Category Archives : Data Warehouse

20

Jun

Column-Level Security is now supported in Azure SQL Data Warehouse

Today we’re announcing Column-Level Security (CLS) for Azure SQL Data Warehouse, an additional capability for managing security for sensitive data. Azure SQL Data Warehouse is a fast, flexible and secure cloud data warehouse tuned for running complex queries fast and across petabytes of data.

As you move data to the cloud, securing your data assets is critical to building trust with your customers and partners. With the introduction of CLS, you can adjust permissions to view sensitive data by limiting user access to specific columns in your tables without having to redesign your data warehouse. This simplifies the overall security implementation as the access restriction logic is located in the database tier itself rather than away from the data in another application. CLS eliminates the need to introduce views to filter out columns for access control management.

Some examples of how this is being used today:

A financial services firm allows only account managers to have access to customer social security numbers (SSN), phone numbers, and other personally identifiable information (PII). A health care provider allows only doctors and nurses to have access to sensitive medical records while not allowing members of the billing department to view this data.

Share

20

Jun

Column-Level Security is now supported in Azure SQL Data Warehouse

Today we’re announcing Column-Level Security (CLS) for Azure SQL Data Warehouse, an additional capability for managing security for sensitive data. Azure SQL Data Warehouse is a fast, flexible and secure cloud data warehouse tuned for running complex queries fast and across petabytes of data.

As you move data to the cloud, securing your data assets is critical to building trust with your customers and partners. With the introduction of CLS, you can adjust permissions to view sensitive data by limiting user access to specific columns in your tables without having to redesign your data warehouse. This simplifies the overall security implementation as the access restriction logic is located in the database tier itself rather than away from the data in another application. CLS eliminates the need to introduce views to filter out columns for access control management.

Some examples of how this is being used today:

A financial services firm allows only account managers to have access to customer social security numbers (SSN), phone numbers, and other personally identifiable information (PII). A health care provider allows only doctors and nurses to have access to sensitive medical records while not allowing members of the billing department to view this data.

Share

14

Jun

Quick Recovery Time with SQL Data Warehouse using User-Defined Restore Points

We are excited to announce that SQL Data Warehouse (SQL DW) now supports User-Defined Restore Points! SQL DW is a flexible and secure analytics platform for the enterprise optimized for running complex queries fast across petabytes of data.

Previously, SQL DW supported only automated snapshots guaranteeing an eight-hour recovery point objective (RPO). While this snapshot policy provided high levels of protection, customers asked for more control over restore points to enable more efficient data warehouse management capabilities leading to quicker times of recovery in the event of any workload interruptions or user errors. 

Now, with user-defined restore points, in addition to the automated snapshots, you can initiate snapshots before and after significant operations on your data warehouse. With more granular restore points, you ensure that each restore point is logically consistent and limit the impact and reduce recovery time of restoring the data warehouse should this be needed. User-defined restore points can also be labeled so they are easy to identify afterwards. 

You can seamlessly create a restore point with a single statement in PowerShell, so it’s easy to integrate with your data warehouse management operations. You can have up to 42 restore points at any point, and as all

Share

14

Jun

Quick Recovery Time with SQL Data Warehouse using User-Defined Restore Points

We are excited to announce that SQL Data Warehouse (SQL DW) now supports User-Defined Restore Points! SQL DW is a flexible and secure analytics platform for the enterprise optimized for running complex queries fast across petabytes of data.

Previously, SQL DW supported only automated snapshots guaranteeing an eight-hour recovery point objective (RPO). While this snapshot policy provided high levels of protection, customers asked for more control over restore points to enable more efficient data warehouse management capabilities leading to quicker times of recovery in the event of any workload interruptions or user errors. 

Now, with user-defined restore points, in addition to the automated snapshots, you can initiate snapshots before and after significant operations on your data warehouse. With more granular restore points, you ensure that each restore point is logically consistent and limit the impact and reduce recovery time of restoring the data warehouse should this be needed. User-defined restore points can also be labeled so they are easy to identify afterwards. 

You can seamlessly create a restore point with a single statement in PowerShell, so it’s easy to integrate with your data warehouse management operations. You can have up to 42 restore points at any point, and as all

Share

10

Jun

Microsoft Azure Data welcomes attendees to ACM SIGMOD/PODS 2018

Hello SIGMOD attendees!

Welcome to Houston, and to what is shaping up to be a great conference.  I wanted to take this opportunity to share with you some of the exciting work in data that’s going on in the Azure Data team at Microsoft, and to invite you to take a closer look.

Microsoft has long been a leader in database management with SQL Server, recognized as the top DBMS by Gartner for the past three years in a row.  The emergence of the cloud and edge as the new frontiers for computing, and thus data management, is an exciting direction—data is now dispersed within and beyond the enterprise, on-prem, on-cloud, and on edge devices, and we must enable intelligent analysis, transactions, and responsible governance for all data everywhere, from the moment it is created to the moment it is deleted, through the entire life-cycle of ingestion, updates, exploration, data prep, analysis, serving, and archival. 

These trends require us to fundamentally re-think data management.  Transactional replication can span continents.  Data is not just relational.  Interactive, real-time, and streaming applications with enterprise level SLAs are becoming common.  Machine learning is a foundational analytic task and must be supported while ensuring that

Share

23

May

Load confidently with SQL Data Warehouse PolyBase Rejected Row Location

Every row of your data is an insight waiting to be found. That is why it is critical you can get every row loaded into your data warehouse. When the data is clean, loading data into Azure SQL Data Warehouse is easy using PolyBase. It is elastic, globally available, and leverages Massively Parallel Processing (MPP). In reality clean data is a luxury that is always available. In those cases you need to know which rows failed to load and why.

In Azure SQL Data Warehouse the Create External Table definition has been extended to include a Rejected_Row_Location parameter. This value represents the location in the External Data Source where the Error File(s) and Rejected Row(s) will be written.

CREATE EXTERNAL TABLE [dbo].[Reject_Example] ( [Col_one] TINYINT NULL, [Col_two] VARCHAR(100) NULL, [Col_three] NUMERIC(2,2) NULL ) WITH ( DATA_SOURCE = EDS_Reject_Row ,LOCATION = ‘Read_Directory’ ,FILE_FORMAT = CSV ,REJECT_TYPE = VALUE ,REJECT_VALUE = 100 ,REJECTED_ROW_LOCATION=‘Reject_Directory’ ) What happens when data is loaded?

When a user runs a Create Table as Select (CTAS) on the table above, PolyBase creates a directory on the External Data Source at the Rejected_Row_Location if one doesn’t exist. A child directory is created with the name “_rejectedrows”. The “_”

Share

22

May

Accelerate data warehouse modernization with Informatica Intelligent Cloud Services for Azure

Today at the Informatica World, Scott Guthrie, EVP, Cloud + AI, along with Anil Chakravarthy, CEO of Informatica, announced the availability of Informatica Intelligent Cloud Services (IICS) for Azure. Microsoft has partnered with Informatica, a leader in Enterprise Data Management, to help our customers accelerate data warehouse modernization. This service is available as a free preview on Azure today.

Informatica provides a discovery-driven approach to data warehouse migration. This approach simplifies the process of identifying and moving data into Azure SQL Data Warehouse (SQL DW), Microsoft’s petabyte scale, fully managed, globally available analytics platform. With the recently released SQL DW Compute Optimized Gen2 tier, you can enjoy 5x performance, 4x concurrency and 5x scale from previous generation.

With this release, Informatica Intelligent Cloud Services for Azure can be launched directly from the Azure Portal. You can enjoy a single sign-on experience and don’t have to create a separate Informatica account. With Informatica Data Accelerator for Azure, you can discover and load data into SQL DW. Informatica’s discovery-driven approach allows you to work with thousands of tables and columns.

“We are very excited about this next step in our long-standing partnership with Microsoft”, said Pratik Parekh, VP, Product Management, Informatica.

Share

14

May

Accelerate your cloud data warehouse with automation tools

Gaining insights rapidly from data is critical to competitiveness in today’s business world. Azure SQL Data Warehouse (SQL DW), Microsoft’s fully managed analytics platform, leverages Massively Parallel Processing (MPP) to run complex interactive SQL queries at every level of scale.

An enterprise data warehouse doesn’t exist in isolation but is part of a complete analytics solution that includes the ‘pipelines’ that extract, transform, and load data, often from many disparate sources, as well as staging and reporting layers. Tasks such as continuous integration and continuous delivery require extensive manual coding and doesn’t offer the flexibility needed for the rapidly changing business needs. This is where data warehouse automation tools provide value for customers.

Data warehouse automation (DWA) tools are meta-data driven, code generation tools that streamline developing and managing a data warehouse solution. DWA tools provide more than just ETL automation, they automate the complete life cycle of a data warehouse solution, from analysis, design, and implementation to documentation, monitoring and maintenance.

Benefits of data warehouse automation

Data warehouse automation (DWA) tools bring the benefits of meta-data driven automation, and code generation to streamline developing and managing a data warehouse solution. DWA tools provide more than just ETL automation, they

Share

10

May

Extract management insights from SQL Data Warehouse with SQL Operations Studio

SQL Operations Studio can be leveraged with Azure SQL Data Warehouse (SQL DW) to create rich customizable dashboard widgets surfacing insights to your data warehouse. This unlocks key scenarios around managing and tuning your data warehouse to ensure it is optimized for consistent performance. Previously, developers had to manually and continuously execute complex DMV queries to extract insights from their data warehouse. This leads to a repetitious process when following development and tuning best practices with SQL DW. Now with SQL Operations Studio, customized insight widgets can be embedded directly within the query tool enabling you to seamlessly monitor and troubleshoot issues with your data warehouse.

The following widgets can be generated by using the provided T-SQL monitoring scripts within SQL Operations Studio for common data warehouse insights.

Data Skew

Detect data skew across distributions to help identify and troubleshoot query performance issues:

Columnstore health and statistics

Leverage views to help maximize columnstore row group quality and ensure table statistics are up to date for optimal query performance:

User Activity

Identify and understand workload patterns through active sessions queries, queued queries, loads, and backups:

Resource Bottlenecks

Ensure adequate resources are allocated such as memory and TempDB:

Share

10

May

Azure SQL Data Warehouse now supports automatic creation of statistics

We are pleased to announce that Azure SQL Data Warehouse (Azure SQL DW) now supports automatic creation of column level statistics. Azure SQL DW is a fast, flexible, and secure analytics platform for the enterprise.

Modern systems such as Azure SQL DW, rely on cost-based optimizers to generate quality execution plans for user queries. Even though Azure SQL DW implements a cost-based optimizer, the system relies on developers and administrators to create statistics objects manually. When all queries are known in advance, determining what statistics objects need to be created is an achievable task. However, when the system is faced with ad-hoc and random queries which is typical for the data warehousing workloads, system administrators may struggle to predict what statistics need to be created leading to potentially suboptimal query execution plans and longer query response times. One way to mitigate this problem is to create statistics objects on all the table columns in advance. However, that process comes with a penalty as statistics objects need to be maintained during table loading process, causing longer loading times.

Azure SQL DW now supports automatic creation of statistics objects providing greater flexibility, productivity, and ease of use for system administrators and developers,

Share