Every row of your data is an insight waiting to be found. That is why it is critical you can get every row loaded into your data warehouse. When the data is clean, loading data into Azure SQL Data Warehouse is easy using PolyBase. It is elastic, globally available, and leverages Massively Parallel Processing (MPP). In reality clean data is a luxury that is always available. In those cases you need to know which rows failed to load and why.
In Azure SQL Data Warehouse the Create External Table definition has been extended to include a Rejected_Row_Location parameter. This value represents the location in the External Data Source where the Error File(s) and Rejected Row(s) will be written.
CREATE EXTERNAL TABLE [dbo].[Reject_Example] ( [Col_one] TINYINT NULL, [Col_two] VARCHAR(100) NULL, [Col_three] NUMERIC(2,2) NULL ) WITH ( DATA_SOURCE = EDS_Reject_Row ,LOCATION = ‘Read_Directory’ ,FILE_FORMAT = CSV ,REJECT_TYPE = VALUE ,REJECT_VALUE = 100 ,REJECTED_ROW_LOCATION=‘Reject_Directory’ ) What happens when data is loaded?
When a user runs a Create Table as Select (CTAS) on the table above, PolyBase creates a directory on the External Data Source at the Rejected_Row_Location if one doesn’t exist. A child directory is created with the name “_rejectedrows”. The “_”
We’re committed to making Azure work great with the open source tools you know and love, and if you’re using Chef products or open source projects, there’s never been a better time to try Azure. We’ve had a rich history of partnership and collaboration with Chef to deliver automation tools that help you with cloud adoption. Today, at ChefConf, the Chef and Azure teams are excited to announce the inclusion of Chef InSpec, directly in Azure Cloud Shell, as well as the new Chef Developer Hub in Azure Docs.
Inspec in Azure Cloud Shell
In addition to other open source tools like Ansible and Terraform that are already available, today we are announcing the availability of Chef Inspec, pre-installed and ready to use for every Azure user in the Azure Cloud Shell. This makes bringing your Inspec tests to Azure super-simple, in fact it’s the easiest way to try out Inspec – no installation or configuration required.
Figure 1: InSpec Exec within Azure Cloud Shell
Chef Developer Hub for Azure
We are launching the new Chef Developer Hub so Azure customers can more easily implement their solutions using Chef open source software. Whether you’re using Chef, Inspec or
https://powerbi.microsoft.com/en-us/blog/on-premises-data-gateway-may-update-is-now-available/Source: https://powerbi.microsoft.com/en-us/blog/on-premises-data-gateway-may-update-is-now-available/ We are excited to announce that we have released the May update for the On-premises data gateway. Here are some of the things that we would like to highlight with this month’s release READ MORE
Azure Data Lake customers use the Data Lake Store and Data Lake Analytics to store and run complex analytics on massive amounts of data. However it is challenging to manage costs, keep up-to-date with activity in the accounts, and proactively know when usage thresholds are nearing certain limits. Using Log Analytics and Azure Data Lake we can address these challenges and know when the costs are increasing or when certain activities take place.
In this post, you will learn how to use Log Analytics with your Data Lake accounts to create alerts that can notify you of Data Lake activity events and when certain usage thresholds are reached. It is easy to get started!
Step 1: Connect Azure Data Lake and Log Analytics
Data Lake accounts can be configured to generate diagnostics logs, some of which are automatically generated (e.g. regular Data Lake operations such as reporting current usage, or whenever a job completes). Others are generated based on requests (e.g. when a new file is created, opened, or when a job is submitted). Both Data Lake Analytics and Data Lake Store can be configured to send these diagnostics logs to a Log Analytics account where we can query
Today at the Informatica World, Scott Guthrie, EVP, Cloud + AI, along with Anil Chakravarthy, CEO of Informatica, announced the availability of Informatica Intelligent Cloud Services (IICS) for Azure. Microsoft has partnered with Informatica, a leader in Enterprise Data Management, to help our customers accelerate data warehouse modernization. This service is available as a free preview on Azure today.
Informatica provides a discovery-driven approach to data warehouse migration. This approach simplifies the process of identifying and moving data into Azure SQL Data Warehouse (SQL DW), Microsoft’s petabyte scale, fully managed, globally available analytics platform. With the recently released SQL DW Compute Optimized Gen2 tier, you can enjoy 5x performance, 4x concurrency and 5x scale from previous generation.
With this release, Informatica Intelligent Cloud Services for Azure can be launched directly from the Azure Portal. You can enjoy a single sign-on experience and don’t have to create a separate Informatica account. With Informatica Data Accelerator for Azure, you can discover and load data into SQL DW. Informatica’s discovery-driven approach allows you to work with thousands of tables and columns.
“We are very excited about this next step in our long-standing partnership with Microsoft”, said Pratik Parekh, VP, Product Management, Informatica.
Azure Traffic Manager, Azure’s DNS based load balancing solution, is used by customers for a wide variety of use cases including routing a global user base to endpoints on Azure that will give them the fastest, low latency experience, providing seamless auto-failover for mission critical workloads and migration from on-premises to the cloud. One key use case where customers leverage Traffic Manager is to make their software deployments smoother with minimal impact to their users by implementing a Blue-Green deployment process using Traffic Manager’s weighted round-robin routing method. This blog will show how we can implement Blue-Green deployment using Traffic Manager, but before we dive deep, let us discuss what we mean by Blue-Green deployment.
Blue-Green deployment is a software rollout method that can reduce the impact of interruptions caused due to issues in the new version being deployed. This is achieved by exposing the new version of the software to a limited set of users and expanding that user base gradually until everyone is using the new version. If at any time the new version is causing issues, for example a broken authentication workflow in the new version of a web application, all the users can be instantly* redirected
After the recent general availability for Storage Explorer, we also added new features in the latest 1.1 release to align with Azure Storage platform:
Azurite cross-platform emulator Access tiers that efficiently consumes resources based on how frequently a blob is accessed The removal of SAS URL start time to avoid datacenter synchronization issues
Storage Explorer is a great tool for managing contents of your Azure storage account. You can upload, download, and manage blobs, files, queues, and Cosmos DB entities. Additionally, you may gain easy access to manage your Virtual Machine disks, work with either Azure Resource Manager or classic storage accounts, plus manage and configure cross-origin resource sharing (CORS) rules. Storage Explorer also works on public Azure, Sovereign Azure Cloud, as well as Azure Stack.
Let’s go through some example scenarios where Storage Explorer helps with your daily job.
Sign-in to your Azure Cloud from Storage Explorer
To get started using Storage Explorer, sign in to your Azure account and stay connected to your subscriptions. If you have an account for Azure, Azure Sovereign Cloud, or Azure Stack, you can easily sign-in to your account from Storage Explorer Add an Account dialog.
In addition, now Storage Explorer shares the
This post is authored by Gopi Kumar, Principal Program Manager at Microsoft.
The Data Science Virtual Machine (DSVM), a popular VM image on the Azure marketplace, is a purpose-built cloud-based environment with a host of preconfigured data and AI tools. It enables data scientists and AI developers to iterate on developing high quality predictive models and deep learning architectures and helps them become much more productive when developing their AI applications. DSVM has been offered for over two years now and, during that time, it has seen a wide range of users, from small startups to enterprises with large data science teams who use DSVM as their core cloud development and experimentation environment for building production applications and models.
Deploying AI infrastructure at scale can be quite challenging for large enterprise teams. However, Azure Infrastructure provides several services supporting enterprise IT needs, such as around security, scaling, reliability, availability, performance and collaboration. The Data Science VM can readily leverage these services in Azure to support the deployment of large scale enterprise team -based Data Science and AI environments. We have assembled guidance for an initial list of common enterprise scenarios in a new DSVM documentation section dedicated to enterprise
This post is by Ye Xing, Senior Data Scientist, Tao Wu, Principle Data Scientist Manager, and Patrick Buehler, Senior Data Scientist, at Microsoft.
The advancement of medical imaging, as in many other scientific disciplines, relies heavily on the latest advances in tools and methodologies that make rapid iterations possible. We recently witnessed this first-hand when we developed a deep learning model on the newly released Azure Machine Learning Package for Computer Vision (AML-CVP) and were able to improve upon a state-of-the-art algorithm in screening blinding retinal diseases. Our pipeline, based on AML-CVP, reduced misclassification by over 90% (from 3.9% down to 0.3%) without any parameter tuning. The deep learning model training was completed in 10 minutes over 83,484 images on the Azure Deep Learning Virtual Machine equipped with a single NVIDIA V100 GPU. This pipeline can be constructed quickly, with less than 20 lines of Python code, thanks to the benefit of the high-level Python AML-CVP API.
Our work was inspired by the paper “Identifying Medical Diagnosis and Treatable Diseases by Image-Based Deep learning“, published on Cell, a leading medical journal, in February 2018. The paper developed a deep learning AI system to identify two vision-threatening retinal diseases – choroidal
Azure the cloud for all – highlights from Microsoft BUILD 2018 – In this final recap of Microsoft Build 2018, Julia White, Corporate Vice President, Microsoft Azure pulled together some key highlights and top sessions to watch. This post summarizes what’s new across tools, containers+serverless, IoT, and Data+AI.
“Hey! You! Get on my cloud.” Corey Sanders at Build 2018.
Now in preview
Public preview: Query across applications in log alerts – You can use Azure Application Insights to monitor a distributed modern cloud application. In the same spirit, log alerts enable you to combine data across various apps. Cross-app query support in log alerts is currently in preview.
Now generally available
Protect virtual machines across different subscriptions with Azure Security Center – Azure Security Center’s Cross-Subscription Workspace Selection enables you to collect and monitor data in one location from virtual machines that run in different workspaces, subscriptions, and run queries across them.
Announcing SQL Advanced Threat Protection (ATP) and SQL Vulnerability Assessment general availability – SQL Vulnerability Assessment (VA) provides you a one-stop-shop to discover, track and remediate potential database vulnerabilities. It helps give you visibility into your security state, and includes actionable steps to investigate, manage and resolve