Category Archives : Data Science



Power BI Embedded dashboards with Azure Stream Analytics

Azure Stream Analytics is a fully managed “serverless” PaaS service in Azure built for running real-time analytics on fast moving streams of data. Today, a significant portion of Stream Analytics customers use Power BI for real-time dynamic dashboarding. Support for Power BI Embedded has been a repeated ask from many of our customers, and today we are excited to share that it is now generally available.

What is Power BI Embedded?

Power BI Embedded simplifies how ISVs and developers can quickly add stunning visuals, reports, and dashboards to their apps. By enabling easy-to-navigate data exploration in their apps, ISVs help their customers make quick, informed decisions in context. This also enables faster time to market and competitive differentiation for all parties.

Additionally, Power BI Embedded enables users to work within the familiar development environments, Visual Studio or Azure.

Using Azure Stream Analytics with Power BI Embedded

Using Power BI with Azure Stream Analytics allows users of Power BI Embedded dashboards to easily visualize insights from streaming data within the context of the apps they use every day. With Power BI Embedded, users can also embed real-time dashboards right in their organization’s web apps.

No changes are required for your existing



Azure Databricks provides the best Apache Spark™-based analytics solution for data scientists and engineers

Azure Databricks provides a fast, easy, and collaborative Apache Spark-based analytics platform to accelerate and simplify the process of building Big Data and AI solutions that drive the business forward, all backed by industry leading SLAs.

I am excited to announce the availability of a set of new features and regions, which enable our customers to accelerate their AI journey with Azure Databricks.

RStudio integration generally available with Azure Databricks

Today, we are announcing the ability to use RStudio with Azure Databricks. Customers can now analyze data with RStudio while taking advantage of the scale and flexibility of Azure Databricks.

RStudio offers in a rich IDE that is very popular with the data scientists in the R community. With this integration, RStudio runs directly inside Azure Databricks. This enables data scientists to continue to use the familiar and powerful RStudio IDE while gaining the ability to build their solutions at unprecedented scale. Azure Databricks provides the flexibility to start with small jobs and automatically scale up to production workloads in the same environment.

Setting up RStudio in Azure Databricks is simple and fast. Learn how to get started today.

Azure Databricks available in Australia and UK

We are excited



Getting started with Apache Spark on Azure Databricks
Getting started with Apache Spark on Azure Databricks

Data is growing at an astounding rate, with an estimated 2.5 quintillion bytes being created everyday. Data analysts predict that by 2020, the world’s collected data will quadruple. In the sea of all this data, we are continually exploring new ways of analyzing and interpreting data in a way that’s productive, meaningful and insightful.

Designed in collaboration with the original founders of Apache® Spark™, Azure Databricks combines the best of Databricks and Microsoft Azure to help customers accelerate innovation with streamlined workflows, an interactive workspace and one-click set up. Azure Databricks is an analytics engine built for large scale data processing that enables collaboration between data scientists, data engineers and business analysts.

Azure Databricks can be used to run workloads faster and write applications in the language of your choice, whether that’s Scala, SQL, R or Python. When in sync with Azure Databricks, businesses can innovate within the safe, protected cloud environment of Microsoft Azure and benefit from the native integration with other Azure services such as Power BI, Azure SQL Data Warehouse, and Azure Cosmos DB.

When you’re getting started with Apache Spark on Azure Databricks, you’ll have questions that are unique to your businesses implementation and use case.



Spark + AI Summit: Data scientists and engineers put their trust in Microsoft Azure Databricks

Microsoft will have a major presence at Spark + AI Summit, 2018, in San Francisco, the premier event for the Apache Spark community. Rohan Kumar, Corporate Vice President of Azure Data, will deliver a keynote on how Azure Databricks combines the best of Apache® Spark™ analytics platform and Microsoft Azure Data Services to help customers unleash the power of data and reimagine possibilities that will improve our world.

Azure Databricks, a fast, easy, and collaborative Apache Spark-based analytics platform optimized for Azure, was made generally available in March 2018. To learn more about the announcement, read Rohan Kumar’s blog about how Azure Databricks can help customers accelerate innovation and simplify the process of building Big Data & AI solutions. At Spark + AI Summit, we have a number of sessions showcasing the great work our customers and partners are doing and how Azure Databricks is helping them achieve productivity at scale.

Sign up for training on Spark!

On Monday, June 4, 2018 there are a number of full-day training courses on Apache Spark ranging from beginner to advanced that will enhance your skill set and even prepare you for certification on Spark.

Apache Spark essentials

This 1-day course is for



Region expansion for the next generation of SQL Data Warehouse

Azure SQL Data Warehouse (SQL DW) is a fast, flexible and secure, cloud data warehouse tuned for running complex queries fast and across petabytes of data. Continuing to deliver on this promise, we have announced the general availability of the next generation of SQL DW which includes an average of five times the performance boost, five times the increase in compute scalability, and four times the increase in concurrency. The release of Azure SQL DW Compute Optimized Gen2 tier comes with an expansion of 14 additional regions bringing the global region footprint of SQL DW Gen2 to 20 surpassing all other major cloud providers. The following regions are available:

Australia East

Australia Southeast

Canada Central

Central India

Central US

East Asia

East US

East US 2

Japan East

Japan West

Korea South

North Central US

North Europe

South Central US

South India

Southeast Asia

UK South

West Europe

West US

West US 2

With more global regions than any other



Three critical analytics use cases with Microsoft Azure Databricks

Data science and machine learning can be applied to solve many common business scenarios, yet there are many barriers preventing organizations from adopting them. Collaboration between data scientists, data engineers, and business analysts and curating data, structured and unstructured, from disparate sources are two examples of such barriers – and we haven’t even gotten to the complexity involved when trying to do these things with large volumes of data.  

Recommendation engines, clickstream analytics, and intrusion detection are common scenarios that many organizations are solving across multiple industries. They require machine learning, streaming analytics, and utilize massive amounts of data processing that can be difficult to scale without the right tools. Companies like Lennox International, E.ON, and renewables.AI are just a few examples of organizations that have deployed Apache Spark™ to solve these challenges using Microsoft Azure Databricks.

Your company can enable data science with high-performance analytics too. Designed in collaboration with the original creators of Apache Spark, Azure Databricks is a fast, easy, and collaborative Apache Spark™ based analytics platform optimized for Azure. Azure Databricks is integrated with Azure through one-click setup and provides streamlined workflows, and an interactive workspace that enables collaboration between data scientists, data engineers, and business



Azure Event Hubs integration with Apache Spark now generally available

The Event Hubs team is happy to announce the general availability of our integration with Apache Spark. Now, Event Hubs users can use Spark to easily build end-to-end streaming applications. The Event Hubs connector for Spark supports Spark Core, Spark Streaming, and Structured Streaming for Spark 2.1, Spark 2.2, and Spark 2.3.

For users new to Spark, Spark Streaming and Structured Streaming are scalable, fault-tolerant stream processing engines. These processing engines allow users to process huge amounts of data using complex algorithms expressed with high-level functions like map, reduce, join, and window. This data can then be pushed to file systems, databases, or even back to Event Hubs.

Setting up a stream is easy, check it out:

import org.apache.spark.eventhubs._ import org.apache.spark.sql.SparkSession val eventHubsConf = EventHubsConf(“{EVENT HUB CONNECTION STRING FROM AZURE PORTAL}”) .setStartingPosition(EventPosition.fromEndOfStream) // Create a stream that reads data from the specified Event Hub. val spark = SparkSession.builder.appName(“SimpleStream”).getOrCreate() val eventHubStream = spark.readStream .format(“eventhubs”) .options(eventHubsConf.toMap) .load()

It’s as easy as that! Once your events are streaming into Spark, you can process them as you wish. Spark provides a variety of processing options, such as graph analysis and machine learning. Our documentation has more details on linking our connector with your



Unlock your data’s potential with Azure SQL Data Warehouse and Azure Databricks

Getting the most out of your data is critical for any business in a competitive environment. Businesses need the ability to get the right data into the right hands at the right time. Azure Databricks and Azure SQL Data Warehouse can help you do just that through a Modern Data Warehouse.

Azure SQL Data Warehouse is an elastic, globally available, cloud data warehouse that leverages Massively Parallel Processing (MPP) to quickly run complex queries across petabytes of data. Azure SQL Data Warehouse provides a familiar interface for your analysts who know SQL and want to drive action in your business.

Azure Databricks combines the best of Databricks and Azure to help customers accelerate innovation with one-click set up, streamlined workflows, and an interactive workspace that enables collaboration between data scientists, data engineers, and business analysts powered by Apache Spark.

With the general availability of the Azure Databricks Service comes built-in support for Azure SQL Data Warehouse. This enables any data scientist or data engineer to have a seamless experience connecting their Azure Databricks Cluster and their Azure SQL Data Warehouse when building advanced ETL (extract, transform, and load data) for Modern Data Warehouse Architectures or accessing relational data for Machine



Microsoft partners with National Science Foundation to empower data science breakthroughs

Over the past decade, Microsoft has partnered with the National Science Foundation (NSF) on three separate programs, first in 2010, and more recently through a commitment of $6M in cloud credits across two NSF supported data science programs – with the Big Data Regional Innovation Hubs and as part of the NSF BigData solicitation.

The engagement with NSF has helped Microsoft reach diverse research groups such as the Big Data Hubs1 that brings together communities of data scientists to spark and nurture collaborations between domain experts, researchers, communities, state partners, nonprofits, and industry.

As of today, Microsoft has provided 17 cloud credit awards to Principal Investigators (PIs) who benefit from NSF supported programs. These collaborations are already seeing some interesting breakthroughs across the human body, microbial diseases, and even everyday communication –

Franco Pestilli, Assistant Professor in Psychology, Neuroscience and Cognitive Science, Indiana University is an Azure awardee and PI through the Midwest Big Data Hub2 – his group has built a platform called Brainlife using the Azure award, with the goal of fostering collaboration with sixty-six different global scientific communities such as developmental and learning sciences, network science, computer science, engineering, psychology, statistics, traumatic brain injury, vision science. Chirag



Three new reasons to love the TSI explorer
Three new reasons to love the TSI explorer

Today we’re pleased to announce three new Time Series Insights (TSI) explorer capabilities that we think our users are going to love. 

First, we are delighted to share that the TSI explorer, the visualization service of TSI, is now generally available and backed by our SLA.  Second, we’ve made the TSI explorer more accessible and easier to use for those with visual and fine-motor disabilities. And finally, we’ve made it easy to export aggregate event data to other analytics tools like Microsoft Excel. 

Now that the TSI explorer is generally available, users will notice that the explorer is backed by TSI’s service level agreement (SLA), and we’ve removed the preview moniker from the backsplash when the explorer is loading. We have many customers using TSI in production environments and we’re thrilled to offer them the same SLA that backs the rest of the product. The ActionPoint IoT-PREDICT solution is a great example of one of those customers using the TSI explorer to enable their customers to explore and analyze time series data quickly. Check out their solution below.

There are no limits to what people can achieve when technology reflects the diversity of everyone who uses it. Transparency, accountability, and