Category Archives : Big Data

27

Jun

Microsoft from GeekWire Cloud Tech Summit: New Azure innovations will advance the intelligent cloud and intelligent edge

Today, I gathered with the tech community in the Seattle area at the GeekWire Cloud Tech Summit to talk about how customers are using the cloud and what the future holds. I joined GeekWire’s Todd Bishop and Tom Krazit on stage for a fireside chat to share more about Microsoft’s vision for emerging cloud innovation, but I also got to connect with many of you directly. In those conversations, it became even more apparent just how many of you are turning to the intelligent cloud to explore how emerging innovation like serverless, blockchain, edge computing, and AI can help you create solutions that can change your business — and people’s lives.

At Microsoft, we’re continually releasing technology that’s inspired by our customers and what you tell us you need to make the intelligent cloud and intelligent edge a reality for your businesses. For example, you may need to build applications to work in remote areas with low connectivity. Or you need to store, access, and drive insights from your data faster because of competitive pressures. And, you need confidence that your data and applications will be secure, resilient, and highly available across the globe.

During my time with Tom and

Share

27

Jun

Microsoft from GeekWire Cloud Tech Summit: New Azure innovations will advance the intelligent cloud and intelligent edge

Today, I gathered with the tech community in the Seattle area at the GeekWire Cloud Tech Summit to talk about how customers are using the cloud and what the future holds. I joined GeekWire’s Todd Bishop and Tom Krazit on stage for a fireside chat to share more about Microsoft’s vision for emerging cloud innovation, but I also got to connect with many of you directly. In those conversations, it became even more apparent just how many of you are turning to the intelligent cloud to explore how emerging innovation like serverless, blockchain, edge computing, and AI can help you create solutions that can change your business — and people’s lives.

At Microsoft, we’re continually releasing technology that’s inspired by our customers and what you tell us you need to make the intelligent cloud and intelligent edge a reality for your businesses. For example, you may need to build applications to work in remote areas with low connectivity. Or you need to store, access, and drive insights from your data faster because of competitive pressures. And, you need confidence that your data and applications will be secure, resilient, and highly available across the globe.

During my time with Tom and

Share

27

Jun

Lift SQL Server Integration Services packages to Azure with Azure Data Factory

Data is vital to every app and experience we build today. With increasing amount of data, organizations do not want to be tied down by increasing infrastructural costs that come with it. Data engineers and developers are realizing the need to start moving their on-premise workloads to the cloud to take advantage of its massive scale and flexibility. Azure Data Factory capabilities are generally available for SQL Server Integration Services (SSIS) customers to easily lift SSIS packages to Azure gaining scalability, high availability, and lower TCO, while ADF manages resources for them.

Using code-free ADF UI/app, data engineers and developers can now provision and monitor Azure-SSIS Integration Runtime (IR) which are dedicated ADF servers for SSIS package executions. This capability now comes with amazing new features:

Azure Resource Manager (ARM) Virtual Network (VNet) to access data on premises. Ability to host catalog of SSIS projects (SSISDB) in Azure SQL Managed Instance or Azure SQL Database with VNet service endpoint. Azure Hybrid Benefit (AHB) to leverage existing on-premises SQL Server licenses to lower costs. Enterprise Edition to use advanced/premium connectors and transformations. Custom Setup to install custom/third party components on Azure-SSIS IR, and more. Ability to trigger and schedule SSIS

Share

27

Jun

Lift SQL Server Integration Services packages to Azure with Azure Data Factory

Data is vital to every app and experience we build today. With increasing amount of data, organizations do not want to be tied down by increasing infrastructural costs that come with it. Data engineers and developers are realizing the need to start moving their on-premise workloads to the cloud to take advantage of its massive scale and flexibility. Azure Data Factory capabilities are generally available for SQL Server Integration Services (SSIS) customers to easily lift SSIS packages to Azure gaining scalability, high availability, and lower TCO, while ADF manages resources for them.

Using code-free ADF UI/app, data engineers and developers can now provision and monitor Azure-SSIS Integration Runtime (IR) which are dedicated ADF servers for SSIS package executions. This capability now comes with amazing new features:

Azure Resource Manager (ARM) Virtual Network (VNet) to access data on premises. Ability to host catalog of SSIS projects (SSISDB) in Azure SQL Managed Instance or Azure SQL Database with VNet service endpoint. Azure Hybrid Benefit (AHB) to leverage existing on-premises SQL Server licenses to lower costs. Enterprise Edition to use advanced/premium connectors and transformations. Custom Setup to install custom/third party components on Azure-SSIS IR, and more. Ability to trigger and schedule SSIS

Share

25

Jun

Structured streaming with Azure Databricks into Power BI & Cosmos DB

In this blog we’ll discuss the concept of Structured Streaming and how a data ingestion path can be built using Azure Databricks to enable the streaming of data in near-real-time. We’ll touch on some of the analysis capabilities which can be called from directly within Databricks utilising the Text Analytics API and also discuss how Databricks can be connected directly into Power BI for further analysis and reporting. As a final step we cover how streamed data can be sent from Databricks to Cosmos DB as the persistent storage.

Structured streaming is a stream processing engine which allows express computation to be applied on streaming data (e.g. a Twitter feed). In this sense it is very similar to the way in which batch computation is executed on a static dataset. Computation is performed incrementally via the Spark SQL engine which updates the result as a continuous process as the streaming data flows in.

The above architecture illustrates a possible flow on how Databricks can be used directly as an ingestion path to stream data from Twitter (via Event Hubs to act as a buffer), call the Text Analytics API in Cognitive Services to apply intelligence to the data and

Share

21

Jun

Event trigger based data integration with Azure Data Factory

Event driven architecture (EDA) is a common data integration pattern that involves production, detection, consumption and reaction to events. Today, we are announcing the support for event based triggers in your Azure Data Factory (ADF) pipelines. A lot of data integration scenarios requires data factory customers to trigger pipelines based on events. A typical event could be file landing or getting deleted in your azure storage. Now you can simply create an event based trigger in your data factory pipeline.

As soon as the file arrives in your storage location and the corresponding blob is created, it will trigger and run your data factory pipeline. You can create an event based trigger on blob creation, blob deletion or both in your data factory pipelines.

With the “Blob path begins with” and “Blob path ends with” properties, you can tell us for which containers, folders, and blob names you wish to receive events. You can also use wide variety of patterns for both “Blob path begins with” and “Blob path ends with” properties. At least, one of these properties is required.

Examples:

Blob path begins with (/containername/) – Will receive events for any blob in the container. Blob

Share

21

Jun

Event trigger based data integration with Azure Data Factory

Event driven architecture (EDA) is a common data integration pattern that involves production, detection, consumption and reaction to events. Today, we are announcing the support for event based triggers in your Azure Data Factory (ADF) pipelines. A lot of data integration scenarios requires data factory customers to trigger pipelines based on events. A typical event could be file landing or getting deleted in your azure storage. Now you can simply create an event based trigger in your data factory pipeline.

As soon as the file arrives in your storage location and the corresponding blob is created, it will trigger and run your data factory pipeline. You can create an event based trigger on blob creation, blob deletion or both in your data factory pipelines.

With the “Blob path begins with” and “Blob path ends with” properties, you can tell us for which containers, folders, and blob names you wish to receive events. You can also use wide variety of patterns for both “Blob path begins with” and “Blob path ends with” properties. At least, one of these properties is required.

Examples:

Blob path begins with (/containername/) – Will receive events for any blob in the container. Blob

Share

20

Jun

Microsoft deepens its commitment to Apache Hadoop and open source analytics

DATAWORKS SUMMIT, SAN JOSE, Calif., June 18, 2018 – Earlier today, the Microsoft Corporation deepened its commitment to the Apache Hadoop ecosystem and its partnership with Hortonworks that has brought the best of Apache Hadoop and the open source big data analytics to the Cloud. Since the start of the partnership nearly six years ago, hundreds of the largest enterprises have chosen to use Azure HDInsight and Hortonworks to run Hadoop, Spark and other Open Source analytics workloads on Azure. Also, during this time, Microsoft has become one of the leading committers to Apache projects, sharing its experience running one of largest data lakes on the planet, with the open source community.

Azure HDInsight

Azure HDInsight is a fully managed cluster service that enables customers to process and gain insights from massive amounts of data using Hadoop, Spark, Hive, HBase, Kafka, Storm and distributed R. Azure HDInsight offers the latest Hortonworks Data Platform (HDP) distribution and related Open Source projects on the Linux OS. The service is available in 26 public regions and Azure Government Clouds in the US and Germany.
 
The Big Data and Hadoop community has rapidly evolved over the past few years. Azure HDInsight has supported

Share

20

Jun

Microsoft deepens its commitment to Apache Hadoop and open source analytics

DATAWORKS SUMMIT, SAN JOSE, Calif., June 18, 2018 – Earlier today, the Microsoft Corporation deepened its commitment to the Apache Hadoop ecosystem and its partnership with Hortonworks that has brought the best of Apache Hadoop and the open source big data analytics to the Cloud. Since the start of the partnership nearly six years ago, hundreds of the largest enterprises have chosen to use Azure HDInsight and Hortonworks to run Hadoop, Spark and other Open Source analytics workloads on Azure. Also, during this time, Microsoft has become one of the leading committers to Apache projects, sharing its experience running one of largest data lakes on the planet, with the open source community.

Azure HDInsight

Azure HDInsight is a fully managed cluster service that enables customers to process and gain insights from massive amounts of data using Hadoop, Spark, Hive, HBase, Kafka, Storm and distributed R. Azure HDInsight offers the latest Hortonworks Data Platform (HDP) distribution and related Open Source projects on the Linux OS. The service is available in 26 public regions and Azure Government Clouds in the US and Germany.
 
The Big Data and Hadoop community has rapidly evolved over the past few years. Azure HDInsight has supported

Share

20

Jun

Azure Event Hubs is now offering support for Availability Zones in preview

Azure Event Hubs makes streaming data effortless because of its simplicity and ability to scale easily. The sheer volume of data that goes through the Event Hubs platform is a testament to how reliable the service is. In fact, by the time you finish reading this sentence, Event Hubs will have ingested over 100 million events globally.

Today, we are adding to the durability of the service and offering support for Availability Zones in public preview for Standard Event Hubs. This new feature adds even greater resiliency and fault tolerance to the top event streaming service. Support for Availability Zones partners nicely with our disaster recovery feature to offer a highly available service that can withstand both a zone outage and a regional one when both are properly utilized.

Event Hubs has customers in Retail, Auto, Finance, and other verticals that use its streaming capabilities for scenarios such as predictive analytics and financial trading. Availability Zones support further enhances our commitment to keeping our customers’ workloads running as smoothly as possible. We hope you try out this new feature.

What regions will this be offered in? Central US East US 2 France Central

We’ll be offering support for additional Availability

Share