Category Archives : Big Data

27

Jun

Enterprises get deeper insights with Hadoop and Spark updates on Azure HDInsight

Azure HDInsight is one of the most popular services amongst enterprise for open source Hadoop & Spark analytics on Azure. With the plus 50 percent price cut on HDInsight, customers moving to the cloud are reaping more savings than ever.

PROS is a pioneer in using machine learning to give companies an accurate and profitable pricing. PROS Guidance product runs enormously complex pricing calculations based on variables that comprise multiple terabytes of data. In Azure HDInsight, a process that formerly took several days now takes just a few minutes.”– Ed Gonzalez, Product Manager, PROS

Today we are announcing updates to Apache Spark, Apache Kafka, ML Services, Azure Data Lake Storage Gen2 and enhancements to Enterprise Security Package. These new capabilities will continue to drive savings for many of our customers. In addition to this, Microsoft is continuing to deepen its commitment to the Apache Hadoop ecosystem and has extended its partnership with Hortonworks to bring the best of Apache Hadoop and the open source big data analytics to the Cloud.

Continued investment in Open Source for new capabilities and reliability Reliable Open Source

Microsoft’s is contributing to Apache Hadoop ecosystem and also ensuring Azure is the most reliable place

Share

27

Jun

Enterprises get deeper insights with Hadoop and Spark updates on Azure HDInsight

Azure HDInsight is one of the most popular services amongst enterprise for open source Hadoop & Spark analytics on Azure. With the plus 50 percent price cut on HDInsight, customers moving to the cloud are reaping more savings than ever.

PROS is a pioneer in using machine learning to give companies an accurate and profitable pricing. PROS Guidance product runs enormously complex pricing calculations based on variables that comprise multiple terabytes of data. In Azure HDInsight, a process that formerly took several days now takes just a few minutes.”– Ed Gonzalez, Product Manager, PROS

Today we are announcing updates to Apache Spark, Apache Kafka, ML Services, Azure Data Lake Storage Gen2 and enhancements to Enterprise Security Package. These new capabilities will continue to drive savings for many of our customers. In addition to this, Microsoft is continuing to deepen its commitment to the Apache Hadoop ecosystem and has extended its partnership with Hortonworks to bring the best of Apache Hadoop and the open source big data analytics to the Cloud.

Continued investment in Open Source for new capabilities and reliability Reliable Open Source

Microsoft’s is contributing to Apache Hadoop ecosystem and also ensuring Azure is the most reliable place

Share

27

Jun

Microsoft from GeekWire Cloud Tech Summit: New Azure innovations will advance the intelligent cloud and intelligent edge

Today, I gathered with the tech community in the Seattle area at the GeekWire Cloud Tech Summit to talk about how customers are using the cloud and what the future holds. I joined GeekWire’s Todd Bishop and Tom Krazit on stage for a fireside chat to share more about Microsoft’s vision for emerging cloud innovation, but I also got to connect with many of you directly. In those conversations, it became even more apparent just how many of you are turning to the intelligent cloud to explore how emerging innovation like serverless, blockchain, edge computing, and AI can help you create solutions that can change your business — and people’s lives.

At Microsoft, we’re continually releasing technology that’s inspired by our customers and what you tell us you need to make the intelligent cloud and intelligent edge a reality for your businesses. For example, you may need to build applications to work in remote areas with low connectivity. Or you need to store, access, and drive insights from your data faster because of competitive pressures. And, you need confidence that your data and applications will be secure, resilient, and highly available across the globe.

During my time with Tom and

Share

27

Jun

Microsoft from GeekWire Cloud Tech Summit: New Azure innovations will advance the intelligent cloud and intelligent edge

Today, I gathered with the tech community in the Seattle area at the GeekWire Cloud Tech Summit to talk about how customers are using the cloud and what the future holds. I joined GeekWire’s Todd Bishop and Tom Krazit on stage for a fireside chat to share more about Microsoft’s vision for emerging cloud innovation, but I also got to connect with many of you directly. In those conversations, it became even more apparent just how many of you are turning to the intelligent cloud to explore how emerging innovation like serverless, blockchain, edge computing, and AI can help you create solutions that can change your business — and people’s lives.

At Microsoft, we’re continually releasing technology that’s inspired by our customers and what you tell us you need to make the intelligent cloud and intelligent edge a reality for your businesses. For example, you may need to build applications to work in remote areas with low connectivity. Or you need to store, access, and drive insights from your data faster because of competitive pressures. And, you need confidence that your data and applications will be secure, resilient, and highly available across the globe.

During my time with Tom and

Share

27

Jun

Lift SQL Server Integration Services packages to Azure with Azure Data Factory

Data is vital to every app and experience we build today. With increasing amount of data, organizations do not want to be tied down by increasing infrastructural costs that come with it. Data engineers and developers are realizing the need to start moving their on-premise workloads to the cloud to take advantage of its massive scale and flexibility. Azure Data Factory capabilities are generally available for SQL Server Integration Services (SSIS) customers to easily lift SSIS packages to Azure gaining scalability, high availability, and lower TCO, while ADF manages resources for them.

Using code-free ADF UI/app, data engineers and developers can now provision and monitor Azure-SSIS Integration Runtime (IR) which are dedicated ADF servers for SSIS package executions. This capability now comes with amazing new features:

Azure Resource Manager (ARM) Virtual Network (VNet) to access data on premises. Ability to host catalog of SSIS projects (SSISDB) in Azure SQL Managed Instance or Azure SQL Database with VNet service endpoint. Azure Hybrid Benefit (AHB) to leverage existing on-premises SQL Server licenses to lower costs. Enterprise Edition to use advanced/premium connectors and transformations. Custom Setup to install custom/third party components on Azure-SSIS IR, and more. Ability to trigger and schedule SSIS

Share

27

Jun

Lift SQL Server Integration Services packages to Azure with Azure Data Factory

Data is vital to every app and experience we build today. With increasing amount of data, organizations do not want to be tied down by increasing infrastructural costs that come with it. Data engineers and developers are realizing the need to start moving their on-premise workloads to the cloud to take advantage of its massive scale and flexibility. Azure Data Factory capabilities are generally available for SQL Server Integration Services (SSIS) customers to easily lift SSIS packages to Azure gaining scalability, high availability, and lower TCO, while ADF manages resources for them.

Using code-free ADF UI/app, data engineers and developers can now provision and monitor Azure-SSIS Integration Runtime (IR) which are dedicated ADF servers for SSIS package executions. This capability now comes with amazing new features:

Azure Resource Manager (ARM) Virtual Network (VNet) to access data on premises. Ability to host catalog of SSIS projects (SSISDB) in Azure SQL Managed Instance or Azure SQL Database with VNet service endpoint. Azure Hybrid Benefit (AHB) to leverage existing on-premises SQL Server licenses to lower costs. Enterprise Edition to use advanced/premium connectors and transformations. Custom Setup to install custom/third party components on Azure-SSIS IR, and more. Ability to trigger and schedule SSIS

Share

25

Jun

Structured streaming with Azure Databricks into Power BI & Cosmos DB

In this blog we’ll discuss the concept of Structured Streaming and how a data ingestion path can be built using Azure Databricks to enable the streaming of data in near-real-time. We’ll touch on some of the analysis capabilities which can be called from directly within Databricks utilising the Text Analytics API and also discuss how Databricks can be connected directly into Power BI for further analysis and reporting. As a final step we cover how streamed data can be sent from Databricks to Cosmos DB as the persistent storage.

Structured streaming is a stream processing engine which allows express computation to be applied on streaming data (e.g. a Twitter feed). In this sense it is very similar to the way in which batch computation is executed on a static dataset. Computation is performed incrementally via the Spark SQL engine which updates the result as a continuous process as the streaming data flows in.

The above architecture illustrates a possible flow on how Databricks can be used directly as an ingestion path to stream data from Twitter (via Event Hubs to act as a buffer), call the Text Analytics API in Cognitive Services to apply intelligence to the data and

Share

21

Jun

Event trigger based data integration with Azure Data Factory

Event driven architecture (EDA) is a common data integration pattern that involves production, detection, consumption and reaction to events. Today, we are announcing the support for event based triggers in your Azure Data Factory (ADF) pipelines. A lot of data integration scenarios requires data factory customers to trigger pipelines based on events. A typical event could be file landing or getting deleted in your azure storage. Now you can simply create an event based trigger in your data factory pipeline.

As soon as the file arrives in your storage location and the corresponding blob is created, it will trigger and run your data factory pipeline. You can create an event based trigger on blob creation, blob deletion or both in your data factory pipelines.

With the “Blob path begins with” and “Blob path ends with” properties, you can tell us for which containers, folders, and blob names you wish to receive events. You can also use wide variety of patterns for both “Blob path begins with” and “Blob path ends with” properties. At least, one of these properties is required.

Examples:

Blob path begins with (/containername/) – Will receive events for any blob in the container. Blob

Share

21

Jun

Event trigger based data integration with Azure Data Factory

Event driven architecture (EDA) is a common data integration pattern that involves production, detection, consumption and reaction to events. Today, we are announcing the support for event based triggers in your Azure Data Factory (ADF) pipelines. A lot of data integration scenarios requires data factory customers to trigger pipelines based on events. A typical event could be file landing or getting deleted in your azure storage. Now you can simply create an event based trigger in your data factory pipeline.

As soon as the file arrives in your storage location and the corresponding blob is created, it will trigger and run your data factory pipeline. You can create an event based trigger on blob creation, blob deletion or both in your data factory pipelines.

With the “Blob path begins with” and “Blob path ends with” properties, you can tell us for which containers, folders, and blob names you wish to receive events. You can also use wide variety of patterns for both “Blob path begins with” and “Blob path ends with” properties. At least, one of these properties is required.

Examples:

Blob path begins with (/containername/) – Will receive events for any blob in the container. Blob

Share

20

Jun

Microsoft deepens its commitment to Apache Hadoop and open source analytics

DATAWORKS SUMMIT, SAN JOSE, Calif., June 18, 2018 – Earlier today, the Microsoft Corporation deepened its commitment to the Apache Hadoop ecosystem and its partnership with Hortonworks that has brought the best of Apache Hadoop and the open source big data analytics to the Cloud. Since the start of the partnership nearly six years ago, hundreds of the largest enterprises have chosen to use Azure HDInsight and Hortonworks to run Hadoop, Spark and other Open Source analytics workloads on Azure. Also, during this time, Microsoft has become one of the leading committers to Apache projects, sharing its experience running one of largest data lakes on the planet, with the open source community.

Azure HDInsight

Azure HDInsight is a fully managed cluster service that enables customers to process and gain insights from massive amounts of data using Hadoop, Spark, Hive, HBase, Kafka, Storm and distributed R. Azure HDInsight offers the latest Hortonworks Data Platform (HDP) distribution and related Open Source projects on the Linux OS. The service is available in 26 public regions and Azure Government Clouds in the US and Germany.
 
The Big Data and Hadoop community has rapidly evolved over the past few years. Azure HDInsight has supported

Share