Category Archives : Data Science

29

Jun

Advancing Azure service quality with artificial intelligence: AIOps

“In the era of big data, insights collected from cloud services running at the scale of Azure quickly exceed the attention span of humans. It’s critical to identify the right steps to maintain the highest possible quality of service based on the large volume of data collected. In applying this to Azure, we envision infusing AI into our cloud platform and DevOps process, becoming AIOps, to enable the Azure platform to become more self-adaptive, resilient, and efficient. AIOps will also support our engineers to take the right actions more effectively and in a timely manner to continue improving service quality and delighting our customers and partners. This post continues our Advancing Reliability series highlighting initiatives underway to keep improving the reliability of the Azure platform. The post that follows was written by Jian Zhang, our Program Manager overseeing these efforts, as she shares our vision for AIOps, and highlights areas of this AI infusion that are already a reality as part of our end-to-end cloud service management.”—Mark Russinovich, CTO, Azure

This post includes contributions from Principal Data Scientist Manager Yingnong Dang and Partner Group Software Engineering Manager Murali Chintalapati.

 

As Mark mentioned when he launched this Advancing Reliability blog

Share

11

May

Office Licensing Service and Azure Cosmos DB part 1: Migrating the production workload

This post is part 1 of a two-part series about how organizations use Azure Cosmos DB to meet real world needs, and the difference it’s making to them. In part 1, we explore the challenges that led the Microsoft Office Licensing Service team to move from Azure Table storage to Azure Cosmos DB, and how it migrated its production workload to the new service. In part 2, we examine the outcomes resulting from the team’s efforts.

The challenge: Limited throughput and other capabilities

At Microsoft, the Office Licensing Service (OLS) supports activation of the Microsoft Office client on millions of devices around the world—including Windows, Mac, tablets, and mobile. It stores information such as machine ID, product ID, activation count, expiration date, and more. OLS is accessed by the Office client more than more than 240 million times per day by users around the world, with the first call coming from the client upon license activation and then every 2-3 days thereafter as the client checks to make sure the license is still valid.

Until recently, OLS relied on Azure Table storage for its backend data store, which contained about 5 TB of data spread across 18 tables—with separate tables

Share

06

May

Minecraft Earth and Azure Cosmos DB part 1: Extending Minecraft into our real world

This post is part 1 of a two-part series about how organizations use Azure Cosmos DB to meet real world needs and the difference it’s making to them. In part 1, we explore the challenges that led service developers for Minecraft Earth to choose Azure Cosmos DB and how they’re using it to capture almost every action taken by every player around the globe—with ultra-low latency. In part 2, we examine the solution’s workload and how Minecraft Earth service developers have benefited from building it on Azure Cosmos DB.

Extending the world of Minecraft into our real world

You’ve probably heard of the game Minecraft, even if you haven’t played it yourself. It’s the best-selling video game of all time, having sold more than 176 million copies since 2011. Today, Minecraft has more than 112 million monthly players, who can discover and collect raw materials, craft tools, and build structures or earthworks in the game’s immersive, procedurally generated 3D world. Depending on game mode, players can also fight computer-controlled foes and cooperate with—or compete against—other players.

In May 2019, Microsoft announced the upcoming release of Minecraft Earth, which began its worldwide rollout in December 2019. Unlike preceding games in the

Share

06

May

Minecraft Earth and Azure Cosmos DB part 2: Delivering turnkey geographic distribution

This post is part 2 of a two-part series about out how organizations are using Azure Cosmos DB to meet real world needs and the difference it’s making to them. In part 1, we explored the challenges that led service developers for Minecraft Earth to choose Azure Cosmos DB and how they’re using it to capture almost every action taken by every player around the globe—with ultra-low latency. In part 2 we examine the solution’s workload and how Minecraft Earth service developers have benefited from building it on Azure Cosmos DB.

Geographic distribution and multi-region writes

Minecraft Earth service developers used the turnkey geographic distribution feature in Azure Cosmos DB to achieve three goals: fault tolerance, disaster recovery, and minimal latency—the latter achieved by also using the multi-master capabilities of Azure Cosmos DB to enable multi-region writes. Each supported geography has at least two service instances. For example, in North America, the Minecraft Earth service runs in the West US and East US Azure regions, with other components of Azure used to determine which is closer to the user and route traffic accordingly.

Nathan Sosnovske, a Senior Software Engineer on the Minecraft Earth services development team explains:

“With Azure available

Share

16

Jan

Azure Data Explorer and Stream Analytics for anomaly detection

https://azure.microsoft.com/blog/azure-data-explorer-and-stream-analytics-for-anomaly-detection/Anomaly detection plays a vital role in many industries across the globe, such as fraud detection for the financial industry, health monitoring in hospitals, fault detection and operating environment monitoring in the manufacturing, oil and gas, utility, transportation, aviation, and READ MORE

Share

25

Nov

Multi-language identification and transcription in Video Indexer

Multi-language speech transcription was recently introduced into Microsoft Video Indexer at the International Broadcasters Conference (IBC). It is available as a preview capability and customers can already start experiencing it in our portal. More details on all our IBC2019 enhancements can be found here.

Multi-language videos are common media assets in the globalization context, global political summits, economic forums, and sport press conferences are examples of venues where speakers use their native language to convey their own statements. Those videos pose a unique challenge for companies that need to provide automatic transcription for video archives of large volumes. Automatic transcription technologies expect users to explicitly determine the video language in advance to convert speech to text. This manual step becomes a scalability obstacle when transcribing multi-language content as one would have to manually tag audio segments with the appropriate language.

Microsoft Video Indexer provides a unique capability of automatic spoken language identification for multi-language content. This solution allows users to easily transcribe multi-language content without going through tedious manual preparation steps before triggering it. By that, it can save anyone with large archive of videos both time and money, and enable discoverability and accessibility scenarios.

Multi-language audio transcription in Video

Share

18

Nov

https://azure.microsoft.com/blog/pytorch-on-azure-with-streamlined-ml-lifecycle/It’s exciting to see the Pytorch Community continue to grow and regularly release updated versions of PyTorch! Recent releases improve performance, ONNX export, TorchScript, C++ frontend, JIT, and distributed training. Several new experimental features, such as quantization, have also been introduced. At READ MORE

Share

30

Sep

Built-in Jupyter notebooks in Azure Cosmos DB are now available

Earlier this year, we announced a preview of built-in Jupyter notebooks for Azure Cosmos DB. These notebooks, running inside Azure Cosmos DB, are now available.

Cosmic notebooks are available for all data models and APIs including Cassandra, MongoDB, SQL (Core), Gremlin, and Spark to enhance the developer experience in Azure Cosmos DB. These notebooks are directly integrated into the Azure Portal and your Cosmos accounts, making them convenient and easy to use. Developers, data scientists, engineers and analysts can use the familiar Jupyter notebooks experience to:

Interactively run queries Explore and analyze data Visualize data Build, train, and run machine learning and AI models

In this blog post, we’ll explore how notebooks make it easy for you to work with and visualize your Azure Cosmos DB data.

Easily query your data

With notebooks, we’ve included built-in commands to make it easy to query your data for ad-hoc or exploratory analysis. From the Portal, you can use the %%sql magic command to run a SQL query against any container in your account, no configuration needed. The results are returned immediately in the notebook.

Improved developer productivity

We’ve also bundled in version 4 of our Azure Cosmos DB Python SDK

Share

19

Aug

https://azure.microsoft.com/blog/announcing-the-general-availability-of-python-support-in-azure-functions/Python support for Azure Functions is now generally available and ready to host your production workloads across data science and machine learning, automated resource management, and more. You can now develop Python 3.6 apps to run on the cross-platform, open-source READ MORE

Share

11

Jul

Announcing preview of Azure Data Share
Announcing preview of Azure Data Share

In a world where data volume, variety, and type are exponentially growing, organizations need to collaborate with data of any size and shape. In many cases data is at its most powerful when it can be shared and combined with data that resides outside organizational boundaries with business partners and third parties. For customers, sharing this data in a simple and governed way is challenging. Common data sharing approaches using file transfer protocol (FTP) or web APIs tend to be bespoke development and require infrastructure to manage. These tools do not provide the security or governance required to meet enterprise standards, and they often are not suitable for sharing large datasets. To enable enterprise collaboration, we are excited to unveil Azure Data Share Preview, a new data service for sharing data across organizations.

Simple and safe data sharing

Data professionals in the enterprise can now use Azure Data Share to easily and safely share big data with external organizations in Azure Blob Storage and Azure Data Lake Storage. New services will continue to come online. As a fully managed Azure service, Azure Data Share does not require infrastructure to set up and it scales to meet big data sharing demands.

Share