This post is authored by Wilson Lee, Senior Software Engineer at Microsoft.
Imagine yourself at a conference, attending a session with hundreds of other enthusiastic developers. You put away your phone to pay attention to the speaker who is talking about the latest cutting-edge technologies. As you learn about the topic, you start to gather a list of questions you would like to get the answers to right away. But the timing never seems right to ask those questions. Maybe, it’s not Q&A time yet. Maybe, you are not thrilled to speak up in front of so many fellow attendees. Or maybe, even if you raised your hand or stood in line during the Q&A period, you were not picked. As a result, you did not have the full learning experience that you felt you deserved.
Even as digital transformation sweeps through every business and industry, we can’t help but ask – can AI help in the conference experience above? Can today’s manual interaction between speaker and attendees be infused with intelligence to create a more satisfying Q&A experience?
The core of Q&A is a conversation between the speaker and attendees, and – as conversational AI tools gain rapid popularity
This post is co-authored by Erika Menezes, Software Engineer at Microsoft, and Chaitanya Kanitkar, Software Engineer at Twitter. This project was completed as part of the coursework for Stanford’s CS231n in Spring 2018.
Ever seen someone wearing an interesting outfit and wonder where you could buy it yourself?
You’re not alone – retailers world over are trying to capitalize on something very similar. Each time a fashion blogger posts a picture on Instagram or another photo-sharing site, it’s a low-cost sales opportunity. As online shopping and photo-sharing become ever more widely used, the use of user generated content (UGC) in marketing strategies has become pivotal in driving traffic and increasing sales for retailers. A key value proposition for UGC content such as images and videos is their authenticity when compared to professional content. However, this is also why working with UGC content can be more difficult as there is much less control over how the content looks or how it was generated.
Microsoft has been using deep learning for e-commerce visual search and inventory management using content-based image retrieval. Both efforts demonstrate solutions for the in-shop clothes retrieval task, where the query image and target catalog image are taken
This post is co-authored by Anusua Trivedi, Data Scientist, Microsoft; Patrick Buehler, Data Scientist, Microsoft; Dr. Sunil Gupta, Founder, Intelligent Retinal Imaging System (IRIS); and Jocelyn Desbiens, Researcher, IRIS.
Diabetic Retinopathy (DR) is the most common cause of blindness in the working population of the United States and Europe. The World Health Organization (WHO) predicts that the number of patients with diabetes will increase to 366 million in 2030. For patients with diabetes, early diagnosis and treatment have been shown to prevent visual loss and blindness. Automated grading of DR has potential benefits such as:
Increasing the efficiency, reproducibility, and coverage of screening programs.
Reducing barriers to access.
Improving patient outcomes by providing early detection and treatment.
To maximize the clinical utility of automated grading, an accurate algorithm to detect referable DR is needed.
Machine Learning on DR images
Machine Learning has been used in a variety of medical image classification tasks including automated classification of DR. However, much of the work has focused on feature extraction engineering which involves computing image features specified by experts, resulting in algorithms built to detect specific lesions or predict the presence of many types of DR severity. Deep Learning is a
This post is co-authored by Mary Wahl, Data Scientist, Xiaoyong Zhu, Program Manager, Siyu Yang, Software Development Engineer, and Wee Hyong Tok, Principal Data Scientist Manager, at Microsoft.
Object detection powers some of the most widely adopted computer vision applications, from people counting in crowd control to pedestrian detection used by self-driving cars. Training an object detection model can take up to weeks on a single GPU, a prohibitively long time for experimenting with hyperparameters and model architectures.
This blog will show how you can train an object detection model by distributing deep learning training to multiple GPUs. These GPUs can be on a single machine or several machines. You will learn how to perform distributed deep learning on Azure, and how you can do this using Horovod running on Azure Batch AI.
Object detection combines the task of classification with localization, outputting both a category and a set of coordinates representing the bounding box for each object that it detects in the image, as illustrated in Figure 1 below.
Figure 1. Different computer vision tasks (source)
Over the past few years, many exciting deep learning approaches for object detection have emerged. Models such as Faster R-CNN
This post is authored by Tara Shankar Jana, Senior Technical Product Marketing Manager at Microsoft.
Among our exciting announcements at //Build, one of the things I was thrilled to launch is the AI Lab – a collection of AI projects designed to help developers explore, experience, learn about and code with the latest Microsoft AI Platform technologies.
What is AI Lab?
AI Lab helps our large fast-growing community of developers get started on AI. It currently houses five projects that showcase the latest in custom vision, attnGAN (more below), Visual Studio tools for AI, Cognitive Search, machine reading comprehension and more. Each lab gives you access to the experimentation playground, source code on GitHub, a crisp developer-friendly video, and insights into the underlying business problem and solution. One of the projects we highlighted at //Build was the search and rescue challenge which gave the opportunity to developers worldwide to use AI School resources to build and deploy their first AI model for a problem involving aerial drones.
AI Lab is developed in partnership with Microsoft’s AI School and the Microsoft Research (MSR) AI organization.
AI Lab Experiments
We released the following experiments from Microsoft at //Build:
This post is authored by Erika Menezes, Software Engineer at Microsoft.
In Part 1 of this blog series, we created a recipe prediction model to predict recipes from a text input that may contain an arbitrary number of emojis. In this post we will go over how to operationalize this model as a web service that will be exposed as a REST API. We will be using Visual Studio Code Tools for AI to do this. We will also show you recommended practices for operationalizing large sized models and ways to troubleshoot your operationalization workflow. We present a step-by-step walkthrough, right from setting up your Azure Machine Learning account to exposing your ML model through a web endpoint.
To complete this tutorial, you need:
An Azure subscription. If you don’t have an Azure subscription, create a free account before you begin.
An experimentation account and Azure Machine Learning Workbench installed as described in this quickstart.
The classification model from Part 1.
A Docker engine installed and running locally. Learn more here.
Getting Started with Azure Machine Learning
Azure Machine Learning (AML) provides data scientists with a tool set to help you experiment and deploy faster.
O’Reilly and Microsoft are excited bring you a new e-book on AI, titled A Developer’s Guide to Building AI Applications.
This book, which is clearly developer-focused, walks you through the process of building intelligent cloud-based bots, and makes relevant code samples available from GitHub. As you know, AI is accelerating the digital transformation of every industry on the planet. It is our goal to allow developers and organizations of all stripes to be able to use AI successfully to augment human ingenuity and create the next generation of intelligent apps. Through this new e-book, Anand Raman and Wee Hyong Tok of Microsoft provide a comprehensive roadmap for developers to build their first AI-infused application.
Using the example of a “Conference Buddy”, you’ll learn the key ingredients needed to develop an intelligent chatbot – one that helps the attendees at a conference interact with speakers in a novel way.
The e-book attempts to provide a gentle introduction to the tools, infrastructure and services in the Microsoft AI Platform that allow you to create intelligent applications. More specifically, you will learn about:
How the intersection of cloud, data and AI is enabling developers and organizations all over the world to
This post is authored by Mary Wahl, Data Scientist; Daniel Hartl and Wilson Lee, Senior Software Engineers; Xiaoyong Zhu, Program Manager; Erika Menezes, Software Engineer; and Wee Hyong Tok, Principal Data Scientist Manager, at Microsoft.
AI for Earth puts Microsoft’s cloud and AI tools in the hands of those working to solve global environmental challenges. Land cover mapping is one goal of the AI for Earth program, which was created to fundamentally change the way that society monitors, models, and ultimately manages Earth’s natural resources. The ability to perform ultra-fast land cover mapping using deep neural networks on terabytes of high-resolution aerial images from the National Agriculture Imagery Program (NAIP), provided by our partners at Esri, fuels new intelligent AI applications, delivering quick insights to land cover map users like conservation scientists. In this blog post, we share what we learned from deploying deep neural network models to field-programmable gate array (FPGA) services using Project Brainwave, and applying these FPGA services to perform land cover mapping.
Aerial Imagery Dataset Construction
We developed our benchmarking dataset using ~120 TB of NAIP aerial imagery spanning the continental U.S. at one-meter resolution at multiple timepoints. These data were provided by Esri in
This post is authored by Gopi Kumar, Principal Program Manager at Microsoft.
The Data Science Virtual Machine (DSVM), a popular VM image on the Azure marketplace, is a purpose-built cloud-based environment with a host of preconfigured data and AI tools. It enables data scientists and AI developers to iterate on developing high quality predictive models and deep learning architectures and helps them become much more productive when developing their AI applications. DSVM has been offered for over two years now and, during that time, it has seen a wide range of users, from small startups to enterprises with large data science teams who use DSVM as their core cloud development and experimentation environment for building production applications and models.
Deploying AI infrastructure at scale can be quite challenging for large enterprise teams. However, Azure Infrastructure provides several services supporting enterprise IT needs, such as around security, scaling, reliability, availability, performance and collaboration. The Data Science VM can readily leverage these services in Azure to support the deployment of large scale enterprise team -based Data Science and AI environments. We have assembled guidance for an initial list of common enterprise scenarios in a new DSVM documentation section dedicated to enterprise
This post is by Ye Xing, Senior Data Scientist, Tao Wu, Principle Data Scientist Manager, and Patrick Buehler, Senior Data Scientist, at Microsoft.
The advancement of medical imaging, as in many other scientific disciplines, relies heavily on the latest advances in tools and methodologies that make rapid iterations possible. We recently witnessed this first-hand when we developed a deep learning model on the newly released Azure Machine Learning Package for Computer Vision (AML-CVP) and were able to improve upon a state-of-the-art algorithm in screening blinding retinal diseases. Our pipeline, based on AML-CVP, reduced misclassification by over 90% (from 3.9% down to 0.3%) without any parameter tuning. The deep learning model training was completed in 10 minutes over 83,484 images on the Azure Deep Learning Virtual Machine equipped with a single NVIDIA V100 GPU. This pipeline can be constructed quickly, with less than 20 lines of Python code, thanks to the benefit of the high-level Python AML-CVP API.
Our work was inspired by the paper “Identifying Medical Diagnosis and Treatable Diseases by Image-Based Deep learning“, published on Cell, a leading medical journal, in February 2018. The paper developed a deep learning AI system to identify two vision-threatening retinal diseases – choroidal