Category Archives : Cognitive Services



Beyond the printed form: Unlocking insights from documents with Form Recognizer

Data extraction from printed forms is by now a tried and true technology. Form Recognizer extracts key value pairs, tables and text from documents such as W2 tax statements, oil and gas drilling well reports, completion reports, invoices, and purchase orders. However, real-world businesses often rely on a variety of documents for their day-to-day needs that are not always cleanly printed.

We are excited to announce the addition of handwritten and mixed-mode (printed and handwritten) support. Starting now, handling handwritten and mixed-mode forms is the new norm.

Extracting data from handwritten and mixed-mode content with Form Recognizer

Entire data sets that were inaccessible in the past due to the limitations of extraction technology now become available. The handwritten and mixed-mode capability of Form Recognizer is available in preview and enables you to extract structured data out of handwritten text filled in forms such as:

Medical forms: New patient information, doctor notes. Financial forms: Account opening forms, credit card applications. Insurance: Claim forms, liability forms. Manufacturing forms: Packaging slips, testing forms, quality forms. And more.

By using our vast experience in optical character recognition (OCR) and machine learning for form analysis, our experts created a state-of-the-art solution that goes beyond printed forms. The OCR technology




New ways to train custom language models – effortlessly!

Video Indexer (VI), the AI service for Azure Media Services enables the customization of language models by allowing customers to upload examples of sentences or words belonging to the vocabulary of their specific use case. Since speech recognition can sometimes be tricky, VI enables you to train and adapt the models for your specific domain. Harnessing this capability allows organizations to improve the accuracy of the Video Indexer generated transcriptions in their accounts.

Over the past few months, we have worked on a series of enhancements to make this customization process even more effective and easy to accomplish. Enhancements include automatically capturing any transcript edits done manually or via API as well as allowing customers to add closed caption files to further train their custom language models.

The idea behind these additions is to create a feedback loop where organizations begin with a base out-of-the-box language model and improve its accuracy gradually through manual edits and other resources over a period of time, resulting with a model that is fine-tuned to their needs with minimal effort.

Accounts’ custom language models and all the enhancements this blog shares are private and are not shared between accounts.

In the following sections I




Conversational AI updates for July 2019

At Build, we highlighted a few customers who are building conversational experiences using the Bot Framework to transform their customer experiences. For example, BMW discussed its work on the BMW Intelligent Personal Assistant to deliver conversational experiences across multiple canvases by leveraging the Bot Framework and Cognitive Services. LaLiga built their own virtual assistant which allows fans to experience and interact with LaLiga across multiple platforms.

With the Bot Framework release in July, we are happy to share new releases of Bot Framework SDK 4.5 and preview of 4.6, updates to our developer tools, and new channels in Azure Bot Service. We’ll use the opportunity to provide additional updates for the Conversational AI releases from Microsoft.

Bot Framework channels

We continue to expend channels support and functionality for Bot Framework and Azure Bot Service.

Voice-first bot applications: Direct Line Speech preview

The Microsoft Bot Framework lets you connect with your users wherever your users are. We offer thirteen supported channels, including popular messaging apps like Skype, Microsoft Teams, Slack, Facebook Messenger, Telegram, Kik, as well as a growing number of community adapters.

Today, we are happy to share the preview of Direct Line Speech channel. This is a new channel




Enable receipt understanding with Form Recognizer’s new capability

One of the newest members of the Azure AI portfolio, Form Recognizer, applies advanced machine learning to accurately extract text, key-value pairs, and tables from documents. With just a few samples, it tailors its understanding to supplied documents, both on-premises and in the cloud. 

Introducing the new pre-built receipt capability

Form Recognizer focuses on making it simpler for companies to utilize the information hiding latent in business documents such as forms. Now we are making it easier to handle one the most commonplace documents in a business, receipts, “out of the box.”  Form Recognizer’s new pre-built receipt API identifies and extracts key information on sales receipts, such as the time and date of the transaction, merchant information, the amounts of taxes, totals, and more, with no training required.

Streamlining expense reporting

Business expense reporting can be cumbersome for everyone involved in the process. Manually filling out and approving expense reports is a significant time sink for both employees and managers. Aside from productivity lost to expense reporting, there are also pain points around auditing expense reports. A solution to automatically extract merchant and transaction information from receipts can significantly reduce the manual effort of reporting and auditing expenses.





Leveraging complex data to build advanced search applications with Azure Search

Data is rarely simple. Not every piece of data we have can fit nicely into a single Excel worksheet of rows and columns. Data has many diverse relationships such as the multiple locations and phone numbers for a single customer or multiple authors and genres of a single book. Of course, relationships typically are even more complex than this, and as we start to leverage AI to understand our data the additional learnings we get only add to the complexity of relationships. For that reason, expecting customers to have to flatten the data so it can be searched and explored is often unrealistic. We heard this often and it quickly became our number one most requested Azure Search feature. Because of this we were excited to announce the general availability of complex types support in Azure Search. In this post, I want to take some time to explain what complex types adds to Azure Search and the kinds of things you can build using this capability. 

Azure Search is a platform as a service that helps developers create their own cloud search solutions.

What is complex data?

Complex data consists of data that includes hierarchical or nested substructures that do




Introducing next generation reading with Immersive Reader, a new Azure Cognitive Service

This blog post was authored by Tina Coll, Senior Product Marketing Manager, Azure Marketing.

Today, we’re unveiling the preview of Immersive Reader, a new Azure Cognitive Service in the Language category. Developers can now use this service to embed inclusive capabilities into their apps for enhancing text reading and comprehension for users regardless of age or ability. No machine learning expertise is required. Based on extensive research on inclusivity and accessibility, Immersive Reader’s features are designed to read the text aloud, translate, focus user attention, and much more. Immersive Reader helps users unlock knowledge from text and achieve gains in the classroom and office.

Over 15 million users rely on Microsoft’s immersive reading technologies across 18 apps and platforms including Microsoft Learning Tools, Word, Outlook, and Teams. Now, developers can deliver this proven literacy-enhancing experience to their users too.

People like Andrzej, a child with dyslexia, have learned to read with the Immersive Reader experience embedded into apps like Microsoft Learning Tools. His mother, Mitra, shares their story:

Literacy is key to unlocking knowledge and realizing one’s potential. Educators see this reality in the classroom every day, yet hurdles to reading are commonplace for people with dyslexia, ADHD,




Using Azure Search custom skills to create personalized job recommendations

This blog post was co-authored by Kabir Khan, Software Engineer II , Learning Engineering Research and Developement.

The Microsoft Worldwide Learning Innovation lab is an idea incubation lab within Microsoft that focuses on developing personalized learning and career experiences. One of the recent experiences that the lab developed focused on offering skills-based personalized job recommendations. Research shows that job search is one of the most stressful times in someone’s life. Everyone remembers at some point looking for their next career move and how stressful it was to find a job that aligns with their various skills.

Harnessing Azure Search custom skills together with our library of technical capabilities, we were able to build a feature that offers personalized job recommendations based on identified capabilities from resumes. The feature parses a resume to identify technical skills (highlighted and checkmarked in the figure below.) It then ranks jobs based on the skills most relevant to the capabilities in the resume. Another helpful ability is in the UI layout, where the user can view the gaps in their skills (non-highlighted skills in the figure below) for jobs they’re interested in and work towards building those skills.

Figure one: Worldwide Learning Personalized Jobs




Using Text Analytics in call centers

Azure Cognitive Services provides Text Analytics APIs that simplify extracting information from text data using natural language processing and machine learning. These APIs wrap pre-built language processing capabilities, for example, sentiment analysis, key phrase extraction, entity recognition, and language detection.

Using Text Analytics, businesses can draw deeper insights from interactions with their customers. These insights can be used to create management reports, automate business processes, for competitive analysis, and more. One area that can provide such insights is recorded customer service calls which can provide the necessary data to:

Measure and improve customer satisfaction Track call center and agent performance Look into performance of various service areas

In this blog, we will look at how we can gain insights from these recorded customer calls using Azure Cognitive Services.

Using a combination of these services, such as Text Analytics and Speech APIs, we can extract information from the content of customer and agent conversations. We can then visualize the results and look for trends and patterns.

The sequence is as follows:

Using Azure Speech APIs, we can convert the recorded calls to text. With the text transcriptions in hand, we can then run Text Analytics APIs to gain more insight




Accelerate bot development with Bot Framework SDK and other updates




AI-first content understanding, now across more types of content for even more use cases

This post is authored by Elad Ziklik, Principal Program Manager, Applied AI.

Today, data isn’t the barrier to innovation, usable data is. Real-world information is messy and carries valuable knowledge in ways that are not readily usable and require extensive time, resources, and data science expertise to process. With Knowledge Mining, it’s our mission to close the gap between data and knowledge.

We’re making it easier to uncover latent insights across all your content with:

Azure Search’s cognitive search capability (general availability) Form Recognizer (preview) Cognitive search and expansion into new scenarios

Announced at Microsoft Build 2018, Azure Search’s cognitive search capability uniquely helps developers apply a set of composable cognitive skills to extract knowledge from a wide range of content. Deep integration of cognitive skills within Azure Search enables the application of facial recognition, key phrase extraction, sentiment analysis, and other skills to content with a single click. This knowledge is organized and stored in a search index, enabling new experiences for exploring the data.

Cognitive search, now generally available, delivers:

Faster performance – Improved throughput capabilities with increased processing speeds up to 30 times faster than in preview. Completing previously hour-long tasks in only a couple of minutes.