This post was authored by the Microsoft Cognitive Services Team.
We understand that the commitments we make about data are essential for any organization. To give customers more control, we’ve updated our Cognitive Services terms for customer data. Here’s what this means for our customers.
On February 1, we started moving Cognitive Services under the same terms as other Azure services. Under the new terms, Cognitive Services customers own, and can manage and delete their customer data. With this change, many Cognitive Services are now aligned with the same terms that apply to other Azure services.
Terms for Computer Vision, Face, Content Moderator, Text Analytics, and Speech services have already changed, with updates coming to Language Understanding on March 1 and Microsoft Translator on May 1. As new products are added to Cognitive Services, they will align with the same standards as other Azure services, with the exception of Bing Search Services.
Bing Search Services data will continue to be treated differently than other customer data. For example, we use search queries that you provide to Bing Search Services to improve our search algorithms over time.
We are making these updates because we strive to be transparent in our privacy
Voice is the new interface driving ambient computing. This statement has never been more true than it is today. Speech recognition is transforming our daily lives from digital assistants, dictation of emails and documents, to transcriptions of lectures and meetings. These scenarios are possible today thanks to years of research in speech recognition and technological jumps enabled by neural networks. Microsoft is at the forefront of Speech Recognition with its research results, reaching human parity on the Switchboard research benchmark.
Our goal is to empower developers with our AI advances, so they can build new and transformative experiences for their customers. We offer a spectrum of APIs to address the various scenarios and situations developers encounter. Cognitive Services Speech API gives developers access to state of the art speech models. Premium scenarios, using domain specific vocabulary or complex acoustic conditions, offer Custom Speech Service that enables developers to automatically tune speech recognition models to their specific needs. Our services have been previewed on a wide range of scenarios with customers.
Speech recognition systems are composed of several components. The most important components are the acoustic and language models. If your application contains vocabulary items that occur rarely in everyday
One year ago, we announced Azure IP Advantage, the industry’s leading program to help cloud service customers stay focused on their digital transformation journey and avoid IP issues. The program has been a tremendous success so far with many customers telling us that it is a key differentiator for Azure and that they choose Azure in part because of the value they get from these benefits.
Here are some of the highlights from our first year:
Customers around the world find that Azure IP Advantage has been a valuable deterrent against IP lawsuits, which is especially important as cloud-related patent litigation has increased over the past 4 years. Customers of our partner 21 Vianet like Mobike, the world’s largest bicycle sharing company headquartered in China, explain the benefits of offering IP protection programs to Azure clients. “Azure IP Advantage helps us by reducing potential IP risks as we march into new markets. From technologies to patent offerings, Microsoft is providing a comprehensive protection for us to thrive on cloud without worry.” Microsoft expanded Azure IP Advantage to China in partnership with 21Vianet, ensuring that Azure customers in China enjoy the same great IP protection benefits as customers in the rest
We are delighted to announce a new capability in Microsoft Video Indexer: Brand Detection from speech and from visual text! If you are not yet familiar with Video Indexer, you may want to take a look at a few examples on our portal.
Having brands in the video index gives you insights on names of products and organizations, which appear in a video or audio asset without having to watch it. Particularly, it enables you to search over large amounts of video and audio. Customers find Brand Detection useful in a wide variety of business scenarios such as contents archive and discovery, contextual advertising, social media analysis, retail compete analysis and many more.
Out of the box brand detection
Let us take a look at an example. In this Microsoft Build 2017 Day 2 presentation, the brand “Microsoft Windows” appears multiple times. Sometimes in the transcript, sometimes as visual text and never as verbatim. Video Indexer detects with high precision that a term is indeed brand based on the context, covering over 90k brands out of the box, and constantly updating. At 02:25, Video Indexer detects the brand from speech and then again at 02:40 from visual text, which is
Voice is becoming more and more prevalent as a mode of interaction with all kinds of devices and services. The ability to provide not only voice input but also voice output or Text-to-speech (TTS), is also becoming a critical technology that supports AI. Whether you need to interact on a device, over the phone, in a vehicle, through a building PA system, or even with a translated input, TTS is a crucial part of your end-to-end solution. It is also a necessity for all applications that enable accessibility.
We are excited to announce that the Speech API, a Microsoft Cognitive Service, now offers six new TTS languages to all developers, bringing the total number of available languages to 34:
Bulgarian (language code: bg-BG) Croatian (hr-HR) Malaysia (ms-MY) Slovenian (sl-SI) Tamil (ta-IN) Vietnamese (vi-VN)
Powered by the latest AI technology, these 34 languages are available across 48 locales and 78 voice fonts. Through a single API, developers can access the latest-generation of speech recognition and TTS models.
This Text-to-Speech API can be integrated by developers for a broad set of use cases. It can be used on its own for accessibility, hand-free communication or media consumption, or any other machine-to-human interactions.
We are pleased to announce preview availability of SDKs for the Cognitive Serivces – Bing Search APIs. Currently available as REST APIs, the Bing APIs v7 now have SDKs in four languages: C#, Java, Node.js, and Python. These SDKs include offerings such as Bing Web Search, Bing Image Search, Bing Custom Search, Bing News Search, Bing Video Search, Bing Entity Search, and Bing Spell Check.
Here are some of the salient features of these SDKs:
Easy to use and highly flexible in adjusting your basis application scenario. Encompass all the API v7 functionalities, languages, and countries. Reduce assembly footprint through individual SDK for each Bing offering. Enable development in C#, Java, Node.js, and Python. Provide ability to use the new/existing Bing APIs access keys, both free and paid. Well documented through samples and parameter references. Supported through Azure and other developer forums. Opensource under MIT license and available on GitHub for collaboration.
Getting Started with Bing SDKs
For C#, both NuGet packages and SDKs are available for individual Bing offerings. The best place to start is with C# samples. These samples provide an easy to follow step-by-step guide on running various application specific scenarios through corresponding NuGet packages.
Self-service customization for speech recognition
ASR is an important audio analysis feature in Video Indexer. Speech recognition is artificial intelligence at its best, mimicking the human cognitive ability to extract words from audio. In this blog post, we will learn how to customize ASR in VI, to better fit specialized needs.
Before we get in to technical details, let’s take inspiration from a situation we have all experienced. Try to recall your first days on a job. You can probably remember feeling flooded with new words, product names, cryptic acronyms, and ways to use them. After some time, however, you can understand all these new words. You adapted yourself to the vocabulary.
ASR systems are great, but when it comes to recognizing a specialized vocabulary, ASR systems are just like humans. They need to adapt. Video Indexer now supports a customization layer for speech recognition, which allows you to teach the ASR engine new words, acronyms, and how they are used in your business context.
How does Automatic Speech Recognition work? Why is customization needed?
This post was authored by the Azure Bot Service and Language Understanding Team.
Microsoft brings the latest advanced chatbot capabilities to developers’ fingertips, allowing them to create apps that see, hear, speak, understand, and interpret users’ needs — using natural communication styles and methods.
Today, we’re excited to announce we’re making generally available Microsoft Cognitive Services Language Understanding service (LUIS) and Azure Bot Service, two top notch AI services to create digital agents that interact in natural ways and make sense of the surrounding environment.
Think about the possibilities: all developers regardless of expertise in data science able to build conversational AI that can enrich and expand the reach of applications to audiences across a myriad of conversational channels. The app will be able to understand natural language, reason about content and take intelligent actions. Bringing intelligent agents to developers and organizations that do not have expertise in data science is disruptive to the way humans interact with computers in their daily life and the way enterprises run their businesses with their customers and employees.
Through our preview journey in the past two years, we have learned a lot from interacting with thousands of customers undergoing digital transformation. We highlighted
Conversational AI, or making human and computer interactions more natural, has been a goal since technology became ubiquitous in our society. Our mission is to bring conversational AI tools and capabilities to every developer and every organization on the planet, and help businesses augment human ingenuity in unique and differentiated ways.
Today, I’m excited to announce Microsoft Azure Bot Service and Microsoft Cognitive Services Language Understanding (LUIS) are both generally available.
Azure Bot Service enables developers to create conversational interfaces on multiple channels while Language Understanding (LUIS) helps developers create customized natural interactions on any platform for any type of application, including bots. Making these two services generally available on Azure simultaneously extends the capabilities of developers to build custom models that can naturally interpret the intentions of people conversing with bots.
This announcement delivers on our AI Platform approach, providing developers and data scientists with all the tools they need to create AI applications in the cloud and on mobile devices. In November, at Connect(); 2017, we released tools to infuse AI into new and existing applications quickly and easily with updates to Azure Machine Learning (AML) including Azure IoT Edge integration, as well as new Visual Studio Tools