Content creators and broadcasters are increasingly embracing Cloud’s global reach, hybrid model and elastic scale. These attributes combined with AI’s ability to accelerate insights and time to market across content creation, management, and monetization are truly transformative.
At the International Broadcasters Conference (IBC) Show 2018, we are focused on bringing Cloud + AI together to help you overcome common media workflow challenges.
Video Indexer, generally available starting today, is a great example of this Cloud + AI focus. It brings together the power of the cloud + Microsoft AI to intelligently analyze your media assets, extract insights and add metadata. It makes it easier to understand your vast content library and get the more than 20 new and improved models, easy to use interfaces, a single API, and simplified account management. I have been part of Video Indexer team since its inception and could not be more excited to see it reach GA. I’m also incredibly proud of the work the team has done to solve real customer problems and make AI tangible in this easy to use elegant solution.
Our partners are already innovating on top of Video Indexer and extending Azure Media Services to advance the state of
Media and entertainment industry conferences are by far some of my favorites. Creativity, disruption, opportunity, and technology – particularly cloud, edge, and AI – are everywhere. It’s been exciting to see those things come together at NAB 2018, SIGGRAPH, and now IBC Show 2018. Together with teams from across Microsoft, I’m looking forward to IBC Show and the chance to learn, collaborate, and advance the state of this dynamic industry.
At this year’s IBC we’re excited to announce the general availability of Video Indexer, our advanced metadata extraction service. Announced as public preview earlier this year, Video Indexer provides a rich set of cross-channel (audio, speech, and visual) learning models. Check out Sudheer’s blog for more information on all the new capabilities including emotion detection, topic inferencing, and improvements to the ever-popular celebrity recognition model that recognizes over one million faces.
Video Indexer is just one of the ways Azure is helping customers like Endemol Shine, Multichoice, RTL, and Ericsson with their content needs. At IBC 2018, our teams are excited to share new ways that Azure, together with solutions from our partners, can address common media workflow challenges.
How? Well, read on…
More visual effects and animations mean
Earlier today, we announced the general availability (GA) of Video Indexer. This means that our customers can count on all the metadata goodness of Video Indexer to always be available for them to use when running their business. However, this GA is not the only Video Indexer announcement we have for you. In the time since we released Video Indexer to public preview in May 2018, we never stopped innovating and added a wealth of new capabilities to make Video Indexer more insightful and effective for your video and audio needs.
Delightful experience and enhanced widgets
The Video Indexer portal already includes insights and timeline panes that enables our customers to easily review and evaluate media insights. The same experience is also available in embeddable widgets, which are a great way to integrate Video Indexer into any application.
We are now proud to release revamped insight and timeline panes. The new insight and timeline panes are built to accommodate the growing number of insights in Video Indexer and are automatically responsive to different form factors.
By the way, with the new insight pane we have also added visualizations for the already existing keyframes extraction capability, as well as new
Video Indexer recently released a new and improved Video Indexer V2 API. This RESTful API supports both server-to-server and client-to-server communication and enables Video Indexer users to integrate video and audio insights easily into their application logic, unlocking new experiences and monetization opportunities.
To make the integration even easier, we also added new Logic Apps and Flow connectors that are compatible with the new API. Using the new connectors, you can now set up custom workflows to effectively index and extract insights from a large amount of video and audio files, without writing a single line of code! Furthermore, using the connectors for your integration gives you better visibility on the health of your flow and an easy way to debug it.
To help you get started quickly with the new connectors, we’ve added Microsoft Flow templates that use the new connectors to automate extraction of insights from videos. In this blog, we will walk you through those example templates.
Upload and index your video automatically
This scenario is comprised of two different flows that work together. The first flow is triggered when a new file is added to a designated folder in a OneDrive account. It uploads the new
For those of you who might not have tried it yet, Video Indexer is a cloud application and platform built upon media AI technologies to make it easier to extract insights from video and audio files. As a starting point for extracting the textual part of the insights, the solution creates a transcript based on the speech appearing in the file; this process is referred to as Speech-to-text. Today, Video Indexer’s Speech-to-text supports ten different languages. Supported languages include English, Spanish, French, German, Italian, Chinese (Simplified), Portuguese (Brazilian), Japanese, Arabic, and Russian.
However, if the content you need is not in one of the above languages, fear not! Video Indexer partners with other transcription service providers to extend its speech-to-text capabilities to many more languages. One of those partnerships is with Zoom Media, which extended the Speech-to-text to Dutch, Danish, Norwegian and Swedish.
A great example for using Video Indexer and Zoom Media is the Dutch public broadcaster AVROTROS; who uses Video Indexer to analyze videos and allow editors to search through them. Finus Tromp, Head of Interactive Media in AVROTROS shared, “We use Microsoft Video Indexer on a daily basis to supply our videos with relevant metadata. The gathered
Developers and media companies trust and rely on Azure Media Services to build the ability to encode, protect, analyze and deliver video at scale. This week, at the Build 2018 conference in Seattle, we are proud to announce a major new API version for Azure Media Services, along with new developer focused features, and updates to Video Indexer.
Media processing at scale: Public preview of the new Azure Media Services API (v3)
Starting at Build 2018, developers can begin working with the public preview of the new Azure Media Services API (v3). The new API provides a simplified development model, enables a better integration experience with key Azure services like Event Grid and Functions, includes two new media analysis capabilities, and provides a new set of SDKs for .NET, .NET Core, Java, Go, Python, and Node.js!
We have created a set of preliminary documentation to get developers started quickly learning more about the new Azure Media Services preview release announcements.
Get Started with v3 Public Preview: REST API, SDKs, Swagger Files. Code Samples used at the Build 2018 session. Learn more about.. How the new Transform template makes it easier to submit encoding and analysis Jobs. How to use the
As I reflect on cloud computing and the media industry since last year’s NAB, I see two emerging trends. First, content creators and broadcasters such as Rakuten, RTL, and Al Jazeera are increasingly using the global reach, hybrid model, and elastic scale of Azure to create, manage, and distribute their content. Second, AI-powered tools for extracting insights from content are becoming an integral part of the content creation, management and distribution workflows with customers such as Endemol Shine Group, and Zone TV.
Therefore, at this year’s NAB, we are focused on helping you modernize your media workflows, so you can get the best of cloud computing and AI. We made a number of investments to enable better content production workflows in Azure, including the recent acquisition of Avere Systems. You can learn more about how Azure can help you improve your media workflows and business here.
Read on to learn more about the key advancements we’ve made – in media services, distribution and our partner ecosystem – since last year’s IBC.
Azure Media Services
In the last few years, many law enforcement agencies have adopted body worn cameras. In this blog post, I will provide some background on what is driving the growth and will talk about how AI can help law enforcement agencies with the processing of videos captured by body-worn cameras.
Background on body-worn cameras
A body worn camera is a wearable audio, video or photographic recording system. Law enforcement agencies are not the only consumers of body-worn cameras. Other consumers include journalists, medical professionals, athletes, and so on. The forecast unit shipments of body-worn cameras can be seen on this webpage published by Statista.
The National Institute of Justice (NIJ), the research, development and evaluation agency of the US Department of Justice, conducted research on body-worn cameras for law enforcement and conducted a market survey on body-worn cameras for criminal justice. The survey updated in 2016, aggregates and summarizes information on a number of makes and models of body-worn cameras available today, including the approximate costs of each unit. The full market survey on body-worn camera technologies can be found on NIJ’s website.
Freedom of Information Act (FOIA)
FOIA is defined on foia.gov as a law that gives citizens the right
We are delighted to announce a new capability in Microsoft Video Indexer: Brand Detection from speech and from visual text! If you are not yet familiar with Video Indexer, you may want to take a look at a few examples on our portal.
Having brands in the video index gives you insights on names of products and organizations, which appear in a video or audio asset without having to watch it. Particularly, it enables you to search over large amounts of video and audio. Customers find Brand Detection useful in a wide variety of business scenarios such as contents archive and discovery, contextual advertising, social media analysis, retail compete analysis and many more.
Out of the box brand detection
Let us take a look at an example. In this Microsoft Build 2017 Day 2 presentation, the brand “Microsoft Windows” appears multiple times. Sometimes in the transcript, sometimes as visual text and never as verbatim. Video Indexer detects with high precision that a term is indeed brand based on the context, covering over 90k brands out of the box, and constantly updating. At 02:25, Video Indexer detects the brand from speech and then again at 02:40 from visual text, which is
Extracting insights from video, or using AI technologies, presents an additional set of challenges and opportunities for optimization as compared to images. There is a misconception that AI for video is simply extracting frames from a video and running computer vision algorithms on each video frame. While you can certainly do that but that would not help you get the insights that you are truly after. In this blog post, I will use a few examples to explain the shortcomings of taking an approach of just processing individual video frames. I will not be going over the details of the additional algorithms that are required to overcome these shortcomings. Video Indexer implements several such video specific algorithms.
Person presence in the video
Look at the first 25 seconds of this video.
Notice that Doug is present for the entire 25 seconds.
If I were to draw a timeline for when Doug is present in the video, it should be something like this.
Note the fact that Doug is not always facing the camera. Seven seconds in the video he is looking at Emily. Same thing happens at 23 seconds.
If you were to run face detection at