-->
Save your FREE seat for 流媒体 Connect this August. 现在注册!

Video: Audio Analysis and Machine Learning for Video

Learn more about machine learning and AI at 流媒体的下一个事件.

Watch 小君海德's complete presentation, VES103. Enhancing Media with Machine Learning in 2019,在… 流媒体 Conference Video Portal.

Read the complete transcript of this clip:

小君海德: In my opinion, this is one of the more mature areas of machine learning for video. These services are pretty accurate when it comes to speech-to-text, from what I've seen. 显然,它们并不完美. But I feel that they're more perfect in this regard than they are with object detection, especially when we're talking about videos in the wild.

Say I'm a machine learning service, for instance, maybe Video Indexer or Valossa. And I'm tuning my models and they're going to probably cover 80%. But there are going to be those videos that they're not expecting, and they haven't been properly tuned to. So what I would say now is speech-to-text and translation or speech-to-text iare actually pretty good so far.

Translation builds on top of that. In addition to being able to get a transcript, you can then take that transcript and translate it to other languages. 这是另一个很酷的例子. There are sounds in this world that aren't just speech. So we have, you know, bird sounds. 我们有掌声,我们有音乐. 诸如此类. Certain services can actually tell you what other audio is happening within the video as well. 这很有用.

What you see on the left is Valossa and what their JSON looks like. I ran it through video, and it detected applause. In the Fauna category, it detected a pet sound, probably a dog bark or something.

Then there's Video Indexer from Azure Media Services. They have this really cool speaker statistics that they give you. Let's say you have some kind of like training system where you're teaching toastmasters, and you want them to be able to have a two-way conversation. You can leverage these statistics to know who's the one that's talking and not letting anyone else talk.

In this case, it's me, because I'm sitting up here talking to all of you and you're all quiet. But speakers' statistics are pretty interesting.

There's also sentiment analysis as well. Sentiment analysis is where you analyze how much happiness or sadness is in a video. Or how positive or negative is a particular point in the video. Valossa has a pretty cool visualizer within their UI where you can get some sentiment.

AMS Video Indexer does positive and negative, and I think they just recently started putting some emotions in there as well.

Watson Video Enrichment has been doing sentiment analysis for a while. So what you can see here in the bottom right is joy, sadness, anger, fear, and disgust.

流媒体覆盖
免费的
合资格订户
现在就订阅 最新一期 过去的问题
相关文章

是什么 Machine Learning as a Service?

RealEyes Media CTO 小君海德 discusses MLaaS and how to leverage it in this clip from his Video Engineering Summit presentation at 流媒体 East 2019.

Video: How to Make MLaaS Work for You

MLaaS对你有用吗? Depends on the kind of content you have, 你是如何使用它的, 以及你需要的结果类型, as RealEyes' 小君海德 explains in this clip from his Video Engineering Summit presentation at 流媒体 East 2019.

Video: What's the Difference Between Machine Learning and AI?

Microsoft's Andy Beach and IBM/Watson Media's Ethan Dreilinger break down the differences between machine learning and AI in this clip from their panel at 流媒体 West 2018.

Video: Key Considerations When Choosing a Video AI Platform

RealEyes Director of Technology 小君海德 discusses the importance of internal self-assessment and which use-case elements to consider when choosing a platform for video AI in this clip from 流媒体 East 2018.

Video: Tips for Getting Started with Video AI Platforms

RealEyes Director of Technology 小君海德 outlines the first steps in choosing an AI platform in this clip from his presentation at 流媒体 East 2018.