回顾:Epiphan LiveScrypt


Epiphan的 LiveScrypt combines a hardware appliance for audio input and a cloud application for transcription. 总之,他们提供了一个抛光,廉价,易于使用 solution for transcribing speech to text in real time for conferences, training, or similar sessions. 您可以在监视器上本地显示文本, 将其显示为直播流中的封闭字幕, 并通过URL或QR码在网上发布标题. 

像所有的转录服务一样, 机器的还是人的, 准确性不是完美的, but Epiphan uses Google's AI-based Speech-to-Text application programming interface, 所以它会随着时间的推移而改善. If you're wrest­ling with how to affordably add live transcripts to your presentations, LiveScrypt绝对值得一看. 


The LiveScrypt hardware is a touchscreen-based appliance that feeds incoming audio from multiple sources into the cloud, 通过语音到文本转换成文本. 一旦转换, 您可以在设备本身上显示文本, 通过HDMI连接的显示器, 在一个专门的网页上,大多数移动设备都可以访问. You can also feed the text into a live-streaming application like Telestream's Wirecast or Epiphan的 Pearl for display as closed captions. 一旦事件发生 已经结束,你可以从网上下载完整的成绩单吗. 

硬件售价1499美元.在亚马逊上售价95美元,转录服务售价9美元.每小时95美元. 任何演讲的前5分钟是免费的, 每小时或其中的一部分是每小时的全部价格,没有比例. 

拿到设备后,你就登陆伊泼芬的网站 LiveScrypt portal to enter credit card informa­tion and register the device. 之后, you can run LiveScrypt entirely via its touchscreen interface or access it remotely via the portal. 稍后会详细介绍. 


硬件支持非常广泛的音频输入, 包括两个XLR输入(带有Phantom电源), 立体声RCA连接器, a 3.5mm音频接口,2个HDMI接口,SDI音频,2个USB接口 图1中, 在页面顶部). Audio isn't pass-through, so you'll probably have to double up your outputs to support live-streaming/local speaker和Live-Scrypt. During my tests, I input audio through XLR and the RCA connectors, both of which worked well. 

如前所述,您可以在本地或通过web界面驱动该单元. 有一次我配对了设备,并通过XLR连接了麦克风, I was transcribing in seconds; the only hiccup was that I had to manually enable Phantom power to the condenser microphone in the software. 你可以看到结果 图2. 

Epiphan LiveScrypt web界面

图2. 我把传送门里的装置配对后,几秒钟内就完成了转录.

操作上, you start and stop transcription via controls on the upper left of the touchscreen or by using the equivalent web controls. 您可以看到开始按钮 Figure 3 停止按钮和运行时间码如图2所示. You open the controls 所示 图3 using one of the three buttons on the lower right. 

如图3所示,这些控件相当简单. The System tab contains 信息 like the IP address and serial number. Audio allows you to mute different audio inputs and set Gain levels on some, but not all, inputs. 

Epiphan LiveScrypt启动按钮

图3. 机载控制

The Transcription tab lets you choose one of the 30 languages currently supported by the system (you can see a full list at go2sm.com/livescrypt). 目前, 系统只支持转录, 如果你说的是德语, 你可以只输出德语字幕. 然而,翻译在发展的道路上. The Transcription tab also provides options to enable the automatic insertion of punctuation and a profanity filter that converts dirty words to asterisks. The Security tab lets you set a password to operate the touchscreen and the web interface. 

您将在Output选项卡中花费大部分时间,如图3所示. This is where you configure the text HDMI output from the unit for local display or input into a live-streaming system like Wirecast, 所示 图4. You can configure this output as text only or with text and a QR code so viewers watching a local display can retrieve the caption feed on their mobile devices. 


图4. 这是插入到Wirecast中的转录.

To capture the text in Wirecast, I connected the LiveScrypt HDMI output to an Epiphan AV.io 4K USB采集设备并配置AV.io input into the cropped box appearing at the bottom of the video input. 郑重声明, 这个视频来自几年前Sennheiser的麦克风评测, 音频质量也相当不错. I played the video file on a Mac notebook, which I input into the LiveScrypt unit via RCA connectors. 同时, 我在一台惠普笔记本电脑上播放了同样的视频, 我通过桌面演示器输入到Wirecast. I recorded the presentation you can view with the transcription 所示 图4. 

I was surprised that the closed-caption use case was not explicitly supported via a preset that output two or three lines in a short and wide output resolution. It's not hard to configure the output for captions in a system like Wirecast, but it will take some experimentation to cleanly simulate closed captions. I would have thought that this use case would be so common that Epiphan would support it with a preset. 

After the presentation is finished, you can download the transcription in either .srt或 .. TXT格式的web界面 图5. Note that this portal contains all of the controls available on the LiveScrypt hardware itself so you can run the system remotely. 在右上方 图5,您可以在web上看到显示转录的流URL. 

Epiphan LiveScrypt流URL转录

图5. AVStudio provides controls for the paired LiveScrypt unit and downloads of completed presentations.


为了了解LiveScrypt是如何工作的,我 看了一些 epiphan网站上有视频. 在其中一个视频中, Epiphan的发言人声称准确率为92%, 这感觉是对的, 虽然 一些文章认为谷歌的准确率高达95%. 

For perspective, note that the accuracy of human transcribers rates from about 95% to 98% 在我找到的一些文章中. 所以,商业上是买不到完美的. 

最大的问题是需要多大的精确度才能称得上“有用”.“你可以通过观察自己来判断 Sennheiser的视频记录如图4所示. 

看了大半个视频, 精确度相当好,只比音频慢一两秒, 这让人印象深刻. 有些情况下,转录会落后, after which the system catches up with an explosion of text that's a bit hard to follow (see around 1:24). Note that if you're working within a specific industry with unique jargon, you can enter a North American Industry Classification System (NAICS) code to increase transcription accuracy for specific terms and acronyms, 我没有尝试过吗. 

I should also point out that the accuracy exhibited in the test video was the best case in my testing. I also tested some recordings of Europeans speaking in lightly to heavily accented English, 结果是无法使用的. 当然, 因为Epiphan依靠谷歌来转录, 您应该期望所有用例的准确性随着时间的推移而提高. 

在这方面, one way to evaluate LiveScrypt is as a hardware device and service designed to feed high-quality audio from different sources to Google and retrieve and make the transcription available for flexible delivery. 从这个角度看, Epiphan did a great job incorporating a range of audio inputs and making operation extremely simple. The on­ly question is whether the accuracy delivered by Google at this point meets the needs of your application.

