We produce subtitles faster than real-time, but we don't process live streams.
We don’t support live subtitling. Through the use of Artificial Intelligence (AI), we produce subtitles fast, usually 5 times faster than real-time, but for live content we recommend you to consider specialised methods such as respeak or velotyping.
To produce subtitles, producers can rely on computer assisted processing that includes Automatic Speech Recognition (ASR) and automated creation of templates. To avoid a lengthy process to replace transcription errors and to optimise the timing of the subtitles, Limecraft has optimised their services to deliver a result that comes as close to perfection as possible.
What makes subtitles good?
Good subtitles start from a correct audio to text transcription, correctly representing all the meaningful words in the audio, as well as the punctuation marks. Good subtitles must be readable without drawing too much attention to them. Therefore good subtitles must be set according to specific timing templates.
Most producers typically limit subtitles to 37 characters per line, put line breaks after punctuation marks, and ensure that subtitles are displayed for a minimum of 2,5 seconds and a maximum of 5 seconds. The reading speed of individual subtitles is limited to 180 words per minute. With that, adding to the complexity, subtitles must follow the rhythm of the edit and should be aligned with the shot cuts. Parameters can differ from producer to producer, but in either case there's a timing template.
What does this have to do with live subtitling?
To optimise the accuracy of the transcription, we take into consideration the entire audio file. This allows us to make an educated guess on certain words in case of doubt. If we listen to the audio fragment as a whole, the Word Error Rate is usually lower than 5%. When starting from a live signal, the word error rate would typically be 20% or higher, which is unacceptable in most circumstances.
More importantly, to cut the subtitles according to the timing or spotting rules as indicated above, we need a holistic view of the entire clip, the pace of punctuation marks, the scene changes, etc. By processing the live signal, we don't know in advance where the shot cuts will be and how the speech will evolve, so live subtitles can't be and will never be properly styled.
So what are the alternatives?
We highlighted why it is not possible to create broadcast-grade subtitles on a live signal. While Artificial Intelligence is improving at unprecedented pace, it is not possible to look ahead, thus hampering the accuracy of the transcription and making it impossible to apply correct timing. Attempts to generate live subtitles always look clunky and we recommend you investigate workflow alternatives that eliminate the need for live subtitling.
One possibility is to use live content for streaming and to add complementary subtitles afterwards. In another workflow you could consider segmenting the video in segments in parallel with the event, and add subtitles with a short delay.