Subtitled television has been around for a good few decades and has never been more available than it is today– the BBC and Channel 4provide subtitles on all output, for instance, which you can access with a mere couple of clicks of a remote.
But the process by which that text is produced remains a bit of a mystery among viewers. Live subtitling in particular can seem like a dark art, and is subject to much misconception and myth. Here are three of the most popular ones we subtitlers encounter:
“YOU MUST BE ABLE TO TYPE REALLY QUICKLY”
The first and most common thing I’m asked, and something you’d never say if you’d actually seen me type (imagine a chimp mashing a keyboard with his fists, only with less elegance). There are some subtitle typists, called stenographers, who you might also see transcribing speech in court. But the vast majority of us use voice recognition software to repeat – ‘respeak’ – what’s being said on the TV in a computer-friendly (ie robotic) monotone to produce on-screen text.
“YOU’RE NOT VERY GOOD AT IT – THERE ARE SO MANY MISTAKES!”
Voice recognition software is improving all the time but is still pretty primitive compared to human language, which is infinitely varied and always changing. A common source of errors in live subtitling is homophones – ie two or more words of different meaning that sound the same.
In fact, ‘two’ is a good example, since it sounds identical to both ‘to’ and ‘too’, yet has a totally distinct meaning. We constantly train our ‘voice models’ – essentially the computer’s vocabulary – to prevent mistakes caused by homophones like “heading in two the weekend…” But the software is unlikely ever to understand human speech 100% correctly all the time. At least, not till the day the machines rise up and take over – but by that time we’ll have bigger things to worry about than Ed Miliband’s name accidentally appearing on screen as ‘Ed Miller Band’.
“CAN’T THEY JUST PLUG A MICROPHONE INTO THE STUDIO?”
No. For essentially the same reasons as above: because as clever as the software is,‘it needs a human to help it out’. We each have our own voice model because, like a pet dog, the computer likes its master’s voice.
In other words, my computer only understands me, and wouldn’t be able to comprehend with any accuracy what’s being said by a random speaker. Given the vast number of difficult accents in English, and the fact that we’re often subtitling several people at once (during a heated debate, for example), I can almost sympathise with it.
How will changes to people’s TV viewing habits (such as watching more programmes on iPlayer, or through tablet computers or smartphones) affect live subtitling? As technology improves the quality of the images we see, will subtitle viewers’ standards also change?
Let me know what you think in the comments below.
Martin Cornwell, Subtitler