The most basic function of any live subtitling operation is to input the dialogue text, and in the early days, the humble QWERTY keyboard was our only choice. Those early live subtitlers would wear out a typical keyboard in a few weeks, and the clattering of keys as they fought to keep up with news bulletins was a familiar sound. The technique sought to combine fast and accurate typing skills with an ability to précis the dialogue “on the fly”, as even the most accomplished subtitlers could only achieve less than half the typical dialogue speed of a news broadcast of over 180 words per minute.
Of course, despite the incredible skills of those early pioneers, a great deal of the dialogue was simply lost through the need to précis the lack of truly verbatim subtitles led to concerns from the deaf community who felt they were being disadvantaged, so we had to look for other ways.
So we turned to various so-called “high speed keyboard” technologies. The most obvious of these were the Palantype and Stenograph systems, used extensively at the time for verbatim court reporting. Both used machine shorthand technology, and “chording” keyboards, where an operator could input entire words by pressing combinations of keys on a highly specialised keyboard. The Palantype system had been developed for deaf MP Jack Ashley to assist him in the House of Commons and for a while both systems were candidates for live subtitling. In the end, it was the US-born Stenograph system that finally made it as a practical live subtitling tool, using specialist software packages to manage the complex text conversions and dictionary management.
Stenography is an extremely skilled art, with subtitlers taking around 2 years to be proficient in live subtitling, so some broadcasters looked at lower-cost alternatives, the most popular of these being Velotype. Developed in Holland, Velotype was also a chording keyboard, but used syllables, rather than the phonetic system of Stenograph, as the basis of its shorthand input.
Eventually, however, most UK broadcasters adopted some form of Stenography, at least for subtitling of live news, with skilled subtitlers finally able to match the newsreader’s speaking rate and deliver truly verbatim live subtitles for the first time.
It was not until around the turn of the century that speaker-dependent voice recognition became achievable and affordable for live subtitling, using a subtitler to respeak the real-time dialogue. Initially, word rates were limited by software and PC capabilities, and the early programming covered was somewhat leisurely-paced, such as BBC’s Snooker and C4’s Cricket. The technology — and word rate— improved steadily, and coupled with the less demanding subtitler training curve, respeaking VR has become almost universal as the preferred input method for live subtitling, capable of handling virtually all programming requirements.
Of course, the Holy Grail of subtitling driven by automatic speech recognition (ASR) directly from the programme audio still remains out of reach. I started saying that the technology was “Ten years away” around 40 years ago! Some believe that 100% accuracy will never be achieved by a practical ASR subtitling system, but I think it’s tantalisingly close now! The challenges of concurrent speakers, background noise, music and speaker dialects still remain broadly to be overcome. Another ten years perhaps? One thing is certain though — I shall be retired by then!
John Hedger, Consultant Project Manager, Access Services