Watermarking vs. Fingerprinting: How to Synch Companion Apps


What turns a tablet or smartphone into a “companion device” when you’re watching TV?

What provides the link between the two, and makes sure that the content is relevant to what you’re watching? Automatic Content Recognition, or ACR, is one answer.

ACR is a catch-all name for a bunch of technologies that allow a second screen device to synchronise with what you’re watching. The two main methods used are Audio Watermarking and Audio Fingerprinting, and which is most suited depends on a couple of factors.

Audio Watermarking

On every British banknote there’s a watermark of the Queen’s head. You can’t really see it unless you know how to look for it. That’s the idea behind audio watermarking as well.

An extra audio signal is mixed in with the programme soundtrack, and using the particular properties of the human ear, it’s quite easy to hide information that the listener can’t detect, but a microphone still can.

Depending on what technique you use, there is enough space to hide a programme identifier and some time information, or a trigger code, without it being noticeable or affecting the quality. As long as the application can pick up the watermark signal, it can tell you what you’re watching and how far through it you are. Fast-forward, pause or start again, and it synchs quickly.

Audio Fingerprinting

A human fingerprint is a representation of the pattern of lines and grooves that are unique to the bearer. Likewise, an audio fingerprint is a representation of the frequencies and amplitudes of the content that it describes.

If you analyse an audio soundtrack, you can reduce these frequencies and amplitudes into a set identifiers, and the relationship between them creates a signature unique to that piece of content. If you then capture that same soundtrack on your tablet or smartphone’s microphone and perform the same analysis, you can match the two results and figure out what you’re watching.

It doesn’t matter whether it’s louder or quieter, or whether there’s a bit of background noise, as long as the relationship between the components is still the same.

Which is better?

Audio fingerprinting is great when you don’t want to play around with the source material, or you’re not allowed to, but it does have its limitations.

It relies on the content being unique, and that’s not always the case. The opening credits of a TV programme are usually the same, so the companion app might have no idea what episode it is until you’re well into it. Highlights or compilation shows aren’t going to go well either.

Audio watermarking is more definitive, but it does mean that you need to change the source audio, and not everyone is happy with that. And what if someone has already put a watermark into the content, for example to track online piracy or measure audience numbers? One watermark isn’t audible, but keep adding them and it soon becomes annoying.

Will one technology win out in the end, or will more emerge? Will companion apps continue to engage viewers into a more lean-forward TV experience? Let me know what you think in the comments below.

Tim Davis, Senior Technologist