Anyone that has worked in digital video has likely stumbled upon the same problem: codecs. Just what are they? Why won’t my video play? Why does Final Cut take hours to transcode? Why can’t it “just work?” Why can’t it be easier?
This post attempts to answer some of those questions. It is by no means comprehensive or thorough – my goal is simply to explain, in very basic terms, why codecs are even needed in the first place.
Fear of Technology
I have noticed that for many users the usual reaction to technical computer problems such as this is to simply walk away. “It’s just too complicated,” they assume. They’d rather call their geek friend for advice whenever problems inevitably come up.
There is some wisdom to that – new technologies and standards are emerging at a blistering pace. Any knowledge or technique that you acquire today with regards to video will likely have to be updated in a couple of years.
The basics, however, only very rarely change. Let’s put aside the specifics for a second and think about what codecs are. That will make it easier to understand why they are needed in the first place.
Film vs. Digital
We have to start with the difference between film photography and photography done with digital cameras. Sometimes the resulting images can look remarkably similar, but most people know the difference.
It is in the media: in film, a light-sensitive material is exposed for a brief period of time by the camera’s shutter to the light reflecting on the subject being photographed. The film is literally physically marked by the presence of the subject in front of the lens. Hence indexicality – the idea that a photograph is physical proof of existence or at least of having been part of the process of photography. Once the film is printed, the reverse process (projection) can be used to reconstitute the image. This is what they call the magic of cinema. Looking at a photograph is, in a way, using the apparatus of photography to connect to a dead image of something.
Digital is the same process, but instead of a light sensitive physical material, a sensor is used. A sensor detects light and converts it to electric signals which are then recorded by a computer. A sensor is essentially a kind of translator – it takes light and converts it into computer language. Whether or not the magic is still there is something people debate a lot.
That translation from light to computer language is why codecs are required.
Image vs. Code
Essentially, when you watch digital video, you are watching something that is seen as an image, but which in reality is a computer program. It is a series of electric signals.
Which is also to say, cameras are no longer cameras: they are computers. The camera’s primary purpose is calculation and not photography.
All this calculation takes up space. Video files are by nature very large, and this is where codecs come in. Codecs compress the video, taking away some of these calculations the camera did while (hopefully) maintaining a high quality image.
There are lossy codecs and lossless codecs. Lossy codecs destroy part of the information in the file created by the camera, resulting in poorer image quality. Lossless codecs keep all of that information, resulting in higher data rates. Sometimes, this data-rate translates into higher image quality. Sometimes, this higher data-rate simply means that there is more information in the footage with which to work with, giving imaging professionals more latitude for their work. Counterintuitively, this data-rich image can sometimes look worse to our eyes – that’s because it hasn’t been put through the post-production process yet, where all the final settings are dialed in by a colorist.
Hence the multitude of different codecs: different images and different projects have different requirements. A Hollywood production, for instance, wants the best image possible. They want all of the information coming out of the sensor to be present when they finish the movie, giving them the liberty to change colors, lighting levels, etc. Professionals working at this level can afford hardware that can deal with very large files.
A consumer, on the other hand, probably wants a video file that is highly portable and compatible with everything while still looking good to him. There are codecs for that as well.
Video being streamed on the internet requires codecs which are lean in size and provide high-quality video. Those are most likely lossy. Video displayed in a movie theater can take advantage of more robust hardware and bandwidth. There is, of course, a major cost component to all of this. Cheaper cameras can only deal with very small data streams, and hence require a lot of compression. Expensive cameras with robust designs can accommodate a lot more data.
In short, there is no such thing as “the best codec.” It entirely depends on what your needs are.
There are also codecs in different classes suited to different applications. Capture codecs, as the name implies, are codecs suited to capture video. Editing codecs, like Apple ProRes, are suited to editing. These are usually very specific proprietary formats that the consumer doesn’t have to worry about.
Here’s a very simple metaphor that gets the point across quickly: different vehicles serve different purposes. Sometimes all you need is something to get around town: a Vespa will do. It will go as fast as 40MPH and it will get you to the store, if that is all you want. But sometimes you need to get an NFL team across the country. For that you might you use a bus, or you might use a plane.
Codecs are the same – sometimes you need a Vespa and sometimes you need a plane.
It goes without saying that this is a gross simplification. It is, nonetheless, a good place to start. If you’d like more information, this video lecture is a good place to start.
A Warning
Finally, a word of caution: while it is true that technology is complicated, it is imperative that we remain educated. In the “Postscript to the Societies of Control,” Deleuze shows that we have shifted from a world of discipline, in which our bodies were beaten into submission, into a world of control, where computers subtly limit our range of possibilities. Computers suggest certain modes of living (think about the Uberization of the world), and the less we know about how our computers work, the less we know what is really happening. Knowing how things work is the only way in which we can resist.
The easier something is to use, the less permutations are possible. The less we understand about how computers work, the easier it is to pull something behind our backs. Though it is very convenient to have computers that “just work,” we must remember that these are complex tools at the center of our lives.