### Color Subsampling, or What is 4:4:4 or 4:2:2??

In the video space, there’s always a lot of talk about these number ratios – 4:4:4, or 4:2:2, or 4:1:1, but what exactly do they mean? Recently, someone argued with me that it was better to convert every video clip from my Canon Rebel T2i DSLR camera into a 4:4:4 intermediate codec before editing; that this would make the color magically “better” and that editing natively was somehow bad. They were wrong, and I’m going to explain why.

Before you read on, make sure you’ve read my earlier articles on 32-bit floating point and on YUV color, and look at the picture from the Wikimedia Commons site of the barn in YUV breakdown.

In the picture of the barn, try to look at the fine detail in the U and V channels.Typically, without any brightness information, it’s hard to see any detail in the color channels. The naked eye just does a much better job distinguishing brightness than color. This fact holds true for moving pictures. If the video uses YUV color space, the most important data is in the Y channel. You can throw away a lot of the color information, and the average viewer can’t tell that it’s gone.

One trick that video engineers have used for years is to toss away a lot of the color information. Basically, they can toss away the color values on every other pixel, and it’s not very noticeable. In some cases, they throw away even more color information. This is called Color Subsampling, and it’s a big part of a lot of modern HD formats for video.

When looking at color subsampling, you use a ratio to express what the color subsampling is. Most of us are familiar with these numbers: 4:4:4, or 4:2:2, or 4:1:1, and most of us are aware that bigger numbers are better. Fewer people understand what the numbers actually mean. It’s actually pretty easy.

Let’s pretend that we are looking at a small part of a frame – just a 4×4 matrix of pixels in an image:

In this example, every pixel has a Y value, a Cb value, and a Cr value. If you look at a line of pixels, and count how many Y, U, and V values, you’d say that there are 4 values of Y, 4 values for U, and 4 values of V. In color shorthand, we’d say that this is a 4:4:4 image.

4:4:4 color is a platinum standard for color, and it’s extremely rare to see a recording device or camera that outputs 4:4:4 color. Since the human eye doesn’t really notice when color is removed, most of the higher-end devices output something called 4:2:2. Here’s what that 4×4 matrix would look like for 4:2:2:

As you can see, half of the pixels are missing the color data. Looking at that 4×4 grid, 4:2:2 color may not look that good, but 4:2:2 color is actually considered a very good color standard. Most computer software can use the neighboring color values and average in the values of the missing color values.

Let’s look at 4:1:1 color, which is used for NTSC DV video:

Bleaccch. 75% of the color for each pixel is tossed away! With bigger “gaps” between color information, it’s even harder for software to “rebuild” the missing values, but it happens. This is one of the reasons that re-compressing DV can cause color smearing from generation to generation.

Let’s look at one other color subsampling, which is called 4:2:0, and is used very frequently in MPEG encoding schemes:

This diagram shows one of many ways that 4:2:0 color subsampling can be accomplished, but the general idea is the same – Luma samples for each pixel, one line has Cb samples for every other pixel, and the next line has Cr samples for every other pixel.

With a color subsampled image, it’s up to the program decoding the picture to estimate the missing pixel values, using the surrounding intact color values, and providing smoothing between the averaged values.

Okay – we’ve defined what color subsampling is. Now, how does that relate to my friend’s earlier argument?

Well, in my DSLR camera, the color information is subsampled to 4:2:0 color space in the camera. In other words, the camera is throwing away the color information. It’s the weakest link in the chain! Converting from 4:2:0 to 4:4:4 at this stage doesn’t “magically” bring back the thrown-away data – the data was lost prior to hitting the memory card. It’s just taking the data that’s already there, and “upsampling” the missing color values by averaging between the adjoining values.

Inside Premiere Pro, the images will stay exactly as they were recorded in-camera for cuts-only edits. If there’s no color work going on, the 4:2:0 values remain untouched. If I need to do some color grading, Premiere Pro will, on-the-fly, upsample the footage to 4:4:4, and it does this very well, and in a lot of cases, in real-time.

Going to a 4:4:4 intermediate codec does have some benefits – in the transcode process, upsampling every frame to 4:4:4 means that your CPU doesn’t have as much work to do, and may give you better performance on older systems, but there’s a huge time penalty in transcoding. And, it doesn’t get you any “better color” than going native. Whether you upsample prior to editing or do it on-the-fly in Premiere Pro, the color info was already lost in the camera.

In fact, I could argue that Premiere Pro is the better solution for certain types of editing because we leave the color samples alone when possible. If the edit is re-encoded to a 4:2:0 format, Premiere Pro can use the original color samples and pass those along to the encoder in certain circumstances. Upsampling and downsampling can cause errors, since the encoder can’t tell the difference between the original color samples and the rebuilt, averaged ones.

I’m not trying to knock intermediate codecs – there are some very valid reasons why certain people need them in their pipeline. But, for people just editing in the Adobe Production Premium suite, they won’t magically add more color data, and may waste you a lot of time. Take advantage of the native editing in Premiere Pro CS5, and you’ll like what you see. 🙂

I used to use Premiere cs4, and I would convert my 5d footage to Cineform Avis for ease in editing. Now that I have cs5, Converting does not speed up the editing anymore so I asked the techs at Cineform if there was any reason to convert. They told me that when color grading it’s best to start with Cineform because the 4:2:2 is better to color grade with in Premiere. Is this not the case? I would love to remove this step from my workflow.

Cineform is still an excellent codec, and I still use it when I need to work both inside and outside of the Adobe Production Premium suite. I also LOVE the color controls and look-up tables (LUTs) available in First Light, and those color changes show up immediately in Premiere Pro.

Since Premiere Pro can do the upsampling to 4:4:4, 32bpc right in the timeline, it’s not a crucial step, but it can still add value to your pipeline.

Hi Karl

This was the kind of information I was looking for as I’m tired explaining why it’s not necessary to transcode to 4:4:4 whatever your footage codec is.

In fact grading is codec-independent as we grade images not codecs or file-formats.

It’s like adding effects on a old K7 output or burn the music on a CD first…your effects board doesn’t bother the source material, K7, CD, SACD…it just takes the sounds as is regardless it is 16khz/8bjts, 44khz/16bits or 96khz/24bits…

I work with some video experts but I rarely get the opportunity to question them extensively so I dig online and read books. I must say nothing puts this stuff in plain english as well as what I am finding here. Thank you!

So my comment / question is this: It makes sense that converting from 4:2:0 to 4:4:4 then back to 4:2:0 to many times would mess up the picture because the conversion function does not know where the original color subtractions were performed… Or does it? If the resolution remains the same (width and height), would it not make sense that the color subtraction is always in the same positions? Go from your first illustration to your second and back. Nothing is lost. So as long as there is a standard way to choose the pixels that will lose color information then conversion routines would be “safe” would they not?

First excuse my bad english. I speak french.
Your explanations are not entirely accurate. In 4:2:2 for ex. it is not thrue that the color information is discaded for each even pixel. In fact, the color information is subsampled by 2. So, if the resolution of Y is 720 x 576 (Pal), the resolution for U and V is subsampled to 360 x 288. The firts and second Y pixels receive the U an V values the first pixel from the subsampled color information. Thus, the first twoo Y pixels receive the same color value (the average of their actual colors), and so on. If you convert now to 4:4:4 and back to 4:2:2, there is absolutly no modification in the color information. In an uncompressed YUV (YUY2) file, the values for the first pixel is coded YUYV and so on, with the U and V values valid for the 2 Y.

Jean, thanks for the clarification. You are correct. When writing this article, I was targeting the average editor, trying to simplify a complicated topic.

Just wanted to say thanks for a very helpful explanation of YUV and color subsampling. I’ve seen the 4:X:X ratios for a while without fully understanding their meaning. Thanks.

Sorry, the comment form is closed at this time.