The Anatomy of Chroma Subsampling

Chroma subsampling involves the compressing of color information in video footage, to reduce the overall storage size and bitrate of video files. It’s a method used across multiple video codecs, and how much compression happens is often expressed as a set of numbers such as 4:2:2. Instead of serving you number soup, let us make sure you have an explanation of what it means. Whether it’s 4:4:4 or 4:2:0, this specification can have a profound impact on a video, especially during post-production.

Pixels: The Video Building Blocks

In order to understand chroma subsampling, it’s important to understand how an image is created in a frame of video. In the simplest terms, a video image is made up of pixels — solitary points lined in rows across the screen. Each pixel has a specific luminance and chrominance, both determined by the data that defines that pixel. Luminance refers to how bright a pixel is, it creates contrast and detail within the overall image. If the luminance information is isolated, without the chroma information, the resulting image will be in black and white. Technically, it would be grayscale, as there are variable depths of contrast between pure black and pure white.

Chrominance is the color information of a pixel, it determines whether a pixel is red, green, blue, or somewhere in between. There are several color spaces by which color is rendered, regardless of which, the basic method of chroma subsampling is the same. Whereas luminance can render a black and white image when isolated, chrominance renders black when isolated. It is dependent on the luminance to create an image. The chrominance is the data that defines the color, and how much of that color is represented in each pixel. Chroma subsampling is a reduction of color information. It’s achieved through the sharing of data for one pixel across a sampling of several pixels, fewer than those that compose the entire image.

The Human Eye

Chroma subsampling is effective and successful, not only because of the anatomy of the video, but because of the human eye and how it sees color and interprets images. The human eye is a remarkable part of the central nervous system. It is able to capture and focus light onto its photoreceptors and deliver the data of that visual input to the brain. Photoreceptors are the cells that line the back of the eye in the retina, where light is focused. There are two basic types of photoreceptor cells: rods and cones. Rods are highly sensitive to light and are achromatic, meaning they do not see color. The cones are less sensitive to light but are chromatic, able to perceive color.

There are 100 or more millions of photoreceptor cells, and less than 20 percent of them are cones when compared to rods. This is why the human eye is more sensitive to the luminance, the brightness and contrast portion of an image and less sensitive to the chroma, color information of an image. When the luminance and chrominance portions are combined to create the overall image, a degraded or compressed amount of chroma information is not highly noticeable. The human eye sees the overall image, with the sharpness of that image defined by the luminance and the chrominance information composited within. The fact that there is less detailed information in the chrominance is overlooked by the eye and the complete image is perceived.

How does chroma subsampling work?

In order for chroma subsampling to work and the data size of an image to be reduced, engineers figured out a way to share chroma information across a range of pixels while maintaining a different amount of luminance information for that same range of pixels. The selection of information from a portion of an image is known as sampling. Subsampling is the selection of specific pixels, but not every pixel, to determine the chroma information that is used as representative of all pixels. Subsampling is a portion, or fraction, of the overall chrominance sampling in an image.

Chroma subsampling is expressed as a numerical formula that represents the ratio of pixels used in the subsampling of that clip. This ratio is represented by three numbers, separated by colons. Written out it appears as J:a:b, this is a ratio of the pixel width of a sampling region compared to the number of pixels sampled from each row in that sampling region. “J” is representative total number of pixels in the horizontal sampling region. In most cases “J” will equal four.

The next two numbers of the ratio, the “a” and the “b” digits, refer to the vertical resolution that is sampled across the “J” sampling region. The second digit of the ratio, the “a” position, is the number of pixels sampled amongst the first row of pixels as defined by “J”. The “b” position of the ratio is the number of pixels sampled amongst the second row of pixels in the “J” region. There are codecs that will note a fourth position in the ratio, it represents the sampling of pixels for an alpha channel.

Common Chroma Subsampling Ratios. Blocks of color showing Chroma Subsampling of 4:4:4, 4:1:1:1, 4:2:2 and 4:2:0 color information.
Common Chroma Subsampling Ratios. Blocks of color showing Chroma Subsampling of 4:4:4, 4:1:1:1, 4:2:2 and 4:2:0 color information.

Common Chroma Subsampling Ratios

This is the highest quality and it effectively has no subsampling. Each pixel is represented and retains its own luminance and chroma values. The best video cameras will have the capability to output 4:4:4 chroma. These are professional grade tools and they’re priced accordingly.

The 4:2:2 chroma sampling, samples two pixels from both the top and bottom rows. This reduces the chroma information to 50 percent of the uncompressed source chroma. This is one of the more popular samplings and is found in codecs such as AVC-Intra 100, Digital Betacam, Panasonic DVCPRO HD, Apple ProRes 422, and XDCAM HD422.

The more reduced 4:2:0 sampling takes two chroma samples from the top “a” row of pixels and none from the bottom “b” row. Instead, the bottom row shares choma information from the top row sampling. This reduces the overall chroma information to approximately 25 percent of the uncompressed chroma. The configuration of 4:2:0 sampling is common in HDV footage, AVCHD, the Apple Intermediate Codec, and in many of the MPEG encoded video formats used by DSLR cameras.

A chroma sampling of 4:1:1 also reduces the amount of chroma information to 25 percent. It takes one sample from the top “a” row of pixels and one sample from the lower “b” row of pixels. A sampling of 4:1:1 is the sampling ratio found in DVCPRO, DVCAM, and in NTSC DV.

example of halo effect on a poorly applied chromakey shot

What does it all mean?

A video producer’s aesthetic decisions are limited by the technical limits of their footage. That’s why it’s important to have an understanding of chroma subsampling and what it does to footage. Chroma keying and color grading are two processes of post-production that are affected by chroma subsampling.

Footage that relies heavily on chroma subsampling will yield poor results when chromakeying. Video compositing software is able to create transparency by selecting a particular color, or range of colors in a clip, and applying that selection to an alpha channel. The alpha channel creates transparency in the image. The quality of the color information in the image is the main factor in pulling a clean key from green screen footage. Uneven lighting or color consistency will make it more difficult to create clean areas of transparency. Chroma subsampling becomes apparent in the edges of keyed footage because the key is based on the shared chroma information across blocks of pixels. Footage with a chroma subsampling equivalent of 4:1:1, or less, will result in block-edged artifacts along the edge of the key. Even footage with a ratio of 4:2:2 will contain some block artifacts. The cleanest keys come from footage with an uncompressed 4:4:4 ratio.

Color grading is the post-production process of changing the dynamics of the color in footage to achieve an aesthetic goal. Sometimes color grading is subtle with changes that aren’t noticeable. Other times color grades are drastic with significant changes to the color and dynamic range of an image. When the chroma information is reduced due to chroma subsampling, dynamic color grading can reveal digital artifacts in footage. This is most often seen in the banding that occurs across gradients of color. Instead of a smooth transition from one color into the next, bands of color appear as graduated steps between the two colors.

The reasoning for this is simple — chroma subsampling reduces the amount of color information between two regions of an image. As these colors are changed through color grading, the color steps between them — the chroma subsampling, become more apparent. It’s as if they were stretched out. This is another reason why producers who work on commercial projects, ones that require intense color grades, choose to work with the highest quality footage allowed by their budgets.

Accessing 4:4:4 Output

Chroma subsampling helps in compressing the overall size of video files, but those files can be trickier to work with in post-production. This is something to be aware of when that footage is being used for green screen work, visual effects, and serious color grading. There is more latitude for what can be done with footage in post-production when there is a greater amount of color information in the image. The best option for any recorded and processed video is to have uncompressed video with a chroma subsampling of 4:4:4.

The problem lies within the fact that most cameras record internally to solid state media, necessitating compressed video with chroma subsampling of 4:2:2 or less. Most video cameras and DSLRs have a video output, such as SDI or HDMI. Depending on the camera’s image sensor, as well as its image processor, it may have the capability to output video with 4:4:4 chroma subsampling. When this video output is coupled with an external recording device or a computer outfitted with a high-end video capture card and corresponding software, it becomes a powerful tool. 

If the camera is capable of outputting 4:4:4 color video, and an external recording device capable of recording the signal is attached, the resultant video will be better suited for post-production processes. This is why on many studio visual effects shoots, such as green screen capture, there is a large camera rig that is tethered to a computer. A camera’s technical manual is the best place to find out if it’s capable of outputting 4:4:4 color. Online forums dedicated to the best video cameras will have in depth discussions on the best ways to output uncompressed video.

Know What’s Under the Skin

A producer needs to know the technical limitations of their footage and what they can do with it. There are many shooting situations when it is advantageous to capture footage that relies heavily on chroma subsampling in order to compress the image, save space, and reduce expenses. There are also situations when it is critical to acquire footage that is of the highest resolution possible, footage that is uncompressed without chroma subsampling.

Chroma subsampling is part of the hidden anatomy of video, working within its host body. By having a solid knowledge of what chroma subsampling is, producers are better prepared to doctor their video and get the results they desire.

Chris “Ace” Gates is an Emmy Award-winning writer and content producer.


A really hoopy frood.

Related Content