Chroma sub-sampling is the practice of encoding color information at lower spatial resolution than brightness information. It's one of the most consequential bit-budget decisions in video — sub-sampling chroma cuts roughly half the bits with almost no perceptual cost on natural content, which is why every consumer streaming format uses it. The notation (4:4:4, 4:2:2, 4:2:0) is one of the most common technical references in the video engineering vocabulary, and getting it wrong in a pipeline produces washed-out colors, sharp-edge artifacts, and the kind of subtle quality bugs that are hard to spot without explicit testing. This page is the engineering reference.
What chroma sub-sampling is
Video signals are typically stored in a Y'CbCr representation rather than R'G'B':
- Y' — luma (perceived brightness).
- Cb — blue chroma difference (B - Y, scaled).
- Cr — red chroma difference (R - Y, scaled).
This separation lets the encoder treat luma and chroma independently. The trick: human vision is dramatically more sensitive to spatial detail in luma than in chroma. Cutting chroma resolution in half reduces bandwidth significantly without proportional perceptual cost.
The notation J:a:b describes the chroma sub-sampling pattern, where:
- J is a reference width (typically 4).
- a is the number of chroma samples in the first row of width J.
- b is the number of chroma samples in the second row that differ from the first row.
The three patterns that matter:
- 4:4:4 — full chroma resolution. Same number of chroma samples as luma samples.
- 4:2:2 — chroma horizontally sub-sampled by 2. Half the horizontal chroma samples.
- 4:2:0 — chroma sub-sampled by 2 in both horizontal AND vertical. One quarter of the chroma samples.
The bit budget math
For a 1920×1080 frame at 8-bit precision:
- 4:4:4: 1920×1080 luma + 1920×1080 Cb + 1920×1080 Cr = ~6.2 million samples. ~6.2 MB per frame.
- 4:2:2: 1920×1080 luma + 960×1080 Cb + 960×1080 Cr = ~4.1 million samples. ~4.1 MB per frame.
- 4:2:0: 1920×1080 luma + 960×540 Cb + 960×540 Cr = ~3.1 million samples. ~3.1 MB per frame.
Going from 4:4:4 to 4:2:0 is a ~50% bandwidth reduction at the source level, before encoding. The encoded bitstream sees similar proportional savings — a 4:2:0 encode is roughly half the bitrate of a 4:4:4 encode at the same perceptual quality on natural content.
For 10-bit content, the math is the same proportions; just multiply samples by 1.25 for the bit depth difference.
Where each pattern is used
The standard usage in 2026 production:
4:2:0 — consumer streaming (universal)
Every consumer video codec profile that ships at scale uses 4:2:0. HEVC Main, AV1 Main, H.264 High, VP9 — all 4:2:0. This is the universal consumer streaming format. Browsers, mobile devices, smart TVs, set-top boxes — everything decodes 4:2:0.
The reason: bandwidth savings without perceptible quality loss for natural content. The 50% bit budget reduction is compounded across millions of streams; the perceptual cost is invisible on the kind of content (live action, animation, sports) that streaming services deliver.
4:2:2 — broadcast contribution and editorial
Broadcast contribution (camera-to-encoder, master-to-distribution) typically uses 4:2:2. The reasons:
- Multiple-generation processing: contribution feeds get re-encoded multiple times before reaching consumers. 4:2:2 source resists generational quality loss better than 4:2:0.
- Chroma keying: green-screen / blue-screen compositing requires chroma precision. 4:2:0 produces visible edge artifacts on chroma keys.
- Slow-motion preservation: high frame rate slow-motion content benefits from chroma precision because temporal interpolation hurts chroma quality.
ProRes 422 (Apple's editorial codec) is 4:2:2 by design. DNxHR HQ/HQX are 4:2:2. Broadcast contribution codecs (XAVC, MPEG-2 422P@HL) are 4:2:2.
4:4:4 — cinema and color grading
Pure 4:4:4 is for environments where chroma precision matters most:
- Cinema mastering — projectors and digital cinema have fine enough output that chroma loss is visible.
- Color grading — colorists work with 4:4:4 source to have full chroma freedom in grading decisions.
- Animation production — sharp colored line art is precisely the worst content for chroma sub-sampling artifacts.
- Visual effects — chroma-key compositing, color-sensitive VFX work.
- Screen recording — sharp colored text on uniform backgrounds (terminal recordings, IDE screencasts) shows chroma artifacts very visibly.
ProRes 4444 is 4:4:4. DNxHR 444 is 4:4:4. Sony's XAVC 444 is 4:4:4. These are mastering and high-end production formats, not streaming delivery.
Visual impact
The honest assessment of when chroma sub-sampling matters perceptually:
4:2:0 is invisible on:
- Natural live-action content (talking heads, drama, documentary).
- Most animation (cel animation, anime, CGI animation).
- Sports content (constant motion masks chroma artifacts).
- Most user-generated video.
- News and broadcast content.
4:2:0 is visible on:
- Sharp colored text on solid backgrounds (terminal recordings, presentation slides, IDE captures).
- Pixel art with bright colors (retro games, digital illustrations).
- Anime with very saturated colors and sharp edges.
- Chroma-key (green-screen) edges in poorly-keyed content.
- Logo overlays and graphics that should have crisp edges.
For typical consumer streaming, 4:2:0 quality is acceptable. For tech tutorials with lots of code on screen, screen recordings, or content with significant graphic overlays, 4:2:2 or 4:4:4 produces visibly better output.
Codec profile support
Codec support for chroma formats varies by profile:
H.264 (AVC):
- Baseline / Main / High — 4:2:0 only.
- High 4:2:2 — 4:2:0 and 4:2:2.
- High 4:4:4 Predictive — 4:2:0, 4:2:2, and 4:4:4.
HEVC:
- Main / Main 10 — 4:2:0 only.
- Main 4:2:2 10 — adds 4:2:2.
- Main 4:4:4 / Main 4:4:4 10 — adds 4:4:4.
AV1:
- Main profile — 4:2:0 only.
- High profile — 4:2:2.
- Professional profile — 4:4:4.
VP9:
- Profile 0/2 — 4:2:0.
- Profile 1/3 — 4:2:2 or 4:4:4.
ProRes / DNxHR:
- ProRes Proxy/LT/422/422 HQ — 4:2:2.
- ProRes 4444 / 4444 XQ — 4:4:4.
- DNxHR LB/SQ/HQ/HQX — 4:2:2.
- DNxHR 444 — 4:4:4.
The profile-level constraint matters operationally: hardware decoders typically support only Main/Main 10 profiles for HEVC, which means 4:2:2 and 4:4:4 HEVC content is software-decoded on most consumer devices. For streaming delivery, 4:2:0 is essentially mandatory.
Hardware decode considerations
Hardware decoders are highly tuned for 4:2:0 because that's what consumer streaming uses universally. 4:2:2 and 4:4:4 hardware decode is rare:
- 4:2:0 hardware decode — universal across modern devices.
- 4:2:2 hardware decode — limited. Some recent NVIDIA Pro GPUs, some professional decode hardware. Generally not on consumer devices.
- 4:4:4 hardware decode — very rare on consumer hardware. Software decode is the norm.
For consumer streaming pipelines, the practical answer is "4:2:0, always." For broadcast contribution and editorial workflows where the receiving systems are professional gear, 4:2:2 is workable. 4:4:4 is for closed-ecosystem use cases.
Pipeline conversion
Converting between chroma formats requires resampling. The conversion from 4:4:4 to 4:2:0 (typical pipeline operation when ingesting professional masters and producing streaming variants) involves:
- Filter — apply a low-pass filter to the chroma channels to avoid aliasing during downsampling.
- Downsample — reduce chroma resolution by 2 in both dimensions.
- Co-siting decision — pick where the downsampled chroma sample sits relative to the luma samples (top-left, center, etc.). Different chroma siting conventions exist.
Bad conversion produces:
- Chroma aliasing — visible color fringes on sharp edges.
- Chroma siting offset — colors appear shifted slightly from where they belong.
- Resampling artifacts — soft chroma in regions that should be sharp.
ffmpeg handles chroma conversion via the scale filter or pix_fmt selection. For high-quality conversion, use explicit chroma siting flags and high-quality filters:
ffmpeg -i input_444.mov -pix_fmt yuv420p -vf "scale=in_chroma=mpeg2:out_chroma=mpeg2" output_420.mp4
The chroma siting flag matters — MPEG-2 siting (default for HEVC and most consumer formats) places the downsampled chroma between luma samples in a specific pattern; JPEG siting is different. Mismatched siting between encoder and decoder produces subtle color shifts.
Chroma sub-sampling and HDR
HDR content typically uses 4:2:0 sub-sampling, same as SDR. The reasons:
- Hardware decoder support — HDR-capable hardware decoders are 4:2:0 by default.
- Bandwidth — HDR's wider color gamut and 10-bit precision already increase bandwidth; 4:2:2 would compound that.
- Visual benefit margin — chroma sub-sampling artifacts are subtle on HDR content where the dynamic range is the dominant visual feature.
Premium broadcast HDR contribution sometimes uses 4:2:2, but consumer HDR streaming is essentially universal 4:2:0.
Operational considerations
Things that matter for chroma sub-sampling in production:
- Pix_fmt specification at every encoder stage — ffmpeg defaults can vary; specify
-pix_fmt yuv420p(oryuv420p10lefor 10-bit) explicitly to avoid surprises. - Source-to-output mismatch — encoding 4:2:2 source as 4:2:0 output is a one-way trip; the chroma precision is lost. Plan source preservation if you may need 4:2:2 later.
- Mastering-to-streaming chroma chain — masters in 4:4:4 or 4:2:2 → streaming in 4:2:0. Verify the conversion is high-quality; bad conversion shows up as visible color artifacts in streaming output.
- Browser playback edge cases — most browsers play 4:2:0 cleanly; 4:2:2/4:4:4 playback in browsers depends on codec and profile support, often falls back to software decode with quality and battery cost.
- Codec profile signaling — manifest signaling must match the actual content's chroma format. Mismatched signaling can cause player rejection.
What MpegFlow does with chroma sub-sampling
MpegFlow's FfmpegExecutor rendition stages emit 4:2:0 by default for streaming workflows. The DAG runtime treats chroma format as a per-stage parameter; the partitioner persists each rendition stage to job_stages with explicit dependency tracking; per-stage retry handles transient failures; the FFmpeg pixel-format flags select the chroma layout.
For pipelines that ingest 4:4:4 or 4:2:2 source and produce 4:2:0 streaming output, an FfprobeExecutor stage characterizes the source's chroma format upstream; cross-stage data flow wires the probe output into the downstream encode stages so chroma resampling parameters derive from the real source rather than assumed defaults. Default conversion uses high-quality resampling with MPEG-2 chroma siting (the consumer-codec convention).
For workflows that produce both broadcast contribution (4:2:2) and streaming (4:2:0) outputs from the same source, parallel sibling rendition stages emit each variant from one shared upstream source; sibling cancellation propagates fatal failures so dependents don't waste compute.
The CMAF muxer at the packaging stage preserves chroma signaling (pasp and related boxes) the encoder emitted; players parse the signaling and decode appropriately.
The strict-broker security model handles chroma-format work like any pipeline payload — workers carry no ambient credentials; content access flows through short-lived presigned URLs scoped per stage; access is disposed on completion.
For customers asking about chroma format choice for their pipeline, the answer is almost always "4:2:0 for streaming, 4:2:2 for broadcast contribution, 4:4:4 if you have a specific editorial reason." The exceptions are narrow (premium screen-recording content, pixel-art animation, broadcast-grade live production where contribution is bandwidth-rich); the default is consumer streaming chroma sub-sampling.