Loudness normalization is the practice of making audio play at consistent perceived loudness across content. It's why a quiet drama and a loud action movie sound roughly equally loud when streamed back-to-back — the audio engineer measured each in LUFS (Loudness Units Full Scale) and adjusted to a target level. For broadcast, loudness normalization is regulated (CALM Act in US, EBU compliance in Europe). For streaming, platforms apply per-platform loudness targets to deliver consistent listening experience. This page is the engineering reference for what loudness normalization actually is, the standards that define it, and how to integrate it into a pipeline.
What loudness is
Loudness is the perceived intensity of audio. It's not the same as peak amplitude (max signal value) or RMS power (average signal power). Loudness accounts for psychoacoustic factors — frequency weighting (humans hear midrange more sensitively than bass or treble), temporal integration (sustained sounds vs transient peaks), and content distribution (dialogue vs music vs effects).
The core loudness measurement standard is ITU-R BS.1770, which defines:
- K-weighting filter — emphasizes frequencies humans perceive most sensitively.
- Mean square measurement over time windows.
- Channel summation — multi-channel audio summed with specific weighting.
- Gating — silent or near-silent passages are excluded from the loudness measurement to prevent quiet sections from dragging the average down.
The output is a single number expressed in LU (Loudness Units) or, when referenced to digital full scale, LUFS (LU Full Scale). LUFS values are negative; -23 LUFS is "23 LU below digital full scale," i.e., quiet relative to the maximum possible loudness.
The American equivalent unit is LKFS (Loudness, K-weighted, Full Scale). LUFS and LKFS are technically the same measurement; the names just reflect different standards bodies (EBU vs SMPTE/ATSC).
EBU R128
EBU R128 is the European Broadcasting Union's loudness standard for broadcast and program delivery. Key parameters:
- Target level — -23 LUFS integrated loudness across the program.
- Tolerance — typically ±0.5 LU for compliant delivery.
- Loudness Range (LRA) — descriptive metric of dynamic range; not a regulatory target but commonly reported.
- True peak — must not exceed -1 dBTP (decibels relative to True Peak full scale).
EBU R128 is the regulatory standard for European broadcast. Programs delivered to broadcasters must measure at -23 LUFS ±tolerance; non-compliant programs are rejected.
For streaming, EBU R128 is the production reference for European-origin content. Apple Music in some regions targets EBU-aligned levels; Spotify uses an EBU-derived target.
ATSC A/85
ATSC A/85 (also called the CALM Act in US implementation context) is the American broadcast loudness standard. Key parameters:
- Target level — -24 LKFS integrated loudness.
- Tolerance — typically ±2 LK for compliant delivery (looser than EBU).
- True peak — typically -2 dBTP for safety margin.
ATSC A/85 is the regulatory standard for US broadcast (FCC enforces it via the CALM Act of 2010). Broadcasters must deliver content at the target level or face FCC penalties.
ATSC A/85 and EBU R128 differ by 1 LU (-23 vs -24). The 1 LU difference is small in perceptual terms but matters for compliance — content delivered to one regulator that's compliant for the other might fail compliance for the target.
Streaming loudness targets
Streaming platforms use higher loudness targets than broadcast (i.e., louder by a few LU). The reasons are practical — broadcast targets were designed for cinema-style listening environments; streaming is often consumed on phones, laptops, or headphones where slightly louder defaults sound better.
Common streaming targets (as of 2026):
- Spotify — -14 LUFS integrated.
- Apple Music — -16 LUFS integrated.
- YouTube — -14 LUFS integrated.
- Amazon Music — -14 LUFS integrated.
- Tidal — -14 LUFS integrated.
- TIDAL HiFi — -18 LUFS for higher dynamic range tier.
- AES TD1004 — -18 LUFS recommendation for streaming (streaming industry consensus).
For video streaming services (Netflix, Disney+, Prime), audio levels are typically:
- Netflix — internally targets -27 LKFS for dialog level (their dialnorm convention), with the overall program targeting near broadcast levels.
- Disney+ / Prime / Apple TV+ — typically broadcast-aligned (-23 to -27 LUFS depending on dialog vs program target).
Video streaming services tend to be more conservative (closer to broadcast) than music streaming services, for compatibility with home theater systems and consistency with TV-watching expectations.
Integrated, momentary, short-term
Loudness measurement happens at three time scales:
- Integrated — average loudness over the entire program. The headline number for compliance.
- Short-term — average over a 3-second sliding window. Useful for detecting bursty loud sections.
- Momentary — average over a 400ms sliding window. Useful for detecting transient peaks.
For program delivery, integrated is what compliance targets. For loudness QC during production, short-term and momentary measurements help identify problem sections (bursts that exceed acceptable loudness, sustained passages that drag the integrated lower than necessary).
The ebur128 filter in FFmpeg reports all three:
ffmpeg -i input.mp4 -af ebur128 -f null - 2>&1 | grep "I:\|S:\|M:"
Output:
I: -23.5 LUFS
S: -22.1 LUFS (last short-term)
M: -19.8 LUFS (last momentary)
True peak
True peak is the maximum signal amplitude, including inter-sample peaks that don't appear in standard sample-based peak measurement.
The distinction matters because lossy audio codecs and resampling can produce peaks that exceed full scale when reconstructed for playback, even when the source samples don't exceed full scale. Naive peak measurement misses these "inter-sample peaks" because they exist between sample positions.
True peak measurement uses oversampling (typically 4x) to detect these inter-sample peaks. The result is expressed in dBTP (decibels relative to True Peak full scale).
For loudness compliance:
- EBU R128 specifies -1 dBTP maximum.
- ATSC A/85 specifies typically -2 dBTP.
- Streaming specifies typically -1 to -2 dBTP.
Compliant content has both a target integrated loudness AND a true peak below the spec ceiling. Either failure is non-compliance.
True peak limiting (a brick-wall limiter applied during normalization) prevents true peak excursions. The limiter trades transient peak detail for compliance — slight clipping artifacts on the transients in exchange for spec-compliant output.
Dialog normalization (dialnorm)
Dialog normalization is a metadata-based loudness scheme used in surround codecs (AC-3, E-AC-3). The encoder embeds a "dialnorm" value indicating where dialog sits in the level structure; receiving systems use this to normalize playback level.
Dialnorm is a concept-of-operations choice rather than a measurement. The mixer sets dialnorm based on intended dialog reference (typically -27 LKFS for Netflix's standard), and the decoder uses it for level management.
For modern streaming with codec-level dialnorm support (AC-3, E-AC-3, AC-4, MPEG-D DRC), dialnorm metadata travels with the content and is honored by playback systems. For codecs without dialnorm metadata (AAC primarily), the loudness target is enforced at the encoder stage and metadata isn't used for downstream level management.
FFmpeg loudnorm
FFmpeg's loudnorm filter implements ITU-R BS.1770 measurement and EBU R128 normalization. Two-pass usage:
First pass — analyze:
ffmpeg -i input.mp4 -af loudnorm=I=-23:TP=-1:LRA=7:print_format=json -f null - 2> stats.json
This measures the integrated loudness, true peak, and loudness range. Output (excerpted):
{
"input_i" : "-25.34",
"input_tp" : "-2.10",
"input_lra" : "8.60",
"target_offset" : "-2.34"
}
Second pass — normalize:
ffmpeg -i input.mp4 -af loudnorm=I=-23:TP=-1:LRA=7:measured_I=-25.34:measured_TP=-2.10:measured_LRA=8.60:measured_thresh=-35.5:offset=-2.34:linear=true -c:v copy output.mp4
The second pass uses the first-pass measurements to apply normalization linearly (without dynamics processing). The result is exactly at the target loudness, with true peak respect.
Single-pass loudnorm is also available but uses dynamic processing that can alter content character. For broadcast/streaming compliance, two-pass with linear normalization is preferred.
Loudness range (LRA)
LRA describes the loudness span of a program — the difference between the loudest 95th-percentile section and the quietest 10th-percentile section. Units are LU.
Conventional LRA values:
- 2-5 LU — heavily-compressed audio (commercial radio, some pop music).
- 5-10 LU — typical pop music, much streaming content.
- 10-15 LU — film soundtracks, dynamic music.
- 15-25 LU — orchestral classical music, cinema-grade dynamic content.
- >25 LU — extreme dynamic range; rare in production.
EBU R128 doesn't mandate an LRA target but recommends keeping LRA appropriate to content. Compressing dynamic content to low LRA produces fatigue; preserving high LRA produces content that requires careful listening environments.
For streaming, LRA preservation is platform-specific. Broadcast typically targets moderate LRA (8-15 LU); streaming platforms often allow more dynamic range than broadcast.
The loudness wars
The "loudness wars" describes the trend (1990s-2010s) of mastering music progressively louder via heavy compression and limiting. The motivation: louder music seemed to "stand out" on radio playlists. The result: contemporary albums had compressed dynamic range, fatiguing sound, and clipping artifacts.
Streaming platforms ended the loudness wars at a structural level — by normalizing all content to the same target loudness, the "louder is better" arms race lost its motivation. Albums mastered with preserved dynamic range now sound at the same level as compressed albums, with the dynamic-range album sounding better.
For pipeline operations, the historical implication is that pre-2015 music masters often have compressed dynamic range that can't be undone in post. New production from 2015+ generally has preserved dynamic range that streaming normalization handles correctly.
Operational considerations
Things that matter for loudness normalization in production:
- Compliance targets — match the destination's standard. -23 LUFS for European broadcast, -24 LKFS for US broadcast, -14 to -18 LUFS for streaming.
- Two-pass normalization — for compliance-critical content, two-pass with linear normalization. Single-pass is acceptable for less-critical work.
- True peak limiting — never trust the source true peak. Apply true peak limiting at the normalization stage.
- Multi-language tracks — each language's audio track must be normalized independently to the same target. Not just the primary language.
- 5.1 / surround normalization — surround content has its own normalization considerations. Center channel (dialog) is typically the level reference rather than program-average.
- Atmos / object-based audio — Atmos has its own loudness model; standard ITU BS.1770 measurement may not apply directly.
- Live workflow constraints — live audio can't use two-pass normalization. Either accept dynamic range from the source or use real-time loudness control (more complex).
What MpegFlow does with loudness normalization
MpegFlow's DAG runtime expresses loudness normalization as work performed at the FfmpegExecutor stage via loudnorm filter parameters. The partitioner persists the stage to job_stages with explicit dependency tracking; per-stage retry handles transient failures. For two-pass normalization, the runtime models pass-1 (analysis) and pass-2 (apply) as two stages with the analysis output flowing into the apply stage via cross-stage data flow — same dependency mechanism that powers two-pass video encoding.
Default targets:
- Broadcast workflows — -23 LUFS (EBU R128) for European delivery; -24 LKFS (ATSC A/85) for US delivery.
- Streaming workflows — -16 LUFS (Apple Music aligned) by default; configurable per workflow.
- Internal/preview workflows — preserves source loudness; normalization optional.
For VOD content, two-pass normalization is the default for compliance-critical workflows. Single-pass is configured for throughput-bound pipelines where compliance is less critical.
For live workflows, real-time loudness control is via the appropriate FFmpeg limiter at the FfmpegExecutor live stage; the limiter targets a configured loudness level and adjusts gain dynamically based on short-term measurements. Quality is acceptable for most live use cases; for live broadcast where regulatory compliance matters, dedicated broadcast loudness controllers are recommended in the source-side audio chain.
For multi-language and surround content, per-track normalization targets are workflow YAML parameters that flow into the audio-processing stage; each track is normalized independently for consistent listening experience across languages and channel configurations.
The strict-broker security model handles loudness work like any pipeline payload — workers carry no ambient credentials; content access flows through short-lived presigned URLs scoped per stage; access is disposed on completion.
For customers building loudness-compliant pipelines for broadcast or premium streaming distribution, the standing recommendation: configure the appropriate target for the destination, use two-pass normalization for compliance-critical content, validate with ebur128 measurement on output, and integrate loudness QC into release workflow rather than fixing problems post-hoc.
The general guidance: loudness is one of the parts of audio that's invisible when it works (consistent listening levels across content) and frustrating when it doesn't (commercial breaks louder than programs, episodes louder than each other, music louder than dialog). Get the loudness normalization right; it's the operational discipline that distinguishes professional video pipelines from consumer-grade.