Opus is the audio codec that stitches together two formerly-separate codec niches: speech-quality voice coding (where SILK lived) and music-quality general audio (where CELT lived). It's the codec WebRTC standardized on for browser-to-browser audio. It's the codec podcast platforms increasingly default to. It's the codec that beats AAC at every operating point below ~96 kbps stereo and ties or wins above that. And it's royalty-free, BSD-licensed, with a single high-quality reference implementation. This page is the engineering reference.
What Opus is
Opus is a hybrid audio codec — internally, it has two coders that operate at different frequency ranges and can be used independently or in combination:
- SILK — a speech-optimized coder, derived from Skype's pre-Microsoft codec work. Excellent at voice content from ~8 kHz to ~16 kHz audio bandwidth. Operates at 6-40 kbps mono.
- CELT — a low-latency music coder, derived from earlier Xiph.org work. Excellent at full-bandwidth (up to 48 kHz) general audio. Operates at 32 kbps and up.
- Hybrid mode — uses SILK for the lower frequencies and CELT for the upper frequencies. The transition between SILK-only, hybrid, and CELT-only modes is smooth across the bitrate range.
The encoder picks the mode based on bitrate, latency budget, and (optionally) signal type hints. For voice content at low bitrates, SILK is engaged; for music at high bitrates, CELT runs alone; for mixed content, hybrid mode crosses the bands.
The frame sizes are short — 2.5ms, 5ms, 10ms, 20ms, 40ms, 60ms — which is why Opus is the live/WebRTC codec. AAC frames are 20-93ms; the 2.5-10ms range is what makes sub-50ms end-to-end audio latency achievable.
Compression efficiency
Opus benchmarks against AAC and other codecs across content types. Headline:
- Voice content — Opus at 24 kbps mono is roughly equivalent to HE-AAC v2 at 32 kbps mono. SILK is genuinely state-of-the-art for speech.
- Mixed content — Opus at 64 kbps stereo beats AAC-LC at 96 kbps stereo for general listening. Roughly tied with HE-AAC at the same bitrate.
- Music content — Opus at 96 kbps stereo is comparable to AAC-LC at 128 kbps stereo. Above 128 kbps, the codecs are similarly competent; perceptual transparency happens at similar bitrate.
- Very low bitrates (<20 kbps mono) — Opus is the clear winner. There's no AAC variant that matches Opus at 12-16 kbps mono speech.
- Very high bitrates (>192 kbps stereo) — Opus and AAC are both transparent for most listeners. Codec choice doesn't matter much.
The pattern: Opus wins by larger margins at lower bitrates, ties or wins by smaller margins at higher bitrates, and is competitive everywhere. Where Opus loses is the integration story — AAC is the codec downstream tooling assumes, especially in video pipelines.
libopus in production
libopus is the reference encoder, maintained by Xiph.org and Mozilla. Unlike most codec spaces, Opus has one high-quality encoder implementation, and it's the reference. There are no commercial Opus encoders you'd use over libopus.
The CLI invocation:
ffmpeg -i input.wav -c:a libopus -b:a 96k output.opus
Production tuning specifics:
- Bitrate (
-b:a) — 96 kbps stereo is the production default for music; 64 kbps stereo for general; 24-32 kbps mono for voice. VBR mode (default) handles bitrate adaptation; CBR is-vbr off. - Application hint (
-application) —audiofor music/general,voipfor voice (engages SILK earlier),lowdelayfor low-latency. Default isaudio. - Frame duration (
-frame_duration) — 2.5/5/10/20/40/60 ms. Default 20ms is fine for VOD; 5-10ms for live; 60ms only if bitrate is very low and latency doesn't matter. - Compression level (
-compression_level) — 0-10, default 10 (highest quality). Lower values trade quality for encoder speed; rarely worth it because encoding is fast. - Cutoff frequency (
-cutoff) — for low-bitrate streaming, capping at 12 kHz vs full 24 kHz can save bitrate. Defaults adapt to bitrate.
Opus encoding is fast — meaningfully faster than AAC encoding via FDK or ffmpeg native. This isn't a bottleneck; you don't tune Opus for encoding speed.
Container support
Opus's container support evolved over its lifetime. State of play in 2026:
- Ogg — native. The original Opus container.
.opusfiles are usually Ogg-Opus. - WebM — native. Opus-in-WebM is the standard combination for browser playback of Opus audio.
- MP4 — supported, in the
mov,mp4,m4amuxer with theopuscodec. Apple added Opus-in-MP4 support in Safari 11+ (macOS High Sierra, iOS 11). Earlier Apple versions don't decode Opus-in-MP4. - MPEG-TS — possible but uncommon. HLS spec supports Opus-in-TS but most HLS implementations don't ship it.
- CMAF / fragmented MP4 — supported, used in DASH for Opus delivery.
The practical decision: for video pipelines that target browsers + modern Apple, Opus-in-MP4 works. For older Apple devices or legacy HLS, AAC is still the safer choice.
When Opus beats AAC
The cases where Opus is the right codec for a video pipeline:
- WebRTC — Opus is mandatory across all WebRTC implementations. AAC support is patchy. Any browser-to-browser real-time use case is Opus.
- Live streaming with sub-100ms target latency — Opus's short frames beat AAC's 20ms+ frames. Hardware decode is software (no codec-specific accelerators), but the codec is light enough that this isn't a problem.
- Voice / podcast content — at the bitrates podcasts target (32-64 kbps mono), Opus is meaningfully better than AAC. Spotify's podcast platform uses Opus; Apple Podcasts now supports Opus delivery.
- Bandwidth-constrained streaming — at the very low bitrates (24-48 kbps mono/stereo), Opus is the codec that delivers acceptable quality. AAC at the same bitrate is noticeably worse.
- Internal tooling — pipelines you control end-to-end, with no client-compatibility matrix. Opus is technically better and royalty-free.
When AAC is still the right choice
For most streaming video pipelines, AAC stays the default because:
- Universal client support — every device that decodes video also decodes AAC. Opus support is broad but not universal (older iOS, older smart TVs, some embedded set-top boxes).
- HLS legacy compatibility — AAC-in-TS is the safe choice for HLS that has to work everywhere. Opus-in-fMP4 only works on CMAF-capable players.
- Tooling assumes AAC — most audio mastering, loudness normalization, and content-management systems assume AAC. Switching to Opus means revisiting parts of your toolchain that aren't broken.
- Bitrate isn't the constraint — for premium streaming at 192 kbps+ stereo audio, AAC is transparent. The compression advantage of Opus only matters when you're squeezing bitrate.
Opus vs AAC — the practical guide
| Use case | Codec | Reason |
|---|---|---|
| Premium video streaming (VOD) | AAC | Universal client support, transparent at production bitrates |
| Live video streaming (HLS, DASH) | AAC | Same — plus legacy HLS compatibility |
| WebRTC media | Opus | Mandatory in WebRTC spec |
| Podcast delivery | Opus | Better quality at podcast-typical bitrates |
| Voice-only streams | Opus | SILK is state-of-the-art for speech |
| Bandwidth-constrained streaming | Opus | Cleaner quality at very low bitrates |
| Internal video pipelines | Either | Technical merit (Opus); operational consistency (AAC) |
| Music streaming products | Codec-specific | Audio products have different drivers |
Opus and licensing
Opus is BSD-licensed, royalty-free, and patented in a way that the IETF Opus Working Group documented thoroughly during standardization (RFC 6716, RFC 7587). The patents that touch Opus are licensed under terms that allow free use; no patent pool exists; no royalties accrue.
This is a meaningfully clean licensing story compared to AAC. For products where licensing complexity is a real cost — open-source projects, internal tooling, decentralized infrastructure — Opus is the audio codec without administrative overhead.
A note on Opus history and the IETF process
Opus is one of the few codecs to come out of the IETF rather than ITU-T or MPEG. The standardization process was deliberately open: the codec specification, reference implementation, and patent licensing analysis were all developed in public IETF working groups, with code dropped into Xiph.org's repositories from day one. RFC 6716 (the original Opus specification, 2012) ran through the standard IETF rough-consensus process, with public review and revision.
This matters because the IETF process produced a codec that didn't have to navigate the MPEG/ITU-T patent-pool dynamics. The combined CELT+SILK lineage was already royalty-free or licensed for free use; the IETF-driven standardization formalized that without introducing new patent claims. Opus is one of the cleanest examples of how to standardize a codec without the licensing aftermath.
For comparison, AAC's standardization happened inside MPEG, the patent pools formed afterward, and the licensing situation evolved over a decade. Opus skipped that whole arc. The result: every WebRTC implementation can ship Opus without licensing-team negotiations, and that's why Opus became the WebRTC audio codec rather than AAC. Standardization-process choices have downstream consequences for what gets adopted.
What MpegFlow does with Opus
MpegFlow's FfmpegExecutor worker image includes libopus, so Opus encoding is configurable per-rendition with application-hint and bitrate parameters from workflow YAML. The DAG runtime expresses Opus encoding as its own stage with explicit dependency tracking; per-stage retry handles transient failures. The workflow YAML supports per-output-track audio codec selection, so a pipeline can emit AAC for HLS delivery and Opus for WebRTC simulcast from the same source as parallel sibling stages with different audio encoder parameters.
Default audio for video pipelines is AAC (production reality for streaming); Opus is the configured default for WebRTC live streams and for customers explicitly choosing it. The two customer use cases where Opus is the primary audio codec: a live conferencing-adjacent product running over WebRTC, and a podcast-distribution platform shipping Opus to its modern client app with AAC fallback for legacy. Both are operationally interesting because audio matters disproportionately for user experience in those products — bad audio breaks the use case in ways that bad video usually doesn't.