Bitmovin captions: CEA-608/708, EBU-TT, WebVTT, and broadcast workflows

Bitmovin's caption handling — broadcast standards (CEA-608/708, EBU-TT, IMSC), WebVTT for streaming, sidecar vs burned-in workflows, and the accessibility compliance story.

Feature deep-dive · Bitmovin·captions·Bitmovin ↗

Captions are where vendor depth shows up sharpest, because the caption ecosystem is fragmented across broadcast standards (CEA-608, CEA-708, EBU-TT, IMSC) and streaming standards (WebVTT, TTML). Bitmovin handles the full set, which matters for operators delivering across both broadcast and streaming targets — most Bitmovin customers do.

What Bitmovin actually has

Bitmovin Encoding ingests captions from MPEG-TS streams with embedded CEA-608/708, MXF files with EBU-TT or IMSC sidecars, and standalone SRT/VTT/TTML files. The packager outputs WebVTT + IMSC1 for HLS/DASH delivery, with sidecar manifests for player-controlled captions (preferred for accessibility — viewers can turn captions on/off and switch languages). Burned-in captions are supported as an alternative when delivery requires baked-in (broadcasting compliance, social-media autoplay). Multi-language caption track output with proper language-code signaling. Caption position metadata (off-screen alignment, vertical position) survives the encoding pipeline.

Where it's the right fit

Broadcast operators with mixed delivery targets (linear broadcast + streaming) needing CEA-608/708 ingest and WebVTT output. Premium VOD with multi-language caption requirements (10+ language tracks per asset). Accessibility-compliance-driven workflows where caption metadata (screen position, color, formatting) must survive encoding without distortion.

Where the gaps show up

Auto-generation of captions from audio (Whisper-style transcription) is not Bitmovin's product — they ingest pre-existing caption files. If your workflow needs auto-captioning, you bolt on a separate transcription vendor (Mux Video has this; Bitmovin doesn't). Live caption insertion (real-time captioning during live encoding) requires partnered services. Forced subtitles (foreign-language dialog burn-in for mostly-native-language content) work but require manual flagging in the manifest.

Pricing implications

Caption handling is included in Bitmovin's encoding pricing — no separate per-asset or per-minute fee. Auto-caption generation, when added via partnered vendors, is metered separately by the partner.

The MpegFlow angle

MpegFlow's caption stage in the DAG is a parameterized step that handles CEA-608/708 + EBU-TT + WebVTT ingest and produces WebVTT + IMSC1 output. Auto-caption generation via Whisper / Deepgram integration arrives 2026 Q4 alongside DRM. The audit log records caption track count + language codes per asset for accessibility compliance verification.

Topics

captions
subtitles
cea-608
webvtt
accessibility
Bitmovin