AWS MediaConvert captions: CEA-608/708, IMSC, WebVTT, SCC and the broadcast formats
AWS MediaConvert's caption support — every broadcast and streaming caption format, ingest patterns, and the SCC + IMSC handling that broadcast workflows need.
AWS MediaConvert has the broadest caption-format support of any commercial transcoding service. For broadcast workflows ingesting from legacy SCC files, MXF with embedded CEA-708, or IMSC sidecars, MediaConvert handles formats that other vendors require additional preprocessing for. This breadth is the AWS Elemental heritage showing up.
What AWS MediaConvert actually has
Ingest formats: SCC (Scenarist Closed Caption — the legacy broadcast format), CEA-608/708 embedded in MPEG-TS or MXF, EBU-TT (D and T variants), IMSC1 (Text and Image profiles), TTML, WebVTT, SRT, SMPTE Timed Text. Output formats: WebVTT, IMSC1 (Text), CEA-608/708 burn-in or embed, TTML, SCC, SMPTE-TT, EBU-TT. Multi-language tracks with proper language-code signaling per CEA-708 service number. Caption position metadata (vertical alignment, off-screen positioning) preserved through encoding. SCTE-35 markers in source streams flow through with caption synchronization. Burn-in captions with style overrides (font, color, background) configurable per job.
Where it's the right fit
Broadcast ingest workflows with SCC files from legacy production — MediaConvert's SCC parsing is mature where some vendors require pre-conversion. Operators delivering to multi-target audiences (linear broadcast + OTT + accessibility-compliance) where the same source must produce CEA-708 (broadcast), WebVTT (streaming), and burned-in (social-media-style autoplay) outputs from one job spec. Studio post-production workflows with IMSC1 captions where positional metadata must survive precisely.
Where the gaps show up
Auto-caption generation (Whisper, Transcribe-style) isn't MediaConvert's product — pair with AWS Transcribe separately. Live caption insertion isn't supported (live captioning is in MediaLive, not MediaConvert). Forced subtitles for partial dialog translation (foreign-language dialog within a primarily-native-language asset) require careful flagging that the docs don't emphasize.
Pricing implications
Caption handling is included in MediaConvert encoding minutes — no separate caption charge. Burn-in captions don't change the per-minute pricing tier; sidecar caption output is similarly bundled. AWS Transcribe (when paired for auto-captioning) is metered separately by audio minute.
MpegFlow's caption pipeline currently handles CEA-608/708 + WebVTT + IMSC1 + SRT — covering ~95% of production cases. SCC ingest is on the 2026 Q3 roadmap alongside live; for SCC-heavy workflows today, design partners pair MpegFlow with a preprocessing step that converts SCC to WebVTT. The orchestration angle: caption track production is a separate DAG stage, allowing per-language parallelization at scale.
- captions
- subtitles
- aws-mediaconvert
- Broadcast
- accessibility