FFmpeg — the multimedia framework that runs nearly all video infrastructure

MpegFlow

Practical reference on FFmpeg — project structure, libav* libraries, codec/format support, filter graphs, hardware acceleration, ffprobe and ffplay, the universal video tool.

FFmpeg is the multimedia framework that nearly every piece of video infrastructure on the internet uses, directly or indirectly. The command-line ffmpeg tool is famous; the libav* libraries that power it are even more important — Chrome's video decode, OBS's encoding, every streaming service's transcoding fleet, every ffmpeg-based desktop app — they all sit on the same underlying code. Understanding FFmpeg is foundational to working with video at any scale, and unlike most "tool" topics, FFmpeg deserves a deep technical reference because it's not just a tool — it's the substrate. This page is the engineering reference.

What FFmpeg actually is

FFmpeg is three things at once:

A set of libraries — the libav* family that does the actual work of decoding, encoding, muxing, demuxing, filtering, and resampling. These are C libraries with bindings for most languages.
Command-line tools — ffmpeg, ffprobe, ffplay. Wrappers around the libraries that expose the functionality through a CLI.
A project / community — the FFmpeg organization that maintains all of this, with a 25-year history that includes a famous fork (Libav) and reunion. Volunteer-driven, BSD/LGPL/GPL-licensed.

The framework supports essentially every video codec, audio codec, and container format that has ever shipped at scale. The supported codec list runs to hundreds; the supported container list to dozens. This breadth is the practical reason FFmpeg is universal — when you need to handle a piece of video, FFmpeg almost certainly understands it.

The libav* libraries

The libraries are the architectural foundation:

libavcodec — encoders and decoders for every codec FFmpeg supports. ~400+ codecs in current builds. Used by Chrome, Firefox, VLC, Kodi, OBS, every video editor — anything that decodes video on Linux likely uses libavcodec.
libavformat — container/format demuxing and muxing. Reads MP4, MKV, MPEG-TS, MOV, WebM, AVI, FLV, AIFF, OGG, RTMP, RTP, HLS manifests, DASH manifests, and many more. Writes most of them too.
libavfilter — audio and video filter graph runtime. The foundation of -vf and -af in the CLI. ~400+ filters covering scaling, color conversion, deinterlacing, overlays, audio mixing, FFT, and more.
libavutil — utility code (math, time, common data structures, hardware acceleration helpers).
libswscale — software-based image scaling and color space conversion. The reference scaler.
libswresample — audio sample rate, channel layout, and format conversion.
libavdevice — capture device input (video4linux, ALSA, X11 grab, JACK).
libpostproc — legacy post-processing filters (mostly historical now).

The libraries are the actual reusable assets. When you write a video application in any language, you're either using FFmpeg's libraries directly (via C, Python, Go, Rust bindings) or using a higher-level library (gstreamer, OpenCV, PyAV) that itself wraps FFmpeg.

The CLI is convenient for ad-hoc work; the libraries are how production systems integrate FFmpeg.

The ffmpeg CLI

The ffmpeg command-line tool is the most-used face of FFmpeg. It's a CLI for chaining input → filtering → encoding → output operations:

ffmpeg [global options] {[input options] -i input}... {[output options] output}...

A simple example:

ffmpeg -i input.mov -c:v libx265 -crf 24 -c:a aac -b:a 192k output.mp4

This reads input.mov, encodes the video with x265 at CRF 24, encodes the audio with the native AAC encoder at 192 kbps, and writes to output.mp4. The container format is inferred from the output extension; the audio codec is aac because that's MP4's typical audio.

More complex example using a filter graph:

ffmpeg -i input.mp4 -vf "scale=1280:720,fps=30,format=yuv420p" \
  -c:v libx264 -preset medium -crf 22 \
  -c:a copy output.mp4

The -vf filter graph applies scaling to 720p, frame rate conversion to 30 fps, and pixel format conversion to YUV 4:2:0. Filters run in the order specified, with the output of each filter feeding the next.

Filter graphs — the secret weapon

FFmpeg's filter system is what makes it operationally flexible. Filter graphs are mini-DAGs — nodes are filters, edges are media streams flowing between them.

Simple filter chain (linear):

-vf "scale=720:480,fps=30,format=yuv420p"

Complex filter graph (branching):

-filter_complex \
  "[0:v]split=2[main][thumb]; \
   [main]scale=1920:1080[full]; \
   [thumb]scale=320:180,fps=1[preview]"

The split filter creates two copies of the input video; one is scaled to 1080p, the other to 320×180 with one fps for thumbnail extraction. Both outputs can be sent to separate encode operations.

The filter ecosystem covers:

Geometric — scale, crop, pad, rotate, flip, transpose.
Temporal — fps, setpts, framerate, tinterlace, deinterlace (yadif, w3fdif, bwdif).
Color — colorspace conversion, color matrix, levels, curves, eq, hue, saturation.
Compositing — overlay, hstack, vstack, blend, mix.
Filter effects — boxblur, gblur, sharpen, unsharp, denoise (hqdn3d, nlmeans).
Subtitle/overlay — drawtext, subtitles, ass.
Audio — mixing, equalizer, compressor, limiter, loudnorm, asr, channel routing.
Analysis — vmaf, psnr, ssim, blackdetect, silencedetect, scenecut.

For pipeline engineering, the filter graph is where most of the actual work happens. Encoding is one filter (the encoder), but the surrounding work (scaling, color conversion, watermarking, deinterlacing, fps adjustment) is filter graphs.

Hardware acceleration

FFmpeg supports hardware acceleration for both decode and encode across multiple vendors:

VAAPI (Linux, AMD/Intel) — Video Acceleration API. The Linux open standard.
NVENC/NVDEC (NVIDIA) — h264_nvenc, hevc_nvenc, av1_nvenc for encoding; h264_cuvid, hevc_cuvid for decoding.
Quick Sync (Intel) — h264_qsv, hevc_qsv, av1_qsv.
VideoToolbox (macOS, iOS) — Apple's hardware acceleration.
AMF (AMD) — h264_amf, hevc_amf.
MediaCodec (Android) — Android NDK acceleration.

CLI for hardware-accelerated encoding:

ffmpeg -hwaccel cuda -i input.mp4 -c:v hevc_nvenc -preset p5 -b:v 5M output.mp4

The -hwaccel cuda flag enables CUDA-accelerated decoding; -c:v hevc_nvenc uses the NVIDIA HEVC encoder. The decode and encode happen on the GPU; data stays in GPU memory between stages when configured correctly (avoiding GPU↔CPU copies).

For full-pipeline hardware acceleration with filters in GPU memory:

ffmpeg -hwaccel cuda -hwaccel_output_format cuda -i input.mp4 \
  -vf "scale_cuda=1280:720" -c:v h264_nvenc output.mp4

The scale_cuda filter operates on GPU memory; data never returns to CPU until output. Throughput is dramatically higher than CPU-pipeline equivalents.

ffprobe

ffprobe is the inspection tool. Same libraries as ffmpeg; output is structured information about media files.

ffprobe -v error -select_streams v:0 -show_entries stream=width,height,codec_name,pix_fmt,r_frame_rate input.mp4

Output:

[STREAM]
codec_name=h264
width=1920
height=1080
pix_fmt=yuv420p
r_frame_rate=30000/1001
[/STREAM]

For pipeline engineering, ffprobe is the inspection tool you reach for. JSON output (-print_format json) makes it scriptable:

ffprobe -v error -print_format json -show_streams input.mp4 > info.json

For pipelines that need to make encoding decisions based on input characteristics (auto-detecting source resolution, frame rate, codec, color space), ffprobe is the workhorse.

ffplay

ffplay is a basic media player using SDL for output. Useful for debugging and verification — playing the result of an ffmpeg invocation to see if it looks right. Not a production media player; not feature-rich; but reliable.

For day-to-day production work, ffplay is rarely used. Engineers verify with ffprobe (text inspection) and play in real players when needed.

Format/codec detection

FFmpeg's format detection is sophisticated. It can identify input format from:

File extension — initial guess.
Magic bytes — file header inspection.
Stream demuxer probing — actually parsing some content.

For unknown input, FFmpeg picks the best demuxer that can handle it. For ambiguous cases, the -f flag forces a specific format. For pipelines that ingest arbitrary user content, this auto-detection is robust against edge cases.

Codec detection within streams is similar — once the demuxer parses the streams, codec parameters are extracted and the appropriate decoder is selected. The user usually doesn't need to specify decoders explicitly.

Compilation considerations

FFmpeg builds are configurable — different builds support different features. Key build-time choices:

GPL vs LGPL — some features (x264, x265, libfdk-aac for FDK-AAC) require GPL licensing. LGPL builds skip them.
External libraries — libfdk-aac, libvpx, libaom, libsvtav1, libwebp, libplacebo, vmaf must be linked at compile time.
Hardware acceleration — VAAPI, CUDA, Quick Sync, VideoToolbox each require build-time flags.
Network protocols — RTMP, HLS, DASH support optional based on build.

Consequence: not all ffmpeg binaries support all features. The static-build vs distro-build vs custom-build tradeoff:

Static builds (e.g., from BtbN/FFmpeg-Builds on GitHub) — kitchen sink with everything compiled in. Easy to deploy; large binary.
Distro builds (Ubuntu's default ffmpeg, etc.) — typically LGPL, missing some patent-encumbered or closed-source codec support.
Custom builds — enable exactly what you need; smaller and more secure.

For production pipelines, custom builds are recommended — control what's included and verify supply chain.

Licensing reality

FFmpeg's licensing is complex because it depends on what's compiled in:

LGPL — the default. Most components are LGPL-2.1+. You can statically link FFmpeg into proprietary applications under LGPL terms.
GPL — if you compile in GPL components (x264, x265, GPL filters), the resulting binary is GPL. Distribution requires source code availability.
Non-redistributable — FDK-AAC (Fraunhofer) is licensed for use but not redistribution. Builds with libfdk-aac can't be redistributed.

For commercial pipelines, compliance involves:

Understand which components are in your build.
Comply with LGPL/GPL terms based on what's included.
For FDK-AAC and other non-redistributable components, build for internal use only.

Most cloud video infrastructure uses internal-only builds with FDK-AAC for the audio quality benefit. SaaS providers build their own FFmpeg rather than using distro packages.

FFmpeg in production pipelines

Production video pipelines typically use FFmpeg in one of three ways:

CLI invocation — pipeline orchestration spawns ffmpeg processes for each job. Simple, robust, debuggable. The overhead of process startup is negligible compared to encoding time.

Library integration — applications link libav* directly and drive the libraries through their C API (or language bindings). Lower overhead than CLI; more complex to integrate.

Wrapper layer — higher-level abstractions (PyAV for Python, av-rs for Rust) that wrap libav* with idiomatic language patterns.

For most production work, CLI invocation is the right choice — it's flexible, observable, and well-tested. Library integration is for performance-critical paths or applications with custom integration needs (real-time analysis, streaming servers).

Why FFmpeg matters

FFmpeg is the substrate of video infrastructure for several reasons:

Comprehensive codec support — every codec, current and historical. No "we'll add support later"; FFmpeg already supports it.
Comprehensive format support — every container, every protocol, every weird historical format.
Mature — 25 years of development, debugging, optimization. Edge cases are mostly handled.
Free — LGPL/GPL licensing means no per-stream royalties for using FFmpeg itself.
Universal tooling — every video engineer knows FFmpeg syntax. Knowledge transfers across teams and projects.
Active development — new codecs, formats, hardware acceleration support appears regularly.
Extensible — write custom filters, add new format support, integrate at any level.

The closest commercial equivalents (Wowza Streaming Engine, Bitmovin Encoder, AWS Elemental MediaConvert) have additional production polish but don't fundamentally do anything FFmpeg can't. They wrap FFmpeg or independent codec implementations with management UIs and operational tooling.

What MpegFlow does with FFmpeg

MpegFlow's DAG runtime expresses FFmpeg-driven work through the FfmpegExecutor — one of three first-party StageExecutor implementations (alongside FfprobeExecutor and CcextractorExecutor). The executor proto field on each stage tells the ExecutorRegistry which binary to dispatch; the partitioner persists each stage to job_stages with explicit dependency tracking; per-stage retry handles transient failures; sibling cancellation propagates fatal failures across rendition stages; rendition-level partial-success reporting surfaces granular per-stage state.

For most workflow stages — decoding source content, applying filters (scaling, deinterlacing, thumbnails, overlay/watermark, color conversion), encoding to ladder rungs, muxing — FfmpegExecutor does the actual work. Filter parameters are typed (FilterParams::Deinterlace with yadif/bwdif options, FilterParams::Thumbnail, FilterParams::Overlay, etc.) and exit-code classification distinguishes retryable from terminal failures so the runtime makes the right decision when a stage exits non-zero.

The worker image is a defined FFmpeg build with relevant codec support (x264, x265, SVT-AV1, libvpx-vp9, FDK-AAC, libopus, libvmaf) and hardware-acceleration backends configured per pool (NVENC for GPU pools, VAAPI for compatible hardware). Build provenance is tracked; updates flow through controlled image rebuilds and regression testing.

Per-workflow encoder-version selection is not currently a runtime feature — the worker fleet runs the FFmpeg build the deployment is pinned to. Customers needing strict per-rendition determinism handle that at the deployment level, not via per-stage version selection.

The strict-broker security model handles FfmpegExecutor work like any pipeline payload — workers carry no ambient credentials, no IAM role, no long-lived secret; content access flows through short-lived presigned URLs scoped per stage; the worker disposes of access on completion.

For customers learning FFmpeg, MpegFlow's recipes section provides working FFmpeg recipes with production-aware notes covering common video tasks (transcoding, ABR ladder generation, subtitle burn-in, deinterlacing, watermarking, etc.). These reflect the same FfmpegExecutor invocations the runtime dispatches internally, exposed as documentation for engineers building their own pipelines.

The general guidance: FFmpeg is the best tool for video infrastructure, and it deserves the time investment to learn deeply. Engineers who understand FFmpeg's filter graphs, rate control, and hardware acceleration have an outsized productivity advantage in any video pipeline work. FFmpeg isn't easy, but it's worth the investment.

What FFmpeg actually is

FFmpeg is three things at once:

A set of libraries — the libav* family that does the actual work of decoding, encoding, muxing, demuxing, filtering, and resampling. These are C libraries with bindings for most languages.
Command-line tools — ffmpeg, ffprobe, ffplay. Wrappers around the libraries that expose the functionality through a CLI.
A project / community — the FFmpeg organization that maintains all of this, with a 25-year history that includes a famous fork (Libav) and reunion. Volunteer-driven, BSD/LGPL/GPL-licensed.

The libav* libraries

The libraries are the architectural foundation:

libavcodec — encoders and decoders for every codec FFmpeg supports. ~400+ codecs in current builds. Used by Chrome, Firefox, VLC, Kodi, OBS, every video editor — anything that decodes video on Linux likely uses libavcodec.
libavformat — container/format demuxing and muxing. Reads MP4, MKV, MPEG-TS, MOV, WebM, AVI, FLV, AIFF, OGG, RTMP, RTP, HLS manifests, DASH manifests, and many more. Writes most of them too.
libavfilter — audio and video filter graph runtime. The foundation of -vf and -af in the CLI. ~400+ filters covering scaling, color conversion, deinterlacing, overlays, audio mixing, FFT, and more.
libavutil — utility code (math, time, common data structures, hardware acceleration helpers).
libswscale — software-based image scaling and color space conversion. The reference scaler.
libswresample — audio sample rate, channel layout, and format conversion.
libavdevice — capture device input (video4linux, ALSA, X11 grab, JACK).
libpostproc — legacy post-processing filters (mostly historical now).

The CLI is convenient for ad-hoc work; the libraries are how production systems integrate FFmpeg.

The ffmpeg CLI

The ffmpeg command-line tool is the most-used face of FFmpeg. It's a CLI for chaining input → filtering → encoding → output operations:

ffmpeg [global options] {[input options] -i input}... {[output options] output}...

A simple example:

ffmpeg -i input.mov -c:v libx265 -crf 24 -c:a aac -b:a 192k output.mp4

More complex example using a filter graph:

ffmpeg -i input.mp4 -vf "scale=1280:720,fps=30,format=yuv420p" \
  -c:v libx264 -preset medium -crf 22 \
  -c:a copy output.mp4

Filter graphs — the secret weapon

FFmpeg's filter system is what makes it operationally flexible. Filter graphs are mini-DAGs — nodes are filters, edges are media streams flowing between them.

Simple filter chain (linear):

-vf "scale=720:480,fps=30,format=yuv420p"

Complex filter graph (branching):

-filter_complex \
  "[0:v]split=2[main][thumb]; \
   [main]scale=1920:1080[full]; \
   [thumb]scale=320:180,fps=1[preview]"

The split filter creates two copies of the input video; one is scaled to 1080p, the other to 320×180 with one fps for thumbnail extraction. Both outputs can be sent to separate encode operations.

The filter ecosystem covers:

Geometric — scale, crop, pad, rotate, flip, transpose.
Temporal — fps, setpts, framerate, tinterlace, deinterlace (yadif, w3fdif, bwdif).
Color — colorspace conversion, color matrix, levels, curves, eq, hue, saturation.
Compositing — overlay, hstack, vstack, blend, mix.
Filter effects — boxblur, gblur, sharpen, unsharp, denoise (hqdn3d, nlmeans).
Subtitle/overlay — drawtext, subtitles, ass.
Audio — mixing, equalizer, compressor, limiter, loudnorm, asr, channel routing.
Analysis — vmaf, psnr, ssim, blackdetect, silencedetect, scenecut.

Hardware acceleration

FFmpeg supports hardware acceleration for both decode and encode across multiple vendors:

VAAPI (Linux, AMD/Intel) — Video Acceleration API. The Linux open standard.
NVENC/NVDEC (NVIDIA) — h264_nvenc, hevc_nvenc, av1_nvenc for encoding; h264_cuvid, hevc_cuvid for decoding.
Quick Sync (Intel) — h264_qsv, hevc_qsv, av1_qsv.
VideoToolbox (macOS, iOS) — Apple's hardware acceleration.
AMF (AMD) — h264_amf, hevc_amf.
MediaCodec (Android) — Android NDK acceleration.

CLI for hardware-accelerated encoding:

ffmpeg -hwaccel cuda -i input.mp4 -c:v hevc_nvenc -preset p5 -b:v 5M output.mp4

For full-pipeline hardware acceleration with filters in GPU memory:

ffmpeg -hwaccel cuda -hwaccel_output_format cuda -i input.mp4 \
  -vf "scale_cuda=1280:720" -c:v h264_nvenc output.mp4

The scale_cuda filter operates on GPU memory; data never returns to CPU until output. Throughput is dramatically higher than CPU-pipeline equivalents.

ffprobe

ffprobe is the inspection tool. Same libraries as ffmpeg; output is structured information about media files.

ffprobe -v error -select_streams v:0 -show_entries stream=width,height,codec_name,pix_fmt,r_frame_rate input.mp4

Output:

[STREAM]
codec_name=h264
width=1920
height=1080
pix_fmt=yuv420p
r_frame_rate=30000/1001
[/STREAM]

For pipeline engineering, ffprobe is the inspection tool you reach for. JSON output (-print_format json) makes it scriptable:

ffprobe -v error -print_format json -show_streams input.mp4 > info.json

For pipelines that need to make encoding decisions based on input characteristics (auto-detecting source resolution, frame rate, codec, color space), ffprobe is the workhorse.

ffplay

For day-to-day production work, ffplay is rarely used. Engineers verify with ffprobe (text inspection) and play in real players when needed.

Format/codec detection

FFmpeg's format detection is sophisticated. It can identify input format from:

File extension — initial guess.
Magic bytes — file header inspection.
Stream demuxer probing — actually parsing some content.

Compilation considerations

FFmpeg builds are configurable — different builds support different features. Key build-time choices:

GPL vs LGPL — some features (x264, x265, libfdk-aac for FDK-AAC) require GPL licensing. LGPL builds skip them.
External libraries — libfdk-aac, libvpx, libaom, libsvtav1, libwebp, libplacebo, vmaf must be linked at compile time.
Hardware acceleration — VAAPI, CUDA, Quick Sync, VideoToolbox each require build-time flags.
Network protocols — RTMP, HLS, DASH support optional based on build.

Consequence: not all ffmpeg binaries support all features. The static-build vs distro-build vs custom-build tradeoff:

Static builds (e.g., from BtbN/FFmpeg-Builds on GitHub) — kitchen sink with everything compiled in. Easy to deploy; large binary.
Distro builds (Ubuntu's default ffmpeg, etc.) — typically LGPL, missing some patent-encumbered or closed-source codec support.
Custom builds — enable exactly what you need; smaller and more secure.

For production pipelines, custom builds are recommended — control what's included and verify supply chain.

Licensing reality

FFmpeg's licensing is complex because it depends on what's compiled in:

LGPL — the default. Most components are LGPL-2.1+. You can statically link FFmpeg into proprietary applications under LGPL terms.
GPL — if you compile in GPL components (x264, x265, GPL filters), the resulting binary is GPL. Distribution requires source code availability.
Non-redistributable — FDK-AAC (Fraunhofer) is licensed for use but not redistribution. Builds with libfdk-aac can't be redistributed.

For commercial pipelines, compliance involves:

Understand which components are in your build.
Comply with LGPL/GPL terms based on what's included.
For FDK-AAC and other non-redistributable components, build for internal use only.

Most cloud video infrastructure uses internal-only builds with FDK-AAC for the audio quality benefit. SaaS providers build their own FFmpeg rather than using distro packages.

FFmpeg in production pipelines

Production video pipelines typically use FFmpeg in one of three ways:

CLI invocation — pipeline orchestration spawns ffmpeg processes for each job. Simple, robust, debuggable. The overhead of process startup is negligible compared to encoding time.

Library integration — applications link libav* directly and drive the libraries through their C API (or language bindings). Lower overhead than CLI; more complex to integrate.

Wrapper layer — higher-level abstractions (PyAV for Python, av-rs for Rust) that wrap libav* with idiomatic language patterns.

Why FFmpeg matters

FFmpeg is the substrate of video infrastructure for several reasons:

Comprehensive codec support — every codec, current and historical. No "we'll add support later"; FFmpeg already supports it.
Comprehensive format support — every container, every protocol, every weird historical format.
Mature — 25 years of development, debugging, optimization. Edge cases are mostly handled.
Free — LGPL/GPL licensing means no per-stream royalties for using FFmpeg itself.
Universal tooling — every video engineer knows FFmpeg syntax. Knowledge transfers across teams and projects.
Active development — new codecs, formats, hardware acceleration support appears regularly.
Extensible — write custom filters, add new format support, integrate at any level.

FFmpeg — the multimedia framework that runs nearly all video infrastructure

What FFmpeg actually is

The libav* libraries

The ffmpeg CLI

Filter graphs — the secret weapon

Hardware acceleration

ffprobe

ffplay

Format/codec detection

Compilation considerations

Licensing reality

FFmpeg in production pipelines

Why FFmpeg matters

What MpegFlow does with FFmpeg

Related topics and reading

FFmpeg — the multimedia framework that runs nearly all video infrastructure

What FFmpeg actually is

The libav* libraries

The ffmpeg CLI

Filter graphs — the secret weapon

Hardware acceleration

ffprobe

ffplay

Format/codec detection

Compilation considerations

Licensing reality

FFmpeg in production pipelines

Why FFmpeg matters

What MpegFlow does with FFmpeg

Related topics and reading

FFmpeg — the multimedia framework that runs nearly all video infrastructure

#What FFmpeg actually is

#The libav* libraries

#The ffmpeg CLI

#Filter graphs — the secret weapon

#Hardware acceleration

#ffprobe

#ffplay

#Format/codec detection

#Compilation considerations

#Licensing reality

#FFmpeg in production pipelines

#Why FFmpeg matters

#What MpegFlow does with FFmpeg

Related topics and reading

FFmpeg — the multimedia framework that runs nearly all video infrastructure

#What FFmpeg actually is

#The libav* libraries

#The ffmpeg CLI

#Filter graphs — the secret weapon

#Hardware acceleration

#ffprobe

#ffplay

#Format/codec detection

#Compilation considerations

#Licensing reality

#FFmpeg in production pipelines

#Why FFmpeg matters

#What MpegFlow does with FFmpeg

Related topics and reading

What FFmpeg actually is

The libav* libraries

The ffmpeg CLI

Filter graphs — the secret weapon

Hardware acceleration

ffprobe

ffplay

Format/codec detection

Compilation considerations

Licensing reality

FFmpeg in production pipelines

Why FFmpeg matters

What MpegFlow does with FFmpeg

What FFmpeg actually is

The libav* libraries

The ffmpeg CLI

Filter graphs — the secret weapon

Hardware acceleration

ffprobe

ffplay

Format/codec detection

Compilation considerations

Licensing reality

FFmpeg in production pipelines

Why FFmpeg matters

What MpegFlow does with FFmpeg