SVT-AV1 preset tuning — preset 4 vs 6 vs 8 vs 10 in production

MpegFlow

Practical guide to SVT-AV1 preset selection — quality vs encoding-time tradeoff at each preset, VMAF measurements, when to use preset 4 vs 6 vs 8 vs 10 vs 12 in production.

SVT-AV1 (Scalable Video Technology AV1) is the production AV1 encoder for most 2026 streaming workloads. Its 14 presets (0-13) span the full encoding-speed-vs-quality curve, from research-grade preset 0 to real-time-live preset 13. Picking the right preset for your specific workload is one of the most consequential AV1 encoder decisions — wrong preset choice either burns compute on quality you don't need or sacrifices quality at compute you could afford. This page is the engineering reference for SVT-AV1 preset selection in production.

What presets actually control

SVT-AV1 presets are bundles of internal encoder parameter values calibrated by the SVT-AV1 development team. Each preset adjusts:

Block partitioning depth — how aggressively the encoder explores partition options.
Motion estimation depth — how much computation goes into finding optimal motion vectors.
Reference frame count — how many past/future frames are considered for prediction.
Transform size search — exploration of transform size options.
Mode decision granularity — granularity of the encoder's rate-distortion decisions.
Frame-level analysis — depth of look-ahead and frame complexity analysis.

Higher presets disable expensive analysis to gain encoding speed. Lower presets enable everything for maximum quality. The relationship between preset number and encoding time is roughly exponential — preset 4 might be 5x slower than preset 8 at the same content.

The preset spectrum

Preset	Encoding speed (relative to preset 6)	Production use case
0	~0.05x (20x slower)	Research, codec papers, reference quality
1	~0.1x	Premium VOD, near-research
2	~0.2x	Premium VOD with quality margin
3	~0.4x	High-end VOD
4	~0.6x	Premium VOD baseline
5	~0.8x	Quality-tuned VOD
6	1.0x (baseline)	VOD baseline (production default)
7	~1.5x	Throughput-bound VOD
8	~2-3x	Live with quality budget
9	~3-4x	Live
10	~5x	Standard live
11	~6-7x	Low-latency live
12	~8x	Low-latency live with margin
13	~10x	Real-time, lowest quality

The relative numbers depend on hardware, content type, and bitrate target. Treat them as rough ordering rather than absolute multipliers.

Quality measurements

VMAF score deltas between presets at equivalent bitrate (1080p, mixed content corpus):

Preset comparison	VMAF delta	Encoding time multiplier
0 vs 6	+3.5-4.5	20x slower
2 vs 6	+2.5-3.0	5x slower
4 vs 6	+1.0-1.5	1.7x slower
6 vs 6	(baseline)	1x
8 vs 6	-1.5-2.0	0.4x slower (faster)
10 vs 6	-3.0-4.0	0.2x slower (faster)
12 vs 6	-5.0-6.5	0.13x slower (faster)

The non-linearity is meaningful — preset 4 trades 1.7x more encoding time for ~1 VMAF improvement; preset 0 trades 20x for ~4 VMAF improvement. Diminishing returns past preset 4 are real.

The opposite direction: preset 10 saves 5x encoding time at the cost of ~3-4 VMAF. For live streaming where wall-time is non-negotiable, preset 10-12 is acceptable; for VOD where quality matters more, preset 4-6 is the production sweet spot.

Production preset recommendations by use case

Premium VOD top tier (4K HDR, etc.) — preset 4. The compute cost is worth it for top-quality content. Quality matters; encoding time doesn't (ship in days/weeks).

Standard VOD ladder — preset 6 across the board. The baseline. Good quality, reasonable encoding time. Default for most pipelines.

Throughput-bound VOD — preset 7-8 if compute budget is tight or content velocity is high. Quality cost is modest; throughput benefit is real.

Low-latency live (LL-HLS at 2-3s) — preset 10 typically. Encoding latency budget is tight; preset 10 fits with VBV constraint.

Standard live — preset 8-10 depending on quality target. Preset 8 if quality matters; preset 10 if encoder hardware is constrained.

Real-time live (sub-2-second targets) — preset 12 typically. Quality drops noticeably but encoding speed is real-time on modest hardware.

CLI invocation

For production VOD with capped CRF rate control:

ffmpeg -i input.mp4 -c:v libsvtav1 \
  -preset 6 -crf 28 -maxrate 4M -bufsize 8M \
  -c:a copy output.av1

For premium VOD with constraint-tuned encoding:

ffmpeg -i input.mp4 -c:v libsvtav1 \
  -preset 4 -crf 26 -maxrate 6M -bufsize 12M \
  -svtav1-params "tune=1" \
  -c:a copy output.av1

The tune=1 parameter selects "PSNR" mode (vs tune=0 "VQM" which optimizes for VMAF). For VMAF-targeted production, leave at default; for codec research using PSNR-based BD-rate, set tune=1.

For live encoding:

ffmpeg -re -i input.mp4 -c:v libsvtav1 \
  -preset 10 -crf 32 -maxrate 3M -bufsize 6M \
  -svtav1-params "scd=0:keyint=60" \
  -c:a copy live.av1

scd=0 disables scene-change detection (forces fixed GOP); keyint=60 sets 2-second GOP at 30fps.

Per-rung preset selection

For ABR ladders, preset can vary per rung. Typical premium VOD ladder:

Rung	Resolution	Preset	Reasoning
4K	3840×2160	4	Top tier; best quality justifies compute
1440p	2560×1440	5	High tier
1080p AV1	1920×1080	6	Standard production
1080p HEVC fallback	1920×1080	(HEVC, not AV1)	Apple ecosystem
720p	1280×720	7	Throughput-aware
540p	960×540	8	Floor tier; quality less critical
360p	640×360	(H.264, not AV1)	Floor; H.264 universal

The non-uniform preset across the ladder reflects content economics: top tiers get more compute for quality; floor tiers get less. Total ladder encoding time is dominated by the top tier; preset choice on lower tiers is a marginal factor.

SVT-AV1 vs libaom-av1 across presets

libaom-av1 is the AOMedia reference encoder. The relationship to SVT-AV1:

SVT-AV1 preset	libaom-av1 cpu-used (rough equivalence)
0	cpu-used 0 (highest quality)
2	cpu-used 1-2
4	cpu-used 3
6	cpu-used 4-5
8	cpu-used 6
10	cpu-used 7-8

For production at preset 6+, SVT-AV1 is faster than libaom-av1 at equivalent quality. For research at preset 0-2, libaom-av1 still produces slightly better quality at higher encoding time. SVT-AV1 is the production answer; libaom-av1 is the research/highest-quality answer.

Threading characteristics

SVT-AV1's name reflects its scalable threading. Threading parallelism varies by preset:

Lower presets (0-4) — limited parallelism; single-thread bottleneck. Many cores don't help much.
Mid presets (5-7) — good parallelism; benefits from 8-16 cores.
Higher presets (8-12) — extreme parallelism; benefits from 32+ cores.

For a 64-core encode farm, higher presets are well-utilized. For a 4-core CPU, lower presets are essentially as fast as higher ones (single-core-bound).

This affects production deployment decisions: large core-count cloud instances are cost-efficient for high-preset live encoding; smaller instances are better for low-preset quality VOD.

Memory characteristics

Memory usage scales with preset:

Preset 0-4: high memory (multiple frames in look-ahead, expensive analysis structures).
Preset 5-8: moderate memory.
Preset 9-12: lower memory (less look-ahead, simpler structures).

For 4K HDR at preset 4, expect 8-12 GB per encode process. For 1080p at preset 8, expect 1-2 GB. Plan instance memory accordingly.

Calibration procedure for your content

Don't trust generic recommendations; calibrate against your actual content:

Pick representative samples — 5-10 clips covering your typical content types (talking heads, sports, animation, etc.). 30-60 seconds each.
Encode at multiple presets — try preset 4, 6, 8, 10 at the same target bitrate.
Compute VMAF on each — measure quality at each preset.
Plot quality vs encoding time — identify the knee where additional encoding time stops paying off.
Pick the production preset — the lowest preset (most compute) where quality target is met, or the highest preset (least compute) where quality is acceptable for the use case.

The calibration takes a day or two and pays off across years of pipeline operation. The "right preset" for your content might differ from defaults.

Common preset mistakes

Things to avoid:

Using preset 0 in production — preset 0 is research-grade, dramatically slower than even preset 2, with quality benefit not worth the cost.
Using preset 12-13 for VOD — these are real-time live presets; quality cost is too high for VOD.
Mixing presets across variants without testing — preset 4 top tier with preset 8 mid tier might produce visible quality discontinuity at adaptation boundaries.
Trusting preset numbers across encoder versions — SVT-AV1 has evolved; preset 6 in v0.8 differs from preset 6 in current. Test with your actual encoder version.
Picking based on encoding-time alone — quality matters; balance both metrics.

What MpegFlow does with SVT-AV1 presets

MpegFlow's DAG runtime exposes SVT-AV1 preset selection per rendition through workflow YAML; each value flows into the corresponding FfmpegExecutor stage. The partitioner persists each rendition stage to job_stages with explicit dependency tracking and per-stage retry; the KEDA-driven autoscaler sizes the worker pool to the queued workload, so heavier-preset stages scale horizontally rather than queuing behind lighter work.

Default presets per content category:

Premium VOD: preset 4 for top tier; preset 5-6 for mid tiers.
Standard VOD: preset 6 across all rungs.
Throughput-bound VOD: preset 7-8.
Live (LL-HLS): preset 10.
Real-time live: preset 12.

The DAG runtime treats preset 4 work and preset 10 work as separate stages on the same DAG, so they parallelize cleanly; sibling cancellation propagates fatal failures across rendition stages so dependents don't waste compute.

For customers calibrating SVT-AV1 presets for their pipeline, we run the side-by-side preset comparison on representative content during onboarding. The output is a per-content-type preset recommendation tailored to the customer's quality and throughput requirements. The defaults in MpegFlow templates are sensible starting points; customer-specific calibration is where pipeline tuning earns real value.

The strict-broker security model handles preset configuration like any pipeline payload — workers carry no ambient credentials; content access flows through short-lived presigned URLs scoped per stage; access is disposed on completion.

The general guidance: preset 6 is the safe production default for VOD; preset 10 is the safe production default for live; deviate based on calibration data, not vibes. SVT-AV1 preset selection is one of the parts of pipeline tuning that's invisible when right and visible (slow encodes, quality regressions, customer escalations) when wrong.

What presets actually control

SVT-AV1 presets are bundles of internal encoder parameter values calibrated by the SVT-AV1 development team. Each preset adjusts:

Block partitioning depth — how aggressively the encoder explores partition options.
Motion estimation depth — how much computation goes into finding optimal motion vectors.
Reference frame count — how many past/future frames are considered for prediction.
Transform size search — exploration of transform size options.
Mode decision granularity — granularity of the encoder's rate-distortion decisions.
Frame-level analysis — depth of look-ahead and frame complexity analysis.

The preset spectrum

Preset	Encoding speed (relative to preset 6)	Production use case
0	~0.05x (20x slower)	Research, codec papers, reference quality
1	~0.1x	Premium VOD, near-research
2	~0.2x	Premium VOD with quality margin
3	~0.4x	High-end VOD
4	~0.6x	Premium VOD baseline
5	~0.8x	Quality-tuned VOD
6	1.0x (baseline)	VOD baseline (production default)
7	~1.5x	Throughput-bound VOD
8	~2-3x	Live with quality budget
9	~3-4x	Live
10	~5x	Standard live
11	~6-7x	Low-latency live
12	~8x	Low-latency live with margin
13	~10x	Real-time, lowest quality

The relative numbers depend on hardware, content type, and bitrate target. Treat them as rough ordering rather than absolute multipliers.

Quality measurements

VMAF score deltas between presets at equivalent bitrate (1080p, mixed content corpus):

Preset comparison	VMAF delta	Encoding time multiplier
0 vs 6	+3.5-4.5	20x slower
2 vs 6	+2.5-3.0	5x slower
4 vs 6	+1.0-1.5	1.7x slower
6 vs 6	(baseline)	1x
8 vs 6	-1.5-2.0	0.4x slower (faster)
10 vs 6	-3.0-4.0	0.2x slower (faster)
12 vs 6	-5.0-6.5	0.13x slower (faster)

The non-linearity is meaningful — preset 4 trades 1.7x more encoding time for ~1 VMAF improvement; preset 0 trades 20x for ~4 VMAF improvement. Diminishing returns past preset 4 are real.

Production preset recommendations by use case

Premium VOD top tier (4K HDR, etc.) — preset 4. The compute cost is worth it for top-quality content. Quality matters; encoding time doesn't (ship in days/weeks).

Standard VOD ladder — preset 6 across the board. The baseline. Good quality, reasonable encoding time. Default for most pipelines.

Throughput-bound VOD — preset 7-8 if compute budget is tight or content velocity is high. Quality cost is modest; throughput benefit is real.

Low-latency live (LL-HLS at 2-3s) — preset 10 typically. Encoding latency budget is tight; preset 10 fits with VBV constraint.

Standard live — preset 8-10 depending on quality target. Preset 8 if quality matters; preset 10 if encoder hardware is constrained.

Real-time live (sub-2-second targets) — preset 12 typically. Quality drops noticeably but encoding speed is real-time on modest hardware.

CLI invocation

For production VOD with capped CRF rate control:

ffmpeg -i input.mp4 -c:v libsvtav1 \
  -preset 6 -crf 28 -maxrate 4M -bufsize 8M \
  -c:a copy output.av1

For premium VOD with constraint-tuned encoding:

ffmpeg -i input.mp4 -c:v libsvtav1 \
  -preset 4 -crf 26 -maxrate 6M -bufsize 12M \
  -svtav1-params "tune=1" \
  -c:a copy output.av1

The tune=1 parameter selects "PSNR" mode (vs tune=0 "VQM" which optimizes for VMAF). For VMAF-targeted production, leave at default; for codec research using PSNR-based BD-rate, set tune=1.

For live encoding:

ffmpeg -re -i input.mp4 -c:v libsvtav1 \
  -preset 10 -crf 32 -maxrate 3M -bufsize 6M \
  -svtav1-params "scd=0:keyint=60" \
  -c:a copy live.av1

scd=0 disables scene-change detection (forces fixed GOP); keyint=60 sets 2-second GOP at 30fps.

Per-rung preset selection

For ABR ladders, preset can vary per rung. Typical premium VOD ladder:

Rung	Resolution	Preset	Reasoning
4K	3840×2160	4	Top tier; best quality justifies compute
1440p	2560×1440	5	High tier
1080p AV1	1920×1080	6	Standard production
1080p HEVC fallback	1920×1080	(HEVC, not AV1)	Apple ecosystem
720p	1280×720	7	Throughput-aware
540p	960×540	8	Floor tier; quality less critical
360p	640×360	(H.264, not AV1)	Floor; H.264 universal

SVT-AV1 vs libaom-av1 across presets

libaom-av1 is the AOMedia reference encoder. The relationship to SVT-AV1:

SVT-AV1 preset	libaom-av1 cpu-used (rough equivalence)
0	cpu-used 0 (highest quality)
2	cpu-used 1-2
4	cpu-used 3
6	cpu-used 4-5
8	cpu-used 6
10	cpu-used 7-8

Threading characteristics

SVT-AV1's name reflects its scalable threading. Threading parallelism varies by preset:

Lower presets (0-4) — limited parallelism; single-thread bottleneck. Many cores don't help much.
Mid presets (5-7) — good parallelism; benefits from 8-16 cores.
Higher presets (8-12) — extreme parallelism; benefits from 32+ cores.

For a 64-core encode farm, higher presets are well-utilized. For a 4-core CPU, lower presets are essentially as fast as higher ones (single-core-bound).

This affects production deployment decisions: large core-count cloud instances are cost-efficient for high-preset live encoding; smaller instances are better for low-preset quality VOD.

Memory characteristics

Memory usage scales with preset:

Preset 0-4: high memory (multiple frames in look-ahead, expensive analysis structures).
Preset 5-8: moderate memory.
Preset 9-12: lower memory (less look-ahead, simpler structures).

For 4K HDR at preset 4, expect 8-12 GB per encode process. For 1080p at preset 8, expect 1-2 GB. Plan instance memory accordingly.

Calibration procedure for your content

Don't trust generic recommendations; calibrate against your actual content:

Pick representative samples — 5-10 clips covering your typical content types (talking heads, sports, animation, etc.). 30-60 seconds each.
Encode at multiple presets — try preset 4, 6, 8, 10 at the same target bitrate.
Compute VMAF on each — measure quality at each preset.
Plot quality vs encoding time — identify the knee where additional encoding time stops paying off.
Pick the production preset — the lowest preset (most compute) where quality target is met, or the highest preset (least compute) where quality is acceptable for the use case.

The calibration takes a day or two and pays off across years of pipeline operation. The "right preset" for your content might differ from defaults.

Common preset mistakes

Things to avoid:

Using preset 0 in production — preset 0 is research-grade, dramatically slower than even preset 2, with quality benefit not worth the cost.
Using preset 12-13 for VOD — these are real-time live presets; quality cost is too high for VOD.
Mixing presets across variants without testing — preset 4 top tier with preset 8 mid tier might produce visible quality discontinuity at adaptation boundaries.
Trusting preset numbers across encoder versions — SVT-AV1 has evolved; preset 6 in v0.8 differs from preset 6 in current. Test with your actual encoder version.
Picking based on encoding-time alone — quality matters; balance both metrics.

What MpegFlow does with SVT-AV1 presets

Default presets per content category:

Premium VOD: preset 4 for top tier; preset 5-6 for mid tiers.
Standard VOD: preset 6 across all rungs.
Throughput-bound VOD: preset 7-8.
Live (LL-HLS): preset 10.
Real-time live: preset 12.

SVT-AV1 preset tuning — preset 4 vs 6 vs 8 vs 10 in production

What presets actually control

The preset spectrum

Quality measurements

Production preset recommendations by use case

CLI invocation

Per-rung preset selection

SVT-AV1 vs libaom-av1 across presets

Threading characteristics

Memory characteristics

Calibration procedure for your content

Common preset mistakes

What MpegFlow does with SVT-AV1 presets

Related topics and reading

SVT-AV1 preset tuning — preset 4 vs 6 vs 8 vs 10 in production

What presets actually control

The preset spectrum

Quality measurements

Production preset recommendations by use case

CLI invocation

Per-rung preset selection

SVT-AV1 vs libaom-av1 across presets

Threading characteristics

Memory characteristics

Calibration procedure for your content

Common preset mistakes

What MpegFlow does with SVT-AV1 presets

Related topics and reading

SVT-AV1 preset tuning — preset 4 vs 6 vs 8 vs 10 in production

#What presets actually control

#The preset spectrum

#Quality measurements

#Production preset recommendations by use case

#CLI invocation

#Per-rung preset selection

#SVT-AV1 vs libaom-av1 across presets

#Threading characteristics

#Memory characteristics

#Calibration procedure for your content

#Common preset mistakes

#What MpegFlow does with SVT-AV1 presets

Related topics and reading

SVT-AV1 preset tuning — preset 4 vs 6 vs 8 vs 10 in production

#What presets actually control

#The preset spectrum

#Quality measurements

#Production preset recommendations by use case

#CLI invocation

#Per-rung preset selection

#SVT-AV1 vs libaom-av1 across presets

#Threading characteristics

#Memory characteristics

#Calibration procedure for your content

#Common preset mistakes

#What MpegFlow does with SVT-AV1 presets

Related topics and reading

What presets actually control

The preset spectrum

Quality measurements

Production preset recommendations by use case

CLI invocation

Per-rung preset selection

SVT-AV1 vs libaom-av1 across presets

Threading characteristics

Memory characteristics

Calibration procedure for your content

Common preset mistakes

What MpegFlow does with SVT-AV1 presets

What presets actually control

The preset spectrum

Quality measurements

Production preset recommendations by use case

CLI invocation

Per-rung preset selection

SVT-AV1 vs libaom-av1 across presets

Threading characteristics

Memory characteristics

Calibration procedure for your content

Common preset mistakes

What MpegFlow does with SVT-AV1 presets