MpegFlowBlogBack to home
← Topics·Encoding

GOP and keyframe interval — what they are, why they matter, and how to set them

Practical reference on Group of Pictures and keyframe interval — closed vs open GOP, IDR keyframes, segment alignment for ABR streaming, the latency vs compression tradeoff.

ByMpegFlow Engineering Team·Encoding
·May 8, 2026·9 min read·1,799 words
In this topic
  1. What a GOP is
  2. Closed vs open GOP
  3. IDR frames vs non-IDR keyframes
  4. GOP size and ABR streaming
  5. GOP and live latency
  6. GOP and compression efficiency
  7. GOP and codec specifics
  8. Scene-change keyframes vs fixed-interval keyframes
  9. GOP encoding parameters across encoders
  10. Force-key-frames for precise alignment
  11. Operational considerations
  12. What MpegFlow does with GOP

GOP — Group of Pictures — is the structure of frames between successive keyframes in a video stream. Keyframe interval is how often those keyframes occur. Together they're one of the most consequential encoder configurations in live and ABR streaming. Get the GOP size wrong and you break ABR adaptation, hurt live latency, or sacrifice compression efficiency. This page is the engineering reference for what GOP actually is, the constraints that drive its sizing, and how to set it for VOD vs live workflows.

#What a GOP is

A video stream is a sequence of frames. Each frame is one of three types:

  • I-frame (Intra-coded) — a complete picture; can be decoded standalone without reference to other frames.
  • P-frame (Predicted) — encoded as differences from a previous frame (typically the most recent I or P).
  • B-frame (Bi-directionally predicted) — encoded as differences from both past and future frames.

A GOP starts at an I-frame and runs until the next I-frame. Between the I-frames are P-frames and (often) B-frames that reference back to the I-frame and forward through the structure.

A typical GOP structure for VOD might be:

I B B P B B P B B P B B P B B P
└──────────────── GOP ──────────────┘

The I-frame is the keyframe — the synchronization point. Decoding can start from any I-frame without needing earlier frames.

#Closed vs open GOP

GOPs come in two flavors:

Closed GOP — the GOP is self-contained. P and B frames within the GOP only reference frames within the same GOP. Decoding can start cleanly at the I-frame.

Open GOP — B frames at the start of the GOP can reference frames in the previous GOP. More compression efficient (B frames have more reference options) but less suitable for streaming (decoding can't start cleanly at the I-frame because some B frames need previous-GOP context).

For streaming (HLS, DASH), closed GOP is required. ABR adaptation switches between variant streams at GOP boundaries; the player needs to start decoding cleanly without prior context. Open GOP would cause artifacts at every adaptation point.

For VOD with progressive download or single-stream playback, open GOP can be used for the modest compression benefit. For ABR streaming, closed GOP is non-negotiable.

#IDR frames vs non-IDR keyframes

H.264 and HEVC have two types of keyframes:

  • IDR (Instantaneous Decoder Refresh) — strictest keyframe type. After an IDR, the decoder discards all reference frames; the IDR starts a completely new sequence. Closed GOP requires IDR keyframes.
  • Non-IDR I-frames — I-frames that are intra-coded but don't force the decoder to discard references. Used in open GOPs.

For streaming, IDR frames at GOP boundaries are required. Encoder configuration typically uses IDR every N frames rather than keyframe every N frames:

ffmpeg -i input -c:v libx264 -g 60 -keyint_min 60 -force_key_frames "expr:gte(t,n_forced*2)" output.mp4

The -g 60 sets GOP size to 60 frames; -keyint_min 60 prevents the encoder from inserting keyframes earlier than the GOP size; -force_key_frames ensures keyframes occur at exactly 2-second intervals.

For x265, the analogous configuration:

ffmpeg -i input -c:v libx265 -x265-params "keyint=60:min-keyint=60:scenecut=0" output.mp4

The scenecut=0 disables scene-change keyframe insertion, ensuring fixed-interval keyframes (important for streaming).

#GOP size and ABR streaming

The fundamental constraint: HLS/DASH segment duration must be an integer multiple of GOP duration. Each segment must start at an IDR keyframe.

If you ship 6-second HLS segments, your GOP must divide 6 seconds evenly: 1s, 1.5s, 2s, 3s, 6s — but not 4s or 5s (which don't divide 6 evenly).

The typical 2026 streaming defaults:

Use case Segment duration GOP duration
Standard live HLS 6s 2s (3 GOPs per segment)
Standard live DASH 4s 2s (2 GOPs per segment)
LL-HLS partial segments 6s segments, 333ms parts 2s GOPs
VOD HLS 6s 2s
LL-DASH 4s segments, 200ms chunks 2s GOPs

The 2-second GOP is the consumer-streaming default. It allows clean ABR adaptation every 2 seconds (because every IDR is a potential adaptation point), without the compression penalty of a 1-second GOP.

For lower-latency live (LL-HLS, LL-DASH targeting 2-3s end-to-end), shorter GOPs help — 1s GOP, 1s segment. For ultra-low-latency (sub-1s, WebRTC), GOP duration is essentially per-frame; the concept fades.

#GOP and live latency

For live streaming, GOP size affects latency via the segment-duration relationship:

  • Player buffer: typically 3 segments minimum (~3 × segment duration).
  • Manifest update: ~1 segment duration on average (until LL-HLS / LL-DASH partial-segment delivery).
  • Encoding lag: depends on look-ahead; affected by GOP size for some configurations.

Shorter GOP → shorter segments → lower buffer → lower latency. The tradeoff is compression efficiency:

  • 2s GOP — production default. Reasonable compression, reasonable adaptation flexibility.
  • 1s GOP — for low-latency live. ~10-15% bitrate cost vs 2s GOP for equivalent quality.
  • 500ms GOP — for very-low-latency live. ~25-30% bitrate cost.

The bitrate cost comes from the I-frame frequency. I-frames are larger than P/B frames; more I-frames means more bytes spent on keyframes vs predictive frames. Halving the GOP roughly doubles the I-frame budget.

#GOP and compression efficiency

Longer GOPs allow longer prediction chains and more compression. The compression vs ABR-adaptation tradeoff:

  • Very long GOP (10s+) — best compression, terrible ABR adaptation (only one adaptation point per 10s window), unsuitable for streaming.
  • Long GOP (5-10s) — good compression, marginal for ABR streaming.
  • Medium GOP (2s) — production streaming default. Balance of compression and adaptation.
  • Short GOP (1s or less) — for low-latency live. Compression cost.
  • Very short GOP (sub-1s) — extreme low-latency. Significant compression cost.

For VOD where adaptation isn't a concern (single-bitrate progressive playback), longer GOPs (5-10s) are fine. For ABR VOD, the 2s convention applies. For live ABR, 2s is the default; lower for low-latency.

#GOP and codec specifics

Different codecs have different GOP characteristics:

H.264:

  • Standard IDR-based GOP structure.
  • B-frame patterns: typically 2-3 B-frames between P-frames.
  • IDR frames are explicit; closed GOP is the standard.

HEVC:

  • Same fundamental GOP structure as H.264.
  • More flexible reference frame structures (long-term references).
  • B-pyramid hierarchical encoding: B-frames at multiple temporal layers.

AV1:

  • "Show frame" / "no-show frame" distinction allows more flexible GOP structures.
  • Reference frame management is more sophisticated than H.264/HEVC.
  • The encoder can use longer reference distances effectively.
  • Same operational constraint: IDR-equivalent (key frames) at fixed intervals for streaming.

VP9:

  • Similar to AV1's flexibility.
  • Production use is similar: fixed keyframe intervals for streaming.

For all codecs in production streaming use, fixed-interval IDR/keyframes at GOP boundaries is the configuration that makes ABR work. Codec-specific reference management within the GOP is mostly transparent.

#Scene-change keyframes vs fixed-interval keyframes

By default, many encoders insert keyframes at scene changes (detected via inter-frame difference metrics). This improves compression — encoding a scene change as an I-frame is more efficient than trying to predict across a scene boundary.

For streaming, scene-change keyframes are problematic — they make GOP boundaries non-deterministic. Segments can't be guaranteed to start at keyframes if the encoder might insert keyframes anywhere.

The fix: disable scene-change keyframe insertion. Encoder configurations:

  • x264: -x264-params scenecut=0
  • x265: -x265-params scenecut=0
  • SVT-AV1: -svtav1-params scd=0

This forces keyframes only at the configured GOP boundary, accepting modest compression cost for predictable segment alignment.

For VOD without ABR streaming concerns, scene-change keyframes can be enabled. For ABR streaming, disable them.

#GOP encoding parameters across encoders

The configuration syntax varies by encoder:

ffmpeg + x264:

-c:v libx264 -g 60 -keyint_min 60 -sc_threshold 0

ffmpeg + x265:

-c:v libx265 -x265-params "keyint=60:min-keyint=60:scenecut=0"

ffmpeg + SVT-AV1:

-c:v libsvtav1 -svtav1-params "keyint=60:scd=0"

ffmpeg + libvpx-vp9:

-c:v libvpx-vp9 -g 60 -keyint_min 60

ffmpeg + NVENC:

-c:v hevc_nvenc -g 60 -strict_gop 1

The 60 here assumes 30fps content with 2-second GOP (60 frames at 30fps). For 60fps content, use -g 120. For 24fps content, use -g 48.

#Force-key-frames for precise alignment

Encoders sometimes deviate from the -g setting (rounding to scene boundaries, codec internal optimization). For streaming where exact alignment matters, force keyframes at exact times:

-force_key_frames "expr:gte(t,n_forced*2)"

This tells ffmpeg to insert a keyframe every 2 seconds regardless of encoder default behavior. Combined with -g and scenecut=0, this guarantees exact 2-second GOP boundaries.

For ABR ladders where multiple variants must align (HLS adapts at common segment boundaries; segments must start at the same media time across all variants), force-key-frames is the way to ensure cross-variant alignment.

#Operational considerations

Things that matter for GOP and keyframes in production:

  • Cross-variant alignment — all ABR variants must have IDR keyframes at the same media times. Use force-key-frames consistently across all encodings.
  • Segment duration / GOP duration ratio — must be an integer. Verify configuration; mismatches break ABR adaptation.
  • Live encoder restart handling — when a live encoder restarts, the GOP timing may reset. Manifest manipulation needs to handle the reset gracefully.
  • B-frame and look-ahead interaction — encoders with significant look-ahead can introduce small GOP deviations. Test the actual output, not just the configuration.
  • GOP size in encoder reporting — the encoder may report per-segment GOP statistics (average GOP size, IDR placement). Validate against intended configuration.

#What MpegFlow does with GOP

MpegFlow's DAG runtime applies GOP configuration through force-key-frames (or equivalent encoder flags) at each FfmpegExecutor rendition stage. The workflow YAML specifies GOP duration in seconds (typical: 2 seconds); the same value flows into every parallel rendition stage so all rungs share boundaries by construction. The partitioner persists each rendition stage to job_stages with explicit dependency tracking; per-stage retry handles transient failures.

For ABR ladders, cross-variant GOP alignment isn't a manual coordination step — it's a structural property of the partitioner emitting the same keyframe configuration across the parallel rendition stages from one shared workflow parameter.

For live workflows, GOP duration is per-workflow. Default is 2s for standard latency; lower-latency workflows configure shorter GOPs where the underlying tooling supports it.

Segment-to-GOP alignment in the packaging stage is what force-key-frames is intended to guarantee; pipeline regression discipline exercises segment-boundary checks against real output to catch encoder-version-specific edge cases before they reach customer playback.

The strict-broker security model treats GOP configuration like any pipeline payload — workers carry no ambient credentials; content access flows through short-lived presigned URLs scoped per stage; access is disposed on completion.

For customers tuning GOP for their pipeline, the standing recommendation: 2-second GOP for standard streaming; 1-second GOP for low-latency live; verify cross-variant alignment with media inspection tools after configuration changes. The default settings in MpegFlow workflows produce streaming-correct output; customer-specific tuning is for use cases that need to deviate from the streaming-conventional defaults.

Tags
  • gop
  • keyframe
  • encoding
  • hls
  • dash
  • live-streaming
  • segment-alignment
See also

Related topics and reading

  • ABR ladder design — the engineering decisions in adaptive bitrate streaming
  • WebVTT — the W3C caption format every browser speaks
  • CMAF — the segment format that ended the HLS-vs-DASH duplicate-encoding problem
Building on this?

Join the MpegFlow beta.

We're shipping the encoder MVP this quarter. If you're wrangling encoding in production, the beta is built for you — no card, no console waiting.

Join the beta More encoding
© 2026 MpegFlow, Inc. · Trust & complianceAll systems nominal·StatusPrivacy