When Static Images Need to Become Video
Wedding photos for an anniversary party. Product shots for a launch video. A portfolio that needs to play on a TV during an event. Travel photos from a trip someone wants to share. All of these start as a folder of still images and need to become a video file.
The routes to get there range from simple (one FFmpeg command) to sophisticated (precise timing, custom transitions, background music). This guide covers the full range, from getting something working in five minutes to producing a polished result with professional-looking transitions.
The Simplest Approach: FFmpeg
FFmpeg can turn a folder of images into a video in a single command. No GUI, no export dialogs — just a file in, file out.
Basic Slideshow (Fixed Duration per Image)
# Each image displays for 3 seconds
ffmpeg -framerate 1/3 -pattern_type glob -i '*.jpg' \
-c:v libx264 -r 30 -pix_fmt yuv420p output.mp4
Breaking down the flags:
-framerate 1/3— input frame rate of 1 frame per 3 seconds (one image every 3 seconds)-pattern_type glob -i '*.jpg'— use all JPG files in alphabetical order-c:v libx264— encode with H.264 (maximum compatibility)-r 30— output at 30fps (required; playback would otherwise be 1fps)-pix_fmt yuv420p— standard pixel format for compatibility
Important: Image files are processed in alphabetical order. Name your images with zero-padded numbers if order matters: 001_photo.jpg, 002_photo.jpg, etc.
Controlling Image Order
# Name images in sequence and process in order
ffmpeg -framerate 1/4 -i 'img_%03d.jpg' -c:v libx264 -r 30 -pix_fmt yuv420p output.mp4
The %03d pattern matches img_001.jpg, img_002.jpg, etc. — zero-padded 3-digit numbers.
Setting a Consistent Output Resolution
Source photos often have mixed orientations and sizes. Force a consistent output:
# Force 1920x1080 (landscape)
ffmpeg -framerate 1/4 -pattern_type glob -i '*.jpg' \
-vf "scale=1920:1080:force_original_aspect_ratio=decrease,pad=1920:1080:(ow-iw)/2:(oh-ih)/2,setsar=1" \
-c:v libx264 -r 30 -pix_fmt yuv420p output_1080p.mp4
The filter chain:
scale=1920:1080:force_original_aspect_ratio=decrease— scales to fit within 1920x1080 while maintaining aspect ratiopad=1920:1080:(ow-iw)/2:(oh-ih)/2— adds black bars to fill the remaining spacesetsar=1— normalizes the sample aspect ratio
For portrait photos in a landscape slideshow, this produces black letterboxing. For a blur-fill background instead of black bars, see the Ken Burns section below.
Adding Background Music
# Add audio, trim to video length
ffmpeg -framerate 1/4 -pattern_type glob -i '*.jpg' \
-i background_music.mp3 \
-vf "scale=1920:1080:force_original_aspect_ratio=decrease,pad=1920:1080:(ow-iw)/2:(oh-ih)/2" \
-c:v libx264 -r 30 -pix_fmt yuv420p \
-c:a aac -b:a 192k \
-shortest \
output_with_music.mp4
The -shortest flag stops the output when the shorter of the two inputs (video or audio) ends. If the music is longer than the slideshow, it gets cut. If the slideshow is longer than the music, the audio ends first and the remainder is silent — add -af "afade=t=out:st=LAST_FEW_SECONDS:d=2" for a fade-out.
Variable Duration per Image
Different images deserve different amounts of screen time. The cleanest way to handle this in FFmpeg is through a concat demuxer input file:
Create a File List with Custom Durations
Create a file named filelist.txt:
file 'opening.jpg'
duration 5
file 'photo01.jpg'
duration 3
file 'photo02.jpg'
duration 3
file 'photo03.jpg'
duration 4
file 'photo04.jpg'
duration 2
file 'closing.jpg'
duration 5
Then generate the video:
ffmpeg -f concat -safe 0 -i filelist.txt \
-vf "scale=1920:1080:force_original_aspect_ratio=decrease,pad=1920:1080:(ow-iw)/2:(oh-ih)/2" \
-c:v libx264 -r 30 -pix_fmt yuv420p output_variable.mp4
A Python script to generate filelist.txt from a folder with custom timings:
import os, glob
# Map filenames to durations (seconds)
timing = {
'opening.jpg': 5,
'default': 3, # Default for all others
'closing.jpg': 6,
}
images = sorted(glob.glob('*.jpg'))
with open('filelist.txt', 'w') as f:
for img in images:
duration = timing.get(img, timing['default'])
f.write(f"file '{img}'\n")
f.write(f"duration {duration}\n")
print(f"Generated filelist.txt for {len(images)} images")
Adding Transitions
FFmpeg doesn't have built-in "dissolve between slides" transitions in a single pass. The xfade filter handles transitions, but requires either:
- Pre-encoded segment files, or
- The
xfadefilter with carefully timed inputs
Cross-Fade Transition Using xfade
# Two images with a 1-second crossfade between them
ffmpeg \
-loop 1 -t 4 -i photo1.jpg \
-loop 1 -t 4 -i photo2.jpg \
-filter_complex "[0:v][1:v]xfade=transition=fade:duration=1:offset=3,format=yuv420p" \
-r 30 output_fade.mp4
For multiple images, chain the xfade filters:
# Three images with crossfades
ffmpeg \
-loop 1 -t 5 -i photo1.jpg \
-loop 1 -t 5 -i photo2.jpg \
-loop 1 -t 5 -i photo3.jpg \
-filter_complex "
[0:v]scale=1920:1080:force_original_aspect_ratio=decrease,pad=1920:1080:(ow-iw)/2:(oh-ih)/2[v0];
[1:v]scale=1920:1080:force_original_aspect_ratio=decrease,pad=1920:1080:(ow-iw)/2:(oh-ih)/2[v1];
[2:v]scale=1920:1080:force_original_aspect_ratio=decrease,pad=1920:1080:(ow-iw)/2:(oh-ih)/2[v2];
[v0][v1]xfade=transition=fade:duration=1:offset=4[xf1];
[xf1][v2]xfade=transition=fade:duration=1:offset=8,format=yuv420p[out]
" -map "[out]" -r 30 output_transitions.mp4
Available xfade transitions include: fade, wipeleft, wiperight, wipeup, wipedown, slideleft, slideright, slideup, slidedown, circlecrop, circleopen, circleclose, horzopen, horzclose, vertopen, vertclose, dissolve, pixelize, diagtl, diagtr, diagbl, diagbr.
Scripting Multi-Image Transitions
For slideshows with many images, writing xfade chains manually isn't practical. A Python script:
import subprocess
import glob
import os
def create_slideshow_with_transitions(image_folder, output_file,
duration_per_image=4,
transition_duration=1,
transition_type='fade',
size='1920:1080'):
images = sorted(glob.glob(os.path.join(image_folder, '*.jpg')))
if not images:
images = sorted(glob.glob(os.path.join(image_folder, '*.png')))
n = len(images)
w, h = size.split(':')
# Build FFmpeg command
inputs = []
for img in images:
inputs.extend(['-loop', '1', '-t', str(duration_per_image), '-i', img])
# Build filter chain
scale_filter = f"scale={w}:{h}:force_original_aspect_ratio=decrease,pad={w}:{h}:(ow-iw)/2:(oh-ih)/2"
filter_parts = []
for i in range(n):
filter_parts.append(f"[{i}:v]{scale_filter}[v{i}]")
# Chain xfade transitions
prev = 'v0'
for i in range(1, n):
offset = (i * duration_per_image) - (i * transition_duration) - transition_duration
out = f"xf{i}" if i < n-1 else "out"
filter_parts.append(
f"[{prev}][v{i}]xfade=transition={transition_type}:duration={transition_duration}:offset={offset}[{out}]"
)
prev = out
filter_parts.append(f"[out]format=yuv420p[final]")
cmd = inputs + [
'-filter_complex', ';'.join(filter_parts),
'-map', '[final]',
'-r', '30',
'-c:v', 'libx264', '-preset', 'fast',
output_file
]
subprocess.run(['ffmpeg'] + cmd, check=True)
# Usage
create_slideshow_with_transitions('./photos', 'slideshow.mp4')
Ken Burns Effect (Pan and Zoom)
The Ken Burns effect — slow zooms and pans on still images — adds motion and visual interest. FFmpeg's zoompan filter handles this:
# Zoom in slowly on a single image (5 seconds)
ffmpeg -loop 1 -i photo.jpg -t 5 \
-vf "zoompan=z='min(zoom+0.0015,1.5)':d=125:x='iw/2-(iw/zoom/2)':y='ih/2-(ih/zoom/2)',scale=1920:1080,format=yuv420p" \
-r 25 output_kenburns.mp4
Breaking down the zoompan filter:
z='min(zoom+0.0015,1.5)'— zoom increases 0.0015 per frame, capped at 1.5x zoomd=125— duration in frames (125 frames ÷ 25fps = 5 seconds)x/y— center the zoom on the image center
For a pan-right effect:
-vf "zoompan=z=1.2:d=125:x='iw/zoom*(on/125)':y='ih/2-(ih/zoom/2)',scale=1920:1080"
Platform-Specific Settings
Different platforms have specific requirements for slideshow videos:
| Platform | Resolution | Aspect Ratio | Frame Rate | Format |
|---|---|---|---|---|
| Instagram Feed | 1080x1080 | 1:1 | 30fps | MP4, H.264 |
| Instagram Reels | 1080x1920 | 9:16 | 30fps | MP4, H.264 |
| TikTok | 1080x1920 | 9:16 | 30fps | MP4, H.264 |
| YouTube | 1920x1080 | 16:9 | 24/30fps | MP4, H.264 |
| 1280x720+ | 16:9 | 30fps | MP4, H.264 | |
| 1920x1080 | 16:9 | 25/30fps | MP4, H.264 |
Use the video converter to adjust aspect ratio and resolution after creating the base slideshow, or the crop video tool to reframe a 16:9 slideshow for vertical platforms.
Adding Text Overlays
# Add title text to the first 3 seconds
ffmpeg -i slideshow.mp4 \
-vf "drawtext=fontfile=/usr/share/fonts/truetype/dejavu/DejaVuSans-Bold.ttf:text='Our Trip to Spain':x=(w-text_w)/2:y=(h-text_h)/2:fontsize=72:fontcolor=white:enable='between(t,0,3)'" \
-c:v libx264 -c:a copy output_with_title.mp4
For multiple text overlays at different times, chain drawtext filters or apply them in sequence.
Photo Quality Considerations
Before creating the slideshow, ensure source photos are sized appropriately:
- Too small: Photos smaller than 1920x1080 will be upscaled, which reduces sharpness
- Too large: 50MP camera files create processing overhead unnecessarily — resize to 3840x2160 (4K) maximum before creating the slideshow
- Mixed orientations: Script the portrait/landscape handling if your photos include both
Use the image compressor to reduce large source images before processing, or the resize image tool to standardize dimensions. For converting HEIC phone photos (common from iPhones) before using them in slideshows, the HEIC to JPG converter handles batch conversion.
Frequently Asked Questions
What image formats does FFmpeg accept for slideshows?
FFmpeg handles JPEG, PNG, BMP, TIFF, WebP, and most common image formats. RAW camera files (CR2, NEF, ARW) require conversion to JPEG or PNG first. HEIC files from iPhones also require conversion. Use a batch conversion tool before running the slideshow command.
Why does my slideshow output have inconsistent timing?
This usually happens when the image durations don't align cleanly with the output frame rate. Use duration values that are multiples of 1/framerate. For 30fps output, use durations like 2.0, 3.0, 4.0 seconds rather than 2.5 seconds (which creates 75 frames — fine) versus 2.3 seconds (69 frames — leads to rounding inconsistencies).
Can I add custom audio per image section?
Yes, but it requires separate audio tracks with precise timing. The cleanest approach: create the video without audio, create an audio file with the exact same total duration and proper music/voiceover timing, then combine them:
ffmpeg -i video_no_audio.mp4 -i audio_track.mp3 -c:v copy -c:a aac -shortest output_final.mp4
How do I handle mixed portrait and landscape photos?
Use the scale + pad filter combination shown earlier. For portrait photos, this creates black bars on the sides in a landscape slideshow. Alternatively, use a blurred version of the photo as the background fill:
-vf "split[base][over];[base]scale=1920:1080,boxblur=10,setsar=1[bg];[over]scale=1920:1080:force_original_aspect_ratio=decrease,setsar=1[fg];[bg][fg]overlay=(W-w)/2:(H-h)/2"
This creates a blurred background from the photo itself instead of solid black bars.
Conclusion
Creating a video slideshow from images spans from a single FFmpeg command for a basic result to a scripted pipeline for a polished production. The core workflow is always the same: assemble images in order, set duration, optionally add transitions and audio, encode to H.264 MP4 for maximum compatibility.
For photos that need format conversion before the slideshow process, the image converter handles batch conversion of any image format. The how to convert images to PDF guide covers the parallel use case when you need a document rather than a video, and the GIF maker is the right tool when you need a short looping animation instead of a full video file.



