Start from nothing — a sentence, a reference image, or a script — and end with a finished, sharable scene.
Models, aspects, seeds, start & end frames, audio modes — explained without jargon.
From blank screen to a 9:16 short, fully narrated and captioned, in 5 minutes.
Turn a single photo into a studio-grade talking-head that speaks any language.
Plan a sequence with consistent characters and look — like a real shoot.
Continue motion, loop for seamless ambience, or stretch a hero shot.
Melt, crush, explode, glow — when to use each and when not to.
Pour life into footage you already have — captions, dub, voiceover, replace anything.
Style, position, highlight colour — the small details that make captions click.
Keep your voice, keep your gestures, reach a global audience.
Re-record any video in a new language while keeping the speaker's face.
Multi-segment narration, different voice per segment, auto music ducking.
Precision vs Speed mode — when each one wins.
The video inpainting brush, frame by frame.
Transparent, green screen, or AI-replaced background.
Genre, mood, intensity, instrumentation — pick a vibe, Lumen writes.
Generate, design, swap, isolate or transcribe — the entire voice surface.
Stability, similarity, style — the three sliders that change everything.
Describe the voice you've always wanted. Save it. Use it everywhere.
1,000+ voices, sorted by use, filtered by vibe.
Speech-to-speech with preserved emotion and timing.
Vocals, instrumental, or full cleanup — three modes, one slider.
Word-level timestamps, speaker diarisation, SRT/VTT/JSON exports.
System prompt, voice, knowledge base, temperature — the four levers that matter.
Paste a URL → embeddable narrator. Done.
Compose full songs, design ambience, build podcasts — without a DAW.
Flux, Imagen, Stable Diffusion, Midjourney — one composer for all of them.
Reading is half. Doing is the other half. Open Lumen on whatever device is nearest.
Get started, free