Media · 2025 ·Creator · In progress
Lyric Video Generator
A tool that turns an audio file and its lyrics into a styled MP4 lyric video — using OpenAI Whisper for word-level timing, then rendering animated text and audio visualizers with optional GPU acceleration.
L
Overview
Making a lyric video by hand means syncing text to audio one line at a time — tedious, finicky work. This tool automates the whole thing: feed it a song and its lyrics, get back a finished, styled lyric video.
What I built
- Word-level alignment — OpenAI Whisper transcribes with timestamps, then fuzzy matching reconciles the transcription against the real lyrics so timing lands on the right words.
- Styled rendering — animated text with configurable fonts, colors, and positioning, plus spectrum and waveform visualizer overlays.
- Built for speed and reach — optional CUDA GPU acceleration (3–5× faster) and three ways to drive it: CLI, GUI, and a FastAPI web interface.
Status
A work in progress with a solid core — the alignment-and-render pipeline works; the hosted web experience is the next step.
Built with Python Whisper MoviePy PyTorch FastAPI
Next project
AI Support Engine →