SRT Dubber - Cristián Ormazábal

Overview

SRT Dubber is a modern C++20 application that streamlines the voice-over dubbing workflow. Record your voice, one take per subtitle, then automatically process and align audio to match subtitle timing. The tool handles silence trimming, audio normalization, and final video assembly—all from an intuitive terminal UI.

Features

Interactive Recording Session: Terminal UI (TUI) powered by FTXUI for comfortable voice recording
One Take Per Subtitle: Record narration for each subtitle slot individually with real-time waveform preview
Automatic Audio Processing: Trim silence, normalize levels, and fit audio duration to subtitle timing
Smart Muxing: Combine voice-over track with original video using FFmpeg (MP4 output)
Cross-Platform Build: Supports macOS (Clang), Linux (GCC), and Windows (MinGW-w64 cross-compile)
Project State Tracking: Persistent project.json tracks recording progress per subtitle slot

Workflow

The typical dubbing workflow consists of three main commands:

1. Record

srt-dubber record input.srt

Launches the interactive TUI where you record one audio take for each subtitle line. Each recording is saved as a timestamped WAV file in the takes/ directory.

2. Process

srt-dubber process input.srt

Batch-processes all recorded takes: removes silence, normalizes audio levels, and stretches/compresses each recording to match the subtitle duration.

3. Assemble

srt-dubber assemble input.srt input.mp4

Combines all processed audio tracks into a single narration WAV file and muxes it into the original video file, producing the final output.mp4.

All-in-One

srt-dubber full input.srt input.mp4

Runs recording (TUI), processing, and assembly sequentially in one command.

Requirements

CMake ≥ 3.20
C++20 compiler (Clang 14+ or GCC 12+)
FFmpeg + FFprobe (must be on PATH)
miniaudio.h (downloaded manually to vendor/)

Platform-Specific Setup

macOS: Xcode Command Line Tools provide Clang and CMake.
Linux: apt install cmake g++ ffmpeg
Windows: MinGW-w64 cross-compilation supported; FFmpeg required in build environment.

Getting Started

1. Download miniaudio (one-time)

curl -L https://raw.githubusercontent.com/mackron/miniaudio/master/miniaudio.h \
     -o vendor/miniaudio.h

2. Build

mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
cmake --build . -j$(nproc)

The compiled binary is at build/srt-dubber.

3. Record Your First Voice-Over

./build/srt-dubber full input.srt input.mp4

Output Structure

takes/              Raw recordings (N.wav per subtitle slot)
processed/          Trimmed & normalized audio (N.wav)
output/
  voiceover.wav     Assembled narration track
  output.mp4        Final video with voice-over layer
project.json        Recording state & progress (.gitignored)

Architecture

src/
  srt/              SRT parser and subtitle container
  core/             Project state management and persistence
  audio/            miniaudio recorder & playback
  ffmpeg/           FFmpeg/FFprobe wrappers for processing
  tui/              FTXUI interactive screens
main.cpp            CLI dispatcher
vendor/
  miniaudio.h       Single-header audio library (not in git)

Key Dependencies

FTXUI v5.0.0 — Modern terminal UI framework (fetched by CMake)
nlohmann/json v3.11.3 — JSON serialization for project state (fetched by CMake)
miniaudio — Single-header audio I/O library

Tech Stack

Language: C++20
Build System: CMake 3.20+
Audio: FFmpeg, FFprobe, miniaudio
UI: FTXUI (terminal)
Build Tool Support: Docker cross-compilation (Windows)

Use Cases

Content Creators: Add professional voice-overs to tutorial videos
Subtitled Content: Quickly dub existing SRT-subtitled media
Accessibility: Create audio descriptions from subtitle scripts
Localization: Re-dub videos in multiple languages from existing subtitle timings
Podcasting: Compose multi-voice content with precise timing control

Design Philosophy

SRT Dubber follows a single-responsibility, composable CLI design:

Each command (record, process, assemble) does one thing well
Commands work independently or can be chained
Project state is portable and human-readable (JSON)
Local processing—no cloud dependencies or subscriptions

License

MIT License