Auto Clip (V_1 / V_2 / V_3)
A local, offline-friendly toolkit for generating DJ-style album teasers by automatically finding highlights in your tracks and stitching them into a short teaser using Python + FFmpeg.
This repo is organized as:
Auto Clip/
V_1/
V_2/
V_3/
Each version is a step up in “DJ-ness” and automation.
What it does ✅
Given a folder of audio tracks (WAV/MP3/FLAC/etc.), the scripts can:
- Scan tracks automatically (with a max limit, e.g. 20)
- Select tracks by index (e.g.
1,2,5,7or1-4,9) or automatically pick a best-of - Detect highlight segments (energy + onset/transients)
- Snap cut points to a bar grid (phrase-aligned “DJ” cuts)
- Add optional pre-roll (start 1 bar before the highlight)
- Render clips and merge them with acrossfades
- Export WAV + MP3
- Output a report JSON with timestamps and clip metadata
Note: These scripts are audio analysis + heuristics, not “generative music AI”.
LLMs (Ollama) can help with README/promo/tracklists, but the actual audio cutting is done locally with librosa + FFmpeg.
Version overview 🧩
V_1 — Minimal MVP
Goal: Quick proof-of-concept teaser builder.
Typical features:
- Highlight detection (energy/onset)
- Simple clip render + acrossfade teaser
- JSON report
Best when: you want a fast starting point.
V_2 — Selection + Best-of + DJ ordering
Goal: Git-ready CLI and better “DJ flow”.
Adds:
- Folder scan (max 20)
- Track selection by index/range (
--select 1-4,7) - Auto selection (
--select auto --auto-n 8) - Ordering heuristics (tempo clustering + energy ramp)
- WAV + MP3 export
- Report JSON
Best when: you want a practical tool you can keep using.
V_3 — Harmonic mixing + downbeat-ish snap + 2-pass loudness
Goal: Trance-friendly “DJ teaser” quality.
Adds:
- Key detection (chroma-based) + Camelot mapping (best effort)
- Harmonic ordering (Camelot neighbors) for smoother transitions
- “Downbeat-ish” bar-start snap (beat grid + onset accent heuristic)
- 2-pass loudnorm per clip (more consistent output)
Best when: you want old school trance teasers that feel like mini-mixes.
Requirements 🛠️
System
- FFmpeg installed and available in PATH
- Python 3.10+ recommended (3.11+ is great)
Python packages
Common:
numpylibrosasoundfile
Optional:
requests(only needed if you use the Ollama helper to generate README/promo assets)
Install (recommended) 🐍
Create a virtual environment:
python -m venv .venv
# Linux/macOS:
source .venv/bin/activate
# Windows PowerShell:
# .\.venv\Scripts\Activate.ps1
pip install -U pip
pip install numpy librosa soundfile
If you plan to use the Ollama helper (v3.1 style extras):
pip install requests
FFmpeg install hints 🎬
Debian/Ubuntu
sudo apt-get update
sudo apt-get install -y ffmpeg
Windows
Install FFmpeg and add ffmpeg.exe to PATH (so ffmpeg -version works in terminal).
Usage 🚀
Scripts live under
V_1/,V_2/,V_3/depending on your repo layout. The examples below assume you run from inside a version folder and have atracks/folder.
Prepare input
Put audio files in:
tracks/
01 - Track.wav
02 - Track.mp3
...
V_2 examples
Use all tracks (max scan still applies)
python dj_teaser_v2.py --tracks-dir ./tracks --select all --mode rollcall --teaser 60 --bars 2 --preroll-bars 1
Select specific tracks by index
python dj_teaser_v2.py --tracks-dir ./tracks --select 1,2,3,7,9 --teaser 60 --bars 2
Select ranges + mix
python dj_teaser_v2.py --tracks-dir ./tracks --select 1-4,7,10-12 --teaser 60 --bars 2
Auto best-of (pick top N tracks)
python dj_teaser_v2.py --tracks-dir ./tracks --select auto --auto-n 8 --mode bestof --teaser 75 --bars 4 --preroll-bars 1
V_3 examples (recommended for trance)
Rollcall (all tracks, fast DJ flip)
python dj_teaser_v3.py --tracks-dir ./tracks --select all --teaser 60 --bars 2 --preroll-bars 1 --avoid-intro 30 --harmonic
Best-of mini-mix vibe
python dj_teaser_v3.py --tracks-dir ./tracks --select auto --auto-n 8 --teaser 75 --bars 4 --preroll-bars 1 --avoid-intro 30 --harmonic
Output files 📦
Typical outputs:
out/album_teaser.wavout/album_teaser.mp3out/teaser_report.json
The report includes:
- chosen track order
- estimated BPM
- key/camelot (V_3)
- clip start times and durations
Tuning tips (old school trance) 💡
- Avoid long intros: use
--avoid-intro 30or45 - DJ phrasing:
--bars 2for rollcall with many tracks (14+)--bars 4for more “real trance feel”
- Lead-in:
--preroll-bars 1often makes transitions feel natural - Crossfade:
- 0.20–0.35 seconds is usually good for teasers
- Harmonic mode (V_3):
--harmonicis recommended, but key detection is best-effort
Limitations ⚠️
- Beat and key detection are heuristics; some tracks will be “weird”, especially with:
- long breakdowns
- very pad-heavy sections
- ambient intros/outros
- “Downbeat” is approximated from beat grid + onset accent (not a trained downbeat model)
- For perfect DJ results, you can always manually tweak:
--avoid-intro,--bars,--preroll-bars,--select, and the track order
Optional: Ollama (text generation) 🤖
If you run Ollama locally (example: http://192.168.2.60:11434) you can use it to generate:
- README snippets
- promo text (TikTok/IG/YouTube)
- tracklists with timestamps (based on
teaser_report.json)
Recommended model:
llama3.1:8b-instruct-q4_0
Ollama is optional and not required for audio processing.
Repo hygiene 🧼
Suggested .gitignore:
.venv/
__pycache__/
work/
out/
*.wav
*.mp3
License
Pick whatever fits your repo (MIT is common).
If you haven’t decided yet, add an MIT license later.
Credits
- FFmpeg for audio processing
- librosa for audio analysis