First Upload

All Files
2026-01-29 10:48:02 +01:00
commit abf2109171
21 changed files with 2834 additions and 0 deletions
--- a/README_EN.md
+++ b/README_EN.md
@@ -0,0 +1,264 @@
+# Auto Clip (V_1 / V_2 / V_3)
+
+A local, offline-friendly toolkit for generating **DJ-style album teasers** by automatically finding highlights in your tracks and stitching them into a short teaser using **Python + FFmpeg**.
+
+This repo is organized as:
+
+```
+Auto Clip/
+  V_1/
+  V_2/
+  V_3/
+```
+
+Each version is a step up in “DJ-ness” and automation.
+
+---
+
+## What it does ✅
+
+Given a folder of audio tracks (WAV/MP3/FLAC/etc.), the scripts can:
+
+- **Scan tracks automatically** (with a max limit, e.g. 20)
+- **Select tracks by index** (e.g. `1,2,5,7` or `1-4,9`) or automatically pick a **best-of**
+- Detect **highlight segments** (energy + onset/transients)
+- Snap cut points to a **bar grid** (phrase-aligned “DJ” cuts)
+- Add optional **pre-roll** (start 1 bar before the highlight)
+- Render clips and merge them with **acrossfades**
+- Export **WAV + MP3**
+- Output a **report JSON** with timestamps and clip metadata
+
+> Note: These scripts are **audio analysis + heuristics**, not “generative music AI”.  
+> LLMs (Ollama) can help with README/promo/tracklists, but the actual audio cutting is done locally with librosa + FFmpeg.
+
+---
+
+## Version overview 🧩
+
+### V_1 — Minimal MVP
+**Goal:** Quick proof-of-concept teaser builder.
+
+Typical features:
+- Highlight detection (energy/onset)
+- Simple clip render + acrossfade teaser
+- JSON report
+
+**Best when:** you want a fast starting point.
+
+---
+
+### V_2 — Selection + Best-of + DJ ordering
+**Goal:** Git-ready CLI and better “DJ flow”.
+
+Adds:
+- Folder scan (max 20)
+- Track selection by index/range (`--select 1-4,7`)
+- Auto selection (`--select auto --auto-n 8`)
+- Ordering heuristics (tempo clustering + energy ramp)
+- WAV + MP3 export
+- Report JSON
+
+**Best when:** you want a practical tool you can keep using.
+
+---
+
+### V_3 — Harmonic mixing + downbeat-ish snap + 2-pass loudness
+**Goal:** Trance-friendly “DJ teaser” quality.
+
+Adds:
+- Key detection (chroma-based) + **Camelot** mapping (best effort)
+- Harmonic ordering (Camelot neighbors) for smoother transitions
+- “Downbeat-ish” bar-start snap (beat grid + onset accent heuristic)
+- **2-pass loudnorm** per clip (more consistent output)
+
+**Best when:** you want old school trance teasers that feel like mini-mixes.
+
+---
+
+## Requirements 🛠️
+
+### System
+- **FFmpeg** installed and available in PATH
+- Python **3.10+** recommended (3.11+ is great)
+
+### Python packages
+Common:
+- `numpy`
+- `librosa`
+- `soundfile`
+
+Optional:
+- `requests` (only needed if you use the Ollama helper to generate README/promo assets)
+
+---
+
+## Install (recommended) 🐍
+
+Create a virtual environment:
+
+```bash
+python -m venv .venv
+# Linux/macOS:
+source .venv/bin/activate
+# Windows PowerShell:
+# .\.venv\Scripts\Activate.ps1
+
+pip install -U pip
+pip install numpy librosa soundfile
+```
+
+If you plan to use the Ollama helper (v3.1 style extras):
+```bash
+pip install requests
+```
+
+---
+
+## FFmpeg install hints 🎬
+
+### Debian/Ubuntu
+```bash
+sudo apt-get update
+sudo apt-get install -y ffmpeg
+```
+
+### Windows
+Install FFmpeg and add `ffmpeg.exe` to PATH (so `ffmpeg -version` works in terminal).
+
+---
+
+## Usage 🚀
+
+> Scripts live under `V_1/`, `V_2/`, `V_3/` depending on your repo layout.
+> The examples below assume you run from inside a version folder and have a `tracks/` folder.
+
+### Prepare input
+Put audio files in:
+
+```
+tracks/
+  01 - Track.wav
+  02 - Track.mp3
+  ...
+```
+
+---
+
+## V_2 examples
+
+### Use all tracks (max scan still applies)
+```bash
+python dj_teaser_v2.py --tracks-dir ./tracks --select all --mode rollcall --teaser 60 --bars 2 --preroll-bars 1
+```
+
+### Select specific tracks by index
+```bash
+python dj_teaser_v2.py --tracks-dir ./tracks --select 1,2,3,7,9 --teaser 60 --bars 2
+```
+
+### Select ranges + mix
+```bash
+python dj_teaser_v2.py --tracks-dir ./tracks --select 1-4,7,10-12 --teaser 60 --bars 2
+```
+
+### Auto best-of (pick top N tracks)
+```bash
+python dj_teaser_v2.py --tracks-dir ./tracks --select auto --auto-n 8 --mode bestof --teaser 75 --bars 4 --preroll-bars 1
+```
+
+---
+
+## V_3 examples (recommended for trance)
+
+### Rollcall (all tracks, fast DJ flip)
+```bash
+python dj_teaser_v3.py --tracks-dir ./tracks --select all --teaser 60 --bars 2 --preroll-bars 1 --avoid-intro 30 --harmonic
+```
+
+### Best-of mini-mix vibe
+```bash
+python dj_teaser_v3.py --tracks-dir ./tracks --select auto --auto-n 8 --teaser 75 --bars 4 --preroll-bars 1 --avoid-intro 30 --harmonic
+```
+
+---
+
+## Output files 📦
+
+Typical outputs:
+
+- `out/album_teaser.wav`
+- `out/album_teaser.mp3`
+- `out/teaser_report.json`
+
+The report includes:
+- chosen track order
+- estimated BPM
+- key/camelot (V_3)
+- clip start times and durations
+
+---
+
+## Tuning tips (old school trance) 💡
+
+- **Avoid long intros**: use `--avoid-intro 30` or `45`
+- **DJ phrasing**:
+  - `--bars 2` for rollcall with many tracks (14+)
+  - `--bars 4` for more “real trance feel”
+- **Lead-in**: `--preroll-bars 1` often makes transitions feel natural
+- **Crossfade**:
+  - 0.20–0.35 seconds is usually good for teasers
+- **Harmonic mode** (V_3): `--harmonic` is recommended, but key detection is best-effort
+
+---
+
+## Limitations ⚠️
+
+- Beat and key detection are **heuristics**; some tracks will be “weird”, especially with:
+  - long breakdowns
+  - very pad-heavy sections
+  - ambient intros/outros
+- “Downbeat” is approximated from beat grid + onset accent (not a trained downbeat model)
+- For perfect DJ results, you can always manually tweak:
+  - `--avoid-intro`, `--bars`, `--preroll-bars`, `--select`, and the track order
+
+---
+
+## Optional: Ollama (text generation) 🤖
+
+If you run Ollama locally (example: `http://192.168.2.60:11434`) you can use it to generate:
+
+- README snippets
+- promo text (TikTok/IG/YouTube)
+- tracklists with timestamps (based on `teaser_report.json`)
+
+Recommended model:
+- `llama3.1:8b-instruct-q4_0`
+
+> Ollama is optional and not required for audio processing.
+
+---
+
+## Repo hygiene 🧼
+
+Suggested `.gitignore`:
+
+```
+.venv/
+__pycache__/
+work/
+out/
+*.wav
+*.mp3
+```
+
+---
+
+## License
+Pick whatever fits your repo (MIT is common).  
+If you haven’t decided yet, add an MIT license later.
+
+---
+
+## Credits
+- **FFmpeg** for audio processing
+- **librosa** for audio analysis