nwe files

some files
This commit is contained in:
Thomas
2026-01-29 10:51:19 +01:00
parent abf2109171
commit 2b81d5843a
7 changed files with 485 additions and 0 deletions

117
DESIGN_RATIONALE.md Normal file
View File

@@ -0,0 +1,117 @@
# Design Rationale
This document explains *why* Auto Clip makes certain musical and engineering choices.
---
## Why “bars” instead of arbitrary seconds?
Electronic music (especially trance) is structured in **bars** and **phrases**:
- Most tracks are in **4/4**
- Changes happen on predictable boundaries (every 4/8/16/32 bars)
Cutting on bar boundaries reduces:
- awkward mid-kick edits
- off-grid transitions
- “why does this feel wrong?” moments
Thats why the CLI exposes:
- `--bars` (2 bars for rollcall, 4 bars for mini-mix feel)
- `--preroll-bars` (start a bar earlier so the listener hears the groove before the highlight)
---
## Why “pre-roll bars”?
Highlights often occur at an impact moment:
- a stab
- a fill
- a drop hit
If you cut *exactly* at the highlight, the listener misses the *lead-in groove*.
Pre-roll gives the ear context, so the transition feels like a DJ brought it in.
Practical defaults:
- Rollcall: `--bars 2 --preroll-bars 1`
- Mini-mix: `--bars 4 --preroll-bars 1`
---
## Why energy + onset for highlight detection?
In EDM, “interesting” moments correlate with:
- higher RMS energy (loudness/drive)
- strong transient activity (onset strength)
A simple weighted sum (with robust normalization) is:
- fast
- local-only
- works reasonably across many tracks
Its not perfect (pads/breakdowns can confuse it), but its a strong baseline.
---
## Why Camelot (harmonic mixing)?
DJ transitions feel smoother when keys are compatible.
The Camelot wheel provides a practical rule-of-thumb:
- Same number A<->B (relative major/minor)
- Same letter, number +/-1 (adjacent harmonies)
Auto Clip uses **best-effort** key detection and then maps to Camelot to:
- reduce harmonic clashes
- keep the teaser musically “coherent”
Caveats:
- Key detection can be unreliable on pad-heavy sections, noise, or breakdowns
- Thats why V3 calls it best-effort and V4 plans confidence-based fallback
---
## Why “downbeat-ish” snap instead of full ML downbeat detection?
True downbeat detection often needs:
- trained ML models
- more complex pipelines
- sometimes stems / better separation
Auto Clip stays local and lightweight.
So we approximate downbeat by:
- beat tracking grid
- onset accent scoring at bar starts (kick/transient emphasis)
This typically yields:
- better bar-aligned cuts than “nearest beat”
- without heavy dependencies
---
## Why 2-pass loudnorm?
When you cut from different tracks:
- perceived loudness can jump wildly
- the teaser feels amateur even if the edits are good
FFmpegs loudnorm supports 2-pass measurement + apply, which:
- improves consistency
- reduces clipping risk
- keeps the teaser “radio ready” (for a promo)
Thats why V3 uses 2-pass loudnorm per clip.
---
## Why this repo has V_1 / V_2 / V_3?
Keeping versions side-by-side has benefits:
- V_1: minimal baseline
- V_2: practical CLI + selection features
- V_3: trance/DJ quality logic
It also makes it easy for contributors to:
- understand evolution
- debug regressions
V4 aims to unify this into a single stable CLI while retaining clarity.