First Upload

All Files
This commit is contained in:
Thomas
2026-01-29 10:48:02 +01:00
commit abf2109171
21 changed files with 2834 additions and 0 deletions

20
BADGES.md Normal file
View File

@@ -0,0 +1,20 @@
# Badges (GitHub)
You can add these badges to the top of your README.md:
```md
![Python](https://img.shields.io/badge/python-3.10%2B-blue)
![FFmpeg](https://img.shields.io/badge/ffmpeg-required-red)
![Local Only](https://img.shields.io/badge/local--only-no--cloud-success)
![DJ Tools](https://img.shields.io/badge/DJ-tools-purple)
![Audio](https://img.shields.io/badge/audio-DSP-orange)
```
Optional style (flat-square):
```md
![Python](https://img.shields.io/badge/python-3.10%2B-blue?style=flat-square)
![FFmpeg](https://img.shields.io/badge/ffmpeg-required-red?style=flat-square)
![Local Only](https://img.shields.io/badge/local--only-no--cloud-success?style=flat-square)
![DJ Tools](https://img.shields.io/badge/DJ-tools-purple?style=flat-square)
```

54
CHANGELOG.md Normal file
View File

@@ -0,0 +1,54 @@
# Changelog
All notable changes to this project are documented here.
---
## [V3] DJ / Trance Focused Release
### Added
- Harmonic key detection (chroma-based)
- Camelot wheel mapping (best-effort)
- Harmonic track ordering (Camelot neighbors)
- Downbeat-ish bar snapping (beat grid + onset accent)
- Phrase-aligned cuts (bars)
- Pre-roll bars for DJ-style transitions
- 2-pass EBU R128 loudness normalization
- MP3 export alongside WAV
- Detailed JSON report (tempo, key, clip timing)
### Improved
- Much smoother DJ flow
- Better consistency across different tracks
- Trance-friendly defaults
---
## [V2] Practical DJ Tooling
### Added
- Folder auto-scan with max track limit
- Track selection by index and range
- Auto best-of selection
- Tempo clustering
- Energy-based ordering
- CLI flags for DJ control
- WAV + MP3 output
### Improved
- GitHub-ready CLI
- Repeatable results
- Better teaser pacing
---
## [V1] Initial MVP
### Added
- Basic highlight detection
- Simple clip rendering
- Acrossfade teaser output
### Notes
- Proof-of-concept version
- No advanced DJ logic

50
FLOW_DIAGRAM.txt Normal file
View File

@@ -0,0 +1,50 @@
AUTO CLIP DJ TEASER FLOW (V3)
┌─────────────┐
│ tracks/ │
│ WAV / MP3 │
└──────┬──────┘
┌────────────────────┐
│ Audio Analysis │
│ - RMS energy │
│ - Onset detection │
│ - Beat tracking │
│ - Tempo estimate │
│ - Key (chroma) │
└──────┬─────────────┘
┌────────────────────┐
│ Highlight Picker │
│ - Skip intro/outro│
│ - Energy scoring │
│ - Best segment │
└──────┬─────────────┘
┌──────────────────────────┐
│ DJ Logic (V3) │
│ - Bar / phrase snap │
│ - Pre-roll bars │
│ - Camelot harmonic order │
│ - Tempo clustering │
└──────┬───────────────────┘
┌────────────────────┐
│ Rendering │
│ - Trim clips │
│ - 2-pass loudnorm │
│ - Acrossfades │
└──────┬─────────────┘
┌────────────────────┐
│ Output │
│ - WAV teaser │
│ - MP3 teaser │
│ - JSON report │
│ - Track timestamps │
└────────────────────┘

21
LICENSE Normal file
View File

@@ -0,0 +1,21 @@
MIT License
Copyright (c) 2026
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

103
README.md Normal file
View File

@@ -0,0 +1,103 @@
![Python](https://img.shields.io/badge/python-3.10%2B-blue)
![FFmpeg](https://img.shields.io/badge/ffmpeg-required-red)
![Local Only](https://img.shields.io/badge/local--only-no--cloud-success)
![DJ Tools](https://img.shields.io/badge/DJ-tools-purple)
# Auto Clip 🎧✂️
Local, offline-friendly **DJ-style album teaser builder** using Python + FFmpeg.
This repository contains multiple versions of the Auto Clip workflow, each adding more intelligence and DJ-oriented features.
---
## 📁 Repository structure
```
Auto Clip/
├── V_1/
├── V_2/
├── V_3/
├── README.md ← this file
├── README_EN.md
└── README_DA.md
```
- **V_1** — Minimal MVP / proof-of-concept
- **V_2** — Track selection, best-of mode, DJ ordering
- **V_3** — Harmonic mixing (Camelot), downbeat-ish snap, 2-pass loudness
---
## 📖 Documentation
Choose your preferred language:
- 🇬🇧 **English documentation** → [README_EN.md](README_EN.md)
- 🇩🇰 **Dansk dokumentation** → [README_DA.md](README_DA.md)
Each language README contains:
- Full explanation of V_1 / V_2 / V_3
- Install instructions
- Usage examples
- DJ / trance tuning tips
- Limitations and troubleshooting
---
## 🎶 What is Auto Clip?
Auto Clip scans a folder of audio tracks and automatically creates a **short DJ-style teaser** by:
- Finding musical highlights
- Snapping cuts to bar/phrase boundaries
- Optionally applying harmonic (Camelot) ordering
- Rendering a smooth teaser with crossfades
- Exporting WAV + MP3 plus a detailed JSON report
It is designed for:
- DJs
- Producers
- Album / EP promo teasers
- Old school trance & electronic music
Everything runs **locally** — no cloud, no uploads.
---
## 🤖 Optional: Ollama integration
If you run Ollama locally, Auto Clip can optionally generate:
- README content
- Promo text (SoMe / YouTube)
- Tracklists with timestamps
Recommended model:
- `llama3.1:8b-instruct-q4_0`
Ollama is **optional** and not required for audio processing.
---
## 🚀 Quick start (example)
```bash
cd V_3
python dj_teaser_v3.py --tracks-dir ./tracks --select all --teaser 60 --bars 2 --preroll-bars 1 --avoid-intro 30 --harmonic
```
---
## 📜 License
Choose the license that fits your project (MIT is commonly used).
---
## ❤️ Credits
- FFmpeg
- librosa
- numpy
Built for DJs who like automation, control, and clean signal paths.

264
README_DA.md Normal file
View File

@@ -0,0 +1,264 @@
# Auto Clip (V_1 / V_2 / V_3)
Et lokalt, offline-venligt toolkit til at lave **DJ-style album teasers** ved automatisk at finde highlights i dine tracks og klippe dem sammen til en kort teaser med **Python + FFmpeg**.
Repoet er tænkt organiseret sådan her:
```
Auto Clip/
V_1/
V_2/
V_3/
```
Hver version bygger ovenpå den forrige og giver mere “DJ-feel” og automation.
---
## Hvad den gør ✅
Du giver en mappe med tracks (WAV/MP3/FLAC osv.), og scripts kan:
- **Scanne tracks automatisk** (med max-grænse, fx 20)
- **Vælge tracks efter index** (fx `1,2,5,7` eller `1-4,9`) eller automatisk vælge en **best-of**
- Finde **highlight-sektioner** (energi + onset/transients)
- Snappe cuts til et **bar-grid** (phrase-aligned “DJ” cuts)
- Valgfri **pre-roll** (starter 1 bar før highlight)
- Render klip og samle med **acrossfades**
- Eksportere **WAV + MP3**
- Lave en **JSON rapport** med timestamps og metadata
> Note: Det her er **audio-analyse + heuristik**, ikke “generativ musik-AI”.
> LLM/Ollama kan hjælpe med README/promo/tracklist, men selve klipningen laves lokalt med librosa + FFmpeg.
---
## Versioner 🧩
### V_1 — Minimal MVP
**Formål:** Hurtig proof-of-concept.
Typisk:
- Highlight detection (energi/onset)
- Simpel render + acrossfade teaser
- JSON rapport
**Bedst når:** du vil i gang hurtigt.
---
### V_2 — Valg + Best-of + DJ ordering
**Formål:** Git-klar CLI og bedre “DJ flow”.
Tilføjer:
- Folder scan (max 20)
- Track selection via index/range (`--select 1-4,7`)
- Auto selection (`--select auto --auto-n 8`)
- Ordering heuristics (tempo clustering + energy ramp)
- WAV + MP3 eksport
- JSON rapport
**Bedst når:** du vil have et praktisk tool du kan bruge igen og igen.
---
### V_3 — Harmonic mixing + downbeat-ish snap + 2-pass loudness
**Formål:** Trance-venlig “DJ teaser” kvalitet.
Tilføjer:
- Key detection (chroma) + **Camelot** mapping (best effort)
- Harmonic ordering (Camelot neighbors) → mere smooth transitions
- “Downbeat-ish” bar-start snap (beat grid + onset accent heuristik)
- **2-pass loudnorm** pr klip (mere ens output)
**Bedst når:** du laver old school trance, og det skal føles som mini-mix.
---
## Krav 🛠️
### System
- **FFmpeg** installeret og i PATH
- Python **3.10+** anbefalet (3.11+ er perfekt)
### Python pakker
Standard:
- `numpy`
- `librosa`
- `soundfile`
Valgfrit:
- `requests` (kun hvis du bruger Ollama helper til README/promo assets)
---
## Install (anbefalet) 🐍
Lav et virtualenv:
```bash
python -m venv .venv
# Linux/macOS:
source .venv/bin/activate
# Windows PowerShell:
# .\.venv\Scripts\Activate.ps1
pip install -U pip
pip install numpy librosa soundfile
```
Hvis du vil bruge Ollama helper:
```bash
pip install requests
```
---
## FFmpeg install hints 🎬
### Debian/Ubuntu
```bash
sudo apt-get update
sudo apt-get install -y ffmpeg
```
### Windows
Installer FFmpeg og sørg for at `ffmpeg.exe` ligger i PATH (så `ffmpeg -version` virker).
---
## Brug 🚀
> Scripts ligger under `V_1/`, `V_2/`, `V_3/` alt efter repo-strukturen.
> Eksemplerne antager at du kører fra en version-mappe og har en `tracks/` mappe.
### Input
Læg dine filer her:
```
tracks/
01 - Track.wav
02 - Track.mp3
...
```
---
## V_2 eksempler
### Brug alle tracks (max scan gælder stadig)
```bash
python dj_teaser_v2.py --tracks-dir ./tracks --select all --mode rollcall --teaser 60 --bars 2 --preroll-bars 1
```
### Vælg konkrete tracks via index
```bash
python dj_teaser_v2.py --tracks-dir ./tracks --select 1,2,3,7,9 --teaser 60 --bars 2
```
### Range + mix
```bash
python dj_teaser_v2.py --tracks-dir ./tracks --select 1-4,7,10-12 --teaser 60 --bars 2
```
### Auto best-of (vælg top N tracks)
```bash
python dj_teaser_v2.py --tracks-dir ./tracks --select auto --auto-n 8 --mode bestof --teaser 75 --bars 4 --preroll-bars 1
```
---
## V_3 eksempler (anbefalet til trance)
### Rollcall (alle tracks, hurtigt DJ flip)
```bash
python dj_teaser_v3.py --tracks-dir ./tracks --select all --teaser 60 --bars 2 --preroll-bars 1 --avoid-intro 30 --harmonic
```
### Best-of mini-mix vibe
```bash
python dj_teaser_v3.py --tracks-dir ./tracks --select auto --auto-n 8 --teaser 75 --bars 4 --preroll-bars 1 --avoid-intro 30 --harmonic
```
---
## Output 📦
Typisk:
- `out/album_teaser.wav`
- `out/album_teaser.mp3`
- `out/teaser_report.json`
Rapporten indeholder:
- valgt track order
- estimeret BPM
- key/camelot (V_3)
- clip start og varighed
---
## Tuning tips (old school trance) 💡
- **Undgå lange intros**: `--avoid-intro 30` eller `45`
- **DJ phrasing**:
- `--bars 2` til rollcall med mange tracks (14+)
- `--bars 4` hvis du vil have mere “rigtig trance”
- **Lead-in**: `--preroll-bars 1` giver ofte mere naturlige transitions
- **Crossfade**:
- 0.200.35 sek er typisk sweetspot
- **Harmonic mode** (V_3): `--harmonic` anbefales, men key detection er best-effort
---
## Begrænsninger ⚠️
- Beat og key detection er **heuristik**; nogle tracks kan blive “off”, især med:
- lange breakdowns
- meget pad/noise
- ambient intro/outro
- “Downbeat” er approx (beat grid + onset accent), ikke en trænet downbeat-model
- Hvis du vil have det helt perfekt, så tweak:
- `--avoid-intro`, `--bars`, `--preroll-bars`, `--select` og track order
---
## Valgfrit: Ollama (tekst) 🤖
Hvis du kører Ollama lokalt (fx `http://192.168.2.60:11434`) kan du bruge den til:
- README tekst
- promo tekst (TikTok/IG/YouTube)
- tracklist med timestamps (ud fra `teaser_report.json`)
Anbefalet model (som du har):
- `llama3.1:8b-instruct-q4_0`
> Ollama er valgfri og ikke nødvendig for audio-klipningen.
---
## Repo hygiene 🧼
Foreslået `.gitignore`:
```
.venv/
__pycache__/
work/
out/
*.wav
*.mp3
```
---
## License
Vælg det der passer (MIT er typisk).
Hvis du ikke har valgt endnu, kan du tilføje MIT senere.
---
## Credits
- **FFmpeg** til audio processing
- **librosa** til audio analyse

264
README_EN.md Normal file
View File

@@ -0,0 +1,264 @@
# Auto Clip (V_1 / V_2 / V_3)
A local, offline-friendly toolkit for generating **DJ-style album teasers** by automatically finding highlights in your tracks and stitching them into a short teaser using **Python + FFmpeg**.
This repo is organized as:
```
Auto Clip/
V_1/
V_2/
V_3/
```
Each version is a step up in “DJ-ness” and automation.
---
## What it does ✅
Given a folder of audio tracks (WAV/MP3/FLAC/etc.), the scripts can:
- **Scan tracks automatically** (with a max limit, e.g. 20)
- **Select tracks by index** (e.g. `1,2,5,7` or `1-4,9`) or automatically pick a **best-of**
- Detect **highlight segments** (energy + onset/transients)
- Snap cut points to a **bar grid** (phrase-aligned “DJ” cuts)
- Add optional **pre-roll** (start 1 bar before the highlight)
- Render clips and merge them with **acrossfades**
- Export **WAV + MP3**
- Output a **report JSON** with timestamps and clip metadata
> Note: These scripts are **audio analysis + heuristics**, not “generative music AI”.
> LLMs (Ollama) can help with README/promo/tracklists, but the actual audio cutting is done locally with librosa + FFmpeg.
---
## Version overview 🧩
### V_1 — Minimal MVP
**Goal:** Quick proof-of-concept teaser builder.
Typical features:
- Highlight detection (energy/onset)
- Simple clip render + acrossfade teaser
- JSON report
**Best when:** you want a fast starting point.
---
### V_2 — Selection + Best-of + DJ ordering
**Goal:** Git-ready CLI and better “DJ flow”.
Adds:
- Folder scan (max 20)
- Track selection by index/range (`--select 1-4,7`)
- Auto selection (`--select auto --auto-n 8`)
- Ordering heuristics (tempo clustering + energy ramp)
- WAV + MP3 export
- Report JSON
**Best when:** you want a practical tool you can keep using.
---
### V_3 — Harmonic mixing + downbeat-ish snap + 2-pass loudness
**Goal:** Trance-friendly “DJ teaser” quality.
Adds:
- Key detection (chroma-based) + **Camelot** mapping (best effort)
- Harmonic ordering (Camelot neighbors) for smoother transitions
- “Downbeat-ish” bar-start snap (beat grid + onset accent heuristic)
- **2-pass loudnorm** per clip (more consistent output)
**Best when:** you want old school trance teasers that feel like mini-mixes.
---
## Requirements 🛠️
### System
- **FFmpeg** installed and available in PATH
- Python **3.10+** recommended (3.11+ is great)
### Python packages
Common:
- `numpy`
- `librosa`
- `soundfile`
Optional:
- `requests` (only needed if you use the Ollama helper to generate README/promo assets)
---
## Install (recommended) 🐍
Create a virtual environment:
```bash
python -m venv .venv
# Linux/macOS:
source .venv/bin/activate
# Windows PowerShell:
# .\.venv\Scripts\Activate.ps1
pip install -U pip
pip install numpy librosa soundfile
```
If you plan to use the Ollama helper (v3.1 style extras):
```bash
pip install requests
```
---
## FFmpeg install hints 🎬
### Debian/Ubuntu
```bash
sudo apt-get update
sudo apt-get install -y ffmpeg
```
### Windows
Install FFmpeg and add `ffmpeg.exe` to PATH (so `ffmpeg -version` works in terminal).
---
## Usage 🚀
> Scripts live under `V_1/`, `V_2/`, `V_3/` depending on your repo layout.
> The examples below assume you run from inside a version folder and have a `tracks/` folder.
### Prepare input
Put audio files in:
```
tracks/
01 - Track.wav
02 - Track.mp3
...
```
---
## V_2 examples
### Use all tracks (max scan still applies)
```bash
python dj_teaser_v2.py --tracks-dir ./tracks --select all --mode rollcall --teaser 60 --bars 2 --preroll-bars 1
```
### Select specific tracks by index
```bash
python dj_teaser_v2.py --tracks-dir ./tracks --select 1,2,3,7,9 --teaser 60 --bars 2
```
### Select ranges + mix
```bash
python dj_teaser_v2.py --tracks-dir ./tracks --select 1-4,7,10-12 --teaser 60 --bars 2
```
### Auto best-of (pick top N tracks)
```bash
python dj_teaser_v2.py --tracks-dir ./tracks --select auto --auto-n 8 --mode bestof --teaser 75 --bars 4 --preroll-bars 1
```
---
## V_3 examples (recommended for trance)
### Rollcall (all tracks, fast DJ flip)
```bash
python dj_teaser_v3.py --tracks-dir ./tracks --select all --teaser 60 --bars 2 --preroll-bars 1 --avoid-intro 30 --harmonic
```
### Best-of mini-mix vibe
```bash
python dj_teaser_v3.py --tracks-dir ./tracks --select auto --auto-n 8 --teaser 75 --bars 4 --preroll-bars 1 --avoid-intro 30 --harmonic
```
---
## Output files 📦
Typical outputs:
- `out/album_teaser.wav`
- `out/album_teaser.mp3`
- `out/teaser_report.json`
The report includes:
- chosen track order
- estimated BPM
- key/camelot (V_3)
- clip start times and durations
---
## Tuning tips (old school trance) 💡
- **Avoid long intros**: use `--avoid-intro 30` or `45`
- **DJ phrasing**:
- `--bars 2` for rollcall with many tracks (14+)
- `--bars 4` for more “real trance feel”
- **Lead-in**: `--preroll-bars 1` often makes transitions feel natural
- **Crossfade**:
- 0.200.35 seconds is usually good for teasers
- **Harmonic mode** (V_3): `--harmonic` is recommended, but key detection is best-effort
---
## Limitations ⚠️
- Beat and key detection are **heuristics**; some tracks will be “weird”, especially with:
- long breakdowns
- very pad-heavy sections
- ambient intros/outros
- “Downbeat” is approximated from beat grid + onset accent (not a trained downbeat model)
- For perfect DJ results, you can always manually tweak:
- `--avoid-intro`, `--bars`, `--preroll-bars`, `--select`, and the track order
---
## Optional: Ollama (text generation) 🤖
If you run Ollama locally (example: `http://192.168.2.60:11434`) you can use it to generate:
- README snippets
- promo text (TikTok/IG/YouTube)
- tracklists with timestamps (based on `teaser_report.json`)
Recommended model:
- `llama3.1:8b-instruct-q4_0`
> Ollama is optional and not required for audio processing.
---
## Repo hygiene 🧼
Suggested `.gitignore`:
```
.venv/
__pycache__/
work/
out/
*.wav
*.mp3
```
---
## License
Pick whatever fits your repo (MIT is common).
If you havent decided yet, add an MIT license later.
---
## Credits
- **FFmpeg** for audio processing
- **librosa** for audio analysis

View File

@@ -0,0 +1,25 @@
# GitHub Release Checklist v3.0.0
## Pre-release
- [ ] Verify ffmpeg is documented as required
- [ ] Test V3 on at least 10+ tracks
- [ ] Confirm WAV and MP3 export
- [ ] Confirm teaser_report.json is generated
- [ ] Review README.md (links + badges)
- [ ] Review README_EN.md / README_DA.md
## Documentation
- [ ] Update CHANGELOG.md
- [ ] Add RELEASE_NOTES_v3.md to GitHub release
- [ ] Include example_teaser_report.json
- [ ] Optional: add screenshots
## Tag & Release
- [ ] Tag version: `v3.0.0`
- [ ] Create GitHub Release from tag
- [ ] Paste release notes
- [ ] Attach example output (optional)
## Post-release
- [ ] Verify clone + run instructions
- [ ] Open roadmap for V4

38
RELEASE_NOTES_v3.md Normal file
View File

@@ -0,0 +1,38 @@
# Auto Clip V3 Release Notes 🎧
This release focuses on **DJ-style album teasers**, with a strong emphasis on
**old school trance**, phrasing, and harmonic flow.
## Highlights
- Bar-accurate cuts that feel like real DJ transitions
- Harmonic ordering using Camelot keys
- Consistent loudness using 2-pass normalization
- Fully local workflow (no cloud, no uploads)
## Ideal use cases
- Album / EP teaser creation
- DJ promo reels
- Trance and electronic music previews
- Quick rollcall or mini-mix teasers
## Whats new in V3
- Harmonic mixing (best-effort key detection)
- Phrase-aligned bar snapping
- Pre-roll bars for smoother transitions
- Improved loudness consistency
- Cleaner, more musical teaser flow
## Known limitations
- Key detection may be inaccurate on very pad-heavy or ambient tracks
- Downbeat detection is heuristic-based, not ML-trained
## Recommended settings (trance)
```bash
--avoid-intro 30
--bars 2 # rollcall
--bars 4 # mini-mix
--preroll-bars 1
--harmonic
```
Built for DJs who want **automation without losing musical control**.

34
SCREENSHOTS.md Normal file
View File

@@ -0,0 +1,34 @@
# Screenshots / Visuals
This project is primarily CLI-based, but screenshots help explain the workflow.
## Suggested screenshots
### 1. Folder structure
_Show the repo structure with V_1 / V_2 / V_3_
```
Auto Clip/
├── V_1/
├── V_2/
├── V_3/
├── tracks/
└── out/
```
### 2. CLI usage
_Terminal screenshot running a V3 teaser build_
```bash
python dj_teaser_v3.py --tracks-dir ./tracks --select all --bars 2 --harmonic
```
### 3. Output files
_Show generated files in the out/ folder_
- album_teaser.wav
- album_teaser.mp3
- teaser_report.json
### 4. (Optional) Waveform view
_Screenshot from a DAW or audio editor showing the final teaser waveform_

423
V_1/dj_teaser.py Normal file
View File

@@ -0,0 +1,423 @@
#!/usr/bin/env python3
"""
DJ Teaser Builder (local, offline-friendly)
- Scans a folder for audio files (max 20 by default)
- Lets you select tracks by index (e.g. 1,2,5,7) or use "all"
- Finds highlight segments (energy + onset)
- Snaps start to bar grid (DJ-ish phrase cuts)
- Renders clips + acrossfades them into a teaser via FFmpeg
- Writes a JSON report (chosen start times / durations)
Requirements:
- ffmpeg in PATH
- pip install numpy librosa soundfile
Example:
python dj_teaser.py --tracks-dir ./tracks --select 1,2,3,4 --mode rollcall --teaser 60
"""
import argparse
import json
import shutil
import subprocess
from dataclasses import dataclass
from pathlib import Path
from typing import List, Optional, Tuple
import numpy as np
import librosa
AUDIO_EXTS = {".wav", ".mp3", ".flac", ".m4a", ".aiff", ".aac", ".ogg", ".opus"}
@dataclass
class Config:
tracks_dir: Path
work_dir: Path
out_dir: Path
output_name: str
max_tracks: int = 20
analysis_sr: int = 22050
hop_length: int = 512
# Teaser / DJ settings
teaser_seconds: float = 60.0
crossfade_seconds: float = 0.25
fade_seconds: float = 0.08
avoid_intro_seconds: float = 30.0
avoid_outro_seconds: float = 20.0
# rollcall: short bars per track, bestof: longer bars per track (and fewer tracks ideally)
mode: str = "rollcall" # "rollcall" or "bestof"
bars_per_track: int = 2
beats_per_bar: int = 4
# Loudness target (simple 1-pass loudnorm)
target_lufs: float = -14.0
def run(cmd: List[str]) -> None:
p = subprocess.run(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
if p.returncode != 0:
raise RuntimeError(f"Command failed:\n{' '.join(cmd)}\n\nSTDERR:\n{p.stderr}")
def ensure_ffmpeg() -> None:
if shutil.which("ffmpeg") is None:
raise RuntimeError("ffmpeg not found in PATH. Install ffmpeg and try again.")
def list_tracks(tracks_dir: Path, max_tracks: int) -> List[Path]:
files = [p for p in sorted(tracks_dir.iterdir()) if p.is_file() and p.suffix.lower() in AUDIO_EXTS]
return files[:max_tracks]
def parse_selection(selection: str, num_tracks: int) -> List[int]:
"""
Returns 0-based indices.
selection examples:
"all"
"1,2,3,7"
"1-4,7,9-10"
"""
s = selection.strip().lower()
if s == "all":
return list(range(num_tracks))
parts = [p.strip() for p in s.split(",") if p.strip()]
out: List[int] = []
for part in parts:
if "-" in part:
a, b = part.split("-", 1)
a_i = int(a) - 1
b_i = int(b) - 1
if a_i > b_i:
a_i, b_i = b_i, a_i
out.extend(list(range(a_i, b_i + 1)))
else:
out.append(int(part) - 1)
# unique, keep order
seen = set()
filtered = []
for i in out:
if 0 <= i < num_tracks and i not in seen:
seen.add(i)
filtered.append(i)
if not filtered:
raise ValueError("Selection resulted in an empty track list. Check --select.")
return filtered
def ffmpeg_to_wav(in_path: Path, out_wav: Path, sr: int) -> None:
out_wav.parent.mkdir(parents=True, exist_ok=True)
run([
"ffmpeg", "-y",
"-i", str(in_path),
"-vn",
"-ac", "2",
"-ar", str(sr),
"-f", "wav",
str(out_wav),
])
def zscore(x: np.ndarray) -> np.ndarray:
x = np.asarray(x, dtype=np.float32)
mu = float(np.mean(x))
sd = float(np.std(x) + 1e-9)
return (x - mu) / sd
def pick_highlight_segment(
y: np.ndarray,
sr: int,
hop_length: int,
clip_s: float,
avoid_intro_s: float,
avoid_outro_s: float
) -> Tuple[float, float, dict]:
"""
Returns: (approx_start_seconds, duration_seconds, debug_metrics)
"""
duration = len(y) / sr
debug = {"duration_seconds": float(duration)}
if duration <= (avoid_intro_s + avoid_outro_s + clip_s + 1.0):
start = max(0.0, (duration - clip_s) / 2.0)
debug["reason"] = "short_track_center"
return start, clip_s, debug
rms = librosa.feature.rms(y=y, frame_length=2048, hop_length=hop_length)[0]
onset = librosa.onset.onset_strength(y=y, sr=sr, hop_length=hop_length)
n = min(len(rms), len(onset))
rms, onset = rms[:n], onset[:n]
score = 0.35 * zscore(rms) + 0.65 * zscore(onset)
score = np.maximum(score, 0.0)
clip_frames = max(1, int(round((clip_s * sr) / hop_length)))
t_seconds = (np.arange(n) * hop_length) / sr
valid = (t_seconds >= avoid_intro_s) & (t_seconds <= (duration - avoid_outro_s - clip_s))
valid_idxs = np.where(valid)[0]
if len(valid_idxs) == 0:
start = max(0.0, (duration - clip_s) / 2.0)
debug["reason"] = "no_valid_window_center"
return start, clip_s, debug
window = np.ones(clip_frames, dtype=np.float32)
summed = np.convolve(score, window, mode="same")
best_idx = int(valid_idxs[np.argmax(summed[valid_idxs])])
center_t = float(t_seconds[best_idx])
start_t = center_t - (clip_s / 2.0)
start_t = float(max(avoid_intro_s, min(start_t, duration - avoid_outro_s - clip_s)))
debug.update({
"best_center_seconds": center_t,
"approx_start_seconds": start_t,
"clip_frames": int(clip_frames),
})
return start_t, clip_s, debug
def bars_to_seconds(tempo_bpm: float, bars: int, beats_per_bar: int) -> float:
beats = bars * beats_per_bar
return (60.0 / max(1e-6, tempo_bpm)) * beats
def snap_to_bars(y: np.ndarray, sr: int, approx_start: float, bars: int, beats_per_bar: int = 4) -> Tuple[float, float]:
"""
Returns: (snapped_start_seconds, tempo_bpm)
"""
try:
tempo, beat_frames = librosa.beat.beat_track(y=y, sr=sr)
tempo = float(tempo)
if beat_frames is None or len(beat_frames) < 8:
return approx_start, tempo
beat_times = librosa.frames_to_time(beat_frames, sr=sr)
i = int(np.argmin(np.abs(beat_times - approx_start)))
grid = max(1, bars * beats_per_bar) # beats per bar-grid chunk
snapped_i = int(round(i / grid) * grid)
snapped_i = max(0, min(snapped_i, len(beat_times) - 1))
snapped_t = float(beat_times[snapped_i])
# keep snapping reasonable
if abs(snapped_t - approx_start) <= 2.0:
return snapped_t, tempo
return approx_start, tempo
except Exception:
return approx_start, 0.0
def render_clip(
in_wav: Path,
out_path: Path,
start: float,
dur: float,
fade_s: float,
target_lufs: float
) -> None:
out_path.parent.mkdir(parents=True, exist_ok=True)
af = (
f"atrim=start={start}:duration={dur},"
f"afade=t=in:st=0:d={fade_s},"
f"afade=t=out:st={max(0.0, dur - fade_s)}:d={fade_s},"
f"loudnorm=I={target_lufs}:TP=-1.5:LRA=11"
)
run([
"ffmpeg", "-y",
"-i", str(in_wav),
"-vn",
"-af", af,
str(out_path),
])
def build_acrossfade_chain(clips: List[Path], out_path: Path, crossfade_s: float) -> None:
if len(clips) == 1:
shutil.copyfile(clips[0], out_path)
return
cmd = ["ffmpeg", "-y"]
for c in clips:
cmd += ["-i", str(c)]
filter_parts = []
last = "[0:a]"
for i in range(1, len(clips)):
nxt = f"[{i}:a]"
out = f"[a{i}]"
filter_parts.append(f"{last}{nxt}acrossfade=d={crossfade_s}:c1=tri:c2=tri{out}")
last = out
cmd += [
"-filter_complex", ";".join(filter_parts),
"-map", last,
str(out_path),
]
run(cmd)
def main():
parser = argparse.ArgumentParser(description="Local DJ Teaser Builder (Python + FFmpeg)")
parser.add_argument("--tracks-dir", default="./tracks", help="Folder containing audio tracks")
parser.add_argument("--work-dir", default="./work", help="Temp working folder")
parser.add_argument("--out-dir", default="./out", help="Output folder")
parser.add_argument("--max-tracks", type=int, default=20, help="Max tracks to scan from folder (default: 20)")
parser.add_argument("--select", default="all", help='Track selection: "all", "1,2,5", "1-4,7" (1-based)')
parser.add_argument("--mode", choices=["rollcall", "bestof"], default="rollcall", help="Teaser style")
parser.add_argument("--teaser", type=float, default=60.0, help="Final teaser length in seconds")
parser.add_argument("--bars", type=int, default=2, help="Bars per track clip (DJ phrasing). rollcall=2 typical")
parser.add_argument("--bpb", type=int, default=4, help="Beats per bar (4 for trance)")
parser.add_argument("--crossfade", type=float, default=0.25, help="Acrossfade duration in seconds")
parser.add_argument("--avoid-intro", type=float, default=30.0, help="Skip intro seconds when searching highlights")
parser.add_argument("--avoid-outro", type=float, default=20.0, help="Skip outro seconds when searching highlights")
parser.add_argument("--target-lufs", type=float, default=-14.0, help="Loudness target LUFS (approx)")
parser.add_argument("--output", default="album_teaser.wav", help="Output teaser filename")
args = parser.parse_args()
ensure_ffmpeg()
cfg = Config(
tracks_dir=Path(args.tracks_dir),
work_dir=Path(args.work_dir),
out_dir=Path(args.out_dir),
output_name=args.output,
max_tracks=args.max_tracks,
teaser_seconds=args.teaser,
crossfade_seconds=args.crossfade,
avoid_intro_seconds=args.avoid_intro,
avoid_outro_seconds=args.avoid_outro,
mode=args.mode,
bars_per_track=args.bars,
beats_per_bar=args.bpb,
target_lufs=args.target_lufs,
)
cfg.out_dir.mkdir(parents=True, exist_ok=True)
cfg.work_dir.mkdir(parents=True, exist_ok=True)
tracks = list_tracks(cfg.tracks_dir, cfg.max_tracks)
if not tracks:
raise SystemExit(f"No audio tracks found in: {cfg.tracks_dir.resolve()}")
# Print discovered tracks (nice for Git usage)
print("\nDiscovered tracks:")
for i, t in enumerate(tracks, start=1):
print(f" {i:02d}. {t.name}")
selected_idxs = parse_selection(args.select, len(tracks))
selected_tracks = [tracks[i] for i in selected_idxs]
print("\nSelected tracks:")
for i, t in zip(selected_idxs, selected_tracks):
print(f" {i+1:02d}. {t.name}")
n = len(selected_tracks)
teaser_s = float(cfg.teaser_seconds)
cf = float(cfg.crossfade_seconds)
# Total playtime math with acrossfades:
# final_length = sum(durs) - (n-1)*cf => sum(durs) = teaser + (n-1)*cf
# We use avg_dur to clamp bar-based clip duration.
avg_dur = (teaser_s + (n - 1) * cf) / max(1, n)
clips: List[Path] = []
report = {
"config": {
"mode": cfg.mode,
"teaser_seconds": teaser_s,
"crossfade_seconds": cf,
"bars_per_track": cfg.bars_per_track,
"beats_per_bar": cfg.beats_per_bar,
"avoid_intro_seconds": cfg.avoid_intro_seconds,
"avoid_outro_seconds": cfg.avoid_outro_seconds,
"target_lufs": cfg.target_lufs,
"avg_clip_seconds_target": avg_dur,
},
"tracks": []
}
for idx, track in enumerate(selected_tracks, start=1):
tmp_wav = cfg.work_dir / f"track_{idx:02d}.wav"
ffmpeg_to_wav(track, tmp_wav, cfg.analysis_sr)
y, sr = librosa.load(tmp_wav, sr=cfg.analysis_sr, mono=True)
# 1) pick approximate highlight
approx_start, _, debug = pick_highlight_segment(
y=y,
sr=sr,
hop_length=cfg.hop_length,
clip_s=max(4.0, min(8.0, avg_dur)), # search window size
avoid_intro_s=cfg.avoid_intro_seconds,
avoid_outro_s=cfg.avoid_outro_seconds
)
# 2) snap to bar grid (DJ phrasing) + compute tempo
snapped_start, tempo = snap_to_bars(
y=y, sr=sr,
approx_start=approx_start,
bars=cfg.bars_per_track,
beats_per_bar=cfg.beats_per_bar
)
# 3) derive duration from bars at detected tempo
# If tempo fails (0), fall back to avg_dur.
if tempo and tempo > 1.0:
dur = bars_to_seconds(tempo, cfg.bars_per_track, cfg.beats_per_bar)
else:
dur = avg_dur
# clamp duration so total stays in bounds
dur = float(np.clip(dur, 2.5, avg_dur))
clip_out = cfg.work_dir / f"clip_{idx:02d}.wav"
render_clip(
in_wav=tmp_wav,
out_path=clip_out,
start=snapped_start,
dur=dur,
fade_s=cfg.fade_seconds,
target_lufs=cfg.target_lufs
)
clips.append(clip_out)
report["tracks"].append({
"index_in_folder": int(selected_idxs[idx - 1] + 1),
"filename": track.name,
"tempo_bpm_est": round(float(tempo), 2),
"start_seconds": round(float(snapped_start), 3),
"duration_seconds": round(float(dur), 3),
"debug": debug,
})
teaser_path = cfg.out_dir / cfg.output_name
build_acrossfade_chain(clips, teaser_path, cfg.crossfade_seconds)
report_path = cfg.out_dir / "teaser_report.json"
with open(report_path, "w", encoding="utf-8") as f:
json.dump(report, f, ensure_ascii=False, indent=2)
print(f"\n✅ Teaser created: {teaser_path.resolve()}")
print(f"📝 Report written: {report_path.resolve()}\n")
if __name__ == "__main__":
main()

BIN
V_1/ffmpeg.exe Normal file

Binary file not shown.

42
V_1/readme.md Normal file
View File

@@ -0,0 +1,42 @@
Scan mappe og brug alle (max 20)
python dj\_teaser.py --tracks-dir ./tracks --select all --mode rollcall --teaser 60 --bars 2
Vælg specifikke tracks (1-based index)
python dj\_teaser.py --tracks-dir ./tracks --select 1,2,3,7,9 --teaser 60 --bars 2
Range + mix
python dj\_teaser.py --tracks-dir ./tracks --select 1-4,7,10-12 --teaser 60 --bars 2
Output: ./out/album\_teaser.wav + ./out/teaser\_report.jsonScan mappe og brug alle (max 20)
python dj\_teaser.py --tracks-dir ./tracks --select all --mode rollcall --teaser 60 --bars 2
Vælg specifikke tracks (1-based index)
python dj\_teaser.py --tracks-dir ./tracks --select 1,2,3,7,9 --teaser 60 --bars 2
Range + mix
python dj\_teaser.py --tracks-dir ./tracks --select 1-4,7,10-12 --teaser 60 --bars 2
Output: ./out/album\_teaser.wav + ./out/teaser\_report.json

553
V_2/dj_teaser.py Normal file
View File

@@ -0,0 +1,553 @@
#!/usr/bin/env python3
"""
DJ Teaser Builder v2 (local, offline-friendly)
- Scans folder for audio tracks (max N)
- Select tracks by index/range or auto-select "bestof"
- Finds highlight segments (energy + onset)
- Snaps to bar grid (DJ phrasing) + optional pre-roll
- Orders clips to minimize tempo jumps + ramp energy
- Renders clips + acrossfades via FFmpeg
- Exports WAV + MP3
- Writes JSON report
- Optional: generate README + promo text via Ollama
Requirements:
ffmpeg in PATH
pip install numpy librosa soundfile requests
Examples:
python dj_teaser_v2.py --tracks-dir ./tracks --select all --mode rollcall --teaser 60 --bars 2
python dj_teaser_v2.py --tracks-dir ./tracks --select 1-4,7,9 --teaser 60 --bars 2
python dj_teaser_v2.py --tracks-dir ./tracks --select auto --auto-n 8 --mode bestof --teaser 75 --bars 4
python dj_teaser_v2.py --tracks-dir ./tracks --select auto --auto-n 8 --ollama http://192.168.2.60:11434 --gen-readme
"""
import argparse
import json
import math
import shutil
import subprocess
from dataclasses import dataclass
from pathlib import Path
from typing import List, Tuple, Optional, Dict
import numpy as np
import librosa
try:
import requests
except Exception:
requests = None # Ollama is optional
AUDIO_EXTS = {".wav", ".mp3", ".flac", ".m4a", ".aiff", ".aac", ".ogg", ".opus"}
@dataclass
class TrackInfo:
path: Path
index_in_folder: int # 1-based
duration_s: float
tempo_bpm: float
energy_score: float # overall (for ranking)
highlight_score: float
approx_start_s: float
snapped_start_s: float
clip_dur_s: float
def run(cmd: List[str]) -> None:
p = subprocess.run(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
if p.returncode != 0:
raise RuntimeError(f"Command failed:\n{' '.join(cmd)}\n\nSTDERR:\n{p.stderr}")
def ensure_ffmpeg() -> None:
if shutil.which("ffmpeg") is None:
raise RuntimeError("ffmpeg not found in PATH. Install ffmpeg and try again.")
def list_tracks(tracks_dir: Path, max_tracks: int) -> List[Path]:
files = [p for p in sorted(tracks_dir.iterdir()) if p.is_file() and p.suffix.lower() in AUDIO_EXTS]
return files[:max_tracks]
def parse_selection(selection: str, num_tracks: int) -> List[int]:
"""
Returns 0-based indices.
selection examples:
"all"
"1,2,3,7"
"1-4,7,9-10"
"auto"
"""
s = selection.strip().lower()
if s in {"all", "auto"}:
return list(range(num_tracks))
parts = [p.strip() for p in s.split(",") if p.strip()]
out: List[int] = []
for part in parts:
if "-" in part:
a, b = part.split("-", 1)
a_i = int(a) - 1
b_i = int(b) - 1
if a_i > b_i:
a_i, b_i = b_i, a_i
out.extend(list(range(a_i, b_i + 1)))
else:
out.append(int(part) - 1)
seen = set()
filtered = []
for i in out:
if 0 <= i < num_tracks and i not in seen:
seen.add(i)
filtered.append(i)
if not filtered:
raise ValueError("Selection resulted in an empty track list. Check --select.")
return filtered
def ffmpeg_to_wav(in_path: Path, out_wav: Path, sr: int) -> None:
out_wav.parent.mkdir(parents=True, exist_ok=True)
run([
"ffmpeg", "-y",
"-i", str(in_path),
"-vn",
"-ac", "2",
"-ar", str(sr),
"-f", "wav",
str(out_wav),
])
def zscore(x: np.ndarray) -> np.ndarray:
x = np.asarray(x, dtype=np.float32)
mu = float(np.mean(x))
sd = float(np.std(x) + 1e-9)
return (x - mu) / sd
def compute_metrics(y: np.ndarray, sr: int, hop_length: int) -> Dict[str, np.ndarray]:
rms = librosa.feature.rms(y=y, frame_length=2048, hop_length=hop_length)[0]
onset = librosa.onset.onset_strength(y=y, sr=sr, hop_length=hop_length)
n = min(len(rms), len(onset))
rms, onset = rms[:n], onset[:n]
# clip negatives, normalize
score = 0.35 * zscore(rms) + 0.65 * zscore(onset)
score = np.maximum(score, 0.0)
return {
"rms": rms,
"onset": onset,
"score": score
}
def pick_highlight_start(
score: np.ndarray,
sr: int,
hop_length: int,
clip_s: float,
avoid_intro_s: float,
avoid_outro_s: float,
duration_s: float
) -> Tuple[float, float]:
"""
Sliding window max over score.
Returns (approx_start_s, highlight_score_sum).
"""
if duration_s <= (avoid_intro_s + avoid_outro_s + clip_s + 1.0):
return max(0.0, (duration_s - clip_s) / 2.0), float(np.sum(score))
n = len(score)
clip_frames = max(1, int(round((clip_s * sr) / hop_length)))
t_seconds = (np.arange(n) * hop_length) / sr
valid = (t_seconds >= avoid_intro_s) & (t_seconds <= (duration_s - avoid_outro_s - clip_s))
valid_idxs = np.where(valid)[0]
if len(valid_idxs) == 0:
return max(0.0, (duration_s - clip_s) / 2.0), float(np.sum(score))
window = np.ones(clip_frames, dtype=np.float32)
summed = np.convolve(score, window, mode="same")
best_idx = int(valid_idxs[np.argmax(summed[valid_idxs])])
center_t = float(t_seconds[best_idx])
start_t = center_t - (clip_s / 2.0)
start_t = float(max(avoid_intro_s, min(start_t, duration_s - avoid_outro_s - clip_s)))
return start_t, float(summed[best_idx])
def snap_to_bar_grid(y: np.ndarray, sr: int, approx_start: float, bars: int, beats_per_bar: int) -> Tuple[float, float, Optional[np.ndarray]]:
"""
Snap start to nearest bar grid based on beat tracking.
Returns (snapped_start_s, tempo_bpm, beat_times or None).
"""
try:
tempo, beat_frames = librosa.beat.beat_track(y=y, sr=sr)
tempo = float(tempo)
if beat_frames is None or len(beat_frames) < 8:
return approx_start, tempo, None
beat_times = librosa.frames_to_time(beat_frames, sr=sr)
i = int(np.argmin(np.abs(beat_times - approx_start)))
grid = max(1, bars * beats_per_bar) # beats per chunk
snapped_i = int(round(i / grid) * grid)
snapped_i = max(0, min(snapped_i, len(beat_times) - 1))
snapped_t = float(beat_times[snapped_i])
if abs(snapped_t - approx_start) <= 2.0:
return snapped_t, tempo, beat_times
return approx_start, tempo, beat_times
except Exception:
return approx_start, 0.0, None
def bars_to_seconds(tempo_bpm: float, bars: int, beats_per_bar: int) -> float:
beats = bars * beats_per_bar
return (60.0 / max(1e-6, tempo_bpm)) * beats
def apply_preroll(snapped_start: float, beat_times: Optional[np.ndarray], preroll_bars: int, beats_per_bar: int) -> float:
"""
Move start earlier by N bars if beat_times available. Otherwise, fallback to seconds guess.
"""
if preroll_bars <= 0:
return snapped_start
if beat_times is None or len(beat_times) < (preroll_bars * beats_per_bar + 2):
# fallback: 1 bar ~ 2 sec at 120 bpm; safe-ish
return max(0.0, snapped_start - preroll_bars * 2.0)
# Find nearest beat index to snapped_start
i = int(np.argmin(np.abs(beat_times - snapped_start)))
back_beats = preroll_bars * beats_per_bar
j = max(0, i - back_beats)
return float(beat_times[j])
def render_clip(in_wav: Path, out_path: Path, start: float, dur: float, fade_s: float, target_lufs: float) -> None:
out_path.parent.mkdir(parents=True, exist_ok=True)
af = (
f"atrim=start={start}:duration={dur},"
f"afade=t=in:st=0:d={fade_s},"
f"afade=t=out:st={max(0.0, dur - fade_s)}:d={fade_s},"
f"loudnorm=I={target_lufs}:TP=-1.5:LRA=11"
)
run(["ffmpeg", "-y", "-i", str(in_wav), "-vn", "-af", af, str(out_path)])
def build_acrossfade_chain(clips: List[Path], out_wav: Path, crossfade_s: float) -> None:
if len(clips) == 1:
shutil.copyfile(clips[0], out_wav)
return
cmd = ["ffmpeg", "-y"]
for c in clips:
cmd += ["-i", str(c)]
filter_parts = []
last = "[0:a]"
for i in range(1, len(clips)):
nxt = f"[{i}:a]"
out = f"[a{i}]"
filter_parts.append(f"{last}{nxt}acrossfade=d={crossfade_s}:c1=tri:c2=tri{out}")
last = out
cmd += ["-filter_complex", ";".join(filter_parts), "-map", last, str(out_wav)]
run(cmd)
def export_mp3(in_wav: Path, out_mp3: Path, bitrate: str = "320k") -> None:
out_mp3.parent.mkdir(parents=True, exist_ok=True)
run(["ffmpeg", "-y", "-i", str(in_wav), "-vn", "-codec:a", "libmp3lame", "-b:a", bitrate, str(out_mp3)])
def order_tracks_dj_style(track_infos: List[TrackInfo], tempo_tolerance: float, prefer_energy_ramp: bool = True) -> List[TrackInfo]:
"""
DJ ordering heuristic:
1) group by tempo clusters (within tolerance)
2) within each cluster, ramp by energy_score
3) order clusters by median tempo (ascending) and energy (ascending)
"""
if not track_infos:
return []
# cluster by tempo
sorted_by_tempo = sorted(track_infos, key=lambda t: (t.tempo_bpm if t.tempo_bpm > 0 else 1e9))
clusters: List[List[TrackInfo]] = []
for t in sorted_by_tempo:
placed = False
for c in clusters:
# compare to cluster median tempo
tempos = [x.tempo_bpm for x in c if x.tempo_bpm > 0]
med = float(np.median(tempos)) if tempos else t.tempo_bpm
if t.tempo_bpm > 0 and abs(t.tempo_bpm - med) <= tempo_tolerance:
c.append(t)
placed = True
break
if not placed:
clusters.append([t])
# sort within cluster
for c in clusters:
if prefer_energy_ramp:
c.sort(key=lambda x: x.energy_score)
else:
c.sort(key=lambda x: x.highlight_score, reverse=True)
# sort clusters by (median tempo, median energy)
def cluster_key(c: List[TrackInfo]):
tempos = [x.tempo_bpm for x in c if x.tempo_bpm > 0]
med_t = float(np.median(tempos)) if tempos else 9999.0
med_e = float(np.median([x.energy_score for x in c]))
return (med_t, med_e)
clusters.sort(key=cluster_key)
# flatten
ordered = [t for c in clusters for t in c]
return ordered
def ollama_generate(ollama_url: str, model: str, prompt: str) -> str:
if requests is None:
raise RuntimeError("requests not installed. Run: pip install requests")
url = ollama_url.rstrip("/") + "/api/generate"
payload = {"model": model, "prompt": prompt, "stream": False}
r = requests.post(url, json=payload, timeout=60)
r.raise_for_status()
data = r.json()
return data.get("response", "").strip()
def main():
parser = argparse.ArgumentParser(description="Local DJ Teaser Builder v2 (Python + FFmpeg)")
parser.add_argument("--tracks-dir", default="./tracks", help="Folder containing audio tracks")
parser.add_argument("--work-dir", default="./work", help="Temp working folder")
parser.add_argument("--out-dir", default="./out", help="Output folder")
parser.add_argument("--max-tracks", type=int, default=20, help="Max tracks to scan (default 20)")
parser.add_argument("--select", default="all", help='Selection: "all", "1,2,7", "1-4,9", or "auto"')
parser.add_argument("--auto-n", type=int, default=8, help="If --select auto: how many tracks to keep (best-of)")
parser.add_argument("--mode", choices=["rollcall", "bestof"], default="rollcall", help="Teaser style")
parser.add_argument("--teaser", type=float, default=60.0, help="Final teaser length (seconds)")
parser.add_argument("--bars", type=int, default=2, help="Bars per clip (DJ phrasing). rollcall=2 typical")
parser.add_argument("--bpb", type=int, default=4, help="Beats per bar (4 for trance)")
parser.add_argument("--preroll-bars", type=int, default=1, help="Start N bars before highlight (DJ lead-in)")
parser.add_argument("--crossfade", type=float, default=0.25, help="Acrossfade duration seconds")
parser.add_argument("--fade", type=float, default=0.08, help="Fade in/out per clip seconds")
parser.add_argument("--avoid-intro", type=float, default=30.0, help="Skip intro when searching highlights")
parser.add_argument("--avoid-outro", type=float, default=20.0, help="Skip outro when searching highlights")
parser.add_argument("--tempo-tol", type=float, default=4.0, help="Tempo clustering tolerance (BPM)")
parser.add_argument("--target-lufs", type=float, default=-14.0, help="Loudness target LUFS (approx)")
parser.add_argument("--output-wav", default="album_teaser.wav", help="Output teaser WAV filename")
parser.add_argument("--output-mp3", default="album_teaser.mp3", help="Output teaser MP3 filename")
parser.add_argument("--mp3-bitrate", default="320k", help="MP3 bitrate (e.g. 192k, 320k)")
# Ollama (optional)
parser.add_argument("--ollama", default="", help="Ollama base URL (e.g. http://192.168.2.60:11434)")
parser.add_argument("--ollama-model", default="qwen2.5:latest", help="Ollama model name")
parser.add_argument("--gen-readme", action="store_true", help="Generate README + promo text using Ollama")
args = parser.parse_args()
ensure_ffmpeg()
tracks_dir = Path(args.tracks_dir)
work_dir = Path(args.work_dir)
out_dir = Path(args.out_dir)
work_dir.mkdir(parents=True, exist_ok=True)
out_dir.mkdir(parents=True, exist_ok=True)
tracks = list_tracks(tracks_dir, args.max_tracks)
if not tracks:
raise SystemExit(f"No audio tracks found in: {tracks_dir.resolve()}")
print("\nDiscovered tracks:")
for i, t in enumerate(tracks, start=1):
print(f" {i:02d}. {t.name}")
selected_idxs = parse_selection(args.select, len(tracks))
selected_tracks = [tracks[i] for i in selected_idxs]
# math: avg duration per clip given acrossfades
n = len(selected_tracks)
teaser_s = float(args.teaser)
cf = float(args.crossfade)
avg_dur = (teaser_s + (n - 1) * cf) / max(1, n)
# Analyze each selected track into TrackInfo
infos: List[TrackInfo] = []
for local_idx, track in enumerate(selected_tracks, start=1):
tmp_wav = work_dir / f"track_{local_idx:02d}.wav"
ffmpeg_to_wav(track, tmp_wav, sr=22050)
y, sr = librosa.load(tmp_wav, sr=22050, mono=True)
duration_s = float(len(y) / sr)
m = compute_metrics(y, sr, hop_length=512)
score = m["score"]
# overall energy score: mean of top percentile of score (robust)
top = np.quantile(score, 0.90) if len(score) else 0.0
energy_score = float(np.mean(score[score >= top])) if np.any(score >= top) else float(np.mean(score) if len(score) else 0.0)
# choose a search window size (not necessarily final dur): use avg_dur-ish but safe
search_clip = float(np.clip(avg_dur, 4.0, 10.0))
approx_start, highlight_score = pick_highlight_start(
score=score,
sr=sr,
hop_length=512,
clip_s=search_clip,
avoid_intro_s=float(args.avoid_intro),
avoid_outro_s=float(args.avoid_outro),
duration_s=duration_s
)
snapped_start, tempo, beat_times = snap_to_bar_grid(
y=y, sr=sr,
approx_start=approx_start,
bars=int(args.bars),
beats_per_bar=int(args.bpb)
)
# apply preroll bars (DJ lead-in)
snapped_start = apply_preroll(snapped_start, beat_times, int(args.preroll_bars), int(args.bpb))
# duration based on bars + tempo, clamped to avg_dur
if tempo and tempo > 1.0:
dur = bars_to_seconds(tempo, int(args.bars), int(args.bpb))
else:
dur = avg_dur
dur = float(np.clip(dur, 2.5, avg_dur))
infos.append(TrackInfo(
path=track,
index_in_folder=int(selected_idxs[local_idx - 1] + 1),
duration_s=duration_s,
tempo_bpm=float(tempo),
energy_score=energy_score,
highlight_score=float(highlight_score),
approx_start_s=float(approx_start),
snapped_start_s=float(snapped_start),
clip_dur_s=float(dur),
))
# Auto best-of selection (if requested)
if args.select.strip().lower() == "auto":
auto_n = int(max(1, min(args.auto_n, len(infos))))
# rank by highlight_score primarily, then energy_score
infos_sorted = sorted(infos, key=lambda t: (t.highlight_score, t.energy_score), reverse=True)
infos = infos_sorted[:auto_n]
print(f"\nAuto-selected best-of: {auto_n} tracks (ranked by highlight score).")
# DJ ordering
ordered = order_tracks_dj_style(infos, tempo_tolerance=float(args.tempo_tol), prefer_energy_ramp=True)
print("\nFinal clip order:")
for i, t in enumerate(ordered, start=1):
print(f" {i:02d}. [{t.tempo_bpm:.1f} BPM] (E={t.energy_score:.3f}) {t.path.name}")
# Render clips
clip_paths: List[Path] = []
report_tracks = []
for i, t in enumerate(ordered, start=1):
tmp_wav = work_dir / f"track_{i:02d}.wav"
ffmpeg_to_wav(t.path, tmp_wav, sr=22050)
clip_out = work_dir / f"clip_{i:02d}.wav"
render_clip(
in_wav=tmp_wav,
out_path=clip_out,
start=t.snapped_start_s,
dur=t.clip_dur_s,
fade_s=float(args.fade),
target_lufs=float(args.target_lufs)
)
clip_paths.append(clip_out)
report_tracks.append({
"folder_index": t.index_in_folder,
"filename": t.path.name,
"tempo_bpm_est": round(t.tempo_bpm, 2),
"energy_score": round(t.energy_score, 6),
"highlight_score": round(t.highlight_score, 6),
"approx_start_seconds": round(t.approx_start_s, 3),
"snapped_start_seconds": round(t.snapped_start_s, 3),
"clip_duration_seconds": round(t.clip_dur_s, 3),
})
# Build teaser WAV then MP3
out_wav = out_dir / args.output_wav
out_mp3 = out_dir / args.output_mp3
build_acrossfade_chain(clip_paths, out_wav, crossfade_s=float(args.crossfade))
export_mp3(out_wav, out_mp3, bitrate=str(args.mp3_bitrate))
report = {
"version": "v2",
"inputs": {
"tracks_dir": str(tracks_dir.resolve()),
"select": args.select,
"auto_n": int(args.auto_n),
"mode": args.mode,
},
"settings": {
"teaser_seconds": teaser_s,
"bars": int(args.bars),
"beats_per_bar": int(args.bpb),
"preroll_bars": int(args.preroll_bars),
"crossfade_seconds": float(args.crossfade),
"fade_seconds": float(args.fade),
"avoid_intro_seconds": float(args.avoid_intro),
"avoid_outro_seconds": float(args.avoid_outro),
"tempo_tolerance_bpm": float(args.tempo_tol),
"target_lufs": float(args.target_lufs),
"mp3_bitrate": str(args.mp3_bitrate),
},
"outputs": {
"wav": str(out_wav.resolve()),
"mp3": str(out_mp3.resolve()),
},
"tracks": report_tracks
}
report_path = out_dir / "teaser_report.json"
with open(report_path, "w", encoding="utf-8") as f:
json.dump(report, f, ensure_ascii=False, indent=2)
print(f"\n✅ Teaser WAV: {out_wav.resolve()}")
print(f"✅ Teaser MP3: {out_mp3.resolve()}")
print(f"📝 Report: {report_path.resolve()}")
# Optional: generate README / promo text via Ollama
if args.gen_readme:
if not args.ollama:
raise SystemExit("--gen-readme requires --ollama http://host:11434")
prompt = (
"You are helping with a GitHub repo for a local DJ teaser builder.\n"
"Write a concise README in English with:\n"
"- What it does\n- Requirements\n- Install\n- Usage examples\n- Tips for old-school trance / DJ phrasing\n"
"Also write a short promo text (YouTube/Instagram) for an album teaser.\n\n"
f"Settings:\n{json.dumps(report['settings'], indent=2)}\n\n"
f"Tracks (order):\n{json.dumps(report_tracks, indent=2)}\n"
)
text = ollama_generate(args.ollama, args.ollama_model, prompt)
readme_path = out_dir / "README_generated.md"
with open(readme_path, "w", encoding="utf-8") as f:
f.write(text + "\n")
print(f"🧠 Ollama README generated: {readme_path.resolve()}")
if __name__ == "__main__":
main()

BIN
V_2/ffmpeg.exe Normal file

Binary file not shown.

34
V_2/readme.md Normal file
View File

@@ -0,0 +1,34 @@
Install (til repo)
python -m venv .venv
source .venv/bin/activate
pip install -U pip
pip install numpy librosa soundfile requests
\# ffmpeg skal være installeret og i PATH
🎛️ Kommandoer jeg ville bruge til dine 14 old school trance tracks
1\) “Rollcall” (alle 14, DJ-flip)
python dj\_teaser\_v2.py --tracks-dir ./tracks --select all --teaser 60 --bars 2 --preroll-bars 1 --avoid-intro 30 --crossfade 0.25
2\) “Best-of” (mere mini-mix vibe)
python dj\_teaser\_v2.py --tracks-dir ./tracks --select auto --auto-n 8 --mode bestof --teaser 75 --bars 4 --preroll-bars 1 --avoid-intro 30 --crossfade 0.25
3\) Samtidig generér README/promo via din Ollama
python dj\_teaser\_v2.py --tracks-dir ./tracks --select auto --auto-n 8 --teaser 75 --bars 4 \\
&nbsp; --ollama http://192.168.2.60:11434 --ollama-model qwen2.5:latest --gen-readme

644
V_3/dj_teaser.py Normal file
View File

@@ -0,0 +1,644 @@
#!/usr/bin/env python3
"""
DJ Teaser Builder v3 (local, offline-friendly)
Adds:
- Key detection (Krumhansl-Schmuckler on chroma) + Camelot mapping
- Harmonic ordering (Camelot adjacent keys) + tempo clustering + energy ramp
- Downbeat-ish snap (bar start scoring) on top of beat grid
- 2-pass EBU R128 loudnorm per clip for consistent loudness
- Exports WAV + MP3 + report JSON
Requirements:
- ffmpeg in PATH
- pip install numpy librosa soundfile requests (requests only needed if you use Ollama)
Examples:
python dj_teaser_v3.py --tracks-dir ./tracks --select all --teaser 60 --bars 2 --preroll-bars 1
python dj_teaser_v3.py --tracks-dir ./tracks --select auto --auto-n 8 --teaser 75 --bars 4 --harmonic
"""
import argparse
import json
import math
import shutil
import subprocess
from dataclasses import dataclass
from pathlib import Path
from typing import List, Tuple, Optional, Dict
import numpy as np
import librosa
AUDIO_EXTS = {".wav", ".mp3", ".flac", ".m4a", ".aiff", ".aac", ".ogg", ".opus"}
# ---------------------------
# Key profiles (Krumhansl)
# ---------------------------
KRUMHANSL_MAJOR = np.array([6.35, 2.23, 3.48, 2.33, 4.38, 4.09, 2.52, 5.19, 2.39, 3.66, 2.29, 2.88], dtype=np.float32)
KRUMHANSL_MINOR = np.array([6.33, 2.68, 3.52, 5.38, 2.60, 3.53, 2.54, 4.75, 3.98, 2.69, 3.34, 3.17], dtype=np.float32)
PITCHES = ["C", "C#", "D", "D#", "E", "F", "F#", "G", "G#", "A", "A#", "B"]
# Camelot mappings (simplified)
# We map major keys to "B" and minor keys to "A" numbers.
# Common Camelot wheel:
# 8B = C Major, 5A = C Minor, etc.
# We'll use a standard mapping table for pitch class -> camelot number.
CAMELOT_MAJOR = {"C": "8B", "G": "9B", "D": "10B", "A": "11B", "E": "12B", "B": "1B", "F#": "2B", "C#": "3B", "G#": "4B", "D#": "5B", "A#": "6B", "F": "7B"}
CAMELOT_MINOR = {"A": "8A", "E": "9A", "B": "10A", "F#": "11A", "C#": "12A", "G#": "1A", "D#": "2A", "A#": "3A", "F": "4A", "C": "5A", "G": "6A", "D": "7A"}
def run(cmd: List[str]) -> Tuple[str, str]:
p = subprocess.run(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
if p.returncode != 0:
raise RuntimeError(f"Command failed:\n{' '.join(cmd)}\n\nSTDERR:\n{p.stderr}")
return p.stdout, p.stderr
def ensure_ffmpeg() -> None:
if shutil.which("ffmpeg") is None:
raise RuntimeError("ffmpeg not found in PATH. Install ffmpeg and try again.")
def list_tracks(tracks_dir: Path, max_tracks: int) -> List[Path]:
files = [p for p in sorted(tracks_dir.iterdir()) if p.is_file() and p.suffix.lower() in AUDIO_EXTS]
return files[:max_tracks]
def parse_selection(selection: str, num_tracks: int) -> List[int]:
s = selection.strip().lower()
if s in {"all", "auto"}:
return list(range(num_tracks))
parts = [p.strip() for p in s.split(",") if p.strip()]
out: List[int] = []
for part in parts:
if "-" in part:
a, b = part.split("-", 1)
a_i = int(a) - 1
b_i = int(b) - 1
if a_i > b_i:
a_i, b_i = b_i, a_i
out.extend(list(range(a_i, b_i + 1)))
else:
out.append(int(part) - 1)
seen = set()
filtered = []
for i in out:
if 0 <= i < num_tracks and i not in seen:
seen.add(i)
filtered.append(i)
if not filtered:
raise ValueError("Selection resulted in an empty track list. Check --select.")
return filtered
def ffmpeg_to_wav(in_path: Path, out_wav: Path, sr: int) -> None:
out_wav.parent.mkdir(parents=True, exist_ok=True)
run([
"ffmpeg", "-y",
"-i", str(in_path),
"-vn",
"-ac", "2",
"-ar", str(sr),
"-f", "wav",
str(out_wav),
])
def zscore(x: np.ndarray) -> np.ndarray:
x = np.asarray(x, dtype=np.float32)
mu = float(np.mean(x))
sd = float(np.std(x) + 1e-9)
return (x - mu) / sd
@dataclass
class TrackInfo:
path: Path
folder_index: int # 1-based
duration_s: float
tempo_bpm: float
energy_score: float
highlight_score: float
approx_start_s: float
snapped_start_s: float
clip_dur_s: float
key_name: str
camelot: str
def compute_score(y: np.ndarray, sr: int, hop_length: int) -> np.ndarray:
rms = librosa.feature.rms(y=y, frame_length=2048, hop_length=hop_length)[0]
onset = librosa.onset.onset_strength(y=y, sr=sr, hop_length=hop_length)
n = min(len(rms), len(onset))
rms, onset = rms[:n], onset[:n]
score = 0.35 * zscore(rms) + 0.65 * zscore(onset)
return np.maximum(score, 0.0)
def pick_highlight_start(score: np.ndarray, sr: int, hop_length: int,
clip_s: float, avoid_intro_s: float, avoid_outro_s: float, duration_s: float) -> Tuple[float, float]:
if duration_s <= (avoid_intro_s + avoid_outro_s + clip_s + 1.0):
return max(0.0, (duration_s - clip_s) / 2.0), float(np.sum(score))
n = len(score)
clip_frames = max(1, int(round((clip_s * sr) / hop_length)))
t_seconds = (np.arange(n) * hop_length) / sr
valid = (t_seconds >= avoid_intro_s) & (t_seconds <= (duration_s - avoid_outro_s - clip_s))
valid_idxs = np.where(valid)[0]
if len(valid_idxs) == 0:
return max(0.0, (duration_s - clip_s) / 2.0), float(np.sum(score))
window = np.ones(clip_frames, dtype=np.float32)
summed = np.convolve(score, window, mode="same")
best_idx = int(valid_idxs[np.argmax(summed[valid_idxs])])
center_t = float(t_seconds[best_idx])
start_t = center_t - (clip_s / 2.0)
start_t = float(max(avoid_intro_s, min(start_t, duration_s - avoid_outro_s - clip_s)))
return start_t, float(summed[best_idx])
def estimate_key(y: np.ndarray, sr: int) -> Tuple[str, str, float]:
"""
Krumhansl-Schmuckler key estimation using average chroma.
Returns (key_name, camelot, confidence)
"""
# Use harmonic component for more stable key
yh = librosa.effects.harmonic(y)
chroma = librosa.feature.chroma_cqt(y=yh, sr=sr)
chroma_mean = np.mean(chroma, axis=1)
chroma_mean /= (np.sum(chroma_mean) + 1e-9)
def corr_profile(profile):
# rotate profile for each tonic
corrs = []
for shift in range(12):
prof = np.roll(profile, shift)
corrs.append(np.corrcoef(chroma_mean, prof)[0, 1])
return np.array(corrs, dtype=np.float32)
major_corr = corr_profile(KRUMHANSL_MAJOR)
minor_corr = corr_profile(KRUMHANSL_MINOR)
best_major = int(np.argmax(major_corr))
best_minor = int(np.argmax(minor_corr))
maj_val = float(major_corr[best_major])
min_val = float(minor_corr[best_minor])
if maj_val >= min_val:
tonic = PITCHES[best_major]
key_name = f"{tonic} Major"
camelot = CAMELOT_MAJOR.get(tonic, "")
conf = maj_val
else:
tonic = PITCHES[best_minor]
key_name = f"{tonic} Minor"
camelot = CAMELOT_MINOR.get(tonic, "")
conf = min_val
if not camelot:
camelot = "??"
return key_name, camelot, conf
def bars_to_seconds(tempo_bpm: float, bars: int, beats_per_bar: int) -> float:
beats = bars * beats_per_bar
return (60.0 / max(1e-6, tempo_bpm)) * beats
def snap_to_downbeat_like(y: np.ndarray, sr: int, approx_start: float, bars: int, beats_per_bar: int,
onset_weight: float = 1.0) -> Tuple[float, float, Optional[np.ndarray]]:
"""
"Downbeat-ish" snap:
- get beat_times
- build a bar-grid (every beats_per_bar beats)
- score each bar start around approx_start by local onset strength
- pick best bar start near approx_start
Returns (snapped_start, tempo, beat_times)
"""
try:
tempo, beat_frames = librosa.beat.beat_track(y=y, sr=sr)
tempo = float(tempo)
if beat_frames is None or len(beat_frames) < (beats_per_bar * 4):
return approx_start, tempo, None
beat_times = librosa.frames_to_time(beat_frames, sr=sr)
# onset envelope for scoring
onset_env = librosa.onset.onset_strength(y=y, sr=sr)
onset_t = librosa.times_like(onset_env, sr=sr)
# candidate bar starts near approx_start
# bar start indices in beat grid
bar_stride = beats_per_bar
bar_idxs = np.arange(0, len(beat_times), bar_stride)
# focus region +/- 8 seconds around approx_start
region = []
for bi in bar_idxs:
t0 = float(beat_times[bi])
if abs(t0 - approx_start) <= 8.0:
region.append(bi)
if not region:
# fallback: nearest bar
nearest = int(bar_idxs[np.argmin(np.abs(beat_times[bar_idxs] - approx_start))])
return float(beat_times[nearest]), tempo, beat_times
# score each candidate bar start by onset energy in small window after it
best_bi = region[0]
best_val = -1.0
for bi in region:
t0 = float(beat_times[bi])
# window: first ~0.35s after bar start (kick/transient)
mask = (onset_t >= t0) & (onset_t <= (t0 + 0.35))
val = float(np.mean(onset_env[mask])) if np.any(mask) else 0.0
# also prefer closeness to approx_start
closeness = 1.0 - min(1.0, abs(t0 - approx_start) / 8.0)
val = onset_weight * val + 0.25 * closeness
if val > best_val:
best_val = val
best_bi = bi
snapped = float(beat_times[best_bi])
# additionally snap to bar-grid chunk size (bars) for phrase alignment
# i.e. every (bars * beats_per_bar) beats
chunk = max(1, bars * beats_per_bar)
# convert best_bi (beat index) into chunk-aligned beat index
chunk_bi = int(round(best_bi / chunk) * chunk)
chunk_bi = max(0, min(chunk_bi, len(beat_times) - 1))
snapped2 = float(beat_times[chunk_bi])
# keep in sane range
if abs(snapped2 - approx_start) <= 2.5:
return snapped2, tempo, beat_times
return snapped, tempo, beat_times
except Exception:
return approx_start, 0.0, None
def apply_preroll(snapped_start: float, beat_times: Optional[np.ndarray], preroll_bars: int, beats_per_bar: int) -> float:
if preroll_bars <= 0:
return snapped_start
if beat_times is None or len(beat_times) < (preroll_bars * beats_per_bar + 2):
return max(0.0, snapped_start - preroll_bars * 2.0)
i = int(np.argmin(np.abs(beat_times - snapped_start)))
back_beats = preroll_bars * beats_per_bar
j = max(0, i - back_beats)
return float(beat_times[j])
# ---------------------------
# 2-pass loudnorm helpers
# ---------------------------
def loudnorm_2pass_filter(infile: Path, start: float, dur: float, fade_s: float, target_lufs: float) -> str:
"""
Build a 2-pass loudnorm filter for a trimmed segment.
Pass1: measure JSON from ffmpeg stderr
Pass2: apply measured params
"""
# pass1 measure
pre = f"atrim=start={start}:duration={dur},afade=t=in:st=0:d={fade_s},afade=t=out:st={max(0.0, dur - fade_s)}:d={fade_s}"
measure = f"{pre},loudnorm=I={target_lufs}:TP=-1.5:LRA=11:print_format=json"
_, err = run(["ffmpeg", "-y", "-i", str(infile), "-vn", "-af", measure, "-f", "null", "-"])
# extract the last JSON object from stderr
jtxt = err[err.rfind("{") : err.rfind("}") + 1]
data = json.loads(jtxt)
# pass2 apply
# Use measured values
applied = (
f"{pre},loudnorm=I={target_lufs}:TP=-1.5:LRA=11:"
f"measured_I={data['input_i']}:measured_TP={data['input_tp']}:measured_LRA={data['input_lra']}:"
f"measured_thresh={data['input_thresh']}:offset={data['target_offset']}:linear=true:print_format=summary"
)
return applied
def render_clip_2pass(in_wav: Path, out_path: Path, start: float, dur: float, fade_s: float, target_lufs: float) -> None:
out_path.parent.mkdir(parents=True, exist_ok=True)
af2 = loudnorm_2pass_filter(in_wav, start, dur, fade_s, target_lufs)
run(["ffmpeg", "-y", "-i", str(in_wav), "-vn", "-af", af2, str(out_path)])
def build_acrossfade_chain(clips: List[Path], out_wav: Path, crossfade_s: float) -> None:
if len(clips) == 1:
shutil.copyfile(clips[0], out_wav)
return
cmd = ["ffmpeg", "-y"]
for c in clips:
cmd += ["-i", str(c)]
filter_parts = []
last = "[0:a]"
for i in range(1, len(clips)):
nxt = f"[{i}:a]"
out = f"[a{i}]"
filter_parts.append(f"{last}{nxt}acrossfade=d={crossfade_s}:c1=tri:c2=tri{out}")
last = out
cmd += ["-filter_complex", ";".join(filter_parts), "-map", last, str(out_wav)]
run(cmd)
def export_mp3(in_wav: Path, out_mp3: Path, bitrate: str) -> None:
out_mp3.parent.mkdir(parents=True, exist_ok=True)
run(["ffmpeg", "-y", "-i", str(in_wav), "-vn", "-codec:a", "libmp3lame", "-b:a", bitrate, str(out_mp3)])
# ---------------------------
# Harmonic / DJ ordering
# ---------------------------
def camelot_neighbors(c: str) -> List[str]:
"""
Camelot adjacency: same number A<->B, +/-1 same letter.
Example: 8A neighbors -> 8B, 7A, 9A
"""
if len(c) != 2 and len(c) != 3:
return []
# handle 10A/11B/12A
num = int(c[:-1])
letter = c[-1].upper()
def wrap(n):
return 12 if n == 0 else (1 if n == 13 else n)
neigh = []
neigh.append(f"{num}{'A' if letter=='B' else 'B'}")
neigh.append(f"{wrap(num-1)}{letter}")
neigh.append(f"{wrap(num+1)}{letter}")
return neigh
def harmonic_path_order(infos: List[TrackInfo]) -> List[TrackInfo]:
"""
Greedy harmonic chaining:
start from a low-energy track, then pick next that is Camelot-neighbor if possible,
otherwise fall back to closest tempo + energy.
"""
if not infos:
return []
remaining = infos[:]
remaining.sort(key=lambda t: t.energy_score) # start calm
ordered = [remaining.pop(0)]
while remaining:
cur = ordered[-1]
neigh = set(camelot_neighbors(cur.camelot))
# prefer harmonic neighbors
candidates = [t for t in remaining if t.camelot in neigh]
if not candidates:
candidates = remaining
# pick best candidate by (tempo closeness, energy slightly higher)
def keyfn(t: TrackInfo):
tempo_pen = abs((t.tempo_bpm or 0) - (cur.tempo_bpm or 0))
energy_pen = max(0.0, cur.energy_score - t.energy_score) # prefer rising energy
return (tempo_pen, energy_pen, -t.energy_score)
pick = min(candidates, key=keyfn)
remaining.remove(pick)
ordered.append(pick)
return ordered
def tempo_cluster_energy_ramp(infos: List[TrackInfo], tempo_tol: float) -> List[TrackInfo]:
infos_sorted = sorted(infos, key=lambda t: (t.tempo_bpm if t.tempo_bpm > 0 else 1e9))
clusters: List[List[TrackInfo]] = []
for t in infos_sorted:
placed = False
for c in clusters:
tempos = [x.tempo_bpm for x in c if x.tempo_bpm > 0]
med = float(np.median(tempos)) if tempos else t.tempo_bpm
if t.tempo_bpm > 0 and abs(t.tempo_bpm - med) <= tempo_tol:
c.append(t)
placed = True
break
if not placed:
clusters.append([t])
for c in clusters:
c.sort(key=lambda x: x.energy_score)
def ckey(c):
tempos = [x.tempo_bpm for x in c if x.tempo_bpm > 0]
med_t = float(np.median(tempos)) if tempos else 9999.0
med_e = float(np.median([x.energy_score for x in c]))
return (med_t, med_e)
clusters.sort(key=ckey)
return [t for c in clusters for t in c]
def main():
parser = argparse.ArgumentParser(description="Local DJ Teaser Builder v3")
parser.add_argument("--tracks-dir", default="./tracks")
parser.add_argument("--work-dir", default="./work")
parser.add_argument("--out-dir", default="./out")
parser.add_argument("--max-tracks", type=int, default=20)
parser.add_argument("--select", default="all", help='all | auto | "1,2,7" | "1-4,9"')
parser.add_argument("--auto-n", type=int, default=8, help="when --select auto: keep N best tracks")
parser.add_argument("--teaser", type=float, default=60.0)
parser.add_argument("--bars", type=int, default=2)
parser.add_argument("--bpb", type=int, default=4)
parser.add_argument("--preroll-bars", type=int, default=1)
parser.add_argument("--crossfade", type=float, default=0.25)
parser.add_argument("--fade", type=float, default=0.08)
parser.add_argument("--avoid-intro", type=float, default=30.0)
parser.add_argument("--avoid-outro", type=float, default=20.0)
parser.add_argument("--tempo-tol", type=float, default=4.0)
parser.add_argument("--target-lufs", type=float, default=-14.0)
parser.add_argument("--output-wav", default="album_teaser.wav")
parser.add_argument("--output-mp3", default="album_teaser.mp3")
parser.add_argument("--mp3-bitrate", default="320k")
parser.add_argument("--harmonic", action="store_true", help="Enable Camelot harmonic ordering (recommended for trance)")
args = parser.parse_args()
ensure_ffmpeg()
tracks_dir = Path(args.tracks_dir)
work_dir = Path(args.work_dir)
out_dir = Path(args.out_dir)
work_dir.mkdir(parents=True, exist_ok=True)
out_dir.mkdir(parents=True, exist_ok=True)
tracks = list_tracks(tracks_dir, args.max_tracks)
if not tracks:
raise SystemExit(f"No tracks found in {tracks_dir.resolve()}")
print("\nDiscovered tracks:")
for i, t in enumerate(tracks, start=1):
print(f" {i:02d}. {t.name}")
selected_idxs = parse_selection(args.select, len(tracks))
selected_tracks = [tracks[i] for i in selected_idxs]
n = len(selected_tracks)
teaser_s = float(args.teaser)
cf = float(args.crossfade)
avg_dur = (teaser_s + (n - 1) * cf) / max(1, n)
infos: List[TrackInfo] = []
for local_idx, track in enumerate(selected_tracks, start=1):
tmp_wav = work_dir / f"src_{local_idx:02d}.wav"
ffmpeg_to_wav(track, tmp_wav, sr=22050)
y, sr = librosa.load(tmp_wav, sr=22050, mono=True)
duration_s = float(len(y) / sr)
score = compute_score(y, sr, hop_length=512)
# robust energy score
q = np.quantile(score, 0.90) if len(score) else 0.0
energy_score = float(np.mean(score[score >= q])) if np.any(score >= q) else float(np.mean(score) if len(score) else 0.0)
search_clip = float(np.clip(avg_dur, 4.0, 12.0))
approx_start, highlight_score = pick_highlight_start(
score=score,
sr=sr,
hop_length=512,
clip_s=search_clip,
avoid_intro_s=float(args.avoid_intro),
avoid_outro_s=float(args.avoid_outro),
duration_s=duration_s
)
snapped_start, tempo, beat_times = snap_to_downbeat_like(
y=y, sr=sr,
approx_start=approx_start,
bars=int(args.bars),
beats_per_bar=int(args.bpb)
)
snapped_start = apply_preroll(snapped_start, beat_times, int(args.preroll_bars), int(args.bpb))
if tempo and tempo > 1.0:
dur = bars_to_seconds(tempo, int(args.bars), int(args.bpb))
else:
dur = avg_dur
dur = float(np.clip(dur, 2.5, avg_dur))
key_name, camelot, conf = estimate_key(y, sr)
infos.append(TrackInfo(
path=track,
folder_index=int(selected_idxs[local_idx - 1] + 1),
duration_s=duration_s,
tempo_bpm=float(tempo),
energy_score=energy_score,
highlight_score=float(highlight_score),
approx_start_s=float(approx_start),
snapped_start_s=float(snapped_start),
clip_dur_s=float(dur),
key_name=key_name,
camelot=camelot
))
# Auto best-of
if args.select.strip().lower() == "auto":
auto_n = int(max(1, min(args.auto_n, len(infos))))
infos.sort(key=lambda t: (t.highlight_score, t.energy_score), reverse=True)
infos = infos[:auto_n]
print(f"\nAuto-selected best-of: {auto_n} tracks.")
# Ordering
if args.harmonic:
# harmonic path, but keep tempo smooth-ish by pre-sorting with tempo clusters first
pre = tempo_cluster_energy_ramp(infos, tempo_tol=float(args.tempo_tol))
ordered = harmonic_path_order(pre)
print("\nOrdering: harmonic (Camelot neighbors) + tempo/energy heuristics")
else:
ordered = tempo_cluster_energy_ramp(infos, tempo_tol=float(args.tempo_tol))
print("\nOrdering: tempo clustering + energy ramp")
print("\nFinal clip order:")
for i, t in enumerate(ordered, start=1):
print(f" {i:02d}. [{t.tempo_bpm:6.1f} BPM] [{t.camelot:>3}] (E={t.energy_score:.3f}) {t.path.name}")
# Render clips (2-pass loudnorm)
clip_paths: List[Path] = []
report_tracks = []
for i, t in enumerate(ordered, start=1):
src = work_dir / f"ord_{i:02d}.wav"
ffmpeg_to_wav(t.path, src, sr=22050)
clip_out = work_dir / f"clip_{i:02d}.wav"
render_clip_2pass(
in_wav=src,
out_path=clip_out,
start=t.snapped_start_s,
dur=t.clip_dur_s,
fade_s=float(args.fade),
target_lufs=float(args.target_lufs)
)
clip_paths.append(clip_out)
report_tracks.append({
"folder_index": t.folder_index,
"filename": t.path.name,
"tempo_bpm_est": round(t.tempo_bpm, 2),
"key": t.key_name,
"camelot": t.camelot,
"energy_score": round(t.energy_score, 6),
"highlight_score": round(t.highlight_score, 6),
"approx_start_seconds": round(t.approx_start_s, 3),
"snapped_start_seconds": round(t.snapped_start_s, 3),
"clip_duration_seconds": round(t.clip_dur_s, 3),
})
out_wav = out_dir / args.output_wav
out_mp3 = out_dir / args.output_mp3
build_acrossfade_chain(clip_paths, out_wav, crossfade_s=float(args.crossfade))
export_mp3(out_wav, out_mp3, bitrate=str(args.mp3_bitrate))
report = {
"version": "v3",
"settings": {
"teaser_seconds": float(args.teaser),
"bars": int(args.bars),
"beats_per_bar": int(args.bpb),
"preroll_bars": int(args.preroll_bars),
"harmonic": bool(args.harmonic),
"tempo_tolerance_bpm": float(args.tempo_tol),
"crossfade_seconds": float(args.crossfade),
"fade_seconds": float(args.fade),
"avoid_intro_seconds": float(args.avoid_intro),
"avoid_outro_seconds": float(args.avoid_outro),
"target_lufs": float(args.target_lufs),
"mp3_bitrate": str(args.mp3_bitrate),
},
"outputs": {
"wav": str(out_wav.resolve()),
"mp3": str(out_mp3.resolve()),
},
"tracks": report_tracks
}
report_path = out_dir / "teaser_report.json"
with open(report_path, "w", encoding="utf-8") as f:
json.dump(report, f, ensure_ascii=False, indent=2)
print(f"\n✅ Teaser WAV: {out_wav.resolve()}")
print(f"✅ Teaser MP3: {out_mp3.resolve()}")
print(f"📝 Report: {report_path.resolve()}\n")
if __name__ == "__main__":
main()

BIN
V_3/ffmpeg.exe Normal file

Binary file not shown.

141
V_3/ollama_assets.py Normal file
View File

@@ -0,0 +1,141 @@
#!/usr/bin/env python3
"""
Generate repo assets (README, promo text, tracklist) using Ollama.
Input:
- teaser_report.json from dj_teaser_v3.py
Output:
- README.md
- PROMO.txt
- TRACKLIST.md
Requirements:
pip install requests
"""
import argparse
import json
from pathlib import Path
from typing import Dict, Any, List
import requests
def ollama_generate(base_url: str, model: str, prompt: str) -> str:
url = base_url.rstrip("/") + "/api/generate"
payload = {"model": model, "prompt": prompt, "stream": False}
r = requests.post(url, json=payload, timeout=120)
r.raise_for_status()
return r.json().get("response", "").strip()
def format_timestamps(tracks: List[Dict[str, Any]], crossfade_seconds: float) -> List[Dict[str, Any]]:
"""
Approximate teaser timestamps by accumulating clip durations minus crossfades.
timestamp[i] = sum(durs[0..i-1]) - i*crossfade
"""
out = []
t = 0.0
for i, tr in enumerate(tracks):
out.append({
**tr,
"teaser_timestamp_seconds": round(t, 2)
})
dur = float(tr.get("clip_duration_seconds", 0.0))
t += max(0.0, dur - crossfade_seconds)
return out
def seconds_to_mmss(sec: float) -> str:
sec = max(0.0, float(sec))
m = int(sec // 60)
s = int(round(sec - (m * 60)))
return f"{m:02d}:{s:02d}"
def main():
ap = argparse.ArgumentParser(description="Generate README/promo/tracklist via Ollama")
ap.add_argument("--report", default="./out/teaser_report.json", help="Path to teaser_report.json")
ap.add_argument("--out-dir", default="./out", help="Output directory for generated assets")
ap.add_argument("--ollama", default="http://192.168.2.60:11434", help="Ollama base URL")
ap.add_argument("--model", default="llama3.1:8b-instruct-q4_0", help="Ollama model name")
ap.add_argument("--project-name", default="DJ Teaser Builder", help="Project/repo name")
ap.add_argument("--artist", default="DjGulvBasS", help="Artist/DJ name")
ap.add_argument("--genre", default="old school trance", help="Genre")
args = ap.parse_args()
report_path = Path(args.report)
out_dir = Path(args.out_dir)
out_dir.mkdir(parents=True, exist_ok=True)
data = json.loads(report_path.read_text(encoding="utf-8"))
tracks = data.get("tracks", [])
settings = data.get("settings", {})
crossfade = float(settings.get("crossfade_seconds", 0.25))
tracks_ts = format_timestamps(tracks, crossfade_seconds=crossfade)
# Build TRACKLIST.md ourselves (deterministic)
lines = [f"# Tracklist (approx.) — {args.artist}\n"]
for tr in tracks_ts:
ts = seconds_to_mmss(tr["teaser_timestamp_seconds"])
fname = tr.get("filename", "Unknown")
bpm = tr.get("tempo_bpm_est", "?")
camelot = tr.get("camelot", "??")
key = tr.get("key", "")
lines.append(f"- **{ts}** — {fname} _(BPM ~ {bpm}, {camelot}, {key})_")
(out_dir / "TRACKLIST.md").write_text("\n".join(lines) + "\n", encoding="utf-8")
# README prompt
readme_prompt = f"""
You are writing a GitHub README in English for a small local audio tool.
Project: {args.project_name}
Artist use-case: {args.artist}{args.genre}
The tool scans a folder of tracks and builds a DJ-style teaser by:
- detecting highlight segments
- snapping cuts to bar grid (DJ phrasing)
- optional harmonic ordering using Camelot keys
- rendering clips and acrossfading them with FFmpeg
- exporting WAV + MP3
It produces a JSON report and a tracklist with timestamps.
Please write a README with these sections:
1) What it does
2) Requirements (ffmpeg + Python)
3) Install (venv)
4) Usage examples (include: select all, select by indices, auto best-of)
5) Trance/DJ tips (avoid-intro, bars, preroll-bars, harmonic)
6) Troubleshooting (ffmpeg not found, weird beat detection, key detection limitations)
Keep it concise and practical.
These settings were used in an example run:
{json.dumps(settings, indent=2)}
Do NOT invent features beyond what is described.
"""
readme_text = ollama_generate(args.ollama, args.model, readme_prompt)
(out_dir / "README.md").write_text(readme_text + "\n", encoding="utf-8")
# Promo prompt
promo_prompt = f"""
Write 3 short promo text variants (English) for a DJ album teaser for {args.artist} ({args.genre}).
Constraints:
- Each variant should be 24 lines max
- Include 48 hashtags (trance/electronic)
- Tone: energetic, DJ/club vibe
- Do not mention "AI" or "tool" or "script"
- Do not include any URLs
"""
promo_text = ollama_generate(args.ollama, args.model, promo_prompt)
(out_dir / "PROMO.txt").write_text(promo_text + "\n", encoding="utf-8")
print(f"✅ Generated: {out_dir / 'README.md'}")
print(f"✅ Generated: {out_dir / 'TRACKLIST.md'}")
print(f"✅ Generated: {out_dir / 'PROMO.txt'}")
if __name__ == "__main__":
main()

92
V_3/readme.md Normal file
View File

@@ -0,0 +1,92 @@
Install (til v3)
python -m venv .venv
source .venv/bin/activate
pip install -U pip
pip install numpy librosa soundfile
\# ffmpeg skal være installeret
🎛️ Kommandoer (til dine 14 old school trance tracks)
Rollcall (alle 14, DJ flip, harmonic ordering on)
python dj\_teaser\_v3.py --tracks-dir ./tracks --select all --teaser 60 --bars 2 --preroll-bars 1 --avoid-intro 30 --harmonic
Best-of mini-mix vibe (8 tracks, 4 bars)
python dj\_teaser\_v3.py --tracks-dir ./tracks --select auto --auto-n 8 --teaser 75 --bars 4 --preroll-bars 1 --avoid-intro 30 --harmonic
💡 V3 tweaks jeg typisk bruger til trance
--avoid-intro 30 eller 45 (lange trance intros)
--bars 2 hvis alle skal med (rollcall)
--bars 4 hvis du vil have mere “rigtig” trance-feel
--preroll-bars 1 giver DJ-lead-in (får overgangen til at føles naturlig)
--harmonic næsten altid “on” til trance 👌
Hvordan du bruger det i praksis
Byg teaser med v3:
python dj\_teaser\_v3.py --tracks-dir ./tracks --select all --teaser 60 --bars 2 --preroll-bars 1 --avoid-intro 30 --harmonic
Generér repo-ting + promo med din Llama 3.1:
pip install requests
python ollama\_assets.py --report ./out/teaser\_report.json --ollama http://192.168.2.60:11434 --model llama3.1:8b-instruct-q4\_0 --artist DjGulvBasS --genre "old school trance"
🎛️ Små tips (trance)
Hvis cut føles “for tidligt”: sænk --avoid-intro eller sæt --preroll-bars 0
Hvis du vil have mere “rigtig trance”: brug --bars 4 og vælg --select auto --auto-n 8
Hvis key detection virker off på enkelte tracks: det er normalt (pads + noise + modulations). Camelot er “best effort” her.

View File

@@ -0,0 +1,32 @@
{
"version": "v3",
"settings": {
"teaser_seconds": 60,
"bars": 2,
"beats_per_bar": 4,
"preroll_bars": 1,
"harmonic": true,
"crossfade_seconds": 0.25,
"target_lufs": -14.0
},
"tracks": [
{
"folder_index": 1,
"filename": "01 - Opening Trance.wav",
"tempo_bpm_est": 138.2,
"key": "A Minor",
"camelot": "8A",
"snapped_start_seconds": 92.34,
"clip_duration_seconds": 3.48
},
{
"folder_index": 2,
"filename": "02 - Acid Sunrise.wav",
"tempo_bpm_est": 140.0,
"key": "C Major",
"camelot": "8B",
"snapped_start_seconds": 118.02,
"clip_duration_seconds": 3.43
}
]
}