From 5e169342dbfa7cf0796976a19db1065ae9eef87b Mon Sep 17 00:00:00 2001 From: Steffen Schuhmann Date: Sun, 29 Mar 2026 16:08:23 +0200 Subject: [PATCH] docs: add ARCHITECTURE.md for agent/developer onboarding Covers: service topology, directory layout, data model, full API surface, scan/import pipeline, audio analysis flow, auth model, and key conventions. Co-Authored-By: Claude Sonnet 4.6 --- ARCHITECTURE.md | 249 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 249 insertions(+) create mode 100644 ARCHITECTURE.md diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md new file mode 100644 index 0000000..9a1f913 --- /dev/null +++ b/ARCHITECTURE.md @@ -0,0 +1,249 @@ +# RehearsalHub — Architecture + +POC for a band rehearsal recording manager. Audio files live in Nextcloud; this app indexes, annotates, and plays them back. + +--- + +## Services (Docker Compose) + +``` +┌─────────────┐ HTTP/80 ┌─────────────┐ REST /api/v1 ┌───────────────┐ +│ Browser │ ──────────► │ web │ ──────────────► │ api │ +└─────────────┘ │ (nginx + │ │ (FastAPI / │ + │ React PWA) │ │ uvicorn) │ + └─────────────┘ └──────┬────────┘ + │ + ┌───────────────────────────────────────────┤ + │ │ │ │ + ┌────▼────┐ ┌──────▼──────┐ ┌────▼────┐ ┌──▼──────────┐ + │ db │ │ redis │ │Nextcloud│ │audio-worker │ + │(Postgres│ │ (job queue │ │(WebDAV) │ │ (Essentia │ + │ 16) │ │ + pub/sub) │ │ │ │ analysis) │ + └─────────┘ └─────────────┘ └────┬────┘ └─────────────┘ + │ + ┌─────▼──────┐ + │ nc-watcher │ + │(polls NC │ + │ activity) │ + └────────────┘ +``` + +| Service | Image | Role | +|---|---|---| +| `web` | `rehearsalhub/web` | React 18 PWA (Vite + React Router + TanStack Query), served by nginx | +| `api` | `rehearsalhub/api` | FastAPI async REST API + SSE endpoints | +| `audio-worker` | `rehearsalhub/audio-worker` | Background job processor: downloads audio from NC, runs Essentia analysis, writes results to DB | +| `nc-watcher` | `rehearsalhub/nc-watcher` | Polls Nextcloud Activity API every 30s, pushes new audio uploads to `api` internal endpoint | +| `db` | `postgres:16-alpine` | Primary datastore | +| `redis` | `redis:7-alpine` | Job queue (audio analysis jobs) | + +All services communicate on the `rh_net` bridge network. Only `web:80` is exposed to the host. + +--- + +## Directory Layout + +``` +rehearsalhub-poc/ +├── api/ # FastAPI backend +│ ├── alembic/ # DB migrations (Alembic) +│ └── src/rehearsalhub/ +│ ├── db/ +│ │ ├── models.py # SQLAlchemy ORM models +│ │ └── engine.py # Async engine + session factory +│ ├── repositories/ # DB access layer (one file per model) +│ ├── routers/ # FastAPI route handlers +│ ├── schemas/ # Pydantic request/response models +│ ├── services/ # Business logic +│ │ ├── nc_scan.py # Core scan logic (recursive, yields SSE events) +│ │ ├── song.py +│ │ ├── session.py # Date parsing helpers +│ │ └── band.py +│ ├── storage/ +│ │ └── nextcloud.py # WebDAV client (PROPFIND / download) +│ └── queue/ +│ └── redis_queue.py # Enqueue audio analysis jobs +├── worker/ # Audio analysis worker +│ └── src/worker/ +│ ├── main.py # Redis job consumer loop +│ ├── pipeline/ # Download → analyse → persist pipeline +│ └── analyzers/ # Essentia-based BPM / key / waveform analysers +├── watcher/ # Nextcloud file watcher +│ └── src/watcher/ +│ ├── event_loop.py # Poll NC activity, filter audio uploads +│ └── nc_client.py # NC Activity API + etag fetch +├── web/ # React frontend +│ └── src/ +│ ├── pages/ # Route-level components +│ ├── api/ # Typed fetch wrappers +│ └── hooks/ # useWaveform, etc. +├── docker-compose.yml +└── Makefile +``` + +--- + +## Data Model + +``` +Member ──< BandMember >── Band ──< RehearsalSession + │ │ + └──< Song >────┘ + │ + └──< AudioVersion + │ + └──< SongComment + └──< Annotation + └──< RangeAnalysis + └──< Reaction + └──< Job +``` + +**Key tables:** + +| Table | Purpose | +|---|---| +| `members` | User accounts. Store per-user Nextcloud credentials (`nc_username`, `nc_url`, `nc_password`) | +| `bands` | A band. Has a `slug`, optional `nc_folder_path` (defaults to `bands/{slug}/`), and `genre_tags[]` | +| `band_members` | M2M: member ↔ band with `role` (admin / member) | +| `band_invites` | Time-limited invite tokens (72h) | +| `rehearsal_sessions` | One row per dated rehearsal. `date` parsed from a `YYMMDD` or `YYYYMMDD` folder segment in the NC path. Unique on `(band_id, date)` | +| `songs` | A recording / song. `nc_folder_path` is the canonical grouping key (all versions of one song live in this folder). `session_id` links to a rehearsal session if the path contained a date segment | +| `audio_versions` | One row per audio file. Identified by `nc_file_etag` (used for idempotent re-scans). Stores format, size, version number | +| `annotations` | Time-stamped text annotations on a version (like comments at a waveform position) | +| `range_analyses` | Essentia analysis results for a time range within a version (BPM, key, loudness, waveform) | +| `jobs` | Redis-backed job records tracking audio analysis pipeline state | + +--- + +## API + +Base path: `/api/v1` + +### Auth +| Method | Path | Description | +|---|---|---| +| `POST` | `/auth/register` | Create account | +| `POST` | `/auth/login` | Returns JWT | + +JWT is sent as `Authorization: Bearer `. Endpoints that need to work without auth headers (WaveSurfer, SSE EventSource) also accept `?token=`. + +### Bands +| Method | Path | Description | +|---|---|---| +| `GET` | `/bands` | List bands for current member | +| `POST` | `/bands` | Create band (validates NC folder exists if path given) | +| `GET` | `/bands/{id}` | Band detail | +| `PATCH` | `/bands/{id}` | Update band (nc_folder_path, etc.) | +| `GET` | `/bands/{id}/members` | List members | +| `DELETE` | `/bands/{id}/members/{mid}` | Remove member | +| `POST` | `/bands/{id}/invites` | Generate invite link | +| `POST` | `/invites/{token}/accept` | Join band via invite | + +### Sessions +| Method | Path | Description | +|---|---|---| +| `GET` | `/bands/{id}/sessions` | List rehearsal sessions with recording counts | +| `GET` | `/bands/{id}/sessions/{sid}` | Session detail with flat song list | +| `PATCH` | `/bands/{id}/sessions/{sid}` | Update label/notes (admin only) | + +### Songs +| Method | Path | Description | +|---|---|---| +| `GET` | `/bands/{id}/songs` | All songs for band | +| `GET` | `/bands/{id}/songs/search` | Filter by `q`, `tags[]`, `key`, `bpm_min/max`, `session_id`, `unattributed` | +| `POST` | `/bands/{id}/songs` | Create song manually | +| `PATCH` | `/songs/{id}` | Update title, status, tags, key, BPM, notes | + +### Scan +| Method | Path | Description | +|---|---|---| +| `GET` | `/bands/{id}/nc-scan/stream` | **SSE / ndjson stream** — scan NC folder incrementally; yields `progress`, `song`, `session`, `skipped`, `done` events | +| `POST` | `/bands/{id}/nc-scan` | Blocking scan (waits for completion, returns summary) | + +### Versions & Playback +| Method | Path | Description | +|---|---|---| +| `GET` | `/songs/{id}/versions` | List audio versions | +| `GET` | `/versions/{id}/stream` | Proxy-stream the audio file from Nextcloud (accepts `?token=`) | +| `POST` | `/versions/{id}/annotate` | Add waveform annotation | + +### Internal (watcher → api) +| Method | Path | Description | +|---|---|---| +| `POST` | `/internal/nc-upload` | Called by nc-watcher when a new audio file is detected. No auth — internal network only | + +--- + +## Scan & Import Pipeline + +### Manual scan (SSE) + +``` +Browser → GET /nc-scan/stream?token= + │ + ▼ + scan_band_folder() [nc_scan.py] + │ recursive PROPFIND via collect_audio_files() + │ depth ≤ 3 + ▼ + For each audio file: + 1. PROPFIND for etag + size + 2. Skip if etag already in audio_versions + 3. Parse YYMMDD/YYYYMMDD from path → get_or_create RehearsalSession + 4. Determine nc_folder_path: + - File directly in session folder → unique per-file folder (bands/slug/231015/stem/) + - File in subfolder → subfolder path (bands/slug/231015/groove/) + 5. get_or_create Song + 6. Register AudioVersion + 7. Yield ndjson event → browser invalidates TanStack Query caches incrementally +``` + +### Watcher-driven import + +``` +Nextcloud → Activity API (polled every 30s by nc-watcher) + │ + ▼ + event_loop.poll_once() + filter: audio extension only + normalize path (strip WebDAV prefix) + filter: upload event type + │ + ▼ + POST /internal/nc-upload + band lookup: slug-based OR nc_folder_path prefix match + same folder/session/song logic as manual scan + enqueue audio analysis job → Redis +``` + +--- + +## Audio Analysis + +When a new `AudioVersion` is created the API enqueues a `Job` to Redis. The `audio-worker` picks it up and runs: + +1. Download file from Nextcloud to `/tmp/audio/` +2. Run Essentia analysers: BPM, key, loudness, waveform peak data +3. Write `RangeAnalysis` rows to DB +4. Update `Song.global_bpm` / `Song.global_key` if not yet set +5. Clean up temp file + +--- + +## Auth & Nextcloud Credentials + +- JWT signed with `SECRET_KEY` (HS256), `sub` = member UUID +- Per-member Nextcloud credentials stored on the `members` row (`nc_url`, `nc_username`, `nc_password`). The API creates a `NextcloudClient` scoped to the acting member for all WebDAV operations. +- The watcher uses a single shared NC account configured via env vars (`NEXTCLOUD_USER` / `NEXTCLOUD_PASS`). + +--- + +## Key Conventions + +- **Repository pattern**: one `*Repository` class per model in `repositories/`. All DB access goes through repos; routers never touch the session directly except for passing it to repos/services. +- **Pydantic v2**: `model_validate(obj).model_copy(update={...})` — `model_validate` does not accept an `update` kwarg. +- **Async SQLAlchemy**: sessions are opened per-request via `get_session()` FastAPI dependency. SSE endpoints create their own session via `get_session_factory()()` because the dependency session closes when the handler returns. +- **Idempotent scans**: deduplication is by `nc_file_etag`. Re-scanning is always safe. +- **nc_folder_path grouping**: files in the same subfolder (e.g. `bands/slug/groove/`) are treated as multiple versions of one song. Files directly in a dated session folder get a unique virtual folder (`bands/slug/231015/stem/`) so each becomes its own song. +- **Migrations**: Alembic in `api/alembic/`. After rebuilding the DB run `docker compose exec api uv run alembic upgrade head`.