Files
rehearshalhub/ARCHITECTURE.md
Steffen Schuhmann 5e169342db docs: add ARCHITECTURE.md for agent/developer onboarding
Covers: service topology, directory layout, data model, full API surface,
scan/import pipeline, audio analysis flow, auth model, and key conventions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-29 16:08:23 +02:00

12 KiB

RehearsalHub — Architecture

POC for a band rehearsal recording manager. Audio files live in Nextcloud; this app indexes, annotates, and plays them back.


Services (Docker Compose)

┌─────────────┐   HTTP/80   ┌─────────────┐   REST /api/v1   ┌───────────────┐
│   Browser   │ ──────────► │    web      │ ──────────────►  │     api       │
└─────────────┘             │  (nginx +   │                  │  (FastAPI /   │
                            │  React PWA) │                  │   uvicorn)    │
                            └─────────────┘                  └──────┬────────┘
                                                                    │
                        ┌───────────────────────────────────────────┤
                        │               │               │           │
                   ┌────▼────┐   ┌──────▼──────┐  ┌────▼────┐  ┌──▼──────────┐
                   │   db    │   │    redis    │  │Nextcloud│  │audio-worker │
                   │(Postgres│   │  (job queue │  │(WebDAV) │  │ (Essentia   │
                   │   16)   │   │  + pub/sub) │  │         │  │  analysis)  │
                   └─────────┘   └─────────────┘  └────┬────┘  └─────────────┘
                                                        │
                                                  ┌─────▼──────┐
                                                  │ nc-watcher │
                                                  │(polls NC   │
                                                  │ activity)  │
                                                  └────────────┘
Service Image Role
web rehearsalhub/web React 18 PWA (Vite + React Router + TanStack Query), served by nginx
api rehearsalhub/api FastAPI async REST API + SSE endpoints
audio-worker rehearsalhub/audio-worker Background job processor: downloads audio from NC, runs Essentia analysis, writes results to DB
nc-watcher rehearsalhub/nc-watcher Polls Nextcloud Activity API every 30s, pushes new audio uploads to api internal endpoint
db postgres:16-alpine Primary datastore
redis redis:7-alpine Job queue (audio analysis jobs)

All services communicate on the rh_net bridge network. Only web:80 is exposed to the host.


Directory Layout

rehearsalhub-poc/
├── api/                  # FastAPI backend
│   ├── alembic/          # DB migrations (Alembic)
│   └── src/rehearsalhub/
│       ├── db/
│       │   ├── models.py         # SQLAlchemy ORM models
│       │   └── engine.py         # Async engine + session factory
│       ├── repositories/         # DB access layer (one file per model)
│       ├── routers/              # FastAPI route handlers
│       ├── schemas/              # Pydantic request/response models
│       ├── services/             # Business logic
│       │   ├── nc_scan.py        # Core scan logic (recursive, yields SSE events)
│       │   ├── song.py
│       │   ├── session.py        # Date parsing helpers
│       │   └── band.py
│       ├── storage/
│       │   └── nextcloud.py      # WebDAV client (PROPFIND / download)
│       └── queue/
│           └── redis_queue.py    # Enqueue audio analysis jobs
├── worker/               # Audio analysis worker
│   └── src/worker/
│       ├── main.py               # Redis job consumer loop
│       ├── pipeline/             # Download → analyse → persist pipeline
│       └── analyzers/            # Essentia-based BPM / key / waveform analysers
├── watcher/              # Nextcloud file watcher
│   └── src/watcher/
│       ├── event_loop.py         # Poll NC activity, filter audio uploads
│       └── nc_client.py          # NC Activity API + etag fetch
├── web/                  # React frontend
│   └── src/
│       ├── pages/                # Route-level components
│       ├── api/                  # Typed fetch wrappers
│       └── hooks/                # useWaveform, etc.
├── docker-compose.yml
└── Makefile

Data Model

Member ──< BandMember >── Band ──< RehearsalSession
                           │              │
                           └──< Song >────┘
                                  │
                                  └──< AudioVersion
                                  │
                                  └──< SongComment
                                  └──< Annotation
                                  └──< RangeAnalysis
                                  └──< Reaction
                                  └──< Job

Key tables:

Table Purpose
members User accounts. Store per-user Nextcloud credentials (nc_username, nc_url, nc_password)
bands A band. Has a slug, optional nc_folder_path (defaults to bands/{slug}/), and genre_tags[]
band_members M2M: member ↔ band with role (admin / member)
band_invites Time-limited invite tokens (72h)
rehearsal_sessions One row per dated rehearsal. date parsed from a YYMMDD or YYYYMMDD folder segment in the NC path. Unique on (band_id, date)
songs A recording / song. nc_folder_path is the canonical grouping key (all versions of one song live in this folder). session_id links to a rehearsal session if the path contained a date segment
audio_versions One row per audio file. Identified by nc_file_etag (used for idempotent re-scans). Stores format, size, version number
annotations Time-stamped text annotations on a version (like comments at a waveform position)
range_analyses Essentia analysis results for a time range within a version (BPM, key, loudness, waveform)
jobs Redis-backed job records tracking audio analysis pipeline state

API

Base path: /api/v1

Auth

Method Path Description
POST /auth/register Create account
POST /auth/login Returns JWT

JWT is sent as Authorization: Bearer <token>. Endpoints that need to work without auth headers (WaveSurfer, SSE EventSource) also accept ?token=<jwt>.

Bands

Method Path Description
GET /bands List bands for current member
POST /bands Create band (validates NC folder exists if path given)
GET /bands/{id} Band detail
PATCH /bands/{id} Update band (nc_folder_path, etc.)
GET /bands/{id}/members List members
DELETE /bands/{id}/members/{mid} Remove member
POST /bands/{id}/invites Generate invite link
POST /invites/{token}/accept Join band via invite

Sessions

Method Path Description
GET /bands/{id}/sessions List rehearsal sessions with recording counts
GET /bands/{id}/sessions/{sid} Session detail with flat song list
PATCH /bands/{id}/sessions/{sid} Update label/notes (admin only)

Songs

Method Path Description
GET /bands/{id}/songs All songs for band
GET /bands/{id}/songs/search Filter by q, tags[], key, bpm_min/max, session_id, unattributed
POST /bands/{id}/songs Create song manually
PATCH /songs/{id} Update title, status, tags, key, BPM, notes

Scan

Method Path Description
GET /bands/{id}/nc-scan/stream SSE / ndjson stream — scan NC folder incrementally; yields progress, song, session, skipped, done events
POST /bands/{id}/nc-scan Blocking scan (waits for completion, returns summary)

Versions & Playback

Method Path Description
GET /songs/{id}/versions List audio versions
GET /versions/{id}/stream Proxy-stream the audio file from Nextcloud (accepts ?token=)
POST /versions/{id}/annotate Add waveform annotation

Internal (watcher → api)

Method Path Description
POST /internal/nc-upload Called by nc-watcher when a new audio file is detected. No auth — internal network only

Scan & Import Pipeline

Manual scan (SSE)

Browser  →  GET /nc-scan/stream?token=
               │
               ▼
         scan_band_folder() [nc_scan.py]
               │  recursive PROPFIND via collect_audio_files()
               │  depth ≤ 3
               ▼
         For each audio file:
           1. PROPFIND for etag + size
           2. Skip if etag already in audio_versions
           3. Parse YYMMDD/YYYYMMDD from path → get_or_create RehearsalSession
           4. Determine nc_folder_path:
              - File directly in session folder → unique per-file folder (bands/slug/231015/stem/)
              - File in subfolder              → subfolder path (bands/slug/231015/groove/)
           5. get_or_create Song
           6. Register AudioVersion
           7. Yield ndjson event → browser invalidates TanStack Query caches incrementally

Watcher-driven import

Nextcloud  →  Activity API (polled every 30s by nc-watcher)
                    │
                    ▼
             event_loop.poll_once()
               filter: audio extension only
               normalize path (strip WebDAV prefix)
               filter: upload event type
                    │
                    ▼
             POST /internal/nc-upload
               band lookup: slug-based OR nc_folder_path prefix match
               same folder/session/song logic as manual scan
               enqueue audio analysis job → Redis

Audio Analysis

When a new AudioVersion is created the API enqueues a Job to Redis. The audio-worker picks it up and runs:

  1. Download file from Nextcloud to /tmp/audio/
  2. Run Essentia analysers: BPM, key, loudness, waveform peak data
  3. Write RangeAnalysis rows to DB
  4. Update Song.global_bpm / Song.global_key if not yet set
  5. Clean up temp file

Auth & Nextcloud Credentials

  • JWT signed with SECRET_KEY (HS256), sub = member UUID
  • Per-member Nextcloud credentials stored on the members row (nc_url, nc_username, nc_password). The API creates a NextcloudClient scoped to the acting member for all WebDAV operations.
  • The watcher uses a single shared NC account configured via env vars (NEXTCLOUD_USER / NEXTCLOUD_PASS).

Key Conventions

  • Repository pattern: one *Repository class per model in repositories/. All DB access goes through repos; routers never touch the session directly except for passing it to repos/services.
  • Pydantic v2: model_validate(obj).model_copy(update={...})model_validate does not accept an update kwarg.
  • Async SQLAlchemy: sessions are opened per-request via get_session() FastAPI dependency. SSE endpoints create their own session via get_session_factory()() because the dependency session closes when the handler returns.
  • Idempotent scans: deduplication is by nc_file_etag. Re-scanning is always safe.
  • nc_folder_path grouping: files in the same subfolder (e.g. bands/slug/groove/) are treated as multiple versions of one song. Files directly in a dated session folder get a unique virtual folder (bands/slug/231015/stem/) so each becomes its own song.
  • Migrations: Alembic in api/alembic/. After rebuilding the DB run docker compose exec api uv run alembic upgrade head.