docs: add ARCHITECTURE.md for agent/developer onboarding
Covers: service topology, directory layout, data model, full API surface, scan/import pipeline, audio analysis flow, auth model, and key conventions. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
249
ARCHITECTURE.md
Normal file
249
ARCHITECTURE.md
Normal file
@@ -0,0 +1,249 @@
|
||||
# RehearsalHub — Architecture
|
||||
|
||||
POC for a band rehearsal recording manager. Audio files live in Nextcloud; this app indexes, annotates, and plays them back.
|
||||
|
||||
---
|
||||
|
||||
## Services (Docker Compose)
|
||||
|
||||
```
|
||||
┌─────────────┐ HTTP/80 ┌─────────────┐ REST /api/v1 ┌───────────────┐
|
||||
│ Browser │ ──────────► │ web │ ──────────────► │ api │
|
||||
└─────────────┘ │ (nginx + │ │ (FastAPI / │
|
||||
│ React PWA) │ │ uvicorn) │
|
||||
└─────────────┘ └──────┬────────┘
|
||||
│
|
||||
┌───────────────────────────────────────────┤
|
||||
│ │ │ │
|
||||
┌────▼────┐ ┌──────▼──────┐ ┌────▼────┐ ┌──▼──────────┐
|
||||
│ db │ │ redis │ │Nextcloud│ │audio-worker │
|
||||
│(Postgres│ │ (job queue │ │(WebDAV) │ │ (Essentia │
|
||||
│ 16) │ │ + pub/sub) │ │ │ │ analysis) │
|
||||
└─────────┘ └─────────────┘ └────┬────┘ └─────────────┘
|
||||
│
|
||||
┌─────▼──────┐
|
||||
│ nc-watcher │
|
||||
│(polls NC │
|
||||
│ activity) │
|
||||
└────────────┘
|
||||
```
|
||||
|
||||
| Service | Image | Role |
|
||||
|---|---|---|
|
||||
| `web` | `rehearsalhub/web` | React 18 PWA (Vite + React Router + TanStack Query), served by nginx |
|
||||
| `api` | `rehearsalhub/api` | FastAPI async REST API + SSE endpoints |
|
||||
| `audio-worker` | `rehearsalhub/audio-worker` | Background job processor: downloads audio from NC, runs Essentia analysis, writes results to DB |
|
||||
| `nc-watcher` | `rehearsalhub/nc-watcher` | Polls Nextcloud Activity API every 30s, pushes new audio uploads to `api` internal endpoint |
|
||||
| `db` | `postgres:16-alpine` | Primary datastore |
|
||||
| `redis` | `redis:7-alpine` | Job queue (audio analysis jobs) |
|
||||
|
||||
All services communicate on the `rh_net` bridge network. Only `web:80` is exposed to the host.
|
||||
|
||||
---
|
||||
|
||||
## Directory Layout
|
||||
|
||||
```
|
||||
rehearsalhub-poc/
|
||||
├── api/ # FastAPI backend
|
||||
│ ├── alembic/ # DB migrations (Alembic)
|
||||
│ └── src/rehearsalhub/
|
||||
│ ├── db/
|
||||
│ │ ├── models.py # SQLAlchemy ORM models
|
||||
│ │ └── engine.py # Async engine + session factory
|
||||
│ ├── repositories/ # DB access layer (one file per model)
|
||||
│ ├── routers/ # FastAPI route handlers
|
||||
│ ├── schemas/ # Pydantic request/response models
|
||||
│ ├── services/ # Business logic
|
||||
│ │ ├── nc_scan.py # Core scan logic (recursive, yields SSE events)
|
||||
│ │ ├── song.py
|
||||
│ │ ├── session.py # Date parsing helpers
|
||||
│ │ └── band.py
|
||||
│ ├── storage/
|
||||
│ │ └── nextcloud.py # WebDAV client (PROPFIND / download)
|
||||
│ └── queue/
|
||||
│ └── redis_queue.py # Enqueue audio analysis jobs
|
||||
├── worker/ # Audio analysis worker
|
||||
│ └── src/worker/
|
||||
│ ├── main.py # Redis job consumer loop
|
||||
│ ├── pipeline/ # Download → analyse → persist pipeline
|
||||
│ └── analyzers/ # Essentia-based BPM / key / waveform analysers
|
||||
├── watcher/ # Nextcloud file watcher
|
||||
│ └── src/watcher/
|
||||
│ ├── event_loop.py # Poll NC activity, filter audio uploads
|
||||
│ └── nc_client.py # NC Activity API + etag fetch
|
||||
├── web/ # React frontend
|
||||
│ └── src/
|
||||
│ ├── pages/ # Route-level components
|
||||
│ ├── api/ # Typed fetch wrappers
|
||||
│ └── hooks/ # useWaveform, etc.
|
||||
├── docker-compose.yml
|
||||
└── Makefile
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Data Model
|
||||
|
||||
```
|
||||
Member ──< BandMember >── Band ──< RehearsalSession
|
||||
│ │
|
||||
└──< Song >────┘
|
||||
│
|
||||
└──< AudioVersion
|
||||
│
|
||||
└──< SongComment
|
||||
└──< Annotation
|
||||
└──< RangeAnalysis
|
||||
└──< Reaction
|
||||
└──< Job
|
||||
```
|
||||
|
||||
**Key tables:**
|
||||
|
||||
| Table | Purpose |
|
||||
|---|---|
|
||||
| `members` | User accounts. Store per-user Nextcloud credentials (`nc_username`, `nc_url`, `nc_password`) |
|
||||
| `bands` | A band. Has a `slug`, optional `nc_folder_path` (defaults to `bands/{slug}/`), and `genre_tags[]` |
|
||||
| `band_members` | M2M: member ↔ band with `role` (admin / member) |
|
||||
| `band_invites` | Time-limited invite tokens (72h) |
|
||||
| `rehearsal_sessions` | One row per dated rehearsal. `date` parsed from a `YYMMDD` or `YYYYMMDD` folder segment in the NC path. Unique on `(band_id, date)` |
|
||||
| `songs` | A recording / song. `nc_folder_path` is the canonical grouping key (all versions of one song live in this folder). `session_id` links to a rehearsal session if the path contained a date segment |
|
||||
| `audio_versions` | One row per audio file. Identified by `nc_file_etag` (used for idempotent re-scans). Stores format, size, version number |
|
||||
| `annotations` | Time-stamped text annotations on a version (like comments at a waveform position) |
|
||||
| `range_analyses` | Essentia analysis results for a time range within a version (BPM, key, loudness, waveform) |
|
||||
| `jobs` | Redis-backed job records tracking audio analysis pipeline state |
|
||||
|
||||
---
|
||||
|
||||
## API
|
||||
|
||||
Base path: `/api/v1`
|
||||
|
||||
### Auth
|
||||
| Method | Path | Description |
|
||||
|---|---|---|
|
||||
| `POST` | `/auth/register` | Create account |
|
||||
| `POST` | `/auth/login` | Returns JWT |
|
||||
|
||||
JWT is sent as `Authorization: Bearer <token>`. Endpoints that need to work without auth headers (WaveSurfer, SSE EventSource) also accept `?token=<jwt>`.
|
||||
|
||||
### Bands
|
||||
| Method | Path | Description |
|
||||
|---|---|---|
|
||||
| `GET` | `/bands` | List bands for current member |
|
||||
| `POST` | `/bands` | Create band (validates NC folder exists if path given) |
|
||||
| `GET` | `/bands/{id}` | Band detail |
|
||||
| `PATCH` | `/bands/{id}` | Update band (nc_folder_path, etc.) |
|
||||
| `GET` | `/bands/{id}/members` | List members |
|
||||
| `DELETE` | `/bands/{id}/members/{mid}` | Remove member |
|
||||
| `POST` | `/bands/{id}/invites` | Generate invite link |
|
||||
| `POST` | `/invites/{token}/accept` | Join band via invite |
|
||||
|
||||
### Sessions
|
||||
| Method | Path | Description |
|
||||
|---|---|---|
|
||||
| `GET` | `/bands/{id}/sessions` | List rehearsal sessions with recording counts |
|
||||
| `GET` | `/bands/{id}/sessions/{sid}` | Session detail with flat song list |
|
||||
| `PATCH` | `/bands/{id}/sessions/{sid}` | Update label/notes (admin only) |
|
||||
|
||||
### Songs
|
||||
| Method | Path | Description |
|
||||
|---|---|---|
|
||||
| `GET` | `/bands/{id}/songs` | All songs for band |
|
||||
| `GET` | `/bands/{id}/songs/search` | Filter by `q`, `tags[]`, `key`, `bpm_min/max`, `session_id`, `unattributed` |
|
||||
| `POST` | `/bands/{id}/songs` | Create song manually |
|
||||
| `PATCH` | `/songs/{id}` | Update title, status, tags, key, BPM, notes |
|
||||
|
||||
### Scan
|
||||
| Method | Path | Description |
|
||||
|---|---|---|
|
||||
| `GET` | `/bands/{id}/nc-scan/stream` | **SSE / ndjson stream** — scan NC folder incrementally; yields `progress`, `song`, `session`, `skipped`, `done` events |
|
||||
| `POST` | `/bands/{id}/nc-scan` | Blocking scan (waits for completion, returns summary) |
|
||||
|
||||
### Versions & Playback
|
||||
| Method | Path | Description |
|
||||
|---|---|---|
|
||||
| `GET` | `/songs/{id}/versions` | List audio versions |
|
||||
| `GET` | `/versions/{id}/stream` | Proxy-stream the audio file from Nextcloud (accepts `?token=`) |
|
||||
| `POST` | `/versions/{id}/annotate` | Add waveform annotation |
|
||||
|
||||
### Internal (watcher → api)
|
||||
| Method | Path | Description |
|
||||
|---|---|---|
|
||||
| `POST` | `/internal/nc-upload` | Called by nc-watcher when a new audio file is detected. No auth — internal network only |
|
||||
|
||||
---
|
||||
|
||||
## Scan & Import Pipeline
|
||||
|
||||
### Manual scan (SSE)
|
||||
|
||||
```
|
||||
Browser → GET /nc-scan/stream?token=
|
||||
│
|
||||
▼
|
||||
scan_band_folder() [nc_scan.py]
|
||||
│ recursive PROPFIND via collect_audio_files()
|
||||
│ depth ≤ 3
|
||||
▼
|
||||
For each audio file:
|
||||
1. PROPFIND for etag + size
|
||||
2. Skip if etag already in audio_versions
|
||||
3. Parse YYMMDD/YYYYMMDD from path → get_or_create RehearsalSession
|
||||
4. Determine nc_folder_path:
|
||||
- File directly in session folder → unique per-file folder (bands/slug/231015/stem/)
|
||||
- File in subfolder → subfolder path (bands/slug/231015/groove/)
|
||||
5. get_or_create Song
|
||||
6. Register AudioVersion
|
||||
7. Yield ndjson event → browser invalidates TanStack Query caches incrementally
|
||||
```
|
||||
|
||||
### Watcher-driven import
|
||||
|
||||
```
|
||||
Nextcloud → Activity API (polled every 30s by nc-watcher)
|
||||
│
|
||||
▼
|
||||
event_loop.poll_once()
|
||||
filter: audio extension only
|
||||
normalize path (strip WebDAV prefix)
|
||||
filter: upload event type
|
||||
│
|
||||
▼
|
||||
POST /internal/nc-upload
|
||||
band lookup: slug-based OR nc_folder_path prefix match
|
||||
same folder/session/song logic as manual scan
|
||||
enqueue audio analysis job → Redis
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Audio Analysis
|
||||
|
||||
When a new `AudioVersion` is created the API enqueues a `Job` to Redis. The `audio-worker` picks it up and runs:
|
||||
|
||||
1. Download file from Nextcloud to `/tmp/audio/`
|
||||
2. Run Essentia analysers: BPM, key, loudness, waveform peak data
|
||||
3. Write `RangeAnalysis` rows to DB
|
||||
4. Update `Song.global_bpm` / `Song.global_key` if not yet set
|
||||
5. Clean up temp file
|
||||
|
||||
---
|
||||
|
||||
## Auth & Nextcloud Credentials
|
||||
|
||||
- JWT signed with `SECRET_KEY` (HS256), `sub` = member UUID
|
||||
- Per-member Nextcloud credentials stored on the `members` row (`nc_url`, `nc_username`, `nc_password`). The API creates a `NextcloudClient` scoped to the acting member for all WebDAV operations.
|
||||
- The watcher uses a single shared NC account configured via env vars (`NEXTCLOUD_USER` / `NEXTCLOUD_PASS`).
|
||||
|
||||
---
|
||||
|
||||
## Key Conventions
|
||||
|
||||
- **Repository pattern**: one `*Repository` class per model in `repositories/`. All DB access goes through repos; routers never touch the session directly except for passing it to repos/services.
|
||||
- **Pydantic v2**: `model_validate(obj).model_copy(update={...})` — `model_validate` does not accept an `update` kwarg.
|
||||
- **Async SQLAlchemy**: sessions are opened per-request via `get_session()` FastAPI dependency. SSE endpoints create their own session via `get_session_factory()()` because the dependency session closes when the handler returns.
|
||||
- **Idempotent scans**: deduplication is by `nc_file_etag`. Re-scanning is always safe.
|
||||
- **nc_folder_path grouping**: files in the same subfolder (e.g. `bands/slug/groove/`) are treated as multiple versions of one song. Files directly in a dated session folder get a unique virtual folder (`bands/slug/231015/stem/`) so each becomes its own song.
|
||||
- **Migrations**: Alembic in `api/alembic/`. After rebuilding the DB run `docker compose exec api uv run alembic upgrade head`.
|
||||
Reference in New Issue
Block a user