Reset to HoopScout v2 runtime foundation and simplified topology

2026-03-13 10:31:29 +01:00
parent 3b5f1f37dd
commit bb033222e3
13 changed files with 247 additions and 748 deletions
--- a/README.md
+++ b/README.md
@ -1,422 +1,121 @@
-# HoopScout
+# HoopScout v2 (Foundation Reset)

-HoopScout is a production-minded basketball scouting and player search platform.
-The main product experience is server-rendered Django Templates with HTMX enhancements.
-A minimal read-only API is included as a secondary integration surface.
+HoopScout v2 is a controlled greenfield rebuild inside the existing repository.

-## Core Stack
+Current v2 foundation scope in this branch:
+- Django + HTMX server-rendered app
+- PostgreSQL as the only primary database
+- nginx reverse proxy
+- management-command-driven runtime operations
+- static snapshot directories persisted via Docker named volumes

- Python 3.12+
- Django
- Django Templates + HTMX
- Tailwind CSS (CLI build pipeline)
- PostgreSQL
- Redis
- Celery + Celery Beat
- Django REST Framework (read-only API)
- pytest
- Docker / Docker Compose
- nginx
+Out of scope in this step:
+- domain model redesign
+- snapshot importer implementation
+- extractor implementation

-## Architecture Summary
+## Runtime Architecture (v2)

- Main UI: Django + HTMX (not SPA)
- Data layer: normalized domain models for players, seasons, competitions, teams, stats, scouting state
- Provider integration: adapter-based abstraction in `apps/providers`
- Ingestion orchestration: `apps/ingestion` with run/error logs and Celery task execution
- Optional API: read-only DRF endpoints under `/api/`
+Runtime services are intentionally small:
+- `web` (Django/Gunicorn)
+- `postgres` (primary DB)
+- `nginx` (reverse proxy + static/media serving)

-## Repository Structure
+No Redis/Celery services are part of the v2 default runtime topology.
+Legacy Celery/provider code is still in repository history/codebase but de-emphasized for v2.

-```text
-.
-├── apps/
-│   ├── api/
-│   ├── competitions/
-│   ├── core/
-│   ├── ingestion/
-│   ├── players/
-│   ├── providers/
-│   ├── scouting/
-│   ├── stats/
-│   ├── teams/
-│   └── users/
-├── config/
-│   └── settings/
-├── docs/
-├── nginx/
-├── requirements/
-├── package.json
-├── tailwind.config.js
-├── static/
-├── templates/
-├── tests/
-├── .github/
-├── CHANGELOG.md
-├── docker-compose.yml
-├── Dockerfile
-└── entrypoint.sh
-```
+## Image Strategy

-## Quick Start
+Compose builds and tags images as:
+- `registry.younerd.org/hoopscout/web:${APP_IMAGE_TAG:-latest}`
+- `registry.younerd.org/hoopscout/nginx:${NGINX_IMAGE_TAG:-latest}`

-1. Create local env file:
+Reserved for future optional scheduler use:
+- `registry.younerd.org/hoopscout/scheduler:${APP_IMAGE_TAG:-latest}`
+
+## Entrypoint Strategy
+
+- `web`: `entrypoint.sh`
+  - waits for PostgreSQL
+  - optionally runs migrations/collectstatic
+  - ensures snapshot directories exist
+- `nginx`: `nginx/entrypoint.sh`
+  - simple runtime entrypoint wrapper
+
+## Compose Files
+
+- `docker-compose.yml`: production-minded baseline runtime (immutable image filesystem)
+- `docker-compose.dev.yml`: development override with source bind mount for `web`
+- `docker-compose.release.yml`: production settings override (`DJANGO_SETTINGS_MODULE=config.settings.production`)
+
+### Start development runtime

 ```bash
 cp .env.example .env
-```
-
-2. Build and run services:
-
-```bash
-docker compose -f docker-compose.yml -f docker-compose.dev.yml --profile dev up --build
-```
-
-This starts the development-oriented topology (source bind mounts enabled).
-In development, bind-mounted app containers run as `LOCAL_UID`/`LOCAL_GID` from `.env` (set them to your host user/group IDs).
-
-3. If `AUTO_APPLY_MIGRATIONS=0`, run migrations manually:
-
-```bash
-docker compose exec web python manage.py migrate
-```
-
-4. Create a superuser:
-
-```bash
-docker compose exec web python manage.py createsuperuser
-```
-
-5. Open the app:
-
- Web: http://localhost
- Admin: http://localhost/admin/
- Health: http://localhost/health/
- API root endpoints: `/api/players/`, `/api/competitions/`, `/api/teams/`, `/api/seasons/`
-
-## Development vs Release Compose
-
-Base compose (`docker-compose.yml`) is release-oriented and immutable for runtime services.
-Development mutability is enabled via `docker-compose.dev.yml`.
-
-Development startup (mutable source bind mounts for `web`/`celery_*`):
-
-```bash
 docker compose -f docker-compose.yml -f docker-compose.dev.yml up --build
 ```

-Development startup with Tailwind watch:
-
-```bash
-docker compose -f docker-compose.yml -f docker-compose.dev.yml --profile dev up --build
-```
-
-Release-style startup (immutable runtime services):
+### Start release-style runtime

 ```bash
 docker compose -f docker-compose.yml -f docker-compose.release.yml up -d --build
 ```

-Optional release-style stop:
+## Named Volumes

-```bash
-docker compose -f docker-compose.yml -f docker-compose.release.yml down
-```
+v2 runtime uses named volumes for persistence:
+- `postgres_data`
+- `static_data`
+- `media_data`
+- `snapshots_incoming`
+- `snapshots_archive`
+- `snapshots_failed`

-Notes:
+Development override uses separate dev-prefixed volumes to avoid ownership collisions.

- In release-style mode, `web`, `celery_worker`, and `celery_beat` run from built image filesystem with no repository source bind mount.
- In development mode (with `docker-compose.dev.yml`), `web`, `celery_worker`, and `celery_beat` are mutable and bind-mount `.:/app`.
- `tailwind` is a dev-profile service and is not required for release runtime.
- `nginx`, `postgres`, and `redis` service naming remains unchanged.
- Release-style `web`, `celery_worker`, and `celery_beat` explicitly run as container user `10001:10001`.
+## Environment Variables

-## Release Topology Verification
+Use `.env.example` as the source of truth.

-Inspect merged release config:
+Core groups:
+- Django runtime/security vars
+- PostgreSQL connection vars
+- image tag vars (`APP_IMAGE_TAG`, `NGINX_IMAGE_TAG`)
+- snapshot directory vars (`SNAPSHOT_*`)
+- optional future scheduler vars (`SCHEDULER_*`)

-```bash
-docker compose -f docker-compose.yml -f docker-compose.release.yml config
-```
+## Snapshot Storage Convention

-What to verify:
+Snapshot files are expected under:
+- incoming: `/app/snapshots/incoming`
+- archive: `/app/snapshots/archive`
+- failed: `/app/snapshots/failed`

- `services.web.volumes` does not include a bind mount from repository path to `/app`
- `services.celery_worker.volumes` does not include a bind mount from repository path to `/app`
- `services.celery_beat.volumes` does not include a bind mount from repository path to `/app`
- persistent named volumes still exist for `postgres_data`, `static_data`, `media_data`, `runtime_data`, and `redis_data`
+In this foundation step, directories are created and persisted but no importer/extractor is implemented yet.

-Automated local/CI-friendly check:
-
-```bash
-./scripts/verify_release_topology.sh
-```
-
-## Setup and Run Notes
-
- `web` service starts through `entrypoint.sh` and waits for PostgreSQL readiness.
- `web` service also builds Tailwind CSS before `collectstatic` when `AUTO_BUILD_TAILWIND=1`.
- `web`, `celery_worker`, `celery_beat`, and `tailwind` run as a non-root user inside the image.
- `celery_worker` executes background sync work.
- `celery_beat` triggers periodic provider sync (`apps.ingestion.tasks.scheduled_provider_sync`).
- `tailwind` service runs watch mode for development (`npm run dev`).
- nginx proxies web traffic and serves static/media volume mounts.
-
-## Search Consistency Notes
-
- The server-rendered player search page (`/players/`) and read-only players API (`/api/players/`) use the same search form and ORM filter service.
- Sorting/filter semantics are aligned across UI, HTMX partial refreshes, and API responses.
- Search result metrics in the UI table use **best eligible semantics**:
-  - each metric (Games, MPG, PPG, RPG, APG) is the maximum value across eligible player-season rows
-  - eligibility is scoped by the active season/team/competition/stat filters
-  - different displayed metrics for one player can come from different eligible rows
- Metric-based API sorting (`ppg_*`, `mpg_*`) uses the same best-eligible semantics as UI search.
-
-## Docker Volumes and Persistence
-
-`docker-compose.yml` uses named volumes:
-
- `postgres_data`: PostgreSQL persistent database
- `static_data`: collected static assets
- `media_data`: user/provider media artifacts
- `runtime_data`: app runtime files (e.g., celery beat schedule)
- `redis_data`: Redis persistence (`/data` for RDB/AOF files)
- `node_modules_data`: Node modules cache for Tailwind builds in development override
-
-This keeps persistent state outside container lifecycles.
-
-In release-style mode, these volumes remain the persistence layer:
-
- `postgres_data` for database state
- `static_data` for collected static assets served by nginx
- `media_data` for uploaded/provider media
- `runtime_data` for Celery beat schedule/runtime files
- `redis_data` for Redis persistence
-
-## Migrations
-
-Create migration files:
-
-```bash
-docker compose exec web python manage.py makemigrations
-```
-
-Apply migrations:
+## Migration and Superuser Commands

 ```bash
 docker compose exec web python manage.py migrate
-```
-
-## Testing
-
-Run all tests:
-
-```bash
-docker compose run --rm web sh -lc 'pip install -r requirements/dev.txt && pytest -q'
-```
-
-Run a focused module:
-
-```bash
-docker compose run --rm web sh -lc 'pip install -r requirements/dev.txt && pytest -q tests/test_api.py'
-```
-
-## Frontend Assets (Tailwind)
-
-Build Tailwind once:
-
-```bash
-docker compose run --rm web sh -lc 'npm install --no-audit --no-fund && npm run build'
-```
-
-If you see `Permission denied` writing `static/vendor` or `static/css` in development, fix local file ownership once:
-
-```bash
-sudo chown -R "$(id -u):$(id -g)" static
-```
-
-Run Tailwind in watch mode during development:
-
-```bash
-docker compose -f docker-compose.yml -f docker-compose.dev.yml --profile dev up tailwind
-```
-
-Source CSS lives in `static/src/tailwind.css` and compiles to `static/css/main.css`.
-HTMX is served from local static assets (`static/vendor/htmx.min.js`) instead of a CDN dependency.
-
-## Production Configuration
-
-Use production settings in deployed environments:
-
-```bash
-DJANGO_SETTINGS_MODULE=config.settings.production
-DJANGO_DEBUG=0
-DJANGO_ENV=production
-```
-
-When `DJANGO_DEBUG=0`, startup fails fast unless:
-
- `DJANGO_SECRET_KEY` is a real non-default value
- `DJANGO_ALLOWED_HOSTS` is set
- `DJANGO_CSRF_TRUSTED_ORIGINS` is set (for production settings)
-
-Additional production safety checks:
-
- `DJANGO_SECRET_KEY` must be strong and non-default in non-development environments
- `DJANGO_ALLOWED_HOSTS` must not contain localhost-style values
- `DJANGO_CSRF_TRUSTED_ORIGINS` must be explicit HTTPS origins only (no localhost/http)
-
-Production settings enable hardened defaults such as:
-
- secure cookies
- HSTS
- security headers
- `ManifestStaticFilesStorage` for static asset integrity/versioning
-
-### Production Configuration Checklist
-
- `DJANGO_SETTINGS_MODULE=config.settings.production`
- `DJANGO_ENV=production`
- `DJANGO_DEBUG=0`
- strong `DJANGO_SECRET_KEY` (unique, non-default, >= 32 chars)
- explicit `DJANGO_ALLOWED_HOSTS` (no localhost values)
- explicit `DJANGO_CSRF_TRUSTED_ORIGINS` with HTTPS origins only
- `DJANGO_SECURE_SSL_REDIRECT=1` and `DJANGO_SECURE_HSTS_SECONDS` set appropriately
-
-## Superuser and Auth
-
-Create superuser:
-
-```bash
 docker compose exec web python manage.py createsuperuser
 ```

-Default auth routes:
+## Health Endpoints

- Signup: `/users/signup/`
- Login: `/users/login/`
- Logout: `/users/logout/`
+- app health: `/health/`
+- nginx healthcheck proxies `/health/` to `web`

-## Ingestion and Manual Sync
+## GitFlow

-### Trigger via Django Admin
+Required branch model:
+- `main`: production
+- `develop`: integration
+- `feature/*`, `release/*`, `hotfix/*`

- Open `/admin/` -> `IngestionRun`
- Use admin actions:
-  - `Queue full sync (default provider)`
-  - `Queue incremental sync (default provider)`
-  - `Retry selected ingestion runs`
+This v2 work branch is:
+- `feature/hoopscout-v2-static-architecture`

-### Trigger from shell (manual)
+## Notes on Legacy Layers

-```bash
-docker compose exec web python manage.py shell
-```
-
-```python
-from apps.ingestion.tasks import trigger_full_sync
-trigger_full_sync.delay(provider_namespace="balldontlie")
-```
-
-### Logs and diagnostics
-
- Run-level status/counters: `IngestionRun`
- Structured error records: `IngestionError`
- Provider entity mappings + diagnostic payload snippets: `ExternalMapping`
- `IngestionRun.error_summary` captures top-level failure/partial-failure context
-
-### Scheduled sync via Celery Beat
-
-Configure scheduled sync through environment variables:
-
- `INGESTION_SCHEDULE_ENABLED` (`0`/`1`)
- `INGESTION_SCHEDULE_CRON` (5-field cron expression, default `*/30 * * * *`)
- `INGESTION_SCHEDULE_PROVIDER_NAMESPACE` (optional; falls back to default provider namespace)
- `INGESTION_SCHEDULE_JOB_TYPE` (`incremental` or `full_sync`)
- `INGESTION_PREVENT_OVERLAP` (`0`/`1`) to skip obvious overlapping runs
- `INGESTION_OVERLAP_WINDOW_MINUTES` overlap guard window
-
-When enabled, Celery Beat enqueues the scheduled sync task on the configured cron.
-The task uses the existing ingestion service path and writes run/error records in the same tables as manual sync.
-
-Valid cron examples:
-
- `*/30 * * * *` every 30 minutes
- `0 * * * *` hourly
- `15 2 * * *` daily at 02:15
-
-Failure behavior for invalid cron values:
-
- invalid `INGESTION_SCHEDULE_CRON` does not crash unrelated startup paths (for example, web)
- periodic ingestion task is disabled until cron is fixed
- an error is logged at startup indicating the invalid schedule value
-
-## Provider Backend Selection
-
-Provider backend is selected via environment variables:
-
- `PROVIDER_BACKEND=demo` uses the local JSON fixture adapter (`mvp_demo`)
- `PROVIDER_BACKEND=balldontlie` uses the HTTP adapter (`balldontlie`)
- `PROVIDER_DEFAULT_NAMESPACE` can override backend mapping explicitly
-
-The balldontlie adapter is NBA-centric and intended as MVP ingestion only. The provider abstraction remains ready for future multi-league providers (for example Sportradar or FIBA GDAP).
-The adapter follows the published balldontlie OpenAPI contract: server `https://api.balldontlie.io`, NBA endpoints under `/nba/v1/*`, cursor pagination via `meta.next_cursor`, and `stats` ingestion filtered by `seasons[]`.
-Some balldontlie plans do not include stats endpoints; set `PROVIDER_BALLDONTLIE_STATS_STRICT=0` (default) to ingest players/teams/seasons even when stats are unauthorized.
-
-Provider normalization details and explicit adapter assumptions are documented in [docs/provider-normalization.md](docs/provider-normalization.md).
-
-## GitFlow Workflow
-
-GitFlow is required in this repository:
-
- `main`: production branch
- `develop`: integration branch
- `feature/*`: new feature branches from `develop`
- `release/*`: release hardening branches from `develop`
- `hotfix/*`: urgent production fixes from `main`
-
-Read full details in [CONTRIBUTING.md](CONTRIBUTING.md) and [docs/workflow.md](docs/workflow.md).
-
-### Repository Bootstrap Commands
-
-Run these from the current `main` branch to initialize local GitFlow usage:
-
-```bash
-git checkout main
-git pull origin main
-git checkout -b develop
-git push -u origin develop
-```
-
-Start a feature branch:
-
-```bash
-git checkout develop
-git pull origin develop
-git checkout -b feature/player-search-tuning
-```
-
-Start a release branch:
-
-```bash
-git checkout develop
-git pull origin develop
-git checkout -b release/0.1.0
-```
-
-Start a hotfix branch:
-
-```bash
-git checkout main
-git pull origin main
-git checkout -b hotfix/fix-redis-persistence
-```
-
-## Release Notes / Changelog Convention
-
- Use [CHANGELOG.md](CHANGELOG.md) with an `Unreleased` section.
- For each merged PR, add short entries under:
-  - `Added`
-  - `Changed`
-  - `Fixed`
- On release, move `Unreleased` items to a dated version section (`[x.y.z] - YYYY-MM-DD`).
+Legacy provider/Celery ingestion layers are not the default runtime path for v2 foundation.
+They are intentionally isolated until replaced by v2 snapshot ingestion commands in later tasks.