fix(v2-ingestion): align public schema realism follow-ups

This commit is contained in:
Alfredo Di Stasio
2026-03-20 15:23:43 +01:00
parent 6066d2a0bb
commit 48a82e812a
4 changed files with 91 additions and 1 deletions

View File

@ -167,6 +167,12 @@ Validation is strict:
- numeric fields must be numeric
- invalid files are moved to failed directory
Importer enrichment note:
- `full_name` is source truth for identity display
- `first_name` / `last_name` are optional and may be absent in public snapshots
- when both are missing, importer may derive them from `full_name` as a best-effort enrichment step
- this enrichment is convenience-only and does not override source truth semantics
## Import Command
Run import:
@ -284,6 +290,7 @@ Notes:
- extraction is intentionally low-frequency and uses retries conservatively
- only public pages/endpoints should be targeted
- emitted snapshots must match the same schema consumed by `import_snapshots`
- `public_json_snapshot` uses the same required-vs-optional field contract as `SnapshotSchemaValidator` (no stricter extractor-only required bio/physical fields)
- optional scheduler container runs `scripts/scheduler.sh` loop using:
- image: `registry.younerd.org/hoopscout/scheduler:${APP_IMAGE_TAG:-latest}`
- command: `/app/scripts/scheduler.sh`
@ -326,6 +333,16 @@ Notes:
- public-source player bio/physical fields are often incomplete; extractor allows them to be missing and emits `null` for optional fields
- no live HTTP calls in tests; tests use fixtures/mocked responses only
## Testing
- runtime `web` image stays lean and may not include `pytest` tooling
- run tests with the development compose stack (or a dedicated test image/profile) where test dependencies are installed
- local example:
```bash
docker compose -f docker-compose.yml -f docker-compose.dev.yml run --rm web pytest -q
```
## Migration and Superuser Commands
```bash