From 99a0b19bf2fc1b20104622ed572928f0b424ee34 Mon Sep 17 00:00:00 2001
From: bisco <bisco@autistici.org>
Date: Mon, 6 Apr 2026 18:54:36 +0200
Subject: [PATCH] docs: define scouting search domain baseline in ADR-0005

---
 docs/ARCHITECTURE.md                    |   5 +-
 docs/adr/0005-scouting-search-domain.md | 110 ++++++++++++++++++++++++
 2 files changed, 114 insertions(+), 1 deletion(-)
 create mode 100644 docs/adr/0005-scouting-search-domain.md

diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md
index 1707017..1181c46 100644
--- a/docs/ARCHITECTURE.md
+++ b/docs/ARCHITECTURE.md
@@ -6,7 +6,7 @@ This document will become the central architecture overview for HoopScout v2. It
 
 ## Current Status
 
-This is still a phase-1 architecture overview, but the repository now has its first accepted concrete technical decision in `docs/adr/0001-runtime-and-development-stack.md`. Later implementation work should follow that corrected baseline unless a future ADR supersedes it.
+This architecture overview summarizes the accepted technical and domain-decision baseline for implementation. Future model, filter, and UI work should follow the current ADR set unless a later ADR supersedes it.
 
 ## Decision-Driven Development
 
@@ -25,6 +25,7 @@ The current baseline decision is:
 - `ADR-0002`: initial project structure baseline
 - `ADR-0003`: containerized developer workflow baseline
 - `ADR-0004`: configuration and environment strategy baseline
+- `ADR-0005`: scouting search-domain baseline
 
 The current baseline assumes:
 - Python 3
@@ -40,6 +41,8 @@ Future runtime and scaffolding work should also follow the developer workflow de
 
 Future scaffolding should also follow the configuration strategy defined in `docs/adr/0004-configuration-and-environment-strategy.md`, including environment-variable based configuration, a repository-owned `.env.example`, local-only secrets, and a simple initial Django settings approach unless a later ADR supersedes it.
 
+Future search model/filter/UI implementation should follow the domain semantics defined in `docs/adr/0005-scouting-search-domain.md`, including the separation of position vs role vs specialty, MVP filter scope, and optional vs required dimensions.
+
 ## Future Sections Placeholder
 
 Future versions of this document may include sections such as:
diff --git a/docs/adr/0005-scouting-search-domain.md b/docs/adr/0005-scouting-search-domain.md
new file mode 100644
index 0000000..9215aaa
--- /dev/null
+++ b/docs/adr/0005-scouting-search-domain.md
@@ -0,0 +1,110 @@
+# ADR-0005: Scouting Search Domain Baseline
+
+## Status
+Accepted
+
+## Context
+HoopScout v2 is moving from technical foundation decisions to product-domain decisions required before model, filter, and UI implementation. The next implementation prompts need a stable and explicit scouting search vocabulary so we do not conflate objective statistics with internal scouting interpretation.
+
+The product intent is a scouting-first player search engine where users combine multiple dimensions to discover players. This is not a generic basketball encyclopedia.
+
+## Decision
+
+### 1. Search purpose
+The search system is defined as a player scouting search engine. Search dimensions are selected to support scouting workflows and shortlist creation, not comprehensive historical/statistical browsing.
+
+### 2. Domain vocabulary and separation of concerns
+Search dimensions are separated into distinct layers that must not be conflated:
+
+- Position: standard on-court category (`PG`, `SG`, `SF`, `PF`, `C`).
+- Role: tactical/scouting classification (for example `playmaker`, `3-and-D`, `point forward`, `rim protector`, `6th man`).
+- Specialty: scouting tag layer (for example `ball handling`, `off ball`, `defense`, `intangibles`, `clutch`, `post`, `dunk`).
+
+Position is a categorical descriptor, role is tactical interpretation, and specialty is a flexible tagging layer. Future implementation must preserve this separation in model semantics, filtering semantics, and UI wording.
+
+### 3. Data classification by search dimension
+The following classification defines ownership and data realism.
+
+| Dimension | Classification | MVP handling |
+|---|---|---|
+| Position (`PG`/`SG`/`SF`/`PF`/`C`) | Hybrid: often source-derived, may need normalization; can be manually overridden when source is inconsistent | In MVP; filterable |
+| Role (listed tactical roles) | App-defined/internal taxonomy; assigned via internal scouting judgment; may be inferred from data but not treated as source-native fact | Deferred as mandatory filter; optional/manual enrichment in MVP |
+| Specialty tags | App-defined/internal taxonomy; manually curated and optionally inference-assisted; not source-native | Deferred as mandatory filter; optional/manual enrichment in MVP |
+| Age | Derived from source date-of-birth and reference date | In MVP; filterable |
+| Height | Source-derived objective attribute; normalized units required | In MVP; filterable |
+| Weight | Source-derived objective attribute; normalized units required | In MVP; filterable |
+| Wingspan | Optional source-derived or manually curated enrichment; sparse in public sources | Not required in MVP; optional if present |
+| Points per game | Source-derived objective per-game metric | In MVP; filterable |
+| Assists per game | Source-derived objective per-game metric | In MVP; filterable |
+| Steals per game | Source-derived objective per-game metric | In MVP; filterable |
+| Turnovers per game | Source-derived objective per-game metric | In MVP; filterable |
+| Blocks per game | Source-derived objective per-game metric | In MVP; filterable |
+| eFG% | Source-derived if provided, otherwise calculated from source box score totals (derived but objective) | In MVP if available/computable |
+| TS% | Source-derived if provided, otherwise calculated from source totals (derived but objective) | In MVP if available/computable |
+| Plus/minus | Optional source-derived metric with context variance across competitions/sources | Deferred from MVP baseline; include only where source quality is acceptable |
+| Offensive rating | Optional source-derived metric; not universally available/consistent | Deferred from MVP baseline |
+| Defensive rating | Optional source-derived metric; not universally available/consistent | Deferred from MVP baseline |
+
+### 4. MVP search scope decision
+MVP search filters include:
+
+- position
+- per-game metrics: points, assists, steals, turnovers, blocks
+- objective personal/physical attributes with practical availability: age, height, weight
+- advanced percentages when available or reliably computable: eFG%, TS%
+
+MVP does not require role, specialty, wingspan, plus/minus, offensive rating, or defensive rating as baseline filters.
+
+These deferred dimensions are allowed as optional enrichment fields in MVP data ingestion/storage if available, but they are not required for a player to be searchable.
+
+### 5. Data realism decisions for risk-prone dimensions
+- Role: public source coverage is inconsistent and mostly interpretive; treat as internal/manual scouting classification in MVP.
+- Specialty: public source coverage is not standardized; treat as internal/manual tagging in MVP.
+- Wingspan: public coverage is sparse and uneven across leagues; treat as optional enrichment, not a required MVP field.
+
+### 6. Flexibility for future taxonomy growth
+Role and specialty taxonomies are repository-owned domain vocabularies. They must be extensible so new role labels and specialty tags can be added without changing the conceptual model.
+
+Future implementation should assume:
+- position remains a bounded standard set;
+- role remains an app-defined tactical taxonomy that can expand;
+- specialty remains an app-defined tag taxonomy that can expand.
+
+### 7. Implementation guidance for future prompts
+Future model/filter/UI prompts must assume:
+
+- Search filters are split by semantic layer: position, role, specialty, objective metrics, physical characteristics.
+- Position filters operate on normalized categorical values.
+- Role and specialty are internal taxonomy-owned dimensions and may be absent for many players in early phases.
+- Objective metrics and physical fields remain source-driven (or objectively derived from source stats) and should be treated differently from scouting classifications.
+- Optional dimensions must not block ingestion or search eligibility when missing.
+- UI wording should avoid presenting role/specialty as objective source facts.
+
+## Alternatives considered
+
+### A. Treat role and specialty as source-native fields in MVP
+Rejected. This overstates source objectivity and creates data quality risk because these are primarily scouting interpretations.
+
+### B. Require all listed dimensions in MVP filters
+Rejected. This increases delivery risk and couples MVP to low-availability fields (especially wingspan and certain advanced metrics).
+
+### C. Ignore role/specialty until much later
+Rejected. Even if deferred as mandatory filters, role/specialty must be conceptually defined now to avoid model ambiguity and rework.
+
+## Trade-offs
+- Pros: clear semantic boundaries, realistic MVP scope, lower ingestion risk, explicit taxonomy ownership.
+- Cons: initial MVP filter set is narrower than full scouting ambition, and some high-value scouting dimensions are manual/optional at first.
+
+## Consequences
+- Near-term implementation can proceed with objective, reliably available filters first.
+- Role/specialty workflows will require internal curation processes and possibly reviewer/admin UX later.
+- Data pipelines must support missing optional fields without breaking search.
+- Future ADRs can refine role/specialty governance and metric computation standards without replacing this baseline separation.
+
+## Follow-up decisions needed
+1. Role taxonomy governance: who can define, merge, rename, or deprecate roles.
+2. Specialty taxonomy governance: naming rules, hierarchy policy (if any), and duplicate-tag handling.
+3. Normalization standards for height/weight units and age reference-date semantics.
+4. Metric computation and rounding policy for derived advanced stats (eFG%, TS%).
+5. Source quality policy for optional advanced metrics (plus/minus, offensive/defensive rating).
+6. Curation workflow for manual scouting classifications and auditability requirements.