How we confirm the data
How we confirm the data
The trust-receipts companion to How this was built. That page sells the engineering — the architecture, the data model, the directed-AI workflow. This page answers the other question a technical reviewer asks: how do you know the data is actually right? Everything below is generated from the build, not hand-asserted — a point-in-time snapshot as of the latest version (2026-06-20), so the counts reflect that build and the vintage label keeps it honest. ← Back to the dashboard · How this was built →
Three signals, three honest sources: the dbt test suite (baked from the build's
run_results.json), per-measure coverage + vintage (queried live from the
platinum mart), and the cross-layer reconciliations build agents capture at
done-time (read from structured build reports — not lost prose). The discipline is
the same one /about makes for the docs: a surface that can drift, will — so
generate the ones that describe the build from the build.
1 — The dbt test suite
Every model carries tests — primary-key uniqueness, not-null, referential
integrity back to the geography hub, accepted-value enums. They run on every
build; this is the last build's result, baked at version-close from
dbt/target/run_results.json (which is build-time-only and git-ignored, so it's
captured by a generator rather than queried live — the deploy-safe pattern).
dbt tests
passing
verdict
2 — Cross-layer reconciliations
When a build agent finishes a feature, it records the row-count and invariant
checks it set out to prove — "this satellite has exactly the ingest's row count",
"the hub didn't grow", "suppression stayed NULL, not a fabricated 0" — as
structured {claim, expected, actual, passed} assertions in a committed build
report. These used to live as prose in checklists and evaporate; now they
accumulate here, newest first.
All 50 captured checks passed — every cross-layer assertion a build agent made held at done-time.
3 — Per-measure coverage + vintage
Each measure carries its own coverage window — a place-year appears whenever any
measure has data for it, so "how many counties does each measure actually cover,
and how current is it?" is a real question. This is queried live from
mart_place_profile: counties_covered and place_years count the non-null
county-years; latest_year is the vintage (max(year) where the measure is
non-null) — derived for free, since the mart has no dedicated vintage column.
Coverage is against 3,222 counties total. Economic and jobs measures cover ~100%; the County
Health Rankings layer (life expectancy, food insecurity, air pollution, broadband)
covers ~95–98%, missing only the smallest/suppressed counties. The shown measures
are the raw measures on the mart — their national-percentile twins
(*_pctl) are derived from these, so listing both would be redundant.
4 — Coming: source-vs-source parity (the match contract)
Coming — source-vs-source parity. The strongest trust signal is equivalence across sources: when two independent feeds describe the same place-year, do they agree within tolerance? That match-contract / migration-parity summary — the proof that the reconciliation is real, not just internally consistent — is a future feature. This page reserves its seat; no parity numbers are shown here yet (showing a number we haven't computed would be exactly the dishonesty this page exists to refuse).
Snapshot as of the latest version (2026-06-20). The test counts and reconciliations are from the last build; coverage is queried live against the committed data cache. ← Back to Find Your Happy Place · How this was built →
