How we confirm the data

How we confirm the data

The trust-receipts companion to How this was built. That page sells the engineering — the architecture, the data model, the directed-AI workflow. This page answers the other question a technical reviewer asks: how do you know the data is actually right? Everything below is generated from the build, not hand-asserted — a point-in-time snapshot as of the latest version (2026-06-20), so the counts reflect that build and the vintage label keeps it honest. ← Back to the dashboard · How this was built →

Three signals, three honest sources: the dbt test suite (baked from the build's run_results.json), per-measure coverage + vintage (queried live from the platinum mart), and the cross-layer reconciliations build agents capture at done-time (read from structured build reports — not lost prose). The discipline is the same one /about makes for the docs: a surface that can drift, will — so generate the ones that describe the build from the build.


1 — The dbt test suite

Every model carries tests — primary-key uniqueness, not-null, referential integrity back to the geography hub, accepted-value enums. They run on every build; this is the last build's result, baked at version-close from dbt/target/run_results.json (which is build-time-only and git-ignored, so it's captured by a generator rather than queried live — the deploy-safe pattern).

dbt tests

636

passing

636

verdict

all pass
636 tests, ** all pass **, as of the build at 2026-06-21T02:53:11Z . A green suite is the floor, not the ceiling — the reconciliations below are the row-count proofs the tests can't express.

2 — Cross-layer reconciliations

When a build agent finishes a feature, it records the row-count and invariant checks it set out to prove — "this satellite has exactly the ingest's row count", "the hub didn't grow", "suppression stayed NULL, not a fabricated 0" — as structured {claim, expected, actual, passed} assertions in a committed build report. These used to live as prose in checklists and evaporate; now they accumulate here, newest first.

No Results

All 50 captured checks passed — every cross-layer assertion a build agent made held at done-time.


3 — Per-measure coverage + vintage

Each measure carries its own coverage window — a place-year appears whenever any measure has data for it, so "how many counties does each measure actually cover, and how current is it?" is a real question. This is queried live from mart_place_profile: counties_covered and place_years count the non-null county-years; latest_year is the vintage (max(year) where the measure is non-null) — derived for free, since the mart has no dedicated vintage column.

No Results

Coverage is against 3,222 counties total. Economic and jobs measures cover ~100%; the County Health Rankings layer (life expectancy, food insecurity, air pollution, broadband) covers ~95–98%, missing only the smallest/suppressed counties. The shown measures are the raw measures on the mart — their national-percentile twins (*_pctl) are derived from these, so listing both would be redundant.


4 — Coming: source-vs-source parity (the match contract)

Coming — source-vs-source parity. The strongest trust signal is equivalence across sources: when two independent feeds describe the same place-year, do they agree within tolerance? That match-contract / migration-parity summary — the proof that the reconciliation is real, not just internally consistent — is a future feature. This page reserves its seat; no parity numbers are shown here yet (showing a number we haven't computed would be exactly the dishonesty this page exists to refuse).


Snapshot as of the latest version (2026-06-20). The test counts and reconciliations are from the last build; coverage is queried live against the committed data cache. ← Back to Find Your Happy Place · How this was built →