← Index Document PSP-004 · Public
Planisphere
Reference case study · v1.0 2026-05-17
PSP-004 · Reference case study · Substrate self-validation

The Bibles fleet, projected.

A federal program office asks: given a fleet of fine-tuned variants, which are doing distinct work, which are redundant, and what dimension differentiates them most? This case study answers that question end-to-end against a real 6-subject fleet.

Internal substrate validation, not a customer deployment. The fleet measured is the Planisphere team's own reference corpus. A reader should substitute their fleet mentally — the substrate is fleet-agnostic; the pipeline that produced this bundle is the same pipeline a Phase I pilot would execute.

01 ·
The fleet.
6 subjects · 5 evaluation categories
SubjectCorpus tile
Bible-earth-codebook-v1Earth-codebook · designed sign-system · covenant
Bible-etruscan-v1Etruscan · pre-Roman Italic · cognate-failure
Bible-hermetica-v1Hermetic Corpus · Greek/Coptic · revelation
Bible-linear-a-v1Linear A · undeciphered Aegean · structure-without-key
Bible-meroitic-v1Meroitic · Kushite · partial-key
Bible-shotokan-v1Shotokan kata · Funakoshi 1922 · embodiment

Categories scored, PASS / PARTIAL / FAIL: null-handling, recall, cross-tile transfer, router-swap robustness, schema transfer.

02 ·
What the substrate resolved.
End-to-end runtime · Under one second on a laptop
Finding 01 · Redundancy
5 / 6

Distinct behavioral profiles in a 6-subject fleet.

Bible-hermetica-v1 and Bible-linear-a-v1 share byte-identical grade vectors. One is a dedupe candidate; the fleet loses no behavioral coverage by retiring it.

Finding 02 · Effective rank
3

Independent axes of variation across 5 evaluation dimensions.

Two evaluation categories (cross-tile, router-swap) are currently saturated — not discriminative across this fleet. Either the fleet is uniform on those axes or the probes are not stressing them.

Finding 03 · Dominant axis
70.1%

Variance explained on PC1 — null-handling.

PC1 loads +0.935 on null-handling and −0.342 on schema-transfer. The dimension on which the fleet most varies is null-handling. Targeted investment here standardizes the fleet fastest.

Finding 04 · Outlier
1

Subject failing where all others partial.

Bible-meroitic-v1 is the sole FAIL on schema-transfer. Every other subject is PARTIAL. Targeted re-training candidate; the rest of the fleet is sound on this axis.

Variance attribution across principal components

PC 1
70.1%
PC 2
18.8%
PC 3
11.1%
PC 4
0.0%
PC 5
0.0%
Rank-3 fleet behavior in a rank-5 evaluation space. Two of the five evaluation categories are saturated against this fleet. The substrate names the saturated axes as a deliverable, not a footnote.
03 ·
Cluster map.
5 distinct profiles · sorted by population
ClusterGrade vector [null, recall, cross, router, schema]Subjects
01[0.5, 1.0, 1.0, 0.5, 0.5]Bible-hermetica-v1 · Bible-linear-a-v1
02[1.0, 1.0, 1.0, 0.5, 0.5]Bible-etruscan-v1
03[1.0, 1.0, 1.0, 0.5, 0.0]Bible-meroitic-v1
04[0.0, 1.0, 1.0, 0.5, 0.5]Bible-earth-codebook-v1
05[0.5, 0.5, 1.0, 0.5, 0.5]Bible-shotokan-v1
04 ·
The attestation.
Independently recomputable · Tamper-evident
Fleet sha25677576f236f03da712092e7d93e685a9e59c8c302d619ef6bef1e437319aaa8a7
Attestation root1cd0f2b48922b01dbcb131011fae40ac2da355a4470a89565a74802ed8a3d28d
Kernel shaEmbedded in ATTESTATION.json
Determinism propertyBit-identical canonical artifacts across runs for identical inputs and pinned code
Tamper detectionSingle-byte mutation defeats verification
Runtime< 1 second for N = 6 on Contractor laptop
psp › verify <bundle>
{
  "verified": true,
  "root_match": true,
  "artifact_mismatches": []
}

Any party in possession of the 6 input files matching the fleet sha and the pinned analysis code matching the kernel sha can recompute every byte of the canonical artifacts and verify the attestation root independently. The negative case — tampering — is exercised by the substrate's own test suite.

05 ·
For a federal reader.
Substituting your fleet for ours

Substitute a DoW adapter fleet for the Bibles fleet above, and the substrate answers — at the same runtime and the same cost — the same five questions, with attestable evidence:

  • Consolidation: which fine-tuned variants are behaviorally redundant and can be retired
  • Investment direction: which evaluation axis to prioritize for the next training round
  • Targeted remediation: which subject is the outlier, on what category
  • Evaluation harness audit: which probes are saturated and may need to be hardened
  • Audit-ready evidence: all of the above as a sha-pinned, Merkle-rooted, NIST AI RMF-mapped bundle, admissible under IG review
The first measurement is the fleet. A program office that runs Planisphere across 5 candidate vendors walks out with a 5-subject fleet in our attestation system. The fleet didn't exist before the measurement.
Read the capability declaration Back to index