← Index Document PSP-005 · Public
Planisphere
Reference case study · v1.0 2026-05-17
PSP-005 · Reference case study · Substrate self-validation

Twenty-eight subjects, one axis.

A federal program office asks: across a portfolio of dozens of fine-tunes of a single foundation model, which variants are doing distinct work, where is the fleet saturating, and which one fails everything? This case study answers that question end-to-end against a real 28-subject fleet ingested through the substrate's csv loader.

Companion to PSP-004 (Bibles). Same substrate, same attestation shape, different fleet — 28 subjects instead of 6, csv ingest instead of the native ingest format, three customer-defined categories instead of five. The substrate is fleet- and format-agnostic by design; these two case studies are the proof.

The fleet measured here is the v0.2 reference batch from Backstaff — the first vertical product shipped on the Planisphere substrate. Every Backstaff delivery includes a bundle in the same shape as this one.

01 ·
The fleet.
28 subjects · 3 customer-defined categories · csv ingest

Twenty-eight LoRA-style adapters fine-tuned on creator-voice corpora. Each adapter targets a single distinctive author. The fleet is a stand-in for any program office's portfolio of single-base-model fine-tunes; substitute your variants for the names below.

CategoryDefinition
cat_beats_basePASS if win-rate vs. base ≥ 0.50; PARTIAL ≥ 0.20; else FAIL
cat_beats_systemPASS if win-rate vs. system-prompted base ≥ 0.50; PARTIAL ≥ 0.20; else FAIL
cat_voice_coherencePASS if absolute win-rate ≥ 0.90; PARTIAL ≥ 0.70; else FAIL

Three categories, trinary grades, ingested through the substrate's existing csv loader. No new format. No substrate modifications. The categories are customer-defined and arbitrary — the substrate runs SVD over whatever dimensionality the input provides.

The 28.

Each name is a creator-voice corpus that produced one fine-tuned adapter in the v0.2 reference batch. The number under each name is the cluster index from §03: 01 is the saturated bucket (full PASS across all three categories); 07 is the catastrophic failure mode. Outliers shown in signal red.

angelcluster 02
bohrcluster 01 · centroid
breedlovecluster 03
campbellcluster 03
crosscluster 04
daliocluster 02
davincicluster 01
einsteincluster 01
feynmancluster 01
flwcluster 01
fordcluster 07 · catastrophic
franklincluster 01
godelcluster 06 · near-failure
harveycluster 01
hermescluster 02
jesuscluster 01
jobscluster 05
jungcluster 01
kanyecluster 01
mikecluster 01
muskcluster 01
oppenheimercluster 01
petersoncluster 01
socratescluster 01
teslacluster 02
theocluster 01
thielcluster 02
virgilcluster 01

Saturated cluster members (17 of 28) are rendered at low contrast — they live in a single behavioral bucket and any one of them substitutes for the others on the measured categories. Distinct profiles (clusters 02–05) and outliers (06–07) carry the fleet's actual variation.

02 ·
What the substrate resolved.
End-to-end runtime · Under one second on a laptop
Finding 01 · Saturation
17 / 28

Subjects scoring full PASS across all three categories.

Sixty-one percent of the fleet lands in a single [1.0, 1.0, 1.0] bucket. The probes are not currently discriminative against the upper tier. Either the fleet is genuinely uniform on these axes or the evaluation needs harder probes. The substrate names the saturated regime as a deliverable.

Finding 02 · Distinct profiles
7

Behavioral profiles across 28 subjects.

Out of 27 possible grade vectors on a three-category trinary scale, only seven are populated. Twenty-one subjects are dedupe candidates against six representatives. Consolidation evidence in one number.

Finding 03 · Dominant axis
83.6%

Variance explained on PC1 — overall capability.

PC1 loads −0.45, −0.60, −0.66 across the three categories — roughly equally weighted, single-signed. The dominant axis is not category-specific; it is capable vs. not capable. The fleet's variation is one-dimensional at this resolution.

Finding 04 · Catastrophic outlier
1

Subject failing every category.

ford sits at [0.0, 0.0, 0.0]. Projected onto PC1 at +3.86 — more than ten standard deviations from the fleet centroid. Fleet-level fail mode; targeted re-training candidate. The outlier names itself.

Variance attribution across principal components

PC 1
83.6%
PC 2
10.0%
PC 3
6.4%

PC2 (10.0%) loads +0.52, +0.42, −0.74 — a voice-coherence trade-off axis. Subjects strong on competition wins but weak on absolute voice coherence sit at the positive PC2 end; the inverse at the negative end. The third component is residual.

Rank-1 variation in a three-category space, plus a clean secondary trade-off axis. Together: 93.6% of inter-subject variation captured in two numbers per subject.
03 ·
Cluster map.
7 distinct profiles · sorted by population · all members named
[1.0, 1.0, 1.0]
17
[1.0, 1.0, 0.5]
5
[1.0, 0.5, 0.5]
2
[0.5, 1.0, 0.5]
1
[0.5, 0.5, 0.5]
1
[0.5, 0.0, 0.0]
1
[0.0, 0.0, 0.0]
1
ClusterGrade vector [base, system, voice]nMembers
01[1.0, 1.0, 1.0]17bohr (centroid), davinci, einstein, feynman, flw, franklin, harvey, jesus, jung, kanye, mike, musk, oppenheimer, peterson, socrates, theo, virgil
02[1.0, 1.0, 0.5]5angel, dalio, hermes, tesla, thiel
03[1.0, 0.5, 0.5]2breedlove, campbell
04[0.5, 1.0, 0.5]1cross
05[0.5, 0.5, 0.5]1jobs
06[0.5, 0.0, 0.0]1godel · near-failure
07[0.0, 0.0, 0.0]1ford · catastrophic

The substrate-selected centroid is bohr — highest-norm grade vector, anchor for cosine similarity. Sixteen other subjects share the same grade vector; the centroid is the lex-first among them under deterministic tie-breaking.

04 ·
The attestation.
Independently recomputable · Tamper-evident
Fleet sha256a91516d3e14835d21c0a7f32eac9d591b265a4139bd06863c96d31e8ecb6e5ca
Attestation root408a536d9e18f09a8236a744e7c1ae5318b5115fc13a64460f610eddb7964e9a
Kernel shaEmbedded in ATTESTATION.json
Substrate versionplanisphere 0.2.0
Format ingestedcsv · existing loader · no substrate modifications
Determinism propertyBit-identical canonical artifacts across runs for identical inputs and pinned code
Tamper detectionSingle-byte mutation defeats verification
Runtime< 1 second for N = 28 on Contractor laptop
psp › measure <fleet>
[ok] resolving subjects ······························· 28
[ok] discovering categories ··························· 3
[ok] projecting onto plane ····························· ✓
[ok] variance explained on PC1 ························· 0.836
[ok] distinct profiles ································· 7
[ok] attestation root ·································· 408a536d···b964e9a
psp › verify <bundle>
{ "verified": true, "root_match": true, "artifact_mismatches": [] }

Any party in possession of the same inputs and the same pinned analysis code can recompute every byte of the canonical artifacts and verify the attestation root independently. Tampering with any artifact defeats verification.

05 ·
For a federal reader.
Substituting your portfolio for ours

PSP-004 (Bibles) demonstrated the substrate on a small fleet with a rich five-category evaluation. PSP-005 demonstrates the same substrate at scale on a different fleet shape — twenty-eight subjects, three customer-defined categories, csv input. Together the two case studies answer five governance questions a procurement office actually asks, with attestable evidence:

  • Consolidation: how many distinct behavioral profiles live in a portfolio of N fine-tunes (PSP-005: 7 of 28 — 21 dedupe candidates)
  • Saturation: which evaluation categories are no longer discriminative against the upper tier of the fleet (PSP-005: 61% of subjects converge to the saturated bucket)
  • Investment direction: which dimension explains most of the inter-subject variation (PSP-005: a single capability axis at 83.6%; PSP-004: null-handling at 70.1%)
  • Targeted remediation: which subject is the catastrophic outlier and what is the fail signature (PSP-005: ford at [0,0,0]; PSP-004: meroitic on schema transfer)
  • Audit-ready evidence: all of the above as a sha-pinned, Merkle-rooted, NIST AI RMF-mapped bundle, admissible under IG review
Substrate-, format-, and domain-agnostic. The csv ingest of a creator-voice fleet runs the same kernel as the native-format ingest of a cipher-adapter fleet. The substrate doesn't know what's a subject.
Read the companion case study Capability declaration