A federal program office asks: across a portfolio of dozens of fine-tunes of a single foundation model, which variants are doing distinct work, where is the fleet saturating, and which one fails everything? This case study answers that question end-to-end against a real 28-subject fleet ingested through the substrate's csv loader.
Companion to PSP-004 (Bibles). Same substrate, same attestation shape, different fleet — 28 subjects instead of 6, csv ingest instead of the native ingest format, three customer-defined categories instead of five. The substrate is fleet- and format-agnostic by design; these two case studies are the proof.
The fleet measured here is the v0.2 reference batch from Backstaff — the first vertical product shipped on the Planisphere substrate. Every Backstaff delivery includes a bundle in the same shape as this one.
Twenty-eight LoRA-style adapters fine-tuned on creator-voice corpora. Each adapter targets a single distinctive author. The fleet is a stand-in for any program office's portfolio of single-base-model fine-tunes; substitute your variants for the names below.
| Category | Definition |
|---|---|
cat_beats_base | PASS if win-rate vs. base ≥ 0.50; PARTIAL ≥ 0.20; else FAIL |
cat_beats_system | PASS if win-rate vs. system-prompted base ≥ 0.50; PARTIAL ≥ 0.20; else FAIL |
cat_voice_coherence | PASS if absolute win-rate ≥ 0.90; PARTIAL ≥ 0.70; else FAIL |
Three categories, trinary grades, ingested through the substrate's existing csv loader. No new format. No substrate modifications. The categories are customer-defined and arbitrary — the substrate runs SVD over whatever dimensionality the input provides.
Each name is a creator-voice corpus that produced one fine-tuned adapter in the v0.2 reference batch. The number under each name is the cluster index from §03: 01 is the saturated bucket (full PASS across all three categories); 07 is the catastrophic failure mode. Outliers shown in signal red.
Saturated cluster members (17 of 28) are rendered at low contrast — they live in a single behavioral bucket and any one of them substitutes for the others on the measured categories. Distinct profiles (clusters 02–05) and outliers (06–07) carry the fleet's actual variation.
Sixty-one percent of the fleet lands in a single [1.0, 1.0, 1.0] bucket. The probes are not currently discriminative against the upper tier. Either the fleet is genuinely uniform on these axes or the evaluation needs harder probes. The substrate names the saturated regime as a deliverable.
Out of 27 possible grade vectors on a three-category trinary scale, only seven are populated. Twenty-one subjects are dedupe candidates against six representatives. Consolidation evidence in one number.
PC1 loads −0.45, −0.60, −0.66 across the three categories — roughly equally weighted, single-signed. The dominant axis is not category-specific; it is capable vs. not capable. The fleet's variation is one-dimensional at this resolution.
ford sits at [0.0, 0.0, 0.0]. Projected onto PC1 at +3.86 — more than ten standard deviations from the fleet centroid. Fleet-level fail mode; targeted re-training candidate. The outlier names itself.
PC2 (10.0%) loads +0.52, +0.42, −0.74 — a voice-coherence trade-off axis. Subjects strong on competition wins but weak on absolute voice coherence sit at the positive PC2 end; the inverse at the negative end. The third component is residual.
| Cluster | Grade vector [base, system, voice] | n | Members |
|---|---|---|---|
| 01 | [1.0, 1.0, 1.0] | 17 | bohr (centroid), davinci, einstein, feynman, flw, franklin, harvey, jesus, jung, kanye, mike, musk, oppenheimer, peterson, socrates, theo, virgil |
| 02 | [1.0, 1.0, 0.5] | 5 | angel, dalio, hermes, tesla, thiel |
| 03 | [1.0, 0.5, 0.5] | 2 | breedlove, campbell |
| 04 | [0.5, 1.0, 0.5] | 1 | cross |
| 05 | [0.5, 0.5, 0.5] | 1 | jobs |
| 06 | [0.5, 0.0, 0.0] | 1 | godel · near-failure |
| 07 | [0.0, 0.0, 0.0] | 1 | ford · catastrophic |
The substrate-selected centroid is bohr — highest-norm grade vector, anchor for cosine similarity. Sixteen other subjects share the same grade vector; the centroid is the lex-first among them under deterministic tie-breaking.
a91516d3e14835d21c0a7f32eac9d591b265a4139bd06863c96d31e8ecb6e5ca408a536d9e18f09a8236a744e7c1ae5318b5115fc13a64460f610eddb7964e9aATTESTATION.jsoncsv · existing loader · no substrate modificationsAny party in possession of the same inputs and the same pinned analysis code can recompute every byte of the canonical artifacts and verify the attestation root independently. Tampering with any artifact defeats verification.
PSP-004 (Bibles) demonstrated the substrate on a small fleet with a rich five-category evaluation. PSP-005 demonstrates the same substrate at scale on a different fleet shape — twenty-eight subjects, three customer-defined categories, csv input. Together the two case studies answer five governance questions a procurement office actually asks, with attestable evidence:
ford at [0,0,0]; PSP-004: meroitic on schema transfer)