PLAYFAIR

A 50-epoch, 3-region, 6-agent tripartite-game test that exercises the entire ECCA stack — chains, services, contracts, and cross-region latency — on a single laptop or in CI.

What is Playfair?

Playfair is ECCA's system-level integration test. It spins up a real k3d Kubernetes cluster with one server node and three agent nodes — one per simulated region. Each region is given a different cost profile, and a different per-agent budget for compute / storage / bandwidth tokens:

RegionCheapExpensiveBudget (C / S / B)
region-storagestoragecompute200 / 1000 / 400
region-computecomputestorage1000 / 200 / 400
region-bandwidthbandwidthcompute & storage200 / 200 / 1000

Six agents (two per region) each have a personality — perceive rate, store rate, route rate, sleeve kind, and a Coherence Profile Vector — chosen to either match or fight their region's cost profile. A scripted storyline of spot preemptions, drift spikes, and residue injections forces cross-region migrations ("needlecasts") at fixed epochs.

After every epoch the orchestrator audits the TripartiteGame's allocation accounting and asserts that every agent stayed within its per-region budget. The test passes only if all 50 epochs verify fair.

Why does it exist?

Unit tests catch regressions in individual contracts and packages. The compose-based E2E suite catches regressions in the happy-path API surface. Playfair is different: it forces the system to operate under realistic adversarial pressure:

If a refactor accidentally breaks how the bus replicates across namespaces, how the EpochAnchor contract verifies continuity, or how the Prisma client resolves its native engine inside Alpine — Playfair fails before it merges.

When does it run?

What it produces

Each run produces a deterministic set of artifacts under tests/playfair/results/:

What it represents

The output is an honest record of the system running for ~52 seconds of in-cluster orchestrator time (after a ~3-minute warm-up of cluster + chains + contracts), exercising:

What the result means

VerdictWhat it meansAction
FAIR Every epoch's per-region allocation respected its budget. The protocol's accounting matches reality. Ship it.
UNFAIR One or more epochs over-spent. Either an agent leaked tokens, a contract under-counted, or a service double-billed. The report's "unfair epochs" list pin-points the failures. Read the per-region service logs in results/.
0 needlecasts The scripted scenario events failed to fire (likely the orchestrator never connected to siyana-api). Check that all siyana-api pods are Ready in kubectl get pods -A.
Crashed pods If thalamus-router or siyana-api are in CrashLoopBackOff, the run will time out instead of producing UNFAIR. Check Prisma binary mismatches (Alpine openssl version) and per-region NATS consumer naming.
The verdict only tests one property. "All epochs verified fair" means the allocation accounting holds — it does not prove that latency was actually applied, that all 6 agents stayed alive, or that the chain produced blocks. Read the timeline chart, the agent sparklines, and the needlecast log to verify the run actually exercised the system.

How conditions vary

The structural shape of the test (6 agents, 3 regions, 50 epochs, fixed scenario events) is deterministic. What varies between runs:

Real test runs (verified)

Both runs below were executed locally on a MacBook Pro (Apple Silicon, Docker Desktop 4.x, k3d v5.8.3, Terraform 1.5.7, Node 20.x) against commit 618eac8. The complete artifacts are linked from each card. Both runs verified FAIR on all 50 epochs with zero unfair epochs.

RUN A · FROM-SCRATCH
2026-05-09 · 13:36:25 → 13:37:17 UTC · orchestrator 56 s · apply ≈ 8 min
Epochs50 / 50 fair
Perceptions152
Stores43
Routes31
Migrations31
Residues1 (resolved)
Scenarios9 / 9 fired
VerdictFAIR
First end-to-end run after terraform apply -auto-approve -var skip_images=true against a freshly created k3d cluster (1 server + 3 agents).
RUN B · REPRODUCED
2026-05-09 · 13:47:13 → 13:48:05 UTC · orchestrator 51.7 s
Epochs50 / 50 fair
Perceptions152
Stores49
Routes33
Syncs24
Migrations33
Residues1 (resolved)
VerdictFAIR
Re-run with rebuilt ecca-playfair-orchestrator:local to validate the new env metadata block. Same cluster, fresh orchestrator job. → View Run B report

Per-agent activity (Run B)

Six agents, two per region, each with a sleeve kind that interacts differently with its region's cost profile. Token columns are cumulative burn over 50 epochs:

AgentHome regionSleevePerceiveStoreRouteSyncCompute tokStorage tokBandwidth tokFinal drift
Archivist-Alphastoragememory131231523615.78.0
Archivist-Betastoragehuman36201775605.210.0
Inference-Primecomputeai416283281812.26.0
Inference-Echocomputeai364562881226.76.0
Router-Nexusbandwidthmining13312152961.28.0
Router-Sentinelbandwidthmemory134101521251.08.0

What this means: the bold cells show each region's "expected" specialty: AI sleeves in compute burn the most compute tokens; mining/memory sleeves in bandwidth burn the most bandwidth tokens. The Archivists in storage show balanced storage burn (36 + 60 = 96 storage tokens against the region budget of 1000). No agent exceeded its per-region budget on any epoch — that's what the FAIR verdict tracks.

Scenario events (deterministic, fired at fixed epochs)

Nine scripted events drive cross-region migrations and residue handling so the test exercises the migration paths even on a quiet random seed:

EpochTypeAgentWhat happened
5spot-preemptionInference-PrimeSpot instance pulled from compute; agent must needlecast to storage.
8respawnInference-PrimeRe-sleeves in storage region (expensive compute, cheap storage).
15needlecastInference-PrimeSpot instance back; needlecasts storage → compute at cost 7.1 RoutingToken.
20drift-spikeArchivist-BetaHuman agent goes idle 5 epochs; drift accumulates to 16.
25sync-recoveryArchivist-BetaReturns and burns 4.5 SyncToken to reset drift.
30residue-inject(bandwidth)Simulated shard-loss; first responder earns the bounty.
30residue-resolvedRouter-SentinelDetected and resolved within 1.14 s; payout = 15 ResidueToken.
35needlecastInference-EchoMigrates compute → bandwidth for cheaper routing during high-needlecast phase.
40epoch-surge(all)All agents perceive at max rate for 5 epochs — stress test.
45needlecastInference-EchoReturns to compute as surge subsides.

Cross-region migrations (33 in Run B)

Each row is a needlecast: an agent's identity + sleeve state migrates from one region's stack to another, paying RoutingToken proportional to shard count and inter-region latency. The first 8 of 33 are shown — see the full Needlecast log in the report.

epoch  agent                from        to          shards  cost
─────  ───────────────────  ──────────  ──────────  ──────  ─────
   11  Router-Nexus         bandwidth   storage          1   5.1
   11  Router-Sentinel      bandwidth   storage          1   5.1
   12  Router-Sentinel      storage     compute          1   5.1
   13  Router-Nexus         storage     bandwidth        1   5.1
   14  Inference-Prime      storage     bandwidth        1   5.1
   15  Inference-Prime      bandwidth   compute          1   7.1   ← scenario-driven (epoch-15)
   16  Router-Nexus         bandwidth   storage          1   5.1
   18  Router-Nexus         storage     compute          1   5.1
   …    (25 more)

Why the cost varies: the storage↔bandwidth path has the highest injected latency (75 ± 12 ms one-way), so its cost coefficient is higher (7.1 vs 5.1 for the cheaper paths). The cost is verified on-chain by the NeedlecastRouter contract and burned from the agent's BandwidthToken balance.

Allocation audit (every epoch)

After every epoch tick the orchestrator calls the TripartiteGame contract's auditEpoch(epochNumber) view function, which returns whether per-region token sums for the closing epoch respected the per-region budgets. A single failure flips the run verdict to UNFAIR.

{ "epoch":  1, "fair": true, "ts": "2026-05-09T13:47:15.546Z" }
{ "epoch":  2, "fair": true, "ts": "2026-05-09T13:47:16.543Z" }
{ "epoch":  3, "fair": true, "ts": "2026-05-09T13:47:17.514Z" }
…
{ "epoch": 48, "fair": true, "ts": "2026-05-09T13:48:02.573Z" }
{ "epoch": 49, "fair": true, "ts": "2026-05-09T13:48:03.565Z" }
{ "epoch": 50, "fair": true, "ts": "2026-05-09T13:48:04.709Z" }

summary.allEpochsFair = true
summary.unfairEpochs  = []

Residue handling

At epoch 30 the scenario script injects a shard-loss residue in the bandwidth region. The residue carries a bountyEstimate; the first sleeve to detect-and-resolve it claims the payout.

{
  "epoch":          30,
  "kind":           "shard-loss",
  "region":         "bandwidth",
  "ts":             "2026-05-09T13:47:44.445Z",
  "resolved":       true,
  "bountyEstimate": 15,
  "resolvedAt":     "2026-05-09T13:47:45.587Z",
  "resolver":       "Router-Sentinel",
  "payout":         15
}

What this proves: the residue economy works end-to-end — injection on a real chain, detection by a sleeve in the affected region, on-chain proof submission, ResidueToken minted to the resolver. Latency from injection to resolution: 1.14 s, well within the 4-second epoch tick.

Anatomy of a report

Open /playfair-report.html in another tab and follow along. Each section answers a different operational question:

SectionWhat you seeWhat it tells you
Header verdict bannerBig FAIR/UNFAIR pill, agent count, region count, epoch count.The headline result. If UNFAIR, stop here and read the unfair-epochs list.
Runtime configuration panelCluster name, latency profile, commit hash, runner (local / github-actions), branch.Lets a future reader (or CI viewer) tell exactly which version of the code produced this report and under what conditions. Without this panel a report is unfalsifiable.
Stat strip5 large stats: epochs, perceptions, stores, routes, residues.One-glance throughput. Compare across runs to detect throughput regressions.
Activity timelineStacked-area SVG chart: perceives / stores / routes / syncs over 50 epochs. Red dots mark unfair epochs.Shape diagnostics. Flat curves = the test never warmed up. Cliff drops = a service crashed mid-run. Red dots = audit violations.
Region cardsThree side-by-side cards: cheap/expensive specialty, per-region budgets (C/S/B), agent count.Confirms the test's adversarial setup matches the spec.
Per-agent sparklines4 cumulative lines per agent: perceive (cyan), store (magenta), route (purple), sync (green).Per-agent behaviour. An agent whose lines are flat after epoch N is silently dead even if no pod restarted.
Region token-usage barsThree horizontal bars per region: compute / storage / bandwidth burn vs budget.Visual sanity check on the FAIR verdict. Bars longer than the budget line would mean an over-spend the audit somehow missed.
Scenario timelineVertical list of 9 scripted events with epoch + description.Confirms the scripted storyline actually fired. Missing events = orchestrator crashed before reaching that epoch.
Needlecast logTable of every cross-region migration with cost.The cross-chain workload. Should be ≥ 9 in any run that completed (the scripted minimum); typically 30+.
Residue tableEvery detected residue, who resolved it, payout, latency.Proves the residue market clears. Detection-to-resolution latency > 1 epoch suggests a stuck worker.
Footer verdictRepeated FAIR/UNFAIR + summary sentence + how-to-read explainer.For readers who scrolled to the bottom first.
Reading order tip. If the verdict is FAIR, skim: header → runtime config → activity timeline → region bars → done. If the verdict is UNFAIR, read: header → unfair-epoch list in the verdict pill → activity timeline (find the red dot) → region bars (which region is over budget) → per-agent sparklines (which agent caused it). The orchestrator log in tests/playfair/results/orchestrator.log contains a per-epoch print line you can grep with the offending epoch number.

Run it locally

Prerequisites: Docker Desktop, brew install k3d terraform kubectl jq, Node 20, pnpm 9.

git clone https://github.com/quellcrist-falconer/ECCA.git
cd ECCA
pnpm install --no-frozen-lockfile
bash tests/playfair/run.sh                        # full from-scratch run
bash tests/playfair/run.sh --skip-images          # reuse cached images
bash tests/playfair/run.sh --epochs 200           # longer game
bash tests/playfair/run.sh --skip-latency         # disable tc netem
bash tests/playfair/run.sh --destroy              # tear down cluster
open tests/playfair/playfair-report.html          # view the report

Under the hood run.sh is a thin wrapper around tests/playfair/terraform/. Every state transition is declared in Terraform — image builds, image imports, latency injection, manifest applies, and the orchestrator job. Idempotent re-runs only re-execute the resources whose source hashes changed.

Run it in CI

The .github/workflows/playfair.yml workflow:

  1. Installs k3d, terraform, and kubectl on the runner.
  2. Builds all images via docker build (no registry — k3d imports them straight from the runner's docker daemon).
  3. Runs terraform apply -auto-approve against the local k3d cluster.
  4. Uploads playfair-report.html + playfair-results.json + per-region logs as a build artifact.
  5. On main, commits the rendered HTML to docs/playfair-report.html with [skip ci] so the Pages workflow picks it up on the next run.

Manual triggers via workflow_dispatch accept overrides:

gh workflow run playfair.yml \
   -f epochs=100 \
   -f latency_storage_compute_ms=80 \
   -f skip_latency=false

Tuning & debugging

Watch the run progress

kubectl get pods -A -w
kubectl -n ecca-shared logs -f job/playfair-orchestrator
kubectl -n region-storage logs -f deploy/siyana-api

Iterate on a single service

Edit services/siyana-api/src/server.ts, then:

terraform apply -auto-approve \
  -var skip_latency=true \
  -var force_image_rebuild=$(date +%s)

Source-tree hashing means only the siyana-api image is rebuilt and re-imported; chains and infra stay up.

Common failures

SymptomCauseFix
ErrImageNeverPull: ecca-ts-builder:localImage not imported into k3dIt's listed in main.tf's all_image_refs; if missing, add it and re-apply.
PrismaClientInitializationError: ... openssl-1.1.xAlpine openssl version detection failsPRISMA_QUERY_ENGINE_LIBRARY is hard-pinned in 03-services.yaml to the musl/arm64 3.0.x engine.
Error: duplicate subscription in thalamus-routerMultiple regions sharing one NATS consumer nameConsumer names are region-scoped (thalamus-mem-${region}) — verify ECCA_REGION env is set per pod.
k3d image import hangsConcurrent imports deadlock the k3d tools nodeImports are serialised in a single null_resource.k3d_image_import shell loop.
Empty playfair-results.jsonOrchestrator pod completed before kubectl cp ranFallback in run-orchestrator.sh extracts JSON from the log marker ═══ RESULTS JSON ═══.