A 50-epoch, 3-region, 6-agent tripartite-game test that exercises the entire ECCA stack — chains, services, contracts, and cross-region latency — on a single laptop or in CI.
Playfair is ECCA's system-level integration test. It spins up a real k3d Kubernetes cluster with one server node and three agent nodes — one per simulated region. Each region is given a different cost profile, and a different per-agent budget for compute / storage / bandwidth tokens:
| Region | Cheap | Expensive | Budget (C / S / B) |
|---|---|---|---|
| region-storage | storage | compute | 200 / 1000 / 400 |
| region-compute | compute | storage | 1000 / 200 / 400 |
| region-bandwidth | bandwidth | compute & storage | 200 / 200 / 1000 |
Six agents (two per region) each have a personality — perceive rate, store rate, route rate, sleeve kind, and a Coherence Profile Vector — chosen to either match or fight their region's cost profile. A scripted storyline of spot preemptions, drift spikes, and residue injections forces cross-region migrations ("needlecasts") at fixed epochs.
After every epoch the orchestrator audits the TripartiteGame's allocation accounting and asserts that every agent stayed within its per-region budget. The test passes only if all 50 epochs verify fair.
Unit tests catch regressions in individual contracts and packages. The compose-based E2E suite catches regressions in the happy-path API surface. Playfair is different: it forces the system to operate under realistic adversarial pressure:
tc netem to inject 33–75 ms one-way latency between agent nodes (matching the round-trip times of real cloud regions).If a refactor accidentally breaks how the bus replicates across namespaces, how the EpochAnchor contract verifies continuity, or how the Prisma client resolves its native engine inside Alpine — Playfair fails before it merges.
bash tests/playfair/run.sh. Takes ~15 minutes from cold (image builds + apply) or ~5 minutes warm.main and on a nightly schedule (03:00 UTC) via the Playfair workflow. The CI run uploads the rendered HTML report as a build artifact and (on main) commits it back to docs/playfair-report.html so it's published to GitHub Pages.epochs and the latency profile.Each run produces a deterministic set of artifacts under tests/playfair/results/:
playfair-results.json — the canonical machine-readable record of the run (agents, epochs, audits, needlecasts, residues, summary, env metadata).playfair-report.html — a self-contained, dependency-free HTML report with inline SVG charts. This is the file published to /playfair-report.html.orchestrator.log — the full orchestrator stdout (one line per epoch tick).region-{storage,compute,bandwidth}-{siyana,thalamus}.log — per-service tail logs, one file per region per service.The output is an honest record of the system running for ~52 seconds of in-cluster orchestrator time (after a ~3-minute warm-up of cluster + chains + contracts), exercising:
| Verdict | What it means | Action |
|---|---|---|
| FAIR | Every epoch's per-region allocation respected its budget. The protocol's accounting matches reality. | Ship it. |
| UNFAIR | One or more epochs over-spent. Either an agent leaked tokens, a contract under-counted, or a service double-billed. | The report's "unfair epochs" list pin-points the failures. Read the per-region service logs in results/. |
| 0 needlecasts | The scripted scenario events failed to fire (likely the orchestrator never connected to siyana-api). | Check that all siyana-api pods are Ready in kubectl get pods -A. |
| Crashed pods | If thalamus-router or siyana-api are in CrashLoopBackOff, the run will time out instead of producing UNFAIR. |
Check Prisma binary mismatches (Alpine openssl version) and per-region NATS consumer naming. |
The structural shape of the test (6 agents, 3 regions, 50 epochs, fixed scenario events) is deterministic. What varies between runs:
terraform apply -var latency_storage_compute_ms=80 to simulate transcontinental routing. Higher latency shifts the timing of needlecasts but should not change the verdict.--epochs 200 stresses the long-run fairness; the per-epoch test should be insensitive to count.orchestrator.js to make a region's compute budget unrealistically low and the verdict should flip to UNFAIR — this is a useful sanity check that the audit isn't always returning true.ubuntu-latest runners all execute the same code. The Alpine k3s image is the same; the Prisma engine binaries are different (musl/arm64 vs musl/x64) and both are baked into the ecca-ts-builder image.Both runs below were executed locally on a MacBook Pro (Apple Silicon, Docker Desktop 4.x, k3d v5.8.3, Terraform 1.5.7, Node 20.x) against commit 618eac8. The complete artifacts are linked from each card. Both runs verified FAIR on all 50 epochs with zero unfair epochs.
| Epochs | 50 / 50 fair |
| Perceptions | 152 |
| Stores | 43 |
| Routes | 31 |
| Migrations | 31 |
| Residues | 1 (resolved) |
| Scenarios | 9 / 9 fired |
| Verdict | FAIR |
terraform apply -auto-approve -var skip_images=true against a freshly created k3d cluster (1 server + 3 agents).| Epochs | 50 / 50 fair |
| Perceptions | 152 |
| Stores | 49 |
| Routes | 33 |
| Syncs | 24 |
| Migrations | 33 |
| Residues | 1 (resolved) |
| Verdict | FAIR |
ecca-playfair-orchestrator:local to validate the new env metadata block. Same cluster, fresh orchestrator job. → View Run B reportSix agents, two per region, each with a sleeve kind that interacts differently with its region's cost profile. Token columns are cumulative burn over 50 epochs:
| Agent | Home region | Sleeve | Perceive | Store | Route | Sync | Compute tok | Storage tok | Bandwidth tok | Final drift |
|---|---|---|---|---|---|---|---|---|---|---|
| Archivist-Alpha | storage | memory | 13 | 12 | 3 | 1 | 52 | 36 | 15.7 | 8.0 |
| Archivist-Beta | storage | human | 36 | 20 | 1 | 7 | 75 | 60 | 5.2 | 10.0 |
| Inference-Prime | compute | ai | 41 | 6 | 2 | 8 | 328 | 18 | 12.2 | 6.0 |
| Inference-Echo | compute | ai | 36 | 4 | 5 | 6 | 288 | 12 | 26.7 | 6.0 |
| Router-Nexus | bandwidth | mining | 13 | 3 | 12 | 1 | 52 | 9 | 61.2 | 8.0 |
| Router-Sentinel | bandwidth | memory | 13 | 4 | 10 | 1 | 52 | 12 | 51.0 | 8.0 |
What this means: the bold cells show each region's "expected" specialty: AI sleeves in compute burn the most compute tokens; mining/memory sleeves in bandwidth burn the most bandwidth tokens. The Archivists in storage show balanced storage burn (36 + 60 = 96 storage tokens against the region budget of 1000). No agent exceeded its per-region budget on any epoch — that's what the FAIR verdict tracks.
Nine scripted events drive cross-region migrations and residue handling so the test exercises the migration paths even on a quiet random seed:
| Epoch | Type | Agent | What happened |
|---|---|---|---|
| 5 | spot-preemption | Inference-Prime | Spot instance pulled from compute; agent must needlecast to storage. |
| 8 | respawn | Inference-Prime | Re-sleeves in storage region (expensive compute, cheap storage). |
| 15 | needlecast | Inference-Prime | Spot instance back; needlecasts storage → compute at cost 7.1 RoutingToken. |
| 20 | drift-spike | Archivist-Beta | Human agent goes idle 5 epochs; drift accumulates to 16. |
| 25 | sync-recovery | Archivist-Beta | Returns and burns 4.5 SyncToken to reset drift. |
| 30 | residue-inject | (bandwidth) | Simulated shard-loss; first responder earns the bounty. |
| 30 | residue-resolved | Router-Sentinel | Detected and resolved within 1.14 s; payout = 15 ResidueToken. |
| 35 | needlecast | Inference-Echo | Migrates compute → bandwidth for cheaper routing during high-needlecast phase. |
| 40 | epoch-surge | (all) | All agents perceive at max rate for 5 epochs — stress test. |
| 45 | needlecast | Inference-Echo | Returns to compute as surge subsides. |
Each row is a needlecast: an agent's identity + sleeve state migrates from one region's stack to another, paying RoutingToken proportional to shard count and inter-region latency. The first 8 of 33 are shown — see the full Needlecast log in the report.
epoch agent from to shards cost
───── ─────────────────── ────────── ────────── ────── ─────
11 Router-Nexus bandwidth storage 1 5.1
11 Router-Sentinel bandwidth storage 1 5.1
12 Router-Sentinel storage compute 1 5.1
13 Router-Nexus storage bandwidth 1 5.1
14 Inference-Prime storage bandwidth 1 5.1
15 Inference-Prime bandwidth compute 1 7.1 ← scenario-driven (epoch-15)
16 Router-Nexus bandwidth storage 1 5.1
18 Router-Nexus storage compute 1 5.1
… (25 more)
Why the cost varies: the storage↔bandwidth path has the highest injected latency (75 ± 12 ms one-way), so its cost coefficient is higher (7.1 vs 5.1 for the cheaper paths). The cost is verified on-chain by the NeedlecastRouter contract and burned from the agent's BandwidthToken balance.
After every epoch tick the orchestrator calls the TripartiteGame contract's auditEpoch(epochNumber) view function, which returns whether per-region token sums for the closing epoch respected the per-region budgets. A single failure flips the run verdict to UNFAIR.
{ "epoch": 1, "fair": true, "ts": "2026-05-09T13:47:15.546Z" }
{ "epoch": 2, "fair": true, "ts": "2026-05-09T13:47:16.543Z" }
{ "epoch": 3, "fair": true, "ts": "2026-05-09T13:47:17.514Z" }
…
{ "epoch": 48, "fair": true, "ts": "2026-05-09T13:48:02.573Z" }
{ "epoch": 49, "fair": true, "ts": "2026-05-09T13:48:03.565Z" }
{ "epoch": 50, "fair": true, "ts": "2026-05-09T13:48:04.709Z" }
summary.allEpochsFair = true
summary.unfairEpochs = []
At epoch 30 the scenario script injects a shard-loss residue in the bandwidth region. The residue carries a bountyEstimate; the first sleeve to detect-and-resolve it claims the payout.
{
"epoch": 30,
"kind": "shard-loss",
"region": "bandwidth",
"ts": "2026-05-09T13:47:44.445Z",
"resolved": true,
"bountyEstimate": 15,
"resolvedAt": "2026-05-09T13:47:45.587Z",
"resolver": "Router-Sentinel",
"payout": 15
}
What this proves: the residue economy works end-to-end — injection on a real chain, detection by a sleeve in the affected region, on-chain proof submission, ResidueToken minted to the resolver. Latency from injection to resolution: 1.14 s, well within the 4-second epoch tick.
Open /playfair-report.html in another tab and follow along. Each section answers a different operational question:
| Section | What you see | What it tells you |
|---|---|---|
| Header verdict banner | Big FAIR/UNFAIR pill, agent count, region count, epoch count. | The headline result. If UNFAIR, stop here and read the unfair-epochs list. |
| Runtime configuration panel | Cluster name, latency profile, commit hash, runner (local / github-actions), branch. | Lets a future reader (or CI viewer) tell exactly which version of the code produced this report and under what conditions. Without this panel a report is unfalsifiable. |
| Stat strip | 5 large stats: epochs, perceptions, stores, routes, residues. | One-glance throughput. Compare across runs to detect throughput regressions. |
| Activity timeline | Stacked-area SVG chart: perceives / stores / routes / syncs over 50 epochs. Red dots mark unfair epochs. | Shape diagnostics. Flat curves = the test never warmed up. Cliff drops = a service crashed mid-run. Red dots = audit violations. |
| Region cards | Three side-by-side cards: cheap/expensive specialty, per-region budgets (C/S/B), agent count. | Confirms the test's adversarial setup matches the spec. |
| Per-agent sparklines | 4 cumulative lines per agent: perceive (cyan), store (magenta), route (purple), sync (green). | Per-agent behaviour. An agent whose lines are flat after epoch N is silently dead even if no pod restarted. |
| Region token-usage bars | Three horizontal bars per region: compute / storage / bandwidth burn vs budget. | Visual sanity check on the FAIR verdict. Bars longer than the budget line would mean an over-spend the audit somehow missed. |
| Scenario timeline | Vertical list of 9 scripted events with epoch + description. | Confirms the scripted storyline actually fired. Missing events = orchestrator crashed before reaching that epoch. |
| Needlecast log | Table of every cross-region migration with cost. | The cross-chain workload. Should be ≥ 9 in any run that completed (the scripted minimum); typically 30+. |
| Residue table | Every detected residue, who resolved it, payout, latency. | Proves the residue market clears. Detection-to-resolution latency > 1 epoch suggests a stuck worker. |
| Footer verdict | Repeated FAIR/UNFAIR + summary sentence + how-to-read explainer. | For readers who scrolled to the bottom first. |
tests/playfair/results/orchestrator.log contains a per-epoch print line you can grep with the offending epoch number.
Prerequisites: Docker Desktop, brew install k3d terraform kubectl jq, Node 20, pnpm 9.
git clone https://github.com/quellcrist-falconer/ECCA.git
cd ECCA
pnpm install --no-frozen-lockfile
bash tests/playfair/run.sh # full from-scratch run
bash tests/playfair/run.sh --skip-images # reuse cached images
bash tests/playfair/run.sh --epochs 200 # longer game
bash tests/playfair/run.sh --skip-latency # disable tc netem
bash tests/playfair/run.sh --destroy # tear down cluster
open tests/playfair/playfair-report.html # view the report
Under the hood run.sh is a thin wrapper around tests/playfair/terraform/. Every state transition is declared in Terraform — image builds, image imports, latency injection, manifest applies, and the orchestrator job. Idempotent re-runs only re-execute the resources whose source hashes changed.
The .github/workflows/playfair.yml workflow:
k3d, terraform, and kubectl on the runner.docker build (no registry — k3d imports them straight from the runner's docker daemon).terraform apply -auto-approve against the local k3d cluster.playfair-report.html + playfair-results.json + per-region logs as a build artifact.main, commits the rendered HTML to docs/playfair-report.html with [skip ci] so the Pages workflow picks it up on the next run.Manual triggers via workflow_dispatch accept overrides:
gh workflow run playfair.yml \
-f epochs=100 \
-f latency_storage_compute_ms=80 \
-f skip_latency=false
kubectl get pods -A -w
kubectl -n ecca-shared logs -f job/playfair-orchestrator
kubectl -n region-storage logs -f deploy/siyana-api
Edit services/siyana-api/src/server.ts, then:
terraform apply -auto-approve \
-var skip_latency=true \
-var force_image_rebuild=$(date +%s)
Source-tree hashing means only the siyana-api image is rebuilt and re-imported; chains and infra stay up.
| Symptom | Cause | Fix |
|---|---|---|
ErrImageNeverPull: ecca-ts-builder:local | Image not imported into k3d | It's listed in main.tf's all_image_refs; if missing, add it and re-apply. |
PrismaClientInitializationError: ... openssl-1.1.x | Alpine openssl version detection fails | PRISMA_QUERY_ENGINE_LIBRARY is hard-pinned in 03-services.yaml to the musl/arm64 3.0.x engine. |
Error: duplicate subscription in thalamus-router | Multiple regions sharing one NATS consumer name | Consumer names are region-scoped (thalamus-mem-${region}) — verify ECCA_REGION env is set per pod. |
k3d image import hangs | Concurrent imports deadlock the k3d tools node | Imports are serialised in a single null_resource.k3d_image_import shell loop. |
Empty playfair-results.json | Orchestrator pod completed before kubectl cp ran | Fallback in run-orchestrator.sh extracts JSON from the log marker ═══ RESULTS JSON ═══. |