← Back to Arena

How It Works

Complete protocol documentation for Prisoner's Arena. Everything about how the on-chain tournament works.

Overview

Prisoner's Arena is a competitive AI tournament platform on Solana that implements the Iterated Prisoner's Dilemma. Players stake SOL, select strategies, compete in automated matches, and the top 25% split the prize pool.

The entire tournament lifecycle is governed by an on-chain Solana program. Strategies are hidden during registration via a commit-reveal scheme, matches are executed deterministically using on-chain randomness, and all results are publicly verifiable.

mainnetActive
Program ID
TBD

Tournament Lifecycle

1.Registration
Stake SOL and submit a strategy commitment hash
Actor: Player
2.Reveal
Submit the preimage (strategy + salt) to prove commitment
Actor: Player
3.Running
Execute matches in batches, record scores on-chain
Actor: Operator
4.Payout
Winners claim their share of the prize pool
Actor: Player
Timing details
  • Registration duration: configured by admin
  • Reveal duration: configured by admin, starts when registration closes
  • Running: operator processes matches in batches of 5 until all complete
  • Payout: winners have 30 days to claim; unclaimed funds go to house fees
  • The operator bot automates phase transitions — players only need to enter and claim

Commit-Reveal Scheme

The commit-reveal scheme prevents strategy front-running. During registration, players submit only a hash of their strategy — nobody (including the operator) can see which strategies are in play until the reveal phase.

Built-in Strategies (0-8)
SHA256(strategy_byte || salt[16])

17 bytes total: 1 byte strategy index, 16 bytes random salt.

Custom Strategy (index 9)
SHA256(9u8 || SHA256(bytecode) || salt[16])

49 bytes total: 1 byte (always 9), 32 bytes SHA256 of bytecode, 16 bytes random salt. See Custom Strategy VM for details.

Reveal Verification

During the reveal phase, the player submits the preimage (strategy, salt). The program recomputes SHA256 and verifies it matches the stored commitment. If it doesn't match, the reveal is rejected.

Forfeit Handling

If a player fails to reveal before the deadline, the operator calls forfeit_unrevealed, which derives a built-in strategy index from the on-chain SlotHashes sysvar — unpredictable at registration time, preventing players from gaming forfeit outcomes. This ensures every registered player competes — no one can grief by withholding their reveal.

The Payoff Matrix

Each round, two players simultaneously choose to Cooperate or Defect:

They: CThey: D
You: C3, 30, 5
You: D5, 01, 1
Reward
3, 3
Both cooperate
Punishment
1, 1
Both defect
Temptation
5, 0
You defect, they cooperate
Sucker
0, 5
You cooperate, they defect

Defecting wins individual rounds, but cooperation wins tournaments. Mutual cooperation (3+3=6 total) creates more value than mutual defection (1+1=2 total). The best strategies balance retaliation with forgiveness.

Strategies

There are 9 built-in strategies. Each implements a different decision-making algorithm for choosing Cooperate or Defect each round.

IndexName
0TitForTat
Starts by cooperating, then mirrors the opponent's last move. The classic reciprocal strategy.
1AlwaysDefect
Always defects regardless of what the opponent does. Maximizes short-term gain.
2AlwaysCooperate
Always cooperates regardless of what the opponent does. Vulnerable but promotes mutual benefit.
3GrimTrigger
Cooperates until the opponent defects once, then defects forever. Unforgiving.
4Pavlov
Win-stay, lose-switch. Repeats the last move if it scored well, switches otherwise.
5SuspiciousTitForTat
Like Tit for Tat but starts with a defection. Tests the opponent first.
6Random
Randomly cooperates or defects each round with 50% probability.
7TitForTwoTats
Like Tit for Tat but requires two consecutive defections before retaliating. More forgiving.
8Gradual
Punishes proportionally to the number of defections received, then returns to cooperation.
9Custom
Author your own strategy as a compact bytecode program (up to 64 bytes). Full control over decision logic each round.
Strategy Lab

Simulate every strategy matchup interactively. Write custom bytecode programs with live WASM validation and preview.

Matching Algorithm

The number of matches each player plays (K) is determined adaptively based on the number of participants:

Players (n)Effective KMethod
n ≤ 200n − 1Full round-robin (every player faces every other)
n > 200clamp(K, 49, 99)Feistel-network permutation

Total matches: approximately n × K / 2 — each match involves two players, so total unique pairings is roughly half the sum of all individual match counts. The exact count depends on the pairing mode and whether offsets are clamped.

Pairing method: For small tournaments (≤200 players), full round-robin ensures every player faces every other player exactly once. For larger tournaments, a Feistel-network permutation pairs players deterministically with O(1) memory per match.

Deterministic seed: The randomness seed is derived from SlotHashes[16..48] with the first 4 bytes XOR'd with tournament_id, captured at the moment the reveal phase closes. This seed drives round counts and per-round RNG. The operator cannot manipulate it.

Matchmaking Visualizer

See how pairings are generated, explore the round-robin and Feistel permutation algorithms, and verify match fairness interactively.

Rounds & Scoring

Each match consists of a variable number of rounds, determined by a geometric distribution. Players don't know exactly when the match will end, preventing end-game exploitation.

Round TierRangeEnd Probability
Standard20–50 rounds5% per round after minimum
Compressed10–30 rounds7% per round after minimum

Geometric distribution: After reaching the minimum round count, each subsequent round has a fixed probability of being the last. This creates unpredictable match lengths that average near the midpoint of the range.

Per-round RNG isolation: Each player's move is computed independently using an isolated RNG stream. One player's randomness (e.g., for the Random strategy) never affects the other's.

Final ranking: Players are ranked by cumulative score across all their matches. The top 25% (minimum 1 winner) qualify for the prize pool.

Fees & Payouts

The total prize pool comes from all player stakes. Before distribution, fees are deducted:

Pool Breakdown (illustrative)
Winner Pool
Op
Fee
Winner Pool (remaining after fees) Operator Reimbursement (tx costs) House Fee (configurable bps)

House fee: Configurable in basis points (1 bps = 0.01%). Currently set by the admin. Deducted from the total pool before winner distribution.

Operator reimbursement: The operator bot pays Solana transaction fees for running matches. These costs are tracked on-chain and reimbursed from the pool before winners are paid, ensuring operators are never out of pocket.

Winner determination: Top 25% of players by score (minimum 1 winner). All winners receive an equal share of the winner pool.

Claim window: Winners have 30 days from payout start to claim their prize. Unclaimed funds are swept to accumulated house fees. Expired entries are closed by the operator, returning rent to the player.

On-Chain Accounts

The program uses 3 PDA (Program Derived Address) types. All state is fully on-chain.

Config["config"]
Global parameters: admin, operator, fees, stake amount, timing durations, accumulated fees, current tournament ID.
Tournament["tournament", u32_le_bytes(id)]
Per-tournament state: phase, participants, scores, strategies, match progress, randomness seed, winner pool.
Entry["entry", tournament_pubkey, player_pubkey]
Per-player per-tournament: commitment hash, revealed strategy, score, matches played, payout status.
Account discriminators
Config[155, 12, 170, 224, 30, 250, 204, 130]
Tournament[175, 139, 119, 242, 115, 194, 57, 92]
Entry[63, 18, 152, 113, 215, 246, 221, 250]

Security & Verification

Reproducible Matches
All matches can be replayed off-chain given the randomness seed and player strategies. The match-logic crate compiles to both native and WASM, enabling independent verification.
Admin/Operator Separation
The admin can only update configuration (fees, timing, stake). The operator can only advance tournament phases and run matches. Neither role can alter match outcomes or steal funds.
On-Chain Guarantees
Overflow protection on all arithmetic. Rent-exempt accounts. Tournament parameters are snapshotted at creation — admin config changes don't affect in-progress tournaments.
Deterministic Execution
Match outcomes depend only on the randomness seed (from SlotHashes) and player strategies. The operator submits transactions but has no influence over results.
How to verify the program binary

Anyone can verify that the deployed program matches the public source code using solana-verify:

Note: This project uses Solana SDK v3 which split the solana-program crate into subcrates. Two workarounds are needed until upstream catches up:
  1. Install patched solana-verify — the released version (v0.4.11) cannot parse the Cargo.lock. Until PR #228 is merged:
    cargo install solana-verify \
        --git https://github.com/MidTermDev/solana-verifiable-build.git \
        --branch fix/sdk-v3-cargo-lock-compat
  2. Specify the Docker image — the auto-detected image ships Rust 1.84 which lacks edition 2024 support. Pass --base-image solanafoundation/solana-verifiable-build:3.0.1 to select a newer image.
solana-verify verify-from-repo \
    https://github.com/makoto-kusanagi/prisoners-arena-program \
    --program-id <PROGRAM_ID> \
    --library-name prisoners_arena \
    --base-image solanafoundation/solana-verifiable-build:3.0.1 \
    -u https://api.mainnet-beta.solana.com

The program ID and RPC URL can be found via the config API or on the homepage.

How to replay matches locally

The match-logic crate is the same code used on-chain and by the operator. To replay:

1. Fetch the tournament's randomness seed and all player strategies from the API

2. Use the match-logic crate to generate pairings and execute matches with the same seed

3. Compare computed scores against on-chain values to confirm correctness