← Back to Arena

Custom Strategy VM

Write arbitrary Prisoner's Dilemma strategies as compact bytecode programs, interpreted on-chain within the match execution pipeline.

Overview

The 9 built-in strategies cover the classic approaches, but the strategic structure is fixed. A player who wants “cooperate for 5 rounds, then play Tit-for-Tat, but always defect if the opponent has defected more than 60% of the time” cannot express that today.

The Custom Strategy VM lets players compose arbitrary decision logic as compact bytecode programs (max 64 bytes). Programs are interpreted on-chain during match execution — fully deterministic, verifiable, and reproducible. Custom is strategy variant index 9, alongside the existing built-in strategies.

Architecture Flow
Write bytecode (max 64 bytes)SHA256(hash) → commitmentvalidate() → stored on-chainexecute_bytecode() per round

Built-in strategies remain as native optimized code paths — zero performance regression for existing players.

The VM lives in the match-logic crate and compiles to both native (on-chain contract) and WASM (frontend replay).

Machine Model

PropertyValue
Stack depth8 elements
Value typeu8 (0–255)
Max program size64 bytes
Fuel limit128 instructions per round
Default on errorCooperate
Jump modelForward-only (guarantees termination)
Inputs Available Per Round
InputSource
Opponent’s move historySlice, grows each round
Own move historySlice, grows each round
Round numberu8, 0-indexed
Deterministic RNGSeededRng, unique per player per round
Error Handling

The VM never panics. Every anomalous condition falls back to Cooperate:

  • Stack underflow — halt, Cooperate
  • Stack overflow — halt, Cooperate
  • Out-of-bounds history — returns 0 (Cooperate)
  • Unknown opcode — immediate halt, Cooperate
  • Fuel exhaustion — Cooperate
  • Program falls off end — Cooperate

This “fail-safe to cooperation” penalizes broken programs without crashing the match.

Instruction Set (25 opcodes)

Terminals
HexMnemonicBytes
00COOP1
16DEFECT1
18RETURN1
Literals & Input
HexMnemonicBytes
01PUSH imm82
02OPP_LAST1
03MY_LAST1
04OPP_N1
05MY_N1
06OPP_DEFECTS1
07MY_DEFECTS1
08ROUND1
09RAND1
17SCORE_LAST1
Arithmetic (saturating)
HexMnemonic
0AADD
0BSUB
0CMUL
Comparison & Logic
HexMnemonic
0DGT
0ELT
0FEQ
10NOT
11AND
12OR
Stack & Control Flow
HexMnemonicBytes
13DUP1
14JMP_FWD off2
15JMP_FWD_IF off2

Example Programs

Classic strategies re-implemented as bytecode. These demonstrate how the VM's small instruction set can express complex decision logic.

TitForTat

2 bytes

Copy opponent's last move. Round 0: opponent history empty → 0 → Cooperate.

02 18       OPP_LAST RETURN

AlwaysDefect

1 byte

Defect unconditionally.

16          DEFECT

GrimTrigger

8 bytes

Cooperate until the opponent defects once, then defect forever.

06          OPP_DEFECTS         ; [count]
01 00       PUSH 0              ; [count, 0]
0D          GT                  ; [count > 0]
15 01       JMP_FWD_IF 1        ; if true, skip to DEFECT
00          COOP
16          DEFECT

Pavlov

10 bytes

Win-stay, lose-switch: repeat last move if payoff ≥ 3, otherwise switch.

17          SCORE_LAST          ; [score]
01 03       PUSH 3              ; [score, 3]
0E          LT                  ; [bad?]  1 if score < 3
03          MY_LAST             ; [bad?, my_d]
0F          EQ                  ; [should_coop]  bad==my_d → cooperate
15 01       JMP_FWD_IF 1        ; if true → COOP
16          DEFECT
00          COOP

TitForTwoTats

9 bytes

Only retaliate after two consecutive opponent defections.

02          OPP_LAST            ; [last]
01 01       PUSH 1              ; [last, 1]
04          OPP_N               ; [last, second_last]
11          AND                 ; [both_defected]
15 01       JMP_FWD_IF 1        ; if true → DEFECT
00          COOP
16          DEFECT

Forgiving Detective

25 bytes

Cooperate rounds 0–2, defect round 3 (probe). After: if opponent never defected, exploit (AlwaysDefect); otherwise play TitForTat. A novel strategy impossible to express with the 9 built-in strategies.

08          ROUND               ; [round]
01 03       PUSH 3              ; [round, 3]
0D          GT                  ; [past_opening?]
15 06       JMP_FWD_IF 6        ; if past opening → analysis
08          ROUND               ; [round]
01 03       PUSH 3              ; [round, 3]
0F          EQ                  ; [is_round_3?]
15 01       JMP_FWD_IF 1        ; if round 3 → defect
00          COOP                ; rounds 0-2: cooperate
16          DEFECT              ; round 3: probe defect
; -- analysis (round > 3) --
06          OPP_DEFECTS         ; [opp_d]
01 00       PUSH 0              ; [opp_d, 0]
0F          EQ                  ; [naive?]
15 02       JMP_FWD_IF 2        ; if never defected → exploit
02          OPP_LAST            ; [opp_last]
18          RETURN              ; TFT: mirror opponent
16          DEFECT              ; exploit naive opponent
Try it in the Strategy Lab

Write assembly, get instant WASM validation, and preview your custom strategy against all 9 built-ins — right in the browser.

Commit-Reveal for Custom Strategies

Custom strategies use a two-level hashing scheme to keep the commitment preimage fixed-length while allowing variable-length bytecode.

Strategy TypeCommitment Hash
Built-inSHA256(strategy_u8 || salt[16])
CustomSHA256(9u8 || SHA256(bytecode[0..len]) || salt[16])

The inner SHA256(bytecode) hash produces a fixed 32-byte digest regardless of program length, keeping the outer preimage at a fixed 49 bytes (1 + 32 + 16). The bytecode hash can also be displayed independently as a program fingerprint.

Forfeit handling: The forfeit mechanism uses on-chain SlotHashes sysvar data to deterministically assign a built-in strategy (index 0–8) — forfeited players never receive Custom.

Bytecode Validation

Six checks are performed on-chain during the reveal phase to reject malformed programs before they enter the match pipeline:

1. Non-emptyProgram length must be > 0
2. Length limitProgram length must be ≤ 64 bytes
3. Valid opcodesEvery byte must be a known opcode (0x00–0x18)
4. Complete immediatesPUSH, JMP_FWD, and JMP_FWD_IF must have their operand byte present
5. Jump boundspc + offset ≤ bytecode.len() for all jumps
6. Has terminalAt least one COOP, DEFECT, or RETURN instruction must exist

Stack depth is not validated statically — underflow and overflow are handled gracefully at runtime (see Machine Model error handling).

Testing Locally

The match-logic crate provides everything you need to validate and test custom bytecode programs locally before submitting them on-chain.

Key Functions
Function
validate_bytecode(bytecode: &[u8])
run_match(strategy_a, strategy_b, seed, match_index, participant_count)
replay_match(...) (WASM)
Example: Validate & Run a Custom Strategy
use match_logic::{validate_bytecode, run_match, PlayerStrategy};

fn main() {
    // TitForTat as bytecode: OPP_LAST RETURN
    let bytecode = vec![0x02, 0x18];

    // Validate before submitting on-chain
    validate_bytecode(&bytecode).expect("invalid program");

    // Test against AlwaysDefect
    let custom  = PlayerStrategy::Custom(bytecode);
    let defector = PlayerStrategy::Builtin(match_logic::Strategy::new(match_logic::StrategyBase::AlwaysDefect));

    let seed = [0u8; 32];
    let result = run_match(&custom, &defector, &seed, 0, 8);

    println!("Custom: {} | Defector: {}",
        result.total_score_a, result.total_score_b);
    println!("Rounds played: {}", result.round_count);
    for r in &result.rounds {
        println!("  R{}: {:?} vs {:?} → {}-{}",
            r.round, r.move_a, r.move_b, r.score_a, r.score_b);
    }
}

Add match-logic as a dependency in your Cargo.toml to test locally with cargo run. The same code that runs on-chain will execute on your machine — results are deterministic given the same seed.

For browser-based testing, the WASM replay_match() binding accepts JSON strategies like {"Custom": [2, 24]} and returns a full JSON match result.