Polymarket Bot Tutorial · Chapter 16 of 32
Statistical arbitrage on Polymarket: cross-market pairs (correlated events), Polymarket-vs-Kalshi spreads, mean reversion, and how to size stat-arb positions when markets eventually resolve.
What this chapter covers
Statistical arbitrage on Polymarket exploits transient mispricings between correlated markets - same event on Polymarket vs Kalshi, or related markets within Polymarket itself. The edges are small (1-3 cents typical) and operationally fragile. This chapter is honest about what works, what doesn't, and the multi-leg execution risk that kills most attempts.
Cross-market statistical arbitrage exploits transient pricing inconsistencies between Polymarket and Kalshi, Polymarket and Manifold, or between correlated markets within Polymarket. The edges are small (1-3 cents typical) and require fast execution on both legs. This chapter is the honest playbook including the operational complexity that kills most attempts.
- What stat-arb means in prediction markets
- Polymarket-vs-Kalshi spread examples
- Pairs within Polymarket (correlated events)
- Mean reversion vs trend continuation
- Sizing for resolving (not perpetual) markets
- Risk: divergence past resolution
- Code: pairs monitor and threshold-trigger
What stat-arb means in prediction markets
Statistical arbitrage on prediction markets means trading the spread between two markets that should be priced consistently. Three flavors are common on Polymarket.
- Cross-venue: same event on Polymarket and Kalshi (or Manifold, PredictIt). Pricing should converge; in practice it drifts 2-5 cents.
- Same-event-pair: parent vs sum of legs in NegRisk multi-outcome markets. Sum-to-1 invariant lets you arb when the legs sum to less than 1.0.
- Correlated-event-pair: two markets about related outcomes (e.g. "Trump president on Jan 1" vs "Trump president on Mar 1"). Should price within 2-3 cents of each other.
The edges are small. The operational complexity is real. Most attempts die in execution, not in theory.
The structural one: split/merge basis arb
Before any statistical edge, there is a purely structural arb from the Conditional Token Framework - and the chapters above this one rarely mention it. Every Yes + No pair is redeemable for exactly $1.00 pUSD, so:
- If Yes + No < $1.00 on the book, buy both and
mergethe pair back to $1.00 - the difference is locked in, with no resolution wait. - If Yes + No > $1.00,
split$1.00 of pUSD into a fresh Yes + No pair and sell both into the book.
It is near risk-free but thin: the gap is usually a cent or less, capacity is bounded by the shallower side's depth, and the taker fee (plus gas on the split/merge) has to fit inside it. For multi-outcome (neg-risk) events the same idea generalises to the all-legs-sum-to-$1.00 check - see the neg-risk chapter and the CTF primitives. Treat this as the floor your fancier stat-arb must beat: if a structural arb is sitting on the book, take it first.
Polymarket-vs-Kalshi spread examples
From observation in 2025-26, Polymarket and Kalshi list the same major US events but price 1-4 cents apart on a steady basis. The gap exists for structural reasons that you need to model in any arb.
Structural drivers:
- Fee asymmetry: Kalshi takes 4-7% on winning trades (varies by market); Polymarket takes 0 taker fee. The arb math must net out Kalshi's bite.
- Settlement risk premium: when a market's resolution is ambiguous, one venue's UMA may resolve differently than the other's judges. The market prices this in.
- Trader population: Polymarket trends younger and more crypto-native; Kalshi trends professional / hedge. They disagree on the same events systematically.
The arb works when the gap exceeds the structural premium plus fees. A 5-cent gap on a market where the structural premium is 1c and combined fees are 1c is a 3c real edge.
Pairs within Polymarket (correlated events)
Within Polymarket, correlated-event pairs are easier to arb than cross-venue. Same fee structure, same wallet, atomic execution feasible.
Examples that consistently price inconsistently:
- Trump president on date A vs Trump president on date B (where B is later than A by < 90 days).
- Will Bitcoin hit $100k by July 31 vs $100k by August 31.
- Yes vs No legs on the same binary market (sum should = 1.0; sometimes drifts as far as 1.04 in thin books).
The Yes+No=1 arb is the cleanest: read both legs from the same market, fire FOK on both if the sum drops below 0.97 (allowing for the spread tax). Capital required is roughly equal on each leg; execution is atomic when both fills come back in the same response.
Mean reversion vs trend continuation
Two stat-arb regimes. Mean reversion: the pair has drifted apart for a noise reason; you bet on convergence. Trend continuation: the pair has started diverging because new information arrived; you bet on further divergence.
Distinguishing them is the hard part. Heuristic: if the divergence happened on visible volume (a whale walked one leg's book), it's news - fade only if you have a model. If it drifted slowly with low volume, it's noise - trade reversion with confidence.
For new builders: trade mean reversion only, on pairs where the divergence is < 1 standard deviation of historical drift. Trend continuation requires a model that catches the news; without one, you're trading against informed flow.
Sizing for resolving (not perpetual) markets
Prediction markets resolve. Crypto pairs don't. This changes the math.
A pair-arb position on Polymarket has a fixed payout schedule: when both markets resolve, the difference between predicted spread and actual spread is locked. There is no rolling, no infinite holding.
Sizing implication: the maximum you can hold is bounded by the time-to-resolution, because capital is locked until then. A pair resolving in 6 months can earn you 3c per share but you cannot put more capital to work in the meantime if both markets are fully sized.
The right framing: stat-arb on Polymarket is a series of bounded-time trades, not a continuous strategy. Compare PnL per unit of locked capital per day, not gross PnL.
Risk: divergence past resolution
The worst stat-arb outcome is your prediction-of-convergence prediction is wrong because the underlying premise was wrong. Examples:
- You shorted "Trump president on Apr 1" expecting it to converge to "Trump president on Mar 1" - but the date 1 market resolves YES and date 2 resolves NO because of a March impeachment. Your "spread should be flat" thesis was wrong.
- You arbed Polymarket vs Kalshi on the same NBA Finals winner. Polymarket resolves to the team that won the official series; Kalshi resolves on a slightly different definition that includes overtime tie-breakers differently. Both resolve YES on their stated terms, but in opposite directions.
Read every market's resolution criteria carefully. Cross-venue arb is one resolution-mismatch away from full loss on both legs.
Code: pairs monitor and threshold-trigger
Reference: monitor two correlated tokens, fire arb when spread crosses threshold.
def pairs_monitor(token_a, token_b, threshold_cents=3, size=10):
"""Buy A and Sell B when (1 - ask_A) + bid_B > 1 + threshold."""
while True:
book_a = fetch_book(token_a)
book_b = fetch_book(token_b)
if not (book_a.best_ask and book_b.best_bid):
time.sleep(2); continue
# implied: cost of buying A at ask + value of selling B at bid
edge = (1 - book_a.best_ask) + book_b.best_bid - 1
if edge > threshold_cents / 100:
print(f"ARB edge={edge:.3f}; firing")
r_a = fok_buy(token_a, price=book_a.best_ask, size=size)
if r_a.status != "matched": continue
r_b = fok_sell(token_b, price=book_b.best_bid, size=size)
if r_b.status != "matched":
# leg A filled, B failed -- unhedged, exit A
fok_sell(token_a, price=book_a.best_bid, size=size)
time.sleep(3)
The cleanup-on-partial-fill is critical. Without it, partial execution leaves the bot directionally exposed, which is the opposite of stat-arb's whole point.





