< back to desk

Polymarket market-making: a microstructure note

PLYMKT-MM · 2026-04 · avdhesh charjan · repo →

Q: Is market-making on Polymarket economically rational?

A: Yes, and emphatically. Every maker volume decile earns a positive volume-weighted markout at every horizon studied. Stratification is monotonic with an ~85× gap between the bottom and top decile at 10 minutes. There is a sharp discontinuity at D7 → D8 that looks like a step change in maker sophistication.

DATA

Source: SII-WANGZJ/Polymarket_data on Hugging Face — 568M on-chain OrderFilled events through March 2026. After filtering to markets with ≥ 5,000 fills (top ~3% by activity, ~75% of volume), the working dataset is 327,417,891 fills across 915,852 unique maker addresses, $13.99B USDC notional.

METHOD

For each fill, compute a markout against the nearest subsequent trade in the same market at horizons of 10 s, 1 min, and 10 min:

signed_yes = +yes_amount  if maker BUY_YES
             −yes_amount  if maker SELL_YES
markout_H  = signed_yes × (price_at_H − price_at_fill)

Forward-ASOF with strict inequality (future_ts > ts) — naive ASOF can self-match when two fills share a timestamp and contaminate short-horizon markouts. Volume-weighted mean across each maker's USDC volume, aggregated by decile (D1 = smallest 10% by volume, D10 = largest).

FINDINGS

Volume-weighted mean markout per (decile × horizon), units: USDC × YES-delta per fill.

DECILE10s1min10min
D1+0.002+0.011+0.022
D2+0.022+0.034+0.048
D3+0.068+0.078+0.096
D4+0.122+0.150+0.151
D5+0.159+0.210+0.206
D6+0.192+0.268+0.300
D7+0.251+0.340+0.467
D8+0.752+0.914+0.913
D9+0.521+0.777+0.903
D10+2.271+1.972+1.841

INTERPRETATION

The economic story is fair-value mean-reversion in a venue with a lot of uninformed flow. When a passive maker's quote is hit, either (a) the taker is informed and the price keeps moving against the maker, or (b) the taker is uninformed and the price reverts. On Polymarket — high retail participation, binary resolution events that produce mean-reverting micro-moves — case (b) dominates.

The D7 → D8 discontinuity is the most striking structural feature. The top 30% of makers by volume look like a qualitatively different strategy from the middle of the distribution: candidates include algorithmic cross-market arbitrage, latency-sensitive auto-cancellation, or programmatic liquidity provision.

CAVEATS

FOLLOW-UP

Two extensions identified.

1. Resting-book reconstruction to separate quoted spread from realized markout. Status: blocked by dataset. The Hugging Face source has fill events only — no OrderPlaced / OrderCancelled streams, no quote snapshots. Would require Polygon RPC log replay for the CTFExchange contract.

2. Market-classification stratification (politics / sports / crypto / macro / short-horizon price bets) and time-to-resolution at fill, to isolate which market types reward which strategies. Status: implemented. See finding 2 in the repo.

REPRODUCIBILITY

git clone https://github.com/avdheshcharjan/polymarket-market-maker
cd polymarket-market-maker
uv sync

uv run pmm download --file quant.parquet
uv run pmm metrics --source-file data/raw/quant.parquet \
                   --source-kind quant --engine duckdb \
                   --min-fills 5000
uv run python scripts/finding_01.py

Full pipeline ~46 min on a 16 GB Mac, peak scratch disk ~55 GB. 43 tests cover unit + property (hypothesis) + cross-engine equivalence (Polars vs DuckDB).

NOTE

This finding is less a novel claim than an empirical confirmation: on Polymarket's on-chain CLOB, adverse-selection markouts are systematically positive and maker-sophistication stratification is real, monotonic, and large enough to matter. The interesting work is downstream — explaining the D7 → D8 step and identifying which kinds of markets reward which strategies.