If you're chasing arbitrage or +EV value, the only number that matters is end-to-end latency: how long between a soft book mispricing a line and your bet hitting their server. Books are sophisticated enough that genuine windows close within 1 to 30 seconds. If your stack takes 20 seconds, you catch tail-end edges only. If you can run end-to-end in under 2 seconds, you catch the meat.
This post breaks down every step in that path with concrete numbers. We've measured each on production traffic running against US-licensed sportsbooks. If you're building a betting agent, model-runner, or arb scanner, this is the budget you're working with.
Eight stages between the book moving a line and your bet getting confirmed:
Concrete numbers per stage. The "tight" column is what a production system actually hits. The "slack" column is what a hobby setup hits. The "stale" column is where you're effectively betting blind.
| Stage | Tight (ms) | Slack (ms) | Stale (ms) | Optimization lever |
|---|---|---|---|---|
| 1. Book publishes | 0 | 0 | 0 | You don't control this; it's the t=0 reference. |
| 2. Aggregator ingest | 200 | 800 | 3000+ | WebSocket subscriptions or per-book hot loops. Polling = slack/stale; push = tight. |
| 3. Aggregator broadcast | 50 | 200 | 5000+ | WebSocket fan-out at the edge. CDN-cached REST = stale by definition. |
| 4. Network to agent | 20 | 100 | 500 | Geographic proximity to aggregator. East-coast US to East-coast US = ~20ms. |
| 5. Agent decision | 10 | 100 | 2000 | Pre-computed sharp-anchor fair prob, lookup-table edge thresholds. Avoid synchronous DB writes in the hot path. |
| 6. Bet submission | 200 | 600 | 2000 | Persistent connection to the book. Avoid re-auth per bet. |
| 7. Book risk-engine | 200 | 800 | 5000 | You don't control this. Some books are intentionally slow on edge accounts (latency = risk filter). |
| 8. Confirmation | 100 | 300 | 1000 | Network return path; symmetric with stage 4. |
| Total round-trip | 780 | 2900 | 18500+ |
Tight ~780ms. Achievable. Sub-second round trip from book-line-move to bet-confirmation on a well-architected stack.
Slack ~2900ms. Reasonable for most operators. Catches most opportunities; misses the tightest windows.
Stale 18s+. The line has either moved or the edge has been hit by other faster operators by the time your bet reaches the book.
Three observations from running this in production:
If you're polling a REST endpoint every 30 seconds, you've already lost. Your effective ingest latency floor is half your polling interval (15 seconds on average), plus the cache TTL. A 30-second poll against a 60-second-TTL cache = 30 to 90 seconds of staleness before the data even gets to you.
WebSocket flips this. The aggregator's hot loop hits the book on its own cadence (often 200-500ms), pushes the diff to your socket as soon as it lands. Your effective ingest latency is the book's update cadence plus the aggregator's processing overhead, which is typically 300-800ms total.
You can hyper-optimize stages 1-6, get your bet to the book's server in 300ms, and still wait 2+ seconds at the book for risk-engine evaluation. Books deliberately slow-evaluate accounts they've flagged as sharp. There's a latency ceiling on bet acceptance that no amount of client-side speed can break through.
The defense: don't pattern-bet (always taking the off-market side, always near max stake, always within seconds of line move). Sprinkle in recreational-shaped bets, vary stake size, take some "value" plays that you'd lose to a model. This reduces your sharp-flag, which reduces the risk-engine latency. Long-run.
Most agents waste 1-2 seconds in decision time because they synchronously query a database, run a heavy stats package, or recompute fair value from scratch. Pre-compute everything: store the sharp anchor's no-vig fair prob per (event, market, side) in memory, refresh it on every push, and the agent's "is this +EV" check becomes a single dictionary lookup plus an arithmetic comparison. Sub-10ms.
# bad: sync DB + recompute
def is_ev(book_price, event_id, side):
sharp_price = db.query("SELECT price FROM sharp WHERE ...") # 50-200ms
fair = compute_devig(sharp_price, ...) # 10-50ms
return implied(book_price) < fair
# good: in-memory lookup
def is_ev(book_price, event_id, side):
fair = FAIR_PROBS.get((event_id, side)) # <1ms
if fair is None: return False
return implied(book_price) < fair # <1ms
Real-world arb and +EV windows on liquid US books typically last 1 to 30 seconds. The distribution is heavily front-loaded: 80% of opportunities close within the first 5 seconds, 50% within 2 seconds. Your round-trip latency has to be less than the window for the bet to land at the displayed price.
Practical implication: getting from 5-second total latency to 2-second total latency roughly doubles your hit rate, because you go from catching the long-tail to catching half the meat. Getting from 2-second to 1-second is another 50% improvement on top of that.
/v1/sports/{sport}/odds every 5 seconds.wss://parlay-api.com/v1/ws for diff pushes.Every response we return has a X-Source-Latency-Ms header indicating how stale the aggregator's snapshot was at the time of return. WebSocket pushes carry the same field in their payload. You can plot end-to-end by subtracting your received_at from the upstream observed_at. We track 50th, 95th, and 99th percentile per book on /uptime.
Want to verify your own stack? Run our WebSocket reference client with the built-in latency logger; it timestamps every push at receive and prints the per-book end-to-end every minute.
Latency is a means, not an end. The end is edge captured × throughput × longevity. Sub-second latency captures more edge, but if you're betting in patterns the book detects, your throughput collapses and your longevity goes to zero. Latency is necessary but not sufficient. The sharpest operators have sub-second stacks AND vary their bet patterns; the books can't tell them apart from sophisticated recreational bettors until much later.
If you want the latency floor we describe here, ParlayAPI ships sub-second WebSocket on the scale tier ($499/mo) and sub-3-second REST on every tier including free. Try the WebSocket reference client at examples/ws_reference_client.py or sign up at /signup.