Rate Limiting
Rate limits protect nodes from overload and give clients predictable failure semantics. Apply limits at the gateway before a request reaches the node process.
| Limit dimension | Why | Example policy |
|---|---|---|
| Per API key | Fairness between customers and services | 100 requests/second with burst 200 |
| Per source IP | Abuse control before authentication or for public tiers | 20 requests/second with burst 40 |
| Per method | Expensive calls need lower budgets | Lower batch, trace, simulation, historical range limits |
| Per batch | Prevent a single JSON-RPC request from hiding huge work | Maximum 20 calls or 1 MiB body |
| Per WebSocket connection | Subscription fanout protection | Maximum connections, subscriptions, and messages/minute |
Token bucket is the usual model: tokens refill at a steady rate, bursts consume saved tokens, and empty buckets return 429 Too Many Requests.
type RateLimitDecision =
| { allowed: true; remaining: number }
| { allowed: false; status: 429; retryAfterSeconds: number };
Clients must treat 429 as a backpressure signal, not as a reason to retry immediately. Honor Retry-After when present and otherwise use exponential backoff with jitter. Solana public RPC clients may also see rate-limit or access-control failures such as 429 and 403; production clients should move sustained workloads to a private endpoint rather than escalating retry pressure (Solana RPC).
:::warning Batch limits are rate limits A single JSON-RPC batch can be more expensive than many small calls. Count both the HTTP request and each operation inside the batch. :::
See /developer/retry-timeout-backoff for client behavior after throttling.