Skip to Content
ConceptsRate Limiting

Rate Limiting

The Archive API enforces 5 requests per second per workspace. Requests beyond that limit return HTTP 429 with code RATE_LIMIT_EXCEEDED.

The limit is scoped per workspace — if a single token has access to multiple workspaces, each workspace gets its own 5 RPS budget. They don’t share.


How to handle 429

The error message includes the number of seconds to wait before retrying. Read it and respect it:

{ "errors": [ { "message": "Rate limit exceeded. Retry after 0.4 seconds.", "extensions": { "code": "RATE_LIMIT_EXCEEDED" } } ] }

Retries with backoff don’t reduce your effective rate. They just add latency. The right pattern is to shape your traffic before it leaves your service — throttle outbound traffic to 5 RPS per workspace client-side.

  • Keep a token-bucket or fixed-window limiter capping outbound traffic to 5 RPS per workspace.
  • If multiple processes hit the same workspace through the same backend, the throttle has to be shared across them — per-process limiting won’t help if you have several processes.
  • If your job has a hard time budget (e.g., “render the first paint in under 3s”), size your initial fetch so it fits within that budget at 5 RPS. For 5 items, you have exactly 1 second of headroom for fan-out.

Treat 429 as a bug in your client, not a transient error to retry around.

See also: Best Practices → Plan your fan-out for engagementHistory for the one query that genuinely has to fan out.

Last updated on