Security and Safety

Rate Limiting

Rate limiting constrains how many actions, application programming interface (API) calls, or tokens an agent can consume within a given time period, preventing runaway loops, denial-of-service conditions, and unexpected cost spikes. Without rate limits, a single malfunctioning agent caught in an infinite retry cycle (retrying a failed tool call every two seconds across a 200-step planning loop) can generate a $400 bill from a single run before any human notices, and that is not a hypothetical edge case but a recurring incident pattern documented across public agent deployments. Effective rate limiting operates at multiple levels: per-call limits (maximum tokens per request), per-session limits (maximum total spend per task), and circuit breakers that halt execution when spend or iteration counts cross a threshold you set before the agent ever starts.