How do I set a monthly AI spending limit?
Create a hard budget that blocks or degrades AI calls before charges hit your provider bill — in under 5 minutes.
Goal: A monthly spending limit enforced at the proxy — so overspend is impossible, not just alerted.
Time: 5 minutes.
Step 1 — Go to Budgets and create a new budget
Open Dashboard → Budgets → New Budget.
Fill in the form:
| Field | What to enter |
|---|---|
| Name | Something descriptive — Engineering monthly, chatbot feature, Org total |
| Monthly limit | Your budget in USD |
| Scope | See below |
| Enforcement | See below |
| Alert thresholds | 50, 80, 100 (you'll get email alerts at each) |
Choosing the scope:
| Scope | Use when |
|---|---|
| Global | One limit for your entire organisation's AI spend |
| Department | Limit a team — needs X-Cost-Department: engineering in your calls |
| Feature | Limit a single product feature — needs X-Cost-Feature: chatbot in your calls |
Choosing enforcement:
| Mode | What happens when the limit is hit |
|---|---|
| Alert | Call goes through, you get a notification. Good for monitoring. |
| Degrade | Call is rerouted to a cheaper model. Good for production features. |
| Block | Call returns 429. Good for experimental features and agents. |
Start with Alert, graduate to Block
If you're not sure which to pick, start with Alert for 2 weeks to understand normal spend patterns. Then switch to Block or Degrade once you're confident in the limit.
Step 2 — Save and verify
Click Save. The budget appears on the Budgets page immediately.
To verify it's active, make a test API call through the proxy and check the response headers:
You'll see the remaining budget in USD for this call's scope.
Step 3 — Test the enforcement (optional)
To verify the block mode works: temporarily set the budget to $0.01, make a call, confirm you get a 429. Then restore the real limit.
Useful follow-ups
- Multiple levels — Add a department budget on top of feature budgets. The tightest limit always wins. See Hierarchical Budgets.
- Agent workflows — Add a per-run limit so one runaway agent can't consume your entire feature budget. See Track Cost Per Agent Run.
- Get alerted in Slack — Budget alerts can route to Slack instead of (or in addition to) email. See Slack Alerts.
Related articles
Tag Your First AI Call
Add 2 headers to your existing code and see per-feature spend in under 5 minutes.
Cut Your AI Bill with One Click
Use AI Advisor recommendations to apply model downgrades and caching without code changes.
Get Slack Alerts on Spend Spikes
Connect Slack and get notified the moment an anomaly or budget threshold is hit.