How do I set up hierarchical AI budgets?
Enforce spending limits at every level — per run, per feature, per department, and org-wide — so a single agent loop can never blow your quarterly AI budget regardless of what individual feature budgets allow.
Goal: A budget hierarchy where the org ceiling can never be breached, even if every individual child budget still has room.
The problem this solves: Without hierarchy, 50 agent runs × $0.49 each = $24.50 against a $20 department budget. Every individual run passes (it's under its own $0.50 limit) while the department budget is blown. Hierarchy means the parent ceiling wins.
Prerequisites: At least one application making calls through the proxy.
Step 1 — Create the org-level budget (the ceiling)
Go to Dashboard → Budgets → New Budget.
| Field | Value |
|---|---|
| Name | Org — AI Total |
| Scope | Global |
| Monthly limit | Your total quarterly AI budget ÷ 3 |
| Enforcement | Degrade (recommended for org level) |
| Alerts | 70%, 85%, 95% |
This is the hard ceiling. No combination of child budgets can exceed it.
Step 2 — Create department budgets
Create one budget per department that uses AI. Each is scoped to a department header value.
| Field | Value |
|---|---|
| Name | Engineering |
| Scope | Department |
| Scope value | engineering (matches X-Cost-Department: engineering) |
| Monthly limit | Department's AI allocation |
| Enforcement | Alert (teams manage their own spend) |
Repeat for each department: customer-success, product, sales, etc.
Step 3 — Create feature budgets for high-risk features
For agent workflows, experimental features, or anything that can loop, add a feature-level budget.
| Field | Value |
|---|---|
| Name | Agent — document-processor |
| Scope | Feature |
| Scope value | document-processor |
| Monthly limit | $200 (conservative while monitoring) |
| Enforcement | Block |
Use Block mode for experimental features and agents. Use Degrade for production features where you want graceful degradation instead of hard failures. See enforcement modes for the full trade-off.
Step 4 — Add per-run limits for agents
For any agent that runs multi-step workflows, add a per-run budget. This caps the cost of a single agent execution, independent of the monthly feature budget.
Go to the feature budget you created, click Add Sub-limit, and set a per-session limit:
| Field | Value |
|---|---|
| Sub-limit type | Per session |
| Limit | $0.50 per run |
| Enforcement | Block |
Tag agent runs with a session ID so Cognocient can group calls per run:
Step 5 — Test the hierarchy
Make a test call tagged with a feature that has budgets at all levels:
The response header x-cog-budget-remaining shows the most restrictive remaining budget across all matching levels. A call is only allowed if every matching budget level has room.
Step 6 — Check budget status before each agent step (advanced)
For long-running agents, poll the budget status API before each step to avoid being stopped mid-execution:
Done
Your hierarchy is active. The enforcement chain is: run → feature → department → org. Any level being exhausted stops the call — the most restrictive level always wins.
Monitor budget consumption across all levels in Dashboard → Budgets. The parent budget cards show aggregate child spend in real time.
Related articles
Tag Your First AI Call
Add 2 headers to your existing code and see per-feature spend in under 5 minutes.
Set a Monthly Spending Limit
Create a hard budget enforced at the proxy before charges reach your provider bill.
Cut Your AI Bill with One Click
Use AI Advisor recommendations to apply model downgrades and caching without code changes.