How do I debug and stop runaway agent loops?

An agent loop that calls GPT-4o 400 times in 60 seconds is a $200 mistake. Here's how to find it in the dashboard, stop it, and prevent it from happening again.

Goal: Identify which agent is looping, stop the active spend, and set up enforcement so it can't recur.

Prerequisites: Agent calls going through the Cognocient proxy (so they're visible in the dashboard).

Step 1 — Spot the loop in Live Calls

Go to Dashboard → Live Calls. Sort by timestamp (newest first). A runaway loop shows up as a dense cluster of calls from the same session ID, hitting the same model repeatedly, each within a few seconds of the last.

Signs of a loop:

Same session_id appearing 50+ times in under 5 minutes
Each call has very similar token counts (the prompt hasn't changed)
Latency is low (the model is responding fine — it's your code that keeps re-calling)

Click any call in the cluster to see the full metadata including the X-Cost-Feature and X-Cost-Session headers. This tells you exactly which feature and which specific run is looping.

Step 2 — Check if the circuit breaker fired

Go to Dashboard → Engineering Dashboard and check the Circuit Breaker metric in the System Health bar. If it shows a trip count > 0, the velocity limit already caught this loop and started blocking calls.

If it didn't fire, the loop may be under the default velocity threshold. Continue to Step 3 to tighten it.

Step 3 — Stop the active loop immediately

If the loop is still running:

Option A — Block the specific session (fastest): In Live Calls, click the session ID → Block session. All further calls from this session ID return 429 immediately.

Option B — Cut the feature budget: In Dashboard → Budgets, find the budget for this feature and temporarily set it to $0. All calls tagged with that feature are blocked instantly until you raise the limit.

Option C — Revoke the proxy key: In Settings → API Keys, click the key being used and toggle it off. All calls using that key stop immediately. Use this for emergencies — it affects all features on that key.

Step 4 — Set velocity limits to prevent recurrence

Go to Dashboard → Budgets → Guardrails and configure velocity enforcement for the affected feature:

Setting	Recommended value
Velocity window	60 seconds
TPM threshold	3× your normal baseline
Action	Block (not just alert)
Alert	Slack notification on trip

The circuit breaker uses a sliding 60-second window. If tokens-per-minute exceeds your baseline multiplier, calls are blocked and you get a Slack alert.

Set the multiplier to 3× rather than 10×. A factor-of-3 spike is almost always a loop, not a legitimate traffic surge. A factor-of-10 spike has usually already cost you hundreds of dollars before the alert fires.

Step 5 — Add a budget check in your agent loop code

The most robust protection is a pre-flight budget check before each agent step. Add this to your agent's step-execution function:

import httpx
 
def budget_ok(feature: str, session_id: str) -> bool:
    try:
        resp = httpx.get(
            "https://api.cognocient.com/api/budgets/status",
            headers={
                "Authorization": f"Bearer {COG_API_KEY}",
                "X-Cost-Feature": feature,
                "X-Cost-Session": session_id,
            },
            timeout=0.5,  # don't let the budget check slow your agent
        )
        return resp.json().get("can_proceed", True)
    except Exception:
        return True  # fail open if status check itself fails
 
# In your agent loop:
for step in planned_steps:
    if not budget_ok("document-processor", run_id):
        logger.warning(f"Budget limit reached after {len(completed_steps)} steps")
        break
    result = execute_step(step)
    completed_steps.append(result)

async function budgetOk(feature: string, sessionId: string): Promise<boolean> {
  try {
    const resp = await fetch("https://api.cognocient.com/api/budgets/status", {
      headers: {
        Authorization: `Bearer ${COG_API_KEY}`,
        "X-Cost-Feature": feature,
        "X-Cost-Session": sessionId,
      },
      signal: AbortSignal.timeout(500),
    });
    const data = await resp.json();
    return data.can_proceed ?? true;
  } catch {
    return true; // fail open
  }
}

Step 6 — Review in the dashboard after

Once the loop is stopped, go to Dashboard → Feature Intelligence and filter to the affected feature. You'll see the exact spike in the cost-over-time chart, with the loop visible as a vertical cost cliff. This view also shows your average cost per call before and during the loop — useful for estimating the total impact.

Preventing loops from the start

For any new agent workflow, apply these defaults before it goes to production:

Per-session budget — Cap the cost of a single run (e.g., $0.50 per document processing job).
Feature-level Block budget — Hard limit for the feature per month, not just Alert.
Budget pre-check in code — The budget_ok() function above in every agent loop.
Velocity circuit breaker — Set at 3× baseline, action = Block.

See hierarchical budgets for how to set up all four levels together.