Budgets & Control

How do I see real-time AI API calls in Cognocient?

A real-time stream of every AI API call passing through the Cognocient proxy — model, tokens, cost, latency, and every attribution header, updated as calls happen. The fastest way to verify attribution is working, debug unexpected costs, and confirm a new feature is being tracked correctly.

The Live Calls view streams every AI API call passing through the Cognocient proxy in real time — model, tokens, cost, latency, and every attribution header. Use it to verify attribution is working, debug unexpected costs, or confirm a new feature is being tracked correctly.

What each row shows

FieldDescription
TimestampUTC time of the call, accurate to millisecond
ModelProvider and model name (e.g., openai/gpt-4o, anthropic/claude-sonnet-4-6)
FeatureValue of X-Cost-Feature header, or "untagged" if missing
DeptValue of X-Cost-Department header
Input tokensPrompt token count billed by the provider
Output tokensCompletion token count billed by the provider
CostExact cost in USD for this call, calculated in real time
LatencyTime from Cognocient receiving the request to the first token of the response
Status200 OK, 429 rate-limited, 500 error, or budget-blocked
Trace IDUnique ID for this call — use when filing support issues or debugging

Filtering the live feed

The filter bar at the top of the Live Calls view accepts any combination of:

  • Feature name (partial match)
  • Department name
  • Model (select from list)
  • Status (success / error / budget-blocked)
  • User ID
  • Session ID
  • Cost range (min / max per call)

Filter by Status: untagged to find API calls that are missing attribution headers. These calls contribute to the "Unattributed" spend bucket in your dashboard and reduce your chargeback accuracy.

Common uses for Live Calls

Verify attribution headers after deployment — After shipping a new feature or updating header values, watch the Live Calls feed in real time to confirm the new feature name appears correctly. Filter by feature name to isolate just those calls.

Debug unexpected costs — If your dashboard shows a cost spike but you're not sure which feature caused it, set the date range to the spike window and sort by cost descending. The top rows tell you exactly which model, feature, and call pattern drove the increase.

Confirm proxy is routing correctly — After applying a routing rule (e.g., redirect gpt-4o → gpt-4o-mini for sentiment analysis calls), watch Live Calls to verify the model column shows the cheaper model for matching calls.

Monitor a product launch — During a new feature launch, keep the Live Calls view open filtered to that feature. You can watch cost-per-call in real time and spot immediately if usage is higher or more expensive than expected.

Investigate error clusters — Filter by Status: error to see all failed calls. High error rates (>2%) from a specific feature often indicate rate limiting, prompt errors, or context-window violations — all of which waste money.

Exporting call logs

Any filtered view can be exported to CSV. Exports include all columns plus the raw header values. Use this for:

  • Feeding into internal data warehouses (Snowflake, BigQuery)
  • Custom compliance reports
  • Joining with application logs using Trace ID
  • Providing call-level detail to enterprise clients in chargeback reports

Call log retention is 90 days on Growth and Business plans. Raw logs are never retained on Cognocient infrastructure beyond the retention window — see Data Security for details.

Pagination and real-time mode

By default the feed is in live mode — new calls appear at the top as they happen. Click Pause to freeze the view and browse historical calls. Click Resume to return to live mode. You can also switch to a historical range using the date picker, which disables live mode and paginates through the selected window.


Next steps: Anomalies · Attribution Headers · Routing Rules

On this page