How do I attribute costs across MCP and multi-agent workflows?

Agent workflows are a black box for most finance teams — you know the total AI cost but not which tool call, which agent step, or which handoff drove it. MCP/A2A Attribution breaks down every dollar by MCP server, parent run, and handoff hop.

MCP/A2A Attribution breaks down every dollar by MCP server, parent run, and agent handoff hop. Pass X-Cost-MCP-Server and X-Cost-Parent-Run-Id headers on your agent calls to get a full workflow cost tree showing exactly what each step cost.

Without MCP headers	With `X-Cost-MCP-Server`
My agent cost $180 yesterday. No idea which tool calls drove it.	web-search MCP: $94 (52%) · code-executor: $62 (34%) · document-reader: $24 (13%)

The three MCP/A2A attribution headers

X-Cost-MCP-Server: web-search Name of the MCP server making this call. Use the MCP server's identifier — e.g., "web-search", "code-executor", "document-reader". Cognocient aggregates cost by MCP server in the Agent Workflows dashboard.

X-Cost-Parent-Run-Id: run-4a7c2b1e-... The parent workflow run ID — the same UUID shared by all agent steps in a single execution. This ties individual MCP tool calls back to the orchestrating run so you can see total per-run cost in the workflow tree.

X-Cost-Agent-Handoff: true Set to "true" when this call is an agent-to-agent handoff — i.e., one agent is invoking a sub-agent via an API call. Cognocient uses this to build the workflow tree and attribute costs to handoff hops specifically.

These headers work alongside the existing attribution headers. Use X-Cost-Feature for the workflow name and X-Cost-Run-ID (same as X-Cost-Parent-Run-Id) for the run. The MCP-specific headers provide the additional layer of tool-level granularity.

Python — Claude tool use with MCP

The most common pattern: a Claude agent uses tool calls to invoke MCP servers. Pass the MCP server name in the header for every call that originates from a tool invocation.

import uuid
import anthropic
 
client = anthropic.Anthropic(
    api_key="sk-cog-YOUR-PROXY-KEY",
    base_url="https://api.cognocient.com/v1",
)
 
async def run_research_agent(query: str):
    run_id = f"run-{uuid.uuid4()}"
 
    # Step 1: Orchestrator call (no MCP server — this is the main agent)
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        system="You are a research agent. Use tools to answer the query.",
        messages=[{"role": "user", "content": query}],
        tools=[web_search_tool, document_tool],
        extra_headers={
            "X-Cost-Feature":        "research-agent",
            "X-Cost-Parent-Run-Id":  run_id,
        },
    )
 
    # Process tool calls
    for block in response.content:
        if block.type == "tool_use":
            tool_name = block.name  # e.g., "web_search"
            mcp_server = tool_to_mcp_server[tool_name]  # e.g., "web-search"
 
            # Step 2: Tool execution call — tag with MCP server
            tool_response = client.messages.create(
                model="claude-sonnet-4-6",
                max_tokens=512,
                messages=[...],  # tool result context
                extra_headers={
                    "X-Cost-Feature":        "research-agent",
                    "X-Cost-Parent-Run-Id":  run_id,
                    "X-Cost-MCP-Server":     mcp_server,  # key line
                },
            )
 
# Mapping from tool name → MCP server identifier
tool_to_mcp_server = {
    "web_search":      "web-search",
    "read_document":   "document-reader",
    "execute_code":    "code-executor",
}

Node.js — LangGraph agent with handoffs

For multi-agent systems where a supervisor hands off to specialist sub-agents, use X-Cost-Agent-Handoff: true on the sub-agent's calls so Cognocient builds the correct workflow tree.

import OpenAI from 'openai';
import { v4 as uuidv4 } from 'uuid';
 
const openai = new OpenAI({
  apiKey: 'sk-cog-YOUR-PROXY-KEY',
  baseURL: 'https://api.cognocient.com/v1',
});
 
async function supervisorAgent(task: string) {
  const runId = `run-${uuidv4()}`;
 
  // Supervisor call
  const supervisorResponse = await openai.chat.completions.create(
    {
      model: 'gpt-4o',
      messages: [{ role: 'user', content: task }],
    },
    {
      headers: {
        'X-Cost-Feature':       'multi-agent-pipeline',
        'X-Cost-Parent-Run-Id': runId,
      },
    }
  );
 
  // Supervisor decides to hand off to the research sub-agent
  const subAgentResponse = await openai.chat.completions.create(
    {
      model: 'gpt-4o-mini',
      messages: [{ role: 'user', content: supervisorResponse.choices[0].message.content! }],
    },
    {
      headers: {
        'X-Cost-Feature':        'multi-agent-pipeline',
        'X-Cost-Parent-Run-Id':  runId,
        'X-Cost-MCP-Server':     'research-agent',  // sub-agent identity
        'X-Cost-Agent-Handoff':  'true',             // marks this as a handoff
      },
    }
  );
 
  return subAgentResponse;
}

Agent Workflows dashboard

Once you add these headers, the Agent Workflows tab shows:

Cost by MCP server (bar chart) — Which tool is driving the most spend. Sortable by day/week/month.
Workflow tree view — Per run: a tree showing each step, which MCP server was called, tokens used, cost, and latency at each node.
Handoff cost attribution — What fraction of total run cost came from agent handoffs vs. direct orchestrator calls.
Most expensive runs — Table of runs ranked by total cost, with step count and flagged anomalies.

Example workflow tree:

research-agent (supervisor)          $0.21  214 tokens
├── web-search (tool call)            $0.42  890 tokens  [MCP: web-search]
│   └── result synthesis              $0.09  180 tokens
├── document-reader (tool call)       $0.06  312 tokens  [MCP: document-reader]
└── final-response                    $0.05   98 tokens

Start with X-Cost-MCP-Server

If you only add one header, add X-Cost-MCP-Server. Cost by MCP server is the most actionable insight — it immediately tells you which tool to optimise first.

Next steps: Attribution Headers · Sessions · Budget Enforcement

The three MCP/A2A attribution headers

Python — Claude tool use with MCP

Node.js — LangGraph agent with handoffs

Agent Workflows dashboard

On this page