What is Context Tax?

Context tax is the recurring cost of sending a large, mostly-static system prompt with every API call. Learn how prompt caching eliminates it.

Context tax is the token cost paid repeatedly for a large, mostly unchanging system prompt sent with every API call. When a feature's system prompt makes up 80-90% of its total input tokens on every request, that static portion is a "tax" — the same cost paid over and over for information that rarely changes.

Why context tax adds up

RAG-based features are especially prone to this — sending large reference documents or instructions with every query dramatically inflates token count per call, even when the user's actual question is short.

How to detect context tax

Measure the ratio of static tokens (the same on every call) to variable tokens (the part that actually changes, like the user's question). A low variance in input token count across many calls to the same feature is the signal.

How Cognocient detects and fixes context tax

Cognocient's Context Tax Analyser calculates this ratio automatically per feature and quantifies the exact saving available from enabling prompt caching on the static portion — often a 60-80% reduction on that portion of input cost.

Find my waste — free trial → — see your own context tax findings in under 5 minutes.

Why context tax adds up

How to detect context tax

How Cognocient detects and fixes context tax

On this page