Tab · Insights
App LLM calls
Find the single hot prompt that's burning half your bill — and know what to switch it to. Production LLM call telemetry: by-tier, by-surface, hot prompts, routing recommendations, and the actual Anthropic billing.
See it in motion
Where to find it
- Localhost:
/app-llm-calls.html?repo=<id> - API:
GET /api/ai-calls?repo=<id> - Keyboard: ⌘ K then
llm - Sidebar: Insights → App LLM calls
What it does for you
By-tier rollup tells you the shape of spend.Tier-1 / Tier-2 / Tier-3 with cost + share. When Tier-3 is >60% of the bill but your traffic is mostly classification, the routing is wrong — and the dashboard says so.
Top prompts surface the 80/20.One hot prompt usually accounts for half the bill. The Top prompts table ranks by spend over the window so you find it in one glance.
Anthropic billed side-by-side with reduced cost.When
ANTHROPIC_ADMIN_KEY is set, the dashboard shows your real invoice number next to its own calculation. Big gap = double counting; investigate.Configure
Each tracked repo implements GET /api/ai-calls/log/export with bearer-token auth (AI_CALLS_EXPORT_TOKEN). For the billing side-by-side, set ANTHROPIC_ADMIN_KEY (prefix sk-ant-admin01-, not the regular API key).
Use it well
Open weekly. Sort by spend; click the top surface. If its cache-hit-rate is below 30%, the system block isn't cacheable — fix the prompt structure first. If the routing recommendation is one tier cheaper, click Propose routing PR and let it land.