Tab · Insights

App LLM calls

Find the single hot prompt that's burning half your bill — and know what to switch it to. Production LLM call telemetry: by-tier, by-surface, hot prompts, routing recommendations, and the actual Anthropic billing.

See it in motion

Where to find it

  • Localhost: /app-llm-calls.html?repo=<id>
  • API: GET /api/ai-calls?repo=<id>
  • Keyboard: K then llm
  • Sidebar: Insights → App LLM calls

What it does for you

By-tier rollup tells you the shape of spend.Tier-1 / Tier-2 / Tier-3 with cost + share. When Tier-3 is >60% of the bill but your traffic is mostly classification, the routing is wrong — and the dashboard says so.
Top prompts surface the 80/20.One hot prompt usually accounts for half the bill. The Top prompts table ranks by spend over the window so you find it in one glance.
Anthropic billed side-by-side with reduced cost.When ANTHROPIC_ADMIN_KEY is set, the dashboard shows your real invoice number next to its own calculation. Big gap = double counting; investigate.

Configure

Each tracked repo implements GET /api/ai-calls/log/export with bearer-token auth (AI_CALLS_EXPORT_TOKEN). For the billing side-by-side, set ANTHROPIC_ADMIN_KEY (prefix sk-ant-admin01-, not the regular API key).

Use it well

Open weekly. Sort by spend; click the top surface. If its cache-hit-rate is below 30%, the system block isn't cacheable — fix the prompt structure first. If the routing recommendation is one tier cheaper, click Propose routing PR and let it land.

Read more