Token Tracking & Cost

Copair tracks token usage and estimates cost in real-time, so you always know how much a session is consuming.

Per-Request Display

After each model response, Copair shows input and output token counts:

[tokens: 1,234 in / 567 out | session: 8,901 in / 2,345 out | ~$0.12]
FieldDescription
in / outInput and output tokens for the last request
sessionCumulative totals for the entire session
~$0.12Estimated cost based on the provider's pricing

Status Bar

The persistent status bar at the bottom of the terminal shows token usage at a glance:

claude-4-sonnet (main) | 8.9K tokens | [████████░░] 78% | ~$0.12 | auth-refactor-a3f2

The context window usage bar shows how much of the model's context limit you've consumed. This helps you anticipate when summarization might be needed. The current git branch appears in green next to the model name and refreshes after every turn, so a git checkout mid-session is reflected immediately.

Context-limit detection

Smaller and local models (Qwen, Llama, Phi, etc.) sometimes truncate or silently stop responding when the input approaches their context window — without surfacing an explicit error. Copair watches for two signals every turn:

  1. Token threshold — input tokens reach 90% of the model's configured context window
  2. Truncation heuristic — the model returns text-only output (no tool calls) ending mid-word with no terminal punctuation

If either trips, the spinner stops and a yellow warning appears with two options:

  Approaching context limit (45,200 / 50,000 tokens, 90%)

[c]ompact session  [a]bort turn
  • Compact — summarize the conversation history into a short context block, replace the history, and continue. The next turn starts with a much smaller prompt.
  • Abort — return to the REPL without sending another request, so you can /clear or switch models manually.

The threshold is configurable per session via contextLimitThresholdPct if you embed Copair as a library; defaults to 0.9.

Per-Model Breakdown

When you switch models mid-session, Copair tracks usage separately for each model. Use /cost to see the breakdown:

> /cost

Session Token Usage:
  claude-4-sonnet    4,200 in / 1,800 out   ~$0.08
  gpt-4o             3,100 in /   900 out   ~$0.04
  llama4 (local)     1,600 in /   450 out   $0.00
  -------------------------------------------------
  Total              8,900 in / 3,150 out   ~$0.12

Cost Estimation

Copair calculates estimated cost using built-in pricing data for major providers:

ProviderSource
AnthropicOfficial API pricing
OpenAIOfficial API pricing
GoogleOfficial API pricing
Local (Ollama, vLLM)Always $0.00

Pricing data is bundled with Copair and updated with each release. Custom or self-hosted endpoints (Ollama, vLLM) are treated as free ($0.00).

Fallback Token Estimation

Not all providers return token counts in their API responses. When usage metadata is unavailable, Copair estimates tokens using a character-based heuristic (approximately 1 token per 3 characters). This fallback keeps the display consistent across all providers.

The estimation is conservative — it slightly overestimates to prevent unexpected context window overflows.

Session Summary

When you exit a session, Copair displays a final summary:

Session ended.
  Duration: 23 minutes
  Turns: 12
  Tokens: 15,200 in / 4,800 out
  Cost: ~$0.24

Next Steps

Last updated May 12, 2026