Token Tracking & Cost
Copair tracks token usage and estimates cost in real-time, so you always know how much a session is consuming.
Per-Request Display
After each model response, Copair shows input and output token counts:
[tokens: 1,234 in / 567 out | session: 8,901 in / 2,345 out | ~$0.12]| Field | Description |
|---|---|
in / out | Input and output tokens for the last request |
session | Cumulative totals for the entire session |
~$0.12 | Estimated cost based on the provider's pricing |
Status Bar
The persistent status bar at the bottom of the terminal shows token usage at a glance:
claude-4-sonnet (main) | 8.9K tokens | [████████░░] 78% | ~$0.12 | auth-refactor-a3f2The context window usage bar shows how much of the model's context limit you've consumed. This helps you anticipate when summarization might be needed. The current git branch appears in green next to the model name and refreshes after every turn, so a git checkout mid-session is reflected immediately.
Context-limit detection
Smaller and local models (Qwen, Llama, Phi, etc.) sometimes truncate or silently stop responding when the input approaches their context window — without surfacing an explicit error. Copair watches for two signals every turn:
- Token threshold — input tokens reach 90% of the model's configured context window
- Truncation heuristic — the model returns text-only output (no tool calls) ending mid-word with no terminal punctuation
If either trips, the spinner stops and a yellow warning appears with two options:
⚠ Approaching context limit (45,200 / 50,000 tokens, 90%)
[c]ompact session [a]bort turn- Compact — summarize the conversation history into a short context block, replace the history, and continue. The next turn starts with a much smaller prompt.
- Abort — return to the REPL without sending another request, so you can
/clearor switch models manually.
The threshold is configurable per session via contextLimitThresholdPct if you embed Copair as a library; defaults to 0.9.
Per-Model Breakdown
When you switch models mid-session, Copair tracks usage separately for each model. Use /cost to see the breakdown:
> /cost
Session Token Usage:
claude-4-sonnet 4,200 in / 1,800 out ~$0.08
gpt-4o 3,100 in / 900 out ~$0.04
llama4 (local) 1,600 in / 450 out $0.00
-------------------------------------------------
Total 8,900 in / 3,150 out ~$0.12Cost Estimation
Copair calculates estimated cost using built-in pricing data for major providers:
| Provider | Source |
|---|---|
| Anthropic | Official API pricing |
| OpenAI | Official API pricing |
| Official API pricing | |
| Local (Ollama, vLLM) | Always $0.00 |
Pricing data is bundled with Copair and updated with each release. Custom or self-hosted endpoints (Ollama, vLLM) are treated as free ($0.00).
Fallback Token Estimation
Not all providers return token counts in their API responses. When usage metadata is unavailable, Copair estimates tokens using a character-based heuristic (approximately 1 token per 3 characters). This fallback keeps the display consistent across all providers.
The estimation is conservative — it slightly overestimates to prevent unexpected context window overflows.
Session Summary
When you exit a session, Copair displays a final summary:
Session ended.
Duration: 23 minutes
Turns: 12
Tokens: 15,200 in / 4,800 out
Cost: ~$0.24Next Steps
- Model Switching — Switch models to optimize cost vs. capability
- Context Persistence — How sessions and context are managed