Monitoring Token Usage with the Real-Time Audit Dashboard¶
Overview¶
The Real-Time Token Audit dashboard gives you total transparency on AI “fuel” consumption. Monitor exactly how many tokens are being consumed, by which features, and by which models β with hard-limit protection to ensure no surprise invoices.
In this tutorial, you’ll learn:
- How to navigate the Token Audit dashboard
- How to read token consumption reports
- How to set and manage hard-limit protection
- How to optimize token usage across your projects
- How to interpret the LiteLLM-powered analytics
The Token Audit Dashboard¶
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Token Audit Dashboard β Project: My SaaS App β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Monthly Budget: $150.00 Used: $87.50 (58%) β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β Tasks Today β β Avg Tokens β β Cost/Task β β
β β 12 β β 8,450 β β $7.29 β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β
β Model Breakdown: β
β Claude 4.5 ββββββββββββ 42% ($36.75) β
β GPT-5 ββββββββββββββββββββ 38% ($33.25) β
β Gemini 3 Pro ββββββ 14% ($12.25) β
β DeepSeek ββ 6% ($5.25) β
β β
β Recent Tasks: β
β β
Create auth endpoint 12,300 tokens $9.84 β
β β
Fix pagination bug 4,200 tokens $3.36 β
β π Build dashboard UI 18,500 tokens $14.80 (in prog) β
β β³ Write unit tests 6,800 tokens $5.44 (queued) β
β β
β [Set Hard Limit] [Export Report] [View Details] β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Step 1: Navigate the Dashboard¶
Overview Section¶
| Metric | Description |
|---|---|
| Monthly Budget | Your configured token spending cap |
| Used | Current consumption and percentage |
| Remaining | Budget left for the billing cycle |
| Days Left | Days remaining in the current cycle |
Summary Cards¶
| Card | Description |
|---|---|
| Tasks Today | Number of AI tasks submitted today |
| Avg Tokens/Task | Average token consumption per task |
| Cost/Task | Average cost per AI task |
Model Breakdown¶
Shows token consumption by model:
- Visual bar chart: Proportional usage per model
- Percentage: Share of total consumption
- Dollar amount: Cost attributed to each model
Recent Tasks¶
Lists your most recent AI tasks with:
- Status: Completed (β ), In Progress (π), Queued (β³), Failed (β)
- Token count: Total tokens consumed
- Cost: Dollar amount spent
Step 2: Set Hard-Limit Protection¶
Configure Hard Limits¶
- Click “Set Hard Limit”
- Choose your limit type:
- Monthly cap: Maximum spend per billing cycle
- Daily cap: Maximum spend per day
- Per-task cap: Maximum spend per individual task
- Set the amount (e.g., $150/month)
- Choose the action when limit is reached:
- Pause: Stop all AI tasks until you manually resume
- Alert: Send notification but continue processing
- Click “Save”
Limit Notifications¶
You’ll receive notifications when:
| Threshold | Notification |
|---|---|
| 50% used | Informational: “You’ve used 50% of your monthly budget” |
| 75% used | Warning: “You’ve used 75% β consider reviewing task priorities” |
| 90% used | Alert: “You’ve used 90% β approaching your hard limit” |
| 100% reached | Action: Tasks paused (if Pause mode) or final alert (if Alert mode) |
Step 3: Read Detailed Reports¶
Task-Level Report¶
Click on any task to see detailed token usage:
Task: "Create user authentication endpoint"
Status: Completed
Duration: 4 minutes
Token Breakdown:
βββ Claude 4.5: 2,400 tokens (Architecture design)
βββ GPT-5: 8,200 tokens (Code implementation)
βββ Gemini 3 Pro: 1,800 tokens (Architecture review)
βββ DeepSeek: 900 tokens (Code refactoring)
βββ Total: 13,300 tokens ($10.64)
Quality Gate:
βββ Static Analysis: PASS (Score: A)
βββ Vulnerability Scan: PASS (0 issues)
βββ Unit Tests: PASS (87% coverage)
βββ Style Validation: PASS (auto-fixed 2 issues)
Project-Level Report¶
- Go to Analytics β Project Report
- Select date range
- View:
- Total tokens consumed across all tasks
- Cost breakdown by model, task type, and team member
- Trend analysis: Consumption over time
- Top consumers: Most expensive tasks and features
Step 4: Optimize Token Usage¶
Identify High-Consumption Areas¶
Look for patterns in your reports:
| Pattern | Possible Cause | Solution |
|---|---|---|
| One task uses 5x more tokens than average | Overly broad task description | Break into smaller, focused tasks |
| Claude 4.5 consumption is high | Using Claude for simple tasks | Let AI Factory auto-select optimal model |
| Refactoring tasks cost too much | DeepSeek not being used | Check model preference settings |
| Token usage spikes on certain days | Batch task submissions | Spread submissions across the week |
Optimization Strategies¶
- Write precise task descriptions: Vague descriptions lead to more token consumption
- Break large tasks into smaller ones: Easier to estimate and control cost
- Use appropriate model preferences: Don’t force expensive models for simple tasks
- Review and adjust limits monthly: Based on actual usage patterns
- Monitor weekly, not daily: Look for trends, not daily fluctuations
Step 5: Export Reports¶
Export Options¶
- Click “Export Report”
- Choose format:
- CSV: For spreadsheet analysis
- PDF: For sharing with stakeholders
- JSON: For programmatic processing
- Select date range
- Choose data to include:
- Task details
- Model breakdown
- Cost attribution
- Quality gate results
- Download the report
Best Practices¶
Budget Planning¶
- Start conservative: Set a lower limit and increase as you understand usage
- Review monthly: Adjust based on actual consumption patterns
- Plan for spikes: Account for larger features or bug-fixing sprints
- Separate projects: Set individual limits per project for better control
Cost Optimization¶
- Use the AI Factory’s auto-routing: It selects the most cost-effective model
- Batch similar tasks: Reduces context-switching overhead
- Review failed tasks: Failed tasks still consume tokens β improve task descriptions
- Leverage DeepSeek: For refactoring and cleanup, it’s the most cost-efficient
Team Management¶
- Set per-developer limits: If multiple team members submit tasks
- Share usage reports: Keep the team informed about consumption
- Educate on efficiency: Train team members to write effective task descriptions
- Review weekly as a team: Discuss consumption patterns and optimization opportunities
What’s Next?¶
- Learn about Private AI Gateway
- Explore Understanding the AI Factory
- Read about Automated QA & Security Guardrails
Need Help?¶
- Documentation: docs.4geeks.io
- Community: community.4geeks.io
- Support: Available through the console dashboard
Still questions? Ask the community.