GenAI Cost Observability for Google's Generative AI.
ai-tokentrace provides a transparent and easy way to track token consumption in your GenAI applications. Whether you're using the standard google-genai SDK or building complex agents with the Google Agent Development Kit (ADK), this library helps you manage costs, optimize performance, and gain deep insights into your model usage.
- π Automatic Tracking: Seamlessly integrates with
google-genaito capture token usage from every API call. - π€ ADK Support: Includes a plugin for the Google Agent Development Kit for effortless agent monitoring.
- π Multiple Backends: Export data to where you need it:
- Logging: Simple standard output for development.
- JSONL: Structured local files for easy analysis.
- Google Cloud Firestore: Scalable, queryable cloud storage.
- Google Cloud Pub/Sub: Event-driven pipelines for real-time analytics.
- β‘ Async Native: Fully non-blocking to keep your applications fast.
- π Rich Metrics: Tracks input/output tokens, thinking tokens, cached content, tool usage, and more.
Install using pip or uv (recommended).
For standard logging or JSONL export:
pip install ai-tokentrace
# or
uv pip install ai-tokentraceInstall with specific extras for Cloud integrations or ADK support:
# For Google Cloud Firestore
uv pip install "ai-tokentrace[firestore]"
# For Google Cloud Pub/Sub
uv pip install "ai-tokentrace[pubsub]"
# For Google ADK support
uv pip install "ai-tokentrace[adk]"
# Install everything
uv pip install "ai-tokentrace[firestore,pubsub,adk]"Simply wrap your client with TrackedGenaiClient. It works exactly like the standard client but logs all token usage.
import os
from google import genai
from ai_tokentrace import TrackedGenaiClient
# 1. Initialize standard client
client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])
# 2. Wrap with tracking (uses logging by default)
tracked_client = TrackedGenaiClient(client=client)
# 3. Use as normal!
response = tracked_client.models.generate_content(
model="gemini-2.5-flash",
contents="Explain quantum computing in 5 words."
)
print(response.text)
# Output: "Complex superposition processes information fast."
# Log: {"timestamp": "...", "model_name": "gemini-2.5-flash", "total_tokens": 15, ...}Add the TokenTrackingPlugin to your ADK app.
from google.adk.agents import LlmAgent
from google.adk.apps.app import App
from ai_tokentrace.adk import TokenTrackingPlugin
agent = LlmAgent(model="gemini-2.5-flash", ...)
app = App(
name="my_app",
root_agent=agent,
plugins=[TokenTrackingPlugin()] # Tracks all agent interactions
)You can configure different backends for storing your token usage data.
Firestore Example:
from ai_tokentrace import TrackedGenaiClient
from ai_tokentrace.services import FirestoreTokenUsageService
service = FirestoreTokenUsageService(collection_name="genai_usage_logs")
tracked_client = TrackedGenaiClient(client=client, service=service)Pub/Sub Example:
from ai_tokentrace import TrackedGenaiClient
from ai_tokentrace.services import PubSubTokenUsageService
service = PubSubTokenUsageService(topic_id="my-usage-topic", project_id="my-project")
tracked_client = TrackedGenaiClient(client=client, service=service)Give your agents the ability to see their own token usage!
from ai_tokentrace.services import FirestoreTokenUsageService
service = FirestoreTokenUsageService(...)
# Add the inspection tool to your agent
agent = LlmAgent(
...,
tools=[service.get_inspection_tool()]
)Check out the examples/ directory for complete, runnable projects:
- google-genai/: Scripts demonstrating sync/async usage, streaming, and different backends.
- adk/: Full ADK applications showing multi-agent tracking, multimodal capabilities, and self-inspection.
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
Apache 2.0 - See LICENSE for more details.