Cost Tracking
Track token usage and estimated costs across models with built-in pricing tables.
The cost tracking system records token usage per model and estimates USD costs using built-in pricing tables. It supports input, output, cache read, and cache creation tokens.
Configuration
import { Agent } from "noumen";
import { LocalSandbox } from "noumen/local";
const code = new Agent({
provider,
sandbox: LocalSandbox({ cwd: "/my/project" }),
options: {
costTracking: {
enabled: true,
},
},
});Getting cost summaries
After running threads, get a summary from the Agent instance:
const summary = code.getCostSummary();
console.log(`Total cost: $${summary.totalCostUSD.toFixed(4)}`);
console.log(`Input tokens: ${summary.totalInputTokens}`);
console.log(`Output tokens: ${summary.totalOutputTokens}`);Or use the formatted output:
const tracker = new CostTracker();
// ... after some usage ...
console.log(tracker.formatSummary());
// Total cost: $0.0234
// Usage by model:
// gpt-4o: 12.3k input, 2.1k output ($0.0180)
// gpt-4o-mini: 5.2k input, 800 output ($0.0054)CostSummary type
interface CostSummary {
totalCostUSD: number;
totalInputTokens: number;
totalOutputTokens: number;
totalCacheReadTokens: number;
totalCacheCreationTokens: number;
byModel: Record<string, ModelUsageSummary>;
duration: { apiMs: number; wallMs: number };
}Custom pricing
Override built-in pricing by passing a custom pricing table:
import { CostTracker, type ModelPricing } from "noumen";
const pricing: Record<string, ModelPricing> = {
"my-custom-model": {
inputPer1M: 3.0,
outputPer1M: 15.0,
},
};
const tracker = new CostTracker(pricing);Stream events
Cost updates are emitted as cost_update stream events during agent runs:
for await (const event of thread.run("...")) {
if (event.type === "cost_update") {
console.log(`Running cost: $${event.summary.totalCostUSD.toFixed(4)}`);
}
}Prompt caching
Prompt caching reduces cost and latency by enabling providers to cache and reuse the prefix of each request. When enabled, noumen sorts tool definitions into a deterministic order and injects cache_control breakpoints into messages so that the provider can serve the shared prefix from cache on subsequent requests.
const code = new Agent({
provider,
sandbox,
options: {
promptCaching: {
enabled: true,
ttl: "1h", // cache TTL (optional)
scope: "global", // "global" or "org" (optional)
},
},
});This is opt-in because it changes the order of tool definitions sent to the provider (sorted alphabetically with built-in tools first, MCP tools second) and adds cache_control markers to messages. These changes are invisible to the model but can affect behavior if you're comparing raw API payloads or relying on a specific tool ordering. Prompt caching is only effective with providers that support it (Anthropic, some OpenAI models).
CacheControlConfig
| Field | Type | Default | Description |
|---|---|---|---|
enabled | boolean | — | Enable prompt caching |
ttl | "1h" | — | Cache time-to-live hint for the provider |
scope | "global" | "org" | — | Cache sharing scope |