Context Analysis
NeuraMeter can analyze your context window usage to identify waste — which parts of your input are system prompts, conversation history, and tool results.
Usage
import { analyzeContext } from '@neurameter/core';
const analysis = analyzeContext(
[
{ role: 'system', content: 'You are a helpful assistant...' },
{ role: 'user', content: 'Summarize this document...' },
{ role: 'assistant', content: 'Here is the summary...' },
{ role: 'tool', content: '{"results": [...very large...]}' },
{ role: 'user', content: 'Now analyze the key findings' },
],
'gpt-4o'
);ContextAnalysis Result
interface ContextAnalysis {
estimatedInputTokens: number; // total estimated tokens
modelContextLimit: number; // model's max context window
utilizationPercent: number; // tokens / limit (0.0 - 1.0+)
messageCount: number; // number of messages
systemPromptTokens: number; // tokens from system messages
conversationTokens: number; // tokens from user/assistant messages
toolResultTokens: number; // tokens from tool messages
}Example Output
{
"estimatedInputTokens": 95000,
"modelContextLimit": 128000,
"utilizationPercent": 0.742,
"messageCount": 24,
"systemPromptTokens": 2500,
"conversationTokens": 78000,
"toolResultTokens": 14500
}In this example, 82% of the context is conversation history — a strong candidate for summarization.
Token Estimation
import { estimateTokens } from '@neurameter/core';
const tokens = estimateTokens('Hello, how are you?');
// ~5 tokensUses a fast heuristic (~4 chars per token). Intentionally fast (<1ms) for SDK use — not meant to be perfectly accurate.
Model Context Limits
import { getModelContextLimit, MODEL_CONTEXT_LIMITS } from '@neurameter/core';
const limit = getModelContextLimit('gpt-4o');
// 128_000
const limit2 = getModelContextLimit('gpt-4.1');
// 1_000_000Built-in Limits
| Model | Context Window |
|---|---|
gpt-4o | 128,000 |
gpt-4o-mini | 128,000 |
gpt-4.1 | 1,000,000 |
gpt-4.1-mini | 1,000,000 |
o1 | 200,000 |
o3-mini | 200,000 |
claude-sonnet-4 | 200,000 |
claude-haiku-4 | 200,000 |
claude-opus-4 | 200,000 |
Models not in the table default to 128,000. Prefix matching is supported (e.g., claude-sonnet-4-20250514 matches claude-sonnet-4).
Integration with Guardrails
Context analysis is automatically performed when using guardrails:
const meter = new NeuraMeter({
apiKey: 'nm_xxx',
projectId: 'proj_xxx',
guards: {
maxContextUtilization: 0.80,
},
});
const result = meter.checkGuards({
messages: messages,
model: 'gpt-4o',
provider: 'openai',
agentName: 'MyAgent',
});
// result.contextAnalysis contains the full breakdown
console.log(result.contextAnalysis?.utilizationPercent);
console.log(result.contextAnalysis?.conversationTokens);Common Patterns
Detect Conversation Bloat
const analysis = analyzeContext(messages, model);
if (analysis.conversationTokens > analysis.estimatedInputTokens * 0.7) {
console.warn('70%+ of context is conversation history — consider summarizing');
}Detect Tool Result Bloat
if (analysis.toolResultTokens > analysis.estimatedInputTokens * 0.5) {
console.warn('50%+ of context is tool results — consider truncating');
}