Context Analysis

NeuraMeter can analyze your context window usage to identify waste — which parts of your input are system prompts, conversation history, and tool results.

Usage


import { analyzeContext } from '@neurameter/core';
 
const analysis = analyzeContext(
  [
    { role: 'system', content: 'You are a helpful assistant...' },
    { role: 'user', content: 'Summarize this document...' },
    { role: 'assistant', content: 'Here is the summary...' },
    { role: 'tool', content: '{"results": [...very large...]}' },
    { role: 'user', content: 'Now analyze the key findings' },
  ],
  'gpt-4o'
);

ContextAnalysis Result


interface ContextAnalysis {
  estimatedInputTokens: number;   // total estimated tokens
  modelContextLimit: number;      // model's max context window
  utilizationPercent: number;     // tokens / limit (0.0 - 1.0+)
  messageCount: number;           // number of messages
  systemPromptTokens: number;     // tokens from system messages
  conversationTokens: number;     // tokens from user/assistant messages
  toolResultTokens: number;       // tokens from tool messages
}

Example Output


{
  "estimatedInputTokens": 95000,
  "modelContextLimit": 128000,
  "utilizationPercent": 0.742,
  "messageCount": 24,
  "systemPromptTokens": 2500,
  "conversationTokens": 78000,
  "toolResultTokens": 14500
}

In this example, 82% of the context is conversation history — a strong candidate for summarization.

Token Estimation


import { estimateTokens } from '@neurameter/core';
 
const tokens = estimateTokens('Hello, how are you?');
// ~5 tokens

Uses a fast heuristic (~4 chars per token). Intentionally fast (<1ms) for SDK use — not meant to be perfectly accurate.

Model Context Limits


import { getModelContextLimit, MODEL_CONTEXT_LIMITS } from '@neurameter/core';
 
const limit = getModelContextLimit('gpt-4o');
// 128_000
 
const limit2 = getModelContextLimit('gpt-4.1');
// 1_000_000

Built-in Limits

Model	Context Window
`gpt-4o`	128,000
`gpt-4o-mini`	128,000
`gpt-4.1`	1,000,000
`gpt-4.1-mini`	1,000,000
`o1`	200,000
`o3-mini`	200,000
`claude-sonnet-4`	200,000
`claude-haiku-4`	200,000
`claude-opus-4`	200,000

Models not in the table default to 128,000. Prefix matching is supported (e.g., claude-sonnet-4-20250514 matches claude-sonnet-4).

Integration with Guardrails

Context analysis is automatically performed when using guardrails:


const meter = new NeuraMeter({
  apiKey: 'nm_xxx',
  projectId: 'proj_xxx',
  guards: {
    maxContextUtilization: 0.80,
  },
});
 
const result = meter.checkGuards({
  messages: messages,
  model: 'gpt-4o',
  provider: 'openai',
  agentName: 'MyAgent',
});
 
// result.contextAnalysis contains the full breakdown
console.log(result.contextAnalysis?.utilizationPercent);
console.log(result.contextAnalysis?.conversationTokens);

Common Patterns

Detect Conversation Bloat


const analysis = analyzeContext(messages, model);
 
if (analysis.conversationTokens > analysis.estimatedInputTokens * 0.7) {
  console.warn('70%+ of context is conversation history — consider summarizing');
}

Detect Tool Result Bloat


if (analysis.toolResultTokens > analysis.estimatedInputTokens * 0.5) {
  console.warn('50%+ of context is tool results — consider truncating');
}