MCP Server
@neurameter/mcp-server provides an MCP (Model Context Protocol) server that lets AI agents self-monitor their own cost and context usage in real-time.
Installation
npm install @neurameter/mcp-serverUsage
import { NeuraMeterMCPServer } from '@neurameter/mcp-server';
const server = new NeuraMeterMCPServer({
apiKey: 'nm_xxx',
projectId: 'proj_xxx',
});Available Tools
The MCP server exposes these tools to connected agents:
get_cost_summary
Get cost summary for a time period.
Input:
{
"agentName": "MyAgent",
"period": "1h"
}Output:
{
"totalCost": 4.25,
"totalEvents": 42,
"topModels": [
{ "model": "gpt-4o", "cost": 3.80 },
{ "model": "gpt-4o-mini", "cost": 0.45 }
]
}check_context
Analyze context window utilization for a set of messages.
Input:
{
"messages": [...],
"model": "gpt-4o"
}Output:
{
"utilizationPercent": 0.82,
"estimatedInputTokens": 105000,
"modelContextLimit": 128000,
"breakdown": {
"systemPromptTokens": 2500,
"conversationTokens": 88000,
"toolResultTokens": 14500
},
"suggestion": "Summarize conversation history to save ~60% of input tokens"
}check_budget
Check remaining budget for the current period.
Input:
{
"agentName": "MyAgent"
}Output:
{
"budgetLimit": 100.0,
"spent": 45.20,
"remaining": 54.80,
"utilizationPercent": 0.452
}get_recommendations
Get optimization recommendations based on recent usage patterns.
Input:
{
"agentName": "MyAgent"
}Output:
{
"recommendations": [
"Switch from gpt-4o to gpt-4o-mini for classification tasks — same accuracy, 94% cheaper",
"Conversation history accounts for 78% of context — enable auto-summarization",
"Tool result from search_web averages 12K tokens — consider truncating to 2K"
]
}log_optimization
Log when an agent takes an optimization action (for tracking).
Input:
{
"agentName": "MyAgent",
"action": "summarized_history",
"tokensBefore": 95000,
"tokensAfter": 22000,
"description": "Summarized 45 messages to 3 key points"
}Why MCP?
The MCP server enables Layer 3: Autonomous Optimization — agents that:
- Check their own cost mid-conversation
- Analyze their own context utilization
- Decide to summarize, switch models, or throttle
- Report their optimizations back
This creates a feedback loop where agents self-optimize without human intervention.
Configuration with Claude Desktop
Add to your Claude Desktop MCP config:
{
"mcpServers": {
"neurameter": {
"command": "npx",
"args": ["-y", "@neurameter/mcp-server"],
"env": {
"NEURAMETER_API_KEY": "nm_xxx",
"NEURAMETER_PROJECT_ID": "proj_xxx"
}
}
}
}