Skip to Content
MCP Server

MCP Server

@neurameter/mcp-server provides an MCP (Model Context Protocol) server that lets AI agents self-monitor their own cost and context usage in real-time.

Installation

npm install @neurameter/mcp-server

Usage

import { NeuraMeterMCPServer } from '@neurameter/mcp-server'; const server = new NeuraMeterMCPServer({ apiKey: 'nm_xxx', projectId: 'proj_xxx', });

Available Tools

The MCP server exposes these tools to connected agents:

get_cost_summary

Get cost summary for a time period.

Input:

{ "agentName": "MyAgent", "period": "1h" }

Output:

{ "totalCost": 4.25, "totalEvents": 42, "topModels": [ { "model": "gpt-4o", "cost": 3.80 }, { "model": "gpt-4o-mini", "cost": 0.45 } ] }

check_context

Analyze context window utilization for a set of messages.

Input:

{ "messages": [...], "model": "gpt-4o" }

Output:

{ "utilizationPercent": 0.82, "estimatedInputTokens": 105000, "modelContextLimit": 128000, "breakdown": { "systemPromptTokens": 2500, "conversationTokens": 88000, "toolResultTokens": 14500 }, "suggestion": "Summarize conversation history to save ~60% of input tokens" }

check_budget

Check remaining budget for the current period.

Input:

{ "agentName": "MyAgent" }

Output:

{ "budgetLimit": 100.0, "spent": 45.20, "remaining": 54.80, "utilizationPercent": 0.452 }

get_recommendations

Get optimization recommendations based on recent usage patterns.

Input:

{ "agentName": "MyAgent" }

Output:

{ "recommendations": [ "Switch from gpt-4o to gpt-4o-mini for classification tasks — same accuracy, 94% cheaper", "Conversation history accounts for 78% of context — enable auto-summarization", "Tool result from search_web averages 12K tokens — consider truncating to 2K" ] }

log_optimization

Log when an agent takes an optimization action (for tracking).

Input:

{ "agentName": "MyAgent", "action": "summarized_history", "tokensBefore": 95000, "tokensAfter": 22000, "description": "Summarized 45 messages to 3 key points" }

Why MCP?

The MCP server enables Layer 3: Autonomous Optimization — agents that:

  1. Check their own cost mid-conversation
  2. Analyze their own context utilization
  3. Decide to summarize, switch models, or throttle
  4. Report their optimizations back

This creates a feedback loop where agents self-optimize without human intervention.

Configuration with Claude Desktop

Add to your Claude Desktop MCP config:

{ "mcpServers": { "neurameter": { "command": "npx", "args": ["-y", "@neurameter/mcp-server"], "env": { "NEURAMETER_API_KEY": "nm_xxx", "NEURAMETER_PROJECT_ID": "proj_xxx" } } } }