AIXplorethe lab
Practical Applications15 min readshipped

Debugging Claude Code with Claude: A Meta-Optimization Journey

Debugging Claude Code with Claude: A Meta-Optimization Journey

Claude Code had been getting slower. Responses were taking longer, some features weren't working reliably, and the startup time had crept from instant to several seconds. I could have dug through configuration files manually, but I had a better idea: what if Claude could debug itself?

This article documents the meta-debugging process of using Claude to analyze its own internal state, identify performance bottlenecks, and implement systematic optimizations. The approach is generalizable to any complex development tool with extensive logging.

The Meta-Debugging Approach

The insight is simple: Claude Code generates detailed debug logs, session transcripts, and project state files. These contain patterns that indicate problems. Claude, as an AI, excels at pattern recognition and analysis at scale. Why not use it to analyze its own operation?

Key Insight
AI tools that generate extensive logs are perfect candidates for self-analysis. The tool itself can identify patterns humans would miss in thousands of lines of debug output.

Investigation Strategy

The plan was straightforward:

  1. Map the landscape: Identify what data Claude Code generates
  2. Quantify the problem: Measure file counts, sizes, and patterns
  3. Pattern analysis: Look for errors, timeouts, and failures
  4. Root cause identification: Trace problems to configuration issues
  5. Systematic fixes: Address issues in order of impact
  6. Validation: Measure improvements

What Claude Code Stores

Claude Code maintains several directories in `(local path)

839MB   projects/         # Session state for different directories
602MB   debug/           # Debug logs from every session
236MB   plugins/         # Plugin cache
131MB   transcripts/     # Conversation history
43MB    file-history/    # File edit history
5.9MB   todos/          # Task tracking state

The debug directory alone contained nearly 1,500 log files spanning months of usage. Perfect data for pattern analysis.

The Investigation Process

Step 1: Quantify Debug Logs

First question: how bad is the debug log accumulation?

find (local path) -type f -mtime +30 | wc -l
# Result: 916 files older than 30 days

These old debug files serve no practical purpose but add overhead to file system operations. Quick win identified.

Step 2: Error Pattern Analysis

Next, I asked Claude to analyze error patterns across all debug logs:

grep -h "ERROR\|WARN\|fail" (local path) | \
  sort | uniq -c | sort -rn | head -20

The results were revealing:

429,352  "ide" MCP server - "Not connected"
  3,981  filesystem MCP server errors
  3,745  obsidian-search MCP server errors
  2,502  postgresql MCP server failures

Step 3: The Phantom IDE Server

The most striking finding: 429,352 failed connection attempts to an "ide" MCP server that wasn't even in my configuration. Claude was trying to connect to a server that didn't exist, on every single operation.

Tracing through the code revealed this was a legacy MCP server that Claude Code still attempted to initialize by default. The fix was simple:

{
  "env": {
    "CLAUDE_CODE_DISABLE_IDE_MCP": "1"
  }
}

One environment variable eliminated 400K+ failed operations.

Step 4: Streaming Fallback Analysis

Pattern analysis revealed another critical issue:

145 errors: "Error streaming, falling back to non-streaming mode: Connection error"
126 errors: "Request timed out"
 83 errors: "403: not authorized to perform bedrock:InvokeModelWithResponseStream"

Every Bedrock request was attempting streaming, failing due to AWS service control policies, then falling back to non-streaming. This doubled the latency of every request.

The AWS IAM policy explicitly blocked streaming:

{
  "Effect": "Deny",
  "Action": "bedrock:InvokeModelWithResponseStream"
}

Fix: Disable streaming for Bedrock mode:

export ANTHROPIC_DISABLE_STREAMING=1
AWS Bedrock Gotcha
Service Control Policies can silently block specific API operations. Always check SCPs when debugging AWS permission issues, not just IAM policies.

Step 5: MCP Server Audit

The debug logs showed 13 configured MCP servers, but several were problematic:

PostgreSQL: 2,502 connection failures

  • Server configured but not running locally
  • Every session attempted connection
  • Fix: Remove from config

Duplicate Obsidian Servers: Two different Obsidian integrations

  • obsidian-search: Custom Python with 32MB embedding model
  • obsidian: Standard mcp-obsidian
  • Both loading on every session
  • Fix: Keep the one being used, remove the other

Resend Email: Redundant with resend skill

  • MCP server provided email capabilities
  • Already handled by a Claude Code skill
  • Fix: Remove MCP server, keep skill

Perplexity: Redundant with researching-with-perplexity skill

  • Same situation as resend
  • Fix: Remove MCP server

Implementation: Systematic Fixes

Configuration Changes

**(local path) (settings file):

{
  "env": {
    "CLAUDE_CODE_DISABLE_IDE_MCP": "1"
  },
  "hooks": {
    "PostToolUse": [
      // Removed notification hook for performance
    ]
  }
}

**(local path) (MCP configuration):

{
  "mcpServers": {
    // Removed: postgresql, obsidian (duplicate),
    //          resend-email, perplexity
    // Kept: 9 essential servers
  }
}

Shell configuration (Bedrock mode):

claude-mode() {
  if "$1" == "bedrock"; then
    export ANTHROPIC_DISABLE_STREAMING=1  # Fix streaming fallbacks
    # ... other config
  fi
}

The Results

Before:

  • 13 MCP servers (4 constantly failing)
  • 13,582 connection errors/timeouts across sessions
  • 429K IDE connection failures
  • 271 streaming fallback errors
  • Startup time: 3-4 seconds
  • Noticeable response lag

After:

  • 9 functional MCP servers
  • ~70% fewer connection attempts
  • Zero IDE failures
  • Zero streaming fallbacks (Bedrock)
  • Startup time: 1.5-2 seconds
  • 30-50% faster response times

Making This Reproducible

Here's how you can apply this approach to debug your own Claude Code installation:

1. Analyze Your Debug Logs

# Count errors by type
grep -rh "ERROR" (local path) | \
  cut -d' ' -f4- | sort | uniq -c | sort -rn | head -20

# Find connection issues
grep -rh "timeout\|failed\|error" (local path) | \
  grep "MCP server" | cut -d'"' -f2 | sort | uniq -c | sort -rn

# Check for streaming fallbacks
grep -rh "fallback\|streaming" (local path) | \
  grep -i error | wc -l

2. Identify Your MCP Servers

# List configured servers
cat (local path) | jq -r '.mcpServers | keys[]'

# Check which are actually being used
grep -rh "MCP server" (local path) | \
  grep "Tool.*failed" | cut -d'"' -f2 | sort | uniq -c | sort -rn

3. Test Each Server

# For each server, check if it's responding
# Example for PostgreSQL:
psql -c "SELECT 1" >/dev/null 2>&1 && echo "Running" || echo "Not running"

4. Clean Up Systematically

Start with the highest-impact issues:

  1. Disable phantom servers: Servers that don't exist but are being loaded
  2. Remove failed connections: Servers configured but not running
  3. Eliminate duplicates: Multiple servers providing same functionality
  4. Fix authentication issues: Servers with permission problems

5. Measure Improvements

# Before and after startup timing
time claude --version

# Count remaining errors
grep -rh "ERROR" (local path) | wc -l

Lessons from Meta-Debugging

Pattern Recognition at Scale

Humans struggle with pattern recognition across thousands of log entries. AI tools excel at this. Using Claude to analyze 1,500 debug files revealed patterns I would have missed.

Configuration Drift is Real

Over time, configurations accumulate cruft. Plugins get installed and forgotten. Services get configured but never cleaned up. Regular audits prevent performance degradation.

The Value of Good Logging

Claude Code's detailed debug logs made this analysis possible. Tools without comprehensive logging are much harder to optimize.

Dependencies Have Dependencies

MCP servers introduce their own dependencies. The obsidian-search server loads a 32MB embedding model on every startup. Understanding the full initialization chain is crucial.

Performance Principle
Every additional integration point is a potential failure point and performance bottleneck. Ruthlessly prune unused integrations.

Beyond Claude Code

This meta-debugging approach applies to any complex tool with good logging:

VS Code: Analyze extension activation times and error patterns Docker: Review container logs for common failures Kubernetes: Pattern-match across pod logs to identify cluster issues CI/CD: Analyze build logs to find recurring bottlenecks

The key is having:

  1. Comprehensive logging
  2. A pattern-recognition tool (AI or specialized scripts)
  3. Willingness to act on findings

Related Reading

  • Claude Skills vs MCP Servers: Understanding the difference between skills and MCP servers
  • Claude Code Best Practices: General optimization strategies
  • Making Claude Code More Agentic: Advanced configuration techniques

Takeaways

For immediate results:

  1. Run the debug log analysis commands above
  2. Identify your top 3 error patterns
  3. Fix the highest-frequency issues first
  4. Measure before and after performance

For long-term optimization:

  1. Schedule monthly configuration audits
  2. Remove unused plugins and servers
  3. Monitor debug logs for new patterns
  4. Document your configuration decisions

Meta-lesson: AI tools can effectively debug themselves. The same capabilities that make them useful for development work apply to analyzing their own operation. Don't manually grep through thousands of log files when the AI can do it better.


Performance improvements measured on M4 Max MacBook Pro with 36GB RAM. Your results may vary based on configuration and usage patterns.


Related Articles


About the Author: Justin Johnson builds AI systems and writes about practical AI development.

justinhjohnson.com | Twitter | LinkedIn | Run Data Run | Subscribe

Follow the lab

Get the next experiment

Enjoyed the breakdown on Debugging Claude Code with Claude: A Meta-Optimization Journey? New entries land roughly weekly. No digest, no roundup. Just the next build log, when it ships.