Practical ApplicationsJanuary 10, 202615 min readshipped

Debugging Claude Code with Claude: A Meta-Optimization Journey

Claude Code had been getting slower. Responses were taking longer, some features weren't working reliably, and the startup time had crept from instant to several seconds. I could have dug through configuration files manually, but I had a better idea: what if Claude could debug itself?

This article documents the meta-debugging process of using Claude to analyze its own internal state, identify performance bottlenecks, and implement systematic optimizations. The approach is generalizable to any complex development tool with extensive logging.

The Meta-Debugging Approach

The insight is simple: Claude Code generates detailed debug logs, session transcripts, and project state files. These contain patterns that indicate problems. Claude, as an AI, excels at pattern recognition and analysis at scale. Why not use it to analyze its own operation?

Key Insight

AI tools that generate extensive logs are perfect candidates for self-analysis. The tool itself can identify patterns humans would miss in thousands of lines of debug output.

Investigation Strategy

The plan was straightforward:

Map the landscape: Identify what data Claude Code generates
Quantify the problem: Measure file counts, sizes, and patterns
Pattern analysis: Look for errors, timeouts, and failures
Root cause identification: Trace problems to configuration issues
Systematic fixes: Address issues in order of impact
Validation: Measure improvements

What Claude Code Stores

Claude Code maintains several directories in `(local path)

839MB   projects/         # Session state for different directories
602MB   debug/           # Debug logs from every session
236MB   plugins/         # Plugin cache
131MB   transcripts/     # Conversation history
43MB    file-history/    # File edit history
5.9MB   todos/          # Task tracking state

The debug directory alone contained nearly 1,500 log files spanning months of usage. Perfect data for pattern analysis.

The Investigation Process

Step 1: Quantify Debug Logs

First question: how bad is the debug log accumulation?

find (local path) -type f -mtime +30 | wc -l
# Result: 916 files older than 30 days

These old debug files serve no practical purpose but add overhead to file system operations. Quick win identified.

Step 2: Error Pattern Analysis

Next, I asked Claude to analyze error patterns across all debug logs:

grep -h "ERROR\|WARN\|fail" (local path) | \
  sort | uniq -c | sort -rn | head -20

The results were revealing:

429,352  "ide" MCP server - "Not connected"
  3,981  filesystem MCP server errors
  3,745  obsidian-search MCP server errors
  2,502  postgresql MCP server failures

Step 3: The Phantom IDE Server

The most striking finding: 429,352 failed connection attempts to an "ide" MCP server that wasn't even in my configuration. Claude was trying to connect to a server that didn't exist, on every single operation.

Tracing through the code revealed this was a legacy MCP server that Claude Code still attempted to initialize by default. The fix was simple:

{
  "env": {
    "CLAUDE_CODE_DISABLE_IDE_MCP": "1"
  }
}

One environment variable eliminated 400K+ failed operations.

Step 4: Streaming Fallback Analysis

Pattern analysis revealed another critical issue:

145 errors: "Error streaming, falling back to non-streaming mode: Connection error"
126 errors: "Request timed out"
 83 errors: "403: not authorized to perform bedrock:InvokeModelWithResponseStream"

Every Bedrock request was attempting streaming, failing due to AWS service control policies, then falling back to non-streaming. This doubled the latency of every request.

The AWS IAM policy explicitly blocked streaming:

{
  "Effect": "Deny",
  "Action": "bedrock:InvokeModelWithResponseStream"
}

Fix: Disable streaming for Bedrock mode:

export ANTHROPIC_DISABLE_STREAMING=1

AWS Bedrock Gotcha

Service Control Policies can silently block specific API operations. Always check SCPs when debugging AWS permission issues, not just IAM policies.

Step 5: MCP Server Audit

The debug logs showed 13 configured MCP servers, but several were problematic:

PostgreSQL: 2,502 connection failures

Server configured but not running locally
Every session attempted connection
Fix: Remove from config

Duplicate Obsidian Servers: Two different Obsidian integrations

obsidian-search: Custom Python with 32MB embedding model
obsidian: Standard mcp-obsidian
Both loading on every session
Fix: Keep the one being used, remove the other

Resend Email: Redundant with resend skill

MCP server provided email capabilities
Already handled by a Claude Code skill
Fix: Remove MCP server, keep skill

Perplexity: Redundant with researching-with-perplexity skill

Same situation as resend
Fix: Remove MCP server

Implementation: Systematic Fixes

Configuration Changes

**(local path) (settings file):

{
  "env": {
    "CLAUDE_CODE_DISABLE_IDE_MCP": "1"
  },
  "hooks": {
    "PostToolUse": [
      // Removed notification hook for performance
    ]
  }
}

**(local path) (MCP configuration):

{
  "mcpServers": {
    // Removed: postgresql, obsidian (duplicate),
    //          resend-email, perplexity
    // Kept: 9 essential servers
  }
}

Shell configuration (Bedrock mode):

claude-mode() {
  if "$1" == "bedrock"; then
    export ANTHROPIC_DISABLE_STREAMING=1  # Fix streaming fallbacks
    # ... other config
  fi
}

The Results

Before:

13 MCP servers (4 constantly failing)
13,582 connection errors/timeouts across sessions
429K IDE connection failures
271 streaming fallback errors
Startup time: 3-4 seconds
Noticeable response lag

After:

9 functional MCP servers
~70% fewer connection attempts
Zero IDE failures
Zero streaming fallbacks (Bedrock)
Startup time: 1.5-2 seconds
30-50% faster response times

Making This Reproducible

Here's how you can apply this approach to debug your own Claude Code installation:

1. Analyze Your Debug Logs

# Count errors by type
grep -rh "ERROR" (local path) | \
  cut -d' ' -f4- | sort | uniq -c | sort -rn | head -20

# Find connection issues
grep -rh "timeout\|failed\|error" (local path) | \
  grep "MCP server" | cut -d'"' -f2 | sort | uniq -c | sort -rn

# Check for streaming fallbacks
grep -rh "fallback\|streaming" (local path) | \
  grep -i error | wc -l

2. Identify Your MCP Servers

# List configured servers
cat (local path) | jq -r '.mcpServers | keys[]'

# Check which are actually being used
grep -rh "MCP server" (local path) | \
  grep "Tool.*failed" | cut -d'"' -f2 | sort | uniq -c | sort -rn

3. Test Each Server

# For each server, check if it's responding
# Example for PostgreSQL:
psql -c "SELECT 1" >/dev/null 2>&1 && echo "Running" || echo "Not running"

4. Clean Up Systematically

Start with the highest-impact issues:

Disable phantom servers: Servers that don't exist but are being loaded
Remove failed connections: Servers configured but not running
Eliminate duplicates: Multiple servers providing same functionality
Fix authentication issues: Servers with permission problems

5. Measure Improvements

# Before and after startup timing
time claude --version

# Count remaining errors
grep -rh "ERROR" (local path) | wc -l

Lessons from Meta-Debugging

Pattern Recognition at Scale

Humans struggle with pattern recognition across thousands of log entries. AI tools excel at this. Using Claude to analyze 1,500 debug files revealed patterns I would have missed.

Configuration Drift is Real

Over time, configurations accumulate cruft. Plugins get installed and forgotten. Services get configured but never cleaned up. Regular audits prevent performance degradation.

The Value of Good Logging

Claude Code's detailed debug logs made this analysis possible. Tools without comprehensive logging are much harder to optimize.

Dependencies Have Dependencies

MCP servers introduce their own dependencies. The obsidian-search server loads a 32MB embedding model on every startup. Understanding the full initialization chain is crucial.

Performance Principle

Every additional integration point is a potential failure point and performance bottleneck. Ruthlessly prune unused integrations.

Beyond Claude Code

This meta-debugging approach applies to any complex tool with good logging:

VS Code: Analyze extension activation times and error patterns Docker: Review container logs for common failures Kubernetes: Pattern-match across pod logs to identify cluster issues CI/CD: Analyze build logs to find recurring bottlenecks

The key is having:

Comprehensive logging
A pattern-recognition tool (AI or specialized scripts)
Willingness to act on findings

Takeaways

For immediate results:

Run the debug log analysis commands above
Identify your top 3 error patterns
Fix the highest-frequency issues first
Measure before and after performance

For long-term optimization:

Schedule monthly configuration audits
Remove unused plugins and servers
Monitor debug logs for new patterns
Document your configuration decisions

Meta-lesson: AI tools can effectively debug themselves. The same capabilities that make them useful for development work apply to analyzing their own operation. Don't manually grep through thousands of log files when the AI can do it better.

Performance improvements measured on M4 Max MacBook Pro with 36GB RAM. Your results may vary based on configuration and usage patterns.

About the Author: Justin Johnson builds AI systems and writes about practical AI development.

justinhjohnson.com | Twitter | LinkedIn | Run Data Run | Subscribe

Debugging Claude Code with Claude: A Meta-Optimization Journey

Debugging Claude Code with Claude: A Meta-Optimization Journey

The Meta-Debugging Approach

Investigation Strategy

What Claude Code Stores

The Investigation Process

Step 1: Quantify Debug Logs

Step 2: Error Pattern Analysis

Step 3: The Phantom IDE Server

Step 4: Streaming Fallback Analysis

Step 5: MCP Server Audit

Implementation: Systematic Fixes

Configuration Changes

The Results

Making This Reproducible

1. Analyze Your Debug Logs

2. Identify Your MCP Servers

3. Test Each Server

4. Clean Up Systematically

5. Measure Improvements

Lessons from Meta-Debugging

Pattern Recognition at Scale

Configuration Drift is Real

The Value of Good Logging

Dependencies Have Dependencies

Beyond Claude Code

Related Reading

Takeaways

Related Articles

Debugging Claude Code with Claude: A Meta-Optimization Journey

The Meta-Debugging Approach

Investigation Strategy

What Claude Code Stores

The Investigation Process

Step 1: Quantify Debug Logs

Step 2: Error Pattern Analysis

Step 3: The Phantom IDE Server

Step 4: Streaming Fallback Analysis

Step 5: MCP Server Audit

Implementation: Systematic Fixes

Configuration Changes

The Results

Making This Reproducible

1. Analyze Your Debug Logs

2. Identify Your MCP Servers

3. Test Each Server

4. Clean Up Systematically

5. Measure Improvements

Lessons from Meta-Debugging

Pattern Recognition at Scale

Configuration Drift is Real

The Value of Good Logging

Dependencies Have Dependencies

Beyond Claude Code

Related Reading

Takeaways

Related Articles

Get the next experiment