Practical Applications10 min readshipped

My AI Research Assistant Works the Night Shift (A Claude Code Skill Story)

My AI Research Assistant Works the Night Shift

Every morning, I wake up to a fresh batch of AI research. Not email newsletters or RSS spam. Actual, filtered, organized research on models, papers, and tools that matter. Zero duplicates. Instantly searchable. Ready to query.

I didn't hire an assistant. I built a Claude Code skill that runs at 1 AM while I sleep.

Here's why that's cooler than it sounds, and how you can build your own.

The Problem: AI Moves Too Fast

The numbers are ridiculous:

  • HuggingFace: 200+ new models daily
  • arXiv: 50+ AI papers daily
  • GitHub: 100+ new ML repos daily

You can't keep up manually. Most solutions fail:

RSS readers: Flooded with noise, no intelligence
Bookmarks: Chaotic, unsearchable mess
Commercial tools: Expensive and inflexible

What I needed was something that could think for itself. Filter signal from noise. Never research the same thing twice. Run autonomously.

That's what Claude Code skills are perfect for.

What Makes This a "Skill" Instead of Just Scripts?

Here's the thing about Claude Code skills that clicked for me: they're not just Python scripts you run. They're self-contained intelligence units that Claude can invoke naturally.

Instead of this:

python (local path) --flags --more-flags --output-format json

You get this:

"Show me high-relevance research from last week"

Claude figures out which script to run, what parameters to use, and presents the results conversationally. The skill becomes part of Claude's capabilities, not something external you have to remember.

The structure is simple but powerful:

ai-landscaping/              # One skill folder
├── SKILL.md                 # What it does, how Claude uses it
├── scripts/                 # The actual automation
│   ├── auto_research.py     # Overnight runner
│   ├── check_duplicate.py   # Prevents re-research
│   └── search_research.py   # Query interface
└── research/                # All your data
    ├── INDEX.md             # Quick daily recap
    ├── ARCHIVE.md           # Duplicate prevention
    └── 2025-11/             # Organized by date

Everything lives in one directory. Symlink it to Claude's skills folder, document it in SKILL.md, and Claude gains new abilities.

The Smart Parts That Make It Work

1. Never Research the Same Thing Twice

The duplicate prevention is my favorite part. Three layers of checking:

Layer 1: ARCHIVE.md
Simple append-only log. Every item gets a line:

[Model:Qwen2-VL-72B] - Researched on 2025-11-06 in 2025-11/2025-11-06/models.md

Fast string search. Human-readable. Never lies.

Layer 2: SQLite Database
Enables complex queries:

cursor.execute("""
    SELECT name FROM research_items
    WHERE type = ? AND name LIKE ?
""", (item_type, f"%{item_name}%"))

Layer 3: Pre-Check Before Research
Before writing anything, check both layers:

def is_duplicate(item_name):
    # Quick text scan
    with open('ARCHIVE.md') as f:
        if item_name.lower() in f.read().lower():
            return True
    
    # Database check for fuzzy matches
    cursor.execute(
        "SELECT COUNT(*) FROM research_items WHERE name LIKE ?",
        (f"%{item_name}%",)
    )
    return cursor.fetchone()[0] > 0

Result after hundreds of research runs: zero duplicates. The redundancy in checking isn't wasteful, it's insurance.

2. Quality Filtering That Actually Works

Not everything trending deserves research time. The skill applies filters before writing:

Research these:

  • Models from established orgs with novel capabilities
  • Papers with >3 citations/day in first week
  • Tools that integrate with existing workflows
  • Actual paradigm shifts

Skip these:

  • Incremental <5% benchmark improvements
  • Me-too models without differentiation
  • Papers without code or reproducibility
  • Hype without substance

These heuristics saved me from researching hundreds of low-signal items. Better to miss something incremental than waste time on noise.

Start Strict, Loosen Later
Aggressive filtering is easier to relax than the opposite. Start with strict quality bars, then adjust monthly based on what you're missing. You'll save hours avoiding garbage.

3. Dual Storage: Best of Both Worlds

Why both Markdown and SQLite? Because they solve different problems.

Markdown = Human Interface

## Qwen2-VL-72B

- **Source**: HuggingFace
- **Type**: multimodal (vision-language)
- **Relevance**: 4/5
- **Tags**: #multimodal #medical-imaging

### Notes
Strong performance on vision-language tasks...

Git-tracked, text-editor friendly, grep-searchable, future-proof.

SQLite = Query Engine

# Show me all 4+ relevance papers from October
query_db.py --type=paper --relevance-score=4 --start-date=2025-10-01

Structured queries, instant results, aggregate statistics.

Total storage for 1000+ entries: Under 15 MB. The flexibility is worth the redundancy.

How It Runs (The Evolution to True Autonomy)

I started with simple Python scripts scheduled via LaunchAgent. That worked, but it was rigid. Every night at 1 AM, auto_research.py would run the same predetermined searches, apply the same filters, write the same format.

Then I switched to the Claude Code CLI approach. Game changer.

The Claude Code Way

Instead of running a script, I schedule Claude Code itself with a research task:

claude-code --message "Execute today's AI landscaping research. Use the ai-landscaping skill to: 1) Check what we've already researched, 2) Find top 5 trending models from last 24h, 3) Find top 3 papers from major labs, 4) Update all indices and database. Be thorough but efficient."

Or with a task file:

claude-code --task (local path)

The LaunchAgent just calls Claude Code now:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>com.example.ai-research</string>
    
    <key>ProgramArguments</key>
    <array>
        <string>/usr/local/bin/claude-code</string>
        <string>--task</string>
        <string>(local path)
    </array>
    
    <key>StartCalendarInterval</key>
    <dict>
        <key>Hour</key>
        <integer>1</integer>
        <key>Minute</key>
        <integer>0</integer>
    </dict>
    
    <key>Nice</key>
    <integer>10</integer>
</dict>
</plist>

Why This Is Better

Script-Only: Rigid logic, breaks on errors, limited to Python Claude Code + Skill: Adaptive intelligence, graceful failures, uses all available tools

Example: Last week HuggingFace API was rate-limiting. A script would fail. Claude Code detected it, throttled requests, used cached data, and completed research anyway.

Intelligence Frameworks, Not Just Code
The skill defines WHAT to do and HOW. Claude Code decides WHEN, WHETHER, and adapts based on real-time conditions. That's the difference between automation and intelligence.

Morning Ritual: Asking Questions

This is where the Claude skill integration shines. I don't run scripts. I ask questions:

"What did we research yesterday?"

Claude checks INDEX.md, shows me the one-line summary:

[2025-11-06] Models: Qwen2-VL-72B, Flux-Kontext | 
Papers: World Models survey, Discrete Diffusion survey | 
Tools: LangGraph
"Show me all multimodal research from the last month"

Claude runs the search script with the right parameters, presents formatted results.

"Find anything about protein folding"

Claude uses ripgrep under the hood, searches across thousands of files in milliseconds, shows context around matches.

No memorizing script paths. No remembering flags. Just natural questions, intelligent answers.

Want to Build Your Own?

Get the Complete Claude Code Prompt →

The prompt includes everything: file structure, Python scripts, database schema, templates, search strategies, and Claude Code CLI automation setup. Copy, paste into Claude Code, done.

Start Manual First
Build the infrastructure with the prompt, but run manually for 1-2 weeks. Discover what quality filters work for your domain before automating.

Real Results After 3 Months

Research Volume:

  • 400+ items researched overnight
  • 15-20 minutes per session
  • Zero duplicates (verified manually)
  • 85% high-signal ratio

Time Savings:

  • Before: ~2 hours/day manual research
  • After: 5 minutes/morning reviewing results
  • Net: 1.5+ hours/day saved

Quality:

  • Average relevance: 3.8/5
  • High-relevance items (4-5): 45%
  • Actionable findings: 5-10 per week

The system tracks 10x more developments than I could manually, with better organization and zero duplicate work.

Why This Matters: Intelligence vs Automation

Here's the key insight: This isn't just automation. It's autonomous intelligence.

The difference:

  • Automation: Runs tasks you'd do manually (scripts, cron jobs)
  • Intelligence: Makes decisions you'd make manually (Claude Code + Skills)

Traditional automation (scripts) follows rigid logic: "Do X, then Y, then Z." Breaks when X fails. Misses opportunities between Y and Z. Requires constant maintenance.

Intelligence infrastructure (Claude Code + Skills) adapts: "Accomplish goal G using strategies S, adjusting based on conditions." Handles failures gracefully. Discovers opportunities. Improves organically.

The skill provides the playbook. Claude Code executes with judgment.

Beyond AI Research

This architecture works for any high-volume, time-sensitive intelligence gathering:

  • Competitive intelligence: Track competitors, products, pricing changes
  • Security monitoring: CVEs, vulnerabilities, patch releases
  • Market research: Industry trends, regulations, opportunities
  • Technical documentation: Framework updates, API changes, best practices

The pattern is always the same:

  1. Skill defines structure and strategies
  2. Claude Code executes with intelligence
  3. System adapts to real conditions
  4. Quality compounds over time

Build the skill once. Let Claude Code run it intelligently. Forever.

Next Steps

Building your own:

  1. Get the prompt: Claude Code Prompt
  2. Start manual, validate patterns
  3. Build duplicate detection early
  4. Test quality filters for your domain
  5. Automate with Claude Code CLI, not just scripts

The skill runs while you sleep. Claude Code executes with judgment. Intelligence compounds daily.


Related Articles

  • Prompt: Build AI Landscaping Skill
  • Syncing Claude Code Configs
  • Claude Code Best Practices

Related Articles


About the Author: Justin Johnson builds AI systems and writes about practical AI development.

justinhjohnson.com | Twitter | LinkedIn | Run Data Run | Subscribe

Follow the lab

Get the next experiment

Enjoyed the breakdown on My AI Research Assistant Works the Night Shift (A Claude Code Skill Story)? New entries land roughly weekly. No digest, no roundup. Just the next build log, when it ships.

Links to this entry