My AI Research Assistant Works the Night Shift (A Claude Code Skill Story)
My AI Research Assistant Works the Night Shift
Every morning, I wake up to a fresh batch of AI research. Not email newsletters or RSS spam. Actual, filtered, organized research on models, papers, and tools that matter. Zero duplicates. Instantly searchable. Ready to query.
I didn't hire an assistant. I built a Claude Code skill that runs at 1 AM while I sleep.
Here's why that's cooler than it sounds, and how you can build your own.
The Problem: AI Moves Too Fast
The numbers are ridiculous:
- HuggingFace: 200+ new models daily
- arXiv: 50+ AI papers daily
- GitHub: 100+ new ML repos daily
You can't keep up manually. Most solutions fail:
RSS readers: Flooded with noise, no intelligence
Bookmarks: Chaotic, unsearchable mess
Commercial tools: Expensive and inflexible
What I needed was something that could think for itself. Filter signal from noise. Never research the same thing twice. Run autonomously.
That's what Claude Code skills are perfect for.
What Makes This a "Skill" Instead of Just Scripts?
Here's the thing about Claude Code skills that clicked for me: they're not just Python scripts you run. They're self-contained intelligence units that Claude can invoke naturally.
Instead of this:
python (local path) --flags --more-flags --output-format json
You get this:
"Show me high-relevance research from last week"
Claude figures out which script to run, what parameters to use, and presents the results conversationally. The skill becomes part of Claude's capabilities, not something external you have to remember.
The structure is simple but powerful:
ai-landscaping/ # One skill folder
├── SKILL.md # What it does, how Claude uses it
├── scripts/ # The actual automation
│ ├── auto_research.py # Overnight runner
│ ├── check_duplicate.py # Prevents re-research
│ └── search_research.py # Query interface
└── research/ # All your data
├── INDEX.md # Quick daily recap
├── ARCHIVE.md # Duplicate prevention
└── 2025-11/ # Organized by date
Everything lives in one directory. Symlink it to Claude's skills folder, document it in SKILL.md, and Claude gains new abilities.
The Smart Parts That Make It Work
1. Never Research the Same Thing Twice
The duplicate prevention is my favorite part. Three layers of checking:
Layer 1: ARCHIVE.md
Simple append-only log. Every item gets a line:
[Model:Qwen2-VL-72B] - Researched on 2025-11-06 in 2025-11/2025-11-06/models.md
Fast string search. Human-readable. Never lies.
Layer 2: SQLite Database
Enables complex queries:
cursor.execute("""
SELECT name FROM research_items
WHERE type = ? AND name LIKE ?
""", (item_type, f"%{item_name}%"))
Layer 3: Pre-Check Before Research
Before writing anything, check both layers:
def is_duplicate(item_name):
# Quick text scan
with open('ARCHIVE.md') as f:
if item_name.lower() in f.read().lower():
return True
# Database check for fuzzy matches
cursor.execute(
"SELECT COUNT(*) FROM research_items WHERE name LIKE ?",
(f"%{item_name}%",)
)
return cursor.fetchone()[0] > 0
Result after hundreds of research runs: zero duplicates. The redundancy in checking isn't wasteful, it's insurance.
2. Quality Filtering That Actually Works
Not everything trending deserves research time. The skill applies filters before writing:
Research these:
- Models from established orgs with novel capabilities
- Papers with >3 citations/day in first week
- Tools that integrate with existing workflows
- Actual paradigm shifts
Skip these:
- Incremental <5% benchmark improvements
- Me-too models without differentiation
- Papers without code or reproducibility
- Hype without substance
These heuristics saved me from researching hundreds of low-signal items. Better to miss something incremental than waste time on noise.
3. Dual Storage: Best of Both Worlds
Why both Markdown and SQLite? Because they solve different problems.
Markdown = Human Interface
## Qwen2-VL-72B
- **Source**: HuggingFace
- **Type**: multimodal (vision-language)
- **Relevance**: 4/5
- **Tags**: #multimodal #medical-imaging
### Notes
Strong performance on vision-language tasks...
Git-tracked, text-editor friendly, grep-searchable, future-proof.
SQLite = Query Engine
# Show me all 4+ relevance papers from October
query_db.py --type=paper --relevance-score=4 --start-date=2025-10-01
Structured queries, instant results, aggregate statistics.
Total storage for 1000+ entries: Under 15 MB. The flexibility is worth the redundancy.
How It Runs (The Evolution to True Autonomy)
I started with simple Python scripts scheduled via LaunchAgent. That worked, but it was rigid. Every night at 1 AM, auto_research.py would run the same predetermined searches, apply the same filters, write the same format.
Then I switched to the Claude Code CLI approach. Game changer.
The Claude Code Way
Instead of running a script, I schedule Claude Code itself with a research task:
claude-code --message "Execute today's AI landscaping research. Use the ai-landscaping skill to: 1) Check what we've already researched, 2) Find top 5 trending models from last 24h, 3) Find top 3 papers from major labs, 4) Update all indices and database. Be thorough but efficient."
Or with a task file:
claude-code --task (local path)
The LaunchAgent just calls Claude Code now:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.example.ai-research</string>
<key>ProgramArguments</key>
<array>
<string>/usr/local/bin/claude-code</string>
<string>--task</string>
<string>(local path)
</array>
<key>StartCalendarInterval</key>
<dict>
<key>Hour</key>
<integer>1</integer>
<key>Minute</key>
<integer>0</integer>
</dict>
<key>Nice</key>
<integer>10</integer>
</dict>
</plist>
Why This Is Better
Script-Only: Rigid logic, breaks on errors, limited to Python Claude Code + Skill: Adaptive intelligence, graceful failures, uses all available tools
Example: Last week HuggingFace API was rate-limiting. A script would fail. Claude Code detected it, throttled requests, used cached data, and completed research anyway.
Morning Ritual: Asking Questions
This is where the Claude skill integration shines. I don't run scripts. I ask questions:
"What did we research yesterday?"
Claude checks INDEX.md, shows me the one-line summary:
[2025-11-06] Models: Qwen2-VL-72B, Flux-Kontext |
Papers: World Models survey, Discrete Diffusion survey |
Tools: LangGraph
"Show me all multimodal research from the last month"
Claude runs the search script with the right parameters, presents formatted results.
"Find anything about protein folding"
Claude uses ripgrep under the hood, searches across thousands of files in milliseconds, shows context around matches.
No memorizing script paths. No remembering flags. Just natural questions, intelligent answers.
Want to Build Your Own?
Get the Complete Claude Code Prompt →shippedPractical ApplicationsNov 6, 2025Prompt for Claude Code: Build AI Landscaping SkillCopy-paste prompt for Claude Code to build a complete AI research intelligence skill with duplicate prevention, structured storage, and instant retrieval.
The prompt includes everything: file structure, Python scripts, database schema, templates, search strategies, and Claude Code CLI automation setup. Copy, paste into Claude Code, done.
Real Results After 3 Months
Research Volume:
- 400+ items researched overnight
- 15-20 minutes per session
- Zero duplicates (verified manually)
- 85% high-signal ratio
Time Savings:
- Before: ~2 hours/day manual research
- After: 5 minutes/morning reviewing results
- Net: 1.5+ hours/day saved
Quality:
- Average relevance: 3.8/5
- High-relevance items (4-5): 45%
- Actionable findings: 5-10 per week
The system tracks 10x more developments than I could manually, with better organization and zero duplicate work.
Why This Matters: Intelligence vs Automation
Here's the key insight: This isn't just automation. It's autonomous intelligence.
The difference:
- Automation: Runs tasks you'd do manually (scripts, cron jobs)
- Intelligence: Makes decisions you'd make manually (Claude Code + Skills)
Traditional automation (scripts) follows rigid logic: "Do X, then Y, then Z." Breaks when X fails. Misses opportunities between Y and Z. Requires constant maintenance.
Intelligence infrastructure (Claude Code + Skills) adapts: "Accomplish goal G using strategies S, adjusting based on conditions." Handles failures gracefully. Discovers opportunities. Improves organically.
The skill provides the playbook. Claude Code executes with judgment.
Beyond AI Research
This architecture works for any high-volume, time-sensitive intelligence gathering:
- Competitive intelligence: Track competitors, products, pricing changes
- Security monitoring: CVEs, vulnerabilities, patch releases
- Market research: Industry trends, regulations, opportunities
- Technical documentation: Framework updates, API changes, best practices
The pattern is always the same:
- Skill defines structure and strategies
- Claude Code executes with intelligence
- System adapts to real conditions
- Quality compounds over time
Build the skill once. Let Claude Code run it intelligently. Forever.
Next Steps
Building your own:
- Get the prompt: Claude Code PromptshippedPractical ApplicationsNov 6, 2025Prompt for Claude Code: Build AI Landscaping SkillCopy-paste prompt for Claude Code to build a complete AI research intelligence skill with duplicate prevention, structured storage, and instant retrieval.
- Start manual, validate patterns
- Build duplicate detection early
- Test quality filters for your domain
- Automate with Claude Code CLI, not just scripts
The skill runs while you sleep. Claude Code executes with judgment. Intelligence compounds daily.
Related Articles
- Prompt: Build AI Landscaping SkillshippedPractical ApplicationsNov 6, 2025Prompt for Claude Code: Build AI Landscaping SkillCopy-paste prompt for Claude Code to build a complete AI research intelligence skill with duplicate prevention, structured storage, and instant retrieval.
- Syncing Claude Code ConfigsshippedPractical ApplicationsOct 20, 2025Syncing Claude Code Configurations Across Multiple Machines: A Practical GuideLearn how to intelligently sync Claude Code configurations across Mac, Pi, and DGX boxes while preserving machine-specific settings like model endpoints and API keys
- Claude Code Best PracticesshippedAI Development & AgentsApr 20, 2025Claude Code Best Practices: Setup, Commands, and the Defaults Worth ChangingThe Claude Code setup, skills, subagents, and hooks I run in production, plus the defaults worth changing first. Updated for the 2026 feature set.
Related Articles
- Prompt for Claude Code: Build AI Landscaping SkillshippedPractical ApplicationsNov 6, 2025Prompt for Claude Code: Build AI Landscaping SkillCopy-paste prompt for Claude Code to build a complete AI research intelligence skill with duplicate prevention, structured storage, and instant retrieval.
- Claude Skills vs MCP Servers: Why Context Efficiency MattersshippedPractical ApplicationsNov 6, 2025Claude Skills vs MCP Servers: Why Context Efficiency MattersLearn how Claude Code skills provide a lightweight alternative to MCP servers with 30% better context efficiency while maintaining flexibility
- Unlock AI's Full Potential: Why Prompt Engineering is Your Business SuperpowershippedPractical ApplicationsApr 11, 2025Unlock AI's Full Potential: Why Prompt Engineering is Your Business SuperpowerMaster prompt engineering techniques to unlock better, more reliable AI outputs that drive real business results.
About the Author: Justin Johnson builds AI systems and writes about practical AI development.
justinhjohnson.com | Twitter | LinkedIn | Run Data Run | Subscribe
Follow the lab
Get the next experiment
Enjoyed the breakdown on My AI Research Assistant Works the Night Shift (A Claude Code Skill Story)? New entries land roughly weekly. No digest, no roundup. Just the next build log, when it ships.
Related experiments
Apparatus
1,174 words · 10 min read
- claude-code
- automation
- skills
- productivity
Links to this entry
- Auto-Sync Twitter Bookmarks to Obsidian with Bird CLI
- AutoResearch on Blackwell GB10: 151 Experiments Overnight
- Claude Skills vs MCP Servers: Why Context Efficiency Matters
- Prompt for Claude Code: Build AI Landscaping Skill
- The $221 Bill: Finding and Fixing Million-Token Meeting Notes
- TurboQuant from Paper to Production: 1,000 Experiments, Four Model Sizes, One Uncomfortable Finding
- Unlock AI's Full Potential: Why Prompt Engineering is Your Business Superpower
- When ARIA Crashed the DGX: Building GPU Monitoring in 5 Minutes
- When LaunchAgents Attack: A $100 API Crash Loop Story