Practical Applications8 min readshipped

Prompt for Claude Code: Build AI Landscaping Skill

Prompt for Claude Code: Build AI Landscaping Skill

Copy the prompt below and paste it into Claude Code to build your own AI research intelligence skill.


Objective

Create a scalable AI landscaping skill for ongoing research and intelligence gathering. The skill must handle thousands of documents over time, prevent duplicate research, and enable efficient retrieval through indexing and full-text search.

Core Requirements

1. File Structure

Create this exact structure:

ai-landscaping/
├── SKILL.md
├── research/
│   ├── INDEX.md                          # Master index with one-line summaries
│   ├── ARCHIVE.md                        # Track what we've already researched
│   ├── metadata.db                       # SQLite for structured queries
│   └── YYYY-MM/
│       ├── YYYY-MM-DD/
│       │   ├── models.md                 # Daily model findings
│       │   ├── papers.md                 # Daily paper findings
│       │   ├── tools.md                  # Daily tools/platforms
│       │   ├── comparisons.md            # Daily comparative analyses
│       │   └── meta.json                 # Structured metadata for the day
│       └── monthly-summary.md            # Month synthesis
├── scripts/
│   ├── init_research_day.py             # Create today's research structure
│   ├── search_research.py               # Ripgrep wrapper with filters
│   ├── update_index.py                  # Append to INDEX.md
│   ├── check_duplicate.py               # Check if already researched
│   ├── query_db.py                      # SQLite queries
│   └── generate_monthly_summary.py      # Synthesize month's research
└── references/
    ├── search-strategies.md             # Daily search patterns
    └── taxonomy.md                      # Classification system

2. SKILL.md Contents

Create a SKILL.md file with:

Frontmatter:

name: ai-landscaping
description: AI research intelligence gathering and retrieval system. Use when the user wants to research AI models, papers, tools, or platforms; store findings persistently; search through past research; or get daily AI landscape updates. Prevents duplicate research and enables efficient retrieval across thousands of documents.

Body sections:

  • Overview: Purpose and capabilities

  • Daily Research Workflow:

    • Check ARCHIVE.md to avoid duplicates
    • Execute high-signal searches (see strategies below)
    • Store findings in today's directory
    • Update INDEX.md with one-line summaries
    • Update metadata.db
    • Mark researched items in ARCHIVE.md
  • Search & Retrieval Workflow:

    • Quick scan: Check INDEX.md
    • Recent check: View last 7 days of research
    • Full search: Use search_research.py with ripgrep
    • Structured query: Use query_db.py for metadata
  • Anti-Duplication Strategy: Always check ARCHIVE.md before research

  • File Formats: Standardized markdown templates

  • Progressive Disclosure: Reference scripts/ and references/ files

3. Search Strategies (references/search-strategies.md)

Create intelligent daily searches that maximize signal without repetition:

## High-Signal Daily Searches

### Models Research
**Morning Scan (10-15 min research time):**
1. Trending models (exclude already archived):
   - Hugging Face: Top 10 trending models from last 24h
   - Filter: trendingScore, downloads spike >50% week-over-week
   - Check: `rg "model_id" research/ARCHIVE.md` before adding

2. New releases in key categories:
   - Vision: image-to-video, image-generation (check weekly)
   - LLM: text-generation >7B params (check when new)
   - Multimodal: Recent vlm/llm combinations
   - Query: `created_at > [yesterday] AND (task:image-generation OR task:text-generation)`

3. Notable updates to tracked models:
   - Check models in ARCHIVE.md that have `watch: true` flag
   - Look for new versions, significant download spikes

### Papers Research
**Daily Academic Intelligence:**
1. High-impact papers (arxiv/HF):
   - Papers with >50 citations in first week
   - Papers from top labs (OpenAI, Anthropic, DeepMind, Meta, etc.)
   - Query: `author:openai OR author:anthropic` + last 48h

2. Emerging concepts:
   - Track frequency of terms: "discrete diffusion", "world models", "distillation"
   - Monthly: `rg -c "world models" research/2025-11/` to see trend
   - Only research papers on NEW concepts not in ARCHIVE.md

3. Implementation-ready papers:
   - Papers with associated HF models/code
   - Reproducibility score >3/5

### Tools & Platforms
**Weekly Deep Dive (pick 1-2 areas/week):**
1. Infrastructure: New ML frameworks, deployment tools
2. Evaluation: Benchmarks, leaderboards, quality metrics
3. Productivity: IDE integrations, code assistants
4. Governance: Safety tools, alignment research

**Daily Quick Check:**
- HF Spaces with MCP support (new integrations)
- GitHub trending in "machine-learning" (stars >500/day)

### Comparisons
**When to create comparative analyses:**
1. Multiple models solving same task released within 7 days
2. Significant paradigm shifts (e.g., new architecture outperforms)
3. User requests specific comparison
4. Monthly meta-analysis of category

## Anti-Duplication Filters

Before ANY search:
```bash
# Check if already researched
python scripts/check_duplicate.py "model:Flux-1" 
python scripts/check_duplicate.py "paper:Attention Is All You Need"

Search syntax for avoiding duplicates:

# Exclude models already in archive
rg "model_id" research/ARCHIVE.md --files-without-match

# Find gaps in coverage
# (models with >10k downloads but not in our research)

Daily Routine Template

Monday-Friday (15-20 min):

  1. Trending models (top 5 new)
  2. Key papers (top 3 from top labs)
  3. One tool/platform deep dive (rotate)

Weekend:

  1. Generate weekly summary
  2. Create comparisons for related findings
  3. Update taxonomy.md with new categories

Signal Quality Heuristics

High Signal:

  • New model from established org with novel capability
  • Paper with >3 citations/day in first week
  • Tool that integrates with existing workflow
  • Direct applicability to your domain

Low Signal (skip):

  • Incremental improvements (<5% on benchmarks)
  • Me-too models without differentiation
  • Papers without code/reproducibility
  • Tools duplicate existing capabilities

Search Query Examples

Hugging Face

# Models
model_search(query="", sort="trendingScore", limit=10)
model_search(query="image-generation", sort="createdAt", limit=5)
model_search(author="meta-llama", sort="downloads")

# Papers  
paper_search(query="multimodal distillation", results_limit=5)
paper_search(query="world models", results_limit=3)

# Avoid: Broad queries that return 1000s of results
# DON'T: paper_search(query="machine learning")
# DO: paper_search(query="protein folding transformers")

Web Search (targeted)

# New releases
"AI model released" + site:huggingface.co + after:2025-11-05

# Benchmarks
"MMLU benchmark" + "2025" + "state of the art"

# Industry applications
"your-domain AI" + "specific-application" + after:2025-11-01

### 4. Database Schema (metadata.db)

Create SQLite database with this schema:

```sql
CREATE TABLE research_items (
    id INTEGER PRIMARY KEY,
    date TEXT NOT NULL,
    type TEXT NOT NULL,  -- 'model', 'paper', 'tool', 'comparison'
    name TEXT NOT NULL,
    source TEXT,         -- 'huggingface', 'arxiv', 'github', 'web'
    url TEXT,
    summary TEXT,
    tags TEXT,           -- JSON array
    relevance_score INTEGER,  -- 1-5
    watch BOOLEAN DEFAULT 0,
    file_path TEXT
);

CREATE INDEX idx_date ON research_items(date);
CREATE INDEX idx_type ON research_items(type);
CREATE INDEX idx_name ON research_items(name);
CREATE INDEX idx_watch ON research_items(watch);

CREATE VIRTUAL TABLE research_fts USING fts5(
    name, summary, tags, content='research_items'
);

5. Key Scripts

init_research_day.py:

  • Create today's directory structure
  • Initialize empty markdown files with templates
  • Create meta.json with date metadata

check_duplicate.py:

# Usage: check_duplicate.py "model:Qwen-Image"
# Returns: Found in research/2025-10/2025-10-15/models.md
#          OR: Not found - safe to research

search_research.py:

# Ripgrep wrapper with filters
# Usage: search_research.py "diffusion models" --type=papers --last-days=30

update_index.py:

# Append to INDEX.md with format:
# [2025-11-04] Models: Qwen-Image (image gen), Flux-Kontext (editing) | Papers: Discrete Diffusion review

6. Markdown Templates

models.md template:

# Models Research - [DATE]

## [Model Name]
- **Source**: [HuggingFace/GitHub/Other]
- **URL**: [link]
- **Type**: [text-gen/image-gen/video/multimodal]
- **Parameters**: [size]
- **Key Innovation**: [1-2 sentences]
- **Performance**: [benchmark results]
- **Relevance**: [1-5] - Why this matters
- **Tags**: #category #application #domain-specific

### Notes
[Detailed analysis, implementation notes, potential use cases]

---

papers.md template:

# Papers Research - [DATE]

## [Paper Title]
- **Source**: [arXiv/Hugging Face Papers]
- **URL**: [link]
- **Authors**: [key authors/institutions]
- **Key Contribution**: [1-2 sentences]
- **Reproducibility**: [code available? data available?]
- **Relevance**: [1-5] - Why this matters
- **Tags**: #research-area #methodology #application

### Summary
[Main findings, methodology, results]

### Implementation Notes
[How to use this research, what it enables]

---

tools.md template:

# Tools Research - [DATE]

## [Tool/Platform Name]
- **Source**: [GitHub/Website]
- **URL**: [link]
- **Category**: [infrastructure/evaluation/productivity/governance]
- **Key Feature**: [What makes it unique]
- **Integration**: [How it fits into workflows]
- **Relevance**: [1-5] - Why this matters
- **Tags**: #tool-category #use-case

### Overview
[What it does, who it's for]

### Integration Strategy
[How to adopt this tool]

---

7. Intelligence Gathering Principles

Include in references/search-strategies.md:

The "Already Know" Problem:

  • Maintain ARCHIVE.md as source of truth
  • Before each search, check if item exists
  • Use SQLite for fast "have we seen this?" queries
  • Daily: Review last 7 days to avoid re-research

The "Signal vs Noise" Problem:

  • Focus on: Novel capabilities, paradigm shifts, direct applicability
  • Skip: Incremental improvements, duplicative work, low-impact papers
  • Use relevance scoring (1-5) to filter on retrieval

Domain-Specific Lens (customize for your field):

  • Tag items with applicability to your domain
  • Weekly: Cross-reference with organizational priorities
  • Monthly: Generate domain-specific summary

Implementation Instructions

  1. Initialize the skill directory:
mkdir -p (local path)
cd (local path)
  1. Create all directory structures and files as specified above

  2. Write all Python scripts with:

    • Proper error handling and logging
    • Clear usage documentation
    • Type hints and docstrings
    • Test each script independently
  3. Create SKILL.md following the pattern above

  4. Create example research entries for 2-3 days to demonstrate the format

  5. Test the workflow:

    • Run init_research_day.py to create today's structure
    • Manually research 3-5 items using templates
    • Run update_index.py to update INDEX.md
    • Run check_duplicate.py to verify duplicate detection
    • Run search_research.py to test retrieval
  6. Symlink to Claude's skills directory (if applicable)

Success Criteria

  • Can research 10-15 high-signal items in 15-20 minutes
  • Zero duplicate research (ARCHIVE.md prevents this)
  • Can search across 1000s of documents in <2 seconds
  • INDEX.md provides quick 30-second overview of all research
  • Monthly summaries synthesize trends and patterns
  • User can ask "what did we learn about diffusion models?" and get instant answer

Example Usage

After setup, users should be able to:

"Initialize today's research"
→ Creates directory, templates, ready to go

"Check if we've researched Qwen2-VL-72B"
→ Searches ARCHIVE.md and database

"Research trending models from HuggingFace"
→ Fetches, filters, stores, updates index

"Show me all multimodal research from last month"
→ Runs query, presents results

"Generate this month's summary"
→ Analyzes all research, creates synthesis

Automation: The Claude Code CLI Approach

The most powerful way to automate this skill is using Claude Code CLI instead of just running Python scripts.

Recommended Setup

Create a task file at `(local path)

# Daily AI Research Task

Execute today's AI landscaping research using the ai-landscaping skill:

1. Check ARCHIVE.md to see what we've already researched
2. Find top 5 trending models from last 24 hours (HuggingFace)
3. Find top 3 papers from major labs (arXiv, HF Papers)
4. Find 1-2 noteworthy tools/platforms
5. For each item: check duplicates, apply quality filters, write research entry
6. Update INDEX.md, ARCHIVE.md, and metadata.db
7. Log any issues or notable findings

Be thorough but efficient. Skip low-signal items. Adapt if APIs are slow or unavailable.

Schedule with LaunchAgent (macOS)

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>com.example.ai-research</string>
    
    <key>ProgramArguments</key>
    <array>
        <string>/usr/local/bin/claude-code</string>
        <string>--task</string>
        <string>(local path)
    </array>
    
    <key>StartCalendarInterval</key>
    <dict>
        <key>Hour</key>
        <integer>1</integer>
        <key>Minute</key>
        <integer>0</integer>
    </dict>
    
    <key>Nice</key>
    <integer>10</integer>
    
    <key>StandardOutPath</key>
    <string>(local path)
    
    <key>StandardErrorPath</key>
    <string>(local path)
</dict>
</plist>

Or with cron (Linux):

0 1 * * * /usr/local/bin/claude-code --task (local path) >> (local path) 2>&1

Why Claude Code CLI vs Python Scripts?

Claude Code Approach (Recommended):

  • ✅ Intelligent decision-making (adapts to trends)
  • ✅ Graceful error handling (API down? Uses alternatives)
  • ✅ Dynamic quality filtering (recognizes hype vs substance)
  • ✅ Better summaries (understands context)
  • ✅ Uses ALL available tools (MCP servers, web search, file ops)
  • ✅ Can adjust strategy mid-execution

Script-Only Approach:

  • ❌ Rigid, predetermined logic
  • ❌ Breaks on unexpected conditions
  • ❌ Limited to what you coded
  • ❌ Requires manual updates to adapt

The skill provides the framework. Claude Code provides the intelligence.

Customization Notes

For your specific domain:

  1. Update search-strategies.md with your domain's sources
  2. Modify relevance scoring criteria
  3. Add domain-specific tags
  4. Customize quality heuristics
  5. Adjust daily routine to your research cadence

For your workflow:

  1. Start manual, validate patterns for 1-2 weeks
  2. Create task file with your specific requirements
  3. Set quality thresholds based on your time budget
  4. Define "high signal" for your use case
  5. Schedule Claude Code CLI execution

This prompt sets up the complete infrastructure. The Claude Code CLI approach lets you run it with real intelligence, not just automation.


Related Articles

  • My AI Research Assistant Works the Night Shift (A Claude Code Skill Story)
  • Claude Skills vs MCP Servers: Why Context Efficiency Matters
  • Elevating Prompt Engineering with Integrated Tools

About the Author: Justin Johnson builds AI systems and writes about practical AI development.

justinhjohnson.com | Twitter | LinkedIn | Run Data Run | Subscribe

Follow the lab

Get the next experiment

Enjoyed the breakdown on Prompt for Claude Code: Build AI Landscaping Skill? New entries land roughly weekly. No digest, no roundup. Just the next build log, when it ships.

Links to this entry