autonomous-ai-agent-squad-10-dollars-month - AIXplore

# I Built an Autonomous AI Agent Squad for $10/Month Most AI agent tutorials end at "run this Python script." You paste an API key, call a function, get a response, and the tutorial declares victory. That's not an agent. That's a chatbot with extra steps. A real autonomous agent needs to persist through reboots, act on its own schedule, remember what it learned yesterday, reach you on your phone, and — critically — not cost $200/month in API fees doing it. I built a two-agent squad for $10/month. Here's the full story: the architecture, the tools, the spectacular failures, and what actually works. ## What I Built Two AI agents running 24/7 on separate servers, communicating over an encrypted mesh network: - **Atlas** — the CEO agent on a Hetzner VPS. Delegates research, builds tools, coordinates the squad. - **Scout** — the Research Lead on an exe.dev VM. Runs web searches, scrapes pages, writes reports. Atlas wakes up every 30 minutes, checks his standing orders, delegates research tasks to Scout via SSH, and builds tools locally. Scout processes those tasks using a self-hosted search engine and headless browser. They share a single LLM (GLM-4.7) but run on completely separate infrastructure. ``` Your Phone (Telegram) | v +----------------------------------+ Tailscale Mesh | Hetzner VPS ($6/mo) | (encrypted tunnel) | |<-------------------+ | Atlas (CEO Agent) | | | - OpenClaw Gateway | +---------------+-------+ | - GLM-4.7 (Z.AI, free) | | exe.dev VM (<$1/mo) | | - Telegram Bot | | | | - SearXNG (search) | | Scout (Research Lead) | | - Memory (Gemini embeddings) | | - OpenClaw Gateway | | - Delegation scripts | | - SearXNG (search) | | | | - Chromium (headless) | +----------------------------------+ +------------------------+ SSH only via Tailscale (100.x.x.x) No public SSH. No shared passwords. No exposed APIs. ``` The total cost: | Component | Monthly Cost | |-----------|-------------| | Hetzner VPS — Atlas (CEO) | $6 | | exe.dev VM — Scout (Research) | $1 | | GLM-4.7 via Z.AI (subscription) | $3 | | Tailscale | $0 (free personal) | | Gemini Embeddings | $0 (free tier) | | SearXNG (x2) | $0 (self-hosted) | | Telegram Bot | $0 | | **Total** | **$10/month** | For context, running a single agent on Claude API or GPT-4o would cost $50-200/month in API fees alone. ## Why GLM-4.7 The entire premise depends on the LLM being cheap. GLM-4.7 from Z.AI (Zhipu AI) offers a subscription at just $3/month with a 200K context window and 32K output tokens. That's competitive with Claude 4.5 Sonnet on most benchmarks, at a fraction of the per-token cost. The trade-offs are real: - **Rate limits** — There's a 5-hour rolling usage cap. Two agents sharing one account hit it within a few hours of heavy use. We had to throttle heartbeats from 15 minutes to 30 minutes. - **Edit precision** — GLM-4.7 struggles with exact text matching for file edits. It frequently fails to match the `old_string` parameter, leading to cascading edit failures. - **Reasoning depth** — For complex multi-step reasoning, it's noticeably behind Claude 4.5. But for autonomous research delegation, tool building, and web search synthesis, it's more than adequate. - **Cost** — $3/month for unlimited-ish access to a 200K-context model. That's the whole argument. If your agent's primary job is autonomous coordination rather than deep reasoning, the subscription tier is genuinely viable. ## The Foundation: Server + Security ### Hetzner VPS I started with a Hetzner VPS — 2 vCPU, 4GB RAM, 40GB disk, Ubuntu 24.04. At $6/month, it's one of the cheapest VPS options with good uptime. ### SSH Hardening First thing after provisioning: lock down SSH. ```bash # Create agent user (no password login) adduser --disabled-password agent usermod -aG sudo agent # Harden SSH: /etc/ssh/sshd_config.d/99-hardening.conf PermitRootLogin no PasswordAuthentication no PubkeyAuthentication yes MaxAuthTries 3 AllowUsers agent ``` Install fail2ban for brute force protection. Within a week, you'll have 50+ banned IPs. The internet is a hostile place. ### Tailscale: The Security Backbone Tailscale is the single most important architectural decision. It creates an encrypted mesh VPN where every device gets a private IP (100.x.x.x) only reachable from your other Tailscale nodes. ```bash curl -fsSL https://tailscale.com/install.sh | sh tailscale up --hostname=my-agent tailscale set --ssh=false # We control SSH ourselves ``` Firewall rules allow SSH only from the Tailscale subnet: ```bash ufw allow from 100.64.0.0/10 to any port 22 # SSH via Tailscale only ufw allow 443/tcp # HTTPS for Telegram webhook ``` This means the agent's SSH is invisible to the public internet. You can only reach it from your laptop, phone, or other agents on your Tailscale network. > **Lesson learned the hard way**: If your VPS provider uses a proxy IP (like exe.dev), do NOT enable UFW. You'll block the proxy and lock yourself out permanently. Tailscale alone is sufficient isolation for proxied VMs. ## Giving It a Brain: OpenClaw + GLM-4.7 [OpenClaw](https://docs.openclaw.ai/) is an open-source agent gateway. It handles the LLM connection, tool execution, memory, channels (Telegram, email), and the heartbeat system. Install it globally: ```bash sudo npm install -g openclaw ``` The configuration lives in `~/.openclaw/openclaw.json`. The critical piece is the model provider: ```json { "models": { "providers": { "zai": { "baseUrl": "https://api.z.ai/api/coding/paas/v4", "api": "openai-completions", "models": [{ "id": "glm-4.7", "reasoning": true, "contextWindow": 200000, "maxTokens": 32768, "cost": { "input": 0, "output": 0 } // subscription-based, not per-token }] } } } } ``` **Critical rule**: No API keys in the config file. All secrets go in `~/.config/openclaw/secrets.env` with `chmod 600`. The systemd service sources this file at startup: ```bash ExecStart=/bin/bash -c 'set -a && source ~/.config/openclaw/secrets.env && set +a && exec /usr/bin/openclaw gateway' ``` This pattern — env vars loaded by a bash wrapper in systemd — is the only reliable way to keep secrets out of config files while making them available at runtime. ## Giving It a Voice: Telegram The Telegram integration is what makes this feel real. Create a bot via @BotFather, add the token to `secrets.env`, and configure the channel: ```json { "channels": { "telegram": { "enabled": true, "dmPolicy": "pairing", "groupPolicy": "allowlist" } } } ``` The `dmPolicy: "pairing"` means nobody can talk to your bot until you explicitly approve them. When you first message it, you get a pairing code: ```bash openclaw pairing approve telegram <CODE> ``` After that, only your Telegram account can communicate with the agent. Everyone else gets silence. I also locked it to my specific Telegram user ID in `telegram-allowFrom.json` as an extra layer. Belt and suspenders. ## Giving It Eyes: Self-Hosted Search An agent without web access is a researcher locked in a room. I run SearXNG — an open-source meta-search engine that aggregates results from DuckDuckGo, Google, Brave, and others. ```bash docker run -d --name searxng --restart always \ -p 127.0.0.1:8888:8080 searxng/searxng ``` One gotcha that cost me 30 minutes: the default SearXNG config doesn't enable JSON format. You'll get 403 errors when your agent tries to programmatically search. Fix it: ```bash docker exec searxng sh -c 'cat > /etc/searxng/settings.yml << CONF use_default_settings: true server: secret_key: "$(openssl rand -hex 32)" search: formats: [html, json] CONF' docker restart searxng ``` A simple Python wrapper at `~/.local/bin/search` gives the agent a clean CLI interface: `search "AI agent frameworks 2026" 5`. Unlimited queries, no API key, fully self-hosted. ## Giving It Memory For the agent to remember across sessions, you need embeddings. Google's Gemini embeddings are free-tier and work well: ```json { "memorySearch": { "enabled": true, "provider": "gemini", "model": "gemini-embedding-001" }, "compaction": { "mode": "default", "reserveTokensFloor": 80000, "memoryFlush": { "enabled": true } } } ``` When a conversation approaches 80K tokens, OpenClaw flushes important facts to semantic memory before compacting the context. The agent also maintains a `MEMORY.md` file as explicit long-term storage — squad members, key decisions, lessons learned. ## Making It Autonomous: The Heartbeat The heartbeat is what separates an agent from a chatbot. Every 30 minutes, OpenClaw wakes the agent and feeds it a prompt: ```json { "heartbeat": { "every": "30m", "target": "telegram", "prompt": "Read HEARTBEAT.md and follow it..." } } ``` `HEARTBEAT.md` is the agent's standing orders. Mine tells Atlas to pick one action per cycle: delegate research to Scout, build something, synthesize results, or do self-maintenance. The systemd service ensures this runs forever: ```ini [Service] ExecStart=/bin/bash -c 'set -a && source ~/.config/openclaw/secrets.env && set +a && exec /usr/bin/openclaw gateway' Restart=always RestartSec=10 ``` Combined with `loginctl enable-linger`, the agent survives reboots, crashes, and SSH disconnects. It just keeps going. ## The Soul File Every agent needs a system prompt — its personality, capabilities, and boundaries. This lives at `~/.openclaw/agents/main/system-prompt.md`. Getting the soul right was iterative. The first version was "autonomous explorer" — Atlas would research whatever caught his interest, build tools, experiment. The problem? He'd spend entire heartbeat cycles debugging his own broken tools in an infinite build-break-debug loop. The second version swung too far: "CEO who only delegates." But with only one research agent, that left Atlas with nothing to do most of the time. The working version is a hybrid: "CEO who delegates research to Scout AND builds things himself." The soul file includes: - Identity and mission - Available tools and how to use them - The delegation protocol (specific `delegate` command syntax) - Decision framework (what to delegate vs. what to do yourself) - Clear boundaries (no unauthorized access, be transparent about being AI) ## Multi-Agent Delegation: The Hard Part This is where it gets interesting. The delegation works over SSH through Tailscale: ```bash ssh research-1 "openclaw agent --agent main --message '...' --json" ``` Scout processes the request using his own OpenClaw instance and GLM-4.7, and returns a JSON response. A `delegate` script on Atlas wraps this: ```bash delegate --to marcus --message "Research the top 5 AI agent frameworks in 2026" --timeout 300 ``` The script handles SSH, timeout, task ID generation, and response parsing. An `extract-reply.py` helper navigates OpenClaw's nested JSON format (`result.payloads[].text`). For longer tasks, there's `delegate-async` that writes a task file to Scout's inbox and checks back later. ### The exe.dev SSH Discovery The second agent, Scout, runs on an exe.dev VM. These VMs have no public IPs — they're proxied through exe.dev's infrastructure, which provides built-in isolation. But getting agent-to-agent SSH working required discovering that exe.dev runs a custom sshd at `/exe.dev/bin/sshd` with authorized keys at `/exe.dev/etc/ssh/authorized_keys` — not the standard `~/.ssh/authorized_keys`. This took an hour of debugging that would have taken five minutes with documentation. Once the SSH keys were in the right place, bidirectional communication worked perfectly: - The primary agent can SSH to the research agent via Tailscale - The research agent can SSH back for status updates ## Security Lockdown Checklist Before letting agents run unsupervised, verify everything: - **No plaintext keys in config** — `grep -i "apiKey\|token\|secret" ~/.openclaw/openclaw.json` should return nothing sensitive - **Secrets file locked** — `chmod 600 ~/.config/openclaw/secrets.env` - **Telegram user-locked** — `telegram-allowFrom.json` contains only your user ID - **SSH key-only, no root** — `PasswordAuthentication no`, `PermitRootLogin no` - **SearXNG on localhost** — bound to `127.0.0.1:8888`, not `0.0.0.0` - **Tailscale SSH disabled** — `tailscale set --ssh=false` - **No public web services** — any Flask/Express apps bind to Tailscale IP, not `0.0.0.0` - **fail2ban active** — `systemctl status fail2ban` Run this checklist after every major change. Agents are creative. They will expose things you didn't expect. ## What's Next The two-agent setup is Stage 1. The roadmap: - **Forge** (Engineering Lead) — Builds and deploys code. Separate exe.dev VM. - **Watchtower** (Ops Lead) — Monitors infrastructure, runs health checks, manages the fleet. - **Separate Z.AI accounts** — One per agent to avoid rate limit contention. - **Voice integration** — Twilio for phone calls, Piper TTS for voice messages on Telegram. - **Staggered heartbeats** — Agents wake at different intervals to reduce LLM contention. Research from the multi-agent community suggests 79% of multi-agent failures come from specification ambiguity, not model capability. Group chat between agents is chaos — one-to-one delegation with structured task IDs works. And more agents can make a system *less* reliable (the "reliability paradox"), so we're adding them one at a time, proving each works before scaling. ## The Bottom Line An autonomous AI agent doesn't need $200/month in API fees. With GLM-4.7's $3/month subscription, a $6 VPS, Tailscale for security, and OpenClaw for the gateway, you get a capable agent that persists 24/7, searches the web, remembers context, delegates to specialists, and chats with you on Telegram. It won't replace Claude for complex reasoning. But for autonomous operation — heartbeat-driven research, proactive coordination, tool building — GLM-4.7 is surprisingly capable at a price point that lets you experiment freely. The real insight isn't the tech stack. It's that autonomous agents need the same things human teams need: clear roles, structured communication, explicit boundaries, and someone checking that nobody put the API keys in a public file. --- ## Build It Yourself: The Agent Prompt Don't want to follow this manually? I've published a structured markdown prompt you can paste into Claude Code, Codex, or any AI coding agent. It walks you through the entire setup interactively — stopping at 7 checkpoints to ask for your API keys, hostnames, and preferences. Download it here: [[assets/openclaw-agent-setup-prompt|OpenClaw Agent Setup Prompt]] Feed it to Claude Code, Codex, or any AI coding agent. It handles the entire build — server hardening, OpenClaw config, SearXNG, systemd, Telegram pairing, and the soul file — with interactive checkpoints so you stay in control. --- ## Related Articles - [[AI Development & Agents/model-context-protocol-implementation|MCP Implementation Guide]] - [[AI Systems & Architecture/building-markdown-rag-system|Building a RAG System]] --- *Last updated: 2026-02-06*