Introduction
Managing a homelab cluster used to mean keeping a terminal open. Something is consuming memory on worker2? SSH in, run kubectl top pods, grep through logs. Want to trigger the morning brief job before the cron fires? Open another tab, construct the kubectl create job command, get the namespace right. It works, but it’s friction — and friction is what kills the habit of actually operating your infrastructure.
The shift that made this feel different: teaching an agent your cluster’s specific operations once, then driving everything through a conversation. OpenClaw is what made that practical. It runs as a Kubernetes StatefulSet, exposes an API behind a Gateway API HTTPRoute, and connects a Telegram bot to a model-backed agent that has genuine cluster access — kubectl binary, NFS shares, mounted secrets, the whole thing. When you message it to queue a job, it runs the job. No SSH required.
This post covers the architecture of that setup: the pod structure, model routing, volume strategy, and the AGENTS.md pattern that makes the agent useful rather than generic.
The Core Problem: Generic Agents Are Not Useful
Most self-hosted LLM agent setups suffer from the same failure mode: the agent knows how to use tools in the abstract but has no idea what your cluster actually does. It does not know that the morning brief lives in the zinfra namespace as a CronJob named morning-brief, or that web-to-PDF jobs are triggered by running node /scripts/trigger_articles.js, or that resume pipeline data lives at /mnt/gigs.
You can prompt it every time, or you can codify that knowledge once into a workspace file the agent reads at startup. The latter is the AGENTS.md pattern, and it is what separates a useful homelab agent from a novelty.
Architecture
OpenClaw runs in the zinfra namespace as a StatefulSet. The pod has three containers:
| Container | Image | Role |
|---|---|---|
gateway |
ghcr.io/openclaw/openclaw:2026.4.14 |
Serves the OpenClaw API on port 18789 |
node |
ghcr.io/openclaw/openclaw:2026.4.14 |
Agent executor, connects to gateway |
chrome |
zenika/alpine-chrome |
Headless Chromium on port 9222 for web automation |
The gateway container is exposed externally via a Kubernetes Gateway API HTTPRoute at oc.<your-domain>.com. The node container handles all agent execution — tool calls, model inference, job dispatch.
The pod is scheduled on node-availability: 24x7 nodes (worker1 and worker2). The GPU nodes are deliberately excluded; inference for this workload routes to external APIs and a local LiteLLM service rather than local GPU compute.
Model Routing
Model selection is controlled by the openclaw-config ConfigMap. The primary model is google/gemini-2.5-flash, accessed via the Google Generative AI API. Local models are served through a LiteLLM instance at http://litellm-service.llama.svc.cluster.local:4000 and act as fallbacks:
1
2
3
4
5
6
7
# openclaw-config ConfigMap (excerpt)
primary_model: google/gemini-2.5-flash
fallbacks:
- openai/llama3.1:latest
- openai/qwen3:latest
- openai/qwen2.5-coder:14b
litellm_base_url: http://litellm-service.llama.svc.cluster.local:4000
Gemini handles the majority of requests. The local fallbacks cover cases where the Gemini API is unavailable or where a task is better served by a code-focused model — qwen2.5-coder:14b in particular handles structured output generation reliably.
Gateway Startup and Tool Access
The gateway container’s startup command handles several one-time provisioning steps before launching the server:
1
2
3
4
5
6
cp /tmp/openclaw.json /home/node/.openclaw/openclaw.json
cp /tmp/AGENTS.md /home/node/.openclaw/workspace/AGENTS.md
curl -sLo /home/node/bin/kubectl \
"https://dl.k8s.io/release/v1.34.3/bin/linux/amd64/kubectl"
openclaw channels add --channel telegram --use-env
exec openclaw gateway run --bind lan --allow-unconfigured
Breaking this down:
openclaw.jsonis mounted from a ConfigMap and copied to the expected config path.AGENTS.mdis mounted from a ConfigMap and placed in the agent’s workspace directory. This is the file that teaches the agent your homelab.kubectlv1.34.3 is downloaded fresh at startup. Rather than baking it into the image, it is pulled at boot time so the binary version can be updated via the ConfigMap without rebuilding the image.- Telegram channel registration reads the bot token from the environment (backed by a SealedSecret).
- The gateway starts with
--bind lan, making it reachable on the cluster network, and--allow-unconfiguredpermits routing to models that are not explicitly pre-validated.
Volume Strategy
The pod mounts several volumes that give the agent genuine access to cluster state and shared data:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
volumes:
- name: nfs-share
nfsPath: /mnt/nfs_share # shared with other workloads
- name: gigs-pvc
persistentVolumeClaim:
claimName: gigs-pvc # resume pipeline data at /mnt/gigs
- name: ssh-key
secret:
secretName: openclaw-ssh-key # git push access to the Jekyll blog repo
- name: google-cookies
secret:
secretName: google-cookies # authenticated web scraping sessions
- name: agents-md
configMap:
name: openclaw-agents-md # workspace definition file
The NFS share at /mnt/nfs_share is the same mount used by other workloads in the cluster — the web-to-PDF job, the morning brief job, etc. The agent can read and write to shared data without going through any intermediary API. The gigs PVC at /mnt/gigs holds resume pipeline artifacts; the agent can inspect pipeline state and queue new runs directly.
The SSH key gives the agent the ability to push commits to the Jekyll blog repository, which is how it can publish content without a human in the loop.
The AGENTS.md Pattern
This is the most important piece of the setup. AGENTS.md is a plain text file mounted into the agent’s workspace at startup. It describes your homelab in the agent’s own terms: what jobs exist, how to trigger them, what the directory layout means, what’s normal and what’s a problem.
A condensed example of what that file teaches the agent:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# Homelab Operations
## Trigger web-to-PDF article fetch
node /scripts/trigger_articles.js
## Run the morning brief manually
kubectl create job --from=cronjob/morning-brief morning-brief-manual \
-n zinfra
## Parse job posting URLs for the resume pipeline
node /scripts/parse_job_links.js <url1> [url2] ...
## Trigger the gigs resume tailor pipeline
# pipeline artifacts land at /mnt/gigs
# check status: kubectl get jobs -n zinfra -l app=gigs-tailor
Without this file, the agent can call kubectl but has no idea what to call it with. With it, a Telegram message of “fetch today’s articles” maps directly to a known operation. The agent does not need to reason about your cluster topology from first principles every time.
The pattern scales. Each time you add a new recurring operation to the cluster, you add a stanza to AGENTS.md. The agent picks it up on next restart. This is infrastructure-as-documentation, except the documentation is executable.
Secrets Management
All sensitive values are managed via SealedSecrets and unsealed by the Bitnami Sealed Secrets controller running in the cluster. The secret surface for this workload is:
openclaw-gateway-token— the gateway’s API tokengemini-api-key— Google Generative AI API keylitellm-api-key— key for the local LiteLLM servicetelegram-bot-token— the Telegram bot token read by--use-envat startupopenclaw-ssh-key— deploy key for git push access to the blog repogoogle-cookies— session cookies for authenticated scraping
None of these values appear in plaintext in the GitOps repository. SealedSecrets are encrypted with the cluster’s public key and can only be decrypted by the controller in that specific cluster.
Headless Browser Integration
The chrome sidecar runs zenika/alpine-chrome and exposes the Chrome DevTools Protocol on port 9222. The node container connects to it for tasks that require a real browser — JavaScript-heavy pages that cannot be scraped with a plain HTTP client, authenticated sessions that depend on cookie state, or page rendering for PDF generation.
1
2
3
4
5
6
7
8
- name: chrome
image: zenika/alpine-chrome
ports:
- containerPort: 9222
args:
- --no-sandbox
- --remote-debugging-port=9222
- --remote-debugging-address=0.0.0.0
The Google cookies secret is mounted into the node container and injected into the browser session when authenticated scraping is needed. This is how the web-to-PDF job fetches articles behind Google login without manual cookie refresh — as long as the session is valid, the agent handles it automatically.
Gateway API Routing
External access is handled by a Kubernetes Gateway API HTTPRoute, not an Ingress. The HTTPRoute routes traffic from the shared gateway to the OpenClaw service in zinfra:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: openclaw
namespace: zinfra
spec:
parentRefs:
- name: homelab-gateway
namespace: gateway
hostnames:
- oc.<your-domain>.com
rules:
- matches:
- path:
type: PathPrefix
value: /
backendRefs:
- name: openclaw-gateway
port: 18789
This keeps the agent’s API accessible from outside the cluster — useful for integrations beyond Telegram, or for direct API calls during debugging.
Conclusion
The value of this setup is not that the agent is smart. It is that the agent is situated. It knows the cluster because AGENTS.md tells it what the cluster does. It has cluster access because the pod is built with the right tools mounted at the right paths. It has continuity because the StatefulSet keeps it running across restarts without losing workspace state.
The result is a conversational interface to your homelab that does not require you to remember namespaces, job names, or script paths. You built the infrastructure; AGENTS.md is how you teach the agent to operate it on your behalf.
For platforms engineers already comfortable with Kubernetes primitives — StatefulSets, Gateway API, SealedSecrets, PVCs — wiring this up is mostly configuration work, not novel engineering. The tooling is the same. The model is just another workload.