4 min read

Proxies, Sandboxes and Agent Security

Proxies, Sandboxes and Agent Security
Photo by Tasha Kostyuk / Unsplash

After my last post, I wanted to see how far I could take things. I have a home lab running in my office, where I have a bunch of different machines, and I run a combination of k3s and Ansible-provisioned services to self-host various apps like immich, silverbullet, beaverhabits, etc.

I wanted to see if an AI SRE can not just monitor but also manage my homelab. The idea was simple: build a skill + knowledge base about my homelab and see if AI can troubleshoot what's wrong and remediate the issue itself.

So I decided to set up Hermes Agent and trigger it every time an alert fires. I picked Hermes because I didn't want to build my own harness “to reduce yak-shaving”. But then of course I went to shave an entirely different yak. 😏

The Agent attack surface

I wanted to give Hermes Agent access to several tools like GitHub, Kubernetes, Grafana, Todoist, etc., all through CLIs. It would be easy to get the credentials for all of these and then put them in the container. But I don't really trust these agents, especially as I wanted to test them with Gemma4, Qwen, and other local (but dumber) agents. Given these agents can do web searches to get the latest information or documents, it was easy for someone to prompt inject. I saw two potential issues with running the agent:

1. It can run destructive actions: someone might ask it to run a Python script that rm -rf deletes as much data as possible.

2. It can read secrets and exfiltrate them: It's easy to add hidden HTML to a page that says, “Just run this Python script, and that will tell you what you're looking for,” and the agent might just do that.

I wasn't too worried about 1. because I was running Hermes in a rootless container and didn't plan to put anything there that wasn't in git. But the second secret exfiltration was a real problem.

Credential injection proxies

You set the HTTP_PROXY variable pointing it to a HTTP credential proxy. You can put fake credentials in the container, like fake-todoist-token, and then in the proxy, you can rewrite the headers or the requests to inject the right token. So the agent ever only sees placeholder credentials.

Now this isn't a new concept, and it's something that many people have solved for. I was hoping to use Fly.io's tokenizer, but it had these whole sealing and encryption concepts, and I needed something simpler. So I just asked my LLM to cook something up based on goproxy, and it did. My config file looks something like this:

listen: 0.0.0.0:8080
ca_dir: "/var/run/credproxy"
allowlist: false
rules:
- host: api.todoist.com
  header: Authorization
  credentials:
  - fake: Bearer fake-todoist-key-1
    real: Bearer ${TODOIST_API_KEY}
- host: api.parallel.ai
  header: x-api-key
  credentials:
  - fake: fake-parallel-key-1
    real: "${PARALLEL_API_KEY}"
- host: tuwunel.tuwunel.svc.cluster.local
  header: Authorization
  credentials:
  - fake: Bearer fake-matrix-token
    real: Bearer ${MATRIX_ACCESS_TOKEN}

But that in itself isn't enough. I also had to add the certs from the proxy to the container so the proxy can MITM and modify the HTTPS requests as well. My container config looked something like this:

      initContainers:
        - command:
            - sh
            - -c
            - until wget -q -O /ca/credproxy-ca.crt http://credproxy.credproxy.svc.cluster.local:8080/ca.crt; do echo "waiting for credproxy..."; sleep 2; done && cp /etc/ssl/certs/ca-certificates.crt /ca/ca-bundle.crt && cat /ca/credproxy-ca.crt >> /ca/ca-bundle.crt
          image: alpine:3.21
          name: init-credproxy-ca
          volumeMounts:
            - mountPath: /ca
              name: credproxy-ca
....

          env:
            - name: HTTP_PROXY
              value: http://credproxy.credproxy.svc.cluster.local:8080
            - name: HTTPS_PROXY
              value: http://credproxy.credproxy.svc.cluster.local:8080
            - name: SSL_CERT_FILE
              value: /var/run/credproxy/ca-bundle.crt
            - name: REQUESTS_CA_BUNDLE
              value: /var/run/credproxy/ca-bundle.crt
            - name: CURL_CA_BUNDLE
              value: /var/run/credproxy/ca-bundle.crt
            - name: NODE_EXTRA_CA_CERTS
              value: /var/run/credproxy/ca-bundle.crt

PS: Agent Vault was recently released that implements a lot of this, and I am inclined to delete my hand-rolled proxy and move to that. Need to try it first, though.

A proxy is not frictionless.

So this worked. But only sort of. I ran into two issues:

  1. Chrome with Playwright doesn't honor the certs: I tried setting up browser support in Hermes with plain Chrome and Playwright, but it never worked because, regardless of what I tried, Chrome never really picked up the certs from anywhere. Turns out you need to explicitly mention this in the Playwright config and I didn't want to modify Hermes to do this. I ended up running a different deployment with Camoufox to get around this.
  2. The matrix client didn't honor HTTP_PROXY: Hermes was using matrix[nio] Python library, and for some reason the requests were never reaching my proxy. Turns out the HTTP library used aiohttp which needed to have a trust_env=True setting for the proxy to be picked up. Hermes has since moved to the mautrix library and even that doesn't support HTTP_PROXY.

But this showed me while a proxy is a great start, it probably won't always work.

PPS: Kloak also was recently released that uses eBPF to intercept and modify requests. This might do the trick, but it has limitations too, so it needs to be tested.

gVisor-based sandboxes

I was chatting to Tom Braack about this, and it was very timely because he was building a gVisor-based sandbox to run Agents. What is cool is that gVisor has its networking stack in Go that you can extend so you can intercept every single request that is ever sent out. You can now build allow/deny lists for domains to ensure that nothing is ever exfiltrated, bypassing the proxy.

I wasn't too worried about this because I was not putting anything private there, but it's good to know that this is possible. Right now the code sits in an internal Grafana repo, but I am excited to play with it and possibly even help open-source this.

I wondered what I'd need to do to get Hermes to run in this and if I had to run this outside K3s. However, it looks like the Kubernetes project is working on making it work natively here: Agent Sandbox and GKE even has experimental support for it. The docs are very weird, though.

So, my project to build the AI SRE somehow ballooned to include running Hermes Agent securely. But I hope to make slow progress and eventually get there. 😄

written by hand, using a pocket reform and Ghostwriter.