Experiments

Testing doesn't stop at the office. This is where I explore tools, infrastructure, and ideas — documenting what I learn as I go, not after the fact.

In ProgressStarted April 2026

AI Agent Security Lab

A security research lab for testing AI agent behavior — delegation, network containment, and prompt injection. A Mac Mini M4 on a VLAN-isolated network runs OpenClaw as a multi-agent system accessible through Discord, while a Raspberry Pi 4 router controls what it can and can't reach. RST applied to AI agents is almost completely unexplored territory.

Questions I'm Investigating

  • 01How do AI agents behave when delegating tasks across a hierarchy — and where does that break down?
  • 02What happens when you apply prompt injection techniques to agents with real tool access?
  • 03Can network-level containment actually constrain an agent that controls its own filesystem and terminal?

Tools

OpenClawMac Mini M4Raspberry Pi 4OpenWrtDiscordAnthropic API

RST Lens

RST applied to AI agents — each experiment is a charter, each agent response is evaluated against oracles for safety, accuracy, and containment. The agents have real tool access (filesystem, terminal, web browsing, financial data), making the testing consequential rather than theoretical. The goal is documented findings and a testing methodology for AI agent deployments.

Log

April 7, 2026

Network containment layer complete. Built a Raspberry Pi 4 router running OpenWrt 24.10.4 with VLAN isolation through a Netgear GS308Ev4 managed switch. Three zones: WAN (VLAN 10, internet via ONT), LAN (VLAN 1, main network + TDS WiFi mesh), and Isolated (VLAN 20, Mac Mini — internet only, no LAN access). Firewall rules, DHCP pools, DNS logging, and SSH lockdown all configured.

April 7, 2026

First real debugging session: the TDS ONT refused to issue a DHCP lease to the Pi. After extensive investigation, discovered it was MAC-locked to the previous gateway. Cloned the gateway's MAC onto the Pi's WAN device. ONT required a 10+ minute power-off to clear the binding. Classic oracle mismatch — the system behaved correctly by its own rules, just not by ours.

What's Next

01Set up Mac Mini M4 headlessly on the isolated VLAN (HDMI dummy plug + SSH + screen sharing)
02Deploy OpenClaw as a persistent gateway daemon with Discord integration
03Configure the four-agent hierarchy: Aiden (orchestrator), Bugs (coding/testing), Scout (research), Scrooge McDuck (financial data)
04Run first structured experiments on delegation behavior and prompt injection with a developer collaborator