The remote model writes code, not reads data
A headgate is the gate at the intake of a millrace that controls how much water enters the channel. This one is the controlled intake on the data: it lets a powerful remote model do the thinking while your private data stays on your side of the gate. The frontier model is treated as an untrusted code generator, not a data processor — it sees only the shape of your data and writes code; the code runs locally, in a sandbox, against the real data.
⚠︎ Experimental. headgate is an early research experiment — the design, threat model, and APIs are still moving and it isn't ready to trust with data that actually matters yet.
Goal
You have private data and a small, capable local model (served by the millrace inference server). The local model can reason over your data, but some tasks need a frontier model's code-generation ability. Rather than ship the data out, headgate flips the relationship: the remote model receives a sanitized schema and writes code; that code executes locally and the results stay local. The data never crosses the gate.
┌──────────────────── headgate ─────────────────────┐
│ orchestrator · schema sanitizer · egress guard · │
│ remote codegen client · code sandbox (containment) │
└────┬───────────────────────────────────────┬───────┘
reason over │ (OpenAI API) codegen request │ (schema only)
▼ ▼
inference server remote frontier model
(local, on-device) (untrusted; code only)
│ runs generated code ▼
│ [ sandbox · network = deny ]
└────────── private / local data ─────────┘ Design
Code is the interface between the capable-but-untrusted party and the private party — never data. Two roles stay strictly separate: the model runner (the local inference server, which stays harness-agnostic behind the OpenAI API) and the code runner (the sandbox inside headgate that executes the remote model's generated code over real data).
Threat model: the careful SaaS provider
v1 defends against an honest-but-careless remote
provider — one that won't deliberately smuggle data out, but whose
generated code might accidentally leak it (a stray
requests.post to telemetry, or a stack trace that echoes a
private value back into the next prompt). That model lets us skip the
expensive paranoia and concentrate everything on two chokepoints. (A
genuinely adversarial provider with covert channels is a later, harder
threat model.)
Two guarantees
- Containment — owned by the sandbox: generated code cannot reach the network or escape its scope; its output is captured, never self-emitted.
- Confidentiality — owned by headgate: nothing sent to the remote model contains real data. Enforced by schema sanitization + an egress guard, and by debugging against synthetic data shaped like the real schema — the real data is touched only on a final run whose raw output never loops back.
Guiding principles: mechanism vs. policy (the inference server and the sandbox are mechanism; headgate is confidentiality policy on top); sanitize the schema, not just the values (column/table names leak on their own, so they're aliased and mapped back locally); and no silent leaks (the egress guard fails closed).
Implementation
headgate is written in Mojo, in pi-shaped layers — a thin transport chokepoint and core loop over a confidentiality policy over the containment sandbox. The flow for one task:
- Sanitize. The schema sanitizer derives the schema from the real data, aliases sensitive column/table names, and synthesizes fake sample rows that match the types.
- Guarded codegen. The remote codegen client sends the spec + sanitized schema + synthetic samples to the frontier model (Claude) over flare (pure-Mojo HTTP, no curl/Python) and receives code back. Every outbound payload passes the egress guard — a single chokepoint with a real-data fingerprint tripwire, canary detection, and PII redaction — which fails closed.
- Debug on synthetic data. The orchestrator compiles and runs the generated code against synthetic data in the sandbox, looping fixes back to the model — all without the real data in scope.
- Real run. Only once it's clean does the code run against the real data; the dealiased result stays local, and its raw output never loops back to the remote model.
Containment: macOS Seatbelt, proven
The sandbox runs generated code under a macOS Seatbelt profile that
denies all network egress, scopes filesystem reads/writes
to canonicalized (realpath) paths, and exposes a tiny
capability allowlist via a broker. The containment boundary is verified,
not nominal — a 6/6 spike confirms network egress denied, writes scoped,
and $HOME reads denied; the Mojo runner then re-proves it
end to end (in-scope read works; out-of-scope read and network egress are
blocked).
Confidentiality through every loop
Because the compile/run feedback loops operate on aliased code and
synthetic data, the model never sees a real value or name. The schema
sanitizer (schema.mojo) handles type inference + aliasing +
dealiasing; the egress guard (egress.mojo) is the outbound
tripwire; the sandbox + broker (sandbox.mojo,
broker.mojo) own containment; the orchestrator
(orchestrator.mojo) drives the synthetic-debug → real-run
loop.
Budget with local fallback
A token budget caps spend on the frontier model (charged from the API's
usage). Once depleted — or when there's no API key at all —
codegen and fixes route to the local model instead of
failing: trusted, free, lower quality. So headgate degrades gracefully to
a fully local, fully private mode rather than stopping.