The Harness Owns the Environment

The single most important architectural decision in this repo:

main()
  ├── BrowserSession::open()           ← harness opens the environment
  ├── create_tools(session.clone())    ← tools are bound to this session
  ├── create_context(TASK)             ← fresh context for this task
  ├── run_loop(model, &client, ...)    ← loop runs inside the environment
  └── [Browser closed on Drop]         ← always, via RAII

Tools don’t manage the browser. They don’t know about the browser lifecycle. The harness opens it, the harness closes it, and the process exits cleanly.

Why This Matters

If tools managed their own lifecycle, you’d get:

Leaked browser instances when a tool crashes
Shared global state between runs
Race conditions when tools compete for resources
No way to inject deterministic behavior (like login) mid-run

When the harness owns the environment:

Isolation: each run gets a fresh environment
Determinism: the harness can intercept and override at any point
Cleanup: finally blocks guarantee cleanup even on error
Injectability: the login handler runs inside the harness, not the agent

Tools Are Pure Functions of Their Session

Tools are created by passing a session to create_tools:

#![allow(unused)]
fn main() {
let tools = create_tools(session.clone());
}

The tools don’t import a browser or reach into global state. They reference the session that was handed to them. This makes them testable, swappable, and safe.

The Harness Can Intervene

Because the harness owns the loop, it can intercept before every iteration:

Run guardrails (max iterations, max messages)
Run the login handler (check URL, inject credentials)
Compact context (trim old messages)
Log and trace every event

The model never sees the harness code. It only sees the messages the harness chooses to inject.

What “Managing Input/Output Behind the Scenes” Looks Like

In the presentation demo, the login handler pushes this message ¹:

I'm the harness. I logged in. You're good now.

The model receives this as a tool result. It doesn’t know the harness injected credentials. It just sees that the login problem is solved and moves on to clicking the upvote button.

The harness is the deterministic skeleton. The model fills in the gaps.

“Harnesses in AI: A Deep Dive” — Tejas Kumar, AI Engineer World’s Fair, May 2026 ↩

Keyboard shortcuts

AI Harness Engineering

The Harness Owns the Environment

Why This Matters

Tools Are Pure Functions of Their Session

The Harness Can Intervene

What “Managing Input/Output Behind the Scenes” Looks Like