Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Components of an Agent Harness

Every agent harness has six typical components. Tejas Kumar identifies these in his talk 1, using coding agents like Claude Code and Cursor as familiar reference examples. These form the scaffolding that grounds a model in reality.

1. Tool Registry

A defined set of actions the model can take. Each tool has a name, description, parameter schema, and an execution function.

Examples in this repo: browser_navigate, browser_click, browser_fill, browser_get_text, browser_get_stories, browser_url, browser_has_class.

Tools are registered in one place (src/tools.rs) and injected into the loop as a parameter — never imported globally.

2. Model

The underlying LLM. In this repo, any Ollama model is interchangeable by changing one string in src/harness.rs. The harness doesn’t care which model runs — it just sends messages and reads responses. The model client (src/model.rs) uses Ollama’s OpenAI-compatible endpoint at /v1/chat/completions.

3. Context Management

The harness builds initial context (src/context.rs) and manages message history. Without it, context windows fill up with stale tool results. Good context management compacts or trims messages so the model always sees relevant information.

4. Guardrails

Hard limits on agent behavior that run before every loop iteration:

  • Max iterations: “Do not make more than N tool calls.”
  • Max messages: “Stop if the conversation exceeds M messages.”

Guardrails are composable, deterministic checks. They catch structural failures before the model can waste tokens or go off the rails.

5. Agent Loop

The outer orchestration loop (src/agent_loop.rs):

while (true):
  call model → parse response
  if answer: return
  if tool_calls: execute each → append results → loop

This is the engine. But the loop alone is not the harness — the harness is everything around the loop.

6. Login Handler

Some tasks require authentication. When a browser agent hits a login page, the harness can detect this and auto-fill credentials. In this repo, src/login_handler.rs checks the current URL after every tool call batch. If it matches /login or /vote, it fills the HN login form and submits, then injects a synthetic tool event into the trace.

7. Shared State (Upvote Detection)

Tools and guardrails communicate through shared mutable state: Arc<Mutex<Option<UpvotedStory>>>. Tools write to it (detecting upvote clicks), guardrails read from it (stopping on success). This replaces the callback hook pattern from the TypeScript version with idiomatic Rust.

8. Verify Step

A post-hoc check that the intended outcome actually occurred. In a coding agent this would be “run lint, run tests.” In a browser agent this uses the live browser session to check the page DOM — if HN removed the upvote arrow element after the click, the vote was registered.

Guardrails catch structural failures. Verify catches wrong answers. You need both.

Putting It Together

runHarness()
  ├── create shared state
  ├── create guardrails (bound to state)
  │
  ├── attempt 1..MAX_ATTEMPTS:
  │   ├── open environment
  │   ├── create tools (bound to environment + state)
  │   ├── create login handler (bound to environment)
  │   ├── build initial context
  │   ├── runLoop(tools, context, guardrails, login handler)
  │   │     ├── trim context
  │   │     ├── check guardrails
  │   │     ├── call model
  │   │     ├── execute tools or return answer
  │   │     ├── run login handler
  │   │     └── log trace iteration
  │   ├── verify result from trace
  │   ├── if verified: return success
  │   └── close environment (always)
  │
  └── return result (with verification)

This architecture is adapted from the Rust source files at the project root.


  1. “Harnesses in AI: A Deep Dive” — Tejas Kumar, AI Engineer World’s Fair, May 2026