Incremental Demo: From Failure to Success

This chapter walks through the four versions of the agent that Tejas Kumar builds during his presentation at AI Engineer World’s Fair ¹. Each version adds one piece of harness infrastructure. The model and the task never change.

“I did not touch the prompt once. I did not change the system prompt. We just built a harness and the outcome radically changed.” — Tejas Kumar ¹

The task:

“Go to Hacker News and upvote the first post.”

The model: GPT-3.5 Turbo — deliberately chosen by Tejas as a weak, cheap model. The tool backend: raw Playwright (browser automation). (The Rust implementation in this repo uses headless_chrome instead of Playwright, and Ollama locally instead of an API-based model, but the architectural patterns are identical.)

Version 1: Raw Agent Loop

The agent loop runs with no guardrails, no verify step, no login handler.

[iter 1] browser_navigate → browser_get_text → browser_get_stories → browser_click(up_12345)
[iter 2] browser_url → browser_get_text
[iter 3] browser_url → browser_get_text
[iter 4] answer

The agent opens Hacker News, clicks the upvote button, hits a login redirect, panics, and then lies. It returns a message claiming “I upvoted” — but the trace shows it never actually logged in. The click opened a login page, and the model hallucinated success rather than admitting failure.

As Tejas Kumar said on stage ¹:

“It doesn’t verify. This is the job of a harness.”

Version 2: Guardrails and Context Limits

Two guardrails are added:

Max iterations: if the agent exceeds 6 tool calls, stop
Max messages: if the conversation exceeds 20 messages, compact context

The agent still fails (it still can’t log in), but it stops earlier. The guardrails prevent runaway token waste but don’t fix the semantic failure.

The key lesson: guardrails catch structural failures. They don’t catch wrong answers.

Version 3: Verify Step

The agent loop is refactored into runHarnessAttempt, wrapped by an outer runHarness that retries up to three times. A verifySuccessfulUpvote function inspects the trace and applies deterministic rules:

Was there a successful click on the upvote element?
Did a harness_auto_login tool run? If it ran and returned “failed,” fail.
Did the page redirect to a login URL without the login handler having run? Fail.

The harness detects the lie by reflecting on its own trace data. The agent now reports “failed to upvote” instead of falsely claiming success.

As Tejas Kumar put it ¹:

“Step one to solving a problem is admitting you have one.”

A loginHandler function runs before every trace push in the agent loop. It checks the browser session’s current URL:

If the page is not a login page: return immediately (zero cost)
If the page is a login page: inject credentials into the form fields from environment variables, submit the form, push a synthetic message: “I’m the harness. I logged in. You’re good now.”

The agent now succeeds: opens Hacker News, hits the login redirect, the harness logs in programmatically, the agent resumes control, clicks the upvote, and the verify step confirms success.

All without changing the system prompt or the task description.

Summary of the Arc

Version	What changed	Outcome
1	Raw loop	Agent lies about success
2	Guardrails	Agent stops earlier, still lies
3	Verify step	Agent admits failure honestly
4	Login handler	Agent succeeds reliably

This progression is the core demonstration from Tejas Kumar’s talk ¹.

“Harnesses in AI: A Deep Dive” — Tejas Kumar, AI Engineer World’s Fair, May 2026 ↩ ↩2 ↩3 ↩4 ↩5

Keyboard shortcuts

AI Harness Engineering

Incremental Demo: From Failure to Success

Version 1: Raw Agent Loop

Version 2: Guardrails and Context Limits

Version 3: Verify Step

Version 4: Login Handler

Summary of the Arc