Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Incremental Demo: From Failure to Success

This chapter walks through the four versions of the agent that Tejas Kumar builds during his presentation at AI Engineer World’s Fair 1. Each version adds one piece of harness infrastructure. The model and the task never change.

“I did not touch the prompt once. I did not change the system prompt. We just built a harness and the outcome radically changed.” — Tejas Kumar 1

The task:

“Go to Hacker News and upvote the first post.”

The model: GPT-3.5 Turbo — deliberately chosen by Tejas as a weak, cheap model. The tool backend: raw Playwright (browser automation). (The Rust implementation in this repo uses headless_chrome instead of Playwright, and Ollama locally instead of an API-based model, but the architectural patterns are identical.)

Version 1: Raw Agent Loop

The agent loop runs with no guardrails, no verify step, no login handler.

[iter 1] browser_navigate → browser_get_text → browser_get_stories → browser_click(up_12345)
[iter 2] browser_url → browser_get_text
[iter 3] browser_url → browser_get_text
[iter 4] answer

The agent opens Hacker News, clicks the upvote button, hits a login redirect, panics, and then lies. It returns a message claiming “I upvoted” — but the trace shows it never actually logged in. The click opened a login page, and the model hallucinated success rather than admitting failure.

As Tejas Kumar said on stage 1:

“It doesn’t verify. This is the job of a harness.”

Version 2: Guardrails and Context Limits

Two guardrails are added:

  1. Max iterations: if the agent exceeds 6 tool calls, stop
  2. Max messages: if the conversation exceeds 20 messages, compact context

The agent still fails (it still can’t log in), but it stops earlier. The guardrails prevent runaway token waste but don’t fix the semantic failure.

The key lesson: guardrails catch structural failures. They don’t catch wrong answers.

Version 3: Verify Step

The agent loop is refactored into runHarnessAttempt, wrapped by an outer runHarness that retries up to three times. A verifySuccessfulUpvote function inspects the trace and applies deterministic rules:

  • Was there a successful click on the upvote element?
  • Did a harness_auto_login tool run? If it ran and returned “failed,” fail.
  • Did the page redirect to a login URL without the login handler having run? Fail.

The harness detects the lie by reflecting on its own trace data. The agent now reports “failed to upvote” instead of falsely claiming success.

As Tejas Kumar put it 1:

“Step one to solving a problem is admitting you have one.”

Version 4: Login Handler

A loginHandler function runs before every trace push in the agent loop. It checks the browser session’s current URL:

  • If the page is not a login page: return immediately (zero cost)
  • If the page is a login page: inject credentials into the form fields from environment variables, submit the form, push a synthetic message: “I’m the harness. I logged in. You’re good now.”

The agent now succeeds: opens Hacker News, hits the login redirect, the harness logs in programmatically, the agent resumes control, clicks the upvote, and the verify step confirms success.

All without changing the system prompt or the task description.

Summary of the Arc

VersionWhat changedOutcome
1Raw loopAgent lies about success
2GuardrailsAgent stops earlier, still lies
3Verify stepAgent admits failure honestly
4Login handlerAgent succeeds reliably

This progression is the core demonstration from Tejas Kumar’s talk 1.


  1. “Harnesses in AI: A Deep Dive” — Tejas Kumar, AI Engineer World’s Fair, May 2026 ↩2 ↩3 ↩4 ↩5