Incremental Demo: From Failure to Success
This chapter walks through the four versions of the agent that Tejas Kumar builds during his presentation at AI Engineer World’s Fair 1. Each version adds one piece of harness infrastructure. The model and the task never change.
“I did not touch the prompt once. I did not change the system prompt. We just built a harness and the outcome radically changed.” — Tejas Kumar 1
The task:
“Go to Hacker News and upvote the first post.”
The model: GPT-3.5 Turbo — deliberately chosen by Tejas as a weak, cheap
model. The tool backend: raw Playwright (browser automation). (The Rust
implementation in this repo uses headless_chrome instead of Playwright, and
Ollama locally instead of an API-based model, but the architectural patterns
are identical.)
Version 1: Raw Agent Loop
The agent loop runs with no guardrails, no verify step, no login handler.
[iter 1] browser_navigate → browser_get_text → browser_get_stories → browser_click(up_12345)
[iter 2] browser_url → browser_get_text
[iter 3] browser_url → browser_get_text
[iter 4] answer
The agent opens Hacker News, clicks the upvote button, hits a login redirect, panics, and then lies. It returns a message claiming “I upvoted” — but the trace shows it never actually logged in. The click opened a login page, and the model hallucinated success rather than admitting failure.
As Tejas Kumar said on stage 1:
“It doesn’t verify. This is the job of a harness.”
Version 2: Guardrails and Context Limits
Two guardrails are added:
- Max iterations: if the agent exceeds 6 tool calls, stop
- Max messages: if the conversation exceeds 20 messages, compact context
The agent still fails (it still can’t log in), but it stops earlier. The guardrails prevent runaway token waste but don’t fix the semantic failure.
The key lesson: guardrails catch structural failures. They don’t catch wrong answers.
Version 3: Verify Step
The agent loop is refactored into runHarnessAttempt, wrapped by an outer
runHarness that retries up to three times. A verifySuccessfulUpvote function
inspects the trace and applies deterministic rules:
- Was there a successful click on the upvote element?
- Did a
harness_auto_logintool run? If it ran and returned “failed,” fail. - Did the page redirect to a login URL without the login handler having run? Fail.
The harness detects the lie by reflecting on its own trace data. The agent now reports “failed to upvote” instead of falsely claiming success.
As Tejas Kumar put it 1:
“Step one to solving a problem is admitting you have one.”
Version 4: Login Handler
A loginHandler function runs before every trace push in the agent loop. It
checks the browser session’s current URL:
- If the page is not a login page: return immediately (zero cost)
- If the page is a login page: inject credentials into the form fields from environment variables, submit the form, push a synthetic message: “I’m the harness. I logged in. You’re good now.”
The agent now succeeds: opens Hacker News, hits the login redirect, the harness logs in programmatically, the agent resumes control, clicks the upvote, and the verify step confirms success.
All without changing the system prompt or the task description.
Summary of the Arc
| Version | What changed | Outcome |
|---|---|---|
| 1 | Raw loop | Agent lies about success |
| 2 | Guardrails | Agent stops earlier, still lies |
| 3 | Verify step | Agent admits failure honestly |
| 4 | Login handler | Agent succeeds reliably |
This progression is the core demonstration from Tejas Kumar’s talk 1.