Setup and Running
Prerequisites
Installation
git clone <your-repo-url>
cd basic-harness
cargo build
Running the Eval
cargo run --bin eval
Or using the justfile:
just eval
This runs one or more models against a fixed dataset and prints results per test case: pass/fail, trap detection, and latency.
Running the Agent Harness
cargo run --bin agent
Or using the justfile:
just agent
This opens a Chromium window (via Chrome DevTools Protocol), navigates to Hacker News, and attempts to upvote the top story using the local Ollama model.
Swapping Models
Edit src/harness.rs (for the agent) or src/bin/eval.rs (for the
eval) and change the MODEL constant:
#![allow(unused)]
fn main() {
const MODEL: &str = "gemma4:e4b";
}
Any model available in your local Ollama works.
Configuration
Copy .env.example to .env and set your Hacker News credentials:
HN_USER=your_username
HN_PASS=your_password
The login handler reads these at startup and uses them to auto-fill the HN
login form when the agent is redirected to /login or /vote.
Optional environment variables:
OLLAMA_URL=http://localhost:11434 # default, change for remote Ollama