Setup and Running

Prerequisites

Rust (latest stable)
Ollama running locally with a model loaded

Installation

git clone <your-repo-url>
cd basic-harness
cargo build

Running the Eval

cargo run --bin eval

Or using the justfile:

just eval

This runs one or more models against a fixed dataset and prints results per test case: pass/fail, trap detection, and latency.

Running the Agent Harness

cargo run --bin agent

Or using the justfile:

just agent

This opens a Chromium window (via Chrome DevTools Protocol), navigates to Hacker News, and attempts to upvote the top story using the local Ollama model.

Swapping Models

Edit src/harness.rs (for the agent) or src/bin/eval.rs (for the eval) and change the MODEL constant:

#![allow(unused)]
fn main() {
const MODEL: &str = "gemma4:e4b";
}

Any model available in your local Ollama works.

Configuration

Copy .env.example to .env and set your Hacker News credentials:

HN_USER=your_username
HN_PASS=your_password

The login handler reads these at startup and uses them to auto-fill the HN login form when the agent is redirected to /login or /vote.

Optional environment variables:

OLLAMA_URL=http://localhost:11434   # default, change for remote Ollama

Keyboard shortcuts