LLM Harness
An AI agent orchestration thingamabob—focused on safe execution of LLM-integrated tooling.
What is LLM Harness?
LLM Harness is a planned Rust-based project designed to provide a secure framework for running LLM-powered automation and tooling. The goal is to create a robust, auditable, and safe alternative to existing solutions.
Security First
The emphasis on security addresses key concerns:
- Sandboxed execution - Code runs in isolated environments
- Audit logging - All operations are traceable and reviewable
- Permission boundaries - Fine-grained access controls
- Transparency - Open source and verifiable
Documentation Structure
This book is organized into:
- Planning - Project goals, requirements, and constraints
- Architecture - System design and security models
- Tech Stack - Framework evaluation and decisions
- Roadmap - Implementation phases and milestones
Getting Started
For now, this documentation serves as the planning phase. Implementation will follow once architecture and tech stack decisions are finalized.
Note: This documentation is written with mdbook compatibility in mind. To render locally:
cd docs
mdbook serve
Project Overview
Concept
LLM Harness provides a secure runtime environment for LLM-integrated tooling, allowing users to:
- Execute AI-generated code safely
- Integrate LLM capabilities into workflows
- Maintain audit trails of all operations
- Control permissions at granular levels
Inspiration
Existing AI agent tools demonstrate the power of LLM-driven automation but often with insufficient security controls. LLM Harness aims to:
- Preserve the utility of LLM-powered tooling
- Add layers of security and safety
- Provide transparency through open source
- Enable enterprise and individual use with confidence
Target Use Cases
- Development automation - Safe code generation, testing, and refactoring
- Workflow orchestration - LLM-driven task automation with guardrails
- Research environments - Sandboxed LLM experimentation
- Enterprise integration - Secure LLM tooling in regulated environments
Core Principles
- Security first - Every feature evaluated against security impact
- Zero trust - Assume compromise, minimize blast radius
- Composability - Modular design for flexibility
- Performance - Rust’s safety without sacrificing speed
Goals and Requirements
High-Level Goals
Functional Goals
- Execute LLM-generated code in isolated environments
- Support multiple LLM providers (OpenAI, Anthropic, local models, etc.)
- Provide a plugin system for custom tools and integrations
- Enable both interactive and automated/batch execution modes
- Support persistent state and conversation history
Security Goals
- Sandboxed execution preventing host system access
- Network isolation for untrusted code
- Resource limits (CPU, memory, time)
- Comprehensive audit logging
- Fine-grained permission system
- Input/output sanitization
Non-Functional Goals
- Low latency for interactive use
- High throughput for batch processing
- Horizontal scalability
- Clear observability (metrics, tracing)
- Easy deployment (containerized, cloud-native)
Functional Requirements
| ID | Requirement | Priority |
|---|---|---|
| FR-01 | Execute user prompts through configurable LLM backends | Must |
| FR-02 | Parse and execute LLM-generated code/tool calls | Must |
| FR-03 | Support tool definitions (function calling schema) | Must |
| FR-04 | Maintain conversation state across sessions | Should |
| FR-05 | Plugin architecture for custom tools | Should |
| FR-06 | REST API for programmatic access | Should |
| FR-07 | WebSocket support for streaming responses | Could |
| FR-08 | Multi-tenant support | Could |
Security Requirements
| ID | Requirement | Priority |
|---|---|---|
| SR-01 | All code execution in isolated sandboxes (containers/WASM) | Must |
| SR-02 | Network egress control and filtering | Must |
| SR-03 | Resource quotas (CPU, memory, file system) | Must |
| SR-04 | Audit logging of all operations | Must |
| SR-05 | Secrets management with no exposure to executed code | Must |
| SR-06 | Input validation and sanitization | Must |
| SR-07 | Rate limiting and abuse prevention | Should |
| SR-08 | Cryptographic verification of tool outputs | Could |
Constraints
- Written in Rust (performance and safety)
- Open source under permissive license
- Compatible with modern container runtimes
- No external dependencies that compromise security auditing
Architectural Reasoning
The Premise
You want to build a secure AI agent orchestration tool with a terminal-native workflow. The key insight is that you prefer the TUI experience of tools like opencode, but want web accessibility without the complexity of a full client-server architecture.
Key Research Findings
1. Ratatui + WASM is Production-Ready
From the search results, several projects demonstrate this works:
- Ratzilla (1.2k GitHub stars): Official ratatui project for building terminal-themed web apps with WASM
- Webatui: Integration between Yew and Ratatui
- Multiple production websites using this stack
The approach is validated: write once in ratatui, compile to WASM with ratzilla, render in browser.
2. Zellij is Powerful but Limiting
Zellij’s WASM plugin system is innovative but:
- Requires Zellij (excludes tmux/screen users)
- Plugin runs in a pane (limited screen real estate)
- Can’t easily be standalone or web-deployed
- Better as a future integration than primary target
Verdict: Build standalone first, consider Zellij plugin as phase 2/3.
3. The Dual-Target Pattern is Emerging
Projects like Ratzilla show the pattern:
- Shared core logic (platform agnostic)
- Native backend (crossterm) for terminal
- Web backend (DOM rendering) for browser
- Same widgets, same code, different backends
Why This Architecture Wins
Compared to Web-First (React/Vue + API)
| Aspect | Web-First | TUI + WASM |
|---|---|---|
| Developer workflow | Context switch to browser | Stay in terminal |
| Code duplication | API + Web + TUI clients | Single UI codebase |
| Offline capable | Requires server | Local SQLite works |
| Deployment complexity | Server + Database | Static files or binary |
| Performance | HTTP latency | Direct function calls |
| Web presence | Full-featured | Terminal aesthetic |
Compared to Pure Terminal
| Aspect | Pure Terminal | TUI + WASM |
|---|---|---|
| Accessibility | Terminal only | Any device with browser |
| Sharing | Export/import files | Share URL |
| Mobile | SSH only | Mobile browser works |
| Collaboration | Hard | Possible with sync |
Compared to Zellij Plugin
| Aspect | Zellij Plugin | Standalone + WASM |
|---|---|---|
| User base | Zellij users only | All terminal users |
| Web deployment | Separate codebase | Same codebase |
| Full terminal control | No (pane only) | Yes |
| Distribution | Zellij plugin repo | cargo install + web |
The Stack Decision
Based on research:
Ratatui Ecosystem
- ratatui 0.29: Core TUI framework
- crossterm: Native terminal backend
- ratzilla 0.3: WebAssembly backend
These three share the same widget system. Your UI code works everywhere.
WASM Tooling
- trunk: Build tool (like cargo but for WASM)
- wasm-bindgen: Rust/JS bridge
- web-sys: Browser API access
Why Not Yew/Leptos/Dioxus?
Those are web frameworks. You’d write separate UI code for web vs terminal. With ratatui+ratzilla, the same widgets render to:
- Terminal escape codes (native)
- DOM elements (web)
This is the crucial difference that makes the dual-target approach feasible.
Addressing Concerns
“Is WASM in the browser really a good experience?”
Yes, based on evidence:
- Ratzilla demo loads instantly
- Terminal aesthetic is novel and appealing
- Works offline (PWA capable)
- Fast enough for chat interfaces
“What about server-side features?”
Future option: Add harness-server crate
- TUI can connect to local or remote server
- Enables multi-user, team features
- But not required for personal use
“How does code execution work in browser?”
Two approaches:
- WASM sandbox: Tools compile to WASM, run in browser
- No execution: Browser version read-only, use native for tools
Recommendation: Start with option 2, add option 1 later.
The Revised Plan
Immediate (Phase 1)
- Build core logic with conditional compilation
- Build native TUI with ratatui+crossterm
- Polish terminal experience
Near-term (Phase 2)
- Add ratzilla backend
- Build web version with trunk
- Add export/import for conversation sync
Future (Phase 3+)
- Security hardening with wasmtime
- Optional server mode
- Optional Zellij plugin
- Team collaboration features
What We Should Do Next
-
Validate the approach: Build a minimal prototype
- Single conversation view
- Send/receive messages
- Native + WASM both working
-
Document the architecture: Add to existing docs
- Update tech stack
- Update roadmap
- Add TUI-specific architecture doc ✓ (done)
-
Start with core crate:
- Define shared types
- Implement conversation logic
- Add LLM provider trait
The research shows this is a solid, validated approach used by production projects. The ratatui ecosystem provides exactly what we need for dual-target deployment.
System Design
High-Level Architecture
┌─────────────────────────────────────────────────────────────┐
│ Client Interface │
│ (CLI, REST API, WebSocket, Library) │
└────────────────────┬────────────────────────────────────────┘
│
┌────────────────────▼────────────────────────────────────────┐
│ API Gateway │
│ (Auth, Rate Limiting, Request Routing) │
└────────────────────┬────────────────────────────────────────┘
│
┌────────────────────▼────────────────────────────────────────┐
│ Core Engine │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ Prompt │ │ Tool │ │ Conversation │ │
│ │ Manager │ │ Registry │ │ Manager │ │
│ └─────────────┘ └─────────────┘ └─────────────────────┘ │
└────────────────────┬────────────────────────────────────────┘
│
┌────────────┼────────────┐
│ │ │
┌───────▼───┐ ┌────▼────┐ ┌─────▼──────┐
│ LLM │ │ Code │ │ State │
│ Adapters │ │ Executor│ │ Store │
└───────────┘ └────┬────┘ └────────────┘
│
┌───────────┴───────────┐
│ │
┌───────▼────────┐ ┌──────▼───────┐
│ Container │ │ WASM │
│ Runtime │ │ Sandbox │
│ (Docker/gVisor)│ │ (Wasmtime) │
└────────────────┘ └──────────────┘
Component Breakdown
1. Client Interface
Multiple entry points for different use cases:
- CLI: Interactive terminal interface
- REST API: HTTP-based programmatic access
- WebSocket: Real-time streaming for interactive applications
- Rust Library: Direct embedding in Rust applications
2. API Gateway
Central entry point handling cross-cutting concerns:
- Authentication (API keys, JWT, mTLS)
- Rate limiting and quota management
- Request validation and routing
- Load balancing
3. Core Engine
The heart of the system:
Prompt Manager
- Template management and variable substitution
- Prompt versioning and A/B testing support
- Multi-turn conversation handling
Tool Registry
- Dynamic tool registration and discovery
- Schema validation (JSON Schema for function calling)
- Tool permission enforcement
Conversation Manager
- Session state management
- Context window optimization
- Conversation persistence
4. LLM Adapters
Pluggable backends for different providers:
- OpenAI GPT-4/GPT-3.5
- Anthropic Claude
- Local models (via llama.cpp, Ollama)
- Custom self-hosted models
5. Code Executor
The security-critical component:
- Receives code/tool calls from LLM
- Validates against allowed operations
- Executes in sandboxed environment
- Returns results to conversation
6. State Store
Persistent storage for:
- Conversation history
- User preferences and settings
- Audit logs
- Tool definitions
Deployment Architecture
Single-Node Deployment
┌────────────────────────────────────┐
│ Host System │
│ ┌────────────────────────────┐ │
│ │ LLM Harness Service │ │
│ │ ┌─────────────────────┐ │ │
│ │ │ Core Engine │ │ │
│ │ └─────────────────────┘ │ │
│ └────────────────────────────┘ │
│ ┌────────────────────────────┐ │
│ │ Container Runtime │ │
│ │ (isolated sandboxes) │ │
│ └────────────────────────────┘ │
└────────────────────────────────────┘
Distributed Deployment
┌────────────────────────────────────────────────────────┐
│ Load Balancer │
└──────────────┬───────────────────────┬─────────────────┘
│ │
┌──────────▼──────────┐ ┌────────▼────────┐
│ API Gateway │ │ API Gateway │
│ (Instance 1) │ │ (Instance 2) │
└──────────┬──────────┘ └────────┬────────┘
│ │
┌──────────▼───────────────────────▼────────┐
│ Message Queue │
│ (Redis/RabbitMQ/NATS) │
└──────────┬───────────────────────┬─────────┘
│ │
┌──────────▼──────────┐ ┌────────▼────────┐
│ Worker Node 1 │ │ Worker Node 2 │
│ (Core + Executor) │ │ (Core + Exec) │
└─────────────────────┘ └─────────────────┘
Data Flow
Typical Request Flow
-
Request Received
- Client sends prompt via API
- Gateway authenticates and validates
-
Prompt Processing
- Core Engine loads conversation context
- Prompt Manager applies templates
-
LLM Interaction
- Request sent to configured LLM adapter
- LLM generates response (potentially with tool calls)
-
Tool Execution (if needed)
- Tool Registry validates tool calls
- Code Executor runs in sandbox
- Results returned to LLM
-
Response Delivery
- Final response formatted
- Conversation state persisted
- Response returned to client
-
Audit Logging
- All operations logged asynchronously
- Metrics recorded for observability
Security Model
Threat Model
Assets
- User data and prompts
- LLM API keys and credentials
- Generated code and its execution
- Conversation history
- System infrastructure
Threat Actors
- Malicious Users: Attempting to escape sandbox, access other users’ data
- Compromised LLM: Generating malicious code or exfiltrating data
- External Attackers: Exploiting API vulnerabilities, DoS attacks
- Insider Threats: Privileged access abuse
Threat Scenarios
| Scenario | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Sandbox escape | Medium | Critical | gVisor/WASM, seccomp, no root |
| Prompt injection | High | Medium | Input validation, output encoding |
| Data exfiltration | Medium | High | Network isolation, egress filtering |
| Credential theft | Low | Critical | Secrets management, no env leakage |
| Resource exhaustion | High | Medium | Resource quotas, rate limiting |
Defense in Depth
Layer 1: Network Security
- TLS 1.3 for all communications
- mTLS for internal service communication
- Network segmentation and VPC isolation
Layer 2: API Security
- Authentication required for all endpoints
- Rate limiting per user/API key
- Request size limits and timeout enforcement
- Schema validation for all inputs
Layer 3: Execution Security
- Container-based isolation: gVisor or similar for system-level isolation
- Capability dropping: Remove all unnecessary Linux capabilities
- Read-only root filesystem: Prevent persistence in containers
- No privilege escalation: Explicitly disable SETUID/SUDO
Layer 4: Application Security
- Input sanitization using allowlists
- Output encoding to prevent injection
- Prepared statements for database queries
- Constant-time comparison for secrets
Layer 5: Data Security
- Encryption at rest for sensitive data
- Encryption in transit (TLS 1.3)
- Secrets management via external vault (HashiCorp Vault, AWS Secrets Manager)
- Automatic secret rotation
Sandbox Design
Option A: Container Runtime (gVisor)
┌─────────────────────────────────────┐
│ gVisor Sandbox │
│ ┌───────────────────────────────┐ │
│ │ User Application │ │
│ │ (untrusted code runs here) │ │
│ └───────────────────────────────┘ │
│ ┌───────────────────────────────┐ │
│ │ Sentry (seccomp-bpf) │ │
│ │ Intercepts syscalls │ │
│ └───────────────────────────────┘ │
│ ┌───────────────────────────────┐ │
│ │ Gofer (9P file server) │ │
│ │ Filesystem isolation │ │
│ └───────────────────────────────┘ │
└─────────────────────────────────────┘
Pros:
- Strong isolation with user-space kernel
- Compatible with existing container tooling
- Can run any Linux binary
Cons:
- Higher overhead than native
- Requires privileged daemon
Option B: WebAssembly (Wasmtime)
┌─────────────────────────────────────┐
│ Wasmtime Runtime │
│ ┌───────────────────────────────┐ │
│ │ WASM Module │ │
│ │ (compiled guest code) │ │
│ │ │ │
│ │ • Linear memory isolation │ │
│ │ • Capability-based security │ │
│ │ • No direct syscalls │ │
│ └───────────────────────────────┘ │
└─────────────────────────────────────┘
Pros:
- Near-native performance
- True capability-based security
- Smaller attack surface
- Fast startup
Cons:
- Requires code to be compiled to WASM
- Limited standard library support
- Language ecosystem constraints
Recommendation
Hybrid approach:
- Use WASM for user-defined tools (fast, safe)
- Use gVisor containers for general code execution (flexibility)
- Allow operator to configure which to use per tool
Audit Logging
Log Requirements
All logs must include:
- Timestamp (UTC, nanosecond precision)
- Request ID (for correlation)
- User/API key identifier (hashed)
- Action type
- Resource accessed
- Success/failure status
- Source IP (if applicable)
Log Categories
#![allow(unused)]
fn main() {
enum AuditEvent {
Authentication { method: String, success: bool },
PromptSubmitted { model: String, tokens_in: u32 },
ToolExecuted {
tool_name: String,
sandbox_type: SandboxType,
execution_time_ms: u64,
success: bool
},
CodeGenerated { language: String, size_bytes: usize },
ResponseReturned { tokens_out: u32 },
Error { error_type: String, message: String },
}
}
Log Storage
- Structured logging (JSON)
- Append-only, tamper-evident storage
- Separate from application data
- Retention policy: 90 days hot, 1 year cold
Secrets Management
Principles
- Never log secrets
- Never pass to sandbox
- Rotate automatically
- Access audit trail
Implementation
- Integration with HashiCorp Vault or cloud-native secret managers
- Short-lived tokens (max 1 hour)
- Automatic injection into LLM requests (not sandboxed code)
- Secrets never stored in conversation history
Security Checklist
Before production deployment:
- Penetration testing completed
- Security audit of sandbox escape vectors
- Dependency vulnerability scan (cargo audit)
- Secrets rotation procedures documented
- Incident response plan in place
- Security monitoring alerts configured
- Data retention policies implemented
- GDPR/privacy compliance review
Data Flow
Request Lifecycle
1. Client Request
┌─────────┐ HTTP/HTTPS ┌─────────────┐
│ Client │ ──────────────────> │ API Gateway │
│ │ Authorization: │ │
│ │ Bearer <token> │ │
└─────────┘ └─────────────┘
Validation:
- API key authentication
- Rate limit check (Redis-backed)
- Request schema validation
- Content-Security-Policy headers
2. Prompt Processing
┌─────────────┐ ┌─────────────────┐ ┌──────────────┐
│ API Gateway │ --> │ Prompt Manager │ --> │ LLM Adapter │
│ │ │ │ │ │
│ │ │ • Load template │ │ │
│ │ │ • Substitute │ │ │
│ │ │ • Add context │ │ │
└─────────────┘ └─────────────────┘ └──────────────┘
Operations:
- Load conversation context from store
- Apply system prompts and templates
- Token count estimation
- Cost tracking
3. LLM Invocation
┌──────────────┐ HTTPS/HTTP2 ┌─────────────────┐
│ LLM Adapter │ ──────────────────> │ LLM Provider │
│ │ Streaming Response │ (OpenAI/Claude/ │
│ │ <────────────────── │ Local Model) │
└──────────────┘ └─────────────────┘
Handling:
- Retry logic with exponential backoff
- Circuit breaker for provider failures
- Streaming response handling
- Timeout enforcement
4. Response Parsing
┌─────────────────┐ ┌─────────────────┐
│ LLM Provider │ --> │ Response Parser │
│ │ │ │
│ JSON/Stream │ │ • Extract text │
│ with tool calls │ │ • Parse tools │
│ │ │ • Validate │
└─────────────────┘ └─────────────────┘
Validation:
- JSON schema validation for tool calls
- Type checking for arguments
- Permission checking against tool registry
5. Tool Execution (if applicable)
┌─────────────────┐ ┌─────────────────┐ ┌─────────────┐
│ Response Parser │ --> │ Tool Registry │ --> │ Sandbox │
│ │ │ │ │ │
│ Tool calls │ │ • Validate │ │ • gVisor │
│ detected │ │ • Authorize │ │ • Wasmtime │
│ │ │ • Route │ │ • Firecracker│
└─────────────────┘ └─────────────────┘ └─────────────┘
Security checks:
- Tool existence and enabled status
- User permissions for tool
- Resource quota availability
- Input sanitization
6. Sandbox Execution
┌─────────────┐ ┌─────────────────┐ ┌─────────────┐
│ Sandbox │ --> │ Code/Tool Exec │ --> │ Result │
│ │ │ │ │ │
│ Isolated │ │ • Run command │ │ Output │
│ environment │ │ • Capture I/O │ │ Exit code │
│ │ │ • Enforce limits│ │ │
└─────────────┘ └─────────────────┘ └─────────────┘
Constraints:
- CPU time limit (configurable, default 30s)
- Memory limit (configurable, default 256MB)
- Network egress filtering
- Filesystem restrictions
7. Result Aggregation
┌─────────────┐ ┌─────────────────┐ ┌──────────────┐
│ Result │ --> │ Conversation │ --> │ LLM Adapter │
│ │ │ Manager │ │ (if needed) │
│ Tool output │ │ │ │ │
│ │ │ • Update state │ │ │
│ │ │ • Check if more │ │ │
│ │ │ tool calls │ │ │
└─────────────┘ └─────────────────┘ └──────────────┘
Loop detection:
- Maximum tool call iterations (default 10)
- Cycle detection in tool dependencies
- Total execution time limits
8. Response Delivery
┌──────────────┐ ┌─────────────────┐ ┌─────────┐
│ Conversation │ --> │ Response Builder│ --> │ Client │
│ Manager │ │ │ │ │
│ │ │ • Format output │ │ │
│ │ │ • Add metadata │ │ │
│ │ │ • Set headers │ │ │
└──────────────┘ └─────────────────┘ └─────────┘
Metadata included:
- Request ID
- Tokens used (input/output)
- Execution time
- Tool calls made
- Cost estimate
9. Async Processing
┌──────────────────────────────────────────────────────────┐
│ Async Pipeline │
├──────────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────┐ │
│ │ Audit Log │ │ Metrics │ │ Cost Tracking │ │
│ │ Writer │ │ Collector │ │ & Billing │ │
│ └─────────────┘ └─────────────┘ └─────────────────┘ │
└──────────────────────────────────────────────────────────┘
Non-blocking operations:
- Structured audit logging
- Prometheus metrics
- Usage tracking
- Alert evaluation
Data Stores
PostgreSQL (Primary Store)
- Conversations and messages
- User accounts and API keys
- Tool definitions and configurations
Redis (Cache/Queue)
- Rate limiting counters
- Session state (short-term)
- Async job queue
- Pub/sub for real-time features
Object Storage (S3/MinIO)
- Large attachments
- Export files
- Audit log archives
State Diagram
┌─────────────┐
┌─────────>│ Pending │
│ │ (queued) │
│ └──────┬──────┘
│ │
│ v
│ ┌─────────────┐
│ │ Processing │
│ │ (active) │
│ └──────┬──────┘
│ │
┌────┴────┐ ┌─────┴─────┐
│ Timeout │ │ Complete │
│ Error │ │ Success │
└────┬────┘ └─────┬─────┘
│ │
│ v
│ ┌─────────────┐
└─────────>│ Closed │
│ (final) │
└─────────────┘
TUI-First Architecture
Philosophy
Unlike traditional web-first LLM tools (ChatGPT, Claude web interface), robit embraces a terminal-native workflow. The interface should feel like a natural extension of the command line—fast, keyboard-driven, and composable with existing Unix tooling.
Why TUI-First?
Developer Experience:
- Speed: No browser overhead, instant startup
- Keyboard-centric: Vim-like bindings, no mouse required
- Composability: Pipe to/from other CLI tools
- Context preservation: Stay in your terminal environment
Inspiration:
opencode: Excellent TUI workflow for AI interactionlazygit: Git operations without leaving the terminalk9s: Kubernetes management via TUIranger: File management with vim bindings
Dual-Target Strategy
The architecture supports two deployment targets from a single codebase:
1. Native Terminal Application
Runs directly in any terminal emulator:
- Direct terminal control via crossterm
- Full clipboard integration
- Native file system access
- Can spawn subprocesses
- SSH-friendly (works over remote connections)
2. WebAssembly Browser Application
Compiles to WASM and runs in browser:
- Compiled to WASM via ratzilla
- Terminal aesthetic in the browser
- Accessible from any device
- Easy sharing of conversations
- Sandboxed execution (for code)
Shared Core Architecture
The application logic is target-agnostic:
#![allow(unused)]
fn main() {
// Shared core crate - platform agnostic
pub mod core {
pub struct Conversation { ... }
pub struct Message { ... }
pub trait LlmProvider { ... }
// Business logic knows nothing about rendering
pub async fn process_message(
conversation: &mut Conversation,
message: &str,
llm: &dyn LlmProvider,
) -> Result<Message, Error> { ... }
}
}
Component Breakdown
Core Components (Shared)
-
Conversation Engine
- Message history management
- Context window optimization
- Token counting
- Conversation branching
-
LLM Provider Abstraction
- Trait-based providers
- OpenAI, Anthropic, local models
- Streaming response handling
- Retry and circuit breaker logic
-
Tool System
- Tool definition registry
- Schema validation
- Execution coordination
- Result formatting
TUI Components (Native)
-
Main Interface
- Conversation list sidebar
- Message view (markdown rendering)
- Input area with multiline support
- Status bar
-
Input Handling
- Vim-style navigation
- Command palette (
:commands) - Keyboard shortcuts
- Copy/paste integration
Web Components (WASM)
-
Browser Integration
- LocalStorage for settings
- IndexedDB for conversation history
- Clipboard API
- File download/upload
-
UI Adaptations
- Responsive layout
- Touch support
- URL routing (conversation IDs)
- Browser history integration
Data Synchronization
When both native and web versions are used:
Sync strategies:
- File-based: Export/import JSON
- Git-based: Store conversations in git repo
- Cloud sync: Self-hosted or third-party
- Local network: Peer discovery and sync
Comparison with Zellij
What Zellij Offers
Zellij is a terminal multiplexer with a WASM plugin system:
- Plugins run inside Zellij’s WASM runtime
- Can draw UI within a pane
- Access to Zellij APIs (panes, tabs, etc.)
- Written in any WASM-compatible language
Should robit be a Zellij Plugin?
Pros:
- Instant integration with existing Zellij users
- Fits naturally in terminal workflow
- Can control terminal environment
- No separate binary to install
Cons:
- Dependency on Zellij (not standalone)
- Plugin API limitations
- Can’t be used outside Zellij
- Web deployment would be separate codebase
Verdict: Build standalone first, consider Zellij plugin later
The standalone TUI provides:
- Broader compatibility (any terminal)
- Web deployment via WASM
- Simpler distribution
- Can still detect and integrate with Zellij if present
Implementation Strategy
Phase 1: Core + Native TUI
Focus on the terminal experience:
crates/
├── robit-core/ # Shared logic
├── robit-tui/ # Native terminal app
└── robit-cli/ # Command-line interface
Target: Excellent terminal experience that rivals opencode
Phase 2: WebAssembly Port
Add web target:
crates/
├── robit-core/ # Shared logic
├── robit-tui/ # Native terminal app
├── robit-web/ # WASM browser app (ratzilla)
└── robit-cli/ # Command-line interface
Target: Same features in browser, conversation sync
Phase 3: Advanced Features
- Collaborative editing
- Rich media support
- Plugin system
- Zellij integration (optional plugin)
Technology Choices
TUI Framework: Ratatui
Why Ratatui:
- Mature, actively maintained
- Excellent performance
- Rich widget ecosystem
- Works with both native and WASM (via ratzilla)
- Strong community (part of ratatui-rs organization)
Backend Strategy:
- Native:
crosstermbackend - Web:
ratzillabackend (uses DOM elements)
WASM Stack
Web Application
├── ratzilla (TUI rendering to DOM)
├── ratatui (widgets, layout, style)
├── wasm-bindgen (JS interop)
└── compiled Rust (WASM)
Build tools:
trunk: WASM build and dev serverwasm-bindgen: Rust/JS bindingswasm-pack: Package for npm distribution
Alternative: Native + Web Server
Instead of WASM, another approach is TUI connecting to a local server:
Pros:
- Single codebase (no WASM complications)
- Server can run remotely
- Better for multi-user scenarios
Cons:
- Requires server to be running
- More complex deployment
- Latency for local usage
Decision: Start with WASM approach for personal use cases, consider server architecture for team/enterprise features later.
User Experience Flow
Terminal Usage
# Start interactive TUI
$ robit
# Or with specific conversation
$ robit --conversation last
# Pipe content to LLM
$ cat error.log | robit --ask "What's wrong?"
# Quick query (non-interactive)
$ robit --query "Explain this code" --file src/main.rs
Web Usage
# Build and serve locally
$ cargo xtask serve-web
# Serving at http://localhost:8080
# Or deploy to static hosting
$ cargo xtask build-web --release
# Upload dist/ to Netlify/Vercel/etc
Conclusion
The TUI-first, dual-target architecture provides:
- Native performance for daily terminal use
- Web accessibility for sharing and remote access
- Code reuse between platforms via shared core
- Future flexibility to add server mode or Zellij plugin
This approach honors the terminal-native workflow while not sacrificing the convenience of web-based access when needed.
The Zellij Question
What is Zellij?
Zellij is a modern terminal workspace (multiplexer) with built-in WASM plugin support. Unlike tmux or screen, Zellij treats plugins as first-class citizens alongside terminal panes.
┌──────────────────────────────────────────────────────┐
│ Zellij Session │
├──────────────────────────────────────────────────────┤
│ Tabs: [Main] [Code] [Logs] [Plugins] │
├──────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌─────────────┐ ┌──────────────┐ │
│ │ Terminal │ │ Terminal │ │ WASM Plugin │ │
│ │ (nvim) │ │ (shell) │ │ ┌────────┐ │ │
│ │ │ │ │ │ │ Custom │ │ │
│ │ │ │ │ │ │ UI │ │ │
│ └─────────────┘ └─────────────┘ │ └────────┘ │ │
│ │ ↑ │ │
│ │ Zellij │ │
│ │ Plugin API│ │
│ └────────────┘ │
└──────────────────────────────────────────────────────┘
Zellij’s WASM Architecture
Zellij’s plugin system is sophisticated:
#![allow(unused)]
fn main() {
// Zellij plugin runs in WASM runtime with specific APIs
#[derive(Default)]
struct State {
users: Vec<User>,
}
register_plugin!(State);
impl ZellijPlugin for State {
fn load(&mut self, configuration: BTreeMap<String, String>) {
// Plugin loaded by Zellij
request_permission(&[PermissionType::ReadApplicationState]);
}
fn update(&mut self, event: Event) -> bool {
// Handle Zellij events
match event {
Event::Key(key) => { /* handle keypress */ }
Event::TabUpdate(tabs) => { /* tabs changed */ }
_ => {}
}
true // should render
}
fn render(&mut self, rows: usize, cols: usize) {
// Render to the pane using print! macros
println!("Line to display in pane");
}
}
}
Key characteristics:
- Plugins are WASM modules loaded by Zellij
- Can render UI within a pane using text/ANSI
- Access to Zellij state (tabs, panes, sessions)
- Can control Zellij (open panes, run commands, etc.)
- Written in Rust (or any WASM language)
The Integration Question
Should robit be:
A. A standalone TUI application (current plan) B. A Zellij plugin C. Both
Option A: Standalone TUI
# User experience
$ robit
# Full-screen TUI starts
# User interacts with LLM
# Quit returns to shell
Pros:
- Works in any terminal (no Zellij dependency)
- Full control over terminal (alternative screen buffer)
- Can spawn subprocesses freely
- Simpler distribution (single binary)
- Can still detect Zellij and integrate
Cons:
- Another context switch (exit editor → run harness → return)
- Not integrated with existing Zellij workflow
Option B: Zellij Plugin
# In Zellij session
# Open plugin pane
$ zellij action new-pane --plugin harness
# Or bind to key: Ctrl+g → open harness
Pros:
- Lives within existing Zellij workflow
- Can interact with terminal panes (read output, send input)
- Great for “analyze this error” or “explain this code” workflows
- No context switching
- Zellij handles pane management
Cons:
- Only works for Zellij users (excludes tmux/screen users)
- Plugin API limitations (pane-sized, no full terminal control)
- Can’t easily run standalone
- Web deployment would be separate codebase
Option C: Hybrid Approach
┌─────────────────────────────────────────────────────────┐
│ Standalone App │
│ │
│ ┌─────────────────────────────────────────────────┐ │
│ │ robit TUI │ │
│ │ Full terminal control, works everywhere │ │
│ └─────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
│
┌──────────────────┼──────────────────┐
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌───────────────┐ ┌──────────────────┐
│ WASM Web App │ │ Zellij Plugin│ │ CLI Interface │
│ (ratzilla) │ │ (optional) │ │ (commands) │
│ │ │ │ │ │
│ Browser-based │ │ Pane-based │ │ Scripts/pipes │
│ access │ │ access │ │ │
└─────────────────┘ └───────────────┘ └──────────────────┘
Pros:
- Maximum flexibility
- Users choose their interface
- Code reuse across implementations
Cons:
- More complexity
- Multiple build targets
- Testing matrix expands
Recommended Approach: Standalone First, Plugin Later
Why Standalone First?
- Universal compatibility: Works for all terminal users, not just Zellij
- Simpler initial design: Focus on core features, not plugin APIs
- Faster iteration: No need to reload Zellij to test changes
- Distribution: Single binary via cargo install
- Web deployment: WASM path is clearer for standalone apps
Future Zellij Integration
After standalone is mature, consider a Zellij plugin wrapper:
#![allow(unused)]
fn main() {
// crates/harness-zellij-plugin/src/main.rs
// Wrapper that embeds harness-core
// Provides Zellij-specific UI (pane-sized)
// Communicates with standalone daemon or runs standalone
use harness_core::{App, Config};
use zellij_tile::prelude::*;
struct HarnessPlugin {
app: App,
}
impl ZellijPlugin for HarnessPlugin {
fn render(&mut self, rows: usize, cols: usize) {
// Adapt TUI rendering to pane size
let view = self.app.render_compact(rows, cols);
for line in view.lines {
println!("{}", line);
}
}
}
}
Integration options:
-
Embedded mode: Plugin runs full harness-core in WASM
- Pros: Self-contained
- Cons: Large WASM bundle, no persistence
-
Daemon mode: Plugin connects to running harness daemon
- Pros: Shared state with standalone app
- Cons: Requires daemon to be running
-
Hybrid: Lightweight plugin that launches standalone in new pane
- Pros: Full features, minimal code
- Cons: Context switch (but within Zellij)
When Zellij Integration Makes Sense
Consider prioritizing Zellij if:
- Majority of users are already Zellij users
- Deep integration needed (manipulate panes, read terminal output)
- Enterprise deployment where Zellij is standardized
- Plugin ecosystem is a core value proposition
For robit, standalone-first is better because:
- Target audience: Developers using various terminal setups
- Use case: Primary interface, not helper utility
- Deployment: Web accessibility is important (WASM path)
- Complexity: Full TUI needs full terminal control
Detecting and Cooperating with Zellij
Even as a standalone app, we can integrate with Zellij:
#![allow(unused)]
fn main() {
// Detect if running inside Zellij
fn in_zellij() -> bool {
std::env::var("ZELLIJ").is_ok()
}
// Zellij-specific behaviors
if in_zellij() {
// Open new pane for tool output
// Share session state via Zellij's environment
// Use Zellij's key bindings awareness
}
}
Useful integrations:
- Open LLM conversation in new Zellij pane
- Send code from editor pane to harness
- Copy harness output to clipboard pane
- Use Zellij’s session persistence
Implementation Phases
Phase 1: Standalone (Months 1-3)
- Full TUI with ratatui
- Native terminal experience
- WASM web deployment
- Core features (conversations, tools, LLM)
Phase 2: Zellij Awareness (Month 4)
- Detect Zellij environment
- Basic integration (pane awareness)
- Documentation for Zellij users
Phase 3: Zellij Plugin (Month 6+)
- Optional plugin crate
- Embedded or daemon mode
- Publish to Zellij plugin registry
Conclusion
Don’t build for Zellij exclusively, but design with Zellij in mind:
- Keep core logic separate from UI
- Make UI adaptable to different container sizes
- Support environment detection
- Document integration possibilities
The standalone-first approach gives us:
- Broader reach (all terminal users)
- Clearer WASM deployment path
- Simpler initial implementation
- Option to add Zellij later without rewrite
Zellij is an excellent terminal workspace, but robit should be an excellent LLM tool first—regardless of which multiplexer (if any) the user prefers.
Framework Evaluation
Web Framework Options
1. Axum
Overview: Tokio-based web framework from the Tower ecosystem
Pros:
- Native async/await support
- Excellent performance
- Type-safe extractors and responses
- Built on hyper (battle-tested HTTP)
- Great middleware system via Tower
- Good WebSocket support
Cons:
- Relatively newer, smaller ecosystem than Actix
- Steeper learning curve for Tower concepts
Use Case Fit: ⭐⭐⭐⭐⭐ Excellent for our API-heavy service
2. Actix-web
Overview: High-performance web framework with actor model
Pros:
- Extremely fast (consistently benchmarks as fastest)
- Large ecosystem and community
- Mature and stable
- Excellent WebSocket support
Cons:
- Actor model can be overkill for simple cases
- Some unsafe code in core (though extensively audited)
- Heavier runtime than Axum
Use Case Fit: ⭐⭐⭐⭐ Very good, but Axum’s integration with Tokio ecosystem is nicer
3. Rocket
Overview: Opinionated framework with derive macros
Pros:
- Very ergonomic API
- Excellent documentation
- Built-in form validation
- Type-safe routing
Cons:
- Requires nightly Rust
- Slower than Axum/Actix
- Less flexible for our specific needs
Use Case Fit: ⭐⭐⭐ Good but nightly requirement is a concern
Recommendation: Axum
Primary reasons:
- Perfect integration with Tokio ecosystem (we’ll use Tokio everywhere)
- Excellent middleware story for auth, rate limiting
- Clean, composable design matches our architecture
- Strong WebSocket support for streaming
LLM Client Options
1. async-openai
Overview: Unofficial async OpenAI client
Pros:
- Type-safe API
- Full OpenAI API coverage
- Streaming support
- Well-maintained
Cons:
- OpenAI-only (need adapters for other providers)
2. llm
Overview: Rust-native LLM inference (llama.cpp bindings)
Pros:
- Run models locally without external API
- No network dependency
- Privacy-preserving
Cons:
- Limited to supported models
- Requires significant compute resources
- Complex deployment
3. Custom Abstraction
Build a trait-based abstraction allowing multiple backends:
#![allow(unused)]
fn main() {
#[async_trait]
trait LlmProvider {
async fn complete(&self, request: CompletionRequest) -> Result<CompletionResponse>;
async fn stream(&self, request: CompletionRequest) -> Result<Stream<Item=Token>>;
}
struct OpenAiProvider { ... }
struct AnthropicProvider { ... }
struct LocalProvider { ... }
}
Recommendation: Custom abstraction using async-openai as reference
Build our own trait system but learn from async-openai’s patterns. This gives us:
- Multi-provider support
- Consistent interface
- Easy testing with mocks
Serialization / API Schema
1. Serde + JSON
Standard choice, no question here.
2. JSON Schema Validation
Options:
- schemars: Generate JSON Schema from Rust types
- jsonschema: Validate data against schema
- validator: Input validation with derive macros
Recommendation: schemars + validator
Database / Persistence
1. SQLx
Overview: Compile-time checked SQL
Pros:
- Type-safe queries
- No ORM overhead
- Async native
- Multiple database support
Cons:
- SQL knowledge required
- Compile times can be slow with query checking
2. Diesel
Overview: ORM with query builder
Pros:
- Mature and stable
- Type-safe query builder
- Migrations support
Cons:
- Not fully async (requires async wrapper)
- Sync-only
3. SeaORM
Overview: Async ORM
Pros:
- Fully async
- ActiveRecord pattern
- GraphQL integration
Cons:
- Heavier than SQLx
- Less control over queries
Recommendation: SQLx
Reasons:
- We want full control over queries
- Compile-time checking catches errors early
- No ORM magic, explicit is better
Sandboxing Options
1. Firecracker MicroVMs
Overview: AWS’s microVM technology
Pros:
- VM-level isolation (strongest)
- Fast startup (~125ms)
- Minimal overhead
- Production-proven (AWS Lambda)
Cons:
- Requires KVM
- Complex setup
- Linux-only
2. gVisor
Overview: User-space kernel for containers
Pros:
- Strong syscall-level isolation
- Compatible with OCI containers
- Good performance-security balance
- Used by Google Cloud Run
Cons:
- Some syscall overhead
- Requires privileged container runtime
3. Wasmtime (WebAssembly)
Overview: WASM runtime with WASI
Pros:
- Near-native performance
- Capability-based security
- Fast startup
- Language agnostic (compile to WASM)
Cons:
- Limited to WASM target
- Ecosystem still maturing
- Some system APIs unavailable
4. Bubblewrap
Overview: Unprivileged sandboxing tool
Pros:
- Simple, minimal
- No special privileges needed
- Good for simple cases
Cons:
- Weaker isolation than gVisor
- Limited features
Recommendation: Hybrid: Wasmtime for user tools, gVisor for complex code
Configuration Management
1. config crate
Standard layered config:
- Default values
- Config file (TOML/YAML/JSON)
- Environment variables
- Command-line arguments
2. Figment
More flexible, used by Rocket
Recommendation: config crate - simpler, sufficient
Observability
1. Tracing
Structured logging:
tracingfor instrumentationtracing-subscriberfor outputtracing-opentelemetryfor distributed tracing
2. Metrics
Prometheus-compatible:
metricscrate for recordingmetrics-exporter-prometheusfor exposition
3. Error Handling
thiserrorfor library errorsanyhowfor application errorseyreas alternative to anyhow
Recommendation: tracing + metrics + thiserror/anyhow
Authentication / Security
1. JSON Web Tokens (JWT)
jsonwebtokencrate- RS256 for production (asymmetric)
2. API Keys
- Custom implementation with rate limiting
- Argon2 for key hashing (if storing)
3. Argon2
argon2crate for password/key hashing
Testing
1. Unit Testing
Built-in cargo test
2. Integration Testing
tokio::testfor async testswiremockfor HTTP mockingtestcontainersfor database testing
3. Property-Based Testing
proptestfor fuzzing inputs
Recommendation: All of the above for comprehensive coverage
Tech Stack Decisions (DEPRECATED - API-First Architecture)
Note: This document describes the original API-first architecture with a REST backend. The project has shifted to a TUI-first, dual-target architecture (native terminal + WASM web). This document is kept for reference but the current approach does not use a web framework like Axum.
Summary
After evaluating the options, here are the recommended technologies for robit:
Core Framework
| Component | Choice | Rationale |
|---|---|---|
| Web Framework | Axum | Tokio-native, excellent middleware, type-safe |
| Note | N/A | Not needed for TUI architecture - see revised stack |
| Async Runtime | Tokio | Industry standard, comprehensive ecosystem |
| Serialization | Serde | Standard, zero-cost abstractions |
| Error Handling | thiserror + anyhow | Structured errors where needed, ergonomic elsewhere |
Data Layer
| Component | Choice | Rationale |
|---|---|---|
| Database | PostgreSQL | Robust, ACID, JSON support |
| Database Client | SQLx | Compile-time checked, async-native |
| Cache | Redis | Fast, pub/sub, rate limiting |
| Redis Client | redis (with tokio-comp features) | Native async |
| Migrations | sqlx-cli | Integrated with SQLx |
LLM Integration
| Component | Choice | Rationale |
|---|---|---|
| LLM Trait System | Custom | Multi-provider support, unified interface |
| HTTP Client | reqwest | Built on hyper, async-native, widely used |
| Streaming | futures + tokio-stream | Async stream handling |
| JSON Schema | schemars | Generate schemas from types |
| Validation | validator | Input validation with derives |
Security & Sandboxing
| Component | Choice | Rationale |
|---|---|---|
| Primary Sandbox | Wasmtime | Fast, capability-based, safe |
| Secondary Sandbox | gVisor | Strong isolation for complex code |
| Authentication | jsonwebtoken + custom API key | Flexible auth strategies |
| Hashing | argon2 | Modern, secure password hashing |
| Secrets | HashiCorp Vault client | Production secret management |
Observability
| Component | Choice | Rationale |
|---|---|---|
| Logging | tracing | Structured, async-aware |
| Log Output | tracing-subscriber | Flexible formatting |
| Metrics | metrics + metrics-exporter-prometheus | Standard observability |
| Distributed Tracing | tracing-opentelemetry | Jaeger/Zipkin compatible |
Development Tools
| Component | Choice | Rationale |
|---|---|---|
| Configuration | config | Layered config (file, env, CLI) |
| CLI Parsing | clap v4 | Derive macros, comprehensive |
| Environment | dotenvy | Development environment variables |
| Testing HTTP | wiremock | Mock HTTP servers |
| Testing DB | testcontainers | Real database in tests |
| Property Tests | proptest | Fuzzing and edge cases |
| Linting | clippy | Rust best practices |
| Formatting | rustfmt | Consistent style |
Deployment & Operations
| Component | Choice | Rationale |
|---|---|---|
| Container | Distroless or Alpine | Minimal attack surface |
| Orchestration | Kubernetes | Industry standard |
| Service Mesh | Linkerd or Istio (optional) | mTLS, observability |
| TLS | rustls | Pure Rust, audited |
Cargo.toml Skeleton
[package]
name = "robit"
version = "0.1.0"
edition = "2021"
rust-version = "1.75"
[dependencies]
# Core async runtime
tokio = { version = "1", features = ["full"] }
tokio-stream = "0.1"
# Web framework
axum = { version = "0.7", features = ["ws"] }
tower = "0.4"
tower-http = { version = "0.5", features = ["trace", "cors", "limit"] }
# Serialization
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
# Database
sqlx = { version = "0.7", features = ["runtime-tokio", "tls-rustls", "postgres", "uuid", "chrono"] }
redis = { version = "0.24", features = ["tokio-comp", "connection-manager"] }
# HTTP client
reqwest = { version = "0.11", features = ["json", "stream", "rustls-tls"] }
# Validation
validator = { version = "0.16", features = ["derive"] }
schemars = "0.8"
# Security
jsonwebtoken = "9"
argon2 = "0.5"
# Sandboxing
wasmtime = "16"
wasmtime-wasi = "16"
# Observability
tracing = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter", "json"] }
metrics = "0.22"
metrics-exporter-prometheus = "0.13"
# Error handling
thiserror = "1.0"
anyhow = "1.0"
# Utilities
config = "0.14"
clap = { version = "4.5", features = ["derive", "env"] }
uuid = { version = "1.6", features = ["v4", "serde"] }
chrono = { version = "0.4", features = ["serde"] }
once_cell = "1.19"
[dev-dependencies]
tokio-test = "0.4"
wiremock = "0.6"
testcontainers = "0.15"
proptest = "1.4"
Project Structure
robit/
├── Cargo.toml
├── README.md
├── docs/ # mdbook documentation
│ ├── book.toml
│ └── src/
├── crates/ # Workspace members
│ ├── robit-core/ # Core engine
│ ├── robit-api/ # HTTP API server
│ ├── robit-cli/ # Command-line tool
│ ├── robit-sandbox/ # Sandboxing implementations
│ ├── robit-llm/ # LLM provider abstractions
│ └── robit-types/ # Shared types
├── migrations/ # SQLx migrations
├── scripts/ # Development scripts
├── docker/ # Container configurations
└── tests/ # Integration tests
Next Steps
- Initialize Cargo workspace
- Set up development environment
- Create core types crate
- Implement LLM trait system
- Build basic API server scaffold
- Add first sandbox implementation (WASM)
- Implement conversation management
- Add authentication layer
- Create CLI tool
- Write integration tests
See Phase 1: Foundation for detailed implementation plan.
Tech Stack Decisions: Web Framework Analysis
Why We Removed the Web Framework
You caught an important inconsistency. The original documentation recommended Axum as the web framework, but this was from the initial API-first architecture where we planned to build:
Axum Server → REST API → Web Frontend (React/Vue)
↓
TUI Client
With the shift to TUI-first, dual-target architecture, this changes completely:
New Architecture: No Web Framework Needed
Native Terminal (ratatui) ──┐
├── Shared Core ──→ LLM APIs
Web Browser (WASM) ─────────┘ ↓
SQLite/LocalStorage
Key differences:
- No server required for basic functionality
- Native TUI is a standalone binary (no HTTP)
- Web version is a static WASM app served from CDN/static host
- Both talk directly to LLM APIs from client side
What We Actually Need
For Native TUI
ratatui+crossterm- UI frameworktokio- Async runtime for I/Oreqwest- HTTP client to call LLM APIssqlx- Local SQLite databaseclap- CLI argument parsing
For Web (WASM)
ratzilla- DOM backend for ratatuiwasm-bindgen- Rust/JS bridgeweb-sys- Browser APIsgloo- Ergonomic web wrappers
Shared Core
serde- Serializationfutures- Async utilities- Custom LLM provider trait
When Would We Add a Web Framework?
Only if we later implement server mode for:
- Multi-user/team features - Centralized conversation storage
- Cloud sync - Sync across devices without manual export/import
- Enterprise deployment - Admin controls, audit logs
- Heavy compute - Running large models on server instead of client
Then we’d add:
axumoractix-web- HTTP API- PostgreSQL - Centralized database
- Redis - Caching/sessions
- Authentication/authorization layer
But this is Phase 3+ and optional. The core product works without it.
Updated Recommendation
| Component | Original | Revised | Notes |
|---|---|---|---|
| Web Framework | Axum | None | Not needed for TUI |
| Database (native) | PostgreSQL | SQLite | Local, embedded |
| Database (web) | PostgreSQL | LocalStorage/IndexedDB | Browser storage |
| HTTP Client | reqwest | reqwest | Direct to LLM APIs |
| Server | Required | Optional | Only for sync features |
Why This Is Better
- Simpler deployment:
cargo installor static web hosting - Privacy: Conversations stay on device by default
- Offline capable: Works without internet (local models)
- Lower latency: No HTTP hop to our own server
- Cheaper to run: No server infrastructure for basic users
The Mental Model Shift
Old: Web application with TUI as alternative client New: Terminal application with web as alternative deployment target
This aligns with tools like:
lazygit- Git TUI (terminal only, no web)k9s- Kubernetes TUI (terminal only, no web)opencode- AI TUI (terminal-first, web as convenience)
The web deployment via WASM is a feature, not the primary architecture.
Documentation Updates
The tech-stack-revised.md document reflects this approach. This document (decisions.md) now serves as:
- Historical reference for the API-first approach
- Future reference if we add server mode
- Comparison of the two architectures
Next Steps
- Build
harness-core- Shared business logic - Build
harness-tui- Native terminal app - Build
harness-web- WASM browser version - Later: Consider
harness-serveronly if needed
No Axum needed until we explicitly decide to add server-side features.
Revised Tech Stack: TUI + WASM Focus
Overview
Given the TUI-first, dual-target architecture (native terminal + WASM web), the tech stack shifts to support:
- Ratatui ecosystem for terminal UI
- WASM compatibility for web deployment
- Shared core between targets
- Async runtime that works in both environments
Core Framework
| Component | Choice | Rationale |
|---|---|---|
| TUI Framework | Ratatui | Mature, fast, WASM-compatible via ratzilla |
| Terminal Backend | crossterm | Cross-platform, async-ready |
| Web Backend | ratzilla | DOM-based rendering for WASM |
| Async Runtime | tokio (native) / wasm-bindgen-futures (web) | Best in class for each target |
| Error Handling | thiserror + anyhow | Ergonomic, works everywhere |
Target-Specific Stacks
Native Terminal Stack
┌──────────────────────────────────────┐
│ Native Application │
├──────────────────────────────────────┤
│ robit-tui │
│ • ratatui widgets │
│ • crossterm backend │
│ • tokio runtime │
├──────────────────────────────────────┤
│ robit-core (shared) │
│ • Business logic │
│ • LLM abstractions │
│ • Tool system │
├──────────────────────────────────────┤
│ Platform APIs │
│ • reqwest (HTTP) │
│ • sqlx (database) │
│ • directories (config) │
└──────────────────────────────────────┘
Key crates:
ratatui = "0.29"- TUI frameworkcrossterm = "0.28"- Terminal backendtokio = { version = "1", features = ["full"] }- Async runtimereqwest = { version = "0.12", features = ["json", "stream"] }- HTTP clientsqlx = { version = "0.8", features = ["runtime-tokio", "sqlite"] }- Databasedirectories = "5"- Config/cache directories
WebAssembly Stack
┌──────────────────────────────────────┐
│ Web Application │
├──────────────────────────────────────┤
│ robit-web │
│ • ratzilla backend │
│ • ratatui widgets (same as native) │
│ • wasm-bindgen-futures │
├──────────────────────────────────────┤
│ robit-core (shared) │
│ • Same business logic │
│ • Conditional compilation for I/O │
├──────────────────────────────────────┤
│ Web APIs │
│ • web-sys (browser APIs) │
│ • gloo (ergonomic wrappers) │
│ • LocalStorage/IndexedDB │
└──────────────────────────────────────┘
Key crates:
ratzilla = "0.3"- DOM-based ratatui backendwasm-bindgen = "0.2"- Rust/JS interopwasm-bindgen-futures = "0.4"- Async in WASMweb-sys = "0.3"- Browser API bindingsgloo = "0.11"- Ergonomic web APIsjs-sys = "0.3"- JavaScript bindings
Shared Core Stack
Components that work in both environments via conditional compilation:
Serialization
serde = { version = "1.0", features = ["derive"] }- Serializationserde_json = "1.0"- JSON handlingtoml = "0.8"- Config files
Async Utilities
futures = "0.3"- Async utilitiesasync-trait = "0.1"- Async traitspin-project = "1"- Pin projections
LLM Integration
async-openai = "0.26"- OpenAI client (optional, native only)reqwestfor custom HTTP implementations- Custom trait system for multi-provider support
Text Processing
pulldown-cmark = "0.12"- Markdown parsingsyntect = "5"- Syntax highlighting (native)tree-sitter = "0.24"- Language parsing (optional)
Validation
validator = { version = "0.19", features = ["derive"] }- Input validationregex = "1"- Pattern matching
Workspace Structure
robit/
├── Cargo.toml # Workspace root
├── book.toml # Documentation
├── docs/ # mdbook documentation
├── crates/
│ ├── robit-core/ # Shared business logic
│ │ ├── Cargo.toml
│ │ └── src/
│ │ ├── lib.rs
│ │ ├── conversation.rs
│ │ ├── llm/
│ │ │ ├── mod.rs
│ │ │ ├── provider.rs
│ │ │ └── openai.rs
│ │ ├── tools/
│ │ │ ├── mod.rs
│ │ │ └── registry.rs
│ │ └── config.rs
│ │
│ ├── robit-tui/ # Native terminal app
│ │ ├── Cargo.toml
│ │ └── src/
│ │ ├── main.rs
│ │ ├── app.rs
│ │ ├── ui/
│ │ │ ├── mod.rs
│ │ │ ├── conversation_view.rs
│ │ │ ├── input.rs
│ │ │ └── sidebar.rs
│ │ ├── event.rs
│ │ └── widgets/
│ │
│ ├── robit-web/ # WASM web app
│ │ ├── Cargo.toml
│ │ ├── index.html
│ │ └── src/
│ │ ├── main.rs
│ │ ├── app.rs
│ │ └── storage.rs
│ │
│ └── robit-cli/ # Command-line interface
│ ├── Cargo.toml
│ └── src/
│ └── main.rs
│
├── examples/ # Usage examples
├── scripts/ # Build scripts
└── tests/ # Integration tests
Build Configuration
Native Build
# crates/robit-tui/Cargo.toml
[package]
name = "robit-tui"
version = "0.1.0"
edition = "2021"
[[bin]]
name = "robit"
path = "src/main.rs"
[dependencies]
robit-core = { path = "../robit-core" }
# TUI
crossterm = "0.28"
ratatui = "0.29"
# Async
tokio = { version = "1", features = ["full"] }
tokio-stream = "0.1"
# HTTP
reqwest = { version = "0.12", features = ["json", "stream", "rustls-tls"] }
# Database
sqlx = { version = "0.8", features = ["runtime-tokio", "sqlite"] }
# Config
directories = "5"
config = "0.14"
# UI
syntect = "5" # Syntax highlighting
unicode-width = "0.1"
textwrap = "0.16"
# CLI
clap = { version = "4.5", features = ["derive"] }
WASM Build
# crates/robit-web/Cargo.toml
[package]
name = "robit-web"
version = "0.1.0"
edition = "2021"
[lib]
crate-type = ["cdylib", "rlib"]
[dependencies]
robit-core = { path = "../robit-core" }
# WebAssembly
wasm-bindgen = "0.2"
wasm-bindgen-futures = "0.4"
js-sys = "0.3"
web-sys = "0.3"
gloo = "0.11"
console_error_panic_hook = "0.1"
# TUI (same widgets as native)
ratzilla = "0.3"
ratatui = "0.29" # Shared version with native
# Async (browser-compatible)
wasm-bindgen-futures = "0.4"
futures = "0.3"
# Storage
gloo-storage = "0.3"
indexed_db_futures = "0.4" # Optional IndexedDB wrapper
[dependencies.web-sys]
version = "0.3"
features = [
"console",
"Window",
"Document",
"Element",
"HtmlElement",
"Storage",
"IdbDatabase", # IndexedDB
]
Conditional Compilation
The shared core uses cfg attributes for platform-specific code:
#![allow(unused)]
fn main() {
// crates/robit-core/src/storage.rs
#[cfg(not(target_arch = "wasm32"))]
mod native {
use sqlx::SqlitePool;
pub struct DatabaseStorage {
pool: SqlitePool,
}
impl Storage for DatabaseStorage {
// SQLite implementation
}
}
#[cfg(target_arch = "wasm32")]
mod web {
use gloo_storage::{LocalStorage, Storage};
pub struct WebStorage;
impl Storage for WebStorage {
// LocalStorage/IndexedDB implementation
}
}
#[cfg(not(target_arch = "wasm32"))]
pub use native::DatabaseStorage as StorageImpl;
#[cfg(target_arch = "wasm32")]
pub use web::WebStorage as StorageImpl;
}
HTTP Client Abstraction
Different HTTP clients for each target:
#![allow(unused)]
fn main() {
// crates/robit-core/src/http.rs
#[cfg(not(target_arch = "wasm32"))]
pub type HttpClient = reqwest::Client;
#[cfg(target_arch = "wasm32"))]
pub type HttpClient = reqwest_wasm::Client; // Or custom fetch-based client
// Unified interface
#[async_trait]
pub trait HttpClientExt {
async fn get(&self, url: &str) -> Result<Response>;
async fn post(&self, url: &str, body: Value) -> Result<Response>;
}
}
Development Tools
Native Development
# Run TUI app
cargo run -p robit-tui
# With logging
RUST_LOG=debug cargo run -p robit-tui
# Release build
cargo build -p robit-tui --release
Web Development
# Install trunk
cargo install trunk
# Add WASM target
rustup target add wasm32-unknown-unknown
# Serve with hot reload
cd crates/robit-web && trunk serve
# Build for production
trunk build --release
Cross-compilation Testing
# Test native
cargo test --workspace --exclude robit-web
# Check WASM compiles
cargo check -p robit-web --target wasm32-unknown-unknown
# Run WASM tests with wasm-pack
wasm-pack test --headless --firefox
Comparison: Before vs After
Original (API-first)
Axum (HTTP) → Core → Database
↓
Sandboxed Executor
Revised (TUI + WASM)
┌─────────────────────────────────────────┐
│ Terminal (ratatui + crossterm) │
│ Web (ratzilla + wasm-bindgen) │
└─────────────────┬───────────────────────┘
│
┌────────▼────────┐
│ Shared Core │
│ • Conversations│
│ • LLM providers│
│ • Tools │
└────────┬────────┘
│
┌─────────────┼─────────────┐
│ │ │
┌───▼───┐ ┌──────▼──────┐ ┌───▼───┐
│SQLite │ │HTTP Client │ │Config │
│(native)│ │(reqwest) │ │(dirs) │
└───────┘ └─────────────┘ └───────┘
┌─── Web Alternative ───┐
│LocalStorage/IndexedDB │
│fetch API │
└───────────────────────┘
Migration Path
If we later want to add the API/server approach:
- Extract server crate:
harness-serverwith Axum - Add client modes: TUI can connect to local or remote server
- Web app options: WASM (standalone) or web client (connects to server)
This keeps options open without over-engineering from the start.
Summary
The revised stack prioritizes:
- Developer experience: Ratatui provides excellent TUI ergonomics
- Code sharing: Single core works across targets
- Deployment flexibility: Native binary or static web hosting
- Performance: Native speed in terminal, near-native in browser
The WASM approach via ratzilla is particularly elegant because it renders the same ratatui widgets to DOM elements, giving us a true terminal aesthetic in the browser without maintaining separate UI code.
Milestones
Overview
The project is divided into four major milestones, each building on the previous:
- Milestone 1: Foundation - Core infrastructure and basic functionality
- Milestone 2: Security - Sandboxing and security hardening
- Milestone 3: Scale - Performance optimization and enterprise features
- Milestone 4: Ecosystem - Plugin system and community tools
Timeline
Month: 1 2 3 4 5 6 7 8 9 10 11 12
├────┴────┤
│ M1 │
├────┴────┤
│ M2 │
├────┴────┤
│ M3 │
├────┴────┤
│ M4 │
Milestone 1: Foundation (Months 1-2)
Goal: Working prototype with basic functionality
Deliverables:
- Project scaffolding and CI/CD
- Core types and domain models
- Basic LLM provider integration (OpenAI)
- REST API with conversation endpoints
- Simple in-process code execution
- PostgreSQL persistence
- Docker deployment
Success Criteria:
- Can send prompts and receive responses
- Conversations persist across restarts
- API documented and testable
- Basic observability in place
Milestone 2: Security (Months 3-4)
Goal: Production-ready security
Deliverables:
- WASM sandbox implementation
- gVisor container sandbox
- Authentication system (API keys + JWT)
- Authorization and permissions
- Comprehensive audit logging
- Input/output validation and sanitization
- Rate limiting and resource quotas
- Security documentation and threat model review
Success Criteria:
- All code execution sandboxed
- No secrets exposed to sandboxes
- Complete audit trail
- Pass security review checklist
- Load tested with quotas enforced
Milestone 3: Scale (Months 5-6)
Goal: Enterprise-ready performance and reliability
Deliverables:
- Multi-tenant support
- Horizontal scaling with worker pool
- Multiple LLM provider support
- Conversation context optimization
- Advanced caching strategies
- Streaming responses
- WebSocket support
- Kubernetes deployment manifests
- Load testing and performance tuning
Success Criteria:
- Handle 1000+ concurrent conversations
- Sub-100ms p95 latency for simple requests
- Graceful degradation under load
- 99.9% uptime target
Milestone 4: Ecosystem (Months 7-8)
Goal: Extensible platform with community support
Deliverables:
- Plugin architecture for custom tools
- Tool registry and marketplace concept
- Advanced CLI with scripting support
- SDK for Rust integration
- Python bindings
- Example applications and tutorials
- Plugin development documentation
- Community Discord/forum
Success Criteria:
- Third-party tool plugins possible
- Clear documentation for plugin authors
- Multiple example applications
- Active community engagement
Stretch Goals
Beyond Milestone 4
- Federation: Cross-instance communication
- Federated Learning: Privacy-preserving model improvements
- Model Fine-tuning: Custom model training pipeline
- Mobile SDK: iOS and Android support
- Desktop Application: Electron/Tauri GUI
Risk Assessment
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Sandbox vulnerabilities | Medium | Critical | Regular security audits, defense in depth |
| LLM API changes | High | Medium | Abstraction layer, multiple providers |
| Performance issues | Medium | High | Early load testing, profiling |
| Scope creep | High | Medium | Strict milestone boundaries |
| Contributor burnout | Medium | Medium | Sustainable pace, clear documentation |
Success Metrics
Technical Metrics
- Response time: p95 < 500ms for simple requests
- Availability: 99.9% uptime
- Security: Zero critical vulnerabilities in production
- Test coverage: > 80% code coverage
Adoption Metrics
- GitHub stars: Target 1k by month 6
- Active users: 100+ by month 8
- Plugins created: 10+ community plugins by month 8
- Contributors: 5+ external contributors by month 8
Review Process
At the end of each milestone:
- Demo to stakeholders
- Retrospective on what went well/what didn’t
- Security review of new features
- Performance benchmark comparison
- Documentation review and updates
- Roadmap adjustment based on learnings
Phase 1: Foundation
Objective
Establish a working prototype with core functionality: prompt handling, LLM integration, and basic persistence.
Week 1-2: Project Setup
Week 1: Repository and Tooling
Tasks:
- Initialize Git repository
- Create Cargo workspace structure
- Set up GitHub Actions CI:
- Format check (rustfmt)
- Lint check (clippy)
- Test execution
- Security audit (cargo audit)
- Configure Dependabot
- Set up branch protection rules
- Create CONTRIBUTING.md
- Add LICENSE (MIT/Apache-2.0 dual)
Deliverables:
- Green CI pipeline
- Contribution guidelines
- Repository structure established
Week 2: Core Infrastructure
Tasks:
- Create workspace crates:
-
harness-types- shared types and errors -
harness-core- domain logic -
harness-api- HTTP server -
harness-cli- command-line interface
-
- Set up SQLx with compile-time checking
- Create initial database schema
- Implement configuration system (config crate)
- Add structured logging (tracing)
- Create health check endpoint
Deliverables:
- Workspace compiles
- Database migrations working
- Configuration loading from file/env
- Health endpoint returns 200 OK
Week 3-4: Domain Layer
Week 3: Core Types and Models
Tasks:
- Define core domain types:
#![allow(unused)] fn main() { struct Conversation { ... } struct Message { ... } struct ToolCall { ... } enum Role { User, Assistant, System } } - Implement conversation state machine
- Create LLM provider trait
- Define prompt templates structure
- Add validation rules
Deliverables:
- Types compile with all derives
- Unit tests for domain logic
- Documentation for all public types
Week 4: LLM Integration
Tasks:
- Implement OpenAI provider
- Completion API
- Streaming support
- Error handling
- Create LLM client abstraction
- Add request/response logging
- Implement retry logic with backoff
- Add circuit breaker pattern
Deliverables:
- Can send requests to OpenAI API
- Streaming responses work
- Errors handled gracefully
- Unit tests with mocked LLM
Week 5-6: API Layer
Week 5: REST API Endpoints
Tasks:
- Implement endpoints:
POST /conversations- Create conversationGET /conversations/:id- Get conversationPOST /conversations/:id/messages- Send messageGET /conversations/:id/messages- List messagesDELETE /conversations/:id- Delete conversation
- Add request/response validation
- Implement error responses (RFC 7807 Problem Details)
- Add OpenAPI/Swagger documentation
Deliverables:
- All endpoints functional
- API documentation available
- Error responses consistent
Week 6: Persistence and State
Tasks:
- Implement conversation repository
- Add message persistence
- Create conversation service layer
- Add pagination for message lists
- Implement conversation listing with filters
- Add soft delete support
Deliverables:
- Conversations persist across restarts
- Can list and filter conversations
- Deletion is soft (recoverable)
Week 7-8: Execution and Polish
Week 7: Basic Code Execution
Tasks:
- Implement simple in-process executor (for development only)
- Parse tool calls from LLM responses
- Create tool registry
- Add echo/math/read_file example tools
- Implement tool result formatting
Deliverables:
- LLM can invoke tools
- Tool results return to conversation
- Basic tool examples working
Week 8: CLI and Integration
Tasks:
- Build interactive CLI
- Create conversation command
- Send message command
- List conversations command
- Configuration command
- Add Docker support
- Dockerfile
- docker-compose.yml with PostgreSQL
- Create integration tests
- Write initial README
Deliverables:
- CLI usable for basic interactions
- Docker setup works
- Integration tests pass
- README with setup instructions
Technical Decisions for Phase 1
Database Schema
-- conversations table
CREATE TABLE conversations (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
title VARCHAR(255),
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
deleted_at TIMESTAMPTZ
);
-- messages table
CREATE TABLE messages (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
conversation_id UUID NOT NULL REFERENCES conversations(id),
role VARCHAR(20) NOT NULL,
content TEXT NOT NULL,
tool_calls JSONB,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- indexes
CREATE INDEX idx_messages_conversation ON messages(conversation_id);
CREATE INDEX idx_conversations_updated ON conversations(updated_at DESC);
API Response Format
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"title": "Hello conversation",
"messages": [
{
"id": "...",
"role": "user",
"content": "Hello, can you help me?"
},
{
"id": "...",
"role": "assistant",
"content": "I'd be happy to help! What do you need?"
}
],
"created_at": "2024-01-15T10:30:00Z",
"updated_at": "2024-01-15T10:31:00Z"
}
Configuration Structure
[server]
host = "0.0.0.0"
port = 8080
[database]
url = "postgres://user:pass@localhost/harness"
max_connections = 10
[llm]
provider = "openai"
api_key = "${OPENAI_API_KEY}"
model = "gpt-4"
timeout_seconds = 30
[logging]
level = "info"
format = "json"
Definition of Done for Phase 1
- All code compiles without warnings
- CI pipeline passes
- Unit test coverage > 70%
- Integration tests pass
- API documentation complete
- Docker setup works
- CLI functional
- Can create conversation, send messages, receive responses
- Conversations persist in database
- Basic tool execution works
- README with quickstart guide
Risks and Mitigations
| Risk | Mitigation |
|---|---|
| SQLx compile times | Use query! macro sparingly, prefer query_as! |
| LLM API rate limits | Implement caching, add retry logic |
| Scope creep | Strict backlog, daily standups |
| Testing complexity | Use testcontainers for integration tests |
Next Phase Preview
Phase 2 (Security) will build on this foundation by:
- Replacing in-process execution with sandboxed environments
- Adding authentication and authorization
- Implementing comprehensive audit logging
- Hardening all endpoints
The foundation laid here must support these additions without major refactoring.
Revised Roadmap: TUI + WASM Architecture
Overview
The revised roadmap reflects the shift to a TUI-first, dual-target architecture (native terminal + WASM web) while maintaining the security-focused core.
Architecture Phases
┌────────────────────────────────────────────────────────────────────┐
│ ARCHITECTURE EVOLUTION │
├────────────────────────────────────────────────────────────────────┤
│ │
│ Phase 1 Phase 2 Phase 3 Phase 4 │
│ (Months 1-3) (Months 3-5) (Months 5-7) (Month 8+) │
│ │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ Core │ ────>│ Core │──────>│ Core │─────>│ Core │ │
│ │ + TUI │ │ + Web │ │ Security│ │Plugins │ │
│ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │
│ │ │ │ │ │
│ ▼ ▼ ▼ ▼ │
│ Terminal-only Terminal + Web Production Ecosystem │
│ Prototype Demo Ready Ready │
│ │
└────────────────────────────────────────────────────────────────────┘
Phase 1: Core + Native TUI (Months 1-3)
Goal
Working terminal application with basic LLM interaction
Month 1: Foundation
Week 1-2: Project Setup
- Initialize workspace with 4 crates:
harness-core- Shared business logicharness-tui- Native terminal appharness-cli- Command-line interfaceharness-web- WASM web app (stub)
- Set up CI/CD for native builds
- Configure cross-compilation targets
- Add tracing/logging infrastructure
- Create development environment scripts
Week 3-4: Core Types
- Define conversation models
- Implement message types (text, tool calls)
- Create LLM provider trait
- Add configuration system
- Implement conversation persistence (SQLite)
Deliverables:
- Workspace compiles
- Core types documented and tested
- SQLite schema defined
Month 2: TUI Implementation
Week 5-6: Basic UI
- Set up ratatui with crossterm backend
- Create main app loop
- Implement conversation list sidebar
- Build message view with scrolling
- Add input area with multiline support
Week 7-8: Interaction
- Keyboard navigation (vim bindings)
- Command palette (
:commands) - Conversation creation/deletion
- Message rendering (markdown)
- Basic syntax highlighting
Deliverables:
- Interactive TUI functional
- Can create conversations
- Basic keyboard navigation
Month 3: LLM Integration
Week 9-10: OpenAI Provider
- Implement HTTP client abstraction
- OpenAI API integration
- Streaming response handling
- Error handling and retries
- Conversation context management
Week 11-12: Polish
- Configuration file support
- API key management (secure storage)
- Conversation import/export
- Basic search functionality
- README and usage docs
Deliverables:
- Can chat with GPT-4 via TUI
- Configuration management
- Alpha release (terminal only)
Phase 2: WASM Web Deployment (Months 3-5)
Goal
Same application running in browser via WebAssembly
Month 4: Web Foundation
Week 13-14: WASM Setup
- Configure
trunkfor WASM builds - Add
ratzillabackend - Set up conditional compilation for core
- Create web-specific storage (LocalStorage)
- Implement web HTTP client (fetch-based)
Week 15-16: Web UI
- Adapt TUI widgets for web rendering
- Handle browser events (keyboard, resize)
- Add web-specific features:
- URL routing (conversation IDs)
- Browser history integration
- Download conversations
- Responsive layout adjustments
Deliverables:
- WASM build compiles
- Runs in browser with TUI aesthetic
Month 5: Cross-Platform Polish
Week 17-18: Data Sync
- Export/import format (JSON)
- Git-based sync option
- Conversation backup/restore
- Cross-device workflow documentation
Week 19-20: Tool System v1
- Define tool schema
- Implement tool registry
- Add example tools:
- Read file
- Execute command (native only)
- Search web
- Tool result rendering
Deliverables:
- Native and web versions feature-parity
- Tool system functional
- Beta release (both platforms)
Phase 3: Security Hardening (Months 5-7)
Goal
Production-ready security for code execution
Month 6: Sandboxing
Week 21-22: WASM Sandbox
- Integrate
wasmtimefor tool execution - Define WASI capabilities
- Implement tool isolation
- Resource limits (CPU, memory)
Week 23-24: Container Sandbox
- gVisor integration for complex tools
- Docker/gVisor runtime setup
- Network egress filtering
- Filesystem restrictions
Deliverables:
- Tools run in sandboxed environment
- Resource limits enforced
Month 7: Security Features
Week 25-26: Audit System
- Comprehensive audit logging
- Tamper-evident log storage
- Operation tracing
- Security event alerts
Week 27-28: Access Control
- API key authentication
- Permission system
- Rate limiting
- Abuse prevention
Deliverables:
- Security audit passes
- Production security checklist complete
Phase 4: Ecosystem (Month 8+)
Goal
Extensible platform with plugin support
Month 8+: Advanced Features
- Plugin System: WASM-based tool plugins
- Multi-Provider: Claude, local models, etc.
- Collaboration: Shared conversations
- Zellij Plugin: Optional integration
- Server Mode: API for remote access
- Mobile: PWA for mobile browsers
Technology Timeline
| Month | Native TUI | Web WASM | Security | Features |
|---|---|---|---|---|
| 1 | Core + SQLite | - | - | Types, config |
| 2 | Ratatui UI | - | - | Chat, navigation |
| 3 | OpenAI | - | - | Streaming |
| 4 | Polish | Ratzilla | - | Browser UI |
| 5 | Tools | Sync | - | Tool system |
| 6 | - | - | WASMtime | Sandboxing |
| 7 | - | - | gVisor, audit | Production |
| 8+ | Plugins | PWA | Hardening | Ecosystem |
Workspace Evolution
Month 1-3: Foundation
crates/
├── harness-core/ # Business logic
├── harness-tui/ # Terminal app
└── harness-cli/ # CLI
Month 4-5: Web Addition
crates/
├── harness-core/ # Business logic (now with wasm cfg)
├── harness-tui/ # Terminal app
├── harness-web/ # WASM web app ★ new
└── harness-cli/ # CLI
Month 6-8: Security & Plugins
crates/
├── harness-core/ # Business logic
├── harness-tui/ # Terminal app
├── harness-web/ # WASM web app
├── harness-cli/ # CLI
├── harness-sandbox/ # Sandboxing implementations ★ new
└── harness-plugin/ # Plugin SDK ★ new
Success Metrics by Phase
Phase 1 (Month 3)
- TUI works on Linux/macOS
- Can have multi-turn conversation
- Conversations persist
- < 1 second startup time
Phase 2 (Month 5)
- Web version runs in Chrome/Firefox
- Feature parity with native
- Tools can read files
- Export/import works
Phase 3 (Month 7)
- Tools run in sandbox
- Complete audit trail
- Passes security review
- No critical vulnerabilities
Phase 4 (Month 8+)
- 5+ community plugins
- 100+ GitHub stars
- Documentation complete
- Stable release
Risk Mitigation
| Risk | Phase | Mitigation |
|---|---|---|
| WASM compilation issues | 2 | Start with simple examples, test early |
| Ratatui/web feature gaps | 2 | Maintain feature flags, graceful degradation |
| Sandbox performance | 3 | Benchmark early, optimize hot paths |
| Storage sync complexity | 2 | Start with file-based, add cloud later |
| Scope creep | All | Strict milestone definitions, monthly reviews |
Comparison with Original Plan
Original (API-first)
- REST API with Axum
- Web frontend (React/Vue)
- TUI as secondary client
- Server required
Revised (TUI-first)
- Native TUI with ratatui
- WASM web via ratzilla
- Shared core logic
- Works offline (local models)
- Optional server later
Advantages of Revised Approach
- Simpler deployment: No server to run for personal use
- Better UX: TUI-first feels more natural for developers
- Code reuse: Same UI code for terminal and web
- Offline capable: Can work without internet (local LLMs)
- Lower latency: No HTTP overhead for local usage
Trade-offs
- Web features: No real-time collaboration (initially)
- Multi-user: Requires server mode (future feature)
- Mobile: Browser-based only (no native app)
Conclusion
The revised roadmap prioritizes:
- Terminal experience: Where developers spend most time
- Web accessibility: Via WASM, not separate codebase
- Security: Sandboxed execution from the start
- Gradual complexity: Start simple, add features incrementally
This approach aligns with the philosophy of tools like opencode, lazygit, and k9s—terminal-native first, with web as a convenient alternative when needed.
Development Roadmap: Iterative Learning
Philosophy
This roadmap prioritizes learning through doing over comprehensive upfront planning. We’ll build, dogfood, and iterate rather than designing everything in advance.
“We need to stub out the TUI to start getting benefits locally. Learn about wrong assumptions while dogfooding.”
Phase 1: Foundation & Research ✓ (Completed)
Status: Done
Goal: Understand the landscape and choose tools
What We Did
- Analyzed pi-mono architecture and features
- Evaluated Rust TUI ecosystem (ratatui + ratzilla)
- Mapped dependencies for feature parity
- Documented technology philosophy (Rust + WASM only)
Artifacts
- Documentation in
/docs/outlining architecture choices - Dependency mapping (rust-dependencies-pi-parity.md)
- Clear stance: No TypeScript, no web frameworks
Decision: Ready to start building
Phase 2: Local TUI Prototype (Current)
Status: Starting now
Goal: Get a working TUI for local use, learn through dogfooding
Timeline: 2-4 weeks
The Plan
Don’t build everything at once. Build just enough to start using it daily.
Week 1: Hello TUI
┌─────────────────────────────────────┐
│ robit v0.0.1 │
├─────────────────────────────────────┤
│ │
│ Welcome! Type a message below. │
│ │
├─────────────────────────────────────┤
│ You: Hello, can you help me? │
│ AI: Hi! I'd be happy to help. │
│ │
│ > _ │
│ │
└─────────────────────────────────────┘
Features:
- Basic ratatui app structure
- Simple input field
- Static response (mock LLM)
- Quit with ‘q’
Purpose: Learn ratatui basics, event loops, rendering
Week 2: Real LLM Integration
- Add reqwest for HTTP calls
- OpenAI integration (simple, no streaming)
- Config file for API key (simple TOML)
- Conversation history in memory
Purpose: Learn async in TUI context, config management
Week 3: Better UI
- Scrollable message view
- Multi-line input (TextArea widget)
- Conversation list sidebar
- Create/switch conversations
Purpose: Learn widgets, layout, state management
Week 4: Polish for Daily Use
- Markdown rendering (tui-markdown)
- Syntax highlighting in code blocks (syntect)
- Save conversations to disk (JSON or redb)
- Keyboard shortcuts (vim-style navigation)
Purpose: Make it actually usable for daily work
Success Criteria
- You use it for real work daily
- Clear understanding of what works/doesn’t
- List of “wrong assumptions” documented
- Clear pain points identified
Technical Approach
Start simple, no abstractions:
#![allow(unused)]
fn main() {
// main.rs - everything in one file initially
// Don't create crates/ workspace yet
// Don't worry about architecture
// Just make it work
mod app; // App state
mod ui; // Rendering
mod events; // Input handling
mod llm; // OpenAI client
}
Single crate, single binary:
[package]
name = "robit"
version = "0.0.1"
[[bin]]
name = "robit"
path = "src/main.rs"
[dependencies]
ratatui = "0.30"
crossterm = "0.29"
tokio = { version = "1", features = ["full"] }
reqwest = { version = "0.12", features = ["json"] }
serde = { version = "1.0", features = ["derive"] }
toml = "0.9"
# ... minimal deps
Don’t do yet:
- ❌ Separate crates/core/tui split
- ❌ WASM/web support
- ❌ Tool system
- ❌ Multiple LLM providers
- ❌ Tests (manual testing only)
Phase 3: Feature Parity with pi-mono
Status: Future
Goal: Match pi-mono’s core features
Timeline: 4-8 weeks (after Phase 2 feels good)
When to Start
Start this phase when:
- You’re using the Phase 2 TUI daily
- No major architecture regrets
- Clear understanding of needed features
Features to Implement
Based on pi-mono analysis:
Core Chat
- Streaming responses (real-time token display)
- Multi-model support (OpenAI, Anthropic, etc.)
- Model switching mid-conversation
- Token usage tracking
Editor
- Vim-style text editor for input
- Syntax highlighting while typing
- Multi-line editing with proper indentation
- Clipboard integration (arboard)
History & Sessions
- Session tree navigation (like pi’s tree selector)
- Named conversations
- Search through history
- Export conversations
Display
- Better markdown rendering
- Code block with syntax highlighting
- Diff viewer for code changes
- File tree sidebar (tui-tree-widget)
Config
- Theme system
- Keybinding configuration
- Profile support (different API keys, defaults)
Architecture Evolution
Split into crates when needed:
robit/ # Single repo
├── Cargo.toml # Workspace
├── crates/
│ ├── robit-core/ # Extract when needed
│ │ ├── src/
│ │ │ ├── lib.rs
│ │ │ ├── conversation.rs
│ │ │ ├── llm/
│ │ │ └── config.rs
│ │ └── Cargo.toml
│ │
│ └── robit-tui/ # Main TUI app
│ ├── src/
│ │ ├── main.rs
│ │ ├── app.rs
│ │ ├── ui/
│ │ └── widgets/
│ └── Cargo.toml
│
└── docs/
Not yet:
- ❌ WebAssembly build
- ❌ Separate web UI
- ❌ Tool/agent system
Phase 4: Package Architecture
Status: Future
Goal: Structure like pi-mono but for our needs
Timeline: After Phase 3 stable
Planning
Design package structure based on learnings:
crates/
├── robit-core/ # Reusable core
├── robit-tui/ # Terminal app
├── robit-cli/ # Non-interactive CLI
├── robit-web/ # WASM build (maybe)
└── robit-lsp/ # Future: LSP integration?
Key decisions to make:
- What goes in core vs TUI-specific?
- Async runtime abstraction for WASM?
- Plugin system design?
Guiding principle: Split when it hurts not to, not before.
Phase 5: Agent Features
Status: Future
Goal: Add agentic capabilities
Timeline: After Phase 4
Core Capabilities
Advanced agent functionality includes:
- Tool calling (bash, file operations)
- Autonomous agent mode
- GitHub integration (PRs, issues)
- Search capabilities
- Multi-file editing
Approach
Don’t replicate everything:
- Pick features that fit your workflow
- Start with bash tool only
- Add file operations
- Then GitHub integration
- Then search
Security first:
- Sandboxed execution (wasmtime for tools)
- User confirmation for destructive ops
- Audit logging
Phase 6: Mobile/Remote Access (MVP)
Status: Future
Goal: Prompt from phone, agent makes PRs
Timeline: After agent features work
The MVP Definition
“Once I can prompt from my phone and my agent can make PRs and do searches, this feels like a proper MVP to harden.”
Simplest path first:
Option 1: SSH (Easiest)
Phone → SSH app → Terminal → robit
- Ensure TUI works over SSH
- Maybe optimize for mobile terminal size
- Test with Blink Shell (iOS) or Termux (Android)
Pros: Zero additional code
Cons: Requires SSH access, terminal UI on small screen
Option 2: Chat Integration
Phone → Telegram/Discord bot → robit agent
- Simple bot that forwards messages
- Agent runs on server
- Responses sent back to chat
Pros: Native mobile experience
Cons: Requires bot hosting, async conversation style
Option 3: Web Server (More Work)
Phone → Browser → robit-server → agent
- REST API wrapper around agent
- Simple web UI or reuse WASM
- Authentication
Pros: Full control
Cons: Most complex, security considerations
Recommendation
Start with Option 1 (SSH):
- Free, immediate
- See if you actually need mobile access
- Learn what mobile UX should be
Then Option 2 (Chat):
- If SSH is too clunky
- Telegram bot is ~100 lines of code
- Good enough for MVP
Save Option 3 (Web):
- Only if chat isn’t sufficient
- For after “proper MVP” is hardened
Decision Gates
Don’t advance to next phase until:
Phase 1 → 2
- Documentation complete
- Clear on Rust + WASM philosophy
- Ready to write code
Phase 2 → 3
- Daily dogfooding for 2+ weeks
- Documented “wrong assumptions”
- Clear on what features are needed
- TUI feels “good enough” to invest more
Phase 3 → 4
- Feature parity with pi-mono
- Stable daily use
- Pain points from monolithic codebase
- Clear need for package split
Phase 4 → 5
- Clean package architecture
- Core library published (internal)
- Ready for agent features
Phase 5 → 6
- Agent makes PRs successfully
- Agent searches code
- Ready for mobile access
Current Focus
Right now: Phase 2 - Local TUI Prototype
This week:
- Create single-file ratatui app
- Basic input/output loop
- Mock LLM responses
- Run it, use it, break it
Not doing yet:
- ❌ Planning Phase 3-6 in detail
- ❌ Over-engineering architecture
- ❌ WebAssembly builds
- ❌ Package splits
Learning goals:
- How does ratatui feel in practice?
- What’s annoying about the API?
- What do we actually need vs what we think we need?
- Wrong assumptions to document
Notes
Dogfooding Guidelines
While using Phase 2 prototype:
- Keep a “pain log” - What’s annoying? What’s broken?
- Note missing features - Don’t implement, just document
- Track assumptions - “I thought X would work but Y happens”
- Daily commits - Even if messy, commit your working state
When to Course Correct
If during Phase 2 you discover:
- ratatui doesn’t fit → Pivot to different TUI lib
- WASM is wrong approach → Revisit philosophy
- TUI is insufficient → Maybe web-first makes sense
- Rust is wrong choice → Better to know early
Don’t be afraid to throw away Phase 2 code. It’s for learning.
Summary
| Phase | Goal | Timeline | Success Metric |
|---|---|---|---|
| 1 | Research | ✓ Done | Docs complete |
| 2 | Local TUI | Now | Daily use |
| 3 | pi Parity | TBD | Match features |
| 4 | Packages | TBD | Clean architecture |
| 5 | Agent | TBD | PRs + search |
| 6 | Mobile | TBD | Phone prompts |
Current mantra: Build the smallest thing that teaches us the most.
Mobile Strategy
Overview
Mobile support is a Phase 4+ consideration (8+ months out). When we reach this stage, Tauri v2 presents an attractive option for bringing robit to iOS and Android while leveraging existing work.
Why Tauri?
The Case For Tauri
- Rust Native: Backend written in Rust, matching our stack
- Code Reuse: Can leverage the WASM web app as the frontend
- Mobile Support: Tauri v2 (released 2024) officially supports iOS/Android
- Performance: Smaller bundle sizes than Electron, native performance
- Security: Rust’s memory safety extends to the mobile app
Architecture with Tauri
┌─────────────────────────────────────────────────────────────┐
│ Mobile App (Tauri) │
│ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ WebView Frontend │ │
│ │ (Compiled from robit-web or React/Vue variant) │ │
│ └──────────────────────┬───────────────────────────────┘ │
│ │ │
│ ┌──────────────────────▼───────────────────────────────┐ │
│ │ Tauri Bridge │ │
│ │ (JavaScript ← → Rust bindings) │ │
│ └──────────────────────┬───────────────────────────────┘ │
│ │ │
│ ┌──────────────────────▼───────────────────────────────┐ │
│ │ Rust Backend │ │
│ │ • robit-core (shared logic) │ │
│ │ • Platform APIs (mobile-specific) │ │
│ │ - Biometric auth │ │
│ │ - Secure enclave for API keys │ │
│ │ - Push notifications │ │
│ │ - Background sync │ │
│ └──────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
Alternative Approaches
Option 1: Tauri with Web Frontend
Frontend: React/Vue/Svelte or robit-web WASM Backend: robit-core + Tauri APIs
Pros:
- Familiar web development
- Rich ecosystem of UI libraries
- Can share code with web deployment
- Easy to prototype
Cons:
- WebView overhead
- Two UI codebases (terminal vs mobile)
- Not truly native feel
Best for: Phase 4, quick time-to-market
Option 2: Tauri with Native UI
Use Tauri as a shell but render native UI via egui, iced, or Slint.
Pros:
- More native performance
- Rust-native UI libraries
- Can look like system UI
Cons:
- Less mature than web frontend
- Mobile support still evolving
- Steeper learning curve
Best for: If we want native UI without web tech
Option 3: Native Mobile (Swift/Kotlin)
Build separate iOS (Swift) and Android (Kotlin) apps that communicate with a Rust core.
┌─────────────────────────────────────────────────────────────┐
│ Native Mobile Apps │
├─────────────────────────────────┬───────────────────────────┤
│ iOS App (Swift) │ Android (Kotlin) │
│ │ │
│ ┌──────────────────────────┐ │ ┌──────────────────────┐ │
│ │ SwiftUI Interface │ │ │ Jetpack Compose │ │
│ └───────────┬──────────────┘ │ └──────────┬───────────┘ │
│ │ │ │ │
│ ┌───────────▼──────────────┐ │ ┌───────────▼──────────┐ │
│ │ UniFFI Bindings │ │ │ JNI Bindings │ │
│ │ (Swift ← → Rust) │ │ │ (Kotlin ← → Rust) │ │
│ └───────────┬──────────────┘ │ └──────────┬───────────┘ │
└──────────────┼──────────────────┴─────────────┼───────────────┘
│ │
└────────────────┬───────────────┘
│
┌────────────▼────────────┐
│ robit-core │
│ (Rust, shared) │
└─────────────────────────┘
Pros:
- Truly native experience
- Best performance
- Platform-specific features
Cons:
- Three codebases (terminal, iOS, Android)
- More development effort
- Requires mobile expertise
Best for: If mobile becomes primary platform
Option 4: Cross-Platform UI (egui, iced, Slint)
Use a Rust-native UI framework that compiles to mobile.
egui:
- Immediate mode GUI
- Great for tools/internal apps
- Not traditionally mobile-focused
iced:
- Elm architecture
- Still maturing mobile support
- Clean, functional approach
Slint:
- Declarative UI (QML-like)
- Good mobile support
- Commercial licensing considerations
Best for: If we want pure Rust, no web tech
Recommended Approach: Progressive Strategy
Phase 4A: PWA via WASM (Quick Win)
Before building native apps, ensure the WASM web app works well as a PWA:
# robit-web as PWA
- Service worker for offline
- Manifest for install
- Responsive design
- Touch-optimized
Benefits:
- Users can “install” from browser
- Works on mobile immediately
- No app store approval
- Shares codebase with web
Phase 4B: Tauri Mobile App
Build Tauri v2 app using the PWA as base:
// src-tauri/src/main.rs
fn main() {
tauri::Builder::default()
.invoke_handler(tauri::generate_handler![
get_conversations,
send_message,
sync_data,
// Mobile-specific commands
biometric_auth,
secure_storage,
])
.run(tauri::generate_context!())
.expect("error while running tauri application");
}
Mobile-specific features:
- Biometric authentication for API keys
- Secure enclave storage
- Background sync when app closed
- Push notifications for long-running tasks
- Share extension (share to harness from other apps)
Phase 4C: Native Apps (If Needed)
Only if mobile becomes a primary platform and Tauri limitations are hit.
Tauri + Existing Architecture Integration
Sharing Code with robit-web
harness-tauri/
├── src-tauri/ # Rust backend
│ ├── src/
│ │ ├── main.rs # Tauri setup
│ │ ├── commands.rs # JS-callable commands
│ │ └── mobile.rs # Mobile-specific APIs
│ └── Cargo.toml
├── src/ # Frontend (can be)
│ ├── robit-web/ # Option A: Use WASM directly
│ └── react-app/ # Option B: React/Vue wrapper
└── tauri.conf.json
Option A: Embed robit-web
Use the existing ratatui-based WASM app inside Tauri:
<!-- Tauri loads robit-web WASM -->
<script type="module">
import init from './robit-web.js';
init();
</script>
Pros: Minimal code duplication Cons: Terminal aesthetic may feel odd on mobile
Option B: Mobile-Optimized Frontend
Build a mobile-specific web frontend that uses the same Tauri backend:
Mobile UI (React/Svelte)
├── Uses Tauri API for native features
├── Mobile-optimized layout (not terminal)
└── Can access robit-core via Tauri commands
Pros: Native mobile UX Cons: Separate frontend code
Hybrid Approach
Allow users to choose UI:
#![allow(unused)]
fn main() {
// Tauri command to launch different UIs
#[tauri::command]
fn launch_ui(mode: &str) -> Result<String, String> {
match mode {
"terminal" => load_harness_web(), // TUI in mobile
"mobile" => load_mobile_ui(), // Touch-optimized
_ => Err("Unknown mode".to_string()),
}
}
}
Mobile-Specific Considerations
Input Methods
Terminal UI on mobile:
- Virtual keyboard takes half screen
- Need special key handling (Ctrl, Esc, arrow keys)
- Could provide on-screen toolbar
Touch gestures:
- Swipe for navigation
- Pinch to zoom text
- Long press for context menus
Data Sync Strategy
Mobile apps need robust sync:
Mobile Device Server/Cloud
│ │
├─── Sync conversations ────────►│
│ │
│◄──────── Updates ──────────────│
│ │
├─── Offline changes ───────────►│
│ (queued) │
Options:
- Self-hosted sync: User runs robit-server
- Cloud sync: End-to-end encrypted, we host
- Peer sync: Device-to-device (LocalSend, etc.)
Security on Mobile
- Biometric auth: Unlock API keys with FaceID/TouchID
- Secure enclave: Store credentials in hardware
- App sandbox: Mobile OS protections
- Network security: Certificate pinning
Roadmap Integration
Current Plan (Months 1-7)
- Focus on terminal + web WASM
- Build solid core
- Add security features
Mobile Phase (Month 8+)
- Month 8: PWA improvements
- Month 9-10: Tauri mobile app (MVP)
- Month 11-12: Mobile-specific features
- Month 13+: App store release
Prerequisites for Mobile
Before starting mobile:
- Core features stable
- Conversation sync working
- Security model proven
- Server API designed (for cloud sync)
- Developer accounts (Apple Developer, Google Play)
Decision Matrix
| Criteria | Tauri/Web | Native | PWA Only |
|---|---|---|---|
| Dev effort | Medium | High | Low |
| Native feel | Good | Best | Fair |
| Code sharing | Good | Poor | Excellent |
| App store presence | Yes | Yes | No |
| Performance | Good | Best | Good |
| Offline capable | Yes | Yes | Yes |
| Maintenance | Medium | High | Low |
Recommendation: Start with PWA (immediate), add Tauri mobile (Phase 4), evaluate native only if traction demands it.
Open Questions
- Do mobile users want a terminal UI? Or should mobile be touch-optimized?
- What’s the sync story? Self-hosted, cloud, or peer-to-peer?
- Monetization? Free app, subscription for sync, one-time purchase?
- iPad/Tablet support? Different from phone UI?
Conclusion
Tauri v2 is the pragmatic choice for mobile:
- Leverages existing Rust codebase
- Can use web frontend (existing work)
- Official mobile support (not experimental)
- Path to app stores
But first: Make the web PWA excellent. Many users’ mobile needs will be satisfied by a well-designed PWA, deferring the need for native apps.
Comparison: robit vs pi-mono
Philosophy: Rust + WebAssembly Only
Unlike pi-mono’s TypeScript-based approach, robit is committed to Rust and WebAssembly exclusively. No TypeScript frontend frameworks. If we ever need richer web capabilities beyond terminal UI, we’ll use Leptos (Rust) or enhance the WASM build—not JavaScript.
This decision reflects:
- Type safety: Rust’s compile-time guarantees across all platforms
- Performance: Zero-cost abstractions, no JS runtime overhead
- Consistency: Single language, single toolchain
- WASM-first: Browser deployment via Rust-to-WASM, not JS bundlers
pi-mono Overview
pi-mono is Mario Zechner’s AI agent toolkit (11k+ GitHub stars). It powers the Pi coding agent and includes:
- pi-ai: Unified LLM API (OpenAI, Anthropic, Google, etc.)
- pi-agent-core: Agent runtime with tool calling
- pi-tui: Custom terminal UI library with differential rendering
- pi-web-ui: Web UI components (Lit-based web components)
- pi-coding-agent: Main CLI tool (uses pi-tui)
- pi-mom: Slack bot that wraps the coding agent
Architecture Comparison
Package Structure
pi-mono:
packages/
├── ai/ # LLM abstraction
├── agent/ # Agent runtime
├── tui/ # Terminal UI library (custom)
├── coding-agent/ # CLI (uses pi-tui)
├── web-ui/ # Web components (separate)
└── mom/ # Slack bot
robit (proposed):
crates/
├── robit-core/ # Shared business logic
├── robit-tui/ # Terminal app (ratatui)
├── robit-web/ # Web app (ratzilla - same UI code!)
└── robit-cli/ # CLI interface
Key Difference: UI Strategy
pi-mono: Separate UI Implementations
pi-tui (terminal) pi-web-ui (browser)
│ │
└──────────┬────────────┘
│
pi-agent-core
│
pi-ai
- pi-tui: Custom TUI library (chalk, marked, differential rendering)
- pi-web-ui: Lit web components (document previews, chat UI)
- Different UI code for each platform
- Optimized for each platform’s strengths
robit: Shared UI Code
ratatui widgets ──┬── crossterm (terminal)
└── ratzilla (web DOM)
│
robit-core
- Same ratatui widgets used everywhere
- Different backends (crossterm vs ratzilla)
- Single UI codebase
- Trade-offs in platform optimization
Technical Deep Dive
pi-tui: Custom Terminal Library
pi-mono built their own TUI library instead of using blessed, react-blessed, or ink:
Features:
- Differential rendering: Only updates changed lines
- Component interface: React-like component system
- Viewport management: Scrollback handling
- ANSI safety: Proper escape sequence handling
- Overlay system: Modal dialogs, popups
- Custom editor: Built-in text input with syntax highlighting
Dependencies:
chalk- Terminal stylingmarked- Markdown renderingget-east-asian-width- Character width calculationmime-types- MIME detection
Why custom?
- Full control over rendering performance
- Optimized for streaming LLM output
- Differential updates reduce flicker
- No React overhead (Ink uses React)
pi-web-ui: Web Components
Separate package using Lit (web components):
Features:
- Chat interface components
- Document previews (docx, pdf, xlsx)
- Local LLM integration (LM Studio, Ollama)
- Peer dependency: mini-lit or lit
Notable: Uses pi-tui for “shared rendering logic” (some code reuse)
Trade-off Analysis
pi-mono Approach: Separate UIs
Pros:
- Platform-optimized: Terminal UI optimized for terminal, web for web
- Full feature set: Can use web-specific features (drag-drop, file previews)
- No WASM complexity: Web UI is regular JavaScript
- Better performance: Native web components vs WASM overhead
Cons:
- Code duplication: Two UI implementations
- Maintenance burden: Changes need to be made twice
- Inconsistency risk: UI may drift between platforms
- More packages: Separate teams could diverge
robit Approach: Shared UI (ratatui + ratzilla)
Pros:
- Single codebase: One UI implementation
- Consistency guaranteed: Same behavior everywhere
- Faster iteration: Changes apply to all platforms
- Unique aesthetic: Terminal look in browser is distinctive
Cons:
- Terminal aesthetic in web: May feel odd on mobile
- WASM complexity: Build tooling, debugging
- Limited web features: Harder to add drag-drop, rich previews
- Performance overhead: WASM vs native JS
Lessons from pi-mono
1. Differential Rendering Matters
pi-tui’s differential rendering is critical for smooth streaming output:
// Only re-render changed lines
for (let i = firstChanged; i <= lastChanged; i++) {
if (previousLines[i] !== newLines[i]) {
updateLine(i, newLines[i]);
}
}
ratatui does this too - it’s a core feature. Good validation.
2. Separate Packages Enable Different Use Cases
pi-mono’s pi-mom (Slack bot) wraps pi-coding-agent instead of using core directly:
pi-mom ──→ pi-coding-agent ──→ pi-agent-core
This allows the Slack bot to:
- Reuse all interactive features
- Share state management
- Not reimplement agent logic
Lesson: Consider wrapper patterns for different interfaces.
3. Web UI Can Be Richer
pi-web-ui includes:
- Document previews (PDF, Word, Excel)
- Drag-and-drop file upload
- Rich text editing
These are hard in WASM TUI but easy in web components.
Lesson: If we need rich web features, we might need a separate web UI eventually.
4. Extension System Enables Ecosystem
pi-coding-agent has a sophisticated extension system:
- TypeScript runtime compilation
- Event hooks
- Tool/command registration
- Custom providers
Lesson: Plan for extensibility from the start.
5. Custom TUI Libraries Are Viable
pi-mono built pi-tui instead of using existing libraries. Reasons:
- Performance control
- Streaming optimization
- Specific feature needs
Lesson: Don’t be afraid to build custom if needed, but ratatui is battle-tested.
Strategic Implications
When pi-mono’s Approach Wins
- Rich web experience needed: File previews, drag-drop, WYSIWYG
- Different teams/platforms: Web team and terminal team
- Performance critical: WASM overhead unacceptable
- Web-first product: Terminal is secondary
When robit’s Approach Wins
- Terminal is primary: Web is convenience feature
- Small team: Can’t maintain two UIs
- Consistency valued: Same experience everywhere
- Unique aesthetic: Terminal-in-browser is brand differentiator
- Offline-first: WASM works without server
Hybrid Possibilities
Could we combine both approaches?
Option 1: Terminal-First Hybrid
┌─────────────────────────────────────────┐
│ Native Terminal │
│ (ratatui + crossterm) │
│ │
│ Full TUI experience │
│ Vim bindings, keyboard-centric │
└─────────────────────────────────────────┘
│
┌──────────┼──────────┐
│ │ │
▼ ▼ ▼
┌──────────┐ ┌────────┐ ┌──────────┐
│ WASM │ │ PWA │ │ Rich │
│ TUI │ │ Shell │ │ Web UI │
│ │ │ │ │(optional)│
│Terminal │ │ Mobile │ │ React/ │
│aesthetic │ │ basic │ │ Vue │
└──────────┘ └────────┘ └──────────┘
- Primary: Native terminal (ratatui)
- Convenience: WASM TUI (ratzilla)
- Mobile: PWA wrapper
- Rich features: Optional separate web UI (if needed)
Option 2: Platform-Specific UIs
Start with shared UI, split when needed:
Phase 1: Shared UI (ratatui + ratzilla)
│
├──> Phase 2a: Rich web features needed?
│ └──> Build separate robit-web-rich
│
└──> Phase 2b: Terminal-only sufficient?
└──> Keep shared UI
Recommendations
For robit
-
Start with shared UI (ratatui + ratzilla)
- Faster to MVP
- Consistency is a feature
- Terminal aesthetic is distinctive
-
Plan escape hatches
- Keep core UI logic separate from rendering
- If web needs rich features, we can split later
- Conditional compilation for platform features
-
Learn from pi-tui
- Differential rendering (ratatui has this ✓)
- Component-based architecture (ratatui has this ✓)
- ANSI safety (crossterm handles this ✓)
-
Evaluate at Phase 3
- If web needs file previews → consider separate web UI
- If terminal remains primary → stick with shared UI
Key Questions to Answer
-
Is terminal aesthetic acceptable for web users?
- pi-mono users: yes (web UI is different)
- Our users: TBD (shared UI means terminal in browser)
-
Do we need rich web features?
- File drag-drop?
- PDF previews?
- Collaborative editing?
-
Is WASM overhead acceptable?
- Initial download size
- Runtime performance
- Mobile battery impact
Conclusion
pi-mono and robit represent fundamentally different philosophies:
| Aspect | pi-mono | robit |
|---|---|---|
| Language | TypeScript (Node.js) | Rust |
| Web approach | Lit web components | WASM (Rust) |
| UI strategy | Platform-specific | Shared UI code |
| Rich web path | More TS/JS components | Leptos (Rust) or enhanced WASM |
| Ecosystem | npm, webpack, etc. | Cargo, trunk, wasm-bindgen |
robit Philosophy
Rust + WebAssembly exclusively. No exceptions.
If we ever need richer web capabilities:
- First: Enhance the ratzilla WASM build (add custom components)
- Second: Introduce Leptos for a separate rich web UI (still Rust, still WASM)
- Never: TypeScript, React, Vue, or JavaScript frontend frameworks
This isn’t about being dogmatic—it’s about:
- Type safety across the stack
- Single toolchain and language
- WASM as first-class deployment target
- Avoiding JS ecosystem complexity
Validation
The ratatui + ratzilla approach is correct for our goals:
- Terminal-first workflow ✓
- Web deployment without JS frameworks ✓
- Rust everywhere ✓
- Path to Leptos if needed ✓
pi-mono’s TypeScript approach works for them. Our Rust approach works for us. Both are valid; we’ve chosen ours.
Technology Philosophy: Rust + WebAssembly Only
Core Principle
robit is a Rust project. Full stop.
We will not use:
- ❌ TypeScript/JavaScript frontend frameworks (React, Vue, Angular, Lit)
- ❌ Node.js/npm ecosystem for UI
- ❌ Separate language for web vs terminal
- ❌ JavaScript bundlers (webpack, vite, rollup)
We will use:
- ✅ Rust for all logic and UI
- ✅ WebAssembly for browser deployment
- ✅ ratatui for terminal UI
- ✅ ratzilla for web UI (terminal aesthetic in browser)
- ✅ Leptos (optional future) for rich web UI if needed
- ✅ trunk for WASM builds
- ✅ Cargo for everything
Why This Matters
Type Safety
Rust’s type system catches errors at compile time:
#![allow(unused)]
fn main() {
// This won't compile - the compiler catches it
let conversation: Conversation = load_conversation(id)?;
// conversation is guaranteed to be valid here
}
Compare to TypeScript:
// This might fail at runtime despite types
const conversation = await loadConversation(id);
// conversation might be null, undefined, or malformed
Single Toolchain
Rust approach:
cargo build # Build everything
cargo test # Test everything
cargo clippy # Lint everything
cargo doc # Document everything
TypeScript approach (pi-mono):
npm install # Node dependencies
npm run build # Compile TypeScript
npm run check # Lint + format
tsc --noEmit # Type check
Performance
- Zero-cost abstractions: What you write is what runs
- No garbage collection: Predictable performance
- WASM is fast: Near-native speed in browser
- Small bundles: Rust WASM is typically smaller than JS bundles
The WebAssembly Path
Phase 1: Terminal-First (Current)
┌─────────────────────────────────────┐
│ robit (ratatui + crossterm) │
│ │
│ Native terminal, full experience │
└─────────────────────────────────────┘
Deploy: cargo install robit
Phase 2: Web Convenience
┌─────────────────────────────────────┐
│ robit-web (ratatui + ratzilla)│
│ │
│ Same UI code, renders to DOM │
│ Terminal aesthetic in browser │
└─────────────────────────────────────┘
Deploy: Static hosting (GitHub Pages, Netlify, etc.)
Phase 3: Rich Web (If Needed)
┌─────────────────────────────────────┐
│ robit-web-rich (Leptos) │
│ │
│ Full web framework (still Rust!) │
│ Rich interactions, modern UX │
└─────────────────────────────────────┘
Deploy: Static hosting or serverless functions
Why Leptos?
- Full-stack Rust framework
- Compiles to WASM
- Reactive like React, but Rust-native
- No JavaScript required
- Can reuse robit-core crate
#![allow(unused)]
fn main() {
// Leptos example - still Rust!
#[component]
fn ConversationView() -> impl IntoView {
let (messages, set_messages) = create_signal(vec![]);
view! {
<div class="conversation">
<For
each=messages
key=|msg| msg.id
children=|msg| view! { <Message msg=msg /> }
/>
</div>
}
}
}
Common Objections
“But JavaScript has better web ecosystem!”
Reality: The WASM ecosystem has matured:
- trunk: Like webpack but for Rust WASM
- web-sys: Complete browser API bindings
- gloo: Ergonomic wrappers for web APIs
- Leptos/Dioxus/Yew: Full React-like frameworks in Rust
“TypeScript has good types too!”
Reality: TypeScript’s type system is unsound:
// Valid TypeScript that crashes at runtime
const x: number = "hello" as any;
console.log(x + 1); // NaN or "hello1"
Rust’s type system is sound and enforced at compile time.
“JavaScript is easier to hire for!”
Reality: This is a passion project, not enterprise software. We optimize for:
- Developer experience (ourselves)
- Code quality
- Long-term maintainability
Not hiring metrics.
“But pi-mono is successful with TypeScript!”
Reality: Different projects, different goals. pi-mono optimizes for:
- Ease of contribution (11k stars, 117 contributors)
- Familiar stack (JavaScript/TypeScript is popular)
- Rich web features (they need them)
We optimize for:
- Type safety
- Performance
- Single ecosystem
- Terminal-first experience
Both can be successful.
The Escape Hatch
If we discover that the terminal aesthetic in browser is too limiting, and ratzilla can’t be extended enough:
Option A: Build a Leptos app
- Still Rust
- Still WASM
- Rich web UI
- Reuse robit-core
Option B: Enhance ratzilla
- Add custom DOM elements for rich features
- Keep terminal aesthetic for chat
- Add file previews, drag-drop as needed
Option C (not happening): Use React/Vue
- ❌ Not in this codebase
- ❌ Different language
- ❌ Different toolchain
- ❌ Different mental model
Examples in the Wild
Other projects successfully using Rust + WASM for web:
- Figma: Rust core, WASM in browser
- Ruffle: Flash emulator in Rust/WASM
- WordPress Playground: PHP in WASM (Rust toolchain)
- Ratzilla demos: Terminal UIs in browser
- Leptos apps: Full web apps in Rust
The approach is proven.
Our Stack
Language: Rust (100%)
Terminal UI: ratatui + crossterm
Web UI: ratatui + ratzilla (WASM)
Future web: Leptos (if needed, still Rust)
Build tool: Cargo + trunk
Package mgr: Cargo
Testing: Built-in (cargo test)
Linting: Clippy
Formatting: rustfmt
No npm. No node_modules. No package.json. No webpack config.
Just Rust.
Conclusion
This isn’t about hating JavaScript or TypeScript. They’re fine tools for many projects.
This is about choosing the right tool for our specific project:
- A terminal-first LLM tool
- Where type safety matters (AI interactions)
- Where performance matters (streaming responses)
- Where we want a single, cohesive codebase
Rust + WebAssembly is that tool.
Rust Dependencies for pi-mono Feature Parity
Overview
This document maps the features from pi-mono’s TypeScript implementation to equivalent Rust crates for our TUI application.
Core TUI Framework
pi-mono: pi-tui (custom TypeScript)
Features:
- Differential rendering (only updates changed lines)
- Component-based architecture
- Viewport management with scrollback
- ANSI escape sequence safety
- Overlay system for modals/popups
Rust Equivalent:
ratatui = "0.30"
crossterm = "0.29" # Backend for ratatui
Notes:
- ✅ ratatui has built-in differential rendering
- ✅ Component system via Widget trait
- ✅ Viewport management included
- ✅ ANSI safety handled by crossterm
Text Editor / Input
pi-mono: Custom editor in pi-tui
Features:
- Multi-line text input
- Cursor movement
- Syntax highlighting integration
- Vim-style or Emacs keybindings
- Text selection
Rust Options:
Option 1: Basic input (built-in)
ratatui = "0.30" # Has TextArea widget
Option 2: Enhanced editor
# Vim-like editor
edtui = "0.11"
# OR
tui-textarea = "0.4"
# OR
ratatui-code-editor = "0.0.1" # Tree-sitter powered
Recommendation: Start with ratatui’s built-in TextArea, upgrade to edtui or ratatui-code-editor if we need vim bindings or advanced editing.
Markdown Rendering
pi-mono: marked (TypeScript library)
Features:
- Parse markdown to styled terminal output
- Support for headings, lists, code blocks, links
- Custom rendering for AI responses
Rust Equivalent:
tui-markdown = "0.3"
pulldown-cmark = "0.13" # Underlying parser
Alternative (more comprehensive):
md-tui = "0.9" # Full markdown viewer with navigation
Notes:
- tui-markdown converts markdown to ratatui Text values
- Supports all standard markdown features
- Can customize styling for AI output
Syntax Highlighting
pi-mono: Custom integration with highlighting
Features:
- Code block syntax highlighting
- Multiple language support
- Theme support
Rust Equivalent:
syntect = "5.3" # Core syntax highlighting
syntect-tui = "3.0" # Bridge to ratatui styles
Alternative (tree-sitter based):
tui-syntax = "0.4" # Tree-sitter based
tree-sitter = "0.24" # Parser framework
Notes:
- syntect uses Sublime Text syntax definitions (high quality)
- 100+ languages supported out of the box
- syntect-tui converts syntect styles to ratatui Styles
File Tree / Navigation
pi-mono: Tree selector for sessions and files
Features:
- Hierarchical tree view
- Collapsible/expandable nodes
- Keyboard navigation
- File icons/indicators
Rust Equivalent:
tui-tree-widget = "0.24"
# OR
ratatui-explorer = "0.2"
Notes:
- tui-tree-widget: Pure tree widget for any tree data
- ratatui-explorer: File explorer specifically (like ranger)
Diff Viewing
pi-mono: File diff display for code changes
Features:
- Side-by-side or unified diff
- Syntax highlighting in diffs
- Color-coded additions/deletions
- Line numbers
Rust Equivalent:
# Option 1: Use syntect with custom diff logic
syntect = "5.3"
# Option 2: Leverage delta (pager)
# Shell out to delta or integrate as library
Implementation approach:
#![allow(unused)]
fn main() {
// Parse diff output
// Use syntect to highlight each line
// Apply ratatui styles for additions (green) / deletions (red)
}
Clipboard Integration
pi-mono: Clipboard access for copy/paste
Features:
- Copy text to system clipboard
- Paste from clipboard
- Cross-platform support
Rust Equivalent:
arboard = "3.6"
Notes:
- Cross-platform (Linux X11/Wayland, macOS, Windows)
- Supports both text and images
- 1Password maintains this crate (high quality)
Async / Streaming
pi-mono: Event-driven with async/await
Features:
- Handle multiple async events
- Stream LLM responses
- Non-blocking UI updates
- Event queue/channels
Rust Equivalent:
tokio = { version = "1", features = ["full"] }
tokio-stream = "0.1"
futures = "0.3"
async-trait = "0.1"
Pattern:
#![allow(unused)]
fn main() {
// Event loop with tokio
let (tx, mut rx) = tokio::sync::mpsc::unbounded_channel();
// Spawn async task for events
tokio::spawn(async move {
while let Some(event) = rx.recv().await {
// Handle event
}
});
}
Configuration Management
pi-mono: YAML/JSON config files
Features:
- User config in ~/.config/pi/
- Project-level config
- Environment variable overrides
- Theme configuration
Rust Equivalent:
config = "0.15" # Layered config (file, env, etc.)
toml = "0.9" # TOML parsing
serde = { version = "1.0", features = ["derive"] }
directories = "5" # XDG dirs (cross-platform)
Alternative (more powerful):
figment = "0.10" # Used by Rocket, very flexible
Notes:
- config crate: Supports TOML, YAML, JSON, INI
- directories: Gets platform-appropriate config dirs
- XDG on Linux, ~/Library/Application Support on macOS, %APPDATA% on Windows
HTTP / LLM API
pi-mono: pi-ai (unified LLM API)
Features:
- Multiple provider support (OpenAI, Anthropic, etc.)
- Streaming responses
- Retry logic
- Authentication
Rust Equivalent:
reqwest = { version = "0.12", features = ["json", "stream"] }
tokio = "1"
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
For specific providers:
async-openai = "0.26" # If using OpenAI specifically
Custom trait approach (recommended):
#![allow(unused)]
fn main() {
#[async_trait]
trait LlmProvider {
async fn complete(&self, request: Request) -> Result<Response>;
async fn stream(&self, request: Request) -> Result<Stream<Item=Token>>;
}
}
Themes
pi-mono: Theme system with JSON themes
Features:
- Custom color schemes
- User-defined themes
- Syntax highlighting themes
- Terminal color detection
Rust Equivalent:
ratatui = "0.30" # Has Style system
syntect = "5.3" # Has theme support for syntax
For theme loading:
serde = { version = "1.0", features = ["derive"] }
toml = "0.9" # Or json, yaml
Implementation:
#![allow(unused)]
fn main() {
#[derive(Deserialize)]
struct Theme {
background: Color,
foreground: Color,
accent: Color,
// ...
}
}
Key Bindings / Input
pi-mono: Custom keyboard protocol
Features:
- Vim-style keybindings
- Command palette (/ commands)
- Configurable shortcuts
- Key sequences (chords)
Rust Equivalent:
crossterm = "0.29" # Has key event handling
For complex keybindings:
#![allow(unused)]
fn main() {
// crossterm provides KeyEvent, KeyCode, KeyModifiers
// Build a keymap system on top
enum Action {
SendMessage,
NewConversation,
Quit,
// ...
}
struct KeyMap {
bindings: HashMap<(KeyCode, KeyModifiers), Action>,
}
}
State Management
pi-mono: Agent state management
Features:
- Conversation history
- Context management
- Tool state
- Session persistence
Rust Equivalent:
# In-memory: Standard Rust
# Persistence:
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
# Database (optional):
sqlx = { version = "0.8", features = ["runtime-tokio", "sqlite"] }
# OR for simpler use:
redb = "2.0" # Pure Rust embedded DB
Complete Cargo.toml
[package]
name = "harness-tui"
version = "0.1.0"
edition = "2021"
[dependencies]
# Core TUI
ratatui = { version = "0.30", features = ["crossterm"] }
crossterm = "0.29"
# Editor
edtui = "0.11" # Optional: for vim bindings
# Markdown
tui-markdown = "0.3"
pulldown-cmark = "0.13"
# Syntax highlighting
syntect = "5.3"
syntect-tui = "3.0"
# File tree
tui-tree-widget = "0.24"
# Clipboard
arboard = "3.6"
# Async
tokio = { version = "1", features = ["full"] }
tokio-stream = "0.1"
futures = "0.3"
async-trait = "0.1"
# HTTP
reqwest = { version = "0.12", features = ["json", "stream", "rustls-tls"] }
# Config
config = "0.15"
toml = "0.9"
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
directories = "5"
# Persistence (optional)
redb = "2.0"
# Error handling
color-eyre = "0.6" # Better error reporting
thiserror = "1.0"
# Logging
tracing = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter"] }
# CLI
clap = { version = "4.5", features = ["derive"] }
# Utilities
unicode-width = "0.2"
textwrap = "0.16"
chrono = { version = "0.4", features = ["serde"] }
[dev-dependencies]
tokio-test = "0.4"
Feature Comparison Summary
| Feature | pi-mono | robit (Rust) |
|---|---|---|
| TUI Framework | Custom pi-tui | ratatui |
| Differential Rendering | ✅ Custom | ✅ Built-in |
| Text Editor | Custom | edtui / ratatui-code-editor |
| Markdown | marked | tui-markdown |
| Syntax Highlighting | Custom | syntect + syntect-tui |
| File Tree | Custom | tui-tree-widget |
| Diff Viewing | Custom | syntect + custom |
| Clipboard | Native | arboard |
| Async | Node.js async | tokio |
| Config | YAML/JSON | toml + config |
| HTTP | Node fetch | reqwest |
| Themes | JSON | toml + serde |
| State | In-memory | redb / sqlite |
Advantages of Rust Stack
- Type Safety: Compile-time guarantees for all UI state
- Performance: Zero-cost abstractions, no GC pauses
- Single Binary: No Node.js runtime dependency
- Memory Safety: No null pointer exceptions or undefined behavior
- Ecosystem: Cross-platform native performance
Development Workflow
# Add dependencies
cargo add ratatui crossterm tokio serde reqwest
# Development
cargo run # Run TUI
cargo test # Run tests
cargo clippy # Lint
cargo build --release # Optimized build
# Single binary output: target/release/harness-tui
No npm install. No node_modules. No webpack.
Just Rust.