Botface - Voice Assistant for Raspberry Pi 5 + AIY Voice HAT
Offline voice-controlled AI assistant written in Rust for Raspberry Pi 5 + Google AIY Voice HAT v1 + Batocera.
Status: Core architecture complete, integrations in progress - Wake word detection, LED control, transcription, LLM responses, and audio playback functional on AIY Voice HAT.
Architecture
Botface uses a sidecar pattern for audio I/O:
- Botface (Rust): Main state machine, LLM integration (Ollama), TTS (Piper), orchestration
- Sidecar (Python): HTTP service handling wake word detection (openWakeWord) and audio recording
- Communication: HTTP + SSE (Server-Sent Events) between Botface and sidecar
This architecture provides:
- Language isolation (Python crashes don’t affect Rust)
- Independent audio lifecycle
- Better monitoring and health checks
User Speech → Sidecar (Python) → SSE Events → Botface (Rust)
↓
TTS Audio ← Piper ← LLM Response ← Ollama
↓
LED + AIY Voice HAT Speaker
Quick Start (Local Development on macOS)
One-Time Setup
# Download required models and binaries
./scripts/setup.sh --dev
# This downloads:
# - Wake word model (hey_jarvis.onnx)
# - Whisper binary (speech-to-text)
# - Whisper model (ggml-base.en.bin)
# - Creates default config.toml
Running the Assistant
cd Botface
# Run in development mode (mock GPIO, local audio)
cargo run
# Or with explicit flags
cargo run -- --mock-gpio --local-audio --verbose
# Check CLI help
cargo run -- --help
What happens in local mode:
- Uses your Mac’s microphone via
cpal - GPIO operations print to console instead of controlling hardware
- Validates Ollama connection
- Skips Pi-specific binary checks
- Note: Sidecar not used in local dev mode (native wake word detection)
Production Build for Pi
Build and Deploy Workflow
CRITICAL: Build on macOS, deploy to Pi. Never build on the Pi.
# 1. Build for Raspberry Pi 5 (ARM64) on macOS
cross build --release --target aarch64-unknown-linux-gnu
# Binary location: target/aarch64-unknown-linux-gnu/release/botface
# 2. Deploy to Pi
scp target/aarch64-unknown-linux-gnu/release/botface \
root@<pi-ip>:/userdata/voice-assistant/
# 3. Start services on Pi
ssh root@<pi-ip> "cd /userdata/voice-assistant && \
python3 wakeword_sidecar.py --model models/hey_jarvis.onnx --threshold 0.5 --port 8080 & \
./botface"
See docs/INTEGRATION_ROADMAP.md for detailed deployment instructions.
Project Structure
├── Cargo.toml # Dependencies & features
├── Cargo.lock # Dependency lock file
├── Cross.toml # Cross-compilation configuration
├── README.md # This file
├── docs/ # Additional documentation
│ ├── INTEGRATION_ROADMAP.md # Deployment guide
│ ├── dev-log/ # Development session logs
│ └── ARCHITECTURE.md # System design
├── assets/ # Static assets
│ ├── sounds/ # WAV sound effects
│ └── models/ # ONNX models (not in git)
├── src/
│ ├── main.rs # Entry point with CLI args
│ ├── lib.rs # Library exports
│ ├── config.rs # Configuration (local vs Pi)
│ ├── state_machine.rs # Core state management
│ ├── sidecar/ # HTTP client for sidecar
│ ├── audio/ # Audio playback (TTS output)
│ ├── wakeword/ # Wake word (native, sidecar preferred)
│ ├── stt/ # Speech-to-text (whisper.cpp)
│ ├── llm/ # Language model (Ollama)
│ ├── tts/ # Text-to-speech (Piper)
│ ├── gpio/ # Hardware control (real + mock)
│ └── sounds/ # Sound effects
├── scripts/
│ ├── wakeword_sidecar.py # Python HTTP sidecar
│ ├── build.sh # Cross-compile for Pi 5
│ └── deploy.sh # Deploy to Pi via rsync
└── config.toml # Configuration file
Verified Working Features
All components tested and verified on Raspberry Pi 5 + AIY Voice HAT v1:
- ✅ Wake Word Detection: “Hey Jarvis” detected via sidecar (scores 0.85-0.99)
- ✅ LED Control: Physical LED on AIY HAT (ON during recording, OFF when idle)
- ✅ Audio Recording: 5-second clips captured via sidecar
- ✅ Speech-to-Text: whisper.cpp transcribes with high accuracy
- ✅ LLM Integration: Ollama generates contextual responses
- ✅ Text-to-Speech: Piper synthesizes natural speech
- ✅ Audio Playback: Verified working through AIY Voice HAT speaker (using
aplay -D plughw:0,0) - ✅ State Machine: Full pipeline Idle → Wake → Record → Transcribe → Think → Speak → Idle
Development vs Production Modes
Local Development (macOS/Linux Desktop)
Features:
- Audio Input: Uses
cpalto capture from your Mac’s microphone - Audio Output: System default audio device
- GPIO: Mock implementation (prints to console)
- Wake Word: Native Rust (optional, sidecar not used)
- Validation: Checks for Ollama, skips Pi-specific binaries
Useful for:
- Testing state machine logic
- Debugging LLM integration
- Rapid iteration without deploying
Production (Raspberry Pi 5)
Features:
- Audio Input: Sidecar (Python) with sounddevice + openWakeWord
- Audio Output:
aplay -D plughw:0,0(direct to AIY Voice HAT) - GPIO: Real hardware control via
gpioset/gpioget - Validation: Checks all binaries (whisper, piper, ollama, sidecar)
Deployed via:
- Cross-compiled ARM64 binary on macOS
- SCP to
/userdata/voice-assistant/ - Manual start of sidecar + botface
Configuration
The assistant automatically detects your platform and adjusts:
macOS (Local Dev):
[dev_mode]
enabled = true
mock_gpio = true
local_audio = true
skip_binary_checks = true
[audio]
device = "default"
[gpio]
mock_enabled = true
Raspberry Pi (Production):
[dev_mode]
enabled = false
[wakeword]
model_path = "/userdata/voice-assistant/models/hey_jarvis.onnx"
threshold = 0.5
[stt]
whisper_binary = "/userdata/voice-assistant/whisper-cli"
whisper_model = "/userdata/voice-assistant/models/ggml-base.en.bin"
[tts]
piper_binary = "/userdata/voice-assistant/piper/piper"
voice_model = "/userdata/voice-assistant/models/en_US-amy-medium.onnx"
[gpio]
mock_enabled = false
led_pin = 25
Create config.toml in project root for local testing, or in /userdata/voice-assistant/ on Pi.
Usage Examples
Local Development Mode
# Basic local run (auto-detects macOS)
cargo run
# With verbose logging
cargo run -- --verbose
# Skip dependency checks (faster startup)
cargo run -- --skip-checks
# Custom config
cargo run -- --config ./my-config.toml
Production Mode on Pi
# Set your Pi's IP address
PI_IP="192.168.X.X"
# On macOS - Build release binary for Pi
cross build --release --target aarch64-unknown-linux-gnu
# Deploy
scp target/aarch64-unknown-linux-gnu/release/botface \
root@$PI_IP:/userdata/voice-assistant/
# On Pi - Start sidecar first, then botface
ssh root@$PI_IP "cd /userdata/voice-assistant && \
python3 wakeword_sidecar.py --model models/hey_jarvis.onnx --threshold 0.5 --port 8080 > /tmp/sidecar.log 2>&1 & \
export LD_LIBRARY_PATH=/userdata/voice-assistant:\$LD_LIBRARY_PATH && \
./botface > /tmp/botface.log 2>&1 &"
# View logs
ssh root@$PI_IP "tail -f /tmp/botface.log /tmp/sidecar.log"
Testing Without Hardware
You can test most functionality on your Mac:
-
Install Ollama locally:
brew install ollama ollama pull llama3.2 -
Run with mocks:
cargo run -- --mock-gpio --skip-checks
Limitations of local testing:
- Can’t test actual LED/button
- Audio quality depends on your Mac’s mic
- No whisper.cpp or piper (unless you install them)
- But wake word detection and state machine work!
Architecture Highlights
Sidecar Pattern
The sidecar handles audio I/O separately from the main Rust application:
-
Sidecar HTTP API:
GET /health- Health checkGET /events- SSE stream for wake word eventsPOST /record- Record audio for specified durationPOST /reset- Reset detection state
-
Benefits:
- Python handles audio streaming (sounddevice)
- Rust handles orchestration and LLM logic
- Independent restart/crash recovery
Async State Machine
#![allow(unused)]
fn main() {
Idle → Listening → Recording → Transcribing →
Thinking → Speaking → Idle
}
Each state has:
- Entry actions (LED, sounds)
- Async operations (non-blocking)
- Exit cleanup
Trait-Based GPIO
#![allow(unused)]
fn main() {
#[async_trait]
trait Gpio {
async fn led_on(&mut self) -> Result<()>;
async fn led_off(&mut self) -> Result<()>;
async fn is_button_pressed(&self) -> Result<bool>;
}
// Two implementations:
// - AiyHatReal: System commands on Pi (gpioset/gpioget)
// - AiyHatMock: Console output on Mac
}
Feature Flags
sidecar(default): Use Python HTTP sidecar for wake wordnative-wakeword: Native ONNX wake word (conditionally compiled)local-dev: Local development settings (macOS)pi-deploy: Production deployment settings
Development Workflow
1. Edit Code Locally
cd botface
# Edit src/*.rs files
2. Test on Mac
# Quick iteration
cargo run -- --mock-gpio
# With all logging
cargo run -- --verbose 2>&1 | grep -E "(DEBUG|INFO|WARN)"
3. Build for Pi
just build-pi
# or
cross build --release --target aarch64-unknown-linux-gnu
4. Deploy to Pi
# See AGENTS.md for detailed deploy commands
scp target/aarch64-unknown-linux-gnu/release/botface \
root@<pi-ip>:/userdata/voice-assistant/
5. Monitor
ssh root@<pi-ip> "tail -f /tmp/botface.log /tmp/sidecar.log"
Learning Rust with This Project
This codebase demonstrates:
- Async/await with tokio
- Traits and generics for GPIO abstraction
- Error handling with anyhow/thiserror
- Cross-compilation for embedded targets
- HTTP client/server with reqwest and SSE
- Subprocess management for external binaries
- Configuration management with serde
- Feature flags for conditional compilation
Documentation
docs/INTEGRATION_ROADMAP.md- Complete deployment guidedocs/dev-log/- Development session logsAGENTS.md- Coding guidelines for AI assistants.opencode/ci-knowledge.md- CI/CD knowledge
License
MIT License - See LICENSE file for details