3. Four data collectors
Date: 2026-04-22
Status
Accepted
Context
Cerebro aggregates signals from development activity to generate context. We need to decide which data sources to collect from.
Decision
We will collect from four sources:
| Source | Implementation | Data |
|---|---|---|
| OpenCode | SQLite via opencode_db_path | Session history |
| Git | git2 crate | Commits, modified files |
| TODOs | Regex scan on source files | TODO/FIXME/HACK/XXX comments |
| Manual Notes | notes/projects/{name}.md | Status, journal, intent |
Consequences
Pros
- Comprehensive signal aggregation
- Each collector is independent
- Manual notes provide human context
Cons
- Git collection can be slow on large repos
- TODO regex may false-positive on strings
Notes
Collectors are orchestrated in collectors/mod.rs. Each is async and can run in parallel.