Skip to content

ADR-003: SQLite for State Persistence

Accepted

The RLM pattern requires persistent storage for:

  • Buffer content and metadata
  • Chunks with byte offsets and line numbers
  • Embedding vectors for semantic search
  • Full-text search indexes
  • Context state across sessions

The storage solution must support efficient queries, work locally without a server, and integrate well with a CLI tool distribution model.

  1. In-memory only: Would lose state between CLI invocations
  2. File-based JSON: Inefficient for large datasets, no query capability
  3. External databases: Require separate installation and configuration
  1. Zero-configuration: Storage should work out-of-the-box without setup
  2. Embedded operation: No separate server process required
  3. Query capability: Need efficient filtering, joining, and full-text search
  1. ACID compliance: Data integrity across crashes
  2. Single-file storage: Easy backup and portability
  3. FTS5 support: Built-in full-text search for BM25 ranking

Description: Use SQLite as an embedded database accessed via rusqlite.

Technical Characteristics:

  • Single-file database
  • ACID compliant
  • FTS5 for full-text search
  • BLOB storage for embeddings
  • No server process

Advantages:

  • Zero configuration - just a file path
  • Excellent query performance for local data
  • FTS5 provides BM25 ranking out-of-the-box
  • rusqlite is mature and well-maintained
  • Single file for easy backup/sync

Disadvantages:

  • Limited concurrent write performance
  • No built-in vector search (must implement)
  • File locking can be tricky

Risk Assessment:

  • Technical Risk: Low. SQLite is battle-tested
  • Schedule Risk: Low. rusqlite API is straightforward
  • Ecosystem Risk: Low. SQLite is ubiquitous

Description: Use RocksDB key-value store for high-performance storage.

Technical Characteristics:

  • LSM tree architecture
  • High write throughput
  • Column families for organization

Advantages:

  • Excellent write performance
  • Good compression
  • Proven at scale

Disadvantages:

  • No SQL query capability
  • No built-in full-text search
  • More complex setup
  • Larger dependency footprint

Disqualifying Factor: Lack of SQL queries and FTS would require significant additional implementation.

Risk Assessment:

  • Technical Risk: Medium. More complex to use correctly
  • Schedule Risk: High. Would need custom query layer
  • Ecosystem Risk: Low. Mature library

Description: Use PostgreSQL with pgvector for production-grade storage.

Technical Characteristics:

  • Full SQL support
  • pgvector for native vector search
  • Requires server process

Advantages:

  • Native vector similarity search
  • Excellent query optimizer
  • Scalable

Disadvantages:

  • Requires separate installation
  • Server process management
  • Overkill for CLI tool

Disqualifying Factor: External server requirement conflicts with zero-configuration CLI design.

Risk Assessment:

  • Technical Risk: Low. Mature database
  • Schedule Risk: Medium. Setup complexity
  • Ecosystem Risk: Low. Industry standard

Use SQLite as the embedded storage backend via rusqlite.

The implementation will use:

  • rusqlite for database access with bundled SQLite
  • FTS5 for full-text search with BM25 ranking
  • BLOB columns for embedding vector storage
  • Foreign keys for referential integrity
  • Schema migrations for version upgrades
  1. Zero configuration: Database created automatically on first use
  2. Single-file portability: .rlm file can be copied, backed up, or synced
  3. Built-in FTS5: BM25 search available without additional dependencies
  4. ACID guarantees: Data integrity even on crashes
  5. Familiar SQL: Easy to debug and extend
  1. Manual vector search: Must implement cosine similarity in application code
  2. Write contention: Concurrent writes require careful handling
  3. No native vector index: Linear scan for semantic search (mitigated by small dataset sizes)
  1. File locking: SQLite handles this but requires WAL mode for best concurrency

SQLite provides the right balance of capability and simplicity for a CLI tool. The FTS5 extension enables BM25 search, and embedding vectors stored as BLOBs support semantic search with application-level cosine similarity.

Mitigations:

  • Use WAL mode for better concurrency
  • Implement efficient batch operations
  • Keep result sets small with LIMIT clauses
  • Schema versioning for upgrades
  • SQLite - SQLite documentation
  • rusqlite - Rust SQLite bindings
  • FTS5 - Full-text search extension
  • Date: 2025-01-01
  • Source: Project inception design decisions
  • Related ADRs: ADR-001, ADR-008

Status: Compliant

Findings:

FindingFilesLinesAssessment
Schema defined with FTS5src/storage/schema.rsL88-L108compliant
Migrations implementedsrc/storage/schema.rsL127-L197compliant
BLOB embedding storagesrc/storage/schema.rsL79-L86compliant
rusqlite dependencyCargo.tomlL30compliant

Summary: SQLite storage fully implemented with FTS5, migrations, and embedding support.

Action Required: None