Skip to content

ADR-002: Use Rust as Implementation Language

ADR-002: Use Rust as Implementation Language

Section titled “ADR-002: Use Rust as Implementation Language”

Accepted

Implementing the RLM pattern requires a language that can efficiently handle:

  • Large text processing and chunking operations
  • Embedding vector computations
  • SQLite database operations
  • CLI interface with good user experience

The implementation language choice affects performance, distribution complexity, and long-term maintainability.

  1. Python ML tools: Require runtime installation and dependency management
  2. JavaScript/Node: V8 overhead for compute-intensive embedding operations
  3. Go: Limited ML ecosystem, would require CGO for embedding models
  1. Single binary distribution: Users should be able to install via a single executable without runtime dependencies
  2. Performance: Text processing and vector operations should be efficient
  3. Memory safety: No segfaults or memory leaks in production use
  1. Type safety: Strong typing catches errors at compile time
  2. Ecosystem: Cargo provides excellent dependency management
  3. Cross-platform: Should compile for Linux, macOS, and Windows

Description: Implement in Rust using cargo for builds and distribution.

Technical Characteristics:

  • Zero-cost abstractions
  • No garbage collector
  • Single static binary output
  • Strong type system with ownership model

Advantages:

  • Single binary distribution (no runtime required)
  • Memory safety without garbage collection
  • Excellent performance for text processing
  • Strong ecosystem for CLI (clap) and database (rusqlite)
  • fastembed-rs provides native embedding support

Disadvantages:

  • Steeper learning curve
  • Longer compile times
  • Smaller talent pool than Python/JS

Risk Assessment:

  • Technical Risk: Low. Mature language with stable tooling
  • Schedule Risk: Medium. Rust requires more upfront design
  • Ecosystem Risk: Low. Key dependencies (rusqlite, fastembed) are mature

Description: Implement in Python with PyInstaller or similar for distribution.

Technical Characteristics:

  • Dynamic typing
  • Rich ML ecosystem (numpy, sentence-transformers)
  • Requires Python runtime or bundled interpreter

Advantages:

  • Fastest development velocity
  • Best ML library ecosystem
  • Large developer community

Disadvantages:

  • Distribution complexity (virtualenv, pip, version conflicts)
  • Performance overhead for text processing
  • Memory management less predictable

Disqualifying Factor: Distribution complexity conflicts with goal of simple single-binary CLI tool.

Risk Assessment:

  • Technical Risk: Low. Very mature ecosystem
  • Schedule Risk: Low. Fast development
  • Ecosystem Risk: Medium. Dependency conflicts common

Description: Implement in Go for simple distribution and good performance.

Technical Characteristics:

  • Static binary compilation
  • Garbage collected
  • Simple language design

Advantages:

  • Single binary distribution
  • Fast compilation
  • Good CLI tooling (cobra)

Disadvantages:

  • Limited ML/embedding ecosystem
  • Would require CGO for ONNX runtime
  • Less expressive type system

Disqualifying Factor: Embedding model integration would require complex CGO bindings.

Risk Assessment:

  • Technical Risk: Medium. CGO complexity for ML
  • Schedule Risk: Medium. ML integration work
  • Ecosystem Risk: High. Limited embedding options

Use Rust as the implementation language for rlm-rs.

The implementation will use:

  • Cargo for build system and dependency management
  • clap for CLI argument parsing
  • rusqlite for SQLite database access
  • fastembed-rs for embedding generation
  • thiserror for error handling
  1. Zero-dependency distribution: Users install a single binary with no runtime requirements
  2. Predictable performance: No GC pauses, efficient memory usage
  3. Compile-time safety: Ownership system prevents memory bugs and data races
  4. Cross-platform builds: cargo handles cross-compilation well
  1. Development velocity: Rust requires more upfront design than Python
  2. Compile times: Full rebuilds take longer than interpreted languages
  3. Contributor barrier: Fewer developers familiar with Rust
  1. Binary size: Statically linked binaries are larger but self-contained

Rust enables the core distribution goal: a single binary CLI tool that users can install and run without dependency management. The performance characteristics are well-suited for text processing and embedding operations.

Mitigations:

  • Use incremental compilation during development
  • Provide clear documentation for contributors
  • Leverage Rust’s excellent documentation and error messages
  • Date: 2025-01-01
  • Source: Project inception design decisions
  • Related ADRs: ADR-001

Status: Compliant

Findings:

FindingFilesLinesAssessment
Rust edition 2024 configuredCargo.tomlL3compliant
MSRV 1.88 specifiedCargo.tomlL7compliant
Strict clippy lints enabledCargo.tomlL89-L120compliant
No unsafe code blocksall-compliant

Summary: Project fully implemented in Rust with strict safety configuration.

Action Required: None