Features
rlm-cli uses Cargo features to provide optional functionality and reduce binary size for specific use cases.
Available Features
Section titled “Available Features”fastembed-embeddings (default)
Section titled “fastembed-embeddings (default)”What it does: Enables semantic search using FastEmbed ONNX-based embedding models.
Dependencies:
fastembedcrate (ONNX Runtime binaries)- BGE-M3 embedding model (1024 dimensions)
Use when:
- You need semantic similarity search
- Context-aware document retrieval is important
- Hybrid search (semantic + BM25) is required
Binary size impact: ~100MB (includes ONNX runtime + model weights)
Build:
# Enabled by defaultcargo build --release
# Explicitly enablecargo build --release --features fastembed-embeddingsSkip when:
- You only need keyword/regex search
- Binary size is critical (embedded systems, containers)
- BM25 full-text search is sufficient
Build without:
cargo build --release --no-default-featuresusearch-hnsw
Section titled “usearch-hnsw”What it does: Enables high-performance vector search using HNSW (Hierarchical Navigable Small World) algorithm.
Dependencies:
usearchcrate v2.23.x from crates.io (pinned<2.24for Windows compatibility)- Requires C++ compiler (C++17 or later)
Note: Version 2.24.0+ is excluded due to Windows compilation issues. See Troubleshooting for details.
Use when:
- Working with large document collections (>10,000 chunks)
- Low-latency vector search is required (<10ms)
- Memory usage is acceptable (HNSW index ~4x embedding size)
Performance:
- Exact search (SQLite): O(n) - 100ms for 10K chunks
- HNSW search: O(log n) - <10ms for 10K chunks
Build:
cargo build --release --features usearch-hnswSkip when:
- Document collection is small (<1,000 chunks)
- Build environment lacks C++ toolchain
- Approximate nearest neighbor trade-offs are unacceptable
full-search
Section titled “full-search”What it does: Combines fastembed-embeddings + usearch-hnsw for complete semantic search capabilities.
Use when:
- Production deployment with large-scale semantic search
- Maximum search performance is required
- You want the complete feature set
Build:
cargo build --release --features full-searchFeature Combinations
Section titled “Feature Combinations”| Features | Embedding | Vector Search | BM25 | Use Case |
|---|---|---|---|---|
(none) | ❌ | ❌ | ✅ | Keyword search only, minimal binary |
fastembed-embeddings | ✅ | Exact (SQLite) | ✅ | Hybrid search, moderate scale |
usearch-hnsw | ❌ | ❌ | ✅ | No embeddings, BM25 only |
full-search | ✅ | HNSW | ✅ | Production, large scale |
Fallback Behavior
Section titled “Fallback Behavior”Without fastembed-embeddings
Section titled “Without fastembed-embeddings”Semantic search commands will fall back to BM25-only:
# This command requires embeddingsrlm-cli search "query" --mode semantic
# Error: FastEmbed not available, falling back to BM25# Suggestion: Rebuild with --features fastembed-embeddingsThe CLI will automatically use hash-based pseudo-embeddings for compatibility, but results will be degraded.
Without usearch-hnsw
Section titled “Without usearch-hnsw”Vector search uses exact SQLite-based cosine similarity:
rlm-cli search "query" --mode hybrid --top-k 100# Uses exact search - slower but accuratePerformance degrades linearly with chunk count.
Build Examples
Section titled “Build Examples”Minimal Binary (BM25 only)
Section titled “Minimal Binary (BM25 only)”# Smallest binary, keyword search onlycargo build --release --no-default-features
# Result: ~5MB binary, no embedding dependenciesStandard Configuration (recommended)
Section titled “Standard Configuration (recommended)”# Default: FastEmbed embeddings + SQLite vector searchcargo build --release
# Result: ~100MB binary, hybrid search, moderate scaleHigh-Performance Configuration
Section titled “High-Performance Configuration”# Full features: FastEmbed + HNSWcargo build --release --features full-search
# Result: ~105MB binary, maximum performanceContainer Deployment
Section titled “Container Deployment”# Dockerfile example - minimal sizeFROM rust:1.88-slim AS builderWORKDIR /appCOPY . .RUN cargo build --release --no-default-features
FROM debian:bookworm-slimCOPY --from=builder /app/target/release/rlm-cli /usr/local/bin/CMD ["rlm-cli"]Runtime Behavior
Section titled “Runtime Behavior”First-Time Model Download (fastembed-embeddings)
Section titled “First-Time Model Download (fastembed-embeddings)”When first running with embeddings enabled:
rlm-cli load document.md --name docs
# Downloads BGE-M3 model (~1GB) to ~/.cache/fastembed/# Progress: Downloading model... 100%# Generating embeddings... Done (5000 chunks in 30s)Model cache location: $HOME/.cache/fastembed/
Download size: ~1GB (one-time)
Feature Detection
Section titled “Feature Detection”Check which features are compiled:
rlm-cli --version
# Output:# rlm-cli 1.2.4# Features: fastembed-embeddings, usearch-hnswTroubleshooting
Section titled “Troubleshooting”Build Failures
Section titled “Build Failures”Error: error: failed to compile usearch
Solution: Install C++ compiler
# Ubuntu/Debiansudo apt-get install build-essential
# macOSxcode-select --install
# Or disable HNSWcargo build --release --features fastembed-embeddingsError: ONNX Runtime not found
Solution: Use bundled binaries (enabled by default)
# Explicitly enable bundled ONNXcargo build --release --features fastembed-embeddingsRuntime Issues
Section titled “Runtime Issues”Issue: Embedding generation is slow
Solutions:
- Use
--chunker parallelfor multi-threaded chunking - Reduce chunk size:
--chunk-size 50000(default: 100k) - Check CPU resources (embedding uses all cores)
Issue: High memory usage during search
Solutions:
- Without HNSW: Memory = chunk_count × 1024 × 4 bytes
- With HNSW: Memory = chunk_count × 1024 × 16 bytes (includes index)
- Use
--top-kto limit result set:--top-k 10
Performance Comparison
Section titled “Performance Comparison”Benchmark: 50,000 chunks, BGE-M3 embeddings (1024d)
| Configuration | Search Time | Memory | Binary Size |
|---|---|---|---|
| No features | 200ms (BM25) | 50MB | 5MB |
| fastembed-embeddings | 800ms (exact) | 250MB | 100MB |
| full-search | 8ms (HNSW) | 450MB | 105MB |
Recommendation: Use fastembed-embeddings (default) for most use cases. Enable usearch-hnsw only for large-scale deployments (>10K chunks).
Related Documentation
Section titled “Related Documentation”- Architecture - How features integrate with core systems
- CLI Reference - Commands affected by feature flags
- Plugin Integration - Using features with Claude Code