Plugin Integration
This guide explains how to integrate rlm-rs with AI coding assistants through plugins, skills, and commands. While the examples focus on Claude Code, the patterns apply to any AI assistant that can execute shell commands.
Overview
Section titled “Overview”rlm-cli is designed as a CLI-first tool that AI assistants invoke via shell execution. This architecture enables:
- Universal Compatibility: Any assistant with shell access can use rlm-cli
- No Custom APIs: Standard stdin/stdout/stderr communication
- JSON Output: Machine-readable format for programmatic integration
- Stateless Commands: Each invocation is independent (state lives in SQLite)
Claude Code Integration
Section titled “Claude Code Integration”Plugin Architecture
Section titled “Plugin Architecture”The rlm-rs Claude Code plugin implements the RLM pattern:
Components
Section titled “Components”1. Slash Commands (Skills)
Section titled “1. Slash Commands (Skills)”User-invocable shortcuts for common operations:
| Command | Description | Maps To |
|---|---|---|
/rlm-load | Load file into RLM | rlm-cli load <file> |
/rlm-search | Search loaded content | rlm-cli search <query> |
/rlm-status | Show RLM state | rlm-cli status |
/rlm-analyze | Full RLM analysis workflow | Orchestrated multi-step |
Example Skill Definition (.claude/skills/rlm-load.md):
---name: rlm-loaddescription: Load a file or directory into RLM for analysisarguments: - name: path description: File or directory to load required: true - name: name description: Buffer name (defaults to filename) required: false---
Load content into RLM for semantic search and chunk-based analysis.
## Workflow
1. Check if rlm-rs is installed: `which rlm-cli`2. Initialize if needed: `rlm-cli init`3. Load the content: `rlm-cli load {{path}} --name {{name}} --chunker semantic`4. Report status: `rlm-cli status --format json`
## Output
Report the number of chunks created and confirm embeddings were generated.2. Subagents
Section titled “2. Subagents”Specialized agents for chunk-level processing:
rlm-subcall Agent (.claude/agents/rlm-subcall.md):
---name: rlm-subcallmodel: haikudescription: Efficient chunk-level analysis for RLM workflowtools: - Bash - Read---
You are a focused analysis agent processing individual chunks from large documents.
## Instructions
1. Retrieve the chunk: `rlm-cli chunk get <chunk_id>`2. Analyze according to the prompt3. Return structured JSON findings:
```json\{ "chunk_id": <id>, "findings": [...], "relevance": "high|medium|low", "summary": "Brief summary"\}Keep responses concise. You’re part of a larger workflow.
**rlm-synthesizer Agent** (`.claude/agents/rlm-synthesizer.md`):
```markdown---name: rlm-synthesizermodel: sonnetdescription: Synthesize findings from multiple chunk analysestools: - Read - Bash---
You aggregate results from multiple rlm-subcall analyses.
## Instructions
1. Review all chunk findings2. Identify patterns and connections3. Synthesize into coherent narrative4. Highlight key insights and recommendations3. Hooks
Section titled “3. Hooks”Automated triggers for RLM operations:
Auto-load on large files (.claude/hooks/large-file-rlm.md):
---event: PostToolUsetool: Read---
If the file read was larger than 50KB, suggest loading it into RLM:
"This is a large file. Consider using `/rlm-load {{file_path}}` for semantic search."Typical Workflow
Section titled “Typical Workflow”Portable Integration Patterns
Section titled “Portable Integration Patterns”Generic CLI Integration
Section titled “Generic CLI Integration”Any AI assistant can integrate with rlm-rs using these patterns:
Pattern 1: Search-Then-Retrieve
Section titled “Pattern 1: Search-Then-Retrieve”# 1. Load content (one-time setup)rlm-cli load large-document.md --name docs
# 2. Search for relevant chunksRESULTS=$(rlm-cli --format json search "your query" --top-k 5)
# 3. Extract chunk IDsCHUNK_IDS=$(echo "$RESULTS" | jq -r '.results[].chunk_id')
# 4. Retrieve and process each chunkfor ID in $CHUNK_IDS; do CONTENT=$(rlm-cli chunk get $ID) # Process $CONTENT...donePattern 2: Grep-Based Analysis
Section titled “Pattern 2: Grep-Based Analysis”# Find specific patternsrlm-cli grep docs "TODO|FIXME|HACK" --format json --max-matches 50
# Get context around matchesrlm-cli grep docs "error.*handling" --window 200Pattern 3: Progressive Refinement
Section titled “Pattern 3: Progressive Refinement”# Broad search firstrlm-cli search "authentication" --top-k 20
# Narrow downrlm-cli search "JWT token validation" --top-k 5 --mode semantic
# Exact matchrlm-cli search "validateToken function" --mode bm25JSON Output Schema
Section titled “JSON Output Schema”All commands with --format json return structured data:
Search Results:
{ "count": 3, "mode": "hybrid", "query": "authentication", "results": [ { "chunk_id": 42, "buffer_id": 1, "buffer_name": "auth.rs", "score": 0.0328, "semantic_score": 0.0499, "bm25_score": 0.0000016 } ]}Status:
{ "initialized": true, "db_path": ".rlm/rlm-state.db", "db_size_bytes": 245760, "buffer_count": 3, "chunk_count": 42, "total_content_bytes": 125000, "embeddings_count": 42}Chunk:
{ "id": 42, "buffer_id": 1, "buffer_name": "auth.rs", "index": 3, "byte_range": [12000, 15000], "size": 3000, "content": "...", "has_embedding": true}Platform-Specific Notes
Section titled “Platform-Specific Notes”GitHub Copilot
Section titled “GitHub Copilot”Copilot can invoke rlm-rs through its terminal integration:
@terminal rlm-cli load src/ --name code@terminal rlm-cli search "error handling"Codex CLI
Section titled “Codex CLI”Codex can execute rlm-rs commands directly:
codex "Load the documentation and find sections about API authentication"# Codex runs: rlm-cli load docs/ && rlm-rs search "API authentication"OpenCode / Aider
Section titled “OpenCode / Aider”These tools can use rlm-cli as an external helper:
# In .aider.conf.yml or similartools: - name: rlm-search command: rlm-cli --format json search "$QUERY"VS Code Extensions
Section titled “VS Code Extensions”Extensions should use execFile instead of exec for security (avoids shell injection):
import { execFile } from 'child_process';import { promisify } from 'util';
const execFileAsync = promisify(execFile);
interface SearchResult { chunk_id: number; score: number;}
interface SearchResponse { results: SearchResult[];}
async function searchRLM(query: string): Promise<SearchResult[]> { // Using execFile (not exec) prevents shell injection const { stdout } = await execFileAsync('rlm-cli', [ '--format', 'json', 'search', query ]); const response: SearchResponse = JSON.parse(stdout); return response.results;}Best Practices
Section titled “Best Practices”1. Use Semantic Chunking for Code
Section titled “1. Use Semantic Chunking for Code”rlm-cli load src/ --chunker semantic --chunk-size 3000Semantic chunking respects function and class boundaries.
2. Name Buffers Meaningfully
Section titled “2. Name Buffers Meaningfully”rlm-cli load src/auth/ --name auth-modulerlm-cli load src/api/ --name api-handlersrlm-cli load docs/ --name documentationThis makes search results more interpretable.
3. Use Hybrid Search by Default
Section titled “3. Use Hybrid Search by Default”rlm-cli search "query" --mode hybridHybrid combines semantic understanding with keyword matching.
4. Batch Subagent Calls
Section titled “4. Batch Subagent Calls”Instead of sequential calls, use parallel Task invocations:
# Good: ParallelTask(rlm-subcall, chunk 12) || Task(rlm-subcall, chunk 27) || Task(rlm-subcall, chunk 33)
# Avoid: SequentialTask(rlm-subcall, chunk 12)Task(rlm-subcall, chunk 27)Task(rlm-subcall, chunk 33)5. Store Intermediate Results
Section titled “5. Store Intermediate Results”# After subcall analysisrlm-cli add-buffer auth-analysis "$(cat subcall-results.json)"
# Later retrievalrlm-cli show auth-analysisError Handling for AI Assistants
Section titled “Error Handling for AI Assistants”When integrating rlm-rs into AI workflows, proper error handling ensures graceful recovery and good user experience. This section provides structured patterns for handling common errors.
Error Detection
Section titled “Error Detection”All rlm-cli commands return:
- Exit code 0: Success
- Exit code 1: Error (details in stderr)
With JSON format, errors are structured:
{ "error": "storage error: RLM not initialized. Run: rlm-rs init", "code": "NOT_INITIALIZED"}Common Errors and Recovery Strategies
Section titled “Common Errors and Recovery Strategies”| Error Message | Cause | Recovery Strategy |
|---|---|---|
RLM not initialized | Database not created | Run rlm-cli init |
buffer not found: <name> | Buffer doesn’t exist | Run rlm-cli list to verify |
chunk not found: <id> | Invalid chunk ID | Re-run search to get valid IDs |
No results found | Query too specific | Broaden query or lower threshold |
embedding error | Model loading issue | Check disk space, retry once |
file not found | Invalid path | Verify path exists before load |
Structured Error Handling Pattern
Section titled “Structured Error Handling Pattern”# Robust error handling for AI assistantsRESULT=$(rlm-cli --format json search "$QUERY" 2>&1)EXIT_CODE=$?
if [ $EXIT_CODE -ne 0 ]; then # Parse error ERROR=$(echo "$RESULT" | jq -r '.error // empty')
case "$ERROR" in *"not initialized"*) rlm-cli init # Retry original command RESULT=$(rlm-cli --format json search "$QUERY") ;; *"buffer not found"*) echo "Buffer not found. Available buffers:" rlm-cli list ;; *"No results"*) echo "No results. Try broader query or: --threshold 0.1" ;; *) echo "Error: $ERROR" ;; esacfiRetry Logic
Section titled “Retry Logic”For transient errors (embedding model loading, database locks):
MAX_RETRIES=3RETRY_DELAY=1
for i in $(seq 1 $MAX_RETRIES); do RESULT=$(rlm-cli --format json chunk embed "$BUFFER" 2>&1) if [ $? -eq 0 ]; then break fi
if [ $i -lt $MAX_RETRIES ]; then sleep $RETRY_DELAY RETRY_DELAY=$((RETRY_DELAY * 2)) # Exponential backoff fidonePre-flight Checks
Section titled “Pre-flight Checks”Before complex workflows, verify prerequisites:
# Check 1: rlm-rs is installedif ! command -v rlm-rs &> /dev/null; then echo "rlm-rs not found. Install with: cargo install rlm-cli" exit 1fi
# Check 2: Database is initializedif ! rlm-rs status &> /dev/null; then rlm-cli initfi
# Check 3: Content is loadedBUFFER_COUNT=$(rlm-cli --format json status | jq '.buffer_count')if [ "$BUFFER_COUNT" -eq 0 ]; then echo "No content loaded. Use: rlm-rs load <file>" exit 1fi
# Check 4: Embeddings exist for semantic searchEMBED_COUNT=$(rlm-cli --format json chunk status | jq '.embedded_chunks')if [ "$EMBED_COUNT" -eq 0 ]; then echo "No embeddings. Generating..." rlm-cli chunk embed --allfiGraceful Degradation
Section titled “Graceful Degradation”When semantic search fails, fall back to BM25:
# Try semantic firstRESULT=$(rlm-cli --format json search "$QUERY" --mode semantic 2>&1)
if echo "$RESULT" | jq -e '.error' > /dev/null 2>&1; then # Fall back to BM25 (keyword search, no embeddings required) RESULT=$(rlm-cli --format json search "$QUERY" --mode bm25)fiError Messages for Users
Section titled “Error Messages for Users”When reporting errors to users, provide actionable guidance:
**Good**: "Buffer 'config' not found. Available buffers: main, auth. Did you mean one of these?"
**Bad**: "Error: buffer not found: config"Troubleshooting
Section titled “Troubleshooting”Command Not Found
Section titled “Command Not Found”# Check installationwhich rlm-cli
# Install if missingcargo install rlm-cli# orbrew install zircote/tap/rlm-rsDatabase Not Initialized
Section titled “Database Not Initialized”rlm-cli initNo Search Results
Section titled “No Search Results”- Check if content is loaded:
rlm-cli list - Verify embeddings exist:
rlm-cli chunk status - Try broader query or lower threshold:
--threshold 0.1
JSON Parsing Errors
Section titled “JSON Parsing Errors”Ensure you’re using --format json:
rlm-cli --format json search "query" # Correctrlm-cli search "query" --format json # Also correctSystem Prompt Templates
Section titled “System Prompt Templates”Ready-to-use system prompts for AI assistants integrating with rlm-rs are available in the prompts/ directory:
| Template | Purpose | Recommended Model |
|---|---|---|
| rlm-orchestrator.md | Coordinates search, dispatch, and synthesis | sonnet |
| rlm-analyst.md | Analyzes individual chunks | haiku |
| rlm-synthesizer.md | Aggregates analyst findings | sonnet |
Quick Start
Section titled “Quick Start”- Orchestrator receives user request and searches for relevant chunks
- Analysts (parallel) process individual chunks and return structured findings
- Synthesizer aggregates findings into a coherent report
User Request │ ▼┌─────────────┐│ Orchestrator │──▶ rlm-rs search "query"└─────────────┘ │ ▼ dispatch┌─────────────────────────────────┐│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ││ │Analyst 1│ │Analyst 2│ │Analyst N│ │ (parallel)│ └─────────┘ └─────────┘ └─────────┘ │└─────────────────────────────────┘ │ collect ▼┌─────────────┐│ Synthesizer │──▶ Final Report└─────────────┘See Also
Section titled “See Also”- RLM-Inspired Design - Architectural philosophy
- CLI Reference - Complete command documentation
- Architecture - Internal implementation details
- Prompt Templates - System prompts for AI integration