Plugin Integration

This guide explains how to integrate rlm-rs with AI coding assistants through plugins, skills, and commands. While the examples focus on Claude Code, the patterns apply to any AI assistant that can execute shell commands.

Overview

rlm-cli is designed as a CLI-first tool that AI assistants invoke via shell execution. This architecture enables:

Universal Compatibility: Any assistant with shell access can use rlm-cli
No Custom APIs: Standard stdin/stdout/stderr communication
JSON Output: Machine-readable format for programmatic integration
Stateless Commands: Each invocation is independent (state lives in SQLite)

Claude Code Integration

Plugin Architecture

The rlm-rs Claude Code plugin implements the RLM pattern:

Components

1. Slash Commands (Skills)

User-invocable shortcuts for common operations:

Command	Description	Maps To
`/rlm-load`	Load file into RLM	`rlm-cli load <file>`
`/rlm-search`	Search loaded content	`rlm-cli search <query>`
`/rlm-status`	Show RLM state	`rlm-cli status`
`/rlm-analyze`	Full RLM analysis workflow	Orchestrated multi-step

Example Skill Definition (.claude/skills/rlm-load.md):

---
name: rlm-load
description: Load a file or directory into RLM for analysis
arguments:
  - name: path
    description: File or directory to load
    required: true
  - name: name
    description: Buffer name (defaults to filename)
    required: false
---

Load content into RLM for semantic search and chunk-based analysis.

## Workflow

1. Check if rlm-rs is installed: `which rlm-cli`
2. Initialize if needed: `rlm-cli init`
3. Load the content: `rlm-cli load {{path}} --name {{name}} --chunker semantic`
4. Report status: `rlm-cli status --format json`

## Output

Report the number of chunks created and confirm embeddings were generated.

2. Subagents

Specialized agents for chunk-level processing:

rlm-subcall Agent (.claude/agents/rlm-subcall.md):

---
name: rlm-subcall
model: haiku
description: Efficient chunk-level analysis for RLM workflow
tools:
  - Bash
  - Read
---

You are a focused analysis agent processing individual chunks from large documents.

## Instructions

1. Retrieve the chunk: `rlm-cli chunk get <chunk_id>`
2. Analyze according to the prompt
3. Return structured JSON findings:

```json
\{
  "chunk_id": <id>,
  "findings": [...],
  "relevance": "high|medium|low",
  "summary": "Brief summary"
\}

Keep responses concise. You’re part of a larger workflow.

**rlm-synthesizer Agent** (`.claude/agents/rlm-synthesizer.md`):

```markdown
---
name: rlm-synthesizer
model: sonnet
description: Synthesize findings from multiple chunk analyses
tools:
  - Read
  - Bash
---

You aggregate results from multiple rlm-subcall analyses.

## Instructions

1. Review all chunk findings
2. Identify patterns and connections
3. Synthesize into coherent narrative
4. Highlight key insights and recommendations

3. Hooks

Automated triggers for RLM operations:

Auto-load on large files (.claude/hooks/large-file-rlm.md):

---
event: PostToolUse
tool: Read
---

If the file read was larger than 50KB, suggest loading it into RLM:

"This is a large file. Consider using `/rlm-load {{file_path}}` for semantic search."

Typical Workflow

Portable Integration Patterns

Generic CLI Integration

Any AI assistant can integrate with rlm-rs using these patterns:

Pattern 1: Search-Then-Retrieve

# 1. Load content (one-time setup)
rlm-cli load large-document.md --name docs

# 2. Search for relevant chunks
RESULTS=$(rlm-cli --format json search "your query" --top-k 5)

# 3. Extract chunk IDs
CHUNK_IDS=$(echo "$RESULTS" | jq -r '.results[].chunk_id')

# 4. Retrieve and process each chunk
for ID in $CHUNK_IDS; do
    CONTENT=$(rlm-cli chunk get $ID)
    # Process $CONTENT...
done

Pattern 2: Grep-Based Analysis

# Find specific patterns
rlm-cli grep docs "TODO|FIXME|HACK" --format json --max-matches 50

# Get context around matches
rlm-cli grep docs "error.*handling" --window 200

# Broad search first
rlm-cli search "authentication" --top-k 20

# Narrow down
rlm-cli search "JWT token validation" --top-k 5 --mode semantic

# Exact match
rlm-cli search "validateToken function" --mode bm25

JSON Output Schema

All commands with --format json return structured data:

Search Results:

{
  "count": 3,
  "mode": "hybrid",
  "query": "authentication",
  "results": [
    {
      "chunk_id": 42,
      "buffer_id": 1,
      "buffer_name": "auth.rs",
      "score": 0.0328,
      "semantic_score": 0.0499,
      "bm25_score": 0.0000016
    }
  ]
}

Status:

{
  "initialized": true,
  "db_path": ".rlm/rlm-state.db",
  "db_size_bytes": 245760,
  "buffer_count": 3,
  "chunk_count": 42,
  "total_content_bytes": 125000,
  "embeddings_count": 42
}

Chunk:

{
  "id": 42,
  "buffer_id": 1,
  "buffer_name": "auth.rs",
  "index": 3,
  "byte_range": [12000, 15000],
  "size": 3000,
  "content": "...",
  "has_embedding": true
}

Platform-Specific Notes

GitHub Copilot

Copilot can invoke rlm-rs through its terminal integration:

@terminal rlm-cli load src/ --name code
@terminal rlm-cli search "error handling"

Codex CLI

Codex can execute rlm-rs commands directly:

codex "Load the documentation and find sections about API authentication"
# Codex runs: rlm-cli load docs/ && rlm-rs search "API authentication"

OpenCode / Aider

These tools can use rlm-cli as an external helper:

# In .aider.conf.yml or similar
tools:
  - name: rlm-search
    command: rlm-cli --format json search "$QUERY"

VS Code Extensions

Extensions should use execFile instead of exec for security (avoids shell injection):

import { execFile } from 'child_process';
import { promisify } from 'util';

const execFileAsync = promisify(execFile);

interface SearchResult {
    chunk_id: number;
    score: number;
}

interface SearchResponse {
    results: SearchResult[];
}

async function searchRLM(query: string): Promise<SearchResult[]> {
    // Using execFile (not exec) prevents shell injection
    const { stdout } = await execFileAsync('rlm-cli', [
        '--format', 'json',
        'search', query
    ]);
    const response: SearchResponse = JSON.parse(stdout);
    return response.results;
}

Best Practices

1. Use Semantic Chunking for Code

rlm-cli load src/ --chunker semantic --chunk-size 3000

Semantic chunking respects function and class boundaries.

2. Name Buffers Meaningfully

rlm-cli load src/auth/ --name auth-module
rlm-cli load src/api/ --name api-handlers
rlm-cli load docs/ --name documentation

This makes search results more interpretable.

3. Use Hybrid Search by Default

rlm-cli search "query" --mode hybrid

Hybrid combines semantic understanding with keyword matching.

4. Batch Subagent Calls

Instead of sequential calls, use parallel Task invocations:

# Good: Parallel
Task(rlm-subcall, chunk 12) || Task(rlm-subcall, chunk 27) || Task(rlm-subcall, chunk 33)

# Avoid: Sequential
Task(rlm-subcall, chunk 12)
Task(rlm-subcall, chunk 27)
Task(rlm-subcall, chunk 33)

5. Store Intermediate Results

# After subcall analysis
rlm-cli add-buffer auth-analysis "$(cat subcall-results.json)"

# Later retrieval
rlm-cli show auth-analysis

Error Handling for AI Assistants

When integrating rlm-rs into AI workflows, proper error handling ensures graceful recovery and good user experience. This section provides structured patterns for handling common errors.

Error Detection

All rlm-cli commands return:

Exit code 0: Success
Exit code 1: Error (details in stderr)

With JSON format, errors are structured:

{
  "error": "storage error: RLM not initialized. Run: rlm-rs init",
  "code": "NOT_INITIALIZED"
}

Common Errors and Recovery Strategies

Error Message	Cause	Recovery Strategy
`RLM not initialized`	Database not created	Run `rlm-cli init`
`buffer not found: <name>`	Buffer doesn’t exist	Run `rlm-cli list` to verify
`chunk not found: <id>`	Invalid chunk ID	Re-run search to get valid IDs
`No results found`	Query too specific	Broaden query or lower threshold
`embedding error`	Model loading issue	Check disk space, retry once
`file not found`	Invalid path	Verify path exists before load

Structured Error Handling Pattern

# Robust error handling for AI assistants
RESULT=$(rlm-cli --format json search "$QUERY" 2>&1)
EXIT_CODE=$?

if [ $EXIT_CODE -ne 0 ]; then
    # Parse error
    ERROR=$(echo "$RESULT" | jq -r '.error // empty')

    case "$ERROR" in
        *"not initialized"*)
            rlm-cli init
            # Retry original command
            RESULT=$(rlm-cli --format json search "$QUERY")
            ;;
        *"buffer not found"*)
            echo "Buffer not found. Available buffers:"
            rlm-cli list
            ;;
        *"No results"*)
            echo "No results. Try broader query or: --threshold 0.1"
            ;;
        *)
            echo "Error: $ERROR"
            ;;
    esac
fi

Retry Logic

For transient errors (embedding model loading, database locks):

MAX_RETRIES=3
RETRY_DELAY=1

for i in $(seq 1 $MAX_RETRIES); do
    RESULT=$(rlm-cli --format json chunk embed "$BUFFER" 2>&1)
    if [ $? -eq 0 ]; then
        break
    fi

    if [ $i -lt $MAX_RETRIES ]; then
        sleep $RETRY_DELAY
        RETRY_DELAY=$((RETRY_DELAY * 2))  # Exponential backoff
    fi
done

Pre-flight Checks

Before complex workflows, verify prerequisites:

# Check 1: rlm-rs is installed
if ! command -v rlm-rs &> /dev/null; then
    echo "rlm-rs not found. Install with: cargo install rlm-cli"
    exit 1
fi

# Check 2: Database is initialized
if ! rlm-rs status &> /dev/null; then
    rlm-cli init
fi

# Check 3: Content is loaded
BUFFER_COUNT=$(rlm-cli --format json status | jq '.buffer_count')
if [ "$BUFFER_COUNT" -eq 0 ]; then
    echo "No content loaded. Use: rlm-rs load <file>"
    exit 1
fi

# Check 4: Embeddings exist for semantic search
EMBED_COUNT=$(rlm-cli --format json chunk status | jq '.embedded_chunks')
if [ "$EMBED_COUNT" -eq 0 ]; then
    echo "No embeddings. Generating..."
    rlm-cli chunk embed --all
fi

Graceful Degradation

When semantic search fails, fall back to BM25:

# Try semantic first
RESULT=$(rlm-cli --format json search "$QUERY" --mode semantic 2>&1)

if echo "$RESULT" | jq -e '.error' > /dev/null 2>&1; then
    # Fall back to BM25 (keyword search, no embeddings required)
    RESULT=$(rlm-cli --format json search "$QUERY" --mode bm25)
fi

Error Messages for Users

When reporting errors to users, provide actionable guidance:

**Good**: "Buffer 'config' not found. Available buffers: main, auth. Did you mean one of these?"

**Bad**: "Error: buffer not found: config"

Troubleshooting

Command Not Found

# Check installation
which rlm-cli

# Install if missing
cargo install rlm-cli
# or
brew install zircote/tap/rlm-rs

Database Not Initialized

rlm-cli init

No Search Results

Check if content is loaded: rlm-cli list
Verify embeddings exist: rlm-cli chunk status
Try broader query or lower threshold: --threshold 0.1

JSON Parsing Errors

Ensure you’re using --format json:

rlm-cli --format json search "query"  # Correct
rlm-cli search "query" --format json  # Also correct

System Prompt Templates

Ready-to-use system prompts for AI assistants integrating with rlm-rs are available in the prompts/ directory:

Template	Purpose	Recommended Model
rlm-orchestrator.md	Coordinates search, dispatch, and synthesis	sonnet
rlm-analyst.md	Analyzes individual chunks	haiku
rlm-synthesizer.md	Aggregates analyst findings	sonnet

Quick Start

Orchestrator receives user request and searches for relevant chunks
Analysts (parallel) process individual chunks and return structured findings
Synthesizer aggregates findings into a coherent report

User Request
     │
     ▼
┌─────────────┐
│ Orchestrator │──▶ rlm-rs search "query"
└─────────────┘
     │
     ▼ dispatch
┌─────────────────────────────────┐
│  ┌─────────┐ ┌─────────┐ ┌─────────┐  │
│  │Analyst 1│ │Analyst 2│ │Analyst N│  │  (parallel)
│  └─────────┘ └─────────┘ └─────────┘  │
└─────────────────────────────────┘
     │ collect
     ▼
┌─────────────┐
│ Synthesizer │──▶ Final Report
└─────────────┘