rlm
RLM (Recursive Language Model) Workflow
Section titled “RLM (Recursive Language Model) Workflow”Orchestrate processing of documents that exceed context window limits using the rlm-rs CLI tool. This skill implements the RLM pattern from arXiv:2512.24601, enabling analysis of content up to 100x larger than typical context windows.
Architecture Mapping
Section titled “Architecture Mapping”| RLM Concept | Implementation |
|---|---|
| Root LLM | Main Claude Code conversation (Opus/Sonnet) |
Sub-LLM (llm_query) | rlm-subcall agent (Haiku) |
| External Environment | rlm-rs CLI with SQLite storage |
Prerequisites
Section titled “Prerequisites”Verify rlm-rs is installed and available:
command -v rlm-rs >/dev/null 2>&1 || echo "INSTALL REQUIRED: cargo install rlm-rs"Installation options:
# Via Cargo (recommended)cargo install rlm-rs
# Via Homebrewbrew install zircote/tap/rlm-rsWorkflow Steps
Section titled “Workflow Steps”Step 1: Initialize Database
Section titled “Step 1: Initialize Database”Create or verify the RLM database:
rlm-rs initrlm-rs statusIf already initialized, status shows current buffers and state.
Step 2: Load Context File
Section titled “Step 2: Load Context File”Load the large document into a buffer with appropriate chunking:
# Semantic chunking (recommended for structured content)rlm-rs load <file_path> --name <buffer_name> --chunker semantic
# Fixed chunking (for unstructured text)rlm-rs load <file_path> --name <buffer_name> --chunker fixed --chunk-size 6000
# With overlap for continuityrlm-rs load <file_path> --name <buffer_name> --chunker fixed --chunk-size 6000 --overlap 1000Step 3: Scout the Content
Section titled “Step 3: Scout the Content”Examine the beginning and end to understand structure:
# View first 3000 charactersrlm-rs peek <buffer_name> --start 0 --end 3000
# View last 3000 charactersrlm-rs peek <buffer_name> --start -3000Search for relevant sections:
rlm-rs grep <buffer_name> "<pattern>" --max-matches 20 --window 150Step 4: Search for Relevant Chunks
Section titled “Step 4: Search for Relevant Chunks”Use hybrid semantic + BM25 search to find chunks matching your query:
# Hybrid search (semantic + BM25 with rank fusion)rlm-rs search "your query" --buffer <buffer_name> --top-k 100
# JSON output for programmatic userlm-rs --format json search "your query" --top-k 100Output includes chunk IDs with relevance scores and document position (index):
{ "count": 2, "mode": "hybrid", "query": "your query", "results": [ {"chunk_id": 42, "buffer_id": 1, "index": 5, "score": 0.0328, "semantic_score": 0.0499, "bm25_score": 1.6e-6}, {"chunk_id": 17, "buffer_id": 1, "index": 2, "score": 0.0323, "semantic_score": 0.0457, "bm25_score": 1.2e-6} ]}index: Sequential position within the document (0-based) - use for temporal orderingbuffer_id: Which buffer/document this chunk belongs to
Extract chunk IDs sorted by document position: jq -r '.results | sort_by(.index) | .[].chunk_id'
Step 5: Retrieve Chunks by ID
Section titled “Step 5: Retrieve Chunks by ID”Get specific chunk content via pass-by-reference:
# Get chunk contentrlm-rs chunk get 42
# With metadatarlm-rs --format json chunk get 42 --metadataStep 6: Subcall Loop (Batched, Parallel)
Section titled “Step 6: Subcall Loop (Batched, Parallel)”Only process chunks returned by search. Batch chunk IDs to reduce agent calls:
- Search returns chunk IDs with relevance scores and document indices
- Sort all chunk IDs by
index(document position) to preserve temporal context - Group sorted chunk IDs into batches (default 10, configurable via
batch_sizeargument) - Invoke
rlm-subcallagent once per batch using only the two required arguments - Launch batches in parallel via multiple Task calls in one response
- Agent handles retrieval internally via
rlm-rs chunk get <id>(NO buffer ID needed) - Collect structured JSON findings from all batches
IMPORTANT: Sort chunks by index before batching to preserve document flow. Each subagent should receive chunks in document order (e.g., 3,7,12,15,22 not 22,3,15,7,12). This ensures temporal context is maintained - definitions appear before usages, causes before effects.
CORRECT Task invocation - pass ONLY query and chunk_ids arguments (sorted by index):
Task subagent_type="rlm-rs:rlm-subcall" prompt="query='What errors occurred?' chunk_ids='3,7,12,15,22'"Task subagent_type="rlm-rs:rlm-subcall" prompt="query='What errors occurred?' chunk_ids='28,31,45'"CRITICAL - DO NOT:
- Write narrative prompts - the agent already knows what to do
- Include buffer ID or buffer NAME anywhere in the prompt
- Mention the buffer at all - chunk IDs are globally unique across all buffers
WRONG (causes exit code 2):
prompt="Analyze chunks from buffer 1..." # NO - has buffer IDprompt="Analyze chunks from buffer 'myfile.txt'..." # NO - has buffer nameprompt="Use rlm-rs chunk 1 <id>..." # NO - buffer ID in commandprompt="Use rlm-rs chunk get <id> --buffer x..." # NO - --buffer flag doesn't existRIGHT:
prompt="query='the user question' chunk_ids='5,105,2,3,74'" # YES - just args!Step 7: Synthesis
Section titled “Step 7: Synthesis”Once all chunks are processed:
- Collect all JSON findings from subcall agents
- Pass findings directly to
rlm-synthesizeragent (no intermediate files) - Present the final synthesized response to the user
Example Task tool invocation:
Task agent=rlm-synthesizer query="What errors occurred?" findings='[...]' chunk_ids="42,17,23"Guardrails
Section titled “Guardrails”- Never paste large chunks into main context - Use peek/grep to extract only relevant excerpts
- Keep subagent outputs compact - Request JSON format with short evidence fields
- Orchestration stays in main conversation - Subagents cannot spawn other subagents
- State persists in SQLite - All buffers survive across sessions via
.rlm/rlm-state.db - No file I/O for chunk passing - Use pass-by-reference with chunk IDs
Chunking Strategy Selection
Section titled “Chunking Strategy Selection”| Content Type | Recommended Strategy |
|---|---|
| Markdown docs | semantic |
| Source code | semantic |
| JSON/XML | semantic |
| Plain logs | fixed with overlap |
| Unstructured text | fixed |
For detailed chunking guidance, refer to the rlm-chunking skill.
CLI Command Reference
Section titled “CLI Command Reference”| Command | Purpose |
|---|---|
init | Initialize database |
status | Show state summary |
load | Load file into buffer |
list | List all buffers |
show | Show buffer details |
peek | View buffer content slice |
grep | Search with regex |
search | Hybrid semantic + BM25 search |
chunk get | Retrieve chunk by ID |
chunk list | List buffer chunks |
chunk embed | Generate embeddings |
chunk status | Show embedding status |
write-chunks | Export chunks to files (legacy) |
add-buffer | Store intermediate results |
export-buffers | Export all buffers |
var | Get/set context variables |
reset | Clear all state |
Example Session
Section titled “Example Session”# 1. Initializerlm-rs init
# 2. Load a large log filerlm-rs load server.log --name logs --chunker fixed --chunk-size 6000 --overlap 500
# 3. Search for relevant chunksrlm-rs --format json search "database connection errors" --buffer logs --top-k 100
# 4. For each relevant chunk ID, invoke rlm-subcall agent# 5. Collect JSON findings# 6. Pass findings to rlm-synthesizer agent# 7. Present final answerAdditional Resources
Section titled “Additional Resources”Reference Files
Section titled “Reference Files”references/cli-reference.md- Complete CLI documentation
Related Components
Section titled “Related Components”rlm-subcallagent - Chunk-level analysis (Haiku)rlm-synthesizeragent - Result aggregation (Sonnet)rlm-chunkingskill - Chunking strategy selection