Skip to content

Multi-File RLM

Status: Proposal Date: 2026-02-11 Scope: Extension of the single-file RLM pattern (skills/rlm-pattern/SKILL.md and agents/rlm-*) to accept directories, with per-file content-type detection and mixed-type analyst routing in a single team session. Depends on: Content-Aware RLM Design


The content-aware RLM pattern operates on a single file per session. Its content-type detection, type-specific chunking, and specialist analyst routing all assume one file in, one content type out.

Users needing directory-level analysis — “review this project directory”, “analyze all CSV exports”, “audit this mixed codebase” — must fall back to the generic swarm pattern (orchestration-patterns/SKILL.md Pattern 3). This loses all content-aware benefits:

  • No semantic chunking — files get split by line ranges regardless of content structure
  • No type-specific analysts — all files get the same generic analyzer
  • No cross-file synthesis — findings from Python, JSON config, and CSV data get no type-aware aggregation
  • No header preservation — CSV files in a directory lose headers when chunked
  • No import injection — code files lose dependency context

This design extends the RLM pattern to accept a directory path, enumerate and classify files by content type, partition each file using the appropriate strategy, route chunks to mixed analyst types, and synthesize findings in two phases — first per content type, then across types.


┌───────────────┐ ┌───────────────┐ ┌─────────────────┐ ┌──────────────────┐
│ Directory │───▶│ Enumerate │───▶│ Detect Type │───▶│ Group by Type │
│ Input │ │ & Filter │ │ Per File │ │ │
└───────────────┘ └───────────────┘ └─────────────────┘ └──────────────────┘
┌───────────────┐ ┌───────────────┐ ┌─────────────────┐ ┌──────────────────┐
│ Cross-Type │◀───│ Per-Type │◀───│ Mixed Analyst │◀───│ Partition │
│ Synthesis │ │ Synthesis │ │ Routing │ │ Per Group │
│ (Phase 2) │ │ (Phase 1) │ │ │ │ │
└───────────────┘ └───────────────┘ └─────────────────┘ └──────────────────┘
AspectSingle-File RLMMulti-File RLM
InputOne file pathDirectory path + glob filters
Content detectionOne type per sessionPer-file detection, multiple types
Analyst typesOne type per sessionMixed types (up to 4 different)
PartitioningOne strategyPer-type strategy for each file
SynthesisSingle phase (merge all)Two phases: per-type then cross-type
Context managementFindings via SendMessageFindings in task descriptions via TaskUpdate
Analyst scalingNo explicit cap1 analyst per task, fresh context each

ParameterDefaultDescription
directory(required)Absolute path to the target directory
include*Glob patterns for files to include (e.g., *.py, *.csv)
excludeSee default exclusionsGlob patterns for files to exclude
recursivetrueWhether to descend into subdirectories
max_files20Maximum number of files to process (safety cap)

Always excluded unless explicitly included:

# Version control
.git/
# Dependencies
node_modules/
vendor/
.venv/
__pycache__/
.tox/
.eggs/
# Build artifacts
dist/
build/
target/
out/
.next/
# IDE/editor
.idea/
.vscode/
*.swp
*.swo
*~
# Binary and media
*.png, *.jpg, *.jpeg, *.gif, *.ico, *.svg (binary)
*.pdf, *.doc, *.docx
*.zip, *.tar, *.gz, *.bz2
*.exe, *.dll, *.so, *.dylib
*.wasm, *.pyc, *.class
# Lock files
package-lock.json
yarn.lock
Gemfile.lock
poetry.lock
Cargo.lock
pnpm-lock.yaml
composer.lock
# Generated
*.min.js
*.min.css
*.map
*.d.ts (unless explicitly included)
1. List files in directory (recursive if enabled)
2. Apply include globs (whitelist)
3. Apply exclude globs (blacklist — defaults + user overrides)
4. Filter out binary files (check first 512 bytes for null bytes)
5. Sort by file size descending (largest first — they drive partition budget)
6. If count > max_files:
a. Log warning: "Found {count} files, processing first {max_files}"
b. Truncate to max_files
7. Return file manifest: [{path, size_bytes, line_count}]

The Team Lead executes enumeration inline using Glob, Read, and Bash tools. No separate agent needed — this is O(N) where N ≤ 20.


Files are classified into size tiers, and each tier gets a different partition count:

TierLine CountPartitionsRationale
Small≤ 1500 lines0 (batched — see Section 5)Fits in analyst context whole
Medium1501–5000 linesUse content-type chunk size targets (typically 3–5 partitions)Needs splitting but not many chunks
Large> 5000 linesUse content-type chunk size targets — scales with file sizeNeeds aggressive splitting

Partition count is data-driven: divide each file’s size by its content-type chunk target (e.g., 200-line chunks for code, ~2000-row chunks for narrow CSV). Let partition count scale naturally with data size rather than enforcing a fixed cap.

Directory ProfileExampleExpected Partitions
Small (3-5 files, all small)Config directory3-5 tasks (one per file, no splitting)
Medium (5-10 files, mixed sizes)Feature module10-20 tasks
Large (10-20 files, several large)Full service25-50 tasks (scales with data)
def allocate_budget(files):
# Content-type chunk size targets (from Partitioning Strategies)
chunk_targets = {
"source_code": 200, # lines (150-300 range)
"structured_data": 2000, # rows for narrow data, 500 for wide
"json": 350, # elements (200-500 range)
"jsonl": 750, # lines (500-1000 range)
"log": 2500, # lines (200-5000 range)
"prose": 250, # lines
"config": 200, # lines
}
manifest = []
for f in files:
lines = count_lines(f)
content_type = detect_type(f)
if lines <= 1500:
manifest.append({"file": f, "lines": lines, "type": content_type, "partitions": 0, "tier": "small"})
else:
chunk_size = chunk_targets.get(content_type, 200)
partitions = max(2, ceil(lines / chunk_size))
tier = "medium" if lines <= 5000 else "large"
manifest.append({"file": f, "lines": lines, "type": content_type, "partitions": partitions, "tier": tier})
# No hard cap — partition count is data-driven.
# If total is very large (50+), consider excluding low-priority files
# or batching more aggressively rather than artificially shrinking partitions.
return manifest

A directory of 15 files where 10 are small (< 1500 lines) would spawn 10 individual analyst tasks if each got its own task. This wastes agent turns on trivial files.

Group small files of the same content type into batches of ≤ 1500 combined lines. Each batch becomes one analyst task.

1. Collect all small files (≤ 1500 lines) from the manifest
2. Group by content_type
3. For each content_type group:
a. Sort files by line count (smallest first)
b. Create batches:
- Start a new batch (running_total = 0)
- Add files to batch until running_total would exceed 1500
- When batch is full, start a new one
c. Each batch → one analyst task
4. A lone small file (only one of its type) → one analyst task (whole-file, no chunking)

The task description lists all files in the batch with clear boundary markers:

Mode: multi-file
Query: Review for code quality and security issues
Batch: 3 Python files (combined: 1,230 lines)
--- FILE 1: /project/src/utils.py (280 lines) ---
Read with: Read({ file_path: "/project/src/utils.py" })
--- FILE 2: /project/src/config.py (450 lines) ---
Read with: Read({ file_path: "/project/src/config.py" })
--- FILE 3: /project/src/helpers.py (500 lines) ---
Read with: Read({ file_path: "/project/src/helpers.py" })
Analyze each file separately. Report findings with the file path for each finding.

Analysts read each file via the Read tool (pass-by-reference). The task description lists file paths — it never inlines file content. The Team Lead never analyzes files inline either — all analysis goes through analyst agents.


After enumeration and type detection, the Team Lead has a manifest of files grouped by content type. Unlike single-file RLM (which uses one analyst type per session), multi-file RLM spawns different analyst types simultaneously.

1. From the manifest, count tasks per content type:
- source_code tasks: partitions from code files + batches of small code files
- structured_data tasks: partitions from CSV files + batches of small CSV files
- json/jsonl tasks: partitions from JSON files + batches of small JSON files
- general tasks: partitions from log/prose/config files + batches
2. Only spawn analyst types that have tasks
3. 1 analyst per task, fresh context each — distribute proportionally across types, use staged spawning for large workloads

1 analyst per task, fresh context each — every analyst processes exactly one chunk. Distribute proportionally across content types. At least 1 analyst per type that has tasks. Use staged spawning (batches of ~15) for large workloads.

def plan_analysts(tasks_by_type):
# 1:1 ratio — one analyst per task, fresh context each
analysts = {}
for content_type, task_count in tasks_by_type.items():
if task_count == 0:
continue
analysts[content_type] = task_count
return analysts

A directory with 5 Python files (12 tasks), 3 CSV files (6 tasks), 1 JSON config (1 task):

  • Total: 19 tasks = 19 analysts (1:1)
  • Code analysts: 12
  • Data analysts: 6
  • JSON analysts: 1
  • Total: 19 analysts (staged spawning in batches of 15: stage 1 = 15 analysts, stage 2 = 4 analysts)

Each analyst gets a descriptive name incorporating its type:

code-analyst-1, code-analyst-2, code-analyst-3
data-analyst-1, data-analyst-2
json-analyst-1

With 25+ analyst reports spanning multiple content types, the synthesizer faces two challenges:

  1. Volume — too many reports for a single synthesis pass
  2. Heterogeneity — code findings (severity-based), data findings (distribution-based), and JSON findings (schema-based) use different vocabularies and need different aggregation logic

Two-phase synthesis addresses both.

One synthesis task per content type, running in parallel. Each reads analyst findings from completed tasks via TaskGet and produces a type-level summary.

Task IDs for per-type synthesis:
- "synth-code": reads all code analyst task findings → Python summary
- "synth-data": reads all data analyst task findings → CSV summary
- "synth-json": reads all JSON analyst task findings → JSON summary

Synthesizer prompt (per-type):

Mode: per-type synthesis
Content type: source_code
Analyst task IDs: [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
Read findings from each analyst task using TaskGet (findings are in the task description).
Aggregate all source_code findings into a type-level summary:
- Merge duplicate findings across files
- Rank by severity
- Note which files each finding appears in
- Produce a structured summary
Write your synthesis to this task's description via TaskUpdate.
Send a one-line completion notice to team-lead via SendMessage.

Phase 2: Cross-Type Synthesis (Sequential)

Section titled “Phase 2: Cross-Type Synthesis (Sequential)”

One synthesis task that reads all Phase 1 summaries and produces the final report. Blocked until all Phase 1 tasks complete.

Synthesizer prompt (cross-type):

Mode: cross-type synthesis
Per-type synthesis task IDs: [synth-code-id, synth-data-id, synth-json-id]
Original query: <user's query>
Read per-type summaries from each synthesis task using TaskGet.
Produce the final report with these sections:
## Per-File Findings
Brief findings organized by file, noting content type.
## Cross-File Analysis
Patterns that span multiple files or content types:
- Config values referenced in code
- Data schemas matching JSON structures
- Shared naming conventions or inconsistencies
- Dependencies between files
## Recommendations
Actionable items informed by cross-file context.
Write your final report to this task's description via TaskUpdate.
Send the final report to team-lead via SendMessage.
// Phase 1 tasks (parallel, no dependencies between them)
TaskCreate({ subject: "Synthesize Python findings", description: "Mode: per-type synthesis\n..." }) // → task 31
TaskCreate({ subject: "Synthesize CSV findings", description: "Mode: per-type synthesis\n..." }) // → task 32
TaskCreate({ subject: "Synthesize JSON findings", description: "Mode: per-type synthesis\n..." }) // → task 33
// Phase 2 task (blocked by all Phase 1)
TaskCreate({ subject: "Cross-type synthesis", description: "Mode: cross-type synthesis\n..." }) // → task 34
TaskUpdate({ taskId: "34", addBlockedBy: ["31", "32", "33"] })

The dependency graph is acyclic: analyst tasks → Phase 1 synthesis tasks → Phase 2 cross-type task.

Both phases use the existing swarm:rlm-synthesizer agent. The only difference is the prompt:

  • Per-type prompt: reads analyst findings, produces type-level summary
  • Cross-type prompt: reads per-type summaries, produces final report

The synthesizer reads findings via TaskGet in both cases.


In single-file RLM, analysts send full JSON findings to the Team Lead via SendMessage. With 25+ analysts in multi-file mode, this would flood the Team Lead’s inbox with 25+ messages of 2-4K characters each — 50-100K characters of raw findings in context.

Solution: Findings-in-Task-Descriptions Pattern

Section titled “Solution: Findings-in-Task-Descriptions Pattern”

Critical architectural change from single-file RLM:

AspectSingle-File RLMMulti-File RLM
Analyst findings deliveryFull JSON via SendMessage to team-leadJSON written to task description via TaskUpdate
Team Lead inboxReceives all findingsReceives only one-line summaries
Synthesizer reads findingsFrom team-lead’s forwarded messageFrom TaskGet on analyst task IDs
Team Lead context pressureModerate (5-10 messages)Low (only summaries + orchestration)
  1. Analyst completes analysis → writes full JSON findings to its own task description:

    TaskUpdate({
    taskId: "7",
    status: "completed",
    description: "...original description...\n\n--- FINDINGS ---\n{\"findings\": [...], \"metadata\": {...}}"
    })
  2. Analyst notifies team-lead with a one-line summary only:

    SendMessage({
    type: "message",
    recipient: "team-lead",
    content: "Chunk 3/10 complete: 4 findings (2 high, 1 medium, 1 low)",
    summary: "Chunk 3/10 — 4 findings"
    })
  3. Synthesizer reads findings from task records, not from inbox:

    // Synthesizer reads each analyst task's description
    TaskGet({ taskId: "7" }) // → contains full findings JSON
  1. /compact between phases: Run /compact after all analysts complete and before spawning synthesizers. This clears analyst notification messages from context.

  2. 1 analyst per task, fresh context each: Use staged spawning (batches of ~15) and findings-in-task-descriptions mode to manage context pressure — never reduce analyst count.

  3. Descriptive task IDs in synthesis prompts: The synthesizer prompt lists exactly which task IDs to read, rather than asking it to “find all completed tasks.”

  4. Message flow comparison:

    Single-File RLM (10 chunks):
    ┌──────────┐ SendMessage (full findings) ┌───────────┐
    │ Analyst │ ──────────────────────────────────▶ │ Team Lead │
    │ (×10) │ ~3K chars each = ~30K total │ │
    └──────────┘ └───────────┘
    Multi-File RLM (30 chunks):
    ┌──────────┐ TaskUpdate (findings to task) ┌───────────┐
    │ Analyst │ ──────────────────────────────────▶ │ Task List │
    │ (×30) │ │ │
    │ │ SendMessage (one-line summary) ┌───────────┐
    │ │ ──────────────────────────────────▶ │ Team Lead │
    │ │ ~50 chars each = ~1.5K total │ │
    └──────────┘ └───────────┘
    ┌──────────────┐ TaskGet (reads findings) ┌─────┘
    │ Synthesizer │ ◀─────────────────────────────
    │ (Phase 1+2) │
    └──────────────┘

Add a multi-file mode flag to the Team Workflow section. When a task description contains Mode: multi-file, the analyst changes its reporting behavior:

Standard mode (single-file):

  • Write findings to SendMessage → team-lead
  • Full JSON findings in message content

Multi-file mode:

  • Write findings to task description via TaskUpdate
  • Send only a one-line summary to team-lead via SendMessage
  • Same analysis workflow otherwise (TaskList → claim → read → analyze → report → repeat)

This is a prompt-level change — no new tools, no new output schema, no behavioral change to the analysis itself.

Add support for TaskGet-based retrieval. The synthesizer already receives findings via its prompt in single-file mode. In multi-file mode, findings are stored in task descriptions:

Single-file mode (existing):

  • Prompt contains all findings inline
  • Synthesizer processes them directly

Multi-file mode (new):

  • Prompt contains task IDs to read
  • Synthesizer calls TaskGet for each ID to retrieve findings
  • Two prompt variants: per-type synthesis and cross-type synthesis

The synthesizer needs TaskGet in its tools list (currently has only Read and SendMessage). Add TaskGet and TaskUpdate to its tool list.


Input: /project/src/ — 9 files, mixed types:

FileLinesContent TypeTier
data_pipeline.py2800source_codeMedium
api_server.py1900source_codeMedium
models.py3200source_codeMedium
utils.py400source_codeSmall
config.json250jsonSmall
schema.json180jsonSmall
README.md300proseSmall
requirements.txt50configSmall
Makefile120configSmall

Query: “Review this project for code quality, security issues, and architectural concerns.”

Step 1 — Enumerate & Detect: Team Lead uses Glob to list files, applies default exclusions, detects content types via extension mapping.

Step 2 — Partition Budget:

FileTierPartitions
data_pipeline.pyMedium14
api_server.pyMedium10
models.pyMedium16
utils.pySmall0 (batched)
config.jsonSmall0 (batched)
schema.jsonSmall0 (batched)
README.mdSmall0 (batched)
requirements.txtSmall0 (batched)
MakefileSmall0 (batched)
Total partitioned40

The code chunker uses function/class boundaries (150-300 lines), so actual partition count may differ based on code structure. The numbers above use the ~200-line chunk target as an approximation.

Step 3 — Small File Batching:

BatchContent TypeFilesCombined Lines
Batch Asource_codeutils.py400 (lone file)
Batch Bjsonconfig.json, schema.json430
Batch Cconfigrequirements.txt, Makefile170
Batch DproseREADME.md300 (lone file)

Total analyst tasks: 40 (partitioned) + 4 (batched) = 44

However — with README.md (prose) and requirements.txt/Makefile (config) being trivial, the Team Lead can exclude them or batch them together as general-type. Adjusted: 42 analyst tasks.

Step 4 — Analyst Mix:

Content TypeTasksAnalysts
source_code4141
json11
Total4242

Staged spawning: 3 stages (15 + 15 + 12 analysts).

Step 5 — Synthesis:

PhaseTaskReads FromProduces
Phase 1Synthesize code findings41 code analyst tasksCode summary
Phase 1Synthesize JSON findings1 JSON analyst taskJSON summary
Phase 2Cross-type synthesis2 Phase 1 summariesFinal report

Total tasks: 42 analyst + 2 Phase 1 synthesis + 1 Phase 2 synthesis = 45 Total agents: 42 analysts (1:1, staged in batches of 15) + 1 synthesizer (reused across phases) = 43


Input: /data/pipeline/ — 8 files, data-heavy:

FileLinesContent TypeTier
transactions.csv20000structured_dataLarge
customers.csv10000structured_dataLarge
events.jsonl5000jsonlMedium
etl_transform.py2500source_codeMedium
etl_load.sh800source_codeSmall
pipeline_config.json350jsonSmall
etl.log8000logLarge
README.md200proseSmall

Query: “Analyze data quality, identify transformation issues, and check for pipeline errors.”

Step 1 — Partition Budget:

FileTierPartitionsChunk Target
transactions.csvLarge10~2000 rows
customers.csvLarge5~2000 rows
events.jsonlMedium7~750 lines
etl.logLarge4~2500 lines
etl_transform.pyMedium13~200 lines
etl_load.shSmall0 (batched)
pipeline_config.jsonSmall0 (batched)
README.mdSmall0 (batched)
Total39

Step 2 — Small File Batching:

BatchContent TypeFilesCombined Lines
Batch Asource_codeetl_load.sh800
Batch Bjsonpipeline_config.json350
Batch CproseREADME.md200

Total analyst tasks: 39 (partitioned) + 3 (batched) = 42

Step 3 — Analyst Mix:

Content TypeTasksAnalysts
structured_data1515
jsonl77
json11
source_code1414
general (log+prose)55
Total4242

Staged spawning: 3 stages (15 + 15 + 12 analysts).

Step 4 — Synthesis:

PhaseTaskReads From
Phase 1Synthesize CSV findings15 data analyst tasks
Phase 1Synthesize JSON findings8 JSON analyst tasks
Phase 1Synthesize code findings14 code analyst tasks
Phase 1Synthesize log findings5 general analyst tasks
Phase 2Cross-type synthesis4 Phase 1 summaries

Total tasks: 42 analyst + 4 Phase 1 synthesis + 1 Phase 2 synthesis = 47 Total agents: 42 analysts (1:1, staged in batches of 15) + 1 synthesizer (reused across phases) = 43

The synthesizer is spawned once for Phase 1 (claiming Phase 1 tasks from TaskList) and once for Phase 2 (after Phase 1 completes). Alternatively, one synthesizer handles all phases sequentially.


D1: Hybrid Orchestration (Flat + Two-Phase Synthesis)

Section titled “D1: Hybrid Orchestration (Flat + Two-Phase Synthesis)”

Chosen over: Hierarchical model (Type Coordinators per content type) and pure-flat model (all findings to Team Lead).

Why hierarchical was eliminated:

  • Platform constraint: “No nested teams: Teammates cannot spawn their own teams or teammates” (skills/error-handling/SKILL.md:112)
  • Platform constraint: “One team per session” (skills/error-handling/SKILL.md:110)
  • Type Coordinators would need to be mid-level orchestrators with their own sub-teams — not possible

Why pure-flat was rejected:

  • Team Lead receiving 25+ raw analyst reports causes severe context pressure
  • Shallow synthesis when merging heterogeneous finding types in one pass
  • No opportunity for type-specific aggregation (summing CSV distributions, merging code scopes)

Hybrid approach: Team Lead handles all orchestration (flat). Synthesis uses task dependencies for two phases (per-type then cross-type), giving depth without hierarchy.

D2: Data-Driven Partition Budget (Tiered, No Hard Cap)

Section titled “D2: Data-Driven Partition Budget (Tiered, No Hard Cap)”

Chosen over: Uniform partition count per file and hard-capped partitions.

Why uniform was rejected:

  • A 50-line config file doesn’t need 5 partitions
  • An 80,000-line CSV needs more than 5 partitions
  • Tier-based allocation matches resource to need

Why hard caps were rejected:

  • Data-driven sizing naturally produces the right partition count per file
  • Very large totals can be managed by excluding low-priority files or adjusting chunk targets
  • A fixed cap forces artificial scaling that distorts the proportional allocation between files

D3: Batch Small Files Instead of Inline Analysis

Section titled “D3: Batch Small Files Instead of Inline Analysis”

Chosen over: Team Lead analyzing small files inline and skipping small files.

Why inline analysis rejected:

  • Violates the principle that the Team Lead orchestrates but never analyzes
  • Puts file content in the Team Lead’s context, adding pressure
  • Inconsistent protocol (some findings from analysts, some from Team Lead)

Why skipping rejected:

  • Small files often contain critical information (config, utilities, schemas)
  • Users expect directory analysis to cover all files

Batching keeps analysis in analyst agents while avoiding one-task-per-tiny-file overhead.

D4: Findings-in-Task-Descriptions Over SendMessage

Section titled “D4: Findings-in-Task-Descriptions Over SendMessage”

Chosen over: Standard SendMessage delivery (as used in single-file RLM).

Rationale: With 30 tasks, even compact 2K-character findings would put 60K characters in the Team Lead’s inbox. Writing findings to task descriptions and having synthesizers read via TaskGet completely bypasses the Team Lead’s context for raw findings data.

Tradeoff: Synthesizers must make multiple TaskGet calls (one per analyst task). This is slower than reading from a pre-collected prompt but keeps the Team Lead lean.

All multi-file capability is achieved through:

  • Prompt variation — analysts detect Mode: multi-file and switch reporting behavior
  • Task orchestration — two-phase synthesis via task dependencies
  • Existing agents — the same 4 analyst types and 1 synthesizer handle both modes

No new agent definitions, no new tools, no new plugin configuration.


  • Incremental directory re-analysis — Only re-analyze files that changed since the last run. Track file hashes in a manifest and skip unchanged files. Useful for CI/CD integration where the same directory is analyzed on every commit.

  • Cross-directory analysis — Analyzing multiple directories in one session (e.g., comparing src/ and test/ for coverage gaps). Would require a third synthesis phase or a pre-synthesis grouping step.

  • Custom type registrations — Letting users define their own content types (e.g., .proto → protobuf, .tf → terraform) with custom chunking rules and analyst prompts. Wait for user demand before adding configuration surface.

  • Streaming enumeration — For very large directories (1000+ files), enumerate and start processing in parallel rather than waiting for full enumeration. Current max_files=20 cap makes this unnecessary.

  • Analyst result caching — Cache analyst findings for files that haven’t changed, enabling fast re-runs. Requires content hashing and a storage layer beyond the current task system.

  • Priority-based partition ordering — Analyze high-priority files first (e.g., files with known issues, recently modified files). Currently all partitions are unordered and analysts claim by availability.