Introducing MIF: Memory Interchange Format

I built subcog, a semantic memory system for Claude Code, then replaced it with mnemonic when the design needed to change. Migrating my own memories between two systems I wrote myself was painful. The formats were different. The schemas did not align. Context got lost in translation.

If moving between my own tools was this hard, moving between tools from different authors would be worse. And looking at the market, that is exactly where things stand. Mem0, Zep, Letta, LangMem, Basic Memory, and a dozen others each invented their own schema. None can read the others. If you accumulate months of context in one tool and want to switch, you write a custom migration script or you start over.

MIF grew directly from that migration. I needed a format that both subcog and mnemonic could speak. The ontology system came from trying to classify memories that had no consistent taxonomy. Provenance tracking came from not knowing which memories were user-stated facts versus AI inferences. Citations came from needing to trace where a memory originated when something looked wrong. Each feature in the spec traces back to a real problem I hit moving data between my own tools.

Then I realized the same format could work for everyone.

N Providers, N-Squared Migration Paths

Count the AI memory providers that launched in the last 18 months. Mem0, Zep, Letta, LangMem, Basic Memory, and at least a dozen more. Every single one stores memories in a proprietary format. Every single one assumes you will use their tool forever.

The consequences compound:

Switching costs are total. Moving from Mem0 to Zep means writing a custom migration script. Moving from Zep to Letta means writing another one. Every combination of source and destination requires its own converter. With N providers, that is N-squared migration paths.

Memories die with their tools. When a provider shuts down or changes direction, accumulated knowledge vanishes. Users who invested months building context in a proprietary system get nothing back.

Multi-tool workflows break. Someone using Claude Code for development, a different assistant for writing, and another for research gets three separate memory silos. Each builds its own understanding of preferences, project context, and working patterns. None share what they learn.

These problems get worse as the market grows. More providers means more fragmentation. MIF grew from recognizing that pattern early while building mnemonic.

What MIF Actually Is

MIF stands for Memory Interchange Format. It is an open specification that defines how AI assistants store, exchange, and reason about persistent memory. The repository lives at github.com/zircote/MIF and the spec site is mif-spec.dev.

The core idea: define a common data model with two equivalent representations.

Markdown files (.memory.md) are human-readable. They use YAML frontmatter and standard Markdown content. They work as valid Obsidian notes. You can browse your memories in any text editor, search them with grep, back them up with git. No special tooling required.

JSON-LD files (.memory.json) are machine-processable. They use W3C-compatible linked data vocabulary, support RDF tooling, and enable automated validation with JSON Schema. Machines parse these directly without scraping frontmatter.

Both formats convert losslessly between each other. Write in one, read in the other. No information lost.

Here is what a minimal memory looks like in Markdown:

---
id: 550e8400-e29b-41d4-a716-446655440000
type: semantic
created: 2026-01-15T10:30:00Z
namespace: _semantic/preferences
---

User prefers dark mode for all applications.

And the same memory in JSON-LD:

{
  "@context": "https://mif-spec.dev/schema/context.jsonld",
  "@type": "Memory",
  "@id": "urn:mif:550e8400-e29b-41d4-a716-446655440000",
  "memoryType": "semantic",
  "namespace": "_semantic/preferences",
  "content": "User prefers dark mode for all applications.",
  "created": "2026-01-15T10:30:00Z"
}

Four required fields: id, type, content, created. That is the floor. Everything else layers on top through conformance levels.

Three Memory Types

MIF classifies memories into three cognitive types borrowed from how human memory actually works:

Semantic memories store facts, concepts, and relationships. “The API uses OAuth 2.0.” “PostgreSQL connection limit is 100.” “The user prefers dark mode.” These are the what-is-true statements about the world.

Episodic memories capture events, experiences, and timelines. “Debug session on 2026-01-15 traced the OOM to a connection leak.” “Production incident at 3am caused by config drift.” “The team decided to migrate from REST to gRPC after the load test.” These are the what-happened records.

Procedural memories encode workflows, patterns, and how-to knowledge. “To deploy: run tests, build Docker image, push to ECR, update ECS service.” “When reviewing PRs, check for SQL injection in any user input handling.” These are the how-to-do-it instructions.

This matters because different memory types need different handling. Semantic memories update when facts change. Episodic memories are immutable records of what happened. Procedural memories evolve as workflows improve.

The ontology system extends these three base types into a hierarchy:

semantic/
  decisions/         # Architecture choices, rationale
  knowledge/         # APIs, learnings, security context
  entities/          # People, tools, organizations

episodic/
  incidents/         # Production issues, postmortems
  sessions/          # Debug sessions, work sessions
  blockers/          # Impediments and resolutions

procedural/
  runbooks/          # Operational procedures
  patterns/          # Code conventions, testing approaches
  migrations/        # Upgrade steps, migration paths

You can define domain-specific extensions. An agriculture project might add semantic/crop-data and procedural/harvest-protocols. A security team might add episodic/incidents/breach-response.

Bi-Temporal Tracking

Most databases track one timestamp: when the record was created. MIF tracks two.

Transaction time is when the memory was recorded. Your AI assistant captured this fact at 2:15pm on Tuesday. This never changes. Even if the fact turns out to be wrong, the record of when it was captured stays fixed.

Valid time is when the fact was actually true. The API changed its authentication method on January 1st, but you did not discover this until January 15th. Transaction time: January 15th. Valid time: January 1st.

Why does this matter? Because AI memories go stale. A preference recorded six months ago might not reflect current reality. A technical fact from last year might be outdated. Bi-temporal tracking lets you query both “what did my assistant know at time X?” and “what was actually true at time X?”

This distinction becomes critical for debugging. When your assistant makes a bad recommendation, you can trace whether it had outdated information (valid time expired) or whether the information was recorded incorrectly (provenance issue).

Provenance: Where Memories Come From

Every memory has an origin. A user stated a preference. An assistant inferred a pattern. A tool extracted structured data from a conversation. A migration imported records from another system.

MIF uses W3C PROV-O compatible provenance tracking. Each memory can record:

Source: Where the information originated (user statement, AI inference, tool extraction, external import)
Agent: Which AI model or tool created the memory
Confidence: How certain the source is (0.0 to 1.0)
Derivation chains: This memory was derived from these other memories

This sounds academic until you consider a practical scenario. Your assistant recommends a library based on a memory that says “User prefers minimal dependencies.” Where did that preference come from? Did you state it explicitly (confidence: 1.0), or did the assistant infer it from your package choices (confidence: 0.7)? The answer determines how much weight to give the recommendation.

Provenance also enables trust hierarchies. User-stated facts rank higher than AI inferences. Verified tool outputs rank higher than conversation extractions. When memories conflict, provenance helps resolve which one to trust.

Relationships and Entities

Memories do not exist in isolation. They reference people, technologies, organizations, concepts, and files. MIF defines five entity types and nine relationship types to model these connections.

Entity types: Person, Organization, Technology, Concept, File.

Core relationships include:

Relationship	Meaning
RelatesTo	General association
DerivedFrom	This memory came from that one
Supersedes	This memory replaces an older one
Supports	This memory provides evidence for that one
Contradicts	These memories conflict
PartOf	Component relationship

In Markdown, relationships use Obsidian wiki-link syntax:

---
entities:
  - name: PostgreSQL
    type: technology
    role: subject
relationships:
  - type: Supersedes
    target: "[[old-db-config]]"
    context: "Connection pool size updated from 10 to 50"
---

This means your MIF vault is also a knowledge graph. Open it in Obsidian, and the graph view shows how your memories connect. Which decisions led to which outcomes. Which technologies relate to which projects. Which incidents trace back to which configuration changes.

Conformance Levels

Not every implementation needs every feature. MIF defines three conformance levels so providers can adopt incrementally:

Level 1 (Core): Four required fields. id, type, content, created. This is the minimum to call something a MIF memory. A provider that can export memories with these four fields is Level 1 compliant. Migration scripts between Level 1 providers are trivial.

Level 2 (Standard): Adds namespace, entities, relationships, and timestamps. This is where memories become useful for search and organization. Most production systems should target Level 2.

Level 3 (Full): Adds bi-temporal tracking, confidence decay, provenance chains, embedding references, and citations. This is for systems that need full audit trails and sophisticated memory management. Think enterprise compliance or research applications.

A Level 1 provider can read Level 3 files by ignoring fields it does not understand. A Level 3 provider adds richer metadata. Forward and backward compatibility without version negotiation.

Obsidian Native

I made a deliberate design choice: MIF Markdown files must be valid Obsidian notes. Not “compatible with” or “exportable to.” They must work natively in any Obsidian vault as standard notes with standard features.

This means your AI memories live alongside your personal notes. You can link a memory to a meeting note. You can tag memories with the same taxonomy you use for everything else. You can search, filter, and visualize your AI’s knowledge using Obsidian’s graph view, Dataview plugin, and canvas.

The practical benefit is backup and ownership. Your memories are plain text files in a folder. Back them up with git. Sync them with iCloud or Syncthing. Grep them from the terminal. No database to maintain. No server to keep running. No API to authenticate against. Just files.

Migration Paths

MIF includes migration guides for the major providers: Mem0, Zep, Letta, Subcog, and Basic Memory. Each guide maps the provider’s schema to MIF fields and documents what translates cleanly versus what requires manual review.

The pattern is consistent. Export from the source provider. Map fields to MIF properties. Generate .memory.md or .memory.json files. Validate with JSON Schema. Import into the target provider.

For Subcog, which I maintain, the migration is direct since Subcog already uses MIF-compatible ontologies. For providers with simpler schemas (Mem0 stores key-value pairs, essentially), the migration fills in reasonable defaults for fields the source does not track.

Validation

MIF ships with JSON Schema files for automated validation:

# Validate a MIF document
npx ajv validate -s schema/mif.schema.json -d your-memory.json

# Validate ontology definitions
npx ajv validate -s schema/ontology/ontology.schema.json -d ontology.yaml

The schemas enforce required fields, type constraints, valid relationship types, and temporal consistency. CI pipelines can validate MIF files the same way they validate OpenAPI specs or JSON configs.

An ontology conversion script transforms YAML definitions to JSON-LD for semantic web compatibility:

python scripts/yaml2jsonld.py ontologies/mif-base.ontology.yaml
python scripts/yaml2jsonld.py --all

Why Not Just Use JSON?

Fair question. JSON would be simpler. But simpler is not always better.

Plain JSON lacks semantic context. A field named “type” means nothing without documentation. JSON-LD adds @context and @type so machines understand the vocabulary without reading a separate specification. Tools that speak RDF can process MIF documents directly.

Plain JSON is not human-friendly for browsing. I open my memory vault daily to review what my assistants captured. Markdown with frontmatter reads naturally. JSON does not.

Plain JSON does not link. MIF Markdown uses wiki-links to create connections between memories. Open your vault in Obsidian and you get a visual graph of how knowledge relates. JSON would need a separate viewer.

The dual format gives you both. Humans read Markdown. Machines read JSON-LD. Neither compromises for the other.

Where This Goes Next

The specification is version 0.1.0-draft. I published it to gather feedback, not to declare it finished.

Subcog already implements MIF-compatible storage. Mnemonic, my Claude Code plugin, uses MIF ontologies for memory organization. These are the reference implementations, but the point of an open standard is adoption beyond one person’s tools.

What I want to see:

Memory providers adopting MIF export as a baseline. Even Level 1 compliance, just the four required fields, would make migration between tools possible.
Obsidian plugins that understand MIF metadata. The files already work in Obsidian, but a dedicated plugin could surface entity relationships, temporal validity, and provenance in useful ways.
Federation protocols for sharing memories between vaults. The namespace model already supports multi-tenant organization. Federation would let teams share relevant memories while keeping private ones local.
Converter tools maintained by the community. Migration guides exist in the spec, but maintained scripts for each provider would lower the barrier.

Try It

The specification and examples live at github.com/zircote/MIF. The spec site at mif-spec.dev has the formatted documentation.

Start with a Level 1 memory. Create a .memory.md file with four fields in the frontmatter: id, type, content, created. Put it in your Obsidian vault. That is a valid MIF document. Build from there.

If you maintain an AI memory tool, look at the conformance levels and figure out where your schema maps. Level 1 compliance is probably a few hours of work. Open an issue on the repository if you hit gaps in the specification.

If you have opinions about the data model, namespace conventions, or relationship types, file issues or submit PRs. The specification improves through use and feedback, not isolation.

Your AI memories should outlive the tools that create them. MIF makes that possible.