MemGPT / Letta (Virtual Context Management)

Production

Memory Mechanism

An architecture that gives LLMs explicit control over their own memory hierarchy — managing what's in the context window, what's in external storage, and when to page information in and out — inspired by operating system virtual memory.

MemGPT wraps an LLM in a memory management layer. The system has: (1) Main context (working memory) — the actual context window, managed like RAM. (2) Archival memory — external storage (database/vector store) for long-term facts. (3) Recall memory — searchable log of past conversation history. The LLM itself decides when to: push information from main context to archival ('paging out'), retrieve from archival to main context ('paging in'), search recall memory, or edit its own system prompt (self-modifying instructions). These are implemented as function calls the model can invoke. Available as the Letta framework (open-source).

Why Does This Exist?

Persistent Machine Memory →Research Goal

The primary implementation of persistent AI memory — demonstrates that LLMs can manage their own memory hierarchy with explicit read/write operations

Epistemic Control →Research Goal

Explicit memory management creates a clear distinction between "knowledge I retrieved" and "knowledge from my parameters" — enabling more honest uncertainty expression

Modular Knowledge Architecture →Research Goal

Memory hierarchy separates knowledge into independently manageable stores (archival, recall, working) — a form of knowledge modularity at the system level

Cost-Efficient Frontier Intelligence →Research Goal

Virtual context management means you don't need massive context windows (expensive) — you use memory management to keep the effective window small while accessing large memory