MemGPT / Letta (Virtual Context Management)
Memory Mechanism
An architecture that gives LLMs explicit control over their own memory hierarchy — managing what's in the context window, what's in external storage, and when to page information in and out — inspired by operating system virtual memory.
MemGPT wraps an LLM in a memory management layer. The system has: (1) Main context (working memory) — the actual context window, managed like RAM. (2) Archival memory — external storage (database/vector store) for long-term facts. (3) Recall memory — searchable log of past conversation history. The LLM itself decides when to: push information from main context to archival ('paging out'), retrieve from archival to main context ('paging in'), search recall memory, or edit its own system prompt (self-modifying instructions). These are implemented as function calls the model can invoke. Available as the Letta framework (open-source).
Why Does This Exist?
The primary implementation of persistent AI memory — demonstrates that LLMs can manage their own memory hierarchy with explicit read/write operations
Explicit memory management creates a clear distinction between "knowledge I retrieved" and "knowledge from my parameters" — enabling more honest uncertainty expression
Memory hierarchy separates knowledge into independently manageable stores (archival, recall, working) — a form of knowledge modularity at the system level
Virtual context management means you don't need massive context windows (expensive) — you use memory management to keep the effective window small while accessing large memory