Recursive Language Models (RLM): Long-Context AI

Q: What are Recursive Language Models (RLM)?

RLM is an architecture pattern where the LLM processes long documents in recursive chunks — summarizing, extracting, and synthesizing information iteratively rather than trying to fit everything in a single context window.

Q: How does RLM solve the context window limitation?

RLM breaks long inputs into segments, processes each with the LLM, merges results, and iterates until a coherent output emerges. This enables reasoning over documents far larger than any model context window.

Q: When should you use RLM vs RAG?

Use RLM when you need deep reasoning over long documents where the entire context matters. Use RAG when you need to find specific answers from a large corpus. RLM excels at synthesis; RAG excels at retrieval.

Q: What are the limitations of RLM?

RLM limitations include: higher latency (multiple LLM calls), higher cost (more tokens processed), potential information loss during summarization, and complexity in maintaining coherence across recursive passes.