ReadAgent | Notion

A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts

Authors: Kuang-Huei Lee ; Xinyun Chen ; Hiroki Furuta ; John Canny ; Ian Fischer

Summary:

Current Large Language Models (LLMs) are not only limited to some maximum context length, but also are not able to robustly consume long inputs. To address these limitations, we propose ReadAgent, an LLM agent system that increases effective context length up to 20x in our experiments. Inspired by how humans interactively read long documents, we implement ReadAgent as a simple prompting system that uses the advanced language capabilities of LLMs to (1) decide what content to store together in a memory episode, (2) compress those memory episodes into short episodic memories called gist memories, and (3) take actions to look up passages in the original text if ReadAgent needs to remind itself of relevant details to complete a task. We evaluate ReadAgent against baselines using retrieval methods, using the original long contexts, and using the gist memories. These evaluations are performed on three long-document reading comprehension tasks: QuALITY, NarrativeQA, and QMSum. ReadAgent outperforms the baselines on all three tasks while extending the effective context window by 3-20x.

Untitled

Q: 这篇论文试图解决什么问题？

A: 这篇论文试图解决大型语言模型（LLMs）在处理长文本时面临的两个主要问题：

上下文长度限制：现有的LLMs在处理文本时存在一个明确的上下文长度限制，这意味着它们无法一次性处理过长的输入。
长输入性能下降：即使输入长度没有超过这个限制，LLMs在处理越来越长的输入时，其性能也往往会下降。

为了克服这些限制，论文提出了一个名为ReadAgent的LLM代理系统。ReadAgent的设计灵感来源于人类如何交互式地阅读长文档，它通过以下三个主要步骤来增加有效上下文长度：

剧集分页（Episode Pagination）：LLM决定在连续文本的哪个位置暂停阅读，将内容分成称为“剧集”（episodes）的部分。
记忆摘要（Memory Gisting）：LLM将每个剧集压缩成较短的摘要，称为“概要记忆”（gist memories），并将其与相应的上下文关联起来。
交互式查找（Interactive Lookup）：LLM在执行任务时，根据需要查找原始文本中的特定部分，结合概要记忆来完成任务。

通过这种方法，ReadAgent能够在三个长文档阅读理解任务（QuALITY, NarrativeQA, 和 QMSum）上超越基线模型，同时显著扩展了有效上下文窗口，最高可达20倍。

Q: 有哪些相关研究？

A: 这篇论文提到了多个与长上下文LLMs相关的研究方向，包括：

长上下文LLMs：研究如何通过训练或微调LLMs来处理更长的上下文窗口。这包括探索新的架构或高效的Transformer注意力层实现，以减少对长上下文微调的需求。
检索增强生成（Retrieval-Augmented Generation, RAG）：RAG技术允许LLM从大量文档数据库中查询任务相关信息。这项工作实现了一种形式的检索，通过LLM推理上下文化的概要记忆，而无需任何训练。
LLM代理处理长文本：研究LLM如何作为代理来交互式地处理非常长的文本。例如，WebGPT和WebShop学习浏览动作来在互联网上搜索所需答案，尽管它们并非专门设计来理解长文档。PEARL系统通过迭代提示提出更好的长文档理解行动计划。Self-note通过将推理步骤摊销并交替插入中间笔记与原始文档来改进推理。Yang等人通过迭代推理生成长输出。