Enhancing Large Language Model with Self-Controlled Memory Framework
https://papers.cool/arxiv/2304.13343
Authors: Bing Wang ; Xinnian Liang ; Jian Yang ; Hui Huang ; Shuangzhi Wu ; Peihao Wu ; Lu Lu ; Zejun Ma ; Zhoujun Li
Summary: Large Language Models (LLMs) are constrained by their inability to process lengthy inputs, resulting in the loss of critical historical information. To address this limitation, in this paper, we propose the Self-Controlled Memory (SCM) framework to enhance the ability of LLMs to maintain long-term memory and recall relevant information. Our SCM framework comprises three key components: an LLM-based agent serving as the backbone of the framework, a memory stream storing agent memories, and a memory controller updating memories and determining when and how to utilize memories from memory stream. Additionally, the proposed SCM is able to process ultra-long texts without any modification or fine-tuning, which can integrate with any instruction following LLMs in a plug-and-play paradigm. Furthermore, we annotate a dataset to evaluate the effectiveness of SCM for handling lengthy inputs. The annotated dataset covers three tasks: long-term dialogues, book summarization, and meeting summarization. Experimental results demonstrate that our method achieves better retrieval recall and generates more informative responses compared to competitive baselines in long-term dialogues. (https://github.com/wbbeyourself/SCM4LLMs)
A: 这篇论文试图解决的问题是大型语言模型(LLMs)在处理长篇输入时的局限性。具体来说,LLMs在处理长文本时面临两个主要问题:
为了克服这些限制,论文提出了一个名为Self-Controlled Memory (SCM)的框架。SCM框架旨在增强LLMs的能力,使其能够维护长期记忆并回忆相关信息,同时能够处理超长文本,而无需对模型进行任何修改或微调。SCM框架通过三个关键组件实现这一目标:基于LLM的代理、记忆流以及记忆控制器。这样,SCM框架能够在不牺牲性能的情况下,处理包含大量历史信息的长对话、书籍摘要和会议摘要等任务。
A: 相关研究主要集中在以下几个领域:
这些研究为SCM框架提供了理论基础和技术背景,同时也指出了LLMs在处理长文本时需要解决的关键问题。SCM框架的提出,旨在结合这些研究成果,提供一个能够处理无限长度输入的解决方案。