PaperQA: Retrieval-Augmented Generative Agent for Scientific Research

Authors: Jakub Lála ; Odhran O'Donoghue ; Aleksandar Shtedritski ; Sam Cox ; Samuel G. Rodriques ; Andrew D. White

Summary: Large Language Models (LLMs) generalize well across language tasks, but suffer from hallucinations and uninterpretability, making it difficult to assess their accuracy without ground-truth. Retrieval-Augmented Generation (RAG) models have been proposed to reduce hallucinations and provide provenance for how an answer was generated. Applying such models to the scientific literature may enable large-scale, systematic processing of scientific knowledge. We present PaperQA, a RAG agent for answering questions over the scientific literature. PaperQA is an agent that performs information retrieval across full-text scientific articles, assesses the relevance of sources and passages, and uses RAG to provide answers. Viewing this agent as a question answering model, we find it exceeds performance of existing LLMs and LLM agents on current science QA benchmarks. To push the field closer to how humans perform research on scientific literature, we also introduce LitQA, a more complex benchmark that requires retrieval and synthesis of information from full-text scientific papers across the literature. Finally, we demonstrate PaperQA's matches expert human researchers on LitQA.

Q: 这篇论文试图解决什么问题？

A: 这篇论文提出了PaperQA，这是一个检索增强型生成代理（Retrieval-Augmented Generative Agent），旨在解决以下问题：

大型语言模型（LLMs）在语言任务上的泛化能力强，但在回答科学问题时存在幻觉（hallucinations）和不可解释性，这使得在没有真实数据的情况下难以评估其准确性。
现有的LLMs在处理科学文献时，由于知识可能过时，以及在推理问题上存在局限性，导致在许多科学领域的表现不佳。
科学研究过程中，从文献中发现科学发现的过程仍然高度依赖手动，这在面对每年数百万篇学术文章的增长速度时显得尤为困难。
现有的检索增强型生成（RAG）模型通常遵循固定的线性流程，这在处理科学家遇到的多样化问题时可能受到限制。

为了解决这些问题，PaperQA通过以下方式进行改进：

动态调整和迭代执行步骤：PaperQA将RAG分解为模块化的部分，允许代理LLM根据每个问题的具体需求动态调整和迭代执行步骤，以确保更精确和相关的回答。
信息检索和合成：PaperQA能够在科学文献中执行信息检索，评估来源和段落的相关性，并使用RAG提供答案。
提高性能和成本效益：PaperQA在LitQA基准测试中的表现与人类专家相当，但在成本上显著低于人类研究人员。
减少幻觉：PaperQA在生成答案时，能够减少幻觉，提供更可靠的信息。
开放源代码：PaperQA的开源实现旨在为未来的RAG代理和工具设计提供启示，这些代理和工具可以减少LLMs中的幻觉，并促进自然科学领域的更快创新。

Q: 有哪些相关研究？

A: 论文中提到的相关研究领域包括：