Chain-of-Thought Reasoning Without Prompting
https://papers.cool/arxiv/2402.10200
Authors: Xuezhi Wang ; Denny Zhou
In enhancing the reasoning capabilities of large language models (LLMs), prior research primarily focuses on specific prompting techniques such as few-shot or zero-shot chain-of-thought (CoT) prompting. These methods, while effective, often involve manually intensive prompt engineering. Our study takes a novel approach by asking: Can LLMs reason effectively without prompting? Our findings reveal that, intriguingly, CoT reasoning paths can be elicited from pre-trained LLMs by simply altering the \textit{decoding} process. Rather than conventional greedy decoding, we investigate the top-$k$ alternative tokens, uncovering that CoT paths are frequently inherent in these sequences. This approach not only bypasses the confounders of prompting but also allows us to assess the LLMs' \textit{intrinsic} reasoning abilities. Moreover, we observe that the presence of a CoT in the decoding path correlates with a higher confidence in the model's decoded answer. This confidence metric effectively differentiates between CoT and non-CoT paths. Extensive empirical studies on various reasoning benchmarks show that the proposed CoT-decoding substantially outperforms the standard greedy decoding.
Q : 这篇论文试图解决什么问题?
A : 这篇论文试图解决的问题是如何在不使用特定提示(prompting)技术的情况下,有效地从大型语言模型(LLMs)中激发出链式思考(Chain-of-Thought, CoT)推理路径。 具体来说,研究者们探讨了以下几个关键问题:
总的来说,这篇论文的核心贡献在于提出了一种新的视角来探索和利用LLMs的推理能力,即通过改变解码策略来自然地揭示模型的推理路径,而不是依赖于外部的提示或训练。 这种方法不仅提高了模型在各种推理任务上的性能,而且为理解模型的内在推理机制提供了新的见解。
Q : 有哪些相关研究?
A : 这篇论文提到了多个与大型语言模型(LLMs)推理能力相关的研究,以下是一些关键的相关研究:
这些研究为理解LLMs的推理能力、改进模型的推理性能以及探索模型的内在机制提供了丰富的理论和实践基础。论文中提出的CoT-decoding方法在这些研究的基础上,进一步探索了如何通过改变解码策略来激发和利用模型的内在推理能力。