GPT-FedRec | Notion

Federated Recommendation via Hybrid Retrieval Augmented Generation

Authors:

Huimin Zeng, Zhenrui Yue, Qian Jiang, Dong Wang.

Summary:

Federated Recommendation (FR) emerges as a novel paradigm that enables privacy-preserving recommendations. However, traditional FR systems usually represent users/items with discrete identities (IDs), suffering from performance degradation due to the data sparsity and heterogeneity in FR. On the other hand, Large Language Models (LLMs) as recommenders have proven effective across various recommendation scenarios. Yet, LLM-based recommenders encounter challenges such as low inference efficiency and potential hallucination, compromising their performance in real-world scenarios. To this end, we propose GPT-FedRec, a federated recommendation framework leveraging ChatGPT and a novel hybrid Retrieval Augmented Generation (RAG) mechanism. GPT-FedRec is a two-stage solution. The first stage is a hybrid retrieval process, mining ID-based user patterns and text-based item features. Next, the retrieved results are converted into text prompts and fed into GPT for re-ranking. Our proposed hybrid retrieval mechanism and LLM-based re-rank aims to extract generalized features from data and exploit pretrained knowledge within LLM, overcoming data sparsity and heterogeneity in FR. In addition, the RAG approach also prevents LLM hallucination, improving the recommendation performance for real-world users. Experimental results on diverse benchmark datasets demonstrate the superior performance of GPT-FedRec against state-of-the-art baseline methods.

Untitled

Q&A:

Q : 这篇论文试图解决什么问题？

A : 这篇论文试图解决的问题是在联邦推荐（Federated Recommendation, FR）系统中存在的数据稀疏性和数据异质性问题。在FR系统中，为了保护用户隐私，通常采用一种分布式的架构，其中中央服务器存储全局推荐模型，而各个客户端（clients）持有本地私有数据。这些客户端协作训练全局模型，但不共享它们的私有数据。然而，由于数据稀疏（每个客户端可能只有少量用户数据）和数据异质性（不同客户端可能有不同的项目范围），传统的基于ID的推荐器在这种设置下性能下降，尤其是在处理新用户（冷启动用户）时。

为了克服这些挑战，论文提出了GPT-FedRec，这是一个利用ChatGPT和新颖的混合检索增强生成（Retrieval Augmented Generation, RAG）机制的联邦推荐框架。 GPT-FedRec通过两个阶段来解决这些问题：首先，通过混合检索过程挖掘基于ID的用户模式和基于文本的项目特征；其次，将检索结果转换为文本提示，输入到GPT进行重排。这种方法旨在从数据中提取通用特征，并利用LLM（Large Language Models）中的预训练知识，以提高在FR环境中的推荐性能。此外，RAG方法还旨在防止LLM产生幻觉（hallucination），进一步提高现实世界用户推荐性能。

Q ：有哪些相关研究？

A : 相关研究主要集中在以下几个方面：

联邦推荐（Federated Recommendation）： ：
- 传统的联邦推荐系统通常采用基于ID的模型，如FCF、FedRec、FedMF等，它们将用户和项目表示为唯一的身份（ID）。
- 为了提高泛化能力，一些研究如Wu et al. (2022) 训练了领域解耦的序列推荐器。
- 针对数据稀疏性和异质性问题，Zhang et al. (2024) 提出了基于文本的FR解决方案TransFR。
自然语言处理在推荐系统中的应用 ：
- 使用文本特征的推荐系统，如Hou et al. (2023a, 2022) 使用语言模型编码的项目文本训练序列推荐器。
- Li et al. (2023a; Geng et al., 2022) 使用预训练的语言模型微调项目描述，然后用作推荐器。
基于大型语言模型（LLMs）的推荐系统 ：
- LLMs因其在各种任务中的泛化能力而受到关注，如Hou et al. (2023b) 使用LLMs作为对话代理来理解用户偏好并改进推荐。
- 其他研究如Sun et al. (2023) 和He et al. (2023) 利用LLMs进行下一项推荐。
检索增强的推荐框架 ：
- 为了减少LLMs在推荐中可能产生的幻觉，研究者设计了检索增强的推荐框架，如De Cao et al. (2020) 和Tay et al. (2022)。
序列推荐 ：
- 由于序列推荐在传统推荐系统上的性能优势，本论文将研究范围限定在序列推荐上。