Federated Recommendation via Hybrid Retrieval Augmented Generation

https://papers.cool/arxiv/2403.04256

Authors:

Huimin Zeng, Zhenrui Yue, Qian Jiang, Dong Wang.

Summary:

Federated Recommendation (FR) emerges as a novel paradigm that enables privacy-preserving recommendations. However, traditional FR systems usually represent users/items with discrete identities (IDs), suffering from performance degradation due to the data sparsity and heterogeneity in FR. On the other hand, Large Language Models (LLMs) as recommenders have proven effective across various recommendation scenarios. Yet, LLM-based recommenders encounter challenges such as low inference efficiency and potential hallucination, compromising their performance in real-world scenarios. To this end, we propose GPT-FedRec, a federated recommendation framework leveraging ChatGPT and a novel hybrid Retrieval Augmented Generation (RAG) mechanism. GPT-FedRec is a two-stage solution. The first stage is a hybrid retrieval process, mining ID-based user patterns and text-based item features. Next, the retrieved results are converted into text prompts and fed into GPT for re-ranking. Our proposed hybrid retrieval mechanism and LLM-based re-rank aims to extract generalized features from data and exploit pretrained knowledge within LLM, overcoming data sparsity and heterogeneity in FR. In addition, the RAG approach also prevents LLM hallucination, improving the recommendation performance for real-world users. Experimental results on diverse benchmark datasets demonstrate the superior performance of GPT-FedRec against state-of-the-art baseline methods.

Untitled

Untitled

Q&A:

Q : 这篇论文试图解决什么问题?

A : 这篇论文试图解决的问题是在联邦推荐(Federated Recommendation, FR)系统中存在的数据稀疏性和数据异质性问题。 在FR系统中,为了保护用户隐私,通常采用一种分布式的架构,其中中央服务器存储全局推荐模型,而各个客户端(clients)持有本地私有数据。 这些客户端协作训练全局模型,但不共享它们的私有数据。 然而,由于数据稀疏(每个客户端可能只有少量用户数据)和数据异质性(不同客户端可能有不同的项目范围),传统的基于ID的推荐器在这种设置下性能下降,尤其是在处理新用户(冷启动用户)时。

为了克服这些挑战,论文提出了GPT-FedRec,这是一个利用ChatGPT和新颖的混合检索增强生成(Retrieval Augmented Generation, RAG)机制的联邦推荐框架。 GPT-FedRec通过两个阶段来解决这些问题:首先,通过混合检索过程挖掘基于ID的用户模式和基于文本的项目特征;其次,将检索结果转换为文本提示,输入到GPT进行重排。 这种方法旨在从数据中提取通用特征,并利用LLM(Large Language Models)中的预训练知识,以提高在FR环境中的推荐性能。 此外,RAG方法还旨在防止LLM产生幻觉(hallucination),进一步提高现实世界用户推荐性能。

Q : 有哪些相关研究?

A : 相关研究主要集中在以下几个方面:

  1. 联邦推荐(Federated Recommendation):
  2. 自然语言处理在推荐系统中的应用
  3. 基于大型语言模型(LLMs)的推荐系统
  4. 检索增强的推荐框架
  5. 序列推荐