近日,实验室博士生钱凌飞关于回答重排序的研究成果被中科院SCI一区,CCF B类期刊《Information Sciences》录用。
题目:Multi-perspective respondent representations for answer ranking in community question answering
摘要:Answer ranking is an important task in community question answering (CQA) systems. It aims at ranking useful answers above useless answers. Existing works learn respondents’ expertise to help estimate qualities of answers. However, in most of these works, the expertise is only learned from the history answers. As a result, structure correlations between question raisers and respondents are usually ignored. Besides, these works lack an efficient way to learn respondent expertise from extensive history answers. To address the limitations, we propose a novel multi-perspective respondent representation learning (MPRR) network. First, our model learns embeddings of raisers and respondents through a heterogeneous information network(HIN) constructed by the answering records in CQA websites. The structure correlations between raisers and respondents are preserved in the learned embeddings. Second, a freezed pre-trained language model is used to learn respondents’ expertise from history answer contents more quickly. Then the multi-perspective respondent representations are generated based on their expertise and the embeddings learned in the HIN. At last, the raisers, respondents, questions, and answers are all considered to compute the matching scores. We evaluate our model on three real-world CQA dataset. Experiment results show that MPRR outperforms all baseline models with three ranking metrics on all datasets.
回答重排序目的在于对当前问题的候选回答进行排序,使得社区问答平台能够自动优先展示高质量回答。现有的研究通常使用回答历史来评估回答者的专业度,用以提升模型鉴定高质量回答的准确率。然而,多数方法并未关注用户交互网络中包含的用户结构关联信息,综合学习回答者的异构信息可以帮助模型更全面地评估其专业度。另一方面,社区中的专家回答者通常拥有大量的回答记录,因此在融合其异构信息时通常会引起效率上的问题。针对此问题,提出了自适应的用户异构信息嵌入方法。该方法首先从用户交互网络中学习用户间的结构关联特征。同时依据当前问题的主题,使用预训练语言模型,结合历史回答文本来评估回答者在当前语境下的专业度。随后将回答者的异构信息进行融合。实验结果表明,上述方法可以更快地学习回答者的异构信息,同时提高任务效果。