实验室一项研究成果被期刊Neurocomputing录用
新闻来源:IR实验室       发布时间:2019/3/31 21:39:57

  近日, 收到期刊《Neurocomputing》编辑部邮件,实验室徐博等的研究工作“Incorporating query constraints for autoencoder enhanced ranking”已被录用。

  摘要:排序学习是信息检索领域广为采用的排序模型构建技术,在模型构建中将监督式机器学习方法作为核心算法,将传统检索模型作为文档特征,文档特征质量与排序模型性能密切相关,如何构建有效文档特征有待深入研究。本文将自编码器与排序学习相结合,尝试通过特征优化拓展排序学习样本空间,在此基础上着眼于信息检索中的查询约束,采用文档对级和文档列表级排序损失函数对查询约束项建模,将其融入特征优化目标函数。在排序学习公开数据集LETOR上的实验结果证明了该方法的有效性。

  Abstract:Learning to rank has been widely used in information retrieval tasks to construct ranking models for document retrieval. Existing learning to rank methods adopt supervised machine learning methods as core techniques and classical retrieval models as document features. The quality of document features can significantly affect the effectiveness of ranking models. Therefore, it is necessary to generate effective document features in ranking to extend the feature space of learning to rank for better modeling the relevance between queries and their corresponding documents. Recently, deep neural network models have been used to generate effective features for various text mining tasks. Autoencoders, as one type of building blocks of neural networks, capture semantic information as effective features based on an encoder-decoder framework. In this paper, we incorporate autoencoders into constructing ranking models based on learning to rank. In our method, autoencoders are used to generate effective documents features for capturing semantic information of documents. We propose a query-level semi-supervised autoencoder by considering three types of query constraints based on Bregman divergence. We evaluate the effectiveness of our model on datasets from LETOR 3.0 and LETOR 4.0, and show that our model significantly outperforms other competing methods to improve retrieval performance.