实验室关于语义双关语识别研究被期刊Knowledge-Based Systems录用
新闻来源:IR实验室       发布时间:2019/9/21 0:00:00

  近日,收到期刊《Knowledge-Based Systems》编辑部邮件,实验室刁宇峰等的研究工作“CRGA: Homographic Pun Detection with a Contextualized-Representation Gated Attention Network”已被录用。

  摘要:语义双关语的检测是自然语言处理中的一项基础性研究课题。语义双关语能够通过双关语与其语义相似的目标之间的潜在关系产生幽默。双关语在人类语言的书面语和口头语中有着广泛的应用,并有着悠久的历史。然而,语义双关语的歧义性仍然是一个很大的挑战,目前的方法无法很好地解决这个问题。为了解决这一问题,我们提出了一种新的基于上下文表示的门控注意力机制(CRGA)网络来检测语义双关语。这种结构有几个可以利用的优点:一是在不同的语言环境中使用上下文语义表示来解决语义双关的一词多义性;二是CRGA模型能够结合全局语义理解、局部脚本理解、双关语特性的self-attention和门控注意力机制来检测语义双关。通过这种设计,CRGA模型能够有效地捕获多义信息,从而有助于语义双关语的检测。我们基于通用的semeval2017 task7和Pun of the Day数据集进行验证,其实验结果证明了我们提出的CRGA模型的有效性和先进性。

  Abstract:Detecting a homographic pun is one of the fundamental research tasks in natural language processing. A homographic pun is able to produce humor through the latent relationship between the pun and its semantically similar target. Puns have been widely applied in written and spoken forms of human language and have a long history. However, the ambiguity of a homographic pun is still a large challenge that cannot be addressed well with current methods. To alleviate this problem, we present a novel contextualized-representation gated attention (CRGA) network for the detection of homographic puns. This architecture has several advantages that can be exploited: one is that a contextual representation is used across varying linguistic contexts to address the polysemy of homographic puns; another other is that the CRGA model is able to detect homographic puns by combining the global semantic understanding, local script understanding, pun characteristic self-attention and gated mechanism. With this design, the CRGA model can effectively capture the polysemy information, which is helpful for homographic pun detection. The experimental results based on the common SemEval2017 Task7 and Pun of the Day datasets demonstrate the effectiveness and advancement of our proposed CRGA model.