情报科学 ›› 2021, Vol. 39 ›› Issue (11): 90-95.

• 业务研究 • 上一篇    下一篇

面向三类查询意图歧义性的查询表达式自动识别研究

  

  • 出版日期:2021-11-01 发布日期:2021-11-15

  • Online:2021-11-01 Published:2021-11-15

摘要: 【 目的/意义】针对查询意图歧义性自动识别,探讨特征有效性及采用不同分类算法识别三类查询意图歧义
性的分类准确率,以期对后续研究提供借鉴与指导。【方法/过程】首先提出了一个面向查询意图歧义性的查询表达
式分类体系;随后,构建了查询表达式特征及相关文档特征共六类;最后,分别采用决策树算法、神经网络算法及k
最邻近算法,探讨采用不同特征组合的有效性及不同分类算法的分类准确率。【结果/结论】①分类准确率较基准实
验提升比例为49.5%;②使用查询表达式特征分类优于使用相关文档特征;③决策树的分类准确率略高于其他两类
分类算法。【创新/局限】构建了一个面向查询意图歧义性的查询分类体系;完成了面向三类查询意图歧义性的分类
任务;然限于数据集获取途径,仅对200数据验证。

Abstract: 【Purpose/significance】This paper investigates the effectiveness of classification features and compares the performance of
three classifiers in a query ambiguity intent classification task.【Method/process】This paper first constructs a query taxonomy of ambi? guity and then extracts query-based features and document-based features.Later,it tests accuracy,using decision tree,neural network, k-nearest neighbor individually,with various combinations of features.【Result/conclusion】①An accuracy is increased by 49.5% com? pared with the baseline; ②Compared with document-based features,using query-based features achieves better accuracy; ③Decision tree performs best among the tested classifiers.【Innovation/limitation】A query taxonomy of ambiguity is constructed; A query classifi? cation task based on three types of ambiguity is realized; Due to dataset accessibility,our experiments are done on a limited size datas? et.