情报科学 ›› 2021, Vol. 39 ›› Issue (4): 23-29.

• 理论研究 • 上一篇    下一篇

基于多知识图谱的中文文本语义图构建研究

  

  • 出版日期:2021-04-01 发布日期:2021-04-09

  • Online:2021-04-01 Published:2021-04-09

摘要:

【目的/意义】基于知识图谱构建文本语义图,可以解决传统文本表示方法中语义缺失的问题。【方法/过程】
首先基于百科知识图谱CN-DBpedia构建以文本中的命名实体为节点、实体之间的语义关系为边的语义图,然后
引入概念图谱CN-Probase,实现实体和概念之间的映射,进而生成融入概念知识图谱的增强型中文文本语义图,最
后以新闻文本的模式发现任务为例对本文提出的方法进行了验证。【结果/结论】提出了一种新型的基于多知识图
谱构建中文文本语义图的方法。【创新/局限】实现了实体层面和概念层面两个层次的中文文本语义化表示,可应用
于文本分类、文本分析等自然语言处理任务,局限在于只使用了新闻文本进行实验验证。

Abstract:

【Purpose/significance】Text representation using traditional methods is often unsatisfactory because it ignores the semantic
relationships between words. Constructing text semantic graph based on knowledge graph can solve the problem of semantic deficien⁃
cies.【Method/process】First, a semantic graph with named entities in the text as nodes and semantic relationships between entities as
sides is constructed based on the encyclopedia Knowledge Graph, CN-DBpedia. Second, the concept graph CN-Probase is introduced
to map entities to concepts, and an enhanced Chinese text semantic graph with embedded concept knowledge graph is generated. Final⁃
ly, the method proposed in this paper is validated by using the pattern discovery task of news text as an example.【Result/conclusion】
This study proposed a novel method to construct Chinese text semantic graph based on multiple knowledge graphs.【Innovation/limita⁃
tion】This study implements the semantic representation of text at both entity level and conceptual level, which can be widely used in
nature language processing tasks such as text representation and text analysis. The limitation of this study is that the method we pro⁃
posed only be validated in news texts.