情报科学 ›› 2021, Vol. 39 ›› Issue (9): 146-154.

• 博士论坛 • 上一篇    下一篇

融合词向量语义增强和DTM模型的公共政策文本时序建模与演化分析 ——以“大数据领域”为例 

  

  • 出版日期:2021-09-01 发布日期:2021-10-21

  • Online:2021-09-01 Published:2021-10-21

摘要: 【目的/意义】探测特定领域政策文本语义主题,揭示我国政策部署领域与未来发展趋势。【方法/过程】提出
一种融合词向量语义增强和
DTM模型的公共政策文本时序建模与可视化方法,采用DTM模型实现政策文本的时
序切割和主题建模,利用深度学习
Word2vec算法中Skip-gram词嵌入技术可以对上下文词汇进行有效预测,增强
其语义表达性和政策解释性,以更为准确地揭示我国公共政策的部署重点。【结果
/结论】实验表明本文提出的方法
对于公共政策主题识别和政策文本量化具有更好的知识抽取和语义表达能力,对我国公共政策挖掘和信息揭示具
有良好的揭示。【创新
/局限】提出融合词向量语义增强和DTM模型的公共政策文本时序建模方法,一定程度上提
升了政策文本的主题语义表达,未来考虑利用深度学习技术如
LSTM算法、BERT模型等识别政策中的领域知识单
元和语法结构。

Abstract: Purpose/significanceExplore the topic of policy text in specific fields, and reveal the field of policy deployment and fu⁃ture development trend in China.Method/processThis paper proposes a time series modeling and visualization method of public poli⁃cy text, use DTM model to cut the time sequence and model the topic of policy text, and skip gram word embedding technology in deep learning word2vec algorithm is used to effectively predict the context vocabulary, so as to enhance its semantic expression and policy interpretation, and more accurately reveal the deployment focus of public policy in China.Result/conclusionExperiments show that the method proposed in this paper has better knowledge extraction and semantic expression ability for public policy topic recognition and policy text quantification, and has good revelation for public policy mining and information disclosure in China.Innovation/limita⁃
tion
This paper proposes a temporal modeling method of public policy texts which integrates word vector semantic enhancement and DTM model, and improves the topic semantic expression of policy texts to a certain extent. In the future, deep learning technologies such as LSTM algorithm and Bert model will be considered to identify domain knowledge units and syntax structures in policies.