情报科学 ›› 2021, Vol. 39 ›› Issue (10): 11-17.

• 专论 • 上一篇    下一篇

基于LDA2Vec的政策文本主题挖掘与结构化解析框架研究

  

  • 出版日期:2021-10-01 发布日期:2021-11-01

  • Online:2021-10-01 Published:2021-11-01

摘要: 【目的/意义】本文以主题为核心,从外部属性和内容属性两个视角展开政策文本结构化解析,直观反映政
策核心内涵,挖掘政策文本语义,为政策内容解读提供新模式。【方法/过程】利用LDA2Vec主题模型实现基于上下
文的政策文本主题识别,同时借助位置和语法规律提取外部属性,以此构建政策文本结构化解析的描述框架。【结
果/结论】“互联网+”政策文本解读的实证分析发现,本文所提框架有助于直观展现政策要素,有效揭示政策文本主
题分布,以及进行大规模政策领域文本的批量分析和解读。【创新/局限】通过结构化解析框架展现政策文本的形式
化特征和主题性特征,帮助政策相关群体把握政策制定的特点和侧重点,目前深层次内容解读有待进一步研究。

Abstract: 【Purpose/significance】In order to reflect the core connotation intuitively andminingthe semantics of the policy text, this pa?
per focuses on topic mining,and carries out a structuredparseof the policy text from the perspectives of external attributes and content
attributes, which willprovide a new mode for the interpretation of policy text.【Method/process】LDA2Vec topic model is used in this
study to recognizecontext-based policytopic, as well as external attributes are extracted based on location and grammatical rules.There? by, we constructa descriptive framework for structuredparse of policy text.【Result/conclusion】In the empirical analysis, this frame? workwas successfully applied to interpret the "Internet +" policy text, indicating that the framework this paper proposed is help for dis? play policy elements visually, reveal the distribution of policy topics effectively, and perform large-scale batch analysis and interpreta? tion of policy textsefficiently.【Innovation/limitation】According to the structural parse framework, the formal and thematic features of the policy text are displayed, helping policy-related groups grasp the characteristics and focus of policy. At present, the in-depth con? tent interpretation needs further research.