情报科学 ›› 2025, Vol. 43 ›› Issue (1): 117-126.

• 业务研究 • 上一篇    下一篇

我国政府数据开放研究与国家战略所需的匹配度分析 ——基于BERTopic模型与扎根理论

  

  • 出版日期:2025-01-05 发布日期:2025-06-27

  • Online:2025-01-05 Published:2025-06-27

摘要: 【目的/意义】识别我国政府数据开放相关研究主题与战略发展方向,探究我国政府数据开放研究与国家 发展战略的匹配性。【方法/过程】采用BERTopic模型对2010-2023年期间CNKI数据库中政府数据开放相关文献 进行主题挖掘,借助扎根理论对12份政府数据开放内容相关国家级政策文件归纳梳理战略发展方向。【结果/结论】 我国政府数据开放研究可分为14个主题,国家战略发展方向可分为6个子范畴和13个初始范畴,经对比分析发现, 政府数据开放相关研究与我国国家发展战略具有较高的匹配程度,表明学界研究在对接国家政策需求与发展战略 过程中具有较强主动性与一致性。【创新/局限】结合利用BERTopic模型与扎根理论思想,探究我国政府数据开放 研究与国家发展战略的匹配性。但本文主题挖掘模型单一,未进行多种模型结果的对比;需进一步完善、补充自定 义词表,加强主题特征的提取;未来可考虑使用词汇关联的形式呈现政策文件内容结构,有助于可视化展示国家战 略发展方向。

Abstract: 【Purpose/significance】This study aimed to identify the research themes and strategic development directions related to open government data in China and to explore the alignment between open government data research and national development strate⁃ gies.【Method/process】The study employed the BERTopic model to extract themes from literature on open government data published in the CNKI database between 2010 and 2023. Additionally, grounded theory was utilized to summarize strategic development direc⁃ tions from 12 national-level policy documents related to open government data【. Result/conclusion】The research on open government data in China could be categorized into 14 themes. The national strategic development directions were divided into 6 subcategories and 13 initial categories. A comparative analysis revealed a high degree of alignment between open government data research and China's national development strategies, indicating that academic research actively and consistently addressed national policy needs and devel⁃ opment strategies【. Innovation/limitation】This study combines the BERTopic model with the principles of grounded theory to explore the alignment between China's open government data research and national development strategies. However, the topic modeling ap⁃ proach is limited to a single model, and no comparison of results from multiple models has been conducted. Further improvements are needed, including the expansion and refinement of custom word lists to enhance the extraction of topic features. Future work could con⁃ sider using lexical associations to present the structural content of policy documents, which would aid in visually showcasing the direc⁃ tion of national strategic development.