情报科学 ›› 2025, Vol. 43 ›› Issue (9): 12-18.

• 专论 • 上一篇    下一篇

融合BERTopic和大语言模型的大数据安全与治理主题演化研究

  

  • 出版日期:2025-09-05 发布日期:2025-12-12

  • Online:2025-09-05 Published:2025-12-12

摘要: 【目的/意义】数智时代,大数据安全与治理的统筹研究成为当前的研究热点与重点问题。本文通过对大 数据安全与治理领域中的研究主题进行挖掘,探究该领域的核心议题、发展趋势以及演化轨迹。【方法/过程】梳理 考证该领域已有文献,利用中国知网(CNKI)数据,对2012—2024年间大数据安全与治理相关核心期刊文献,采用 改进 BERTopic主题模型与 GPT-4协同分析框架,挖掘文本隐含的研究主题及重要性,展示主题内容的动态演化 过程与趋势。【结果/结论】研究发现大数据安全与治理领域呈现出理论与实践交互演进、多维安全深化拓展、安全 与发展动态平衡三个主要演化特征,以此为基础提出完善大数据领域特色化发展的未来展望。【创新/局限】本研究 引入大语言模型来深入挖掘出大数据安全与治理领域研究主题及其演化特征,未来将着眼于新兴技术治理与安全 领域,以期实现大数据“善治”。

Abstract: 【Purpose/significance】In the era of digital intelligence, the integrated research on big data security and governance has be⁃ come a current research hotspot and a key issue. This paper explores the core issues, development trends and evolutionary trajectories in this field by mining the research topics in big data security and governance.【Method/process】It combs and verifies the existing lit⁃ erature in this field. Utilizing the data from China National Knowledge Infrastructure (CNKI), for the core journal literature related to big data security and governance during the period from 2012 to 2024, an improved BERTopic topic model and a collaborative analy⁃ sis framework with GPT-4 are adopted to mine the implicit research topics and their importance in the texts, and to display the dy⁃ namic evolutionary process and trends of the topic contents.【Result/conclusion】The research finds that the field of big data security and governance presents three main evolutionary characteristics, namely the interactive evolution of theory and practice, the in-depth expansion of multi-dimensional security, and the dynamic balance between security and development. Based on these, it puts forward the prospects for improving the characteristic development in the field of big data.【Innovation/limitation】This study introduces large language models to deeply explore research topics and their evolutionary characteristics in the field of big data security and gover⁃ nance. In the future, it will focus on the field of emerging technology governance and security, aiming to achieve "virtuous governance" of big data