情报科学 ›› 2023, Vol. 41 ›› Issue (10): 112-120.

• 业务研究 • 上一篇    下一篇

基于改进的LDA模型的文献主题挖掘与演化趋势研究
——以个人隐私信息保护领域为例

  

  • 出版日期:2023-10-01 发布日期:2023-12-04

  • Online:2023-10-01 Published:2023-12-04

摘要:

【目的/意义】在大数据时代,由于相关政策缺乏一定的针对性、落地性,个人隐私信息保护问题始终没有
得到良好的解决。为此,本文从文献信息学角度对该领域的发展历程、学科结构等进行研究。【方法/过程】以个人
隐私信息法律保护领域的研究文献为例,首先提出新的潜在特征主题模型Improved-LDA;其次,利用改进的LDA
模型对个人隐私信息法律保护领域的研究文献进行研究主题识别;最后,引入时间序列,探索各研究主题的时间演
化特征,从而揭示该领域的发展状况,并预测其未来的发展趋势。【结果/结论】
研究发现:在主题一致性计算与主题
聚类方面,Improved-LDA模型的性能明显优于传统的LDA模型。同时,运用改进的LDA模型共发现了“信息保护
技术研发、电子商务平台的用户信息安全、金融领域的个人隐私信息安全、疫情背景下的个人隐私信息安全”等八
个主题类别。【创新/局限】
本文对个人隐私信息法律保护领域的知识结构、研究现状与演化规律进行了探索,能够
为该领域的发展提供有效参考,同时对相关政策法规的优化提供借鉴;此外 Improved-LDA 模型的普适性还有待
检验。

Abstract:

【Purpose/significance】 In the era of big data, due to the lack of targeted and practical policies, the issue of protecting per⁃
sonal privacy information has not been effectively addressed. Therefore, this article studies the development process, disciplinary struc⁃ture, and past hotspots of research directions in this field from the perspective of biblioinformatics, based on a collection of research lit⁃erature.【Method/process】 This article takes research literature in the field of legal protection of personal privacy information as an ex⁃ample. First, a new potential feature topic model, Improved LDA, is proposed and compared with the traditional LDA model. Secondly,the improved LDA model is used to identify research topics in the field of legal protection of personal privacy information. Finally,time series is introduced to explore the temporal evolution characteristics of each research topic, in order to reveal the development sta⁃tus of the field in the past and predict its future development trend.【
Result/conclusion】 Research has found that the Improved LDA model performs significantly better than traditional LDA models in both topic consistency calculation and topic clustering. At the same time, an improved LDA model was used to discover eight thematic categories, including "research and development of information pro⁃tection technology, user information security in e-commerce platforms, personal privacy information security in the financial field, and personal privacy information security in the context of the epidemic".【Innovation/limitation】 This article takes research literature in the field of legal protection of personal privacy information as an example, and explores the knowledge structure, research status, and evolution laws of this field from the perspective of biblioinformatics. This provides effective reference for guiding the development of this research field, with the aim of providing reference for the optimization of existing policies and regulations. In addition, the univer⁃sality of the improved LDA model proposed in this article still needs to be tested.