情报科学 ›› 2021, Vol. 39 ›› Issue (4): 174-185.

• 综述 • 上一篇    下一篇

基于LDA模型的国内外数据挖掘研究热点主题演化对比分析

  

  • 出版日期:2021-04-01 发布日期:2021-04-09

#br#

  • Online:2021-04-01 Published:2021-04-09

摘要:

【目的/意义】揭示并对比国内外数据挖掘领域研究热点主题的演化过程。【方法/过程】收集1998-2018年
CNKI及Web of Science收录的数据挖掘领域核心期刊论文,通过LDA主题模型抽取研究主题,并基于主题生命周
期识别热点主题,结合时间片构建主题的演化路径,从数据挖掘研究的理论维度和应用维度来对比分析国内外数
据挖掘领域热点主题演化的区别与联系。【结果/结论】数据挖掘领域在理论维度上,国内的研究内容滞后于国外;
在应用维度上,国内偏向于在社会科学上的应用,国外偏向于在自然科学上的应用;数据挖掘领域整体研究重心由
理论研究逐渐转向应用研究,且结合大数据技术有许多新兴发展。【创新/局限】本文为可视化和比较国内外数据挖
掘领域热点问题的演化过程提供了一种新的思路,局限在于还未对国内外数据挖掘领域的滞后性和影响因素进行
定量分析。

Abstract:

【Purpose/significance】Revealing and comparing the evolution process of hot topics in the field of Data Mining in China
and abroad.【Method/process】Taking data in the field of Data Mining from core journals in CNKI and Web of Science from 1998 to
2018, topics are extracted by LDA model and hot topics are selected based on life cycle theory, and a contrastive analysis in China and
abroad on the Evolution are based on these hot topics. Topic evolution paths are generated to contrast evolution of hot topics between
home and abroad which are grouped into dimensions of technology and application.【Result/conclusion】In the theoretical dimension
of data mining, the research content at home lags behind that at abroad. In terms of application dimension, domestic application tends
to be in social sciences, while application at abroad tends to be in natural sciences. The overall research focus has gradually shifted
from theoretical research to application research, and there are many emerging developments combined with big data technology.【In⁃
novation/limitation】Providing a new way to visualize and compare the evolution of hot topics in the field of Data Mining in China and
abroad. But it fails to analyze quantify the lagging performance and contributing factors in the field of Data Mining at home and abroad.