情报科学

• • 上一篇    下一篇

基于BERT-BiLSTM-CRF模型的算法术语抽取与创新演化路径构建研究

  

  1. 1.武汉大学 信息管理学院,武汉 4300722.天津师范大学 管理学院,天津 300387

Research on algorithm term extraction and innovation evolution path construction based on BERT -BiLSTM-CRF Model

关键词:

算法术语抽取, 创新演化关系, BERT-BiLSTM-CRF

Abstract:

 [Purpose/Significance] Extracting algorithm terms from massive paper metadata and constructing the innovation evolution relationship between them is beneficial to the effective management and application of algorithms, so as to help researchers improve research efficiency and adopt cutting-edge achievements. [Method/Process]Firstly, the GAN algorithm abstract is used as corpus to annotate algorithm terms by combining manual annotation with rule extraction, and the bert-BilstM-CRF model is used to realize automatic extraction of algorithm terms. Then, the established model is applied to extract algorithm terms from the cited literature metadata of LDA algorithm papers, and the innovative evolution path of LDA algorithm is extracted from the cited content according to rule judgment and citation relationship. [Result/Conclusion] In the algorithm term experiment with GAN paper as an example, the accuracy rate, recall rate and F1 score reach 0.81, 0.63 and 0.71 respectively, and we construct the innovation evolution path of the LDA using relationship extraction. Our method can effectively promote the construction of the algorithm evolution network and the retrieval task of algorithms, and also enrich the related research of the innovation diffusion theory. [Innovation/limitation] It expands the application field of named entity recognition technology and provides a good idea for computer algorithm management. The subsequent construction method of innovation evolution path can be optimized.

Key words:

Algorithm term extraction, Innovation evolution relationship, BERT-BiLSTM-CRF