情报科学 ›› 2021, Vol. 39 ›› Issue (4): 15-22.

• 理论研究 • 上一篇    下一篇

基于FastText字向量与双向GRU循环神经网络的
短文本情感分析研究
——以微博评论文本为例

  

  • 出版日期:2021-04-01 发布日期:2021-04-09

  • Online:2021-04-01 Published:2021-04-09

摘要:

【目的/意义】提出基于字向量与双向GRU循环神经网络的模型以提高网络化短文本情感分类准确率,有
助于关注民众在网络上的情绪状态,维护社会稳定,净化网络环境,提升人民幸福感。【方法/过程】通过FastText算
法生成字向量与词向量,对比两者在双向GRU的循环神经网络的训练效果,预测微博评论的情感分类。【结果/结
论】研究结果表明,使用字向量训练可以降低模型过拟合的风险,本文提出的模型在准确率、精确率、召回率、
F1分
数四个指标上的分数都达到0.92以上,具有优秀的拟合能力和泛化能力。【创新/局限】本文根据理论为模型配置了
独特的词嵌入层和循环神经网络层,模型在中文短文本二分类情感分析任务中表现优越,但在长文本或者三分类
情感分析任务中的表现未知。

Abstract:

【Purpose/significance】This paper proposes a model based on word vector and bi-directional GRU recurrent neural network
to improve the accuracy of sentiment classification of networked short text, which helps to pay attention to the emotional state of the
public on the network, maintain social stability, purify the network environment, and enhance people's happiness.【Method/process】
FastText algorithm was used to generate word vector and word vector, and the training effect of them in bidirectional GRU recurrent
neural network was compared to predict the sentiment classification of microblog comments.【Result/conclusion】The results show that
the use of word vector training can reduce the risk of model over fitting. The score of accuracy, precision, recall rate and F1 score of
the model proposed in this paper is above 0.92, which has excellent fitting ability and generalization ability.【Innovation/limitation】In
this paper, a unique word embedding layer and a cyclic neural network layer are configured for the model according to the theory. The
model performs well in the second category sentiment analysis task of Chinese short text, but its performance in the long text or three
category sentiment analysis task is unknown.