情报科学 ›› 2024, Vol. 42 ›› Issue (6): 54-61.

• 理论研究 • 上一篇    下一篇

学术文献推荐的事后可解释性研究 ——基于实体关系联合抽取的知识图谱

  

  • 出版日期:2023-06-01 发布日期:2024-07-31

  • Online:2023-06-01 Published:2024-07-31

摘要: 【 目的/意义】针对当前学术文献推荐可解释研究多聚焦于模型可解释,不仅解释力度有限且涉及一定的性 能牺牲,提出了一种基于知识图谱的事后可解释方法。【方法/过程】基于深度学习、多任务学习及注意力机制等技 术方法,构建混合深度学习的医学文献实体关系联合抽取模型,通过可视化知识图谱的方式揭示推荐文献与查询 主题的内在语义关联,实现推荐结果的事后可解释化。【结果/结论】在人工标注的高质量实体关系测试数据集上F1 值 达 72.4%,较 基 准 模 型 BioBERT-BiLSTM-CRF、BioBERT-LSTM-CRF、LSTM-CRF 以 及 Joint-BiLSTMRNN分别提高4.3%、5.2%、7.7%和1.3%。对得到的知识图谱从透明性、可信度、有用性、有效性、说服性、可审查性 以及满意度七个指标进行可解释性测度,分别获得 4.3、3.6、4.1、3.8、3.7、1.0、2.7 的评分(评分区间 0-5)。【创新/局 限】采用远距离监督的方法实现大规模训练语料自动构建,虽能免去繁冗的人工数据标注环节,但关系标签中的噪 音降低了模型在关系抽取上的表现。基于知识图谱的事后可解释方法能够较好地解释推荐结果产生原因以及协 助用户进行正确决策,但在可审查性方面仍有较大的提升空间。

Abstract: 【Purpose/significance】Aiming at the current academic literature recommendation explainable research mostly focuses on model explainable, which not only has limited explaining strength but also involves certain performance sacrifices, a post hoc explain⁃ able method based on knowledge graph is proposed. 【Method/process】 Based on deep learning, multi-task learning and attention mechanism, we construct a joint extraction model of entity relationship in medical literature with hybrid deep learning, and reveal the intrinsic semantic association between recommended literature and query topic through visual knowledge graph, so as to realize post hoc interpretability of recommendation results. 【Result/conclusion】 The F1 value on the manually labeled high-quality entityrelationship test dataset reaches 72.4%, which is 4.3%, 5.2% ,7.7% and 1.3% higher than the benchmark models BioBERT-BiLSTMCRF and BioBERT-LSTM-CRF and LSTM-CRF and Joint-BiLSTM-RNN, respectively. The obtained knowledge graphs were mea⁃ sured for interpretability in terms of seven metrics: transparency, trustworthiness, usefulness, effectiveness, persuasiveness, reviewabil⁃ ity, and satisfaction, and received ratings of 4.3, 3.6, 4.1, 3.8, 3.7, 1.0, and 2.7, respectively (with a rating interval of 0-5).【 Innovation/ limitation】 The method of distant supervision is used to realize the automatic construction of large-scale training corpus, which can eliminate the tedious manual data labeling session, but the noise in the relation label reduces the performance of the model on relation extraction. The knowledge graph-based post hoc interpretable method can better explain the reasons for the recommendation results as well as assist users in making correct decisions, but there is still much room for improvement in terms of reviewability.