情报科学 ›› 2024, Vol. 42 ›› Issue (3): 100-109.

• 业务研究 • 上一篇    下一篇

面向中文病历处理的多图命名实体识别方法研究

  

  • 出版日期:2024-03-05 发布日期:2024-06-08

  • Online:2024-03-05 Published:2024-06-08

摘要:

【目的/意义】命名实体识别(NER)作为医疗记录处理的核心组成部分,对于提高电子病历处理的准确性
和效率至关重要。尤其是在处理中文病历这一领域,由于中文的复杂性,NER任务面临更多挑战。因此,开发一种
有效的中文病历命名实体识别模型,对于改进医疗记录的信息提取和数据处理流程具有重要价值。【方法/过程】文
中提出了一个新型框架 NER-CMR(中文病历命名实体识别),旨在克服现有 NER 方法在中文病历中的限制。
NER-CMR 框架通过结合流行的连续词和短语等上下文信息,解决传统 NER 中实体词嵌套和边界识别的问题。
具体来说,该框架从相关词和短语中提取字符间的邻接、共现和依赖关系,这些信息随后被融合到 NER 神经模型
中。NER-CMR 包含字符编码模块、词嵌入模块、图形构建模块、融合模块和 CRF 模块。【结果/结论】通过在
CCKS这个广泛使用的中文病历数据集与DIABETES真实糖尿病中文数据集上进行综合实验,NER-CMR展示了
其在识别性能上优于基线模型的能力。此外,该模型作为一个引入图神经网络的中文NER任务处理框架,具有模
块替换的灵活性,为中文电子病历命名实体识别研究领域提供了新的发展方向。【创新/局限】提出了基于图注意力
机制的网络图,设计了融合层实现多图融合处理,进一步利用两种策略来应对不正确关系带来的噪音问题,但缺乏
智慧医疗系统应用层面的实例研究。

Abstract:

【Purpose/significance】 Named entity recognition (NER), as a core component of medical record processing, is crucial to im⁃
proving the accuracy and efficiency of electronic medical record processing. Especially in the field of processing Chinese medical re⁃
cords, NER tasks face more challenges due to the complexity of Chinese. Therefore, developing an effective named entity recognition
model for Chinese medical records is of great value for improving the information extraction and data processing process of medical re⁃
cords.【Method/process】 A novel framework NER-CMR (Chinese Medical Records Named Entity Recognition) is proposed, aiming to
overcome the limitations of existing NER methods in Chinese medical records. The NER-CMR framework solves the problems of en⁃
tity word nesting and boundary identification in traditional NER by combining contextual information such as popular continuous
words and phrases. Specifically, the framework extracts adjacencies, co-occurrences, and dependencies between characters from re⁃
lated words and phrases, and this information is subsequently fused into the NER neural model. NER-CMR includes character encod⁃
ing module, word embedding module, graph building module, fusion module and CRF module.【Result/conclusion】 Through compre⁃
hensive experiments on CCKS, a widely used Chinese medical record dataset, and DIABETES real diabetes Chinese dataset, NER
CMR demonstrated its ability to outperform the baseline model in recognition performance. In addition, as a Chinese NER task pro⁃
cessing framework that introduces graph neural networks, this model has the flexibility of module replacement, providing a new devel⁃
opment direction for the research field of named entity recognition in Chinese electronic medical records.
Innovation/limitation】 A
network graph based on graph attention mechanism is proposed, a fusion layer is designed to realize multi-graph fusion processing,
and two strategies are further used to deal with the noise problem caused by incorrect relationships, but there is a lack of case studies
on the application level of smart medical system.