情报科学 ›› 2025, Vol. 43 ›› Issue (6): 28-41.

• 专题组稿 • 上一篇    下一篇

基于SHAP解释方法的在线问诊平台信息采纳行为影响因素研究

  

  • 出版日期:2025-06-05 发布日期:2025-10-16

  • Online:2025-06-05 Published:2025-10-16

摘要: 【目的/意义】利用SHAP解释方法对在线问诊平台信息采纳行为预测模型进行可解释性分析,以识别信息 采纳行为的重要影响因素及其作用方式。【方法/过程】首先,依据信息采纳模型从信息质量和信息源可信度维度提 取问答文本相关候选特征变量,并利用相关性分析对其进行筛选;其次,利用多个集成学习经典算法构建信息采纳 行为预测模型,并利用相关性能指标对其进行评价和选择;然后,针对综合表现最优模型,利用SHAP框架从特征变 量的重要性、主效应、作用方式以及交互效应等方面展开可解释性分析;最后,根据分析结果从患者、医生、平台三 个维度提出信息采纳优化策略。【结果/结论】问答时间差和回复文本总数是影响患者信息采纳最关键的两个因素, 此外,追问追答、悬赏金额、回答文本熵等因素对信息采纳行为的影响程度、影响方向以及影响方式也存在差异。 【创新/局限】利用算法归因思路对信息采纳行为的影响因素进行挖掘,但数据来源及形式较为单一,特征提取尚停 留在问答文本形式特征,未来将深度挖掘多源多模态问答数据中的内容特征,以完善相关影响因素。

Abstract: 【Purpose/significance】The SHAP explanation method was used to conduct interpretability analysis on the information adoption behavior prediction model of the online consultation platform to identify the important influencing factors and their modes of action that affect information adoption behavior.【Method/process】Firstly, based on the information adoption model, candidate feature variables related to the question and answer text are extracted from the dimensions of information quality and information source cred⁃ ibility, and Spearman correlation analysis is used to screen them; secondly, multiple ensemble learning classic algorithms are used to construct information adoption behavior predict the model, and use relevant performance indicators to evaluate and select it; then, use the SHAP explanation method to conduct interpretability analysis from the importance of feature variables, main effects, modes of ac⁃ tion, and interactive effects for the best overall performance model; finally, based on the analysis results, an information adoption opti⁃ mization strategy is proposed from the three dimensions of patients, doctors, and platforms.【Result/conclusion】The time difference between questions and answers and the total number of reply texts are the two most critical factors that affect patients' information adoption. In addition, factors such as follow-up questions, reward amounts, and answer text entropy also have different effects on infor⁃ mation adoption behavior in terms of degree, direction, and method.【Innovation/limitation】The algorithmic attribution idea is used to mine the influencing factors of information adoption behavior. However, the data source and form are relatively single, and the feature extraction is still limited to the question and answer record form features. In the future, the content features in the multi-source and multi-modal question and answer data will be deeply mined to improve the relevant influencing factors.