情报科学 ›› 2022, Vol. 40 ›› Issue (4): 3-8.

• 专论 •    下一篇

基于关联规则的开放政府数据主题多政策协同性研究 

  

  • 出版日期:2022-04-01 发布日期:2022-05-15

  • Online:2022-04-01 Published:2022-05-15

摘要: 【目的/意义】从开放政府数据主题的多个政策文本的语义挖掘出发,发现多个政策文本内容间的语义关
系,探索能降低人工干预,实现多政策文本协同性自动化分析的方法。【方法
/过程】利用数据挖掘的关联规则算法
对经过预处理的开放政府数据政策文本进行语义挖掘,按照得到的有效强关联分析多政策文本间的协同性。【结
/结论】以开放政府数据主题的多个政策文本为研究对象,确定置信度为 0.7,提升度大于 3时得到的有效强关联
规则数量较稳定;经过不同层次的政策文本关联规则分析,可以得到与人工分析基本吻合的结论,验证了该方法可
以应用于多政策文本语义协同性的定量研究。【创新
/局限】采用数据挖掘中的关联规则算法完成数据政策多文本
的协同性知识推理研究,有效的实现了语义自动化计算的问题。实验中政策词表的完整性、数据预处理过程、参数
设定等环节都会对实验结果准确性有影响,需进一步降低人工干预影响。

Abstract:

Purpose/significanceStarting from the semantic mining of multiple policy texts of open government data,the semantic rela⁃tionship among multiple policies is found and a method to reduce manual intervention and automate the analysis of multi-policy text synergies is explored. Method/processThe data mining association rule algorithm is used to semantically mine the pre-processed pol⁃icy text of open government data,and the synergy between multiple policy texts is analyzed according to the obtained effective strong as⁃sociation. Result/conclusionTaking multiple policy texts of the open government data as the research object,the obtained number of effective strong association rules when the confidence is 0.7 and the lift is greater than 3 is relatively stable.After different levels of pol⁃icy text association rules analysis,it can get the conclusion basically consistent with the manual analysis.The conclusions prove that the method can be applied to the quantitative research of multi-policy text semantic synergy. Innovation/limitationThe association rule algorithm is used to complete the knowledge reasoning research of multi-policy text semantic synergy and realize the problem of se⁃mantic automatic calculation.In the experiment the integrity of the policy vocabulary data preprocessing process parameter setting and other aspects will have an impact on the accuracy of the experimental results and the impact of manual intervention needs to be further reduced.