情报科学 ›› 2024, Vol. 42 ›› Issue (4): 129-135.

• 业务研究 • 上一篇    下一篇

融合多维特征的潜在诉讼风险专利识别研究

  

  • 出版日期:2024-04-05 发布日期:2024-06-08

  • Online:2024-04-05 Published:2024-06-08

摘要:

【目的/意义】在专利诉讼发生之前,对容易引发诉讼的高风险专利进行识别具有重要的意义,有助于我国
相关主体及早采取防控措施,规避专利风险。【方法/过程】从美国专利商标局公布的授权专利中采集诉讼专利数据
和非诉讼专利数据作为研究对象,融合专利申请、专利审查和专利价值等多维特征,利用机器学习技术构建潜在诉
讼风险专利的识别模型,并进行模型性能评估与对比。【结果/结论】研究表明,基于随机森林的潜在诉讼风险专利
识别模型在召回率、准确率和综合性能方面表现最优,可以有效识别出潜在诉讼风险专利,更加适用于专利风险预
警及防控活动中。与基于单分类器的潜在诉讼风险专利识别模型相比,基于集成学习的识别模型在潜在诉讼风险
专利识别过程中尚未表现出明显的优势。【创新/局限】本文在专利数据采集、关键特征筛选和核心算法选择方面规
避了已有研究的不足,同时提供了机器学习技术在潜在诉讼风险专利识别领域的初步尝试,进一步丰富了潜在诉
讼风险专利识别的理论与方法。

Abstract:

【Purpose/significance】 Before the occurrence of patent litigation, it is of great significance to identify high-risk patents that
are easy to cause litigation, which will help China's relevant patent subjects to take prevention and control measures as soon as pos⁃
sible to avoid patent risks.【Method/process】 This study collects litigation patent data and non-litigation patent data from the autho⁃
rized patents published by the USPTO as the research object. Based on the characteristic variables such as patent application, patent
examination and patent value, this study uses machine learning technology to build the identification model of potential litigation risk
patents, the performance of the model is evaluated and compared.【Result/conclusion】 The results show that the identification model
based on Random Forest is the best in Accuracy, Recall, and AUC, which is more suitable for patent litigation risk pre-warning. Com⁃
pared with the potential litigation risk patent recognition model based on a single classifier, the recognition model based on ensemble
learning has not shown significant advantages in the process of identifying potential litigation risk patents.【Innovation/limitation】 This
article avoids the shortcomings of existing research in patent data collection, key feature selection, and core algorithm selection. At the
same time, it provides a preliminary attempt of machine learning technology in the field of potential litigation risk patent identification,
further enriching the theory and methods of potential litigation risk patent identification.