情报科学 ›› 2025, Vol. 43 ›› Issue (5): 199-205.

• 业务研究 • 上一篇    

一种基于图结构的相似图书内容推荐方法

  

  • 出版日期:2025-05-05 发布日期:2025-09-01

  • Online:2025-05-05 Published:2025-09-01

摘要: 【目的/意义】利用图书文本内容实现相似图书推荐,海量图书数据环境下提高图书相似度计算效率。【方 法/过程】构建了一种基于图结构的相似图书内容推荐方法,在图书的文本内容进行短语抽取后计算短语网络中的 TextRank值获得图书关键词,进而建立图书向量并结合层次可导航小世界算法(Hierarchcal Navigable Small World, HNSW)得到目标图书和推荐图书之间的相似度。【结果/结论】利用基于内容的相似图书推荐方法得到的用户评价 平均准确率达到0.807,客观平均准确率显著高于TF-IDF和TextRank的文本表示方法,可以实现较好的图书推荐 效果,HNSW算法将计算效率缩小到对数级别,对大数据环境下的相似图书计算效率起到一定的优化作用。【创新/ 局限】本研究创新性地结合图结构和HNSW算法提高了图书推荐的准确性和计算效率,但受限于对腾讯词典的依 赖,影响了向量表达的普适性和跨语言适应性。

Abstract: 【Purpose/significance】The study aims to utilize book text content for recommending similar books and to enhance the effi‑ ciency of book similarity calculations in the context of massive book data environments.【Methods/process】Key phrases were ex‑ tracted from the book's text content, and their TextRank values were calculated within the phrase network. Book vectors were then es‑ tablished and combined with the Hierarchical Navigable Small World (HNSW) algorithm to determine the similarity between target and recommended books.【Results/conclusion】The average accuracy of user evaluation obtained by using the content-based similar book recommendation method reaches 0.807, which is significantly higher than that of TF-IDF and TextRank-based textual represen‑ tations. This method achieves a better book recommendation effect. Moreover, the HNSW algorithm reduces computational efficiency to a logarithmic level, optimizing the calculation of similar books in big data environments.【Innovation/limitation】This study innova‑ tively combines graph structures and HNSW algorithms to improve the accuracy and computational efficiency of book recommenda‑ tions, but its reliance on Tencent's dictionary affects the universality and cross-language adaptability of vector expressions.