情报科学 ›› 2022, Vol. 40 ›› Issue (6): 25-35.

• 理论研究 • 上一篇    下一篇

数字人文视角下《谭延闿日记》人物关系挖掘及可视化研究 

  

  • 出版日期:2022-06-01 发布日期:2022-06-12

  • Online:2022-06-01 Published:2022-06-12

摘要: 【目的/意义】基于数字人文理论与方法,挖掘《谭延闿日记》中蕴含的人物关系,形成能够呈现日记人物同
现关系的可视化图谱,将非结构化的日记文本以更加清晰直观的方式进行展示,并力图在此过程中发现和提炼有
用的知识。【方法
/过程】以 19231926年的《谭延闿日记》内容为研究对象,抽取具有同现关系的人物实体要素,运
Gephi数据可视化软件构建日记人物同现关系网络图谱,并通过量化统计、社会网络分析等方法对网络拓扑特
征、人物中心性特征以及基于模块化和
k-core的人物群体特征等问题进行分析与讨论。【结果/结论】以人物关系挖
掘为切入点,发现和提炼《谭延闿日记》中蕴含的知识,展现了数字人文视阈下细粒度开发名人日记资源的可行性。
【创新
/局限】构建了《谭延闿日记》人物同现关系网络,从不同角度对其进行分析与可视化呈现,并结合相关历史研
究进行对比验证,以更加直观的方式展现对日记文本内容的挖掘过程与结果,为其他历史档案资源的开发提供参
考。但是所抽取数据为局部时间段数据,仅能展现局部时间段的特定人物关系,更多、更丰富人物关系的挖掘与呈
现还需更长时段的数据与更多相关资料的充实。

Abstract: Purpose/significanceExcavating the relationship between characters in Tan Yankai's Diary from the perspective of digi⁃tal humanities and forming a visual map of the relationship between diary characters and present is helpful to transform unstructured diary text into more clear and intuitive information and expand the existing development ideas and scope of celebrity diary resources.Method/processThis paper takes the contents of Tan Yankai's Diary from 1923-1926 as the research object, extracts the entity ele⁃ments of characters with the same present relationship, uses the Gephi data visualization software to construct the network of diary characters with the co-occurrence relationship, and analyzes and discusses the network topology, character centrality and character group characteristics based on modularization and k-core by means of quantitative statistics, social network analysis and literature evidence-based methods.Result/conclusionThis paper provides an idea for the study of the effective development of Tan Yankai's
Diary
, and shows the feasibility of using digital humanities and social network analysis images and expanding historical cognition by digital humanities and social network analysis.Innovation/limitationThe study constructs the co-occurrence relationship network of the characters in Tan Yankai's diary, analyzes and visualizes it from different perspectives, and compares it with relevant historical re⁃search, so as to show the mining process and results of the diary text content in a more audio-visual way, and provide reference for the development of other historical archival resources. However, the extracted data is in partial time quantum, which can only show the specific character relationships in this period. The mining and presentation of more and richer character relationships need a longer pe⁃
riod of data and more relevant materials.