情报科学 ›› 2023, Vol. 41 ›› Issue (1): 143-151.

• 业务研究 • 上一篇    下一篇

科学数据在学术虚拟社区的循环流动模式研究 ——以GitHub为例

  

  • 出版日期:2023-01-01 发布日期:2023-04-06

  • Online:2023-01-01 Published:2023-04-06

摘要: 【目的/意义】科学数据的开放和共享能够促进科学交流,学术虚拟社区为此提供了基础平台。本文总结科
学数据在学术虚拟社区中的交流模式,为科学数据的高效交流与共享提供参考。【方法/过程】利用质性分析法和社
会网络分析法,以数据交流过程为脉络,对 GitHub 中 8,521条数据的性质、特征以及交流模式对比分析。【结果/结
论】科学数据分为原始数据材料、分析过程数据、模型理论数据、代码测试数据四类。原始数据材料较易获取、易保
存、易被讨论,但难更新;分析过程数据具有易保存、易更新、难被讨论的特点;模型理论数据较易保存和被讨论,但
难更新和被认可;代码测试数据是用户最重视的数据集类型;学术虚拟社区中用户的相互交流模式较为简单直接,
但整体形成了以科学数据为中心的循环交流模式,提醒科学数据用户和学术虚拟社区都要把握科学数据生产和传
播质量。【创新/局限】提出科学数据在学术虚拟社区中是循环流动的交流模式,为提高科学交流水平有重要意义。

Abstract: 【Purpose/significance】The opening and sharing of scientific data can promote scholarly communication, where the aca?
demic virtual community provides a good basic platform for the opening and sharing of scientific data. This research will summarize
the communication mode of scientific data in the academic virtual community to provide a reference for the efficient exchange and
sharing of scientific data.【Method/process】We used qualitative analysis methods and social network analysis based on data exchange process to compare the features and communication mode for 8,521 pieces of dataset in GitHub.【Result/conclusion】This approach found that scientific data can be divided into four categories: raw data materials, analysis process data, model and theoretical data, and code test data. In the communication process, raw data materials are easy to obtain, easy to save, and easy to be discussed, but difficult to update. Analysis process data has the characteristics of being easy to save and easily update, but difficult to discuss. Model and theo? retical data has the features of being easy to save and be discussed, but difficult to update and be recognized. Code test data is most ap? preciated dataset by users. Moreover, in the academic virtual community, the mutual communication mode among users is relatively simple and straightforward. However, it has formed a communication pattern that surrounding scientific data. The circulation communi? cation mode prompts both scientific data users and academic virtual communities to improve the quality of scientific data production and dissemination.【Innovation/limitation】The proposed circulation flow of scientific data in academic virtual community will play an important role in and benefit scholarly communication。