映射研究轨迹

论文标题

映射研究轨迹

Mapping Research Trajectories

论文作者

Schäfermeier, Bastian, Stumme, Gerd, Hanika, Tom

论文摘要

稳步增长的信息，例如每年发表的科学论文，已经变得如此之大，以至于它们避免了大量的手动分析。因此，为了维持概述，必须使用自动化的方法来映射和可视化知识领域的映射和可视化，例如，对于科学决策者而言。随着时间的流逝，在这一领域中特别感兴趣的是不同实体（例如科学作者和场地）的研究主题的发展。但是，现有的分析方法仅适用于单一实体类型，例如场地，并且通常不会以易于解释的方式捕获研究主题或时间维度。因此，我们提出了一种针对\ emph {映射研究轨迹}的原则方法，该方法适用于可以由已发表论文组表示的各种科学实体。为此，我们从地理可视化域，特别是轨迹图和交互式地理图中传递思想和原理。我们的可视化效果描述了随着时间的流逝，实体的研究主题是直接的解释。方式。它们可以由用户直观地导航，并仅限于感兴趣的特定元素。这些地图来自研究出版物的语料库（即标题和摘要），通过无监督的机器学习方法的结合。在实用的演示器应用程序中，我们说明了机器学习的出版物语料库中提出的方法。我们观察到，我们对30个顶级机器学习场所和该领域1000名主要作者的轨迹可视化是可以解释的，并且与从实体出版物中汲取的背景知识一致。旁边产生交互式，解释。可视化支持不同类型的分析，我们的计算轨迹适用于将来的轨迹采矿应用。

Steadily growing amounts of information, such as annually published scientific papers, have become so large that they elude an extensive manual analysis. Hence, to maintain an overview, automated methods for the mapping and visualization of knowledge domains are necessary and important, e.g., for scientific decision makers. Of particular interest in this field is the development of research topics of different entities (e.g., scientific authors and venues) over time. However, existing approaches for their analysis are only suitable for single entity types, such as venues, and they often do not capture the research topics or the time dimension in an easily interpretable manner. Hence, we propose a principled approach for \emph{mapping research trajectories}, which is applicable to all kinds of scientific entities that can be represented by sets of published papers. For this, we transfer ideas and principles from the geographic visualization domain, specifically trajectory maps and interactive geographic maps. Our visualizations depict the research topics of entities over time in a straightforward interpr. manner. They can be navigated by the user intuitively and restricted to specific elements of interest. The maps are derived from a corpus of research publications (i.e., titles and abstracts) through a combination of unsupervised machine learning methods. In a practical demonstrator application, we exemplify the proposed approach on a publication corpus from machine learning. We observe that our trajectory visualizations of 30 top machine learning venues and 1000 major authors in this field are well interpretable and are consistent with background knowledge drawn from the entities' publications. Next to producing interactive, interpr. visualizations supporting different kinds of analyses, our computed trajectories are suitable for trajectory mining applications in the future.

下载PDF全文

下载文献需遵守相关版权规定

论文标题