论文标题

从生物医学黑数据中发现的视觉探索和知识发现

Visual Exploration and Knowledge Discovery from Biomedical Dark Data

论文作者

Aggarwal, Shashwat, Singh, Ramesh

论文摘要

数据可视化技术提供有效的方法,以图形吸引人的格式组织和呈现数据,这不仅加快了决策和模式识别的过程,而且还使决策者能够充分了解数据见解并做出知情的决策。随着时间的流逝,随着技术和计算资源的增加,世界科学知识的指数增长。但是,其中大多数缺乏结构,并且不能轻易将其分类和导入到常规数据库中。这种类型的数据通常称为黑数据。数据可视化技术提供了一种有希望的解决方案,可以通过快速理解信息,发现趋势的发现,关系和模式的发现,等等等。在这项实证研究中,我们使用PubMed的丰富语料库,其中包含超过3000万次从生物医学文献中引用的超过3000万次引用,以视觉上探索和理解各种信息可视化的信息。我们采用基于自然语言处理的管道来从生物医学黑暗数据中发现知识。管道包括不同的词汇分析技术,例如主题建模,以提取固有的主题和主要重点领域,网络图,以研究各种实体之间的关系,例如科学文档和期刊,期刊,研究人员,以及关键词和术语等。通过这项分析研究,我们旨在提出一个潜在的解决方案,以分析人性化的范围,并逐步调查人类的范围,以使人的范围逐渐审议,使人类的范围进行了限制,使得人类的限制性范围的范围是如此。数据卷。

Data visualization techniques proffer efficient means to organize and present data in graphically appealing formats, which not only speeds up the process of decision making and pattern recognition but also enables decision-makers to fully understand data insights and make informed decisions. Over time, with the rise in technological and computational resources, there has been an exponential increase in the world's scientific knowledge. However, most of it lacks structure and cannot be easily categorized and imported into regular databases. This type of data is often termed as Dark Data. Data visualization techniques provide a promising solution to explore such data by allowing quick comprehension of information, the discovery of emerging trends, identification of relationships and patterns, etc. In this empirical research study, we use the rich corpus of PubMed comprising of more than 30 million citations from biomedical literature to visually explore and understand the underlying key-insights using various information visualization techniques. We employ a natural language processing based pipeline to discover knowledge out of the biomedical dark data. The pipeline comprises of different lexical analysis techniques like Topic Modeling to extract inherent topics and major focus areas, Network Graphs to study the relationships between various entities like scientific documents and journals, researchers, and, keywords and terms, etc. With this analytical research, we aim to proffer a potential solution to overcome the problem of analyzing overwhelming amounts of information and diminish the limitation of human cognition and perception in handling and examining such large volumes of data.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源