论文标题
基于自然语言处理和机器学习技术的十年内文本引文分析:经验研究概述
A Decade of In-text Citation Analysis based on Natural Language Processing and Machine Learning Techniques: An overview of empirical studies
论文作者
论文摘要
引文分析是研究评估中最常用的方法之一。我们通过文献计量元数据看到了引文分析的显着增长,这主要是由于引用数据库的可用性,例如科学,Scopus,Google Scholar,Microsoft Academic和Dimensions。由于近年来,由于更好地访问了全文出版物语料库,信息科学家通过利用全文数据处理技术的进步来衡量科学出版物在上下文中的影响,因此信息科学家远远超出了传统的书目计量学的范围。这导致了引用环境和内容分析,引文分类,引用情感分析,引用摘要和基于引文的建议的技术发展。本文旨在叙述有关这些发展的研究。它的主要重点是使用自然语言处理和机器学习技术来分析引用的出版物。
Citation analysis is one of the most frequently used methods in research evaluation. We are seeing significant growth in citation analysis through bibliometric metadata, primarily due to the availability of citation databases such as the Web of Science, Scopus, Google Scholar, Microsoft Academic, and Dimensions. Due to better access to full-text publication corpora in recent years, information scientists have gone far beyond traditional bibliometrics by tapping into advancements in full-text data processing techniques to measure the impact of scientific publications in contextual terms. This has led to technical developments in citation context and content analysis, citation classifications, citation sentiment analysis, citation summarisation, and citation-based recommendation. This article aims to narratively review the studies on these developments. Its primary focus is on publications that have used natural language processing and machine learning techniques to analyse citations.