论文标题
分析文本特征与研究建议生产率之间的关系
Analyzing the relationship between text features and research proposal productivity
论文作者
论文摘要
预测研究赠款的产出与研究资助机构,科学实体和政府机构的相关性。在这项研究中,我们研究了从项目标题和摘要中提取的文本特征是否能够识别生产赠款。我们的分析是在三个不同的领域进行的,即医学,牙科和兽医医学。局部和复杂性文本特征用于识别生产力的预测指标。结果表明,文本特征和授予生产力之间存在统计学意义的关系,但是这种依赖性很弱。特征相关性分析表明,词汇多样性衍生的抽象文本长度和指标是最歧视的特征。我们还发现,预测准确性对所考虑的项目语言具有依赖性,并且主题功能比文本复杂度测量更具歧视性。我们的发现表明,应将文本功能与其他功能结合使用,以帮助识别相关的研究思想。
Predicting the output of research grants is of considerable relevance to research funding bodies, scientific entities and government agencies. In this study, we investigate whether text features extracted from projects title and abstracts are able to identify productive grants. Our analysis was conducted in three distinct areas, namely Medicine, Dentistry and Veterinary Medicine. Topical and complexity text features were used to identify predictors of productivity. The results indicate that there is a statistically significant relationship between text features and grants productivity, however such a dependence is weak. A feature relevance analysis revealed that the abstract text length and metrics derived from lexical diversity are among the most discriminative features. We also found that the prediction accuracy has a dependence on the considered project language and that topical features are more discriminative than text complexity measurements. Our findings suggest that text features should be used in combination with other features to assist the identification of relevant research ideas.