论文标题

多少页?从元数据预测的纸张长度预测

How Many Pages? Paper Length Prediction from the Metadata

论文作者

Çano, Erion, Bojar, Ondřej

论文摘要

能够预测科学论文的长度可能在众多情况下可能会有所帮助。这项工作将纸张长度预测任务定义为回归问题,并使用流行的机器学习模型报告了几个实验结果。我们还创建了一个庞大的出版物元数据数据集和各个页面的长度。该数据集将是免费的,旨在促进该领域的研究。作为未来的工作,我们希望探索基于神经网络和大型语言模型的更高级回归器。

Being able to predict the length of a scientific paper may be helpful in numerous situations. This work defines the paper length prediction task as a regression problem and reports several experimental results using popular machine learning models. We also create a huge dataset of publication metadata and the respective lengths in number of pages. The dataset will be freely available and is intended to foster research in this domain. As future work, we would like to explore more advanced regressors based on neural networks and big pretrained language models.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源