论文标题
自动移动应用程序用户反馈答案生成的预训练的神经语言模型
Pre-Trained Neural Language Models for Automatic Mobile App User Feedback Answer Generation
论文作者
论文摘要
研究表明,开发人员对应用程序商店中移动应用程序用户反馈的回答可以提高应用程序的星级评级。为了帮助应用程序开发人员生成与用户问题相关的答案,最近的研究开发了自动生成答案的模型。目的:应用程序响应生成模型使用深度神经网络并需要培训数据。在自然语言处理(NLP)中使用的预训练的神经语言模型(PTM)利用了他们从大型语料库中学习的信息,以无监督的方式,可以减少所需的培训数据的数量。在本文中,我们评估PTMS以生成移动应用程序用户反馈的答复。方法:我们从划痕和微调两个PTM训练变压器模型,以评估生成的响应,该响应与当前的应用程序响应模型RRGEN进行了比较。我们还用培训数据的不同部分评估了模型。结果:通过自动指标评估的大数据集的结果表明,PTM的得分比基准的得分较低。但是,我们的人类评估证实,PTM可以对发布的反馈产生更相关和有意义的回应。此外,与其他模型相比,当训练数据量减少到1/3时,PTM的性能下降较小。结论:PTM可用于生成对应用程序评论的响应,并且对提供的培训数据的数量是更强大的模型。但是,预测时间比RRGEN 19倍。这项研究可以为改编PTM的研究提供新的途径,以分析移动应用程序用户反馈。索引术语摩托车应用程序用户反馈分析,神经预训练的语言模型,自动答案生成
Studies show that developers' answers to the mobile app users' feedbacks on app stores can increase the apps' star rating. To help app developers generate answers that are related to the users' issues, recent studies develop models to generate the answers automatically. Aims: The app response generation models use deep neural networks and require training data. Pre-Trained neural language Models (PTM) used in Natural Language Processing (NLP) take advantage of the information they learned from a large corpora in an unsupervised manner, and can reduce the amount of required training data. In this paper, we evaluate PTMs to generate replies to the mobile app user feedbacks. Method: We train a Transformer model from scratch and fine-tune two PTMs to evaluate the generated responses, which are compared to RRGEN, a current app response model. We also evaluate the models with different portions of the training data. Results: The results on a large dataset evaluated by automatic metrics show that PTMs obtain lower scores than the baselines. However, our human evaluation confirms that PTMs can generate more relevant and meaningful responses to the posted feedbacks. Moreover, the performance of PTMs has less drop compared to other models when the amount of training data is reduced to 1/3. Conclusion: PTMs are useful in generating responses to app reviews and are more robust models to the amount of training data provided. However, the prediction time is 19X than RRGEN. This study can provide new avenues for research in adapting the PTMs for analyzing mobile app user feedbacks. Index Terms-mobile app user feedback analysis, neural pre-trained language models, automatic answer generation