论文标题
部分可观测时空混沌系统的无模型预测
Balancing Cost and Quality: An Exploration of Human-in-the-loop Frameworks for Automated Short Answer Scoring
论文作者
论文摘要
简短的答案评分(SAS)是对学习者编写的简短文本的任务。近年来,基于深度学习的方法显着改善了SAS模型的性能,但是在将此类模型应用于教育领域时,如何保证高质量的预测仍然是一个关键问题。为了确保高质量的预测,我们介绍了探索人类在循环框架中使用分级成本的第一个研究,同时通过允许SAS模型与人类分级器共享分级任务,以确保分级质量。具体而言,通过引入一种置信度估计方法来指示模型预测的可靠性,可以通过仅利用对评分结果的可靠性高的预测来保证评分质量,并对人类分级的可靠性低可靠性。在我们的实验中,我们使用多个置信度估计方法和多个SAS数据集研究了提出的框架的可行性。我们发现,我们的人类框架框架允许自动评分模型和人类分级器达到目标评分质量。
Short answer scoring (SAS) is the task of grading short text written by a learner. In recent years, deep-learning-based approaches have substantially improved the performance of SAS models, but how to guarantee high-quality predictions still remains a critical issue when applying such models to the education field. Towards guaranteeing high-quality predictions, we present the first study of exploring the use of human-in-the-loop framework for minimizing the grading cost while guaranteeing the grading quality by allowing a SAS model to share the grading task with a human grader. Specifically, by introducing a confidence estimation method for indicating the reliability of the model predictions, one can guarantee the scoring quality by utilizing only predictions with high reliability for the scoring results and casting predictions with low reliability to human graders. In our experiments, we investigate the feasibility of the proposed framework using multiple confidence estimation methods and multiple SAS datasets. We find that our human-in-the-loop framework allows automatic scoring models and human graders to achieve the target scoring quality.