论文标题
不是您的祖父测试集:减少测试的标签工作
Not Your Grandfathers Test Set: Reducing Labeling Effort for Testing
论文作者
论文摘要
建造和维护高质量的测试集仍然是一项费力且昂贵的任务。结果,现实世界中的测试集通常无法正确地保持最新状态,并且从他们应该代表的生产流量中漂移。这种漂移的频率和严重性引起了人们对质量检查过程中手动标记的测试集的价值的严重关注。本文提出了一种简单但有效的技术,该技术大大减少了构建和维护高质量测试集所需的努力(在各种实际情况下,将标签工作减少了80-100%)。这一结果鼓励了两位从业人员对测试过程的基本重新思考,他们可以立即使用这些技术来改善测试,并且可以帮助解决这种新方法提出的许多开放问题的研究人员。
Building and maintaining high-quality test sets remains a laborious and expensive task. As a result, test sets in the real world are often not properly kept up to date and drift from the production traffic they are supposed to represent. The frequency and severity of this drift raises serious concerns over the value of manually labeled test sets in the QA process. This paper proposes a simple but effective technique that drastically reduces the effort needed to construct and maintain a high-quality test set (reducing labeling effort by 80-100% across a range of practical scenarios). This result encourages a fundamental rethinking of the testing process by both practitioners, who can use these techniques immediately to improve their testing, and researchers who can help address many of the open questions raised by this new approach.