论文标题
在Android应用中与并发相关的片状测试检测
Concurrency-related Flaky Test Detection in Android apps
论文作者
论文摘要
由于存在片状测试,因此很难通过测试验证Android应用程序。由于非确定性执行环境,一系列事件(测试)可能会以无法预测的方式导致成功或失败。在这项工作中,我们提出了一种方法和工具壁炉架,用于通过系统探索事件订单来检测片状测试。我们的主要观察结果是,对于移动应用程序中的测试,有一个测试框架线程创建测试事件,一个主要的用户界面(UI)线程处理这些事件,并且可能还有其他几个背景线程异步运行。对于任何事件E的执行涉及潜在的非确定性,我们将最早的(最新)事件本地(之前)本地化。然后,我们必须有效地探索上/下限事件之间的时间表,同时将事件分组在单个语句中,以查找测试结果是否片状。我们还创建了一套名为DroidFlaker的主题程序,以研究Android应用中的片状测试。我们对主题 - 套件机器人的实验证明了片状测试检测的功效。我们的工作与现有的片状测试检测工具(例如Deflaker)互补,该工具仅检查失败的测试。如我们的方法和实验所示,Flakeshovel可以检测通过测试之间的片状测试。
Validation of Android apps via testing is difficult owing to the presence of flaky tests. Due to non-deterministic execution environments, a sequence of events (a test) may lead to success or failure in unpredictable ways. In this work, we present an approach and tool FlakeShovel for detecting flaky tests through systematic exploration of event orders. Our key observation is that for a test in a mobile app, there is a testing framework thread which creates the test events, a main User-Interface (UI) thread processing these events, and there may be several other background threads running asynchronously. For any event e whose execution involves potential non-determinism, we localize the earliest (latest) event after (before) which e must happen.We then efficiently explore the schedules between the upper/lower bound events while grouping events within a single statement, to find whether the test outcome is flaky. We also create a suite of subject programs called DroidFlaker to study flaky tests in Android apps. Our experiments on subject-suite DroidFlaker demonstrate the efficacy of our flaky test detection. Our work is complementary to existing flaky test detection tools like Deflaker which check only failing tests. FlakeShovel can detect flaky tests among passing tests, as shown by our approach and experiments.