我的版本是否不服从语义版本？基于语义差异的静态检测

论文标题

我的版本是否不服从语义版本？基于语义差异的静态检测

Has My Release Disobeyed Semantic Versioning? Static Detection Based on Semantic Differencing

论文作者

Zhang, Lyuye, Liu, Chengwei, Xu, Zhengzi, Chen, Sen, Fan, Lingling, Chen, Bihuan, Liu, Yang

论文摘要

为了增强Java第三方库（TPLS）版本控制中的兼容性，Maven采用语义版本管理（SEMVER）来标准化版本的基本含义，但用户仍然可以在升级后升级和升级后崩溃，即使汇编和链接成功。它是由语义破坏（SEMB）问题引起的，因此用户直接使用的API具有相同的签名，但跨升级的语义不一致。为了加强遵守SEMVER规则，应向开发人员和用户提醒此类问题。不幸的是，在静态上检测它们是一项挑战，因为API的内部方法的语义变化很难捕获。动态测试可以确认发现一些，但由于覆盖范围不足而受到限制。为了通过SEMVER规则检测到兼容升级（补丁和次要）的SEMB问题，我们对180个SEMB问题进行了一项经验研究，以了解根本原因，灵感来自于此，我们建议Sembid（语义破坏问题检测器），以统计地检测到开发人员和用户的TPLS问题。由于用户直接使用API，因此SEMBID检测和报告基于API的SEMB问题。对于一对API，Sembid穿过源自API的呼叫链，通过测量语义差异来定位断裂变化。然后，SEMBID检查断裂变化是否会影响API沿呼叫链的输出。评估显示，SEMBID可实现90.26％的召回和81.29％的精度，并且在SEMB API检测方面的表现优于其他API检查器。我们还揭示了SEMBID检测到的SEMB API超过3倍，其覆盖范围比通常使用的解决方案更好。此外，我们对546个顶级Java库的1,629,589个API进行了一项实证研究，发现SEMB API是基于签名问题的SEMB API的2-4倍。

To enhance the compatibility in the version control of Java Third-party Libraries (TPLs), Maven adopts Semantic Versioning (SemVer) to standardize the underlying meaning of versions, but users could still confront abnormal execution and crash after upgrades even if compilation and linkage succeed. It is caused by semantic breaking (SemB) issues, such that APIs directly used by users have identical signatures but inconsistent semantics across upgrades. To strengthen compliance with SemVer rules, developers and users should be alerted of such issues. Unfortunately, it is challenging to detect them statically, because semantic changes in the internal methods of APIs are difficult to capture. Dynamic testing can confirmingly uncover some, but it is limited by inadequate coverage. To detect SemB issues over compatible upgrades (Patch and Minor) by SemVer rules, we conduct an empirical study on 180 SemB issues to understand the root causes, inspired by which, we propose Sembid (Semantic Breaking Issue Detector) to statically detect such issues of TPLs for developers and users. Since APIs are directly used by users, Sembid detects and reports SemB issues based on APIs. For a pair of APIs, Sembid walks through the call chains originating from the API to locate breaking changes by measuring semantic diff. Then, Sembid checks if the breaking changes can affect API's output along call chains. The evaluation showed Sembid achieved 90.26% recall and 81.29% precision and outperformed other API checkers on SemB API detection. We also revealed Sembid detected over 3 times more SemB APIs with better coverage than unit tests, the commonly used solution. Furthermore, we carried out an empirical study on 1,629,589 APIs from 546 version pairs of top Java libraries and found there were 2-4 times more SemB APIs than those with signature-based issues.

下载PDF全文

下载文献需遵守相关版权规定

论文标题