在树合奏分类器中嵌入和提取知识

论文标题

在树合奏分类器中嵌入和提取知识

Embedding and Extraction of Knowledge in Tree Ensemble Classifiers

论文作者

Huang, Wei, Zhao, Xingyu, Huang, Xiaowei

论文摘要

有用知识的嵌入和提取是机器学习应用程序的最新趋势，例如补充现有的数据集。尽管随着机器学习模型在安全至关重要的应用中的日益增长的使用，但恶意知识的嵌入和提取分别等同于臭名昭著的后门攻击及其防御。本文研究了树集合分类器中知识的嵌入和提取，并专注于以普通公式的通用形式表达的知识，例如鲁棒性特性和后门攻击。对于嵌入，必须是防腐剂（保留分类器的原始性能），可验证（可以证明知识）和隐形（无法轻易检测到嵌入）。为了促进这一点，我们提出了两种小说且有效的嵌入算法，其中一种用于黑盒设置，另一种用于白盒设置。可以在PTIME中完成嵌入。除了嵌入外，我们还开发了一种算法来提取嵌入式知识，通过减少使用SMT（满意度模型理论）求解器解决的问题来提取嵌入式知识。尽管这种新颖的算法可以成功提取知识，但还原导致了NP计算。因此，如果将嵌入作为后门攻击和提取作为防御的嵌入，我们的结果表明，在使用Tree Ensemble分类器时，在攻击和防御之间存在复杂性差距（p vs. NP）。我们应用算法TOA多样的数据集来广泛验证我们的结论。

The embedding and extraction of useful knowledge is a recent trend in machine learning applications, e.g., to supplement existing datasets that are small. Whilst, as the increasing use of machine learning models in security-critical applications, the embedding and extraction of malicious knowledge are equivalent to the notorious backdoor attack and its defence, respectively. This paper studies the embedding and extraction of knowledge in tree ensemble classifiers, and focuses on knowledge expressible with a generic form of Boolean formulas, e.g., robustness properties and backdoor attacks. For the embedding, it is required to be preservative(the original performance of the classifier is preserved), verifiable(the knowledge can be attested), and stealthy(the embedding cannot be easily detected). To facilitate this, we propose two novel, and effective, embedding algorithms, one of which is for black-box settings and the other for white-box settings.The embedding can be done in PTIME. Beyond the embedding, we develop an algorithm to extract the embedded knowledge, by reducing the problem to be solvable with an SMT (satisfiability modulo theories) solver. While this novel algorithm can successfully extract knowledge, the reduction leads to an NP computation. Therefore, if applying embedding as backdoor attacks and extraction as defence, our results suggest a complexity gap (P vs. NP) between the attack and defence when working with tree ensemble classifiers. We apply our algorithms toa diverse set of datasets to validate our conclusion extensively.

下载PDF全文

下载文献需遵守相关版权规定

论文标题