关于大型AI模型的不可能的安全

论文标题

关于大型AI模型的不可能的安全

On the Impossible Safety of Large AI Models

论文作者

El-Mhamdi, El-Mahdi, Farhadkhani, Sadegh, Guerraoui, Rachid, Gupta, Nirupam, Hoang, Lê-Nguyên, Pinot, Rafael, Rouault, Sébastien, Stephan, John

论文摘要

大型AI模型（Laims）是最突出的最突出的示例，展示了一些令人印象深刻的表现。但是，从经验上发现它们构成了严重的安全问题。本文将我们对建立任意准确且安全的机器学习模型的基本不可能的知识系统化。更确切地说，我们确定了当今许多机器学习设置的关键挑战性功能。也就是说，高精度似乎需要记住大型培训数据集，这些数据集通常是用户生成的，并且具有敏感信息和虚假用户。然后，我们认为，我们认为，我们认为，这是一个令人信服的案例，即设计具有强大安全保证的高准确性Laims的可能性。

Large AI Models (LAIMs), of which large language models are the most prominent recent example, showcase some impressive performance. However they have been empirically found to pose serious security issues. This paper systematizes our knowledge about the fundamental impossibility of building arbitrarily accurate and secure machine learning models. More precisely, we identify key challenging features of many of today's machine learning settings. Namely, high accuracy seems to require memorizing large training datasets, which are often user-generated and highly heterogeneous, with both sensitive information and fake users. We then survey statistical lower bounds that, we argue, constitute a compelling case against the possibility of designing high-accuracy LAIMs with strong security guarantees.

下载PDF全文

下载文献需遵守相关版权规定

论文标题