用于对粗粒替代物进行半监督训练和通过虚拟观察物进行物理约束的概率生成模型

论文标题

用于对粗粒替代物进行半监督训练和通过虚拟观察物进行物理约束的概率生成模型

A probabilistic generative model for semi-supervised training of coarse-grained surrogates and enforcing physical constraints through virtual observables

论文作者

Rixner, Maximilian, Koutsourelakis, Phaedon-Stelios

论文摘要

以数据为中心的廉价替代物用于细粒度的物理模型，由于其在不确定性量化等多期任务中的重要效用，因此在计算物理学的最前沿。最近的努力利用了从机器学习领域（例如，深神经网络）与仿真数据结合使用的能力技术。尽管这些策略即使在更高维度的问题中也表现出了希望，但即使替代物的构建是一个小数据问题，它们通常也需要大量的培训数据。它没有采用基于数据的损失功能，而是提议利用管理方程（在搭配点最简单的情况下），以使域知识在训练其他类似黑盒的插入器的训练中。本文提供了一个灵活的，概率的框架，该框架在培训目标以及替代模型本身中都说明了物理结构和信息。我们提倡一个概率（贝叶斯）模型，其中可以将物理学可用的平等（例如残差，保护法）作为虚拟可观察物引入，并且可以通过可能性提供其他信息。我们进一步提倡一种生成模型，即试图学习能够以半监督的方式使用未标记的数据（即仅输入）的输入和输出的关节密度，以促进较低维度的嵌入，这些嵌入量永远是毫无疑问的，这些嵌入者尚无预测良好的模型的输出。

The data-centric construction of inexpensive surrogates for fine-grained, physical models has been at the forefront of computational physics due to its significant utility in many-query tasks such as uncertainty quantification. Recent efforts have taken advantage of the enabling technologies from the field of machine learning (e.g. deep neural networks) in combination with simulation data. While such strategies have shown promise even in higher-dimensional problems, they generally require large amounts of training data even though the construction of surrogates is by definition a Small Data problem. Rather than employing data-based loss functions, it has been proposed to make use of the governing equations (in the simplest case at collocation points) in order to imbue domain knowledge in the training of the otherwise black-box-like interpolators. The present paper provides a flexible, probabilistic framework that accounts for physical structure and information both in the training objectives as well as in the surrogate model itself. We advocate a probabilistic (Bayesian) model in which equalities that are available from the physics (e.g. residuals, conservation laws) can be introduced as virtual observables and can provide additional information through the likelihood. We further advocate a generative model i.e. one that attempts to learn the joint density of inputs and outputs that is capable of making use of unlabeled data (i.e. only inputs) in a semi-supervised fashion in order to promote the discovery of lower-dimensional embeddings which are nevertheless predictive of the fine-grained model's output.

下载PDF全文

下载文献需遵守相关版权规定

论文标题