论文标题
物理知识的矢量量化了用于湍流数据压缩的自动编码器
A Physics-Informed Vector Quantized Autoencoder for Data Compression of Turbulent Flow
论文作者
论文摘要
从湍流模拟中分析大规模数据是记忆密集型的,需要大量资源。这一主要挑战强调了对数据压缩技术的需求。在这项研究中,我们基于矢量量化应用了物理信息的深度学习技术,以产生来自三维湍流模拟数据的离散,低维表示。深度学习框架由卷积层组成,并在流动上包含物理约束,例如保留速度梯度的不可压缩性和全球统计特征。使用基于统计,基于比较的相似性和基于物理的指标来评估模型的准确性。训练数据集是由直接数值模拟的,对不可压缩的,统计固定的各向同性湍流。这种有损数据压缩方案的性能不仅可以通过固定的各向同性湍流中的看不见的数据进行评估,而且还通过衰减各向同性湍流和泰勒绿色涡流流的数据进行评估。将压缩比(CR)定义为原始数据大小与压缩率的比率,结果表明,基于向量量化的模型可以提供CR $ = 85 $,均为均为正方形误差(MSE)为$ O(10^{ - 3})$,并且忠实地预测了流程的统计数据,除了在很小的范围内,还有一些损失。与基于在连续空间中进行压缩的常规自动编码器的最新研究相比,我们的模型将CR提高了30美元以上,并减少了MSE的数量级。我们的压缩模型是对于需要快速,高质量和低空编码和大型数据解码的情况的有吸引力的解决方案。
Analyzing large-scale data from simulations of turbulent flows is memory intensive, requiring significant resources. This major challenge highlights the need for data compression techniques. In this study, we apply a physics-informed Deep Learning technique based on vector quantization to generate a discrete, low-dimensional representation of data from simulations of three-dimensional turbulent flows. The deep learning framework is composed of convolutional layers and incorporates physical constraints on the flow, such as preserving incompressibility and global statistical characteristics of the velocity gradients. The accuracy of the model is assessed using statistical, comparison-based similarity and physics-based metrics. The training data set is produced from Direct Numerical Simulation of an incompressible, statistically stationary, isotropic turbulent flow. The performance of this lossy data compression scheme is evaluated not only with unseen data from the stationary, isotropic turbulent flow, but also with data from decaying isotropic turbulence, and a Taylor-Green vortex flow. Defining the compression ratio (CR) as the ratio of original data size to the compressed one, the results show that our model based on vector quantization can offer CR $=85$ with a mean square error (MSE) of $O(10^{-3})$, and predictions that faithfully reproduce the statistics of the flow, except at the very smallest scales where there is some loss. Compared to the recent study based on a conventional autoencoder where compression is performed in a continuous space, our model improves the CR by more than $30$ percent, and reduces the MSE by an order of magnitude. Our compression model is an attractive solution for situations where fast, high quality and low-overhead encoding and decoding of large data are required.