论文标题
通过机器学习对动荡的环境进行分类
Classifying Turbulent Environments via Machine Learning
论文作者
论文摘要
从部分观察中对湍流环境进行分类的问题对于某些理论和应用领域,从工程到地球观察和天体物理学,例如要在不同的湍流背景中搜索最佳控制策略的前提搜索,以预测罕见事件的概率和/或推断出标记不同湍流设置的物理参数。为了实现这样的目标,人们可以根据系统的知识以及可访问数据的质量和数量使用不同的工具。在这种情况下,我们假设在无模型的设置中工作,完全对所有动态定律视而不见,但使用大量(优质)培训数据。作为具有不同吸引子的复杂流的原型,以及不同的多尺度统计属性,我们通过更改3D域的参考框架的旋转频率选择了10个动荡的“合奏”,我们假设可以访问一组局部观测值,仅限于2D平面中的一组瞬时动能分布,因为它通常是case In Geophysics and Geophysics and Aroptopsics和Altoptossics和Altopersics和Altoptossics和Altoptossics。我们比较了通过机器学习(ML)方法获得的结果(ML)方法,该方法由最先进的深卷积神经网络(DCNN)与贝叶斯的推论相比,该方法利用了有关速度和潮流时刻的信息。首先,我们讨论了ML方法的至高无上,还提出了更改训练数据和超参数的数量的结果。其次,我们对旨在对DCNN使用的流动特征的重要性进行排名进行了消融研究,有助于识别分类器使用的主要物理内容。最后,我们讨论了此类数据驱动方法和潜在有趣的应用程序的主要局限性。
The problem of classifying turbulent environments from partial observation is key for some theoretical and applied fields, from engineering to earth observation and astrophysics, e.g. to precondition searching of optimal control policies in different turbulent backgrounds, to predict the probability of rare events and/or to infer physical parameters labelling different turbulent set-ups. To achieve such goal one can use different tools depending on the system's knowledge and on the quality and quantity of the accessible data. In this context, we assume to work in a model-free setup completely blind to all dynamical laws, but with a large quantity of (good quality) data for training. As a prototype of complex flows with different attractors, and different multi-scale statistical properties we selected 10 turbulent 'ensembles' by changing the rotation frequency of the frame of reference of the 3d domain and we suppose to have access to a set of partial observations limited to the instantaneous kinetic energy distribution in a 2d plane, as it is often the case in geophysics and astrophysics. We compare results obtained by a Machine Learning (ML) approach consisting of a state-of-the-art Deep Convolutional Neural Network (DCNN) against Bayesian inference which exploits the information on velocity and enstrophy moments. First, we discuss the supremacy of the ML approach, presenting also results at changing the number of training data and of the hyper-parameters. Second, we present an ablation study on the input data aimed to perform a ranking on the importance of the flow features used by the DCNN, helping to identify the main physical contents used by the classifier. Finally, we discuss the main limitations of such data-driven methods and potential interesting applications.