论文标题
一种用于自动化训练集的熵最大化方法,用于原子间电位
An Entropy-Maximization Approach to Automated Training Set Generation for Interatomic Potentials
论文作者
论文摘要
机器学习(ML)基于原子质电位目前正在引起广泛的关注,因为它们努力以经验潜能的计算成本来实现电子结构方法的准确性。鉴于它们的通用功能形式,这些电势的可传递性高度取决于训练集的质量,其产生是高度劳动力密集的活动。良好的训练集应立即包含一套非常多样化的配置,同时避免了造成成本而不提供福利的冗余。我们在本地熵最大化框架中对这些要求进行形式化,并提出一种自动抽样方案,以从此目标函数中进行采样。我们表明,这种方法比公正的抽样产生了更多多样化的训练集,并且具有手工制作的训练集具有竞争力。
Machine learning (ML)-based interatomic potentials are currently garnering a lot of attention as they strive to achieve the accuracy of electronic structure methods at the computational cost of empirical potentials. Given their generic functional forms, the transferability of these potentials is highly dependent on the quality of the training set, the generation of which is a highly labor-intensive activity. Good training sets should at once contain a very diverse set of configurations while avoiding redundancies that incur cost without providing benefits. We formalize these requirements in a local entropy maximization framework and propose an automated sampling scheme to sample from this objective function. We show that this approach generates much more diverse training sets than unbiased sampling and is competitive with hand-crafted training sets.