论文标题

基准材料属性预测方法:MATBENCH测试集和自动机参考算法

Benchmarking Materials Property Prediction Methods: The Matbench Test Set and Automatminer Reference Algorithm

论文作者

Dunn, Alexander, Wang, Qi, Ganose, Alex, Dopp, Daniel, Jain, Anubhav

论文摘要

我们提出了一个基准测试套件和一个自动化的机器学习程序,用于评估监督机器学习(ML)模型,以预测无机散装材料的性质。测试套件MATBENCH是一组13个ML任务,大小为312至132K样品,并包含来自10个密度功能理论衍生和实验源的数据。任务包括预测具有材料组成和/或晶体结构的光学,热,电子,热力学,拉伸和弹性特性。参考算法,AutomatMiner是一种高度扩展的,完全自动化的ML管道,用于从材料原始材料(例如组成和晶体结构)中预测材料特性,而无需用户干预或高参数调整。我们在MATBENCH测试套件上测试自动机器人,并将其预测能力与最先进的晶体图神经网络和传统的基于描述符的随机森林模型进行比较。我们发现自动机器人在基准中的13个任务中的8个任务中实现了最佳性能。我们还显示,我们的测试套件能够揭示每种算法的预测优势 - 也就是说,Crystal Graph方法似乎超过了传统的机器学习方法,给出了〜10^4或更大的数据点。预处理的,现成的MATBENCH任务和自动机源代码是开源的,在线可用(http://hackingmaterials.lbl.gov/automatminer/)。我们鼓励在MATBENCH基准测试中评估新材料ML算法,并将其与最新版本的AutomatMiner进行比较。

We present a benchmark test suite and an automated machine learning procedure for evaluating supervised machine learning (ML) models for predicting properties of inorganic bulk materials. The test suite, Matbench, is a set of 13 ML tasks that range in size from 312 to 132k samples and contain data from 10 density functional theory-derived and experimental sources. Tasks include predicting optical, thermal, electronic, thermodynamic, tensile, and elastic properties given a materials composition and/or crystal structure. The reference algorithm, Automatminer, is a highly-extensible, fully-automated ML pipeline for predicting materials properties from materials primitives (such as composition and crystal structure) without user intervention or hyperparameter tuning. We test Automatminer on the Matbench test suite and compare its predictive power with state-of-the-art crystal graph neural networks and a traditional descriptor-based Random Forest model. We find Automatminer achieves the best performance on 8 of 13 tasks in the benchmark. We also show our test suite is capable of exposing predictive advantages of each algorithm - namely, that crystal graph methods appear to outperform traditional machine learning methods given ~10^4 or greater data points. The pre-processed, ready-to-use Matbench tasks and the Automatminer source code are open source and available online (http://hackingmaterials.lbl.gov/automatminer/). We encourage evaluating new materials ML algorithms on the MatBench benchmark and comparing them against the latest version of Automatminer.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源