论文标题
分子振动的深空学习
Deep Spatial Learning with Molecular Vibration
论文作者
论文摘要
由数据稀缺引起的机器学习过度拟合极大地限制了机器学习在分子中的应用。由于制造过程的差异,并非总是通过计算化学方法来用于某些任务,从而造成大数据,从而导致机器学习算法的数据稀缺问题。在这里,我们建议提取分子结构的自然特征,并合理地扭曲它们以增加数据的可用性。这种方法允许机器学习项目利用物理信息增强的强大拟合,从而为预测精度提供了重大的提升。通过预测薄膜聚酰胺纳米滤膜的预测和通量的预测,成功验证了,相对误差从16.34%下降到6.71%,确定系数从0.16下降到0.75,提出的深层空间学习是针对分子振动的深空学习。实验比较明确地证明了其优于常见的学习算法。
Machine learning over-fitting caused by data scarcity greatly limits the application of machine learning for molecules. Due to manufacturing processes difference, big data is not always rendered available through computational chemistry methods for some tasks, causing data scarcity problem for machine learning algorithms. Here we propose to extract the natural features of molecular structures and rationally distort them to augment the data availability. This method allows a machine learning project to leverage the powerful fit of physics-informed augmentation for providing significant boost to predictive accuracy. Successfully verified by the prediction of rejection rate and flux of thin film polyamide nanofiltration membranes, with the relative error dropping from 16.34% to 6.71% and the coefficient of determination rising from 0.16 to 0.75, the proposed deep spatial learning with molecular vibration is widely instructive for molecular science. Experimental comparison unequivocally demonstrates its superiority over common learning algorithms.