论文标题
Moflow:一种可逆流模型,用于生成分子图
MoFlow: An Invertible Flow Model for Generating Molecular Graphs
论文作者
论文摘要
具有由深图生成模型驱动的所需化学特性的分子图提供了一种非常有希望的方法来加速药物发现过程。这样的图生成模型通常由两个步骤组成:学习潜在表示和分子图的产生。然而,由于分子图的化学限制和组合复杂性,从潜在产生新的和化学效率的分子图非常具有挑战性。在本文中,我们提出了Moflow,这是一种基于流的图生成模型,以学习分子图及其潜在表示之间的可逆映射。为了生成分子图,我们的moflow首先通过基于发光的模型生成键(边),然后通过新的图条件流量产生键,并最终将它们组装成具有后有效性校正的化学有效分子图。我们的Moflow具有优点,包括精确且可拖动的可能性训练,有效的一通嵌入和产生,化学有效性保证,100 \%的训练数据重建以及良好的概括能力。我们通过四个任务来验证我们的模型:分子图生成和重建,连续的潜在空间的可视化,属性优化和限制性属性优化。我们的Moflow达到了最先进的性能,这意味着其潜在的效率和有效性,可探索大量的药物发现空间。
Generating molecular graphs with desired chemical properties driven by deep graph generative models provides a very promising way to accelerate drug discovery process. Such graph generative models usually consist of two steps: learning latent representations and generation of molecular graphs. However, to generate novel and chemically-valid molecular graphs from latent representations is very challenging because of the chemical constraints and combinatorial complexity of molecular graphs. In this paper, we propose MoFlow, a flow-based graph generative model to learn invertible mappings between molecular graphs and their latent representations. To generate molecular graphs, our MoFlow first generates bonds (edges) through a Glow based model, then generates atoms (nodes) given bonds by a novel graph conditional flow, and finally assembles them into a chemically valid molecular graph with a posthoc validity correction. Our MoFlow has merits including exact and tractable likelihood training, efficient one-pass embedding and generation, chemical validity guarantees, 100\% reconstruction of training data, and good generalization ability. We validate our model by four tasks: molecular graph generation and reconstruction, visualization of the continuous latent space, property optimization, and constrained property optimization. Our MoFlow achieves state-of-the-art performance, which implies its potential efficiency and effectiveness to explore large chemical space for drug discovery.