论文标题
Chemva:虚拟筛选中化学复合相似性的交互式视觉分析
ChemVA: Interactive Visual Analysis of Chemical Compound Similarity in Virtual Screening
论文作者
论文摘要
在现代药物发现过程中,药物化学家处理了大型候选分子的分析的复杂性。计算工具,例如降低(DR)和分类,通常用于有效地处理特征的多维空间。这些基本计算通常会阻碍结果的解释性,并阻止专家评估单个分子特征对结果表示的影响。为了提供仔细研究此类复杂数据的解决方案,我们介绍了Chemva,这是对大分子集合及其特征的可视化探索的交互应用。我们的工具包括多个协调的视图:六边形视图,详细视图,3D视图,表观视图以及设计用于比较DR投影的新提出的差异视图。这些视图显示DR投影结合了生物学活性,选定的分子特征以及每个投影的置信度得分。这种视图的连词使用户可以通过数据集深入钻探并有效地选择候选化合物。在两个案例研究中,评估了我们的方法,该研究在结构上相似的配体与靶蛋白具有相似的结合亲和力以及外部定性评估。结果表明,我们的系统可以有效地进行视觉检查和比较不同的高维分子表示。此外,Chemva有助于鉴定候选化合物,同时提供有关不同分子表示背后的确定性的信息。
In the modern drug discovery process, medicinal chemists deal with the complexity of analysis of large ensembles of candidate molecules. Computational tools, such as dimensionality reduction (DR) and classification, are commonly used to efficiently process the multidimensional space of features. These underlying calculations often hinder interpretability of results and prevent experts from assessing the impact of individual molecular features on the resulting representations. To provide a solution for scrutinizing such complex data, we introduce ChemVA, an interactive application for the visual exploration of large molecular ensembles and their features. Our tool consists of multiple coordinated views: Hexagonal view, Detail view, 3D view, Table view, and a newly proposed Difference view designed for the comparison of DR projections. These views display DR projections combined with biological activity, selected molecular features, and confidence scores for each of these projections. This conjunction of views allows the user to drill down through the dataset and to efficiently select candidate compounds. Our approach was evaluated on two case studies of finding structurally similar ligands with similar binding affinity to a target protein, as well as on an external qualitative evaluation. The results suggest that our system allows effective visual inspection and comparison of different high-dimensional molecular representations. Furthermore, ChemVA assists in the identification of candidate compounds while providing information on the certainty behind different molecular representations.