论文标题

通过半参数多级混合物建模的基于集合的遗传关联推断的有效测试和效果大小估计:应用于全基因组冠状动脉疾病的研究

Efficient testing and effect size estimation for set-based genetic association inference via semiparametric multilevel mixture modeling: Application to a genome-wide association study of coronary artery disease

论文作者

Sugasawa, Shonosuke, Noma, Hisashi

论文摘要

在遗传关联研究中,具有极小等位基因频率的罕见变体在复杂性状中起着至关重要的作用,并且开发了共同评估单核苷酸多态性(SNP)效应的基于集合的测试方法,以提高关联测试的能力。但是,由于等位基因的频率极小,这些测试的功能仍然受到严重限制,并且基本上不可能对单个SNP的效果大小的精确估计。在本文中,我们提供了一个有效的基于集合的推理框架,该框架可以基于贝叶斯半参数多层混合模型同时解决这两个重要问题。我们建议使用多级分层模型,该模型将变化纳入特异性效应和特异性效应中,并应用最佳发现程序(ODP),该过程(ODP)在多重意义测试中实现了最大的总体功率。此外,我们还提供了贝叶斯最佳“基于集合”的效应大小的经验分布估计量。通过应用于冠状动脉疾病(CAD)的全基因组关联研究以及通过模拟研究来证明所提出方法的效率。这些结果表明,对于CAD来说,可能有很多稀有变体,其效应尺寸很大,并且ODP检测到的有意义集的数量远大于现有方法。

In genetic association studies, rare variants with extremely small allele frequency play a crucial role in complex traits, and the set-based testing methods that jointly assess the effects of groups of single nucleotide polymorphisms (SNPs) were developed to improve powers for the association tests. However, the powers of these tests are still severely limited due to the extremely small allele frequency, and precise estimations for the effect sizes of individual SNPs are substantially impossible. In this article, we provide an efficient set-based inference framework that addresses the two important issues simultaneously based on a Bayesian semiparametric multilevel mixture model. We propose to use the multilevel hierarchical model that incorporate the variations in set-specific effects and variant-specific effects, and to apply the optimal discovery procedure (ODP) that achieves the largest overall power in multiple significance testing. In addition, we provide Bayesian optimal "set-based" estimator of the empirical distribution of effect sizes. Efficiency of the proposed methods is demonstrated through application to a genome-wide association study of coronary artery disease (CAD), and through simulation studies. These results suggested there could be a lot of rare variants with large effect sizes for CAD, and the number of significant sets detected by the ODP was much greater than those by existing methods.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源