使用SAEM算法的非线性混合效应模型中的贝叶斯高维协变量选择

论文标题

使用SAEM算法的非线性混合效应模型中的贝叶斯高维协变量选择

Bayesian high-dimensional covariate selection in non-linear mixed-effects models using the SAEM algorithm

论文作者

Naveau, Marion, King, Guillaume Kon Kam, Rincent, Renaud, Sansonnet, Laure, Delattre, Maud

论文摘要

在标准回归模型中广泛记录了高维变量选择，其协变量比观测值更多，但是在非线性混合效应模型中仍然很少有工具来解决该模型，在这些模型中反复收集了几个个体的数据。在这项工作中，从贝叶斯的角度进行了变量选择，并提出了选择过程，结合了尖峰和slab先验的使用和期望最大化（SAEM）算法的随机近似版本。与拉索回归类似，相关协变量集可以通过探索惩罚参数的值网格来选择。 SAEM方法比经典的MCMC（Markov Chain Monte Carlo）算法快得多，我们的方法在模拟数据上显示出很好的选择性能。通过为多种非线性混合效应模型实施它来证明其灵活性。该方法的有用性在遗传标记鉴定的问题上进行了说明，这与植物育种中基因组辅助选择有关。

High-dimensional variable selection, with many more covariates than observations, is widely documented in standard regression models, but there are still few tools to address it in non-linear mixed-effects models where data are collected repeatedly on several individuals. In this work, variable selection is approached from a Bayesian perspective and a selection procedure is proposed, combining the use of a spike-and-slab prior and the Stochastic Approximation version of the Expectation Maximisation (SAEM) algorithm. Similarly to Lasso regression, the set of relevant covariates is selected by exploring a grid of values for the penalisation parameter. The SAEM approach is much faster than a classical MCMC (Markov chain Monte Carlo) algorithm and our method shows very good selection performances on simulated data. Its flexibility is demonstrated by implementing it for a variety of nonlinear mixed effects models. The usefulness of the proposed method is illustrated on a problem of genetic markers identification, relevant for genomic-assisted selection in plant breeding.

下载PDF全文

下载文献需遵守相关版权规定

论文标题