协方差矩阵估计的复合决策方法

论文标题

协方差矩阵估计的复合决策方法

A Compound Decision Approach to Covariance Matrix Estimation

论文作者

Xin, Huiqin, Zhao, Sihai Dave

论文摘要

协方差矩阵估计是许多应用中的基本统计任务，但是当样本量与特征数量相当或小时时，样本协方差矩阵是优化的。这种高维设置在现代基因组学中很常见，在现代基因组学中，协方差矩阵估计经常用作推断基因网络的方法。为了在这些情况下达到估计准确性，现有方法通常假设人口协方差矩阵具有某些特定的结构，例如稀疏性，或应用收缩以更好地估计人口特征值。在本文中，我们研究了一种估计高维协方差矩阵的新方法。我们首先将协方差矩阵估计作为复合决策问题。这激发了定义一类决策规则，并使用非参数经验贝叶斯G模型方法来估计班级中的最佳规则。在小鼠中的RNA-seq实验中，模拟结果和基因网络推断表明，我们的方法可与或胜过许多最先进的建议。

Covariance matrix estimation is a fundamental statistical task in many applications, but the sample covariance matrix is sub-optimal when the sample size is comparable to or less than the number of features. Such high-dimensional settings are common in modern genomics, where covariance matrix estimation is frequently employed as a method for inferring gene networks. To achieve estimation accuracy in these settings, existing methods typically either assume that the population covariance matrix has some particular structure, for example sparsity, or apply shrinkage to better estimate the population eigenvalues. In this paper, we study a new approach to estimating high-dimensional covariance matrices. We first frame covariance matrix estimation as a compound decision problem. This motivates defining a class of decision rules and using a nonparametric empirical Bayes g-modeling approach to estimate the optimal rule in the class. Simulation results and gene network inference in an RNA-seq experiment in mouse show that our approach is comparable to or can outperform a number of state-of-the-art proposals.

下载PDF全文

下载文献需遵守相关版权规定

论文标题