论文标题
在山脊逻辑回归中调整以解决分离
Tuning in ridge logistic regression to solve separation
论文作者
论文摘要
逻辑回归中的分离是一个常见的问题,导致迭代估计过程失败时,找到最大的似然估计。提出了Firth的校正(FC)作为解决方案,在存在分离的情况下也提供了估计值。在本文中,我们评估了是否可以考虑脊回归(RR)是否可以减少与FC相比的系数估计值的平方误差(MSE)。在RR中,通常通过最大程度地降低样本外预测误差或信息标准的一定程度来确定确定惩罚强度的调整参数。但是,在存在分离的情况下,这些度量可以产生零的优化值(无收缩),因此无法提供通用的解决方案。我们得出了一个新的基于引导的调整标准$ b $,它总是导致收缩。此外,我们证明了如何通过组合重新采样概况惩罚的可能性功能来获得有效的推理。在肿瘤学的一个示例中说明了我们的方法,并将其表现与FC进行了模拟研究。我们的模拟表明,在对小型和稀疏数据集的分析中,并且具有许多相关的协变量$ b $ tuned的RR可以产生MSE小于FC的系数估计,并且置信区间大约可以实现名义覆盖概率。
Separation in logistic regression is a common problem causing failure of the iterative estimation process when finding maximum likelihood estimates. Firth's correction (FC) was proposed as a solution, providing estimates also in presence of separation. In this paper we evaluate whether ridge regression (RR) could be considered instead, specifically, if it could reduce the mean squared error (MSE) of coefficient estimates in comparison to FC. In RR the tuning parameter determining the penalty strength is usually obtained by minimizing some measure of the out-of-sample prediction error or information criterion. However, in presence of separation tuning these measures can yield an optimized value of zero (no shrinkage), and hence cannot provide a universal solution. We derive a new bootstrap based tuning criterion $B$ that always leads to shrinkage. Moreover, we demonstrate how valid inference can be obtained by combining resampled profile penalized likelihood functions. Our approach is illustrated in an example from oncology and its performance is compared to FC in a simulation study. Our simulations showed that in analyses of small and sparse datasets and with many correlated covariates $B$-tuned RR can yield coefficient estimates with MSE smaller than FC and confidence intervals that approximately achieve nominal coverage probabilities.