论文标题
随机竞争风险森林供大型数据
Random Competing Risks Forests for Large Data
论文作者
论文摘要
随机森林是一个明智的非参数模型,可以根据一些协变量来预测竞争风险数据。但是,目前没有可以充分处理大型数据集的软件包($ n> 100,000美元)。我们使用Ishwaran等人开发的随机竞争风险介绍了一个新的R包,更大的CRF。 (2014)。我们通过模拟研究验证了包装的有效性和准确性,并表明其结果与Randomforestsrc相似,同时花费更少的时间进行运行。我们还使用大多数研究人员可用的硬件要求,在以前无法访问的大型数据集上演示了包装。
Random forests are a sensible non-parametric model to predict competing risk data according to some covariates. However, there are currently no packages that can adequately handle large datasets ($n > 100,000$). We introduce a new R package, largeRCRF, using the random competing risks forest theory developed by Ishwaran et al. (2014). We verify our package's validity and accuracy through simulation studies and show that its results are similar enough to randomForestSRC while taking less time to run. We also demonstrate the package on a large dataset that was previously inaccessible, using hardware requirements that are available to most researchers.