论文标题

通过监督的混凝土自动编码器,多摩变数据集成和特征选择基于生存的患者分层

Multi-Omic Data Integration and Feature Selection for Survival-based Patient Stratification via Supervised Concrete Autoencoders

论文作者

Avelar, Pedro Henrique da Costa, Laddach, Roman, Karagiannis, Sophia, Wu, Min, Tsoka, Sophia

论文摘要

癌症是一种复杂的疾病,具有重大的社会和经济影响。高通量分子测定的进步以及执行高质量多词的成本降低的成本通过机器学习促进了见解。先前的研究表明,使用多个OMIC预测生存和分层癌症患者的希望。在本文中,我们开发了一种监督的自动编码器(SAE)模型,用于基于生存的多OMIC集成,该模型改善了先前的工作,并报告了一种具体的监督自动编码器模型(CSAE),该模型(CSAE)使用功能选择来共同重建输入功能并预测生存。我们的实验表明,我们的模型表现优于或与一些最常用的基线相提并论,同时提供更好的生存分离(SAE)或更容易解释(CSAE)。我们还对我们的模型进行了特征选择稳定性分析,并注意到与通常与生存相关的特征存在幂律关系。该项目的代码可在以下网址获得:https://github.com/phcavelar/coxae

Cancer is a complex disease with significant social and economic impact. Advancements in high-throughput molecular assays and the reduced cost for performing high-quality multi-omics measurements have fuelled insights through machine learning . Previous studies have shown promise on using multiple omic layers to predict survival and stratify cancer patients. In this paper, we developed a Supervised Autoencoder (SAE) model for survival-based multi-omic integration which improves upon previous work, and report a Concrete Supervised Autoencoder model (CSAE), which uses feature selection to jointly reconstruct the input features as well as predict survival. Our experiments show that our models outperform or are on par with some of the most commonly used baselines, while either providing a better survival separation (SAE) or being more interpretable (CSAE). We also perform a feature selection stability analysis on our models and notice that there is a power-law relationship with features which are commonly associated with survival. The code for this project is available at: https://github.com/phcavelar/coxae

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源