通过阳性未标记的学习改善了蛋白酶体的切割预测

论文标题

通过阳性未标记的学习改善了蛋白酶体的切割预测

Improved proteasomal cleavage prediction with positive-unlabeled learning

论文作者

Dorigatti, Emilio, Bischl, Bernd, Schubert, Benjamin

论文摘要

抗原加工途径的计算机建模的准确性对于实现个性化表位疫苗设计至关重要。这种途径的一个重要步骤是，蛋白酶体将疫苗降解为较小的肽，其中一些将由MHC复合物呈现给T细胞。虽然最近预测MHC肽的表现引起了很多关注，但鉴于最近的高通量质谱质谱法MHC连接组学，蛋白酶体裂解预测仍然是一个相对未探索的区域。此外，由于这种实验技术不允许识别无法分裂的区域，因此最新的预测因子会产生诱饵负样本，并在训练时将其视为真正的负面样本，即使其中一些实际上可以是肯定的。因此，在这项工作中，我们提出了一个新的预测指标，该预测因素通过扩展的数据集和稳固的未标记学习理论基础进行了培训，从而实现了蛋白酶体裂解预测的新最新。改进的预测能力将又可以使更精确的疫苗开发提高基于表位的疫苗的功效。审计的模型可在GitHub上使用

Accurate in silico modeling of the antigen processing pathway is crucial to enable personalized epitope vaccine design for cancer. An important step of such pathway is the degradation of the vaccine into smaller peptides by the proteasome, some of which are going to be presented to T cells by the MHC complex. While predicting MHC-peptide presentation has received a lot of attention recently, proteasomal cleavage prediction remains a relatively unexplored area in light of recent advancesin high-throughput mass spectrometry-based MHC ligandomics. Moreover, as such experimental techniques do not allow to identify regions that cannot be cleaved, the latest predictors generate decoy negative samples and treat them as true negatives when training, even though some of them could actually be positives. In this work, we thus present a new predictor trained with an expanded dataset and the solid theoretical underpinning of positive-unlabeled learning, achieving a new state-of-the-art in proteasomal cleavage prediction. The improved predictive capabilities will in turn enable more precise vaccine development improving the efficacy of epitope-based vaccines. Pretrained models are available on GitHub

下载PDF全文

下载文献需遵守相关版权规定

论文标题