对EHR时间序列数据的多任务学习和多任务预培训的全面评估

论文标题

对EHR时间序列数据的多任务学习和多任务预培训的全面评估

A Comprehensive Evaluation of Multi-task Learning and Multi-task Pre-training on EHR Time-series Data

论文作者

McDermott, Matthew B. A., Nestor, Bret, Kim, Evan, Zhang, Wancong, Goldenberg, Anna, Szolovits, Peter, Ghassemi, Marzyeh

论文摘要

多任务学习（MTL）是一种机器学习技术，旨在通过利用许多任务的信息来提高模型性能。它已广泛用于各种数据模式，包括电子健康记录（EHR）数据。但是，尽管在EHR数据上有大量使用，但在医疗保健中有多种可能的任务和培训方案中，MTL的实用性几乎没有系统的研究。在这项工作中，我们检查了EHR时间序列数据的一系列任务中的MTL。我们发现，尽管MTL确实遭受了共同的负转移的困扰，但我们可以通过MTL预训练和单任务微调实现显着增长。我们证明，可以以任务无关的方式实现这些收益，不仅在传统学习下提供了较小的改进，而且在几次学习环境中显着取得了显着的收益，从而表明这可能是可扩展的工具，可以在重要的医疗保健环境中提供改进的绩效。

Multi-task learning (MTL) is a machine learning technique aiming to improve model performance by leveraging information across many tasks. It has been used extensively on various data modalities, including electronic health record (EHR) data. However, despite significant use on EHR data, there has been little systematic investigation of the utility of MTL across the diverse set of possible tasks and training schemes of interest in healthcare. In this work, we examine MTL across a battery of tasks on EHR time-series data. We find that while MTL does suffer from common negative transfer, we can realize significant gains via MTL pre-training combined with single-task fine-tuning. We demonstrate that these gains can be achieved in a task-independent manner and offer not only minor improvements under traditional learning, but also notable gains in a few-shot learning context, thereby suggesting this could be a scalable vehicle to offer improved performance in important healthcare contexts.

下载PDF全文

下载文献需遵守相关版权规定

论文标题