论文标题

无所畏惧的步骤挑战(FS-2):大量自然主义的阿波罗数据有监督的学习

FEARLESS STEPS Challenge (FS-2): Supervised Learning with Massive Naturalistic Apollo Data

论文作者

Joglekar, Aditya, Hansen, John H. L., Shekar, Meena Chandra, Sangwan, Abhijeet

论文摘要

UTDALLAS-CRSS的无畏步骤倡议导致了19,000小时的原始模拟音频数据的数字化,恢复和诊断,以及算法的开发,以从此多通道自然主义数据资源中提取有意义的信息。 2020年无畏步骤(FS-2)挑战是言语和语言技术社区的第二个年度挑战,旨在激励有监督的学习算法开发多方和多派对自然主义音频。在本文中,我们概述了挑战子任务,数据,绩效指标以及从《无畏步骤挑战》(FS-2)的第2阶段中学到的经验教训。我们通过广泛的社区宣传和反馈来介绍FS-2在FS-2中的进步。我们描述了挑战语料库发展中的创新,并提出了修订的基线结果。我们最终讨论了挑战的两个阶段(FS-1阶段和FS-2阶段)的挑战结果和系统开发的总体趋势,以及它继续进入即将到来的无畏步骤挑战阶段3的多通道挑战任务。

The Fearless Steps Initiative by UTDallas-CRSS led to the digitization, recovery, and diarization of 19,000 hours of original analog audio data, as well as the development of algorithms to extract meaningful information from this multi-channel naturalistic data resource. The 2020 FEARLESS STEPS (FS-2) Challenge is the second annual challenge held for the Speech and Language Technology community to motivate supervised learning algorithm development for multi-party and multi-stream naturalistic audio. In this paper, we present an overview of the challenge sub-tasks, data, performance metrics, and lessons learned from Phase-2 of the Fearless Steps Challenge (FS-2). We present advancements made in FS-2 through extensive community outreach and feedback. We describe innovations in the challenge corpus development, and present revised baseline results. We finally discuss the challenge outcome and general trends in system development across both phases (Phase FS-1 Unsupervised, and Phase FS-2 Supervised) of the challenge, and its continuation into multi-channel challenge tasks for the upcoming Fearless Steps Challenge Phase-3.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源