论文标题
CharcencingsPeech 2022挑战:在线会议申请的非侵入目标语音质量评估(NISQA)挑战
ConferencingSpeech 2022 Challenge: Non-intrusive Objective Speech Quality Assessment (NISQA) Challenge for Online Conferencing Applications
论文作者
论文摘要
随着语音交流系统(例如在线会议应用程序)的进步,我们可以与人们无缝合作,无论他们身在何处。但是,在在线会议上,语音质量可能会受到背景噪音,混响,数据包丢失,网络抖动等的显着影响。由于其性质,传统上,在实验室的主观测试中评估了语音质量,并且最近在遵循ITU-T REC的国际标准的众包中进行了评估。 P.800系列。但是,这些方法是昂贵的,不能应用于客户数据。因此,需要一种有效的客观评估方法来评估或监视正在进行的对话的语音质量。 2022 ChartencingsPeech挑战针对语音质量评估任务的非侵入性深神经网络模型。我们开源一个培训语料库,具有不同语言的86K语音剪辑,具有广泛的合成和现场降解及其相应的主观质量分数,通过众包来进行培训。 18个团队在此挑战中提交了他们的模型进行评估。盲验测试集包括大约4300个夹子,来自较宽的降解范围。本文描述了挑战,数据集和评估方法,并报告了最终结果。
With the advances in speech communication systems such as online conferencing applications, we can seamlessly work with people regardless of where they are. However, during online meetings, speech quality can be significantly affected by background noise, reverberation, packet loss, network jitter, etc. Because of its nature, speech quality is traditionally assessed in subjective tests in laboratories and lately also in crowdsourcing following the international standards from ITU-T Rec. P.800 series. However, those approaches are costly and cannot be applied to customer data. Therefore, an effective objective assessment approach is needed to evaluate or monitor the speech quality of the ongoing conversation. The ConferencingSpeech 2022 challenge targets the non-intrusive deep neural network models for the speech quality assessment task. We open-sourced a training corpus with more than 86K speech clips in different languages, with a wide range of synthesized and live degradations and their corresponding subjective quality scores through crowdsourcing. 18 teams submitted their models for evaluation in this challenge. The blind test sets included about 4300 clips from wide ranges of degradations. This paper describes the challenge, the datasets, and the evaluation methods and reports the final results.