使用希腊哈佛语料库评估听力正常和听力障碍的听众的神经语音富集的可理解性好处

论文标题

使用希腊哈佛语料库评估听力正常和听力障碍的听众的神经语音富集的可理解性好处

Evaluating the Intelligibility Benefits of Neural Speech Enrichment for Listeners with Normal Hearing and Hearing Impairment using the Greek Harvard Corpus

论文作者

Shifas, Muhammed PV, Sfakianaki, Anna, Chimona, Theognosia, Stylianou, Yannis

论文摘要

在这项工作中，我们使用最近设计的希腊哈佛风格的语料库评估了基于光谱形状和动态范围压缩（SSDRC）（SSDRC）（SSDRC）（SSDRC）的基于神经的语音可理解性助推器（SSDRC）。该语料库是根据哈佛/IEEE句子的格式开发的，并提供了应用神经言语增强模型的机会，并检查了他们对希腊听众的绩效提高。过去，WSSDRC已成功地测试了英语材料和演讲者。在本文中，我们重新访问WSSDRC，以在修改前后的同等能量的条件下对模型进行全面评估。正常的听力（NH）和听力受损（HI）听众（HI）听众在听众特定的SNR上评估了与他们的语音接收阈值（SRT）相匹配的言语形状噪声（SSN）的模型，这一点是50％的未修饰语音是可理解的。分析统计数据表明，与普通的未经处理的语音相比，WSSDRC模型的NH的中位数可理解性提升为39％，而HI的HI的中值提升为38％。

In this work we evaluate a neural based speech intelligibility booster based on spectral shaping and dynamic range compression (SSDRC), referred to as WaveNet-based SSDRC (wSSDRC), using a recently designed Greek Harvard-style corpus. The corpus has been developed according to the format of the Harvard/IEEE sentences and offers the opportunity to apply neural speech enhancement models and examine their performance gain for Greek listeners. wSSDRC has been successfully tested for English material and speakers in the past. In this paper we revisit wSSDRC to perform a full scale evaluation of the model with Greek listeners under the condition of equal energy before and after modification. Both normal hearing (NH) and hearing impaired (HI) listeners evaluated the model under speech shaped noise (SSN) at listener-specific SNRs matching their Speech Reception Threshold (SRT) - a point at which 50 % of unmodified speech is intelligible. The analysis statistics show that the wSSDRC model has produced a median intelligibility boost of 39% for NH and 38% for HI, relative to the plain unprocessed speech.

下载PDF全文

下载文献需遵守相关版权规定

论文标题