论文标题

融合的音频实例和呼吸道疾病检测的代表

Fused Audio Instance and Representation for Respiratory Disease Detection

论文作者

Truong, Tuan, Lenga, Matthias, Serrurier, Antoine, Mohammadi, Sadegh

论文摘要

长期以来,已经研究了基于音频的人体声音分类技术,以帮助诊断呼吸道疾病。虽然大多数研究以咳嗽作为主要生物标志物的使用为中心,但其他身体的声音也有可能检测呼吸道疾病。关于COVID-19的最新研究表明,除了咳嗽之外,呼吸和语音声音与该疾病相关。我们的研究提出了融合的音频实例和表示(公平)作为呼吸道疾病检测的一种方法。 Fair依赖于以波形和频谱形式表示的各种体内声音构建联合特征向量。我们通过结合波形和身体声音的频谱图来进行有关COVID-19检测的用例的实验。我们的发现表明,使用自我注意力结合咳嗽,呼吸和语音的提取特征,导致最佳性能,而接收器操作特征曲线下的区域(AUC)得分为0.8658,灵敏度为0.8057,特异性为0.7958。与仅在光谱图或波形上训练的模型相比,两种表示的使用都会提高AUC分数,这表明将频谱图和波形表示结合有助于丰富提取的特征,并优于仅使用一个表示形式的模型。

Audio-based classification techniques on body sounds have long been studied to aid in the diagnosis of respiratory diseases. While most research is centered on the use of cough as the main biomarker, other body sounds also have the potential to detect respiratory diseases. Recent studies on COVID-19 have shown that breath and speech sounds, in addition to cough, correlate with the disease. Our study proposes Fused Audio Instance and Representation (FAIR) as a method for respiratory disease detection. FAIR relies on constructing a joint feature vector from various body sounds represented in waveform and spectrogram form. We conducted experiments on the use case of COVID-19 detection by combining waveform and spectrogram representation of body sounds. Our findings show that the use of self-attention to combine extracted features from cough, breath, and speech sounds leads to the best performance with an Area Under the Receiver Operating Characteristic Curve (AUC) score of 0.8658, a sensitivity of 0.8057, and a specificity of 0.7958. Compared to models trained solely on spectrograms or waveforms, the use of both representations results in an improved AUC score, demonstrating that combining spectrogram and waveform representation helps to enrich the extracted features and outperforms the models that use only one representation.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源