机器学习应用于对模拟人类声音进行流动引起的声音参数的分类

论文标题

机器学习应用于对模拟人类声音进行流动引起的声音参数的分类

Machine-learning applied to classify flow-induced sound parameters from simulated human voice

论文作者

Kraxberger, Florian, Wurzinger, Andreas, Schoder, Stefan

论文摘要

语音生产疾病对受影响个体的生活质量有严重影响。一种模拟方法用于研究语音产生中的因果链，以表明语音的典型特征，例如次数闭合闭合功能不全和左右不对称性。因此，在参数研究中使用先前发表的混合空气声模拟模型在参数研究中模拟了24种不同的语音配置。基于这24个仿真配置，在模拟评估点处的选定的声学参数（HNR，CPP，...）与这些仿真配置细节相关联，以根据模拟结果获得流动诱导的人类声音的特征洞察力。最近，一些机构研究了流动和声学特性的实验数据，并将其与健康且无序的语音信号相关联。在此上，这项研究是迈向详细数据集定义的下一步，数据集很小，但是根据现有的Simvoice的现有仿真方法，相关特征的定义是精确的。通过相关分析研究了小数据集，使用RBF内核的支持向量机分类器用于对表示形式进行分类。通过使用线性判别分析，可以看到单个研究的维度。这允许绘制相关性并确定从口腔前的声学信号评估的最重要特征。 GC类型可以根据CPP和BoxPlot可视化最佳歧视。此外，并使用LDA-Dimensive降低的特征空间，可以最好地将次磁压力分类为91.7 \％\％精度，与健康或无序的语音模拟参数无关。

Disorders of voice production have severe effects on the quality of life of the affected individuals. A simulation approach is used to investigate the cause-effect chain in voice production showing typical characteristics of voice such as sub-glottal pressure and of functional voice disorders as glottal closure insufficiency and left-right asymmetry. Therewith, 24 different voice configurations are simulated in a parameter study using a previously published hybrid aeroacoustic simulation model. Based on these 24 simulation configurations, selected acoustic parameters (HNR, CPP, ...) at simulation evaluation points are correlated with these simulation configuration details to derive characteristic insight in the flow-induced sound generation of human phonation based on simulation results. Recently, several institutions studied experimental data, of flow and acoustic properties and correlated it with healthy and disordered voice signals. Upon this, the study is a next step towards a detailed dataset definition, the dataset is small, but the definition of relevant characteristics are precise based on the existing simulation methodology of simVoice. The small datasets are studied by correlation analysis, and a Support Vector Machine classifier with RBF kernel is used to classify the representations. With the use of Linear Discriminant Analysis the dimensions of the individual studies are visualized. This allows to draw correlations and determine the most important features evaluated from the acoustic signals in front of the mouth. The GC type can be best discriminated based on CPP and boxplot visualizations. Furthermore and using the LDA-dimensionality-reduced feature space, one can best classify subglottal pressure with 91.7\% accuracy, independent of healthy or disordered voice simulation parameters.

下载PDF全文

下载文献需遵守相关版权规定

论文标题