论文标题
深层双面学习合奏模型,用于几次帕金森氏帕金森语音识别
Deep Double-Side Learning Ensemble Model for Few-Shot Parkinson Speech Recognition
论文作者
论文摘要
基于语音数据对帕金森氏病的诊断和治疗效果评估非常重要,但是其几乎没有的学习问题是具有挑战性的。尽管深度学习擅长自动提取,但它的学习问题很少。因此,一般有效方法是基于先验知识的首先进行特征提取,然后进行降低以进行后续分类。但是,有两个主要问题:1)尚未开采语音特征之间的结构信息,并且尚未重建更高质量的新功能。 2)尚未开采数据样本之间的结构信息,并且尚未重建具有较高质量的新样本。为了解决这两个问题,基于现有的帕金森语音功能数据集,本文设计了深层双面学习合奏模型,可以深入又一次地重建语音特征和样本。至于功能重建,本文设计了一个嵌入式的深堆组稀疏自动编码器,以进行非线性特征转换,以获取新的高级深度特征,然后深层特征与L1正则化功能选择方法与L1正则化功能的原始语音特征融合在一起。至于语音样本重建,在本文中设计了一种深层样本学习算法,基于迭代平均聚类以进行样品转换,以获取新的高级深度样本。最后,采用包装合奏模式来融合深度特征学习算法和深层样本学习算法,从而构建了深层双面学习合奏模型。在本文的最后,使用了两个代表性的帕金森氏病语音数据集进行验证。实验结果表明,所提出的算法是有效的。
Diagnosis and therapeutic effect assessment of Parkinson disease based on voice data are very important,but its few-shot learning problem is challenging.Although deep learning is good at automatic feature extraction, it suffers from few-shot learning problem. Therefore, the general effective method is first conduct feature extraction based on prior knowledge, and then carry out feature reduction for subsequent classification. However, there are two major problems: 1) Structural information among speech features has not been mined and new features of higher quality have not been reconstructed. 2) Structural information between data samples has not been mined and new samples with higher quality have not been reconstructed. To solve these two problems, based on the existing Parkinson speech feature data set, a deep double-side learning ensemble model is designed in this paper that can reconstruct speech features and samples deeply and simultaneously. As to feature reconstruction, an embedded deep stacked group sparse auto-encoder is designed in this paper to conduct nonlinear feature transformation, so as to acquire new high-level deep features, and then the deep features are fused with original speech features by L1 regularization feature selection method. As to speech sample reconstruction, a deep sample learning algorithm is designed in this paper based on iterative mean clustering to conduct samples transformation, so as to obtain new high-level deep samples. Finally, the bagging ensemble learning mode is adopted to fuse the deep feature learning algorithm and the deep samples learning algorithm together, thereby constructing a deep double-side learning ensemble model. At the end of this paper, two representative speech datasets of Parkinson's disease were used for verification. The experimental results show that the proposed algorithm are effective.