论文标题

基于卷积稀疏内核转移学习的帕金森氏病语音数据的分类算法,最佳内核和平行样品特征选择

Classification Algorithm of Speech Data of Parkinsons Disease Based on Convolution Sparse Kernel Transfer Learning with Optimal Kernel and Parallel Sample Feature Selection

论文作者

Zhang, Xiaoheng, Li, Yongming, Wang, Pin, Tan, Xiaoheng, Liu, Yuchuan

论文摘要

来自帕金森病(PD)患者的言语数据很少,培训和测试数据的统计分布在现有数据集中有很大差异。为了解决这些问题,必须考虑减少尺寸的减少和样本增强。在本文中,提出了一种基于稀疏内核转移学习的新型PD分类算法,并提出了样品和特征的平行优化。稀疏传输学习用于从公共数据集中提取PD语音特征的有效结构信息作为源域数据,并改善了快速的addM迭代以提高信息提取性能。为了实现并行优化,样本和特征之间的潜在关系被认为可以获得高质量的组合特征。首先,从特定的公共语音数据集中提取功能,以构建功能数据集作为源域。然后,PD目标域,包括培训和测试数据集,由卷积稀疏编码编码,该编码可以提取更多的深入信息。接下来,实现并行优化。为了进一步提高分类性能,设计了卷积内核优化机制。使用两个代表性的公共数据集和一个自我构造的数据集,该实验比较了三十个相关算法。结果表明,将Sakar数据集,Maxlittle数据集和DNSH数据集作为目标域时,所提出的算法可实现分类准确性的明显提高。该研究还发现,与非转移学习方法相比,本文的算法有很大的改进,这表明转移学习既有效又具有更可接受的时间成本。

Labeled speech data from patients with Parkinsons disease (PD) are scarce, and the statistical distributions of training and test data differ significantly in the existing datasets. To solve these problems, dimensional reduction and sample augmentation must be considered. In this paper, a novel PD classification algorithm based on sparse kernel transfer learning combined with a parallel optimization of samples and features is proposed. Sparse transfer learning is used to extract effective structural information of PD speech features from public datasets as source domain data, and the fast ADDM iteration is improved to enhance the information extraction performance. To implement the parallel optimization, the potential relationships between samples and features are considered to obtain high-quality combined features. First, features are extracted from a specific public speech dataset to construct a feature dataset as the source domain. Then, the PD target domain, including the training and test datasets, is encoded by convolution sparse coding, which can extract more in-depth information. Next, parallel optimization is implemented. To further improve the classification performance, a convolution kernel optimization mechanism is designed. Using two representative public datasets and one self-constructed dataset, the experiments compare over thirty relevant algorithms. The results show that when taking the Sakar dataset, MaxLittle dataset and DNSH dataset as target domains, the proposed algorithm achieves obvious improvements in classification accuracy. The study also found large improvements in the algorithms in this paper compared with nontransfer learning approaches, demonstrating that transfer learning is both more effective and has a more acceptable time cost.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源