论文标题
增强位置敏感的哈希(Hashing):歧视性二进制代码用于源分离
Boosted Locality Sensitive Hashing: Discriminative Binary Codes for Source Separation
论文作者
论文摘要
随着深度学习技术的发展,语音增强任务已取得了重大改进,但计算复杂性的成本提高。在这项研究中,我们提出了一种自适应增强方法,以有效地代表音频光谱。我们将学习的哈希代码用于单渠道语音denotion任务作为复杂机器学习模型的替代方法,尤其是解决资源约束环境。我们的自适应增强算法将简单的逻辑回归器学习为弱者。一旦受过培训,他们的二进制分类结果将每个测试噪声语音的每个频谱都转化为一些字符串。简单的位操作计算锤击距离,以在训练噪声语音光谱词典中找到k-near最匹配的框架,其相关的理想二进制遮罩平均以估算该测试混合物的denoising面膜。我们提出的学习算法与Adaboost不同,因为对投影进行了训练,以最大程度地减少哈希码的自相似性矩阵与原始光谱的自相似性矩阵之间的距离,而不是错误分类率。我们通过各种噪声类型评估了圆锥形语料库上的歧视性哈希码,并在降解性能和复杂性方面表现出与深度学习方法的比较性能。
Speech enhancement tasks have seen significant improvements with the advance of deep learning technology, but with the cost of increased computational complexity. In this study, we propose an adaptive boosting approach to learning locality sensitive hash codes, which represent audio spectra efficiently. We use the learned hash codes for single-channel speech denoising tasks as an alternative to a complex machine learning model, particularly to address the resource-constrained environments. Our adaptive boosting algorithm learns simple logistic regressors as the weak learners. Once trained, their binary classification results transform each spectrum of test noisy speech into a bit string. Simple bitwise operations calculate Hamming distance to find the K-nearest matching frames in the dictionary of training noisy speech spectra, whose associated ideal binary masks are averaged to estimate the denoising mask for that test mixture. Our proposed learning algorithm differs from AdaBoost in the sense that the projections are trained to minimize the distances between the self-similarity matrix of the hash codes and that of the original spectra, rather than the misclassification rate. We evaluate our discriminative hash codes on the TIMIT corpus with various noise types, and show comparative performance to deep learning methods in terms of denoising performance and complexity.