论文标题
使用标准化的最大eigengap,用于扬声器诊断的自动调整光谱聚类
Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap
论文作者
论文摘要
在这项研究中,我们提出了一个新的光谱聚类框架,可以在说话者诊断的背景下自动调整聚类算法的参数。所提出的框架使用归一化的最大eigengap(NME)值来估计光谱聚类期间每行元素在亲和力矩阵中元素阈值的簇数和参数,而无需在开发集中使用参数调整。即使通过这种失望的方法,我们在各种评估集中取得了可比或更好的性能,而使用传统的聚类方法发现的结果来应用仔细的参数调整和开发数据。在众所周知的呼叫者评估集上,说话者错误率的相对提高17%,显示了我们提出的通过自动调整的频谱聚类的有效性。
In this study, we propose a new spectral clustering framework that can auto-tune the parameters of the clustering algorithm in the context of speaker diarization. The proposed framework uses normalized maximum eigengap (NME) values to estimate the number of clusters and the parameters for the threshold of the elements of each row in an affinity matrix during spectral clustering, without the use of parameter tuning on the development set. Even through this hands-off approach, we achieve a comparable or better performance across various evaluation sets than the results found using traditional clustering methods that apply careful parameter tuning and development data. A relative improvement of 17% in the speaker error rate on the well-known CALLHOME evaluation set shows the effectiveness of our proposed spectral clustering with auto-tuning.