论文标题
Voxceleb扬声器识别挑战2022的Bucea扬声器诊断系统
The BUCEA Speaker Diarization System for the VoxCeleb Speaker Recognition Challenge 2022
论文作者
论文摘要
本文介绍了2022年Voxceleb扬声器识别挑战的Bucea扬声器诊断系统。 VoxSRC-22提供了VoxConverse的开发集和测试集,我们主要使用VoxConverse的测试集进行参数调整。我们的系统由几个模块组成,包括语音活动检测(VAD),说话者嵌入提取器,聚类方法,重叠的语音检测(OSD)和结果融合。在不考虑重叠的情况下,将Dover-lap(可诊断输出投票误差减少)方法应用于系统融合,并最终进行了重叠的语音检测和处理。我们的最佳系统在VOXSRC 2022评估中分别达到了5.48%的诊断错误率(DER),JACCARD错误率(JER)为32.1%。
This paper describes the BUCEA speaker diarization system for the 2022 VoxCeleb Speaker Recognition Challenge. Voxsrc-22 provides the development set and test set of VoxConverse, and we mainly use the test set of VoxConverse for parameter adjustment. Our system consists of several modules, including speech activity detection (VAD), speaker embedding extractor, clustering methods, overlapping speech detection (OSD), and result fusion. Without considering overlap, the Dover-LAP (short for Diarization Output Voting Error Reduction) method was applied to system fusion, and overlapping speech detection and processing were finally carried out. Our best system achieves a diarization error rate (DER) of 5.48% and a Jaccard error rate (JER) of 32.1% on the VoxSRC 2022 evaluation set respectively.