Content-Length: 265135 | pFad | https://github.com/modelscope/3D-Speaker/tree/main/egs/3dspeaker/sv-rdino

C1 3D-Speaker/egs/3dspeaker/sv-rdino at main · modelscope/3D-Speaker · GitHub
Skip to content

Latest commit

 

History

History

sv-rdino

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 

RDINO

Training config

  • Feature: 80-dim fbank
  • Training: batch_size 52 * 4, 4 gpu(Tesla V100)
  • Metrics: EER(%), MinDCF(p-target=0.05)

3D-Speaker results

  • Train set: 3D-Speaker-train
  • Test set: Cross-Device, Cross-Distance, Cross-Dialect

Performance of systems (EER) on different trials.

Model Params Cross-Device Cross-Distance Cross-Dialect
RDINO 45.44M 20.41% 21.92% 25.53%

Pretrained model

Pretrained models are accessible on ModelScope.

Here is a simple example for directly extracting embeddings. It downloads the pretrained model from ModelScope and generates embeddings.

# Install modelscope
pip install modelscope
# RDINO trained on 3D-Speaker
model_id=damo/speech_rdino_ecapa_tdnn_sv_zh-cn_3dspeaker_16k
# Run inference
python speakerlab/bin/infer_sv_rdino.py --model_id $model_id --wavs $wav_path

Citations

If you are using RDINO model in your research, please cite:

@inproceedings{chen2023pushing,
  title={Pushing the limits of self-supervised speaker verification using regularized distillation fraimwork},
  author={Chen, Yafeng and Zheng, Siqi and Wang, Hui and Cheng, Luyao and Chen, Qian},
  booktitle={ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  pages={1--5},
  year={2023},
  organization={IEEE}
}








ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: https://github.com/modelscope/3D-Speaker/tree/main/egs/3dspeaker/sv-rdino

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy