| --- |
| language: |
| - en |
| tags: |
| - audio |
| - audio-classification |
| - antispoofing |
| - deepfake-detection |
| - speech |
| license: other |
| pipeline_tag: audio-classification |
| --- |
| |
| # DF Arena 500M - Antispoofing Model |
|
|
| We are excited to release DF Arena 500M Universal Antispoofing model 🔥trained on traditional speech antispoofing datasets in addition to singing and environmental deepfake data. |
| Check out the release on [DF Arena leaderboard](https://huggingface.co/spaces/Speech-Arena-2025/Speech-DF-Arena) |
|
|
| # Training Data |
|
|
| - **ASVspoof 2019, 2024** |
| - **Codecfake** |
| - **LibriSeVoc** |
| - **DFADD** |
| - **CTRSVDD** |
| - **SpoofCeleb** |
| - **MLAAD** |
| - **EnvSDD** |
|
|
| ## Usage |
| ```python |
| from transformers import pipeline |
| import librosa |
| |
| #load model |
| pipe = pipeline("antispoofing", model="Speech-Arena-2025/DF_Arena_500M_V_1", trust_remote_code=True, device='cuda') |
| audio, sr = librosa.load("sample.wav", sr=16000) |
| result = pipe(audio) |
| print(result) |
| # Output: |
| {'label': 'spoof', 'logits': [[1.5515458583831787, -1.2254822254180908]], 'score': 0.9414217472076416, 'all_scores': {'spoof': 0.9414217472076416, 'bonafide': 0.05857823044061661}} |
| ``` |
|
|
| # Evaluation |
|
|
| # Evaluation |
|
|
| | Dataset | EER (%) | F1-score | Accuracy (%) | |
| |----------------------|----------|-----------|---------------| |
| | dfadd | 0.00 | 0.9993 | 99.97 | |
| | add_2023_round_2 | 12.30 | 0.9133 | 87.70 | |
| | codecfake | 6.36 | 0.8997 | 93.65 | |
| | asvspoof_2021_la | 4.23 | 0.8191 | 95.77 | |
| | in_the_wild | 1.76 | 0.9860 | 98.24 | |
| | asvspoof_2019 | 1.09 | 0.9494 | 98.91 | |
| | add_2022_track_1 | 23.98 | 0.6453 | 76.02 | |
| | fake_or_real | 2.30 | 0.9773 | 97.73 | |
| | asvspoof_2024 | 12.39 | 0.7423 | 87.61 | |
| | add_2022_track_3 | 2.77 | 0.9200 | 97.23 | |
| | add_2023_round_1 | 7.47 | 0.9465 | 92.53 | |
| | librisevoc | 0.12 | 0.9955 | 99.87 | |
| | asvspoof_2021_df | 3.30 | 0.6200 | 96.70 | |
| | sonar | 1.90 | 0.9837 | 98.13 | |
| | **Average** | **5.78** | **0.884**| **94.19** | |
| | **Pooled** | **10.88** | **0.78** | **89.11** | |
|
|
|
|
|
|
| ## License |
|
|
| We use a non-commercial license which can be found [here](./LICENSE.txt) |
|
|
| ## Contact |
|
|
| For questions or issues, please open an issue on the model repository or contact us at speech.arena.eval@gmail.com. |
|
|
| Stay tuned for upcoming versions of our models! |
|
|
| ## Citation |
|
|
| If you use this model in your work, it can be cited as : |
|
|
| ```bibtex |
| @misc{kulkarni2026compactsslbackbonesmatter, |
| title={Do Compact SSL Backbones Matter for Audio Deepfake Detection? A Controlled Study with RAPTOR}, |
| author={Ajinkya Kulkarni and Sandipana Dowerah and Atharva Kulkarni and Tanel Alumäe and Mathew Magimai Doss}, |
| year={2026}, |
| eprint={2603.06164}, |
| archivePrefix={arXiv}, |
| primaryClass={cs.SD}, |
| url={https://arxiv.org/abs/2603.06164}, |
| } |
| ``` |