Scientific Publications
Plava, A. Overconfidence, overliteracy : Online IDentity Theft stigma on the expert users
Abstract
The digital transformation of society has created new opportunities for criminals to exploit personal identity data, leading to the rise of online identity theft (OIDT). Despite its growing prevalence, little is known about the victims’ profiles, needs, or experiences—many of whom do not report the crime. This article examines the impact of OIDT across Europe and highlights three key findings: (1) even digitally skilled users are affected; (2) victims often experience a stigma that contributes to a sense of vulnerability; and (3) greater digital literacy can intensify the perception of stigma and fear of secondary victimization.
Full article here
Eros Rosello, Angel M. Gomez, Iván López-Espejo, Antonio M. Peinado, Juan M. Martín-Doñas. Anti spoofing Ensembling Model: Dynamic Weight Allocation in Ensemble Models for Improved Voice Biometrics Security
Abstract
This paper introduces an adaptive ensemble model to counter spoofed speech, with a focus on synthetic voice. While deep neural network-based speaker verification has advanced, it remains vulnerable to spoofing attacks, necessitating effective countermeasures. Ensemble methods, which combine multiple models, have shown strong performance, but often use fixed weights that overlook each model’s specific strengths. To address this, we propose a neural network-based ensemble that dynamically adjusts weights based on input speech. Experiments demonstrate that this adaptive approach outperforms traditional fixed-weight methods by better capturing diverse audio characteristics.
Full article here
Martin-Donas, J.M., Alvarez, A. Eros Rosello, Angel M. Gome, Antonio M. Peinado (2024). Exploring Self-supervised Embeddings and Synthetic Data Augmentation for Robust Audio Deepfake Detection
Abstract
This paper investigates the use of large self-supervised speech models as robust audio deepfake detectors. Instead of fine-tuning, the study uses pre-trained models as feature extractors for lightweight classifiers, preserving general audio knowledge while enhancing deepfake detection. To improve generalization, the training data is enriched with synthetic samples from various vocoders and diverse acoustic conditions. Experiments on benchmark datasets demonstrate state-of-the-art results, and an analysis of the classifier highlights key components for robustness.
Full article here
Juan M. Martín-Doñas, Eros Roselló, Angel M. Gomez, Aitor Álvarez, Iván López-Espejo, Antonio M. Peinado. ASASVIcomtech: The Vicomtech-UGR Speech Deepfake Detection and SASV Systems for the ASVspoof5 Challenge
Abstract
This paper describes the ASASVIcomtech team’s contribution to the ASVspoof 5 Challenge, involving researchers from Vicomtech and the University of Granada. The team competed in both Track 1 (speech deepfake detection) and Track 2 (spoofing-aware speaker verification). The work began with a thorough analysis of the challenge datasets to minimize potential training biases—key findings from this analysis are shared. For Track 1, a closed-condition system using a deep complex convolutional recurrent architecture was implemented but did not yield strong results. In contrast, open-condition systems for both tracks, incorporating self-supervised models, data augmentation from past challenges, and novel vocoders, led to highly competitive performance through an ensemble approach.
Full article here
Martín-Doñas, Juan M. Álvarez, Aitor. The Vicomtech Audio Deepfake Detection System based on Wav2Vec2 for the 2022 ADD Challenge
Abstract
This paper describes our submitted system to the 2023 Audio Deepfake Detection Challenge Track 2. This track focuses on locating the manipulated regions in partially fake audio. Our approach integrates a pre-trained Wav2Vec2 based feature
extractor and two different downstream models for deepfake detection and audio clustering. While the detection module is
composed of a simple but efficient downstream neural classification model, the clustering-based neural network was trained
to first segment the audio and then discriminate between the original regions and the manipulated segments. The final
segmentation was obtained by combining the clustering process with the decision score through the application of some
post-processing strategies. We evaluate our system on the test set of the challenge track, showing good performance for
partially fake detection and location in challenging environments. Our novel, simple and efficient approach ranked fourth in
the mentioned challenge among sixteen participants.
Full article here