Scientific Publications

Vazgken Vanian, Georgios Petmezas, Konstantinos Konstantoudakis, and Dimitris Zarpalas. ReenactFaces: A Specialized Dataset for Reenactment-based Deepfake Detection

Abstract

The growing sophistication of deepfake technologies poses a serious threat to the authenticity of digital media and has far-reaching implications for security, privacy and misinformation. Existing deepfake datasets are often limited in scope, with most focusing on a single manipulation technique, and only a few addressing the specific domain of facial-reenactment. To address this gap, we present ReenactFaces, a specialized, open-access dataset designed to support
the development and evaluation of deepfake detection systems targeting facialreenactment methods. This dataset includes both real and manipulated videos, enabling researchers to train and test models specifically against reenactment-based deepfakes. ReenactFaces provides a valuable resource for improving the generalization of detection models to reenactment manipulations, filling a critical gap in the literature and complementing existing deepfake datasets.

Full article here

Georgios Petmezas, Vazgken Vanian, Manuel Pastor Rufete, Eleana E. I. Almaloglou and Dimitris Zarpalas. A Dual-Branch Fusion Model for Deepfake Detection Using Video Frames and Microexpression Features

Abstract

Deepfake detection has become a critical issue due to the rise of synthetic media and its potential for misuse. In this paper, we propose a novel approach to deepfake detection by combining video frame analysis with facial microexpression features. The dual-branch fusion model utilizes a 3D ResNet18 for spatiotemporal feature extraction and a transformer model to capture microexpression patterns, which are difficult to replicate in manipulated content. We evaluate the model on the widely used FaceForensics++ (FF++) dataset and demonstrate that our approach outperforms existing state-of-the-art methods, achieving 99.81% accuracy and a perfect ROC-AUC score of 100%. The proposed method highlights the importance of integrating diverse data sources for deepfake detection, addressing some of the current limitations of existing systems.

Full article here

Vazgken Vanian, Georgios Petmezas, Eleana E. I. Almaloglou, Dimitris Zarpalas. MGAMNET: A Mask-Guided Attention Network for Deepfake Detection

Abstract

The rapid advancement of deepfake generation techniques poses increasing risks to security and privacy. Existing detection methods primarily focus on identifying artifacts in synthesized content, however these artifacts tend to be concentrated in specific facial regions, such as the eyes, mouth and facial boundaries. In this work, we introduce MGAMNET, a maskguided attention network that enhances deepfake detection by integrating face segmentation into the learning process. Our method utilizes segmentation-driven attention maps to guide the network toward the most informative facial regions, improving feature extraction and classification performance. Experimental resultsdemonstrate that MGAMNET significantly enhances detection accuracy compared to baseline models, while also providing
informative attention maps that highlight key facial areas.

Full article here

Vazgken Vanian, Georgios Petmezas, Konstantinos Konstantoudakis, Dimitris Zarpalas. XDF: A Large-Scale Dataset for Evaluating Video Deepfake Detection Across Multiple Manipulation Techniques

Abstract

Deepfake technologies have rapidly advanced, presenting significant challenges to the integrity of digital media and creating potential risks in various sectors, from politics to personal privacy. In response, the research community has focused on developing reliable methods for detecting deepfakes. However, the continuous advancements and growing complexity of artificial intelligence (AI) models have outpaced existing datasets, making it difficult to train and evaluate detection systems effectively. This paper addresses this gap by introducing a comprehensive dataset of real and manipulated videos, aimed at supporting the development and evaluation of advanced deepfake detection models. This dataset could serve as a benchmark for assessing the effectiveness of emerging detection methodologies, providing a standardized resource for researchers to measure progress and compare results. Additionally, this study offers key insights for the creation of effective deepfake datasets and identifies several pressing challenges in the field, guiding future research and development efforts.

Full article here

Georgios Petmezas, Vazgken Vanian, Konstantinos Konstantoudakis, Elena E. I. Almaloglou, Dimitris Zarpalas. Video deepfake detection using a hybrid CNN‑LSTM‑Transformer model for identity verification

Abstract

The proliferation of deepfake technology poses significant challenges due to its potential for misuse in creating highly convincing manipulated videos. Deep learning (DL) techniques have emerged as powerful tools for analyzing and identifying subtle inconsistencies that distinguish genuine content from deepfakes. This paper introduces a novel approach for video deepfake detection that integrates 3D Morphable Models (3DMMs) with a hybrid CNN-LSTM-Transformer model, aimed at enhancing detection accuracy and efficiency. Our model leverages 3DMMs for detailed facial feature extraction, a CNN for fine-grained spatial analysis, an LSTM for short-term temporal dynamics, and a Transformer for capturing long-term dependencies in sequential data. This architecture effectively addresses critical challenges in current detection systems by handling both local and global temporal information. The proposed model employs an identity verification approach, comparing test videos with reference videos containing genuine footage of the individuals. Trained and validated on the VoxCeleb2 dataset, with further testing on three additional datasets, our model demonstrates superior performance to existing state-of-the-art methods, maintaining robustness across different video qualities, compression levels and manipulation
types. Additionally, it operates efficiently in time-sensitive scenarios, significantly outperforming existing methods in inference speed. By relying solely on pristine, unmanipulated data for training, our approach enhances adaptability to new and sophisticated manipulations, setting a new benchmark for video deepfake detection technologies. This study not only advances the framework for detecting deepfakes but also underscores its potential for practical deployment in areas critical for digital forensics and media integrity.

Full article here

Plava, A. Overconfidence, overliteracy : Online IDentity Theft stigma on the expert users

Abstract

The digital transformation of society has created new opportunities for criminals to exploit personal identity data, leading to the rise of online identity theft (OIDT). Despite its growing prevalence, little is known about the victims’ profiles, needs, or experiences—many of whom do not report the crime. This article examines the impact of OIDT across Europe and highlights three key findings: (1) even digitally skilled users are affected; (2) victims often experience a stigma that contributes to a sense of vulnerability; and (3) greater digital literacy can intensify the perception of stigma and fear of secondary victimization.

Full article here

Eros Rosello, Angel M. Gomez, Iván López-Espejo, Antonio M. Peinado, Juan M. Martín-Doñas. Anti spoofing Ensembling Model: Dynamic Weight Allocation in Ensemble Models for Improved Voice Biometrics Security

Abstract

This paper introduces an adaptive ensemble model to counter spoofed speech, with a focus on synthetic voice. While deep neural network-based speaker verification has advanced, it remains vulnerable to spoofing attacks, necessitating effective countermeasures. Ensemble methods, which combine multiple models, have shown strong performance, but often use fixed weights that overlook each model’s specific strengths. To address this, we propose a neural network-based ensemble that dynamically adjusts weights based on input speech. Experiments demonstrate that this adaptive approach outperforms traditional fixed-weight methods by better capturing diverse audio characteristics.

Full article here

Martin-Donas, J.M., Alvarez, A. Eros Rosello, Angel M. Gome, Antonio M. Peinado (2024). Exploring Self-supervised Embeddings and Synthetic Data Augmentation for Robust Audio Deepfake Detection

Abstract

This paper investigates the use of large self-supervised speech models as robust audio deepfake detectors. Instead of fine-tuning, the study uses pre-trained models as feature extractors for lightweight classifiers, preserving general audio knowledge while enhancing deepfake detection. To improve generalization, the training data is enriched with synthetic samples from various vocoders and diverse acoustic conditions. Experiments on benchmark datasets demonstrate state-of-the-art results, and an analysis of the classifier highlights key components for robustness.

Full article here

Juan M. Martín-Doñas, Eros Roselló, Angel M. Gomez, Aitor Álvarez, Iván López-Espejo, Antonio M. Peinado. ASASVIcomtech: The Vicomtech-UGR Speech Deepfake Detection and SASV Systems for the ASVspoof5 Challenge

Abstract

This paper describes the ASASVIcomtech team’s contribution to the ASVspoof 5 Challenge, involving researchers from Vicomtech and the University of Granada. The team competed in both Track 1 (speech deepfake detection) and Track 2 (spoofing-aware speaker verification). The work began with a thorough analysis of the challenge datasets to minimize potential training biases—key findings from this analysis are shared. For Track 1, a closed-condition system using a deep complex convolutional recurrent architecture was implemented but did not yield strong results. In contrast, open-condition systems for both tracks, incorporating self-supervised models, data augmentation from past challenges, and novel vocoders, led to highly competitive performance through an ensemble approach.

Full article here

Martín-Doñas, Juan M. Álvarez, Aitor. The Vicomtech Audio Deepfake Detection System based on Wav2Vec2 for the 2022 ADD Challenge

Abstract

This paper describes our submitted system to the 2023 Audio Deepfake Detection Challenge Track 2. This track focuses on locating the manipulated regions in partially fake audio. Our approach integrates a pre-trained Wav2Vec2 based feature
extractor and two different downstream models for deepfake detection and audio clustering. While the detection module is
composed of a simple but efficient downstream neural classification model, the clustering-based neural network was trained
to first segment the audio and then discriminate between the original regions and the manipulated segments. The final
segmentation was obtained by combining the clustering process with the decision score through the application of some
post-processing strategies. We evaluate our system on the test set of the challenge track, showing good performance for
partially fake detection and location in challenging environments. Our novel, simple and efficient approach ranked fourth in
the mentioned challenge among sixteen participants.

Full article here