Edit model card

base

This model is a fine-tuned version of microsoft/resnet-50 on '140k Real and Fake Faces' dataset from Kaggle. https://www.kaggle.com/datasets/xhlulu/140k-real-and-fake-faces

Model description

The model described in the thesis is a preliminary version, fine-tuned on a subset of 16,000 images. This fine-tuning serves as an early exploration phase, allowing for initial performance evaluation and insights into the model's behavior. The model is not yet in its final form, with further iterations planned for fine-tuning on a larger dataset to enhance its accuracy and generalization capabilities. This version acts as a stepping stone towards the final deepfake detection solution.

Intended uses & limitations

Intended Uses:

Deepfake Detection: The model is designed primarily for detecting manipulated or deepfake images, specifically within the context of human facial images. Its purpose is to distinguish between real and fake images using advanced computer vision techniques. Academic Research: As part of a thesis project, the model serves as a foundation for further research in AI-driven image classification, contributing to the development of robust techniques in deepfake detection. Facial Image Classification: While deepfake detection is the primary use case, the model may be extended to other facial image classification tasks in future iterations. Limitations:

Limited Dataset: This version has been fine-tuned on only 16,000 images, which may limit its accuracy and generalization to unseen data, especially in diverse or complex scenarios. Not the Final Version: The model is still under development, and its current state does not represent the final performance level. Further fine-tuning on a larger dataset is necessary for improved results. Potential Overfitting: With a relatively small training set, the model may be prone to overfitting, reducing its effectiveness on more extensive or varied datasets. Restricted to Facial Images: The model has been specifically trained on facial images, and its performance outside this domain, such as in other image types or object detection tasks, is untested and likely suboptimal.

Training procedure

Dataset Preparation:

Data Collection: The model was fine-tuned using the given dataset, which includes both real and deepfake images. Data Splitting: The dataset was divided into training and validation sets. In this preliminary stage, no separate test set was used. Preprocessing:

Image Resizing: Images were resized to a uniform dimension suitable for the model. Normalization: Pixel values were normalized to the range [0, 1] to standardize input data. Data Augmentation: Basic data augmentation techniques, such as random cropping and horizontal flipping, were applied to improve model robustness and prevent overfitting. Model Configuration:

Architecture: The model's architecture was selected based on its suitability for facial image classification and deepfake detection tasks. Hyperparameters: Key hyperparameters, including learning rate, batch size, and number of epochs, were configured to balance training time and model performance. Fine-Tuning:

Training: The model was fine-tuned on the 16,000 images using a supervised learning approach. The training process involved adjusting weights based on the cross-entropy loss function to minimize classification errors. Validation: During training, the model's performance was periodically evaluated on the validation set to monitor accuracy, precision, recall, and F1 score. Evaluation:

Metrics: Model performance was assessed using accuracy, precision, recall, and F1 score metrics. Error Analysis: Misclassifications were analyzed to identify areas for improvement and guide future adjustments. Checkpointing and Saving:

Model Checkpoints: Intermediate model checkpoints were saved to enable recovery and further fine-tuning. Final Model: The fine-tuned model was saved for subsequent evaluation and analysis. Future Work:

Dataset Expansion: Further fine-tuning will be conducted with a larger and more diverse dataset to enhance model performance and generalization. Performance Optimization: Additional training and optimization steps will be implemented to improve accuracy and address any identified limitations.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 50
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 8

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.4.0+cu121
  • Tokenizers 0.19.1
Downloads last month
13
Safetensors
Model size
23.6M params
Tensor type
F32
·
Inference Examples
Inference API (serverless) is not available, repository is disabled.

Model tree for 1ancelot/base_rn

Finetuned
this model

Space using 1ancelot/base_rn 1