binh230
/

mambaformer

Generated from Trainer

Model card Files Files and versions Community

mambaformer / README.md

binh230's picture

Model save

ec7e068 verified 3 months ago

|

history blame contribute delete

2.87 kB

	---
	license: apache-2.0
	base_model: OuteAI/Lite-Oute-2-Mamba2Attn-Base
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	- precision
	- recall
	- f1
	model-index:
	- name: mambaformer
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/truonggiabjnh2003-fpt-university/Detect%20AI%20Generated%20Text/runs/vbdymxf4)
	# mambaformer

	This model is a fine-tuned version of [OuteAI/Lite-Oute-2-Mamba2Attn-Base](https://huggingface.co./OuteAI/Lite-Oute-2-Mamba2Attn-Base) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.1639
	- Accuracy: 0.9607
	- Precision: 0.9628
	- Recall: 0.9607
	- F1: 0.9613
	- Auroc: 0.9925

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 32
	- eval_batch_size: 4
	- seed: 42
	- gradient_accumulation_steps: 4
	- total_train_batch_size: 128
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 1
	- label_smoothing_factor: 0.03

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \| Precision \| Recall \| F1 \| Auroc \|
	\|:-------------:\|:------:\|:----:\|:---------------:\|:--------:\|:---------:\|:------:\|:------:\|:------:\|
	\| 0.8973 \| 0.0988 \| 128 \| 0.6661 \| 0.6897 \| 0.6807 \| 0.6897 \| 0.6850 \| 0.5552 \|
	\| 0.5525 \| 0.1976 \| 256 \| 0.4682 \| 0.7898 \| 0.7526 \| 0.7898 \| 0.7413 \| 0.7643 \|
	\| 0.4086 \| 0.2965 \| 384 \| 0.3500 \| 0.8523 \| 0.8452 \| 0.8523 \| 0.8472 \| 0.9024 \|
	\| 0.3067 \| 0.3953 \| 512 \| 0.2573 \| 0.9107 \| 0.9085 \| 0.9107 \| 0.9091 \| 0.9620 \|
	\| 0.2477 \| 0.4941 \| 640 \| 0.2234 \| 0.9309 \| 0.9298 \| 0.9309 \| 0.9288 \| 0.9761 \|
	\| 0.2283 \| 0.5929 \| 768 \| 0.2074 \| 0.9404 \| 0.9396 \| 0.9404 \| 0.9398 \| 0.9804 \|
	\| 0.2035 \| 0.6918 \| 896 \| 0.1875 \| 0.9529 \| 0.9530 \| 0.9529 \| 0.9530 \| 0.9853 \|
	\| 0.1963 \| 0.7906 \| 1024 \| 0.1809 \| 0.9464 \| 0.9458 \| 0.9464 \| 0.9460 \| 0.9867 \|
	\| 0.1798 \| 0.8894 \| 1152 \| 0.1638 \| 0.9601 \| 0.9610 \| 0.9601 \| 0.9604 \| 0.9900 \|
	\| 0.1749 \| 0.9882 \| 1280 \| 0.1652 \| 0.9583 \| 0.9579 \| 0.9583 \| 0.9581 \| 0.9894 \|


	### Framework versions

	- Transformers 4.43.0.dev0
	- Pytorch 2.4.0+cu124
	- Datasets 2.19.1
	- Tokenizers 0.19.1