Model Card for Model ID
This Model is Finetuned for Document Question and Answering purpose Trained on the yahma/alpaca-cleaned(https://huggingface.co./TheBloke/zephyr-7B-beta-GPTQ) dataset.
Model Details
Training hyperparameters
The following hyperparameters were used during training:
-gradient_accumulation_steps=1,
-warmup_steps=5,
-max_steps=20,
-learning_rate=2e-4,
-fp16=not torch.cuda.is_bf16_supported(),
-bf16=torch.cuda.is_bf16_supported(),
-logging_steps=1,
-optim="adamw_8bit",
-weight_decay=0.01,
-lr_scheduler_type="linear",
-seed=3407,
- Downloads last month
- 7
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.