chameleon-lizard
/

Qwen-2.5-7B-DTF

Text Generation

Model card Files Files and versions Community

chameleon-lizard commited on 16 days ago

Commit

7e48234

·

verified ·

1 Parent(s): 77c7453

Added README

Files changed (1) hide show

README.md +34 -3

README.md CHANGED Viewed

@@ -1,3 +1,34 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+datasets:
+- SubMaroon/DTF_Comments_Responses_Counts
+language:
+- ru
+base_model:
+- unsloth/Qwen2.5-7B
+pipeline_tag: text-generation
+---
+A continued pretrained version of unsloth/Qwen2.5-7B model using unsloth's low rank adaptation on a dataset of [DTF](dtf.ru) posts.
+For pretraining, posts from [SubMaroon/DTF_comments_Responses_Counts](https://huggingface.co/datasets/SubMaroon/DTF_Comments_Responses_Counts) were selected, deduplicated by simple `df.unique` and filtered by length of 1000 < x < 128000 tokens.
+Hyperparameters:
+```
+num_train_epochs=2
+train_batch_size=8
+gradient_accumulation_steps=16
+gradient_checkpointing=False
+optim="adamw_8bit"
+weight_decay=4e-2
+bf16=True
+learning_rate=5e-5
+lr_scheduler_type="cosine"
+packing=True,
+seed=42
+```
+[Wandb](https://wandb.ai/a_okshus/DTF_comments/runs/fr5hfq6g?nw=nwusera_okshus)
+[GitHub: TODO]()