README.md · QuantFactory/Einstein-v7-Qwen2-7B-GGUF at main

Einstein-v7-Qwen2-7B-GGUF / README.md

munish0838

Update README.md

7d0d96a verified 7 months ago

preview code

raw

history blame contribute delete

6.8 kB

	---
	language:
	- en
	license: other
	tags:
	- axolotl
	- instruct
	- finetune
	- chatml
	- gpt4
	- synthetic data
	- science
	- physics
	- chemistry
	- biology
	- math
	- qwen
	- qwen2
	base_model: Weyaxi/Einstein-v7-Qwen2-7B
	datasets:
	- allenai/ai2_arc
	- camel-ai/physics
	- camel-ai/chemistry
	- camel-ai/biology
	- camel-ai/math
	- metaeval/reclor
	- openbookqa
	- mandyyyyii/scibench
	- derek-thomas/ScienceQA
	- TIGER-Lab/ScienceEval
	- jondurbin/airoboros-3.2
	- LDJnr/Capybara
	- Cot-Alpaca-GPT4-From-OpenHermes-2.5
	- STEM-AI-mtl/Electrical-engineering
	- knowrohit07/saraswati-stem
	- sablo/oasst2_curated
	- lmsys/lmsys-chat-1m
	- TIGER-Lab/MathInstruct
	- bigbio/med_qa
	- meta-math/MetaMathQA-40K
	- openbookqa
	- piqa
	- metaeval/reclor
	- derek-thomas/ScienceQA
	- scibench
	- sciq
	- Open-Orca/SlimOrca
	- migtissera/Synthia-v1.3
	- TIGER-Lab/ScienceEval
	- allenai/WildChat
	- microsoft/orca-math-word-problems-200k
	- openchat/openchat_sharegpt4_dataset
	- teknium/GPTeacher-General-Instruct
	- m-a-p/CodeFeedback-Filtered-Instruction
	- totally-not-an-llm/EverythingLM-data-V3
	- HuggingFaceH4/no_robots
	- OpenAssistant/oasst_top1_2023-08-25
	- WizardLM/WizardLM_evol_instruct_70k
	- abacusai/SystemChat-1.1
	- H-D-T/Buzz-V1.2
	pipeline_tag: text-generation
	---

	# 🔬 Einstein-v7-Qwen2-7B-GGUF
	This is quantized version of [Weyaxi/Einstein-v7-Qwen2-7B](https://huggingface.co./Weyaxi/Einstein-v7-Qwen2-7B) created using llama.cpp

	# Model Description

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/6468ce47e134d050a58aa89c/KLQP1jK-DIzpwHzYRIH-Q.png)

	This model is a full fine-tuned version of [Qwen/Qwen2-7B](https://huggingface.co./Qwen/Qwen2-7B) on diverse datasets.

	This model is finetuned using `8xMI300X` using [axolotl](https://github.com/OpenAccess-AI-Collective/axolotl).

	<details><summary>See axolotl config</summary>

	axolotl version: `0.4.0`
	```yaml
	base_model: Qwen/Qwen2-7B
	model_type: AutoModelForCausalLM
	tokenizer_type: AutoTokenizer

	load_in_8bit: false
	load_in_4bit: false
	strict: false

	chat_template: chatml
	datasets:
	- path: data/airoboros_3.2_without_contextual_slimorca_orca_sharegpt.json
	ds_type: json
	type: sharegpt
	conversation: chatml

	- path: data/allenai_wild_chat_gpt4_english_toxic_random_half_4k_sharegpt.json
	ds_type: json
	type: sharegpt
	strict: false
	conversation: chatml

	- path: data/buzz_unstacked_chosen_math_removed_filtered.json
	ds_type: json
	type: alpaca
	conversation: chatml

	- path: data/capybara_sharegpt.json
	ds_type: json
	type: sharegpt
	conversation: chatml

	- path: data/cot_alpaca_gpt4_extracted_openhermes_2.5_sharegpt.json
	ds_type: json
	type: sharegpt
	conversation: chatml

	- path: data/everythinglm-data-v3_sharegpt.json
	ds_type: json
	type: sharegpt
	strict: false
	conversation: chatml

	- path: data/gpt4_data_lmys_1m_sharegpt.json
	ds_type: json
	type: sharegpt
	conversation: chatml

	- path: data/gpteacher-instruct-special-alpaca.json
	ds_type: json
	type: gpteacher
	conversation: chatml

	- path: data/merged_all.json
	ds_type: json
	type: alpaca
	conversation: chatml

	- path: data/no_robots_sharegpt.json
	ds_type: json
	type: sharegpt
	strict: false
	conversation: chatml

	- path: data/oasst_top1_from_fusechatmixture_sharegpt.json
	ds_type: json
	type: sharegpt
	strict: false
	conversation: chatml

	- path: data/pippa_bagel_repo_3k_sharegpt.json
	ds_type: json
	type: sharegpt
	conversation: chatml

	- path: data/rpguild_quarter_alignment_lab_sharegpt.json
	ds_type: json
	type: sharegpt
	conversation: chatml

	- path: data/sharegpt_gpt4_english.json
	ds_type: json
	type: sharegpt
	conversation: chatml

	- path: data/slimorca_dedup_filtered_95k_sharegpt.json
	ds_type: json
	type: sharegpt
	conversation: chatml

	- path: data/soda_diaolog_longest_tenth_buzz_sharegpt.json
	ds_type: json
	type: sharegpt
	conversation: chatml

	- path: data/synthia-v1.3_sharegpt_12500.json
	ds_type: json
	type: sharegpt
	conversation: chatml

	- path: data/system_conversations_dolphin_sharegpt.json
	ds_type: json
	type: sharegpt
	conversation: chatml

	dataset_prepared_path: last_run_prepared
	val_set_size: 0.002

	output_dir: ./Einstein-v7-Qwen2-7B-model

	sequence_len: 8192
	sample_packing: true
	pad_to_sequence_len: true
	eval_sample_packing: false

	wandb_project: Einstein
	wandb_entity:
	wandb_watch:
	wandb_name:
	wandb_log_model:
	hub_model_id: Weyaxi/Einstein-v7-Qwen2-7B

	gradient_accumulation_steps: 4
	micro_batch_size: 6
	num_epochs: 2
	optimizer: paged_adamw_8bit
	lr_scheduler: cosine
	learning_rate: 0.00001 # look

	train_on_inputs: false
	group_by_length: false
	bf16: auto
	fp16:
	tf32: false

	gradient_checkpointing: unsloth
	gradient_checkpointing_kwargs:
	use_reentrant: true # look
	early_stopping_patience:
	resume_from_checkpoint:
	local_rank:
	logging_steps: 1
	xformers_attention:
	flash_attention: true

	warmup_steps: 10
	evals_per_epoch: 2
	eval_table_size:
	eval_max_new_tokens: 128
	saves_per_epoch: 1
	debug:

	deepspeed: deepspeed_configs/zero3_bf16.json
	weight_decay: 0.05
	fsdp:
	fsdp_config:
	special_tokens:
	eos_token: "<\|im_end\|>"
	pad_token: "<\|end_of_text\|>"
	tokens:
	- "<\|im_start\|>"
	- "<\|im_end\|>"
	```

	</details><br>

	# 💬 Prompt Template

	You can use ChatML prompt template while using the model:

	### ChatML

	```
	<\|im_start\|>system
	{system}<\|im_end\|>
	<\|im_start\|>user
	{user}<\|im_end\|>
	<\|im_start\|>assistant
	{asistant}<\|im_end\|>
	```

	This prompt template is available as a [chat template](https://huggingface.co./docs/transformers/main/chat_templating), which means you can format messages using the
	`tokenizer.apply_chat_template()` method:

	```python
	messages = [
	{"role": "system", "content": "You are helpful AI asistant."},
	{"role": "user", "content": "Hello!"}
	]
	gen_input = tokenizer.apply_chat_template(message, return_tensors="pt")
	model.generate(**gen_input)
	```

	# 📊 Datasets used in this model

	The datasets used to train this model are listed in the metadata section of the model card.

	Please note that certain datasets mentioned in the metadata may have undergone filtering based on various criteria.

	The results of this filtering process and its outcomes are in a diffrent repository:

	[Weyaxi/sci-datasets/main](https://huggingface.co./datasets/Weyaxi/sci-datasets/tree/main)


	# 🎯 [Open LLM Leaderboard Evaluation Results](https://huggingface.co./spaces/HuggingFaceH4/open_llm_leaderboard)

	# 🤖 Additional information about training

	This model is full fine-tuned for 2 epoch.

	Total number of steps was 500.

	<details><summary>Loss graph</summary>

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/6468ce47e134d050a58aa89c/bkJGgh_JUfKeRlTLo_ZcB.png)

	</details><br>