Update README.md

40ceb07 verified 17 days ago

3.6 kB

	---
	license: llama3
	base_model: meta-llama/Meta-Llama-3-8B-Instruct
	language:
	- en
	tags:
	- KALE-LM
	- science
	- chemistry
	pipeline_tag: text-generation
	---
	# Llama3-KALE-LM-Chem-8B

	## Introduction

	We are thrilled to present Llama3-KALE-LM-Chem 8B, our first open-source KALE-LM, which specializes in chemistry.

	## Training Details

	We have continually pre-trained the model with a large amount of data and post-trained it through supervised fine-tuning.

	## Benchmarks

	### Open Benchmarks
	\| Models \| ChemBench \| MMLU \| MMLU-Chem \| SciQ \| IE(Acc) \| IE(LS) \|
	\| ---- \| ---- \| ---- \| ---- \| ---- \| ---- \| ---- \|
	\| GPT-3.5 \| 47.15 \| 69.75 \| 53.32 \| 89.6 \| 52.98 \| 68.28 \|
	\| GPT-4 \| 53.72 \| 78.67 \| 63.70 \| 94.10 \| 54.20 \| 69.74 \|
	\| Llama3-8B-Instruct \| 46.02 \| 68.3 \| 51.10 \| 93.30 \| 45.83 \| 61.22 \|
	\| LlaSMol \| 28.47 \| 54.47 \| 33.24 \| 72.30 \| 2.16 \| 3.23 \|
	\| ChemDFM \| 44.44 \| 58.11 \| 45.60 \| 86.70 \| 7.61 \| 11.49 \|
	\| ChemLLM-7B-Chat \| 34.16 \| 61.79 \| 48.39 \| 94.00 \| 29.66 \| 39.17 \|
	\| ChemLLM-7B-Chat-1.5-SFT \| 42.75 \| 63.56 \| 49.63 \| 95.10 \| 14.96 \| 19.61 \|
	\| Llama3-KALE-LM-Chem-8B \| 52.40 \| 68.74 \| 53.83 \| 91.50 \| 67.50 \| 78.37 \|

	#### ChemBench Details (Evaluated By OpenCompass)

	\| Models \| NC \| PP \| M2C \| C2M \| PP \| RS \| YP \| TP \| SP \| Average \|
	\| ------ \| ------ \| ------ \| ------ \| ------ \| ------ \| ------ \| ------ \| ------ \| ------ \| ------ \|
	\| GPT-3.5 \| 46.93 \| 56.98 \| 85.28 \| 38.25 \| 43.67 \| 42.33 \| 30.33 \| 42.57 \| 38 \| 47.15 \|
	\| GPT-4 \| 54.82 \| 65.02 \| 92.64 \| 52.88 \| 62.67 \| 52.67 \| 42.33 \| 24.75 \| 35.67 \| 53.72 \|
	\| Llama3-8B-Instruct \| 51.31 \| 27.79 \| 90.30 \| 40.88 \| 34.00 \| 30.00 \| 45.33 \| 60.89 \| 33.67 \| 46.02 \|
	\| LlaSMol \| 27.78 \| 29.34 \| 31.44 \| 23.38 \| 25.67 \| 24.00 \| 37.33 \| 34.65 \| 22.67 \| 28.47 \|
	\| ChemDFM \| 36.92 \| 55.57 \| 83.95 \| 42.00 \| 40.00 \| 37.33 \| 39.00 \| 33.17 \| 32.00 \| 44.44 \|
	\| ChemLLM-7B-Chat \| 41.05 \| 29.76 \| 85.28 \| 26.12 \| 26.00 \| 24.00 \| 20.00 \| 24.26 \| 31.00 \| 34.16 \|
	\| ChemLLM-7B-Chat-1.5-SFT \| 50.06 \| 49.51 \| 85.28 \| 38.75 \| 38.00 \| 26.67 \| 28.33 \| 31.68 \| 33.67 \| 42.44 \|
	\| Llama3-KALE-LM-Chem-8B \| 63.58 \| 58.39 \| 92.98 \| 44.50 \| 48.67 \| 38.33 \| 46.33 \| 44.55 \| 34.33 \| 52.41 \|

	## Quick Start

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	device = "cuda" # the device to load the model onto

	model = AutoModelForCausalLM.from_pretrained(
	"USTC-KnowledgeComputingLab/Llama3-KALE-LM-Chem-8B",
	torch_dtype="auto",
	device_map="auto"
	)
	tokenizer = AutoTokenizer.from_pretrained("USTC-KnowledgeComputingLab/Llama3-KALE-LM-Chem-8B")

	prompt = "Give me a short introduction to large language model."
	messages = [
	{"role": "system", "content": "You are a helpful assistant."},
	{"role": "user", "content": prompt}
	]
	text = tokenizer.apply_chat_template(
	messages,
	tokenize=False,
	add_generation_prompt=True
	)
	model_inputs = tokenizer([text], return_tensors="pt").to(device)

	generated_ids = model.generate(
	model_inputs.input_ids,
	max_new_tokens=2048
	)
	generated_ids = [
	output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
	]

	response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
	```

	## Cite This Work

	```
	@article{dai2024kale,
	title={KALE-LM: Unleash The Power Of AI For Science Via Knowledge And Logic Enhanced Large Model},
	author={Dai, Weichen and Chen, Yezeng and Dai, Zijie and Huang, Zhijie and Liu, Yubo and Pan, Yixuan and Song, Baiyang and Zhong, Chengli and Li, Xinhe and Wang, Zeyu and others},
	journal={arXiv preprint arXiv:2409.18695},
	year={2024}
	}
	```