mlabonne commited on
Commit
425b9bf
1 Parent(s): 420fcbc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +43 -13
README.md CHANGED
@@ -1,27 +1,33 @@
1
  ---
2
- base_model:
3
- - Qwen/Qwen2.5-32B-Instruct
 
 
 
 
4
  library_name: transformers
5
  tags:
6
  - mergekit
7
  - merge
8
-
 
 
9
  ---
10
- # merge
 
 
11
 
12
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
13
 
14
- ## Merge Details
15
- ### Merge Method
16
 
17
- This model was merged using the passthrough merge method.
18
 
19
- ### Models Merged
20
 
21
- The following models were included in the merge:
22
- * [Qwen/Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct)
23
 
24
- ### Configuration
25
 
26
  The following YAML configuration was used to produce this model:
27
 
@@ -50,5 +56,29 @@ slices:
50
  model: Qwen/Qwen2.5-32B-Instruct
51
  merge_method: passthrough
52
  dtype: bfloat16
53
-
54
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: other
3
+ license_name: tongyi-qianwen
4
+ license_link: https://huggingface.co/Qwen/Qwen2-72B-Instruct/blob/main/LICENSE
5
+ language:
6
+ - en
7
+ pipeline_tag: text-generation
8
  library_name: transformers
9
  tags:
10
  - mergekit
11
  - merge
12
+ - lazymergekit
13
+ base_model:
14
+ - Qwen/Qwen2.5-32B-Instruct
15
  ---
16
+ # BigQwen2.5-52B-Instruct
17
+
18
+ ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/98GiKtmH1AtHHbIbOUH4Y.jpeg)
19
 
20
+ BigQwen2.5-52B-Instruct is a [Qwen/Qwen2-32B-Instruct](https://huggingface.co/Qwen/Qwen2-72B-Instruct) self-merge made with [MergeKit](https://github.com/arcee-ai/mergekit/tree/main).
21
 
22
+ It applies the [mlabonne/Meta-Llama-3-120B-Instruct](https://huggingface.co/mlabonne/Meta-Llama-3-120B-Instruct/) recipe.
 
23
 
24
+ I made it due to popular demand but I haven't tested it so use it at your own risk. ¯\\\_(ツ)_/¯
25
 
26
+ ## 🔍 Applications
27
 
28
+ It might be good for creative writing tasks. I recommend a context length of 32k but you can go up to 131,072 tokens in theory.
 
29
 
30
+ ## 🧩 Configuration
31
 
32
  The following YAML configuration was used to produce this model:
33
 
 
56
  model: Qwen/Qwen2.5-32B-Instruct
57
  merge_method: passthrough
58
  dtype: bfloat16
 
59
  ```
60
+
61
+ ## 💻 Usage
62
+
63
+ ```python
64
+ !pip install -qU transformers accelerate
65
+
66
+ from transformers import AutoTokenizer
67
+ import transformers
68
+ import torch
69
+
70
+ model = "mlabonne/BigQwen2.5-52B-Instruct"
71
+ messages = [{"role": "user", "content": "What is a large language model?"}]
72
+
73
+ tokenizer = AutoTokenizer.from_pretrained(model)
74
+ prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
75
+ pipeline = transformers.pipeline(
76
+ "text-generation",
77
+ model=model,
78
+ torch_dtype=torch.float16,
79
+ device_map="auto",
80
+ )
81
+
82
+ outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
83
+ print(outputs[0]["generated_text"])
84
+ ```