Safetensors
mistral
h-j-han commited on
Commit
191b5d9
·
1 Parent(s): ed2e3e6

Fix new line issue & Match vocab type to base model

Browse files
README.md CHANGED
@@ -14,7 +14,6 @@ base_model:
14
  VocADT is a solution for vocabulary adaptation using adapter modules that are trained to learn the optimal linear combination of existing embeddings while keeping the model’s weights fixed.
15
  VocADT offers a flexible and scalable solution without requiring external resources or language constraints.
16
 
17
-
18
  ## New Vocabulary Adapted Models
19
  Only the input/output embeddings are replaced, while all other original weights of base model remain fixed.
20
  These are the merged version: after training the adapters, we merge the original embeddings with the adapter to generate the new embeddings.
@@ -29,10 +28,10 @@ These are the merged version: after training the adapters, we merge the original
29
  ```python
30
  from transformers import AutoModelForCausalLM, AutoTokenizer
31
 
32
- # model_name = "mistralai/Mistral-7B-v0.1 # Base Model
33
  model_name = "h-j-han/Mistral-7B-VocADT-50k-Latin" # Vocabulary Adapted Model
34
  tokenizer = AutoTokenizer.from_pretrained(model_name)
35
- model = AutoModelForCausalLM.from_pretrained(model_name)
36
 
37
  prefix = "\nEnglish: Hello!\nSwahili: Habari!\nEnglish: What's your name?\nSwahili: Jina lako ni nani?\nEnglish: "
38
  line = "My name is Amani."
@@ -40,6 +39,8 @@ suffix = f"\nSwahili:"
40
  prompt = prefix + line + suffix
41
 
42
  inputs = tokenizer(prompt, return_tensors="pt")
 
 
43
  outputs = model.generate(**inputs, max_new_tokens=5)
44
  print(tokenizer.decode(outputs[0], skip_special_tokens=True))
45
 
 
14
  VocADT is a solution for vocabulary adaptation using adapter modules that are trained to learn the optimal linear combination of existing embeddings while keeping the model’s weights fixed.
15
  VocADT offers a flexible and scalable solution without requiring external resources or language constraints.
16
 
 
17
  ## New Vocabulary Adapted Models
18
  Only the input/output embeddings are replaced, while all other original weights of base model remain fixed.
19
  These are the merged version: after training the adapters, we merge the original embeddings with the adapter to generate the new embeddings.
 
28
  ```python
29
  from transformers import AutoModelForCausalLM, AutoTokenizer
30
 
31
+ # model_name = "mistralai/Mistral-7B-v0.1" # Base Model
32
  model_name = "h-j-han/Mistral-7B-VocADT-50k-Latin" # Vocabulary Adapted Model
33
  tokenizer = AutoTokenizer.from_pretrained(model_name)
34
+ model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
35
 
36
  prefix = "\nEnglish: Hello!\nSwahili: Habari!\nEnglish: What's your name?\nSwahili: Jina lako ni nani?\nEnglish: "
37
  line = "My name is Amani."
 
39
  prompt = prefix + line + suffix
40
 
41
  inputs = tokenizer(prompt, return_tensors="pt")
42
+ for item in inputs:
43
+ inputs[item] = inputs[item].cuda()
44
  outputs = model.generate(**inputs, max_new_tokens=5)
45
  print(tokenizer.decode(outputs[0], skip_special_tokens=True))
46
 
config.json CHANGED
@@ -21,5 +21,5 @@
21
  "torch_dtype": "bfloat16",
22
  "transformers_version": "4.43.0.dev0",
23
  "use_cache": true,
24
- "vocab_size": 50302
25
  }
 
21
  "torch_dtype": "bfloat16",
22
  "transformers_version": "4.43.0.dev0",
23
  "use_cache": true,
24
+ "vocab_size": 50000
25
  }
model-00001-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:56cd7b2917c67dbe374fffd75e4fdc234d4f0aacf3a7901bb34f06443ab09bd3
3
- size 4975651696
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:98489382fe32a3163ae7d60e2b6d6705ed9854a563b78ed9f97289923b1b0f6b
3
+ size 4973177712
model-00003-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e09d26e0e83a6c87af03adb7faffd91cd4ae25337f9cac34ef3ea40838ae46d4
3
- size 4891790120
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:94a489b7f407e9aabeb6cfddce9b002fea96ee02d5263d26237322b33d210997
3
+ size 4889316136
model.safetensors.index.json CHANGED
@@ -1,6 +1,6 @@
1
  {
2
  "metadata": {
3
- "total_size": 14783324160
4
  },
5
  "weight_map": {
6
  "lm_head.weight": "model-00003-of-00003.safetensors",
 
1
  {
2
  "metadata": {
3
+ "total_size": 14778376192
4
  },
5
  "weight_map": {
6
  "lm_head.weight": "model-00003-of-00003.safetensors",
tokenizer.json CHANGED
The diff for this file is too large to render. See raw diff