dittops commited on
Commit
f55a80b
·
1 Parent(s): 473233d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +66 -37
README.md CHANGED
@@ -1,57 +1,86 @@
1
  ---
2
- license: other
3
- base_model: microsoft/phi-1_5
4
  tags:
5
- - llama-factory
6
- - generated_from_trainer
7
- model-index:
8
- - name: path_to_sft_checkpoint
9
- results: []
10
  ---
11
 
12
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
- should probably proofread and complete it, then remove this comment. -->
14
 
15
- # path_to_sft_checkpoint
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
 
17
- This model is a fine-tuned version of [microsoft/phi-1_5](https://huggingface.co/microsoft/phi-1_5) on the oss-evol dataset.
18
 
19
- ## Model description
20
 
21
- More information needed
 
 
22
 
23
- ## Intended uses & limitations
 
24
 
25
- More information needed
 
26
 
27
- ## Training and evaluation data
28
 
29
- More information needed
30
 
31
- ## Training procedure
 
 
32
 
33
- ### Training hyperparameters
34
 
35
- The following hyperparameters were used during training:
36
- - learning_rate: 2e-05
37
- - train_batch_size: 6
38
- - eval_batch_size: 8
39
- - seed: 42
40
- - distributed_type: multi-GPU
41
- - num_devices: 8
42
- - total_train_batch_size: 48
43
- - total_eval_batch_size: 64
44
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
45
- - lr_scheduler_type: cosine
46
- - num_epochs: 3.0
47
 
48
- ### Training results
49
 
 
50
 
 
 
 
 
 
 
 
 
 
 
 
 
51
 
52
- ### Framework versions
53
 
54
- - Transformers 4.34.1
55
- - Pytorch 2.1.1+cu121
56
- - Datasets 2.14.7
57
- - Tokenizers 0.14.1
 
1
  ---
2
+ library_name: transformers
 
3
  tags:
4
+ - code
5
+
 
 
 
6
  ---
7
 
 
 
8
 
9
+ # Bud Code Millenials 1B
10
+
11
+ Welcome to our Code Model repository! Our model is specifically fine-tuned for code generation tasks. Bud Millenial Code Gen open-source models are currently the State of the Art (SOTA) for code generation, beating all the existing models of all sizes. We have achieved a HumanEval value of 80.48 @ Pass 1, beating proprietary models like Gemini Ultra, Claude, GPT-3.5 etc. by a large margin, and on par with GPT-4 (HumanEval ~ 82. Ref. WizardCoder). Our proprietary model (Bud Code Jr) beats GPT-4 as well with a HumanEval value of 88.2 & a context size of 168K, we will be releasing an API for Researchers, Enterprises, and potential Partners by January 2024 end. If interested, please reach out to [email protected]
12
+
13
+ ### News 🔥🔥🔥
14
+
15
+ - [2024/01/03] We released **Code Millenials 34B** , which achieves the **80.48 pass@1** on the [HumanEval Benchmarks](https://github.com/openai/human-eval).
16
+ - [2024/01/02] We released **Code Millenials 13B** , which achieves the **76.21 pass@1** on the [HumanEval Benchmarks](https://github.com/openai/human-eval).
17
+
18
+
19
+ ### HumanEval
20
+
21
+ <p align="center" width="100%">
22
+ <a ><img src="https://raw.githubusercontent.com/BudEcosystem/code-millenials/main/assets/result.png" alt="CodeMillenials" style="width: 100%; min-width: 300px; display: block; margin: auto;"></a>
23
+ </p>
24
+
25
+ For the millenial models, the eval script in the github repo is used for the above result.
26
+
27
+ Note: The humaneval values of other models are taken from the official repos of [WizardCoder](https://github.com/nlpxucan/WizardLM), [DeepseekCoder](https://github.com/deepseek-ai/deepseek-coder), [Gemini](https://deepmind.google/technologies/gemini/#capabilities) etc.
28
+
29
+
30
+ ### Models
31
+
32
+ | Model | Checkpoint | HumanEval (+) | MBPP (+) |
33
+ |---------|-------------|---------------|----------|
34
+ |Code Millenials 34B | <a href="https://huggingface.co/budecosystem/code-millenials-34b" target="_blank">HF Link</a> | 80.48 (75) | 74.68 (62.9) |
35
+ |Code Millenials 13B | <a href="https://huggingface.co/budecosystem/code-millenials-13b" target="_blank">HF Link</a> | 76.21 (69.5) | 70.17 (57.6) |
36
+ |Code Millenials 3B | <a href="https://huggingface.co/budecosystem/code-millenials-3b" target="_blank">HF Link</a> | - | - |
37
+ |Code Millenials 1B | <a href="https://huggingface.co/budecosystem/code-millenials-1b" target="_blank">HF Link</a> | - | - |
38
+
39
+
40
+
41
 
42
+ ### 🚀 Quick Start
43
 
44
+ Inference code using the pre-trained model from the Hugging Face model hub
45
 
46
+ ```python
47
+ import torch
48
+ from transformers import AutoTokenizer, AutoModelForCausalLM
49
 
50
+ tokenizer = AutoTokenizer.from_pretrained("budecosystem/code-millenials-1b")
51
+ model = AutoModelForCausalLM.from_pretrained("budecosystem/code-millenials-1b")
52
 
53
+ template = """A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
54
+ ### Instruction: {instruction} ### Response:"""
55
 
56
+ instruction = <Your code instruction here>
57
 
58
+ prompt = template.format(instruction=instruction)
59
 
60
+ inputs = tokenizer(prompt, return_tensors="pt")
61
+ sample = model.generate(**inputs, max_length=128)
62
+ print(tokenizer.decode(sample[0]))
63
 
64
+ ```
65
 
 
 
 
 
 
 
 
 
 
 
 
 
66
 
67
+ ## Training details
68
 
69
+ The model is trained of 8 A100 80GB for approximately 6hrs.
70
 
71
+ | Hyperparameters | Value |
72
+ | :----------------------------| :-----: |
73
+ | per_device_train_batch_size | 6 |
74
+ | gradient_accumulation_steps | 1 |
75
+ | epoch | 3 |
76
+ | steps | 11502 |
77
+ | learning_rate | 2e-5 |
78
+ | lr schedular type | cosine |
79
+ | warmup ratio | 0.1 |
80
+ | optimizer | adamw |
81
+ | fp16 | True |
82
+ | GPU | 8 A100 80GB |
83
 
84
+ ### Important Note
85
 
86
+ - **Bias, Risks, and Limitations:** Model may sometimes make errors, produce misleading contents, or struggle to manage tasks that are not related to coding.