jackal1586 commited on
Commit
eb823ba
·
1 Parent(s): f066ef4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -17
README.md CHANGED
@@ -1,37 +1,33 @@
1
- [tokenizer](#tokenizer) | [model](#model) | [datasets](#datasets) | [plots](#plots) | [fine tuning](#fine-tuning)
2
-
3
- # Tokenizer {#tokenizer}
4
 
5
  We trained our tokenizer using [sentencepiece](https://github.com/google/sentencepiece)'s unigram tokenizer. Then loaded the tokenizer as MT5TokenizerFast.
6
 
7
- ## Model {#model}
8
 
9
  We used [MT5-base](https://huggingface.co/google/mt5-base) model.
10
 
11
- ## Datasets {#datasets}
12
 
13
  We used [Code Search Net](https://huggingface.co/datasets/code_search_net)'s dataset and some scrapped data from internet to train the model. We maintained a list of datasets where each dataset had codes of same language.
14
 
15
- ## Plots {#plots}
16
-
17
- [train loss](#train_loss) | [evaluation loss](#eval_loss) | [evaluation accuracy](#eval_acc) | [learning rate](#lrs)
18
 
19
- ### Train loss {#train_loss}
20
 
21
- ![train loss](train_loss.png)
22
 
23
- ### Evaluation loss {#eval_loss}
24
 
25
- ![eval loss](eval_loss.png)
26
 
27
- ### Evaluation accuracy {#eval_acc}
28
 
29
- ![eval accuracy](eval_accuracy.png)
30
 
31
- ### Learning rate {#lrs}
32
 
33
- ![learning rate](learning_rate.png)
34
 
35
- ## Fine tuning {#fine-tuning}
36
 
37
  We fine tuned the model with [CodeXGLUE code-to-code-trans dataset](https://huggingface.co/datasets/code_x_glue_cc_code_to_code_trans), and scrapper data.
 
1
+ # Tokenizer
 
 
2
 
3
  We trained our tokenizer using [sentencepiece](https://github.com/google/sentencepiece)'s unigram tokenizer. Then loaded the tokenizer as MT5TokenizerFast.
4
 
5
+ ## Model
6
 
7
  We used [MT5-base](https://huggingface.co/google/mt5-base) model.
8
 
9
+ ## Datasets
10
 
11
  We used [Code Search Net](https://huggingface.co/datasets/code_search_net)'s dataset and some scrapped data from internet to train the model. We maintained a list of datasets where each dataset had codes of same language.
12
 
13
+ ## Plots
 
 
14
 
15
+ ### Train loss
16
 
17
+ ![train loss](https://i.ibb.co/x53Wm8n/train-loss.png)
18
 
19
+ ### Evaluation loss
20
 
21
+ ![eval loss](https://i.ibb.co/McB2jnf/eval-loss.png)
22
 
23
+ ### Evaluation accuracy
24
 
25
+ ![eval accuracy](https://i.ibb.co/YDGhLdn/eval-accuracy.png)
26
 
27
+ ### Learning rate
28
 
29
+ ![learning rate](https://i.ibb.co/CMStzWv/learning-rate.png)
30
 
31
+ ## Fine tuning
32
 
33
  We fine tuned the model with [CodeXGLUE code-to-code-trans dataset](https://huggingface.co/datasets/code_x_glue_cc_code_to_code_trans), and scrapper data.