Sagarkrishna commited on
Commit
899ede3
1 Parent(s): 0c71a93

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +122 -0
README.md ADDED
@@ -0,0 +1,122 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: peft
4
+ tags:
5
+ - trl
6
+ - sft
7
+ - generated_from_trainer
8
+ base_model: meta-llama/Meta-Llama-3-8B-Instruct
9
+ datasets:
10
+ - b-mc2/sql-create-context
11
+ model-index:
12
+ - name: llama3-8b-instruct-text-to-sql
13
+ results: []
14
+ metrics:
15
+ - accuracy 79.90
16
+ language:
17
+ - en
18
+ ---
19
+
20
+
21
+
22
+ # llama3-8b-instruct-text-to-sql
23
+
24
+ This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on the generator dataset.
25
+
26
+ ## Model description
27
+
28
+ More information needed
29
+
30
+ ## Intended uses & limitations
31
+
32
+ More information needed
33
+
34
+ ## Training and evaluation data
35
+
36
+ More information needed
37
+
38
+ ## Training procedure
39
+
40
+ ### Training hyperparameters
41
+
42
+ The following hyperparameters were used during training:
43
+ - learning_rate: 0.0002
44
+ - train_batch_size: 3
45
+ - eval_batch_size: 8
46
+ - seed: 42
47
+ - gradient_accumulation_steps: 2
48
+ - total_train_batch_size: 6
49
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
50
+ - lr_scheduler_type: constant
51
+ - lr_scheduler_warmup_ratio: 0.03
52
+ - num_epochs: 3
53
+
54
+ ### Training results
55
+
56
+
57
+
58
+ ### Framework versions
59
+
60
+ - PEFT 0.10.0
61
+ - Transformers 4.40.0
62
+ - Pytorch 2.2.0+cu121
63
+ - Datasets 2.19.0
64
+ - Tokenizers 0.19.1
65
+
66
+
67
+ ### Usage
68
+
69
+ ```python
70
+
71
+ from transformers import AutoTokenizer, AutoModelForCausalLM
72
+ import torch
73
+
74
+ model_id = "SagarKrishna/Llama_3_8b_Instruct_Text2Sql_FullPrecision_Finetuned"
75
+
76
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
77
+
78
+ model = AutoModelForCausalLM.from_pretrained(
79
+ model_id,
80
+ torch_dtype=torch.bfloat16,
81
+ device_map="auto",
82
+ )
83
+
84
+
85
+ messages = [
86
+ {"role": "system", "content": "You are an text to SQL query translator. Users will ask you questions in English and you will generate a SQL query based on the provided SCHEMA.\nSCHEMA:\nCREATE TABLE match_season (College VARCHAR, POSITION VARCHAR)"},
87
+ {"role": "user", "content": "Which college have both players with position midfielder and players with position defender?"},
88
+ ]
89
+
90
+ input_ids = tokenizer.apply_chat_template(
91
+ messages,
92
+ add_generation_prompt=True,
93
+ return_tensors="pt"
94
+ ).to(model.device)
95
+
96
+ terminators = [
97
+ tokenizer.eos_token_id,
98
+ tokenizer.convert_tokens_to_ids("<|eot_id|>")
99
+ ]
100
+
101
+ outputs = model.generate(
102
+ input_ids,
103
+ max_new_tokens=256,
104
+ eos_token_id=terminators,
105
+ do_sample=True,
106
+ temperature=0.6,
107
+ top_p=0.9,
108
+ )
109
+ response = outputs[0]
110
+ print(tokenizer.decode(response, skip_special_tokens=True))
111
+
112
+ #
113
+ #system
114
+ #You are an text to SQL query translator. Users will ask you questions in English and you will generate a SQL query based on the provided SCHEMA.
115
+ #SCHEMA:
116
+ #CREATE TABLE match_season (College VARCHAR, POSITION VARCHAR)
117
+ #user
118
+ #Which college have both players with position midfielder and players with position defender?
119
+ #assistant
120
+ #SELECT College FROM match_season WHERE POSITION = "Midfielder" INTERSECT SELECT College FROM match_season WHERE POSITION = "Defender"
121
+ #
122
+ ```