hubentu commited on
Commit
2c9e240
1 Parent(s): a3640b9

Update README

Browse files
Files changed (1) hide show
  1. README.md +114 -0
README.md CHANGED
@@ -12,6 +12,120 @@ tags:
12
  - sft
13
  ---
14
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  # Uploaded model
16
 
17
  - **Developed by:** hubentu
 
12
  - sft
13
  ---
14
 
15
+ # Model Information
16
+
17
+ The `cmd2cwl` model is an instruction fine-tuned version of the `unsloth/Llama-3.2-3B`. This model has been trained on a custom dataset consisting of help documentation from various command-line tools and corresponding CWL (Common Workflow Language) scripts. Its purpose is to assist users in converting command-line tool documentation into clean and well-structured CWL scripts, enhancing automation and workflow reproducibility.
18
+
19
+ # Example
20
+ ## Task
21
+ ``` python
22
+ question = """
23
+ Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
24
+
25
+ ### Instruction:
26
+ Write a cwl script for md5sum with docker image alpine.
27
+
28
+ ### Input:
29
+
30
+ With no FILE, or when FILE is -, read standard input.
31
+
32
+ -b, --binary read in binary mode
33
+ -c, --check read MD5 sums from the FILEs and check them
34
+ --tag create a BSD-style checksum
35
+ -t, --text read in text mode (default)
36
+ -z, --zero end each output line with NUL, not newline,
37
+ and disable file name escaping
38
+
39
+ The following five options are useful only when verifying checksums:
40
+ --ignore-missing don't fail or report status for missing files
41
+ --quiet don't print OK for each successfully verified file
42
+ --status don't output anything, status code shows success
43
+ --strict exit non-zero for improperly formatted checksum lines
44
+ -w, --warn warn about improperly formatted checksum lines
45
+
46
+ --help display this help and exit
47
+ --version output version information and exit
48
+
49
+ The sums are computed as described in RFC 1321. When checking, the input
50
+ should be a former output of this program. The default mode is to print a
51
+ line with checksum, a space, a character indicating input mode ('*' for binary,
52
+ ' ' for text or where binary is insignificant), and name for each FILE.
53
+
54
+
55
+ ### Response:
56
+ """
57
+ ```
58
+
59
+ ## Using unsloth
60
+
61
+ ``` python
62
+ from unsloth import FastLanguageModel
63
+ from transformers import TextStreamer
64
+
65
+ model, tokenizer = FastLanguageModel.from_pretrained(
66
+ model_name = "hubentu/cmd2cwl_Llama-3.2-3B",
67
+ load_in_4bit = False,
68
+ )
69
+ FastLanguageModel.for_inference(model)
70
+
71
+ inputs = tokenizer(
72
+ [question],
73
+ return_tensors = "pt").to("cuda")
74
+
75
+ text_streamer = TextStreamer(tokenizer)
76
+ _ = model.generate(**inputs, streamer = text_streamer)
77
+
78
+ ```
79
+
80
+ ## Using AutoModelForCausalLM
81
+ ``` python
82
+ from transformers import AutoTokenizer, AutoModelForCausalLM
83
+ from transformers import TextStreamer
84
+
85
+ model = AutoModelForCausalLM.from_pretrained("hubentu/cmd2cwl_Llama-3.2-3B")
86
+ tokenizer = AutoTokenizer.from_pretrained("hubentu/cmd2cwl_Llama-3.2-3B")
87
+ model.to('cuda')
88
+
89
+ text_streamer = TextStreamer(tokenizer)
90
+ _ = model.generate(**inputs, streamer = text_streamer, max_length=8192)
91
+ ```
92
+
93
+ ## Using generator
94
+ ``` python
95
+ from transformers import pipeline
96
+ generator = pipeline('text-generation', model="checkpoints/cmd2cwl_Llama-3.2-3B", device='cuda')
97
+ resp = generator(question, max_length=8192)
98
+ print(resp[0]['generated_text'].split("### Response:\n")[-1])
99
+ ```
100
+
101
+ ## Output
102
+ ```
103
+ cwlVersion: v1.0
104
+ class: CommandLineTool
105
+ baseCommand:
106
+ - md5sum
107
+ requirements:
108
+ - class: DockerRequirement
109
+ dockerPull: alpine:latest
110
+ label: md5sum
111
+ doc: Compute and check MD5 checksums
112
+ inputs:
113
+ files:
114
+ label: files
115
+ doc: Input files
116
+ type: File[]
117
+ inputBinding:
118
+ separate: true
119
+ outputs:
120
+ md5:
121
+ label: md5
122
+ doc: MD5 checksums
123
+ type: string[]
124
+ outputBinding:
125
+ glob: $(inputs.files.name)
126
+ ```
127
+
128
+
129
  # Uploaded model
130
 
131
  - **Developed by:** hubentu