Edit model card

Model Information

The cmd2cwl model is an instruction fine-tuned version of the unsloth/Llama-3.2-3B. This model has been trained on a custom dataset consisting of help documentation from various command-line tools and corresponding CWL (Common Workflow Language) scripts. Its purpose is to assist users in converting command-line tool documentation into clean and well-structured CWL scripts, enhancing automation and workflow reproducibility.

Example

Task

question = """
Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
Write a cwl script for md5sum with docker image alpine.

### Input:

    With no FILE, or when FILE is -, read standard input.

    -b, --binary         read in binary mode
    -c, --check          read MD5 sums from the FILEs and check them
    --tag            create a BSD-style checksum
    -t, --text           read in text mode (default)
    -z, --zero           end each output line with NUL, not newline,
    and disable file name escaping

    The following five options are useful only when verifying checksums:
    --ignore-missing  don't fail or report status for missing files
    --quiet          don't print OK for each successfully verified file
    --status         don't output anything, status code shows success
    --strict         exit non-zero for improperly formatted checksum lines
    -w, --warn           warn about improperly formatted checksum lines

    --help     display this help and exit
    --version  output version information and exit

    The sums are computed as described in RFC 1321.  When checking, the input
    should be a former output of this program.  The default mode is to print a
    line with checksum, a space, a character indicating input mode ('*' for binary,
    ' ' for text or where binary is insignificant), and name for each FILE.


### Response:
"""

Using unsloth

from unsloth import FastLanguageModel
from transformers import TextStreamer

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "hubentu/cmd2cwl_Llama-3.2-3B",
    load_in_4bit = False,
)
FastLanguageModel.for_inference(model)

inputs = tokenizer(
    [question],
    return_tensors = "pt").to("cuda")

text_streamer = TextStreamer(tokenizer)
_ = model.generate(**inputs, streamer = text_streamer)

Using AutoModelForCausalLM

from transformers import AutoTokenizer, AutoModelForCausalLM
from transformers import TextStreamer

model = AutoModelForCausalLM.from_pretrained("hubentu/cmd2cwl_Llama-3.2-3B")
tokenizer = AutoTokenizer.from_pretrained("hubentu/cmd2cwl_Llama-3.2-3B")
model.to('cuda')

text_streamer = TextStreamer(tokenizer)
_ = model.generate(**inputs, streamer = text_streamer, max_length=8192)

Using generator

from transformers import pipeline
generator = pipeline('text-generation', model="checkpoints/cmd2cwl_Llama-3.2-3B", device='cuda')
resp = generator(question, max_length=8192)
print(resp[0]['generated_text'].split("### Response:\n")[-1])

Output

cwlVersion: v1.0
class: CommandLineTool
baseCommand:
- md5sum
requirements:
- class: DockerRequirement
  dockerPull: alpine:latest
label: md5sum
doc: Compute and check MD5 checksums
inputs:
  files:
    label: files
    doc: Input files
    type: File[]
    inputBinding:
      separate: true
outputs:
  md5:
    label: md5
    doc: MD5 checksums
    type: string[]
    outputBinding:
      glob: $(inputs.files.name)

Uploaded model

  • Developed by: hubentu
  • License: apache-2.0
  • Finetuned from model : unsloth/Llama-3.2-3B

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month
2
Safetensors
Model size
3.21B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for hubentu/cmd2cwl_Llama-3.2-3B

Finetuned
(1)
this model