library_name: peft

Training procedure

Finetuned on ultrachat-100k-flattened dataset for 1 epoch, took around 40 hrs on A100 80BG

The following bitsandbytes quantization config was used during training:

  • quant_method: bitsandbytes
  • load_in_8bit: False
  • load_in_4bit: True
  • llm_int8_threshold: 6.0
  • llm_int8_skip_modules: None
  • llm_int8_enable_fp32_cpu_offload: False
  • llm_int8_has_fp16_weight: False
  • bnb_4bit_quant_type: nf4
  • bnb_4bit_use_double_quant: True
  • bnb_4bit_compute_dtype: float16

Framework versions

  • PEFT 0.5.0

Prompt

Use the following for prompting

prompt = "### Human: "+instruction+"### Assistant: "

merged following the gist https://gist.github.com/ChrisHayduk/1a53463331f52dca205e55982baf9930

Guidance from https://kaitchup.substack.com/

Work supported by https://datacrunch.io/

Downloads last month
17
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train gradjitta/mistral-7b-ultrachat100k-merged