Edit model card

KARAKURI LM 7B APM v0.2- GGUF

This is quantized version of karakuri-ai/karakuri-lm-7b-apm-v0.2 created using llama.cpp

Model Details

Model Description

Usage

KARAKURI LM 7B APM v0.2 is a attribute prediction model that rates model responses on various aspects that makes a response desirable.

Given a conversation with multiple turns between user and assistant, the model rates the following attributes (between 0 and 4) for every assistant turn.

  • helpfulness: Overall helpfulness of the response to the prompt.
  • correctness: Inclusion of all pertinent facts without errors.
  • coherence: Consistency and clarity of expression.
  • complexity: Intellectual depth required to write response (i.e. whether the response can be written by anyone with basic language competency or requires deep domain expertise).
  • verbosity: Amount of detail included in the response, relative to what is asked for in the prompt.
  • quality: Perceived goodness of response.
  • toxicity: Undesirable elements such as vulgar, harmful or potentially biased response.
  • humor: Sense of humor within response.
  • creativity: Willingness to generate non-conventional response.

The first five are derived from HelpSteer, while the remaining four are derived from OASST2.

You can run the model using the 🤗 Transformers:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "karakuri-ai/karakuri-lm-7b-apm-v0.2"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype="auto",
    device_map="auto",
)

messages = [
    {"role": "user", "content": "Hello!"},
    {"role": "assistant", "content": "Hello! How can I help you today?"},
]
tokenizer.apply_chat_template(
    messages,
    label="helpsteer",
    tokenize=False,
    add_generation_prompt=True,
)
# <bos>[INST] Hello! [/INST] Hello! How can I help you today? [ATTR_1]

input_ids = tokenizer.apply_chat_template(
    messages,
    label="helpsteer",
    add_generation_prompt=True,
    return_tensors="pt",
).to(model.device)
outputs = model.generate(input_ids, max_new_tokens=32)
tokenizer.decode(outputs[0][input_ids.shape[-1]:])
#  helpfulness: 2 correctness: 1 coherence: 2 complexity: 1 verbosity: 1 [/ATTR_1]<eos>

messages += [
    {"role": "label", "content": "helpfulness: 2 correctness: 1 coherence: 2 complexity: 1 verbosity: 1"},
    {"role": "user", "content": "Thank you!"},
    {"role": "assistant", "content": "You're welcome! I'm happy to help however I can."},
]
tokenizer.apply_chat_template(
    messages,
    label="helpsteer",
    tokenize=False,
    add_generation_prompt=True,
)
# <bos>[INST] Hello! [/INST] Hello! How can I help you today? [ATTR_1] helpfulness: 2 correctness: 1 coherence: 2 complexity: 1 verbosity: 1 [/ATTR_1]<eos>[INST] Thank you! [/INST] You're welcome! I'm happy to help however I can. [ATTR_1]

messages = [
    {"role": "user", "content": "Hello!"},
    {"role": "assistant", "content": "Hello! How can I help you today?"},
]
tokenizer.apply_chat_template(
    messages,
    label="oasst",
    tokenize=False,
    add_generation_prompt=True,
)
# <bos>[INST] Hello! [/INST] Hello! How can I help you today? [ATTR_2]

input_ids = tokenizer.apply_chat_template(
    messages,
    label="oasst",
    add_generation_prompt=True,
    return_tensors="pt",
).to(model.device)
outputs = model.generate(input_ids, max_new_tokens=32)
tokenizer.decode(outputs[0][input_ids.shape[-1]:])
#  quality: 3 toxicity: 1 humor: 1 creativity: 1 [/ATTR_2]<eos>

Training Details

Training Data

Training Infrastructure

  • Hardware: The model was trained on single node of an Amazon EC2 trn1.32xlarge instance.
  • Software: We use code based on neuronx-nemo-megatron.

Model Citation

@misc{karakuri_lm_7b_apm_v02,
    author       = { {KARAKURI} {I}nc. },
    title        = { {KARAKURI} {LM} 7{B} {APM} v0.2 },
    year         = { 2024 },
    url          = { https://huggingface.co./karakuri-ai/karakuri-lm-7b-apm-v0.2 },
    publisher    = { Hugging Face },
    journal      = { Hugging Face repository }
}
Downloads last month
232
GGUF
Model size
7.24B params
Architecture
llama

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for QuantFactory/karakuri-lm-7b-apm-v0.2-GGUF

Quantized
(1)
this model

Datasets used to train QuantFactory/karakuri-lm-7b-apm-v0.2-GGUF