Announcing OLAFv2: The Next Step in Korean Language Understanding πŸš€

We are thrilled to announce the release of OLAFv2, our state-of-the-art Korean language model, now available on Hugging Face! πŸŽ‰ Designed to excel in complex reasoning, mathematical problem-solving, and general language understanding, OLAFv2 represents a significant leap forward in NLP capabilities for the Korean language.

image/png

Key Features of OLAFv2 🌟

Two Model Sizes for Flexibility

OLAFv2 is available in two parameter sizes:

  • 14B (Billion) Parameters: For maximum performance. πŸ‹οΈβ€β™‚οΈ
  • 1.5B (Billion) Parameters: For lightweight applications and hardware-constrained environments. πŸͺΆ

Reasoning Mode for Complex Tasks πŸ€”

One of OLAFv2's standout features is its Reasoning Mode, specifically designed for:

  • Complex mathematical problem-solving. βœ–οΈβž—
  • STEM (Science, Technology, Engineering, Mathematics) applications. πŸ”¬πŸ“
  • Tasks requiring detailed step-by-step reasoning. 🧠

This mode can be effectively utilized for Test-Time Scaling, enabling the model to harness additional computational resources during inference. This approach enhances output detail and accuracy, achieving performance levels that surpass GPT-4o. πŸ“ˆ

image/png

Long Context Support πŸ“œ

With support for up to 32K tokens, OLAFv2 is perfect for:

  • Retrieval-Augmented Generation (RAG). πŸ› οΈ
  • Tasks requiring long-context understanding and reasoning. 🧡

Benchmarks and Performance πŸ“Š

We share evaluation results across three benchmarks, KMMLU, HRM8K and LogicKor.

polyglot_budget

We also share results with inference-time scaling. For more details have a look into our blog.

alt-text-1 alt-text-2

Getting Started πŸš€

OLAFv2 is now available on Hugging Face! You can start using it by accessing our repository:

# pip install transformers
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "OLAResearch/OLAF2-14B"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "introduce yourself!"
messages = [
    {"role": "system", "content": "You're name is OLAF. A large language model made by OneLineAI, specializing in Korean culture and finance."},
    # for reasoning mode
    #{"role": "system", "content": "You're name is OLAF. A large language model made by OneLineAI, specializing in Korean culture and finance.Perform two-step reasoning. Return your answers in \\boxed{N} format."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
Downloads last month
113
Safetensors
Model size
14.8B params
Tensor type
F32
Β·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for OLAResearch/OLAF2-14B

Base model

Qwen/Qwen2.5-14B
Finetuned
(81)
this model
Quantizations
2 models