--- language: - en license: mit library_name: transformers tags: - orpo - qwen2 - sft - chatml base_model: - MaziyarPanahi/calme-2.4-rys-78b datasets: - mlabonne/orpo-dpo-mix-40k pipeline_tag: text-generation inference: false model_creator: dfurman quantized_by: dfurman model-index: - name: CalmeRys-78B-Orpo-v0.1 results: - task: type: text-generation name: Text Generation dataset: name: IFEval (0-Shot) type: HuggingFaceH4/ifeval args: num_few_shot: 0 metrics: - type: inst_level_strict_acc and prompt_level_strict_acc value: 81.63 name: strict accuracy source: url: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=dfurman/CalmeRys-78B-Orpo-v0.1 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: BBH (3-Shot) type: BBH args: num_few_shot: 3 metrics: - type: acc_norm value: 61.92 name: normalized accuracy source: url: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=dfurman/CalmeRys-78B-Orpo-v0.1 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MATH Lvl 5 (4-Shot) type: hendrycks/competition_math args: num_few_shot: 4 metrics: - type: exact_match value: 37.92 name: exact match source: url: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=dfurman/CalmeRys-78B-Orpo-v0.1 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GPQA (0-shot) type: Idavidrein/gpqa args: num_few_shot: 0 metrics: - type: acc_norm value: 20.02 name: acc_norm source: url: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=dfurman/CalmeRys-78B-Orpo-v0.1 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MuSR (0-shot) type: TAUR-Lab/MuSR args: num_few_shot: 0 metrics: - type: acc_norm value: 36.37 name: acc_norm source: url: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=dfurman/CalmeRys-78B-Orpo-v0.1 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU-PRO (5-shot) type: TIGER-Lab/MMLU-Pro config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 66.8 name: accuracy source: url: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=dfurman/CalmeRys-78B-Orpo-v0.1 name: Open LLM Leaderboard --- # dfurman/CalmeRys-78B-Orpo-v0.1 This model is a finetune of `MaziyarPanahi/calme-2.4-rys-78b` on 1.5k rows of the `mlabonne/orpo-dpo-mix-40k` dataset. It was trained as a generalist language model for a variety of text generation use cases, including support of agentic capabilities, roleplaying, reasoning, multi-turn conversations, long context coherence, and more. Thanks go out to [mlabonne](https://huggingface.co./mlabonne), [MaziyarPanahi](https://hf.xwall.us.kg.m/MaziyarPanahi), et al. for the source dataset and base model. ## šŸ¦¾ Training You can find the experiment on W&B at this [link](https://wandb.ai/dryanfurman/huggingface/runs/1w50nu70?nw=nwuserdryanfurman). Here are a few visualizations: ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62afc20ca5bd7cef3e1ab3f4/NG5WGL0ljzLsNhSBRVqnD.png) ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62afc20ca5bd7cef3e1ab3f4/Zhk5Bpr1I2NrzX98Bhtp8.png) ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62afc20ca5bd7cef3e1ab3f4/WgnKQnYIFWkCRSW3JPVAb.png) ## šŸ’» Usage
Setup ```python !pip install -qU transformers accelerate bitsandbytes !huggingface-cli download dfurman/CalmeRys-78B-Orpo-v0.1 ``` ```python from transformers import AutoTokenizer, BitsAndBytesConfig import transformers import torch if torch.cuda.get_device_capability()[0] >= 8: !pip install -qqq flash-attn attn_implementation = "flash_attention_2" torch_dtype = torch.bfloat16 else: attn_implementation = "eager" torch_dtype = torch.float16 # # quantize if necessary # bnb_config = BitsAndBytesConfig( # load_in_4bit=True, # bnb_4bit_quant_type="nf4", # bnb_4bit_compute_dtype=torch_dtype, # bnb_4bit_use_double_quant=True, # ) model = "dfurman/CalmeRys-78B-Orpo-v0.1" tokenizer = AutoTokenizer.from_pretrained(model) pipeline = transformers.pipeline( "text-generation", model=model, model_kwargs={ "torch_dtype": torch_dtype, # "quantization_config": bnb_config, "device_map": "auto", "attn_implementation": attn_implementation, } ) ```
### Example 1 ```python question = "Is the number 9.11 larger than 9.9?" messages = [ {"role": "system", "content": "You are a helpful assistant that thinks step by step."}, {"role": "user", "content": question}, ] prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) # print("***Prompt:\n", prompt) outputs = pipeline( prompt, max_new_tokens=1000, do_sample=True, temperature=0.7, top_k=50, top_p=0.95 ) print("***Generation:") print(outputs[0]["generated_text"][len(prompt) :]) ``` ``` ***Generation: To compare these two numbers, it's important to look at their decimal places after the whole number part, which is 9 in both cases. Comparing the tenths place, 9.11 has a '1' and 9.9 has a '9'. Since '9' is greater than '1', 9.9 is larger than 9.11. ``` ### Example 2 ```python question = """The bakers at the Beverly Hills Bakery baked 200 loaves of bread on Monday morning. They sold 93 loaves in the morning and 39 loaves in the afternoon. A grocery store then returned 6 unsold loaves back to the bakery. How many loaves of bread did the bakery have left? Respond as succinctly as possible. Format the response as a completion of this table: |step|subquestion|procedure|result| |:---|:----------|:--------|:-----:|""" messages = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": question}, ] prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) # print("***Prompt:\n", prompt) outputs = pipeline(prompt, max_new_tokens=1000, do_sample=True, temperature=0.7, top_k=50, top_p=0.95) print("***Generation:") print(outputs[0]["generated_text"][len(prompt):]) ``` ``` ***Generation: |1|Calculate total sold|Add morning and afternoon sales|132| |2|Subtract sold from total|200 - 132|68| |3|Adjust for returns|Add returned loaves to remaining|74| ``` ### Example 3 ```python question = "What's a good recipe for a spicy margarita?" messages = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": question}, ] prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) # print("***Prompt:\n", prompt) outputs = pipeline(prompt, max_new_tokens=1000, do_sample=True, temperature=0.7, top_k=50, top_p=0.95) print("***Generation:") print(outputs[0]["generated_text"][len(prompt):]) ``` ``` ***Generation: To make a Spicy Margarita, you'll need to incorporate a chili or pepper element into your classic margarita recipe. Hereā€™s a simple way to do it: ### Ingredients: - 2 oz tequila (blanco or reposado) - 1 oz fresh lime juice - 1/2 oz triple sec (Cointreau or Grand Marnier) - 1/2 oz agave syrup or simple syrup - 1-2 slices of jalapeƱo (or more depending on how spicy you like it) - Salt and/or chili powder for rimming the glass - Ice - Lime wheel for garnish ### Instructions: 1. **Muddle JalapeƱo**: In a shaker, muddle the jalapeƱo slices slightly. This will release the oils and heat from the peppers. 2. **Add Remaining Ingredients**: Add the tequila, lime juice, triple sec, and agave syrup or simple syrup. 3. **Shake and Strain**: Fill the shaker with ice and shake vigorously until cold. Strain into a salt and/or chili powder rimmed glass filled with ice. 4. **Garnish and Serve**: Garnish with a lime wheel and enjoy. If you prefer a smoother spiciness that doesn't overpower the drink, you could also consider making a jalapeƱo-infused tequila by leaving the jalapeƱo slices in the bottle of tequila for several hours to a couple of days, adjusting the time based on desired level of spiciness. Then use this infused tequila instead of regular tequila in the recipe above. Another variation is to use a spicy syrup. To make this, combine equal parts water and sugar with a few sliced jalapeƱos in a saucepan. Bring to a boil, stirring occasionally to dissolve the sugar. Reduce heat and simmer for about 5 minutes. Let cool, strain out the jalapeƱos, then store in a sealed container in the refrigerator until ready to use. Use this spicy syrup instead of regular syrup in the recipe. As always, adjust the quantity of jalapeƱo or the type of chili used to suit your taste. Enjoy responsibly! ``` # [Open LLM Leaderboard Evaluation Results](https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co./datasets/open-llm-leaderboard/details_dfurman__CalmeRys-78B-Orpo-v0.1) | Metric |Value| |-------------------|----:| |Avg. |50.78| |IFEval (0-Shot) |81.63| |BBH (3-Shot) |61.92| |MATH Lvl 5 (4-Shot)|37.92| |GPQA (0-shot) |20.02| |MuSR (0-shot) |36.37| |MMLU-PRO (5-shot) |66.80|