--- language: - en license: apache-2.0 tags: - chat - mistral - roleplay - creative-writing base_model: - nbeerbower/mistral-nemo-bophades-12B - anthracite-org/magnum-v2-12b - Sao10K/MN-12B-Lyra-v3 - Gryphe/Pantheon-RP-1.6-12b-Nemo pipeline_tag: text-generation model-index: - name: StarDust-12b-v2 results: - task: type: text-generation name: Text Generation dataset: name: IFEval (0-Shot) type: HuggingFaceH4/ifeval args: num_few_shot: 0 metrics: - type: inst_level_strict_acc and prompt_level_strict_acc value: 56.29 name: strict accuracy source: url: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=Luni/StarDust-12b-v2 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: BBH (3-Shot) type: BBH args: num_few_shot: 3 metrics: - type: acc_norm value: 34.95 name: normalized accuracy source: url: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=Luni/StarDust-12b-v2 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MATH Lvl 5 (4-Shot) type: hendrycks/competition_math args: num_few_shot: 4 metrics: - type: exact_match value: 5.97 name: exact match source: url: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=Luni/StarDust-12b-v2 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GPQA (0-shot) type: Idavidrein/gpqa args: num_few_shot: 0 metrics: - type: acc_norm value: 5.82 name: acc_norm source: url: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=Luni/StarDust-12b-v2 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MuSR (0-shot) type: TAUR-Lab/MuSR args: num_few_shot: 0 metrics: - type: acc_norm value: 14.26 name: acc_norm source: url: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=Luni/StarDust-12b-v2 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU-PRO (5-shot) type: TIGER-Lab/MMLU-Pro config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 27.1 name: accuracy source: url: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=Luni/StarDust-12b-v2 name: Open LLM Leaderboard --- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6303fa71fc783bfc7443e7ae/c3ddWBoz-lINEykUDCoXy.png) ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6303fa71fc783bfc7443e7ae/hOpgDxJS2sDO7HzuC9e18.png) # StarDust-12b-v2 ## Quants - GGUF: [mradermacher/StarDust-12b-v2-GGUF](https://huggingface.co./mradermacher/StarDust-12b-v2-GGUF) - weighted/imatrix GGUF: [mradermacher/StarDust-12b-v2-i1-GGUF](https://huggingface.co./mradermacher/StarDust-12b-v2-i1-GGUF/tree/main) - exl2: [lucyknada/Luni_StarDust-12b-v2-exl2](https://huggingface.co./lucyknada/Luni_StarDust-12b-v2-exl2) ## Description | Usecase - The result of this merge is in my opinion a more vibrant and less generic sonnet inspired prose, it's able to be gentle and harsh where asked. - The v2 uses the non-kto magnum which tends to have less "claudeism" (making the story feel rather repetitive) - Note on Non-Kto: There is a very big gap between people preferring and disliking the KTO. To make things easier, you can still use [Luni/StarDust-12b-v1](https://huggingface.co./Luni/StarDust-12b-v1) which has the KTO version. - In early testing users have reported a much better experience in longer roleplays and a abillity to add a creative touch to the stable experiencve. Just like with v1: - This model is intended to be used as a Role-playing model. - Its direct conversational output is... I can't even say it's luck, it's just not made for it. - Extension to Conversational output: The Model is designed for roleplay, direct instructing or general purpose is NOT recommended. ## Initial Feedback - Initial feedback has proven the model to be a solid "go-to" choice for creative storywriting - The prose has been certified as "amazing" with many making it their default model. ## Prompting ### ChatML has proven to be the BEST choice. Both Mistral and ChatML should work though I had better results with ChatML: ChatML Example: ```py """<|im_start|>user Hi there!<|im_end|> <|im_start|>assistant Nice to meet you!<|im_end|> <|im_start|>user Can I ask a question?<|im_end|> <|im_start|>assistant """ ``` ## Merge Details ### Merge Method This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [Sao10K/MN-12B-Lyra-v3](https://huggingface.co./Sao10K/MN-12B-Lyra-v3) as a base. ### Models Merged The following models were included in the merge: * [nbeerbower/mistral-nemo-bophades-12B](https://huggingface.co./nbeerbower/mistral-nemo-bophades-12B) * [anthracite-org/magnum-v2-12b](https://huggingface.co./anthracite-org/magnum-v2-12b) * [Gryphe/Pantheon-RP-1.6-12b-Nemo](https://huggingface.co./Gryphe/Pantheon-RP-1.6-12b-Nemo) * [Sao10K/MN-12B-Lyra-v3](https://huggingface.co./Sao10K/MN-12B-Lyra-v3) ### Special Thanks Special thanks to the SillyTilly and myself for helping me find the energy to finish this. # [Open LLM Leaderboard Evaluation Results](https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co./datasets/open-llm-leaderboard/details_Luni__StarDust-12b-v2) | Metric |Value| |-------------------|----:| |Avg. |24.06| |IFEval (0-Shot) |56.29| |BBH (3-Shot) |34.95| |MATH Lvl 5 (4-Shot)| 5.97| |GPQA (0-shot) | 5.82| |MuSR (0-shot) |14.26| |MMLU-PRO (5-shot) |27.10|