starble-dev
/

mistral-doryV2-12b-gguf

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

starble-dev commited on Jul 23, 2024

Commit

dc6a974

·

verified ·

1 Parent(s): 7665ca2

Update README.md

Files changed (1) hide show

README.md +5 -6

README.md CHANGED Viewed

@@ -13,14 +13,13 @@ library_name: transformers
 > Mistral-Nemo-12B is very sensitive to the temperature sampler, try values near **0.3** at first or else you will get some weird results. This is mentioned by MistralAI at their [Transformers](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407#transformers) section. <br>
 > In my personal testing, Flash-Attention seems to have seem weird effects with the model as well, however there is no confirmation on this.
-**Original Model:**
-[BeaverAI/mistral-doryV2-12b](https://huggingface.co/BeaverAI/mistral-doryV2-12b)
-**How to Use:**
-[llama.cpp](https://github.com/ggerganov/llama.cpp)
-**License:**
-Apache 2.0
 # Quants
 | Name | Quant Type | Size |

 > Mistral-Nemo-12B is very sensitive to the temperature sampler, try values near **0.3** at first or else you will get some weird results. This is mentioned by MistralAI at their [Transformers](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407#transformers) section. <br>
 > In my personal testing, Flash-Attention seems to have seem weird effects with the model as well, however there is no confirmation on this.
+**Original Model:** [BeaverAI/mistral-doryV2-12b](https://huggingface.co/BeaverAI/mistral-doryV2-12b)
+**How to Use:** [llama.cpp](https://github.com/ggerganov/llama.cpp)
+**Original Model License:** Apache 2.0
+**Release Used:** [b3441](https://github.com/ggerganov/llama.cpp/releases/tag/b3441)
 # Quants
 | Name | Quant Type | Size |