Update README.md
Browse files
README.md
CHANGED
@@ -15,11 +15,11 @@ datasets:
|
|
15 |
- philschmid/test_german_squad
|
16 |
---
|
17 |
# Introduction
|
18 |
-
This repository contains the model [jphme/Llama-2-13b-chat-german](https://huggingface.co/jphme/Llama-2-13b-chat-german) in GGUF format
|
19 |
-
This model was created
|
20 |
|
21 |
-
##
|
22 |
-
The
|
23 |
|
24 |
### General Information
|
25 |
|Attribute|Details|
|
@@ -33,10 +33,15 @@ The Model Profile (config.aiml file) describes relevant properties, configuratio
|
|
33 |
|Attribute|Details|
|
34 |
|----------------------------|--------------------------------------------------------------------------------------------------------------|
|
35 |
| **Type** | Large Language Model |
|
36 |
-
| **
|
37 |
| **Architecture** | Transformers |
|
38 |
-
| **Variables** | {"
|
|
|
|
|
|
|
|
|
39 |
| **Filetype** | GGUF |
|
|
|
40 |
| **Compression** | 8 Bit, 5 Bit (K_M), 4 Bit (K_M) |
|
41 |
| **CompressionMethod** | llama.cpp - convert.py Script |
|
42 |
| **Notes** | First, a FP16 GGUF file was generated, and then quantized it to 8, 4 (K_M) and 5 (K_M) Bit with llama.cpp/quantize | |
|
@@ -44,7 +49,7 @@ The Model Profile (config.aiml file) describes relevant properties, configuratio
|
|
44 |
### Customization
|
45 |
|Attribute|Details|
|
46 |
|----------------------------|-----------------------------------------------------------------------------------------------------------------|
|
47 |
-
| **Type** | finetune_full (e.g. none, finetune_lora, finetune_qlora, finetune_full)
|
48 |
| **Class** | Instruct, Chat |
|
49 |
| **Datasets** | {"[Prorietary German Conversation Dataset](https://placeholder.ocal/dataset)", "[German & German legal SQuAD](https://placeholder.local/dataset)" |
|
50 |
| **Notes** | The datasets were augmented with rows containing "wrong" contexts, in order to improve factual RAG performance. |
|
|
|
15 |
- philschmid/test_german_squad
|
16 |
---
|
17 |
# Introduction
|
18 |
+
This repository contains the model [jphme/Llama-2-13b-chat-german](https://huggingface.co/jphme/Llama-2-13b-chat-german) in GGUF format.
|
19 |
+
This model was created by [jphme](https://huggingface.co/jphme). It's a fine-tuned variant of Meta's [Llama2 13b Chat](https://huggingface.co/meta-llama/Llama-2-13b-chat) with a compilation of multiple instruction datasets in German language.
|
20 |
|
21 |
+
## Model Profile
|
22 |
+
The AIML Profile stored in a file named "config.aiml" contains all relevant configuration parameters, properties and rules for securely deploying the AI model without hassle.
|
23 |
|
24 |
### General Information
|
25 |
|Attribute|Details|
|
|
|
33 |
|Attribute|Details|
|
34 |
|----------------------------|--------------------------------------------------------------------------------------------------------------|
|
35 |
| **Type** | Large Language Model |
|
36 |
+
| **Pipeline** | Text Generation |
|
37 |
| **Architecture** | Transformers |
|
38 |
+
| **Variables** | {"llm_languages":"en,de,nl,it,fr",
|
39 |
+
"llm_flavor":"llama",
|
40 |
+
"llm_prompt_template":"llama2",
|
41 |
+
"devices":"gpu[0,1,2,3],cpu[0]",
|
42 |
+
"key":"value"} |
|
43 |
| **Filetype** | GGUF |
|
44 |
+
| **InferenceTools** | Llama.cpp, Text Generation Inference (TGI), h2oGPT Server, KoboldCpp, Custom |
|
45 |
| **Compression** | 8 Bit, 5 Bit (K_M), 4 Bit (K_M) |
|
46 |
| **CompressionMethod** | llama.cpp - convert.py Script |
|
47 |
| **Notes** | First, a FP16 GGUF file was generated, and then quantized it to 8, 4 (K_M) and 5 (K_M) Bit with llama.cpp/quantize | |
|
|
|
49 |
### Customization
|
50 |
|Attribute|Details|
|
51 |
|----------------------------|-----------------------------------------------------------------------------------------------------------------|
|
52 |
+
| **Type** | finetune_full (e.g. none, finetune_lora, finetune_qlora, finetune_full) |
|
53 |
| **Class** | Instruct, Chat |
|
54 |
| **Datasets** | {"[Prorietary German Conversation Dataset](https://placeholder.ocal/dataset)", "[German & German legal SQuAD](https://placeholder.local/dataset)" |
|
55 |
| **Notes** | The datasets were augmented with rows containing "wrong" contexts, in order to improve factual RAG performance. |
|