freefallr commited on
Commit
56f625e
1 Parent(s): 0cc61a0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -7
README.md CHANGED
@@ -15,11 +15,11 @@ datasets:
15
  - philschmid/test_german_squad
16
  ---
17
  # Introduction
18
- This repository contains the model [jphme/Llama-2-13b-chat-german](https://huggingface.co/jphme/Llama-2-13b-chat-german) in GGUF format, for fast and easy inference with llama.cpp and similar LLM inference tools.
19
- This model was created and trained by [jphme](https://huggingface.co/jphme). It is a fine-tuned variant of Meta's [Llama2 13b Chat](https://huggingface.co/meta-llama/Llama-2-13b-chat) with a compilation of multiple instruction datasets in German language.
20
 
21
- ## AIML Profile
22
- The Model Profile (config.aiml file) describes relevant properties, configuration and rules for the AI model in a standardized, digestible and easy-to-read way.
23
 
24
  ### General Information
25
  |Attribute|Details|
@@ -33,10 +33,15 @@ The Model Profile (config.aiml file) describes relevant properties, configuratio
33
  |Attribute|Details|
34
  |----------------------------|--------------------------------------------------------------------------------------------------------------|
35
  | **Type** | Large Language Model |
36
- | **Function** | Text Generation |
37
  | **Architecture** | Transformers |
38
- | **Variables** | {"llm_flavor":"llama", "llm_prompt_template":"llama2", "devices":"gpu[0,1,2,3],cpu[0]", "key":"value"} |
 
 
 
 
39
  | **Filetype** | GGUF |
 
40
  | **Compression** | 8 Bit, 5 Bit (K_M), 4 Bit (K_M) |
41
  | **CompressionMethod** | llama.cpp - convert.py Script |
42
  | **Notes** | First, a FP16 GGUF file was generated, and then quantized it to 8, 4 (K_M) and 5 (K_M) Bit with llama.cpp/quantize | |
@@ -44,7 +49,7 @@ The Model Profile (config.aiml file) describes relevant properties, configuratio
44
  ### Customization
45
  |Attribute|Details|
46
  |----------------------------|-----------------------------------------------------------------------------------------------------------------|
47
- | **Type** | finetune_full (e.g. none, finetune_lora, finetune_qlora, finetune_full) |
48
  | **Class** | Instruct, Chat |
49
  | **Datasets** | {"[Prorietary German Conversation Dataset](https://placeholder.ocal/dataset)", "[German & German legal SQuAD](https://placeholder.local/dataset)" |
50
  | **Notes** | The datasets were augmented with rows containing "wrong" contexts, in order to improve factual RAG performance. |
 
15
  - philschmid/test_german_squad
16
  ---
17
  # Introduction
18
+ This repository contains the model [jphme/Llama-2-13b-chat-german](https://huggingface.co/jphme/Llama-2-13b-chat-german) in GGUF format.
19
+ This model was created by [jphme](https://huggingface.co/jphme). It's a fine-tuned variant of Meta's [Llama2 13b Chat](https://huggingface.co/meta-llama/Llama-2-13b-chat) with a compilation of multiple instruction datasets in German language.
20
 
21
+ ## Model Profile
22
+ The AIML Profile stored in a file named "config.aiml" contains all relevant configuration parameters, properties and rules for securely deploying the AI model without hassle.
23
 
24
  ### General Information
25
  |Attribute|Details|
 
33
  |Attribute|Details|
34
  |----------------------------|--------------------------------------------------------------------------------------------------------------|
35
  | **Type** | Large Language Model |
36
+ | **Pipeline** | Text Generation |
37
  | **Architecture** | Transformers |
38
+ | **Variables** | {"llm_languages":"en,de,nl,it,fr",
39
+ "llm_flavor":"llama",
40
+ "llm_prompt_template":"llama2",
41
+ "devices":"gpu[0,1,2,3],cpu[0]",
42
+ "key":"value"} |
43
  | **Filetype** | GGUF |
44
+ | **InferenceTools** | Llama.cpp, Text Generation Inference (TGI), h2oGPT Server, KoboldCpp, Custom |
45
  | **Compression** | 8 Bit, 5 Bit (K_M), 4 Bit (K_M) |
46
  | **CompressionMethod** | llama.cpp - convert.py Script |
47
  | **Notes** | First, a FP16 GGUF file was generated, and then quantized it to 8, 4 (K_M) and 5 (K_M) Bit with llama.cpp/quantize | |
 
49
  ### Customization
50
  |Attribute|Details|
51
  |----------------------------|-----------------------------------------------------------------------------------------------------------------|
52
+ | **Type** | finetune_full (e.g. none, finetune_lora, finetune_qlora, finetune_full) |
53
  | **Class** | Instruct, Chat |
54
  | **Datasets** | {"[Prorietary German Conversation Dataset](https://placeholder.ocal/dataset)", "[German & German legal SQuAD](https://placeholder.local/dataset)" |
55
  | **Notes** | The datasets were augmented with rows containing "wrong" contexts, in order to improve factual RAG performance. |