Update README.md
Browse files
README.md
CHANGED
@@ -23,46 +23,13 @@ This model was created by [jphme](https://huggingface.co/jphme). It's a fine-tun
|
|
23 |
|
24 |
# Model Details
|
25 |
|
26 |
-
The model profile is stored in a file called config.aiml and can be used by AI inference tools to get information about the model, like the correct prompt template and information on how to deploy and run it.
|
27 |
-
The AIML Model Profile is stored in a file named "config.aiml" and contains all relevant configuration parameters, properties and rules for securely deploying the AI model without hassle.
|
28 |
-
|
29 |
### General Information
|
30 |
|Attribute|Details|
|
31 |
|----------------------------|--------------------------------------------------------------------------------------------------------------|
|
32 |
-
| **ID** | 1 |
|
33 |
| **Name** | [jphme/Llama-2-13b-chat-german](https://huggingface.co/jphme/Llama-2-13b-chat-german) |
|
34 |
| **Creator** | [jphme](https://huggingface.co/jphme) |
|
35 |
-
| **Source
|
36 |
-
|
37 |
-
### Model Specifications
|
38 |
-
|Attribute|Details|
|
39 |
-
|----------------------------|--------------------------------------------------------------------------------------------------------------|
|
40 |
-
| **Type** | Large Language Model |
|
41 |
-
| **Pipeline** | Text Generation |
|
42 |
-
| **Architecture** | Transformers |
|
43 |
-
| **Variables** | {"llm_languages":"en,de,nl,it,fr",
|
44 |
-
"llm_flavor":"llama",
|
45 |
-
"llm_prompt_template":"llama2",
|
46 |
-
"devices":"gpu[0,1,2,3],cpu[0]",
|
47 |
-
"key":"value"} |
|
48 |
-
| **Filetype** | GGUF |
|
49 |
-
| **InferenceTools** | Llama.cpp, Text Generation Inference (TGI), h2oGPT Server, KoboldCpp, Custom |
|
50 |
-
| **Compression** | 8 Bit, 5 Bit (K_M), 4 Bit (K_M) |
|
51 |
-
| **CompressionMethod** | llama.cpp - convert.py Script |
|
52 |
-
| **Notes** | First, a FP16 GGUF file was generated, and then quantized it to 8, 4 (K_M) and 5 (K_M) Bit with llama.cpp/quantize | |
|
53 |
-
|
54 |
-
### Customization
|
55 |
-
|Attribute|Details|
|
56 |
-
|----------------------------|-----------------------------------------------------------------------------------------------------------------|
|
57 |
-
| **Type** | finetune_full (e.g. none, finetune_lora, finetune_qlora, finetune_full) |
|
58 |
-
| **Class** | Instruct, Chat |
|
59 |
-
| **Datasets** | {"[Prorietary German Conversation Dataset](https://placeholder.ocal/dataset)", "[German & German legal SQuAD](https://placeholder.local/dataset)" |
|
60 |
-
| **Notes** | The datasets were augmented with rows containing "wrong" contexts, in order to improve factual RAG performance. |
|
61 |
|
62 |
-
### Run Instructions
|
63 |
-
|Attribute|Details|
|
64 |
-
| **Start Model** | #/bin/sh<br/>chmod +x run.sh && ./run.sh <br/># This is an example. Functioning run.sh Script to be published soon |
|
65 |
-
| **Stop Model** | # Coming soon, todo |
|
66 |
|
67 |
|
68 |
## Deploy from source
|
@@ -77,8 +44,7 @@ cd llama.cpp && make
|
|
77 |
# This command converts the original model to GGUF format with FP16 precision. Make sure to change the file paths and model names to your desire.
|
78 |
python3 llama.cpp/convert.py ./original-models/Llama-2-13b-chat-german --outtype f16 --outfile ./converted_gguf/Llama-2-13b-chat-german-GGUF.fp16.bin
|
79 |
```
|
80 |
-
4.
|
81 |
-
5. The converted GGUF model with FP16 precision will then be used to do further quantization to 8 Bit, 5 Bit (K_M) and 4 Bit (K_M).
|
82 |
|
83 |
```
|
84 |
# 2. Convert original model to GGUF format with FP16 precision
|
@@ -90,33 +56,6 @@ python3 llama.cpp/convert.py ./original-models/Llama-2-13b-chat-german --outtype
|
|
90 |
```
|
91 |
___
|
92 |
|
93 |
-
## Responsible AI Pledge
|
94 |
-
We are utilizing open-source AI in a responsible and inclusive manner, and we encourage you to do the same. We have crafted some guidelines that we strictly follow when deploying AI services. Please take a moment to read, understand and follow the following rules:
|
95 |
-
|
96 |
-
### Purpose and Limitations
|
97 |
-
The Llama 2 13b Chat German - GGUF model is aimed for processing and understanding the German language better than the original Llama 2 13b Chat model. While it has been trained on a German conversation dataset, as well as on German SQuAD and German legal SQuAD data, it does not guarantee perfect accuracy or understanding in all contexts.
|
98 |
-
|
99 |
-
### Ethical Use
|
100 |
-
We urge users to employ this technology responsibly. Avoid using the model for purposes that may harm, mislead, or discriminate against individuals or groups. Respect privacy and avoid sharing personal or confidential information when interacting with the model.
|
101 |
-
|
102 |
-
### Continuous Learning
|
103 |
-
Like all AI models, the Llama 2 13b Chat German model is a result of continuous learning and improvement. Results and responses may vary, and there may be occasional errors or inaccuracies.
|
104 |
-
|
105 |
-
### No Substitution for Expertise
|
106 |
-
While the model can offer information on a variety of topics, including legal ones, it should not be considered a substitute for professional advice. Always consult experts when making critical decisions.
|
107 |
-
|
108 |
-
### Model Modifications
|
109 |
-
The provided model has undergone various modifications and quantizations. Understand the technical details and implications of these changes before use.
|
110 |
-
|
111 |
-
### Feedback and Corrections
|
112 |
-
We encourage users to provide feedback on model inaccuracies. Collaboration and user feedback are vital for the continuous improvement of open-source AI models.
|
113 |
-
|
114 |
-
### Acknowledgment
|
115 |
-
Ensure you credit the appropriate sources when using or referring to a model. Provide links and information to the training process, datasets and everything you can publicly disclose.
|
116 |
-
|
117 |
-
By accessing and using the Llama 2 13b Chat German GGUF model, you acknowledge the above guidelines and commit to responsible and ethical use of this technology. Let's work together to ensure AI benefits humanity while minimizing potential risks.
|
118 |
-
Thank you.
|
119 |
-
|
120 |
# Original Model Card
|
121 |
|
122 |
This is the original model card from [jphme/Llama-2-13b-chat-german](https://huggingface.co/jphme/Llama-2-13b-chat-german):
|
|
|
23 |
|
24 |
# Model Details
|
25 |
|
|
|
|
|
|
|
26 |
### General Information
|
27 |
|Attribute|Details|
|
28 |
|----------------------------|--------------------------------------------------------------------------------------------------------------|
|
|
|
29 |
| **Name** | [jphme/Llama-2-13b-chat-german](https://huggingface.co/jphme/Llama-2-13b-chat-german) |
|
30 |
| **Creator** | [jphme](https://huggingface.co/jphme) |
|
31 |
+
| **Source** | https://huggingface.co/ |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
32 |
|
|
|
|
|
|
|
|
|
33 |
|
34 |
|
35 |
## Deploy from source
|
|
|
44 |
# This command converts the original model to GGUF format with FP16 precision. Make sure to change the file paths and model names to your desire.
|
45 |
python3 llama.cpp/convert.py ./original-models/Llama-2-13b-chat-german --outtype f16 --outfile ./converted_gguf/Llama-2-13b-chat-german-GGUF.fp16.bin
|
46 |
```
|
47 |
+
3. The converted GGUF model with FP16 precision will then be used to do further quantization to 8 Bit, 5 Bit (K_M) and 4 Bit (K_M).
|
|
|
48 |
|
49 |
```
|
50 |
# 2. Convert original model to GGUF format with FP16 precision
|
|
|
56 |
```
|
57 |
___
|
58 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
59 |
# Original Model Card
|
60 |
|
61 |
This is the original model card from [jphme/Llama-2-13b-chat-german](https://huggingface.co/jphme/Llama-2-13b-chat-german):
|