germanjke commited on
Commit
18dae31
·
verified ·
1 Parent(s): 33ed5b7

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +82 -0
README.md ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - ru
4
+ base_model:
5
+ - t-tech/T-lite-it-1.0
6
+ tags:
7
+ - llama-cpp
8
+ ---
9
+
10
+ # T-lite-it-1.0-Q8_0-GGUF
11
+
12
+ **🚨 T-lite is designed for further fine-tuning and is not intended as a ready-to-use conversational assistant. Users are advised to exercise caution and are responsible for any additional training and oversight required to ensure the model's responses meet acceptable ethical and safety standards. The responsibility for incorporating this model into industrial or commercial solutions lies entirely with those who choose to deploy it.**
13
+
14
+ ## Description
15
+
16
+ This repository contains the [`T-lite-it-1.0`](https://huggingface.co/t-tech/T-lite-it-1.0/) model, which has been quantized into the GGUF format using the [`llama.cpp`](https://github.com/ggerganov/llama.cpp) repository.
17
+
18
+ ## 📊 Benchmarks
19
+
20
+ | Benchmark | T-lite-it-1.0 | T-lite-it-1.0-Q8_0 | Qwen-2.5-7B-Instruct | GigaChat Pro 1.0.26.15 | RuAdapt-Qwen-7B-Instruct-v1 | gemma-2-9b-it |
21
+ |------------------------------------------------|:-------------:|:-------------:|:--------------------:|:----------------------:|:---------------------------:|:--------------|
22
+ | Arena-Hard-Ru | **метрика** |метрика | 54.29 | - | 52.77 | 47.83 |
23
+
24
+
25
+ ## Llama.cpp usage
26
+
27
+ ### Server
28
+
29
+ From HF:
30
+
31
+ ```bash
32
+ llama-server --hf-repo t-tech/T-lite-it-1.0-Q8_0-GGUF --hf-file t-lite-it-1.0-q8_0.gguf -c 8192
33
+ ```
34
+
35
+ Or locally:
36
+
37
+ ```bash
38
+ ./build/bin/llama-server -m t-lite-it-1.0-q8_0.gguf -c 8192
39
+ ```
40
+
41
+ ### POST
42
+
43
+ ```bash
44
+ curl --request POST \
45
+ --url http://localhost:8080/completion \
46
+ --header "Content-Type: application/json" \
47
+ --data '{
48
+ "prompt": "<|im_start|>user\nРасскажи мне чем отличается Python от C++?\n<|im_end|>\n<|im_start|>assistant\n",
49
+ "n_predict": 256
50
+ }'
51
+
52
+ ```
53
+
54
+
55
+ ## ollama usage
56
+
57
+ ### Serve
58
+
59
+ ```bash
60
+ ollama serve
61
+ ```
62
+
63
+ ### Run
64
+
65
+ From HF:
66
+
67
+ ```bash
68
+ ollama run hf.co/t-tech/T-lite-it-1.0-Q8_0-GGUF/
69
+ ```
70
+
71
+ Or locally:
72
+
73
+ ```bash
74
+ ollama create example -f Modelfile
75
+ ollama run example "Расскажи мне про отличия C++ и Python"
76
+ ```
77
+
78
+ where `Modelfile` is
79
+
80
+ ```bash
81
+ FROM ./t-lite-it-1.0-q8_0.gguf
82
+ ```