Text Generation
Transformers
GGUF
English
llama
Inference Endpoints
aashish1904 commited on
Commit
def280b
1 Parent(s): badfab8

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +52 -0
README.md ADDED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+
4
+ license: apache-2.0
5
+ datasets:
6
+ - gair-prox/FineWeb-pro
7
+ language:
8
+ - en
9
+ tags:
10
+ - llama
11
+ pipeline_tag: text-generation
12
+ library_name: transformers
13
+
14
+ ---
15
+
16
+ [![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory)
17
+
18
+
19
+ # QuantFactory/FW-ProX-1.7B-GGUF
20
+ This is quantized version of [gair-prox/FW-ProX-1.7B](https://huggingface.co/gair-prox/FW-ProX-1.7B) created using llama.cpp
21
+
22
+ # Original Model Card
23
+
24
+
25
+ # FW-ProX-1.7B
26
+
27
+ <p align="center">
28
+ <img src="prox-teaser.png">
29
+ </p>
30
+
31
+ [ArXiv](https://arxiv.org/abs/2409.17115) | [Models](https://huggingface.co/gair-prox/FW-ProX-1.7B) | [Data](https://huggingface.co/datasets/gair-prox/FineWeb-pro) | [Code](https://github.com/GAIR-NLP/program-every-example)
32
+
33
+ **FW-ProX-1.7B** is a small language model. It was and trained on the [FineWeb-pro](https://huggingface.co/datasets/gair-prox/FineWeb-pro) for 50B tokens.
34
+
35
+ ## Evaluations
36
+
37
+ ProX models are evaluated over 10 language model benchmarks in zero-shot setting.
38
+
39
+ | | ArC-c | ARC-e | CSQA | HellaS | MMLU | OBQA | PiQA | SIQA | WinoG | SciQ | AVG |
40
+ |-----------------------|-------|-------|-------|-----------|-------|-------|-------|-------|-------|-------|------|
41
+ | raw | 28.5 | 52.6 | 33.9 | 53.2 | 29.8 | 32.6 | 72.9 | 40.2 | 53.0 | 77.1 | 47.4 |
42
+ | ours | 34.4 | 63.9 | 32.6 | 53.0 | 33.1 | 34.4 | 73.1 | 39.3 | 52.7 | 81.5 | 49.8 |
43
+
44
+ ### Citation
45
+ ```
46
+ @article{zhou2024programming,
47
+ title={Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale},
48
+ author={Zhou, Fan and Wang, Zengzhi and Liu, Qian and Li, Junlong and Liu, Pengfei},
49
+ journal={arXiv preprint arXiv:2409.17115},
50
+ year={2024}
51
+ }
52
+ ```