QuantFactory
/

FW-ProX-1.7B-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

FW-ProX-1.7B-GGUF / README.md

aashish1904's picture

Upload README.md with huggingface_hub

def280b verified about 1 month ago

|

history blame contribute delete

1.87 kB


	---

	license: apache-2.0
	datasets:
	- gair-prox/FineWeb-pro
	language:
	- en
	tags:
	- llama
	pipeline_tag: text-generation
	library_name: transformers

	---

	[![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory)


	# QuantFactory/FW-ProX-1.7B-GGUF
	This is quantized version of [gair-prox/FW-ProX-1.7B](https://huggingface.co./gair-prox/FW-ProX-1.7B) created using llama.cpp

	# Original Model Card


	# FW-ProX-1.7B

	<p align="center">
	<img src="prox-teaser.png">
	</p>

	[ArXiv](https://arxiv.org/abs/2409.17115) \| [Models](https://huggingface.co./gair-prox/FW-ProX-1.7B) \| [Data](https://huggingface.co./datasets/gair-prox/FineWeb-pro) \| [Code](https://github.com/GAIR-NLP/program-every-example)

	FW-ProX-1.7B is a small language model. It was and trained on the [FineWeb-pro](https://huggingface.co./datasets/gair-prox/FineWeb-pro) for 50B tokens.

	## Evaluations

	ProX models are evaluated over 10 language model benchmarks in zero-shot setting.

	\| \| ArC-c \| ARC-e \| CSQA \| HellaS \| MMLU \| OBQA \| PiQA \| SIQA \| WinoG \| SciQ \| AVG \|
	\|-----------------------\|-------\|-------\|-------\|-----------\|-------\|-------\|-------\|-------\|-------\|-------\|------\|
	\| raw \| 28.5 \| 52.6 \| 33.9 \| 53.2 \| 29.8 \| 32.6 \| 72.9 \| 40.2 \| 53.0 \| 77.1 \| 47.4 \|
	\| ours \| 34.4 \| 63.9 \| 32.6 \| 53.0 \| 33.1 \| 34.4 \| 73.1 \| 39.3 \| 52.7 \| 81.5 \| 49.8 \|

	### Citation
	```
	@article{zhou2024programming,
	title={Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale},
	author={Zhou, Fan and Wang, Zengzhi and Liu, Qian and Li, Junlong and Liu, Pengfei},
	journal={arXiv preprint arXiv:2409.17115},
	year={2024}
	}
	```