buddhist-nlp
/

gemma-2-mitra-it-int8

Inference Endpoints

Model card Files Files and versions Community

gemma-2-mitra-it-int8 / README.md

snehrdich's picture

Update README.md

6eac43e verified 3 months ago

|

668 Bytes

	---
	library_name: transformers
	tags: []
	---

	# gemma2-mitra-it-int8
	This is an 8int quantized version of gemma-2-mitra-it: https://huggingface.co./buddhist-nlp/gemma-2-mitra-it
	The quantization was done with llm compressor: https://github.com/vllm-project/llm-compressor

	The template for prompting the model is this:
	```
	Please translate into <target_language>: <input_sentence> 🔽 Translation::
	```
	Line breaks in this model should be replaced with the '🔽' character before running the generation. '#' is used as a stop token.
	## Model Details
	For details on how to run this please see the gemma2-9b repository: https://huggingface.co./google/gemma-2-9b