rubenroy
/

phys-3b

Text Generation

text-generation-inference

Model card Files Files and versions Community

phys-3b / README.md

rubenroy's picture

Update README.md

f29e030 verified 2 days ago

|

history blame contribute delete

2.43 kB

	---
	license: apache-2.0
	language:
	- en
	- fr
	- es
	- de
	pipeline_tag: text-generation
	tags:
	- chat
	- conversational
	- PyTorch
	- Transformers
	- text-generation-inference
	datasets:
	- EleutherAI/pile
	---

	# Phys 3b

	Phys 3b is a 3 billion parameter autoregressive language model trained on specific excerpts from the Pile. The model's training data had consisted largely of English, as well as some Spanish, French, and German.

	## Model Details

	- Developed by: Ruben Roy
	- Parameters: 3 billion
	- Language: English
	- Model Type: Transformer-based language model

	## Usage

	You can generate a chat response from Phys using the transformers library as follows:

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_name = "rubenroy/phys-3b"

	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(model_name).half().eval().cuda()

	input_text = """
	<\|startoftoken\|>system
	You are a helpful assistant<\|endoftoken\|><\|startoftoken\|>human
	How tall is the Eiffel tower?<\|endoftoken\|><\|startoftoken\|>assistant

	"""
	inputs = tokenizer(input_text, return_tensors="pt", padding=True).to(0)
	outputs = model.generate(
	**inputs,
	early_stopping=True,
	max_new_tokens=args.max_new_tokens,
	do_sample=True,
	top_k=args.top_k,
	temperature=args.temperature,
	pad_token_id=tokenizer.eos_token_id,
	)

	output = tokenizer.decode(outputs[0], truncate_before_pattern=[r"\n\n^#", "^'''", "\n\n\n"])
	print(output)
	```

	You can experiment with different decoding methods and parameters to get the best results for your use case, as experimenting with `temperature` and `reptition_penalty` can provide optimal performance on your use case!

	## Considerations for Using the Model

	### Intended Use

	This model may be used by anyone as a foundational model for application-specific fine-tuning. The model is trained specifically for conversation.

	### Limitations

	The pre-training dataset may have contained offensive or inappropriate content, and such content may be reflected in model generated text. Users are recommended to exercise reasonable caution when using in production systems. Do not use for any applications that may cause harm or distress to individuals or groups.

	## Additional Information

	### Licensing Information

	The model is released under the Apache 2.0 License. Please refer to the license for usage rights and restrictions.