mosaicml
/

mpt-7b-chat

@@ -119,6 +119,22 @@ from transformers import AutoTokenizer
 tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
 ```
 ## Model Description
 The architecture is a modification of a standard decoder-only transformer.

 tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
 ```
+The model can then be used, for example, within a text-generation pipeline.
+Note: when running Torch modules in lower precision, it is best practice to use the [torch.autocast context manager](https://pytorch.org/docs/stable/amp.html).
+```python
+from transformers import pipeline
+pipe = pipeline('text-generation', model=model, tokenizer=tokenizer, device='cuda:0')
+with torch.autocast('cuda', dtype=torch.bfloat16):
+    print(
+        pipe('Here is a recipe for vegan banana bread:\n',
+            max_new_tokens=100,
+            do_sample=True,
+            use_cache=True))
+```
 ## Model Description
 The architecture is a modification of a standard decoder-only transformer.