facebook
/

opt-350m

@@ -9,11 +9,11 @@ license: mit
 # OPT : Open Pre-trained Transformer Language Models
-Feel free to test the whole generation capabilities here: https://transformer.huggingface.co/doc/opt-30b. This will be made available in the near future.
-The models were pretrained on the English language using a causal language modeling (CLM) objective. It was first introduced in [META AI's paper](https://arxiv.org/pdf/2205.01068.pdf) and was first released [here](https://github.com/facebookresearch/metaseq) on May 3rd 2022.
-Disclaimer: The team releasing OPT also wrote a
 [model card](https://github.com/facebookresearch/metaseq/blob/main/projects/OPT/model_card.md) for their model, which is available in the appendix D of their paper. Content from this model card
 has been written by the Hugging Face team to complete the information they provided and give specific examples of how to use the model, and the various bias.
@@ -38,8 +38,7 @@ You can use the raw model for text generation or fine-tune it to a downstream ta
 ### How to use
-You can use this model directly with a pipeline for text generation. Since the generation relies on some randomness, we
-set a seed for reproducibility:
 ```python
 >>> from transformers import pipeline, set_seed, OPTForCausalLM, GPT2Tokenizer
@@ -52,15 +51,48 @@ set a seed for reproducibility:
 [{'generated_text': "Hello, I'm a language model, and I'm interested in learning more about the language model.\n\nI'm a language model, and I"}]
 ```
-Here is how to use this model to get the features of a given text in PyTorch:
 ```python
-from transformers import GPT2Tokenizer, OPTModel
-tokenizer = GPT2Tokenizer.from_pretrained("patrickvonplaten/opt_gpt2_tokenizer")
-model = OPTModel.from_pretrained("facebook/opt-350m")
-text = "Replace me by any text you'd like."
-encoded_input = tokenizer(text, return_tensors='pt')
-output = model(**encoded_input)
 ```
 ### Limitations and bias

 # OPT : Open Pre-trained Transformer Language Models
+Feel free to test the whole generation capabilities here: https://transformer.huggingface.co/doc/opt-30b.
+The models were pretrained on the English language using a causal language modeling (CLM) objective. It was first introduced in [Open Pre-trained Transformer Language Models](https://arxiv.org/pdf/2205.01068.pdf) and was first released in [metaseq repository](https://github.com/facebookresearch/metaseq) on May 3rd 2022 by the META AI team.
+**Disclaimer**: The team releasing OPT also wrote a
 [model card](https://github.com/facebookresearch/metaseq/blob/main/projects/OPT/model_card.md) for their model, which is available in the appendix D of their paper. Content from this model card
 has been written by the Hugging Face team to complete the information they provided and give specific examples of how to use the model, and the various bias.
 ### How to use
+You can use this model directly with a pipeline for text generation. Generation is deterministic, thus in order to use the top-k sampling `do_sample` is set to `True`.
 ```python
 >>> from transformers import pipeline, set_seed, OPTForCausalLM, GPT2Tokenizer
 [{'generated_text': "Hello, I'm a language model, and I'm interested in learning more about the language model.\n\nI'm a language model, and I"}]
 ```
+Here is how to use this model to get the hidden states of a given text in PyTorch:
 ```python
+>>> from transformers import GPT2Tokenizer, OPTModel
+>>> tokenizer = GPT2Tokenizer.from_pretrained("patrickvonplaten/opt_gpt2_tokenizer")
+>>> model = OPTModel.from_pretrained("facebook/opt-350m")
+>>> text = "I am happy to be releasing a new model!"
+>>> encoded_input = tokenizer(text, return_tensors='pt')
+>>> output = model(**encoded_input)
+BaseModelOutputWithPast(last_hidden_state=tensor([[[-2.4159,  0.7136, -4.6705,  ..., -1.3857,  0.4758, -1.5518],
+         [-1.4122, -2.0026, -9.4849,  ...,  1.3589,  3.1777,  0.8622],
+         [ 0.8425, -5.9863, -5.7204,  ...,  2.2054,  4.3147,  0.2039],
+         ...,
+         [-0.5943, -0.9686, -2.3670,  ...,  6.7386, -4.5704,  3.1795],
+         [ 0.0582, -5.4449, -3.1305,  ...,  3.9461, -2.2183,  1.1721],
+         [ 0.0547, -4.1437, -0.1780,  ..., -0.1648,  0.7273,  0.7006]]],
+       grad_fn=<UnsafeViewBackward0>), past_key_values=((tensor([[[[-0.4485,  0.4126,  0.3829,  ..., -0.4228,  0.5844,  0.4145],
+          [-0.8542,  0.8587,  0.8495,  ..., -0.8048,  0.7143,  0.8142],
+          [-0.6921,  0.6961,  0.6502,  ..., -0.6523,  0.5810,  0.6708],
+          ...,
+          [-0.6822,  0.6847,  0.6880,  ..., -0.6225,  0.5817,  0.6720],
+          [-0.7208,  0.7355,  0.6723,  ..., -0.6821,  0.6895,  0.7070],
+          [-0.6217,  0.6276,  0.6367,  ..., -0.5950,  0.5609,  0.6075]],
+         [[-0.0373, -0.4824,  0.0290,  ..., -0.5359,  0.5350,  0.1365],
+          [ 0.8295, -0.3887, -0.7507,  ..., -0.2576, -1.1691,  0.6727],
+          [ 0.5611, -0.3490, -0.5395,  ..., -0.2822, -0.7972,  0.5236],
+          ...,
+          [ 0.4013, -0.2377, -0.3478,  ..., -0.1679, -0.5556,  0.4043],
+          [ 0.5444, -0.3821, -0.4555,  ..., -0.2781, -0.6267,  0.4551],
+          [ 0.2731, -0.1157, -0.2134,  ..., -0.0131, -0.3230,  0.2420]],
+         [[-0.8761,  0.8668,  0.8488,  ..., -0.7307, -0.8133,  0.7668],
+          [-0.6488,  0.7369,  0.7716,  ..., -0.8711, -0.6874,  0.7305],
+          [-0.6605,  0.7629,  0.7675,  ..., -0.7790, -0.6908,  0.7493],
+          ...,
+          [-0.6542,  0.7252,  0.7787,  ..., -0.7739, -0.6742,  0.7018],
+          [-0.7012,  0.7739,  0.8003,  ..., -0.8420, -0.7059,  0.7675],
+          [-0.5077,  0.5662,  0.6203,  ..., -0.7885, -0.5262,  0.5924]],
+         ...,
+         ]]], hidden_states=None, attentions=None)
 ```
 ### Limitations and bias