patrickvonplaten commited on
Commit
9694c14
·
2 Parent(s): d589d0a 6958223

Merge branch 'main' of https://huggingface.co./facebook/opt-350m into main

Browse files
Files changed (1) hide show
  1. README.md +44 -12
README.md CHANGED
@@ -9,11 +9,11 @@ license: mit
9
 
10
  # OPT : Open Pre-trained Transformer Language Models
11
 
12
- Feel free to test the whole generation capabilities here: https://transformer.huggingface.co/doc/opt-30b. This will be made available in the near future.
13
 
14
- The models were pretrained on the English language using a causal language modeling (CLM) objective. It was first introduced in [META AI's paper](https://arxiv.org/pdf/2205.01068.pdf) and was first released [here](https://github.com/facebookresearch/metaseq) on May 3rd 2022.
15
 
16
- Disclaimer: The team releasing OPT also wrote a
17
  [model card](https://github.com/facebookresearch/metaseq/blob/main/projects/OPT/model_card.md) for their model, which is available in the appendix D of their paper. Content from this model card
18
  has been written by the Hugging Face team to complete the information they provided and give specific examples of how to use the model, and the various bias.
19
 
@@ -38,8 +38,7 @@ You can use the raw model for text generation or fine-tune it to a downstream ta
38
 
39
  ### How to use
40
 
41
- You can use this model directly with a pipeline for text generation. Since the generation relies on some randomness, we
42
- set a seed for reproducibility:
43
 
44
  ```python
45
  >>> from transformers import pipeline, set_seed, OPTForCausalLM, GPT2Tokenizer
@@ -52,15 +51,48 @@ set a seed for reproducibility:
52
  [{'generated_text': "Hello, I'm a language model, and I'm interested in learning more about the language model.\n\nI'm a language model, and I"}]
53
  ```
54
 
55
- Here is how to use this model to get the features of a given text in PyTorch:
56
 
57
  ```python
58
- from transformers import GPT2Tokenizer, OPTModel
59
- tokenizer = GPT2Tokenizer.from_pretrained("patrickvonplaten/opt_gpt2_tokenizer")
60
- model = OPTModel.from_pretrained("facebook/opt-350m")
61
- text = "Replace me by any text you'd like."
62
- encoded_input = tokenizer(text, return_tensors='pt')
63
- output = model(**encoded_input)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
64
  ```
65
 
66
  ### Limitations and bias
 
9
 
10
  # OPT : Open Pre-trained Transformer Language Models
11
 
12
+ Feel free to test the whole generation capabilities here: https://transformer.huggingface.co/doc/opt-30b.
13
 
14
+ The models were pretrained on the English language using a causal language modeling (CLM) objective. It was first introduced in [Open Pre-trained Transformer Language Models](https://arxiv.org/pdf/2205.01068.pdf) and was first released in [metaseq repository](https://github.com/facebookresearch/metaseq) on May 3rd 2022 by the META AI team.
15
 
16
+ **Disclaimer**: The team releasing OPT also wrote a
17
  [model card](https://github.com/facebookresearch/metaseq/blob/main/projects/OPT/model_card.md) for their model, which is available in the appendix D of their paper. Content from this model card
18
  has been written by the Hugging Face team to complete the information they provided and give specific examples of how to use the model, and the various bias.
19
 
 
38
 
39
  ### How to use
40
 
41
+ You can use this model directly with a pipeline for text generation. Generation is deterministic, thus in order to use the top-k sampling `do_sample` is set to `True`.
 
42
 
43
  ```python
44
  >>> from transformers import pipeline, set_seed, OPTForCausalLM, GPT2Tokenizer
 
51
  [{'generated_text': "Hello, I'm a language model, and I'm interested in learning more about the language model.\n\nI'm a language model, and I"}]
52
  ```
53
 
54
+ Here is how to use this model to get the hidden states of a given text in PyTorch:
55
 
56
  ```python
57
+ >>> from transformers import GPT2Tokenizer, OPTModel
58
+ >>> tokenizer = GPT2Tokenizer.from_pretrained("patrickvonplaten/opt_gpt2_tokenizer")
59
+ >>> model = OPTModel.from_pretrained("facebook/opt-350m")
60
+ >>> text = "I am happy to be releasing a new model!"
61
+ >>> encoded_input = tokenizer(text, return_tensors='pt')
62
+ >>> output = model(**encoded_input)
63
+ BaseModelOutputWithPast(last_hidden_state=tensor([[[-2.4159, 0.7136, -4.6705, ..., -1.3857, 0.4758, -1.5518],
64
+ [-1.4122, -2.0026, -9.4849, ..., 1.3589, 3.1777, 0.8622],
65
+ [ 0.8425, -5.9863, -5.7204, ..., 2.2054, 4.3147, 0.2039],
66
+ ...,
67
+ [-0.5943, -0.9686, -2.3670, ..., 6.7386, -4.5704, 3.1795],
68
+ [ 0.0582, -5.4449, -3.1305, ..., 3.9461, -2.2183, 1.1721],
69
+ [ 0.0547, -4.1437, -0.1780, ..., -0.1648, 0.7273, 0.7006]]],
70
+ grad_fn=<UnsafeViewBackward0>), past_key_values=((tensor([[[[-0.4485, 0.4126, 0.3829, ..., -0.4228, 0.5844, 0.4145],
71
+ [-0.8542, 0.8587, 0.8495, ..., -0.8048, 0.7143, 0.8142],
72
+ [-0.6921, 0.6961, 0.6502, ..., -0.6523, 0.5810, 0.6708],
73
+ ...,
74
+ [-0.6822, 0.6847, 0.6880, ..., -0.6225, 0.5817, 0.6720],
75
+ [-0.7208, 0.7355, 0.6723, ..., -0.6821, 0.6895, 0.7070],
76
+ [-0.6217, 0.6276, 0.6367, ..., -0.5950, 0.5609, 0.6075]],
77
+
78
+ [[-0.0373, -0.4824, 0.0290, ..., -0.5359, 0.5350, 0.1365],
79
+ [ 0.8295, -0.3887, -0.7507, ..., -0.2576, -1.1691, 0.6727],
80
+ [ 0.5611, -0.3490, -0.5395, ..., -0.2822, -0.7972, 0.5236],
81
+ ...,
82
+ [ 0.4013, -0.2377, -0.3478, ..., -0.1679, -0.5556, 0.4043],
83
+ [ 0.5444, -0.3821, -0.4555, ..., -0.2781, -0.6267, 0.4551],
84
+ [ 0.2731, -0.1157, -0.2134, ..., -0.0131, -0.3230, 0.2420]],
85
+
86
+ [[-0.8761, 0.8668, 0.8488, ..., -0.7307, -0.8133, 0.7668],
87
+ [-0.6488, 0.7369, 0.7716, ..., -0.8711, -0.6874, 0.7305],
88
+ [-0.6605, 0.7629, 0.7675, ..., -0.7790, -0.6908, 0.7493],
89
+ ...,
90
+ [-0.6542, 0.7252, 0.7787, ..., -0.7739, -0.6742, 0.7018],
91
+ [-0.7012, 0.7739, 0.8003, ..., -0.8420, -0.7059, 0.7675],
92
+ [-0.5077, 0.5662, 0.6203, ..., -0.7885, -0.5262, 0.5924]],
93
+
94
+ ...,
95
+ ]]], hidden_states=None, attentions=None)
96
  ```
97
 
98
  ### Limitations and bias