AutoModels¶

In many cases, the architecture you want to use can be guessed from the name or the path of the pretrained model you are supplying to the from_pretrained method.

AutoClasses are here to do this job for you so that you automatically retrieve the relevant model given the name/path to the pretrained weights/config/vocabulary:

Instantiating one of AutoModel, AutoConfig and AutoTokenizer will directly create a class of the relevant architecture (ex: model = AutoModel.from_pretrained('bert-base-cased') will create a instance of BertModel).

AutoConfig¶

class transformers.AutoConfig[source]¶

AutoConfig is a generic configuration class that will be instantiated as one of the configuration classes of the library when created with the from_pretrained() class method.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string.

classmethod from_pretrained(pretrained_model_name_or_path, **kwargs)[source]¶

Instantiates one of the configuration classes of the library from a pre-trained model configuration.

The configuration class to instantiate is selected based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string:

Parameters
  • pretrained_model_name_or_path (string) –

    Is either:
    • a string with the shortcut name of a pre-trained model configuration to load from cache or download, e.g.: bert-base-uncased.

    • a string with the identifier name of a pre-trained model configuration that was user-uploaded to our S3, e.g.: dbmdz/bert-base-german-cased.

    • a path to a directory containing a configuration file saved using the save_pretrained() method, e.g.: ./my_model_directory/.

    • a path or url to a saved configuration JSON file, e.g.: ./my_model_directory/configuration.json.

  • cache_dir (string, optional, defaults to None) – Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.

  • force_download (boolean, optional, defaults to False) – Force to (re-)download the model weights and configuration files and override the cached versions if they exist.

  • resume_download (boolean, optional, defaults to False) – Do not delete incompletely received file. Attempt to resume the download if such a file exists.

  • proxies (Dict[str, str], optional, defaults to None) – A dictionary of proxy servers to use by protocol or endpoint, e.g.: {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. See the requests documentation for usage.

  • return_unused_kwargs (boolean, optional, defaults to False) –

    • If False, then this function returns just the final configuration object.

    • If True, then this functions returns a tuple (config, unused_kwargs) where unused_kwargs is a dictionary consisting of the key/value pairs whose keys are not configuration attributes: ie the part of kwargs which has not been used to update config and is otherwise ignored.

  • kwargs (Dict[str, any], optional, defaults to {}) – key/value pairs with which to update the configuration object after loading. - The values in kwargs of any keys which are configuration attributes will be used to override the loaded values. - Behavior concerning key/value pairs whose keys are not configuration attributes is controlled by the return_unused_kwargs keyword parameter.

Examples:

config = AutoConfig.from_pretrained('bert-base-uncased')  # Download configuration from S3 and cache.
config = AutoConfig.from_pretrained('./test/bert_saved_model/')  # E.g. config (or model) was saved using `save_pretrained('./test/saved_model/')`
config = AutoConfig.from_pretrained('./test/bert_saved_model/my_configuration.json')
config = AutoConfig.from_pretrained('bert-base-uncased', output_attention=True, foo=False)
assert config.output_attention == True
config, unused_kwargs = AutoConfig.from_pretrained('bert-base-uncased', output_attention=True,
                                                   foo=False, return_unused_kwargs=True)
assert config.output_attention == True
assert unused_kwargs == {'foo': False}

AutoTokenizer¶

class transformers.AutoTokenizer[source]¶

AutoTokenizer is a generic tokenizer class that will be instantiated as one of the tokenizer classes of the library when created with the AutoTokenizer.from_pretrained(pretrained_model_name_or_path) class method.

The from_pretrained() method takes care of returning the correct tokenizer class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string:

  • t5: T5Tokenizer (T5 model)

  • distilbert: DistilBertTokenizer (DistilBert model)

  • albert: AlbertTokenizer (ALBERT model)

  • camembert: CamembertTokenizer (CamemBERT model)

  • xlm-roberta: XLMRobertaTokenizer (XLM-RoBERTa model)

  • longformer: LongformerTokenizer (AllenAI Longformer model)

  • roberta: RobertaTokenizer (RoBERTa model)

  • bert: BertTokenizer (Bert model)

  • openai-gpt: OpenAIGPTTokenizer (OpenAI GPT model)

  • gpt2: GPT2Tokenizer (OpenAI GPT-2 model)

  • transfo-xl: TransfoXLTokenizer (Transformer-XL model)

  • xlnet: XLNetTokenizer (XLNet model)

  • xlm: XLMTokenizer (XLM model)

  • ctrl: CTRLTokenizer (Salesforce CTRL model)

  • electra: ElectraTokenizer (Google ELECTRA model)

This class cannot be instantiated using __init__() (throw an error).

classmethod from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)[source]¶

Instantiate one of the tokenizer classes of the library from a pre-trained model vocabulary.

The tokenizer class to instantiate is selected based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string:

  • t5: T5Tokenizer (T5 model)

  • distilbert: DistilBertTokenizer (DistilBert model)

  • albert: AlbertTokenizer (ALBERT model)

  • camembert: CamembertTokenizer (CamemBERT model)

  • xlm-roberta: XLMRobertaTokenizer (XLM-RoBERTa model)

  • longformer: LongformerTokenizer (AllenAI Longformer model)

  • roberta: RobertaTokenizer (RoBERTa model)

  • bert-base-japanese: BertJapaneseTokenizer (Bert model)

  • bert: BertTokenizer (Bert model)

  • openai-gpt: OpenAIGPTTokenizer (OpenAI GPT model)

  • gpt2: GPT2Tokenizer (OpenAI GPT-2 model)

  • transfo-xl: TransfoXLTokenizer (Transformer-XL model)

  • xlnet: XLNetTokenizer (XLNet model)

  • xlm: XLMTokenizer (XLM model)

  • ctrl: CTRLTokenizer (Salesforce CTRL model)

  • electra: ElectraTokenizer (Google ELECTRA model)

Params:

pretrained_model_name_or_path: either:

  • a string with the shortcut name of a predefined tokenizer to load from cache or download, e.g.: bert-base-uncased.

  • a string with the identifier name of a predefined tokenizer that was user-uploaded to our S3, e.g.: dbmdz/bert-base-german-cased.

  • a path to a directory containing vocabulary files required by the tokenizer, for instance saved using the save_pretrained() method, e.g.: ./my_model_directory/.

  • (not applicable to all derived classes) a path or url to a single saved vocabulary file if and only if the tokenizer only requires a single vocabulary file (e.g. Bert, XLNet), e.g.: ./my_model_directory/vocab.txt.

cache_dir: (optional) string:

Path to a directory in which a downloaded predefined tokenizer vocabulary files should be cached if the standard cache should not be used.

force_download: (optional) boolean, default False:

Force to (re-)download the vocabulary files and override the cached versions if they exists.

resume_download: (optional) boolean, default False:

Do not delete incompletely recieved file. Attempt to resume the download if such a file exists.

proxies: (optional) dict, default None:

A dictionary of proxy servers to use by protocol or endpoint, e.g.: {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.

use_fast: (optional) boolean, default False:

Indicate if transformers should try to load the fast version of the tokenizer (True) or use the Python one (False).

inputs: (optional) positional arguments: will be passed to the Tokenizer __init__ method.

kwargs: (optional) keyword arguments: will be passed to the Tokenizer __init__ method. Can be used to set special tokens like bos_token, eos_token, unk_token, sep_token, pad_token, cls_token, mask_token, additional_special_tokens. See parameters in the doc string of PreTrainedTokenizer for details.

Examples:

# Download vocabulary from S3 and cache.
tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')

# Download vocabulary from S3 (user-uploaded) and cache.
tokenizer = AutoTokenizer.from_pretrained('dbmdz/bert-base-german-cased')

# If vocabulary files are in a directory (e.g. tokenizer was saved using `save_pretrained('./test/saved_model/')`)
tokenizer = AutoTokenizer.from_pretrained('./test/bert_saved_model/')

AutoModel¶

class transformers.AutoModel[source]¶

AutoModel is a generic model class that will be instantiated as one of the base model classes of the library when created with the AutoModel.from_pretrained(pretrained_model_name_or_path) or the AutoModel.from_config(config) class methods.

This class cannot be instantiated using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the base model classes of the library from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

  • isInstance of distilbert configuration class: DistilBertModel (DistilBERT model)

  • isInstance of longformer configuration class: LongformerModel (Longformer model)

  • isInstance of roberta configuration class: RobertaModel (RoBERTa model)

  • isInstance of bert configuration class: BertModel (Bert model)

  • isInstance of openai-gpt configuration class: OpenAIGPTModel (OpenAI GPT model)

  • isInstance of gpt2 configuration class: GPT2Model (OpenAI GPT-2 model)

  • isInstance of ctrl configuration class: CTRLModel (Salesforce CTRL model)

  • isInstance of transfo-xl configuration class: TransfoXLModel (Transformer-XL model)

  • isInstance of xlnet configuration class: XLNetModel (XLNet model)

  • isInstance of xlm configuration class: XLMModel (XLM model)

  • isInstance of flaubert configuration class: FlaubertModel (Flaubert model)

  • isInstance of electra configuration class: ElectraModel (Electra model)

Examples:

>>> config = BertConfig.from_pretrained('bert-base-uncased')    # Download configuration from S3 and cache.
>>> model = AutoModel.from_config(config)  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiates one of the base model classes of the library from a pre-trained model configuration.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string:

The model is set in evaluation mode by default using model.eval() (Dropout modules are deactivated) To train the model, you should first set it back in training mode with model.train()

Parameters
  • pretrained_model_name_or_path –

    either:

    • a string with the shortcut name of a pre-trained model to load from cache or download, e.g.: bert-base-uncased.

    • a string with the identifier name of a pre-trained model that was user-uploaded to our S3, e.g.: dbmdz/bert-base-german-cased.

    • a path to a directory containing model weights saved using save_pretrained(), e.g.: ./my_model_directory/.

    • a path or url to a tensorflow index checkpoint file (e.g. ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

  • model_args – (optional) Sequence of positional arguments: All remaning positional arguments will be passed to the underlying model’s __init__ method

  • config –

    (optional) instance of a class derived from PretrainedConfig: Configuration for the model to use instead of an automatically loaded configuation. Configuration can be automatically loaded when:

    • the model is a model provided by the library (loaded with the shortcut-name string of a pretrained model), or

    • the model was saved using save_pretrained() and is reloaded by suppling the save directory.

    • the model is loaded by suppling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.

  • state_dict – (optional) dict: an optional state dictionary for the model to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.

  • cache_dir – (optional) string: Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.

  • force_download – (optional) boolean, default False: Force to (re-)download the model weights and configuration files and override the cached versions if they exists.

  • resume_download – (optional) boolean, default False: Do not delete incompletely recieved file. Attempt to resume the download if such a file exists.

  • proxies – (optional) dict, default None: A dictionary of proxy servers to use by protocol or endpoint, e.g.: {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.

  • output_loading_info – (optional) boolean: Set to True to also return a dictionary containing missing keys, unexpected keys and error messages.

  • kwargs – (optional) Remaining dictionary of keyword arguments: These arguments will be passed to the configuration and the model.

Examples:

model = AutoModel.from_pretrained('bert-base-uncased')    # Download model and configuration from S3 and cache.
assert model.config.output_attentions == True
# Loading from a TF checkpoint file instead of a PyTorch model (slower)
config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
model = AutoModel.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)

AutoModelForPreTraining¶

class transformers.AutoModelForPreTraining[source]¶

AutoModelForPreTraining is a generic model class that will be instantiated as one of the model classes of the library -with the architecture used for pretraining this model– when created with the AutoModelForPreTraining.from_pretrained(pretrained_model_name_or_path) class method.

This class cannot be instantiated using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the base model classes of the library from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

Examples:

>>> config = BertConfig.from_pretrained('bert-base-uncased')    # Download configuration from S3 and cache.
>>> model = AutoModelForPreTraining.from_config(config)  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiates one of the model classes of the library -with the architecture used for pretraining this model– from a pre-trained model configuration.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string:

The model is set in evaluation mode by default using model.eval() (Dropout modules are deactivated) To train the model, you should first set it back in training mode with model.train()

Parameters
  • pretrained_model_name_or_path –

    Either:

    • a string with the shortcut name of a pre-trained model to load from cache or download, e.g.: bert-base-uncased.

    • a string with the identifier name of a pre-trained model that was user-uploaded to our S3, e.g.: dbmdz/bert-base-german-cased.

    • a path to a directory containing model weights saved using save_pretrained(), e.g.: ./my_model_directory/.

    • a path or url to a tensorflow index checkpoint file (e.g. ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

  • model_args – (optional) Sequence of positional arguments: All remaning positional arguments will be passed to the underlying model’s __init__ method

  • config –

    (optional) instance of a class derived from PretrainedConfig: Configuration for the model to use instead of an automatically loaded configuation. Configuration can be automatically loaded when:

    • the model is a model provided by the library (loaded with the shortcut-name string of a pretrained model), or

    • the model was saved using save_pretrained() and is reloaded by suppling the save directory.

    • the model is loaded by suppling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.

  • state_dict – (optional) dict: an optional state dictionary for the model to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.

  • cache_dir – (optional) string: Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.

  • force_download – (optional) boolean, default False: Force to (re-)download the model weights and configuration files and override the cached versions if they exists.

  • resume_download – (optional) boolean, default False: Do not delete incompletely received file. Attempt to resume the download if such a file exists.

  • proxies – (optional) dict, default None: A dictionary of proxy servers to use by protocol or endpoint, e.g.: {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.

  • output_loading_info – (optional) boolean: Set to True to also return a dictionary containing missing keys, unexpected keys and error messages.

  • kwargs – (optional) Remaining dictionary of keyword arguments: These arguments will be passed to the configuration and the model.

Examples:

model = AutoModelForPreTraining.from_pretrained('bert-base-uncased')    # Download model and configuration from S3 and cache.
model = AutoModelForPreTraining.from_pretrained('./test/bert_model/')  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
assert model.config.output_attention == True
# Loading from a TF checkpoint file instead of a PyTorch model (slower)
config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
model = AutoModelForPreTraining.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)

AutoModelWithLMHead¶

class transformers.AutoModelWithLMHead[source]¶

AutoModelWithLMHead is a generic model class that will be instantiated as one of the language modeling model classes of the library when created with the AutoModelWithLMHead.from_pretrained(pretrained_model_name_or_path) class method.

This class cannot be instantiated using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the base model classes of the library from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

Examples:

config = BertConfig.from_pretrained('bert-base-uncased')    # Download configuration from S3 and cache.
model = AutoModelWithLMHead.from_config(config)  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiates one of the language modeling model classes of the library from a pre-trained model configuration.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string:

The model is set in evaluation mode by default using model.eval() (Dropout modules are deactivated) To train the model, you should first set it back in training mode with model.train()

Parameters
  • pretrained_model_name_or_path –

    Either:

    • a string with the shortcut name of a pre-trained model to load from cache or download, e.g.: bert-base-uncased.

    • a string with the identifier name of a pre-trained model that was user-uploaded to our S3, e.g.: dbmdz/bert-base-german-cased.

    • a path to a directory containing model weights saved using save_pretrained(), e.g.: ./my_model_directory/.

    • a path or url to a tensorflow index checkpoint file (e.g. ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

  • model_args – (optional) Sequence of positional arguments: All remaning positional arguments will be passed to the underlying model’s __init__ method

  • config –

    (optional) instance of a class derived from PretrainedConfig: Configuration for the model to use instead of an automatically loaded configuation. Configuration can be automatically loaded when:

    • the model is a model provided by the library (loaded with the shortcut-name string of a pretrained model), or

    • the model was saved using save_pretrained() and is reloaded by suppling the save directory.

    • the model is loaded by suppling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.

  • state_dict – (optional) dict: an optional state dictionary for the model to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.

  • cache_dir – (optional) string: Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.

  • force_download – (optional) boolean, default False: Force to (re-)download the model weights and configuration files and override the cached versions if they exists.

  • resume_download – (optional) boolean, default False: Do not delete incompletely received file. Attempt to resume the download if such a file exists.

  • proxies – (optional) dict, default None: A dictionary of proxy servers to use by protocol or endpoint, e.g.: {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.

  • output_loading_info – (optional) boolean: Set to True to also return a dictionary containing missing keys, unexpected keys and error messages.

  • kwargs – (optional) Remaining dictionary of keyword arguments: These arguments will be passed to the configuration and the model.

Examples:

model = AutoModelWithLMHead.from_pretrained('bert-base-uncased')    # Download model and configuration from S3 and cache.
model = AutoModelWithLMHead.from_pretrained('./test/bert_model/')  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
assert model.config.output_attention == True
# Loading from a TF checkpoint file instead of a PyTorch model (slower)
config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
model = AutoModelWithLMHead.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)

AutoModelForSequenceClassification¶

class transformers.AutoModelForSequenceClassification[source]¶

AutoModelForSequenceClassification is a generic model class that will be instantiated as one of the sequence classification model classes of the library when created with the AutoModelForSequenceClassification.from_pretrained(pretrained_model_name_or_path) class method.

This class cannot be instantiated using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the base model classes of the library from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

Examples:

config = BertConfig.from_pretrained('bert-base-uncased')    # Download configuration from S3 and cache.
model = AutoModelForSequenceClassification.from_config(config)  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiates one of the sequence classification model classes of the library from a pre-trained model configuration.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string:

The model is set in evaluation mode by default using model.eval() (Dropout modules are deactivated) To train the model, you should first set it back in training mode with model.train()

Parameters
  • pretrained_model_name_or_path –

    either:

    • a string with the shortcut name of a pre-trained model to load from cache or download, e.g.: bert-base-uncased.

    • a string with the identifier name of a pre-trained model that was user-uploaded to our S3, e.g.: dbmdz/bert-base-german-cased.

    • a path to a directory containing model weights saved using save_pretrained(), e.g.: ./my_model_directory/.

    • a path or url to a tensorflow index checkpoint file (e.g. ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

  • model_args – (optional) Sequence of positional arguments: All remaining positional arguments will be passed to the underlying model’s __init__ method

  • config –

    (optional) instance of a class derived from PretrainedConfig: Configuration for the model to use instead of an automatically loaded configuation. Configuration can be automatically loaded when:

    • the model is a model provided by the library (loaded with the shortcut-name string of a pretrained model), or

    • the model was saved using save_pretrained() and is reloaded by suppling the save directory.

    • the model is loaded by suppling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.

  • state_dict – (optional) dict: an optional state dictionary for the model to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.

  • cache_dir – (optional) string: Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.

  • force_download – (optional) boolean, default False: Force to (re-)download the model weights and configuration files and override the cached versions if they exists.

  • resume_download – (optional) boolean, default False: Do not delete incompletely recieved file. Attempt to resume the download if such a file exists.

  • proxies – (optional) dict, default None: A dictionary of proxy servers to use by protocol or endpoint, e.g.: {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.

  • output_loading_info – (optional) boolean: Set to True to also return a dictionary containing missing keys, unexpected keys and error messages.

  • kwargs – (optional) Remaining dictionary of keyword arguments: These arguments will be passed to the configuration and the model.

Examples:

model = AutoModelForSequenceClassification.from_pretrained('bert-base-uncased')    # Download model and configuration from S3 and cache.
model = AutoModelForSequenceClassification.from_pretrained('./test/bert_model/')  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
assert model.config.output_attention == True
# Loading from a TF checkpoint file instead of a PyTorch model (slower)
config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
model = AutoModelForSequenceClassification.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)

AutoModelForQuestionAnswering¶

class transformers.AutoModelForQuestionAnswering[source]¶

AutoModelForQuestionAnswering is a generic model class that will be instantiated as one of the question answering model classes of the library when created with the AutoModelForQuestionAnswering.from_pretrained(pretrained_model_name_or_path) class method.

This class cannot be instantiated using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the base model classes of the library from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

Examples:

config = BertConfig.from_pretrained('bert-base-uncased')    # Download configuration from S3 and cache.
model = AutoModelForQuestionAnswering.from_config(config)  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiates one of the question answering model classes of the library from a pre-trained model configuration.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string:

The model is set in evaluation mode by default using model.eval() (Dropout modules are deactivated) To train the model, you should first set it back in training mode with model.train()

Parameters
  • pretrained_model_name_or_path –

    either:

    • a string with the shortcut name of a pre-trained model to load from cache or download, e.g.: bert-base-uncased.

    • a string with the identifier name of a pre-trained model that was user-uploaded to our S3, e.g.: dbmdz/bert-base-german-cased.

    • a path to a directory containing model weights saved using save_pretrained(), e.g.: ./my_model_directory/.

    • a path or url to a tensorflow index checkpoint file (e.g. ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

  • model_args – (optional) Sequence of positional arguments: All remaning positional arguments will be passed to the underlying model’s __init__ method

  • config –

    (optional) instance of a class derived from PretrainedConfig: Configuration for the model to use instead of an automatically loaded configuation. Configuration can be automatically loaded when:

    • the model is a model provided by the library (loaded with the shortcut-name string of a pretrained model), or

    • the model was saved using save_pretrained() and is reloaded by suppling the save directory.

    • the model is loaded by suppling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.

  • state_dict – (optional) dict: an optional state dictionary for the model to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.

  • cache_dir – (optional) string: Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.

  • force_download – (optional) boolean, default False: Force to (re-)download the model weights and configuration files and override the cached versions if they exists.

  • proxies – (optional) dict, default None: A dictionary of proxy servers to use by protocol or endpoint, e.g.: {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.

  • output_loading_info – (optional) boolean: Set to True to also return a dictionary containing missing keys, unexpected keys and error messages.

  • kwargs – (optional) Remaining dictionary of keyword arguments: These arguments will be passed to the configuration and the model.

Examples:

model = AutoModelForQuestionAnswering.from_pretrained('bert-base-uncased')    # Download model and configuration from S3 and cache.
model = AutoModelForQuestionAnswering.from_pretrained('./test/bert_model/')  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
assert model.config.output_attention == True
# Loading from a TF checkpoint file instead of a PyTorch model (slower)
config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
model = AutoModelForQuestionAnswering.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)

AutoModelForTokenClassification¶

class transformers.AutoModelForTokenClassification[source]¶

AutoModelForTokenClassification is a generic model class that will be instantiated as one of the token classification model classes of the library when created with the AutoModelForTokenClassification.from_pretrained(pretrained_model_name_or_path) class method.

This class cannot be instantiated using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the base model classes of the library from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

  • isInstance of distilbert configuration class: DistilBertModelForTokenClassification (DistilBERT model)

  • isInstance of xlm configuration class: XLMForTokenClassification (XLM model)

  • isInstance of xlm roberta configuration class: XLMRobertaModelForTokenClassification (XLMRoberta model)

  • isInstance of bert configuration class: BertModelForTokenClassification (Bert model)

  • isInstance of albert configuration class: AlbertForTokenClassification (AlBert model)

  • isInstance of xlnet configuration class: XLNetModelForTokenClassification (XLNet model)

  • isInstance of camembert configuration class: CamembertModelForTokenClassification (Camembert model)

  • isInstance of roberta configuration class: RobertaModelForTokenClassification (Roberta model)

  • isInstance of electra configuration class: ElectraForTokenClassification (Electra model)

Examples:

config = BertConfig.from_pretrained('bert-base-uncased')    # Download configuration from S3 and cache.
model = AutoModelForTokenClassification.from_config(config)  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiates one of the question answering model classes of the library from a pre-trained model configuration.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string:

The model is set in evaluation mode by default using model.eval() (Dropout modules are deactivated) To train the model, you should first set it back in training mode with model.train()

Parameters
  • pretrained_model_name_or_path –

    Either:

    • a string with the shortcut name of a pre-trained model to load from cache or download, e.g.: bert-base-uncased.

    • a path to a directory containing model weights saved using save_pretrained(), e.g.: ./my_model_directory/.

    • a path or url to a tensorflow index checkpoint file (e.g. ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

  • model_args – (optional) Sequence of positional arguments: All remaning positional arguments will be passed to the underlying model’s __init__ method

  • config –

    (optional) instance of a class derived from PretrainedConfig: Configuration for the model to use instead of an automatically loaded configuation. Configuration can be automatically loaded when:

    • the model is a model provided by the library (loaded with the shortcut-name string of a pretrained model), or

    • the model was saved using save_pretrained() and is reloaded by suppling the save directory.

    • the model is loaded by suppling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.

  • state_dict – (optional) dict: an optional state dictionary for the model to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.

  • cache_dir – (optional) string: Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.

  • force_download – (optional) boolean, default False: Force to (re-)download the model weights and configuration files and override the cached versions if they exists.

  • proxies – (optional) dict, default None: A dictionary of proxy servers to use by protocol or endpoint, e.g.: {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.

  • output_loading_info – (optional) boolean: Set to True to also return a dictionary containing missing keys, unexpected keys and error messages.

  • kwargs – (optional) Remaining dictionary of keyword arguments: These arguments will be passed to the configuration and the model.

Examples:

model = AutoModelForTokenClassification.from_pretrained('bert-base-uncased')    # Download model and configuration from S3 and cache.
model = AutoModelForTokenClassification.from_pretrained('./test/bert_model/')  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
assert model.config.output_attention == True
# Loading from a TF checkpoint file instead of a PyTorch model (slower)
config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
model = AutoModelForTokenClassification.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)

TFAutoModel¶

class transformers.TFAutoModel[source]¶

TFAutoModel is a generic model class that will be instantiated as one of the base model classes of the library when created with the TFAutoModel.from_pretrained(pretrained_model_name_or_path) class method.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string:

  • t5: TFT5Model (T5 model)

  • distilbert: TFDistilBertModel (DistilBERT model)

  • roberta: TFRobertaModel (RoBERTa model)

  • bert: TFBertModel (Bert model)

  • openai-gpt: TFOpenAIGPTModel (OpenAI GPT model)

  • gpt2: TFGPT2Model (OpenAI GPT-2 model)

  • transfo-xl: TFTransfoXLModel (Transformer-XL model)

  • xlnet: TFXLNetModel (XLNet model)

  • xlm: TFXLMModel (XLM model)

  • ctrl: TFCTRLModel (CTRL model)

This class cannot be instantiated using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the base model classes of the library from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights

Parameters

config –

(optional) instance of a class derived from PretrainedConfig: The model class to instantiate is selected based on the configuration class:

  • isInstance of distilbert configuration class: TFDistilBertModel (DistilBERT model)

  • isInstance of roberta configuration class: TFRobertaModel (RoBERTa model)

  • isInstance of bert configuration class: TFBertModel (Bert model)

  • isInstance of openai-gpt configuration class: TFOpenAIGPTModel (OpenAI GPT model)

  • isInstance of gpt2 configuration class: TFGPT2Model (OpenAI GPT-2 model)

  • isInstance of ctrl configuration class: TFCTRLModel (Salesforce CTRL model)

  • isInstance of transfo-xl configuration class: TFTransfoXLModel (Transformer-XL model)

  • isInstance of xlnet configuration class: TFXLNetModel (XLNet model)

  • isInstance of xlm configuration class: TFXLMModel (XLM model)

Examples:

config = BertConfig.from_pretrained('bert-base-uncased')    # Download configuration from S3 and cache.
model = TFAutoModel.from_config(config)  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiates one of the base model classes of the library from a pre-trained model configuration.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string:

  • t5: TFT5Model (T5 model)

  • distilbert: TFDistilBertModel (DistilBERT model)

  • roberta: TFRobertaModel (RoBERTa model)

  • bert: TFTFBertModel (Bert model)

  • openai-gpt: TFOpenAIGPTModel (OpenAI GPT model)

  • gpt2: TFGPT2Model (OpenAI GPT-2 model)

  • transfo-xl: TFTransfoXLModel (Transformer-XL model)

  • xlnet: TFXLNetModel (XLNet model)

  • ctrl: TFCTRLModel (CTRL model)

Params:

pretrained_model_name_or_path: either:

  • a string with the shortcut name of a pre-trained model to load from cache or download, e.g.: bert-base-uncased.

  • a string with the identifier name of a pre-trained model that was user-uploaded to our S3, e.g.: dbmdz/bert-base-german-cased.

  • a path to a directory containing model weights saved using save_pretrained(), e.g.: ./my_model_directory/.

  • a path or url to a PyTorch, TF 1.X or TF 2.0 checkpoint file (e.g. ./tf_model/model.ckpt.index). In the case of a PyTorch checkpoint, from_pt should be set to True and a configuration object should be provided as config argument.

from_pt: (Optional) Boolean

Set to True if the Checkpoint is a PyTorch checkpoint.

model_args: (optional) Sequence of positional arguments:

All remaning positional arguments will be passed to the underlying model’s __init__ method

config: (optional) instance of a class derived from PretrainedConfig:

Configuration for the model to use instead of an automatically loaded configuation. Configuration can be automatically loaded when:

  • the model is a model provided by the library (loaded with the shortcut-name string of a pretrained model), or

  • the model was saved using save_pretrained() and is reloaded by suppling the save directory.

  • the model is loaded by suppling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.

state_dict: (optional) dict:

an optional state dictionnary for the model to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.

cache_dir: (optional) string:

Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.

force_download: (optional) boolean, default False:

Force to (re-)download the model weights and configuration files and override the cached versions if they exists.

resume_download: (optional) boolean, default False:

Do not delete incompletely recieved file. Attempt to resume the download if such a file exists.

proxies: (optional) dict, default None:

A dictionary of proxy servers to use by protocol or endpoint, e.g.: {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.

output_loading_info: (optional) boolean:

Set to True to also return a dictionnary containing missing keys, unexpected keys and error messages.

kwargs: (optional) Remaining dictionary of keyword arguments:

Can be used to update the configuration object (after it being loaded) and initiate the model. (e.g. output_attention=True). Behave differently depending on whether a config is provided or automatically loaded:

  • If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done)

  • If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Examples:

model = TFAutoModel.from_pretrained('bert-base-uncased')    # Download model and configuration from S3 and cache.
model = TFAutoModel.from_pretrained('./test/bert_model/')  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
model = TFAutoModel.from_pretrained('bert-base-uncased', output_attention=True)  # Update configuration during loading
assert model.config.output_attention == True
# Loading from a TF checkpoint file instead of a PyTorch model (slower)
config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
model = TFAutoModel.from_pretrained('./pt_model/bert_pytorch_model.bin', from_pt=True, config=config)

TFAutoModelForPreTraining¶

class transformers.TFAutoModelForPreTraining[source]¶

TFAutoModelForPreTraining is a generic model class that will be instantiated as one of the model classes of the library -with the architecture used for pretraining this model– when created with the TFAutoModelForPreTraining.from_pretrained(pretrained_model_name_or_path) class method.

This class cannot be instantiated using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the base model classes of the library from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

  • isInstance of distilbert configuration class: TFDistilBertModelForMaskedLM (DistilBERT model)

  • isInstance of roberta configuration class: TFRobertaModelForMaskedLM (RoBERTa model)

  • isInstance of bert configuration class: TFBertForPreTraining (Bert model)

  • isInstance of openai-gpt configuration class: TFOpenAIGPTLMHeadModel (OpenAI GPT model)

  • isInstance of gpt2 configuration class: TFGPT2ModelLMHeadModel (OpenAI GPT-2 model)

  • isInstance of ctrl configuration class: TFCTRLModelLMHeadModel (Salesforce CTRL model)

  • isInstance of transfo-xl configuration class: TFTransfoXLLMHeadModel (Transformer-XL model)

  • isInstance of xlnet configuration class: TFXLNetLMHeadModel (XLNet model)

  • isInstance of xlm configuration class: TFXLMWithLMHeadModel (XLM model)

Examples:

config = BertConfig.from_pretrained('bert-base-uncased')    # Download configuration from S3 and cache.
model = TFAutoModelForPreTraining.from_config(config)  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiates one of the model classes of the library -with the architecture used for pretraining this model– from a pre-trained model configuration.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string:

The model is set in evaluation mode by default using model.eval() (Dropout modules are deactivated) To train the model, you should first set it back in training mode with model.train()

Parameters
  • pretrained_model_name_or_path –

    Either:

    • a string with the shortcut name of a pre-trained model to load from cache or download, e.g.: bert-base-uncased.

    • a string with the identifier name of a pre-trained model that was user-uploaded to our S3, e.g.: dbmdz/bert-base-german-cased.

    • a path to a directory containing model weights saved using save_pretrained(), e.g.: ./my_model_directory/.

    • a path or url to a tensorflow index checkpoint file (e.g. ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

  • model_args – (optional) Sequence of positional arguments: All remaning positional arguments will be passed to the underlying model’s __init__ method

  • config –

    (optional) instance of a class derived from PretrainedConfig: Configuration for the model to use instead of an automatically loaded configuation. Configuration can be automatically loaded when:

    • the model is a model provided by the library (loaded with the shortcut-name string of a pretrained model), or

    • the model was saved using save_pretrained() and is reloaded by suppling the save directory.

    • the model is loaded by suppling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.

  • state_dict – (optional) dict: an optional state dictionnary for the model to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.

  • cache_dir – (optional) string: Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.

  • force_download – (optional) boolean, default False: Force to (re-)download the model weights and configuration files and override the cached versions if they exists.

  • resume_download – (optional) boolean, default False: Do not delete incompletely received file. Attempt to resume the download if such a file exists.

  • proxies – (optional) dict, default None: A dictionary of proxy servers to use by protocol or endpoint, e.g.: {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.

  • output_loading_info – (optional) boolean: Set to True to also return a dictionnary containing missing keys, unexpected keys and error messages.

  • kwargs –

    (optional) Remaining dictionary of keyword arguments: Can be used to update the configuration object (after it being loaded) and initiate the model. (e.g. output_attention=True). Behave differently depending on whether a config is provided or automatically loaded:

    • If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done)

    • If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Examples:

model = TFAutoModelForPreTraining.from_pretrained('bert-base-uncased')    # Download model and configuration from S3 and cache.
model = TFAutoModelForPreTraining.from_pretrained('./test/bert_model/')  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
model = TFAutoModelForPreTraining.from_pretrained('bert-base-uncased', output_attention=True)  # Update configuration during loading
assert model.config.output_attention == True
# Loading from a TF checkpoint file instead of a PyTorch model (slower)
config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
model = TFAutoModelForPreTraining.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)

TFAutoModelWithLMHead¶

class transformers.TFAutoModelWithLMHead[source]¶

TFAutoModelWithLMHead is a generic model class that will be instantiated as one of the language modeling model classes of the library when created with the TFAutoModelWithLMHead.from_pretrained(pretrained_model_name_or_path) class method.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string:

  • t5: TFT5ForConditionalGeneration (T5 model)

  • distilbert: TFDistilBertForMaskedLM (DistilBERT model)

  • roberta: TFRobertaForMaskedLM (RoBERTa model)

  • bert: TFBertForMaskedLM (Bert model)

  • openai-gpt: TFOpenAIGPTLMHeadModel (OpenAI GPT model)

  • gpt2: TFGPT2LMHeadModel (OpenAI GPT-2 model)

  • transfo-xl: TFTransfoXLLMHeadModel (Transformer-XL model)

  • xlnet: TFXLNetLMHeadModel (XLNet model)

  • xlm: TFXLMWithLMHeadModel (XLM model)

  • ctrl: TFCTRLLMHeadModel (CTRL model)

This class cannot be instantiated using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the base model classes of the library from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights

Parameters

config –

(optional) instance of a class derived from PretrainedConfig: The model class to instantiate is selected based on the configuration class:

  • isInstance of distilbert configuration class: DistilBertModel (DistilBERT model)

  • isInstance of roberta configuration class: RobertaModel (RoBERTa model)

  • isInstance of bert configuration class: BertModel (Bert model)

  • isInstance of openai-gpt configuration class: OpenAIGPTModel (OpenAI GPT model)

  • isInstance of gpt2 configuration class: GPT2Model (OpenAI GPT-2 model)

  • isInstance of ctrl configuration class: CTRLModel (Salesforce CTRL model)

  • isInstance of transfo-xl configuration class: TransfoXLModel (Transformer-XL model)

  • isInstance of xlnet configuration class: XLNetModel (XLNet model)

  • isInstance of xlm configuration class: XLMModel (XLM model)

Examples:

config = BertConfig.from_pretrained('bert-base-uncased')    # Download configuration from S3 and cache.
model = TFAutoModelWithLMHead.from_config(config)  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiates one of the language modeling model classes of the library from a pre-trained model configuration.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string:

  • t5: TFT5ForConditionalGeneration (T5 model)

  • distilbert: TFDistilBertForMaskedLM (DistilBERT model)

  • roberta: TFRobertaForMaskedLM (RoBERTa model)

  • bert: TFBertForMaskedLM (Bert model)

  • openai-gpt: TFOpenAIGPTLMHeadModel (OpenAI GPT model)

  • gpt2: TFGPT2LMHeadModel (OpenAI GPT-2 model)

  • transfo-xl: TFTransfoXLLMHeadModel (Transformer-XL model)

  • xlnet: TFXLNetLMHeadModel (XLNet model)

  • xlm: TFXLMWithLMHeadModel (XLM model)

  • ctrl: TFCTRLLMHeadModel (CTRL model)

Params:

pretrained_model_name_or_path: either:

  • a string with the shortcut name of a pre-trained model to load from cache or download, e.g.: bert-base-uncased.

  • a string with the identifier name of a pre-trained model that was user-uploaded to our S3, e.g.: dbmdz/bert-base-german-cased.

  • a path to a directory containing model weights saved using save_pretrained(), e.g.: ./my_model_directory/.

  • a path or url to a PyTorch, TF 1.X or TF 2.0 checkpoint file (e.g. ./tf_model/model.ckpt.index). In the case of a PyTorch checkpoint, from_pt should be set to True and a configuration object should be provided as config argument.

from_pt: (Optional) Boolean

Set to True if the Checkpoint is a PyTorch checkpoint.

model_args: (optional) Sequence of positional arguments:

All remaning positional arguments will be passed to the underlying model’s __init__ method

config: (optional) instance of a class derived from PretrainedConfig:

Configuration for the model to use instead of an automatically loaded configuation. Configuration can be automatically loaded when:

  • the model is a model provided by the library (loaded with the shortcut-name string of a pretrained model), or

  • the model was saved using save_pretrained() and is reloaded by suppling the save directory.

  • the model is loaded by suppling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.

state_dict: (optional) dict:

an optional state dictionnary for the model to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.

cache_dir: (optional) string:

Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.

force_download: (optional) boolean, default False:

Force to (re-)download the model weights and configuration files and override the cached versions if they exists.

resume_download: (optional) boolean, default False:

Do not delete incompletely recieved file. Attempt to resume the download if such a file exists.

proxies: (optional) dict, default None:

A dictionary of proxy servers to use by protocol or endpoint, e.g.: {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.

output_loading_info: (optional) boolean:

Set to True to also return a dictionnary containing missing keys, unexpected keys and error messages.

kwargs: (optional) Remaining dictionary of keyword arguments:

Can be used to update the configuration object (after it being loaded) and initiate the model. (e.g. output_attention=True). Behave differently depending on whether a config is provided or automatically loaded:

  • If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done)

  • If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Examples:

model = TFAutoModelWithLMHead.from_pretrained('bert-base-uncased')    # Download model and configuration from S3 and cache.
model = TFAutoModelWithLMHead.from_pretrained('./test/bert_model/')  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
model = TFAutoModelWithLMHead.from_pretrained('bert-base-uncased', output_attention=True)  # Update configuration during loading
assert model.config.output_attention == True
# Loading from a TF checkpoint file instead of a PyTorch model (slower)
config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
model = TFAutoModelWithLMHead.from_pretrained('./pt_model/bert_pytorch_model.bin', from_pt=True, config=config)

TFAutoModelForSequenceClassification¶

class transformers.TFAutoModelForSequenceClassification[source]¶

TFAutoModelForSequenceClassification is a generic model class that will be instantiated as one of the sequence classification model classes of the library when created with the TFAutoModelForSequenceClassification.from_pretrained(pretrained_model_name_or_path) class method.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string:

  • distilbert: TFDistilBertForSequenceClassification (DistilBERT model)

  • roberta: TFRobertaForSequenceClassification (RoBERTa model)

  • bert: TFBertForSequenceClassification (Bert model)

  • xlnet: TFXLNetForSequenceClassification (XLNet model)

  • xlm: TFXLMForSequenceClassification (XLM model)

This class cannot be instantiated using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the base model classes of the library from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights

Parameters

config –

(optional) instance of a class derived from PretrainedConfig: The model class to instantiate is selected based on the configuration class:

  • isInstance of distilbert configuration class: DistilBertModel (DistilBERT model)

  • isInstance of roberta configuration class: RobertaModel (RoBERTa model)

  • isInstance of bert configuration class: BertModel (Bert model)

  • isInstance of xlnet configuration class: XLNetModel (XLNet model)

  • isInstance of xlm configuration class: XLMModel (XLM model)

Examples:

config = BertConfig.from_pretrained('bert-base-uncased')    # Download configuration from S3 and cache.
model = AutoModelForSequenceClassification.from_config(config)  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiates one of the sequence classification model classes of the library from a pre-trained model configuration.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string:

  • distilbert: TFDistilBertForSequenceClassification (DistilBERT model)

  • roberta: TFRobertaForSequenceClassification (RoBERTa model)

  • bert: TFBertForSequenceClassification (Bert model)

  • xlnet: TFXLNetForSequenceClassification (XLNet model)

  • xlm: TFXLMForSequenceClassification (XLM model)

The model is set in evaluation mode by default using model.eval() (Dropout modules are deactivated) To train the model, you should first set it back in training mode with model.train()

Params:

pretrained_model_name_or_path: either:

  • a string with the shortcut name of a pre-trained model to load from cache or download, e.g.: bert-base-uncased.

  • a string with the identifier name of a pre-trained model that was user-uploaded to our S3, e.g.: dbmdz/bert-base-german-cased.

  • a path to a directory containing model weights saved using save_pretrained(), e.g.: ./my_model_directory/.

  • a path or url to a PyTorch, TF 1.X or TF 2.0 checkpoint file (e.g. ./tf_model/model.ckpt.index). In the case of a PyTorch checkpoint, from_pt should be set to True and a configuration object should be provided as config argument.

from_pt: (Optional) Boolean

Set to True if the Checkpoint is a PyTorch checkpoint.

model_args: (optional) Sequence of positional arguments:

All remaning positional arguments will be passed to the underlying model’s __init__ method

config: (optional) instance of a class derived from PretrainedConfig:

Configuration for the model to use instead of an automatically loaded configuation. Configuration can be automatically loaded when:

  • the model is a model provided by the library (loaded with the shortcut-name string of a pretrained model), or

  • the model was saved using save_pretrained() and is reloaded by suppling the save directory.

  • the model is loaded by suppling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.

state_dict: (optional) dict:

an optional state dictionnary for the model to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.

cache_dir: (optional) string:

Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.

force_download: (optional) boolean, default False:

Force to (re-)download the model weights and configuration files and override the cached versions if they exists.

resume_download: (optional) boolean, default False:

Do not delete incompletely recieved file. Attempt to resume the download if such a file exists.

proxies: (optional) dict, default None:

A dictionary of proxy servers to use by protocol or endpoint, e.g.: {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.

output_loading_info: (optional) boolean:

Set to True to also return a dictionnary containing missing keys, unexpected keys and error messages.

kwargs: (optional) Remaining dictionary of keyword arguments:

Can be used to update the configuration object (after it being loaded) and initiate the model. (e.g. output_attention=True). Behave differently depending on whether a config is provided or automatically loaded:

  • If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done)

  • If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Examples:

model = TFAutoModelForSequenceClassification.from_pretrained('bert-base-uncased')    # Download model and configuration from S3 and cache.
model = TFAutoModelForSequenceClassification.from_pretrained('./test/bert_model/')  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
model = TFAutoModelForSequenceClassification.from_pretrained('bert-base-uncased', output_attention=True)  # Update configuration during loading
assert model.config.output_attention == True
# Loading from a TF checkpoint file instead of a PyTorch model (slower)
config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
model = TFAutoModelForSequenceClassification.from_pretrained('./pt_model/bert_pytorch_model.bin', from_pt=True, config=config)

TFAutoModelForQuestionAnswering¶

class transformers.TFAutoModelForQuestionAnswering[source]¶

TFAutoModelForQuestionAnswering is a generic model class that will be instantiated as one of the question answering model classes of the library when created with the TFAutoModelForQuestionAnswering.from_pretrained(pretrained_model_name_or_path) class method.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string:

  • distilbert: TFDistilBertForQuestionAnswering (DistilBERT model)

  • albert: TFAlbertForQuestionAnswering (ALBERT model)

  • roberta: TFRobertaForQuestionAnswering (RoBERTa model)

  • bert: TFBertForQuestionAnswering (Bert model)

  • xlnet: TFXLNetForQuestionAnswering (XLNet model)

  • xlm: TFXLMForQuestionAnswering (XLM model)

This class cannot be instantiated using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the base model classes of the library from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights

Parameters

config –

(optional) instance of a class derived from PretrainedConfig: The model class to instantiate is selected based on the configuration class:

  • isInstance of distilbert configuration class: DistilBertModel (DistilBERT model)

  • isInstance of albert configuration class: AlbertModel (ALBERT model)

  • isInstance of roberta configuration class: RobertaModel (RoBERTa model)

  • isInstance of bert configuration class: BertModel (Bert model)

  • isInstance of xlnet configuration class: XLNetModel (XLNet model)

  • isInstance of xlm configuration class: XLMModel (XLM model)

Examples:

config = BertConfig.from_pretrained('bert-base-uncased')    # Download configuration from S3 and cache.
model = TFAutoModelForQuestionAnswering.from_config(config)  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiates one of the question answering model classes of the library from a pre-trained model configuration.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string:

  • distilbert: TFDistilBertForQuestionAnswering (DistilBERT model)

  • albert: TFAlbertForQuestionAnswering (ALBERT model)

  • roberta: TFRobertaForQuestionAnswering (RoBERTa model)

  • bert: TFBertForQuestionAnswering (Bert model)

  • xlnet: TFXLNetForQuestionAnswering (XLNet model)

  • xlm: TFXLMForQuestionAnswering (XLM model)

The model is set in evaluation mode by default using model.eval() (Dropout modules are deactivated) To train the model, you should first set it back in training mode with model.train()

Params:

pretrained_model_name_or_path: either:

  • a string with the shortcut name of a pre-trained model to load from cache or download, e.g.: bert-base-uncased.

  • a string with the identifier name of a pre-trained model that was user-uploaded to our S3, e.g.: dbmdz/bert-base-german-cased.

  • a path to a directory containing model weights saved using save_pretrained(), e.g.: ./my_model_directory/.

  • a path or url to a PyTorch, TF 1.X or TF 2.0 checkpoint file (e.g. ./tf_model/model.ckpt.index). In the case of a PyTorch checkpoint, from_pt should be set to True and a configuration object should be provided as config argument.

from_pt: (Optional) Boolean

Set to True if the Checkpoint is a PyTorch checkpoint.

model_args: (optional) Sequence of positional arguments:

All remaning positional arguments will be passed to the underlying model’s __init__ method

config: (optional) instance of a class derived from PretrainedConfig:

Configuration for the model to use instead of an automatically loaded configuation. Configuration can be automatically loaded when:

  • the model is a model provided by the library (loaded with the shortcut-name string of a pretrained model), or

  • the model was saved using save_pretrained() and is reloaded by suppling the save directory.

  • the model is loaded by suppling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.

state_dict: (optional) dict:

an optional state dictionnary for the model to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.

cache_dir: (optional) string:

Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.

force_download: (optional) boolean, default False:

Force to (re-)download the model weights and configuration files and override the cached versions if they exists.

resume_download: (optional) boolean, default False:

Do not delete incompletely recieved file. Attempt to resume the download if such a file exists.

proxies: (optional) dict, default None:

A dictionary of proxy servers to use by protocol or endpoint, e.g.: {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.

output_loading_info: (optional) boolean:

Set to True to also return a dictionnary containing missing keys, unexpected keys and error messages.

kwargs: (optional) Remaining dictionary of keyword arguments:

Can be used to update the configuration object (after it being loaded) and initiate the model. (e.g. output_attention=True). Behave differently depending on whether a config is provided or automatically loaded:

  • If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done)

  • If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Examples:

model = TFAutoModelForQuestionAnswering.from_pretrained('bert-base-uncased')    # Download model and configuration from S3 and cache.
model = TFAutoModelForQuestionAnswering.from_pretrained('./test/bert_model/')  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
model = TFAutoModelForQuestionAnswering.from_pretrained('bert-base-uncased', output_attention=True)  # Update configuration during loading
assert model.config.output_attention == True
# Loading from a TF checkpoint file instead of a PyTorch model (slower)
config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
model = TFAutoModelForQuestionAnswering.from_pretrained('./pt_model/bert_pytorch_model.bin', from_pt=True, config=config)

TFAutoModelForTokenClassification¶

class transformers.TFAutoModelForTokenClassification[source]¶
classmethod from_config(config)[source]¶

Instantiates one of the base model classes of the library from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights

Parameters

config –

(optional) instance of a class derived from PretrainedConfig: The model class to instantiate is selected based on the configuration class:

  • isInstance of bert configuration class: BertModel (Bert model)

  • isInstance of xlnet configuration class: XLNetModel (XLNet model)

  • isInstance of distilbert configuration class: DistilBertModel (DistilBert model)

  • isInstance of roberta configuration class: RobteraModel (Roberta model)

Examples:

config = BertConfig.from_pretrained('bert-base-uncased')    # Download configuration from S3 and cache.
model = TFAutoModelForTokenClassification.from_config(config)  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiates one of the question answering model classes of the library from a pre-trained model configuration.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string:

  • bert: BertForTokenClassification (Bert model)

  • xlnet: XLNetForTokenClassification (XLNet model)

  • distilbert: DistilBertForTokenClassification (DistilBert model)

  • roberta: RobertaForTokenClassification (Roberta model)

The model is set in evaluation mode by default using model.eval() (Dropout modules are deactivated) To train the model, you should first set it back in training mode with model.train()

Params:

pretrained_model_name_or_path: either:

  • a string with the shortcut name of a pre-trained model to load from cache or download, e.g.: bert-base-uncased.

  • a path to a directory containing model weights saved using save_pretrained(), e.g.: ./my_model_directory/.

  • a path or url to a tensorflow index checkpoint file (e.g. ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args: (optional) Sequence of positional arguments:

All remaning positional arguments will be passed to the underlying model’s __init__ method

config: (optional) instance of a class derived from PretrainedConfig:

Configuration for the model to use instead of an automatically loaded configuation. Configuration can be automatically loaded when:

  • the model is a model provided by the library (loaded with the shortcut-name string of a pretrained model), or

  • the model was saved using save_pretrained() and is reloaded by suppling the save directory.

  • the model is loaded by suppling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.

state_dict: (optional) dict:

an optional state dictionnary for the model to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.

cache_dir: (optional) string:

Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.

force_download: (optional) boolean, default False:

Force to (re-)download the model weights and configuration files and override the cached versions if they exists.

proxies: (optional) dict, default None:

A dictionary of proxy servers to use by protocol or endpoint, e.g.: {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.

output_loading_info: (optional) boolean:

Set to True to also return a dictionnary containing missing keys, unexpected keys and error messages.

kwargs: (optional) Remaining dictionary of keyword arguments:

Can be used to update the configuration object (after it being loaded) and initiate the model. (e.g. output_attention=True). Behave differently depending on whether a config is provided or automatically loaded:

  • If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done)

  • If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Examples:

model = TFAutoModelForTokenClassification.from_pretrained('bert-base-uncased')    # Download model and configuration from S3 and cache.
model = TFAutoModelForTokenClassification.from_pretrained('./test/bert_model/')  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
model = TFAutoModelForTokenClassification.from_pretrained('bert-base-uncased', output_attention=True)  # Update configuration during loading
assert model.config.output_attention == True
# Loading from a TF checkpoint file instead of a PyTorch model (slower)
config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
model = TFAutoModelForTokenClassification.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)