hezarai
/

t5-base-fa

Text Generation

Model card Files Files and versions Community

t5-base-fa / preprocessor /tokenizer_config.yaml

arxyzan's picture

Update preprocessor/tokenizer_config.yaml

ab92be3 verified 20 days ago

history blame contribute delete

541 Bytes

	name: sentencepiece_unigram_tokenizer
	config_type: preprocessor
	max_length: 512
	truncation: longest_first
	truncation_side: right
	stride: 0
	padding: longest
	padding_side: right
	pad_to_multiple_of: 0
	pad_token_type_id: 0
	bos_token: <s>
	eos_token: </s>
	unk_token: <unk>
	sep_token: <sep>
	pad_token: <pad>
	cls_token: <cls>
	mask_token: <mask>
	continuing_subword_prefix: ''
	replacement: _
	add_prefix_space: true
	end_of_word_suffix: ''
	fuse_unk: false
	vocab_size: 32103
	min_frequency: 2
	limit_alphabet: 1000
	initial_alphabet: []
	show_progress: true