Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
thunlp
/
SubCharTokenization
like
1
Model card
Files
Files and versions
Community
202c559
SubCharTokenization
1 contributor
History:
4 commits
chen-yingfa
Add more tokenization methods
202c559
about 3 years ago
.gitattributes
Safe
1.18 kB
initial commit
about 3 years ago
cangjie.pt
Safe
pickle
Detected Pickle imports (3)
"collections.OrderedDict"
,
"torch._utils._rebuild_tensor_v2"
,
"torch.FloatStorage"
What is a pickle import?
1.05 GB
LFS
Add more tokenization methods
about 3 years ago
pinyin.pt
Safe
pickle
Detected Pickle imports (3)
"collections.OrderedDict"
,
"torch._utils._rebuild_tensor_v2"
,
"torch.FloatStorage"
What is a pickle import?
1.05 GB
LFS
Add pinyin
about 3 years ago
pinyin_no_index.pt
Safe
pickle
Detected Pickle imports (3)
"torch._utils._rebuild_tensor_v2"
,
"torch.FloatStorage"
,
"collections.OrderedDict"
What is a pickle import?
1.05 GB
LFS
Add wubi and no index model
about 3 years ago
stroke.pt
Safe
pickle
Detected Pickle imports (3)
"torch._utils._rebuild_tensor_v2"
,
"collections.OrderedDict"
,
"torch.FloatStorage"
What is a pickle import?
1.05 GB
LFS
Add more tokenization methods
about 3 years ago
wubi.pt
Safe
pickle
Detected Pickle imports (3)
"torch.FloatStorage"
,
"collections.OrderedDict"
,
"torch._utils._rebuild_tensor_v2"
What is a pickle import?
1.05 GB
LFS
Add wubi and no index model
about 3 years ago
wubi_no_index.pt
Safe
pickle
Detected Pickle imports (3)
"torch._utils._rebuild_tensor_v2"
,
"torch.FloatStorage"
,
"collections.OrderedDict"
What is a pickle import?
1.05 GB
LFS
Add wubi and no index model
about 3 years ago
zhengma.pt
Safe
pickle
Detected Pickle imports (3)
"torch._utils._rebuild_tensor_v2"
,
"torch.FloatStorage"
,
"collections.OrderedDict"
What is a pickle import?
1.05 GB
LFS
Add more tokenization methods
about 3 years ago
zhuyin.pt
pickle
Detected Pickle imports (13)
"torch._utils._rebuild_tensor_v2"
,
"torch.utils.data.dataloader.DataLoader"
,
"numpy.core.multiarray._reconstruct"
,
"torch.FloatStorage"
,
"collections.OrderedDict"
,
"__main__.WorkerInitObj"
,
"numpy.dtype"
,
"__main__.pretraining_dataset"
,
"numpy.ndarray"
,
"torch.utils.data.sampler.BatchSampler"
,
"torch.utils.data.sampler.RandomSampler"
,
"torch.utils.data._utils.collate.default_collate"
,
"_codecs.encode"
How to fix it?
1.44 GB
LFS
Add more tokenization methods
about 3 years ago