Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Spaces:
CONDA-Workshop
/
Data-Contamination-Database
like
16
Running
App
Files
Files
Community
29
Fetching metadata from the HF Docker repository...
ad06fdc
Data-Contamination-Database
14 contributors
History:
17 commits
vishaal27
Add data from "Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus"
ad06fdc
verified
9 months ago
.gitattributes
Safe
1.52 kB
initial commit
11 months ago
.gitignore
Safe
12 Bytes
Style + gitignore
11 months ago
README.md
Safe
352 Bytes
Initital commit
11 months ago
app.py
Safe
6.23 kB
Increase tab font size
10 months ago
contamination_report.csv
Safe
34.5 kB
Add data from "Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus"
9 months ago
dataset.py
Safe
9.64 kB
Add PR links to previous commits
10 months ago
markdown.py
Safe
9.83 kB
update urls
10 months ago
requirements.txt
Safe
73 Bytes
Initital commit
11 months ago
utils.py
Safe
6.11 kB
Get token from environment
10 months ago