Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Spaces:
CONDA-Workshop
/
Data-Contamination-Database
like
16
Running
App
Files
Files
Community
29
Fetching metadata from the HF Docker repository...
refs/pr/6
Data-Contamination-Database
14 contributors
History:
17 commits
vishaal27
Add data from "Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus"
ad06fdc
verified
11 months ago
.gitattributes
Safe
1.52 kB
initial commit
12 months ago
.gitignore
Safe
12 Bytes
Style + gitignore
12 months ago
README.md
Safe
352 Bytes
Initital commit
12 months ago
app.py
Safe
6.23 kB
Increase tab font size
11 months ago
contamination_report.csv
Safe
34.5 kB
Add data from "Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus"
11 months ago
dataset.py
Safe
9.64 kB
Add PR links to previous commits
12 months ago
markdown.py
Safe
9.83 kB
update urls
11 months ago
requirements.txt
Safe
73 Bytes
Initital commit
12 months ago
utils.py
Safe
6.11 kB
Get token from environment
12 months ago