Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
3
Greg Lindahl
greglindahl
Follow
AGreenDCAnt's profile picture
osanseviero's profile picture
thomwolf's profile picture
4 followers
·
3 following
wumpus
AI & ML interests
None yet
Recent Activity
authored
a paper
about 1 month ago
Towards Best Practices for Open Datasets for LLM Training
updated
a Space
3 months ago
commoncrawl/README
updated
a dataset
5 months ago
commoncrawl/eot2024_hostlevel_logs
View all activity
Organizations
greglindahl
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
authored
a paper
about 1 month ago
Towards Best Practices for Open Datasets for LLM Training
Paper
•
2501.08365
•
Published
Jan 14
•
55
updated
a Space
3 months ago
Running
README
🌍
Explore Common Crawl's metadata and experimental datasets
updated
2 datasets
5 months ago
commoncrawl/eot2024_hostlevel_logs
Viewer
•
Updated
Oct 9, 2024
•
271k
•
18
•
1
commoncrawl/citations
Viewer
•
Updated
Sep 22, 2024
•
8.48k
•
209
New activity in
commoncrawl/citations
5 months ago
Upload 2024.jsonl.gz
#2 opened 5 months ago by
greglindahl
updated
a dataset
7 months ago
commoncrawl/citations-annotated
Viewer
•
Updated
Aug 6, 2024
•
424
•
188
New activity in
commoncrawl/README
9 months ago
start a README
#1 opened 9 months ago by
greglindahl
start a README
#1 opened 9 months ago by
greglindahl
Load more