Hugging Face

Enterprise

company

Verified

https://huggingface.co.

huggingface

Activity Feed

AI & ML interests

The AI community building the future.

Recent Activity

m-ric updated a dataset 1 day ago

huggingface/documentation-images

evijit updated a dataset 1 day ago

huggingface/policy-docs

evijit new activity 1 day ago

huggingface/policy-docs:Reuploaded this file with addition of credit to Bruna

View all activity

Articles

Yay! Organizations can now publish blog Articles

Jan 20

• 34

huggingface's activity

fdaudens

posted an update about 6 hours ago

Post

377

What if AI becomes as ubiquitous as the internet, but runs locally and transparently on our devices?

Fascinating TED talk by @thomwolf on open source AI and its future impact.

Imagine this for AI: instead of black box models running in distant data centers, we get transparent AI that runs locally on our phones and laptops, often without needing internet access. If the original team moves on? No problem - resilience is one of the beauties of open source. Anyone (companies, collectives, or individuals) can adapt and fix these models.

This is a compelling vision of AI's future that solves many of today's concerns around AI transparency and centralized control.

Watch the full talk here: https://www.ted.com/talks/thomas_wolf_what_if_ai_just_works

AdinaY

posted an update about 7 hours ago

Post

167

The AI race in the automotive industry is heating up🚗
Li Auto’s research team has released their latest paper on LLM👇 LDGen: Enhancing Text-to-Image Synthesis via Large Language Model-Driven Language Representation (2502.18302)

✨This paper introduces LDGen, which integrates LLMs with diffusion models to enhance text-to-image (T2I) generation capabilities.

AdinaY

posted an update about 7 hours ago

Post

148

LLaDA 🔥a 8B diffusion model by GSAI Lab Renmin University
✨Fully trained from scratch, LLaDA delivers performance on par with LLaMA3 8B
Model: GSAI-ML/LLaDA-8B-Instruct
Demo: multimodalart/LLaDA
Paper: Large Language Diffusion Models (2502.09992)

davanstrien

posted an update about 9 hours ago

Post

321

📊 Introducing "Hugging Face Dataset Spotlight" 📊

I'm excited to share the first episode of our AI-generated podcast series focusing on nice datasets from the Hugging Face Hub!

This first episode explores mathematical reasoning datasets:

- SynthLabsAI/Big-Math-RL-Verified: Over 250,000 rigorously verified problems spanning multiple difficulty levels and mathematical domains
- open-r1/OpenR1-Math-220k: 220,000 math problems with multiple reasoning traces, verified for accuracy using Math Verify and Llama-3.3-70B models.
- facebook/natural_reasoning: 1.1 million general reasoning questions carefully deduplicated and decontaminated from existing benchmarks, showing superior scaling effects when training models like Llama3.1-8B-Instruct.

Plus a bonus segment on bespokelabs/bespoke-manim!

https://www.youtube.com/watch?v=-TgmRq45tW4

m-ric

updated a dataset 1 day ago

huggingface/documentation-images

Viewer • Updated 1 day ago • 50 • 4.7M • 52

ngxson

posted an update 1 day ago

Post

1293

A comprehensive matrix for which format should you use.

Read more on my blog post: https://huggingface.co./blog/ngxson/common-ai-model-formats

| Hardware        | GGUF      | PyTorch                | Safetensors              | ONNX  |
|-----------------|-----------|------------------------|--------------------------|-------|
| CPU             | ✅ (best) | 🟡                      | 🟡                       | ✅    |
| GPU             | ✅        | ✅                      | ✅                       | ✅    |
| Mobile          | ✅        | 🟡 (via executorch)     | ❌                       | ✅    |
| Apple silicon   | ✅        | 🟡                      | ✅ (via MLX framework)   | ✅    |

1 reply

davanstrien

posted an update 1 day ago

Post

1919

Quick POC: Turn a Hugging Face dataset card into a short podcast introducing the dataset using all open models.

I think I'm the only weirdo who would enjoy listening to something like this though 😅

Here is an example for eth-nlped/stepverify

1 reply

burtenshaw

posted an update 1 day ago

Post

2532

I made a real time voice agent with FastRTC, smolagents, and hugging face inference providers. Check it out in this space:

🔗 burtenshaw/coworking_agent

5 replies

evijit

updated a dataset 1 day ago

huggingface/policy-docs

Updated Jan 17 • 2.1k • 9

evijit

in huggingface/policy-docs 1 day ago

Reuploaded this file with addition of credit to Bruna

#14 opened 1 day ago by

evijit

Upload 2025_UK_Govt_Consultation_Copyright_and_Artificial_Intelligence.pdf

#12 opened 1 day ago by

evijit

Update README.md

#13 opened 1 day ago by

evijit

stevhliu

in huggingface/documentation-images 2 days ago

Upload marigold_einstein_albedo_uncertainty.png

#439 opened 5 days ago by

toshas

Extra files for the Marigold Intrinsic Image Decomposition pipeline (PR coming)

#438 opened 5 days ago by

toshas

fdaudens

posted an update 2 days ago

Post

2649

Is this the best tool to extract clean info from PDFs, handwriting and complex documents yet?

Open source olmOCR just dropped and the results are impressive.

Tested the free demo with various documents, including a handwritten Claes Oldenburg letter. The speed is impressive: 3000 tokens/second on your own GPU - that's 1/32 the cost of GPT-4o ($190/million pages). Game-changer for content extraction and digital archives.

To achieve this, Ai2 trained a 7B vision language model on 260K pages from 100K PDFs using "document anchoring" - combining PDF metadata with page images.

Best part: it actually understands document structure (columns, tables, equations) instead of just jumbling everything together like most OCR tools. Their human eval results back this up.

👉 Try the demo: https://olmocr.allenai.org

Going right into the AI toolkit: JournalistsonHF/ai-toolkit

3 replies

burtenshaw

posted an update 2 days ago

Post

5462

Now the Hugging Face agent course is getting real! With frameworks like smolagents, LlamaIndex, and LangChain.

🔗 Follow the org for updates https://huggingface.co./agents-course

This week we are releasing the first framework unit in the course and it’s on smolagents. This is what the unit covers:

- why should you use smolagents vs another library?
- how to build agents that use code
- build multiagents systems
- use vision language models for browser use

The team has been working flat out on this for a few weeks. Led by @sergiopaniego and supported by smolagents author @m-ric .