Hugging Face

Enterprise
company
Verified
Activity Feed

AI & ML interests

The AI community building the future.

Recent Activity

Articles

huggingface's activity

fdaudens 
posted an update about 6 hours ago
view post
Post
377
What if AI becomes as ubiquitous as the internet, but runs locally and transparently on our devices?

Fascinating TED talk by @thomwolf on open source AI and its future impact.

Imagine this for AI: instead of black box models running in distant data centers, we get transparent AI that runs locally on our phones and laptops, often without needing internet access. If the original team moves on? No problem - resilience is one of the beauties of open source. Anyone (companies, collectives, or individuals) can adapt and fix these models.

This is a compelling vision of AI's future that solves many of today's concerns around AI transparency and centralized control.

Watch the full talk here: https://www.ted.com/talks/thomas_wolf_what_if_ai_just_works
AdinaY 
posted an update about 7 hours ago
AdinaY 
posted an update about 7 hours ago
davanstrien 
posted an update about 9 hours ago
view post
Post
321
📊 Introducing "Hugging Face Dataset Spotlight" 📊

I'm excited to share the first episode of our AI-generated podcast series focusing on nice datasets from the Hugging Face Hub!

This first episode explores mathematical reasoning datasets:

- SynthLabsAI/Big-Math-RL-Verified: Over 250,000 rigorously verified problems spanning multiple difficulty levels and mathematical domains
- open-r1/OpenR1-Math-220k: 220,000 math problems with multiple reasoning traces, verified for accuracy using Math Verify and Llama-3.3-70B models.
- facebook/natural_reasoning: 1.1 million general reasoning questions carefully deduplicated and decontaminated from existing benchmarks, showing superior scaling effects when training models like Llama3.1-8B-Instruct.

Plus a bonus segment on bespokelabs/bespoke-manim!

https://www.youtube.com/watch?v=-TgmRq45tW4
ngxson 
posted an update 1 day ago
view post
Post
1293
A comprehensive matrix for which format should you use.

Read more on my blog post: https://huggingface.co./blog/ngxson/common-ai-model-formats

| Hardware        | GGUF      | PyTorch                | Safetensors              | ONNX  |
|-----------------|-----------|------------------------|--------------------------|-------|
| CPU             | ✅ (best) | 🟡                      | 🟡                       ||
| GPU             |||||
| Mobile          || 🟡 (via executorch)     |||
| Apple silicon   || 🟡                      | ✅ (via MLX framework)   ||
  • 1 reply
·
davanstrien 
posted an update 1 day ago
view post
Post
1919
Quick POC: Turn a Hugging Face dataset card into a short podcast introducing the dataset using all open models.

I think I'm the only weirdo who would enjoy listening to something like this though 😅

Here is an example for eth-nlped/stepverify
  • 1 reply
·
burtenshaw 
posted an update 1 day ago
view post
Post
2532
I made a real time voice agent with FastRTC, smolagents, and hugging face inference providers. Check it out in this space:

🔗 burtenshaw/coworking_agent
·
fdaudens 
posted an update 2 days ago
view post
Post
2649
Is this the best tool to extract clean info from PDFs, handwriting and complex documents yet?

Open source olmOCR just dropped and the results are impressive.

Tested the free demo with various documents, including a handwritten Claes Oldenburg letter. The speed is impressive: 3000 tokens/second on your own GPU - that's 1/32 the cost of GPT-4o ($190/million pages). Game-changer for content extraction and digital archives.

To achieve this, Ai2 trained a 7B vision language model on 260K pages from 100K PDFs using "document anchoring" - combining PDF metadata with page images.

Best part: it actually understands document structure (columns, tables, equations) instead of just jumbling everything together like most OCR tools. Their human eval results back this up.

👉 Try the demo: https://olmocr.allenai.org

Going right into the AI toolkit: JournalistsonHF/ai-toolkit
  • 3 replies
·
burtenshaw 
posted an update 2 days ago
view post
Post
5462
Now the Hugging Face agent course is getting real! With frameworks like smolagents, LlamaIndex, and LangChain.

🔗 Follow the org for updates https://huggingface.co./agents-course

This week we are releasing the first framework unit in the course and it’s on smolagents. This is what the unit covers:

- why should you use smolagents vs another library?
- how to build agents that use code
- build multiagents systems
- use vision language models for browser use

The team has been working flat out on this for a few weeks. Led by @sergiopaniego and supported by smolagents author @m-ric .