HFforLegal (Hugging Face for Legal)

fdaudens

posted an update about 3 hours ago

Post

124

What if AI becomes as ubiquitous as the internet, but runs locally and transparently on our devices?

Fascinating TED talk by @thomwolf on open source AI and its future impact.

Imagine this for AI: instead of black box models running in distant data centers, we get transparent AI that runs locally on our phones and laptops, often without needing internet access. If the original team moves on? No problem - resilience is one of the beauties of open source. Anyone (companies, collectives, or individuals) can adapt and fix these models.

This is a compelling vision of AI's future that solves many of today's concerns around AI transparency and centralized control.

Watch the full talk here: https://www.ted.com/talks/thomas_wolf_what_if_ai_just_works

AdinaY

posted an update about 4 hours ago

Post

102

The AI race in the automotive industry is heating up🚗
Li Auto’s research team has released their latest paper on LLM👇 LDGen: Enhancing Text-to-Image Synthesis via Large Language Model-Driven Language Representation (2502.18302)

✨This paper introduces LDGen, which integrates LLMs with diffusion models to enhance text-to-image (T2I) generation capabilities.

AdinaY

posted an update about 4 hours ago

Post

94

LLaDA 🔥a 8B diffusion model by GSAI Lab Renmin University
✨Fully trained from scratch, LLaDA delivers performance on par with LLaMA3 8B
Model: GSAI-ML/LLaDA-8B-Instruct
Demo: multimodalart/LLaDA
Paper: Large Language Diffusion Models (2502.09992)

fdaudens

posted an update 1 day ago

Post

2430

Is this the best tool to extract clean info from PDFs, handwriting and complex documents yet?

Open source olmOCR just dropped and the results are impressive.

Tested the free demo with various documents, including a handwritten Claes Oldenburg letter. The speed is impressive: 3000 tokens/second on your own GPU - that's 1/32 the cost of GPT-4o ($190/million pages). Game-changer for content extraction and digital archives.

To achieve this, Ai2 trained a 7B vision language model on 260K pages from 100K PDFs using "document anchoring" - combining PDF metadata with page images.

Best part: it actually understands document structure (columns, tables, equations) instead of just jumbling everything together like most OCR tools. Their human eval results back this up.

👉 Try the demo: https://olmocr.allenai.org

Going right into the AI toolkit: JournalistsonHF/ai-toolkit

3 replies

·

AdinaY

posted an update 3 days ago

Post

2574

Wan2.1 🔥📹 new OPEN video model by Alibaba Wan team!

Model: Wan-AI/Wan2.1-T2V-14B
Demo: Wan-AI/Wan2.1

✨Apache 2.0
✨8.19GB VRAM, runs on most GPUs
✨Multi-Tasking: T2V, I2V, Video Editing, T2I, V2A
✨Text Generation: Supports Chinese & English
✨Powerful Video VAE: Encode/decode 1080P w/ temporal precision

1 reply

·

AdinaY

posted an update 4 days ago

Post

2822

Try QwQ-Max-Preview, Qwen's reasoning model here👉 https://chat.qwen.ai
Can't wait for the model weights to drop on the Hugging Face Hub 🔥

2 replies

·

fdaudens

posted an update 4 days ago

Post

3162

🚀 Just launched: A toolkit of 20 powerful AI tools that journalists can use right now - transcribe, analyze, create. 100% free & open-source.

Been testing all these tools myself and created a searchable collection of the most practical ones - from audio transcription to image generation to document analysis. No coding needed, no expensive subscriptions.

Some highlights I've tested personally:
- Private, on-device transcription with speaker ID in 100+ languages using Whisper
- Website scraping that just works - paste a URL, get structured data
- Local image editing with tools like Finegrain (impressive results)
- Document chat using Qwen 2.5 72B (handles technical papers well)

Sharing this early because the best tools come from the community. Drop your favorite tools in the comments or join the discussion on what to add next!

👉 JournalistsonHF/ai-toolkit

AdinaY

posted an update 4 days ago

Post

2395

Two AI startups, DeepSeek & Moonshot AI , keep moving in perfect sync 👇

✨ Last December: DeepSeek & Moonshot AI released their reasoning models on the SAME DAY.
DeepSeek: deepseek-ai/DeepSeek-R1
MoonShot: https://github.com/MoonshotAI/Kimi-k1.5

✨ Last week: Both teams published papers on modifying attention mechanisms on the SAME DAY AGAIN.
DeepSeek: Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention (2502.11089)
Moonshot: MoBA: Mixture of Block Attention for Long-Context LLMs (2502.13189)

✨ TODAY:
DeepSeek unveiled Flash MLA: a efficient MLA decoding kernel for NVIDIA Hopper GPUs, optimized for variable-length sequences.
https://github.com/deepseek-ai/FlashMLA

Moonshot AI introduces Moonlight: a 3B/16B MoE trained on 5.7T tokens using Muon, pushing the Pareto frontier with fewer FLOPs.
moonshotai/Moonlight-16B-A3B

What's next? 👀

fdaudens

posted an update 7 days ago

Post

3445

Trying something new to keep you ahead of the curve: The 5 AI stories of the week - a weekly curation of the most important AI news you need to know. Do you like it?

For more AI stories and deeper analysis, check out my newsletter: https://open.substack.com/pub/fdaudens/p/ai-competition-heats-up-grok-3-iphone

1 reply

·

AdinaY

posted an update 8 days ago

Post

749

VLM-R1🔥bringing DeepSeek’s R1 method to vision language models!

GitHub: https://github.com/om-ai-lab/VLM-R1
Demo: omlab/VLM-R1-Referral-Expression

fdaudens

posted an update 10 days ago

Post

5762

🎯 Perplexity drops their FIRST open-weight model on Hugging Face: A decensored DeepSeek-R1 with full reasoning capabilities. Tested on 1000+ examples for unbiased responses.

Check it out: perplexity-ai/r1-1776
Blog post: https://perplexity.ai/hub/blog/open-sourcing-r1-1776

1 reply

·

AdinaY

posted an update 10 days ago

Post

4179

🚀 StepFun阶跃星辰 is making BIG open moves!

Last year, their GOT-OCR 2.0 took the community by storm 🔥but many didn’t know they were also building some amazing models. Now, they’ve just dropped something huge on the hub!

📺 Step-Video-T2V: a 30B bilingual open video model that generates 204 frames (8-10s) at 540P resolution with high information density & consistency.
stepfun-ai/stepvideo-t2v

🔊 Step-Audio-TTS-3B : a TTS trained with the LLM-Chat paradigm on a large synthetic dataset, capable of generating RAP & Humming
stepfun-ai/step-audio-67b33accf45735bb21131b0b

3 replies

·

AdinaY

posted an update 10 days ago

Post

2420

The latest paper of DeepSeek is now available on the Daily Papers page 🚀
You can reach out to the authors directly on this page👇
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention (2502.11089)

1 reply

·

clem

posted an update 10 days ago

Post

2682

What are the best organizations to follow on @huggingface ?

On top of my head:
- Deepseek (35,000 followers): https://huggingface.co./deepseek-ai
- Meta Llama (27,000 followers): https://huggingface.co./meta-llama
- Black Forrest Labs (11,000 followers): https://huggingface.co./black-forest-labs
- OpenAI (5,000 followers): https://huggingface.co./openai
- Nvidia (16,000 followers): https://huggingface.co./nvidia
- MIcrosoft (9,000 followers): https://huggingface.co./microsoft
- AllenAI (2,000 followers): https://huggingface.co./allenai
- Mistral (5,000 followers): https://huggingface.co./mistralai
- XAI (600 followers): https://huggingface.co./xai-org
- Stability AI (16,000 followers): https://huggingface.co./stabilityai
- Qwen (16,000 followers): https://huggingface.co./Qwen
- GoogleAI (8,000 followers): https://huggingface.co./google
- Unsloth (3,000 followers): https://huggingface.co./unsloth
- Bria AI (4,000 followers): https://huggingface.co./briaai
- NousResearch (1,300 followers): https://huggingface.co./NousResearch

Bonus, the agent course org with 17,000 followers: https://huggingface.co./agents-course

1 reply

·

clem

posted an update 11 days ago

Post

3411

We crossed 1B+ tokens routed to inference providers partners on HF, that we released just a few days ago.

Just getting started of course but early users seem to like it & always happy to be able to partner with cool startups in the ecosystem.

Have you been using any integration and how can we make it better?

https://huggingface.co./blog/inference-providers

fdaudens

posted an update 12 days ago

Post

2268

Will we soon all have our own personalized AI news agents? And what does it mean for journalism?

Just built a simple prototype based on the Hugging Face course. It lets you get customized news updates on any topic.

Not perfect yet, but you can see where things could go: we'll all be able to build personalized AI agents that curate & analyze news for each of us. And users who could decide to build custom news products for their needs, such as truly personalized newsletters or podcasts.

The implications for both readers & news organizations are significant. To name a few:
- Will news articles remain the best format for informing people?
- What monetization model will work for news organizations?
- How do you create an effective conversion funnel?

👉 Try it here: fdaudens/my-news-agent (Code is open-source)
👉 Check out the course: https://huggingface.co./learn/agents-course/unit0/introduction

louisbrulenaudet

posted an update 12 days ago

Post

3051

I am pleased to introduce my first project built upon Hugging Face’s smolagents framework, integrated with Alpaca for financial market analysis automation 🦙🤗

The project implements technical indicators such as the Relative Strength Index (RSI) and Bollinger Bands to provide momentum and volatility analysis. Market data is retrieved through the Alpaca API, enabling access to historical price information across various timeframes.

AI-powered insights are generated using Hugging Face’s inference API, facilitating the analysis of market trends through natural language processing with DuckDuckGo search integration for real-time sentiment analysis based on financial news 🦆

Link to the GitHub project: https://github.com/louisbrulenaudet/agentic-market-tool

fdaudens

posted an update 14 days ago

Post

2117

🔊 Meet Kokoro Web - Free, ML speech synthesis on your computer, that'll make you ditch paid services!

28 natural voices, unlimited generations, and WebGPU acceleration. Perfect for journalists and content creators.

Test it with full articles—sounds amazingly human! 🎯🎙️

Xenova/kokoro-web

AdinaY

posted an update 15 days ago

Post

2553

Ovis2 🔥 a multimodal LLM released by Alibaba AIDC team.
AIDC-AI/ovis2-67ab36c7e497429034874464
✨1B/2B/4B/8B/16B/34B
✨Strong CoT for deeper problem solving
✨Multilingual OCR – Expanded beyond English & Chinese, with better data extraction

fdaudens

posted an update 15 days ago

Post

2681

⭐️ The AI Energy Score project just launched - this is a game-changer for making informed decisions about AI deployment.

You can now see exactly how much energy your chosen model will consume, with a simple 5-star rating system. Think appliance energy labels, but for AI.

Looking at transcription models on the leaderboard is fascinating: choosing between whisper-tiny or whisper-large-v3 can make a 7x difference. Real-time data on these tradeoffs changes everything.

166 models already evaluated across 10 different tasks, from text generation to image classification. The whole thing is public and you can submit your own models to test.

Why this matters:
- Teams can pick efficient models that still get the job done
- Developers can optimize for energy use from day one
- Organizations can finally predict their AI environmental impact

If you're building with AI at any scale, definitely worth checking out.

👉 leaderboard: https://lnkd.in/esrSxetj
👉 blog post: https://lnkd.in/eFJvzHi8

Huge work led by @sasha with @bgamazay @yjernite @sarahooker @regisss @meg

1 reply

·

Hugging Face for Legal

AI & ML interests

HFforLegal's activity

AI & ML interests

Team members 83

HFforLegal's activity