Julien Chaumond's picture

Julien Chaumond PRO

julien-c

AI & ML interests

<3 ML/AI for everyone, building products to propel communities fwd

Recent Activity

liked a model about 7 hours ago
deepseek-ai/DeepSeek-R1
liked a dataset about 8 hours ago
cais/hle
commented on an article about 13 hours ago
Mastering Long Contexts in LLMs with KVPress
View all activity

Articles

Organizations

Hugging Face's profile picture Safetensors's profile picture Notebooks-explorers's profile picture Nbconvert-internal's profile picture BigScience Workshop's profile picture Spaces-explorers's profile picture Flax Community's profile picture Templates's profile picture Hugging Face Course's profile picture Giskard's profile picture ph-snps's profile picture Text Generation Inference's profile picture Amazon SageMaker Community's profile picture Training Transformers Together's profile picture Hugging Chat's profile picture Atmos Bank's profile picture Godot Engine Demos's profile picture Pyodide Demos's profile picture Huggingface.js's profile picture Webhooks Explorers (BETA)'s profile picture Workshop June 13 Classroom's profile picture HF Canonical Model Maintainers's profile picture TRL's profile picture Open-Source AI Meetup's profile picture Scanned Tokens's profile picture HF Legal's profile picture Language Tools's profile picture Stable Diffusion concepts library's profile picture Teven-projects's profile picture Banana-projects's profile picture Exbert-project's profile picture Blog-explorers's profile picture EU org's profile picture Hacktoberfest 2023's profile picture huggingPartyParis's profile picture Enterprise Explorers's profile picture ZeroGPU Explorers's profile picture OpenAI community's profile picture XLNet community's profile picture ALBERT community's profile picture Transformer-XL community's profile picture Facebook AI community's profile picture DistilBERT community's profile picture BERT community's profile picture T5 community's profile picture choosealicense.com mirror's profile picture Social Post Explorers's profile picture Dev Mode Explorers's profile picture Test's profile picture private beta for deeplinks's profile picture Paris AI Running Club's profile picture kmhf's profile picture Hugging Face Party @ PyTorch Conference's profile picture Nerdy Face's profile picture Hugging Face Science's profile picture open/ acc's profile picture DDUF's profile picture Self-serve FTW's profile picture Inference Explorers's profile picture

julien-c's activity

reacted to florentgbelidji's post with ๐Ÿ”ฅ 2 days ago
view post
Post
1316
๐—ฃ๐—น๐—ฎ๐—ป๐—ป๐—ถ๐—ป๐—ด ๐—ฌ๐—ผ๐˜‚๐—ฟ ๐—ก๐—ฒ๐˜…๐˜ ๐—ฆ๐—ธ๐—ถ ๐—”๐—ฑ๐˜ƒ๐—ฒ๐—ป๐˜๐˜‚๐—ฟ๐—ฒ ๐—๐˜‚๐˜€๐˜ ๐—š๐—ผ๐˜ ๐—ฆ๐—บ๐—ฎ๐—ฟ๐˜๐—ฒ๐—ฟ: ๐—œ๐—ป๐˜๐—ฟ๐—ผ๐—ฑ๐˜‚๐—ฐ๐—ถ๐—ป๐—ด ๐—”๐—น๐—ฝ๐—ถ๐—ป๐—ฒ ๐—”๐—ด๐—ฒ๐—ป๐˜!๐Ÿ”๏ธโ›ท๏ธ

With the big hype around AI agents these days, I couldnโ€™t stop thinking about how AI agents could truly enhance real-world activities.
What sort of applications could we build with those AI agents: agentic RAG? self-correcting text-to-sql? Nah, boringโ€ฆ

Passionate about outdoors, Iโ€™ve always dreamed of a tool that could simplify planning mountain trips while accounting for all potential risks. Thatโ€™s why I built ๐—”๐—น๐—ฝ๐—ถ๐—ป๐—ฒ ๐—”๐—ด๐—ฒ๐—ป๐˜, a smart assistant designed to help you plan safe and enjoyable itineraries in the French Alps and Pyrenees.

Built using Hugging Face's ๐˜€๐—บ๐—ผ๐—น๐—ฎ๐—ด๐—ฒ๐—ป๐˜๐˜€ library, Alpine Agent combines the power of AI with trusted resources like ๐˜š๐˜ฌ๐˜ช๐˜ต๐˜ฐ๐˜ถ๐˜ณ.๐˜ง๐˜ณ (https://skitour.fr/) and METEO FRANCE. Whether itโ€™s suggesting a route with moderate difficulty or analyzing avalanche risks and weather conditions, this agent dynamically integrates data to deliver personalized recommendations.

In my latest blog post, I share how I developed this projectโ€”from defining tools and integrating APIs to selecting the best LLMs like ๐˜˜๐˜ธ๐˜ฆ๐˜ฏ2.5-๐˜Š๐˜ฐ๐˜ฅ๐˜ฆ๐˜ณ-32๐˜‰-๐˜๐˜ฏ๐˜ด๐˜ต๐˜ณ๐˜ถ๐˜ค๐˜ต, ๐˜“๐˜ญ๐˜ข๐˜ฎ๐˜ข-3.3-70๐˜‰-๐˜๐˜ฏ๐˜ด๐˜ต๐˜ณ๐˜ถ๐˜ค๐˜ต, or ๐˜Ž๐˜—๐˜›-4.

โ›ท๏ธ Curious how AI can enhance adventure planning?โ€จTry the app and share your thoughts: florentgbelidji/alpine-agent

๐Ÿ‘‰ Want to build your own agents? Whether for cooking, sports training, or other passions, the possibilities are endless. Check out the blog post to learn more: https://huggingface.co./blog/florentgbelidji/alpine-agent

Many thanks to @m-ric for helping on building this tool with smolagents!
  • 1 reply
ยท
replied to Nitral-AI's post 13 days ago
view reply

FWIW their PM (Chris Perry) is quite helpful on twitter. Maybe try to ping him?

reacted to Nitral-AI's post with ๐Ÿ˜” 13 days ago
view post
Post
3639
That moment when you spend 5 days up babysitting trains, only for colab pro + to randomly disconnect the environment at every chance with 0 error indication of any kind (it just disconnects without an error). Nuke the session from the interface, but continue to eat my colab credits while it reports to wandb. 0 way of saving the models when this happens since it nukes the code preset up to auto-execute. And since the sessions 'exist' but also at the same time doesn't exist i cant close it. And have to wait till they auto timeout after 24hrs. Guess, i won't be using colab for 'quick' test trains anymore. Thanks google for scheming the very little model training budget i had for the month.
ยท
replied to burtenshaw's post about 1 month ago
reacted to burtenshaw's post with ๐Ÿค—โค๏ธ about 1 month ago
view post
Post
2964
People are flexing their end of year stats, so I made this app to show hub stats in a tidy design!

Thanks @Ameeeee and @jfcalvo for the feature from Argilla!
burtenshaw/recap
  • 1 reply
ยท
replied to victor's post about 1 month ago
reacted to Kseniase's post with ๐Ÿ”ฅ about 1 month ago
view post
Post
2843
TL;DR: The Story of Attention's Development by @karpathy

Origin: First proposed in 2014 by @Dzmitry Bahdanau, @KyunghyunCho , and Yoshua Bengio in Neural Machine Translation by Jointly Learning to Align and Translate (1409.0473) . Inspired by cognitive processes and later renamed from "RNNSearch."

Key Idea: A data-dependent weighted average for pooling and communication, enabling flexible and powerful neural network connections.

Breakthrough: Bahdanau's "soft search" mechanism (softmax + weighted averaging) solved encoder-decoder bottlenecks in machine translation.
Transformer Revolution: Attention Is All You Need (1706.03762) (2017) by @ashishvaswanigoogle et al. simplified architectures by stacking attention layers, introducing multi-headed attention and positional encodings.
Legacy: Attention replaced RNNs, driving modern AI systems like ChatGPT. It emerged independently but was influenced by contemporaneous work like Alex Gravesโ€™s Neural Turing Machines (1410.5401) and Jason Westonโ€™s Memory Networks (1410.3916) .

Attention to history: Jรผrgen Schmidhuber claims his 1992 Fast Weight Programmers anticipated modern attention mechanisms. While conceptually similar, the term โ€œattentionโ€ was absent, and thereโ€™s no evidence it influenced Bahdanau, Cho, and Bengioโ€™s 2014 work. Paying attention (!) to history might have brought us to genAI earlier โ€“ but credit for the breakthrough still goes to Montreal.

Referenced Papers:
Attention Origin: Neural Machine Translation by Jointly Learning to Align and Translate (1409.0473)
Transformers: Attention Is All You Need (1706.03762)
Alex Graves' Work: Neural Turing Machines (1410.5401), Generating Sequences With Recurrent Neural Networks (1308.0850)
Jason Weston @spermwhale 's Memory Networks (1410.3916)
Sequence to Sequence Learning with Neural Networks (1409.3215) by Ilya Sutskever ( @ilyasut ), Oriol Vinyals, Quoc V. Le

Who else deserves recognition in this groundbreaking narrative of innovation? Letโ€™s ensure every contributor gets the credit they deserve. Leave a comment below ๐Ÿ‘‡๐Ÿป๐Ÿค—
ยท
replied to Duskfallcrew's post about 1 month ago
view reply

Public storage- y'all ... HF are you nuts?

i can neither confirm nor deny

reacted to FranckAbgrall's post with ๐Ÿ‘ about 1 month ago
view post
Post
2005
Hey!

โœจ If you're using HF access tokens, we just released an overview of the permissions for fine-grained tokens by hovering over the badge on token settings page (org and user)

It will show the highest permission you've set for each entity ๐Ÿ‘€
reacted to their post with ๐Ÿ˜Ž๐Ÿค๐Ÿ‘๐Ÿค—โค๏ธ๐Ÿ”ฅ about 1 month ago
view post
Post
8617
After some heated discussion ๐Ÿ”ฅ, we clarify our intent re. storage limits on the Hub

TL;DR:
- public storage is free, and (unless blatant abuse) unlimited. We do ask that you consider upgrading to PRO and/or Enterprise Hub if possible
- private storage is paid above a significant free tier (1TB if you have a paid account, 100GB otherwise)

docs: https://huggingface.co./docs/hub/storage-limits

We optimize our infrastructure continuously to scale our storage for the coming years of growth in Machine learning, to the benefit of the community ๐Ÿ”ฅ

cc: @reach-vb @pierric @victor and the HF team
ยท
reacted to burtenshaw's post with ๐Ÿ”ฅ about 1 month ago
view post
Post
2470
Quick update from week 1 of smol course. The community is taking the driving seat and using the material for their own projects. If you want to do the same, join in!

- we have ongoing translation projects in Korean, Vietnamese, Portuguese, and Spanish
- 3 chapters are ready for students. On topics like, instruction tuning, preference alignment, and parameter efficient fine tuning
- 3 chapters are in progress on evaluation, vision language models, and synthetic data.
- around 780 people have forked the repo to use it for learning, teaching, sharing.

โญ๏ธ Next step is to support people that want to use the course for teaching, content creation, internal knowledge sharing, or anything. If you're into this. Drop an issue or PR

REPO: https://buff.ly/3ZCMKX2
discord channel: https://buff.ly/4f9F8jA
reacted to bartowski's post with ๐Ÿ‘€ about 1 month ago
view post
Post
22546
Looks like Q4_0_N_M file types are going away

Before you panic, there's a new "preferred" method which is online (I prefer the term on-the-fly) repacking, so if you download Q4_0 and your setup can benefit from repacking the weights into interleaved rows (what Q4_0_4_4 was doing), it will do that automatically and give you similar performance (minor losses I think due to using intrinsics instead of assembly, but intrinsics are more maintainable)

You can see the reference PR here:

https://github.com/ggerganov/llama.cpp/pull/10446

So if you update your llama.cpp past that point, you won't be able to run Q4_0_4_4 (unless they add backwards compatibility back), but Q4_0 should be the same speeds (though it may currently be bugged on some platforms)

As such, I'll stop making those newer model formats soon, probably end of this week unless something changes, but you should be safe to download and Q4_0 quants and use those !

Also IQ4_NL supports repacking though not in as many shapes yet, but should get a respectable speed up on ARM chips, PR for that can be found here: https://github.com/ggerganov/llama.cpp/pull/10541

Remember, these are not meant for Apple silicon since those use the GPU and don't benefit from the repacking of weights
ยท