Louis Brulé Naudet PRO

louisbrulenaudet

AI & ML interests

Research in business taxation and development, University Dauphine-PSL 📖 | Backed by the Microsoft for Startups Hub program and Google Cloud Platform for startups program | Hugging Face for Legal 🤗

Organizations

louisbrulenaudet's activity

reacted to yagilb's post with 👀 3 days ago
reacted to singhsidhukuldeep's post with 👀 4 days ago
view post
Post
1988
Exciting Research Alert: Revolutionizing Dense Passage Retrieval with Entailment Tuning!

The good folks at HKUST have developed a novel approach that significantly improves information retrieval by leveraging natural language inference.

The entailment tuning approach consists of several key steps to enhance dense passage retrieval performance.

Data Preparation
- Convert questions into existence claims using rule-based transformations.
- Combine retrieval data with NLI data from SNLI and MNLI datasets.
- Unify the format of both data types using a consistent prompting framework.

Entailment Tuning Process
- Initialize the model using pre-trained language models like BERT or RoBERTa.
- Apply aggressive masking (β=0.8) specifically to the hypothesis components while preserving premise information.
- Train the model to predict the masked hypothesis tokens from the premise content.
- Run the training for 10 epochs using 8 GPUs, taking approximately 1.5-3.5 hours.

Training Arguments for Entailment Tuning (Yes! They Shared Them)
- Use a learning rate of 2e-5 with 100 warmup steps.
- Set batch size to 128.
- Apply weight decay of 0.01.
- Utilize the Adam optimizer with beta values (0.9, 0.999).
- Maintain maximum gradient norm at 1.0.

Deployment
- Index passages using FAISS for efficient retrieval.
- Shard vector store across multiple GPUs.
- Enable sub-millisecond retrieval of the top-100 passages per query.

Integration with Existing Systems
- Insert entailment tuning between pre-training and fine-tuning stages.
- Maintain compatibility with current dense retrieval methods.
- Preserve existing contrastive learning approaches during fine-tuning.

Simple, intuitive, and effective!

This advancement significantly improves the quality of retrieved passages for question-answering systems and retrieval-augmented generation tasks.
reacted to reach-vb's post with 🚀 6 days ago
view post
Post
2816
Smol models ftw! AMD released AMD OLMo 1B - beats OpenELM, tiny llama on MT Bench, Alpaca Eval - Apache 2.0 licensed 🔥

> Trained with 1.3 trillion (dolma 1.7) tokens on 16 nodes, each with 4 MI250 GPUs

> Three checkpoints:

- AMD OLMo 1B: Pre-trained model
- AMD OLMo 1B SFT: Supervised fine-tuned on Tulu V2, OpenHermes-2.5, WebInstructSub, and Code-Feedback datasets
- AMD OLMo 1B SFT DPO: Aligned with human preferences using Direct Preference Optimization (DPO) on UltraFeedback dataset

Key Insights:
> Pre-trained with less than half the tokens of OLMo-1B
> Post-training steps include two-phase SFT and DPO alignment
> Data for SFT:
- Phase 1: Tulu V2
- Phase 2: OpenHermes-2.5, WebInstructSub, and Code-Feedback

> Model checkpoints on the Hub & Integrated with Transformers ⚡️

Congratulations & kudos to AMD on a brilliant smol model release! 🤗

amd/amd-olmo-6723e7d04a49116d8ec95070
replied to their post 13 days ago
view reply

Hello,

Thank you for reaching out. I'm interested in learning more about its potential applications and dataset specifics. To ensure we’re aligned on objectives and timelines, would you mind detailing a bit further on the following in the Tally form? (https://tally.so/r/w2xe0A)

  • Project Goals: What are the primary objectives for your model, and how do you envision deploying it?
  • Data and Compute Requirements: Could you outline the volume and nature of data you'd like to process and any specific requirements for H100 access?
  • Finetuning Method: I'd be interested to hear more about your finetuning approach. Do you have a plan for iterations or specific benchmarks in mind?

Please submit your responses via the form to streamline our discussion. Once we have the foundational details clarified, we can determine the next steps and see how best to leverage the Azure credits together.

Looking forward to exploring the possibilities.

Best regards, Louis

replied to their post 13 days ago
view reply

Hello @Siddartha10 ,

Thank you for reaching out! I'm excited to hear about your work and the potential for collaboration.

To help assess how best to support your project, could you please share a bit more detail? Specifically:

  • Project Overview: A brief description of your project and its objectives.
  • Data Preparedness: Whether your data is ready for immediate use and the nature of this data.
  • Expected Outcomes: The goals or deliverables you anticipate achieving with this additional compute power.

Feel free to submit your details via this form Tally form (https://tally.so/r/w2xe0A) so we can proceed efficiently.

Looking forward to learning more about your project and potentially collaborating!

Best regards,
Louis

replied to their post 13 days ago
view reply

Hi @Pankaj8922 ,

Thank you for reaching out and sharing your project concept! For this collaboration, I'm specifically seeking projects that already have data prepared and ready for immediate use, as the Azure credits are limited and focused on applications that can be initiated without additional data generation steps.

If you have any projects with data fully prepared, feel free to submit details through the form here: https://tally.so/r/w2xe0A.

Best of luck with your synthetic dataset project!

posted an update 13 days ago
view post
Post
907
Introducing Lemone-router, a series of classification models designed to produce an optimal multi-agent system for different branches of tax law.

Trained on a base of 49k lines comprising a set of synthetic questions generated by GPT-4 Turbo and Llama 3.1 70B, which have been further refined through evol-instruction tuning and manual curation and authority documents, these models are based on an 8-category decomposition of the classification scheme derived from the Bulletin officiel des finances publiques - impôts :

label2id = {
    "Bénéfices professionnels": 0,
    "Contrôle et contentieux": 1,
    "Dispositifs transversaux": 2,
    "Fiscalité des entreprises": 3,
    "Patrimoine et enregistrement": 4,
    "Revenus particuliers": 5,
    "Revenus patrimoniaux": 6,
    "Taxes sur la consommation": 7
}
	
id2label = {
    0: "Bénéfices professionnels",
    1: "Contrôle et contentieux",
    2: "Dispositifs transversaux",
    3: "Fiscalité des entreprises",
    4: "Patrimoine et enregistrement",
    5: "Revenus particuliers",
    6: "Revenus patrimoniaux",
    7: "Taxes sur la consommation"
}

It achieves the following results on the evaluation set:
- Loss: 0.4734
- Accuracy: 0.9191

Link to the collection: louisbrulenaudet/lemone-router-671cce21d6410f3570514762
reacted to albertvillanova's post with 👍 18 days ago
view post
Post
1914
🚨 We’ve just released a new tool to compare the performance of models in the 🤗 Open LLM Leaderboard: the Comparator 🎉
open-llm-leaderboard/comparator

Want to see how two different versions of LLaMA stack up? Let’s walk through a step-by-step comparison of LLaMA-3.1 and LLaMA-3.2. 🦙🧵👇

1/ Load the Models' Results
- Go to the 🤗 Open LLM Leaderboard Comparator: open-llm-leaderboard/comparator
- Search for "LLaMA-3.1" and "LLaMA-3.2" in the model dropdowns.
- Press the Load button. Ready to dive into the results!

2/ Compare Metric Results in the Results Tab 📊
- Head over to the Results tab.
- Here, you’ll see the performance metrics for each model, beautifully color-coded using a gradient to highlight performance differences: greener is better! 🌟
- Want to focus on a specific task? Use the Task filter to hone in on comparisons for tasks like BBH or MMLU-Pro.

3/ Check Config Alignment in the Configs Tab ⚙️
- To ensure you’re comparing apples to apples, head to the Configs tab.
- Review both models’ evaluation configurations, such as metrics, datasets, prompts, few-shot configs...
- If something looks off, it’s good to know before drawing conclusions! ✅

4/ Compare Predictions by Sample in the Details Tab 🔍
- Curious about how each model responds to specific inputs? The Details tab is your go-to!
- Select a Task (e.g., MuSR) and then a Subtask (e.g., Murder Mystery) and then press the Load Details button.
- Check out the side-by-side predictions and dive into the nuances of each model’s outputs.

5/ With this tool, it’s never been easier to explore how small changes between model versions affect performance on a wide range of tasks. Whether you’re a researcher or enthusiast, you can instantly visualize improvements and dive into detailed comparisons.

🚀 Try the 🤗 Open LLM Leaderboard Comparator now and take your model evaluations to the next level!
reacted to Taylor658's post with 🔥 18 days ago
view post
Post
2118
The Mystery Bot 🕵️‍♂️ saga I posted about from earlier this week has been solved...🤗

Cohere for AI has just announced its open source Aya Expanse multilingual model. The Initial release supports 23 languages with more on the way soon.🌌 🌍

You can also try Aya Expanse via SMS on your mobile phone using the global WhatsApp number or one of the initial set of country specific numbers listed below.⬇️

🌍WhatsApp - +14313028498
Germany - (+49) 1771786365
USA – +18332746219
United Kingdom — (+44) 7418373332
Canada – (+1) 2044107115
Netherlands – (+31) 97006520757
Brazil — (+55) 11950110169
Portugal – (+351) 923249773
Italy – (+39) 3399950813
Poland - (+48) 459050281
  • 1 reply
·
reacted to malhajar's post with 🔥 18 days ago
view post
Post
3560
🇫🇷 Lancement officiel de l'OpenLLM French Leaderboard : initiative open-source pour référencer l’évaluation des LLMs francophones

Après beaucoup d’efforts et de sueurs avec Alexandre Lavallee, nous sommes ravis d’annoncer que le OpenLLMFrenchLeaderboard est en ligne sur Hugging Face (space url: le-leadboard/OpenLLMFrenchLeaderboard) la toute première plateforme dédiée à l’évaluation des grands modèles de langage (LLM) en français. 🇫🇷✨

Ce projet de longue haleine est avant tout une œuvre de passion mais surtout une nécessité absolue. Il devient urgent et vital d'oeuvrer à plus de transparence dans ce domaine stratégique des LLM dits multilingues. La première pièce à l'édifice est donc la mise en place d'une évaluation systématique et systémique des modèles actuels et futurs.

Votre modèle IA français est-il prêt à se démarquer ? Soumettez le dans notre espace, et voyez comment vous vous comparez par rapport aux autres modèles.

❓ Comment ça marche :
Soumettez votre LLM français pour évaluation, et nous le testerons sur des benchmarks de référence spécifiquement adaptés pour la langue française — notre suite de benchmarks comprend :

- BBH-fr : Raisonnement complexe
- IFEval-fr : Suivi d'instructions
- GPQA-fr : Connaissances avancées
- MUSR-fr : Raisonnement narratif
- MATH_LVL5-fr : Capacités mathématiques
- MMMLU-fr : Compréhension multitâche

Le processus est encore manuel, mais nous travaillons sur son automatisation, avec le soutien de la communauté Hugging Face.

@clem , on se prépare pour une mise à niveau de l’espace ? 😏👀

Ce n'est pas qu'une question de chiffres—il s'agit de créer une IA qui reflète vraiment notre langue, notre culture et nos valeurs. OpenLLMFrenchLeaderboard est notre contribution personnelle pour façonner l'avenir des LLM en France.
  • 1 reply
·
reacted to m-ric's post with 👀 19 days ago
view post
Post
710
⚡️ 𝐓𝐡𝐢𝐬 𝐦𝐨𝐧𝐭𝐡'𝐬 𝐦𝐨𝐬𝐭 𝐢𝐦𝐩𝐨𝐫𝐭𝐚𝐧𝐭 𝐛𝐫𝐞𝐚𝐤𝐭𝐡𝐫𝐨𝐮𝐠𝐡: 𝐃𝐢𝐟𝐟𝐞𝐫𝐞𝐧𝐭𝐢𝐚𝐥 𝐓𝐫𝐚𝐧𝐬𝐟𝐨𝐫𝐦𝐞𝐫 𝐯𝐚𝐬𝐭𝐥𝐲 𝐢𝐦𝐩𝐫𝐨𝐯𝐞𝐬 𝐚𝐭𝐭𝐞𝐧𝐭𝐢𝐨𝐧 ⇒ 𝐛𝐞𝐭𝐭𝐞𝐫 𝐫𝐞𝐭𝐫𝐢𝐞𝐯𝐚𝐥 𝐚𝐧𝐝 𝐟𝐞𝐰𝐞𝐫 𝐡𝐚𝐥𝐥𝐮𝐜𝐢𝐧𝐚𝐭𝐢𝐨𝐧𝐬!

Thought that self-attention could not be improved anymore?

Microsoft researchers have dropped a novel "differential attention" mechanism that amplifies focus on relevant context while canceling out noise. It sounds like a free lunch, but it does really seem to vastly improve LLM performance!

𝗞𝗲𝘆 𝗶𝗻𝘀𝗶𝗴𝗵𝘁𝘀:

🧠 Differential attention computes the difference between two separate softmax attention maps, canceling out noise and promoting sparse attention patterns

🔥 DIFF Transformer outperforms standard Transformers while using 35-40% fewer parameters or training tokens

📏 Scales well to long contexts up to 64K tokens, leveraging increasing context length more effectively

🔎 Dramatically improves key information retrieval, enhancing in-context learning, and possibly reducing risk of hallucinations 🤯

🔢 Reduces activation outliers, potentially enabling lower-bit quantization without performance drop!

⚙️ Can be directly implemented using existing FlashAttention kernels

This new architecture could lead much more capable LLMs, with vastly improved strengths in long-context understanding and factual accuracy.

But they didn’t release weights on the Hub: let’s wait for the community to train the first open-weights DiffTransformer! 🚀

Read their paper 👉  Differential Transformer (2410.05258)
reacted to thomwolf's post with 🚀 19 days ago
view post
Post
4796
Is is time for the open-source AI robots revolution 🚀?

With @haixuantao and @Leyo we’ve been playing with a low-cost DJI robot controlled by three local open-source AI models (Whisper, Idefics2, Parler-TTS - all Apache2) and orchestrated by Dora-cs.

Links to find all the hardware/software we used in the demo:
- robot control framework – dora-rs: https://github.com/dora-rs/dora
- speech-to-text model – whisper: openai/whisper-base
- vision-text model – Idefics2: HuggingFaceM4/idefics2-8b-AWQ
- text-to-speech model – ParlerTTS mini: parler-tts/parler_tts_mini_v0.1
- robot: https://dji.com/robomaster-s1
- code gist: https://gist.github.com/haixuanTao/860e1740245dc2c8dd85b496150a9320
- Larger codebase: dora-rs/dora-idefics2
- laptop/pc: any with a recent GPU card (our has a RTX 4090)

Enjoy!
·
reacted to alielfilali01's post with 👀 19 days ago
view post
Post
1552
I feel like this incredible resource hasn't gotten the attention it deserves in the community!

@clefourrier and generally the HuggingFace evaluation team put together a fantastic guidebook covering a lot about 𝗘𝗩𝗔𝗟𝗨𝗔𝗧𝗜𝗢𝗡 from basics to advanced tips.

link : https://github.com/huggingface/evaluation-guidebook

I haven’t finished it yet, but i'am enjoying every piece of it so far. Huge thanks @clefourrier and the team for this invaluable resource !
  • 3 replies
·
reacted to regisss's post with 👍 20 days ago
view post
Post
1303
Interested in performing inference with an ONNX model?⚡️

The Optimum docs about model inference with ONNX Runtime is now much clearer and simpler!

You want to deploy your favorite model on the hub but you don't know how to export it to the ONNX format? You can do it in one line of code as follows:
from optimum.onnxruntime import ORTModelForSequenceClassification

# Load the model from the hub and export it to the ONNX format
model_id = "distilbert-base-uncased-finetuned-sst-2-english"
model = ORTModelForSequenceClassification.from_pretrained(model_id, export=True)

Check out the whole guide 👉 https://huggingface.co./docs/optimum/onnxruntime/usage_guides/models
reacted to celinah's post with ❤️ 20 days ago
view post
Post
1040
📣 𝚑𝚞𝚐𝚐𝚒𝚗𝚐𝚏𝚊𝚌𝚎_𝚑𝚞𝚋 v0.26.0 is out with some new features and improvements!

✨ 𝗧𝗼𝗽 𝗛𝗶𝗴𝗵𝗹𝗶𝗴𝗵𝘁𝘀:
- 🔐 Multiple access tokens support: Easily manage multiple access tokens with new CLI commands. Perfect for handling multiple tokens with specific permissions in production or when collaborating with external teams.
- 🖼️ Conversational VLMs inference is now supported with InferenceClient's chat completion!
- 📄 Daily Papers API: Seamlessly search and retrieve detailed paper information from the Hub!

We’ve also introduced multiple bug fixes and quality-of-life improvements - thanks to the awesome contributions from our community! 🤗

Check out the release notes here: Wauplin/huggingface_hub#9

and you can try it out now 👇
pip install huggingface_hub==0.26.0

reacted to as-cle-bert's post with 🚀 20 days ago
view post
Post
3195
Hi HuggingFacers!🤗

As you may have probably heard, in the past weeks three Tech Giants (Microsoft, Amazon and Google) announced that they would bet on nuclear reactors to feed the surging energy demand of data centers, driven by increasing AI data and computational flows.

I try to explain the state of AI energy consumptions, its environmental impact and the key points of "turning AI nuclear" in my last article on HF community blog: https://huggingface.co./blog/as-cle-bert/ai-is-turning-nuclear-a-review

Enjoy the reading!🌱
reacted to singhsidhukuldeep's post with 🤝 20 days ago
view post
Post
1279
Looks like @Meta thinks we forgot they created PyTorch, so now they've open-sourced Lingua, a powerful and flexible library for training and inferencing large language models.

Things that stand out:

- Architecture: Pure PyTorch nn.Module implementation for easy customization.

- Checkpointing: Uses the new PyTorch distributed saving method (.distcp format) for flexible model reloading across different GPU configurations.

- Configuration: Utilizes data classes and YAML files for intuitive setup and modification.

- Profiling: Integrates with xFormers' profiler for automatic MFU and HFU calculation, plus memory profiling.

- Slurm Integration: Includes stool.py for seamless job launching on Slurm clusters.

Some results from @Meta to show off:

- 1B parameter models trained on 60B tokens achieve strong performance across various NLP tasks.

- 7B parameter Mamba model (trained on 200B tokens) shows competitive results with Llama 7B on benchmarks like ARC, MMLU, and BBH.

If you're working on LLM research or looking to experiment with cutting-edge language model architectures, Lingua is definitely worth exploring.
reacted to zamal's post with 🔥 20 days ago
view post
Post
1676
🚀 Announcement for the Lovely community! 🚀

Just launched the zamal/DeepSeek-VL-1.3B-Chat on Hugging Face, and it's ready for YOU to explore! 💬🖼️

This full-fledged model is perfect for advanced image and text interactions, with zero GPU required. The Deepseek VL-1.3B Chat typically needs around 8 GB of VRAM and storage of almost 4 GB, but now you can experience it hassle-free right on our space!

Want something lighter? We’ve also uploaded a 4 bit quantized version (just around 1GB!), available on my profile. Perfect for those with limited hardware. 🌍🔍

Come try it now and see what this model can do! 🚀✨

reacted to their post with 🔥 22 days ago
view post
Post
3044
🚨 I have $3,500 in Azure credits, including access to an H100 (96 Go), expiring on November 12, 2024.

I won’t be able to use it all myself, so I’m reaching out to the @huggingface community: Are there any open-source projets with data ready for some compute power?

Let’s collaborate and make the most of it together 🔗
·
posted an update 24 days ago
view post
Post
3044
🚨 I have $3,500 in Azure credits, including access to an H100 (96 Go), expiring on November 12, 2024.

I won’t be able to use it all myself, so I’m reaching out to the @huggingface community: Are there any open-source projets with data ready for some compute power?

Let’s collaborate and make the most of it together 🔗
·