Jean Louis's picture

Jean Louis

JLouisBiz

AI & ML interests

- LLM for sales, marketing, promotion - LLM for Website Revision System - increasing quality of communication with customers - helping clients access information faster - saving people from financial troubles

Recent Activity

replied to Kseniase's post about 1 hour ago
5 New implementations of Diffusion Models Diffusion models are widely used for image and video generation but remain underexplored in text generation, where autoregressive models (ARMs) dominate. Unlike ARMs, which produce tokens sequentially, diffusion models iteratively refine noise through denoising steps, offering greater flexibility and speed. Recent advancements show a shift toward using diffusion models in place of, or alongside, ARMs. Researchers also combine strengths from both methods and integrate autoregressive concepts into diffusion. Here are 5 new implementations of diffusion models: 1. Mercury family of diffusion LLMs (dLLMs) by Inception Labs -> https://www.inceptionlabs.ai/news It applies diffusion to text and code data, enabling sequence generation 10x faster than today's top LLMs. Now available Mercury Coder can run at over 1,000 tokens/sec on NVIDIA H100s. 2. Diffusion of Thoughts (DoT) -> https://huggingface.co./papers/2402.07754 Integrates diffusion models with Chain-of-Thought. DoT allows reasoning steps to diffuse gradually over time. This flexibility enables balancing between reasoning quality and computational cost. 3. LLaDA -> https://huggingface.co./papers/2502.09992 Shows diffusion models' potential in replacing ARMs. Trained with pre-training and SFT, LLaDA masks tokens, predicts them via a Transformer, and optimizes a likelihood bound. LLaDA matches key LLM skills, and surpasses GPT-4o in reversal poetry. 4. LanDiff -> https://huggingface.co./papers/2503.04606 This hybrid text-to-video model combines autoregressive and diffusion paradigms, introducing a semantic tokenizer, an LM for token generation, and a streaming diffusion model. LanDiff outperforms models like Sora. 5. General Interpolating Discrete Diffusion (GIDD) -> https://huggingface.co./papers/2503.04482 A flexible noising process with a novel diffusion ELBO enables combining masking and uniform noise, allowing diffusion models to correct mistakes, where ARMs struggle.
View all activity

Organizations

RCD Wealth LLC's profile picture

JLouisBiz's activity

New activity in utter-project/EuroLLM-9B-Instruct 3 days ago
New activity in deepseek-ai/Janus-Pro-7B 10 days ago

Response is imaginary

#170 opened 10 days ago by
JLouisBiz
New activity in CohereForAI/aya-expanse-32b 11 days ago