Jean Louis

JLouisBiz

https://www.StartYourOwnGoldMine.com

AI & ML interests

- LLM for sales, marketing, promotion - LLM for Website Revision System - increasing quality of communication with customers - helping clients access information faster - saving people from financial troubles

Recent Activity

replied to Kseniase's post about 1 hour ago

5 New implementations of Diffusion Models Diffusion models are widely used for image and video generation but remain underexplored in text generation, where autoregressive models (ARMs) dominate. Unlike ARMs, which produce tokens sequentially, diffusion models iteratively refine noise through denoising steps, offering greater flexibility and speed. Recent advancements show a shift toward using diffusion models in place of, or alongside, ARMs. Researchers also combine strengths from both methods and integrate autoregressive concepts into diffusion. Here are 5 new implementations of diffusion models: 1. Mercury family of diffusion LLMs (dLLMs) by Inception Labs -> https://www.inceptionlabs.ai/news It applies diffusion to text and code data, enabling sequence generation 10x faster than today's top LLMs. Now available Mercury Coder can run at over 1,000 tokens/sec on NVIDIA H100s. 2. Diffusion of Thoughts (DoT) -> https://huggingface.co./papers/2402.07754 Integrates diffusion models with Chain-of-Thought. DoT allows reasoning steps to diffuse gradually over time. This flexibility enables balancing between reasoning quality and computational cost. 3. LLaDA -> https://huggingface.co./papers/2502.09992 Shows diffusion models' potential in replacing ARMs. Trained with pre-training and SFT, LLaDA masks tokens, predicts them via a Transformer, and optimizes a likelihood bound. LLaDA matches key LLM skills, and surpasses GPT-4o in reversal poetry. 4. LanDiff -> https://huggingface.co./papers/2503.04606 This hybrid text-to-video model combines autoregressive and diffusion paradigms, introducing a semantic tokenizer, an LM for token generation, and a streaming diffusion model. LanDiff outperforms models like Sora. 5. General Interpolating Discrete Diffusion (GIDD) -> https://huggingface.co./papers/2503.04482 A flexible noising process with a novel diffusion ELBO enables combining masking and uniform noise, allowing diffusion models to correct mistakes, where ARMs struggle.

new activity about 17 hours ago

eaddario/Watt-Tool-8B-GGUF:Problem with the license, this is not really free software

new activity about 23 hours ago

meditsolutions/medit-one-140M-9B-tokens-checkpoint:Question on meaning of parameter of this model

View all activity

Organizations

JLouisBiz's activity

New activity in eaddario/Watt-Tool-8B-GGUF about 17 hours ago

Problem with the license, this is not really free software

#1 opened 5 days ago by

JLouisBiz

New activity in meditsolutions/medit-one-140M-9B-tokens-checkpoint about 23 hours ago

Question on meaning of parameter of this model

#2 opened 1 day ago by

JLouisBiz

New activity in meditsolutions/medit-one-140M-9B-tokens-checkpoint 1 day ago

Can't install

#1 opened 3 days ago by

JLouisBiz

New activity in marathi-llm/MahaMarathi-7B-v24.01-Base 3 days ago

You got a serious licensing problem

#8 opened 3 days ago by

JLouisBiz

New activity in Tower-Babel/Babel-9B-Chat 3 days ago

Can you publish it under free software license like MIT, Apache 2.0 or some other?

#2 opened 3 days ago by

JLouisBiz

New activity in perplexity-ai/r1-1776 3 days ago

For anyone who is wondering what is going on here with all the "reports"

#168 opened 16 days ago by

ufwd1984

New activity in utter-project/EuroLLM-9B-Instruct 3 days ago

Disable "gated access", it is Apache 2

#6 opened 3 months ago by

kno10

New activity in perplexity-ai/r1-1776 5 days ago

USA/West Propaganda hugging face of huggingface

#230 opened 13 days ago by

devops724

New activity in perplexity-ai/r1-1776 7 days ago

This model is totally unnecessary - uncensored R1 with system prompt

#248 opened 11 days ago by

ttthree

🚩 Report: Ethical issue(s)

#259 opened 10 days ago by

Toseie

New activity in nvidia/canary-1b 7 days ago

This model is great and works very well for speech recognition.

#27 opened 7 days ago by

JLouisBiz

New activity in perplexity-ai/r1-1776 8 days ago

Existence of this model is a faux pas, but...

#166 opened 16 days ago by

MrDevolver

I thought it was about eliminating censorship, but it turned out to be fabricated data.

#33 opened 18 days ago by

CatFly

New activity in deepseek-ai/Janus-Pro-7B 10 days ago

Response is imaginary

#170 opened 10 days ago by

JLouisBiz

New activity in CohereForAI/aya-expanse-32b 11 days ago

Why not commercial license?

#9 opened 11 days ago by

JLouisBiz

New activity in nomic-ai/nomic-embed-text-v1.5 12 days ago

I wish to tokenize the text, before submitting to the model, does this model provide API to tokenize?

#42 opened 12 days ago by

JLouisBiz

New activity in HuggingFaceTB/SmolVLM2-2.2B-Instruct 16 days ago

checkpoint you are trying to load has model type `smolvlm` but Transformers does not recognize this

#7 opened 16 days ago by

JLouisBiz

New activity in iandennismiller/llama-cpp-scripts 16 days ago

Where do I find llama-finetune? Is not in git of llama.cpp

#2 opened 16 days ago by

JLouisBiz

New activity in Qwen/Qwen2.5-Coder-32B-Instruct 22 days ago

Requesting information about hardware resources

#28 opened 3 months ago by

Ishuks

New activity in AIDC-AI/Ovis2-34B 22 days ago

Thank you for giving us freedom with your free software

#1 opened 22 days ago by

JLouisBiz