AI & ML interests

None defined yet.

Recent Activity

DDUF's activity

lysandreย 
posted an update 7 days ago
view post
Post
5237
SmolVLM-2 and SigLIP-2 are now part of transformers in dedicated releases!

They're added on top of the v4.49.0 release, and can be installed from the following tags: v4.49.0-SmolVLM-2 and v4.49.0-SigLIP-2.

This marks a new beginning for the release process of transformers. For the past five years, we've been doing monthly releases featuring many models (v4.49.0, the latest release, features 9 new architectures).

Starting with SmolVLM-2 & SigLIP2, we'll now additionally release tags supporting new models on a stable branch. These models are therefore directly available for use by installing from the tag itself. These tags will continue to be updated with fixes applied to these models.

Going forward, continue expecting software releases following semantic versioning: v4.50.0 will have ~10 new architectures compared to v4.49.0, as well as a myriad of new features, improvements and bug fixes. Accompanying these software releases, we'll release tags offering brand new models as fast as possible, to make them accessible to all immediately.
  • 1 reply
ยท
sayakpaulย 
posted an update 11 days ago
view post
Post
2895
Inference-time scaling meets Flux.1-Dev (and others) ๐Ÿ”ฅ

Presenting a simple re-implementation of "Inference-time scaling diffusion models beyond denoising steps" by Ma et al.

I did the simplest random search strategy, but results can potentially be improved with better-guided search methods.

Supports Gemini 2 Flash & Qwen2.5 as verifiers for "LLMGrading" ๐Ÿค—

The steps are simple:

For each round:

1> Starting by sampling 2 starting noises with different seeds.
2> Score the generations w.r.t a metric.
3> Obtain the best generation from the current round.

If you have more compute budget, go to the next search round. Scale the noise pool (2 ** search_round) and repeat 1 - 3.

This constitutes the random search method as done in the paper by Google DeepMind.

Code, more results, and a bunch of other stuff are in the repository. Check it out here: https://github.com/sayakpaul/tt-scale-flux/ ๐Ÿค—
sayakpaulย 
posted an update 29 days ago
view post
Post
1982
We have been cooking a couple of fine-tuning runs on CogVideoX with finetrainers, smol datasets, and LoRA to generate cool video effects like crushing, dissolving, etc.

We are also releasing a LoRA extraction utility from a fully fine-tuned checkpoint. I know that kind of stuff has existed since eternity, but the quality on video models was nothing short of spectacular. Below are some links:

* Models and datasets: https://huggingface.co./finetrainers
* finetrainers: https://github.com/a-r-r-o-w/finetrainers
* LoRA extraction: https://github.com/huggingface/diffusers/blob/main/scripts/extract_lora_from_model.py
  • 1 reply
ยท
sayakpaulย 
posted an update about 1 month ago
view post
Post
1956
We have authored a post to go over the state of video generation in the Diffusers ecosystem ๐Ÿงจ

We cover the models supported, the knobs of optims our users can fire, fine-tuning, and more ๐Ÿ”ฅ

5-6GBs for HunyuanVideo, sky is the limit ๐ŸŒŒ ๐Ÿค—
https://huggingface.co./blog/video_gen
sayakpaulย 
posted an update 2 months ago
view post
Post
4362
Commits speak louder than words ๐Ÿคช

* 4 new video models
* Multiple image models, including SANA & Flux Control
* New quantizers -> GGUF & TorchAO
* New training scripts

Enjoy this holiday-special Diffusers release ๐Ÿค—
Notes: https://github.com/huggingface/diffusers/releases/tag/v0.32.0
sayakpaulย 
posted an update 2 months ago
view post
Post
2193
In the past seven days, the Diffusers team has shipped:

1. Two new video models
2. One new image model
3. Two new quantization backends
4. Three new fine-tuning scripts
5. Multiple fixes and library QoL improvements

Coffee on me if someone can guess 1 - 4 correctly.
  • 1 reply
ยท
celinahย 
posted an update 2 months ago
view post
Post
659
๐Ÿš€ We've just dropped a new release v0.27.0 of the ๐š‘๐šž๐š๐š๐š’๐š—๐š๐š๐šŠ๐šŒ๐šŽ_๐š‘๐šž๐š‹ Python library!

This release includes:
- ๐Ÿ’พ New torch model loading utilities in the serialization module โ€” providing a standardized way to save and load torch models with built-in support for sharding and safe serialization.
- ๐Ÿ“ฆ Tooling for something exciting โ€” if you like single-file formats for models like GGUF, you'll love what we're cooking up ๐Ÿ‘€ More coming soon!
- ๐Ÿ› ๏ธ Loads of quality-of-life improvements and bug fixes!

release notes and full details here ๐Ÿ‘‡
Wauplin/huggingface_hub#10

$ pip install -U huggingface_hub
marcsun13ย 
updated a Space 3 months ago