239 49 141

Ross Wightman

rwightman

AI & ML interests

Computer vision, transfer learning, semi/self supervised learning, robotics.

Recent Activity

liked a dataset about 9 hours ago

MLCommons/unsupervised_peoples_speech

upvoted an article 6 days ago

Open-R1: a fully open reproduction of DeepSeek-R1

reacted to merve's post with 🔥 9 days ago

Oof, what a week! 🥵 So many things have happened, let's recap! https://huggingface.co./collections/merve/jan-24-releases-6793d610774073328eac67a9 Multimodal 💬 - We have released SmolVLM -- tiniest VLMs that come in 256M and 500M, with it's retrieval models ColSmol for multimodal RAG 💗 - UI-TARS are new models by ByteDance to unlock agentic GUI control 🤯 in 2B, 7B and 72B - Alibaba DAMO lab released VideoLlama3, new video LMs that come in 2B and 7B - MiniMaxAI released Minimax-VL-01, where decoder is based on MiniMax-Text-01 456B MoE model with long context - Dataset: Yale released a new benchmark called MMVU - Dataset: CAIS released Humanity's Last Exam (HLE) a new challenging MM benchmark LLMs 📖 - DeepSeek-R1 & DeepSeek-R1-Zero: gigantic 660B reasoning models by DeepSeek, and six distilled dense models, on par with o1 with MIT license! 🤯 - Qwen2.5-Math-PRM: new math models by Qwen in 7B and 72B - NVIDIA released AceMath and AceInstruct, new family of models and their datasets (SFT and reward ones too!) Audio 🗣️ - Llasa is a new speech synthesis model based on Llama that comes in 1B,3B, and 8B - TangoFlux is a new audio generation model trained from scratch and aligned with CRPO Image/Video/3D Generation ⏯️ - Flex.1-alpha is a new 8B pre-trained diffusion model by ostris similar to Flux - tencent released Hunyuan3D-2, new 3D asset generation from images

View all activity

Articles

Organizations

rwightman's activity

New activity in safetensors/convert 10 days ago

Allow running conversion after closing a previous PR.

#21 opened about 1 year ago by

rwightman

New activity in safetensors/convert 13 days ago

Update convert.py

#37 opened 13 days ago by

rwightman

New activity in timm/efficientformer_l7.snap_dist_in1k 18 days ago

Adding `safetensors` variant of this model

#1 opened 18 days ago by

SFconvertbot

New activity in timm/davit_tiny.msft_in1k 18 days ago

Adding `safetensors` variant of this model

#1 opened 18 days ago by

SFconvertbot

New activity in timm/davit_base.msft_in1k 18 days ago

Adding `safetensors` variant of this model

#2 opened 18 days ago by

SFconvertbot

New activity in timm/levit_128.fb_dist_in1k 18 days ago

Adding `safetensors` variant of this model

#1 opened 18 days ago by

SFconvertbot

New activity in timm/davit_small.msft_in1k 18 days ago

Adding `safetensors` variant of this model

#1 opened 18 days ago by

SFconvertbot

New activity in timm/mobilenetv4_conv_small.e2400_r224_in1k about 1 month ago

Can't get attribute 'UniversalInvertedResidual'

#7 opened about 1 month ago by

sddwadsa

New activity in pixparse/cc3m-wds about 1 month ago

Converting Arrow to WebDataset TAR Format for Offline Use

#5 opened about 1 month ago by

katie312

New activity in timm/mobilenetv4_conv_small.e2400_r224_in1k about 1 month ago

model.eval()，results are wrong?

#6 opened about 1 month ago by

liyufeng

New activity in timm/efficientformerv2_s1.snap_dist_in1k about 2 months ago

Adding `safetensors` variant of this model

#1 opened about 2 months ago by

SFconvertbot

New activity in timm/efficientformerv2_s2.snap_dist_in1k about 2 months ago

Adding `safetensors` variant of this model

#1 opened about 2 months ago by

SFconvertbot

New activity in timm/fastvit_t8.apple_dist_in1k about 2 months ago

Update "first_conv" in config.json

#2 opened about 2 months ago by

Cinq108

New activity in timm/efficientformer_l1.snap_dist_in1k about 2 months ago

Adding `safetensors` variant of this model

#1 opened about 2 months ago by

SFconvertbot

New activity in timm/efficientformerv2_l.snap_dist_in1k about 2 months ago

Adding `safetensors` variant of this model

#1 opened about 2 months ago by

SFconvertbot

New activity in timm/ViT-B-16-SigLIP-i18n-256 2 months ago

Are the languages that are supported documented anywhere?

#1 opened 2 months ago by

Jesse-marqo

New activity in pixparse/cc12m-wds 2 months ago

Is this where all the data is?

#3 opened 2 months ago by

showstarpro

New activity in laion/CLIP-ViT-B-32-xlm-roberta-base-laion5B-s13B-b90k 2 months ago

Upload pytorch_model.bin

#3 opened 3 months ago by

prasadpr20

New activity in timm/efficientformerv2_s0.snap_dist_in1k 3 months ago

Adding `safetensors` variant of this model

#1 opened 3 months ago by

SFconvertbot

New activity in laion/CLIP-ViT-L-14-CommonPool.XL-s13B-b90K 3 months ago

Adding `safetensors` variant of this model

#1 opened 3 months ago by

SFconvertbot

Ross Wightman

AI & ML interests

Recent Activity

Articles

Timm ❤️ Transformers: Use any timm model with transformers

Trick or ResNet Treat

Mamba Out

Tiny Test Models

Searching for better (Full) ImageNet ViT Baselines

MobileNet Baselines

MobileNet-V4 (now in timm)

Organizations

rwightman's activity

Allow running conversion after closing a previous PR.

Update convert.py

Adding `safetensors` variant of this model

Adding `safetensors` variant of this model

Adding `safetensors` variant of this model

Adding `safetensors` variant of this model

Adding `safetensors` variant of this model

Can't get attribute 'UniversalInvertedResidual'

Converting Arrow to WebDataset TAR Format for Offline Use

model.eval()，results are wrong?

Adding `safetensors` variant of this model

Adding `safetensors` variant of this model

Update "first_conv" in config.json

Adding `safetensors` variant of this model

Adding `safetensors` variant of this model

Are the languages that are supported documented anywhere?

Is this where all the data is?

Upload pytorch_model.bin

Adding `safetensors` variant of this model

Adding `safetensors` variant of this model