24 39 49

Steven Zheng

Steveeeeeeen

AI & ML interests

speech & audio

Recent Activity

liked a model about 5 hours ago

microsoft/Phi-4-multimodal-instruct

updated a dataset about 11 hours ago

Steveeeeeeen/whisper-leaderboard-evals

upvoted an article 1 day ago

SigLIP 2: A better multilingual vision language encoder

View all activity

Organizations

Steveeeeeeen's activity

liked a model about 5 hours ago

microsoft/Phi-4-multimodal-instruct

Automatic Speech Recognition • Updated about 16 hours ago • 7.35k • 501

updated a dataset about 11 hours ago

Steveeeeeeen/whisper-leaderboard-evals

Preview • Updated about 11 hours ago • 120

upvoted an article 1 day ago

Article

SigLIP 2: A better multilingual vision language encoder

8 days ago

• 113

liked a Space 1 day ago

LLaDA

🚀

Large Language Diffusion Models

upvoted an article 2 days ago

Article

Deploying Speech-to-Speech on Hugging Face

Oct 22, 2024

• 38

liked a Space 2 days ago

118

AI Podcast Generator

🎙

Generate Podcast using Kokoro-TTS!

upvoted 2 collections 2 days ago

OWLS: Scaling Laws for Speech Recognition and Translation

Collection

🦉 A suite of Whisper-style models from 250M to 18B parameters. Trained on up to 360K hours of data. • 6 items • Updated 3 days ago • 3

Open Whisper-style Speech Models (OWSM)

Collection

Fully open Whisper-style speech foundation models developed by CMU WAVLab: https://www.wavlab.org/activities/2024/owsm/ • 15 items • Updated 22 days ago • 5

upvoted a paper 3 days ago

Slamming: Training a Speech Language Model on One GPU in a Day

Paper • 2502.15814 • Published 9 days ago • 56

published a dataset 7 days ago

Steveeeeeeen/whisper-leaderboard-evals

Preview • Updated about 11 hours ago • 120

New activity in hf-audio/open_asr_leaderboard 7 days ago

whisper-leaderboard

#28 opened 3 months ago by

Steveeeeeeen

updated a Space 7 days ago

628

Open ASR Leaderboard

🏆

Request evaluation for speech models

New activity in hf-audio/open_asr_leaderboard 7 days ago

whisper-leaderboard

#31 opened 7 days ago by

Steveeeeeeen

New activity in Steveeeeeeen/Open_ASR_Leaderboard 7 days ago

Adding whisper leaderboard

#3 opened 7 days ago by

Steveeeeeeen

upvoted a paper 8 days ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 9 days ago • 150

upvoted an article 8 days ago

Article

SmolVLM Grows Smaller – Introducing the 250M & 500M Models!

Jan 23

• 143

liked a Space 8 days ago

1.78k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters

updated a Space 8 days ago

Talk To SmolVox

⚡

Talk to Fixie.ai's Ultravox with WebRTC ⚡️

upvoted a paper 8 days ago

Presumed Cultural Identity: How Names Shape LLM Responses

Paper • 2502.11995 • Published 11 days ago • 10

New activity in TTS-AGI/TTS-Arena 8 days ago

pussy

#87 opened 8 days ago by

Mystique31