Aritra Roy Gosthipaty PRO
ariG23498
AI & ML interests
Deep Representation Learning
Recent Activity
published
an
article
about 11 hours ago
๐ Deploying OLMo-7B with Text Generation Inference (TGI) on Hugging Face Spaces
commented on
their
article
about 12 hours ago
๐ Build a Qwen 2.5 VL API endpoint with Hugging Face spaces and Docker!
updated
a model
4 days ago
ariG23498/layerskip-hf-smollm-135m-topv2
Articles
Organizations
ariG23498's activity
published
an
article
about 11 hours ago
Article
๐ Deploying OLMo-7B with Text Generation Inference (TGI) on Hugging Face Spaces
By
โข
commented on
๐ Build a Qwen 2.5 VL API endpoint with Hugging Face spaces and Docker!
about 12 hours ago
Hi @lseanlon ! Thank you for your interest in this work. The premise of this blog post was to show users how to build a POC quickly (as you have mentioned), and not show production usage.
Note: It might be interesting for you to showcase some tricks to speedup the inference time. I will be very willing to see what you come up with and also add it to this article giving you due credits of course. ๐ค
update seeding
#7 opened 4 days ago
by
ariG23498
add numpy import
#6 opened 4 days ago
by
linoyts
Update requirements.txt
#5 opened 4 days ago
by
sayakpaul
Update app.py to acknowledge ByteDance
#2 opened 4 days ago
by
sayakpaul
add randomized seed option
#3 opened 4 days ago
by
linoyts
upvoted
an
article
4 days ago
Article
Mixture of Experts Explained
โข
283
upvoted
an
article
4 days ago
Article
KV Caching Explained: Optimizing Transformer Inference Efficiency
By
โข
โข
22upvoted
a
collection
4 days ago