Hugging Face Party @ PyTorch Conference

community

AI & ML interests

None defined yet.

Recent Activity

HF-Party's activity

csabakecskemeti 
posted an update 7 days ago
csabakecskemeti 
posted an update 10 days ago
julien-c 
posted an update 14 days ago
view post
Post
7570
After some heated discussion 🔥, we clarify our intent re. storage limits on the Hub

TL;DR:
- public storage is free, and (unless blatant abuse) unlimited. We do ask that you consider upgrading to PRO and/or Enterprise Hub if possible
- private storage is paid above a significant free tier (1TB if you have a paid account, 100GB otherwise)

docs: https://huggingface.co./docs/hub/storage-limits

We optimize our infrastructure continuously to scale our storage for the coming years of growth in Machine learning, to the benefit of the community 🔥

cc: @reach-vb @pierric @victor and the HF team
·
danielhanchen 
posted an update 18 days ago
julien-c 
posted an update 25 days ago
view post
Post
2184
wow 😮

INTELLECT-1 is the first collaboratively trained 10 billion parameter language model trained from scratch on 1 trillion tokens of English text and code.

PrimeIntellect/INTELLECT-1-Instruct
csabakecskemeti 
posted an update 26 days ago
csabakecskemeti 
posted an update 29 days ago
view post
Post
1174
I have this small utility: no_more_typo
It is running in the background and able to call the LLM model to update the text on the clipboard. I think it would be ideal to fix typos and syntax.
I have just added the option to use custom prompt templates to perform different tasks.

Details, code and executable:
https://github.com/csabakecskemeti/no_more_typo

https://devquasar.com/no-more-typo/
csabakecskemeti 
posted an update about 1 month ago
view post
Post
291
Repurposed my older AI workstation to a homelab server, it has received 2xV100 + 1xP40
I can reach huge 210k token context size with MegaBeam-Mistral-7B-512k-GGUF ~70+tok/s, or run Llama-3.1-Nemotron-70B-Instruct-HF-GGUF with 50k Context ~10tok/s (V100 only 40k ctx and 15tok/s).
Also able to Lora finetune with similar performace as an RTX3090.
It moved to the garage to no complaints for the noise from the family. Will move to a Rack soon :D
  • 2 replies
·
jeffboudier 
posted an update about 1 month ago
danielhanchen 
posted an update about 1 month ago
ArthurZ 
posted an update about 1 month ago
csabakecskemeti 
posted an update about 1 month ago
view post
Post
1227
Some time ago, I built a predictive LLM router that routes chat requests between small and large LLM models based on prompt classification. It dynamically selects the most suitable model depending on the complexity of the user input, ensuring optimal performance while maintaining conversation context. I also fine-tuned a RoBERTa model to use with the package, but you can plug and play any classifier of your choice.

Project's homepage:
https://devquasar.com/llm-predictive-router/
Pypi:
https://pypi.org/project/llm-predictive-router/
Model:
DevQuasar/roberta-prompt_classifier-v0.1
Training data:
DevQuasar/llm_router_dataset-synth
Git:
https://github.com/csabakecskemeti/llm_predictive_router_package

Feel free to check it out, and/or contribute.
csabakecskemeti 
posted an update about 1 month ago
view post
Post
1508
I've built a small open utility pip package called LLM-Forwarder that allows you to inject context, such as adding a private RAG, into existing chat applications by forwarding the app through the LLM-Forwarder. In the forwarder server, you can configure custom code to re-process chat messages and alter the user prompt, for example, by adding extra context.

https://pypi.org/project/llm-forwarder/
More details
https://devquasar.com/llmforwarder/
Aurelien-Morgan 
posted an update about 2 months ago
view post
Post
466
I just shipped retrain-pipelines 0.1.1 today. The doc is also pimped compared to previous release. That was clearly not mature then.
I'll have to focus on another project for the next couple weeks but, anyone feel free to open issues on the GitHub repo and discuss any interest you'd have there if you will (please?) !
In the meantime, you may enjoy retrying this :
https://huggingface.co./blog/Aurelien-Morgan/stateful-metaflow-on-colab
zamal 
posted an update 2 months ago
view post
Post
1734
🚀 Announcement for the Lovely community! 🚀

Just launched the zamal/DeepSeek-VL-1.3B-Chat on Hugging Face, and it's ready for YOU to explore! 💬🖼️

This full-fledged model is perfect for advanced image and text interactions, with zero GPU required. The Deepseek VL-1.3B Chat typically needs around 8 GB of VRAM and storage of almost 4 GB, but now you can experience it hassle-free right on our space!

Want something lighter? We’ve also uploaded a 4 bit quantized version (just around 1GB!), available on my profile. Perfect for those with limited hardware. 🌍🔍

Come try it now and see what this model can do! 🚀✨

Aurelien-Morgan 
posted an update 2 months ago
zamal 
posted an update 2 months ago
view post
Post
2039
Hello, lovely community! 🌟

zamal/Molmo-4bit Thrilled to announce that the Molmo 7B 4-bit Space is now live! 🚀 The model size has been reduced by six times with almost no performance loss, and the results will leave you amazed!

It runs on zero GPU, making it incredibly accessible for everyone!

Check it out here and start exploring today!

Happy experimenting! 🎉
jeffboudier 
posted an update 2 months ago