AI & ML interests

Non-profit ML community

Recent Activity

pseudolab's activity

lunarfluย 
posted an update 20 days ago
Tonicย 
posted an update about 2 months ago
view post
Post
3391
๐Ÿ™‹๐Ÿปโ€โ™‚๏ธhey there folks,

periodic reminder : if you are experiencing โš ๏ธ500 errors โš ๏ธ or โš ๏ธ abnormal spaces behavior on load or launch โš ๏ธ

we have a thread ๐Ÿ‘‰๐Ÿป https://discord.com/channels/879548962464493619/1295847667515129877

if you can record the problem and share it there , or on the forums in your own post , please dont be shy because i'm not sure but i do think it helps ๐Ÿค—๐Ÿค—๐Ÿค—
  • 2 replies
ยท
Tonicย 
posted an update about 2 months ago
view post
Post
1088
boomers still pick zenodo.org instead of huggingface ??? absolutely clownish nonsense , my random datasets have 30x more downloads and views than front page zenodos ... gonna write a comparison blog , but yeah... cringe.
  • 1 reply
ยท
Tonicย 
posted an update 2 months ago
view post
Post
817
๐Ÿ™‹๐Ÿปโ€โ™‚๏ธ hey there folks ,

really enjoying sharing cool genomics and protein datasets on the hub these days , check out our cool new org : https://huggingface.co./seq-to-pheno

scroll down for the datasets, still figuring out how to optimize for discoverability , i do think on that part it will be better than zenodo[dot}org , it would be nice to write a tutorial about that and compare : we already have more downloads than most zenodo datasets from famous researchers !
Tonicย 
posted an update 2 months ago
view post
Post
1446
hey there folks,

twitter is aweful isnt it ? just getting into the habbit of using hf/posts for shares ๐Ÿฆ™๐Ÿฆ™

Tonic/on-device-granite-3.0-1b-a400m-instruct

new granite on device instruct model demo , hope you like it ๐Ÿš€๐Ÿš€
Tonicย 
posted an update 2 months ago
Tonicย 
posted an update 3 months ago
Tonicย 
posted an update 3 months ago
view post
Post
1853
๐Ÿ™‹๐Ÿปโ€โ™‚๏ธ Hey there folks ,

๐ŸฆŽSalamandra release by @mvillegas and team
@BSC_CNS https://huggingface.co./BSC-LT is absolutely impressive so far !

perhaps the largest single training dataset of high quality text to date of 7.8 trillion tokens in 35 European languages and code.

the best part : the data was correctly licenced so it's actually future-proof!

the completions model is really creative and instruct fine tuned version is very good also.

now you can use such models for multi-lingual enterprise applications with further finetunes , long response generation, structured outputs (coding) also works.

check out ๐Ÿ‘‡๐Ÿป
the collection : BSC-LT/salamandra-66fc171485944df79469043a
the repo : https://github.com/langtech-bsc/salamandra
7B-Instruct demo : Tonic/Salamandra-7B
Tonicย 
posted an update 3 months ago
view post
Post
1719
@mlabonne hey there ๐Ÿ™‹๐Ÿปโ€โ™‚๏ธ I kinda got obsessed with your great model , and i found the endpoint for it in lambda labs, but basically i got rate limited / banned for trying to make my DPO dataset project, i was wondering if you all had an open ai compatible solution for me to make a great "thinking" sft + dpo dataset with all the splits ๐Ÿ™๐Ÿป๐Ÿ™๐Ÿป kinda desparate , it's true , but was looking forward to a nice write ups ๐Ÿš€๐Ÿš€๐Ÿš€
  • 1 reply
ยท
Tonicย 
posted an update 3 months ago
Tonicย 
posted an update 3 months ago
view post
Post
1240
๐Ÿ™‹๐Ÿปโ€โ™‚๏ธ Hey there folks,

stepfun-ai/GOT-OCR2_0 is in top trending and spaces of the week for the second week straight !!

This is madness ๐Ÿ˜ฑ

๐Ÿš€๐Ÿš€check out my demo here : Tonic/GOT-OCR
Tonicย 
posted an update 3 months ago
Tonicย 
posted an update 3 months ago
Tonicย 
posted an update 4 months ago
view post
Post
1105
๐Ÿ™‹๐Ÿปโ€โ™‚๏ธ hey there folks ,

made an image similarity demo to test out the mistral-community/pixtral-12b-240910 model .

If anyone knows how to generate captions with it , please do let me know x ๐Ÿš€

here's the demo : Tonic/Pixtral

hope you like it ๐Ÿค—
Tonicย 
posted an update 4 months ago
view post
Post
2660
So awesome , now i can deploy a jupyterlab on huggingface and deploy gradio from the jupyterlab
Tonicย 
posted an update 4 months ago
Tonicย 
posted an update 4 months ago
view post
Post
2525
๐Ÿ™‹๐Ÿปโ€โ™‚๏ธhey there folks ,

โœ’๏ธInkubaLM has been trained from scratch using 1.9 billion tokens of data for five African languages, along with English and French data, totaling 2.4 billion tokens of data. It is capable of understanding and generating content in five African languages: Swahili, Yoruba, Hausa, isiZulu, and isiXhosa, as well as English and French.

model lelapa/InkubaLM-0.4B
demo Tonic/Inkuba-0.4B