4 2 57

Michael Coppola

m18coppola

http://michaeljcoppola.com/

m18coppola

AI & ML interests

AI lobotomies, nlp

Recent Activity

upvoted a collection 18 days ago

Human-Like LLMs

liked a Space 2 months ago

huggingface/open-source-ai-year-in-review-2024

upvoted a collection 3 months ago

OpenCoder

View all activity

Organizations

m18coppola's activity

upvoted a collection 18 days ago

Human-Like LLMs

Collection

Human-Like LLMs series. • 5 items • Updated 14 days ago • 13

liked a Space 2 months ago

Running

521

😻

Open Source Ai Year In Review 2024

What happened in open-source AI this year, and what’s next?

upvoted a collection 3 months ago

OpenCoder

Collection

OpenCoder is an open and reproducible code LLM family which matches the performance of top-tier code LLMs. • 8 items • Updated Nov 23, 2024 • 79

New activity in meta-llama/Llama-3.1-8B-Instruct 6 months ago

BUG : Using `AutoTokenizer.from_pretrained`'s `.encode()` function fails to add BOS token

#21 opened 6 months ago by

m18coppola

liked a model 7 months ago

facebook/chameleon-7b

Image-Text-to-Text • Updated Jul 23, 2024 • 22.1k • 173

liked a Space 7 months ago

Running

110

🏔️

Open-LLM performances are plateauing, let’s make the leaderboard steep again

liked 2 models 8 months ago

nyunai/nyun-c2-llama3-50B

Text Generation • Updated Jun 13, 2024 • 34 • 11

nvidia/NV-Embed-v1

Updated Nov 30, 2024 • 6.79k • 422

reacted to Avelina's post with ❤️😔 9 months ago

Post

1168

Found out my ECCV paper is getting rejected because of a LaTeX compile error :(

reacted to mrm8488's post with 🔥 9 months ago

Post

5936

Working on a concept GPT-2 (small) that uses KANs instead of MLPs.
The ckpt and training code will be soon on the hub.

6 replies

liked 2 models 10 months ago

GritLM/GritLM-7B

Text Generation • Updated Feb 16, 2024 • 72.7k • 90

mistralai/Mixtral-8x22B-Instruct-v0.1

Text Generation • Updated Oct 3, 2024 • 1.33M • • 707

reacted to akhaliq's post with 🔥 10 months ago

Post

3256

LLM2Vec

Large Language Models Are Secretly Powerful Text Encoders

LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders (2404.05961)

Large decoder-only language models (LLMs) are the state-of-the-art models on most of today's NLP tasks and benchmarks. Yet, the community is only slowly adopting these models for text embedding tasks, which require rich contextualized representations. In this work, we introduce LLM2Vec, a simple unsupervised approach that can transform any decoder-only LLM into a strong text encoder. LLM2Vec consists of three simple steps: 1) enabling bidirectional attention, 2) masked next token prediction, and 3) unsupervised contrastive learning. We demonstrate the effectiveness of LLM2Vec by applying it to 3 popular LLMs ranging from 1.3B to 7B parameters and evaluate the transformed models on English word- and sequence-level tasks. We outperform encoder-only models by a large margin on word-level tasks and reach a new unsupervised state-of-the-art performance on the Massive Text Embeddings Benchmark (MTEB). Moreover, when combining LLM2Vec with supervised contrastive learning, we achieve state-of-the-art performance on MTEB among models that train only on publicly available data. Our strong empirical results and extensive analysis demonstrate that LLMs can be effectively transformed into universal text encoders in a parameter-efficient manner without the need for expensive adaptation or synthetic GPT-4 generated data.

liked a model 10 months ago

fireworks-ai/mixtral-8x22b-instruct-oh

Text Generation • Updated Apr 15, 2024 • 11 • 29

reacted to alielfilali01's post with 🧠 10 months ago

Post

2184

Honestly i don't understand how come we as the open source community haven't surpassed GPT-4 yet ? Like for me it looks like everything is out there just need to be exploited! Clearly specialized small models outperforms gpt4 on downstream tasks ! So why haven't we just trained a 1B-2B really strong general model and then continue pertained and/or finetuned it on datasets for downstream tasks like math, code...well structured as Textbooks format or other datasets formats that have been proven to be really efficient and good! Ounce you have 100 finetuned model, just wrap them all into a FrankenMoE and Voila ✨
And that's just what a NOOB like myself had in mind, I'm sure there is better, more efficient ways to do it ! So the question again, why we haven't yet ? I feel I'm missing something... Right?

5 replies

liked 4 models 10 months ago