Steve Huguenin

stevugnin

AI & ML interests

Language Models

Recent Activity

liked a model 23 days ago

NousResearch/Hermes-3-Llama-3.2-3B-GGUF

liked a model 23 days ago

NousResearch/Hermes-3-Llama-3.2-3B

liked a model about 1 month ago

Aleph-Alpha/Pharia-1-LLM-7B-control

View all activity

Organizations

None yet

stevugnin's activity

upvoted an article 4 months ago

Article

Welcome FalconMamba: The first strong attention-free 7B model

Aug 12, 2024

• 108

upvoted a collection 4 months ago

OLMoE

Collection

Artifacts for open mixture-of-experts language models. • 13 items • Updated 5 days ago • 29

upvoted a paper 5 months ago

Sentence-T5: Scalable Sentence Encoders from Pre-trained Text-to-Text Models

Paper • 2108.08877 • Published Aug 19, 2021 • 2

upvoted 2 collections 5 months ago

OpenELM Pretrained Models

Collection

4 items • Updated Oct 4, 2024 • 49

OpenELM Instruct Models

Collection

4 items • Updated Oct 4, 2024 • 115

upvoted 2 papers 5 months ago

Hermes 3 Technical Report

Paper • 2408.11857 • Published Aug 15, 2024 • 44

OpenDevin: An Open Platform for AI Software Developers as Generalist Agents

Paper • 2407.16741 • Published Jul 23, 2024 • 69

upvoted a collection 5 months ago

GritLM

Collection

Generative Representational Instruction Tuning (GRIT) • 64 items • Updated Apr 17, 2024 • 7

upvoted a paper 5 months ago

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

Paper • 1910.13461 • Published Oct 29, 2019 • 3

upvoted a collection 6 months ago

Embeddings

Collection

27 items • Updated Jul 21, 2024 • 4

upvoted a paper 6 months ago

Arctic-Embed: Scalable, Efficient, and Accurate Text Embedding Models

Paper • 2405.05374 • Published May 8, 2024 • 2

upvoted a paper 7 months ago

Training data-efficient image transformers & distillation through attention

Paper • 2012.12877 • Published Dec 23, 2020 • 2

upvoted a paper 8 months ago

OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework

Paper • 2404.14619 • Published Apr 22, 2024 • 126