Ji-Xiang's picture

Ji-Xiang

Ji-Xiang

·

AI & ML interests

None yet

Organizations

Ji-Xiang's activity

upvoted a collection 5 days ago

MobileLLM

Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 8 items • Updated 3 days ago • 89

upvoted a collection 6 days ago

SmolLM2

State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 8 items • Updated 6 days ago • 160

upvoted 2 papers about 1 month ago

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Paper • 2409.17146 • Published Sep 25 • 99

MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning

Paper • 2409.20566 • Published Sep 30 • 51

upvoted 2 collections about 1 month ago

NVLM 1.0

A family of frontier-class multimodal large language models (LLMs) that achieve state-of-the-art results on vision-language tasks and text-only tasks. • 1 item • Updated Oct 1 • 48

CogVideo

10 items • Updated 2 days ago • 23

upvoted 2 articles about 1 month ago

Article

A Short Summary of Chinese AI Global Expansion

By

•

Oct 1

• 13

Article

Getting Started with Sentiment Analysis using Python

Feb 2, 2022

• 28

upvoted 3 collections about 1 month ago

Whisper Release

Whisper includes both English-only and multilingual checkpoints for ASR and ST, ranging from 38M params for the tiny models to 1.5B params for large. • 12 items • Updated Sep 13, 2023 • 86

Molmo

Artifacts for open multimodal language models. • 5 items • Updated Sep 26 • 269

Moshi v0.1 Release

MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated Sep 18 • 215

upvoted 2 collections about 2 months ago

Llama 3.2

This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated 17 days ago • 453

Llama3-8B-1.58

A trio of powerful models: fine-tuned from Llama3-8b-Instruct, with BitNet architecture! • 3 items • Updated Sep 14 • 12

upvoted a collection 3 months ago

Minitron

A family of compressed models obtained via pruning and knowledge distillation • 9 items • Updated Oct 3 • 59

upvoted 2 articles 4 months ago

Article

Faster fine-tuning using TRL & Unsloth

Jan 10

• 37

Article

Llama 3.1 - 405B, 70B & 8B with multilinguality and long context

Jul 23

• 213

upvoted a collection 4 months ago

Llama 3.1

This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Sep 25 • 613

upvoted an article 4 months ago

Article

WWDC 24: Running Mistral 7B with Core ML

Jul 22

• 55

upvoted 2 collections 4 months ago

DCLM

DCLM Models + Datasets • 6 items • Updated Oct 4 • 24

🪐 SmolLM

A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated Aug 18 • 192