xMAD.ai

Enterprise
company
Verified
Activity Feed

AI & ML interests

None defined yet.

Recent Activity

onebitquantized  updated a collection about 2 months ago
Llama
onebitquantized  updated a model about 2 months ago
xmadai/Llama-3.1-70B-Instruct-xMADai-INT4
View all activity

The xMADified Family

From xMAD.ai

Welcome to the official Hugging Face organization for xMADified models from xMAD.ai!

The repositories below contains popular open-source models xMADified with our NeurIPS 2024 methods from 16-bit floats to 4-bit integers, using xMAD.ai proprietary technology.

These models are fine-tunable over the same reduced (4x less) hardware in mere 3-clicks.

Watch our product demo here

CLICK HERE TO JOIN BETA for:

  • No-code deployment
  • Proprietary Dataset Management
  • On-Premise Fine-tuning
  • Endpoint Scaling
  • System Health Monitoring
  • Seamless API Integration

and more!

The memory and hardware requirements (GPU memory needed to run as well as fine-tune them) are listed in the table below:

Model GPU Memory Requirement (Before/After)
Llama-3.1-405B-Instruct-xMADai-INT4 800 GB (16 H100s) → 250 GB (8 V100)
Llama-3.1-Nemotron-70B-Instruct-xMADai-INT4 140 GB (4 L40S) → 40 GB (1 L40S)
Llama-3.1-8B-Instruct-xMADai-INT4 16 GB → 7 GB (any laptop GPU)
Llama-3.2-3B-Instruct-xMADai-INT4 6.5 GB → 3.5 GB (any laptop GPU)
Llama-3.2-1B-Instruct-xMADai-4bit 2.5 GB → 2 GB (any laptop GPU)
Mistral-Small-Instruct-2409-xMADai-INT4 44 GB → 12 GB (T4)
Mistral-Large-Instruct-2407-xMADai-INT4 250 GB → 65GB (1 A100)
gemma-2-9b-it-xMADai-INT4 18.5 GB → 8 GB (any laptop GPU)

xMAD.ai LinkedIn

datasets

None public yet