arxiv:2407.10994
Alistarh
d-alistarh
AI & ML interests
NLP
Recent Activity
authored
a paper
16 days ago
Model compression via distillation and quantization
authored
a paper
16 days ago
Sparse Finetuning for Inference Acceleration of Large Language Models
authored
a paper
16 days ago
Towards End-to-end 4-Bit Inference on Generative Large Language Models
Organizations
Papers
25
models
None public yet
datasets
None public yet