File size: 1,443 Bytes
7753dcc
b10c655
 
7753dcc
 
 
b10c655
7753dcc
b10c655
 
 
 
7753dcc
b10c655
7753dcc
b10c655
f7f49d4
7753dcc
b10c655
7753dcc
b10c655
7753dcc
b10c655
7753dcc
b10c655
 
 
 
f7f49d4
b10c655
 
 
7753dcc
b10c655
7753dcc
b10c655
7753dcc
b10c655
7753dcc
b10c655
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
---
language: en
license: apache-2.0
library_name: transformers
---

# SQFT Base Model: sqft-phi-3-mini-4k-40-base

- Source Model: [microsoft/Phi-3-mini-4k-instruct](https://huggingface.co./microsoft/Phi-3-mini-4k-instruct)
- Sparse Method: [Wanda](https://github.com/locuslab/wanda)
- Sparsity: 40%
- Quantization: No

## Model Sources

- **Repository:** [https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/SQFT](https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/SQFT)
- **Paper:** [SQFT: Low-cost Model Adaptation in Low-precision Sparse Foundation Models](https://arxiv.org/abs/2410.03750)

## How to get this model

Refer to the command in [SQFT/run_command/phi-3-mini-4k-instruct/sparse_quantization.sh#11](https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/SQFT/run_command/phi-3-mini-4k-instruct/sparse_quantization.sh#11).

## Citation

```bash
@article{munoz2024sqft,
  title = {SQFT: Low-cost Model Adaptation in Low-precision Sparse Foundation Models},
  author={J. Pablo Munoz and Jinjie Yuan and Nilesh Jain},
  journal={The 2024 Conference on Empirical Methods in Natural Language Processing (Findings)},
  year={2024}
}
```

## Acknowledgement

Thanks to the work Wanda ([paper](https://arxiv.org/abs/2306.11695), [code](https://github.com/locuslab/wanda)), which provides a simple but effective pruning approach.

## License

Apache-2.0