nm-testing/SparseLlama-3.1-8B-gsm8k-pruned.2of4-tensor_wts_tensor_act_int8-BitM Updated Dec 17, 2024 • 2
nm-testing/SparseLlama-3.1-8B-gsm8k-pruned.2of4-tensor_wts_tensor_act_fp8-BitM Updated Dec 17, 2024 • 8
nm-testing/SparseLlama-3.1-8B-gsm8k-pruned.2of4-tensor_wts_per_tok_dyn_act_int8-BitM Updated Dec 17, 2024 • 2
nm-testing/SparseLlama-3.1-8B-gsm8k-pruned.2of4-tensor_wts_per_tok_dyn_act_fp8-BitM Updated Dec 17, 2024 • 7
nm-testing/SparseLlama-3.1-8B-gsm8k-pruned.2of4-chnl_wts_tensor_act_int8-BitM Updated Dec 17, 2024 • 5
nm-testing/SparseLlama-3.1-8B-gsm8k-pruned.2of4-chnl_wts_tensor_act_fp8-BitM Updated Dec 17, 2024 • 10
nm-testing/TinyLlama-1.1B-Chat-v1.0-gsm8k-pruned.2of4-tensor_wts_tensor_act_int8-BitM Updated Dec 17, 2024 • 6
nm-testing/SparseLlama-3.1-8B-gsm8k-pruned.2of4-chnl_wts_per_tok_dyn_act_int8-BitM Updated Dec 17, 2024 • 4
nm-testing/TinyLlama-1.1B-Chat-v1.0-gsm8k-pruned.2of4-tensor_wts_tensor_act_fp8-BitM Updated Dec 17, 2024 • 5
nm-testing/TinyLlama-1.1B-Chat-v1.0-gsm8k-pruned.2of4-tensor_wts_per_tok_dyn_act_int8-BitM Updated Dec 17, 2024 • 2
nm-testing/SparseLlama-3.1-8B-gsm8k-pruned.2of4-chnl_wts_per_tok_dyn_act_fp8-BitM Updated Dec 17, 2024 • 16
nm-testing/TinyLlama-1.1B-Chat-v1.0-gsm8k-pruned.2of4-tensor_wts_per_tok_dyn_act_fp8-BitM Updated Dec 17, 2024 • 2
nm-testing/TinyLlama-1.1B-Chat-v1.0-gsm8k-pruned.2of4-chnl_wts_tensor_act_int8-BitM Updated Dec 17, 2024 • 2
nm-testing/TinyLlama-1.1B-Chat-v1.0-gsm8k-pruned.2of4-chnl_wts_tensor_act_fp8-BitM Updated Dec 17, 2024 • 4
nm-testing/TinyLlama-1.1B-Chat-v1.0-gsm8k-pruned.2of4-chnl_wts_per_tok_dyn_act_fp8-BitM Updated Dec 17, 2024 • 15