Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
7.4
TFLOPS
68
2
Andrew
sealad886
Follow
mshojaei77's profile picture
1 follower
·
2 following
sealad886
AI & ML interests
None yet
Recent Activity
new
activity
about 2 hours ago
mlx-community/DeepSeek-R1-Distill-Qwen-32B-MLX:
Multiple quants for MLX framework
new
activity
about 2 hours ago
mlx-community/DeepSeek-R1-Distill-Qwen-32B-MLX:
How to get these new quantinized version of model in LMStudio?
updated
a model
about 3 hours ago
mlx-community/DeepSeek-R1-Distill-Qwen-7B-MLX
View all activity
Organizations
sealad886
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
New activity in
mlx-community/DeepSeek-R1-Distill-Qwen-32B-MLX
about 2 hours ago
Multiple quants for MLX framework
#25 opened about 4 hours ago by
sealad886
How to get these new quantinized version of model in LMStudio?
1
#24 opened about 6 hours ago by
furyzhenxi
updated
2 models
about 3 hours ago
mlx-community/DeepSeek-R1-Distill-Qwen-7B-MLX
Text Generation
•
Updated
about 3 hours ago
mlx-community/DeepSeek-R1-Distill-Qwen-1.5B-MLX
Text Generation
•
Updated
about 3 hours ago
New activity in
mlx-community/DeepSeek-R1-Distill-Qwen-1.5B-MLX
about 3 hours ago
Multiple quants for MLX framework
#1 opened about 3 hours ago by
sealad886
published
a model
about 3 hours ago
mlx-community/DeepSeek-R1-Distill-Qwen-1.5B-MLX
Text Generation
•
Updated
about 3 hours ago
New activity in
mlx-community/DeepSeek-R1-Distill-Qwen-3B-MLX
about 3 hours ago
Multiple quants for MLX framework
#1 opened about 3 hours ago by
sealad886
published
a model
about 3 hours ago
mlx-community/DeepSeek-R1-Distill-Qwen-3B-MLX
Updated
about 3 hours ago
updated
a model
about 4 hours ago
mlx-community/DeepSeek-R1-Distill-Qwen-14B-MLX
Text Generation
•
Updated
about 4 hours ago
New activity in
mlx-community/DeepSeek-R1-Distill-Qwen-7B-MLX
about 4 hours ago
Multiple quants for MLX framework
#1 opened about 4 hours ago by
sealad886
published
a model
about 4 hours ago
mlx-community/DeepSeek-R1-Distill-Qwen-7B-MLX
Text Generation
•
Updated
about 3 hours ago
updated
a model
about 4 hours ago
mlx-community/DeepSeek-R1-Distill-Qwen-32B-MLX
Text Generation
•
Updated
about 2 hours ago
•
22
New activity in
mlx-community/DeepSeek-R1-Distill-Qwen-14B-MLX
about 4 hours ago
Multiple quants for MLX framework
#2 opened about 4 hours ago by
sealad886
Multiple quants for MLX framework
#1 opened about 8 hours ago by
sealad886
New activity in
mlx-community/DeepSeek-R1-Distill-Qwen-32B-MLX
about 6 hours ago
Multiple quants for MLX framework
#23 opened about 8 hours ago by
sealad886
published
a model
about 8 hours ago
mlx-community/DeepSeek-R1-Distill-Qwen-14B-MLX
Text Generation
•
Updated
about 4 hours ago
New activity in
mlx-community/DeepSeek-R1-Distill-Qwen-32B-MLX
about 8 hours ago
Multiple quants for MLX framework
#17 opened about 10 hours ago by
sealad886
New activity in
mlx-community/DeepSeek-R1-Distill-Qwen-32B-MLX
about 10 hours ago
Parent PR: Add Multi-Quantization Support for DeepSeek-R1-Distill-Qwen-32B via MLX_LM This PR introduces a new conversion pipeline that generates multiple quantized variants of **DeepSeek-R1-Distill-Qwen-32B** using the **MLX_LM** tool. Unlike previous methods based on llama.cpp, this implementation leverages MLX_LM’s unique `quant_predicate` configuration to produce high‑quality mixed‑bit quantizations optimized specifically for MLX runs. Key Changes and Features: - **MLX_LM-Based Conversion:** All model conversions are performed using MLX_LM, which uses parameters like `q_bits`, `q_group_size`, and the distinctive `quant_predicate` (e.g., `"mixed_3_6"`, `"mixed_2_6"`) to create finely tuned quantized models. This provides a superior balance between quality and performance tailored for MLX inference. - **Asynchronous Workflow:** The new pipeline supports asynchronous conversion and upload tasks. Each quantized variant is generated concurrently and then uploaded to the designated Hugging Face repository, streamlining the overall process. - **Updated Documentation:** The repository’s README has been fully updated to reflect the MLX_LM conversion process, providing clear instructions on prompt formatting, downloading individual variants, and running the models with MLX. The documentation emphasizes that these quantizations are for MLX runs only and are not intended for general GGUF deployments. - **Enhanced User Flexibility:** With multiple quantization options (including bf16, Q8_0, Q6_K, Q5_K_M, Q4_K_M, IQ4_NL, etc.), users can select the variant that best meets their hardware and performance requirements. Detailed usage and download instructions facilitate easy deployment. Benefits: - **Optimized for MLX Runs:** The generated quantized models are designed specifically for MLX inference, ensuring optimal performance and compatibility with MLX’s specialized runtime. - **Scalability and Future-Proofing:** This modular pipeline allows for easy integration of additional quantization recipes and future enhancements while keeping the conversion process aligned with MLX_LM’s capabilities. - **Comprehensive Documentation:** The updated README and model card provide thorough guidance on model usage, including prompt format, download instructions, and hardware-specific recommendations. This PR represents a significant advancement in making DeepSeek-R1-Distill-Qwen-32B more accessible and versatile for MLX users. It establishes a parent PR that will be referenced by every subsequent quantized model upload, ensuring consistency and traceability across all releases. Feedback and suggestions are welcome.
#22 opened about 10 hours ago by
sealad886
Parent PR: Add Multi-Quantization Support for DeepSeek-R1-Distill-Qwen-32B via MLX_LM This PR introduces a new conversion pipeline that generates multiple quantized variants of **DeepSeek-R1-Distill-Qwen-32B** using the **MLX_LM** tool. Unlike previous methods based on llama.cpp, this implementation leverages MLX_LM’s unique `quant_predicate` configuration to produce high‑quality mixed‑bit quantizations optimized specifically for MLX runs. Key Changes and Features: - **MLX_LM-Based Conversion:** All model conversions are performed using MLX_LM, which uses parameters like `q_bits`, `q_group_size`, and the distinctive `quant_predicate` (e.g., `"mixed_3_6"`, `"mixed_2_6"`) to create finely tuned quantized models. This provides a superior balance between quality and performance tailored for MLX inference. - **Asynchronous Workflow:** The new pipeline supports asynchronous conversion and upload tasks. Each quantized variant is generated concurrently and then uploaded to the designated Hugging Face repository, streamlining the overall process. - **Updated Documentation:** The repository’s README has been fully updated to reflect the MLX_LM conversion process, providing clear instructions on prompt formatting, downloading individual variants, and running the models with MLX. The documentation emphasizes that these quantizations are for MLX runs only and are not intended for general GGUF deployments. - **Enhanced User Flexibility:** With multiple quantization options (including bf16, Q8_0, Q6_K, Q5_K_M, Q4_K_M, IQ4_NL, etc.), users can select the variant that best meets their hardware and performance requirements. Detailed usage and download instructions facilitate easy deployment. Benefits: - **Optimized for MLX Runs:** The generated quantized models are designed specifically for MLX inference, ensuring optimal performance and compatibility with MLX’s specialized runtime. - **Scalability and Future-Proofing:** This modular pipeline allows for easy integration of additional quantization recipes and future enhancements while keeping the conversion process aligned with MLX_LM’s capabilities. - **Comprehensive Documentation:** The updated README and model card provide thorough guidance on model usage, including prompt format, download instructions, and hardware-specific recommendations. This PR represents a significant advancement in making DeepSeek-R1-Distill-Qwen-32B more accessible and versatile for MLX users. It establishes a parent PR that will be referenced by every subsequent quantized model upload, ensuring consistency and traceability across all releases. Feedback and suggestions are welcome.
#21 opened about 10 hours ago by
sealad886
Parent PR: Add Multi-Quantization Support for DeepSeek-R1-Distill-Qwen-32B via MLX_LM This PR introduces a new conversion pipeline that generates multiple quantized variants of **DeepSeek-R1-Distill-Qwen-32B** using the **MLX_LM** tool. Unlike previous methods based on llama.cpp, this implementation leverages MLX_LM’s unique `quant_predicate` configuration to produce high‑quality mixed‑bit quantizations optimized specifically for MLX runs. Key Changes and Features: - **MLX_LM-Based Conversion:** All model conversions are performed using MLX_LM, which uses parameters like `q_bits`, `q_group_size`, and the distinctive `quant_predicate` (e.g., `"mixed_3_6"`, `"mixed_2_6"`) to create finely tuned quantized models. This provides a superior balance between quality and performance tailored for MLX inference. - **Asynchronous Workflow:** The new pipeline supports asynchronous conversion and upload tasks. Each quantized variant is generated concurrently and then uploaded to the designated Hugging Face repository, streamlining the overall process. - **Updated Documentation:** The repository’s README has been fully updated to reflect the MLX_LM conversion process, providing clear instructions on prompt formatting, downloading individual variants, and running the models with MLX. The documentation emphasizes that these quantizations are for MLX runs only and are not intended for general GGUF deployments. - **Enhanced User Flexibility:** With multiple quantization options (including bf16, Q8_0, Q6_K, Q5_K_M, Q4_K_M, IQ4_NL, etc.), users can select the variant that best meets their hardware and performance requirements. Detailed usage and download instructions facilitate easy deployment. Benefits: - **Optimized for MLX Runs:** The generated quantized models are designed specifically for MLX inference, ensuring optimal performance and compatibility with MLX’s specialized runtime. - **Scalability and Future-Proofing:** This modular pipeline allows for easy integration of additional quantization recipes and future enhancements while keeping the conversion process aligned with MLX_LM’s capabilities. - **Comprehensive Documentation:** The updated README and model card provide thorough guidance on model usage, including prompt format, download instructions, and hardware-specific recommendations. This PR represents a significant advancement in making DeepSeek-R1-Distill-Qwen-32B more accessible and versatile for MLX users. It establishes a parent PR that will be referenced by every subsequent quantized model upload, ensuring consistency and traceability across all releases. Feedback and suggestions are welcome.
#22 opened about 10 hours ago by
sealad886
Load more