mfarre's picture
mfarre HF staff
Update README.md
6bc94c7 verified
|
raw
history blame
1.03 kB
metadata
library_name: transformers
license: apache-2.0
datasets:
  - HuggingFaceM4/the_cauldron
  - HuggingFaceM4/Docmatix
pipeline_tag: video-text-to-text
language:
  - en
base_model:
  - HuggingFaceTB/SmolLM2-360M-Instruct
  - google/siglip-base-patch16-512
  - HuggingFaceTB/SmolVLM2-500M-Video-Instruct
tags:
  - mlx

HuggingFaceTB/SmolVLM2-500M-Video-Instruct-mlx-8bit-skip-vision

This model was converted to MLX format from HuggingFaceTB/SmolVLM2-500M-Video-Instruct using mlx-vlm version 0.1.13. In this quantized version of the 500M model, the Vision tower is not quantized to avoid issues on iOS Refer to the original model card for more details on the model.

Use with mlx

pip install -U mlx-vlm
python -m mlx_vlm.generate --model mlx-community/SmolVLM2-500M-Video-Instruct-mlx-8bit-skip-vision --image https://huggingface.co./datasets/huggingface/documentation-images/resolve/main/bee.jpg --prompt "Can you describe this image?"