MobileCLIP / README.md
fguzman82's picture
Update Readme.md
7caab94 verified
|
raw
history blame
3.36 kB
metadata
license: apple-ascl

MobileCLIP CoreML Models

These are the CoreML models of MobileCLIP. For more details, refer to MobileCLIP on HuggingFace and MobileCLIP on GitHub.

The models are separated for each subarchitecture:

  • MobileCLIP-S0: This subarchitecture is designed for lightweight and fast inference, making it suitable for edge devices with limited computational resources.
  • MobileCLIP-S1: This subarchitecture offers a balance between model complexity and performance, providing a good trade-off for various applications.
  • MobileCLIP-S2: This subarchitecture focuses on achieving higher accuracy, ideal for applications where performance can be slightly compromised for better results.
  • MobileCLIP-B: This subarchitecture aims at delivering the highest possible accuracy, optimized for environments with ample computational resources.

Each subarchitecture contains a TextEncoder and ImageEncoder that are separated into CoreML models for each subarchitecture:

Model CLIP Text CLIP Image
MobileCLIP-S0 clip_text_s0.mlpackage clip_image_s0.mlpackage
MobileCLIP-S1 clip_text_s1.mlpackage clip_image_s1.mlpackage
MobileCLIP-S2 clip_text_s2.mlpackage clip_image_s2.mlpackage
MobileCLIP-B clip_text_B.mlpackage clip_image_B.mlpackage

For detailed implementation and architecture specifics, refer to the MobileCLIP GitHub repository.

CoreML Parameters:

Model Input Name Input Shape Input DataType Output Name Output Shape Output DataType
CLIP Text input_text (1,77) INT32 output_embeddings (1,512) FLOAT16
Model Input Name Input Width Input Height Input ColorSpace Output Name Output Shape Output DataType
CLIP Image input_image 256 256 RGB output_embeddings (1,512) FLOAT16

These are example scripts for performing the conversion to CoreML

  1. CLIPImageModel to CoreML Open In Colab

    • This notebook demonstrates the process of converting a CLIP image model to CoreML format.
  2. CLIPTextModel to CoreML Open In Colab

    • This notebook demonstrates the process of converting a CLIP text model to CoreML format.