You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

This model and associated code are released under the CC-BY-NC-ND 4.0 license and may only be used for non-commercial, academic research purposes with proper attribution.
Any commercial use, sale, or other monetization of the H0-mini model and its derivatives, which include models trained on outputs from the H0-mini model or datasets created from the H0-mini model, is prohibited and requires prior approval.
Please note that the primary email used to sign up for your Hugging Face account must match your institutional email to receive approval. By downloading the model, you attest that all information (affiliation, research use) is correct and up-to-date. Downloading the model requires prior registration on Hugging Face and agreeing to the terms of use. By downloading this model, you agree not to distribute, publish or reproduce a copy of the model. If another user within your organization wishes to use the H0-mini model, they must register as an individual user and agree to comply with the terms of use. Users may not attempt to re-identify the deidentified data used to develop the underlying model.
This model is provided “as-is” without warranties of any kind, express or implied. This model has not been reviewed, certified, or approved by any regulatory body, including but not limited to the FDA (U.S.), EMA (Europe), MHRA (UK), or other medical device authorities. Any application of this model in healthcare or biomedical settings must comply with relevant regulatory requirements and undergo independent validation. Users assume full responsibility for how they use this model and any resulting consequences. The authors, contributors, and distributors disclaim any liability for damages, direct or indirect, resulting from model use. Users are responsible for ensuring compliance with data protection regulations (e.g., GDPR, HIPAA) when using it in research that involves patient data.
If you are a commercial entity, please contact us at hello [at] bioptimus.com to discuss licensing options.

Model card for H0-mini

H0-mini is a lightweight foundation model for histology developed by Owkin and Bioptimus.

The model is a Vision Transformer Base/14 distilled from H-optimus-0 [1] (ViT-g/14) with DINOv2 [2] self-supervised distillation method on PanCancer40M, a set of 43 million histology tiles extracted from 6,093 histology slides of TCGA.

H0-mini achieves comparable performance to current histology foundation models at a significantly reduced inference cost. It also demonstrates strong robustness to variations in staining and scanning protocols. Please refer to the ArXiv preprint for additional details.

Figure: Assessment of model robustness to staining and scanning conditions in PLISM dataset [3] - Median top-10 accuracy vs. mean cosine similarity was computed for each extractor over 4,095 slide pairs. For both axes, higher values indicate more robust models.

How to use it to extract features.

H0-mini can be used with or without fine-tuning on different downstream applications, such as slide-level classification using multiple-instance learning algorithms (e.g. using ABMIL [4]).

The following code snippet allows you to extract features from histology images using H0-Mini.

We recommend to use the CLS token (cls_features) as input features for downstream tasks. The concatenation of the CLS token features with the average of patch token features may bring some improvements on some tasks (concatenated_features).

from huggingface_hub import login
import torch
import timm
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
from torchvision import transforms


# Login to the Hugging Face hub, using your user access token that can be found here:
# https://huggingface.co./settings/tokens.
login()

model = timm.create_model(
    "hf-hub:bioptimus/H0-mini",
    pretrained=True,
    mlp_layer=timm.layers.SwiGLUPacked,
    act_layer=torch.nn.SiLU,
)
model.to("cuda")
model.eval()

transform = create_transform(**resolve_data_config(model.pretrained_cfg, model=model))

input = torch.rand(3, 224, 224)
input = transforms.ToPILImage()(input)

# We recommend using mixed precision for faster inference.
with torch.autocast(device_type="cuda", dtype=torch.float16):
    with torch.inference_mode():
        output = model(transform(input).unsqueeze(0).to("cuda"))  # (1, 261, 768)
        # CLS token features (1, 768):
        cls_features = output[:, 0]
        # Patch token features (1, 256, 768):
        patch_token_features = output[:, model.num_prefix_tokens :]
        # Concatenate the CLS token features with the mean of the patch token
        # features (1, 1536):
        concatenated_features = torch.cat(
            [cls_features, patch_token_features.mean(1)], dim=-1
        )

assert cls_features.shape == (1, 768)
assert patch_token_features.shape == (1, 256, 768)
assert concatenated_features.shape == (1, 1536)

These features can then be used for downstream applications such as ROI classification (via linear or k-NN probing), slide classification (via multiple instance learning), segmentation (via ViT-Adapter for instance), etc.

Software Dependencies.

torch>==2.0.0: https://pytorch.org
torchvision>=0.15.0: https://pytorch.org/vision/stable/index.html
xformers>=0.0.18: https://github.com/facebookresearch/xformers

Citation.

If you are using this model, please cite our work:

@misc{filiot2025distillingfoundationmodelsrobust,
      title={Distilling foundation models for robust and efficient models in digital pathology}, 
      author={Alexandre Filiot and Nicolas Dop and Oussama Tchita and Auriane Riou and Thomas Peeters and Daria Valter and Marin Scalbert and Charlie Saillard and Geneviève Robin and Antoine Olivier},
      year={2025},
      eprint={2501.16239},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2501.16239}, 
}

Acknowledgements.

Computing resources.

This work was granted access to the High-Performance Computing (HPC) resources of IDRIS under the allocations 2023-A0141012519, 2024-A0161012519 and 2024-GC011015442 made by GENCI.

Code.

H0-mini was built upon DINOv2 repository (Apache License 2.0).

Datasets.

The results published here are partly based upon data generated by the TCGA Research Network: https://www.cancer.gov/tcga.

References

Saillard, C., Jenatton, R., Llinares-López, F., Mariet, Z., Cahané, D., Durand, E., Vert, J.-P., 2024. H-optimus-0.
Oquab, M., Darcet, T., Moutakanni, T., Vo, H., Szafraniec, M., Khalidov, V., ... & Bojanowski, P. (2023). Dinov2: Learning robust visual features without supervision. arXiv preprint arXiv:2304.07193.
Ochi, M., Komura, D., Onoyama, T., Shinbo, K., Endo, H., Odaka, H., ... & Ishikawa, S. (2024). Registered multi-device/staining histology image dataset for domain-agnostic machine learning models. Scientific Data, 11(1), 330.
Ilse, M., Tomczak, J., & Welling, M. (2018, July). Attention-based deep multiple instance learning. In International conference on machine learning (pp. 2127-2136). PMLR.

Downloads last month: 15

Safetensors

Model size

85.7M params

Tensor type

F32

Inference Providers NEW

Image Feature Extraction

This model is not currently available via any of the supported Inference Providers.