Edit model card

This is model is compiled explictly for AWS Neuronx(inferentia 2 / trainium 1) with following codes:

from datasets import load_dataset
from transformers import AutoProcessor

from optimum.neuron import NeuronModelForCTC, pipeline


dataset = load_dataset("hf-internal-testing/librispeech_asr_demo", "clean", split="validation")
dataset = dataset.sort("id")
sampling_rate = dataset.features["audio"].sampling_rate

# model_id = "hf-internal-testing/tiny-random-Wav2Vec2Model"
model_id = "facebook/wav2vec2-large-960h-lv60-self"
processor = AutoProcessor.from_pretrained(model_id)
input_shapes = {"batch_size": 1, "audio_sequence_length": 100000}
compiler_args = {"auto_cast": "matmul", "auto_cast_type": "bf16"}
model = NeuronModelForCTC.from_pretrained(
    model_id,
    export=True,
    disable_neuron_cache=True,
    **input_shapes,
    **compiler_args,
)
model.save_pretrained("wav2vec2_neuron")
Downloads last month
4
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.