Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co./docs/hub/model-cards#model-card-metadata)

This is a copy of the original OPT weights that is more efficient to use with the DeepSpeed-MII and DeepSpeed-Inference. In this repo the original tensors are split into 8 shards to target 8 GPUs, this allows the user to run the model with DeepSpeed-inference Tensor Parallelism.

For specific details about the OPT model itself, please see the original OPT model card.

For examples on using this repo please see the following:

Downloads last month
11
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.