How to try it out? I provide WIP

#1
by billy-ai - opened

Hi all, I installed python 3.10.8, installed the latest version of torch and transformers. Afterwards, I tried the following code:

from transformers import GPTJModel, GPTJConfig
import torch
configuration = GPTJConfig()

# Initializing a model from the configuration
model = GPTJModel(configuration)

# (First I downloaded the model)
path_loader = torch.load("GPT-JT-6B-v1/pytorch_model.bin")
model.load_state_dict(path_loader)
model.eval()

but can't actually use the model. I tried using generate, but I got: TypeError: The current model class (GPTJModel) is not compatible with .generate(), as it doesn't have a language model head. Please use one of the following classes instead: {'GPTJForCausalLM'}

Any ideas on how to use the model after loading it? :)

did you try using GPTJForCausualLM and supply the .bin https://huggingface.co./transformers/v4.11.3/model_doc/gptj.html ?

Did not find the time myself yet to try it out.

you can just load_from_pretrained('your-local-model-dir') with the huggingface transformers lib

Together org

Hi, you can simply do

from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("togethercomputer/GPT-JT-6B-v1").eval().half().to("cuda:0")

Or if you prefer to download and load manually, you should use GPTJForCausalLM instead of GPTJModel.
As the log has said, GPTJModel does not support generate() as it does not have the LM head but only the embeddings and the transformer layers.

Sign up or log in to comment