GGML converted versions of EleutherAI's GPT-J model

Description

GPT-J 6B is a transformer model trained using Ben Wang's Mesh Transformer JAX. "GPT-J" refers to the class of model, while "6B" represents the number of trainable parameters.

Hyperparameter Value
nparametersn_{parameters} 6053381344
nlayersn_{layers} 28*
dmodeld_{model} 4096
dffd_{ff} 16384
nheadsn_{heads} 16
dheadd_{head} 256
nctxn_{ctx} 2048
nvocabn_{vocab} 50257/50400† (same tokenizer as GPT-2/3)
Positional Encoding Rotary Position Embedding (RoPE)
RoPE Dimensions 64

* Each layer consists of one feedforward block and one self attention block.

† Although the embedding matrix has a size of 50400, only 50257 entries are used by the GPT-2 tokenizer.

The model consists of 28 layers with a model dimension of 4096, and a feedforward dimension of 16384. The model dimension is split into 16 heads, each with a dimension of 256. Rotary Position Embedding (RoPE) is applied to 64 dimensions of each head. The model is trained with a tokenization vocabulary of 50257, using the same set of BPEs as GPT-2/GPT-3.

Converted Models

Usage

Python via llm-rs:

Installation

Via pip: pip install llm-rs

Run inference

from llm_rs import AutoModel

#Load the model, define any model you like from the list above as the `model_file`
model = AutoModel.from_pretrained("rustformers/gpt-j-ggml",model_file="gpt-j-6b-q4_0-ggjt.bin")

#Generate
print(model.generate("The meaning of life is"))

Rust via Rustformers/llm:

Installation

git clone --recurse-submodules https://github.com/rustformers/llm.git
cd llm
cargo build --release

Run inference

cargo run --release -- gptj infer -m path/to/model.bin  -p "Tell me how cool the Rust programming language is:"
Downloads last month
19
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train rustformers/gpt-j-ggml