metadata

license: apache-2.0
datasets:
  - togethercomputer/RedPajama-Data-1T-Sample
language:
  - en

Landmark Attention LLaMA 33B

This model has been trained using the PEFT LoRA method using the Landmark Attention method over 200 steps. Model will likely be trained further and updated later on.

Usage

Unlikely to be usable with the popular frontends (e.g. KoboldAI and Oobabooga) due to the lack of support for landmark tokens.

PEFT Checkpoint

You can likely merge the checkpoint with any other LLaMA-based model (provided they're 33B, of course). This repo contains the merged weights, but you can grab the adapter here.