tinyllama-proteinpretrain-quinoa
Full model finetuning of TinyLLaMA-1.1B on the "research" split (quinoa protein sequences) of GreenBeing-Proteins dataset.
Notes: pretraining only on sequences leads the model to only generate protein sequences, eventually repeating VVVV ot KKKK.
- This model may be replaced with mixed training (bio/chem text and protein).
- This model might need "biotokens" to represent the amino acids instead of using the existing tokenizer.
More details TBD
- Downloads last month
- 14
Inference API (serverless) is not available, repository is disabled.
Model tree for monsoon-nlp/tinyllama-proteinpretrain-quinoa
Finetuned
this model