File size: 3,709 Bytes
53132d3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 |
# Pico-OpenLAiNN-250M 🤗
Hey there fellow researchers, developers, and AI enthusiasts! Today I'm releasing a new, slightly less *smol* open LLM. This LLM was trained on the full 32B tokens that the entire Open-PicoLAiNN family is trained on.
You can find the GGUF quants of this model [here](https://huggingface.co./UUFO-Aigis/Pico-OpenLAiNN-250M-gguf).
## Models Overview
- **Pico-OpenLAiNN-100**: The smallest of the bunch, this 100M parameter model is perfect for quick experiments and applications where computational resources are *extremely* limited.
- **Pico-OpenLAiNN-250**: This is the middle child of the PicoLAiNN family, it's still tiny at 250M parameters but is more capable than the 100M parameter model.
- **Pico-OpenLAiNN-500**: My current "Heavyweight" Model, this model has 500M parameters and is the most capable of the Pico-OpenLAiNN models.
## Pretraining Details
This specific version of Pico LAiNN was trained on just 32B tokens of the fineweb dataset.
## Other information:
- **Compatibility**: Built to be compatible with existing projects that use LLAMA 2's tokenizer and architecture.
- **Ease of Use**: No need to reinvent the wheel. These models are ready to be plugged into your applications.
- **Open Source**: Fully open source, so you can tweak, tune, and twist them to your heart's content.
## Getting Started
To start using these models, you can simply load them via the Hugging Face `transformers` library:
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
MODEL_NAME = "UUFO-Aigis/Pico-OpenLAiNN-250M" #Replace 100M with 250M or 500M if you prefer those models.
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
model = AutoModelForCausalLM.from_pretrained(MODEL_NAME)
def generate_text(prompt, model, tokenizer, max_length=512, temperature=1, top_k=50, top_p=0.95):
inputs = tokenizer.encode(prompt, return_tensors="pt")
outputs = model.generate(
inputs,
max_length=max_length,
temperature=temperature,
top_k=top_k,
top_p=top_p,
do_sample=True
)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
return generated_text
def main():
# Define your prompt
prompt = "According to all known laws of aviation, there is no way a bee should be able to fly."
generated_text = generate_text(prompt, model, tokenizer)
print(generated_text)
if __name__ == "__main__":
main()
```
# Benchy :3
| Tasks | Value | |Stderr|
|--------------|------:|---|-----:|
|arc_challenge | 0.1988|± |0.0117|
|arc_easy | 0.4503|± |0.0102|
|boolq | 0.5907|± |0.0086|
|hellaswag | 0.3215|± |0.0047|
|lambada_openai| 0.3280|± |0.0065|
|piqa | 0.6594|± |0.0111|
|winogrande | 0.5028|± |0.0141|
## Future Plans
- **More Models**: I'm currenetly training the bigger siblings of this models, including a 1B parameter version and beyond. 2-4 Billion parameter versions are planned. These will be Released as OpenLAiNN.
- **New architecture**: This is still up in the air and I'm still developing it, and will release if I deem it to be actually useful, so stay tuned, this will likely be named FLaRE-LAiNN.
- **Paper**: A detailed paper and the full source code will be made available for those interested in the details.
## Credit Where Credit's Due
If you find these models useful and decide to use these models, a link to this repository would be highly appreciated. I am a one man show running this. Thanks 🤗
## Contact
If you have questions, Please reach out to me at [email protected]
<p align="center">
<img src="UUFO.png" alt="U.U.F.O Research Logo" width="250"/>
</p>
|