TinyLLM
Overview
This repository hosts a small language model developed as part of the TinyLLM framework ([arxiv link]). These models are specifically designed and fine-tuned with sensor data to support embedded sensing applications. They enable locally hosted language models on low-computing-power devices, such as single-board computers. The models, based on the GPT-2 architecture, are trained using Nvidia's H100 GPUs. This repo provides base models that can be further fine-tuned for specific downstream tasks related to embedded sensing.
Model Information
- Parameters: 124M (Hidden Size = 768)
- Architecture: Decoder-only transformer
- Training Data: Up to 10B tokens from the SHL and Fineweb datasets, combined in a 3:7 ratio
- Input and Output Modality: Text
- Context Length: 1024
Acknowledgements
We want to acknowledge the open-source frameworks llm.c and llama.cpp and the sensor dataset provided by SHL, which were instrumental in training and testing these models.
Usage
The model can be used in two primary ways:
With Hugging Face’s Transformers Library
from transformers import pipeline import torch path = "tinyllm/124M-0.3" prompt = "The sea is blue but it's his red sea" generator = pipeline("text-generation", model=path,max_new_tokens = 30, repetition_penalty=1.3, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto") print(generator(prompt)[0]['generated_text'])
With llama.cpp Generate a GGUF model file using this tool and use the generated GGUF file for inferencing.
python3 convert_hf_to_gguf.py models/mymodel/
Disclaimer
This model is intended solely for research purposes.
- Downloads last month
- 122