TinyLlama-1.1B-Tele-it Model Card

Model Summary

The language model TinyLlama-1.1B-Tele-it is an instruct version of TinyLlama-1.1B-Tele, which is based on TinyLlama-1.1B and specialized in telecommunications. It was fine-tuned to follow instructions using Supervised Fine-tuning (SFT) with a combination of the Alpaca and Open-instruct datasets.

Context Length

The context length of the model is 2048 tokens.

Usage

TinyLlama-1.1B-Tele-it has been fine-tuned using pairs of instructions and responses from the Alpaca and Open-instruct datasets, separated by the "\n" delimiter. Below is an example of how to query the model using this format:

Prompt: Explain to me Shannon capacity.\n

Model: The Shannon capacity of a communication channel is the maximum amount of information that can be transmitted over the channel in a single transmission. It is a measure of the maximum amount of information that can be transmitted over a channel with a given noise level. The Shannon capacity is a fundamental limit on the amount of information that can be transmitted over a communication channel.

Sample Code

Below we share some code snippets on how to get quickly started with running the model. First, make sure to pip install transformers, then copy the snippet corresponding to your hardware and adapt it to your usecase.

Running the model on a CPU

from transformers import AutoTokenizer, AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("AliMaatouk/TinyLlama-1.1B-Tele-it", torch_dtype="auto")
tokenizer = AutoTokenizer.from_pretrained("AliMaatouk/TinyLlama-1.1B-Tele-it")

prompt = "Explain to me Shannon capacity.\n"
input_ids = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**input_ids, max_new_tokens=100)

generated_tokens = outputs[0, len(input_ids['input_ids'][0]):]
response = tokenizer.decode(generated_tokens, skip_special_tokens=True)
print(response)

Running the model on a single / multi GPU

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("AliMaatouk/TinyLlama-1.1B-Tele-it", torch_dtype="auto", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("AliMaatouk/TinyLlama-1.1B-Tele-it")

prompt = "Explain to me Shannon capacity.\n"
input_ids = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids, max_new_tokens=100)

generated_tokens = outputs[0, len(input_ids['input_ids'][0]):]
response = tokenizer.decode(generated_tokens, skip_special_tokens=True)
print(response)

Citation

You can find the paper with all details about the model at https://arxiv.org/abs/2409.05314. Please cite it as follows:

@misc{maatouk2024telellmsseriesspecializedlarge,
      title={Tele-LLMs: A Series of Specialized Large Language Models for Telecommunications}, 
      author={Ali Maatouk and Kenny Chirino Ampudia and Rex Ying and Leandros Tassiulas},
      year={2024},
      eprint={2409.05314},
      archivePrefix={arXiv},
      primaryClass={cs.IT},
      url={https://arxiv.org/abs/2409.05314}, 
}

AliMaatouk
/

TinyLlama-1.1B-Tele-it