Gemma-2B-Tele Model Card
Model Summary
The language model Gemma-2B-Tele is a Transformer with 2 billion parameters, specialized in telecommunications. It is based on Google gemma-2b and was continutally pretrained on Tele-Data, a large-scale dataset of approximately 2.5 billion tokens of telecommunications material, including articles, standards, and general web content related to the telecommunications domain.
When assessed against telecommunications benchmarks such as Tele-Eval, Gemma-2B-Tele outperforms gemma-2b by several percentage points. Additionally, Gemma-2B-Tele matches gemma-2b across benchmarks related to common sense, language understanding, and logical reasoning. Thus, this adaptation was achieved with minimal compromise in performance on the original version.
Context Length
The model was trained on a context length of 8192 tokens.
Usage
Gemma-2B-Tele is a base model best suited for fine-tuning on applications related to telecommunications. It has not been fine-tuned to follow instructions and operates solely within a text completion framework. An example of this completion can be found below:
Prompt: Shannon capacity is
Model: the maximum rate at which information can be reliably transmitted over a communication channel. It is named after Claude Shannon, who introduced the concept in his 1948 paper "A Mathematical Theory of Communication".
The instruct version of this model can be found by following the link Gemma-2B-Tele-it.
Sample Code
Below we share some code snippets on how to get quickly started with running the model. First, make sure to pip install transformers
, then copy the snippet corresponding to your hardware and adapt it to your usecase.
Running the model on a CPU
from transformers import AutoTokenizer, AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("AliMaatouk/Gemma-2B-Tele", torch_dtype="auto")
tokenizer = AutoTokenizer.from_pretrained("AliMaatouk/Gemma-2B-Tele")
prompt = "Shannon capacity is"
input_ids = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**input_ids, max_new_tokens=100)
generated_tokens = outputs[0, len(input_ids['input_ids'][0]):]
response = tokenizer.decode(generated_tokens, skip_special_tokens=True)
print(response)
Running the model on a single / multi GPU
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("AliMaatouk/Gemma-2B-Tele", torch_dtype="auto", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("AliMaatouk/Gemma-2B-Tele")
prompt = "Shannon capacity is"
input_ids = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids, max_new_tokens=100)
generated_tokens = outputs[0, len(input_ids['input_ids'][0]):]
response = tokenizer.decode(generated_tokens, skip_special_tokens=True)
print(response)
Citation
You can find the paper with all details about the model at https://arxiv.org/abs/2409.05314. Please cite it as follows:
@misc{maatouk2024telellmsseriesspecializedlarge,
title={Tele-LLMs: A Series of Specialized Large Language Models for Telecommunications},
author={Ali Maatouk and Kenny Chirino Ampudia and Rex Ying and Leandros Tassiulas},
year={2024},
eprint={2409.05314},
archivePrefix={arXiv},
primaryClass={cs.IT},
url={https://arxiv.org/abs/2409.05314},
}
- Downloads last month
- 12