File size: 4,785 Bytes
96f47b6 032b091 d775a10 96f47b6 6d6be88 20820bd 702aedb 8d55342 20820bd 702aedb 20820bd 702aedb 20820bd 2d5b3f1 702aedb 8d55342 20820bd 8d55342 20820bd 8d55342 20820bd 8d55342 702aedb 20820bd 8d55342 20820bd 702aedb bd7d834 20820bd 8d55342 f30a80c 8d55342 f30a80c 8d55342 f30a80c 8d55342 f30a80c 8d55342 f30a80c 8d55342 f30a80c 8d55342 702aedb 20820bd 702aedb ab31247 702aedb 20820bd 702aedb 20820bd 8d55342 96f47b6 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 |
---
license: mit
datasets:
- allenai/c4
language:
- en
library_name: transformers
pipeline_tag: text-generation
base_model:
- anto18671/lumenspark
---
# Linformer-based Language Model
Efficient language modeling optimized for long sequences using the Linformer architecture. This model reduces memory and computational overhead, making it ideal for various text generation tasks.
## Table of Contents
- [Introduction](#introduction)
- [Architecture](#architecture)
- [Installation](#installation)
- [Quick Start](#quick-start)
- [Inference Parameters](#inference-parameters)
- [Hyperparameters](#hyperparameters)
- [Training Progress](#training-progress)
- [Sponsorship](#sponsorship)
- [License](#license)
## Introduction
The **Linformer-based Language Model** leverages the Linformer architecture to efficiently handle long sequences in text generation and other language tasks. By optimizing the self-attention mechanism, this model maintains high performance while reducing resource consumption, making it suitable for applications like text completion and generation.
## Architecture
Built upon the **Linformer Transformer**, the model incorporates several key innovations:
1. **Efficient Attention**: Reduces self-attention complexity from quadratic to linear by projecting the attention matrix into a lower-dimensional space.
2. **Low-Rank Linear Projections**: Utilizes LowRankLinear layers to decrease dimensionality without compromising expressiveness.
3. **Self-Attention Mechanism**: Implements multi-head self-attention with full expressivity by avoiding low-rank projections in this module.
4. **Factorized Feed-Forward Layers**: Uses factorized LowRankLinear layers in the Feed-Forward Neural Network to maintain performance with fewer parameters.
5. **PreNorm with LayerNorm and LayerScale**: Applies Layer Normalization before attention and feed-forward layers, enhanced with LayerScale for better gradient flow and stability.
6. **Dropout & Residual Connections**: Incorporates dropout for regularization and residual connections to aid in gradient flow and prevent vanishing gradients.
## Installation
Install the `lumenspark` package via pip:
```bash
pip install lumenspark
```
This command installs the Linformer-based language model along with all necessary dependencies.
## Training Progress
Below is the training loss plot that shows the progress made during the model training process:
![Training Loss Plot](assets/training_loss_plot.png)
## Quick Start
Load the pre-trained model and tokenizer from Hugging Face to perform text generation:
```python
from lumenspark import LumensparkModel
import torch
# 1. Set up the device (GPU if available, else CPU)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")
# 2. Load the model and move it to the device
model = LumensparkModel.from_pretrained("anto18671/lumenspark").to(device)
# 3. Example input text
input_text = "Once upon a time"
# 4. Generate text
output_text = model.generate(
input_text,
max_length=100, # Maximum length of the generated sequence
temperature=0.7, # Controls randomness in predictions
top_k=50, # Top-k sampling to filter high-probability tokens
top_p=0.9, # Nucleus sampling to control diversity
repetition_penalty=1.2 # Penalize repetition
)
# 5. Print the generated text
print(output_text)
```
## Inference Parameters
Customize text generation using the following parameters:
- **`max_length`**: Maximum length of the generated sequence.
- **`temperature`**: Controls randomness (lower = more deterministic).
- **`top_k`**: Limits sampling to top `k` tokens.
- **`top_p`**: Nucleus sampling based on cumulative probability `p`.
- **`repetition_penalty`**: Penalizes repeated tokens or phrases.
- **`no_repeat_ngram_size`**: Prevents repeated n-grams of specified size.
## Hyperparameters
Optimized for performance and efficiency:
- **`vocab_size`**: 50,257
- **`embed_dim`**: 768
- **`depth`**: 8 layers
- **`heads`**: 8 attention heads
- **`seq_length`**: 768 tokens
- **`dropout`**: 1/17
- **`k`**: 384 (attention projection)
- **`rank`**: 256 (low-rank projections)
## Acknowledgements
We would like to extend our gratitude to [RunPod](https://www.runpod.io) for their generous sponsorship, supporting the training and development of Lumenspark. Their contribution has been instrumental in pushing the project forward.
![RunPod Logo](assets/RunPod.webp)
## Sponsorship
Support the ongoing development of Lumenspark!
### How to Sponsor
Visit [GitHub Sponsors](https://github.com/sponsors/anto18671) and choose a sponsorship tier that suits you. Thank you for your support!
## License
This project is licensed under the [MIT License](LICENSE). |