dejanseo's picture
Update README.md
91aa528 verified
---
language: en
tags:
- transformers
- text-classification
- taxonomy
license: other
license_name: link-attribution
license_link: https://dejanmarketing.com/link-attribution/
model_name: Taxonomy Classifier
pipeline_tag: text-classification
base_model: albert-base-v2
---
# Taxonomy Classifier
This model is a hierarchical text classifier designed to categorize text into a 7-level taxonomy. It utilizes a chain of models, where the prediction at each level informs the prediction at the subsequent level. This approach reduces the classification space at each step.
## Model Details
- **Model Developers:** [DEJAN.AI](https://dejan.ai/)
- **Model Type:** Hierarchical Text Classification
- **Base Model:** [`albert/albert-base-v2`](https://huggingface.co./albert/albert-base-v2)
- **Taxonomy Structure:**
| Level | Unique Classes |
|---|---|
| 1 | 21 |
| 2 | 193 |
| 3 | 1350 |
| 4 | 2205 |
| 5 | 1387 |
| 6 | 399 |
| 7 | 50 |
- **Model Architecture:**
- **Level 1:** Standard sequence classification using `AlbertForSequenceClassification`.
- **Levels 2-7:** Custom architecture (`TaxonomyClassifier`) where the ALBERT pooled output is concatenated with a one-hot encoded representation of the predicted ID from the previous level before being fed into a linear classification layer.
- **Language(s):** English
- **Library:** [Transformers](https://huggingface.co./docs/transformers/index)
- **License:** [link-attribution](https://dejanmarketing.com/link-attribution/)
## Uses
### Direct Use
The model is intended for categorizing text into a predefined 7-level taxonomy.
### Downstream Uses
Potential applications include:
- Automated content tagging
- Product categorization
- Information organization
### Out-of-Scope Use
The model's performance on text outside the domain of the training data or for classifying into taxonomies with different structures is not guaranteed.
## Limitations
- Performance is dependent on the quality and coverage of the training data.
- Errors in earlier levels of the hierarchy can propagate to subsequent levels.
- The model's performance on unseen categories is limited.
- The model may exhibit biases present in the training data.
- The reliance on one-hot encoding for parent IDs can lead to high-dimensional input features at deeper levels, potentially impacting training efficiency and performance (especially observed at Level 4).
## Training Data
The model was trained on a dataset of 374,521 samples. Each row in the training data represents a full taxonomy path from the root level to a leaf node.
## Training Procedure
- **Levels:** Seven separate models were trained, one for each level of the taxonomy.
- **Level 1 Training:** Trained as a standard sequence classification task.
- **Levels 2-7 Training:** Trained with a custom architecture incorporating the predicted parent ID.
- **Input Format:**
- **Level 1:** Text response.
- **Levels 2-7:** Text response concatenated with a one-hot encoded vector of the predicted ID from the previous level.
- **Objective Function:** CrossEntropyLoss
- **Optimizer:** AdamW
- **Learning Rate:** Initially 5e-5, adjusted to 1e-5 for Level 4.
- **Training Hyperparameters:**
- **Epochs:** 10
- **Validation Split:** 0.1
- **Validation Frequency:** Every 1000 steps
- **Batch Size:** 38
- **Max Sequence Length:** 512
- **Early Stopping Patience:** 3
## Evaluation
Validation loss was used as the primary evaluation metric during training. The following validation loss trends were observed:
- **Level 1, 2, and 3:** Showed a relatively rapid decrease in validation loss during training.
- **Level 4:** Exhibited a slower decrease in validation loss, potentially due to the significant increase in the dimensionality of the parent ID one-hot encoding and the larger number of unique classes at this level.
Further evaluation on downstream tasks is recommended to assess the model's practical performance.
## How to Use
Inference can be performed using the provided Streamlit application.
1. **Input Text:** Enter the text you want to classify.
2. **Select Checkpoints:** Choose the desired checkpoint for each level's model. Checkpoints are saved in the respective `level{n}` directories (e.g., `level1/model` or `level4/level4_step31000`).
3. **Run Inference:** Click the "Run Inference" button.
The application will output the predicted ID and the corresponding text description for each level of the taxonomy, based on the provided `mapping.csv` file.
## Visualizations
### Level 1: Training Loss
![Level 1 Train Loss](https://huggingface.co./dejanseo/ecommerce-taxonomy-classifier/resolve/main/training/metrics/level-1-train-loss.png)
This graph shows the training loss over the steps for Level 1, demonstrating a significant drop in loss during the initial training period.
### Level 1: Validation Loss
![Level 1 Validation Loss](https://huggingface.co./dejanseo/ecommerce-taxonomy-classifier/resolve/main/training/metrics/level-1-val-loss.png)
This graph illustrates the validation loss progression over training steps for Level 1, showing steady improvement.
### Level 2: Training Loss
![Level 2 Train Loss](https://huggingface.co./dejanseo/ecommerce-taxonomy-classifier/resolve/main/training/metrics/level-2-train-loss.png)
Here we see the training loss for Level 2, which also shows a significant decrease early on in training.
### Level 2: Validation Loss
![Level 2 Validation Loss](https://huggingface.co./dejanseo/ecommerce-taxonomy-classifier/resolve/main/training/metrics/level-2-val-loss.png)
The validation loss for Level 2 shows consistent reduction as training progresses.
### Level 3: Training Loss
![Level 3 Train Loss](https://huggingface.co./dejanseo/ecommerce-taxonomy-classifier/resolve/main/training/metrics/level-3-train-loss.png)
This graph displays the training loss for Level 3, where training stabilizes after an initial drop.
### Level 3: Validation Loss
![Level 3 Validation Loss](https://huggingface.co./dejanseo/ecommerce-taxonomy-classifier/resolve/main/training/metrics/level-3-val-loss.png)
The validation loss for Level 3, demonstrating steady improvements as the model converges.
## Level 4
### Level 4: Training Loss
![Level 4 Train Loss](https://huggingface.co./dejanseo/ecommerce-taxonomy-classifier/resolve/main/training/metrics/level-4-train-loss.png)
The training loss for Level 4 is plotted here, showing the effects of high-dimensional input features at this level.
![Level 4 Train Loss / Epoch](https://huggingface.co./dejanseo/ecommerce-taxonomy-classifier/resolve/main/training/metrics/level-4-val-loss-epochs.png)
| Epoch | Average Training Loss |
|-------|------------------------|
| 1 | 5.2803 |
| 2 | 2.8285 |
| 3 | 1.5707 |
| 4 | 0.8696 |
| 5 | 0.5164 |
| 6 | 0.3384 |
| 7 | 0.2408 |
| 8 | 0.1813 |
| 9 | 0.1426 |
### Level 4: Validation Loss
![Level 4 Validation Loss](https://huggingface.co./dejanseo/ecommerce-taxonomy-classifier/resolve/main/training/metrics/level-4-val-loss.png)
Finally, the validation loss for Level 4 is shown, where training seems to stabilize after a longer period.
## Level 5
### Level 5: Training and Validation Loss
![Level 5 Train Loss](https://huggingface.co./dejanseo/ecommerce-taxonomy-classifier/resolve/main/training/metrics/level-5-train-loss.png)
Level 5 training loss.
![Level 5 Training Loss per Epoch](https://huggingface.co./dejanseo/ecommerce-taxonomy-classifier/resolve/main/training/metrics/level-5-val-loss-epochs.png)
Average training loss / epoch.
| Epoch | Average Training Loss |
|-------|-----------------------|
| 1 | 5.9700 |
| 2 | 3.9396 |
| 3 | 2.5609 |
| 4 | 1.6004 |
| 5 | 1.0196 |
| 6 | 0.6372 |
| 7 | 0.4410 |
| 8 | 0.3169 |
| 9 | 0.2389 |
| 10 | 0.1895 |
| 11 | 0.1635 |
| 12 | 0.1232 |
| 13 | 0.1075 |
| 14 | 0.0939 |
| 15 | 0.0792 |
| 16 | 0.0632 |
| 17 | 0.0549 |
![Level 5 Validation Loss](https://huggingface.co./dejanseo/ecommerce-taxonomy-classifier/resolve/main/training/metrics/level-5-val-loss.png)
Level 5 validation loss.
## Level 6
### Level 6: Training and Validation Loss
![Level 6 Train Loss](https://huggingface.co./dejanseo/ecommerce-taxonomy-classifier/resolve/main/training/metrics/level-6-train-loss.png)
![Level 6 Training Loss / Epoch](https://huggingface.co./dejanseo/ecommerce-taxonomy-classifier/resolve/main/training/metrics/level-6-val-loss-epochs.png)
| **Epoch** | **Average Training Loss** |
|-----------|----------------------------|
| 1 | 5.5855 |
| 2 | 4.1836 |
| 3 | 3.0299 |
| 4 | 2.1331 |
| 5 | 1.4587 |
| 6 | 0.9847 |
| 7 | 0.6774 |
| 8 | 0.4990 |
| 9 | 0.3637 |
| 10 | 0.2688 |
| 11 | 0.2121 |
| 12 | 0.1697 |
| 13 | 0.1457 |
| 14 | 0.1139 |
| 15 | 0.1186 |
| 16 | 0.0753 |
| 17 | 0.0612 |
| 18 | 0.0676 |
| 19 | 0.0527 |
| 20 | 0.0399 |
| 21 | 0.0342 |
| 22 | 0.0304 |
| 23 | 0.0421 |
| 24 | 0.0280 |
| 25 | 0.0211 |
| 26 | 0.0189 |
| 27 | 0.0207 |
| 28 | 0.0337 |
| 29 | 0.0194 |
![Level 6 Validation Loss](https://huggingface.co./dejanseo/ecommerce-taxonomy-classifier/resolve/main/training/metrics/level-6-val-loss.png)
## Level 7
### Level 7: Training and Validation Loss
![Level 7 Train Loss](https://huggingface.co./dejanseo/ecommerce-taxonomy-classifier/resolve/main/training/metrics/level-7-train-loss.png)
![Level 7 Validation Loss / Epoch](https://huggingface.co./dejanseo/ecommerce-taxonomy-classifier/resolve/main/training/metrics/level-7-val-loss-epochs.png)
| **Epoch** | **Average Training Loss** |
|-----------|----------------------------|
| 1 | 3.8413 |
| 2 | 3.5653 |
| 3 | 3.1193 |
| 4 | 2.5189 |
| 5 | 1.9640 |
| 6 | 1.4992 |
| 7 | 1.1322 |
| 8 | 0.8627 |
| 9 | 0.6674 |
| 10 | 0.5232 |
| 11 | 0.4235 |
| 12 | 0.3473 |
| 13 | 0.2918 |
| 14 | 0.2501 |
| 15 | 0.2166 |
![Level 7 Validation Loss](https://huggingface.co./dejanseo/ecommerce-taxonomy-classifier/resolve/main/training/metrics/level-7-val-loss.png)