---
license: mit
language:
- zh
metrics:
- accuracy
- f1 (macro)
- f1 (micro)
base_model:
- google-bert/bert-base-chinese
pipeline_tag: text-classification
tags:
- Multi-label Text Classification
datasets:
- scfengv/TVL-general-layer-dataset
library_name: adapter-transformers
model-index:
- name: scfengv/TVL_GeneralLayerClassifier
  results:
  - task:
      type: multi-label text-classification
    dataset:
      name: scfengv/TVL-general-layer-dataset
      type: scfengv/TVL-general-layer-dataset
    metrics:
    - name: Accuracy
      type: Accuracy
      value: 0.952902
    - name: F1 score (Micro)
      type: F1 score (Micro)
      value: 0.968717
    - name: F1 score (Macro)
      type: F1 score (Macro)
      value: 0.970818
---
# Model Details of TVL_GeneralLayerClassifier

## Base Model
This model is fine-tuned from [google-bert/bert-base-chinese](https://huggingface.co./google-bert/bert-base-chinese).

## Model Architecture
- **Type**: BERT-based text classification model
- **Hidden Size**: 768
- **Number of Layers**: 12
- **Number of Attention Heads**: 12
- **Intermediate Size**: 3072
- **Max Sequence Length**: 512
- **Vocabulary Size**: 21,128

## Key Components
1. **Embeddings**
   - Word Embeddings
   - Position Embeddings
   - Token Type Embeddings
   - Layer Normalization

2. **Encoder**
   - 12 layers of:
     - Self-Attention Mechanism
     - Intermediate Dense Layer
     - Output Dense Layer
     - Layer Normalization

3. **Pooler**
   - Dense layer for sentence representation

4. **Classifier**
   - Output layer with 4 classes

## Training Hyperparameters

The model was trained using the following hyperparameters:

```
Learning rate: 1e-05
Batch size: 32
Number of epochs: 10
Optimizer: Adam
Loss function: torch.nn.BCEWithLogitsLoss()
```

## Training Infrastructure

- **Hardware Type:** NVIDIA Quadro RTX8000
- **Library:** PyTorch
- **Hours used:** 2hr 56mins

## Model Parameters
- Total parameters: ~102M (estimated)
- All parameters are in 32-bit floating point (F32) format

## Input Processing
- Uses BERT tokenization
- Supports sequences up to 512 tokens

## Output
- 4-class multi-label classification

## Performance Metrics
- Accuracy score: 0.952902
- F1 score (Micro): 0.968717
- F1 score (Macro): 0.970818

## Training Dataset
This model was trained on the [scfengv/TVL-general-layer-dataset](https://huggingface.co./datasets/scfengv/TVL-general-layer-dataset).

## Testing Dataset

- [scfengv/TVL-general-layer-dataset](https://huggingface.co./datasets/scfengv/TVL-general-layer-dataset)
  - validation
  - Remove Emoji
  - Emoji2Desc
  - Remove Punctuation

## Usage

```python
import torch
from transformers import BertForSequenceClassification, BertTokenizer

model = BertForSequenceClassification.from_pretrained("scfengv/TVL_GeneralLayerClassifier")
tokenizer = BertTokenizer.from_pretrained("scfengv/TVL_GeneralLayerClassifier")

# Prepare your text
text = "Your text here" ## Please refer to Dataset
inputs = tokenizer(text, return_tensors = "pt", padding = True, truncation = True, max_length = 512)

# Make prediction
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.sigmoid(outputs.logits)

# Print predictions
print(predictions)
```

## Additional Notes
- This model is specifically designed for TVL general layer classification tasks.
- It's based on the Chinese BERT model, indicating it's optimized for Chinese text.

- **Hardware Type:** NVIDIA Quadro RTX8000
- **Library:** PyTorch
- **Hours used:** 2hr 56mins

### Training Data

- [scfengv/TVL-general-layer-dataset](https://huggingface.co./datasets/scfengv/TVL-general-layer-dataset)
  - train

### Training Hyperparameters

The model was trained using the following hyperparameters:

```
Learning rate: 1e-05
Batch size: 32
Number of epochs: 10
Optimizer: Adam
Loss function: torch.nn.BCEWithLogitsLoss()
```

## Evaluation

### Testing Data

- [scfengv/TVL-general-layer-dataset](https://huggingface.co./datasets/scfengv/TVL-general-layer-dataset)
  - validation
  - Remove Emoji
  - Emoji2Desc
  - Remove Punctuation

### Results (validation)

- Accuracy: 0.952902
- F1 Score (Micro): 0.968717
- F1 Score (Macro): 0.970818