MoritzLaurer
/

deberta-v3-large-zeroshot-v2.0-28heldout

Text Classification

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

MoritzLaurer HF staff commited on Apr 3

Commit

e5eb304

•

1 Parent(s): 1487beb

Update README.md

Files changed (1) hide show

README.md +8 -60

README.md CHANGED Viewed

@@ -6,66 +6,14 @@ base_model: microsoft/deberta-v3-large
 metrics:
 - accuracy
 model-index:
-- name: deberta-v3-large-zeroshot-v2.0-2024-03-27-14-14
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# deberta-v3-large-zeroshot-v2.0-2024-03-27-14-14
-This model is a fine-tuned version of [microsoft/deberta-v3-large](https://huggingface.co/microsoft/deberta-v3-large) on an unknown dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.1361
-- F1 Macro: 0.5017
-- F1 Micro: 0.5316
-- Accuracy Balanced: 0.5535
-- Accuracy: 0.5316
-- Precision Macro: 0.6440
-- Recall Macro: 0.5535
-- Precision Micro: 0.5316
-- Recall Micro: 0.5316
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 9e-06
-- train_batch_size: 4
-- eval_batch_size: 32
-- seed: 42
-- gradient_accumulation_steps: 8
-- total_train_batch_size: 32
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: linear
-- lr_scheduler_warmup_ratio: 0.06
-- num_epochs: 2
-### Training results
-| Training Loss | Epoch | Step  | Validation Loss | F1 Macro | F1 Micro | Accuracy Balanced | Accuracy | Precision Macro | Recall Macro | Precision Micro | Recall Micro |
-|:-------------:|:-----:|:-----:|:---------------:|:--------:|:--------:|:-----------------:|:--------:|:---------------:|:------------:|:---------------:|:------------:|
-| 0.2087        | 1.0   | 40258 | 0.3319          | 0.8637   | 0.8789   | 0.8552            | 0.8789   | 0.8752          | 0.8552       | 0.8789          | 0.8789       |
-| 0.1369        | 2.0   | 80516 | 0.3447          | 0.8742   | 0.8858   | 0.8729            | 0.8858   | 0.8755          | 0.8729       | 0.8858          | 0.8858       |
-### Framework versions
-- Transformers 4.37.2
-- Pytorch 2.2.1+cu121
-- Datasets 2.17.1
-- Tokenizers 0.15.2

 metrics:
 - accuracy
 model-index:
+- name: deberta-v3-large-zeroshot-v2.0-28heldout
   results: []
 ---
+This model exists mostly for research purposes.
+It is essentially the same as [MoritzLaurer/deberta-v3-large-zeroshot-v2.0](https://huggingface.co/MoritzLaurer/deberta-v3-large-zeroshot-v2.0)
+only that the training data from the 28 datasets/tasks used for evaluating the model were excluded.
+The purpose of the model is to create true zeroshot metrics, by holding out the training data from the 28 datasets/tasks.
+For most practical purposes `MoritzLaurer/deberta-v3-large-zeroshot-v2.0`
+will be more useful as it has seen data from 28 additional tasks and will perfom better on most tasks.
+Note that `MoritzLaurer/deberta-v3-large-zeroshot-v2.0` only has seen training data for these 28 tasks, no test data.