metadata
license: cc
base_model: facebook/bart-large-cnn
tags:
- generated_from_trainer
datasets:
- cnn_dailymail
- Convosumm
widget:
- text: >
Can we say that among the Pythagoreans the “body” of the concept was
number? What do you mean by "concept body"? shell. What then is hidden
behind this shell? Definition of a concept) what definition of a concept
is ultimately hidden behind the body in the form of a number? All those
that the Pythagoreans indicated. I want to say that numbers were their
very concept. They thought in numbers as in concepts. Shape maybe?) you
can say yes, but it will need to be developed on a mug. The definitions of
thought are subject to numbers. On the one hand, numbers are pure
abstraction, which gives initial freedom of thought for the derivation of
abstract, embryonic definitions, but then for the derivation, description
of reality, more specific concepts, the abstractness of numbers, on the
contrary, limits, “leads into the darkness.” One is the object, “in
itself”;'
model-index:
- name: BART-CNN-Convosumm
results:
- task:
name: Abstractive Dialogue Summarization
type: abstractive-text-summarization
dataset:
name: Reddit arg-filtered part of Convosumm
type: Convosumm
metrics:
- name: Validation ROGUE-1
type: rogue-1
value: 38.6252
- name: Validation ROGUE-L
type: rogue-l
value: 23.902
- name: Test ROGUE-1
type: rogue-1
value: 38.3642
- name: Test ROGUE-L
type: rogue-l
value: 23.7782
language:
- en
pipeline_tag: summarization
BART-CNN-Convosumm
Model description
This model is a fine-tuned version of facebook/bart-large-cnn on the arg-filtered reddit part of Convosumm dataset. Model is trained for multilanguage telegram-bot summarizer.
Intended uses & limitations
Input expected: unstructured set of concatenated messages without nickname-message indexing.
Training and evaluation data
More information needed
Training procedure
Wandb logged results.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 20
- total_train_batch_size: 20
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: polynomial
- lr_scheduler_warmup_steps: 1
- num_epochs: 7
- label_smoothing_factor: 0.1
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
6.207 | 1.0 | 10 | 4.2651 | 32.3341 | 7.812 | 20.0411 | 29.4849 | 77.38 |
4.0248 | 1.99 | 20 | 3.9903 | 36.0787 | 11.0447 | 21.3596 | 33.2903 | 130.58 |
3.5933 | 2.99 | 30 | 3.9020 | 34.2931 | 11.2036 | 20.7935 | 30.8361 | 140.02 |
3.3086 | 3.98 | 40 | 3.8712 | 38.4842 | 11.9947 | 23.4913 | 34.4347 | 85.78 |
3.112 | 4.98 | 50 | 3.8700 | 38.652 | 11.8315 | 23.5208 | 34.5998 | 76.2 |
2.9933 | 5.97 | 60 | 3.8809 | 38.66 | 12.3337 | 23.4394 | 35.1976 | 83.26 |
2.834 | 6.97 | 70 | 3.8797 | 38.6252 | 12.2556 | 23.902 | 34.6324 | 81.28 |
It achieves the following results on the evaluation set (50 data points):
- Loss: 3.8797
- Rouge1: 38.6252
- Rouge2: 12.2556
- Rougel: 23.902
- Rougelsum: 34.6324
- Gen Len: 81.28
It achieves the following results on the test set (250 data points):
- Loss: 3.8343
- Rouge1: 38.3642
- Rouge2: 12.2056
- Rougel: 23.7782
- Rougelsum: 34.3959
- Gen Len: 84.132
Framework versions
- Transformers 4.35.2
- Pytorch 2.0.0
- Datasets 2.1.0
- Tokenizers 0.15.0