metadata

license: cc
base_model: facebook/bart-large-cnn
tags:
  - generated_from_trainer
datasets:
  - cnn_dailymail
  - Convosumm
widget:
  - text: >
      Can we say that among the Pythagoreans the “body” of the concept was
      number? What do you mean by "concept body"? shell. What then is hidden
      behind this shell? Definition of a concept) what definition of a concept
      is ultimately hidden behind the body in the form of a number? All those
      that the Pythagoreans indicated. I want to say that numbers were their
      very concept. They thought in numbers as in concepts. Shape maybe?) you
      can say yes, but it will need to be developed on a mug. The definitions of
      thought are subject to numbers. On the one hand, numbers are pure
      abstraction, which gives initial freedom of thought for the derivation of
      abstract, embryonic definitions, but then for the derivation, description
      of reality, more specific concepts, the abstractness of numbers, on the
      contrary, limits, “leads into the darkness.” One is the object, “in
      itself”;'
model-index:
  - name: BART-CNN-Convosumm
    results:
      - task:
          name: Abstractive Dialogue Summarization
          type: abstractive-text-summarization
        dataset:
          name: Reddit arg-filtered part of Convosumm
          type: Convosumm
        metrics:
          - name: Validation ROGUE-1
            type: rogue-1
            value: 38.6252
          - name: Validation ROGUE-L
            type: rogue-l
            value: 23.902
          - name: Test ROGUE-1
            type: rogue-1
            value: 38.3642
          - name: Test ROGUE-L
            type: rogue-l
            value: 23.7782
language:
  - en
pipeline_tag: summarization

BART-CNN-Convosumm

Model description

This model is a fine-tuned version of facebook/bart-large-cnn on the arg-filtered reddit part of Convosumm dataset. Model is trained for multilanguage telegram-bot summarizer.

Intended uses & limitations

Input expected: unstructured set of concatenated messages without nickname-message indexing.

Training and evaluation data

More information needed

Training procedure

Wandb logged results.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 1
eval_batch_size: 1
seed: 42
gradient_accumulation_steps: 20
total_train_batch_size: 20
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: polynomial
lr_scheduler_warmup_steps: 1
num_epochs: 7
label_smoothing_factor: 0.1

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
6.207	1.0	10	4.2651	32.3341	7.812	20.0411	29.4849	77.38
4.0248	1.99	20	3.9903	36.0787	11.0447	21.3596	33.2903	130.58
3.5933	2.99	30	3.9020	34.2931	11.2036	20.7935	30.8361	140.02
3.3086	3.98	40	3.8712	38.4842	11.9947	23.4913	34.4347	85.78
3.112	4.98	50	3.8700	38.652	11.8315	23.5208	34.5998	76.2
2.9933	5.97	60	3.8809	38.66	12.3337	23.4394	35.1976	83.26
2.834	6.97	70	3.8797	38.6252	12.2556	23.902	34.6324	81.28

It achieves the following results on the evaluation set (50 data points):

Loss: 3.8797
Rouge1: 38.6252
Rouge2: 12.2556
Rougel: 23.902
Rougelsum: 34.6324
Gen Len: 81.28

It achieves the following results on the test set (250 data points):

Loss: 3.8343
Rouge1: 38.3642
Rouge2: 12.2056
Rougel: 23.7782
Rougelsum: 34.3959
Gen Len: 84.132

Framework versions

Transformers 4.35.2
Pytorch 2.0.0
Datasets 2.1.0
Tokenizers 0.15.0