BART-CNN-Convosumm / README.md
Remeris's picture
Update README.md
539db81
|
raw
history blame
4.35 kB
metadata
license: cc
base_model: facebook/bart-large-cnn
tags:
  - generated_from_trainer
datasets:
  - cnn_dailymail
  - Convosumm
widget:
  - text: >
      Can we say that among the Pythagoreans the “body” of the concept was
      number? What do you mean by "concept body"? shell. What then is hidden
      behind this shell? Definition of a concept) what definition of a concept
      is ultimately hidden behind the body in the form of a number? All those
      that the Pythagoreans indicated. I want to say that numbers were their
      very concept. They thought in numbers as in concepts. Shape maybe?) you
      can say yes, but it will need to be developed on a mug. The definitions of
      thought are subject to numbers. On the one hand, numbers are pure
      abstraction, which gives initial freedom of thought for the derivation of
      abstract, embryonic definitions, but then for the derivation, description
      of reality, more specific concepts, the abstractness of numbers, on the
      contrary, limits, “leads into the darkness.” One is the object, “in
      itself”;'
model-index:
  - name: BART-CNN-Convosumm
    results:
      - task:
          name: Abstractive Dialogue Summarization
          type: abstractive-text-summarization
        dataset:
          name: Reddit arg-filtered part of Convosumm
          type: Convosumm
        metrics:
          - name: Validation ROGUE-1
            type: rogue-1
            value: 38.6252
          - name: Validation ROGUE-L
            type: rogue-l
            value: 23.902
          - name: Test ROGUE-1
            type: rogue-1
            value: 38.3642
          - name: Test ROGUE-L
            type: rogue-l
            value: 23.7782
language:
  - en
pipeline_tag: summarization

BART-CNN-Convosumm

Model description

This model is a fine-tuned version of facebook/bart-large-cnn on the arg-filtered reddit part of Convosumm dataset. Model is trained for multilanguage telegram-bot summarizer.

Intended uses & limitations

Input expected: unstructured set of concatenated messages without nickname-message indexing.

Training and evaluation data

More information needed

Training procedure

Wandb logged results.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 20
  • total_train_batch_size: 20
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: polynomial
  • lr_scheduler_warmup_steps: 1
  • num_epochs: 7
  • label_smoothing_factor: 0.1

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
6.207 1.0 10 4.2651 32.3341 7.812 20.0411 29.4849 77.38
4.0248 1.99 20 3.9903 36.0787 11.0447 21.3596 33.2903 130.58
3.5933 2.99 30 3.9020 34.2931 11.2036 20.7935 30.8361 140.02
3.3086 3.98 40 3.8712 38.4842 11.9947 23.4913 34.4347 85.78
3.112 4.98 50 3.8700 38.652 11.8315 23.5208 34.5998 76.2
2.9933 5.97 60 3.8809 38.66 12.3337 23.4394 35.1976 83.26
2.834 6.97 70 3.8797 38.6252 12.2556 23.902 34.6324 81.28

It achieves the following results on the evaluation set (50 data points):

  • Loss: 3.8797
  • Rouge1: 38.6252
  • Rouge2: 12.2556
  • Rougel: 23.902
  • Rougelsum: 34.6324
  • Gen Len: 81.28

It achieves the following results on the test set (250 data points):

  • Loss: 3.8343
  • Rouge1: 38.3642
  • Rouge2: 12.2056
  • Rougel: 23.7782
  • Rougelsum: 34.3959
  • Gen Len: 84.132

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.0.0
  • Datasets 2.1.0
  • Tokenizers 0.15.0