maclean-connor96
/

feedier-french-books

Text Classification

Inference Endpoints

Model card Files Files and versions Community

maclean-connor96 commited on Aug 31, 2023

Commit

4e4a1fd

·

1 Parent(s): 8e792ce

Update ReadMe

Files changed (1) hide show

README.md +19 -1

README.md CHANGED Viewed

@@ -3,4 +3,22 @@ license: apache-2.0
 datasets:
 - Abirate/french_book_reviews
 pipeline_tag: text-classification
----

 datasets:
 - Abirate/french_book_reviews
 pipeline_tag: text-classification
+---
+## Model and approach 🤗
+#### As I am limited by my personal computer, the training was done on the distilbert-base-multilingual-cased model. This model is 60% faster than the classic BERT model and preserves 95% of the original model's accuracy.
+#### The dataset provided contains book titles, authors, reviews, and a score for each book. These columns were concatenated to form large context blocks and were used as the input text. The labels, (0, 1, and -1) were normalized to 0, 1, and 2, and finally to NEUTRAL, POSITIVE, and NEGATIVE to help with legibility of the predictions.
+#### As this exercise is simply to show my capacities to train a model, the model has been trained using 3000 training entries and 300 test entries for 2 epochs.
+## Notes on the three classes and the model's bias 📝
+#### The distribution of these classes is not equal in the ensemble of this dataset. Although it is shuffled, positive reviews are the most present, and therefore most-often predicted category. In addition, the decision to keep the review score in the text block did have an impact on the biases of the model. The model can make a prediction based on score alone, a number between 1 and 5.
+### Positive reviews: 2081
+### Negative reviews: 224
+### Neutral reviews: 695