Spaces:

TheBritishLibrary
/

British-Library-books-genre-classifier

Runtime error

davanstrien HF staff commited on Nov 9, 2021

Commit

7745cb1

1 Parent(s): 0ade054

Update app.py

Files changed (1) hide show

app.py CHANGED Viewed

@@ -132,7 +132,15 @@ This demo allows you to play with a 'genre' detection model which has been train
 The model was trained with the [fastai](https://docs.fast.ai/) library on training data drawn from [digitised books](https://www.bl.uk/collection-guides/digitised-printed-books) at the British Library. These Books are mainly from the 19th Century.
 The demo also shows you which parts of the input the model is using most to make its prediction. You can hover over the words to see the attention score assigned to that word. This gives you some sense of which words are important to the model in making a prediction.
-The examples include titles from the BL books collection.
 ## Background

 The model was trained with the [fastai](https://docs.fast.ai/) library on training data drawn from [digitised books](https://www.bl.uk/collection-guides/digitised-printed-books) at the British Library. These Books are mainly from the 19th Century.
 The demo also shows you which parts of the input the model is using most to make its prediction. You can hover over the words to see the attention score assigned to that word. This gives you some sense of which words are important to the model in making a prediction.
+The examples include titles from the BL books collection. You may notice that the model makes mistakes on short titles in particular, this can partly be explained by the title format in the original data. For example the novel *'Vanity Fair'* by William Makepeace Thackeray
+is found in the training data as:
+```
+Vanity Fair. A novel without a hero ... With all the original illustrations by the author, etc
+```
+You can see that the model gets a bit of help with the genre here 😉. Since the model was trained for a very particular dataset and task it might not work well on titles that don't match this original corpus.
 ## Background