Spaces:

TheBritishLibrary
/

British-Library-books-genre-classifier

Runtime error

App Files Files Community

davanstrien HF staff commited on Nov 8, 2021

Commit

71b3cf3

1 Parent(s): f1cf4ce

add more model information

Browse files

Files changed (1) hide show

app.py +10 -10

app.py CHANGED Viewed

@@ -16,7 +16,6 @@ def _value2rgba(x, cmap=cm.RdYlGn, alpha_mult=1.0):
     return tuple(rgb.tolist() + [a])
 def _eval_dropouts(mod):
     module_name = mod.__class__.__name__
     if "Dropout" in module_name or "BatchNorm" in module_name:
@@ -25,7 +24,6 @@ def _eval_dropouts(mod):
         _eval_dropouts(module)
 def _piece_attn_html(pieces, attns, sep=" ", **kwargs):
     html_code, spans = ['<span style="font-family: monospace;">'], []
     for p, a in zip(pieces, attns):
@@ -45,8 +43,7 @@ def _show_piece_attn(*args, **kwargs):
     display(HTML(_piece_attn_html(*args, **kwargs)))
-@lru_cache(maxsize=1024*2)
 def _intrinsic_attention(learn, text, class_id=None):
     "Calculate the intrinsic attention of the input w.r.t to an output `class_id`, or the classification given by the model if `None`."
     learn.model.train()
@@ -80,12 +77,10 @@ def intrinsic_attention(x: TextLearner, text: str, class_id: int = None, **kwarg
     return _piece_attn_html(text.split(), to_np(attn), **kwargs)
 labels = learn_inf.dls.vocab[1]
-@lru_cache(maxsize=1024*2)
 def predict_label(title):
     *_, probs = learn_inf.predict(title)
     return probs
@@ -131,11 +126,14 @@ British Library Books genre detection model
 article = """
 [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.5245175.svg)](https://doi.org/10.5281/zenodo.5245175)
 # British Library Books genre detection demo
 This demo alows you to play with a 'genre' detection model which has been trained to predict, from the title of a book, whether it is 'fiction' or 'non-fiction'.
-The demo also shows you which parts of the input the model is using most to make its prediction.
 ## Model description
@@ -145,12 +143,14 @@ This dataset is dominated by English language books though it includes books in
 ## Training data
-[[More information needed]]
 ## Model performance
 The models performance on a held-out test set is as follows:
 ```
              precision    recall  f1-score   support

     return tuple(rgb.tolist() + [a])
 def _eval_dropouts(mod):
     module_name = mod.__class__.__name__
     if "Dropout" in module_name or "BatchNorm" in module_name:
         _eval_dropouts(module)
 def _piece_attn_html(pieces, attns, sep=" ", **kwargs):
     html_code, spans = ['<span style="font-family: monospace;">'], []
     for p, a in zip(pieces, attns):
     display(HTML(_piece_attn_html(*args, **kwargs)))
+@lru_cache(maxsize=1024 * 2)
 def _intrinsic_attention(learn, text, class_id=None):
     "Calculate the intrinsic attention of the input w.r.t to an output `class_id`, or the classification given by the model if `None`."
     learn.model.train()
     return _piece_attn_html(text.split(), to_np(attn), **kwargs)
 labels = learn_inf.dls.vocab[1]
+@lru_cache(maxsize=1024 * 2)
 def predict_label(title):
     *_, probs = learn_inf.predict(title)
     return probs
 article = """
 [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.5245175.svg)](https://doi.org/10.5281/zenodo.5245175)
 # British Library Books genre detection demo
 This demo alows you to play with a 'genre' detection model which has been trained to predict, from the title of a book, whether it is 'fiction' or 'non-fiction'.
+The demo also shows you which parts of the input the model is using most to make its prediction. You can hover over the words to see the attenton score assigned to that word. This gives you some sense of which words are important to the model in making a prediction.
+## Background
+This model was developed as part of work by the [Living with Machines](https://livingwithmachines.ac.uk/). The process of training the model and working with the data is documented in a tutorial which will be released soon.
 ## Model description
 ## Training data
+The model is trained on a particular collection of books digitised by the British Library. As a result the model may do less well on titles that look different to this data.
+In particular the training data, was mostly English, and mostly from the 19th Century. You can find more information about the model [here]((https://doi.org/10.5281/zenodo.5245175))
 ## Model performance
 The models performance on a held-out test set is as follows:
 ```
              precision    recall  f1-score   support