Add files

Files changed (10) hide show

README.md ADDED Viewed

+---
+language: "en"
+datasets:
+- squad
+metrics:
+- squad
+license: apache-2.0
+---
+# DistilBERT base cased distilled SQuAD
+This model is a fine-tune checkpoint of [DistilBERT-base-cased](https://huggingface.co/distilbert-base-cased), fine-tuned using (a second step of) knowledge distillation on SQuAD v1.1.
+This model reaches a F1 score of 87.1 on the dev set (for comparison, BERT bert-base-cased version reaches a F1 score of 88.7).

config.json ADDED Viewed

+{
+  "activation": "gelu",
+  "architectures": [
+    "DistilBertForQuestionAnswering"
+  ],
+  "attention_dropout": 0.1,
+  "dim": 768,
+  "dropout": 0.1,
+  "hidden_dim": 3072,
+  "initializer_range": 0.02,
+  "max_position_embeddings": 512,
+  "model_type": "distilbert",
+  "n_heads": 12,
+  "n_layers": 6,
+  "output_past": true,
+  "pad_token_id": 0,
+  "qa_dropout": 0.1,
+  "seq_classif_dropout": 0.2,
+  "sinusoidal_pos_embds": true,
+  "tie_weights_": true,
+  "vocab_size": 28996
+}

pytorch_model.bin ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:4e10bdbc83fdbb975a430fc2148c85051e55bd288334deab18db58664ef0ea13
+size 260793700

rust_model.ot ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:8a9f9b2f153ac9ff230aca4548fa3286be9d2f9ea4eb7e9169665b1a8e983f44
+size 260795580

saved_model.tar.gz ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:f7e26fe22fdeb23462ae6423fc04b7e4929212a49aa033c3a7b8f30c937c943f
+size 241487391

tf_model.h5 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:63ee5a0142067161ced524179c161c5026f47b53a34a946a5ad1a907fab35011
+size 260894952

tfjs.tar.gz ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:2e966858819faa94996263752a344fad68f858299ccdf27ccabe3d868c588186
+size 241062466

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

+{
+  "do_lower_case": false
+}

vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff