en_core_web_md / README.md
elishowk's picture
Automatic correction of README.md metadata for keys. Contact [email protected] for any question
b7db18e
|
raw
history blame
2.77 kB
metadata
tags:
  - spacy
  - token-classification
language:
  - en
license: mit
model-index:
  - name: en_core_web_md
    results:
      - task:
          name: NER
          type: token-classification
        metrics:
          - name: NER Precision
            type: precision
            value: 0.853733758
          - name: NER Recall
            type: recall
            value: 0.8456530449
          - name: NER F Score
            type: f_score
            value: 0.8496741892
      - task:
          name: POS
          type: token-classification
        metrics:
          - name: POS Accuracy
            type: accuracy
            value: 0.9727831973
      - task:
          name: SENTER
          type: token-classification
        metrics:
          - name: SENTER Precision
            type: precision
            value: 0.9049104721
          - name: SENTER Recall
            type: recall
            value: 0.8801372122
          - name: SENTER F Score
            type: f_score
            value: 0.8923519379
      - task:
          name: UNLABELED_DEPENDENCIES
          type: token-classification
        metrics:
          - name: Unlabeled Dependencies Accuracy
            type: accuracy
            value: 0.9186878782
      - task:
          name: LABELED_DEPENDENCIES
          type: token-classification
        metrics:
          - name: Labeled Dependencies Accuracy
            type: accuracy
            value: 0.9186878782

Details: https://spacy.io/models/en#en_core_web_md

English pipeline optimized for CPU. Components: tok2vec, tagger, parser, senter, ner, attribute_ruler, lemmatizer.

Feature Description
Name en_core_web_md
Version 3.1.0
spaCy >=3.1.0,<3.2.0
Default Pipeline tok2vec, tagger, parser, attribute_ruler, lemmatizer, ner
Components tok2vec, tagger, parser, senter, attribute_ruler, lemmatizer, ner
Vectors 684830 keys, 20000 unique vectors (300 dimensions)
Sources OntoNotes 5 (Ralph Weischedel, Martha Palmer, Mitchell Marcus, Eduard Hovy, Sameer Pradhan, Lance Ramshaw, Nianwen Xue, Ann Taylor, Jeff Kaufman, Michelle Franchini, Mohammed El-Bachouti, Robert Belvin, Ann Houston)
ClearNLP Constituent-to-Dependency Conversion (Emory University)
WordNet 3.0 (Princeton University)
GloVe Common Crawl (Jeffrey Pennington, Richard Socher, and Christopher D. Manning)
License MIT
Author Explosion

Accuracy

Type Score
TOKEN_ACC 99.93
TAG_ACC 97.28
DEP_UAS 91.87
DEP_LAS 90.05
ENTS_P 85.37
ENTS_R 84.57
ENTS_F 84.97
SENTS_P 90.49
SENTS_R 88.01
SENTS_F 89.24