taishi-i commited on
Commit
0df6d6c
1 Parent(s): 70edc56

add evaluation script to README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -0
README.md CHANGED
@@ -53,6 +53,50 @@ label = pipe(text)
53
  print(label) # [{'label': '0', 'score': 0.9986791014671326}]
54
  ```
55
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
56
  # License
57
 
58
  This model was trained from a dataset collected from the GitHub API under [GitHub Acceptable Use Policies - 7. Information Usage Restrictions](https://docs.github.com/en/site-policy/acceptable-use-policies/github-acceptable-use-policies#7-information-usage-restrictions) and [GitHub Terms of Service - H. API Terms](https://docs.github.com/en/site-policy/github-terms/github-terms-of-service#h-api-terms). It should be used solely for research verification purposes. Adhering to GitHub's regulations is mandatory.
 
53
  print(label) # [{'label': '0', 'score': 0.9986791014671326}]
54
  ```
55
 
56
+ # Evaluation
57
+
58
+ Please install the following library.
59
+
60
+ ```bash
61
+ pip install evaluate scikit-learn datasets transformers torch
62
+ ```
63
+
64
+ ```python
65
+ import evaluate
66
+ from datasets import load_dataset
67
+ from sklearn.metrics import classification_report
68
+ from transformers import pipeline
69
+
70
+ # Evaluation dataset
71
+ dataset = load_dataset("taishi-i/awesome-japanese-nlp-classification-dataset")
72
+
73
+ # Text classification model
74
+ pipe = pipeline(
75
+ "text-classification",
76
+ model="taishi-i/awesome-japanese-nlp-classification-model",
77
+ )
78
+
79
+ # Evaluation metric
80
+ f1 = evaluate.load("f1")
81
+
82
+ # Predict process
83
+ predicted_labels = []
84
+ for text in dataset["test"]["text"]:
85
+ prediction = pipe(text)
86
+ predicted_label = prediction[0]["label"]
87
+ predicted_labels.append(int(predicted_label))
88
+
89
+ score = f1.compute(
90
+ predictions=predicted_labels, references=dataset["test"]["label"]
91
+ )
92
+ print(score)
93
+
94
+ report = classification_report(
95
+ y_true=dataset["test"]["label"], y_pred=predicted_labels
96
+ )
97
+ print(report)
98
+ ```
99
+
100
  # License
101
 
102
  This model was trained from a dataset collected from the GitHub API under [GitHub Acceptable Use Policies - 7. Information Usage Restrictions](https://docs.github.com/en/site-policy/acceptable-use-policies/github-acceptable-use-policies#7-information-usage-restrictions) and [GitHub Terms of Service - H. API Terms](https://docs.github.com/en/site-policy/github-terms/github-terms-of-service#h-api-terms). It should be used solely for research verification purposes. Adhering to GitHub's regulations is mandatory.