emilys commited on
Commit
76eeb1a
1 Parent(s): 0b8e91f

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +72 -0
README.md ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-4.0
3
+ language:
4
+ - en
5
+ pipeline_tag: text-classification
6
+ tags:
7
+ - roberta-large
8
+ - topic
9
+ - news
10
+
11
+ widget:
12
+ - text: "Diplomatic efforts to deal with the world’s two wars — the civil war in Spain and the undeclared Chinese - Japanese conflict — received sharp setbacks today."
13
+ - text: "WASHINGTON. AP. A decisive development appeared in the offing in the tug-of-war between the federal government and the states over the financing of relief."
14
+ - text: "A frantic bride called the Rochester Gas and Electric corporation to complain that her new refrigerator “freezes ice cubes too fast.”"
15
+
16
+ ---
17
+
18
+ # Fine-tuned RoBERTa-large for detecting news on obituaries
19
+
20
+ # Model Description
21
+
22
+ This model is a finetuned RoBERTa-large, for classifying whether news articles are obituaries.
23
+
24
+ # How to Use
25
+
26
+ ```python
27
+ from transformers import pipeline
28
+ classifier = pipeline("text-classification", model="dell-research-harvard/topic-obits")
29
+ classifier("John Smith died after a long illness")
30
+ ```
31
+
32
+ # Training data
33
+
34
+ The model was trained on a hand-labelled sample of data from the [NEWSWIRE dataset](https://huggingface.co/datasets/dell-research-harvard/newswire).
35
+
36
+ Split|Size
37
+ -|-
38
+ Train|272
39
+ Dev|57
40
+ Test|57
41
+
42
+ # Test set results
43
+
44
+ Metric|Result
45
+ -|-
46
+ F1|1.000
47
+ Accuracy|1.000
48
+ Precision|1.000
49
+ Recall|1.000
50
+
51
+
52
+ # Citation Information
53
+
54
+ You can cite this dataset using
55
+
56
+ ```
57
+ @misc{silcock2024newswirelargescalestructureddatabase,
58
+ title={Newswire: A Large-Scale Structured Database of a Century of Historical News},
59
+ author={Emily Silcock and Abhishek Arora and Luca D'Amico-Wong and Melissa Dell},
60
+ year={2024},
61
+ eprint={2406.09490},
62
+ archivePrefix={arXiv},
63
+ primaryClass={cs.CL},
64
+ url={https://arxiv.org/abs/2406.09490},
65
+ }
66
+ ```
67
+
68
+ # Applications
69
+
70
+ We applied this model to a century of historical news articles. You can see all the classifications in the [NEWSWIRE dataset](https://huggingface.co/datasets/dell-research-harvard/newswire).
71
+
72
+