AliHaider0343 commited on
Commit
9af390c
1 Parent(s): c4ddcf0

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +278 -0
README.md ADDED
@@ -0,0 +1,278 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ metrics:
5
+ - accuracy
6
+ pipeline_tag: text-classification
7
+ tags:
8
+ - bert
9
+ - Aspects
10
+ - ABSA
11
+ - Aspects Extraction
12
+ - roberta
13
+ ---
14
+
15
+ # Model Card for Model ID
16
+
17
+ <!-- Provide a quick summary of what the model is/does. -->
18
+
19
+ Extracting Implicit and Explicit Aspects from Restaurant Reviews using RoBERTa-Large Variant with Benchmark Efficiency and Custom Dataset
20
+ We present a groundbreaking approach to extracting implicit and explicit aspects from restaurant reviews in the domain. Leveraging the powerful RoBERTa-Large variant, our method achieves remarkable performance while utilizing a custom dataset.
21
+ Our research addresses the challenging task of aspect extraction, which involves identifying both explicit aspects explicitly mentioned in reviews, as well as implicit aspects that are indirectly referred to. By employing RoBERTa-Large, a state-of-the-art language model, we leverage its advanced contextual understanding to capture nuanced information from textual data.
22
+ To ensure the efficiency and accuracy of our approach, we benchmarked our system against existing methods in the field. The results were outstanding, highlighting the superiority of our approach in terms of precision, recall, and overall performance.
23
+ Furthermore, we developed a custom dataset tailored specifically to the restaurant domain, encompassing a diverse range of reviews from various platforms. This dataset allowed us to train our model with domain-specific knowledge, leading to better aspect extraction outcomes.
24
+
25
+ Overall, our research presents a novel and efficient solution for aspect extraction from restaurant reviews. By employing the RoBERTa-Large variant and a carefully curated custom dataset, we have achieved remarkable results that surpass existing approaches. This breakthrough has significant implications for sentiment analysis, opinion mining, and other natural language processing applications in the restaurant domain.
26
+
27
+ ## Model Details
28
+
29
+ ### Model Description
30
+
31
+ <!-- Provide a longer summary of what this model is. -->
32
+ - **Developed by:** Ali Haider
33
+ - **Shared by:** Ali Haider
34
+ - **Model type:** Bert Varinet
35
+ - **Language(s) (NLP):** English (Restaurant Domain Reviews)
36
+ - **Finetuned from model:** Roberta Large
37
+
38
+ ## Uses
39
+
40
+ Aspects Extraction Model in Restaurant Domain aimns to extract the Implicit and explicit aspects that might be speifified in the Reviews we can use our model for vairous purposes such as
41
+ 1. Aspects extraction from the reviews Sentences and classification under 34 aspects-categoires.
42
+ 2. Aspects based Restaurant Recommendation system
43
+ 3. Restaurant Reviews Analysis
44
+
45
+ ### Out-of-Scope Use
46
+
47
+ Model has been tuned to classifiy the out of scope sentences into the General.
48
+
49
+
50
+ ## How to Get Started with the Model
51
+
52
+ Sample Sentence: The food was very delicious, elegant Ambience and Decoration , floors were clean and most importantly the food was affoardable.
53
+ Expected Output:
54
+
55
+ Food-Taste
56
+ Food-Price
57
+ Restaurant-Decoration
58
+ Restaurant-Atmosphere
59
+ Restaurant-Hygiene
60
+
61
+
62
+ ## Training Details
63
+
64
+ Roberta-large varient is used with 10678 data entires each of the sentence is classified under serveral Aspects they might belong to and trained till the Validation loss
65
+ not improving till 3 epochs.
66
+
67
+ ### Training Data
68
+
69
+ Reviews are tokenized into sentences and 10678 unique sentences are annotated for training.
70
+ Aspects are Categorized under 4 categories
71
+
72
+ Restaurants (Restaurants and Ambience Merged)
73
+ Atmopshere
74
+ Building
75
+ Location
76
+ Features
77
+ Hygiene
78
+ Kitchen
79
+ Recommendation
80
+ View
81
+ Decoration
82
+ Seating Plan
83
+ Options
84
+ Experience
85
+ General
86
+
87
+ Service (Staff and Service Merged)
88
+ Behavior
89
+ Wait Time
90
+ General
91
+ Experience
92
+
93
+ Food (Food and Drinks Merged)
94
+ Cuisine
95
+ Deals
96
+ Diet Options
97
+ Ingredients
98
+ Menu
99
+ Kitchen
100
+ Portion
101
+ Presentation
102
+ Price
103
+ Quality
104
+ Taste
105
+ Flavor
106
+ Recommendation
107
+ Experience
108
+ Dishes
109
+ General
110
+
111
+ General (Out of Domain and Contextless Sentences)
112
+ General
113
+
114
+
115
+ #### Training Hyperparameters
116
+
117
+ lr=2e-5 eps=1e-8 batch_size=32
118
+
119
+ ## Evaluation and Results
120
+
121
+ Classification Report
122
+ precision recall f1-score support
123
+
124
+ FOOD-CUISINE 0.69 0.83 0.76 65
125
+ FOOD-DEALS 0.81 0.75 0.78 40
126
+ FOOD-DIET_OPTION 0.73 0.93 0.82 71
127
+ FOOD-EXPERIENCE 0.38 0.44 0.40 55
128
+ FOOD-FLAVOR 0.83 0.94 0.88 63
129
+ FOOD-GENERAL 0.65 0.78 0.71 141
130
+ FOOD-INGREDIENT 0.77 0.80 0.78 54
131
+ FOOD-KITCHEN 0.50 0.60 0.55 35
132
+ FOOD-MEAL 0.72 0.74 0.73 208
133
+ FOOD-MENU 0.80 0.89 0.84 136
134
+ FOOD-PORTION 0.90 0.91 0.90 76
135
+ FOOD-PRESENTATION 0.82 0.94 0.87 33
136
+ FOOD-PRICE 0.74 0.88 0.80 57
137
+ FOOD-QUALITY 0.61 0.66 0.63 102
138
+ FOOD-RECOMMENDATION 0.65 0.47 0.55 32
139
+ FOOD-TASTE 0.79 0.84 0.82 114
140
+ GENERAL-GENERAL 0.98 0.88 0.93 163
141
+ RESTAURANT-ATMOSPHERE 0.73 0.79 0.76 170
142
+ RESTAURANT-BUILDING 0.90 0.86 0.88 44
143
+ RESTAURANT-DECORATION 0.95 0.84 0.89 44
144
+ RESTAURANT-EXPERIENCE 0.67 0.60 0.63 189
145
+ RESTAURANT-FEATURES 0.55 0.76 0.64 75
146
+ RESTAURANT-GENERAL 0.45 0.49 0.47 47
147
+ RESTAURANT-HYGIENE 0.94 0.92 0.93 51
148
+ RESTAURANT-KITCHEN 0.82 0.85 0.84 33
149
+ RESTAURANT-LOCATION 0.59 0.78 0.67 69
150
+ RESTAURANT-OPTIONS 0.42 0.41 0.41 32
151
+ RESTAURANT-RECOMMENDATION 0.62 0.71 0.67 49
152
+ RESTAURANT-SEATING_PLAN 0.78 0.82 0.80 68
153
+ RESTAURANT-VIEW 0.80 0.88 0.84 42
154
+ SERVICE-BEHAVIOUR 0.65 0.87 0.74 127
155
+ SERVICE-EXPERIENCE 0.31 0.24 0.27 21
156
+ SERVICE-GENERAL 0.74 0.81 0.77 162
157
+ SERVICE-WAIT_TIME 0.86 0.85 0.86 94
158
+
159
+ micro avg 0.72 0.78 0.75 2762
160
+ macro avg 0.71 0.76 0.73 2762
161
+ weighted avg 0.73 0.78 0.75 2762
162
+ samples avg 0.75 0.78 0.75 2762
163
+
164
+ Accuracy 0.9801993831240361
165
+
166
+ Confusin Matrix
167
+ [[[2047, 24],
168
+ [ 11, 54]],
169
+
170
+ [[2089, 7],
171
+ [ 10, 30]],
172
+
173
+ [[2041, 24],
174
+ [ 5, 66]],
175
+
176
+ [[2041, 40],
177
+ [ 31, 24]],
178
+
179
+ [[2061, 12],
180
+ [ 4, 59]],
181
+
182
+ [[1936, 59],
183
+ [ 31, 110]],
184
+
185
+ [[2069, 13],
186
+ [ 11, 43]],
187
+
188
+ [[2080, 21],
189
+ [ 14, 21]],
190
+
191
+ [[1869, 59],
192
+ [ 55, 153]],
193
+
194
+ [[1969, 31],
195
+ [ 15, 121]],
196
+
197
+ [[2052, 8],
198
+ [ 7, 69]],
199
+
200
+ [[2096, 7],
201
+ [ 2, 31]],
202
+
203
+ [[2061, 18],
204
+ [ 7, 50]],
205
+
206
+ [[1991, 43],
207
+ [ 35, 67]],
208
+
209
+ [[2096, 8],
210
+ [ 17, 15]],
211
+
212
+ [[1997, 25],
213
+ [ 18, 96]],
214
+
215
+ [[1970, 3],
216
+ [ 19, 144]],
217
+
218
+ [[1917, 49],
219
+ [ 36, 134]],
220
+
221
+ [[2088, 4],
222
+ [ 6, 38]],
223
+
224
+ [[2090, 2],
225
+ [ 7, 37]],
226
+
227
+ [[1890, 57],
228
+ [ 75, 114]],
229
+
230
+ [[2015, 46],
231
+ [ 18, 57]],
232
+
233
+ [[2061, 28],
234
+ [ 24, 23]],
235
+
236
+ [[2082, 3],
237
+ [ 4, 47]],
238
+
239
+ [[2097, 6],
240
+ [ 5, 28]],
241
+
242
+ [[2029, 38],
243
+ [ 15, 54]],
244
+
245
+ [[2086, 18],
246
+ [ 19, 13]],
247
+
248
+ [[2066, 21],
249
+ [ 14, 35]],
250
+
251
+ [[2052, 16],
252
+ [ 12, 56]],
253
+
254
+ [[2085, 9],
255
+ [ 5, 37]],
256
+
257
+ [[1950, 59],
258
+ [ 17, 110]],
259
+
260
+ [[2104, 11],
261
+ [ 16, 5]],
262
+
263
+ [[1927, 47],
264
+ [ 30, 132]],
265
+
266
+ [[2029, 13],
267
+ [ 14, 80]]]
268
+
269
+ Average Validation loss 0.06330019191129883
270
+
271
+ ## Model Card Authors
272
+
273
+ Ali Haider
274
+
275
+ ## Model Card Contact
276
+
277
+ +923068983139
278