AdithyaSK commited on
Commit
a95af80
1 Parent(s): 0822dde

updated content on different tabs - Adithya S K

Browse files
Files changed (1) hide show
  1. app.py +134 -18
app.py CHANGED
@@ -153,34 +153,150 @@ def main():
153
  # About tab
154
  with About_tab:
155
  st.markdown('''
156
- ### About Indic LLM Leaderboard
157
-
158
- ### Indic Eval
159
-
160
- ### Contribute
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
161
  ''')
162
 
163
  # FAQ tab
164
  with FAQ_tab:
165
  st.markdown('''
166
- ### FAQ
167
-
168
- ### SUBMISSIONS
169
-
170
-
171
- ### RESULTS
172
-
173
-
174
- ### EDITING SUBMISSIONS
175
-
176
-
177
- ### OTHER
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
178
  ''')
179
 
180
  # Submit tab
181
  with Submit_tab:
182
  st.markdown('''
183
- ### Submit Your Model
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
184
  ''')
185
 
186
 
 
153
  # About tab
154
  with About_tab:
155
  st.markdown('''
156
+ ## **Why a Indic LLM Leaderboard is Required ?**
157
+
158
+ In recent months, there has been considerable progress in the Indic large language model (LLM) space. Major startups like Sarvam and Krutrim are building LLMs in this area.
159
+ Simultaneously, the open-source community is also adapting pretrained models, such as Llama, Mistral, and Gemma, for Indic languages.
160
+ Despite the influx of new models, there is a lack of a unified method to evaluate and compare them. This makes it challenging to track progress and determine what is working and what is not.
161
+
162
+ > This is the alpha release of the Indic LLM Leaderboard, and modifications will be made to the leaderboard in the future.
163
+ >
164
+
165
+ ## **Who We Are**
166
+
167
+ I'm [Adithya S K](https://linktr.ee/adithyaskolavi), the founder of [CognitiveLab](https://www.cognitivelab.in/). We provide AI solutions at scale and undertake research-based tasks.
168
+
169
+ One initiative we have taken is to create a unified platform where Indic LLMs can be compared using specially crafted datasets. Although initially developed for internal use, we are now open-sourcing this framework to further aid the Indic LLM ecosystem.
170
+
171
+ After releasing [Amabri, a 7b parameter English-Kannada bilingual LLM](https://www.cognitivelab.in/blog/introducing-ambari), we wanted to compare it with other open-source LLMs to identify areas for improvement. As there wasn't an existing solution, we built the Indic LLM suite, which consists of three projects:
172
+
173
+ - [Indic-llm](https://github.com/adithya-s-k/Indic-llm): An open-source framework designed to adapt pretrained LLMs, such as Llama, Mistral, and Mixtral, to a wide array of domains and languages.
174
+ - [Indic-Eval](https://github.com/adithya-s-k/indic_eval): A lightweight evaluation suite tailored specifically for assessing Indic LLMs across a diverse range of tasks, aiding in performance assessment and comparison within the Indian language context.
175
+ - [Indic LLM Leaderboard](https://huggingface.co/spaces/Cognitive-Lab/indic_llm_leaderboard): Utilizes the [indic_eval](https://github.com/adithya-s-k/indic_eval) evaluation framework, incorporating state-of-the-art translated benchmarks like ARC, Hellaswag, MMLU, among others. Supporting seven Indic languages, it offers a comprehensive platform for assessing model performance and comparing results within the Indic language modeling landscape.
176
+
177
+ ## **Upcoming implementations**
178
+
179
+ - [ ] Support to add VLLM for faster evaluation and inference
180
+ - [ ] SkyPilot installation to quickly run indic_eval on any cloud provider
181
+ - [ ] Add support for onboard evaluation just like OpenLLM Leaderboard
182
+
183
+ **Contribute**
184
+
185
+ All the projects are completely open source with different licenses, so anyone can contribute.
186
+
187
+ The current leaderboard is in alpha release, and many more changes are forthcoming:
188
+
189
+ - More robust benchmarks tailored for Indic languages.
190
+ - Easier integration with [indic_eval](https://github.com/adithya-s-k/indic_eval).
191
  ''')
192
 
193
  # FAQ tab
194
  with FAQ_tab:
195
  st.markdown('''
196
+ **What is the minimum requirement for GPUs to run the evaluation?**
197
+
198
+ - The evaluation can easily run on a single A100 GPU, but the framework also supports multi-GPU based evaluation to speed up the process.
199
+
200
+ **What languages are supported by the evaluation framework?**
201
+
202
+ - The following languages are supported by default: English, Kannada, Hindi, Tamil, Telugu, Gujarati, Marathi, Malayalam.
203
+
204
+ **How can I put my model on the leaderboard?**
205
+
206
+ - Please follow the steps shown in the Submit tab or refer to the indic_eval for more details.
207
+
208
+ **How does the leaderboard work?**
209
+
210
+ - After running indic_eval on the model of your choice, the results are pushed to a server and stored in a database. The Frontend Leaderboard accesses the server and retrieves the latest models in the database along with their respective benchmarks and metadata. The entire system is deployed in India and is as secure as possible.
211
+
212
+ **How is it different from the Open LLM leaderboard?**
213
+
214
+ - This project was mainly inspired by the Open LLM leaderboard. However, due to limited computation resources, we standardized the evaluation library with standard benchmarks. You can run the evaluation on your GPUs and the leaderboard will serve as a unified platform to compare models. We used indictrans2 and other translation APIs to translate the benchmarking dataset into seven Indian languages to ensure reliability and consistency in the output.
215
+
216
+ **Why does it take so much time to load the results?**
217
+
218
+ - We are running the server on a serverless instance which has a cold start problem, so it might sometimes take a while.
219
+
220
+ **What benchmarks are offered?**
221
+
222
+ - The current Indic Benchmarks offered by the indic_eval library can be found in this collection: https://huggingface.co/collections/Cognitive-Lab/indic-llm-leaderboard-eval-suite-660ac4818695a785edee4e6f. They include ARC Easy, ARC Challenge, Hellaswag, Boolq, and MMLU.
223
+
224
+ **How much time does it take to run the evaluation using indic_eval?**
225
+
226
+ - Depending on which GPU you are running, the time for evaluation varies.
227
+ - From our testing, it takes 3 to 4 hours to run the whole evaluation on a single GPU.
228
+ - It's much faster when using multiple GPUs.
229
+
230
+ **How does the verification step happen?**
231
+
232
+ - While running the evaluation, you are given an option to push results to the leaderboard with `-push_to_leaderboard <[email protected]>`. You will need to provide an email address through which we can contact you. If we find any anomaly in the evaluation score, we will contact you through this email for verification of results.
233
  ''')
234
 
235
  # Submit tab
236
  with Submit_tab:
237
  st.markdown('''
238
+ Here are the steps you will have to follows to put your model on the Indic LLM leaderboard
239
+
240
+ Clone the repo:
241
+
242
+ ```bash
243
+ git clone <https://github.com/adithya-s-k/indic_eval>
244
+ cd indic_eval
245
+
246
+ ```
247
+
248
+ Create a virtual environment using virtualenv or conda depending on your preferences. We require Python 3.10 or above:
249
+
250
+ ```bash
251
+ conda create -n indic-eval-venv python=3.10 && conda activate indic-eval-venv
252
+
253
+ ```
254
+
255
+ Install the dependencies. For the default installation, you just need:
256
+
257
+ ```bash
258
+ pip install .
259
+
260
+ ```
261
+
262
+ If you want to evaluate models with frameworks like `accelerate` or `peft`, you will need to specify the optional dependencies group that fits your use case (`accelerate`,`tgi`,`optimum`,`quantization`,`adapters`,`nanotron`):
263
+
264
+ ```bash
265
+ pip install '.[optional1,optional2]'
266
+
267
+ ```
268
+
269
+ The setup tested most is:
270
+
271
+ ```bash
272
+ pip install '.[accelerate,quantization,adapters]'
273
+
274
+ ```
275
+
276
+ If you want to push your results to the Hugging Face Hub, don't forget to add your access token to the environment variable `HUGGING_FACE_HUB_TOKEN`. You can do this by running:
277
+
278
+ ```
279
+ huggingface-cli login
280
+ ```
281
+
282
+ ## Command to Run Indic Eval and Push to Indic LLM Leaderboard
283
+
284
+ ```bash
285
+ accelerate launch run_indic_evals_accelerate.py \\
286
+ --model_args="pretrained=<path to model on the hub>" \\
287
+ --tasks indic_llm_leaderboard \\
288
+ --output_dir output_dir \\
289
+ --push_to_leaderboard <[email protected]> \\
290
+
291
+ ```
292
+
293
+ It's as simple as that.👍
294
+
295
+ For `--push_to_leaderboard`, provide an email id through which we can contact you in case of verification. This email won't be shared anywhere. It's only required for future verification of the model's scores and for authenticity.
296
+
297
+ After you have installed all the required packages, run the following command:
298
+
299
+ For multi-GPU configuration, please refer to the docs of [Indic_Eval](https://github.com/adithya-s-k/indic_eval).
300
  ''')
301
 
302