--- license: osl-3.0 model-index: - name: indus_1.175B results: - task: type: text-generation name: Text Generation dataset: name: AI2 Reasoning Challenge (25-Shot) type: ai2_arc config: ARC-Challenge split: test args: num_few_shot: 25 metrics: - type: acc_norm value: 22.7 name: normalized accuracy source: url: https://huggingface.co./spaces/HuggingFaceH4/open_llm_leaderboard?query=nickmalhotra/ProjectIndus name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: HellaSwag (10-Shot) type: hellaswag split: validation args: num_few_shot: 10 metrics: - type: acc_norm value: 25.04 name: normalized accuracy source: url: https://huggingface.co./spaces/HuggingFaceH4/open_llm_leaderboard?query=nickmalhotra/indus_1.175B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU (5-Shot) type: cais/mmlu config: all split: test args: num_few_shot: 5 metrics: - type: acc value: 23.12 name: accuracy source: url: https://huggingface.co./spaces/HuggingFaceH4/open_llm_leaderboard?query=nickmalhotra/indus_1.175B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: TruthfulQA (0-shot) type: truthful_qa config: multiple_choice split: validation args: num_few_shot: 0 metrics: - type: mc2 value: 0.0 source: url: https://huggingface.co./spaces/HuggingFaceH4/open_llm_leaderboard?query=nickmalhotra/indus_1.175B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: Winogrande (5-shot) type: winogrande config: winogrande_xl split: validation args: num_few_shot: 5 metrics: - type: acc value: 49.57 name: accuracy source: url: https://huggingface.co./spaces/HuggingFaceH4/open_llm_leaderboard?query=nickmalhotra/indus_1.175B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GSM8k (5-shot) type: gsm8k config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 0.0 name: accuracy source: url: https://huggingface.co./spaces/HuggingFaceH4/open_llm_leaderboard?query=nickmalhotra/indus_1.175B name: Open LLM Leaderboard --- --- # Model Card for Indus The model is a pretrained model in Hindi and dialects which is instruct tuned . # Table of Contents - [Model Card for Indus](#model-card-for--model_id-) - [Table of Contents](#table-of-contents) - [Table of Contents](#table-of-contents-1) - [Model Details](#model-details) - [Model Description](#model-description) - [Uses](#uses) - [Direct Use](#direct-use) - [Downstream Use [Optional]](#downstream-use-optional) - [Out-of-Scope Use](#out-of-scope-use) - [Bias, Risks, and Limitations](#bias-risks-and-limitations) - [Recommendations](#recommendations) - [Training Details](#training-details) - [Training Data](#training-data) - [Training Procedure](#training-procedure) - [Preprocessing](#preprocessing) - [Speeds, Sizes, Times](#speeds-sizes-times) - [Evaluation](#evaluation) - [Testing Data, Factors & Metrics](#testing-data-factors--metrics) - [Testing Data](#testing-data) - [Factors](#factors) - [Metrics](#metrics) - [Results](#results) - [Model Examination](#model-examination) - [Environmental Impact](#environmental-impact) - [Technical Specifications [optional]](#technical-specifications-optional) - [Model Architecture and Objective](#model-architecture-and-objective) - [Compute Infrastructure](#compute-infrastructure) - [Hardware](#hardware) - [Software](#software) - [Citation](#citation) - [Glossary [optional]](#glossary-optional) - [More Information [optional]](#more-information-optional) - [Model Card Authors [optional]](#model-card-authors-optional) - [Model Card Contact](#model-card-contact) - [How to Get Started with the Model](#how-to-get-started-with-the-model) # Model Details ## Model Description TThe model is a pretrained model in Hindi and dialects which is instruct tuned. - **Developed by:** Nikhil Malhotra, Nilesh Brahme, Satish Mishra, Vinay Sharma (Makers Lab, TechMahindra) - **Model type:** Foundational Language model - **Language(s) (NLP):** hin, bho, mai, doi - **License:** other - **Parent Model:** It is a grounds up model built on GPT-2 architecture starting from tokenizer to decoder - **Resources for more information:** https://www.techmahindra.com/en-in/innovation/the-indus-project/ # Uses Uses include question and answeting and conversation in Hindi and Dialects. The model would be reward tuned to be used across various industries 1. Call center 2. Healthcare 3. Automotive 4. Telecom ## Direct Use Direct use is as a foundationla model on Hindi and dialects ## Downstream Use [Optional] Uses include question and answeting and conversation in Hindi and Dialects. The model would be reward tuned to be used across various industries 1. Call center 2. Healthcare 3. Automotive 4. Telecom ## Out-of-Scope Use Cannot be used for fill in the blanks, Multiple Q&A etc. at the moment # Bias, Risks, and Limitations Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)). Predictions generated by the model may include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups. We have taken care across various biases by trying to remove them from training data. However since the model is a generative model, it would tend to produce hallucinations. Any disturbing or harmful sterotype produced by the model is purely un-intentional and coincidental. ## Recommendations Recommendation is to not use biases and negative connotation for the model # Training Details ## Training Data More information on training data needed ## Training Procedure ### Preprocessing More information needed ### Speeds, Sizes, Times More information needed # Evaluation ## Testing Data, Factors & Metrics ### Testing Data More information needed ### Factors More information needed ### Metrics More information needed ## Results More information needed # Model Examination More information needed # Environmental Impact Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). - **Hardware Type:** More information needed - **Hours used:** More information needed - **Cloud Provider:** More information needed - **Compute Region:** More information needed - **Carbon Emitted:** More information needed # Technical Specifications [optional] ## Model Architecture and Objective More information needed ## Compute Infrastructure More information needed ### Hardware More information needed ### Software More information needed # Citation **BibTeX:** More information needed **APA:** More information needed # Glossary [optional] More information needed # More Information [optional] More information needed # Model Card Authors [optional] Nikhil Malhotra, Nilesh Brahme, Vinay Sharma, Satish Mishra # Model Card Contact More information needed # How to Get Started with the Model Use the code below to get started with the model. # Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="nickmalhotra/Indus_1.175B") # Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("nickmalhotra/Indus_1.175B") model = AutoModelForCausalLM.from_pretrained("nickmalhotra/Indus_1.175B")
Click to expand More information needed
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co./spaces/HuggingFaceH4/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co./datasets/open-llm-leaderboard/details_nickmalhotra__indus_1.175B) | Metric |Value| |---------------------------------|----:| |Avg. |20.07| |AI2 Reasoning Challenge (25-Shot)|22.70| |HellaSwag (10-Shot) |25.04| |MMLU (5-Shot) |23.12| |TruthfulQA (0-shot) | 0.00| |Winogrande (5-shot) |49.57| |GSM8k (5-shot) | 0.00|