|
--- |
|
title: Question Generator |
|
emoji: π |
|
colorFrom: yellow |
|
colorTo: yellow |
|
sdk: streamlit |
|
sdk_version: "1.10.0" |
|
app_file: app.py |
|
pinned: false |
|
--- |
|
|
|
# Internship-IVIS-labs |
|
|
|
- The *Intelligent Question Generator* app is an easy-to-use interface built in Streamlit which uses [KeyBERT](https://github.com/MaartenGr/KeyBERT), [Sense2vec](https://github.com/explosion/sense2vec), [T5](https://huggingface.co./ramsrigouthamg/t5_paraphraser) |
|
- It uses a minimal keyword extraction technique that leverages multiple NLP embeddings and relies on [Transformers](https://huggingface.co./transformers/) π€ to create keywords/keyphrases that are most similar to a document. |
|
- [sense2vec](https://github.com/explosion/sense2vec) (Trask et. al, 2015) is a nice twist on word2vec that lets you learn more interesting and detailed word vectors. |
|
|
|
## Repository Breakdown |
|
### src Directory |
|
--- |
|
- `src/Pipeline/QAhaystack.py`: This file contains the code of question answering using [haystack](https://haystack.deepset.ai/overview/intro). |
|
- `src/Pipeline/QuestGen.py`: This file contains the code of question generation. |
|
- `src/Pipeline/Reader.py`: This file contains the code of reading the document. |
|
- `src/Pipeline/TextSummariztion.py`: This file contains the code of text summarization. |
|
- `src/PreviousVersionCode/context.py`: This file contains the finding the context of the paragraph. |
|
- `src/PreviousVersionCode/QuestionGenerator.py`: This file contains the code of first attempt of question generation. |
|
|
|
## Installation |
|
```shell |
|
$ git clone https://github.com/HemanthSai7/Internship-IVIS-labs.git |
|
``` |
|
```shell |
|
$ cd Internship-IVIS-labs |
|
``` |
|
```python |
|
pip install -r requirements.txt |
|
``` |
|
- For the running the app for the first time locally, you need to uncomment the the lines in `src/Pipeline/QuestGen.py` to download the models to the models directory. |
|
|
|
```python |
|
streamlit run app.py |
|
``` |
|
- Once the app is running, you can access it at http://localhost:8501 |
|
```shell |
|
You can now view your Streamlit app in your browser. |
|
|
|
Local URL: http://localhost:8501 |
|
Network URL: http://192.168.0.103:8501 |
|
``` |
|
|
|
## Tech Stack Used |
|
data:image/s3,"s3://crabby-images/668ee/668eea043fffe72890cef193716ade1c18d08282" alt="image" |
|
data:image/s3,"s3://crabby-images/69015/69015b704c7d093e26f11e7ab65f10c17db6ad81" alt="image" |
|
data:image/s3,"s3://crabby-images/f9501/f9501409068dbde2c64c7cbeac2b5002ac12a028" alt="image" |
|
data:image/s3,"s3://crabby-images/527a8/527a8e46b52ec731270406e716eac9bfef1bd14a" alt="image" |
|
data:image/s3,"s3://crabby-images/9dec5/9dec5e66a878155d1bb5ba8c899ea8eeb78cb4b3" alt="image" |
|
data:image/s3,"s3://crabby-images/d2ec8/d2ec8c02bef061e5309bce21033d7b2d3595cc0b" alt="image" |
|
data:image/s3,"s3://crabby-images/64cea/64cea391489f35fff39925d0f6bd1bfbd8687606" alt="image" |
|
data:image/s3,"s3://crabby-images/cb7d9/cb7d9dec7ce28ad447a96893536b6145fd7d8c26" alt="image" |
|
data:image/s3,"s3://crabby-images/7250d/7250d3a8e94fde2d75e6152eff882dfb0f4e566b" alt="image" |
|
data:image/s3,"s3://crabby-images/68c5e/68c5ea324b939245488761e29f3e23b7ac633bc9" alt="image" |
|
|
|
## Timeline |
|
### Week 1-2: |
|
#### Tasks |
|
- [x] Understanding and brushing up the concepts of NLP. |
|
- [x] Extracting images and text from a pdf file and storing it in a texty file. |
|
- [x] Exploring various open source tools for generating questions from a given text. |
|
- [x] Read papers related to the project (Bert,T5,RoBERTa etc). |
|
- [x] Summarizing the extracted text using T5 base pre-trained model from the pdf file. |
|
|
|
### Week 3-4: |
|
#### Tasks |
|
- [x] Understanding the concept of QA systems. |
|
- [x] Created a basic script for generating questions from the text. |
|
- [x] Created a basic script for finding the context of the paragraph. |
|
|
|
### Week 5-6: |
|
#### Tasks |
|
|
|
- [x] Understanding how Transformers models work for NLP tasks Question answering and generation |
|
- [x] Understanding how to use the Haystack library for QA systems. |
|
- [x] Understanding how to use the Haystack library for Question generation. |
|
- [x] PreProcessed the document for Haystack QA for better results . |
|
|
|
### Week 7-8: |
|
#### Tasks |
|
- [x] Understanding how to generate questions intelligently. |
|
- [x] Explored wordnet to find synonyms |
|
- [x] Used BertWSD for disambiguating the sentence provided. |
|
- [x] Used KeyBERT for finding the keywords in the document. |
|
- [x] Used sense2vec for finding better words with high relatedness for the keywords generated. |
|
|
|
### Week 9-10: |
|
#### Tasks |
|
- [x] Create a streamlit app to demonstrate the project. |
|
|