Spaces:
Sleeping
Sleeping
Tanmay Jain
commited on
Commit
·
289b1c0
1
Parent(s):
aebbdc1
minor text changes
Browse files
README.md
CHANGED
@@ -10,4 +10,54 @@ pinned: false
|
|
10 |
short_description: AI tool turning Academic papers into podcasts
|
11 |
---
|
12 |
|
13 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
short_description: AI tool turning Academic papers into podcasts
|
11 |
---
|
12 |
|
13 |
+
# AI Research Companion - Transforming Research Papers into Podcasts
|
14 |
+
|
15 |
+
## Overview
|
16 |
+
The AI Research Companion is an innovative tool designed to make academic research more accessible. It transforms complex, text-heavy research papers into audio podcasts, enabling users to consume academic content in a more engaging and convenient way.
|
17 |
+
|
18 |
+
This project was initially developed during the Smart India Hackathon (SIH) in 2023 to address the overwhelming challenge of managing and understanding a large number of research papers. It leverages large language models (LLMs) to extract relevant text, generate readable transcripts, and convert these into audio podcasts.
|
19 |
+
|
20 |
+
## Features
|
21 |
+
- **Text Extraction:** Extracts content from uploaded PDFs to create clean, readable text.
|
22 |
+
- **Transcript Generation:** Uses AI to generate a coherent transcript from the extracted text.
|
23 |
+
- **TTS (Text-to-Speech):** Converts the refined transcript into an audio file.
|
24 |
+
- **Editable Transcript:** Users can modify the transcript before converting it into audio, allowing for better control over the final output.
|
25 |
+
- **Audio Output:** Listen to the final generated podcast from the research paper.
|
26 |
+
|
27 |
+
## Development Status
|
28 |
+
The tool is still under development with plans to:
|
29 |
+
- Integrate web search capabilities to find related research.
|
30 |
+
- Explore additional Text-to-Speech engines to enhance the audio output.
|
31 |
+
|
32 |
+
## Requirements
|
33 |
+
- Python 3.7 or higher
|
34 |
+
- Gradio
|
35 |
+
- Various AI/LLM APIs (configured in the `config` directory)
|
36 |
+
- Edge TTS for audio generation
|
37 |
+
|
38 |
+
## Setup Instructions
|
39 |
+
1. Clone this repository to your local machine:
|
40 |
+
```bash
|
41 |
+
git clone <repository_url>
|
42 |
+
```
|
43 |
+
2. Install the required dependencies:
|
44 |
+
```bash
|
45 |
+
pip install -r requirements.txt
|
46 |
+
```
|
47 |
+
3. Set up API keys for the LLM models in the `config` directory.
|
48 |
+
|
49 |
+
## Usage
|
50 |
+
1. **Upload PDF:** Start by uploading a research paper in PDF format.
|
51 |
+
2. **Select Model:** Choose the text model for processing the document.
|
52 |
+
3. **Text Preview:** Preview the extracted text before proceeding.
|
53 |
+
4. **Transcript Preview:** Review the generated transcript and make edits if needed.
|
54 |
+
5. **TTS Output:** After finalizing the transcript, generate the audio podcast from the text.
|
55 |
+
|
56 |
+
## Note:
|
57 |
+
This tool uses APIs for LLMs, but if GPUs are available, you can easily switch the API base to local models like "ollama" for enhanced performance.
|
58 |
+
|
59 |
+
## Acknowledgements
|
60 |
+
Special thanks to [yasserrmd](https://huggingface.co/spaces/yasserrmd/NotebookLlama) for inspiring the structured prompts that guide this project.
|
61 |
+
|
62 |
+
## License
|
63 |
+
This project is open source under the MIT License. Feel free to contribute and improve the tool.
|
app.py
CHANGED
@@ -104,12 +104,11 @@ with gr.Blocks(theme=custom_theme) as app:
|
|
104 |
gr.Markdown("""
|
105 |
|
106 |
## Project Background
|
107 |
-
This project
|
108 |
|
109 |
-
Development is
|
110 |
|
111 |
-
This AI Research Companion
|
112 |
-
This page allows users to upload their research papers in PDF format to initiate the conversion process.
|
113 |
""")
|
114 |
|
115 |
with gr.Row():
|
|
|
104 |
gr.Markdown("""
|
105 |
|
106 |
## Project Background
|
107 |
+
This project started during the Smart India Hackathon (SIH) as a solution to a challenge I personally faced—keeping up with the overwhelming influx of research papers. Realizing how intense and time-consuming this was, I thought an AI-driven tool could make academic content more accessible. By leveraging large language models, this tool converts dense research into easily understandable audio, offering an easier way for everyone to engage with academic material.
|
108 |
|
109 |
+
Development is ongoing, with future plans to add web search and additional TTS options for a richer experience. Special thanks to [yasserrmd](https://huggingface.co/spaces/yasserrmd/NotebookLlama) inspiring the structured prompts behind this project.
|
110 |
|
111 |
+
This AI Research Companion bridges the gap between research and accessibility, transforming detailed papers into podcasts for more convenient, on-the-go learning. Just upload a PDF to start the conversion.
|
|
|
112 |
""")
|
113 |
|
114 |
with gr.Row():
|