Tanmay Jain commited on
Commit
289b1c0
·
1 Parent(s): aebbdc1

minor text changes

Browse files
Files changed (2) hide show
  1. README.md +51 -1
  2. app.py +3 -4
README.md CHANGED
@@ -10,4 +10,54 @@ pinned: false
10
  short_description: AI tool turning Academic papers into podcasts
11
  ---
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  short_description: AI tool turning Academic papers into podcasts
11
  ---
12
 
13
+ # AI Research Companion - Transforming Research Papers into Podcasts
14
+
15
+ ## Overview
16
+ The AI Research Companion is an innovative tool designed to make academic research more accessible. It transforms complex, text-heavy research papers into audio podcasts, enabling users to consume academic content in a more engaging and convenient way.
17
+
18
+ This project was initially developed during the Smart India Hackathon (SIH) in 2023 to address the overwhelming challenge of managing and understanding a large number of research papers. It leverages large language models (LLMs) to extract relevant text, generate readable transcripts, and convert these into audio podcasts.
19
+
20
+ ## Features
21
+ - **Text Extraction:** Extracts content from uploaded PDFs to create clean, readable text.
22
+ - **Transcript Generation:** Uses AI to generate a coherent transcript from the extracted text.
23
+ - **TTS (Text-to-Speech):** Converts the refined transcript into an audio file.
24
+ - **Editable Transcript:** Users can modify the transcript before converting it into audio, allowing for better control over the final output.
25
+ - **Audio Output:** Listen to the final generated podcast from the research paper.
26
+
27
+ ## Development Status
28
+ The tool is still under development with plans to:
29
+ - Integrate web search capabilities to find related research.
30
+ - Explore additional Text-to-Speech engines to enhance the audio output.
31
+
32
+ ## Requirements
33
+ - Python 3.7 or higher
34
+ - Gradio
35
+ - Various AI/LLM APIs (configured in the `config` directory)
36
+ - Edge TTS for audio generation
37
+
38
+ ## Setup Instructions
39
+ 1. Clone this repository to your local machine:
40
+ ```bash
41
+ git clone <repository_url>
42
+ ```
43
+ 2. Install the required dependencies:
44
+ ```bash
45
+ pip install -r requirements.txt
46
+ ```
47
+ 3. Set up API keys for the LLM models in the `config` directory.
48
+
49
+ ## Usage
50
+ 1. **Upload PDF:** Start by uploading a research paper in PDF format.
51
+ 2. **Select Model:** Choose the text model for processing the document.
52
+ 3. **Text Preview:** Preview the extracted text before proceeding.
53
+ 4. **Transcript Preview:** Review the generated transcript and make edits if needed.
54
+ 5. **TTS Output:** After finalizing the transcript, generate the audio podcast from the text.
55
+
56
+ ## Note:
57
+ This tool uses APIs for LLMs, but if GPUs are available, you can easily switch the API base to local models like "ollama" for enhanced performance.
58
+
59
+ ## Acknowledgements
60
+ Special thanks to [yasserrmd](https://huggingface.co/spaces/yasserrmd/NotebookLlama) for inspiring the structured prompts that guide this project.
61
+
62
+ ## License
63
+ This project is open source under the MIT License. Feel free to contribute and improve the tool.
app.py CHANGED
@@ -104,12 +104,11 @@ with gr.Blocks(theme=custom_theme) as app:
104
  gr.Markdown("""
105
 
106
  ## Project Background
107
- This project was initially implemented during the Smart India Hackathon (SIH) to address a real struggle I faced: managing the overwhelming flow of research papers and effectively understanding each one. The intensity of this process highlighted how valuable an AI-powered solution could be, not just for me but for others facing similar challenges in academia. By using large language models, this tool aims to make academic material more accessible and manageable, converting dense research into an audio format that’s easier to consume. And with the power of AI, I hope that this tool can transform the way we learn and engage with academic content.
108
 
109
- Development is still ongoing, with plans to integrate web search capabilities and explore additional TTS engines to enhance usability. Special thanks to [yasserrmd](https://huggingface.co/spaces/yasserrmd/NotebookLlama) for inspiring the structured prompts that drive this project forward.
110
 
111
- This AI Research Companion is crafted to bridge the gap between research and accessibility, turning in-depth research papers into audio podcasts for easier, on-the-go learning.
112
- This page allows users to upload their research papers in PDF format to initiate the conversion process.
113
  """)
114
 
115
  with gr.Row():
 
104
  gr.Markdown("""
105
 
106
  ## Project Background
107
+ This project started during the Smart India Hackathon (SIH) as a solution to a challenge I personally faced—keeping up with the overwhelming influx of research papers. Realizing how intense and time-consuming this was, I thought an AI-driven tool could make academic content more accessible. By leveraging large language models, this tool converts dense research into easily understandable audio, offering an easier way for everyone to engage with academic material.
108
 
109
+ Development is ongoing, with future plans to add web search and additional TTS options for a richer experience. Special thanks to [yasserrmd](https://huggingface.co/spaces/yasserrmd/NotebookLlama) inspiring the structured prompts behind this project.
110
 
111
+ This AI Research Companion bridges the gap between research and accessibility, transforming detailed papers into podcasts for more convenient, on-the-go learning. Just upload a PDF to start the conversion.
 
112
  """)
113
 
114
  with gr.Row():