Spaces:

Tanmay09516
/

Research-Companion

Sleeping

App Files Files Community

Tanmay Jain commited on Nov 11, 2024

Commit

289b1c0

1 Parent(s): aebbdc1

minor text changes

Browse files

Files changed (2) hide show

README.md +51 -1
app.py +3 -4

README.md CHANGED Viewed

@@ -10,4 +10,54 @@ pinned: false
 short_description: AI tool turning Academic papers into podcasts
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 short_description: AI tool turning Academic papers into podcasts
 ---
+# AI Research Companion - Transforming Research Papers into Podcasts
+## Overview
+The AI Research Companion is an innovative tool designed to make academic research more accessible. It transforms complex, text-heavy research papers into audio podcasts, enabling users to consume academic content in a more engaging and convenient way.
+This project was initially developed during the Smart India Hackathon (SIH) in 2023 to address the overwhelming challenge of managing and understanding a large number of research papers. It leverages large language models (LLMs) to extract relevant text, generate readable transcripts, and convert these into audio podcasts.
+## Features
+- **Text Extraction:** Extracts content from uploaded PDFs to create clean, readable text.
+- **Transcript Generation:** Uses AI to generate a coherent transcript from the extracted text.
+- **TTS (Text-to-Speech):** Converts the refined transcript into an audio file.
+- **Editable Transcript:** Users can modify the transcript before converting it into audio, allowing for better control over the final output.
+- **Audio Output:** Listen to the final generated podcast from the research paper.
+## Development Status
+The tool is still under development with plans to:
+- Integrate web search capabilities to find related research.
+- Explore additional Text-to-Speech engines to enhance the audio output.
+## Requirements
+- Python 3.7 or higher
+- Gradio
+- Various AI/LLM APIs (configured in the `config` directory)
+- Edge TTS for audio generation
+## Setup Instructions
+1. Clone this repository to your local machine:
+    ```bash
+    git clone <repository_url>
+    ```
+2. Install the required dependencies:
+    ```bash
+    pip install -r requirements.txt
+    ```
+3. Set up API keys for the LLM models in the `config` directory.
+## Usage
+1. **Upload PDF:** Start by uploading a research paper in PDF format.
+2. **Select Model:** Choose the text model for processing the document.
+3. **Text Preview:** Preview the extracted text before proceeding.
+4. **Transcript Preview:** Review the generated transcript and make edits if needed.
+5. **TTS Output:** After finalizing the transcript, generate the audio podcast from the text.
+## Note:
+This tool uses APIs for LLMs, but if GPUs are available, you can easily switch the API base to local models like "ollama" for enhanced performance.
+## Acknowledgements
+Special thanks to [yasserrmd](https://huggingface.co/spaces/yasserrmd/NotebookLlama) for inspiring the structured prompts that guide this project.
+## License
+This project is open source under the MIT License. Feel free to contribute and improve the tool.

app.py CHANGED Viewed

@@ -104,12 +104,11 @@ with gr.Blocks(theme=custom_theme) as app:
         gr.Markdown("""
         ## Project Background
-        This project was initially implemented during the Smart India Hackathon (SIH) to address a real struggle I faced: managing the overwhelming flow of research papers and effectively understanding each one. The intensity of this process highlighted how valuable an AI-powered solution could be, not just for me but for others facing similar challenges in academia. By using large language models, this tool aims to make academic material more accessible and manageable, converting dense research into an audio format that’s easier to consume. And with the power of AI, I hope that this tool can transform the way we learn and engage with academic content.
-        Development is still ongoing, with plans to integrate web search capabilities and explore additional TTS engines to enhance usability. Special thanks to [yasserrmd](https://huggingface.co/spaces/yasserrmd/NotebookLlama) for inspiring the structured prompts that drive this project forward.
-        This AI Research Companion is crafted to bridge the gap between research and accessibility, turning in-depth research papers into audio podcasts for easier, on-the-go learning.
-        This page allows users to upload their research papers in PDF format to initiate the conversion process.
         """)
         with gr.Row():

         gr.Markdown("""
         ## Project Background
+        This project started during the Smart India Hackathon (SIH) as a solution to a challenge I personally faced—keeping up with the overwhelming influx of research papers. Realizing how intense and time-consuming this was, I thought an AI-driven tool could make academic content more accessible. By leveraging large language models, this tool converts dense research into easily understandable audio, offering an easier way for everyone to engage with academic material.
+        Development is ongoing, with future plans to add web search and additional TTS options for a richer experience. Special thanks to  [yasserrmd](https://huggingface.co/spaces/yasserrmd/NotebookLlama) inspiring the structured prompts behind this project.
+        This AI Research Companion bridges the gap between research and accessibility, transforming detailed papers into podcasts for more convenient, on-the-go learning. Just upload a PDF to start the conversion.
         """)
         with gr.Row():