dishDecode / README.md
GoodML's picture
Update README.md
13c950e verified
|
raw
history blame
3.67 kB
metadata
title: DishDecode
emoji: 
colorFrom: pink
colorTo: pink
sdk: docker
pinned: false
license: mit
short_description: It transforms unstructured recipe videos into structured

📝 Flask Audio and YouTube Video Processing API

A Flask-based API that processes audio files and YouTube video transcripts to generate structured recipe information. It utilizes Whisper, Deepgram, and Gemini APIs for transcription and data extraction.

🚀 Overview

This API offers:

  1. Audio Processing: Download and transcribe audio files.
  2. YouTube Transcription: Extract transcripts from YouTube videos.
  3. Recipe Data Generation: Generate detailed recipe data using Gemini API.

🧩 Features

  • Audio URL Processing: Download and transcribe audio via Deepgram.
  • YouTube Video Processing: Extract video transcripts and process them.
  • Structured Output: Recipe name, ingredients, steps, techniques, and more.
  • Logging and Error Handling: Debugging and comprehensive error responses.

⚙️ Installation

1. Clone the Repository

git clone https://github.com/your-repo-name.git
cd your-repo-name

2. Create a Virtual Environment

python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

3. Install Dependencies

pip install -r requirements.txt

4. Configure Environment Variables

Create a .env file with your API keys:

FIRST_API_KEY=your_gemini_api_key
SECOND_API_KEY=your_deepgram_api_key

📡 API Endpoints

✅ Health Check

GET /

{
  "status": "success",
  "message": "API is running successfully!"
}

🎧 Process Audio URL

POST /process-audio

  • Request:

    {
      "audioUrl": "https://example.com/audio.wav"
    }
    
  • Response:

    {
      "structured_data": {
        "Recipe Name": "Pasta Alfredo",
        "Ingredients List": ["Pasta", "Cream", "Garlic"],
        ...
      }
    }
    

📹 Process YouTube Video

POST /process-youtube

  • Request:

    {
      "youtube_url": "https://www.youtube.com/watch?v=example"
    }
    
  • Response:

    {
      "structured_data": {
        "Recipe Name": "Grilled Cheese Sandwich",
        "Ingredients List": ["Bread", "Cheese", "Butter"],
        ...
      }
    }
    

🛠️ How It Works

  1. Audio Processing:

    • Downloads the audio file.
    • Transcribes using Deepgram.
    • Sends the transcription to Gemini for structured data.
  2. YouTube Processing:

    • Extracts the video ID.
    • Retrieves the transcript.
    • Sends the transcript to Gemini for structured data.

📦 Dependencies

  • Flask
  • Whisper
  • Deepgram
  • Google Gemini API
  • YouTube Transcript API
  • Requests
  • Dotenv

Install dependencies via:

pip install -r requirements.txt

▶️ Run the Application

  1. Activate Virtual Environment:

    source venv/bin/activate  # Windows: venv\Scripts\activate
    
  2. Run the Flask App:

    python app.py
    
  3. Access:

    http://localhost:5000
    

🐞 Error Handling

  • API Key Errors: Ensure .env contains valid API keys.
  • Invalid Input: Returns 400 for missing URLs.
  • Transcription Errors: Returns detailed error messages.

📝 License

MIT License.


👤 Contributors

  • Aniket

💬 Feedback or contributions? Open an issue or submit a pull request!

Check out the configuration reference at https://huggingface.co./docs/hub/spaces-config-reference