metadata

title: DishDecode
emoji: ⚡
colorFrom: pink
colorTo: pink
sdk: docker
pinned: false
license: mit
short_description: It transforms unstructured recipe videos into structured

📝 Flask Audio and YouTube Video Processing API

A Flask-based API that processes audio files and YouTube video transcripts to generate structured recipe information. It utilizes Whisper, Deepgram, and Gemini APIs for transcription and data extraction.

🚀 Overview

This API offers:

Audio Processing: Download and transcribe audio files.
YouTube Transcription: Extract transcripts from YouTube videos.
Recipe Data Generation: Generate detailed recipe data using Gemini API.

🧩 Features

Audio URL Processing: Download and transcribe audio via Deepgram.
YouTube Video Processing: Extract video transcripts and process them.
Structured Output: Recipe name, ingredients, steps, techniques, and more.
Logging and Error Handling: Debugging and comprehensive error responses.

⚙️ Installation

1. Clone the Repository

git clone https://github.com/your-repo-name.git
cd your-repo-name

2. Create a Virtual Environment

python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

3. Install Dependencies

pip install -r requirements.txt

4. Configure Environment Variables

Create a .env file with your API keys:

FIRST_API_KEY=your_gemini_api_key
SECOND_API_KEY=your_deepgram_api_key

📡 API Endpoints

✅ Health Check

GET /

{
  "status": "success",
  "message": "API is running successfully!"
}

🎧 Process Audio URL

POST /process-audio

Request:

{
  "audioUrl": "https://example.com/audio.wav"
}

Response:

{
  "structured_data": {
    "Recipe Name": "Pasta Alfredo",
    "Ingredients List": ["Pasta", "Cream", "Garlic"],
    ...
  }
}

📹 Process YouTube Video

POST /process-youtube

Request:

{
  "youtube_url": "https://www.youtube.com/watch?v=example"
}

Response:

{
  "structured_data": {
    "Recipe Name": "Grilled Cheese Sandwich",
    "Ingredients List": ["Bread", "Cheese", "Butter"],
    ...
  }
}

🛠️ How It Works

Audio Processing:
- Downloads the audio file.
- Transcribes using Deepgram.
- Sends the transcription to Gemini for structured data.
YouTube Processing:
- Extracts the video ID.
- Retrieves the transcript.
- Sends the transcript to Gemini for structured data.

📦 Dependencies

Flask
Whisper
Deepgram
Google Gemini API
YouTube Transcript API
Requests
Dotenv

Install dependencies via:

pip install -r requirements.txt

▶️ Run the Application

Activate Virtual Environment:

source venv/bin/activate  # Windows: venv\Scripts\activate

Run the Flask App:
```
python app.py
```
Access:
```
http://localhost:5000
```

🐞 Error Handling

API Key Errors: Ensure .env contains valid API keys.
Invalid Input: Returns 400 for missing URLs.
Transcription Errors: Returns detailed error messages.

📝 License

MIT License.

👤 Contributors

Aniket

💬 Feedback or contributions? Open an issue or submit a pull request!

Check out the configuration reference at https://huggingface.co./docs/hub/spaces-config-reference