---
title: DishDecode
emoji: ⚡
colorFrom: pink
colorTo: pink
sdk: docker
pinned: false
license: mit
short_description: It transforms unstructured recipe videos into structured
---

---

# 📝 **Flask Audio and YouTube Video Processing API**

A Flask-based API that processes audio files and YouTube video transcripts to generate structured recipe information. It utilizes Whisper, Deepgram, and Gemini APIs for transcription and data extraction.

## 🚀 **Overview**

This API offers:

1. **Audio Processing**: Download and transcribe audio files.
2. **YouTube Transcription**: Extract transcripts from YouTube videos.
3. **Recipe Data Generation**: Generate detailed recipe data using Gemini API.

---

## 🧩 **Features**

- **Audio URL Processing**: Download and transcribe audio via Deepgram.
- **YouTube Video Processing**: Extract video transcripts and process them.
- **Structured Output**: Recipe name, ingredients, steps, techniques, and more.
- **Logging and Error Handling**: Debugging and comprehensive error responses.

---

## ⚙️ **Installation**

### 1. Clone the Repository

```bash
git clone https://github.com/your-repo-name.git
cd your-repo-name
```

### 2. Create a Virtual Environment

```bash
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
```

### 3. Install Dependencies

```bash
pip install -r requirements.txt
```

### 4. Configure Environment Variables

Create a `.env` file with your API keys:

```plaintext
FIRST_API_KEY=your_gemini_api_key
SECOND_API_KEY=your_deepgram_api_key
```

---

## 📡 **API Endpoints**

### ✅ Health Check

**GET /**

```json
{
  "status": "success",
  "message": "API is running successfully!"
}
```

### 🎧 Process Audio URL

**POST /process-audio**

- **Request**:

  ```json
  {
    "audioUrl": "https://example.com/audio.wav"
  }
  ```

- **Response**:

  ```json
  {
    "structured_data": {
      "Recipe Name": "Pasta Alfredo",
      "Ingredients List": ["Pasta", "Cream", "Garlic"],
      ...
    }
  }
  ```

### 📹 Process YouTube Video

**POST /process-youtube**

- **Request**:

  ```json
  {
    "youtube_url": "https://www.youtube.com/watch?v=example"
  }
  ```

- **Response**:

  ```json
  {
    "structured_data": {
      "Recipe Name": "Grilled Cheese Sandwich",
      "Ingredients List": ["Bread", "Cheese", "Butter"],
      ...
    }
  }
  ```

---

## 🛠️ **How It Works**

1. **Audio Processing**:
   - Downloads the audio file.
   - Transcribes using Deepgram.
   - Sends the transcription to Gemini for structured data.

2. **YouTube Processing**:
   - Extracts the video ID.
   - Retrieves the transcript.
   - Sends the transcript to Gemini for structured data.

---

## 📦 **Dependencies**

- **Flask**
- **Whisper**
- **Deepgram**
- **Google Gemini API**
- **YouTube Transcript API**
- **Requests**
- **Dotenv**

Install dependencies via:

```bash
pip install -r requirements.txt
```

---

## ▶️ **Run the Application**

1. **Activate Virtual Environment**:

   ```bash
   source venv/bin/activate  # Windows: venv\Scripts\activate
   ```

2. **Run the Flask App**:

   ```bash
   python app.py
   ```

3. **Access**:

   ```
   http://localhost:5000
   ```

---

## 🐞 **Error Handling**

- **API Key Errors**: Ensure `.env` contains valid API keys.
- **Invalid Input**: Returns 400 for missing URLs.
- **Transcription Errors**: Returns detailed error messages.

---

## 📝 **License**

MIT License.

---

## 👤 **Contributors**

- **Aniket**

---

💬 **Feedback or contributions?** Open an issue or submit a pull request!


Check out the configuration reference at https://huggingface.co./docs/hub/spaces-config-reference