Spaces:
Running
Running
metadata
title: DishDecode
emoji: ⚡
colorFrom: pink
colorTo: pink
sdk: docker
pinned: false
license: mit
short_description: It transforms unstructured recipe videos into structured
📝 Flask Audio and YouTube Video Processing API
A Flask-based API that processes audio files and YouTube video transcripts to generate structured recipe information. It utilizes Whisper, Deepgram, and Gemini APIs for transcription and data extraction.
🚀 Overview
This API offers:
- Audio Processing: Download and transcribe audio files.
- YouTube Transcription: Extract transcripts from YouTube videos.
- Recipe Data Generation: Generate detailed recipe data using Gemini API.
🧩 Features
- Audio URL Processing: Download and transcribe audio via Deepgram.
- YouTube Video Processing: Extract video transcripts and process them.
- Structured Output: Recipe name, ingredients, steps, techniques, and more.
- Logging and Error Handling: Debugging and comprehensive error responses.
⚙️ Installation
1. Clone the Repository
git clone https://github.com/your-repo-name.git
cd your-repo-name
2. Create a Virtual Environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
3. Install Dependencies
pip install -r requirements.txt
4. Configure Environment Variables
Create a .env
file with your API keys:
FIRST_API_KEY=your_gemini_api_key
SECOND_API_KEY=your_deepgram_api_key
📡 API Endpoints
✅ Health Check
GET /
{
"status": "success",
"message": "API is running successfully!"
}
🎧 Process Audio URL
POST /process-audio
Request:
{ "audioUrl": "https://example.com/audio.wav" }
Response:
{ "structured_data": { "Recipe Name": "Pasta Alfredo", "Ingredients List": ["Pasta", "Cream", "Garlic"], ... } }
📹 Process YouTube Video
POST /process-youtube
Request:
{ "youtube_url": "https://www.youtube.com/watch?v=example" }
Response:
{ "structured_data": { "Recipe Name": "Grilled Cheese Sandwich", "Ingredients List": ["Bread", "Cheese", "Butter"], ... } }
🛠️ How It Works
Audio Processing:
- Downloads the audio file.
- Transcribes using Deepgram.
- Sends the transcription to Gemini for structured data.
YouTube Processing:
- Extracts the video ID.
- Retrieves the transcript.
- Sends the transcript to Gemini for structured data.
📦 Dependencies
- Flask
- Whisper
- Deepgram
- Google Gemini API
- YouTube Transcript API
- Requests
- Dotenv
Install dependencies via:
pip install -r requirements.txt
▶️ Run the Application
Activate Virtual Environment:
source venv/bin/activate # Windows: venv\Scripts\activate
Run the Flask App:
python app.py
Access:
http://localhost:5000
🐞 Error Handling
- API Key Errors: Ensure
.env
contains valid API keys. - Invalid Input: Returns 400 for missing URLs.
- Transcription Errors: Returns detailed error messages.
📝 License
MIT License.
👤 Contributors
- Aniket
💬 Feedback or contributions? Open an issue or submit a pull request!
Check out the configuration reference at https://huggingface.co./docs/hub/spaces-config-reference