--- title: Test emoji: 🐠 colorFrom: pink colorTo: pink sdk: docker pinned: false --- # **AI-Powered Question & Answer Generator with Voice Cloning** --- ## **Overview** This project leverages cutting-edge AI technologies to create an interactive experience where AI-generated answers are delivered using a cloned voice. The primary components of the project include: 1. **Text Generation**: Based on a fine-tuned model, Mistral-7B-v0.1, we generate realistic and human-like answers to user-provided questions. 2. **Voice Cloning**: Using the ElevenLabs API, we clone a voice and synthesize the AI-generated answers into natural-sounding speech. 3. **Deception for Interaction**: The system is designed to "tromper" (mislead) players by making the responses appear as if they are coming from a real human. --- ## **Key Features** 1. **Fine-Tuned Model for Text Generation**: - The project utilizes the **Mistral-7B-v0.1** model fine-tuned on a custom dataset. - The model generates contextually accurate, human-like responses to a wide range of questions. 2. **Voice Cloning with ElevenLabs**: - ElevenLabs’ **Speech-to-Text and Voice Cloning API** is used to replicate a target voice. - The cloned voice delivers the AI-generated answers in a natural and believable manner. 3. **Integration for Immersion**: - The generated answers and synthesized speech are integrated to provide seamless interaction. - Designed for applications in gaming, interactive storytelling, or prank scenarios. --- ## **How It Works** ### 1. **Question Input**: - Users provide a question in text form (e.g., "What’s the best way to prepare for a long flight?"). - Alternatively, voice input can be transcribed to text using ElevenLabs’ speech-to-text feature. ### 2. **Text Generation**: - The Mistral-7B-v0.1 model processes the input question and generates a natural response. - Example: - **Question**: "What’s your favorite place to relax?" - **Answer**: "My room, where I can unwind and enjoy some quiet time." ### 3. **Voice Cloning**: - The generated text is sent to ElevenLabs’ API, where it is converted into speech using a cloned voice. - The voice sounds human, complete with natural intonation and emotion. ### 4. **Output Delivery**: - The final output is an audio response delivered in the cloned voice, making it indistinguishable from a real human speaker. --- ## **Applications** - **Gaming**: Use in trivia or role-playing games to simulate human-like NPCs. - **Storytelling**: Create immersive audio experiences by combining generated text with realistic voiceovers. - **Social Experiments**: Test human reactions to AI-generated, voice-synthesized responses in various scenarios. - **Entertainment/Pranks**: Surprise players or audiences with a system that convincingly mimics a real human. --- ## **Technologies Used** 1. **Mistral-7B-v0.1**: - A fine-tuned large language model specializing in text generation. - Delivers contextually accurate and relatable answers. 2. **ElevenLabs API**: - **Speech-to-Text**: Converts spoken questions into text for the model to process. - **Voice Cloning**: Synthesizes text into speech using a cloned voice. 3. **Python**: - Backend logic for integrating text generation, voice synthesis, and API calls. - Frameworks and libraries include `transformers`, `torch`, and API wrappers for ElevenLabs. --- ## **Setup Instructions** ### 1. **Clone the Repository**: ```bash git clone https://github.com/Lirone/NotMe.git cd NotMe