Spaces:
Running
title: Deploy Qdrant RAG
emoji: π
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
license: apache-2.0
Deploying RAG powered by Qdrant as vector db and fastembed for embedding and retrieval
β QUESTION #1:
Why do we want to support streaming? What about streaming is important, or useful?
ANSWER #1:
The goal of streaming in this context is to render the generated answers in chunks. Thus reducing latency specifically for answers containing a lot of tokens
β QUESTION #2:
Why are we using User Session here? What about Python makes us need to use this? Why not just store everything in a global variable?
ANSWER #2:
Users sessions are used to keep track of users activity. It can be used to retrieve contxt from previous conversations or separate conversions
β Discussion Question #1:
Upload a PDF file of the recent DeepSeek-R1 paper and ask the following questions:
- What is RL and how does it help reasoning?
- What is the difference between DeepSeek-R1 and DeepSeek-R1-Zero?
- What is this paper about?
Does this application pass your vibe check? Are there any immediate pitfalls you're noticing?
β Discussion
Yes The application passes the vibe check except for the last question but it is normal behaviour. My collection had documents from another uploaded PDF
π§ CHALLENGE MODE π§
Added Qdrant as vector db
Hugging Face Space link :