Elevate Responses: RAG with LlamaIndex & MongoDB
Introduction
Retrieval Augmented Generation (RAG) systems have revolutionized the way we interact with large language models (LLMs) by enhancing their capabilities to provide contextually relevant responses. These systems connect LLMs to databases, enabling them to retrieve semantically relevant information to augment their responses. In this comprehensive guide, we'll explore how to construct your own RAG system using LlamaIndex and MongoDB, empowering you to develop dynamic and context-aware applications.
Definitions
Retrieval Augmented Generation (RAG): A system design pattern that integrates information retrieval techniques with generative AI models, enhancing the relevance and accuracy of responses to user queries by supplementing them with additional context retrieved from external data sources.
LLamaIndex: An LLM/data framework that facilitates the connection of data sources to both proprietary and open-source LLMs, abstracting complexities associated with data ingestion and RAG pipeline implementation.
Benefits for Integration
Integrating LlamaIndex with MongoDB offers several advantages:
Efficient Data Retrieval: MongoDB serves as both an operational and vector database, efficiently storing and retrieving vector embeddings and operational data required for RAG systems.
Scalability: LlamaIndex abstracts complexities associated with data ingestion and RAG pipeline implementation, enabling developers to build scalable applications that adapt to various domains swiftly.
Contextual Relevance: By leveraging MongoDB's indexing capabilities and LlamaIndex's retrieval model, developers can ensure that LLM responses are contextually relevant and accurate.
Code Implementation
Let's delve into the practical implementation steps:
Step I: Install Libraries
!pip install llama-index
!pip install llama-index-vector-stores-mongodb
!pip install llama-index-embeddings-openai
!pip install pymongo
!pip install datasets
!pip install pandas
Step 2: OPENAI KEY SETUP
import os
os.environ["OPENAI_API_KEY"] = ""
Step 3: Data load and Processing
from datasets import load_dataset
import pandas as pd
# https://huggingface.co./datasets/MongoDB/embedded_movies
# Make sure you have an Hugging Face token(HF_TOKEN) in your development environemnt
dataset = load_dataset("MongoDB/airbnb_embeddings")
# Convert the dataset to a pandas dataframe
dataset_df = pd.DataFrame(dataset['train'])
## Processing
dataset_df = dataset_df.drop(columns=['text_embeddings'])
dataset_df.head(5)
Step 4: LLAMAINDEX LLM CONFIGURATION
from llama_index.core.settings import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
embed_model = OpenAIEmbedding(model="text-embedding-3-small", dimensions=256)
llm = OpenAI()
Settings.llm = llm
Settings.embed_model = embed_model
Step 5: CREATING LLAMAINDEX CUSTOM DOCUMENTS AND NODES
import json
from llama_index.core import Document
from llama_index.core.schema import MetadataMode
# Convert the DataFrame to a JSON string representation
documents_json = dataset_df.to_json(orient='records')
# Load the JSON string into a Python list of dictionaries
documents_list = json.loads(documents_json)
llama_documents = []
for document in documents_list:
# Value for metadata must be one of (str, int, float, None)
document["amenities"] = json.dumps(document["amenities"])
document["images"] = json.dumps(document["images"])
document["host"] = json.dumps(document["host"])
document["address"] = json.dumps(document["address"])
document["availability"] = json.dumps(document["availability"])
document["review_scores"] = json.dumps(document["review_scores"])
document["reviews"] = json.dumps(document["reviews"])
document["image_embeddings"] = json.dumps(document["image_embeddings"])
# Create a Document object with the text and excluded metadata for llm and embedding models
llama_document = Document(
text=document["description"],
metadata=document,
excluded_llm_metadata_keys=["_id", "transit", "minimum_nights", "maximum_nights", "cancellation_policy", "last_scraped", "calendar_last_scraped", "first_review", "last_review", "security_deposit", "cleaning_fee", "guests_included", "host", "availability", "reviews", "image_embeddings"],
excluded_embed_metadata_keys=["_id", "transit", "minimum_nights", "maximum_nights", "cancellation_policy", "last_scraped", "calendar_last_scraped", "first_review", "last_review", "security_deposit", "cleaning_fee", "guests_included", "host", "availability", "reviews", "image_embeddings"],
metadata_template="{key}=>{value}",
text_template="Metadata: {metadata_str}\n-----\nContent: {content}",
)
llama_documents.append(llama_document)
# Observing an example of what the LLM and Embedding model receive as input
print(
"\nThe LLM sees this: \n",
llama_documents[0].get_content(metadata_mode=MetadataMode.LLM),
)
print(
"\nThe Embedding model sees this: \n",
llama_documents[0].get_content(metadata_mode=MetadataMode.EMBED),
)
Step 6: Create Nodes
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core.schema import MetadataMode
parser = SentenceSplitter(chunk_size=5000)
nodes = parser.get_nodes_from_documents(llama_documents)
for node in nodes:
node_embedding = embed_model.get_text_embedding(
node.get_content(metadata_mode=MetadataMode.EMBED)
)
node.embedding = node_embedding
Step 7: MONGODB VECTOR DATABASE CONNECTION AND SETUP
Creating a database and collection within MongoDB is made simple with MongoDB Atlas.
1. First, register for a MongoDB Atlas account. For existing users, sign into MongoDB Atlas.
2. Follow the instructions. Select Atlas UI as the procedure to deploy your first cluster.
3. Create the database: `airbnb`.
4. Within the database` airbnb`, create the collection ‘listings_reviews’.
5. Create a vector search index named vector_index for the ‘listings_reviews’ collection. This index enables the RAG application to retrieve records as additional context to supplement user queries via vector search. Below is the JSON definition of the data collection vector search index.
import pymongo
from google.colab import userdata
def get_mongo_client(mongo_uri):
"""Establish connection to the MongoDB."""
try:
client = pymongo.MongoClient(mongo_uri)
print("Connection to MongoDB successful")
return client
except pymongo.errors.ConnectionFailure as e:
print(f"Connection failed: {e}")
return None
mongo_uri = userdata.get('MONGO_URI_2')
if not mongo_uri:
print("MONGO_URI not set in environment variables")
mongo_client = get_mongo_client(mongo_uri)
DB_NAME="movies"
COLLECTION_NAME="movies_records"
db = mongo_client[DB_NAME]
collection = db[COLLECTION_NAME]
The following code guarantees that the current database collection is empty by executing the delete_many() operation on the collection.
collection.delete_many({})
Step 8: DATA INGESTION
from llama_index.vector_stores.mongodb import MongoDBAtlasVectorSearch
vector_store = MongoDBAtlasVectorSearch(mongo_client, db_name=DB_NAME, collection_name=COLLECTION_NAME, index_name="vector_index")
vector_store.add(nodes)
Step 8: QUERYING THE INDEX WITH USER QUERIES
from llama_index.core import VectorStoreIndex
import pprint
from llama_index.core.response.notebook_utils import display_response
index = VectorStoreIndex.from_vector_store(vector_store)
query_engine = index.as_query_engine(similarity_top_k=3)
query = "I want to stay in a place that's warm and friendly, and not too far from resturants, can you recommend a place? Include a reason as to why you've chosen your selection"
response = query_engine.query(query)
display_response(response)
pprint.pprint(response.source_nodes)
Conclusion
By following the steps outlined in this guide, you can develop your own RAG system with LlamaIndex and MongoDB, enhancing the capabilities of your applications to provide contextually relevant and accurate responses. This integration not only streamlines the development process but also ensures scalability and efficiency in handling large datasets. Embrace the power of RAG systems to revolutionize your AI applications and deliver enhanced user experiences.
“Stay connected and support my work through various platforms:
Medium: You can read my latest articles and insights on Medium at https://medium.com/@andysingal
Paypal: Enjoyed my article? Buy me a coffee! https://paypal.me/alphasingal?country.x=US&locale.x=en_US"
Requests and questions: If you have a project in mind that you’d like me to work on or if you have any questions about the concepts I’ve explained, don’t hesitate to let me know. I’m always looking for new ideas for future Notebooks and I love helping to resolve any doubts you might have.
Resources: