{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# A Gentle Introduction to RAG Applications\n", "\n", "This notebook creates a simple RAG (Retrieval-Augmented Generation) system to answer questions from a PDF document using an open-source model." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "PDF_FILE = \"paul.pdf\"\n", "\n", "# We'll be using Llama 3.1 8B for this example.\n", "MODEL = \"llama3.1\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loading the PDF document\n", "\n", "Let's start by loading the PDF document and breaking it down into separate pages.\n", "\n", "" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Number of pages: 9\n", "Length of a page: 3217\n", "Content of a page: 10% a week. And while 110 may not seem much better than 100,if you keep growing at 10% a week you'll be surprised how bigthe numbers get. After a year you'll have 14,000 users, and after2 years you'll have 2 million.You'll be doing different things when you're acquiring users athousand at a time, and growth has to slow down eventually. Butif the market exists you can usually start by recruiting usersmanually and then gradually switch to less manual methods. [3]Airbnb is a classic example of this technique. Marketplaces are sohard to get rolling that you should expect to take heroic measuresat first. In Airbnb's case, these consisted of going door to door inNew York, recruiting new users and helping existing ones improvetheir listings. When I remember the Airbnbs during YC, I picturethem with rolly bags, because when they showed up for tuesdaydinners they'd always just flown back from somewhere.FragileAirbnb now seems like an unstoppable juggernaut, but early on itwas so fragile that about 30 days of going out and engaging inperson with users made the difference between success andfailure.That initial fragility was not a unique feature of Airbnb. Almost allstartups are fragile initially. And that's one of the biggest thingsinexperienced founders and investors (and reporters and know-it-alls on forums) get wrong about them. They unconsciously judgelarval startups by the standards of established ones. They're likesomeone looking at a newborn baby and concluding \"there's noway this tiny creature could ever accomplish anything.\"It's harmless if reporters and know-it-alls dismiss your startup.They always get things wrong. It's even ok if investors dismissyour startup; they'll change their minds when they see growth.The big danger is that you'll dismiss your startup yourself. I'veseen it happen. I often have to encourage founders who don't seethe full potential of what they're building. Even Bill Gates madethat mistake. He returned to Harvard for the fall semester afterstarting Microsoft. He didn't stay long, but he wouldn't havereturned at all if he'd realized Microsoft was going to be even afraction of the size it turned out to be. [4]The question to ask about an early stage startup is not \"is thiscompany taking over the world?\" but \"how big could this companyget if the founders did the right things?\" And the right thingsoften seem both laborious and inconsequential at the time.Microsoft can't have seemed very impressive when it was just acouple guys in Albuquerque writing Basic interpreters for amarket of a few thousand hobbyists (as they were then called),but in retrospect that was the optimal path to dominatingmicrocomputer software. And I know Brian Chesky and JoeGebbia didn't feel like they were en route to the big time as theywere taking \"professional\" photos of their first hosts' apartments.They were just trying to survive. But in retrospect that too wasthe optimal path to dominating a big market.How do you find users to recruit manually? If you build somethingto solve your own problems, then you only have to find yourpeers, which is usually straightforward. Otherwise you'll have to8/6/24, 11:04 AMDo Things that Don't Scale\n", "https://paulgraham.com/ds.html2/9\n" ] } ], "source": [ "from langchain_community.document_loaders import PyPDFLoader\n", "\n", "loader = PyPDFLoader(PDF_FILE)\n", "pages = loader.load()\n", "\n", "print(f\"Number of pages: {len(pages)}\")\n", "print(f\"Length of a page: {len(pages[1].page_content)}\")\n", "print(\"Content of a page:\", pages[1].page_content)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Splitting the pages in chunks\n", "\n", "Pages are too long, so let's split pages into different chunks.\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Number of chunks: 32\n", "Length of a chunk: 1494\n", "Content of a chunk: July 2013One of the most common types of advice we give at Y Combinatoris to do things that don't scale. A lot of would-be founders believethat startups either take off or don't. You build something, makeit available, and if you've made a better mousetrap, people beat apath to your door as promised. Or they don't, in which case themarket must not exist. [1]Actually startups take off because the founders make them takeoff. There may be a handful that just grew by themselves, butusually it takes some sort of push to get them going. A goodmetaphor would be the cranks that car engines had before theygot electric starters. Once the engine was going, it would keepgoing, but there was a separate and laborious process to get itgoing.RecruitThe most common unscalable thing founders have to do at thestart is to recruit users manually. Nearly all startups have to. Youcan't wait for users to come to you. You have to go out and getthem.Stripe is one of the most successful startups we've funded, andthe problem they solved was an urgent one. If anyone could havesat back and waited for users, it was Stripe. But in fact they'refamous within YC for aggressive early user acquisition.Startups building things for other startups have a big pool ofpotential users in the other companies we've funded, and nonetook better advantage of it than Stripe. At YC we use the term\"Collison installation\" for the technique they invented. Morediffident founders ask \"Will you try our beta?\" and if the answer\n" ] } ], "source": [ "from langchain_text_splitters import RecursiveCharacterTextSplitter\n", "\n", "splitter = RecursiveCharacterTextSplitter(chunk_size=1500, chunk_overlap=100)\n", "\n", "chunks = splitter.split_documents(pages)\n", "print(f\"Number of chunks: {len(chunks)}\")\n", "print(f\"Length of a chunk: {len(chunks[1].page_content)}\")\n", "print(\"Content of a chunk:\", chunks[1].page_content)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Storing the chunks in a vector store\n", "\n", "We can now generate embeddings for every chunk and store them in a vector store.\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "from langchain_community.vectorstores import FAISS\n", "from langchain_community.embeddings import OllamaEmbeddings\n", "\n", "embeddings = OllamaEmbeddings(model=MODEL)\n", "vectorstore = FAISS.from_documents(chunks, embeddings)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setting up a retriever\n", "\n", "We can use a retriever to find chunks in the vector store that are similar to a supplied question.\n", "\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[Document(metadata={'source': 'paul.pdf', 'page': 7}, page_content=\"started are not merely a necessary evil, but change the companypermanently for the better. If you have to be aggressive aboutuser acquisition when you're small, you'll probably still beaggressive when you're big. If you have to manufacture your ownhardware, or use your software on users's behalf, you'll learnthings you couldn't have learned otherwise. And mostimportantly, if you have to work hard to delight users when youonly have a handful of them, you'll keep doing it when you have alot.\"),\n", " Document(metadata={'source': 'paul.pdf', 'page': 0}, page_content='Want to start a startup? Get funded by Y Combinator.'),\n", " Document(metadata={'source': 'paul.pdf', 'page': 7}, page_content='Notes[1] Actually Emerson never mentioned mousetraps specifically. Hewrote \"If a man has good corn or wood, or boards, or pigs, tosell, or can make better chairs or knives, crucibles or churchorgans, than anybody else, you will find a broad hard-beaten roadto his house, though it be in the woods.\"[2] Thanks to Sam Altman for suggesting I make this explicit.And no, you can\\'t avoid doing sales by hiring someone to do it foryou. You have to do sales yourself initially. Later you can hire areal salesperson to replace you.[3] The reason this works is that as you get bigger, your sizehelps you grow. Patrick Collison wrote \"At some point, there wasa very noticeable change in how Stripe felt. It tipped from beingthis boulder we had to push to being a train car that in fact hadits own momentum.\"[4] One of the more subtle ways in which YC can help founders isby calibrating their ambitions, because we know exactly how a lotof successful startups looked when they were just getting started.[5] If you\\'re building something for which you can\\'t easily get asmall set of users to observe — e.g. enterprise software — and ina domain where you have no connections, you\\'ll have to rely oncold calls and introductions. But should you even be working onsuch an idea?[6] Garry Tan pointed out an interesting trap founders fall into inthe beginning. They want so much to seem big that they imitateeven the flaws of big companies, like indifference to individualusers. This seems to them more \"professional.\"'),\n", " Document(metadata={'source': 'paul.pdf', 'page': 3}, page_content=\"In that form it only had a potential market of a fewthousand people, but because they felt it was really for them, acritical mass of them signed up. After Facebook stopped being for8/6/24, 11:04 AMDo Things that Don't Scale\")]" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "retriever = vectorstore.as_retriever()\n", "retriever.invoke(\"What can you get away with when you only have a small number of users?\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Configuring the model\n", "\n", "We'll be using Ollama to load the local model in memory. After creating the model, we can invoke it with a question to get the response back.\n", "\n", "" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "AIMessage(content='As of my last update in April 2023, Joe Biden is the President of the United States. He took office on January 20, 2021, succeeding Donald Trump as the 46th President of the United States. Please note that this information might change over time due to elections or other political developments.', response_metadata={'model': 'llama3.1', 'created_at': '2024-08-11T15:56:24.740654949Z', 'message': {'role': 'assistant', 'content': ''}, 'done_reason': 'stop', 'done': True, 'total_duration': 1628813229, 'load_duration': 22064348, 'prompt_eval_count': 20, 'prompt_eval_duration': 40750000, 'eval_count': 65, 'eval_duration': 1520747000}, id='run-c453f60f-b4c3-4ed8-84db-c715f94eb563-0', usage_metadata={'input_tokens': 20, 'output_tokens': 65, 'total_tokens': 85})" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from langchain_ollama import ChatOllama\n", "\n", "model = ChatOllama(model=MODEL, temperature=0)\n", "model.invoke(\"Who is the president of the United States?\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Parsing the model's response\n", "\n", "The response from the model is an `AIMessage` instance containing the answer. We can extract the text answer by using the appropriate output parser. We can connect the model and the parser using a chain.\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "As of my last update in April 2023, Joe Biden is the President of the United States. He took office on January 20, 2021, succeeding Donald Trump as the 46th President of the United States. Please note that this information might change over time due to elections or other political developments.\n" ] } ], "source": [ "from langchain_core.output_parsers import StrOutputParser\n", "\n", "parser = StrOutputParser()\n", "\n", "chain = model | parser \n", "print(chain.invoke(\"Who is the president of the United States?\"))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setting up a prompt\n", "\n", "In addition to the question we want to ask, we also want to provide the model with the context from the PDF file. We can use a prompt template to define and reuse the prompt we'll use with the model.\n", "\n", "\n", "" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "You are an assistant that provides answers to questions based on\n", "a given context. \n", "\n", "Answer the question based on the context. If you can't answer the\n", "question, reply \"I don't know\".\n", "\n", "Be as concise as possible and go straight to the point.\n", "\n", "Context: Here is some context\n", "\n", "Question: Here is a question\n", "\n" ] } ], "source": [ "from langchain.prompts import PromptTemplate\n", "\n", "template = \"\"\"\n", "You are an assistant that provides answers to questions based on\n", "a given context. \n", "\n", "Answer the question based on the context. If you can't answer the\n", "question, reply \"I don't know\".\n", "\n", "Be as concise as possible and go straight to the point.\n", "\n", "Context: {context}\n", "\n", "Question: {question}\n", "\"\"\"\n", "\n", "prompt = PromptTemplate.from_template(template)\n", "print(prompt.format(context=\"Here is some context\", question=\"Here is a question\"))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Adding the prompt to the chain\n", "\n", "We can now chain the prompt with the model and the parser.\n", "\n", "" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'Anna.'" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "chain = prompt | model | parser\n", "\n", "chain.invoke({\n", " \"context\": \"Anna's sister is Susan\", \n", " \"question\": \"Who is Susan's sister?\"\n", "})\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Adding the retriever to the chain\n", "\n", "Finally, we can connect the retriever to the chain to get the context from the vector store.\n", "\n", "" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [], "source": [ "from operator import itemgetter\n", "\n", "chain = (\n", " {\n", " \"context\": itemgetter(\"question\") | retriever,\n", " \"question\": itemgetter(\"question\"),\n", " }\n", " | prompt\n", " | model\n", " | parser\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Using the chain to answer questions\n", "\n", "Finally, we can use the chain to ask questions that will be answered using the PDF document." ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Question: What can you get away with when you only have a small number of users?\n", "Answer: Delight users.\n", "*************************\n", "\n", "Question: What's the most common unscalable thing founders have to do at the start?\n", "Answer: Delighting users individually.\n", "*************************\n", "\n", "Question: What's one of the biggest things inexperienced founders and investors get wrong about startups?\n", "Answer: They want to seem big too early and imitate flaws of big companies, like indifference to individual users.\n", "*************************\n", "\n" ] } ], "source": [ "questions = [\n", " \"What can you get away with when you only have a small number of users?\",\n", " \"What's the most common unscalable thing founders have to do at the start?\",\n", " \"What's one of the biggest things inexperienced founders and investors get wrong about startups?\",\n", "]\n", "\n", "for question in questions:\n", " print(f\"Question: {question}\")\n", " print(f\"Answer: {chain.invoke({'question': question})}\")\n", " print(\"*************************\\n\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "language_info": { "name": "python" } }, "nbformat": 4, "nbformat_minor": 2 }