Spaces:

CyranoB
/

search_agent

Running

App Files Files Community

Eddie Pick commited on Sep 15, 2024

Commit

6f80de5

unverified ·

1 Parent(s): d803be1

Improvements

Browse files

Files changed (6) hide show

README.md +34 -47
models.py +9 -18
search_agent.py +15 -7
search_agent_ui.py +27 -8
web_crawler.py +18 -27
web_rag.py +17 -1

README.md CHANGED Viewed

@@ -10,26 +10,25 @@ pinned: false
 license: apache-2.0
 ---
-⚠️ **This project is a demonstration / proof-of-concept and is not intended for use in production environments. It is provided as-is, without warranty or guarantee of any kind. The code and any accompanying materials are for educational, testing, or evaluation purposes only.**⚠️
 # Simple Search Agent
-This Python project provides a search agent that can perform web searches, optimize search queries, fetch and process web content, and generate responses using a language model and the retrieved information.
-Does a bit what [Perplexity AI](https://www.perplexity.ai/) does.
 The Streamlit GUI hosted on 🤗 Spaces is [available to test](https://huggingface.co/spaces/CyranoB/search_agent)
-This Python script and Streamli GUI are a basic search agent that utilizes the LangChain library to perform optimized web searches, retrieve relevant content, and generate informative answers to user queries. The script supports multiple language models and providers, including OpenAI, Anthropic, and Groq.
 The main functionality of the script can be summarized as follows:
 1. **Query Optimization**: The user's input query is optimized for web search by identifying the key information requested and transforming it into a concise search string using the language model's capabilities.
 2. **Web Search**: The optimized search query is used to fetch search results from the Brave Search API. The script allows limiting the search to a specific domain and setting the maximum number of pages to retrieve.
 3. **Content Extraction**: The script fetches the content of the retrieved search results, handling both HTML and PDF documents. It extracts the main text content from web pages and text from PDF files.
-4. **Vectorization**: The extracted content is split into smaller text chunks and vectorized using OpenAI's text embeddings. The vectorized data is stored in a FAISS vector store for efficient retrieval.
-5. **Query Answering**: The user's original query is answered by retrieving the most relevant text chunks from the vector store using a Multi-Query Retriever. The language model generates an informative answer by synthesizing the retrieved information, citing the sources used, and formatting the response in Markdown.
-The script supports various options for customization, such as specifying the language model provider (OpenAI, Anthropic, Groq, or OllaMa), temperature for language model generation, and output format (text or Markdown).
 Additionally, the script integrates with the LangChain Tracing V2 feature, allowing users to monitor and analyze the execution of their LangChain applications using the LangChain Studio.
@@ -48,58 +47,46 @@ To run the script, users need to provide their API keys for the desired language
 1. Clone this repo
 2. Install the required dependencies:
-   ```
    pip install -r requirements.txt
    ```
 3. Set up API keys:
-   - You will need API keys for the web search API and LLM API.
    - Add your API keys to the `.env` file. Use `dotenv.sample` to create this file.
 ## Usage
 ```
-python search_agent.py --query "your search query" --provider "provider_name" --model "model_name" --temperature 0.0
-```
-Replace `"your search query"` with your desired search query, `"provider_name"` with the language model provider (e.g., `bedrock`, `openai`, `groq`, `ollama`), `"model_name"` with the specific model name (optional), and `temperature` with the desired temperature value for the language model (optional).
-Example:
 ```
-➜ python ./search_agent.py  --provider groq -o text "Write a linkedin post on how Sequoia Capital AI Ascent 2024 is interesting"
-[21:44:05] Using mixtral-8x7b-32768 on groq with temperature 0.0             search_agent.py:78
-[21:44:06] Optimized search query: Sequoia Capital AI Ascent 2024 interest  search_agent.py:248
-           Found 10 sources                                                 search_agent.py:252
-[21:44:08] Managed to extract content from 7 sources                        search_agent.py:256
-[21:44:12] Filtered 21 relevant content extracts                            search_agent.py:263
-───────────────────────────────────── Response from groq ──────────────────────────────────────
-🚀 Sequoia Capital's AI Ascent 2024 conference brought together some of the brightest minds in
-AI, including founders, researchers, and industry leaders. The event was a unique opportunity
-to discuss the state of AI and its future, focusing on the promise of generative AI to
-revolutionize industries and provide amazing productivity gains.
-🌟 Highlights of the conference included talks by Sam Altman of OpenAI, Dylan Field of Figma,
-Alfred Mensch of Mistral, Daniela Amodei of Anthropic, Andrew Ng of AI Fund, CJ Desai of
-ServiceNow, and independent researcher Andrej Karpathy. Sessions covered a wide range of
-topics, from the merits of large and small models to the rise of reasoning agents, the future
-of compute, and the evolving AI ecosystem.
-💡 One key takeaway from the event is the recognition that we are in a 'primordial soup' phase
-of AI development. This is a crucial moment for the technology to transition from being an idea
-to solving real-world problems efficiently. Factors like cheap compute power, fast networks,
-ubiquitous supercomputers, and readily available data are enabling AI as the next significant
-technology wave.
-🔜 As we move forward, we can expect AI to become an even more significant part of our lives,
-revolutionizing various sectors and offering unprecedented value creation potential. Stay tuned
-for the upcoming advancements in AI, and let's continue to explore and harness its vast
-capabilities!
-_For more information, check out the [Sequoia Capital AI Ascent 2024 conference
-recap](https://www.sequoiacap.com/article/ai-ascent-2024/)._
-#AI #ArtificialIntelligence #GenerativeAI #SequoiaCapital #AIascent2024
-──────────────────────────────────────────────  ───────────────────────────────────────────────
 ```
 ## License

 license: apache-2.0
 ---
+⚠️ **This project is a demonstration / proof-of-concept and is not intended for use in production environments. It is provided as-is, without warranty or guarantee of any kind. The code and any accompanying materials are for educational, testing, or evaluation purposes only.** ⚠️
 # Simple Search Agent
+This Python project provides a search agent that can perform web searches, optimize search queries, fetch and process web content, and generate responses using a language model and the retrieved information. It does a bit of what [Perplexity AI](https://www.perplexity.ai/) does.
 The Streamlit GUI hosted on 🤗 Spaces is [available to test](https://huggingface.co/spaces/CyranoB/search_agent)
+This Python script and Streamlit GUI are a basic search agent that utilizes the LangChain library to perform optimized web searches, retrieve relevant content, and generate informative answers to user queries. The script supports multiple language models and providers, including OpenAI, Anthropic, and Groq.
 The main functionality of the script can be summarized as follows:
 1. **Query Optimization**: The user's input query is optimized for web search by identifying the key information requested and transforming it into a concise search string using the language model's capabilities.
 2. **Web Search**: The optimized search query is used to fetch search results from the Brave Search API. The script allows limiting the search to a specific domain and setting the maximum number of pages to retrieve.
 3. **Content Extraction**: The script fetches the content of the retrieved search results, handling both HTML and PDF documents. It extracts the main text content from web pages and text from PDF files.
+4. **Vectorization**: The extracted content is split into smaller text chunks using a RecursiveCharacterTextSplitter and vectorized using the specified embedding model. The vectorized data is stored in a FAISS vector store for efficient retrieval.
+5. **Query Answering**: The user's original query is answered by retrieving the most relevant text chunks from the vector store. The language model generates an informative answer by synthesizing the retrieved information, citing the sources used, and formatting the response in Markdown.
+The script supports various options for customization, such as specifying the language model provider (OpenAI, Anthropic, Groq, or Ollama), temperature for language model generation, and output format (text or Markdown).
 Additionally, the script integrates with the LangChain Tracing V2 feature, allowing users to monitor and analyze the execution of their LangChain applications using the LangChain Studio.
 1. Clone this repo
 2. Install the required dependencies:
+   ```bash
    pip install -r requirements.txt
    ```
 3. Set up API keys:
+   - You will need API keys for the Brave Search API and LLM API.
    - Add your API keys to the `.env` file. Use `dotenv.sample` to create this file.
 ## Usage
+You can run the search agent from the command line using the following syntax:
+```bash
+python search_agent.py [OPTIONS] SEARCH_QUERY
 ```
+### Options:
+- `-h`, `--help`: Show this help message and exit.
+- `--version`: Show the program's version number and exit.
+- `-c`, `--copywrite`: First produce a draft, review it, and rewrite for a final text.
+- `-d DOMAIN`, `--domain=DOMAIN`: Limit search to a specific domain.
+- `-t TEMP`, `--temperature=TEMP`: Set the temperature of the LLM [default: 0.0].
+- `-m MODEL`, `--model=MODEL`: Use a specific model [default: openai/gpt-4o-mini].
+- `-e MODEL`, `--embedding_model=MODEL`: Use a specific embedding model [default: same provider as model].
+- `-n NUM`, `--max_pages=NUM`: Max number of pages to retrieve [default: 10].
+- `-x NUM`, `--max_extracts=NUM`: Max number of page extracts to consider [default: 7].
+- `-s`, `--use_selenium`: Use selenium to fetch content from the web [default: False].
+- `-o TEXT`, `--output=TEXT`: Output format (choices: text, markdown) [default: markdown].
+### Examples
+```bash
+python search_agent.py -m openai/gpt-4o-mini "Write a linked post about the current state of M&A for startups. Write in the style of Russ from Silicon Valley TV show."
 ```
+```bash
+ python search_agent.py -m openai -e ollama -t 0.7 -n 20 -x 15  "Write a linked post about the state of M&A for startups in 2024. Write in the style of Russ from TV show Silicon Valley" -s
 ```
 ## License

models.py CHANGED Viewed

@@ -31,17 +31,12 @@ from langchain_together.embeddings import TogetherEmbeddings
 def get_model(provider_model, temperature=0.0):
-    provider, model = (provider_model.split('/') + [None])[:2]
     match provider:
         case 'bedrock':
-            #credentials_profile_name=os.getenv('CREDENTIALS_PROFILE_NAME')
             if model is None:
                 model = "anthropic.claude-3-sonnet-20240229-v1:0"
-            chat_llm = ChatBedrockConverse(
-                #credentials_profile_name=credentials_profile_name,
-                model=model,
-                temperature=temperature,
-            )
         case 'cohere':
             if model is None:
                 model = 'command-r-plus'
@@ -52,7 +47,7 @@ def get_model(provider_model, temperature=0.0):
             chat_llm = ChatFireworks(model_name=model, temperature=temperature, max_tokens=120000)
         case 'googlegenerativeai':
             if model is None:
-                model = "gemini-1.5-pro"
             chat_llm = ChatGoogleGenerativeAI(model=model, temperature=temperature,
                                               max_tokens=None, timeout=None, max_retries=2,)
         case 'groq':
@@ -82,16 +77,12 @@ def get_model(provider_model, temperature=0.0):
 def get_embedding_model(provider_embedding_model):
-    provider, model = (provider_embedding_model.split('/') + [None])[:2]
     match provider:
         case 'bedrock':
-            #credentials_profile_name=os.getenv('CREDENTIALS_PROFILE_NAME')
             if model is None:
-                model = "cohere.embed-multilingual-v3"
-            embedding_model = BedrockEmbeddings(
-                model_id=model,
-                #credentials_profile_name=credentials_profile_name
-            )
         case 'cohere':
             if model is None:
                 model = "embed-english-light-v3.0"
@@ -118,11 +109,11 @@ def get_embedding_model(provider_embedding_model):
             raise ValueError(f"Cannot use Perplexity for embedding model")
         case 'together':
             if model is None:
-                model = 'BAAI/bge-base-en-v1.5'
             embedding_model = TogetherEmbeddings(model=model)
         case _:
             raise ValueError(f"Unknown LLM provider {provider}")
     return embedding_model
@@ -233,7 +224,7 @@ class TestGetModel(unittest.TestCase):
     @patch('models.ChatGroq')
     def test_groq_model(self, mock_groq):
         result = get_model('groq')
-        mock_groq.assert_called_once_with(model_name='llama-3.1-8b-instant', temperature=0.0)
         self.assertEqual(result, mock_groq.return_value)
     @patch('models.ChatOllama')

 def get_model(provider_model, temperature=0.0):
+    provider, model = (provider_model.rstrip('/').split('/') + [None])[:2]
     match provider:
         case 'bedrock':
             if model is None:
                 model = "anthropic.claude-3-sonnet-20240229-v1:0"
+            chat_llm = ChatBedrockConverse(model=model, temperature=temperature)
         case 'cohere':
             if model is None:
                 model = 'command-r-plus'
             chat_llm = ChatFireworks(model_name=model, temperature=temperature, max_tokens=120000)
         case 'googlegenerativeai':
             if model is None:
+                model = "gemini-1.5-flash"
             chat_llm = ChatGoogleGenerativeAI(model=model, temperature=temperature,
                                               max_tokens=None, timeout=None, max_retries=2,)
         case 'groq':
 def get_embedding_model(provider_embedding_model):
+    provider, model = (provider_embedding_model.rstrip('/').split('/') + [None])[:2]
     match provider:
         case 'bedrock':
             if model is None:
+                model = "amazon.titan-embed-text-v2:0"
+            embedding_model = BedrockEmbeddings(model_id=model)
         case 'cohere':
             if model is None:
                 model = "embed-english-light-v3.0"
             raise ValueError(f"Cannot use Perplexity for embedding model")
         case 'together':
             if model is None:
+                model = 'togethercomputer/m2-bert-80M-2k-retrieval'
             embedding_model = TogetherEmbeddings(model=model)
         case _:
             raise ValueError(f"Unknown LLM provider {provider}")
     return embedding_model
     @patch('models.ChatGroq')
     def test_groq_model(self, mock_groq):
         result = get_model('groq')
+        mock_groq.assert_called_once_with(model_name='llama2-70b-4096', temperature=0.0)
         self.assertEqual(result, mock_groq.return_value)
     @patch('models.ChatOllama')

search_agent.py CHANGED Viewed

@@ -22,9 +22,9 @@ Options:
     -d domain --domain=domain           Limit search to a specific domain
     -t temp --temperature=temp          Set the temperature of the LLM [default: 0.0]
     -m model --model=model              Use a specific model [default: openai/gpt-4o-mini]
-    -e model --embedding_model=model    Use a specific embedding model [default: openai/text-embedding-3-small]
     -n num --max_pages=num              Max number of pages to retrieve [default: 10]
-    -e num --max_extracts=num           Max number of page extract to consider [default: 5]
     -s --use_selenium                   Use selenium to fetch content from the web [default: False]
     -o text --output=text               Output format (choices: text, markdown) [default: markdown]
@@ -54,10 +54,10 @@ dotenv.load_dotenv()
 def get_selenium_driver():
     from selenium import webdriver
     from selenium.webdriver.chrome.options import Options
-    from selenium.common.exceptions import TimeoutException
     chrome_options = Options()
-    chrome_options.add_argument("headless")
     chrome_options.add_argument("--disable-extensions")
     chrome_options.add_argument("--disable-gpu")
     chrome_options.add_argument("--no-sandbox")
@@ -66,8 +66,12 @@ def get_selenium_driver():
     chrome_options.add_argument('--blink-settings=imagesEnabled=false')
     chrome_options.add_argument("--window-size=1920,1080")
-    driver = webdriver.Chrome(options=chrome_options)
-    return driver
 callbacks = []
 if os.getenv("LANGCHAIN_API_KEY"):
@@ -88,7 +92,11 @@ def main(arguments):
     query = arguments["SEARCH_QUERY"]
     chat = md.get_model(model, temperature)
-    embedding_model = md.get_embedding_model(embedding_model)
     with console.status(f"[bold green]Optimizing query for search: {query}"):
         optimize_search_query = wr.optimize_search_query(chat, query)

     -d domain --domain=domain           Limit search to a specific domain
     -t temp --temperature=temp          Set the temperature of the LLM [default: 0.0]
     -m model --model=model              Use a specific model [default: openai/gpt-4o-mini]
+    -e model --embedding_model=model    Use a specific embedding model [default: same provider as model]
     -n num --max_pages=num              Max number of pages to retrieve [default: 10]
+    -x num --max_extracts=num           Max number of page extract to consider [default: 7]
     -s --use_selenium                   Use selenium to fetch content from the web [default: False]
     -o text --output=text               Output format (choices: text, markdown) [default: markdown]
 def get_selenium_driver():
     from selenium import webdriver
     from selenium.webdriver.chrome.options import Options
+    from selenium.common.exceptions import WebDriverException
     chrome_options = Options()
+    chrome_options.add_argument("--headless")
     chrome_options.add_argument("--disable-extensions")
     chrome_options.add_argument("--disable-gpu")
     chrome_options.add_argument("--no-sandbox")
     chrome_options.add_argument('--blink-settings=imagesEnabled=false')
     chrome_options.add_argument("--window-size=1920,1080")
+    try:
+        driver = webdriver.Chrome(options=chrome_options)
+        return driver
+    except WebDriverException as e:
+        print(f"Error creating Selenium WebDriver: {e}")
+        return None
 callbacks = []
 if os.getenv("LANGCHAIN_API_KEY"):
     query = arguments["SEARCH_QUERY"]
     chat = md.get_model(model, temperature)
+    if embedding_model.lower() == "same provider as model":
+        provider = model.split('/')[0]
+        embedding_model = md.get_embedding_model(f"{provider}/")
+    else:
+        embedding_model = md.get_embedding_model(embedding_model)
     with console.status(f"[bold green]Optimizing query for search: {query}"):
         optimize_search_query = wr.optimize_search_query(chat, query)

search_agent_ui.py CHANGED Viewed

@@ -58,6 +58,8 @@ if "models" not in st.session_state:
     models = []
     if os.getenv("FIREWORKS_API_KEY"):
         models.append("fireworks")
     if os.getenv("COHERE_API_KEY"):
         models.append("cohere")
     if os.getenv("OPENAI_API_KEY"):
@@ -74,7 +76,7 @@ with st.sidebar.expander("Options", expanded=False):
     model_provider = st.selectbox("Model provider 🧠", st.session_state["models"])
     temperature = st.slider("Model temperature 🌡️", 0.0, 1.0, 0.1, help="The higher the more creative")
     max_pages = st.slider("Max pages to retrieve 🔍", 1, 20, 10, help="How many web pages to retrive from the internet")
-    top_k_documents = st.slider("Nbr of doc extracts to consider 📄", 1, 20, 5, help="How many of the top extracts to consider")
     reviewer_mode =  st.checkbox("Draft / Comment / Rewrite mode ✍️", value=False, help="First generate a draft, then comments and then rewrite")
 with st.sidebar.expander("Links", expanded=False):
@@ -148,13 +150,30 @@ if prompt := st.chat_input("Enter you instructions..." ):
     with st.chat_message("assistant"):
         st_cb = StreamHandler(st.empty())
-        if hasattr(chat, 'stream'):
-            response = ""
-            for chunk in chat.stream(rag_prompt, config={"callbacks": [st_cb, ls_tracer]}):
-                response += chunk.content
-        else:
-            result = chat.invoke(rag_prompt, config={"callbacks": [st_cb, ls_tracer]})
-            response = result.content
         response = response.strip()
         message_id = f"{prompt}{datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')}"

     models = []
     if os.getenv("FIREWORKS_API_KEY"):
         models.append("fireworks")
+    if os.getenv("TOGETHER_API_KEY"):
+        models.append("together")
     if os.getenv("COHERE_API_KEY"):
         models.append("cohere")
     if os.getenv("OPENAI_API_KEY"):
     model_provider = st.selectbox("Model provider 🧠", st.session_state["models"])
     temperature = st.slider("Model temperature 🌡️", 0.0, 1.0, 0.1, help="The higher the more creative")
     max_pages = st.slider("Max pages to retrieve 🔍", 1, 20, 10, help="How many web pages to retrive from the internet")
+    top_k_documents = st.slider("Nbr of doc extracts to consider 📄", 1, 20, 10, help="How many of the top extracts to consider")
     reviewer_mode =  st.checkbox("Draft / Comment / Rewrite mode ✍️", value=False, help="First generate a draft, then comments and then rewrite")
 with st.sidebar.expander("Links", expanded=False):
     with st.chat_message("assistant"):
         st_cb = StreamHandler(st.empty())
+        response = ""
+        for chunk in chat.stream(rag_prompt, config={"callbacks": [ls_tracer]}):
+            if isinstance(chunk, dict):
+                chunk_text = chunk.get('text') or chunk.get('content', '')
+            elif isinstance(chunk, str):
+                chunk_text = chunk
+            elif hasattr(chunk, 'content'):
+                chunk_text = chunk.content
+            else:
+                chunk_text = str(chunk)
+            if isinstance(chunk_text, list):
+                chunk_text = ' '.join(
+                    item['text'] if isinstance(item, dict) and 'text' in item
+                    else str(item)
+                    for item in chunk_text if item is not None
+                )
+            elif chunk_text is not None:
+                chunk_text = str(chunk_text)
+            else:
+                continue
+            response += chunk_text
+            st_cb.on_llm_new_token(chunk_text)
         response = response.strip()
         message_id = f"{prompt}{datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')}"

web_crawler.py CHANGED Viewed

@@ -8,8 +8,7 @@ from trafilatura import extract
 from selenium.common.exceptions import TimeoutException
 from langchain_core.documents.base import Document
 from langchain_experimental.text_splitter import SemanticChunker
-from langchain.text_splitter import RecursiveCharacterTextSplitter
-from langchain_openai import OpenAIEmbeddings
 from langchain_community.vectorstores.faiss import FAISS
 from langsmith import traceable
 import requests
@@ -130,7 +129,6 @@ def get_links_contents(sources, get_driver_func=None, use_selenium=False):
 @traceable(run_type="embedding")
 def vectorize(contents, embedding_model):
     documents = []
-    total_content_length = 0
     for content in contents:
         try:
             page_content = content['page_content']
@@ -138,38 +136,31 @@ def vectorize(contents, embedding_model):
                 metadata = {'title': content['title'], 'source': content['link']}
                 doc = Document(page_content=content['page_content'], metadata=metadata)
                 documents.append(doc)
-                total_content_length += len(page_content)
         except Exception as e:
-            print(f"[gray]Error processing content for {content['link']}: {e}")
-    # Define a threshold for when to use pre-splitting (e.g., 1 million characters)
-    pre_split_threshold = 1_000_000
-    if total_content_length > pre_split_threshold:
-        # Use pre-splitting for large datasets
-        pre_splitter = RecursiveCharacterTextSplitter(
-            chunk_size=2000,
-            chunk_overlap=200,
-            length_function=len,
-        )
-        documents = pre_splitter.split_documents(documents)
-    semantic_chunker = SemanticChunker(embedding_model, breakpoint_threshold_type="percentile")
     vector_store = None
-    batch_size = 200  # Adjust this value if needed
-    for i in range(0, len(documents), batch_size):
-        batch = documents[i:i+batch_size]
-        # Split each document in the batch using SemanticChunker
-        chunked_docs = []
-        for doc in batch:
-            chunked_docs.extend(semantic_chunker.split_documents([doc]))
         if vector_store is None:
-            vector_store = FAISS.from_documents(chunked_docs, embedding_model)
         else:
-            vector_store.add_documents(chunked_docs)
     return vector_store

 from selenium.common.exceptions import TimeoutException
 from langchain_core.documents.base import Document
 from langchain_experimental.text_splitter import SemanticChunker
+from langchain.text_splitter import RecursiveCharacterTextSplitter, TokenTextSplitter
 from langchain_community.vectorstores.faiss import FAISS
 from langsmith import traceable
 import requests
 @traceable(run_type="embedding")
 def vectorize(contents, embedding_model):
     documents = []
     for content in contents:
         try:
             page_content = content['page_content']
                 metadata = {'title': content['title'], 'source': content['link']}
                 doc = Document(page_content=content['page_content'], metadata=metadata)
                 documents.append(doc)
         except Exception as e:
+            print(f"Error processing content for {content['link']}: {e}")
+    # Initialize recursive text splitter
+    text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
+    # Split documents
+    split_documents = text_splitter.split_documents(documents)
+    # Create vector store
     vector_store = None
+    batch_size = 250  # Slightly less than 256 to be safe
+    for i in range(0, len(split_documents), batch_size):
+        batch = split_documents[i:i+batch_size]
         if vector_store is None:
+            vector_store = FAISS.from_documents(batch, embedding_model)
         else:
+            texts = [doc.page_content for doc in batch]
+            metadatas = [doc.metadata for doc in batch]
+            embeddings = embedding_model.embed_documents(texts)
+            vector_store.add_embeddings(
+                list(zip(texts, embeddings)),
+                metadatas
+            )
     return vector_store

web_rag.py CHANGED Viewed

@@ -96,6 +96,12 @@ def get_optimized_search_messages(query):
             Exmaple:
                 Question: Write a short linkedin about how the "freakeconomics" book previsions didn't pan out
                 freakeconomics book predictions failed**
         """
     )
     human_message = HumanMessage(
@@ -293,4 +299,14 @@ def build_rag_prompt(chat_llm, question, search_query, vectorstore, top_k = 10,
 def query_rag(chat_llm, question, search_query, vectorstore, top_k = 10, callbacks = []):
     prompt = build_rag_prompt(chat_llm, question, search_query, vectorstore, top_k=top_k, callbacks = callbacks)
     response = chat_llm.invoke(prompt, config={"callbacks": callbacks})
-    return response.content

             Exmaple:
                 Question: Write a short linkedin about how the "freakeconomics" book previsions didn't pan out
                 freakeconomics book predictions failed**
+            Example:
+                Question: Write an LinkedIn post about startup M&A in the style of Andrew Ng
+                startup M&A**
+            Example:
+                Question: Write a linked post about the current state of M&A for startups. Write in the style of Russ from Silicon Valley TV show.
+                startup current state M&A**
         """
     )
     human_message = HumanMessage(
 def query_rag(chat_llm, question, search_query, vectorstore, top_k = 10, callbacks = []):
     prompt = build_rag_prompt(chat_llm, question, search_query, vectorstore, top_k=top_k, callbacks = callbacks)
     response = chat_llm.invoke(prompt, config={"callbacks": callbacks})
+    # Ensure we're returning a string
+    if isinstance(response.content, list):
+        # If it's a list, join the elements into a single string
+        return ' '.join(str(item) for item in response.content)
+    elif isinstance(response.content, str):
+        # If it's already a string, return it as is
+        return response.content
+    else:
+        # If it's neither a list nor a string, convert it to a string
+        return str(response.content)