{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import minsearch\n", "from groq import Groq\n", "from mistralai import Mistral\n", "from dotenv import load_dotenv\n", "import os" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loading Environment Variables" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "load_dotenv()\n" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# Access the API keys\n", "mistral_api_key = os.getenv(\"MISTRAL_API_KEY\")\n", "groq_api_key = os.getenv(\"GROQ_API_KEY\")" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "client = Groq(api_key=groq_api_key)\n", "client_mistral=Mistral(api_key=mistral_api_key)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data Loading" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "df=pd.read_csv(\"../dataset/data.csv\")" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Question_IDQuestionsAnswers
01590140What does it mean to have a mental illness?Mental illnesses are health conditions that di...
12110618Who does mental illness affect?It is estimated that mental illness affects 1 ...
26361820What causes mental illness?It is estimated that mental illness affects 1 ...
39434130What are some of the warning signs of mental i...Symptoms of mental health disorders vary depen...
47657263Can people with mental illness recover?When healing from mental illness, early identi...
51619387What should I do if I know someone who appears...Although this website cannot substitute for pr...
61030153How can I find a mental health professional fo...Feeling comfortable with the professional you ...
78022026What treatment options are available?Just as there are different types of medicatio...
81155199If I become involved in treatment, what do I n...Since beginning treatment is a big step for in...
97760466What is the difference between mental health p...There are many types of mental health professi...
\n", "
" ], "text/plain": [ " Question_ID Questions \\\n", "0 1590140 What does it mean to have a mental illness? \n", "1 2110618 Who does mental illness affect? \n", "2 6361820 What causes mental illness? \n", "3 9434130 What are some of the warning signs of mental i... \n", "4 7657263 Can people with mental illness recover? \n", "5 1619387 What should I do if I know someone who appears... \n", "6 1030153 How can I find a mental health professional fo... \n", "7 8022026 What treatment options are available? \n", "8 1155199 If I become involved in treatment, what do I n... \n", "9 7760466 What is the difference between mental health p... \n", "\n", " Answers \n", "0 Mental illnesses are health conditions that di... \n", "1 It is estimated that mental illness affects 1 ... \n", "2 It is estimated that mental illness affects 1 ... \n", "3 Symptoms of mental health disorders vary depen... \n", "4 When healing from mental illness, early identi... \n", "5 Although this website cannot substitute for pr... \n", "6 Feeling comfortable with the professional you ... \n", "7 Just as there are different types of medicatio... \n", "8 Since beginning treatment is a big step for in... \n", "9 There are many types of mental health professi... " ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.head(10)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 98 entries, 0 to 97\n", "Data columns (total 3 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 Question_ID 98 non-null int64 \n", " 1 Questions 98 non-null object\n", " 2 Answers 98 non-null object\n", "dtypes: int64(1), object(2)\n", "memory usage: 2.4+ KB\n" ] } ], "source": [ "df.info()" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "documents=df.to_dict('records')" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'Question_ID': 1590140,\n", " 'Questions': 'What does it mean to have a mental illness?',\n", " 'Answers': 'Mental illnesses are health conditions that disrupt a person’s thoughts, emotions, relationships, and daily functioning. They are associated with distress and diminished capacity to engage in the ordinary activities of daily life.\\nMental illnesses fall along a continuum of severity: some are fairly mild and only interfere with some aspects of life, such as certain phobias. On the other end of the spectrum lie serious mental illnesses, which result in major functional impairment and interference with daily life. These include such disorders as major depression, schizophrenia, and bipolar disorder, and may require that the person receives care in a hospital.\\nIt is important to know that mental illnesses are medical conditions that have nothing to do with a person’s character, intelligence, or willpower. Just as diabetes is a disorder of the pancreas, mental illness is a medical condition due to the brain’s biology.\\nSimilarly to how one would treat diabetes with medication and insulin, mental illness is treatable with a combination of medication and social support. These treatments are highly effective, with 70-90 percent of individuals receiving treatment experiencing a reduction in symptoms and an improved quality of life. With the proper treatment, it is very possible for a person with mental illness to be independent and successful.'}" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "documents[0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Indexing the data Using Minsearch" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ " % Total % Received % Xferd Average Speed Time Time Time Current\n", " Dload Upload Total Spent Left Speed\n", "\n", " 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0\n", "100 3832 100 3832 0 0 4800 0 --:--:-- --:--:-- --:--:-- 4838\n", "100 3832 100 3832 0 0 4785 0 --:--:-- --:--:-- --:--:-- 4820\n" ] } ], "source": [ "!curl -O https://raw.githubusercontent.com/DataTalksClub/llm-zoomcamp/main/01-intro/minsearch.py\n" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Question_IDQuestionsAnswers
01590140What does it mean to have a mental illness?Mental illnesses are health conditions that di...
12110618Who does mental illness affect?It is estimated that mental illness affects 1 ...
26361820What causes mental illness?It is estimated that mental illness affects 1 ...
39434130What are some of the warning signs of mental i...Symptoms of mental health disorders vary depen...
47657263Can people with mental illness recover?When healing from mental illness, early identi...
............
934373204How do I know if I'm drinking too much?Sorting out if you are drinking too much can b...
947807643If cannabis is dangerous, why are we legalizin...Cannabis smoke, for example, contains cancer-c...
954352464How can I convince my kids not to use drugs?You can't. But you can influence their capacit...
966521784What is the legal status (and evidence) of CBD...Cannabidiol or CBD is a naturally occurring co...
973221856What is the evidence on vaping?\"Vaping\" is the term for using a device where ...
\n", "

98 rows × 3 columns

\n", "
" ], "text/plain": [ " Question_ID Questions \\\n", "0 1590140 What does it mean to have a mental illness? \n", "1 2110618 Who does mental illness affect? \n", "2 6361820 What causes mental illness? \n", "3 9434130 What are some of the warning signs of mental i... \n", "4 7657263 Can people with mental illness recover? \n", ".. ... ... \n", "93 4373204 How do I know if I'm drinking too much? \n", "94 7807643 If cannabis is dangerous, why are we legalizin... \n", "95 4352464 How can I convince my kids not to use drugs? \n", "96 6521784 What is the legal status (and evidence) of CBD... \n", "97 3221856 What is the evidence on vaping? \n", "\n", " Answers \n", "0 Mental illnesses are health conditions that di... \n", "1 It is estimated that mental illness affects 1 ... \n", "2 It is estimated that mental illness affects 1 ... \n", "3 Symptoms of mental health disorders vary depen... \n", "4 When healing from mental illness, early identi... \n", ".. ... \n", "93 Sorting out if you are drinking too much can b... \n", "94 Cannabis smoke, for example, contains cancer-c... \n", "95 You can't. But you can influence their capacit... \n", "96 Cannabidiol or CBD is a naturally occurring co... \n", "97 \"Vaping\" is the term for using a device where ... \n", "\n", "[98 rows x 3 columns]" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "df = df.rename(columns={'Question_ID': 'question_id', 'Questions': 'questions','Answers':'answers'})\n" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['question_id', 'questions', 'answers'], dtype='object')" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.columns" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "documents=df.to_dict('records')" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'question_id': 1590140,\n", " 'questions': 'What does it mean to have a mental illness?',\n", " 'answers': 'Mental illnesses are health conditions that disrupt a person’s thoughts, emotions, relationships, and daily functioning. They are associated with distress and diminished capacity to engage in the ordinary activities of daily life.\\nMental illnesses fall along a continuum of severity: some are fairly mild and only interfere with some aspects of life, such as certain phobias. On the other end of the spectrum lie serious mental illnesses, which result in major functional impairment and interference with daily life. These include such disorders as major depression, schizophrenia, and bipolar disorder, and may require that the person receives care in a hospital.\\nIt is important to know that mental illnesses are medical conditions that have nothing to do with a person’s character, intelligence, or willpower. Just as diabetes is a disorder of the pancreas, mental illness is a medical condition due to the brain’s biology.\\nSimilarly to how one would treat diabetes with medication and insulin, mental illness is treatable with a combination of medication and social support. These treatments are highly effective, with 70-90 percent of individuals receiving treatment experiencing a reduction in symptoms and an improved quality of life. With the proper treatment, it is very possible for a person with mental illness to be independent and successful.'}" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "documents[0]" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "index=minsearch.Index(\n", " text_fields=['questions', 'answers'],\n", " keyword_fields=[]\n", ")" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "index.fit(documents)" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "query=\"What should I eat if I lost a friend\"" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'question_id': 4759773,\n", " 'questions': 'What should I do if I’m worried about a friend or relative?',\n", " 'answers': 'This may depend on your relationship with them. Gently encouraging someone to seek appropriate support would be helpful to start with.'},\n", " {'question_id': 3388962,\n", " 'questions': 'What should I know before starting a new medication?',\n", " 'answers': 'The best source of information regarding medications is the physician prescribing them. He or she should be able to answer questions such as: 1. What is the medication supposed to do? 2. When should it begin to take effect, and how will I know when it is effective? 3. How is the medication taken and for how long? What food, drinks, other medicines, and activities should be avoided while taking this medication? 4. What are the side effects and what should be done if they occur? 5. What do I do if a dose is missed? 6. Is there any written information available about this medication? 7. Are there other medications that might be appropriate? 8. If so, why do you prefer the one you have chosen? 9. How do you monitor medications and what symptoms indicate that they should be raised, lowered, or changed? 10. All medications should be taken as directed. Most medications for mental illnesses do not work when taken irregularly, and extra doses can cause severe, sometimes dangerous side effects. Many psychiatric medications begin to have a beneficial effect only after they have been taken for several weeks.'},\n", " {'question_id': 1619387,\n", " 'questions': 'What should I do if I know someone who appears to have the symptoms of a mental disorder?',\n", " 'answers': \"Although this website cannot substitute for professional advice, we encourage those with symptoms to talk to their friends and family members and seek the counsel of a mental health professional. The sooner the mental health condition is identified and treated, the sooner they can get on the path to recovery.\\nIf you know someone who is having problems, don't assume that the issue will resolve itself. Let them know that you care about them, and that there are treatment options available that will help them heal. Speak with a mental health professional or counselor if you think your friend or family member is experiencing the symptoms of a mental health condition. If the affected loved one knows that you support them, they will be more likely to seek out help.\"},\n", " {'question_id': 8690253,\n", " 'questions': 'What do I do if I’m worried about my mental health?',\n", " 'answers': 'The most important thing is to talk to someone you trust. This might be a friend, colleague, family member, or GP. In addition to talking to someone, it may be useful to find out more information about what you are experiencing. These things may help to get some perspective on what you are experiencing, and be the start of getting help.'},\n", " {'question_id': 1259439,\n", " 'questions': 'If I become involved in treatment what do I need to know?',\n", " 'answers': 'Beginning treatment is a big step for individuals and families and can be very overwhelming. It is important to continue involvement in the treatment process as much as possible. Some questions you will need to have answered include:\\nWhat is known about the cause of this particular illness?\\nAre there other diagnoses where these symptoms are common?\\nDo you normally include a physical or neurological examination?\\nAre there any additional tests or exams that you would recommend at this point?\\nWould you advise an independent opinion from another psychiatrist at this point?\\nWhat program of treatment is the most helpful with this diagnosis?\\nWill this program involve services by other specialists? If so, who will be responsible for coordinating these services?\\nWhat do you see as the family’s role in this program of treatment?\\nHow much access will the family have to the individuals who are providing the treatment?\\nWhat medications are generally used with this diagnosis? What is the biological effect of this medication, and what do you expect it to accomplish? What are the risks associated with the medication? How soon will we be able to tell if the medication is effective, and how will we know?\\nHow much experience do you have in treating individuals with this illness?\\nWhat can I do to help you in the treatment?'},\n", " {'question_id': 2973656,\n", " 'questions': 'How do I know if I’m unwell?',\n", " 'answers': 'If your beliefs , thoughts , feelings or behaviours have a significant impact on your ability to function in what might be considered a normal or ordinary way, it would be important to seek help.'},\n", " {'question_id': 9100298,\n", " 'questions': 'How can I maintain social connections? What if I feel lonely?',\n", " 'answers': \"A lot of people are alone right now, but we don't have to be lonely. We're all in this together. \\n While you may be physically separated from friends, family members, and other loved ones, it has never been more important to maintain those social connections. Social connections are an opportunity to seek and share support, talk through difficult feelings, share a laugh, keep up-to-date with loved ones, and help each other cope. This pandemic is a lot for one person to deal with on their own. While measures like physical distancing and self-isolation are necessary to slow the spread of the virus, the physical separation can amplify a lot of challenging emotions like loneliness and fear. \\n Think about the different ways to connect that are most meaningful for you. For example, you might prefer a video chat over a phone call, or you might prefer to text throughout the day rather than one set time for a video call. Then, work with your social networks to make a plan. You might video chat with your close friends in the evening and phone a family member once a week. \\n Remember to be mindful of people who may not be online. Check in by phone and ask how you can help. \\n The quality of your social connections matter. Mindlessly scrolling through social media and liking a few posts usually doesn't build strong social connections. Make sure you focus on strategies that actually make you feel included and connected. If your current strategies don't help you feel connected, problem-solve to see if you can find a solution. \\n Everyone feels lonely at times—maybe you recently moved to a new city, are changing your circle of friends, lost someone important in your life, or lost your job and also lost important social connections with coworkers. Other people may have physical connections to others but may feel like their emotional or social needs aren't met. Measures like social distancing or self-isolation can make loneliness feel worse no matter why you feel lonely now. \\n Reach out to the connections you do have. Suggest ways to keep in touch and see if you can set a regular time to connect. People may hesitate to reach out for a lot of different reasons, so don't be afraid to be the one who asks. \\n Look for local community support groups and mutual aid groups on social media. This pandemic is bringing everyone together, so look for opportunities to make new connections. These groups are a great way to share your skills and abilities or seek help and support. \\n Look for specialized support groups. Support groups are moving online, and there are a lot of different support lines to call if you need to talk to someone. To find community services in BC, call or text 211 or visit www.bc211.ca. \\n If you need extra support, you can talk with a psychologist or counsellor for free: \\n You can access a free phone call with a Registered Psychologist though the Covid-19 Psychological Support Service from the BC Psychological Association. Visit www.psychologists.bc.ca/covid-19-resources. \\n You can access free, phone-based, short-term support with a counsellor from a new group called the BC COVID-19 Mental Health Network. Email bccovidtherapists@gmail.com to receive an appointment time. \\n For youth people ages 12-24, you can talk with a counsellor for free through Foundry Virtual. Visit foundrybc.ca/get-support/virtual/. \\n Call the BC Mental Health Support Line at 310-6789. It’s available 24/7. \\n Chat online with a Crisis Center volunteer at www.crisiscentrechat.ca (daily between noon and 1:00am) \\n For older adults: Call the Seniors Distress Line at 604-872-123 \\n For youth and young adults: Chat online with a volunteer at www.YouthinBC.com (daily between noon and 1:00am) \\n For children and youth: Call the Kids Help Phone at 1-800-668-6868 or visit kidshelpphone.ca \\n For tips on managing loneliness, check out the following resources: \\n Coping with Loneliness from the Canadian Mental Health Association: cmha.bc.ca/documents/coping-with-loneliness/ \\n Loneliness and Social Connection issue of Visions Journal at www.heretohelp.bc.ca/visions/loneliness-and-social-connection-vol14 \\n Wellness Module 3: Social Support at www.heretohelp.bc.ca/wellness-module/wellness-module-3-social-support\"},\n", " {'question_id': 1155199,\n", " 'questions': 'If I become involved in treatment, what do I need to know?',\n", " 'answers': 'Since beginning treatment is a big step for individuals and families, it can be very overwhelming. It is important to be as involved and engaged in the treatment process as possible. Some questions you will need to have answered include:\\nWhat is known about the cause of this particular illness?\\nAre there other diagnoses where these symptoms are common?\\nDo you normally include a physical or neurological examination?\\nAre there any additional tests or exams that you would recommend at this point?\\nWould you advise an independent opinion from another psychiatrist at this point?\\nWhat program of treatment is the most helpful with this diagnosis?\\nWill this program involve services by other specialists? If so, who will be responsible for coordinating these services?\\nWhat do you see as the family’s role in this program of treatment?\\nHow much access will the family have to the individuals who are providing the treatment?\\nWhat medications are generally used with this diagnosis?\\nHow much experience do you have in treating individuals with this illness?\\nWhat can I do to help you in the treatment?'},\n", " {'question_id': 2447683,\n", " 'questions': 'Cannabis is legally allowed to 19+ but there are doctor groups saying it’s potentially harmful to age 25. Any use or certain use? What’s myth and what’s fact? If I’m a parent, what should I tell my young adult?',\n", " 'answers': 'Using cannabis has the potential for benefits and harms. Young people use cannabis, like other psychoactive drugs, to feel good, to feel better, to do better or to explore. Trying cannabis out of curiosity, as an experiment, or while socializing with friends, is related to moderate use and lower potential for harm. Using cannabis to cope with daily life, deal with unpleasant feelings, or fit in with a social group has higher potential for harm. This is because dealing with these kinds of issues is associated with frequent and heavier use, less thought about potential harms and little consideration of alternatives for coping such as talking with a parent or trusted adult or physical activity with friends. \\n Evidence suggests that the younger a person is when they start using cannabis and the more often they use, the greater the potential for harms. The legal age to use cannabis in BC is 19. However, our brains do not finish developing until about age 25. Delaying cannabis use until early adulthood may reduce potential harmful effects on the brain. \\n Some young people, especially those with many factors predisposing them to serious and persistent mental health issues, should probably not use cannabis. Cannabis has been associated with an increased risk for psychosis and schizophrenia in this small group of people. Some people with serious mental health issues have also reported that using cannabis has helped them cope with their illness by helping them feel less anxious or stressed. As in most situations, balancing potential benefits and harms of using cannabis will be key for young people who have serious mental health concerns. \\n Mixing drugs, such as cannabis and alcohol, can also increase the possibility of experiencing harms. Intoxication may be more intense and long lasting and the young person may not appreciate how impaired they are. We often suggest, “Not too much, not too often, and only in a safe context” as a simple way to gauge your use of any psychoactive substance. \\n As a parent or caring adult, an open respectful relationship with a young person is one of your best resources and ways to prevent harms from substance use. Letting the youth know they can approach you at any time to talk about cannabis, other substances, or anything else of concern to them, says they matter to you and you are ready to listen and engage in dialogue with them. This is a great place to begin addressing anything that might come the young person’s way in life! \\n The Canadian Institute for Substance Use Research, formerly CARBC, is a member of the BC Partners for Mental Health and Addictions Information. The institute is dedicated to the study of substance use in support of community-wide efforts aimed at providing all people with access to healthier lives, whether using substances or not. For more, visit www.cisur.ca.'},\n", " {'question_id': 1337085,\n", " 'questions': 'I have thoughts of suicide, or someone I care about is talking about suicide. What should I do?',\n", " 'answers': 'If you need to talk to someone or you aren’t sure how to help someone you care about, call 1-800-SUICIDE (1-800-784-2433) at any time. Or type your concern using live chat (like texting online) at www.crisiscentrechat.ca between noon and 1am. They can help you, and they can suggest good local resources. If you’re at risk of harm or think someone else is in danger and you need help right now, call 911. \\n It’s scary to have thoughts of suicide or hear that someone you can care about has thoughts of suicide. Thoughts of suicide don’t mean that someone will end their life, but it’s a sign that they need extra help or support. If you have thoughts of suicide, it’s important to talk with your doctor or mental health service provider. If you’re supporting someone else, encourage them to seek help. \\n Coping With Suicidal Thoughts is a good resource to help you understand and manage difficult feelings. \\n Our info sheet on suicide has information on suicide, helping someone else, and finding help. What is Suicide? is a booklet with audio in plain language for lower literacy readers. \\n The Centre for Suicide Prevention has many resource toolkits on suicide for different audiences, including people serving in the military, young people, teens, older adults, Aboriginal community members, and LGBT community members.'}]" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "index.search(query)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Evaluating Retrieval" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": [ "df_questions = pd.read_csv('../dataset/ground_truth_data.csv')" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idquestion
01590140How do mental illnesses affect a person's dail...
11590140What are some examples of serious mental illne...
21590140Why is it a misconception to associate mental ...
31590140How are mental illnesses treated, and what is ...
41590140Can a person with mental illness become indepe...
\n", "
" ], "text/plain": [ " id question\n", "0 1590140 How do mental illnesses affect a person's dail...\n", "1 1590140 What are some examples of serious mental illne...\n", "2 1590140 Why is it a misconception to associate mental ...\n", "3 1590140 How are mental illnesses treated, and what is ...\n", "4 1590140 Can a person with mental illness become indepe..." ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_questions.head()" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [], "source": [ "ground_truth=df_questions.to_dict('records')" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'id': 1590140,\n", " 'question': \"How do mental illnesses affect a person's daily functioning and relationships?\"}" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ground_truth[0]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": [ "def search(query):\n", " boost = {}\n", "\n", " results = index.search(\n", " query=query,\n", " filter_dict={},\n", " boost_dict=boost,\n", " num_results=10\n", " )\n", "\n", " return results" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [], "source": [ "def hit_rate(relevance_total):\n", " cnt = 0\n", "\n", " for line in relevance_total:\n", " if True in line:\n", " cnt = cnt + 1\n", "\n", " return cnt / len(relevance_total)\n", "\n", "def mrr(relevance_total):\n", " total_score = 0.0\n", "\n", " for line in relevance_total:\n", " for rank in range(len(line)):\n", " if line[rank] == True:\n", " total_score = total_score + 1 / (rank + 1)\n", "\n", " return total_score / len(relevance_total)\n" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [], "source": [ "def precision(relevance_list):\n", " \"\"\"\n", " Precision: Proportion of retrieved documents that are relevant.\n", " \"\"\"\n", " relevant_retrieved = sum(relevance_list) # True indicates relevance\n", " total_retrieved = len(relevance_list) # All retrieved documents\n", " if total_retrieved == 0:\n", " return 0.0 # Avoid division by zero\n", " return relevant_retrieved / total_retrieved" ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [], "source": [ "def recall(relevance_list, total_relevant):\n", " \"\"\"\n", " Recall: Proportion of relevant documents that are retrieved.\n", " \"\"\"\n", " relevant_retrieved = sum(relevance_list) # True indicates relevance\n", " if total_relevant == 0:\n", " return 0.0 # Avoid division by zero\n", " return relevant_retrieved / total_relevant" ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [], "source": [ "def evaluate(ground_truth, search_function):\n", " relevance_total = []\n", " precision_scores = []\n", " recall_scores = []\n", "\n", " for q in tqdm(ground_truth):\n", " doc_id = q['id']\n", " \n", " # Get search results for this query\n", " results = search_function(q)\n", " \n", " # Check if the correct document (matching question_id) is in the results\n", " relevance = [doc['question_id'] == doc_id for doc in results]\n", " relevance_total.append(relevance)\n", " \n", " # Precision: Fraction of retrieved documents that are relevant\n", " precision_score = precision(relevance)\n", " precision_scores.append(precision_score)\n", " \n", " # Recall: There is only 1 relevant document per query, so recall is either 1 or 0\n", " recall_score = recall(relevance, 1)\n", " recall_scores.append(recall_score)\n", "\n", " return {\n", " 'hit_rate': hit_rate(relevance_total),\n", " 'mrr': mrr(relevance_total),\n", " 'precision': sum(precision_scores) / len(precision_scores), # Average precision\n", " 'recall': sum(recall_scores) / len(recall_scores), # Average recall\n", " }\n" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [], "source": [ "from tqdm.auto import tqdm\n" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "a77c76ffd2434de9ac4b888899b38f21", "version_major": 2, "version_minor": 0 }, "text/plain": [ " 0%| | 0/366 [00:00 Dict[str, float]:\n", " \"\"\"Calculate all retrieval metrics from relevance lists\"\"\"\n", " \n", " # Calculate hit rate\n", " hit_rate = sum(any(line) for line in relevance_total) / len(relevance_total)\n", " \n", " # Calculate MRR\n", " mrr_score = 0.0\n", " for line in relevance_total:\n", " for rank, relevant in enumerate(line):\n", " if relevant:\n", " mrr_score += 1 / (rank + 1)\n", " break\n", " mrr = mrr_score / len(relevance_total)\n", " \n", " # Calculate precision and recall for each query\n", " precision_scores = []\n", " recall_scores = []\n", " \n", " for relevance_list in relevance_total:\n", " # Precision\n", " precision = sum(relevance_list) / len(relevance_list) if relevance_list else 0.0\n", " precision_scores.append(precision)\n", " \n", " # Recall (assuming 1 relevant document per query)\n", " recall = sum(relevance_list) / 1 if sum(relevance_list) > 0 else 0.0\n", " recall_scores.append(recall)\n", " \n", " # Calculate averages\n", " avg_precision = np.mean(precision_scores)\n", " avg_recall = np.mean(recall_scores)\n", " \n", " # Calculate F1 score\n", " f1_score = (2 * avg_precision * avg_recall) / (avg_precision + avg_recall) if (avg_precision + avg_recall) > 0 else 0.0\n", " \n", " return {\n", " 'hit_rate': hit_rate,\n", " 'mrr': mrr,\n", " 'precision': avg_precision,\n", " 'recall': avg_recall,\n", " }\n", "\n" ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [], "source": [ "def search_with_params(index, query: str, params: Dict[str, Any]) -> List[Dict]:\n", " \"\"\"Perform search with specific parameters\"\"\"\n", " results = index.search(\n", " query=query,\n", " filter_dict={},\n", " boost_dict=params.get('boost', {}),\n", " num_results=params.get('num_results', 10)\n", " )\n", " return results" ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [], "source": [ "def evaluate_params(index, ground_truth: List[Dict], params: Dict[str, Any]) -> Dict[str, float]:\n", " \"\"\"Evaluate a single parameter configuration\"\"\"\n", " relevance_total = []\n", " \n", " for q in tqdm(ground_truth, desc=\"Evaluating configuration\"):\n", " doc_id = q['id']\n", " results = search_with_params(index, q['question'], params)\n", " \n", " # Check relevance of results\n", " relevance = [doc['question_id'] == doc_id for doc in results]\n", " relevance_total.append(relevance)\n", " \n", " return calculate_metrics(relevance_total)" ] }, { "cell_type": "code", "execution_count": 55, "metadata": {}, "outputs": [], "source": [ "def generate_param_combinations(param_grid: Dict[str, List[Any]]) -> List[Dict[str, Any]]:\n", " \"\"\"Generate all possible combinations of parameters\"\"\"\n", " keys = param_grid.keys()\n", " values = product(*param_grid.values())\n", " return [dict(zip(keys, v)) for v in values]" ] }, { "cell_type": "code", "execution_count": 56, "metadata": {}, "outputs": [], "source": [ "def find_best_params(index, ground_truth: List[Dict], param_grid: Dict[str, List[Any]]) -> Dict:\n", " \"\"\"Find the best parameters through grid search\"\"\"\n", " \n", " # Generate all parameter combinations\n", " param_combinations = generate_param_combinations(param_grid)\n", " \n", " # Store all results\n", " all_results = []\n", " best_score = -float('inf')\n", " best_params = None\n", " best_metrics = None\n", " \n", " # Test each parameter combination\n", " for params in tqdm(param_combinations, desc=\"Testing parameter combinations\"):\n", " metrics = evaluate_params(index, ground_truth, params)\n", " \n", " # Calculate overall score (average of all metrics)\n", " score = np.mean([\n", " metrics['hit_rate'],\n", " metrics['mrr']\n", " ])\n", " \n", " result = {\n", " 'params': params,\n", " 'metrics': metrics,\n", " 'overall_score': score\n", " }\n", " all_results.append(result)\n", " \n", " # Update best results if current score is better\n", " if score > best_score:\n", " best_score = score\n", " best_params = params\n", " best_metrics = metrics\n", " \n", " return {\n", " 'best_params': best_params,\n", " 'best_metrics': best_metrics,\n", " 'all_results': all_results\n", " }" ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [], "source": [ "# Example parameter grid\n", "default_param_grid = {\n", " 'boost': [\n", " {'questions': 1.0, 'answers': 0.5},\n", " {'questions': 2.0, 'answers': 1.0},\n", " {'questions': 1.0, 'answers': 1.0},\n", " {'questions': 3.0, 'answers': 0.5},\n", " ],\n", " 'num_results': [5, 10, 15, 20]\n", "}" ] }, { "cell_type": "code", "execution_count": 57, "metadata": { "scrolled": true }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "c6e504832cd741769e878efba4afe3c7", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Testing parameter combinations: 0%| | 0/16 [00:01\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
boostnum_resultsmetric_hit_ratemetric_mrrmetric_precisionmetric_recalloverall_score
11{'questions': 1.0, 'answers': 1.0}200.9699450.7145730.0484970.9699450.842259
10{'questions': 1.0, 'answers': 1.0}150.9508200.7134510.0633880.9508200.832135
9{'questions': 1.0, 'answers': 1.0}100.9180330.7107740.0918030.9180330.814403
8{'questions': 1.0, 'answers': 1.0}50.8551910.7023680.1710380.8551910.778780
3{'questions': 1.0, 'answers': 0.5}200.9125680.6025610.0456280.9125680.757565
7{'questions': 2.0, 'answers': 1.0}200.9125680.6025610.0456280.9125680.757565
2{'questions': 1.0, 'answers': 0.5}150.8907100.6013290.0593810.8907100.746020
6{'questions': 2.0, 'answers': 1.0}150.8907100.6013290.0593810.8907100.746020
1{'questions': 1.0, 'answers': 0.5}100.8551910.5984550.0855190.8551910.726823
5{'questions': 2.0, 'answers': 1.0}100.8551910.5984550.0855190.8551910.726823
15{'questions': 3.0, 'answers': 0.5}200.8524590.5323160.0426230.8524590.692388
14{'questions': 3.0, 'answers': 0.5}150.8142080.5302180.0542810.8142080.672213
0{'questions': 1.0, 'answers': 0.5}50.7568310.5847910.1513660.7568310.670811
4{'questions': 2.0, 'answers': 1.0}50.7568310.5847910.1513660.7568310.670811
13{'questions': 3.0, 'answers': 0.5}100.7513660.5251450.0751370.7513660.638256
12{'questions': 3.0, 'answers': 0.5}50.6612020.5136610.1322400.6612020.587432
\n", "" ], "text/plain": [ " boost num_results metric_hit_rate \\\n", "11 {'questions': 1.0, 'answers': 1.0} 20 0.969945 \n", "10 {'questions': 1.0, 'answers': 1.0} 15 0.950820 \n", "9 {'questions': 1.0, 'answers': 1.0} 10 0.918033 \n", "8 {'questions': 1.0, 'answers': 1.0} 5 0.855191 \n", "3 {'questions': 1.0, 'answers': 0.5} 20 0.912568 \n", "7 {'questions': 2.0, 'answers': 1.0} 20 0.912568 \n", "2 {'questions': 1.0, 'answers': 0.5} 15 0.890710 \n", "6 {'questions': 2.0, 'answers': 1.0} 15 0.890710 \n", "1 {'questions': 1.0, 'answers': 0.5} 10 0.855191 \n", "5 {'questions': 2.0, 'answers': 1.0} 10 0.855191 \n", "15 {'questions': 3.0, 'answers': 0.5} 20 0.852459 \n", "14 {'questions': 3.0, 'answers': 0.5} 15 0.814208 \n", "0 {'questions': 1.0, 'answers': 0.5} 5 0.756831 \n", "4 {'questions': 2.0, 'answers': 1.0} 5 0.756831 \n", "13 {'questions': 3.0, 'answers': 0.5} 10 0.751366 \n", "12 {'questions': 3.0, 'answers': 0.5} 5 0.661202 \n", "\n", " metric_mrr metric_precision metric_recall overall_score \n", "11 0.714573 0.048497 0.969945 0.842259 \n", "10 0.713451 0.063388 0.950820 0.832135 \n", "9 0.710774 0.091803 0.918033 0.814403 \n", "8 0.702368 0.171038 0.855191 0.778780 \n", "3 0.602561 0.045628 0.912568 0.757565 \n", "7 0.602561 0.045628 0.912568 0.757565 \n", "2 0.601329 0.059381 0.890710 0.746020 \n", "6 0.601329 0.059381 0.890710 0.746020 \n", "1 0.598455 0.085519 0.855191 0.726823 \n", "5 0.598455 0.085519 0.855191 0.726823 \n", "15 0.532316 0.042623 0.852459 0.692388 \n", "14 0.530218 0.054281 0.814208 0.672213 \n", "0 0.584791 0.151366 0.756831 0.670811 \n", "4 0.584791 0.151366 0.756831 0.670811 \n", "13 0.525145 0.075137 0.751366 0.638256 \n", "12 0.513661 0.132240 0.661202 0.587432 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "\n", "# Run the optimization\n", "results = find_best_params(index, ground_truth, default_param_grid)\n", "\n", "# Print best parameters\n", "print(\"Best Parameters:\", results['best_params'])\n", "print(\"\\nBest Metrics:\")\n", "for metric, value in results['best_metrics'].items():\n", " print(f\"{metric}: {value:.4f}\")\n", "\n", "# Create DataFrame of all results for analysis\n", "results_df = pd.DataFrame([\n", " {\n", " **r['params'],\n", " **{f\"metric_{k}\": v for k, v in r['metrics'].items()},\n", " 'overall_score': r['overall_score']\n", " }\n", " for r in results['all_results']\n", "])\n", "\n", "# Display results sorted by overall score\n", "print(\"\\nAll configurations sorted by performance:\")\n", "display(results_df.sort_values('overall_score', ascending=False))\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Rag Flow" ] }, { "cell_type": "code", "execution_count": 58, "metadata": {}, "outputs": [], "source": [ "prompt_template=\"\"\"\n", "You are an expert mental health assistant specialized in providing detailed and accurate answers based on the given context.\n", "Answer the QUESTION based on the CONTEXT from our meantal health database.\n", "Use only the facts from the CONTEXT when answering the QUESTION.\n", "\n", "Here is the context:\n", "\n", "Context: {context}\n", "\n", "Please answer the following question based on the provided context:\n", "\n", "Question: {question}\n", "\n", "Provide a detailed and informative response. Ensure that your answer is clear, concise, and directly addresses the question while being relevant to the context provided.\n", "\n", "Your response should be in plain text and should not include any code blocks or extra formatting.\n", "\n", "Answer:\n", "\"\"\".strip()\n", "def build_prompt(query, search_results):\n", " context = \"\"\n", " \n", " for doc in search_results:\n", " context = context.format(**doc) + \"\\n\\n\"\n", "\n", " prompt = prompt_template.format(question=query, context=context).strip()\n", " return prompt\n" ] }, { "cell_type": "code", "execution_count": 59, "metadata": {}, "outputs": [], "source": [ "def llm(prompt,model):\n", " response = client.chat.completions.create(\n", " model=model,\n", " messages=[{\"role\": \"user\", \"content\": prompt}])\n", " return response.choices[0].message.content" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": 60, "metadata": {}, "outputs": [], "source": [ "def rag(query, model='mixtral-8x7b-32768'):\n", " search_results = search(query)\n", " prompt = build_prompt(query, search_results)\n", " #print(prompt)\n", " answer = llm(prompt, model=model)\n", " return answer" ] }, { "cell_type": "code", "execution_count": 61, "metadata": {}, "outputs": [], "source": [ "query=\"What should I eat if I lost a friend\"" ] }, { "cell_type": "code", "execution_count": 62, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\"I'm really sorry to hear that you're dealing with the loss of a friendship. It's important to remember that while food can't solve emotional pain, maintaining a balanced diet can help support your overall well-being during this difficult time.\\n\\nUnfortunately, the context provided doesn't give specific dietary recommendations for dealing with lost friendships. However, generally, it's recommended to eat a variety of nutrient-dense foods. This can include:\\n\\n1. Fruits and vegetables: These are high in vitamins, minerals, and fiber, which can support your physical health.\\n2. Lean proteins: Foods like chicken, turkey, fish, eggs, and tofu can help maintain muscle mass and support your body's healing processes.\\n3. Whole grains: Foods like brown rice, whole wheat bread, and quinoa can provide energy and help keep you feeling full.\\n4. Healthy fats: Foods like avocados, nuts, seeds, and olive oil can support brain health and help keep you feeling satisfied.\\n\\nRemember, it's also important to stay hydrated and to listen to your body's hunger and fullness cues. If you're finding it hard to eat, try smaller, more frequent meals. If you're turning to food for comfort, try to be mindful of your eating habits and consider healthier alternatives or seeking support from a mental health professional.\"" ] }, "execution_count": 62, "metadata": {}, "output_type": "execute_result" } ], "source": [ "rag(query)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Rag Evaluation" ] }, { "cell_type": "code", "execution_count": 63, "metadata": {}, "outputs": [], "source": [ "prompt2_template = \"\"\"\n", "You are an expert evaluator for a RAG system.\n", "Your task is to analyze the relevance of the generated answer to the given question.\n", "Based on the relevance of the generated answer, you will classify it\n", "as \"NON_RELEVANT\", \"PARTLY_RELEVANT\", or \"RELEVANT\".\n", "\n", "Here is the data for evaluation:\n", "\n", "Question: {question}\n", "Generated Answer: {answer_llm}\n", "\n", "Please analyze the content and context of the generated answer in relation to the question\n", "and provide your evaluation in parsable JSON without using code blocks:\n", "\n", "{{\n", " \"Relevance\": \"NON_RELEVANT\" | \"PARTLY_RELEVANT\" | \"RELEVANT\",\n", " \"Explanation\": \"[Provide a brief explanation for your evaluation]\"\n", "}}\n", "\"\"\".strip()\n" ] }, { "cell_type": "code", "execution_count": 64, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "366" ] }, "execution_count": 64, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(ground_truth)" ] }, { "cell_type": "code", "execution_count": 65, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'id': 1590140,\n", " 'question': \"How do mental illnesses affect a person's daily functioning and relationships?\"}" ] }, "execution_count": 65, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ground_truth[0]" ] }, { "cell_type": "code", "execution_count": 66, "metadata": {}, "outputs": [], "source": [ "record=ground_truth[0]\n", "question=record[\"question\"]\n", "answer_llm=rag(question)" ] }, { "cell_type": "code", "execution_count": 67, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "You are an expert evaluator for a RAG system.\n", "Your task is to analyze the relevance of the generated answer to the given question.\n", "Based on the relevance of the generated answer, you will classify it\n", "as \"NON_RELEVANT\", \"PARTLY_RELEVANT\", or \"RELEVANT\".\n", "\n", "Here is the data for evaluation:\n", "\n", "Question: How do mental illnesses affect a person's daily functioning and relationships?\n", "Generated Answer: Mental illnesses can significantly impact a person's daily functioning and relationships in various ways. According to the context provided, mental illnesses can affect a person's ability to perform routine activities, maintain employment, manage finances, and handle responsibilities at home. They may also experience difficulty in concentrating, making decisions, and regulating emotions, which can further interfere with their daily life.\n", "\n", "Furthermore, mental illnesses can strain a person's relationships with family, friends, and colleagues. The symptoms of mental illnesses, such as irritability, mood swings, and social withdrawal, can cause misunderstandings, conflicts, and tension in interpersonal relationships. Mental illnesses can also affect a person's communication, empathy, and trust, making it challenging for them to build and maintain meaningful connections with others.\n", "\n", "In some cases, mental illnesses can lead to social isolation, loneliness, and discrimination, further exacerbating the negative impact on a person's daily functioning and relationships. It is essential to provide support, understanding, and accommodations to individuals experiencing mental health issues to help them manage their symptoms and improve their quality of life.\n", "\n", "Please analyze the content and context of the generated answer in relation to the question\n", "and provide your evaluation in parsable JSON without using code blocks:\n", "\n", "{\n", " \"Relevance\": \"NON_RELEVANT\" | \"PARTLY_RELEVANT\" | \"RELEVANT\",\n", " \"Explanation\": \"[Provide a brief explanation for your evaluation]\"\n", "}\n" ] } ], "source": [ "prompt=prompt2_template.format(question=question,answer_llm=answer_llm)\n", "print(prompt)" ] }, { "cell_type": "code", "execution_count": 68, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "8ff19c2ce6cc44199c08e01f7d9c4ab4", "version_major": 2, "version_minor": 0 }, "text/plain": [ " 0%| | 0/366 [00:00\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
recordanswerevaluation
0{'id': 1590140, 'question': 'How do mental ill...Mental illnesses can significantly impact a pe...{'Relevance': 'RELEVANT', 'Explanation': 'The ...
1{'id': 1590140, 'question': 'What are some exa...Sure, I'd be happy to help answer your questio...{'Relevance': 'RELEVANT', 'Explanation': 'The ...
2{'id': 1590140, 'question': 'Why is it a misco...Mental illness is not a reflection of a person...{'Relevance': 'RELEVANT', 'Explanation': 'The ...
3{'id': 1590140, 'question': 'How are mental il...Mental illnesses are treated using a variety o...{'Relevance': 'RELEVANT', 'Explanation': 'The ...
4{'id': 1590140, 'question': 'Can a person with...Yes, a person with a mental illness can become...{'Relevance': 'RELEVANT', 'Explanation': 'The ...
\n", "" ], "text/plain": [ " record \\\n", "0 {'id': 1590140, 'question': 'How do mental ill... \n", "1 {'id': 1590140, 'question': 'What are some exa... \n", "2 {'id': 1590140, 'question': 'Why is it a misco... \n", "3 {'id': 1590140, 'question': 'How are mental il... \n", "4 {'id': 1590140, 'question': 'Can a person with... \n", "\n", " answer \\\n", "0 Mental illnesses can significantly impact a pe... \n", "1 Sure, I'd be happy to help answer your questio... \n", "2 Mental illness is not a reflection of a person... \n", "3 Mental illnesses are treated using a variety o... \n", "4 Yes, a person with a mental illness can become... \n", "\n", " evaluation \n", "0 {'Relevance': 'RELEVANT', 'Explanation': 'The ... \n", "1 {'Relevance': 'RELEVANT', 'Explanation': 'The ... \n", "2 {'Relevance': 'RELEVANT', 'Explanation': 'The ... \n", "3 {'Relevance': 'RELEVANT', 'Explanation': 'The ... \n", "4 {'Relevance': 'RELEVANT', 'Explanation': 'The ... " ] }, "execution_count": 70, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_eval.head()" ] }, { "cell_type": "code", "execution_count": 71, "metadata": {}, "outputs": [], "source": [ "df_eval = pd.DataFrame(evaluations, columns=['record', 'answer', 'evaluation'])\n", "\n", "df_eval['id'] = df_eval.record.apply(lambda d: d['id'])\n", "df_eval['question'] = df_eval.record.apply(lambda d: d['question'])\n", "\n", "df_eval['relevance'] = df_eval.evaluation.apply(lambda d: d['Relevance'])\n", "df_eval['explanation'] = df_eval.evaluation.apply(lambda d: d['Explanation'])\n", "\n", "del df_eval['record']\n", "del df_eval['evaluation']" ] }, { "cell_type": "code", "execution_count": 72, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
answeridquestionrelevanceexplanation
0Mental illnesses can significantly impact a pe...1590140How do mental illnesses affect a person's dail...RELEVANTThe generated answer is highly relevant to the...
1Sure, I'd be happy to help answer your questio...1590140What are some examples of serious mental illne...RELEVANTThe generated answer provides a detailed list ...
2Mental illness is not a reflection of a person...1590140Why is it a misconception to associate mental ...RELEVANTThe generated answer directly addresses the qu...
3Mental illnesses are treated using a variety o...1590140How are mental illnesses treated, and what is ...RELEVANTThe generated answer fully addresses the quest...
4Yes, a person with a mental illness can become...1590140Can a person with mental illness become indepe...RELEVANTThe generated answer directly addresses the qu...
\n", "
" ], "text/plain": [ " answer id \\\n", "0 Mental illnesses can significantly impact a pe... 1590140 \n", "1 Sure, I'd be happy to help answer your questio... 1590140 \n", "2 Mental illness is not a reflection of a person... 1590140 \n", "3 Mental illnesses are treated using a variety o... 1590140 \n", "4 Yes, a person with a mental illness can become... 1590140 \n", "\n", " question relevance \\\n", "0 How do mental illnesses affect a person's dail... RELEVANT \n", "1 What are some examples of serious mental illne... RELEVANT \n", "2 Why is it a misconception to associate mental ... RELEVANT \n", "3 How are mental illnesses treated, and what is ... RELEVANT \n", "4 Can a person with mental illness become indepe... RELEVANT \n", "\n", " explanation \n", "0 The generated answer is highly relevant to the... \n", "1 The generated answer provides a detailed list ... \n", "2 The generated answer directly addresses the qu... \n", "3 The generated answer fully addresses the quest... \n", "4 The generated answer directly addresses the qu... " ] }, "execution_count": 72, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_eval.head()" ] }, { "cell_type": "code", "execution_count": 73, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "relevance\n", "RELEVANT 348\n", "PARTLY_RELEVANT 17\n", "NON_RELEVANT 1\n", "Name: count, dtype: int64\n" ] } ], "source": [ "# Count the occurrences of each relevance category\n", "relevance_counts = df_eval['relevance'].value_counts()\n", "print(relevance_counts)\n" ] }, { "cell_type": "code", "execution_count": 74, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Installing seaborn..." ] }, { "name": "stderr", "output_type": "stream", "text": [ "Loading .env environment variables...\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "Resolving seaborn...\n", "[ ] Installing...\n", "[= ] Installing seaborn...\n", "Installation Succeeded\n", "[== ] Installing seaborn...\n", "[== ] Installing seaborn...\n", "\n", "Installing dependencies from Pipfile.lock (7b2755)...\n" ] } ], "source": [ "!pipenv install seaborn" ] }, { "cell_type": "code", "execution_count": 82, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "relevance\n", "RELEVANT 348\n", "PARTLY_RELEVANT 17\n", "NON_RELEVANT 1\n", "Name: count, dtype: int64\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "C:\\Users\\USER\\AppData\\Local\\Temp\\ipykernel_16892\\1345086306.py:15: FutureWarning: \n", "\n", "Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0. Assign the `x` variable to `hue` and set `legend=False` for the same effect.\n", "\n", " ax = sns.countplot(data=df_eval, x='relevance', palette='viridis')\n" ] }, { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "\n", "# Calculate counts and percentages\n", "counts = df_eval['relevance'].value_counts()\n", "print(counts)\n", "percentages = counts / counts.sum() * 100\n", "\n", "# Create bar plot\n", "plt.figure(figsize=(8, 5))\n", "ax = sns.countplot(data=df_eval, x='relevance', palette='viridis')\n", "\n", "# Add percentage labels on top of the bars\n", "for p in ax.patches:\n", " height = p.get_height()\n", " ax.annotate(f'{height / counts.sum() * 100:.1f}%', \n", " (p.get_x() + p.get_width() / 2., height), \n", " ha='center', va='bottom')\n", "\n", "plt.title('Distribution of Relevance For mixtral-8x7b-32768')\n", "plt.xlabel('Relevance')\n", "plt.ylabel('Count')\n", "plt.show()\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(s)" ] }, { "cell_type": "code", "execution_count": 77, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
answeridquestionrelevanceexplanation
0Mental illnesses can significantly impact a pe...1590140How do mental illnesses affect a person's dail...RELEVANTThe generated answer is highly relevant to the...
1Sure, I'd be happy to help answer your questio...1590140What are some examples of serious mental illne...RELEVANTThe generated answer provides a detailed list ...
2Mental illness is not a reflection of a person...1590140Why is it a misconception to associate mental ...RELEVANTThe generated answer directly addresses the qu...
3Mental illnesses are treated using a variety o...1590140How are mental illnesses treated, and what is ...RELEVANTThe generated answer fully addresses the quest...
4Yes, a person with a mental illness can become...1590140Can a person with mental illness become indepe...RELEVANTThe generated answer directly addresses the qu...
\n", "
" ], "text/plain": [ " answer id \\\n", "0 Mental illnesses can significantly impact a pe... 1590140 \n", "1 Sure, I'd be happy to help answer your questio... 1590140 \n", "2 Mental illness is not a reflection of a person... 1590140 \n", "3 Mental illnesses are treated using a variety o... 1590140 \n", "4 Yes, a person with a mental illness can become... 1590140 \n", "\n", " question relevance \\\n", "0 How do mental illnesses affect a person's dail... RELEVANT \n", "1 What are some examples of serious mental illne... RELEVANT \n", "2 Why is it a misconception to associate mental ... RELEVANT \n", "3 How are mental illnesses treated, and what is ... RELEVANT \n", "4 Can a person with mental illness become indepe... RELEVANT \n", "\n", " explanation \n", "0 The generated answer is highly relevant to the... \n", "1 The generated answer provides a detailed list ... \n", "2 The generated answer directly addresses the qu... \n", "3 The generated answer fully addresses the quest... \n", "4 The generated answer directly addresses the qu... " ] }, "execution_count": 77, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Filter relevant entries\n", "relevant_answers = df_eval[df_eval['relevance'] == 'RELEVANT']\n", "relevant_answers.head(5)\n" ] }, { "cell_type": "code", "execution_count": 80, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
answeridquestionrelevanceexplanation
208The context provided does not include informat...3284724What are the different income levels for provi...NON_RELEVANTThe generated answer does not provide any info...
\n", "
" ], "text/plain": [ " answer id \\\n", "208 The context provided does not include informat... 3284724 \n", "\n", " question relevance \\\n", "208 What are the different income levels for provi... NON_RELEVANT \n", "\n", " explanation \n", "208 The generated answer does not provide any info... " ] }, "execution_count": 80, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Filter non-relevant entries\n", "non_relevant_answers = df_eval[df_eval['relevance'] == 'NON_RELEVANT']\n", "non_relevant_answers" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "non_relevant_answers[['question','answer','explanation']]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "\n", "# Set display options to show full content\n", "pd.set_option('display.max_colwidth', None)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Analysing Partly Relevant Answers" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "partly_relevant_answers = df_eval[df_eval['relevance'] == 'PARTLY_RELEVANT']\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "partly_relevant_answers[['question','answer','explanation']]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#df_eval.to_csv('../dataset/rag-eval-mistral.csv', index=False)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Evaluating Mistralai Model" ] }, { "cell_type": "code", "execution_count": 83, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "0e3ded5a907f4fe58c4fbe4e590c9459", "version_major": 2, "version_minor": 0 }, "text/plain": [ " 0%| | 0/366 [00:01\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
answeridquestionrelevanceexplanation
0Based on the provided context, mental illnesse...1590140How do mental illnesses affect a person's dail...RELEVANTThe generated answer directly addresses the qu...
1Based on the provided context, some examples o...1590140What are some examples of serious mental illne...RELEVANTThe generated answer directly addresses the qu...
2It is a misconception to associate mental illn...1590140Why is it a misconception to associate mental ...RELEVANTThe generated answer directly addresses the qu...
\n", "" ], "text/plain": [ " answer id \\\n", "0 Based on the provided context, mental illnesse... 1590140 \n", "1 Based on the provided context, some examples o... 1590140 \n", "2 It is a misconception to associate mental illn... 1590140 \n", "\n", " question relevance \\\n", "0 How do mental illnesses affect a person's dail... RELEVANT \n", "1 What are some examples of serious mental illne... RELEVANT \n", "2 Why is it a misconception to associate mental ... RELEVANT \n", "\n", " explanation \n", "0 The generated answer directly addresses the qu... \n", "1 The generated answer directly addresses the qu... \n", "2 The generated answer directly addresses the qu... " ] }, "execution_count": 85, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_eval_llama.head(3)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_eval_llama.to_csv('../dataset/rag_eval_llama.csv', index=False)\n" ] }, { "cell_type": "code", "execution_count": 86, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "relevance\n", "RELEVANT 317\n", "NON_RELEVANT 31\n", "PARTLY_RELEVANT 18\n", "Name: count, dtype: int64\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "C:\\Users\\USER\\AppData\\Local\\Temp\\ipykernel_16892\\4118569142.py:8: FutureWarning: \n", "\n", "Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0. Assign the `x` variable to `hue` and set `legend=False` for the same effect.\n", "\n", " ax = sns.countplot(data=df_eval, x='relevance', palette='viridis')\n" ] }, { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Calculate counts and percentages\n", "counts = df_eval_llama['relevance'].value_counts()\n", "print(counts)\n", "percentages = counts / counts.sum() * 100\n", "\n", "# Create bar plot\n", "plt.figure(figsize=(8, 5))\n", "ax = sns.countplot(data=df_eval, x='relevance', palette='viridis')\n", "\n", "# Add percentage labels on top of the bars\n", "for p in ax.patches:\n", " height = p.get_height()\n", " ax.annotate(f'{height / counts.sum() * 100:.1f}%', \n", " (p.get_x() + p.get_width() / 2., height), \n", " ha='center', va='bottom')\n", "\n", "plt.title('Distribution of Relevance For llama')\n", "plt.xlabel('Relevance')\n", "plt.ylabel('Count')\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.13" } }, "nbformat": 4, "nbformat_minor": 4 }