{ "cells": [ { "cell_type": "markdown", "id": "68e9310f-109d-4f30-b263-d1e6c058ee80", "metadata": {}, "source": [ "# Setup" ] }, { "cell_type": "code", "execution_count": 1, "id": "6805b3b5-782b-437c-82b3-9392abb5a599", "metadata": { "tags": [] }, "outputs": [], "source": [ "# %pip install -q -r requirements.txt" ] }, { "cell_type": "markdown", "id": "94f0fcdd-1653-440e-8ebc-9c33d931163a", "metadata": {}, "source": [ "## Config" ] }, { "cell_type": "code", "execution_count": 1, "id": "5d0bd22f-293e-4c15-9dfe-8070553f42b5", "metadata": { "tags": [] }, "outputs": [], "source": [ "INPUT_DATASET = 'derek-thomas/labeled-multiple-choice-explained-falcon-reasoning'\n", "REVISION = '536f3b8'\n", "OUTPUT_DATASET = 'derek-thomas/labeled-multiple-choice-explained-falcon-tokenized'" ] }, { "cell_type": "code", "execution_count": 2, "id": "a1fc7a29-6b60-446d-b708-012f897de6a9", "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "6b7b5851e4944f36a31679d28c6d60e4", "version_major": 2, "version_minor": 0 }, "text/plain": [ "VBox(children=(HTML(value='
\n", " | formatted_question | \n", "combined_fact | \n", "answer_key | \n", "topic | \n", "gpt3_5_reasoning | \n", "question_text | \n", "answer_choices | \n", "falcon_reasoning | \n", "
---|---|---|---|---|---|---|---|---|
0 | \n", "what is satellite technology used for predicti... | \n", "satellite technology is used for predicting wh... | \n", "c | \n", "Technology | \n", "a) Seconds and minutes: This option is incorre... | \n", "What is satellite technology used for predicting? | \n", "(a) Seconds and minutes (b) The strength and m... | \n", "- (a) Seconds and minutes: Satellite technolog... | \n", "
1 | \n", "what does irradiating food do? (a) relieve pai... | \n", "irradiated food improves food safety. | \n", "c | \n", "Food science | \n", "(a) Relieve pain: This option is not correct b... | \n", "What does irradiating food do? | \n", "(a) Relieve pain (b) Enhance food's nutrients ... | \n", "(a) Relieve pain: Irradiating food does not ha... | \n", "
2 | \n", "what protects a mammal's skin? (a) fiber folli... | \n", "fiber follicles protect mammal skin | \n", "a | \n", "Biology | \n", "b) Exfoliation: Exfoliation is the process of ... | \n", "What protects a mammal's skin? | \n", "(a) Fiber follicles (b) Exfoliation (c) Resist... | \n", "(a) **Fiber follicles**: This is the correct a... | \n", "
3 | \n", "what do earthworms do when a segment breaks of... | \n", "earthworms can regrow segments that break off | \n", "b | \n", "Biology | \n", "a) Dies: This option is not correct because ea... | \n", "What do earthworms do when a segment breaks off? | \n", "(a) Dies (b) Regrows it (c) Reproduces (d) Sed... | \n", "1. **Option (a): Dies**\\n - Earthworms are s... | \n", "
4 | \n", "lightning can be bad for what? (a) the environ... | \n", "lightning can be bad for the environment. | \n", "a | \n", "Electricity | \n", "b) Rainstorms: Lightning is actually a natural... | \n", "Lightning can be bad for what? | \n", "(a) The environment (b) Rainstorms (c) Destruc... | \n", "(a) The environment: Lightning can release lar... | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
8408 | \n", "organisms that can cause infection do what? (a... | \n", "organisms that can cause infection make humans... | \n", "g | \n", "Biology | \n", "a) Bandaging open sores is not the correct ans... | \n", "Organisms that can cause infection do what? | \n", "(a) Bandage open sores (b) Keep flesh clean (c... | \n", "(a) Bandage open sores: This action is typical... | \n", "
8409 | \n", "fungi are living things that cannot make thei... | \n", "fungi are living things that cannot make their... | \n", "a | \n", "Biology | \n", "b) Fungi are living things that can make their... | \n", "Fungi are living things that cannot make their... | \n", "(a) Food (b) Cells (c) Energy (d) Fruits (e) H... | \n", "1. **Read the question and options carefully.*... | \n", "
8410 | \n", "an overheated body can use water for: (a) meta... | \n", "the evaporation of water from the skin cools t... | \n", "g | \n", "Biology | \n", "a) Metabolic reaction: This option is incorrec... | \n", "An overheated body can use water for:? | \n", "(a) Metabolic reaction (b) Dehydrating (c) Rai... | \n", "- (a) Metabolic reaction: This is incorrect be... | \n", "
8411 | \n", "what is essential for cellular respiration for... | \n", "plants are essential for cellular respiration ... | \n", "f | \n", "Biology | \n", "a) Electrons are involved in cellular respirat... | \n", "What is essential for cellular respiration for... | \n", "(a) Electron (b) Glucose (c) Energy (d) Energy... | \n", "1. **Glucose (b)**: Glucose is one of the reac... | \n", "
8412 | \n", "what helps insulate and protect the body? (a) ... | \n", "living cells in follicles help insulate and pr... | \n", "b | \n", "Biology | \n", "a) H2O: Water is essential for life, but it do... | \n", "What helps insulate and protect the body? | \n", "(a) H2o (b) Living cells in follicles (c) Laye... | \n", "1. **Read the question and options carefully.*... | \n", "
8413 rows × 8 columns
\n", "" ], "text/plain": [ " formatted_question \\\n", "0 what is satellite technology used for predicti... \n", "1 what does irradiating food do? (a) relieve pai... \n", "2 what protects a mammal's skin? (a) fiber folli... \n", "3 what do earthworms do when a segment breaks of... \n", "4 lightning can be bad for what? (a) the environ... \n", "... ... \n", "8408 organisms that can cause infection do what? (a... \n", "8409 fungi are living things that cannot make thei... \n", "8410 an overheated body can use water for: (a) meta... \n", "8411 what is essential for cellular respiration for... \n", "8412 what helps insulate and protect the body? (a) ... \n", "\n", " combined_fact answer_key \\\n", "0 satellite technology is used for predicting wh... c \n", "1 irradiated food improves food safety. c \n", "2 fiber follicles protect mammal skin a \n", "3 earthworms can regrow segments that break off b \n", "4 lightning can be bad for the environment. a \n", "... ... ... \n", "8408 organisms that can cause infection make humans... g \n", "8409 fungi are living things that cannot make their... a \n", "8410 the evaporation of water from the skin cools t... g \n", "8411 plants are essential for cellular respiration ... f \n", "8412 living cells in follicles help insulate and pr... b \n", "\n", " topic gpt3_5_reasoning \\\n", "0 Technology a) Seconds and minutes: This option is incorre... \n", "1 Food science (a) Relieve pain: This option is not correct b... \n", "2 Biology b) Exfoliation: Exfoliation is the process of ... \n", "3 Biology a) Dies: This option is not correct because ea... \n", "4 Electricity b) Rainstorms: Lightning is actually a natural... \n", "... ... ... \n", "8408 Biology a) Bandaging open sores is not the correct ans... \n", "8409 Biology b) Fungi are living things that can make their... \n", "8410 Biology a) Metabolic reaction: This option is incorrec... \n", "8411 Biology a) Electrons are involved in cellular respirat... \n", "8412 Biology a) H2O: Water is essential for life, but it do... \n", "\n", " question_text \\\n", "0 What is satellite technology used for predicting? \n", "1 What does irradiating food do? \n", "2 What protects a mammal's skin? \n", "3 What do earthworms do when a segment breaks off? \n", "4 Lightning can be bad for what? \n", "... ... \n", "8408 Organisms that can cause infection do what? \n", "8409 Fungi are living things that cannot make their... \n", "8410 An overheated body can use water for:? \n", "8411 What is essential for cellular respiration for... \n", "8412 What helps insulate and protect the body? \n", "\n", " answer_choices \\\n", "0 (a) Seconds and minutes (b) The strength and m... \n", "1 (a) Relieve pain (b) Enhance food's nutrients ... \n", "2 (a) Fiber follicles (b) Exfoliation (c) Resist... \n", "3 (a) Dies (b) Regrows it (c) Reproduces (d) Sed... \n", "4 (a) The environment (b) Rainstorms (c) Destruc... \n", "... ... \n", "8408 (a) Bandage open sores (b) Keep flesh clean (c... \n", "8409 (a) Food (b) Cells (c) Energy (d) Fruits (e) H... \n", "8410 (a) Metabolic reaction (b) Dehydrating (c) Rai... \n", "8411 (a) Electron (b) Glucose (c) Energy (d) Energy... \n", "8412 (a) H2o (b) Living cells in follicles (c) Laye... \n", "\n", " falcon_reasoning \n", "0 - (a) Seconds and minutes: Satellite technolog... \n", "1 (a) Relieve pain: Irradiating food does not ha... \n", "2 (a) **Fiber follicles**: This is the correct a... \n", "3 1. **Option (a): Dies**\\n - Earthworms are s... \n", "4 (a) The environment: Lightning can release lar... \n", "... ... \n", "8408 (a) Bandage open sores: This action is typical... \n", "8409 1. **Read the question and options carefully.*... \n", "8410 - (a) Metabolic reaction: This is incorrect be... \n", "8411 1. **Glucose (b)**: Glucose is one of the reac... \n", "8412 1. **Read the question and options carefully.*... \n", "\n", "[8413 rows x 8 columns]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Load dataset from Hugging Face Hub\n", "dataset = load_dataset(INPUT_DATASET, split='train')\n", "\n", "# Convert to pandas dataframe\n", "df = dataset.to_pandas()\n", "print(f\"Before Cleaning: {len(df)} rows\")\n", "print(df.columns)\n", "\n", "# Drop the __index_level_0__ column if it exists\n", "df.drop(columns=['falcon_reasoning_prompt'], errors='ignore', inplace=True)\n", "\n", "# Ensure all values in 'formatted_question' are strings\n", "df.rename(columns={\n", " 'explanation': 'gpt3_5_reasoning',\n", "}, inplace=True)\n", "\n", "# Fix formatting\n", "df['question_text'] = df['question_text'].str.replace('\"', '', regex=False)\n", "df['gpt3_5_reasoning'] = df['gpt3_5_reasoning'].str.replace('\"', \"'\", regex=False)\n", "df['falcon_reasoning'] = df['falcon_reasoning'].str.replace('\"', \"'\", regex=False)\n", "\n", "df" ] }, { "cell_type": "markdown", "id": "2511bc04-f611-4dc7-b3ed-e477907b0200", "metadata": {}, "source": [ "## Create Prompts from Processed Data" ] }, { "cell_type": "markdown", "id": "d124c7cf-a369-46a9-94db-069894145959", "metadata": {}, "source": [ "We need to convert our sample into a format similar to below for each of the scenarios. This is ideal since we can use [chat templates](https://huggingface.co./docs/transformers/en/chat_templating) to easily switch models which might have different special tokens.\n", "\n", "```\n", "[\n", " {\"content\": system_prompt, \"role\": \"system\"},\n", " {\"content\": user_content, \"role\": \"user\"},\n", " {\"content\": assistant_response, \"role\": \"assistant\"}\n", "]\n", "```\n", "\n", "We should include a helpful `system_prompt` with a general trivia prefix, and a suffix that contains instructions that fit each scenario.\n", "The `user_content` will have the Question and answer choices.\n", "The `assistant_response` should reflect the scenario. " ] }, { "cell_type": "markdown", "id": "c85b3c11-18d7-4854-a0ba-ad0c1407fd6d", "metadata": {}, "source": [ "Its best to understand `template_blocks` in a couple layers. \n", "- The top layer (macro) allows me to decide which pieces I want to include. Sometimes I want just the `system`+`user` message, and for fine-tuning Ill want `system`+`user`+`assistant`\n", "- System+User:\n", " - Inside the this layer I use jinja to interpolate the values I want to add\n", " - I moved `user_content` out to get a feel for how it looks\n", "- Assistant:\n", " - Here we have an if statement to allow me to chose between FA, RFA and FAR\n", " - Inside that we just have the same interploation as seen elsewhere\n", "\n", "You can see in `initial` and `full` the json for the messages structure. Here Im selecting which macros I want to use." ] }, { "cell_type": "code", "execution_count": 6, "id": "f42f3c34-f736-4e1c-b904-418caf2b0de1", "metadata": {}, "outputs": [], "source": [ "from jinja2 import Environment, DictLoader\n", "\n", "template_blocks = '''\n", "{%- macro user_message(system_content, question_text, answer_choices) -%}\n", "{\n", " \"role\": \"system\",\n", " \"content\": {{ system_content }}\n", "},\n", "{\n", " \"role\": \"user\",\n", " \"content\": \"Question: {{ question_text }}\\\\nAnswer Choices: {{ answer_choices }}\"\n", "\n", "}\n", "{%- endmacro %}\n", "\n", "{% macro assistant_response(reasoning, answer_key, response_order='default') -%}\n", "{\n", " \"role\": \"assistant\",\n", " \"content\": {\n", " {% if response_order == 'rfa' -%}\n", " \"reasoning\": {{ reasoning | tojson }},\n", " \"final_answer\": \"{{ answer_key }}\"\n", " {% elif response_order == 'far' -%}\n", " \"final_answer\": \"{{ answer_key }}\",\n", " \"reasoning\": {{ reasoning | tojson }}\n", " {% else -%}\n", " \"final_answer\": \"{{ answer_key }}\"\n", " {% endif %}\n", " }\n", "}\n", "{%- endmacro %}\n", "'''\n", "\n", "# System + User only (initial template)\n", "initial = '''\n", "[\n", " {{ user_message(system_content, question_text, answer_choices) }}\n", "]\n", "'''\n", "\n", "# Full conversation template\n", "full = '''\n", "[\n", " {{ user_message(system_content, question_text, answer_choices) }},\n", " {{ assistant_response(reasoning, answer_key, response_order) }}\n", "]\n", "'''\n", "\n", "# Create Jinja environment and load templates\n", "env = Environment(loader=DictLoader({\n", " 'template_blocks': template_blocks,\n", " 'initial': initial,\n", " 'full': full\n", "}))\n", "\n", "# # Load the macro definitions into the environment\n", "macro_template = env.get_template('template_blocks')\n", "env.globals.update(macro_template.module.__dict__)\n", "\n", "# Compile full and initial templates\n", "full_template = env.get_template('full')\n", "initial_template = env.get_template('initial')" ] }, { "cell_type": "markdown", "id": "35b2b21b-6ecf-490d-a453-7d679e3b1877", "metadata": {}, "source": [ "### Reasoning Final Answer" ] }, { "cell_type": "code", "execution_count": 7, "id": "eccb2f71-70a9-41fc-8235-d58b8876bdf1", "metadata": {}, "outputs": [], "source": [ "rfa_system_content = 'Answer the Question and include your reasoning and the final answer in a json like: {\"reasoning\":\n", " | topic | \n", "question_text | \n", "answer_key | \n", "gpt3_5_reasoning | \n", "falcon_reasoning | \n", "answer_choices | \n", "user_prompt_RFA | \n", "conversation_RFA_gpt3_5 | \n", "conversation_RFA_falcon | \n", "user_prompt_FAR | \n", "conversation_FAR_gpt3_5 | \n", "conversation_FAR_falcon | \n", "user_prompt_FA | \n", "conversation_FA | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "Technology | \n", "What is satellite technology used for predicting? | \n", "c | \n", "a) Seconds and minutes: This option is incorre... | \n", "- (a) Seconds and minutes: Satellite technolog... | \n", "(a) Seconds and minutes (b) The strength and m... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "
1 | \n", "Food science | \n", "What does irradiating food do? | \n", "c | \n", "(a) Relieve pain: This option is not correct b... | \n", "(a) Relieve pain: Irradiating food does not ha... | \n", "(a) Relieve pain (b) Enhance food's nutrients ... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "
2 | \n", "Biology | \n", "What protects a mammal's skin? | \n", "a | \n", "b) Exfoliation: Exfoliation is the process of ... | \n", "(a) **Fiber follicles**: This is the correct a... | \n", "(a) Fiber follicles (b) Exfoliation (c) Resist... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "
3 | \n", "Biology | \n", "What do earthworms do when a segment breaks off? | \n", "b | \n", "a) Dies: This option is not correct because ea... | \n", "1. **Option (a): Dies**\\n - Earthworms are s... | \n", "(a) Dies (b) Regrows it (c) Reproduces (d) Sed... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "
4 | \n", "Electricity | \n", "Lightning can be bad for what? | \n", "a | \n", "b) Rainstorms: Lightning is actually a natural... | \n", "(a) The environment: Lightning can release lar... | \n", "(a) The environment (b) Rainstorms (c) Destruc... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
8408 | \n", "Biology | \n", "Organisms that can cause infection do what? | \n", "g | \n", "a) Bandaging open sores is not the correct ans... | \n", "(a) Bandage open sores: This action is typical... | \n", "(a) Bandage open sores (b) Keep flesh clean (c... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "
8409 | \n", "Biology | \n", "Fungi are living things that cannot make their... | \n", "a | \n", "b) Fungi are living things that can make their... | \n", "1. **Read the question and options carefully.*... | \n", "(a) Food (b) Cells (c) Energy (d) Fruits (e) H... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "
8410 | \n", "Biology | \n", "An overheated body can use water for:? | \n", "g | \n", "a) Metabolic reaction: This option is incorrec... | \n", "- (a) Metabolic reaction: This is incorrect be... | \n", "(a) Metabolic reaction (b) Dehydrating (c) Rai... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "
8411 | \n", "Biology | \n", "What is essential for cellular respiration for... | \n", "f | \n", "a) Electrons are involved in cellular respirat... | \n", "1. **Glucose (b)**: Glucose is one of the reac... | \n", "(a) Electron (b) Glucose (c) Energy (d) Energy... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "
8412 | \n", "Biology | \n", "What helps insulate and protect the body? | \n", "b | \n", "a) H2O: Water is essential for life, but it do... | \n", "1. **Read the question and options carefully.*... | \n", "(a) H2o (b) Living cells in follicles (c) Laye... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "[{'role': 'system', 'content': 'Answer the Que... | \n", "
8413 rows × 14 columns
\n", "