{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "604ab692-a51f-4093-bf94-45b503c68d33",
   "metadata": {},
   "source": [
    "# AgentReview\n",
    "\n",
    "\n",
    "\n",
    "In this tutorial, you will explore customizing the AgentReview experiment.\n",
    "\n",
    "📑 Venue: EMNLP 2024 (Oral)\n",
    "\n",
    "🔗 arXiv: [https://arxiv.org/abs/2406.12708](https://arxiv.org/abs/2406.12708)\n",
    "\n",
    "🌐 Website: [https://agentreview.github.io/](https://agentreview.github.io/)\n",
    "\n",
    "```bibtex\n",
    "@inproceedings{jin2024agentreview,\n",
    "  title={AgentReview: Exploring Peer Review Dynamics with LLM Agents},\n",
    "  author={Jin, Yiqiao and Zhao, Qinlin and Wang, Yiyang and Chen, Hao and Zhu, Kaijie and Xiao, Yijia and Wang, Jindong},\n",
    "  booktitle={EMNLP},\n",
    "  year={2024}\n",
    "}\n",
    "```\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bdb3190e-09cf-44e7-b539-f531dfc68446",
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "os.environ[\"OPENAI_API_VERSION\"] = \"2023-05-15\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "09de5377-25f3-4363-923a-c597ec6e52d0",
   "metadata": {},
   "source": [
    "## Specify OpenAI Keys\n",
    "\n",
    "### OpenAI\n",
    "\n",
    "If you use OpenAI client, specify your OpenAI key here"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "62906f8a-6aef-4d48-8a3e-ba0b9c3d5b4b",
   "metadata": {},
   "outputs": [],
   "source": [
    "# If you use either OpenAI or AzureOpenAI, specify the API key here\n",
    "os.environ['OPENAI_API_KEY'] = ...  # Your OpenAI key here"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7c2c9418-c67f-4824-b40f-40b5a7eea781",
   "metadata": {},
   "source": [
    "### AzureOpenAI\n",
    "\n",
    "If you use AzureOpenAI, specify these environment variables"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5f85ee6f-49f0-419b-89a0-5f02e0f96200",
   "metadata": {},
   "outputs": [],
   "source": [
    "os.environ['AZURE_ENDPOINT'] = ... # Format: f\"https://YOUR_ENDPOINT.openai.azure.com\"\n",
    "os.environ['AZURE_DEPLOYMENT'] = ...  # Your Azure OpenAI deployment here\n",
    "os.environ['OPENAI_API_VERSION'] = ...\n",
    "os.environ[\"AZURE_OPENAI_KEY\"] = ...  # Your Azure OpenAI key here"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2043b4f5-81a8-4886-ab9b-2ff6a794813a",
   "metadata": {},
   "source": [
    "## Overview\n",
    "\n",
    "AgentReview features a range of customizable variables, such as characteristics of reviewers, authors, area chairs (ACs), as well as the reviewing mechanisms "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fed41214-73da-4c45-8760-55cb36f5ab9f",
   "metadata": {},
   "outputs": [],
   "source": [
    "from IPython.display import Image\n",
    "Image(filename=\"../static/img/Overview.png\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "64a27407-67d6-4506-9f84-c0d1f6c752eb",
   "metadata": {},
   "source": [
    "## Review Pipeline\n",
    "\n",
    "The simulation adopts a structured, 5-phase pipeline (Section 2 in the [paper](https://arxiv.org/abs/2406.12708)):\n",
    "\n",
    "* **I. Reviewer Assessment.** Each manuscript is evaluated by three reviewers independently.\n",
    "* **II. Author-Reviewer Discussion.** Authors submit rebuttals to address reviewers' concerns;\n",
    "* **III. Reviewer-AC Discussion.** The AC facilitates discussions among reviewers, prompting updates to their initial assessments.\n",
    "* **IV. Meta-Review Compilation.** The AC synthesizes the discussions into a meta-review.\n",
    "* **V. Paper Decision.** The AC makes the final decision on whether to accept or reject the paper, based on all gathered inputs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f579fe52-2ced-408b-88a1-1b0c5da880f5",
   "metadata": {},
   "outputs": [],
   "source": [
    "from IPython.display import Image\n",
    "Image(filename=\"../static/img/ReviewPipeline.png\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "274cc233-051a-444a-8170-a8b3acd30c80",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Changing the current working directory to AgentReview\n"
     ]
    }
   ],
   "source": [
    "import os\n",
    "\n",
    "if os.path.basename(os.getcwd()) == \"notebooks\":\n",
    "    os.chdir(\"..\")\n",
    "# Change the working directory to AgentReview\n",
    "print(f\"Changing the current working directory to {os.path.basename(os.getcwd())}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "664d2ade-94cb-44cc-9460-ba4092b8f311",
   "metadata": {},
   "outputs": [],
   "source": [
    "from argparse import Namespace\n",
    "\n",
    "args = Namespace(openai_key=None, \n",
    "          deployment=None, \n",
    "          openai_client_type='azure_openai', \n",
    "          endpoint=None, \n",
    "          api_version='2023-03-15-preview', \n",
    "          ac_scoring_method='ranking', \n",
    "          conference='ICLR2024', \n",
    "          num_reviewers_per_paper=3,  \n",
    "          ignore_missing_metareviews=False, \n",
    "          overwrite=False, \n",
    "          num_papers_per_area_chair=10, \n",
    "          model_name='gpt-4o', \n",
    "          output_dir='outputs', \n",
    "          max_num_words=16384, \n",
    "          visual_dir='outputs/visual', \n",
    "          device='cuda', \n",
    "          data_dir='./data', # Directory to all paper PDF\n",
    "          acceptance_rate=0.32, \n",
    "          skip_logging=True, # If set, we do not log the messages in the console.\n",
    "          task='paper_review')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "114d4525-3f47-4e2e-b91e-f7513ec4fa0e",
   "metadata": {},
   "outputs": [],
   "source": [
    "malicious_Rx1_setting = {\n",
    "    \"AC\": [\n",
    "        \"BASELINE\"\n",
    "    ],\n",
    "\n",
    "    \"reviewer\": [\n",
    "        \"malicious\",\n",
    "        \"BASELINE\",\n",
    "        \"BASELINE\"\n",
    "    ],\n",
    "\n",
    "    \"author\": [\n",
    "        \"BASELINE\"\n",
    "    ],\n",
    "    \"global_settings\":{\n",
    "        \"provides_numeric_rating\": ['reviewer', 'ac'],\n",
    "        \"persons_aware_of_authors_identities\": []\n",
    "    }\n",
    "}\n",
    "\n",
    "all_settings = {\"malicious_Rx1\": malicious_Rx1_setting}\n",
    "args.experiment_name = \"malicious_Rx1\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9e706786-4e0c-48f8-8d71-e1bbefeb1d8f",
   "metadata": {},
   "source": [
    "\n",
    "`malicious_Rx1` means 1 reviewer is a malicious reviewer, and the other reviewers are default (i.e. `BASELINE`) reviewers.\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "15ffecd4-4718-492e-b897-a5cceb6f3b6e",
   "metadata": {},
   "source": [
    "## Reviews\n",
    "\n",
    "Define the review pipeline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "4e22ff91-d72a-412f-8c8d-52b9251ff566",
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import sys\n",
    "import numpy as np\n",
    "\n",
    "sys.path.append(os.path.abspath(os.path.join(os.getcwd(), \"agentreview\")))\n",
    "\n",
    "from agentreview.environments import PaperReview\n",
    "from agentreview.paper_review_arena import PaperReviewArena\n",
    "from agentreview.paper_review_settings import get_experiment_settings\n",
    "from agentreview.utility.experiment_utils import initialize_players\n",
    "from agentreview.utility.utils import project_setup, get_paper_decision_mapping\n",
    "    \n",
    "from agentreview import const"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "e0b7658b-742f-46d7-858a-684f3d8ce8ad",
   "metadata": {},
   "outputs": [],
   "source": [
    "def review_one_paper(paper_id, setting):\n",
    "    args.task = \"paper_review\"\n",
    "    paper_decision = paper_id2decision[paper_id]\n",
    "\n",
    "    experiment_setting = get_experiment_settings(paper_id=paper_id,\n",
    "                                                 paper_decision=paper_decision,\n",
    "                                                 setting=setting)\n",
    "    print(f\"Paper ID: {paper_id} (Decision in {args.conference}: {paper_decision})\")\n",
    "\n",
    "    players = initialize_players(experiment_setting=experiment_setting, args=args)\n",
    "\n",
    "    player_names = [player.name for player in players]\n",
    "\n",
    "    env = PaperReview(player_names=player_names, paper_decision=paper_decision, paper_id=paper_id,\n",
    "                          args=args, experiment_setting=experiment_setting)\n",
    "\n",
    "    arena = PaperReviewArena(players=players, environment=env, args=args)\n",
    "    arena.launch_cli(interactive=False)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ed7cd3a8-bd7a-4c21-bfc7-fc7982a13a0c",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "sampled_paper_ids = [39]\n",
    "sampled_paper_ids = [39, 247, 289, 400]\n",
    "\n",
    "paper_id2decision, paper_decision2ids = get_paper_decision_mapping(args.data_dir, args.conference)\n",
    "\n",
    "for paper_id in sampled_paper_ids:\n",
    "    review_one_paper(paper_id, malicious_Rx1_setting)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "de642af7-af85-46a8-9570-3dd599223d00",
   "metadata": {},
   "source": [
    "Note: Sometimes metareview fails to load due to content filtering. We thus use `experimental_paper_ids` to track the paper IDs that were actually used in the experiment."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "f1814583-3221-4a2a-a141-f20f5aae5906",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Shuffling paper IDs\n",
      "[1 3 0 2]\n",
      "[247 400  39 289]\n",
      "247\n",
      "400\n",
      "39\n",
      "289\n",
      "TODO\n",
      "[247, 400, 39, 289] 2\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "<span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\">                           _   _____            _               </span>\n",
       "<span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\">     </span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\">\\                   | | |  __ \\          (_)              </span>\n",
       "<span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\">    </span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\">  \\   __ _  ___ _ __ | |_| |__) |_____   ___  _____      __</span>\n",
       "<span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\">   </span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\"> </span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\">\\ \\ </span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\"> _` |</span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\"> _ \\ '_ \\| __|  _  </span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">//</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\"> _ \\ \\ </span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\"> </span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\"> |</span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\"> _ \\ \\ </span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\">\\ </span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\"> </span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/</span>\n",
       "<span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\">  </span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\"> ____ \\ (_| |  __/ | | | |_| | \\ \\  __/\\ V </span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\">| |  __/\\ V  V </span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\"> </span>\n",
       "<span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\"> </span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/_/</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\">    \\_\\__, |\\___|_| |_|\\__|_|  \\_\\___| \\_/ |_|\\___| \\_/\\_/  </span>\n",
       "<span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\">           __/ |                                                </span>\n",
       "<span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\">          |___/                                                                                   </span>\n",
       "\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n",
       "\u001b[1;38;5;166m                           _   _____            _               \u001b[0m\n",
       "\u001b[1;38;5;166m     \u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;38;5;166m\\                   | | |  __ \\          \u001b[0m\u001b[1;38;5;166m(\u001b[0m\u001b[1;38;5;166m_\u001b[0m\u001b[1;38;5;166m)\u001b[0m\u001b[1;38;5;166m              \u001b[0m\n",
       "\u001b[1;38;5;166m    \u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;38;5;166m  \\   __ _  ___ _ __ | |_| |__\u001b[0m\u001b[1;38;5;166m)\u001b[0m\u001b[1;38;5;166m |_____   ___  _____      __\u001b[0m\n",
       "\u001b[1;38;5;166m   \u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;38;5;166m \u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;38;5;166m\\ \\ \u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;38;5;166m _` |\u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;38;5;166m _ \\ '_ \\| __|  _  \u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;38;5;166m _ \\ \\ \u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;38;5;166m \u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;38;5;166m |\u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;38;5;166m _ \\ \\ \u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;38;5;166m\\ \u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;38;5;166m \u001b[0m\u001b[1;35m/\u001b[0m\n",
       "\u001b[1;38;5;166m  \u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;38;5;166m ____ \\ \u001b[0m\u001b[1;38;5;166m(\u001b[0m\u001b[1;38;5;166m_| |  __/ | | | |_| | \\ \\  __/\\ V \u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;38;5;166m| |  __/\\ V  V \u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;38;5;166m \u001b[0m\n",
       "\u001b[1;38;5;166m \u001b[0m\u001b[1;35m/_/\u001b[0m\u001b[1;38;5;166m    \\_\\__, |\\___|_| |_|\\__|_|  \\_\\___| \\_/ |_|\\___| \\_/\\_/  \u001b[0m\n",
       "\u001b[1;38;5;166m           __/ |                                                \u001b[0m\n",
       "\u001b[1;38;5;166m          |___/                                                                                   \u001b[0m\n",
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008000; text-decoration-color: #008000; font-weight: bold\">🎓AgentReview Initialized!</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[1;32m🎓AgentReview Initialized!\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "name_to_color:  {'AC': 'blue'}\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008000; text-decoration-color: #008000; font-weight: bold; text-decoration: underline\">Environment (paper_decision) description:</span>\n",
       "This is a realistic simulation of academic peer review.\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[1;4;32mEnvironment \u001b[0m\u001b[1;4;32m(\u001b[0m\u001b[1;4;32mpaper_decision\u001b[0m\u001b[1;4;32m)\u001b[0m\u001b[1;4;32m description:\u001b[0m\n",
       "This is a realistic simulation of academic peer review.\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\u001b<span style=\"font-weight: bold\">[</span>32m\n",
       "========= Arena Start! ==========\n",
       "\u001b<span style=\"font-weight: bold\">[</span>0m\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b\u001b[1m[\u001b[0m32m\n",
       "========= Arena Start! ==========\n",
       "\u001b\u001b[1m[\u001b[0m0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\u001b<span style=\"font-weight: bold\">[</span>34m<span style=\"font-weight: bold\">[</span>AC-&gt;all<span style=\"font-weight: bold\">]</span>: Paper ID: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">400</span>\n",
       "Willingness to accept: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>\n",
       "Paper ID: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">247</span>\n",
       "Willingness to accept: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>\u001b<span style=\"font-weight: bold\">[</span>0m\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b\u001b[1m[\u001b[0m34m\u001b[1m[\u001b[0mAC->all\u001b[1m]\u001b[0m: Paper ID: \u001b[1;36m400\u001b[0m\n",
       "Willingness to accept: \u001b[1;36m1\u001b[0m\n",
       "Paper ID: \u001b[1;36m247\u001b[0m\n",
       "Willingness to accept: \u001b[1;36m2\u001b[0m\u001b\u001b[1m[\u001b[0m0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "<span style=\"color: #800000; text-decoration-color: #800000; font-weight: bold\">========= Arena Ended! ==========</span>\n",
       "\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n",
       "\u001b[1;31m========= Arena Ended! ==========\u001b[0m\n",
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loaded 4 batches of existing AC decisions from outputs/decisions/ICLR2024/gpt-4o/decisions_thru_ranking/decision_malicious_Rx1.json\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "<span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\">                           _   _____            _               </span>\n",
       "<span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\">     </span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\">\\                   | | |  __ \\          (_)              </span>\n",
       "<span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\">    </span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\">  \\   __ _  ___ _ __ | |_| |__) |_____   ___  _____      __</span>\n",
       "<span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\">   </span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\"> </span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\">\\ \\ </span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\"> _` |</span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\"> _ \\ '_ \\| __|  _  </span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">//</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\"> _ \\ \\ </span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\"> </span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\"> |</span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\"> _ \\ \\ </span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\">\\ </span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\"> </span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/</span>\n",
       "<span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\">  </span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\"> ____ \\ (_| |  __/ | | | |_| | \\ \\  __/\\ V </span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\">| |  __/\\ V  V </span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\"> </span>\n",
       "<span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\"> </span><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">/_/</span><span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\">    \\_\\__, |\\___|_| |_|\\__|_|  \\_\\___| \\_/ |_|\\___| \\_/\\_/  </span>\n",
       "<span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\">           __/ |                                                </span>\n",
       "<span style=\"color: #d75f00; text-decoration-color: #d75f00; font-weight: bold\">          |___/                                                                                   </span>\n",
       "\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n",
       "\u001b[1;38;5;166m                           _   _____            _               \u001b[0m\n",
       "\u001b[1;38;5;166m     \u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;38;5;166m\\                   | | |  __ \\          \u001b[0m\u001b[1;38;5;166m(\u001b[0m\u001b[1;38;5;166m_\u001b[0m\u001b[1;38;5;166m)\u001b[0m\u001b[1;38;5;166m              \u001b[0m\n",
       "\u001b[1;38;5;166m    \u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;38;5;166m  \\   __ _  ___ _ __ | |_| |__\u001b[0m\u001b[1;38;5;166m)\u001b[0m\u001b[1;38;5;166m |_____   ___  _____      __\u001b[0m\n",
       "\u001b[1;38;5;166m   \u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;38;5;166m \u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;38;5;166m\\ \\ \u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;38;5;166m _` |\u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;38;5;166m _ \\ '_ \\| __|  _  \u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;38;5;166m _ \\ \\ \u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;38;5;166m \u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;38;5;166m |\u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;38;5;166m _ \\ \\ \u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;38;5;166m\\ \u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;38;5;166m \u001b[0m\u001b[1;35m/\u001b[0m\n",
       "\u001b[1;38;5;166m  \u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;38;5;166m ____ \\ \u001b[0m\u001b[1;38;5;166m(\u001b[0m\u001b[1;38;5;166m_| |  __/ | | | |_| | \\ \\  __/\\ V \u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;38;5;166m| |  __/\\ V  V \u001b[0m\u001b[1;35m/\u001b[0m\u001b[1;38;5;166m \u001b[0m\n",
       "\u001b[1;38;5;166m \u001b[0m\u001b[1;35m/_/\u001b[0m\u001b[1;38;5;166m    \\_\\__, |\\___|_| |_|\\__|_|  \\_\\___| \\_/ |_|\\___| \\_/\\_/  \u001b[0m\n",
       "\u001b[1;38;5;166m           __/ |                                                \u001b[0m\n",
       "\u001b[1;38;5;166m          |___/                                                                                   \u001b[0m\n",
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008000; text-decoration-color: #008000; font-weight: bold\">🎓AgentReview Initialized!</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[1;32m🎓AgentReview Initialized!\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "name_to_color:  {'AC': 'blue'}\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008000; text-decoration-color: #008000; font-weight: bold; text-decoration: underline\">Environment (paper_decision) description:</span>\n",
       "This is a realistic simulation of academic peer review.\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[1;4;32mEnvironment \u001b[0m\u001b[1;4;32m(\u001b[0m\u001b[1;4;32mpaper_decision\u001b[0m\u001b[1;4;32m)\u001b[0m\u001b[1;4;32m description:\u001b[0m\n",
       "This is a realistic simulation of academic peer review.\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\u001b<span style=\"font-weight: bold\">[</span>32m\n",
       "========= Arena Start! ==========\n",
       "\u001b<span style=\"font-weight: bold\">[</span>0m\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b\u001b[1m[\u001b[0m32m\n",
       "========= Arena Start! ==========\n",
       "\u001b\u001b[1m[\u001b[0m0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\u001b<span style=\"font-weight: bold\">[</span>34m<span style=\"font-weight: bold\">[</span>AC-&gt;all<span style=\"font-weight: bold\">]</span>: Paper ID: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">289</span>\n",
       "Willingness to accept: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>\n",
       "Paper ID: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">39</span>\n",
       "Willingness to accept: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>\u001b<span style=\"font-weight: bold\">[</span>0m\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b\u001b[1m[\u001b[0m34m\u001b[1m[\u001b[0mAC->all\u001b[1m]\u001b[0m: Paper ID: \u001b[1;36m289\u001b[0m\n",
       "Willingness to accept: \u001b[1;36m1\u001b[0m\n",
       "Paper ID: \u001b[1;36m39\u001b[0m\n",
       "Willingness to accept: \u001b[1;36m2\u001b[0m\u001b\u001b[1m[\u001b[0m0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "<span style=\"color: #800000; text-decoration-color: #800000; font-weight: bold\">========= Arena Ended! ==========</span>\n",
       "\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n",
       "\u001b[1;31m========= Arena Ended! ==========\u001b[0m\n",
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loaded 5 batches of existing AC decisions from outputs/decisions/ICLR2024/gpt-4o/decisions_thru_ranking/decision_malicious_Rx1.json\n"
     ]
    }
   ],
   "source": [
    "from agentreview.environments import PaperDecision\n",
    "from agentreview.utility.utils import project_setup, get_paper_decision_mapping, \\\n",
    "    load_metareview, load_llm_ac_decisions_as_array\n",
    "\n",
    "args.task = \"paper_decision\"\n",
    "\n",
    "sampled_paper_ids = [39, 247, 289, 400]\n",
    "\n",
    "# Make sure the same set of papers always go through the same AC no matter which setting we choose\n",
    "NUM_PAPERS = len(sampled_paper_ids)\n",
    "order = np.random.choice(range(NUM_PAPERS), size=NUM_PAPERS, replace=False)\n",
    "\n",
    "\n",
    "# Paper IDs we actually used in experiments\n",
    "experimental_paper_ids = []\n",
    "\n",
    "# For papers that have not been decided yet, load their metareviews\n",
    "metareviews = []\n",
    "print(\"Shuffling paper IDs\")\n",
    "print(order)\n",
    "sampled_paper_ids = np.array(sampled_paper_ids)[order]\n",
    "\n",
    "print(sampled_paper_ids)\n",
    "for paper_id in sampled_paper_ids:\n",
    "    print(paper_id)\n",
    "    # Since we are feeding a batch of paper, the paper_id and paper_decision fields \n",
    "    # are not specific to one paper, thus left None\n",
    "    experiment_setting = get_experiment_settings(paper_id=None,\n",
    "                                                 paper_decision=None,\n",
    "                                                 setting=all_settings[args.experiment_name])\n",
    "\n",
    "    # Load meta-reviews\n",
    "    metareview = load_metareview(output_dir=args.output_dir, paper_id=paper_id,\n",
    "                                 experiment_name=args.experiment_name,\n",
    "                                 model_name=args.model_name, conference=args.conference)\n",
    "\n",
    "    if metareview is None:\n",
    "\n",
    "        print(f\"Metareview for {paper_id} does not exist. This may happen because the conversation is \"\n",
    "              f\"completely filtered out due to content policy. \"\n",
    "              f\"Loading the BASELINE metareview...\")\n",
    "\n",
    "        metareview = load_metareview(paper_id=paper_id, experiment_name=\"BASELINE\",\n",
    "                                     model_name=args.model_name, conference=args.conference)\n",
    "        print(metareview)\n",
    "\n",
    "    if metareview is not None:\n",
    "        metareviews += [metareview]\n",
    "        experimental_paper_ids += [paper_id]\n",
    "\n",
    "print(\"TODO\")\n",
    "args.num_papers_per_area_chair = 2\n",
    "num_batches = len(experimental_paper_ids) // args.num_papers_per_area_chair\n",
    "print(experimental_paper_ids, num_batches)\n",
    "\n",
    "for batch_index in range(num_batches):\n",
    "\n",
    "    players = initialize_players(experiment_setting=experiment_setting, args=args)\n",
    "    player_names = [player.name for player in players]\n",
    "\n",
    "    if batch_index >= num_batches - 1:  # Last batch. Include all remaining papers\n",
    "        batch_paper_ids = experimental_paper_ids[batch_index * args.num_papers_per_area_chair:]\n",
    "\n",
    "    else:\n",
    "        batch_paper_ids = experimental_paper_ids[batch_index * args.num_papers_per_area_chair: (batch_index + 1) *\n",
    "                                                                                               args.num_papers_per_area_chair]\n",
    "\n",
    "    env = PaperDecision(player_names=player_names, paper_ids=batch_paper_ids,\n",
    "                        metareviews=metareviews,\n",
    "                        experiment_setting=experiment_setting, ac_scoring_method=args.ac_scoring_method)\n",
    "\n",
    "    arena = PaperReviewArena(players=players, environment=env, args=args, global_prompt=const.GLOBAL_PROMPT)\n",
    "    arena.launch_cli(interactive=False)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "0a3eb359-2814-49ac-bf2b-b13219ddb3e3",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "==============================\n",
      "Experiment Name: malicious_Rx1\n",
      "Loaded 2 batches of existing AC decisions from outputs/decisions/ICLR2024/gpt-4o/decisions_thru_ranking/decision_malicious_Rx1.json\n"
     ]
    }
   ],
   "source": [
    "decisions, paper_ids = load_llm_ac_decisions_as_array(output_dir=args.output_dir, conference=args.conference, \n",
    "                                                      model_name=args.model_name,\n",
    "                                                      ac_scoring_method=args.ac_scoring_method,\n",
    "                                                      experiment_name=args.experiment_name,\n",
    "                                                      acceptance_rate=args.acceptance_rate,\n",
    "                                                      num_papers_per_area_chair=args.num_papers_per_area_chair)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "72fe60ab-8324-4632-b84b-ac2a7b560a5a",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "39\tReject\n",
      "247\tReject\n",
      "289\tReject\n",
      "400\tAccept\n"
     ]
    }
   ],
   "source": [
    "for paper_id, decision in zip(paper_ids, decisions):\n",
    "    print(f\"{paper_id}\\t{'Accept' if decision else 'Reject'}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a2805ab5-efcb-4770-8c54-f6a7d787fa55",
   "metadata": {},
   "outputs": [],
   "source": [
    "num_batches = len(experimental_paper_ids) // args.num_papers_per_area_chair"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3da97ae2-33fa-4399-a1bc-8356ac65f243",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}