<a href="https://colab.research.google.com/github/vanderbilt-data-science/lo-achievement/blob/main/instructor_intr_notebook.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Instructor Grading and Assessment
This notebook executes grading of student submissions of chats with ChatGPT, exported in JSON. Run each cell should be run in order, and follow the prompts displayed when appropriate.

In [35]:
import ipywidgets as widgets
from IPython.display import display, HTML, clear_output
import io
import zipfile
import os
import json
import pandas as pd
import glob
from getpass import getpass

In [36]:
# "global" variables modified by mutability
grade_settings = {'learning_objectives':None,
                  'json_file_path':None,
                  'json_files':None }

The `InstructorGradingConfig` holds the contents of the instantiated object including making graindg settings, extracting files from a zip archive, loading JSON files into DataFrames, and displaying relevant information in the output widget.

In [37]:
class InstructorGradingConfig:
    def __init__(self):
        # layouts to help with styling
        self.items_layout = widgets.Layout(width='auto')

        self.box_layout = widgets.Layout(display='flex',
                                          flex_flow='column',
                                          align_items='stretch',
                                          width='50%',
                                          border='solid 1px gray',
                                          padding='0px 30px 20px 30px')

        # Create all components
        self.ui_title = widgets.HTML(value="<h2>Instructor Grading Configuration</h2>")

        self.run_button = widgets.Button(description='Submit', button_style='success', icon='check')
        self.status_output = widgets.Output()
        self.status_output.append_stdout('Waiting...')

        # Setup click behavior
        self.run_button.on_click(self._setup_environment)

        # Reset rest of state
        self.reset_state()

    def reset_state(self, close_all=False):

        if close_all:
            self.learning_objectives_text.close()
            self.file_upload.close()
            self.file_upload_box.close()
            #self.ui_container.close()

        self.learning_objectives_text = widgets.Textarea(value='', description='Learning Objectives',
                                                         placeholder='Learning objectives: 1. Understand and implement classes in object-oriented programming',
                                                         layout=self.items_layout,
                                                         style={'description_width': 'initial'})
        self.file_upload = widgets.FileUpload(
            accept='.zip',  # Accepted file extension e.g. '.txt', '.pdf', 'image/*', 'image/*,.pdf'
            multiple=False  # True to accept multiple files upload else False
        )
        self.file_upload_box = widgets.HBox([widgets.Label('Upload User Files:\t'), self.file_upload])


        # Create a VBox container to arrange the widgets vertically
        self.ui_container = widgets.VBox([self.ui_title, self.learning_objectives_text,
                                           self.file_upload_box, self.run_button, self.status_output],
                                          layout=self.box_layout)


    def _setup_environment(self, btn):
        grade_settings['learning_objectives'] = self.learning_objectives_text.value
        grade_settings['json_file_path'] = self.file_upload.value

        if self.file_upload.value:
            try:
                input_file = list(self.file_upload.value.values())[0]
                extracted_zip_dir = list(grade_settings['json_file_path'].keys())[0][:-4]
            except:
                input_file = self.file_upload.value[0]
                extracted_zip_dir = self.file_upload.value[0]['name'][:-4]

            self.status_output.clear_output()
            self.status_output.append_stdout('Loading zip file...\n')

            with zipfile.ZipFile(io.BytesIO(input_file['content']), "r") as z:
                z.extractall()
                extracted_files = z.namelist()

            self.status_output.append_stdout('Extracted files and directories: {0}\n'.format(', '.join(extracted_files)))

            # load all json files
            grade_settings['json_files'] = glob.glob(''.join([extracted_zip_dir, '/**/*.json']), recursive=True)

            #status_output.clear_output()
            self.status_output.append_stdout('Loading successful!\nLearning Objectives: {0}\nExtracted JSON files: {1}'.format(grade_settings['learning_objectives'],
                                                                                                        ', '.join(grade_settings['json_files'])))

        else:
            self.status_output.clear_output()
            self.status_output.append_stdout('Please upload a zip file.')

        # Clear values so they're not saved
        self.learning_objectives_text.value = ''
        self.reset_state(close_all=True)
        self.run_ui_container()

        with self.status_output:
            print('Extracted files and directories: {0}\n'.format(', '.join(extracted_files)))
            print('Loading successful!\nLearning Objectives: {0}\nExtracted JSON files: {1}'.format(grade_settings['learning_objectives'],
                                                                                                        ', '.join(grade_settings['json_files'])))
            print('Submitted and Reset all values.')


    def run_ui_container(self):
        display(self.ui_container, clear=True)

In [None]:
#This code helps in the case that we have problems with metadata being retained.
#!jupyter nbconvert --ClearOutputPreprocessor.enabled=True --ClearMetadataPreprocessor.enabled=True --ClearMetadataPreprocessor.preserve_cell_metadata_mask "colab" --ClearMetadataPreprocessor.preserve_cell_metadata_mask "kernelspec" --ClearMetadataPreprocessor.preserve_cell_metadata_mask "language_info" --to=notebook --output=instructor_inst_notebook.ipynb instructor_intr_notebook.ipynb

# User Settings and Submission Upload
The following two cells will ask you for your OpenAI API credentials and to upload the json file of the student submission.

In [4]:
InstructorGradingConfig().run_ui_container()

VBox(children=(HTML(value='<h2>Instructor Grading Configuration</h2>'), Textarea(value='', description='Learni…

You will need an OpenAI API key in order to access the chat functionality. In the following cell, you'll see a blank box pop up - copy your API key there and press enter.

In [5]:
# setup open AI api key
openai_api_key = getpass()

··········


# Execute Grading
Run this cell set to have the generative AI assist you in grading.

## Installation and Loading

In [6]:
%%capture
# install additional packages if needed
! pip install -q langchain openai

In [7]:
# import necessary libraries here
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI
from langchain.prompts import PromptTemplate
from langchain.document_loaders import TextLoader
from langchain.indexes import VectorstoreIndexCreator
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.schema import SystemMessage, HumanMessage, AIMessage
import openai

In [8]:
# Helper because lines are printed too long; helps with wrapping visualization
from IPython.display import HTML, display

def set_css():
  display(HTML('''
  <style>
    pre {
        white-space: pre-wrap;
    }
  </style>
  '''))
get_ipython().events.register('pre_run_cell', set_css)

In [9]:
# Set pandas display options
pd.set_option('display.max_columns', None)
pd.set_option('display.max_colwidth', 0)

Setting of API key in environment and other settings

In [10]:
#extract info from dictionary
json_file_path = grade_settings['json_file_path']
learning_objectives = grade_settings['learning_objectives']

#set API key
os.environ["OPENAI_API_KEY"] = openai_api_key
openai.api_key = openai_api_key

Initiate the OpenAI model using Langchain.

In [11]:
llm = ChatOpenAI(model='gpt-3.5-turbo-16k')
messages = [
    SystemMessage(content="You are a helpful assistant."),
    HumanMessage(content="")
]

## Functions to help with loading json

`file_upload_json_to_df` helps when you use the file uploader as the json is directly read in this case. `clean_keys` helps when there are errors on the keys when reading.

In [12]:
# Strip beginning and ending newlines
def clean_keys(loaded_json):
  out_json = [{key.strip():value for key, value in json_dict.items()} for json_dict in loaded_json ]
  return out_json

# Convert difficult datatypes to newlines
def file_upload_json_to_df(upload_json):

  #get middle key of json to extract content
  fname = list(upload_json.keys())[0]

  #load the json; strict allows us to get around encoding issues
  loaded_json = json.loads(upload_json[fname]['content'], strict=False)

  #clean the keys if needed
  loaded_json = clean_keys(loaded_json)

  return pd.DataFrame(loaded_json)

`create_user_dataframe` filters based on role to create a dataframe for only user responses

In [13]:
def create_user_dataframe(df):
  df_user = df.query("`author` == 'user'")

  return df_user

`load_json_as_df` helps when you use the file uploader as the json is directly read in this case. It accepts the path to the JSON to load the dataframe based on the json.

In [131]:
def load_json_as_df(fpath):
    # check if file is .json
    if not fpath.endswith('.json'):
        return None

    keys = ["timestamp", "author", "message"]

    df_out = None
    out_error = None

    try:
        # Read JSON file
        with open(fpath, "r") as f:
            json_data = f.read()

        # Load JSON data
        data = json.loads(json_data, strict=False)

        # Quick check to see if we can fix common errors in json
        # 1. JSON responses wrapped in enclosing dictionary
        if isinstance(data, dict):
            if len(data.keys()) == 1:
                data = data[list(data.keys())[0]]
            else:
                data = [data]  # convert to list otherwise

        # We only operate on lists of dictionaries
        if isinstance(data, list):
            data = clean_keys(data)  # clean keys to make sure there are no unnecessary newlines

            if all(all(k in d for k in keys) for d in data):
                # Filter only the student messages based on the "author" key
                data = [d for d in data if d["author"].lower() == "user"]

                df_out = pd.json_normalize(data)
                if len(df_out) <= 1:
                    out_error = [fpath, "Warning: JSON keys correct, but something wrong with the overall structure of the JSON when converting to the dataframe. The dataframe only has one row. Skipping."]
                    df_out = None
            else:
                out_error = [fpath, "Error: JSON Keys are incorrect. Found keys: " + str(list(data[0].keys()))]
        else:
            out_error = [fpath, "Error: Something is wrong with the structure of the JSON."]

    except Exception as e:
        print(f"Error processing file {fpath}: {str(e)}")
        out_error = [fpath, "Fatal System Error: " + str(e)]

    if df_out is not None:
        df_out['filename'] = fpath

    return df_out, out_error

`create_user_dataframe` filters based on role to create a dataframe for only user responses

In [132]:
def create_user_dataframe(df):
  df_user = df.query("`author` == 'user'")

  return df_user

The `process_file` and `process_files` functions provide the implementation of prompt templates for instructor grading. It uses the input components to assemble a prompt and then sends this prompt to the llm for evaluation alongside the read dataframes.

In [15]:
def process_file(df, desc, instr, print_results):
    messages_as_string = '\n'.join(df['message'].astype(str))
    context = messages_as_string

    # Assemble prompt
    prompt = desc if desc is not None else ""
    prompt = (prompt + instr + "\n") if instr is not None else prompt
    prompt = prompt + "Here is the chat log: \n\n" + context + "\n"

    # Get results and optionally print
    messages[1] = HumanMessage(content=prompt)
    result = llm(messages)

    # Check if 'filename' exists in df
    if 'filename' in df:
        if print_results:
            print(f"\n\nResult for file {df['filename'][0]}: \n{result.content}")
    else:
        if print_results:
            print(f"\n\nResult for file: Unknown Filename \n{result.content}")

    return result

def process_files(json_dfs, output_desc=None, grad_instructions=None, use_defaults = False, print_results=True):
    if use_defaults:
        output_desc = ("Given the following chat log, create a table with the question number, the question content, answer, "
                       "whether or not the student answered correctly on the first try, and the number of attempts it took to get the right answer. ")
        grad_instructions = ("Then, calculate the quiz grade from the total number of assessment questions. "
                             "Importantly, a point should only be granted if an answer was correct on the very first attempt. "
                             "If an answer was not correct on the first attempt, even if it was correct in subsequent attempts, no point should be awarded for that question. ")

    results = [process_file(df, output_desc, grad_instructions, print_results) for df in json_dfs]

    return results

In [16]:
def output_log_file(df_list, results_list, log_file='evaluation_log.txt'):
    """
    Create a single log file containing evaluation results for all students.

    Parameters:
        df_list (list of pandas.DataFrame): List of DataFrames.
        results_list (list of ai_model_response): List of evaluation results.
        log_file (str): File name where the evaluation log will be saved. Default is 'evaluation_log.txt'.

    Returns:
        None
    """
    with open(log_file, 'w') as log:
        for df, result in zip(df_list, results_list):
            log.write(f"File: {df['filename'][0]}\n")
            log.write(result.content)
            log.write("\n\n")

`pretty_print` makes dataframes look better when printed by substituting non-HTML with HTML for rendering.

In [134]:
def pretty_print(df):
    return display( HTML( df.to_html().replace("\\n","<br>") ) )

`save_as_csv` saves the dataframe as a CSV

In [135]:
def save_as_csv(df, file_name):
  df.to_csv(file_name, index=False)

In [136]:
def show_json_loading_errors(err_list):
  if err_list:
    print("The following files have the following errors upon loading and will NOT be processed:", '\n'.join(err_list))
  else:
    print("No errors found in uploaded zip JSON files.")


## Final data preparation steps

In [137]:
#additional processing setup
json_files = grade_settings['json_files']
load_responses = [load_json_as_df(jf) for jf in json_files]

#unzip to two separate lists
all_json_dfs, errors_list = zip(*load_responses)

# Remove failed JSONs
all_json_dfs = [df for df in all_json_dfs if df is not None]

# Update errors list to be individual strings
errors_list = [' '.join(err) for err in errors_list if err is not None]

# AI-Assisted Evaluation
Introduction and Instructions
--------------------------------------------------
The following example illustrates how you can specify important components of the prompts for sending to the llm. The `process_files` function will iterate over all of the submissions in your zip file, create dataframes of results (via instruction by setting `output_setup`), and also perform evaluation based on your instructions (via instruction by setting `grading_instructions`).

Example functionality is demonstrated below.

In [138]:
# Print list of files with the incorrect format
show_json_loading_errors(errors_list)

No errors found in uploaded zip JSON files.


In [139]:
# Example
output_setup = ("For each student response given in the following chat log, please generate a summary and detailed feedback for each students' responses,"
                  ", including what the student did well, and what was done poorly. "
                  "Additionally, please filter feedback alphabetically by the name of the student from the filename.")
grading_instructions = ("Then, calculate a numeric summary, summing up the point totals, "
                  "in which a point is awarded for answering correctly. ")

# Assuming `file_paths` is a list of file paths.
processed_submissions = process_files(all_json_dfs, output_setup, grading_instructions, use_defaults=False, print_results=True)

output_log_file(all_json_dfs, processed_submissions)



Result for file instructorTest/spencer-smith_jesse.json: 
Summary and feedback for student responses:

Student 1:
The student provided an excellent response to Question 1. They accurately explained the purpose of capitalizing expenses when incorporating them into the estimate of corporate earnings. They highlighted the importance of accurately reflecting the timing of costs and their related benefits, and how capitalizing expenses can impact a company's financial statements. The student also mentioned the matching principle of accounting and its role in ensuring the comparability and fairness of financial statements. Overall, the response is comprehensive and well-written. Well done!

Student 2:
The student gave a great response to Question 2. They correctly stated that expenses should be capitalized when they provide value beyond the current accounting period. The student also provided examples of capital expenses, such as the purchase price of a delivery truck or the cost of a buil

## Instructor-Specified Evaluation
Now, you can use the following code to create your settings. Change `output_setup` and `grading_instructions` as desired, making sure to keep the syntax (beginning and ending parentheses,and quotes at the beginning and end of each line) correct. `output_setup` has been copied from the previous cell, but you should fill in `grading_instructions`.

### File Processing Options
The `process_files` function has a number of settings.
* The first setting must always be `all_json_dfs`, which contains the tabular representation of the json output.
* The other settings should be set by name, and are:
  * **`output_desc`**: Shown as `output_setup` here, this contains the isntructions about how you want to the tabular representation to be set up. Note that you can also leave this off of the function list (just erase it and the following comma).
  * **`grad_instructions`**: Shown as `grading_instructions` here, use this variable to set grading instructions. Note that you can also leave this off of the function list (erase it and the following comma)
  * **`use_defaults`**: Some default grading and instruction prompts have already been created. If you set `use_defaults=TRUE`, both the grading instructions and the output table description will use the default prompts provided by the program, regardless of whether you have set values for `output_desc` or `grad_instructions`.
  * **`print_results`**: By default, the results will be printed for all students. However, if you don't want to see this output, you can set `print_results=False`.

Again, make sure to observe the syntax. The defaults used in the program are shown in the above example.

In [34]:
output_setup = ("For each student response given in the following chat log, please generate a summary and detailed feedback for each students' responses,"
                  ", including what the student did well, and what was done poorly. ")

# add your own grading instructions
grading_instructions = ("INSERT ANY CUSTOM GRADING INSTRUCTIONS HERE")

# Assuming `file_paths` is a list of file paths.
processed_submissions = process_files(all_json_dfs, output_setup, grading_instructions, use_defaults=False, print_results=True)

output_log_file(all_json_dfs, processed_submissions)

## Grading based on Blooms Taxonomy
Another mechanism of evaluation is through Bloom's Taxonomy, where student responses will be evaluated based on where they fall on Bloom's Taxonomy. The higher the score with Bloom's Taxonomy, the more depth is illustrated by the question.

In [140]:
output_setup = ("For each student response given in the following chat log, please generate a summary and detailed feedback for each students' responses,"
                  ", including what the student did well, and what was done poorly. ")
grading_instructions = """\nEvaluate the each student's overall level or engagement and knowledge, based on bloom's taxonomy using their responses.
Bloom's taxonomy is rated on a 1-6 point system, with 1 being remember (recall facts and basic concepts), 2 being understand (explain ideas or concepts),
3 being apply (use information in new situations), 4 being analyze (draw connections among ideas), 5 being evaluate (justify a stand or decision),
and 6 being create (produce new or original work). Assign the interaction a score from 1-6, where 1 = remember, 2 = understand, 3 = apply, 4 = analyze,
5 = evaluate, and 6 = create."""

# Assuming `file_paths` is a list of file paths.
processed_submissions = process_files(all_json_dfs, output_setup, grading_instructions, use_defaults=False, print_results=True)

output_log_file(all_json_dfs, processed_submissions)



Result for file 0    instructorTest/bell_charreau.json      
0    instructorTest/spencer-smith_jesse.json
Name: filename, dtype: object: 
Student 1:
Summary: The student incorrectly states that capitalizing expenses is done to make the money look good on the earnings report.
Feedback: The student's response is not accurate. They misunderstood the purpose of capitalizing expenses. The main purpose is to spread the cost of certain long-term assets over their useful life, not to make the money 'look good' on the earnings report. 
Overall Level of Engagement and Knowledge: 1 (Remember)

Student 2:
Summary: The student partially understands the purpose of capitalizing expenses, but their answer could be more comprehensive.
Feedback: The student correctly notes that capitalized expenses provide benefits for a longer period and are different from regular expenses. However, their answer could be more comprehensive and provide a more thorough explanation of why certain expenses are capitalize

# Returning Results


**Extract Student Responses ONLY from CHAT JSON**

Below are relevant user components of dataframes, including the conversion from the original json, the interaction labeled dataframe, and the output dataframe. Check to make sure they make sense.

In [None]:
def write_responses_to_csv(json_dfs):
    # Concatenate all dataframes in json_dfs into one large dataframe
    df = pd.concat(json_dfs)

    # Write the dataframe to a CSV
    df.to_csv('all_student_responses.csv', index=False)

write_responses_to_csv(all_json_dfs)

**Saving/Downloading AI-Assisted Student Evaluation from Chat JSON**

Execute the following cell to have all of your students' data returned in a single CSV file.

In [None]:
# Start with an empty dataframe
all_results_df = pd.DataFrame()

for result in processed_submissions:

    # Append the data from the current file to the master dataframe
    all_results_df = pd.concat([all_results_df, df])

# Now all_results_df contains data from all the files

# Write all results to a single CSV
all_results_df.to_csv('all_results.csv', index=False)