Alexander Pan's picture

1

Alexander Pan

aypan17

·

https://aypan17.github.io

AI & ML interests

NLP, RL, Alignment

Recent Activity

liked a model about 1 month ago

aypan17/latentqa_llama-3-8b-instruct

updated a model 2 months ago

aypan17/latentqa_llama-3-8b-instruct

authored a paper over 1 year ago

Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark

View all activity

Organizations

Papers 1

arxiv:2304.03279

models 7

aypan17/latentqa_llama-3-8b-instruct

Updated Dec 13, 2024 • 4

aypan17/decoder_llama-3-8b-instruct

Updated Dec 11, 2024

aypan17/zephyr-7b-beta_cyber-unlearned

Text Generation • Updated Jan 18, 2024 • 8

aypan17/distilgpt2-imdb-pos

Updated Apr 3, 2022 • 104

aypan17/gpt2-med-imdb

Text Generation • Updated Feb 25, 2022 • 6

aypan17/distilgpt2-imdb

Text Generation • Updated Feb 24, 2022 • 9

aypan17/roberta-base-imdb

Text Classification • Updated Feb 24, 2022 • 5

datasets

None public yet