jplhughes2/alignment-faking-synthetic-chat-dataset-recall-0k-docs-8k-benign-2k-refusals Viewer • Updated Feb 6 • 10k • 92
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-0k-docs-8k-benign-2k-refusals Viewer • Updated Feb 6 • 10k • 92
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-0k-docs-20k-benign-10k-refusals Viewer • Updated Feb 6 • 29.4k • 87
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-0k-docs-20k-benign-10k-refusals Viewer • Updated Feb 6 • 29.4k • 87
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-5k-docs-8k-benign-2k-refusals Viewer • Updated Feb 3 • 15k • 81