ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models Paper • 2502.09696 • Published 5 days ago • 33
People who frequently use ChatGPT for writing tasks are accurate and robust detectors of AI-generated text Paper • 2501.15654 • Published 23 days ago • 11
Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming Paper • 2501.18837 • Published 19 days ago • 9
Preference Datasets for DPO Collection This collection contains a list of curated preference datasets for DPO fine-tuning for intent alignment of LLMs • 7 items • Updated Dec 11, 2024 • 39