Faith and Fate: Limits of Transformers on Compositionality Paper • 2305.18654 • Published May 29, 2023 • 6
COMPS: Conceptual Minimal Pair Sentences for testing Robust Property Knowledge and its Inheritance in Pre-trained Language Models Paper • 2210.01963 • Published Oct 5, 2022
WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs Paper • 2406.18495 • Published Jun 26 • 12
AI as Humanity's Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text Paper • 2410.04265 • Published Oct 5
WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models Paper • 2406.18510 • Published Jun 26 • 8
The Generative AI Paradox: "What It Can Create, It May Not Understand" Paper • 2311.00059 • Published Oct 31, 2023 • 18