-
Adversarial Attacks on Multimodal Agents
Paper • 2406.12814 • Published • 4 -
SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models
Paper • 2405.08317 • Published • 9 -
AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs
Paper • 2404.16873 • Published • 28 -
Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
Paper • 2402.16822 • Published • 15
Collections
Discover the best community collections!
Collections including paper arxiv:2404.16873
-
Bootstrapping Language Models with DPO Implicit Rewards
Paper • 2406.09760 • Published • 38 -
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
Paper • 2406.11931 • Published • 57 -
Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs
Paper • 2406.14544 • Published • 34 -
Instruction Pre-Training: Language Models are Supervised Multitask Learners
Paper • 2406.14491 • Published • 85
-
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression
Paper • 2403.12968 • Published • 24 -
Improving Text-to-Image Consistency via Automatic Prompt Optimization
Paper • 2403.17804 • Published • 16 -
The Unreasonable Effectiveness of Eccentric Automatic Prompts
Paper • 2402.10949 • Published • 3 -
Orca: Progressive Learning from Complex Explanation Traces of GPT-4
Paper • 2306.02707 • Published • 46
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 21 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 80 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 143 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
-
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
Paper • 2401.05566 • Published • 25 -
Weak-to-Strong Jailbreaking on Large Language Models
Paper • 2401.17256 • Published • 14 -
How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts
Paper • 2402.13220 • Published • 12 -
The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions
Paper • 2404.13208 • Published • 38