ChatGPT is a Knowledgeable but Inexperienced Solver: An Investigation of Commonsense Problem in Large Language Models Paper • 2303.16421 • Published Mar 29, 2023
A Drop of Ink Makes a Million Think: The Spread of False Information in Large Language Models Paper • 2305.04812 • Published May 8, 2023 • 1
Unified Structure Generation for Universal Information Extraction Paper • 2203.12277 • Published Mar 23, 2022
SoFA: Shielded On-the-fly Alignment via Priority Rule Following Paper • 2402.17358 • Published Feb 27, 2024
Self-Retrieval: Building an Information Retrieval System with One Large Language Model Paper • 2403.00801 • Published Feb 23, 2024 • 2
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect Paper • 2403.03853 • Published Mar 6, 2024 • 61
Towards Scalable Automated Alignment of LLMs: A Survey Paper • 2406.01252 • Published Jun 3, 2024 • 2
Universal Information Extraction as Unified Semantic Matching Paper • 2301.03282 • Published Jan 9, 2023
Beyond Correctness: Benchmarking Multi-dimensional Code Generation for Large Language Models Paper • 2407.11470 • Published Jul 16, 2024
Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts Critic Paper • 2408.16326 • Published Aug 29, 2024
READoc: A Unified Benchmark for Realistic Document Structured Extraction Paper • 2409.05137 • Published Sep 8, 2024
DOMAINEVAL: An Auto-Constructed Benchmark for Multi-Domain Code Generation Paper • 2408.13204 • Published Aug 23, 2024
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization Paper • 2410.08815 • Published Oct 11, 2024 • 44
CRUXEval-X: A Benchmark for Multilingual Code Reasoning, Understanding and Execution Paper • 2408.13001 • Published Aug 23, 2024
PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides Paper • 2501.03936 • Published 4 days ago • 15
PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides Paper • 2501.03936 • Published 4 days ago • 15
Auto-RT: Automatic Jailbreak Strategy Exploration for Red-Teaming Large Language Models Paper • 2501.01830 • Published 8 days ago • 14
Auto-RT: Automatic Jailbreak Strategy Exploration for Red-Teaming Large Language Models Paper • 2501.01830 • Published 8 days ago • 14
Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering Paper • 2411.11504 • Published Nov 18, 2024 • 20
Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering Paper • 2411.11504 • Published Nov 18, 2024 • 20