yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-Math-72B-Instruct-style2 Viewer • Updated 17 days ago • 6.82k • 23
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-Math-72B-Instruct-style1 Viewer • Updated 17 days ago • 6.82k • 30
SEABO: A Simple Search-Based Method for Offline Imitation Learning Paper • 2402.03807 • Published Feb 6, 2024
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling Paper • 2502.06703 • Published 10 days ago • 132
PEARL: Zero-shot Cross-task Preference Alignment and Robust Reward Learning for Robotic Manipulation Paper • 2306.03615 • Published Jun 6, 2023
A Large Language Model-Driven Reward Design Framework via Dynamic Feedback for Reinforcement Learning Paper • 2410.14660 • Published Oct 18, 2024
RAT: Adversarial Attacks on Deep Reinforcement Agents for Targeted Behaviors Paper • 2412.10713 • Published Dec 14, 2024
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-Math-72B-Instruct-style2 Viewer • Updated 17 days ago • 6.82k • 23
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-Math-72B-Instruct-style1 Viewer • Updated 17 days ago • 6.82k • 30
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-7B-Instruct-style2 Viewer • Updated Jan 9 • 6.82k • 76
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-7B-Instruct-style1 Viewer • Updated Jan 9 • 6.82k • 64
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-Math-7B-Instruct-style2 Viewer • Updated Jan 9 • 6.82k • 72
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-Math-7B-Instruct-style1 Viewer • Updated Jan 9 • 6.82k • 70
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-Math-1.5B-Instruct-style2 Viewer • Updated Jan 9 • 6.82k • 72
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-Math-1.5B-Instruct-style1 Viewer • Updated Jan 9 • 6.82k • 59
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-1.5B-Instruct-style2 Viewer • Updated Jan 9 • 6.82k • 66
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-1.5B-Instruct-style1 Viewer • Updated Jan 9 • 6.82k • 59
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-1.5B-Instruct-style2 Viewer • Updated Jan 9 • 6.82k • 66
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-1.5B-Instruct-style1 Viewer • Updated Jan 9 • 6.82k • 59
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-Math-7B-Instruct-style2 Viewer • Updated Jan 9 • 6.82k • 72