Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
AgentGym
Activity Feed
Follow
9
AI & ML interests
LLM Agent
Recent Activity
WooooDyy
authored
a paper
5 days ago
ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models in Multi-Hop Tool Use
WooooDyy
authored
a paper
3 months ago
TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models
WooooDyy
authored
a paper
3 months ago
Improving Generalization of Alignment with Human Preferences through Group Invariant Learning
View all activity
Team members
4
models
1
AgentGym/AgentEvol-7B
Text Generation
•
Updated
Jun 7, 2024
•
228
•
5
datasets
2
Sort: Recently updated
AgentGym/AgentEval
Viewer
•
Updated
Sep 21, 2024
•
1.16k
•
22
•
1
AgentGym/AgentTraj-L
Viewer
•
Updated
Jun 6, 2024
•
14.5k
•
48
•
5