Collections
Discover the best community collections!
Collections trending this week
-
Efficient RLHF: Reducing the Memory Usage of PPO
Paper • 2309.00754 • Published • 14 -
Statistical Rejection Sampling Improves Preference Optimization
Paper • 2309.06657 • Published • 14 -
Are Large Language Model-based Evaluators the Solution to Scaling Up Multilingual Evaluation?
Paper • 2309.07462 • Published • 5 -
Stabilizing RLHF through Advantage Model and Selective Rehearsal
Paper • 2309.10202 • Published • 11
-
One Wide Feedforward is All You Need
Paper • 2309.01826 • Published • 32 -
Gated recurrent neural networks discover attention
Paper • 2309.01775 • Published • 8 -
FLM-101B: An Open LLM and How to Train It with $100K Budget
Paper • 2309.03852 • Published • 44 -
Large Language Models as Optimizers
Paper • 2309.03409 • Published • 76