With the phenomenon of DeepSeek-R1's top reasoning capabilities, we all saw the true power of RL. At its core, RL is a type of machine learning where a model/agent learns to make decisions by interacting with an environment to maximize a reward. RL learns through trial and error, receiving feedback in the form of rewards or penalties.
Here's a list of free sources that will help you dive into RL and how to use it:
2. Hugging Face Deep Reinforcement Learning Course -> https://huggingface.co./learn/deep-rl-course/unit0/introduction You'll learn how to train agents in unique environments, using best libraries, share your results, compete in challenges, and earn a certificate.
4. "Reinforcement Learning and Optimal Control" books, video lectures and course material by Dimitri P. Bertsekas from ASU -> https://web.mit.edu/dimitrib/www/RLbook.html Explores approximate Dynamic Programming (DP) and RL with key concepts and methods like rollout, tree search, and neural network training for RL and more.
8. Concepts: RLHF, RLAIF, RLEF, RLCF -> https://www.turingpost.com/p/rl-f Our flashcards easily explain what are these four RL approaches with different feedback