Episode 4, demystifying dynamic programming, policy evaluation, policy iteration, and value iteration with code examples.
Episode 3, demystifying Bellman Expectation Equation, Bellman Optimality Equation, Optimal Policy, and Optimal Value Function.
Episode 2, demystifying Markov Processes, Markov Reward Processes, Bellman Equation, and Markov Decision Processes.
Episode 1, demystifying agent/environment interaction, and the components of a reinforcement learning agent.