Episode 5, demystifying exploration-exploitation dilemma, greedy, ε-greedy, and UCB algorithms in the multi-armed bandit setting.