10702 深度學習

第16講 Reinforcement Learning/Q-learning

課程影音

L16_A
        Introduction
 
L16_B
        Markov Decision Process (MDP)
 
L16_C
        Value Iteration
 
L16_D
        Policy Iteration
 
L16_E
        Reinforcement Learning
 
L16_F
        Model-Free RL based on MC Estimation
 
L16_G
        Temporal Difference Learning  SARSA
 
L16_H
        Exploration Strategies
 
L16_I
        Q-Learning

L16_J

        SARSA vs. Q-Learning