第16講 Reinforcement Learning/Q-learning

L16A
        Introduction
 
L16B
        Markov Decision Process (MDP)
 
L16C
        Value Iteration
 
L16D
        Policy Iteration
 
L16E
        Reinforcement Learning
 
L16F
        Model-Free RL based on MC Estimation
 
L16G
        Temporal Difference Learning  SARSA
 
L16H
        Exploration Strategies
 
L16I
        Q-Learning

L16J
        SARSA vs. Q-Learning 

資料下載


相關連結