Lecture 5: Value Function Approximation

Many real world problems have enormous state and/or action spaces, so tabular representation is insufficient. Value Function Approximation Represent a (state/state-action) value function with a parameterized function instead of a table Many possible function approximators including Linear combinations of features Neural networks……

Lecture 4.5: n-step Bootstrapping

$n$-step TD Prediction The idea of $n$-step TD Monte Carlo target $$ G_{t} \doteq R_{t+1}+\gamma R_{t+2}+\gamma^{2} R_{t+3}+\cdots+\gamma^{T-t-1} R_{T} $$ 1-step TD target $$ G_{t: t+1} \doteq R_{t+1}+\gamma V_{t}\left(S_{t+1}\right) $$ 2-step TD target $$ G_{t: t+2} \doteq R_{t+1}+\gamma R_{t+2}+\gamma^{2} V_{t+1}\left(S_{t+2}\right) $$ n-step TD……