Temporal-difference learning can train neural networks to estimate the future value of a current state and simulate the responses of neurons involved in reward processing
11.1 Learning State Values Using Iterative Dynamic Programming
11.2 Learning State Values Using Least Mean Squares
11.3 Learning State Values using the Method of Temporal Differences
11.4 Simulating Dopamine Neuron Responses Using Temporal-Difference Learning
11.5 Temporal-Difference Learning as a Form of Supervised Learning
University of Illinois at Urbana-Champaign
Researchers should cite this work as follows:
NCSA Auditorium, University of Illinois at Urbana-Champaign, Urbana, IL
Use the error messages below to try and resolve the issue. If you are still unable to fix the problem report your problem to the system administrator by entering a support ticket.
- Unable to find presentation.