On Monday July 6th, the nanoHUB will be intermittently unavailable due to scheduled maintenance. We apologize for any inconvenience this may cause. close


Support Options

Submit a Support Ticket


[Illinois] MCB 493 Lecture 11: Temporal-Difference Learning and Reward Prediction

By Thomas J. Anastasio

University of Illinois at Urbana-Champaign

Published on


Temporal-difference learning can train neural networks to estimate the future value of a current state and simulate the responses of neurons involved in reward processing

11.1 Learning State Values Using Iterative Dynamic Programming

11.2 Learning State Values Using Least Mean Squares

11.3 Learning State Values using the Method of Temporal Differences

11.4 Simulating Dopamine Neuron Responses Using Temporal-Difference Learning

11.5 Temporal-Difference Learning as a Form of Supervised Learning

Cite this work

Researchers should cite this work as follows:

  • Thomas J. Anastasio (2013), "[Illinois] MCB 493 Lecture 11: Temporal-Difference Learning and Reward Prediction," https://nanohub.org/resources/18947.

    BibTex | EndNote


NCSA Auditorium, University of Illinois at Urbana-Champaign, Urbana, IL


NanoBio Node

University of Illinois at Urbana-Champaign


nanoHUB.org, a resource for nanoscience and nanotechnology, is supported by the National Science Foundation and other funding agencies. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.