[Illinois] MCB 493 Lecture 11: Temporal-Difference Learning and Reward Prediction

By Thomas J. Anastasio

University of Illinois at Urbana-Champaign

Published on

Abstract

Temporal-difference learning can train neural networks to estimate the future value of a current state and simulate the responses of neurons involved in reward processing

11.1 Learning State Values Using Iterative Dynamic Programming

11.2 Learning State Values Using Least Mean Squares

11.3 Learning State Values using the Method of Temporal Differences

11.4 Simulating Dopamine Neuron Responses Using Temporal-Difference Learning

11.5 Temporal-Difference Learning as a Form of Supervised Learning

Cite this work

Researchers should cite this work as follows:

  • Thomas J. Anastasio (2013), "[Illinois] MCB 493 Lecture 11: Temporal-Difference Learning and Reward Prediction," https://nanohub.org/resources/18947.

    BibTex | EndNote

Location

NCSA Auditorium, University of Illinois at Urbana-Champaign, Urbana, IL

Submitter

NanoBio Node

University of Illinois at Urbana-Champaign

Tags