Title: Deep Exploration via Randomized Value
Functions
Abstract:
An important challenge in reinforcement learning concerns how an agent can
simultaneously explore and generalize in a reliably efficient manner. It is
difficult to claim that one can produce a robust artificial intelligence
without tackling this fundamental issue. This talk will present a systematic
approach to exploration that induces judicious probing through randomization of
value function estimates and operates effectively in tandem with common
reinforcement learning algorithms, such as least-squares value iteration and
temporal-difference learning, that generalize via parameterized representations
of the value function. Theoretical results offer assurances with tabular
representations of the value function, and computational results suggest that
the approach remains effective with generalizing representations.