Title: Deep Exploration via Randomized Value
An important challenge in reinforcement learning concerns how an agent can simultaneously explore and generalize in a reliably efficient manner. It is difficult to claim that one can produce a robust artificial intelligence without tackling this fundamental issue. This talk will present a systematic approach to exploration that induces judicious probing through randomization of value function estimates and operates effectively in tandem with common reinforcement learning algorithms, such as least-squares value iteration and temporal-difference learning, that generalize via parameterized representations of the value function. Theoretical results offer assurances with tabular representations of the value function, and computational results suggest that the approach remains effective with generalizing representations.