Title: Learning
in Non-Stationary Environments: Near-Optimal Guarantees
Abstract: Motivated by scenarios in which
heterogeneous autonomous agents interact, in this talk we present recent work
on the development of learning algorithms with performance guarantees for both
simultaneous and hierarchical decision-making. Adoption of new technologies is
transforming application domains from intelligent infrastructure to e-commerce,
allowing operators and intelligently augmented humans to make decisions rapidly
as they engage with these systems. Algorithms and market mechanisms supporting
interactions occur on multiple time-scales, face resource constraints and
system dynamics, and are exposed to exogenous uncertainties, information
asymmetries, and behavioral aspects of human decision-making. Techniques for
synthesis and analysis of decision-making algorithms, for either inference or
influence, that fundamentally depend on an assumption of environment
stationarity often breakdown in this context. For instance, humans engaging
with platform-based transportation services make decisions that are dependent
on their immediate observations of the environment and past experience, both of
which are functions of the decisions of other users, multi-timescale policies
(e.g., dynamic pricing and fixed laws), and even environmental context that may
be non-stationary (e.g., weather patterns or congestion). Implementation of
algorithms designed assuming stationarity may lead to unintended or unforeseen
consequences.
Stochastic models with
high-probability guarantees that capture the dynamics and the decision-making
behavior of autonomous agents are needed to support effective interventions
such as physical control, economic incentives, or information shaping
mechanisms. Two fundamental components are missing in the state-of-the-art: (i) a toolkit for analysis of interdependent learning
processes and for adaptive inference in these settings, and (ii) certifiable
algorithms for co-designing adaptive influence mechanisms that achieve
measurable improvement in system-level performance while ensuring individual-level
quality of service through design-in high-probability guarantees. In this talk,
we discuss our work towards addressing these gaps. In particular, we provide
(asymptotic and non-asymptotic) convergence guarantees for simultaneous play,
multi-agent gradient-based learning (a class of algorithms that encompasses a
broad set of multi-agent reinforcement learning algorithms) and performance
guarantees (regret bounds) for hierarchical decision-making (incentive design)
with bandit feedback in non-stationary, Markovian environments. Building on
insights from these results, the talk concludes with a discussion of
interesting future directions in the design of certifiable, robust algorithms
for adaptive inference and influence.
Bio:
Lillian J. Ratliff is an
Assistant Professor in the Department of Electrical and Computer Engineering at
the University of Washington. Prior to joining UW she was a postdoctoral
researcher in EECS at UC Berkeley (2015-2016) where she also obtained her PhD
(2015) under the advisement of Shankar Sastry. She holds a MS (UNLV 2010) in
Electrical Engineering as well as a BS in Mathematics and a BS
in Electrical Engineering (UNLV 2008). Her research interests lie at the intersection
of game theory, optimization, and learning. She draws on theory from these
areas to develop new theoretical models for decision-making under
uncertainty and tools for analysis and synthesis in such settings. In
particular, she develops algorithms with theoretical guarantees on performance
for decision-making in settings with multiple decision makers. Applications
include societal-scale intelligent infrastructure systems, human-in-the-loop
systems, and online content recommendation. She is a
past recipient of the NSF graduate research fellowship as well as the
NSF CISE research initiation initiative award.