Title: Inventing Algorithms via Deep Learning
Abstract: Deep learning is
a part of daily life, owing to its successes in computer vision and natural
language processing.
In these
applications, the success of the model-free deep learning approach
can be attributed to a lack of (mathematical)
generative
model. In yet other applications, the data is generated by a simple
model and performance criterion
mathematically
precise and training/test samples infinitely abundant, but the space of
algorithmic choices is
enormous (example:
chess). Deep learning has recently shown strong promise in these
problems too (example: alphazero).
In this talk, we
study two canonical problems of great scientific and
engineering interest through the lens of deep learning.
The first is reliable communication over noisy media where we
successfully revisit classical open problems in
information
theory; we show that creatively trained and architected neural networks can
beat state of the art
on the AWGN
channel with noisy feedback by a 100 fold improvement
in bit error rate.
The second is
optimization and classification problems on graphs, where the key algorithmic
challenge is scalable
performance to
arbitrary sized graphs. Representing graphs as randomized nonlinear dynamical
systems via
recurrent neural
networks, we show that creative adversarial training allows one to train on
small size graphs
and test on much
larger sized graphs (100~1000x) with approximation ratios that rival state of
the art
on a variety of
optimization problems across the complexity theoretic hardness spectrum.
Apart
from the obvious practical value, this study of mathematically precise problems
sheds light on
the
mysteries of deep learning methods: training example choices,
architectural design decisions and
loss
function/learning methodologies. Our (mostly) empirical research is
conducted under the backdrop of
a
theoretical research program of understanding gated neural networks (eg: attention networks, GRU, LSTM); we show the first
provably (globally) consistent algorithms to recover the parameters of a
classical
gated neural network architecture: mixture of experts (MoE).