Title: Inventing Algorithms via Deep Learning 

 

Abstract: Deep learning is a part of daily life, owing to its successes in computer vision and natural language processing. 

In these applications, the success of the model-free deep learning approach can be attributed to a lack of (mathematical)

generative model.   In yet other applications, the data is generated by a simple model and performance criterion 

mathematically precise and training/test samples infinitely abundant, but the space of algorithmic choices is 

enormous (example: chess). Deep learning has recently shown strong promise in these problems too (example: alphazero).

In this talk, we study two canonical problems of great scientific  and engineering interest through the lens of deep learning. 

 

The first is  reliable communication over noisy media where we successfully revisit classical open problems in

information theory; we show that creatively trained and architected neural networks can beat state of the art 

on the AWGN channel with noisy feedback by a 100 fold improvement in bit error rate. 

 

The second is optimization and classification problems on graphs, where the key algorithmic challenge is scalable 

performance to arbitrary sized graphs. Representing graphs as randomized nonlinear dynamical systems via 

recurrent neural networks, we show that creative adversarial training allows one to train on small size graphs

and test on much larger sized graphs (100~1000x) with approximation ratios that rival state of the art 

on a variety of optimization problems across the complexity theoretic hardness spectrum. 

  

Apart from the obvious practical value, this study of mathematically precise problems sheds light on

the mysteries of deep learning methods: training example choices, architectural design decisions and 

loss function/learning methodologies. Our (mostly) empirical research is conducted under the backdrop of 

a theoretical research program of understanding gated neural networks (eg: attention networks, GRU, LSTM); we show the first provably (globally) consistent algorithms to recover the parameters of a 

classical gated neural network architecture: mixture of experts (MoE).