Stochastic Optimization

In my research group, we’re diving deep into stochastic optimization, and we’ve been exploring some fascinating problems where Markov chain theory plays a crucial role. A lot of our work focuses on analyzing the convergence of iterative schemes in stochastic optimization, and Markov chains give us a powerful framework to study how these algorithms behave over time. By leveraging Markov chain theory, we’re able to rigorously analyze the long-term behavior of these methods, which is key to understanding their efficiency and reliability.

An example of topics that we’re excited about involves dealing with infinite variance gradient estimators. Surprisingly, these types of estimators arise more frequently than one may think; infinite variance is simply a modeling formalism for a situation in which spatial-scale variability is larger than the square root of temporal-scale variability. Dealing with infinite variance can be a tricky problem because traditional convergence results require measuring mean squared errors (which are infinite in this setting). We’re developing new techniques to handle these infinite variance cases and still ensure that our algorithms converge effectively.

Another line of research in the group is focused on unbiased estimators of stochastic gradient descent (SGD), especially when dealing with function compositions. When you have layers of functions composed together, getting accurate and unbiased gradient estimates is convenient because then one can directly use the machinery of SGD, but finding unbiased estimators (with finite variance) becomes quite challenging. We’re working on methods to tackle this problem and improve the accuracy and efficiency of SGD in these complex settings.

We’re also pushing boundaries in gradient estimation for parametric diffusions. These estimators are particularly useful in deep learning techniques applied to solving stochastic control problems. This area is exciting because it connects sophisticated mathematical tools with real-world applications, such as financial modeling and neural SDEs. By improving these gradient estimators, we’re helping deep learning models more effectively handle the randomness and uncertainty inherent in complex control problems, enabling better decision-making in dynamic environments. This work has huge potential for advancing techniques in both theory and practice.