Dynamic measure transport for sampling and quantization
Overview
Sampling or otherwise summarizing complex probability distributions is a central task in applied mathematics, statistics, and machine learning. Many modern algorithms for this task introduce dynamics in the space of probability measures, designing these dynamics to achieve good practical performance.
We will discuss several aspects of this broad design endeavor. First is the problem of optimal scheduling of dynamic transport, i.e., with what speed should one proceed along a prescribed path of probability measures? Though many popular methods seek straight line trajectories, i.e., trajectories with zero acceleration in a Lagrangian frame, we show how a specific class of `curved' trajectories can significantly improve approximation and learning. Specifically, we consider the unit-time interpolation of a given transport map with the identity, and introduce a `schedule' function that rescales time. We show that a schedule minimizing the spatial Lipschitz constant of velocity field, uniformly over time, can be computed in closed form, and that the resulting Lipschitz constant is exponentially smaller than that induced by an identity schedule (corresponding to, for instance, the Wasserstein geodesic). We then discuss extensions of this idea which seek not only schedules but paths that improve spatial regularity of the velocity.
Second, we discuss the problem of weighted quantization, i.e., summarizing a complex distribution with a small set of weighted Dirac measures. We study this problem from the perspective of minimizing maximum mean discrepancy via gradient flow in the Wasserstein--Fisher--Rao (WFR) geometry. This gradient flow yields an ODE system from which we derive a fixed-point algorithm called mean shift interacting particles (MSIP). We show that MSIP extends the classical mean shift algorithm, used for identifying modes in kernel density estimates, and that it outperforms state-of-the-art methods for quantization. We also describe how MSIP can be used not only to quantize an empirical measure, but to generate good particle approximations given only an unnormalized density.
Presenters
Youssef Marzouk, Professor, MIT
Brief Biography
Youssef Marzouk is the Breene M. Kerr (1951) Professor in Aeronautics and Astronautics at MIT and Associate Dean of the MIT Schwarzman College of Computing. He is also a PI in the MIT Laboratory for Information and Decision Systems and a core member of MIT's Statistics and Data Science Center.
His research interests lie at the intersection of computational mathematics, statistical inference, and physical modeling. He develops new methodologies for uncertainty quantification, Bayesian computation, and machine learning, motivated by a broad range of engineering and science applications. His recent work has centered on algorithms for inference, with applications to data assimilation and inverse problems; dimension reduction methodologies for high-dimensional learning and surrogate modeling; optimal experimental design; and transportation of measure as a tool for inference and generative modeling.
He received his SB, SM, and PhD degrees from MIT and spent four years at Sandia National Laboratories before joining the MIT faculty in 2009. He is a Fellow of SIAM and an Associate Fellow of the AIAA. He is also an avid coffee drinker and an occasional classical pianist.