Convergence of Langevin-Simulated Annealing algorithms with multiplicative noise

Abstract

We study the convergence of Langevin-Simulated Annealing type algorithms with multiplicative noise, i.e. for V:Rd→R a potential function to minimize, we consider the stochastic equation dYt=−σσ⊤∇V(Yt)dt+a(t)σ(Yt)dWt+a(t)2Υ(Yt)dt, where (Wt) is a Brownian motion, where σ:Rd→Md(R) is an adaptive (multiplicative) noise, where a:R+→R+ is a function decreasing to 0 and where Υ is a correction term. This setting can be applied to optimization problems arising in Machine Learning. The case where σ is a constant matrix has been extensively studied however little attention has been paid to the general case. We prove the convergence for the L1-Wasserstein distance of Yt and of the associated Euler-scheme Y¯t to some measure ν⋆ which is supported by argmin(V) and give rates of convergence to the instantaneous Gibbs measure νa(t) of density ∝exp(−2V(x)/a(t)2). To do so, we first consider the case where a is a piecewise constant function. We find again the classical schedule a(t)=Alog−1/2(t). We then prove the convergence for the general case by giving bounds for the Wasserstein distance to the stepwise constant case using ergodicity properties.

Publication
To be published in Mathematics of Computation

Gilles Pagès’ website

Pierre Bras
Pierre Bras
PhD Student in Applied Mathematics

I am a PhD student under the direction of Gilles Pagès, interested in Machine Learning, Stochastic Optimization and Numerical Probability.

Related