(Session 1)
Range of the displacement operator of PDHG with applications to quadratic and conic programming
         
Walaa Moursi, University of Waterloo
Primal-dual hybrid gradient (PDHG) is a first-order method for saddle-point problems and convex programming introduced
by Chambolle and Pock. Recently, Applegate et al. analyzed the behavior of PDHG when applied to an infeasible or unbounded
instance of linear programming, and in particular, showed that PDHG is able to diagnose these conditions. Their analysis
hinges on the notion of the infimal displacement vector in the closure of the range of the displacement mapping of the
splitting operator that encodes the PDHG algorithm. In this paper, we develop a novel formula for this range using monotone
operator theory. The analysis is then specialized to conic programming and further to quadratic programming (QP) and
second-order cone programming (SOCP). A consequence of our analysis is that PDHG is able to diagnose infeasible or
unbounded instances of QP and of the ellipsoid-separation problem, a subclass of SOCP.
(Session 2)
Optimization applications in radiation treatment planning
         
Lewei Zhao, Department of Radiation Oncology, Stanford University
Radiation therapy plays a pivotal role in the treatment of cancer, offering the potential for precise and effective tumor targeting while
sparing healthy tissues. It is an optimization problem that we want to maximizing the killing to tumor and minimizing the side effect to normal tissue.
In recent years, the field of radiation therapy has witnessed a paradigm shift, thanks to the integration of mathematical optimization methods. This
talk will be an overview of treatment plan optimization, emphasizing the role of mathematical modeling and computational algorithms in refining radiation
therapy plans to achieve the best possible balance between tumor coverage and normal tissue sparing. A series of work including applications of
evolutionary algorithm, alternating direction method of multipliers, primal dual active set method,
multi-criterial optimization and so on will be introduced.
-
(Session 3)
Exploiting Second-order Information in Conjunction with Variance Reduction Promotes Robustness
         
Sachin Garg, EECS Department, University of Michigan
Stochastic gradient methods (SG) have witnessed great success in solving convex
finite-sum (n elements) optimization tasks such as empirical risk minimization but
provide a sublinear convergence rate to the optimal point. Incorporating variancereduced
techniques accelerates the convergence of stochastic first-order methods, providing
global linear convergence for smooth and strongly convex functions. However, the
convergence rate of first-order variance-reduced stochastic methods deteriorates with an
increase in the mini-batch sample size used for the stochastic gradient. This suggests
the use of small batches for optimal convergence rate, resulting in highly sequential
algorithms. Moreover, the literature on the benefits of variance reduction techniques
to accelerate the popular second-order methods [Der22] is limited and requires further
investigation.
In this work, we propose a stochastic second-order method with variance-reduced
gradients which achieves the same fast convergence regardless of the mini-batch size,
and takes advantage of variance reduction to provide improved local convergence rates
(per data pass) compared to both first-order stochastic variance-reduced methods and
stochastic second-order methods. Specifically, we prove that in the big data regime
where the number of components n is much larger than the condition number $\kappa$, our
algorithm achieves local linear convergence rate $(C\frac{k}{n}\log(n/\kappa))^t$ with high probability
after t iterations, independent of the choice of the mini-batch size. We compare our
method with Stochastic Variance Reduced Gradient (SVRG) [JZ13] and Subsampled
Newton (SN) [RM19], and provide evidence that the proposed method achieves a faster
convergence rate, and is more robust to the noise in the stochastic Hessian.
References:
[JZ13] Rie Johnson and Tong Zhang, Accelerating stochastic gradient descent using predictive
variance reduction, Advances in neural information processing systems
26 (2013).
[RM19] Farbod Roosta-Khorasani and Michael W Mahoney, Sub-sampled Newton methods,
Mathematical Programming 174.1 (2019), pp. 293-326.
[Der22] Michal Derezinski, Stochastic Variance-Reduced Newton: Accelerating Finite-Sum
Minimization with Large Batches, arXiv preprint arXiv:2206.02702 (2022).