Conditional Gradient Methods

The purpose of our survey is to serve both as a gentle introduction and a coherent overview of state-of-the-art Frank–Wolfe algorithms, also called conditional gradient algorithms, for function minimization. These are first-order algorithms accessing the feasible region only through linear minimization, and are especially useful in convex optimization, in particular when linear optimization is cheaper than projection.

The selection of the material has been guided by the principle of highlighting crucial ideas as well as presenting new approaches that we believe might become important in the future, with ample citations even of old works imperative in the development of newer methods. Yet, our selection is sometimes biased, for example all applications are from machine learning, and need not reflect consensus of the research community. We have certainly missed recent important contributions. After all the research area of Frank–Wolfe is very active, making it a moving target. We apologize sincerely in advance for any such distortions and we fully acknowledge: We stand on the shoulder of giants.



Cite as

Gábor Braun, Alejandro Carderera, Cyrille W Combettes, Hamed Hassani, Amin Karbasi, Aryan Mokhtari, and Sebastian Pokutta, Conditional Gradient Methods, arXiv:2211.14103 [math.OC]

or in BibTeX format

  author =       {G{\'a}bor Braun and Alejandro Carderera and Cyrille
                  W Combettes and Hamed Hassani and Amin Karbasi and
                  Aryan Mokhtari and Sebastian Pokutta},
  title =        {Conditional Gradient Methods},
  month =        nov,
  year =         2022,
  archivePrefix ={arXiv},
  primaryClass = {math.OC},
  eprint =       {2211.14103},
  url =          {}


The python code used in the survey for generating all comparisons and graphics will be made available soon.

Many of the more general variants are available as reference implementations in the FrankWolfe.jl Julia package.


Feedback, suggestions, and comments are very much welcome at