Part I: Online Learning (John)
Prediction with expert advice; online convex optimization; external regret; Online Gradient Descent algorithm and regret bound.
[notes pdf] [intro slides pdf]Family of leader-based algorithms, analysis of Follow-the-Regularized-Leader (FTRL) via coupling with Be-the-Leader/Follow-the-Leader, Multiplicative Weights Update as FTRL, lower bounds for online learning.
[notes pdf]Online Mirror Descent (OMD) analysis, relationship between OMD and FTRL, Follow-the Perturbed-Leader (FTPL) analysis.
[notes pdf]Bandit feedback model, expected regret and pseudo-regret, EXP3 algorithm for adversarial bandits.
[notes pdf]Beyond external regret: swap-regret, internal-regret, and the Phi-Regret framework. Blum-Mansour and Gordon-Greenwald-Marks algorithms.
[notes pdf]Regret Matching (RM) and Regret Matching+ (RM+) algorithms, Blackwell's Approachability theorem.
[notes pdf]Part II: Online Learning in Normal-Form and Stochastic Games (Anas)
Finite normal-form games, mixed extension, Nash equilibria, Nash's theorem, proof via Brouwer's fixed point theorem.
[notes pdf]Identical interest games, potential games, existence of pure NE, online learning in games paradigm, sublinear regret in potential games, learning approximate Nash equilibria via online learning in potential games.
[notes pdf]Zero-sum games, minmax theorem, online learning proof, learning approximate Nash equilibria via online learning in zero-sum games.
[notes pdf]Coarse correlated equilibria, correlated equilibria, correlation device, Phi-equilibria, time-average convergence via no-Phi-regret learning
[notes pdf]Optimistic FTRL algorithms, RVU bounds, constant regret in zero-sum and potential games, improved regret in general-sum games, social welfare, smooth games, price of anarchy bounds, social welfare of no-regret dynamics.
[notes pdf]Definition of stochastic games, existence of stationary Markovian Nash equilibria, characterization of Nash equilibria, zero-sum Markov games, Shapley's minmax theorem, Markov potential games, independent and decentralized learning, multi-agent policy gradient algorithm, Nash regret.
[notes pdf]Part III: Learning in Extensive-Form and Continuous Games (Joseph)
Game trees, imperfect information, perfect recall, strategy representations, Kuhn's theorem.
[notes pdf]Counterfactual Regret Minimization algorithm (CFR) and speedups.
Concave games, Rosen's theorem, variational inequalities, monotone games, zero-sum games and Gradient Descent Ascent (GDA), divergence of GDA in bilinear case.
Proximal point method, Optimistic GDA and Extragradient algorithms for zero-sum games, learning equilibria in potential games, general concave games.
Braess' paradox, Pigou's network, smooth games, introduction to Price of Anarchy (PoA) bounds.
Part IV: Special Topics