Part I: Online Learning (John)
Prediction with expert advice; online convex optimization; external regret; Online Gradient Descent algorithm and regret bound.
[notes pdf]Family of leader-based algorithms, analysis of Follow-the-Regularized-Leader (FTRL) via coupling with Be-the-Leader/Follow-the-Leader, Multiplicative Weights Update as FTRL, lower bounds for online learning.
Follow-the Perturbed-Leader (FTPL) analysis, equivalence between FTPL and FTRL, Online Mirror Descent (OMD) analysis.
Bandit feedback model, expected regret and pseudo-regret, EXP3 algorithm for adversarial bandits, Explore-the-Commit and UCB algorithms for stochastic bandits.
Beyond external regret: swap-regret, internal-regret, and the Phi-Regret framework. Blum-Mansour and Stoltz-Lugosi algorithms.
Blackwell's Approachability theorem, Regret Matching (RM) and Regret Matching+ (RM+) algorithms.
Part II: Learning in Normal-Form and Stochastic Games (Anas)
Normal-Form Games, Nash equilibria (NE), game classes (potential, zero-sum, decomposition).
Hindsight rationality, proof of minimax theorem via online learning, learning NEs in potential games.
(Coarse)-correlated equilibria, time-average convergence via no-phi-regret learning, average vs. last-iterate convergence.
Optimistic FTRL algorithms, RVU bounds, individual vs. sum of regrets, fast convergence of social welfare.
Introduction to Markov Decision Processes (MDPs) and Reinforcement Learning, definition of stochastic games, Shapley's minimax theorem, existence of Nash equilibria.
Independent and decentralized learning, zero-sum Markov games and Markov potential games, policy gradient methods.
Part III: Learning in Extensive-Form and Continuous Games (Joseph)
Game trees, imperfect information, perfect recall, strategy representations, Kuhn's theorem.
Counterfactual Regret Minimization algorithm (CFR) and speedups.
Concave games, Rosen's theorem, variational inequalities, monotone games, zero-sum games and Gradient Descent Ascent (GDA), divergence of GDA in bilinear case.
Proximal point method, Optimistic GDA and Extragradient algorithms for zero-sum games, learning equilibria in potential games, general concave games.
Braess' paradox, Pigou's network, smooth games, introduction to Price of Anarchy (PoA) bounds.
Part IV: Special Topics