Course Description

A graduate-level theory course providing a modern treatment of online learning and learning in games. The course is centered on the online learning framework as a paradigm for sequential decision making within strategic and non-stationary environments. Particular attention will be devoted to showing how online learning dynamics lead to equilibria in multi-agent, game-theoretic settings.

Co-instructors: Anas Barakat, John Lazarsfeld, Joseph Sakos.
Contact: sutd.glo.course@gmail.com
Time: Tues/Thurs, 10am-12pm SGT
Location: SUTD Building 1, Think Tank 11-12 (1.503)

Syllabus PDF

Schedule

Part I: Online Learning (John)

Lecture 01 -- 2025.09.16
Introduction to Online Learning

Prediction with expert advice; online convex optimization; external regret; Online Gradient Descent algorithm and regret bound.
[notes pdf] [intro slides pdf]
Lecture 02 -- 2025.09.18
Follow-the-Regularized-Leader: No-Regret via Regularization

Family of leader-based algorithms, analysis of Follow-the-Regularized-Leader (FTRL) via coupling with Be-the-Leader/Follow-the-Leader, Multiplicative Weights Update as FTRL, lower bounds for online learning.
[notes pdf]
Lecture 03 -- 2025.09.23
Online Mirror Descent and Follow-the-Perturbed-Leader: No-Regret via Penalty and Perturbation.

Online Mirror Descent (OMD) analysis, relationship between OMD and FTRL, Follow-the Perturbed-Leader (FTPL) analysis.
[notes pdf]
Lecture 04 -- 2025.09.25
Online Learning with Bandit Feedback

Bandit feedback model, expected regret and pseudo-regret, EXP3 algorithm for adversarial bandits.
[notes pdf]
Lecture 05 -- 2025.09.30
Phi-Regret Minimization

Beyond external regret: swap-regret, internal-regret, and the Phi-Regret framework. Blum-Mansour and Gordon-Greenwald-Marks algorithms.
[notes pdf]
Lecture 06 -- 2025.10.02
Regret Matching and Blackwell Approachability

Regret Matching (RM) and Regret Matching+ (RM+) algorithms, Blackwell's Approachability theorem.
[notes pdf]

Part II: Online Learning in Normal-Form and Stochastic Games (Anas)

Lecture 07 -- 2025.10.07
Introduction to Normal-Form Games and Nash Equilibria

Finite normal-form games, mixed extension, Nash equilibria, Nash's theorem, proof via Brouwer's fixed point theorem.
[notes pdf]
Lecture 08 -- 2025.10.09
Online Learning in Potential Games

Identical interest games, potential games, existence of pure NE, online learning in games paradigm, sublinear regret in potential games, learning approximate Nash equilibria via online learning in potential games.
[notes pdf]
Lecture 09 -- 2025.10.14
Online Learning in Zero-Sum Games

Zero-sum games, minmax theorem, online learning proof, learning approximate Nash equilibria via online learning in zero-sum games.
[notes pdf]
Lecture 10 -- 2025.10.16
Learning (Coarse)-Correlated Equilibria in General-Sum Games

Coarse correlated equilibria, correlated equilibria, correlation device, Phi-equilibria, time-average convergence via no-Phi-regret learning
[notes pdf]
Lecture 11 -- 2025.10.21
Optimistic Online Learning and Social Welfare of No-Regret Dynamics

Optimistic FTRL algorithms, RVU bounds, constant regret in zero-sum and potential games, improved regret in general-sum games, social welfare, smooth games, price of anarchy bounds, social welfare of no-regret dynamics.
[notes pdf]
Lecture 12 -- 2025.10.23
Introduction to Stochastic Games and Multi-Agent Reinforcement Learning

Definition of stochastic games, existence of stationary Markovian Nash equilibria, characterization of Nash equilibria, zero-sum Markov games, Shapley's minmax theorem, Markov potential games, independent and decentralized learning, multi-agent policy gradient algorithm, Nash regret.
[notes pdf]

Part III: Learning in Extensive-Form and Continuous Games (Joseph)

Lecture 13 -- 2025.11.06
Introduction to Extensive-Form Games

Game trees, imperfect information, perfect recall, strategy representations, Kuhn's theorem.
[notes pdf]
Lecture 14 -- 2025.11.11
Learning Equilibria in Extensive-Form Games

Counterfactual Regret Minimization algorithm (CFR) and speedups.
Lecture 15 -- 2025.11.13
Introduction to Continuous Games

Concave games, Rosen's theorem, variational inequalities, monotone games, zero-sum games and Gradient Descent Ascent (GDA), divergence of GDA in bilinear case.
Lecture 16 -- 2025.11.18
Learning Equilibria in Continuous Games

Proximal point method, Optimistic GDA and Extragradient algorithms for zero-sum games, learning equilibria in potential games, general concave games.
Lecture 17 -- 2025.11.20
Price of Anarchy and Equilibrium Selection

Braess' paradox, Pigou's network, smooth games, introduction to Price of Anarchy (PoA) bounds.

Part IV: Special Topics

Lecture 18 -- 2025.11.25
Online Learning in Time-Vaying Games (Anas)
Lecture 19 -- 2025.11.27
Online (Multi-Agent) Nonstochastic Control (Anas)
Lecture 20 -- 2025.12.02
Bridging Continuous-Time and Discrete-Time Learning in Games (John)
Lecture 21 -- 2025.12.04
Unregularized Learning in Games (John)
Lecture 22 -- 2025.12.09
Supermodular Games (Joseph)
Lecture 23 -- 2025.12.11
Sum-of-Squares Optimization in Games (Joseph)

Assignments and Project

For SUTD students: see eDimension for all assignments and final project details

Resources

Check back soon...

Acknowledgements

Thanks to Antonios Varvitsiotis and Georgios Piliouras for their support and feedback in designing the course. Thanks to Ryann Sim for help in initial designs of the course content.

Last Updated: 2025.10.21