MADGRAD: A high performance deep learning optimizer

February 5, 2021 Aaron Defazio Leave a comment

I've just open sourced an implementation of the MADGRAD optimizer that I developed together with Samy Jelassi. It out-performs Adam on every problem I've tried it on, and it has generalization performance comparable to SGD, avoiding the overfitting problems of adaptive methods entirely! Check it out here: https://github.com/facebookresearch/madgrad

Tangentially / A Machine Learning Blog

MADGRAD: A high performance deep learning optimizer

Leave a Reply Cancel reply

Aaron Defazio, Machine Learning Researcher & Data Scientist