Matus Telgarsky (University of Illinois),
Date and Time: Nov 07, 2018 (12:30 PM)
Location: Orchard room (3280) at the Wisconsin Institute for Discovery Building
From the confusion surrounding the optimization and generalization of deep
networks has arisen an exciting possibility: gradient descent is implicitly
regularized, meaning it not only outputs iterates of low error, but moreover
iterates of low complexity.
This talk starts with a "spectrally-normalized" generalization bound which is
small if gradient descent happens to select iterates with certain favorable
properties. These properties can be verified in practice, but the bulk of the
talk will work towards theoretical guarantees, showing firstly that even
stronger properties hold for logistic regression, and secondly for linear
networks of arbitrary depth.
Joint work with Peter Bartlett, Dylan Foster, and Ziwei Ji.