The weekly SILO Seminar Series is made possible through the generous support of the 3M Company and its Advanced Technology Group


Overcoming the Challenges of Learning in Parallel

Dimitris Papailiopoulos,

Date and Time: Oct 25, 2017 (12:30 PM)
Location: Orchard room (3280) at the Wisconsin Institute for Discovery Building


Distributed implementations of popular machine learning algorithms exhibit poor scaling when deployed on more than a few tens of compute nodes. The key sources of this poor performance are communication bottlenecks and straggler nodes in the system. In this talk, I will explain why these bottlenecks are a real challenge for scaling up, and will provide insights on how to overcome them using simple algebraic ideas. I will show experiments where simple theoretical insights can lead to distributed training algorithms with significant speedup gains, and will conclude with several open problems that lie in the intersection of machine learning and distributed systems.