SILO



The weekly SILO Seminar Series is made possible through the generous support of the 3M Company and its Advanced Technology Group

3M

with additional support from the Analytics Group of the Northwestern Mutual Life Insurance Company

Northwestern Mutual

Linear Dueling Bandits

Sumeet Katariya, Graduate Student

Date and Time: Jun 18, 2015 ( 4:00 PM)
Location: Orchard room (3280) at the Wisconsin Institute for Discovery Building

Abstract:

Dueling bandits is a variant of the multi-armed bandit problem where instead of playing an arm and observing the reward at each instant, you duel two arms (pairwise comparison) and observe the winner among the two. Linear bandits is a special case of contextual bandits where the reward of an arm is equal to its inner product with an unknown vector. Here we look at a setting which is the combination of the two - you have structure in the form of linearity and inner products, but you also observe limited information (1 bit) at every time instant. We are interested in the complexity of finding the best arm under this setting.

This is a work in progress. We performed some simulations and observed very interesting results. In this talk, I'll formulate the problem, show the simulation results, and talk about ideas we have towards proving them.