IEOR SEMINAR: Patrick Jaillet (MIT)

October 4, 2016 | 1:00pm - 2:10pm

IEOR SEMINAR: Patrick Jaillet (MIT)


Abstract: Multi-Armed Bandit (MAB) is a benchmark model for repeated decision making in stochastic environments with very limited, but immediate, feedback on the outcomes of alternatives. The Bandits with Knapsacks (BwK) model, a recent extension to the framework allowing to capture resource consumption, offers promising applications in many areas and particularly in dynamic pricing and real-time bidding, where advertisers arguably face a problem of this nature when bidding on advertising inventory. In this talk, after motivating this model in the context of display and search auctions, we will present recent results we obtained on the characterization of optimal regret bounds for this model extension. More specifically, while asymptotically optimal regret bounds with respect to the time horizon T are now well documented for the original MAB problem, with ln{T} (resp. sqrt{T}) distribution-dependent (resp. distribution-free) bounds, the BwK model lacks this clear-cut distinction. We partially bridge the gap by designing algorithms achieving logarithmic growth rates under a non-degeneracy assumption. Joint work with Arthur Flajolet, MIT ORC

Bio: Patrick Jaillet is the Dugald C. Jackson Professor in the Department of Electrical Engineering and Computer Science and a member of the Laboratory for Information and Decision Systems at MIT. He is also one of the two Directors of the MIT Operations Research Center. Before MIT, he held faculty positions at the University of Texas at Austin and at the Ecole Nationale des Ponts et Chaussees, Paris. He received a Diplôme d'Ingénieur from France, and a PhD in Operations Research from MIT. His current research interests include on-line and data-driven optimization under uncertainty. He is a Fellow of INFORMS and a member of SIAM.

500 W. 120th St., Mudd 315, New York, NY 10027    212-854-2942                 
©2014 Columbia University