IEOR-DRO Seminar

The IEOR-DRO Seminar  is a joint offering from IEOR and the Decision, Risk and Operations Division  of the Columbia Business School to feature prominent research on decision-making through optimization, modeling and managing uncertainty, and all aspects of the operations function in firms.

IEOR-DRO Seminar Mailing list

Join the IEOR-DRO Seminar mailing list!

Fall 2018 Seminars

Markus Pelger(Stanford) | 9/4/18 | 1:10pm to 2:00pm

Speaker: Markus Pelger(Stanford)
Date: Tuesday, September 4, 2018
Time: 1:10pm to 2:00pm
Location: 303 Mudd

Title: Estimating Latent Asset-Pricing Factors (joint with Martin Lettau from UC Berkeley)
Abstract: We develop an estimator for latent factors in a large-dimensional panel of financial data that can explain expected excess returns. Statistical factor analysis based on Principal Component Analysis (PCA) has problems identifying factors with a small variance that are important for asset pricing. We generalize PCA with a penalty term accounting for the pricing error in expected returns. Our estimator searches for factors that can explain both the expected return and covariance structure. We derive the statistical properties of the new estimator and show that our estimator can find asset-pricing factors with high Sharpe-ratios, which cannot be detected with PCA, even if a large amount of data is available. Applying the approach to portfolio data we find factors with Sharpe-ratios more than twice as large as those based on conventional PCA and with significantly smaller pricing errors.

Bio: Markus Pelger is an Assistant Professor at the Management Science & Engineering Department at Stanford University and a Reid and Polly Anderson Faculty Fellow at Stanford University.

His research interests are in machine-learning, statistics, financial econometrics, asset pricing and risk management. His work includes contributions in statistical factor analysis, high-frequency statistics, credit risk modeling and management compensation. He is particularly interested in how systematic risk factors and tail risk influence the prices of assets and the incentives of agents. For this purpose he has developed various statistical tools to estimate unknown risk factors from large dimensional data sets and from high-frequency data. He uses them empirically to predict asset prices and construct trading strategies.
Markus received his Ph.D. in Economics from the University of California, Berkeley. He is a scholar of the German National Merit Foundation and he was awarded a Fulbright Scholarship, the Institute for New Economic Thinking and Eliot J. Swan Prize. He has two Diplomas in Mathematics and in Economics, both with highest distinction, from the University of Bonn in Germany.  


Marty Lariviere(Northwestern) | 9/11/18 | 1:10pm to 2:00pm

Speaker: Marty Lariviere(Northwestern) 
Date: Tuesday, September 11, 2018
Time: 1:10pm to 2:00pm
Location: 303 Mudd

Title: Martin Lariviere (Northwestern) will present Priority Queues and Consumer Surplus.

Abstract: We examine whether priority queues benefit or hurt customers in a setting in which customers are privately informed of their per-unit-time waiting cost. Implementing a priority queue thus means posting a menu of expected waits and out-of-pocket prices that are incentive compatible. Whether priorities increase or decrease consumer surplus relative to first-in, first-out service depends on the model of customer utility and the on the distribution of customer waiting costs. If all customers have the same value of the service independent of their waiting costs, priorities essentially always lower consumer surplus. If a customer’s value of the service is an increasing function of her waiting cost, priorities lower surplus if the distribution of waiting costs has a decreasing mean residual life. If the mean residual life is increasing, then priorities make consumers better off. We show that the results across utility models are linked by an elasticity measure. If an appropriate measure of waiting cost is elastic, consumer surplus falls with priorities. We also explore how priorities impact individual customers and show that they potentially make all customers worse off. It is possible that low priority customers may pay a higher out-of-pocket price than they would under first-in, first-out service.

Giacomo Nannicini(IBM) | 9/18/18 | 1:10pm to 2:00pm

Speaker: Giacomo Nannicini(IBM)
Date: Tuesday, September 18, 2018
Time: 1:10pm to 2:00pm
Location: 303 Mudd

Title: Fully polynomial-time approximation schemes for stochastic dynamic programs: theory and applications.

Abstract: This talk presents some recent results in our quest to build approximation algorithms for stochastic dynamic programs that are both theoretically and computationally efficient. The main constructive result is an FPTAS for dynamic programs with scalar state, polyhedral action spaces, convex costs and linear transition functions. The random variables are assumed to be described compactly by an oracle to their CDFs. This type of problem finds applications in the area of optimal planning under uncertainty, and can be thought of as the problem of optimally managing a single non-discrete resource over a finite time horizon.

We show that under a value oracle model for the cost functions this result for one-dimensional state space is "best possible", because a similar dynamic programming model with two-dimensional state space does not admit a PTAS. Along the way, we will discuss hardness results regarding the inapproximability of certain two-dimensional convex functions, and, if time permits, applications to a newsvendor model. The talk is based on joint work with Nir Halman.

Bio: Giacomo Nannicini is a Research Staff Member in the Theory of Quantum Computing and Information group at the IBM T. J. Watson Research Center. Before joining IBM, he was an assistant professor in the Engineering Systems and Design pillar at the Singapore University of Technology and Design. His main research interest is optimization broadly defined and its applications. Giacomo received several awards, including the 2016 COIN-OR Cup, the 2015 Robert Faure prize, the 2012 Glover-Klingman prize. His algorithms and software are used by one of the largest real-time traffic and mobility information groups in Europe, and by IBM's Cloud AI service.

John Birge(University of Chicago) | 9/25/18 | 1:10pm to 2:00pm

Speaker: John Birge(University of Chicago)
Date: Tuesday, September 25, 2018
Time: 1:10pm to 2:00pm
Location: 303 Mudd

Santiago Gallino(Wharton) | 10/2/18 | 1:10pm to 2:00pm

Speaker: Santiago Gallino(Wharton)
Date: Tuesday, October 2, 2018
Time: 1:10pm to 2:00pm
Location: 326 Uris

Title: Setting Retail Staffing Levels: A Methodology Validated with Implementation.

ABSTRACT: We describe a three-step process that a retailer can use to set retail store sales staff levels. First, use historical data on revenue and planned and actual staffing levels by store to estimate how revenue varies with the staffing level at each store. We disentangle the endogeneity between revenue and staffing levels by focusing on randomly occurring deviations between planned and actual labor. Second, using historical analysis as a guide, we validate these results by changing the staffing levels in a few test stores. Finally, we implement the results chain-wide and measure the impact. We describe the successful deployment of this process with a large specialty retailer. We find that 1) the implementation validates predictions of the historical analysis, including the use of the variation between planned staffing and actual staffing as an exogenous shock, 2) implementation in 168 stores over a 6-month period produces a 4.5% revenue increase and a nearly $7.4 million annual profit increase, after accounting for the cost of the additional labor, and 3) the impact of staffing level on revenue varies greatly by store, and therefore staffing levels should also vary, with more sales staff relative to revenue assigned to those stores where sales staff have the greatest impact on revenue. Specifically, we found the largest impact of store labor in stores with the largest average basket sizes, located in regions with good growth potential, facing certain competitors (e.g., Wal-Mart), and run by long-serving managers.

This is joint work with Marshall Fisher and Serguei Netessine.

view paper

Balasubramanian Sivan(Google) | 10/9/18 | 1:10pm to 2:00pm

Speaker: Balasubramanian Sivan(Google)
Date: Tuesday, October 9, 2018
Time: 1:10pm to 2:00pm
Location: 303 Mudd

Title: How to Sell an App: Simple Pricing Schemes for Consumers With Evolving Values

Abstract: We consider a pricing problem where a buyer is interested in purchasing/using a good, such as an app or music or software, repeatedly over time. The consumer discovers his value for the good only as he uses it, and the value evolves with each use as a martingale. We provide a simple pricing scheme and show that its revenue is a constant fraction of the buyer’s expected cumulative value.

Bio: Balu Sivan is a research scientist at Google Research in New York. His research interests are in algorithmic game theory, auction/mechanism design, online algorithms and online learning. Prior to this he was a postdoctoral researcher at Microsoft Research in Redmond. He obtained his PhD in Computer Science from the University of Wisconsin-Madison in 2013 and his undergraduate degree in Computer Science from the Indian Institute of Technology Madras in 2008. He is a recipient of the 2013 ACM SIGecom Doctoral Dissertation Award. 

Amin Saberi(Stanford) | 10/16/18 | 1:10pm to 2:00pm

Speaker: Amin Saberi(Stanford)
Date: Tuesday, October 16, 2018
Time: 1:10pm to 2:00pm
Location: 303 Mudd

Title: Just a few seeds more: Value of network data for diffusion

Abstract: Identifying the optimal set of individuals to first receive information (“seeds”) in a social network is a widely-studied question with applications in marketing new technologies, ideas, or economic development programs like micro-finance. Numerous studies have proposed various network-centrality based heuristics to choose seeds in a way that is likely to boost diffusion. Here we show that, for some frequently studied diffusion processes and a small x, randomly seeding s plus x individuals can prompt a larger cascade than optimally targeting the best s individuals. We prove our results for large classes of random networks and also show that they hold in simulations over several real-world networks. This suggests that returns to collecting and analyzing network data to identify the optimal seeds may not be economically significant. Given these findings, practitioners interested in communicating a message to a large number of people may wish to compare the cost of network-based targeting to that of slightly expanding initial outreach.    

Bio: Amin Saberi is an Associate Professor and 3COM faculty scholar in Stanford University. He received his B.Sc. from Sharif University of Technology and his Ph.D. from Georgia Institute of Technology in Computer Science. His research interests include algorithms, approximation algorithms, and algorithmic aspects of games, markets, and networks. Amin Saberi's research is supported by National Science Foundation (under grants CCF 0546889, 0729586, and 0915145), Library of Congress, Stanford Clean Slate Design for the Internet, and Google. His most recent awards include an Alfred Sloan Fellowship and best paper awards in FOCS 2011 and SODA 2010.

Sharad Goel(Stanford) | 10/23/18 | 1:10pm to 2:00pm

Speaker: Sharad Goel(Stanford)
Date: Tuesday, October 23, 2018
Time: 1:10pm to 2:00pm
Location: 333 Uris

Title: The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning.

Full PDF is here.

Abstract: The nascent field of fair machine learning aims to ensure that decisions guided by algorithms are equitable. Over the last several years, three formal definitions of fairness have gained prominence: (1) anti-classification, meaning that protected attributes—like race, gender, and their proxies—are not explicitly used to make decisions; (2) classification parity, meaning that common measures of predictive performance (e.g., false positive and false negative rates) are equal across groups defined by the protected attributes; and (3) calibration, meaning that conditional on risk estimates, outcomes are independent of protected attributes. Here we show that all three of these fairness definitions suffer from significant statistical limitations. Requiring anticlassification or classification parity can, perversely, harm the very groups they were designed to protect; and calibration, though generally desirable, provides little guarantee that decisions are equitable. In contrast to these formal fairness criteria, we argue that it is often preferable to treat similarly risky people similarly, based on the most statistically accurate estimates of risk that one can produce. Such a strategy, while not universally applicable, often aligns well with policy objectives; notably, this strategy will typically violate both anti-classification and classification parity. In practice, it requires significant effort to construct suitable risk estimates. One must carefully define and measure the targets of prediction to avoid retrenching biases in the data. But, importantly, one cannot generally address these difficulties by requiring that algorithms satisfy popular mathematical formalizations of fairness. By highlighting these challenges in the foundation of fair machine learning, we hope to help researchers and practitioners productively advance the area.

This is joint worth with Sam Corbett-Davies (Stanford).


Retsef Levi(MIT) | 10/30/18 | Cancelled

Speaker: Retsef Levi(MIT)
Date: Tuesday, October 30, 2018

This event has been cancelled.

Vasilis Syrgkanis(Microsoft) | 11/13/18 | 1:10pm to 2:00pm

Speaker: Vasilis Syrgkanis(Microsoft)
Date: Tuesday, November 13, 2018
Time: 1:10pm to 2:00pm
Location: 326 Uris

Title: Econometrics and Machine Learning via the Lens of Neyman Orthogonality
Abstract: Many statistical estimation problems that arise in economic applications and more generally in causal inference, require the estimation of nuisance quantities that are not the focus of the analyst but are simply aides to identify causal models from observational data. Examples include modeling the treatment policy when estimating treatment effects, modeling the pricing policy when estimating demand elasticity or modeling an opponent’s behavior when estimating the structural utility parameters in a game of incomplete information. I will give an overview of recent results that invoke and extend the principle of Neyman-orthogonality to develop estimation methods whose target parameter error is robust to errors in the estimation of the nuisance components of the model. This enables fast mean squared error rates and asymptotically valid confidence intervals, even when the nuisance components are fitted via arbitrary ML-based approaches. I will discuss applications in the estimation of heterogeneous treatment effects, estimation with missing data, estimation in games of incomplete information and offline policy learning.
Based on joint works with: Ilias Zadik, Lester Mackey, Denis Nekipelov, Vira Semenova, Victor Chernozhukov, Miruna Oprescu, Steven Wu

Nathan Kallus (Cornell Tech) | 11/20/18 | 1:10pm to 2:00pm

Speaker: Nathan Kallus (Cornell Tech)
Date: Tuesday, November 20, 2018
Time: 1:10pm to 2:00pm
Location: 303 Mudd

Title: Learning to Personalize from Observational Data Under Unobserved Confounding
Abstract: Recent work on counterfactual learning from observational data aims to leverage large-scale data -- much larger than any experiment can ever be -- to learn individual-level causal effects for personalized interventions. The hope is to transform electronic medical records to personalized treatment regimes, transactional records to personalized pricing strategies, and click- and "like"-streams to personalized advertising campaigns. Motivated by the richness of the data, existing approaches (including my own) make the simplifying assumption that there are no unobserved confounders: unobserved variables that affect both treatment and outcome and would induce non-causal correlations that cannot be accounted for. However, all observational data, which lacks experimental manipulation, no matter how rich, will inevitably be subject to some level of unobserved confounding and assuming otherwise can lead to personalized treatment policies that seek to exploit individual-level effects that are not really there, may intervene where not necessary, and may in fact lead to net harm rather than net good relative to current, non-personalized practices.
The question is then how to use such powerfully rich data to safely improve upon current practices. In this talk, I will present a novel approach to the problem that calibrates policy learning to realistic violations of the unverifiable assumption of unconfoundedness. Our framework for confounding-robust policy improvement optimizes the minimax regret of a candidate policy against a baseline standard-of-care policy over an uncertainty set for propensity weights motivated by sensitivity analysis in causal inference. By establishing a finite-sample generalization bound, we prove that our robust policy, when applied in practice, is (almost) guaranteed to do no worse than the baseline and improve upon it if it is possible. We characterize the adversarial optimization subproblem and use efficient algorithmic solutions to optimize over policy spaces such as hyperplanes, score cards, and decision trees. We assess our methods on a large clinical trial of acute ischaemic stroke treatment, demonstrating that hidden confounding can hinder existing approaches and lead to overeager intervention and unwarranted harm, while our robust approach guarantees safety and focuses on well-evidenced improvement, a necessity for making personalized treatment policies learned from observational data usable in practice.
Bio: Nathan Kallus is a Professor of Operations Research and Information Engineering at Cornell University and Cornell Tech, starting Fall 2016. His research revolves around data-driven decision making in operations, the interplay of optimization and statistics in decision making and inference, and the analytical capacities and challenges of unstructured, large-scale, and web-based data. His works span basic theory, effective methodology, and novel applications and has been recognized by awards. Nathan hails from the town of Haifa, Israel. He holds a PhD in Operations Research from MIT as well as a BA in Pure Mathematics and a BS in Computer Science both from UC Berkeley. Previously, Nathan has been a Visiting Scholar at USC's Department of Data Sciences and Operations and a Postdoctoral Associate at MIT's Operations Research and Statistics group. Nathan is currently recruiting highly motivated and talented PhD students for his research group at Cornell Tech in NYC.
Title: Learning to Personalize from Observational Data Under Unobserved Confounding

David Goldberg(Cornell) | 11/27/18 | 1:10pm to 2:00pm

Speaker: David Goldberg(Cornell)
Date: Tuesday, November 27, 2018
Time: 1:10pm to 2:00pm
Location: 303 Mudd

Bobby Kleinberg(Cornell) | 12/4/18 | 1:10pm to 2:00pm

Speaker: Bobby Kleinberg(Cornell)
Date: Tuesday, December 4, 2018
Time: 1:10pm to 2:00pm
Location: 303 Mudd

Vishal Gupta(USC Marshall) | 12/18/18 | 1:10pm to 2:00pm

Speaker: Vishal Gupta(USC Marshall)
Date: Tuesday, December 18, 2018
Time: 1:10pm to 2:00pm
Location: 303 Mudd

Title:  Data-Pooling in Optimization
Abstract: Managing large-scale systems often involves simultaneously solving thousands of potentially unrelated stochastic optimization problems, each with limited data. Optimization intuition suggests decoupling these unrelated problems and solving them separately. Statistical intuition, however, suggests that combining the problems via some form of shrinkage may outperform decoupling, but does not offer a concrete non-parametric approach for doing so when solving complex, constrained optimization problems such as vehicle-routing, economic lot-sizing or facility location. We propose a novel data-pooling algorithm that combines both perspectives. Our approach does not require strong distributional assumptions and applies to constrained, possibly non-convex optimization problems. We show that, unlike the classical statistical setting, the potential benefits of data-pooling, in general, depend strongly on the problem structure, and, in some cases, data-pooling offers no benefit. We prove that as the number of problems grows large, our method learns if pooling is necessary and the optimal amount to pool, even if the expected amount of data per problem is fixed and bounded. We further show that pooling offers significant benefits over decoupling when there are many problems, each of which has a small amount of relevant data. We demonstrate the practical benefits of data-pooling using real data from a chain of retail drug stores in the context of inventory management. 
Bio: Vishal Gupta is an Assistant Professor of Data Sciences and Operations at the USC Marshall School of Business. Vishal earned his B.A. in Mathematics and Philosophy from Yale University, graduating Magna Cum Laude with honors, and completed Part III of the Mathematics Tripos at the University of Cambridge with distinction. After Cambridge, Gupta spent four years working as a “quant” at Barclays Capital focusing on commodities modeling, derivatives pricing and risk management. Realizing how much he missed research working towards a larger mission of impact, he left private industry to complete his Ph.D. at MIT in 2014. 
Vishal's research focuses on decision-making in data-scarce environments. His work has been recognized by various awards, including as a finalist in the Pierskalla Best Paper Award, the CHOM Best Paper Award, the Service Science Best Paper Award and the Nicholson Best Student Paper award.