# Applied Probability and Risk Seminar

The Applied Probability and Risk Seminar (APR) is the joint seminar between IEOR, theStatistics Department and the Center for Applied Probability (CAP).

## Fall 2018 Seminars

## Tze Lai (Stanford) | 9/13/18 | 4:10pm to 5:00pm

**Speaker: **Tze Lai (Stanford)**Date**: Thursday, September 13, 2018**Time: **4:10pm to 5:00pm**Location: **903 SSW

**Title:** MCMC with Sequential State Substitutions: Theory and Applications

**Abstract:**

## Zachary Feinstein (Washington U in St. Louis) | 9/20/2018 | 1:10pm to 2:00p

**Speaker: **Zachary Feinstein (Washington U in St. Louis)**Date**: Thursday, September 20, 2018**Time: **1:10pm to 2:00pm**Location: **MUDD 303**Title:** Pricing debt in an Eisenberg-Noe network under comonotonic endowments

**Abstact:**

## Yuval Peres (Microsoft Research) | 9/27/18 | 1:10pm to 2:00pm

**Speaker: **Yuval Peres (Microsoft Research)**Date**: Thursday, September 27, 2018**Time:** 1:10pm to 2:00pm**Location: **MUDD 303

**Title:**Trace reconstruction for the deletion channel

**Abstract:**

## Miklos Racz (Princeton) | 10/4/2018 | 1:10pm to 2:00pm

**Speaker: ** Miklos Racz (Princeton)**Date**: Thursday, October 4, 2018**Time:** 1:10pm to 2:00pm**Location: **MUDD 303

**Title:**High-dimensional random geometric graphs

**Abstract:**I will talk about two natural random geometric graph models, where connections between vertices depend on distances between latent d-dimensional labels. We are particularly interested in the high-dimensional case when d is large. We study a basic hypothesis testing problem: can we distinguish a random geometric graph from an Erdos-Renyi random graph (which has no geometry)? We show that there exists a computationally efficient procedure which is almost optimal (in an information-theoretic sense). The proofs will highlight new graph statistics as well as connections to random matrices. This is based on joint work with Sebastien Bubeck, Jian Ding, Ronen Eldan, and Jacob Richey.

## Nicolas Garcia Trillos (U Wisconsin)| 10/18/2018 | 1:10pm to 2:00pm

**Speaker: **Nicolas Garcia Trillos (U Wisconsin)**Date**: Thursday, October 18, 2018**Time:** 1:10pm to 2:00pm**Location: **MUDD 303

**Title:**Large sample asymptotics of graph-based methods in machine learning: mathematical analysis and implications.

**Abstract:**Many machine learning procedures aimed to extract information from data can be defined as precise mathematical objects that are constructed in terms of the data. It is often assumed that the data is “big” in complexity but also in quantity, and in this “large amount of data’’ setting, a basic mathematical concept that one can explore is that of closure of a given class of statistical procedures (i.e. what are the limiting procedures as the number of data points available goes to infinity.) In this talk, I will explore this notion in the context of graph-based methods. Examples of such methods include minimization of Cheeger cuts, spectral clustering, and graph-based bayesian semi-supervised learning, among others. I will introduce some of the mathematical ideas needed for the analysis, as well as show some of the implications of it: our results show statistical consistency of the methods, provide with quantitative information in the form of scaling of parameters and rates of convergence, imply qualitative properties at the discrete level, and suggest the use of appropriate algorithms.

## Philip Ernst (Rice) | 10/25/2018 | 1:10pm to 2:00pm

**Speaker:** NPhilip Ernst (Rice University)**Date:** Thursday, October 25, 2018**Time: **1:10pm to 2:00pm**Location: **MUDD 303

**Title: **Yule’s “Nonsense Correlation” Solved!

**Abstract:**

In this talk, I will discuss how I recently resolved a longstanding open statistical problem. The problem, formulated by the British statistician Udny Yule in 1926, is to mathematically prove Yule’s 1926 empirical finding of “nonsense correlation.” We solve the problem by analytically determining the second moment of the empirical correlation coefficient of two independent Wiener processes. Using tools from Fredholm integral equation theory, we calculate the second moment of the empirical correlation to obtain a value for the standard deviation of the empirical correlation of nearly .5. The “nonsense” correlation, which we call “volatile” correlation, is volatile in the sense that its distribution is heavily dispersed and is frequently large in absolute value. It is induced because each Wiener process is “self-correlated” in time. This is because a Wiener process is an integral of pure noise and thus its values at different time points are correlated. In addition to providing an explicit formula for the second moment of the empirical correlation, we offer implicit formulas for higher moments of the empirical correlation. The paper appeared in The Annals of Statistics and can be found at https://projecteuclid.org/euclid.aos/1498636874

## Mark Brown(Columbia) | 10/29/2018 | 4:10pm to 5:00pm

**Joint with the Statistics Seminar**

**Speaker: **Mark Brown(Columbia)**Date**: Monday, October 29, 2018**Time:** 4:10pm to 5:00pm**Location: **903 SSW

**Title: **Taylor’s Law via Ratios, for Some Distributions with Infinite Mean

**Abstract:** Taylor’s law (TL) originated as an empirical pattern in ecology. In many sets of samples of population density, the variance of each sample was approximately proportional to a power of the mean of that sample. In a family of nonnegative random variables, TL asserts that the population variance is proportional to a power of the population mean. TL, sometimes called fluctuation scaling, holds widely in physics, ecology, finance, demography, epidemiology, and other sciences, and characterizes many classical probability distributions and stochastic processes such as branching processes and birth-and-death processes. We demonstrate analytically for the first time that a version of TL holds for a class of distributions with infinite mean. These distributions and the associated TL differ qualitatively from those of light-tailed distributions. Our results employ and contribute to methodology of Albrecher and Teugels (2006) and Albrecher, Ladoucette and Teugels (2010). This work opens a new domain of investigation for generalizations of TL. This work is joint with Professors Joel Cohen and Victor de la Pena.

## Dan Pirjol (JP Morgan) | 11/1/2018 | 1:10pm to 2:00pm

**Speaker: **Dan Pirjol (JP Morgan)**Date**: Thursday, November 1, 2018**Time:** 1:10pm to 2:00pm**Location: **MUDD 303

## Subhabrata Sen (MIT) | 11/8/2018 | 1:10pm to 2:00pm

**Speaker: **Subhabrata Sen (MIT) **Date**: Thursday, November 8, 2018**Time:** 1:10pm to 2:00pm**Location: **MUDD 303

**Title:** Sampling convergence for random graphs: graphexes and multigraphexes** **

**Abstract:**We will look at structural properties of large, sparse random graphs through the lens of sampling convergence (Borgs, Chayes, Cohn and Veitch ’17). Sam- pling convergence generalizes left convergence to sparse graphs, and describes the limit in terms of a graphex. We will introduce this framework and motivate the components of a graphex. Subsequently, we will discuss the graphex limit for several sparse random (multi)graphs of practical interest.

**Bio:**Subhabrata Sen is a Schramm Postdoctoral Fellow at the Department of Mathematics, Massachusetts Institute of Technology, and Microsoft Research (New England). He completed his Ph.D. in 2017 from the Department of Statis- tics, Stanford University, where he was advised jointly by Prof. Amir Dembo and Prof. Andrea Montanari. He was awarded the “Probability Dissertation Award” for his thesis “Optimization, Random Graphs, and Spin Glasses”. Sub- habrata’s research interests include random combinatorial optimization, random graphs, spin glasses, and hypothesis testing.

## Mathieu Rosenbaum (Ecole Polytechnique) | 11/15/2018 | 1:10pm to 2:00pm

**Speaker:** Eunhye Song (Penn State)**Date:** Thursday, November 15, 2018**Time:** 1:10pm to 2:00pm**Location:** MUDD 303

**Speaker:**Mathieu Rosenbaum (Ecole Polytechnique)

**Title:**No arbitrage implies power-law market impact and rough volatility

**Abstract:**Market impact is the link between the volume of a (large) order and the price move during and after the execution of this order. We show that under no-arbitrage assumption, the market impact function can only be of power-law type. Furthermore, we prove that this implies that the macroscopic price is diffusive with rough volatility, with a one-to-one correspondence between the exponent of the impact function and the Hurst parameter of the volatility. Hence we simply explain the universal rough behavior of the volatility as a consequence of the no-arbitrage property. From a mathematical viewpoint, our study relies in particular on new results about hyper-rough stochastic Volterra equations. This is joint work with Paul Jusselin.

## Sebastian Engelke (U Geneva) | 11/29/2018 | 1:10pm to 2:00pm

**Speaker: **Sebastian Engelke (U Geneva)**Date**: Thursday, November 29, 2018**Time:** 1:10pm to 2:00pm**Location: **MUDD 303

**Title:**Graphical Models and Structural Learning for Extremes

**Abstract:**

**Bio:**

## Eunhye Song (Penn State) | 12/6/2018 | 1:10pm to 2:00pm

**Speaker: **Eunhye Song (Penn State)**Date**: Thursday, December 6, 2018**Time:** 1:10pm to 2:00pm**Location: **MUDD 303

**Title:**Solving a large-scale discrete simulation optimization problem using Gaussian Markov random fields

**Abstract:**Gaussian Process (GP)-based inferential optimization, sometimes referred to as Bayesian optimization, has gained its popularity for optimizing a function that is mildly expensive to evaluate and has no known structural properties. In this talk, we discuss solving a discrete simulation optimization problem with a combinatorially large solution space using the Gaussian Markov Improvement Algorithm (GMIA), a Gaussian Markov random field-based inferential optimization method. GMRF is a GP defined on a graph of solutions whose precision matrix's sparsity pattern is decided by the edges of the graph. At each iteration, GMIA updates the conditional distribution of the GMRF based on the simulated solutions and selects the next solution to simulate based on the complete expected improvement (CEI) criterion, which can be also used as a stopping criterion. GMIA is globally convergent to the optimal solution when the simulation budget increases to infinity and shows excellent empirical finite-sample performance as it 1) simulates only a small fraction of the solution space, and 2) stops correctly when the desired optimality gap is achieved. When the solution space is large, updating the GMRF and evaluating the CEIs for all feasible solutions can be computationally challenging. We introduce two modifications of GMIA to alleviate such challenges. The first is the restricted search set scheme, which periodically forms a set of promising solutions and restricts allocating simulation effort only to those solutions for p iterations (or until a stopping criterion is met). By doing so, we avoid factorizing a large precision matrix at each iteration while computing the CEIs of the solutions included in the search set exactly. The second is the multi-resolution GMIA (MR-GMIA) based on multiple layers of GMRFs defined on the feasible solution space. The solution space is divided into subregions, where each subregion is modeled by a solution-level GMRF and the subregions become “solutions” to the region-level GMRF. As a result, both solution-level and region-level GMRFs enjoy smaller graphs compared to the original solution space.