APR Seminar: Mor Harchol-Balter (Carnegie Mellon University)
APR Seminar: Mor Harchol-Balter (Carnegie Mellon University)Schapiro Hall (CEPSR) 415
Title: Queueing with Redundant Requests: A more realistic model
Abstract: Recent computer systems research has proposed using redundant requests to reduce latency. The idea is to replicate a request so that it joins the queue at multiple servers, where the request is considered complete as soon as any one copy of the request completes.
Redundancy is beneficial because it allows us to overcome server-side variability -- the fact that the server we choose might be temporarily slow, due to factors like background load, network interrupts, garbage collection, and so on. When server-side variability dominates runtime, replicating requests can greatly reduce their response times.
In the past few years, queueing theorists have begun to study redundancy, first via approximations, and, more recently, via exact product-form analysis. Unfortunately, for analytical tractability, all the theoretical analysis has assumed models where a job's replicas each have independent service requirements, unrelated to the job's inherent size. These unrealistic models have resulted in analysis which differs greatly from computer systems implementation results.
In this talk, we introduce a much more realistic model of redundancy. Our model allows us to decouple the inherent job size (X) from the server-side slowdown (S), where we track both S and X for each job. Analysis within the S&X model is, of course, much more difficult. Nevertheless, we derive a policy which is both analytically tractable within the S&X model and has provably excellent performance.
Joint work with Kristy Gardner and Alan Scheller-Wolf.