02:00 PM - 03:00 PM
Finding the Best Policy: The Curious Case of Approximate Dynamic Programming
Dynamic programming arises in a wide range of applications, but with rare exceptions, obtaining optimal solutions is fundamentally intractable. As a result, a plethora of algorithmic strategies have evolved in the context of different communities under names such as stochastic search, stochastic programming, simulation optimization, policy optimization and approximate dynamic programming. In this talk, I will argue that there are four fundamental classes of policies that can be used to produce decisions in sequential stochastic optimization problems: myopic policies, lookahead policies, policy function approximations, and policies based on value function approximations. Many other choices are available by creating hybrids using these four fundamental strategies. Policy and value function approximations can be performed with three fundamental strategies: lookup tables, parametric models, and nonparametric models. Not surprisingly, the richest but most difficult class of policies involves approximating value functions, and I will touch on the major algorithmic choices that need to be made for this class of policies. This view pulls together the disparate communities that work in this field, and replaces the appearance of competing communities with a more coherent set of choices that can and should work together. I will illustrate these ideas using a series of applications which are well suited to different strategies. In the process, I hope to also reinforce the idea that dynamic programming is an effective strategy for certain classes of very large-scale applications.
This is joint work with the students and professional research staff at CASTLE Lab ( www.castlelab.princeton.edu).