Part of Advances in Neural Information Processing Systems 20 (NIPS 2007)
David Wingate, Satinder Baveja
In order to represent state in controlled, partially observable, stochastic dynamical systems, some sort of sufficient statistic for history is necessary. Predictive repre- sentations of state (PSRs) capture state as statistics of the future. We introduce a new model of such systems called the “Exponential family PSR,” which defines as state the time-varying parameters of an exponential family distribution which models n sequential observations in the future. This choice of state representation explicitly connects PSRs to state-of-the-art probabilistic modeling, which allows us to take advantage of current efforts in high-dimensional density estimation, and in particular, graphical models and maximum entropy models. We present a pa- rameter learning algorithm based on maximum likelihood, and we show how a variety of current approximate inference methods apply. We evaluate the qual- ity of our model with reinforcement learning by directly evaluating the control performance of the model.