Part of Advances in Neural Information Processing Systems 10 (NIPS 1997)
We apply a general algorithm for merging prediction strategies (the Aggregating Algorithm) to the problem of linear regression with the square loss; our main assumption is that the response variable is bounded. It turns out that for this particular problem the Aggre(cid:173) gating Algorithm resembles, but is slightly different from, the well(cid:173) known ridge estimation procedure. From general results about the Aggregating Algorithm we deduce a guaranteed bound on the dif(cid:173) ference between our algorithm's performance and the best, in some sense, linear regression function's performance. We show that the AA attains the optimal constant in our bound, whereas the con(cid:173) stant attained by the ridge regression procedure in general can be 4 times worse.