Ronen Brafman, Moshe Tennenholtz
We introduce efficient learning equilibrium (ELE), a normative ap(cid:173) proach to learning in non cooperative settings. In ELE, the learn(cid:173) ing algorithms themselves are required to be in equilibrium. In addition, the learning algorithms arrive at a desired value after polynomial time, and deviations from a prescribed ELE become ir(cid:173) rational after polynomial time. We prove the existence of an ELE in the perfect monitoring setting, where the desired value is the expected payoff in a Nash equilibrium. We also show that an ELE does not always exist in the imperfect monitoring case. Yet, it exists in the special case of common-interest games. Finally, we extend our results to general stochastic games.