{"title": "Optimal Bayesian Recommendation Sets and Myopically Optimal Choice Query Sets", "book": "Advances in Neural Information Processing Systems", "page_first": 2352, "page_last": 2360, "abstract": "Bayesian approaches to utility elicitation typically adopt (myopic) expected value of information (EVOI) as a natural criterion for selecting queries. However, EVOI-optimization is usually computationally prohibitive. In this paper, we examine EVOI optimization using \\emph{choice queries}, queries in which a user is ask to select her most preferred product from a set. We show that, under very general assumptions, the optimal choice query w.r.t.\\ EVOI coincides with \\emph{optimal recommendation set}, that is, a set maximizing expected utility of the user selection. Since recommendation set optimization is a simpler, submodular problem, this can greatly reduce the complexity of both exact and approximate (greedy) computation of optimal choice queries. We also examine the case where user responses to choice queries are error-prone (using both constant and follow mixed multinomial logit noise models) and provide worst-case guarantees. Finally we present a local search technique that works well with large outcome spaces.", "full_text": "Optimal Bayesian Recommendation Sets and\n\nMyopically Optimal Choice Query Sets\n\nPaolo Viappiani\u2217\n\nCraig Boutilier\n\nDepartment of Computer Science\n\nDepartment of Computer Science\n\nUniversity of Toronto\n\nUniversity of Toronto\n\npaolo.viappiani@gmail.com\n\ncebly@cs.toronto.edu\n\nAbstract\n\nBayesian approaches to utility elicitation typically adopt (myopic) expected value of infor-\nmation (EVOI) as a natural criterion for selecting queries. However, EVOI-optimization is\nusually computationally prohibitive. In this paper, we examine EVOI optimization using\nchoice queries, queries in which a user is ask to select her most preferred product from a\nset. We show that, under very general assumptions, the optimal choice query w.r.t. EVOI\ncoincides with the optimal recommendation set, that is, a set maximizing the expected util-\nity of the user selection. Since recommendation set optimization is a simpler, submodular\nproblem, this can greatly reduce the complexity of both exact and approximate (greedy)\ncomputation of optimal choice queries. We also examine the case where user responses\nto choice queries are error-prone (using both constant and mixed multinomial logit noise\nmodels) and provide worst-case guarantees. Finally we present a local search technique for\nquery optimization that works extremely well with large outcome spaces.\n\n1 Introduction\n\nUtility elicitation is a key component in many decision support applications and recommender sys-\ntems, since appropriate decisions or recommendations depend critically on the preferences of the\nuser on whose behalf decisions are being made. Since full elicitation of user utility is prohibitively\nexpensive in most cases (w.r.t. time, cognitive effort, etc.), we must often rely on partial utility in-\nformation. Thus in interactive preference elicitation, one must selectively decide which queries are\nmost informative relative to the goal of making good or optimal recommendations. A variety of\nprincipled approaches have been proposed for this problem. A number of these focus directly on\n(myopically or heuristically) reducing uncertainty regarding utility parameters as quickly as possi-\nble, including max-margin [10], volumetric [12], polyhedral [22] and entropy-based [1] methods.\n\nA different class of approaches does not attempt to reduce utility uncertainty for its own sake, but\nrather focuses on discovering utility information that improves the quality of the recommendation.\nThese include regret-based [3, 23] and Bayesian [7, 6, 2, 11] models. We focus on Bayesian models\nin this work, assuming some prior distribution over user utility parameters and conditioning this\ndistribution on information acquired from the user (e.g., query responses or behavioral observations).\nThe most natural criterion for choosing queries is expected value of information (EVOI), which can\nbe optimized myopically [7] or sequentially [2]. However, optimization of EVOI for online query\nselection is not feasible except in the most simple cases. Hence, in practice, heuristics are used that\noffer no theoretical guarantees with respect to query quality.\n\nIn this paper we consider the problem of myopic EVOI optimization using choice queries. Such\nqueries are commonly used in conjoint analysis and product design [15], requiring a user to indicate\nwhich choice/product is most preferred from a set of k options. We show that, under very general\nassumptions, optimization of choice queries reduces to the simpler problem of choosing the opti-\nmal recommendation set, i.e., the set of k products such that, if a user were forced to choose one,\n\n\u2217From 9/2010 to 12/2010 at the University of Regina; from 01/2011 onwards at Aalborg University.\n\n1\n\n\fmaximizes utility of that choice (in expectation). Not only is the optimal recommendation set prob-\nlem somewhat easier computationally, it is submodular, admitting a greedy algorithm with approx-\nimation guarantees. Thus, it can be used to determine approximately optimal choice queries. We\ndevelop this connection under several different (noisy) user response models. Finally, we describe\nquery iteration, a local search technique that, though it has no formal guarantees, \ufb01nds near-optimal\nrecommendation sets and queries much faster than either exact or greedy optimization.\n\n2 Background: Bayesian Recommendation and Elicitation\n\nWe assume a system is charged with the task of recommending an option to a user in some multi-\nattribute space, for instance, the space of possible product con\ufb01gurations from some domain (e.g.,\ncomputers, cars, rental apartments, etc.). Products are characterized by a \ufb01nite set of attributes\nX = {X1, ...Xn}, each with \ufb01nite domain Dom(Xi). Let X \u2286 Dom(X ) denote the set of feasible\ncon\ufb01gurations. For instance, attributes may correspond to the features of various cars, such as color,\nengine size, fuel economy, etc., with X de\ufb01ned either by constraints on attribute combinations (e.g.,\nconstraints on computer components that can be put together) or by an explicit database of feasible\ncon\ufb01gurations (e.g., a rental database). The user has a utility function u : Dom(X ) \u2192 R. The\nprecise form of u is not critical, but in our experiments we assume that u(x; w) is linear in the\nparameters (or weights) w (e.g., as in generalized additive independent (GAI) models [8, 5].) We\noften refer to w as the user\u2019s \u201cutility function\u201d for simplicity, assuming a \ufb01xed form for u. A simple\nadditive model in the car domain might be:\n\nThe optimal product x\u2217\n\nu(Car ; w) = w1f1(MPG ) + w2f2(EngineSize) + w3f3(Color ).\nw for a user with utility parameters w is the x \u2208 X that maximizes u(x; w).\n\nGenerally, a user\u2019s utility function w will not be known with certainty. Following recent models of\nBayesian elicitation, the system\u2019s uncertainty is re\ufb02ected in a distribution, or beliefs, P (w; \u03b8) over\nthe space W of possible utility functions [7, 6, 2]. Here \u03b8 denotes the parameterization of our model,\nand we often refer to \u03b8 as our belief state. Given P (\u00b7; \u03b8), we de\ufb01ne the expected utility of an option\n\nx to be EU (x; \u03b8) = RW u(x; w)P (w; \u03b8)dw. If required to make a recommendation given belief \u03b8,\nthe optimal option x\u2217(\u03b8) is that with greatest expected utility EU \u2217(\u03b8) = maxx\u2208X EU (x; \u03b8), with\nx\u2217(\u03b8) = arg maxx\u2208X EU (x; \u03b8).\n\nIn some settings, we are able to make set-based recommendations: rather than recommending a\nsingle option, a small set of k options can be presented, from which the user selects her most pre-\nferred option [15, 20, 23]. We discuss the problem of constructing an optimal recommendation set\nS further below. Given recommendation set S with x \u2208 S, let S \u22b2 x denote that x has the greatest\nutility among those items in S (for a given utility function w). Given feasible utility space W , we\nde\ufb01ne W \u2229 S \u22b2 x \u2261 {w \u2208 W : u(x; w) \u2265 u(y; w), \u2200y 6= x, y \u2208 S} to be those utility functions\nsatisfying S \u22b2 x. Ignoring \u201cties\u201d over full-dimensional subsets of W (which are easily dealt with,\nbut complicate the presentation), the regions W \u2229 S \u22b2 xi, xi \u2208 S, partition utility space.\n\nA recommender system can re\ufb01ne its belief state \u03b8 by learning more about the user\u2019s utility function\nw. A reduction in uncertainty will lead to better recommendations (in expectation). While many\nsources of information can be used to assess a user preferences\u2014including the preferences of related\nusers, as in collaborative \ufb01ltering [14], or observed user choice behavior [15, 19]\u2014we focus on\nexplicit utility elicitation, in which a user is asked questions about her preferences.\n\nThere are a variety of query types that can be used to re\ufb01ne one\u2019s knowledge of a user\u2019s utility\nfunction (we refer to [13, 3, 5] for further discussion). Comparison queries are especially natural,\nasking a user if she prefers one option x to another y. These comparisons can be localized to speci\ufb01c\n(subsets of) attributes in additive or GAI models, and such structured models allow responses w.r.t.\nspeci\ufb01c options to \u201cgeneralize,\u201d providing constraints on the utility of related options. In this work\nwe consider the extension of comparions to choice sets of more than two options [23] as is common\nin conjoint analysis [15, 22]. Any set S can be interpreted as a query: the user states which of the k\nelements xi \u2208 S she prefers. We refer to S interchangeably as a query or a choice set.\n\nThe user\u2019s response to a choice set tells us something about her preferences; but this depends on\nthe user response model. In a noiseless model, the user correctly identi\ufb01es the preferred item in\nthe slate: the choice of xi \u2208 S re\ufb01nes the set of feasible utility functions W by imposing k \u2212 1\nlinear constraints of the form u(xi; w) \u2265 u(xj ; w), j 6= i, and the new belief state is obtained by\n\n2\n\n\frestricting \u03b8 to have non-zero density only on W \u2229S \u22b2 xi and renormalizing. More generally, a noisy\nresponse model allows that a user may select an option that does not maximize her utility. For any\nchoice set S with xi \u2208 S, let S xi denote the event of the user selecting xi. A response model R\ndictates, for any choice set S, the probability PR(S xi; w) of any selection given utility function\n\nw. When the beliefs about a user\u2019s utility are uncertain, we de\ufb01ne PR(S xi; \u03b8) = RW PR(S \n\nxi; w)P (w; \u03b8)dw. We discuss various response models below.\n\nWhen treating S as a query set (as opposed to a recommendation set), we are not interested in its\nexpected utility, but rather in its expected value of information (EVOI), or the (expected) degree to\nwhich a response will increase the quality of the system\u2019s recommendation. We de\ufb01ne:\n\nDe\ufb01nition 1 Given belief state \u03b8, the expected posterior utility (EPU ) of query set S under R is\n\nEPU R(S; \u03b8) = Xx\u2208S\n\nPR(S x; \u03b8) EU \u2217(\u03b8|S x)\n\n(1)\n\nEVOI (S; \u03b8) is then EPU (S; \u03b8) \u2212 EU \u2217(\u03b8), the expected improvement in decision quality given S.\nAn optimal query (of \ufb01xed size k) is any S with maximal EV OI, or equivalently, maximal EPU .\n\nIn many settings, we may wish to present a set of options to a user with the dual goals of offering\na good set of recommendations and eliciting valuable information about user utility. For instance,\nproduct navigation interfaces for e-commerce sites often display a set of options from which a user\ncan select, but also give the user a chance to critique the proposed options [24]. This provides one\nmotivation for exploring the connection between optimal recommendation sets and optimal query\nsets. Moreover, even in settings where queries and recommendation are separated, we will see that\nquery optimization can be made more ef\ufb01cient by exploiting this relationship.\n\n3 Optimal Recommendation Sets\n\nWe consider \ufb01rst the problem of computing optimal recommendation sets given the system\u2019s uncer-\ntainty about the user\u2019s true utility function w. Given belief state \u03b8, if a single recommendation is to\nbe made, then we should recommend the option x\u2217(\u03b8) that maximizes expected utility EU (x, \u03b8).\nHowever, there is often value in suggesting a \u201cshortlist\u201d containing multiple options and allowing\nthe user to select her most preferred option. Intuitively, such a set should offer options that are\ndiverse in the following sense: recommended options should be highly preferred relative to a wide\nrange of \u201clikely\u201d user utility functions (relative to \u03b8) [23, 20, 4]. This stands in contrast to some rec-\nommender systems that de\ufb01ne diversity relative to product attributes [21], with no direct reference\nto beliefs about user utility. It is not hard to see that \u201ctop k\u201d systems, those that present the k options\nwith highest expected utility, do not generally result in good recommendation sets [20].\n\nIn broad terms, we assume that the utility of a recommendation set S is the utility of its most\npreferred item. However, it is unrealistic to assume that users will select their most preferred item\nwith complete accuracy [17, 15]. So as with choice queries, we assume a response model R dictating\nthe probability PR(S x; \u03b8) of any choice x from S:\n\nDe\ufb01nition 2 The expected utility of selection (EUS) of recommendation set S given \u03b8 and R is:\n\nEUS R(S; \u03b8) = Xx\u2208S\n\nPR(S x; \u03b8)EU (x; \u03b8|S x)\n\nWe can expand the de\ufb01nition to rewrite EUS R(S; \u03b8) as:\n\nEUS R(S; \u03b8) = ZW\n\n[Xx\u2208S\n\nPR(S x; w) u(x; w)]P (w; \u03b8)dw\n\n(2)\n\n(3)\n\nUser behavior is largely dictated by the response model R. In the ideal setting, users would always\nselect the option with highest utility w.r.t. her true utility function w. This noiseless model is assumed\nin [20] for example. However, this is unrealistic in general. Noisy response models admit user\n\u201cmistakes\u201d and the choice of optimal sets should re\ufb02ect this possibility (just as belief update does,\n\n3\n\n\fsee Defn. 1). Possible constraints on response models include: (i) preference bias: a more preferred\noutcome in the slate given w is selected with probability greater than a less preferred outcome; and\n(ii) Luce\u2019s choice axiom [17], a form of independence of irrelevant alternatives that requires that the\nrelative probability (if not 0 or 1) of selecting any two items x and y from S is not affected by the\naddition or deletion of other items from the set. We consider three different response models:\n\n\u2022 In the noiseless response model, RNL, we have PNL(S x; w) = Qy\u2208S I[u(x; w) \u2265 u(y; w)]\n\n(with indicator function I). Then EUS becomes\n\nEUS NL(S; \u03b8) = ZW\n\n[max\nx\u2208S\n\nu(x; w)]P (w; \u03b8)dw.\n\nThis is identical to the expected max criterion of [20]. Under RNL we have S x iff S \u22b2 x.\n\n\u2022 The constant noise model RC assumes a multinomial distribution over choices or responses where\nw relative to w, is selected with (small)\nk , so the\nw; w) = \u03b1 = 1 \u2212 (k \u2212 1)\u03b2 > \u03b2.\nw(S) the optimal element in S\n\neach option x, apart from the most preferred option x\u2217\nconstant probability PC (S x; w) = \u03b2, with \u03b2 independent of w. We assume \u03b2 < 1\nmost preferred option is selected with probability PC (S x\u2217\nThis generalizes the model used in [10, 2] to sets of any size. If x\u2217\ngiven w, and u\u2217\n\nw(S) is its utility, then EUS is:\n\nEUS C (S; \u03b8) = ZW\n\n[\u03b1u\u2217\n\nw(S) + Xy\u2208S\u2212{x\u2217\n\nw(S)}\n\n\u03b2u(x; w)]P (w; \u03b8)dw\n\n\u2022 The logistic response model RL is commonly used in choice modeling, and is variously known as\nthe Luce-Sheppard [16], Bradley-Terry [11], or mixed multinomial logit model. Selection prob-\nabilities are given by PL(S x; w) =\nPy\u2208S exp(\u03b3u(y;w)) , where \u03b3 is a temperature parameter.\nFor comparison queries (i.e., |S| = 2), RL is the logistic function of the difference in utility\nbetween the two options.\n\nexp(\u03b3u(x;w))\n\nWe now consider properties of the expected utility of selection EUS under these various models.\nAll three models satisfy preference bias, but only RNL and RL satisfy Luce\u2019s choice axiom. EUS\nis monotone under the noiseless response model RNL: the addition of options to a recommendation\nset S cannot decrease its expected utility EUS NL(S; \u03b8). Moreover, say that option xi dominates xj\nrelative to belief state \u03b8, if u(xi; w) > u(xj; w) for all w with nonzero density. Adding a set-wise\ndominated option x to S (i.e., an x dominated by some element of S) does not change expected\nutility under RNL: EUS NL(S \u222a {x}; \u03b8) = EUS NL(S; \u03b8). This stands in constrast to noisy response\nmodels, where adding dominated options might actually decrease expected utility.\n\nImportantly, EUS is submodular for both the noiseless and the constant response models RC :\n\nTheorem 1 For R \u2208 {RN L, RC}, EUS R is a submodular function of the set S. That is, given\nrecommendation sets S \u2286 Q, option x 6\u2208 S, S\u2032 = S \u222a {x}, and Q\u2032 = Q \u222a {x}, we have:\n\nEUS R(S\u2032; \u03b8) \u2212 EUS R(S; \u03b8) \u2265 EUS R(Q\u2032; \u03b8) \u2212 EUS R(Q; \u03b8)\n\n(4)\n\nThe proof is omitted, but simply shows that EUS has the required property of diminishing returns.\nSubmodularity serves as the basis for a greedy optimization algorithm (see Section 5 and worst-case\nresults on query optimization below). EUS under the commonly used logistic response model RL is\nnot submodular, but can be related to EUS under the noiseless model\u2014as we discuss next\u2014allowing\nus to exploit submodularity of the noiseless model when optimizing w.r.t. RL.\n\n4 The Connection between EUS and EPU\n\nWe now develop the connection between optimal recommendation sets (using EUS) and optimal\nchoice queries (using EPU/EVOI). As discussed above, we\u2019re often interested in sets that can serve\nas both good recommendations and good queries; and since EPU/EVOI can be computationally\ndif\ufb01cult, good methods for EUS-optimization can serve to generate good queries as well if we have\na tight relationship between the two.\n\n4\n\n\fIn the following, we make use of a transformation T\u03b8,R that modi\ufb01es a set S in such a way that\nEUS usually increases (and in the case of RNL and RC cannot decrease). This transformation is\nused in two ways: (i) to prove the optimality (near-optimality in the case of RL) of EUS-optimal\nrecommendation sets when used as query sets; (ii) and directly as a computationally viable heuristic\nstrategy for generating query sets.\n\nDe\ufb01nition 3 Let S = {x1, \u00b7 \u00b7 \u00b7 , xk} be a set of options. De\ufb01ne:\n\nT\u03b8,R(S) = {x\u2217(\u03b8|S x1; R), \u00b7 \u00b7 \u00b7 , x\u2217(\u03b8|S xk; R)}\n\nwhere x\u2217(\u03b8|S xi; R) is the optimal option (in expectation) when \u03b8 is conditioned on S xi\nw.r.t. R.\n\nIntuitively, T (we drop the subscript when \u03b8, R are clear from context) re\ufb01nes a recommendation\nset S of size k by producing k updated beliefs for each possible user choice, and replacing each\noption in S with the optimal option under the corresponding update. Note that T generally produces\ndifferent sets under different response models. Indeed, one could use T to construct a set using one\nresponse model, and measure EUS or EPU of the resulting set under a different response model.\nSome of our theoretical results use this type of \u201ccross-evaluation.\u201d\n\nWe \ufb01rst show that optimal recommendation sets under both RNL and RC are optimal (i.e.,\nEPU/EVOI-maximizing) query sets.\n\nLemma 1 EUS R(T\u03b8,R(S); \u03b8) \u2265 EPU R(S; \u03b8) for R \u2208 {N L, C}\nProof: For the RNL , the argument relies on partitioning W w.r.t. options in S:\n\nEPU NL(S; \u03b8) = X\ni,j\n\nEUS NL(T (S); \u03b8) = X\ni,j\n\nP (S \u22b2 xi, T (S) \u22b2 x\n\n\u2032\nj ; \u03b8)EU (x\n\n\u2032\ni, \u03b8[S \u22b2 xi, T (S) \u22b2 x\n\n\u2032\nj ])\n\nP (S \u22b2 xi , T (S) \u22b2 x\n\n\u2032\nj ; \u03b8)EU (x\n\n\u2032\nj ; \u03b8[S \u22b2 xi, T (S) \u22b2 x\n\n\u2032\nj ])\n\n(5)\n\n(6)\n\nCompare the two expression componentwise: 1) If i = j then the components of each expression are the same. 2) If i 6= j, for any w with nonzero density in\n\u03b8[S \u22b2 xi, T (S) \u22b2 x\u2032\nj . Since EUS NL(T (S); \u00b7) \u2265\nEPU NL(S; \u00b7) in each component, the result follows. For RC the proof uses the same argument, along with the observation that: EUS C (S; \u03b8) = Pi P (S \u22b2\nxi; \u03b8)(\u03b1 EU (xi, \u03b8[S \u22b2 xi]) + \u03b2 Pj6=i\n\nj ) \u2265 EU (xi) in the region S \u22b2 xi, T (S) \u22b2 x\u2032\n\ni, w), thus EU (x\u2032\n\nj ], we have u(x\u2032\n\nj ; w) \u2265 u(x\u2032\n\nEU (sj , \u03b8[S \u22b2 xi])).\n\nFrom Lemma 1 and the fact that EUS R(S; \u03b8) \u2264 EP UR(S, \u03b8), it follows that EUS R(T (S); \u03b8) \u2265\nEUS R(S; \u03b8). We now state the main theorem (we assume the size k of S is \ufb01xed):\n\nTheorem 2 Assume response model R \u2208 {N L, C} and let S\u2217 be an optimal recommendation set.\nThen S\u2217 is an optimal query set: EPU (S\u2217; \u03b8) \u2265 EPU (S; \u03b8), \u2200S \u2208 Xk\nProof: Suppose S\u2217 is not an optimal query set, i.e., there is some S s.t. EPU (S; \u03b8) > EPU (S\u2217; \u03b8). Applying T to S gives a new query set T (S),\nwhich by the results above shows: EUS (T (S); \u03b8) \u2265 EPU (S; \u03b8) > EPU (S\u2217; \u03b8) \u2265 EUS (S\u2217; \u03b8). This contradicts the EUS-optimality of S\u2217.\n\nAnother consequence of Lemma 1 is that posing a query S involving an infeasible option is pointless:\nthere is always a set with only elements in X with EPU/EVOI at least as great. This is proved by\nobserving the lemma still holds if T is rede\ufb01ned to allow sets containing infeasible options.\n\nIt is not hard to see that admitting noisy responses under the logistic response model RL can decrease\nthe value of a recommendation set, i.e., EUS L(S; \u03b8) \u2264 EUS N L(S; \u03b8). However, the loss in EUS\nunder RL can in fact be bounded. The logistic response model is such that, if the probability of\nincorrect selection of some option is high, then the utility of that option must be close to that of the\nbest item, so the relative loss in utility is small. Conversely, if the loss associated with some incorrect\nselection is great, its utility must be signi\ufb01cantly less than that of the best option, rendering such an\nevent extremely unlikely. This allows us to bound the difference between EUS NL and EUS L at\nsome value \u2206max that depends only on the set cardinality k and on the temperature parameter \u03b3 (we\nderive an expression for \u2206max below):\n\nTheorem 3 EUS L(S; \u03b8) \u2265 EUS NL(S; \u03b8) \u2212 \u2206max.\n\nUnder RL, our transformation TL does not, in general, improve the value EUS L(S) of a recom-\nmendation set S. However the set TL(S) is such that its value EUS NL, assuming selection under\nthe noiseless model, is greater than the expected posterior utility EPU L(S) under RL:\n\n5\n\n\fLemma 2 EUS N L(TL(S); \u03b8) \u2265 EPU L(S; \u03b8)\n\nWe use this fact below to prove the optimal recommendation set under RL is a near-optimal query\nunder RL. It has two other consequences: First, from Thm. 3 it follows that EUS L(TL(S); \u03b8) \u2265\nEPU L(S; \u03b8) \u2212 \u2206max. Second, EPU of the optimal query under the noiseless model is at least as\ngreat that of the optimal query under the logistic model: EPU \u2217\nL(\u03b8).1 We now derive\nour main result for logistic responses: the EUS of the optimal recommendation set (and hence its\nEPU) is at most \u2206max less than the EPU of the optimal query set.\n\nN L(\u03b8) \u2265 EPU \u2217\n\nTheorem 4 EUS \u2217\nProof: Consider the optimal query S\u2217\nEPU \u2217\n\nL(\u03b8) \u2265 EPU \u2217\n\nL(\u03b8) \u2212 \u2206max.\nL and the set S\u2032 = TL(S\u2217\n\nL(\u03b8). From Thm. 3, EUS L(S\u2032; \u03b8) \u2265 EUS NL(S\u2032; \u03b8) \u2212 \u2206max; and from Thm. 2, EUS \u2217\n\nL) obtained by applying TL . From Lemma 2, EUS NL(S\u2032; \u03b8) \u2265 EPU L(S\u2217\nNL(\u03b8). Thus EUS \u2217\n\nNL(\u03b8) = EPU \u2217\n\nL, \u03b8) =\nL(\u03b8) \u2265\n\nEUS L(S\u2032; \u03b8) \u2265 EUS NL(S\u2032; \u03b8) \u2212 \u2206max \u2265 EPU \u2217\n\nL(\u03b8) \u2212 \u2206max\n\n1\n\n\u2212\u221e |z| \u00b7\n\n\u2206(S; \u03b8) = R +\u221e\n\nThe loss \u2206(S; \u03b8) = EUS NL(S; \u03b8) \u2212 EUS L(S; \u03b8) in the EUS of set S due to logistic noise can\nbe characterized as a function of the utility difference z = u(x1) \u2212 u(x2) between options x1\nand x2 of S, and integrating over the possible values of z (weighted by \u03b8). For a speci\ufb01c value\nof z \u2265 0, EUS-loss is exactly the utility difference z times the probability of choosing the less\npreferred option under RL: 1 \u2212 L(\u03b3z) = L(\u2212\u03b3z) where L is the logistic function. We have\n1+e\u03b3|z| P (z; \u03b8)dz. We derive a problem-independent upper bound on \u2206(S; \u03b8)\n1+e\u03b3z with z \u2265 0. The maximal loss \u2206max = f (zmax) for a\nfor any S, \u03b8 by maximizing f (z) = z \u00b7\nset of two hypothetical items s1 and s2 is attained by having the same utility difference u(s1, w) \u2212\nu(s2, w) = zmax for any w \u2208 W . By imposing \u2202f\n\u2202z = 0, we obtain e\u2212\u03b3z \u2212 \u03b3z + 1 = 0. Numerically,\nthis yields zmax \u223c 1.279 1\n\u03b3 and \u2206max \u223c 0.2785 1\n\u03b3 . This bound can be expressed on a scale that is\nindependent of the temperature parameter \u03b3; intuitively, \u2206max corresponds to a utility difference so\nslight that the user identi\ufb01es the best item only with probability 0.56 under RL with temperature \u03b3.\nIn other words, the maximum loss is so small that the user is unable able to identify the preferred\nitem 44% of the time when asked to compare the two items in S. This derivation can be generalized\nto sets of any size k, yielding \u2206k\n\ne ), where LW (\u00b7) is the Lambert W function.2\n\nmax = 1\n\n\u03b3 \u00b7 LW( k\u22121\n\n1\n\n5 Set Optimization Strategies\n\nWe discuss several strategies for the optimization of query/recommendation sets in this section,\nand summarize their theoretical and computational properties. In what follows, n is the number of\noptions |X|, k the size of the query/recommendation set, and l is the \u201ccost\u201d of Bayesian inference\n(e.g., the number of particles in a Monte Carlo sampling procedure).\n\nExact Methods The naive maximization of EPU is more computationally intensive than EUS-\noptimization, and is generally impractical. Given a set S of k elements, computing EPU (S, \u03b8)\nrequires Bayesian update of \u03b8 for each possible response, and expected utility optimization for each\nsuch posterior. Query optimization requires this be computed for nk possible query sets. Thus EPU\nmaximization is O(nk+1kl). Exact EUS optimization, while still quite demanding, is only O(nkkl)\nas it does not require EU-maximization in updated distributions. Thm. 2 allows us to compute\noptimal query sets using EUS-maximization under RC and RNL, reducing complexity by a factor\nof n. Under RL, Thm. 4 allows us to use EUS-optimization to approximate the optimal query, with\na quality guarantee of EPU \u2217 \u2212 \u2206max.\n\nGreedy Optimization A simple greedy algorithm can be used construct a recommendation\nset of size k by iteratively adding the option offering the greatest\nimprovement in value:\narg maxx EUS R(S \u222a {x}; \u03b8). Under RNL and RC , since EUS is submodular (Thm. 1), the\ngreedy algorithm determines a set with EUS that is within \u03b7 = 1 \u2212 ( k\u22121\nk )k of the optimal value\n\n1EPU L(S; \u03b8) is not necessarily less than EPU NL(S; \u03b8): there are sets S for which a noisy response might\n\nbe \u201cmore informative\u201d than a noiseless one. However, this is not the case for optimal query sets.\n\n2Lambert W, or product-log, is de\ufb01ned as the principal value of the inverse of x \u00b7 ex. The loss-maximizing\n\nset Smax may contain infeasible outcomes; so in practice loss may be much lower.\n\n6\n\n\fEU S\u2217 = EP U \u2217 [9].3 Thm. 2 again allows us to use greedy maximization of EUS to determine a\nquery set with similar gaurantees.\n\nUnder RL, EUS L is no longer submodular. However, Lemma 2 and Thm. 3 allow us to use EUS NL,\nwhich is submodular, as a proxy. Let Sg the set determined by greedy optimization of EUS NL. By\nsubmodularity, \u03b7 \u00b7 EUS \u2217\nNL. Applying\nThm. 3 to Sg gives: EUS L(Sg) \u2265 EUS N L(Sg) \u2212 \u2206. Thus, we derive\n\nNL \u2264 EUS NL(Sg) \u2264 EUS \u2217\n\nN L; we also have EUS \u2217\n\nL \u2264 EUS \u2217\n\nEUS L(Sg)\n\nEUS \u2217\nL\n\n\u2265\n\n\u03b7 \u00b7 EUS \u2217\n\nN L \u2212 \u2206\n\nEUS \u2217\nL\n\n\u2265\n\n\u03b7 \u00b7 EUS \u2217\nEUS \u2217\n\nN L\n\nN L \u2212 \u2206\n\n\u2265 \u03b7 \u2212\n\n\u2206\nEUS \u2217\n\nN L\n\n(7)\n\nSimilarly, we derive a worst-case bound for EPU w.r.t. greedy EUS-optimization (using the fact\nthat EUS is a lower bound for EPU, Thm. 3 and Thm. 2):\n\nEPU L(Sg)\n\nEPU \u2217\nL\n\n\u2265\n\nEUS L(Sg)\n\nEPU \u2217\nL\n\n\u2265\n\nN L \u2212 \u2206\n\n\u03b7 \u00b7 EUS \u2217\nEPU \u2217\n\nN L\n\n=\n\n\u03b7 \u00b7 EUS \u2217\nEUS \u2217\n\nN L \u2212 \u2206\n\nN L\n\n\u2265 \u03b7 \u2212\n\n\u2206\nEUS \u2217\n\nN L\n\n(8)\n\nGreedy maximization of S w.r.t. EUS is extremely fast, O(k2ln), or linear in the number of options\nn: it requires O(kn) evaluations of EUS , each with cost kl.4\n\nQuery Iteration The T transformation (Defn. 3) gives rise to a natural heuristic method for com-\nputing, good query/recommendation sets. Query iteration (QI) starts with an initial set S, and locally\noptimizes S by repeatedly applying operator T (S) until EUS (T (S); \u03b8)=EUS (S; \u03b8). QI is sensitive\nto the initial set S, which can lead to different \ufb01xed points. We consider several initialization strate-\ngies: random (randomly choose k options), sampling (include x\u2217(\u03b8), and sample k \u2212 1 points wi\nfrom P (w; \u03b8), and for each of these add the optimal item to S, while forcing distinctness) and greedy\n(initialize with the greedy set Sg).\n\nWe can bound the performance of QI relative to optimal query/recommendation sets assuming RNL\nor RC . If QI is initialized with Sg, performance is no worse than greedy optimization. If initialized\nwith an arbitrary set, we note that, because of submodularity, EU \u2217 \u2264 EUS \u2217 \u2264 kEU \u2217. The\ncondition T (S) = S implies EUS (S) = EPU (S). Also note that, for any set Q, EPU (Q) \u2265 EU \u2217.\nk EUS \u2217. This means for comparison queries (|S| = 2), QI achieves at least 50%\nThus, EUS (S) \u2265 1\nof the optimal recommendation set value. This bound is tight and corresponds to the singleton\ndegenerate set Sd = {x\u2217(\u03b8), .., x\u2217(\u03b8)} = {x\u2217(\u03b8)}. This solution is problematic since T (Sd) = Sd\nand has EVOI of zero. However, under RNL, QI with sampling initialization avoids this \ufb01xed point\nprovably by construction, always leading to a query set with positive EVOI.\n\nComplexity of one iteration of QI is O(nk + lk), i.e., linear in the number of options, exactly like\nGreedy. However, in practice it is much faster than Greedy since typically k << l. While we have\nno theoretical results that limit the iterations required by QI to converge, in practice, a \ufb01xed point is\nreached in very quickly (see below).\n\nEvaluation We compare the strategies above empirically on choice problems with random user\nutility functions using both noiseless and noisy response models.5\n\nBayesian inference is realized by a Monte Carlo method with importance sampling (particle weights\nare determined by applying the response model to observed responses). To overcome the problem of\nparticle degeneration (most particles eventually have low or zero weight), we use slice-sampling [18]\nto regenerate particles w.r.t. to the response-updated belief state \u03b8 whenever the effective number of\nsamples drops signi\ufb01cantly (50000 particles were used in the simulations). Figure 1(a) shows the\naverage loss of our strategies in an apartment rental dataset, with 187 outcomes, each character-\nized by 10 attributes (either numeric or categorical with domain sizes 2\u20136), when asking pairwise\ncomparison queries with noiseless responses. We note that greedy performs almost as well as exact\noptimization, and the optimal item is found in roughly 10\u201315 queries. Query iteration performs\nreasonably well when initialized with sampling, but poorly with random seeds.\n\n3This is 75% for comparison queries (k = 2) and at worst 63% (as k \u2192 \u221e).\n4A practical speedup can be achieved by maintaining a priority queue of outcomes sorted by their potential\nEUS-contribution (monotonically decreasing due to submodularity). When choosing the item to add to the set,\nwe only need to evaluate a few outcomes at the top of the queue (lazy evaluation).\n\n5Utility priors are mixtures of 3 Gaussians with \u00b5 = U [0, 10] and \u03c3 = \u00b5/3 for each component.\n\n7\n\n\f1\n\n0.8\n\n0.6\n\n0.4\n\n0.2\n\ns\ns\no\n\nl\n \n\ne\ng\na\nr\ne\nv\na\n\n \n\nd\ne\nz\n\ni\nl\n\na\nm\nr\no\nn\n\n \n\nexactEUS\ngreedy(EUS,NL)\nQI(sampling)\nQI(rand)\nrandom\n\n1\n\n0.8\n\n0.6\n\n0.4\n\n0.2\n\ns\ns\no\n\nl\n \n\ne\ng\na\nr\ne\nv\na\n\n \n\nd\ne\nz\n\ni\nl\n\na\nm\nr\no\nn\n\n \n\nQI(greedy,L)\ngreedy(EUS,L)\ngreedy(EUS,NL)\nQI(sampling,NL)\nQI(rand,L)\nrandom\n\n0\n\n \n0\n\n5\n\n10\nnumber of queries\n\n15\n\n20\n\n25\n\n0\n\n \n0\n\n5\n\n10\n20\nnumber of queries\n\n15\n\n25\n\n30\n\n(a) Average Loss (187 outcomes, 30 runs, RN L)\n\n(b) Average Loss (506 outcomes, 30 runs, RL)\n\nIn the second experiment, we consider the Boston Housing dataset with 506 items (1 binary and\n13 continous attributes) and a logistic noise model for responses with \u03b3 = 1. We compare the\ngreedy and QI strategies (exact methods are impractical on problems of this size) in Figure 1(b);\nwe also consider a hybrid greedy(EUS,NL) strategy that optimizes \u201cassuming\u201d noiseless responses,\nbut is evaluated using the true response model RL. QI(sampling) is more ef\ufb01cient when using TNL\ninstead of TL and this is the version plotted. Overall these experiments show that (greedy or exact)\nmaximization of EUS is able to \ufb01nd optimal\u2014or near-optimal when responses are noisy\u2014query\nsets. Finally, we compare query optimization times on the two datasets in the following table:\n\nexactEPU\n\nexactEUS\n\ngreedy(EPU,L)\n\nQI(greedy(EUS,L))\n\ngreedy(EUS,L)\n\ngreedy(EUS,NL)\n\nQI(sampling)\n\nQI(rand)\n\nn=30, k=2\nn=187, k=2\nn=187, k=4\nn=187, k=6\nn=506, k=2\nn=506, k=4\nn=506, k=6\n\n47.3s\n1815s\n\n-\n-\n-\n-\n-\n\n10.3s\n405s\n\n10000s\n\n-\n-\n-\n-\n\n1.5s\n9.19s\n39.7s\n87.1s\n14.6s\n64.9s\n142s\n\n0.76s\n2.07s\n7.89s\n15.7s\n4.09s\n15.4s\n32.9s\n\n0.65s\n1.97s\n7.71s\n15.4s\n3.99s\n15.2s\n32.8s\n\n0.12s\n1.02s\n1.86s\n2.55s\n0.93s\n1.12s\n1.53s\n\n0.11s\n0.15s\n0.16s\n0.51s\n0.05s\n0.08s\n0.09s\n\n0.11s\n0.17s\n0.19s\n0.64s\n0.06s\n0.10s\n0.13s\n\nAmong our strategies, QI is certainly most ef\ufb01cient computationally, and is best suited to large\noutcome spaces.\nInterestingly, QI is often faster with sampling initialization than with random\ninitialization because it needs fewer iteration on average before convergence (3.1 v.s. 4.0).\n\n6 Conclusions\n\nWe have provided a novel analysis of set-based recommendations in Bayesian recommender sys-\ntems, and have shown how it is offers a tractable means of generating myopically optimal or near-\noptimal choice queries for preference elicitation. We examined several user response models, show-\ning that optimal recommendation sets are EVOI-optimal queries under noiseless and constant noise\nmodels; and that they are near-optimal under the logistic/Luce-Sheppard model (both theoretically\nand practically). We stress that our results are general and do not depend on the speci\ufb01c implemen-\ntation of Bayesian update, nor on the speci\ufb01c form of the utility function. Our greedy strategies\u2014\nexploiting submodularity of EUS computation\u2014perform very well in practice and have theoretical\napproximation guarantees. Finally our experimental results demonstrate that query iteration, a sim-\nple local search strategy, is especially well-situated to large decision spaces.\n\nA number of important directions for future research remain. Further theoretical and practical in-\nvestigation of local search strategies such as query iteration is important. Another direction is the\ndevelopment of strategies for Bayesian recommendation and elicitation in large-scale con\ufb01guration\nproblems, e.g., where outcomes are speci\ufb01ed by a CSP, and for sequential decision problems (such\nas MDPs with uncertain rewards). Finally, we are interested in elicitation strategies that combine\nprobabilistic and regret-based models.\n\nAcknowledgements The authors would like to thank Iain Murray and Cristina Manfredotti for\nhelpful discussion on Monte Carlo methods, sampling techniques and particle \ufb01lters. This research\nwas supported by NSERC.\n\n8\n\n\fReferences\n\n[1] Ali Abbas. Entropy methods for adaptive utility elicitation. IEEE Transactions on Systems, Science and\n\nCybernetics, 34(2):169\u2013178, 2004.\n\n[2] Craig Boutilier. A POMDP formulation of preference elicitation problems. In Proceedings of the Eigh-\n\nteenth National Conference on Arti\ufb01cial Intelligence (AAAI-02), pp.239\u2013246, Edmonton, 2002.\n\n[3] Craig Boutilier, Relu Patrascu, Pascal Poupart, and Dale Schuurmans. Constraint-based optimization and\n\nutility elicitation using the minimax decision criterion. Arti\ufb01cal Intelligence, 170(8\u20139):686\u2013713, 2006.\n\n[4] Craig Boutilier, Richard S. Zemel, and Benjamin Marlin. Active collaborative \ufb01ltering. In Proc. 19th\n\nConference on Uncertainty in Arti\ufb01cial Intelligence (UAI-03), pp.98\u2013106, Acapulco, 2003.\n\n[5] Darius Braziunas and Craig Boutilier. Minimax regret-based elicitation of generalized additive utilities.\nIn Proc. 23rd Conference on Uncertainty in Arti\ufb01cial Intelligence (UAI-07), pp.25\u201332, Vancouver, 2007.\n\n[6] U. Chajewska and D. Koller. Utilities as random variables: Density estimation and structure discovery.\n\nIn Proc. 16th Conference on Uncertainty in Arti\ufb01cial Intelligence (UAI-00), pp.63\u201371, Stanford, 2000.\n\n[7] U. Chajewska, D. Koller, and R. Parr. Making rational decisions using adaptive utility elicitation. In Proc.\n\n17th National Conference on Arti\ufb01cial Intelligence (AAAI-00), pp.363\u2013369, Austin, TX, 2000.\n\n[8] Peter C. Fishburn. Interdependence and additivity in multivariate, unidimensional expected utility theory.\n\nInternational Economic Review, 8:335\u2013342, 1967.\n\n[9] L. A. Wolsey G. L. Nemhauser and M. L. Fisher. An analysis of approximations for maximizing submod-\n\nular set functions. Mathematical Programming, 14(1):265\u2013294, December 1978.\n\n[10] Krzysztof Gajos and Daniel S. Weld. Preference elicitation for interface optimization. In Patrick Baudisch,\n\nMary Czerwinski, and Dan R. Olsen, editors, UIST, pp.173\u2013182. ACM, 2005.\n\n[11] Shengbo Guo and Scott Sanner. Real-time multiattribute bayesian preference elicitation with pairwise\ncomparison queries. In Proceedings of the 13th International Conference on Arti\ufb01cial Intelligence and\nStatistics (AISTATS-10), Sardinia, Italy, 2010.\n\n[12] V. S. Iyengar, J. Lee, and M. Campbell. Q-Eval: Evaluating multiple attribute items using queries. In\n\nProceedings of the Third ACM Conference on Electronic Commerce, pp.144\u2013153, Tampa, FL, 2001.\n\n[13] Ralph L. Keeney and Howard Raiffa. Decisions with Multiple Objectives: Preferences and Value Trade-\n\noffs. Wiley, New York, 1976.\n\n[14] J. A. Konstan, B. N. Miller, D. Maltz, J. L. Herlocker, L. R. Gordon, and J. Riedl. Grouplens: Applying\n\ncollaborative \ufb01ltering to usenet news. Communications of the ACM, 40(3):77\u201387, 1997.\n\n[15] Jordan J. Louviere, David A. Hensher, and Joffre D. Swait. Stated Choice Methods: Analysis and Appli-\n\ncation. Cambridge University Press, Cambridge, 2000.\n\n[16] Christopher G. Lucas, Thomas L. Grif\ufb01ths, Fei Xu, and Christine Fawcett. A rational model of preference\nlearning and choice prediction by children. In Proceedings of the Twenty-Second Annual Conference on\nNeural Information Processing Systems, Vancouver, Canada, 2008, pp.985\u2013992, 2008.\n\n[17] Robert D. Luce. Individual choice behavior: a theoretical analysis. Wiley, New York, 1959.\n\n[18] Radford M. Neal. Slice sampling. The Annals of Statistics, 31(3):705\u201370, 2003.\n\n[19] A. Ng and S. Russell. Algorithms for inverse reinforcement learning. In Proc. 17th International Confer-\n\nence on Machine Learning (ICML-00), pp.663\u2013670, Stanford, CA, 2000.\n\n[20] Robert Price and Paul R. Messinger. Optimal recommendation sets: Covering uncertainty over user\npreferences. In Proceedings of the Twentieth National Conference on Arti\ufb01cial Intelligence (AAAI\u201905),\npp.541\u2013548, 2005.\n\n[21] James Reilly, Kevin McCarthy, Lorraine McGinty, and Barry Smyth. Incremental critiquing. Knowledge-\n\nBased Systems, 18(4\u20135):143\u2013151, 2005.\n\n[22] Olivier Toubia, John Hauser, and Duncan Simester. Polyhedral methods for adaptive choice-based con-\n\njoint analysis. Journal of Marketing Research, 41:116\u2013131, 2004.\n\n[23] Paolo Viappiani and Craig Boutilier. Regret-based optimal recommendation sets in conversational rec-\nommender systems. In Proceedings of the 3rd ACM Conference on Recommender Systems (RecSys09),\npp.101\u2013108, New York, 2009.\n\n[24] Paolo Viappiani, Boi Faltings, and Pearl Pu. Preference-based search using example-critiquing with\n\nsuggestions. Journal of Arti\ufb01cial Intelligence Research, 27:465\u2013503, 2006.\n\n9\n\n\f", "award": [], "sourceid": 444, "authors": [{"given_name": "Paolo", "family_name": "Viappiani", "institution": null}, {"given_name": "Craig", "family_name": "Boutilier", "institution": null}]}