{"title": "Deconvolving Feedback Loops in Recommender Systems", "book": "Advances in Neural Information Processing Systems", "page_first": 3243, "page_last": 3251, "abstract": "Collaborative filtering is a popular technique to infer users' preferences on new content based on the collective information of all users preferences. Recommender systems then use this information to make personalized suggestions to users. When users accept these recommendations it creates a feedback loop in the recommender system, and these loops iteratively influence the collaborative filtering algorithm's predictions over time. We investigate whether it is possible to identify items affected by these feedback loops. We state sufficient assumptions to deconvolve the feedback loops while keeping the inverse solution tractable. We furthermore develop a metric to unravel the recommender system's influence on the entire user-item rating matrix. We use this metric on synthetic and real-world datasets to (1) identify the extent to which the recommender system affects the final rating matrix, (2) rank frequently recommended items, and (3) distinguish whether a user's rated item was recommended or an intrinsic preference. Our results indicate that it is possible to recover the ratings matrix of intrinsic user preferences using a single snapshot of the ratings matrix without any temporal information.", "full_text": "Deconvolving Feedback Loops\n\nin Recommender Systems\n\nAyan Sinha\n\nPurdue University\nsinhayan@mit.edu\n\nDavid F. Gleich\nPurdue University\n\nKarthik Ramani\nPurdue University\n\ndgleich@purdue.edu\n\nramani@purdue.edu\n\nAbstract\n\nCollaborative \ufb01ltering is a popular technique to infer users\u2019 preferences on new\ncontent based on the collective information of all users preferences. Recommender\nsystems then use this information to make personalized suggestions to users. When\nusers accept these recommendations it creates a feedback loop in the recommender\nsystem, and these loops iteratively in\ufb02uence the collaborative \ufb01ltering algorithm\u2019s\npredictions over time. We investigate whether it is possible to identify items\na\ufb00ected by these feedback loops. We state su\ufb03cient assumptions to deconvolve\nthe feedback loops while keeping the inverse solution tractable. We furthermore\ndevelop a metric to unravel the recommender system\u2019s in\ufb02uence on the entire\nuser-item rating matrix. We use this metric on synthetic and real-world datasets\nto (1) identify the extent to which the recommender system a\ufb00ects the \ufb01nal rating\nmatrix, (2) rank frequently recommended items, and (3) distinguish whether a\nuser\u2019s rated item was recommended or an intrinsic preference. Our results indicate\nthat it is possible to recover the ratings matrix of intrinsic user preferences using a\nsingle snapshot of the ratings matrix without any temporal information.\n\n1\n\nIntroduction\n\nRecommender systems have been helpful to users for making decisions in diverse domains such\nas movies, wines, food, news among others [19, 23]. However, it is well known that the interface\nof these systems a\ufb00ect the users\u2019 opinion, and hence, their ratings of items [7, 24].Thus, broadly\nspeaking, a user\u2019s rating of an item is either his or her intrinsic preference or the in\ufb02uence of the\nrecommender system (RS) on the user [2]. As these ratings implicitly a\ufb00ect recommendations to other\nusers through feedback, it is critical to quantify the role of feedback in content personalization [22].\nThus the primary motivating question for this paper is: Given only a user-item rating matrix, is it\npossible to infer whether any preference values are in\ufb02uenced by a RS? Secondary questions include:\nWhich preference values are in\ufb02uenced and to what extent by the RS? Furthermore, how do we\nrecover the true preference value of an item to a user?\nWe develop an algorithm to answer these questions using the singular value decomposition (SVD)\nof the observed ratings matrix (Section 2). The genesis of this algorithm follows by viewing the\nobserved ratings at any point of time as union of true ratings and recommendations:\n\nRobs = Rtrue + Rrecom\n\n(1)\nwhere Robs is the observed rating matrix at a given instant of time, Rtrue is the rating matrix due\nto users\u2019 true preferences of items (along with any external in\ufb02uences such as ads, friends, and so\non) and Rrecom is the rating matrix which indicates the RS\u2019s contribution to the observed ratings.\nOur more formal goal is to recover Rtrue from Robs. But this is impossible without strong modeling\nassumptions; any rating is just as likely to be a true rating as due to the system.\nThus, we make strong, but plausible assumptions about a RS. In essence, these assumptions prescribe\na precise model of the recommender and prevent its e\ufb00ects from completely dominating the future.\n\n30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain.\n\n\fWith these assumptions, we are able to mathematically relate Rtrue and Robs. This enables us to\n\ufb01nd the centered rating matrix Rtrue (up to scaling). We caution readers that these assumptions\nare designed to create a model that we can tractably analyze, and they should not be considered\nlimitations of our ideas. Indeed, the strength of this simplistic model is that we can use its insights\nand predictions to analyze far more complex real-world data. One example of this model is that\nthe notion of Rtrue is a convenient \ufb01ction that represents some idealized, unperturbed version of the\nratings matrix. Our model and theory suggests that Rtrue ought to have some relationship with the\nobserved ratings, Robs. By studying these relationships, we will show that we gain useful insights\ninto the strength of various feedback and recommendation processes in real-data.\nIn that light, we use our theory to develop a heuristic, but accurate, metric to quantitatively infer the\nin\ufb02uence of a RS (or any set of feedback e\ufb00ects) on a ratings matrix (Section 3). Additionally, we\npropose a metric for evaluating the in\ufb02uence of a recommender system on each user-item rating pair.\nAggregating these scores over all users helps identify putative highly recommended items. The \ufb01nal\nmetrics for a RS provide insight into the quality of recommendations and argue that Net\ufb02ix had a\nbetter recommender than MovieLens, for example. This score is also sensitive to all cases where we\nhave ground-truth knowledge about feedback processes akin to recommenders in the data.\n\n2 Deconvolving feedback\n\nWe \ufb01rst state equations ans assumptions under which the true rating matrix is recoverable (or\ndeconvolvable) from the observed matrix, and provide an algorithm to deconvolve using the SVD.\n\n2.1 A model recommender system\n\nConsider a ratings matrix R of dimen-\nsion m \u00d7 n where m is the number of\nusers and n is the number of items being\nrated. Users are denoted by subscript u,\nand items are denoted by subscript i, i.e.,\nRu,i denotes user u\u2019s rating for item i. As\nstated after equation (1), our objective\nis to decouple Rtrue from Rrecom given\nthe matrix Robs. Although this problem\nFigure 1: Sub\ufb01gure A shows a ratings matrix with rec-\nseems intractable, we list a series of as-\nommender induced ratings and true ratings; Figure B:\nsumptions under which a closed form\nsolution of Rtrue is deconvolvable from\nFeedback loop in RS wherein the observed ratings is a\nRobs alone.\nfunction of the true ratings and ratings induced by a RS\nAssumption 1 The feedback in the RS occurs through the iterative process involving the observed\nratings and an item-item similarity matrix S: 1\n\nRobs = Rtrue + H (cid:12) (RobsS).\n\n(2)\nHere (cid:12) indicates Hadamard, or entrywise product, given as: (H (cid:12) R)u,i = Hu,i \u00b7 Ru,i. This assumption\nis justi\ufb01ed because in many collaborative \ufb01ltering techniques, Rrecom is a function of the observed\nratings Robs and the item-item similarity matrix, S . The matrix H is an indicator matrix over a set\nof items where the user followed the recommendation and agreed with it. This matrix is essentially\ncompletely unknown and is essentially unknowable without direct human interviews. The model RS\nequation (2) then iteratively updates Robs based on commonly rated items by users. This key idea is\nillustrated in Figure 1. The recursion progressively \ufb01lls all missing entries in matrix Robs starting\nfrom Rtrue. The recursions do not update Rtrue in our model of a RS. If we were to explicitly consider\nthe state of matrix Robs after k iterations, Rk+1\n\nobsSk) = Rtrue + H(k) (cid:12)(cid:16)(cid:0)Rtrue + H(k\u22121) (cid:12) (Rk\u22121\n\nobs Sk\u22121)(cid:1)Sk\n\n= Rtrue + H(k) (cid:12) (Rk\n\nobs we get:\n\n(cid:17)\n\nRk+1\nobs\n\n(3)\nHere Sk is the item-item similarity matrix induced by the observed matrix at state k. The above\nequation 3 is naturally initialized as R1\n= Rtrue along with the constraint S1 = Strue, i.e, the similarity\nobs\n1For an user-user similarities, \u02c6S, the derivations in this paper can be extended by considering the expression:\nRT\n= RT\nobs\n\n\u02c6S). We restrict to item-item similarity which is more popular in practice.\n\ntrue + HT (cid:12) (RT\n\n= . . .\n\nobs\n\n2\n\n\fmatrix at the \ufb01rst iteration is the similarity matrix induced by the matrix of true preferences, Rtrue.\nThus, we see that Robs is an implicit function of Rtrue and the set of similarity matrices Sk, Sk\u22121, . . . S1.\nAssumption 2 Hadamard product H(k) is approximated with a probability parameter \u03b1k \u2208 (0, 1].\nWe model the selection matrix H(k) and it\u2019s Hadamard problem in expectation and replace the\nsuccessive matrices H(k) with independent Bernoulli random matrices with probability \u03b1k. Taking\nthe expectation allows us to replace the matrix H(k) with the probability parameter \u03b1k itself:\n\n= . . .\n\nRk+1\nobs\n\n= Rtrue + \u03b1k(Rk\n\nobsSk) = Rtrue + \u03b1k\n\n(4)\nThe set of Sk, Sk\u22121,\u00b7\u00b7\u00b7 are apriori unknown. We are now faced with the task of constructing a valid\nsimilarity metric. Towards this end, we make our next assumption.\n\u2248 \u00afR(true)\nAssumption 3 The user mean \u00afRu in the observed and true matrix are roughly equal: \u00afR(obs)\nThe Euclidean item norms (cid:107)Ri(cid:107) are also roughly equal: (cid:107)R(obs)\nThese assumptions are justi\ufb01ed because ultimately we are interested in relative preferences of items\nfor a user and unbiased relative ratings of items by users. These can be achieved by centering\nusers and the normalizing item ratings, respectively, in the true and observed ratings matrices. We\nquantitatively investigate this assumption in the supplementary material. Using this assumption, the\nsimilarity metric then becomes:\n\n(cid:107) \u2248 (cid:107)R(true)\n\n(cid:107).\n\nu\n\nu\n\n.\n\ni\n\ni\n\n(cid:16)(cid:0)Rtrue + \u03b1k\u22121(Rk\u22121\n\nobs Sk\u22121)(cid:1)Sk\n\n(cid:17)\n\n(cid:80)\nu\u2208U(Ru,i \u2212 \u00afRu)2\n\n(cid:113)(cid:80)\n\n(cid:113)(cid:80)\n\nS(i, j) =\n\nu\u2208U(Ru,i \u2212 \u00afRu)(Ru, j \u2212 \u00afRu)\n\nu\u2208U(Ru, j \u2212 \u00afRu)2\n\n(5)\n\n\u221a(cid:80)\nRu,i\u2212 \u00afRu\nu\u2208U(Ru,i\u2212 \u00afRu)2\n\u02c6Robs = \u02c6Rtrue(I + f1(a1) \u02c6RT\n\nThis metric is known as the adjusted cosine similarity, and preferred over cosine similarity because it\nmitigates the e\ufb00ect of rating schemes over users [25]. Using the relations \u02dcRu,i = Ru,i \u2212 \u00afRu, and, \u02c6Ru,i =\n\u02dcRu,i\n(cid:107) \u02dcRi(cid:107) =\n\n, the expression of our recommender (4) becomes:\n\n1 \u03b1c2\n\n1 . . . \u03b1ck\n\ntrue \u02c6Rtrue)3 + . . .)\n\ntrue \u02c6Rtrue + f2(a2)( \u02c6RT\n\ntrue \u02c6Rtrue)2 + f3(a3)( \u02c6RT\n\nk . . . such that(cid:80)\n\n(6)\nHere, f1, f2, f3 . . . are functions of the probability parameters ak = [\u03b11, \u03b12, . . . \u03b1k, . . .] of the form\nfz(az) = c\u03b1c1\nk ck = z, and c is a constant. The proof of equation 6 is\nin the supplementary material. We see that the centering and normalization results in \u02c6Robs being\nexplicitly represented in terms of \u02c6Rtrue and coe\ufb03cients f (a). It is now possible to recover \u02c6Rtrue, but\nthe coe\ufb03cients f (a) are apriori unknown. Thus, our next assumption.\nAssumption 4 fz(az) = \u03b1z, i.e., the coe\ufb03cients of the series (6) are induced by powers of a constant\nprobability parameter \u03b1 \u2208 (0, 1].\nNote that in recommender (3), Robs becomes denser with every iteration, and hence the higher order\nHadamard products in the series \ufb01ll fewer missing terms. The e\ufb00ect of absorbing the unknowable\nprobability parameters, \u03b1k\u2019s into single probability parameter \u03b1 is similar. Powers of \u03b1, produce\nsuccessively less of an impact, just as in the true model. The governing expression now becomes:\n\n\u02c6Robs = \u02c6Rtrue(I + \u03b1 \u02c6RT\n\ntrue \u02c6Rtrue + \u03b12( \u02c6RT\n\ntrue \u02c6Rtrue)2 + \u03b13( \u02c6RT\n\ntrue \u02c6Rtrue)3 + . . .)\n\n(7)\n\nIn order to ensure convergence of this equation, we make our \ufb01nal assumption.\nAssumption 5 The spectral radius of the similarity matrix \u03b1 \u02c6RT\n\ntrue \u02c6Rtrue is less than 1.\n\ntrue \u02c6Rtrue)2 + \u03b13( \u02c6RT\n\ntrue \u02c6Rtrue)3 + . . .) as (1\u2212 \u03b1 \u02c6RT\n\nThis assumption enables us to write the in\ufb01nite series representing \u02c6Robs, \u02c6Rtrue(I + \u03b1 \u02c6RT\n\u03b12( \u02c6RT\n\u02c6RT\ntrue \u02c6Rtrue such that the spectral radius of \u03b1 \u02c6RT\n\u02c6RT\ntrue up to a scaling constant.\nDiscussion of assumptions. We now brie\ufb02y discuss the implications of our assumptions. First,\nassumption 1 states the recommender model. Assumption 2 states that we are modeling expected\n\ntrue \u02c6Rtrue +\ntrue \u02c6Rtrue)\u22121. It states that given \u03b1, we scale the matrix\ntrue \u02c6Rtrue is less than 1 2. Then we are then able to recover\n\n2See [10] for details on scaling similarity matrices to ensure convergence\n\n3\n\n\fFigure 2:\n(a) to (f): Our procedure for scoring ratings based on the deconvolved scores with true\ninitial ratings in cyan and ratings due to recommender in red. (a) The observed and deconvolved\nratings. (b) The RANSAC \ufb01t to extract straight line passing through data points for each item. (c)\nRotation and translation of data points using \ufb01tted line such that the scatter plot is approximately\nparallel to y-axis and recommender e\ufb00ects are distinguishable along x-axis. (d) Scaling of data points\nused for subsequent score assignment. (e) Score assignment using the vertex of the hyperbola with\nslope \u03b8 = 1 that passes through the data point. (f) Increasing \u03b1 deconvolves implicit feedback loops\nto a greater extent and better discriminates recommender e\ufb00ects as illustrated by the red points which\nshow more pronounced deviation when \u03b1 = 1.\n\nbehavior rather than actual behavior. Assumptions 3-5 are key to our method working. They\nessentially state that the RS\u2019s e\ufb00ects are limited in scope so that they cannot dominate the world.\nThis has a few interpretations on real-world data. The \ufb01rst would be that we are considering the\nimpact of the RS over a short time span. The second would be that the recommender e\ufb00ects are\nessentially second-order and that there is some other true e\ufb00ect which dominates them. We discuss\nthe mechanism of solving equation 7 using the above set of \ufb01ve assumptions next.\n\n2.2 The algorithm for deconvolving feedback loops\n\nTheorem 1 Assuming the RS follows (7), \u03b1 is between 0 and 1, and the singular value decomposition\nof the observed rating matrix is, \u02c6Robs = U\u03a3obsVT , the deconvolved matrix Rtrue of true ratings is\ngiven as U\u03a3trueVT , where the \u03a3true is a diagonal matrix with elements:\n\n(cid:115)\n\n\u03c3true\n\ni\n\n=\n\n\u22121\n2\u03b1\u03c3obs\n\ni\n\n+\n\n1\n\n4\u03b12(\u03c3obs\n\ni\n\n)2\n\n+\n\n1\n\u03b1\n\n(8)\n\nThe proof of the theorem is in the supplementary material. In practical applications, the feedback\nloops are deconvolved by taking a truncated-SVD (low rank approximation) instead of the complete\ndecomposition. In this process, we naturally concede accuracy for performance. We consider the\nmatrix of singular values \u02dc\u03a3obs to only contain the k largest singular values (the other singular values\nare replaced by zero). We now state Algorithm 1 for deconvolving feedback loops. The algorithm is\nsimple to compute as it just involves a singular value decomposition of the observed ratings matrix.\n\n3 Results and recommender system scoring\n\nWe tested our approach for deconvolving feedback loops on synthetic RS, and designed a metric to\nidentify the ratings most a\ufb00ected by the RS. We then use the same automated technique to study\nreal-world ratings data, and \ufb01nd that the metric is able to identify items in\ufb02uenced by a RS.\n\n4\n\n\fAlgorithm 1 Deconvolving Feedback Loops\nInput: Robs, \u03b1, k, where Robs is observed ratings matrix, \u03b1 is parameter governing feedback loops\n\nand k is number of singular values\n\n\u02c6Rtrue, True rating matrix\n\nOutput:\n1: Compute \u02dcRobs given Robs, where \u02dcRobs is user centered observed matrix\n2: Compute \u02c6Robs \u2190 \u02dcRobsD\u22121\n\nN , where \u02c6Robs is item-normalized rating matrix, and D\u22121\nu\u2208U(Ru,i \u2212 \u00afRu)2\n(cid:114)\n\n3: Solve U\u03a3obsVT \u2190 S VD( \u02c6Robs, k), the truncated SVD corresponding to k largest singular values.\n4: Perform \u03c3true\n\n(cid:1) for all i\n\n(cid:113)(cid:80)\ni \u2190(cid:0) \u22121\n\nDN(i, i) =\n\n+ 1\n\u03b1\n\n+\n\n1\n\n2\u03b1\u03c3obs\n\ni\n\n4\u03b12(\u03c3obs\n\n)2\n\ni\n\nN is diagonal matrix of item-norms\n\n5: return U, \u03a3true, VT\n\nFigure 3: Results for a synthetic RS with controllable e\ufb00ects. (Left to right): (a) ROC curves by\nvarying data sparsity (b) ROC curves by varying the parameter \u03b1 (c) ROC curves by varying feedback\nexponent (d) Score assessing the overall recommendation e\ufb00ects as we vary the true e\ufb00ect.\n\n3.1 Synthetic data simulating a real-world recommender system\n\nWe use item response theory to generate a sparse true rating matrix Rtrue using a model related to that\nin [12]. Let au be the center of user u\u2019s rating scale, and bu be the rating sensitivity of user u. Let ti\nbe the intrinsic score of item i. We generate a user-item rating matrix as:\n\nRu,i = L[au + buti + \u03b7u,i]\n\n(9)\nwhere L[\u03c9] is the discrete levels function assigning a score in the range 1 to 5: L[\u03c9] =\nmax(min(round(\u03c9), 5), 1) and \u03b7u,i is a noise parameter. In our experiment, we draw au \u223c N(3, 1),\nbu \u223c N(0.5, 0.5), tu \u223c N(0.1, 1), and \u03b7u,i \u223c \u0001N(0, 1), where N is a standard normal, and \u0001 is a noise\nparameter. We sample these ratings uniformly at random by specifying a desired level of rating\nsparsity \u03b3 which serves as the input, Rtrue, to our RS. We then run a cosine similarity based RS,\nprogressively increasing the density of the rating matrix. The unknown ratings are iteratively updated\nusing the standard item-item collaborative \ufb01ltering technique [8] as Rk+1\n, where k\nu,i\nis the iteration number and R0 = Rtrue, and the similarity measure at the kth iteration is given as\nsk\n. After the kth iteration, each synthetic user accepts the top r recommen-\ni, j\ndations with probability proportional to (Rk+1\nu,i )e, where e is an exponent controlling the frequency\nof acceptance. We \ufb01x the number of iterative updates to be 10, r to be 10 and the resulting rating\nmatrix is Robs. We deconvolve Robs as per Algorithm 1 to output \u02c6Rtrue. Recall, \u02c6Rtrue is user-centered\nand item-normalized. In the absence of any recommender e\ufb00ects Rrecom, the expectation is that \u02c6Rtrue\nis perfectly correlated with \u02c6Robs. The absence of a linear correlation hints at factors extraneous to\nthe user, i.e., the recommender. Thus, we plot \u02c6Rtrue (the deconvolved ratings) against the \u02c6Robs, and\nsearch for characteristic signals that exemplify recommender e\ufb00ects (see Figure 2a and inset).\n\nu,iRk\nu, j\nu\u2208U (Rk\n\ni, jRk\nj\u2208i(sk\nu, j)\ni, j|)\nj\u2208i(|sk\n\nu\u2208U Rk\nu,i)2\n\n(cid:80)\n(cid:80)\n\n(cid:113)(cid:80)\n\n\u221a(cid:80)\n\nu\u2208U (Rk\n\n(cid:80)\n\nu, j)2\n\n=\n\n=\n\n3.2 A metric to assess a recommender system\n\nWe develop an algorithm guided by the intuition that deviation of ratings from a straight line suggest\nrecommender e\ufb00ects (Algorithm 2). The procedure is visually elucidated in Figure 2. We consider\n\ufb01tting a line to the observed and deconvolved (equivalently estimated true) ratings; however, our\nexperiments indicate that least square \ufb01t of a straight line in the presence of severe recommender\ne\ufb00ects is not robust. The outliers in our formulation correspond to recommended items. Hence, we\nuse random sample consensus or the RANSAC method [11] to \ufb01t a straight line on a per item basis\n\n5\n\n\fDataset\nJester-1\nJester-2\nMusicLab-Weak\nMusicLab-Strong\nMovieLens-100K\nMovieLens-1M\nMovieLens-10M\nBeerAdvocate\nRateBeer\nFine Foods\nWine Ratings\nNet\ufb02ix\n\nUsers\n24.9K\n50.6K\n7149\n7192\n943\n6.04K\n69.8K\n31.8K\n28.0K\n130K\n21.0K\n480K\n\nTable 1: Datasets and parameters\nRating\nItems\n615K\n100\n1.72M\n140\n48\n25064\n23386\n48\n83.2K\n603\n975K\n2514\n7259\n9.90M\n1.35M\n9146\n2.40M\n20129\n329K\n5015\n320K\n8772\n16795\n100M\n\nMin RPI\n1\n1\n1\n1\n50\n50\n50\n20\n20\n20\n20\n100\n\nk in SVD\n100\n140\n48\n48\n603\n2514\n1500\n1500\n1500\n1500\n1500\n1500\n\nScore\n0.0487\n0.0389\n0.1073\n0.1509\n0.2834\n0.3033\n0.3821\n0.2223\n0.1526\n0.1209\n0.1601\n0.2661\n\n(Figure 2b). All these straight lines are translated and rotated so as to coincide with the y-axis as\ndisplayed in Figure 2c. Observe that the data points corresponding to recommended ratings pop out\nas a bump along the x-axis. Thus, the e\ufb00ect of the RANSAC and rotation is to place the ratings into\na precise location. Next, the ratings are scaled so as to make the maximum absolute values of the\nrotated and translated \u02d8Rtrue, \u02d8Robs, values to be equal (Figure 2d).\nThe scores we design are to measure \u201cextent\u201d into the x-axis. But we want to consider some allowable\nvertical displacement. The \ufb01nal score we assign is given by \ufb01tting a hyperbola through each rating\nviewed as a point: \u02d8Rtrue, \u02d8Robs. A straight line of slope, \u03b8 = 1 passing through the origin is \ufb01xed as an\nasymptote to all hyperbolas. The vertex of this hyperbola serves as the score of the corresponding\ndata point. The higher the value of the vertex of the associated hyperbola to a data point, the more\nlikely is the data point to be recommended item. Using the relationship between slope of asymptote,\nand vertex of hyperbola, the score s( \u02d8Rtrue, \u02d8Robs) is given by:\n\n(cid:113)\n\ns( \u02d8Rtrue, \u02d8Robs) = real(\n\ntrue \u2212 \u02d8R2\n\u02d8R2\nobs)\n\n(10)\n\nWe set the slope of the asymptote, \u03b8 = 1, because the maximum magnitudes of \u02d8Rtrue, \u02d8Robs are equal\n(see Figure 2 d,e). The overall algorithm is stated in the supplementary material. Scores are zero if\nthe point is inside the hyperbola with vertex 0.\n\n3.3\n\nIdentifying high recommender e\ufb00ects in the synthetic system\n\nWe display the ROC curve of our algorithm to identify recommended products in our synthetic\nsimulation by varying the sparsity, \u03b3 in Rtrue (Figure 3a), varying \u03b1 (Figure 3b), and varying exponent\ne (Figure 3c) for acceptance probability. The dimensions of the rating matrix is \ufb01xed at [1000, 100]\nwith 1000 users and 100 items. Decreasing \u03b1 as well as \u03b3 has adversarial e\ufb00ects on the ROC curve,\nand hence, AUC values, as is natural. The fact that high values of \u03b1 produce more discriminative\ndeconvolved ratings is clearly illustrated in Figure 2 f. Additionally, Figure 3 d shows that the\ncalculated score varies linearly with the true score as we change the recommender exponent, e, color\ncoded in the legend. Overall, our algorithm is remarkably successful in extracting recommended\nitems from Robs without any additional information. Also, we can score the overall impact of the RS\n(see the upcoming section RS scores) and it accurately tracks the true e\ufb00ect of the RS.\n\n3.4 Real data\n\nIn this subsection we validate our approach for deconvolving feedback loops on a real-world RS.\nFirst, we demonstrate that the deconvolved ratings are able to distinguish datasets that use a RS\nagainst those that do not. Second, we specify a metric that re\ufb02ects the extent of RS e\ufb00ects on the\n\ufb01nal ratings matrix. Finally, we validate that the score returned by our algorithm is indicative of the\nrecommender e\ufb00ects on a per item basis. We use \u03b1 = 1 in all experiments because it models the case\nwhen the recommender e\ufb00ects are strong and thus produces the highest discriminative e\ufb00ect between\nthe observed and true ratings (see Figure 2 f). This is likely to be the most useful as our model is only\nan approximation.\n\n6\n\n\fFigure 4: (Left to Right) A density plot of deconvolved and observed ratings on the Jester joke dataset\n(Left) that had no feedback loops and on the Net\ufb02ix dataset (Left Center) where their Cinematch\nalgorithm was running. The Net\ufb02ix data shows dispersive e\ufb00ects indicative of a RS whereas the\nJester data is highly correlated indicating no feedback system. A scatter plot of deconvolved and\nobserved ratings on the MusicLab dataset- Weak (Right Center) that had no downloads counts and on\nthe MusicLab dataset- Strong (Right) which displayed the download counts. The MusicLab-Strong\nscatter plot shows higher dispersive e\ufb00ects indicative of feedback e\ufb00ects.\n\nDatasets. Table 1 lists all the datasets we use to validate our approach for deconvolving a RS\n(from [21, 4, 13]). The columns detail name of the dataset, number of users, the number of items,\nthe lower threshold for number of ratings per item (RPI) considered in the input ratings matrix and\nthe number of singular vectors k (as many as possible based on the limits of computer memory),\nrespectively. The datasets are brie\ufb02y discussed in the supplementary material.\nClassi\ufb01cation of ratings matrix.\nAn example of the types of insights our method enables is shown in Figure 4. This \ufb01gure shows four\ndensity plots of the estimated true ratings (y-axis) compared with the observed ratings (x-axis) for\ntwo datasets, Jester and Net\ufb02ix. Higher density is indicated by darker shades in the scatter plot of\nobserved and deconvolved ratings. If there is no RS, then these should be highly correlated. If there\nis a system with feedback loops, we should see a dispersive plot. In the \ufb01rst plot (Jester) we see the\nresults for a real-world system without any RS or feedback loops; the second plot (Net\ufb02ix) shows the\nresults on the Net\ufb02ix ratings matrix, which did have a RS impacting the data. A similar phenomenon\nis observed in the third and fourth plots corresponding to the MusicLab dataset in Figure 4. We\ndisplay the density plot of observed (y-axis) vs. deconvolved or expected true (x-axis) ratings for all\ndatasets considered in our evaluation in the supplementary material.\nRecommender system scores. The RS scores we displayed\nin Table 1 are based on the fraction of ratings with non-zero\nscore (using the score metric (10)). Recall that a zero score\nindicates that the data point lies outside the associated hyper-\nbola and does not su\ufb00er from recommender e\ufb00ect. Hence,\nthe RS score is indicative of the fraction of ratings a\ufb00ected\nby the recommender. Looking at Table 1, we see that the two\nJester datasets have low RS scores validating that the Jester\ndataset did not run a RS. The MusicLab datasets show a weak\ne\ufb00ect because they do not include any type of item-item rec-\nommender. Nevertheless, the strong social in\ufb02uence condition\nscored higher for a RS because the simple download count\nfeedback will elicit comparable e\ufb00ects. These cases give us\ncon\ufb01dence in our scores because we have a clear understand-\ning of feedback processes in the true data. Interestingly, the\nRS score progressively increases for the three versions of the\nMovieLens datasets: MovieLens-100K, MovieLens-1M and\nMovieLens-10M. This is expected as the RS e\ufb00ects would\nhave progressively accrued over time in these datasets. Note\nthat Net\ufb02ix is also lower than Movielens, indicating that Net-\n\ufb02ix\u2019s recommender likely correlated better with users\u2019 true tastes. The RS scores associated with\nalcohol datasets (RateBeer, BeerAdvocate and Wine Ratings) are higher compared to the Fine Foods\ndataset. This is surprising. We conjecture that this e\ufb00ect is due to common features that correlate\nwith evaluations of alcohol such as the age of wine or percentage of alcohol in beer.\nRanking of items based on recommendation score. We associate a RS rating to each item as our\nmean score of an item over all users. All items are ranked in ascending order of RS score and we\n\nFigure 5:\n(Top to bottom) (a) De-\nconvolved ranking as a bar chart for\nT.V. shows. (b) Deconvolved rank-\ning as a bar chart for Indian movies.\n\n7\n\n\f\ufb01rst look at items with low RS scores. The Net\ufb02ix dataset comprises of movies as well as television\nshows. We expect that television shows are less likely to be a\ufb00ected by a RS because each season of\na T.V. show requires longer time commitment, and they have their own following. To validate this\nexpectation, we \ufb01rst identify all T.V. shows in the ranked list and compute the number of occurrences\nof a T.V. show in equally spaced bins of size 840. Figure 5 shows a bar chart for the number of\noccurrences and we see that there are \u2248 90 T.V.shows in the \ufb01rst bin (or top 840 items as per the\nscore). This is highest compared to all bins and the number of occurrences progressively decrease\nas we move further down the list, validating our expectation. Also unsurprisingly, the seasons of\nthe popular sitcom Friends comprised of 10 out of the top 20 T.V. seasons with lowest RS scores.\nIt is also expected that the Season 1 of a T.V. show is more likely to be recommended relative to\nsubsequent seasons. We identi\ufb01ed the top 40 T.V shows with multiple (at least 2) seasons, and\nobserved that 31 of these have a higher RS score for Season 1 relative to Season 2. The 9 T.V. shows\nwhere the converse is true are mostly comedies like Coupling, That 70\u2019s Show etc., for which the\nseasons can be viewed independently of each other. Next, we looked at items with high RS score. At\nthe time the dataset was released, Net\ufb02ix operated exclusively in the U.S., and one plausible use is\nthat immigrants might use Net\ufb02ix\u2019s RS to watch movies from their native country. We speci\ufb01cally\nlooked at Indian \ufb01lms in the ranked list to validate this expectation. Figure 5b shows a bar chart\nsimilar to the one plotted for T.V. shows and we observe an increasing trend along the ranked list for\nthe number of occurrences of Indian \ufb01lms. The movie with lowest recommendation score is Lagaan,\nthe only Indian movie to be nominated for the Oscars in last 25 years.\n4 Discussion, related work and future work\n\nDiscussion:In this paper we propose a mechanism to deconvolve feedback e\ufb00ects on RS, similar in\nspirit to the network deconvolution method to distinguish direct dependencies in biological networks\n[10, 3].\nIndeed, our approach can be viewed as a generalization of their methods for general\nrectangular matrices. We do so by only considering a ratings matrix at a given instant of time. Our\napproach depends on a few reasonable assumptions that enable us to create a tractable model of a RS.\nWhen we evaluate the resulting methods on synthetic and real-world datasets, we \ufb01nd that we are\nable to assess the degree of in\ufb02uence that a RS has had on those ratings. This analysis is also easy to\ncompute and just involves a singular value decomposition of the ratings matrix.\nRelated Work: User feedback in collaborative \ufb01ltering systems is categorized as either explicit\nfeedback which includes input by users regarding their interest in products [1], or implicit feedback\nsuch as purchase and browsing history, search patterns, etc. [14]. Both types of feedback a\ufb00ect\nthe item-item or user-user similarities used in the collaborative \ufb01ltering algorithm for predicting\nfuture recommendations [16]. There has been a considerable amount of work on incorporating the\ninformation from these types of user feedback mechanisms in collaborative \ufb01ltering algorithms in\norder to improve and personalize recommendations [15, 6]. Here, we do not focus on improving\ncollaborative \ufb01ltering algorithms for recommender systems by studying user feedback, but instead,\nour thrust is to recover each user\u2019s true preference of an item devoid of any rating bias introduced\nby the recommender system due to feedback. Another line of work based on user feedback in\nrecommender systems is related to understanding the exploration and exploitation tradeo\ufb00 [20]\nassociated with the training feedback loop in collaborative \ufb01ltering algorithms [9]. This line of\nresearch evaluates \u2018what-if\u2019 scenarios such as evaluating the performance of alternative collaborative\n\ufb01ltering models or, adapting the algorithm based on user-click feedbacks to maximize reward, using\napproaches like the multi-armed bandit setting [17, 18] or counterfactual learning systems [5]. In\ncontrast, we tackle the problem of recovering the true ratings matrix if feedback loops were absent.\nFuture Work: In the future we wish to analyze the e\ufb00ect of feeding the derived deconvolved ratings\nwithout putative feedback e\ufb00ects back into the RS. Some derivatives of our method include setting\nthe parameters considered unknown in our current approach with known values (such as S ) if known\na priori. Incorporating temporal information at di\ufb00erent snapshots of time while deconvolving the\nfeedback loops is also an interesting line of future work. From another viewpoint, our approach\ncan serve as a supplement to the active learning community to unbias the data and reveal additional\ninsights regarding feedback loops considered in this paper. Overall, we believe that deconvolving\nfeedback loops opens new gateways for understanding ratings and recommendations.\nAcknowledgements: David Gleich would like to acknowledge the support of the NSF via awards CAREER\nCCF-1149756, IIS-1422918, IIS-1546488, and the Center for Science of Information STC, CCF-093937, as\nwell as the support of DARPA SIMPLEX.\n\n8\n\n\fReferences\n[1] G. Adomavicius and A. Tuzhilin. Toward the next generation of recommender systems: A survey of the state-of-the-art\n\nand possible extensions. IEEE Trans. on Knowl. and Data Eng., 17(6):734\u2013749, June 2005.\n\n[2] X. Amatriain, J. M. Pujol, N. Tintarev, and N. Oliver. Rate it again: Increasing recommendation accuracy by user\n\nre-rating. In RecSys, pp. 173\u2013180, 2009.\n\n[3] B. Barzel and A.-L. Barab\u00e1si. Network link prediction by global silencing of indirect correlations. Nature biotechnol-\n\nogy, 31(8):720\u2013725, 2013.\n\n[4] J. Bennett and S. Lanning. The Net\ufb02ix prize. In Proceedings of the KDD Cup Workshop, pp. 3\u20136, 2007.\n\n[5] L. Bottou, J. Peters, J. Qui\u00f1onero-Candela, D. X. Charles, D. M. Chickering, E. Portugaly, D. Ray, P. Simard, and\nE. Snelson. Counterfactual reasoning and learning systems: The example of computational advertising. Journal of\nMachine Learning Research, 14:3207\u20133260, 2013.\n\n[6] L. Chen, G. Chen, and F. Wang. Recommender systems based on user reviews: the state of the art. User Modeling and\n\nUser-Adapted Interaction, 25(2):99\u2013154, 2015.\n\n[7] D. Cosley, S. K. Lam, I. Albert, J. A. Konstan, and J. Riedl. Is seeing believing?: How recommender system interfaces\n\na\ufb00ect users\u2019 opinions. In CHI, pp. 585\u2013592, 2003.\n\n[8] M. Deshpande and G. Karypis. Item-based top-n recommendation algorithms. ACM Trans. Inf. Syst., 22(1):143\u2013177,\n\nJan. 2004.\n\n[9] B. Edelman, M. Ostrovsky, and M. Schwarz. Internet advertising and the generalized second-price auction: Selling\n\nbillions of dollars worth of keywords. American Economic Review, 97(1):242\u2013259, 2007.\n\n[10] S. Feizi, D. Marbach, M. Medard, and M. Kellis. Network deconvolution as a general method to distinguish direct\n\ndependencies in networks. Nature Biotechnology, 31(8):726\u2013733, July 2013.\n\n[11] M. A. Fischler and R. C. Bolles. Random sample consensus: A paradigm for model \ufb01tting with applications to image\n\nanalysis and automated cartography. Commun. ACM, 24(6):381\u2013395, June 1981.\n\n[12] D. F. Gleich and L.-H. Lim. Rank aggregation via nuclear norm minimization. In KDD, pp. 60\u201368, 2011.\n\n[13] K. Goldberg, T. Roeder, D. Gupta, and C. Perkins. Eigentaste: A constant time collaborative \ufb01ltering algorithm. Inf.\n\nRetr., 4(2):133\u2013151, July 2001.\n\n[14] Y. Hu, Y. Koren, and C. Volinsky. Collaborative \ufb01ltering for implicit feedback datasets. In ICDM, pp. 263\u2013272, 2008.\n\n[15] G. Jawaheer, M. Szomszor, and P. Kostkova. Comparison of implicit and explicit feedback from an online music\nrecommendation service. In Proceedings of the Workshop on Information Heterogeneity and Fusion in Recommender\nSystems, pp. 47\u201351, 2010.\n\n[16] N. Lathia, S. Hailes, L. Capra, and X. Amatriain. Temporal diversity in recommender systems. In SIGIR, pp. 210\u2013217,\n\n2010.\n\n[17] L. Li, W. Chu, J. Langford, and R. E. Schapire. A contextual-bandit approach to personalized news article recommen-\n\ndation. In WWW, pp. 661\u2013670, 2010.\n\n[18] W. Li, X. Wang, R. Zhang, Y. Cui, J. Mao, and R. Jin. Exploitation and exploration in a performance based contextual\n\nadvertising system. In KDD, pp. 27\u201336, 2010.\n\n[19] G. Linden, B. Smith, and J. York. Amazon.com recommendations: Item-to-item collaborative \ufb01ltering. IEEE Internet\n\nComputing, 7(1):76\u201380, Jan. 2003.\n\n[20] J. G. March. Exploration and exploitation in organizational learning. Organiz. Science, 2(1):pp. 71\u201387, 1991.\n\n[21] J. J. McAuley and J. Leskovec. From amateurs to connoisseurs: Modeling the evolution of user expertise through\n\nonline reviews. In WWW, pp. 897\u2013908, 2013.\n\n[22] R. S. Poston and C. Speier. E\ufb00ective use of knowledge management systems: A process model of content ratings and\n\ncredibility indicators. MIS Quarterly, 29(2):pp. 221\u2013244, 2005.\n\n[23] F. Ricci, L. Rokach, B. Shapira, and P. B. Kantor. Recommender Systems Handbook. Springer-Verlag, New York,\n\n2010.\n\n[24] M. J. Salganik, P. S. Dodds, and D. J. Watts. Experimental study of inequality and unpredictability in an arti\ufb01cial\n\ncultural market. Science, 311(5762):854\u2013856, 2006.\n\n[25] B. Sarwar, G. Karypis, J. Konstan, and J. Riedl. Item-based collaborative \ufb01ltering recommendation algorithms. In\n\nWWW, pp. 285\u2013295, 2001.\n\n9\n\n\f", "award": [], "sourceid": 1616, "authors": [{"given_name": "Ayan", "family_name": "Sinha", "institution": "Purdue"}, {"given_name": "David", "family_name": "Gleich", "institution": "Purdue University"}, {"given_name": "Karthik", "family_name": "Ramani", "institution": "Purdue University"}]}