Context Effects in Category Learning: An Investigation of Four Probabilistic Models

Part of Advances in Neural Information Processing Systems 19 (NIPS 2006)

Bibtex Metadata Paper


Michael C. Mozer, Michael Shettel, Michael Holmes


Categorization is a central activity of human cognition. When an individual is asked to categorize a sequence of items, context effects arise: categorization of one item influences category decisions for subsequent items. Specifically, when experimental subjects are shown an exemplar of some target category, the category prototype appears to be pulled toward the exemplar, and the prototypes of all nontarget categories appear to be pushed away. These push and pull effects diminish with experience, and likely reflect long-term learning of category boundaries. We propose and evaluate four principled probabilistic (Bayesian) accounts of context effects in categorization. In all four accounts, the probability of an exemplar given a category is encoded as a Gaussian density in feature space, and categorization involves computing category posteriors given an exemplar. The models differ in how the uncertainty distribution of category prototypes is represented (localist or distributed), and how it is updated following each experience (using a maximum likelihood gradient ascent, or a Kalman filter update). We find that the distributed maximum-likelihood model can explain the key experimental phenomena. Further, the model predicts other phenomena that were confirmed via reanalysis of the experimental data.

Categorization is a key cognitive activity. We continually make decisions about characteristics of objects and individuals: Is the fruit ripe? Does your friend seem unhappy? Is your car tire flat? When an individual is asked to categorize a sequence of items, context effects arise: categorization of one item influences category decisions for subsequent items. Intuitive naturalistic scenarios in which context effects occur are easy to imagine. For example, if one lifts a medium-weight object after lifting a light-weight or heavy-weight object, the medium weight feels heavier following the light weight than following the heavy weight. Although the object-contrast effect might be due to fatigue of sensory-motor systems, many context effects in categorization are purely cognitive and cannot easily be attributed to neural habituation. For example, if you are reviewing a set of conference papers, and the first three in the set are dreadful, then even a mediocre paper seems like it might be above threshold for acceptance. Another example of a category boundary shift due to context is the following. Suppose you move from San Diego to Pittsburgh and notice that your neighbors repeatedly describe muggy, somewhat overcast days as "lovely." Eventually, your notion of what constitutes a lovely day accommodates to your new surroundings. As we describe shortly, experimental studies have shown a fundamental link between context effects in categorization and long-term learning of category boundaries. We believe that context effects can be viewed as a reflection of a trial-to-trial learning, and the cumulative effect of these trial-to-trial modulations corresponds to what we classically consider to be category learning. Consequently, any compelling model of category learning should also be capable of explaining context effects.

1 Experimental Studies of Context Effects in Categorization Consider a set of stimuli that vary along a single continuous dimension. Throughout this paper, we use as an illustration circles of varying diameters, and assume four categories of circles defined ranges of diameters; call them A, B, C, and D, in order from smallest to largest diameter.

In a classification paradigm, experimental subjects are given an exemplar drawn from one category and are asked to respond with the correct category label (Zotov, Jones, & Mewhort, 2003). After making their response, subjects receive feedback as to the correct label, which we'll refer to as the target. In a production paradigm, subjects are given a target category label and asked to produce an exemplar of that category, e.g., using a computer mouse to indicate the circle diameter (Jones & Mewhort, 2003). Once a response is made, subjects receive feedback as to the correct or true category label for the exemplar they produced. Neither classification nor production task has sequential structure, because the order of trial is random in both experiments. The production task provides direct information about the subjects' internal representations, because subjects are producing exemplars that they consider to be prototypes of a category, whereas the categorization task requires indirect inferences to be made about internal representations from reaction time and accuracy data. Nonetheless, the findings in the production and classification tasks mirror one another nicely, providing converging evidence as to the nature of learning. The production task reveals how mental representations shift as a function of trial-to-trial sequences, and these shifts cause the sequential pattern of errors and response times typically observed in the classification task. We focus on the production task in this paper because it provides a richer source of data. However, we address the categorization task with our models as well. Figure 1 provides a schematic depiction of the key sequential effects in categorization. The horizontal line represents the stimulus dimension, e.g., circle diameter. The dimension is cut into four regions labeled with the corresponding category. The category center, which we'll refer to as the prototype, is indicated by a vertical dashed line. The long solid vertical line marks the current exemplar--whether it is an exemplar presented to subjects in the classification task or an exemplar generated by subjects in the production task. Following an experimental trial with this exemplar, category prototypes appear to shift: the target-category prototype moves toward the exemplar, which we refer to as a pull effect, and all nontarget-category prototypes move away from the exemplar, which we refer to as a push effect. Push and pull effects are assessed in the production task by examining the exemplar produced on the following trial, and in the categorization task by examining the likelihood of an error response near category boundaries. The set of phenomena to be explained are as follows, described in terms of the production task. All numerical results referred to are from Jones and Mewhort (2003). This experiment consisted of 12 blocks of 40 trials, with each category label given as target 10 times within a block. Within-category pull: When a target category is repeated on successive trials, the exemplar generated on the second trial moves toward the exemplar generated on the first trial, with respect to the true category prototype. Across the experiment, a correlation coefficient of 0.524 is obtained, and remains fairly constant over trials. Between-category push: When the target category changes from one trial to the next, the exemplar generated on the second trial moves away from the exemplar generated on the first trial (or equivalently, from the prototype of the target category on the first trial). Figure 2a summarizes the sequential push effects from Jones and Mewhort. The diameter of the circle produced on trial t is plotted as a function of the target category on trial t - 1, with one line for each of the four trial t targets. The mean diameter for each target category is subtracted out, so the absolute vertical offset of each line is unimportant. The main feature of the data to note is that all four curves have a negative slope, which has the following meaning: the smaller that target t - 1 is (i.e., the further to the left on the x axis in Figure 1), the larger the response to target t is (further to the right in Figure 1), and vice versa, reflecting a push away from target t - 1. Interestingly and importantly, the magnitude of the push increases with the ordinal distance between targets t - 1 and t. Figure 2a is based on data from only eight subjects and is therefore noisy, though the effect is statistically reliable. As further evidence, Figure 2b shows data from a categorization task (Zotov et al., 2003), where the y-axis is a different dependent measure, but the negative slope has the same interpretation as in Figure 2a. example

Figure 1: Schematic depiction of sequential effects in categorization