{"title": "Intersecting regions: The Key to combinatorial structure in hidden unit space", "book": "Advances in Neural Information Processing Systems", "page_first": 27, "page_last": 33, "abstract": null, "full_text": "Intersecting regions: The key to combinatorial \n\nstructure in hidden unit space \n\nJanet Wiles \nDepts of Psychology and \nComputer Science, \nUniversity of Queensland \nQLD 4072 Australia. \njanetw@cs.uq.oz.au \n\nMark Ollila, \nVision Lab, CITRI \nDept of Computer Science, \nUniversity of Melbourne, \nVic 3052 Australia \nmolly@vis.citri.edu.au \n\nAbstract \n\nHidden units in multi-layer networks form a representation space in which each \nregion can be identified with a class of equivalent outputs (Elman, 1989) or a \nlogical state in a finite state machine (Cleeremans, Servan-Schreiber & \nMcClelland, 1989; Giles, Sun, Chen, Lee, & Chen, 1990). We extend the \nanalysis of the spatial structure of hidden unit space to a combinatorial task, \nbased on binding features together in a visual scene. The logical structure \nrequires a combinatorial number of states to represent all valid scenes. On \nanalysing our networks, we find that the high dimensionality of hidden unit \nspace is exploited by using the intersection of neighboring regions to represent \nconjunctions of features. These results show how combinatorial structure can \nbe based on the spatial nature of networks, and not just on their emulation of \nlogical structure. \n\n1 TECHNIQUES FOR ANALYSING THE SPATIAL AND \nLOGICAL STRUCTURE OF HIDDEN UNIT SPACE \n\nIn multi-layer networks, regions of hidden unit space can be identified with classes of \nequivalent outputs. For example, Elman (1989) showed that the hidden unit patterns for \nwords in simple grammatical sentences cluster into regions, with similar patterns \nrepresenting similar grammatical entities. For example, different tokens of the same word \nare clustered tightly, indicating that they are represented within a small region. These \nregions can be grouped into larger regions, reflecting a hierarchical structure. The largest \n\n27 \n\n\f28 \n\nWiles and Ollila \n\ngroups represent the abstract categories, nouns and verbs. Elman used cluster analysis to \ndemonstrate this hierarchical grouping, and principal component analysis (PCA) to show \ndimensions of variation in the representation in hidden unit space. \n\nAn alternative approach to Elman's hierarchical clustering is to identify each region with \na functional state. By tracing the trajectories of sequences through the different regions, \nan equivalent finite state machine (FSM) can be constructed This approach has been \ndescribed using Reber grammars with simple recurrent networks (Cleeremans, Servan(cid:173)\nSchreiber & McClelland, 1989) and higher-order networks (Giles, Sun, Chen, Lee, & \nChen, 1990). Giles et a1. showed that the logical structure of the grammars is embedded \nin hidden unit space by identifying each regions with a state, extracting the equivalent \nfinite state machine from the set of states, and then reducing it to the minimal FSM. \nOustering and FSM extraction demonstrate different aspects of representations in hidden \nunit space. Elman showed that regions can be grouped hierarchically and that dimensions \nof variation can be identified using PCA, emphasizing how the functionality is reflected \nin the spatial structure. Giles et al. extracted the logical structure of the finite state \nmachine in a way that represented the logical states independently of their spatial \nembedding. There is an inherent trade off between the spatial and logical analyses: In \none sense, the FSM is the idealized version of a grammar, and indeed for the Reber \ngrammars, Giles et a1. found improved performance on the extracted FSMs over the \ntrained networks. However, the states of the FSM increase combinatorially with the size \nof the input. If there is information encoded in the hierarchical grouping of regiOns or \nrelative spatial arrangement of clusters, the extracted FSM cannot exploit it. \n\nThe basis of the logical equivalence of a FSM and the hidden unit representations is that \ndisjoint regions of hidden unit space represent separate logical states. In previous work, \nwe reversed the process of identifying clusters with states of a FSM, by using prior \nknowledge of the minimal FSM to label hidden unit patterns from a network trained on \nsequences from three temporal functions (Wiles & Bloesch, 1992). Canonical \ndiscriminant analysis (CDA, Cliff, 1987) was then used to view the hidden unit patterns \nclustered into regions that corresponded to the six states of the minimal FSM. \n\nIn this paper we explore an alternative interpretation of regions. Instead of considering \ndisjoint regions, we view each region as a sub-component lying at the intersection of two \nor more larger regions. For example, in the three-function simulations, the six clusters \ncan be interpreted in terms of three large regions that identify the three possible temporal \nfunctions, overlapping with two large regions that identify the output of the network (see \nFigure 1). The six states can then be seen as combinations of the three function and two \noutput classes (Le, 5 large overlapping regions instead of 6 smaller disjoint ones). While \nthe three-function simulation does provide a clear demonstration of the intersecting \nstructure of regions, nonetheless, only six states are required to represent the minimal \nFSM and harder tasks are needed to demonstrate combinatorial representations. \n\n2 SIMULATIONS OF THE CONJUNCTION OF COLOR, \nSHAPE AND LOCATION \n\nThe representation of combinatorial structure is an important aspect of any computational \ntasi( because of the drastic implications of combinatorial explOSion for scaling. The \nintersection of regions is a concise way to represent all possible combinations of different \nitems. We demonstrate this idea applied to the analysis of a hidden unit space \n\n\fIntersecting regions: The key to combinatorial structure in hidden unit space \n\n29 \n\n1 \n\nI \n\nr \n\n-.. \n\n, , Xl \n\n1 I \u2022 \n\nr - - - -~ r- - ---, \n\nr - - - -., \n\n, \n\n, \" \n..l----r' \nI f -l----[tt' \nI: \n~ .1.. - - r- r - - ; \n'J \n::ftl RI \n.J. J \nI \nJ 1(1) AI 1 'lli-~--=--t-f-~ \n\u2022 ~ ...:r -h= e - , \nf \nf I \n~-+- --\nI \nr \n' I ,-:1 \nI r \nI f \nr \nI ~\u00b7.i \nI f \nI J \nxo, \n, \n-\n,I \n, r \nr J \n'1 _ RO: \nI 1:_ \nI I \nAO r i.' r ~ , r \nI : .e \nJ ,i: I JI \nI ,} r \nI J ' I' ~ .-1 J \nI \nL ~. -r- , \nI \n1 \nL ___ ~ \n\n1 \n'- _~.J L __ ~J \n\nr \n, \nI \n: \nr \n1 II \n\n:- ~ r \n\nc \n\" c \n&. \nE \no u \nft \n.!::! \nc o c \n\":2 .. \n\" U \n; \n\nFirse CIDOnicaJ componCDI \n\nFigure 1. Intersecting regions in hidden unit space. Hidden unit patterns from the three(cid:173)\nfunction task of Wiles and Bloesch (1992) are shown projected onto the first and third \ncanonical components. Each temporal function, XOR, AND and OR is represented by a \nvertical region, separated along the first canonical component. The possible outputs, 0 \nand 1 are represented by horizontal regions, separated down the third canonical \ncomponent. The states of the finite state machine are represented by the regions in the \nintersections of the vertical and horizontal regions. (Adapted from Wiles & Bloesch, \n1992, Figure lb.) \n\n\f30 Wiles and Ollila \n\nrepresentation of conjunctions of colors, shapes and locations. In our task, a scene \nconsists of zero or more objects, each object identified by its color, shape and location. \nThe number of scenes, C, is given by C = (s/+l)l where s,/, and 1 are the numbers of \nshapes, features and locations respectively. This problem illustrates several important \ncomponents: There is no unique representation of an object in the input or output - each \nobject is represented only by the presence of a shape and color at a given location. The \ntask of the network is to create hidden unit representations for all possible scenes, each \ncontaining the features themselves, and the binding of features to position. \n\nThe simulations involved two locations, three possible shapes and three colors (100 \nlegitimate scenes). A 12-20-12 encoder network was trained on the entire set of scenes \nand the hidden unit patterns for each scene were recorded. Analysis using CDA with 10 \ngroups designating all possible combinations of zero, one or two colors showed that the \nhidden unit space was partitioned into intersecting regions corresponding to the three \ncolors or no color (see Figure 2a). CDA was repeated using groups designating all \ncombinations of shapes, which showed an alternative partitioning into four intersecting \nregions related to the component shapes (see Figure 2b). Figures 2a and 2b show \nalternate two-dimensional projections of the 20-dimensional space. The analyses showed \nthat each hidden unit pattern was contained in many different groupings, such as all \nobjects that are red, all triangles, or all red triangles. \nIn linguistic terms, each hidden \nunit pattern corresponds to a token of a feature, and the region containing all tokens of a \ngiven group corresponds to its abstract type. The interesting aspect of this representation \nis that the network had learnt not only how to separate the groups, but also to use \noverlapping regions. Thus given a region that represents a circle and one representing a \ntriangle, the intersection of the two regions implies a scene that has both a circle and a \ntriangle. \n\nGiven suitable groups, the perspectives provided by CDA show many different abstract \ntypes within the hidden unit space. For example, scenes can be grouped according to the \nnumber of objects in a scene, or the number of squares in a scene. We were initially \nsurprised that contiguous regions exist for representing scenes with zero, one and two \nObjects, since the output units only require representations of individual features, such as \nsquare or circle, and not the abstraction to \"any shape\", or even more abstract, \"any \nobject\". \nIt seems plausible that the separation of these regions is due to the high \ndimensionality provided by 12-20-12 mappings. The excess degrees of freedom in hidden \nunit space can encode variation in the inputs that is not necessarily required to complete \nthe task. With fewer hidden units, we would expect that variation in the input patterns \nthat is not required for completing the task would be compressed or lost under the \ncompeting requirement of maximally separating functionally useful groups in the hidden \nunit space. This explanation found support in a second simulation, using a 12-8-12 \nencoder network. Whereas analysis of the 12-20-12 network showed separation of \npatterns into disjoint regions by number of objects, the smaller 12-8-12 network did \nnot. Over all, our analyses showed that as the number of dimensions increases, additional \naspects of scenes may be represented, even if those aspects are not required for the task \nthat the network is learning. \n\n\fIntersecting regions: The key to combinatorial structure in hidden unit space \n\n31 \n\n2A LEGEND \n\nRegion 1 Any scene with red \n2 Any scene with blue \n3 Any scene with green \n4 Any scene with 0 or 1 color \n\nA Scenes with red & green objects \nB Scenes with red & blue objects \nC Scenes with 1 red object \nD Scenes with 2 red objects \nE Scenes with green & blue objects \nF Scenes with 1 green object \nG Scenes with 2 green objects \nH Scenes with 1 blue object \nI Scenes with 2 blue objects \nJ Scenes with no objects \n\n2s LEGEND \n\nRegion 1 Any scene with a triangle \n\n2 Any scene with a circle \n3 Any scene with a square \n4 Scenes with 0 or 1 object \n\nA Scenes with a triangle & a circle \nB Scenes with a triangle & a square \nC Scenes with a single triangle \nD Scenes with 2 triangles \nE Scenes with a circle & a square \nF Scenes with a single circle \nG Scenes with 2 circles \nH Scenes with a single square \nI Scenes with 2 squares \nJ Scenes with no objects \n\n, \n\n..... \n\n-\n\nFirst canonical component \n\n-\"'\" \n\n.~ .. \"-.. -.. -.. \"'''., \nr JJ i \n\n1 \n\n..,.-----, \nI , \n\n~, \n, I \n\n2 \n\n4 \n\n\\ \n\n' A' \n\\ . ' \n' \n\nt~\u00b7 \nI \n~.. \n\\. \n\\ \u2022 \u2022 \n\\ \u2022 \n\\ \n...... . \n.. \n! \n\\. \n: \u2022\u2022\u2022 ~ \u2022\u2022\u2022\u2022\u2022\u2022\u2022\u2022 \\.~ j I \n\\i --; \n, \nfI \n,II ., : \n--; \n.... ., \n. \n\"' .. --\",-\n. \n: \n\u2022 \n\u2022 \n' \n\u2022 \n\u2022 \n: \n-.. \n~ \n: \n,t \n,. \n.-\n..... !~ ... \n\n-. \n-.. \n\n3 \n'I \n\n-.. \n\nFirst canonical component \n\nFigure 2. CDA plots showing the representations of features in a scene. A scene \nconsists of zero, one or two objects, represented in terms of color, shape and location. \n2a. Patterns labelled by color: Hidden unit patterns form ten distinct clusters, which \nhave been grouped into four intersecting regions, 1-4. For example, the hidden unit \npatterns within region 1 all contain at least one red object, those in regions 2 contain at \nleast one blue one, and those in the intersection of regions 1 and 2 contain one red and \none blue object. 2b. Patterns labelled by shape: Again the hidden unit patterns form ten \ndistinct clusters, which have been grouped into four intersecting regions, however, these \nregions represent scenes with the same shape. 2a and 2b show alternate groupings of the \nsame hidden unit space, projected onto different canonical components. The two \nprojections can be combined in the mind's eye (albeit with some difficulty) to form a four \ndimensional representation of the spatial structure of intersecting regions of both color \nand shape. \n\n\f32 \n\nWiles and Ollila \n\n3 THE SPATIAL STRUCTURE OF HIDDEN UNIT SPACE \nIS ISOMORPHIC TO THE COMBINATORIAL STRUCTURE \nOF THE VISUAL MAPPING TASK \n\nIn conclusion, the simulations demonstrate how combinatorial structure can be embedded \nin the spatial nature of networks in a way that is isomorphic to the combinatorial \nstructure of the task, rather than by emulation of logical structure. In our approach, the \nrepresentation of intersecting regions is the key to providing combinatorial \nrepresentations. If the visual mapping task were extended by including a feature \nspecifying the color of the background scene (e.g., blue or green) the number of possible \nscenes would double, as would the number of states in a FSM. By contrast, in the \nhidden unit representation, the additional feature would involve adding two more \noverlapping regions to those currently supported by the spatial structure. This could be \nimplemented by dividing hidden unit space along an unused dimension, orthogonal to the \ncurrent groups. \n\nThe task presented in this case study is extremely simplified, in order to expose the \nintrinsic combinatorial structure required in binding. Despite the simplifications, it does \ncontain elements of tasks that face real cognitive systems. \nIn the simulations above, \nindividual Objects can be clustered by their shape or color, or whole scenes by other \nproperties, such as the number of squares in the scene. These representations provide a \nconcise and easily accessible structure that solves the combinatorial problem of binding \nseveral features to one object, in such a way as to represent the individual object, and yet \nalso allow efficient access to its component features. The flexibility of such access \nprocesses is one of the main motivations for tensor models of human memory \n(Humphreys, Bain & Pike, 1989) and analogical reasoning (Halford et aI., in press). Our \nanalysis of spatial structure in terms of intersecting regions has a straightforward \ninterpretation in terms of tensors, and provides a basis for future work on network \nimplementations of the tensor memory and analogical reasoning models. \nAcknowledgements \n\nWe thank Simon Dennis and Steven Phillips for their canonical discriminant program. \nThis work was supported by grants from the Australian Research Council. \n\nReferences \n\nCleeremans, A., Servan-Schreiber, D., and McClelland, J.L. (1989). Finite state \nautomata and simple recurrent networks, Neural Computation, 1, 372-381. \n\nCliff, N. (1987). Analyzing Multivariate Data. Harcourt Brace Jovanovich, Orlando, \nFlorida. \n\nElman, J. (1989). Representation and structure in connectionist models. CRL Technical \nReport 8903, Center for Research in Language, University of California, San Diego, \n26pp. \n\nGiles, C. L., Sun, G. Z., Chen, H. H., Lee, Y. C., and Chen, D. (1990). Higher Order \nRecurrent Networks. In D.S. Touretzky (ed.) Advances in Neural Information Processing \nSystems 2, Morgan-Kaufmann, San Mateo, Ca., 380-387. \n\n\fIntersecting regions: The key to combinatorial structure in hidden unit space \n\n33 \n\nHalford, G.S., Wilson, W.H., Guo, J., Wiles, J. and Stewart, J.E.M. Connectionist \nimplications for processing capacity limitations in analogies. To appear in KJ. Holyoak \n& J. Barnden (Eds.), Advances in Connectionist and Neural Computation Theory, Vol 2: \nAnalogical Connections. Norwood, NJ: Ablex, in press. \n\nHumphreys, M.S., Bain, J.D., and Pike, R. (1989). Different ways to cue a coherent \nmemory system: A theory of episodic, semantic and procedural tasks, Psychological \nReview, 96 (2), 208-233. \n\nWiles, J. and Bloesch, A. (1992). Operators and curried functions: Training and analysis \nof simple recurrent networks. In J. E. Moody, S. J. Hanson, and R. P. Lippmann (Eds.) \nAdvances in Neural Information Processing Systems 4, Morgan-Kaufmann, San Mateo, \nCa. \n\n\f", "award": [], "sourceid": 691, "authors": [{"given_name": "Janet", "family_name": "Wiles", "institution": null}, {"given_name": "Mark", "family_name": "Ollila", "institution": null}]}