{"title": "Analyzing Cross-Connected Networks", "book": "Advances in Neural Information Processing Systems", "page_first": 1117, "page_last": 1124, "abstract": null, "full_text": "Analyzing Cross Connected Networks \n\nThomas R. Shultz \n\nDepartment of Psychology & \n\nMcGill Cognitive Science Centre \n\nMcGill University \n\nMontreal, Quebec, Canada H3A IB 1 \n\nshultz@psych.mcgill.ca \n\nand \n\nAbstract \n\nJeffrey L. Elman \n\nCenter for Research on Language \nDepartment of Cognitive Science \n\nUniversity of California at San Diego \n\nLaJolla, CA 92093-0126 U.S.A. \n\nelman@crl.ucsd.edu \n\nThe non-linear complexities of neural networks make network solutions \ndifficult to understand. Sanger's contribution analysis is here extended to \nthe analysis of networks automatically generated by the cascade(cid:173)\ncorrelation learning algorithm. Because such networks have cross \nconnections that supersede hidden layers, standard analyses of hidden \nunit activation patterns are insufficient. A contribution is defined as the \nproduct of an output weight and the associated activation on the sending \nunit, whether that sending unit is an input or a hidden unit, multiplied \nby the sign of the output target for the current input pattern. \nIntercorrelations among contributions, as gleaned from the matrix of \ncontributions x input patterns, can be subjected to principal \ncomponents analysis (PCA) to extract the main features of variation in \nthe contributions. Such an analysis is applied to three problems, \ncontinuous XOR, arithmetic comparison, and distinguishing between \ntwo interlocking spirals. In all three cases, this technique yields useful \ninsights into network solutions that are consistent across several \nnetworks. \n\nINTRODUCTION \n\n1 \nAlthough neural network researchers are typically impressed with the performance \nachieved by their learning networks, it often remains a challenge to explain or even \ncharacterize such performance. The latter difficulties stem principally from the complex \nnon-linear properties of neural nets and from the fact that information is encoded in a form \nthat is distributed across many weights and units. The problem is exacerbated by the fact \nthat multiple nets generate unique solutions depending on variation in both starting states \nand training patterns. \nTwo techniques for network analysis have been applied with some degree of success, \nfocusing respectively on either a network's weights or its hidden unit activations. Hinton \n(e.g., Hinton & Sejnowski, 1986) pioneered a diagrammatic analysis that involves \nplotting a network's learned weights. Occasionally, such diagrams yield interesting \ninsights but often, because of the highly distributed nature of network representations, the \nmost notable features of such analyses are the complexity of the pattern of weights and its \nvariability across multiple networks learning the same problem. \n\n1117 \n\n\f1118 \n\nShultz and Elman \n\nStatistical analysis of the activation patterns on the hidden units of three layered feed(cid:173)\nforward nets has also proven somewhat effective in understanding network performance. \nThe relations among hidden unit activations, computed from a matrix of hidden units x \ninput patterns, can be subjected to either cluster analysis (Elman, 1990) or PCA (Elman, \n1989) to determine the way in which the hidden layer represents the various inputs. \nHowever, it is not clear how this technique should be extended to multi-layer networks or \nto networks with cross connections. \nCross connections are direct connections that bypass intervening hidden layers. Cross \nconnections typically speed up learning when used in static back-propagation networks \n(Lang & Witbrock, 1988) and are an obligatory and ubiquitous feature of some generative \nlearning algorithms, such as cascade-correlation (Fahlman & Lebiere, 1990). Generative \nalgorithms construct their own network topologies as they learn. In cascade-correlation, \nthis is accomplished by recruiting new hidden units into the network, as needed, installing \neach on a separate layer. In addition to layer-to-layer connections, each unit in a cascade(cid:173)\ncorrelation network is fully cross connected to all non-adjacent layers downstream. \nBecause such cross connections carry so much of the work load, any analysis restricted to \nhidden unit acti vations provides a partial picture of the network solution at best. \nGenerative networks seem to provide a number of advantages over static networks, \nincluding more principled network design, leaner networks, faster learning, and more \nrealistic simulations of hwnan cognitive development (Fahlman & Lebiere, 1990; Shultz, \nSchmidt, Buckingham, & Mareschal, in press). Thus, it is important to understand how \nthese networks function, even if they seem impervious to standard analytical tools. \n2 CONTRIBUTION ANALYSIS \nOne analytical technique that might be adapted for multi-layer, cross connected nets is \ncontribution analysis (Sanger, 1989). Sanger defined a contribution as the triple product \nof an output weight, the activation of a sending unit, and the sign of the output target for \nthat input. He argued that contributions are potentially more informative than either \nweights alone or hidden unit activations alone. A large weight may not contribute much \nif it is connected to a sending unit with a small activation. Likewise, a large sending \nactivation may not contribute much if it is connected via a small weight. In contrast, \nconsidering a full contribution, using both weight and sending activation, would more \nlikely yield valid comparisons. \nSanger (1989) applied contribution analysis to a small version of NETtalk, a net that \nlearns to convert written English into spoken English (Sejnowski & Rosenberg, 1987). \nSanger's analysis began with the construction of an output unit x hidden unit x input \npattern array of contributions. Various two-dimensional slices were taken from this three(cid:173)\ndimensional array, each representing a particular output unit or a particular hidden unit. \nEach two-dimensional slice was then subjected to PCA, yielding information about either \ndistributed or local hidden unit responsibilities, depending on whether the focus was on an \nindividual output unit or individual hidden unit, respectively. \n3 CONTRIBUTION ANALYSIS FOR MULTI\u00b7 LAYER, \nCROSS CONNECTED NETS \nWe adapted contribution analysis for use with multi-layered, cross connected cascade(cid:173)\ncorrelation nets. Assume a cascade-correlation network with j units (input units + hidden \nunits) and k output units, being trained with i input patterns. There are j x k output \nweights in such a network, where an output weight is defined as any weight connected to \n\n\fAnalyzing Cross-Connected Networks \n\n1119 \n\nan output unit. A contribution c for a particular ijk combination is defined as \n\nCijk = Wjk aij 2tki \n\n(1) \n\nPRINCIPAL COMPONENTS ANALYSIS \n\nwhere Wjk is the weight connecting sending unit j with output unit k, aij is the activation \nof sending unit j given input pattern i, and tki is the target for output unit k given input \npattern i. The term 2tki adjusts the sign of the contribution so that it provides a measure \nof correctness. That is, positive contributions push the output activation towards the \ntarget, whereas negative contributions push the output activation away from the target. In \ncascade-correlation, sigmoid output units have targets of either -0.5 or +0.5. Hence, \nmUltiplying a target by 2 yields a positive sign for positive targets and a negative sign for \nnegative targets. Our term 2tki is analogous to Sanger's (1989) term 2tik - 1, which is \nappropriate for targets of 0 and I, commonly used in back-propagation learning. \nIn contrast to Sanger's (1989) three-dimensional array of contributions (output unit x \nhidden unit x input pattern). we begin with a two-dimensional output weight (k * j) x \ninput pattern (i) array of contributions. This is because we want to include all of the \ncontributions coming into the output units, including the cross connections from more \nthan one layer away. Since we begin with a two-dimensional array. we do not need to \nemploy the somewhat cumbersome slicing technique used by Sanger to isolate particular \noutput or hidden units. Nonetheless. as will be seen, our technique does allow the \nidentification of the roles of specific contributions. \n4 \nCorrelations among the various contributions across input patterns are subjected to PCA. \nPCA is a statistical technique that identifies significant dimensions of variation in a \nmulti-dimensional space (Flury, 1988). A component is a line of closest fit to a set of \npoints in multi-dimensional space. The goal of PCA is to summarize a multivariate data \nset using as few components as possible. It does this by taking advantage of possible \ncorrelations among the variables (contributions, in our case). \nWe apply PCA to contributions, as defined in Equation I, taken from networks learning \nthree different problems: continuous XOR, arithmetic comparisons. and distinguishing \nbetween interlocking spirals. The contribution matrix for each net, as described in section \n3, is subjected to PCA using 1.0 as the minimum eigenvalue for retention. Varimax \nrotation is applied to improve the interpretability of the solution. Then the scree test is \napplied to eliminate components that fail to account for much of the variance (Cattell, \n1966). In cases where components are eliminated. the analysis is repeated with the correct \nnumber of components. again with a varimax rotation. Component scores for the retained \ncomponents are plotted to provide an indication of the function of the components. \nFinally. component loadings for the various contributions are examined to determine the \nroles of the contributions from hidden units that had been recruited into the networks. \n5 APPLICATION TO THE CONTINUOUS XOR PROBLEM \nThe simplicity of binary XOR and the small number of training patterns (four) renders \napplication of contribution analysis superfluous. However, it is possible to construct a \ncontinuous version of the XOR problem that is more suitable for contribution analysis. \nWe do this by dividing the input space into four quadrants. Input values are incremented \nin steps of 0.1 starting from 0.0 up to 1.0, yielding 100 x, y input pairs. Values of x up \nto 0.5 combined with values of y above 0.5 produce a positive output target (0.5), as do \nvalues of x above 0.5 combined with values of y below 0.5. Input pairs in the other two \nquadrants yield a negative output target (-0.5). \n\n\f1120 \n\nShultz and Elman \n\nThree cascade-correlation nets are trained on this problem. Each of the three nets generates \na unique solution to the continuous XOR problem, with some variation in number of \nhidden units recruited. PCA of contributions yields different component loadings across \nthe three nets and different descriptions of components. Yet with all of that variation in \ndetail, it is apparent that all three nets make the same three distinctions that are afforded \nby the training patterns. The largest distinction is that which the nets are explicitly \ntrained to make, between positive and negative outputs. Two components are sufficient to \ndescribe the representations. Plots of rotated component scores for the 100 training \npatterns cluster into four groups of 25 points, each cluster corresponding to one of the \nfour quadrants described earlier. Component loadings for the various contributions on the \ntwo components indicate that the hidden units play an interactive and distributed role in \nseparating the input patterns into their respective quadrants. \n6 APPLICATION TO COMPARATIVE ARITHMETIC \nA less well understood problem than XOR in neural net research is that of arithmetic \noperations, such as addition and multiplication. What has a net learned when it learns to \nadd, or to multiply, or to do both operations? The non-linear nature of multiplication \nmakes it particularly interesting as a network analysis problem. The fact that several \npsychological simulations using neural nets involve problems of linear and non-linear \narithmetic operations enhances interest in this sort of problem (McClelland, 1989; Shultz \net al., in press). \n\nWe designed arithmetic comparison tasks that provided interesting similarities to some of \nthe psychological simulations. In particular, instead of simply adding or multiplying, the \nnets learn to compare sums or products to some value and then output whether the sum or \nproduct is greater than, less than, or equal to that comparative value. \nThe addition and multiplication tasks each involve three linear input units. The first two \ninput units each code a randomly selected integer in the range from 0 to 9, inclusive. The \nthird input unit codes a randomly selected comparison integer. For addition problems, the \ncomparison values are in the range of 0 to 19, inclusive; for multiplication the range is 0 \nto 82, inclusive. Two output units code the results of the comparison. Target outputs of \n0.5 and -0.5 represent that the results of the arithmetic operation are greater than the \ncomparison value, targets of -0.5 and 0.5 represent less than, and targets of 0.5 and 0.5 \nrepresent equal to. For problems involving both addition and multiplication, a fourth \ninput unit codes the type of arithmetic operation to be performed: 0 for addition, 1 for \nmultiplication. \nNets trained on either addition or multiplication have 100 randomly selected training \npatterns, with the restriction that 45 of them have correct answers of greater than, 45 have \ncorrect answers of less than, and 10 have correct answers of equal to. The latter constraints \nare designed to reduce the natural skew of comparative values in the high direction on \nmultiplication problems. Nets trained on both addition and multiplication have 100 \nrandomly selected addition problems and 100 randomly selected multiplication problems, \nsubject to the constraints just described. We trained three nets on addition, three on \nmultiplication, and three on both addition and multiplication. \n6.1 \nPCA of contributions in all three addition nets yield two significant components. In each \nof the three nets, the component scores form three clusters, representing the three correct \nanswers. In all three nets, the first component distinguishes greater than from less than \nanswers and places equal to answers in the middle; the second component distinguishes \n\nRESULTS FOR ADDITION \n\n\fAnalyzing Cross-Connected Networks \n\n1121 \n\nRESUL TS FOR MULTIPLICATION \n\nequal to from unequal to answers. The primary role of the hidden unit in these nets is to \ndistinguish equality from inequality. The hidden unit is not required to perform addition \nper se in these nets, which have additive activation functions. \n6.2 \nPCA applied to the contributions in the three multiplication nets yields from 3 to 4 \nsignificant components. Plots of rotated component scores show that the first component \nseparates greater than from less than outputs, placing equal to outputs in the middle. \nOther components further differentiate the problems in these categories into several \nsmaller groups that are related to the particular values being multiplied. Rotated \ncomponent loadings indicate that component 1 is associated not only with contributions \ncoming from the bias unit and the input units, but also with contributions from some \nhidden units. This underscores the need for hidden units to capture the non-linearities \ninherent to multiplication. \n6.3 \nPCA of contributions yields three components in each of the three nets taught to do both \naddition and multiplication. In addition to the familiar distinctions between greater than, \nless than, and equal to outputs found in nets doing either addition or multiplication, it is \nof interest to determine whether nets doing both operations distinguish between adding \nand multiplying. \nFigure 1 shows the rotated component scores for net 1. Components 1 and 2 (accounting \nfor 30.2% and 21.9% of the variance, respectively) together distinguish greater than \nanswers from the rest. Component 3, accounting for 20.2% of the variance, separates \nequal to answers from less than answers and multiplication from addition for greater than \nanswers. Together, components 2 and 3 separate multiplication from addition for less than \nanswers. Results for the other two nets learning both multiplication and addition \ncomparisons are essentially similar to those for net 1. \n\nRESULTS FOR BOTH ADDITION AND MULTIPLICATION \n\n.\u2022 -... \" \n~ .,+ ... \n. . ... . -':/ . \n. . I . .. . , .... \"'\" \n.-. \n... \nr .\u2022 _ . 'I. \n.~ \n., \n\n\u2022\u2022\u2022 ( +< \n\n.... \n, \n\n==.1 \n\nI \n\nx> \n\n\u2022 \n\nx< \u2022\u2022 \n\n~ \n\n~ \ns:: o \nc.. \nE \no v \n\n2 \n\n0 \n\n-1 \n\n-2 \n\n2 \n\nComponent 2 \n\n-3 \n\n-3 \n\nComponent 3 \n\n2 \n\nFigure 1. Rotated component scores for a net doing both addition and multiplication. \n\nDISCUSSION OF COMPARATIVE ARITHMETIC \n\n6.4 \nAs with continuous XOR, there is considerable variation among networks learning \ncomparative arithmetic problems. Varying numbers of hidden units are recruited by the \nnetworks and different types of components emerge from PCA of network contributions. \nIn some cases, clear roles can be assigned to particular components, but in other cases, \nseparation of input patterns relies on interactions among the various components. \n\n\f1122 \n\nShultz and Elman \n\nYet with all of this variation, it is apparent that the nets learn to separate arithmetic \nproblems according to features afforded by the training set. Nets learning either addition or \nmultiplication differentiate the problems according to answer types: greater than, less \nthan, and equal to. Nets learning both arithmetic operations supplement these answer \ndistinctions with the operational distinction between adding and multiplying. \n7 APPLICATION TO THE TWO-SPIRALS PROBLEM \nWe next apply contribution analysis to a particularly difficult discrimination problem \nrequiring a relatively large number of hidden units. The two-spirals problem requires the \nnet to distinguish between two interlocking spirals that wrap around their origin three \ntimes. The standard version of this problem has two sets of 97 continuous-valued x, y \npairs, each set representing one of the spirals. The difficulty of the two-spirals problem is \nunderscored by the finding that standard back-propagation nets are unable to learn it \n(Wieland, unpublished, cited in Fahlman & Lebiere, 1990). The best success to date on \nthe two-spirals problem was reported with cascade-correlation nets, which learned in an \naverage of 1700 epochs while recruiting from 12 to 19 hidden units (Fahlman & Lebiere, \n1990). The relative difficulty of the two-spirals problem is undoubtedly due to its high \ndegree of non-linearity. It suited our need for a relatively difficult, but fairly well \nunderstood problem on which to apply contribution analysis. We ran three nets using the \n194 continuous x, y pairs as inputs and a single sigmoid output unit, signaling -0.5 for \nspiral 1 and 0.5 for spiral 2. \nBecause of the relative difficulty of interpreting plots of component scores for this \nproblem, we focus primarily on the extreme component scores, defined as less than -lor \ngreater than 1. Those x, y input pairs with extreme component scores on the first two \ncomponents for net 1 are plotted in Figure 2 as filled points on the two spirals. There are \nseparate plots for the positive and negative ends of each of the two components. The fllled \npoints in each quadrant of Figure 2 define a shape resembling a tilted hourglass covering \napproximately one-half of the spirals. The positive end of component 1 can be seen to \nfocus on the northeast sector of spiral 1 and the southwest sector of spiral 2. The negative \nend of component 1 has an opposite focus on the northeast sector of spiral 2 and the \nsouthwest sector of spiral 1. Component 2 does precisely the opposite of component 1: \nits positive end deals with the southeast sector of spiral 1 and the northwest sector of \nspiral 2 and its negative end deals with the southeast sector of spiral 2 and the northwest \nsector of spiral 1. Comparable plots for the other two nets show this same hourglass \nshape, but in a different orientation. \nThe networks appear to be exploiting the symmetries of the two spirals in reaching a \nsolution. Examination of Figure 2 reveals the essential symmetries of the problem. For \neach x, y pair, there exists a corresponding -x, -y pair 180 degrees opposite and lying on \nthe other spiral. Networks learn to treat these mirror image points similarly, as revealed \nby the fact that the plots of extreme component scores in Figures 2 are perfectly \nsymmetrical across the two spirals. If a point on one spiral is plotted, then so is the \ncorresponding point on the other spiral, 180 degrees opposite and at the same distance out \nfrom the center of the spirals. If a trained network learns that a given x, y pair is on spiral \n1, then it also seems to know that the -x, -y pair is on spiral 2. Thus, it make good sense \nfor the network to represent these opposing pairs similarly. \nRecall that contributions are scaled by the sign of their targets, so that all of the products \nof sending activations and output weights for spiral 1 are multiplied by -1. This is to \nensure that contributions bring output unit activations close to their targets in proportion \n\n\fAnalyzing Cross-Connected Networks \n\n1123 \n\nto the size of the contribution. Ignoring this scaling by target, the networks possess \nsufficient information to separate the two spirals even though they represent points of the \ntwo spirals in similar fashion. The plot of the extreme component scores in Figure 2 \nsuggests that the critical information for separating the two spirals derives mainly from \nthe signs of the input activations. \nBecause scaling contributions by the sign of the output target appears to obscure a full \npicture of network solutions to the two-spirals problem, there may be some value in \nusing unsealed contributions in network analysis. Use of unscaled contributions also \ncould be justified on the grounds that the net has no knowledge of targets as it represents \na particular problem; target information is only used in the error correction process. A \ndisadvantage of using un scaled contributions is that one cannot distinguish contributions \nthat facilitate vs. contributions that inhibit reaching a relatively error free solution. \nThe symmetry of these network representations suggests a level of systematicity that is, \non some accounts, not supposed to be possible in neural nets (Fodor & Pylyshyn, 1988). \nWhether this representational symmetry reflects systematicity in performance is another \nmatter. One empirical prediction would be that as a net learns that x, y is on one spiral, it \nalso learns at about the same time that -x, -y is on the other spiral. If confirmed, this \nwould demonstrate a clear case of systematic cognition in neural nets. \n8 GENERAL DISCUSSION \nPerforming PCA on network contributions is here shown to be a useful technique for \nunderstanding the performance of networks constructed by the cascade-correlation learning \nalgorithm. Because cascade-correlation nets typically possess multiple hidden layers and \nare fully cross connected, they are difficult to analyze with more standard methods \nemphasizing activation patterns on the hidden units alone. Examination of their weight \npatterns is also problematic, particularly in larger networks, because of the highly \ndistributed nature of the net's representations. \nAnalyzing contributions, in contrast to either hidden unit activations or weights, is a \nnaturally appealing solution. Contributions capture the influence coming into output \nunits both from adjacent hidden units and from distant, cross connected hidden and input \nunits. Moreover, because contributions include both sending activations and connecting \nweights, they are not unduly sensitive to one at the expense of the other. \nIn the three domains examined in the present paper, PCA of the network contributions \nboth confirm some expected results and provide new insights into network performance. \nIn all cases examined, the nets succeed in drawing all of the important distinctions in their \nrepresentations that are afforded by the training patterns, whether these distinctions \nconcern the type of output or the operation being performed on the input. In combination \nwith further experimentation and analysis of network weights and activation patterns, this \ntechnique could help to provide an account of how networks accomplish whatever it is \nthey learn to accomplish. \nIt might be of interest to apply the present technique at various points in the learning \nprocess to obtain a developmental trace of network performance. Would all networks \nlearning under the same constraints progress through the same stages of development, in \nterms of the problem distinctions they are able to make? This would be of particular \ninterest to network simulations of human cognitive development, which has been claimed \nto be stage-like in its progressions. \n\n\f1124 \n\nShultz and Elman \n\nThe present technique could also be useful in predicting the results of lesioning \nexperiments on neural nets. If the role of a hidden unit can be identified by its association \nwith a particular principal component, then it could be predicted that lesioning this unit \nwould impair the function served by the component. \nAcknowledgments \nThis research was supported by the Natural Sciences and Engineering Research Council of \nCanada and the MacArthur Foundation. Helpful comments were provided by Scott \nFahlman, Denis Mareschal, Yuriko Oshima-Takane, and Sheldon Tetewsky. \nReferences \n\nCattell, R. B. (1966). The scree test for the number of factors. Multivariate Behavioral \n\nResearch, t, 245-276. \n\nElman, 1. L. (1989). Representation and structure in connectionist models. CRL \nTechnical Report 8903, Center for Research in Language, University of California at \nSan Diego. \n\nElman, J. L. (1990). Finding structure in time. Cognitive Science, 14, 179-211. \nFahlman, S. E., & Lebiere, C. (1990.) The Cascade-Correlation learning architecture. In \nD. Touretzky (Ed.), Advances in neural information processing systems 2, (pp. 524-\n532). Mountain View, CA: Morgan Kaufmann. \n\nRury, B. (1988). Common principal components and related multivariate models. New \n\nYork: Wesley. \n\nFodor, J., & Pylyshyn, Z. (1988). Connectionism and cognitive architecture: A critical \n\nanalysis. Cognition, 28,3-71. \n\nHinton, G. E., & Sejnowski, T. J. (1986). Learning and relearning in Boltzmann \nmachines. In D. E. Rume1hart & J. L. McClelland (Eds.), Parallel distrihuted \nprocessing: Explorations in the microstructure of cognition. Volwne 1: Foundalion.~, \npp. 282-317. Cambridge, MA: MIT Press. \n\nLang, K. J., & Witbrock, M. J. (1988). Learning to tell two spirals apart. In D. \nTouretzky, G. Hinton, & T. Sejnowski (Eds)., Proceedings of the Connectioni.rt \nModels Summer School, (pp. 52-59). Mountain View, CA: Morgan Kaufmann. \n\nMcClelland, 1. L. (1989). Parallel distributed processing: Implications for cognition and \ndevelopment. In Morris, R. G. M. (Ed.), Para/lei distributed processing: Implications \nfor psychology and neurobiology, pp. 8-45. Oxford University Press. \n\nRumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning internal \nrepresentations by error propagation. In D. E. Rumelhart & J. L. McClelland (Eds.), \nParallel distributed processing: Explorations in the microstructure of cognition. \nVolume 1: Foundations, pp. 318-362. Cambridge, MA: MIT Press. \n\nSanger, D. (1989). Contribution analysis: A technique for assigning responsibilities to \n\nhidden units in connectionist networks. Connection Science, I, 115-138. \n\nSejnowski, T. J., & Rosenberg, C. R. (1987). Parallel networks that learn to pronounce \n\nEnglish text. Complex Systems, I, 145-168. \n\nShultz, T. R., Schmidt, W. C., Buckingham, D., & Mareschal, D. (In press). Modeling \ncognitive development with a generative connectionist algorithm. In G. Halford & T. \nSimon (Eds.), Developing cognitive competence: New approaches to process \nmndeling. Hillsdale, NJ: Erlbaum. \n\n\" \nIIII'ph'! \no AlI.ph12 \n\u2022 El1rem. spiral 1 \n\u2022 Eldrltm. spire' 2 \n\n0 \n\n\u00b74 \n\n-q \n\n-8 \n\n... \n\n0 \n\no \n\nFigure 2. Extreme rotated component scores for a net on the two-spirals problem. \n\n\f", "award": [], "sourceid": 812, "authors": [{"given_name": "Thomas", "family_name": "Shultz", "institution": null}, {"given_name": "Jeffrey", "family_name": "Elman", "institution": null}]}