{"title": "How the Poverty of the Stimulus Solves the Poverty of the Stimulus", "book": "Advances in Neural Information Processing Systems", "page_first": 51, "page_last": 58, "abstract": "", "full_text": "How the Poverty of the Stimulus \nSolves the Poverty of the Stimulus \n\nWilleIll ZuideIlla \n\nLanguage Evolution and Computation Research Unit \nand Institute for Cell, Animal and Population Biology \n\nUniversity of Edinburgh \n\n40 George Square, Edinburgh EH8 9LL, United Kingdom \n\njelle@ling.ed.ac.uk \n\nAbstract \n\nLanguage acquisition is a special kind of learning problem because \nthe outcome of learning of one generation is the input for the next. \nThat makes it possible for languages to adapt to the particularities \nof the learner. In this paper, I show that this type of language \nchange has important consequences for models of the evolution and \nacquisition of syntax. \n\n1 The Language Acquisition Problem \n\nFor both artificial systems and non-human animals, learning the syntax of natural \nlanguages is a notoriously hard problem. All healthy human infants, in contrast, \nlearn any of the approximately 6000 human languages rapidly, accurately and spon(cid:173)\ntaneously. Any explanation of how they accomplish this difficult task must specify \nthe (innate) inductive bias that human infants bring to bear, and the input data \nthat is available to them. Traditionally, the inductive bias is termed - somewhat un(cid:173)\nfortunately - \"Universal Grammar\", and the input data \"primary linguistic data\". \n\nOver the last 30 years or so, a view on the acquisition of the syntax of natural \nlanguage has become popular that has put much emphasis on the innate machinery. \nIn this view, that one can call the \"Principles and Parameters\" model, the Universal \nGrammar specifies most aspects of syntax in great detail [e.g. 1]. The role of \nexperience is reduced to setting a limited number (30 or so) of parameters. The main \nargument for this view is the argument from the poverty of the stimulus [2]. This \nargument states that children have insufficient evidence in the primary linguistic \ndata to induce the grammar of their native language. \n\nMark Gold [3] provides the most well-known formal basis to this argument. Gold \nintroduced the criterion \"identification in the limit\" for evaluating the success of a \nlearning algorithm: with an infinite number of training samples all hypotheses of \nthe algorithm should be identical, and equivalent to the target. Gold showed that \nthe class of context-free grammars is not learnable in this sense by any algorithm \nfrom positive samples alone (and neither are other super'-jinite classes). This proof \nis based on the fact that no matter how many samples from an infinite language a \n\n\flearning algorithm has seen, the algorithm can not decide with certainty that the \nsamples are drawn from the infinite language or from a finite language that con(cid:173)\ntains all samples. Because natural languages are thought to be at least as complex \nas context-free grammars, and negative feedback is assumed to be absent in the \nprimary linguistic data, Gold's analysis, and subsequent work in learn ability theory \n[1] , is usually interpreted as strong support for the argument from the poverty of the \nstimulus, and, in the extreme, for the view that grammar induction is fundamentally \nimpossible (a claim that Gold would not subscribe to). \n\nCritics of this \"nativist\" approach [e.g. 4, 5] have argued for different assumptions \non the appropriate grammar formalism (e.g. stochastic context-free grammars), the \navailable primary data (e.g. semantic information) or the appropriate learnability \ncriterion. In this paper I will take a different approach. I will present a model that \ninduces context-free grammars without a-priori restrictions on the search space, se(cid:173)\nmantic information or negative evidence. Gold's negative results thus apply. Never(cid:173)\ntheless, acquisition of grammar is successful in my model, because another process \nis taken into account as well: the cultural evolution of language. \n\n2 The Language Evolution Problem \n\nWhereas in language acquisition research the central question is how a child acquires \nan existing language, in language evolution research the central question is how this \nlanguage and its properties have emerged in the first place. Within the nativist \nparadigm, some have suggested that the answer to this question is that Universal \nGrammar is the product of evolution under selection pressures for communication \n[e.g. 6]. Recently, several formal models have been presented to evaluate this view. \nFor this paper, the most relevant of those is the model of Nowak et al. [7]. \n\nIn that model it is assumed that there is a finite number of grammars, that new(cid:173)\ncomers (infants) learn their grammar from the population, that more successful \ngrammars have a higher probability of being learned and that mistakes are made in \nlearning. The system can thus be described in terms of the changes in the relative \nfrequencies Xi of each grammar type i in the population. The first result that Nowak \net al. obtain is a \"coherence threshold\". This threshold is the necessary condition \nfor grammatical coherence in a population, i.e. for a majority of individuals to use \nthe same grammar. They show that this coherence depends on the chances that a \nchild has to correctly acquire its parents' grammar. This probability is described \nwith the parameter q. Nowak et al. show analytically that there is a minimum value \nfor q to keep coherence in the population. If q is lower than this value, all possible \ngrammar types are equally frequent in the population and the communicative suc(cid:173)\ncess in minimal. If q is higher than this value, one grammar type is dominant; the \ncommunicative success is much higher than before and reaches 100% if q = l. \nThe second result relates this required fidelity (called qd to a lower bound (be) \non the number of sample sentences that a child needs. Nowak et al. make the \ncrucial assumption that all languages are equally expressive and equally different \nfrom each other. With that assumption they can show that be is proportional to \nthe total number of possible grammars N. Of course, the actual number of sample \nsentences b is finite; Nowak et al. conclude that only if N is relatively small can \na stable grammar emerge in a population. I.e. the population dynamics require a \nrestrictive Universal Grammar. \n\nThe models of Gold and Nowak et al. have in common that they implicitly assume \nthat every possible grammar is equally likely to become the target grammar for \nlearning. If even the best possible learning algorithm cannot learn such a grammar, \n\n\fthe set of allowed grammars must be restricted. There is, however, reason to believe \nthat this assumption is not the most useful for language learning. Language learning \nis a very particular type of learning problem, because the outcome of the learning \nprocess at one generation is the input for the next. The samples from which a child \nlearns with its learning procedure, are therefore biased by the learning of previous \ngenerations that used the same procedure[8]. \nIn [9] and other papers, Kirby, Hurford and students have developed a framework \nto study the consequences of that fact. In this framework, called the \"Iterated \nLearning Model\" (ILM), a population of individuals is modeled that can each pro(cid:173)\nduce and interpret sentences, and have a language acquisition procedure to learn \ngrammar from each other. In the ILM one individual (the parent) presents a rela(cid:173)\ntively small number of examples of form-meaning pairs to the next individual (the \nchild). The child then uses these examples to induce his own granunar. In the next \niteration the child becomes the parent, and a new individual becomes the child. \nThis process is repeated many times. Interestingly, Kirby and Hurford have found \nthat in these iterated transmission steps the language becomes easier and easier to \nlearn, because the language adapts to the learning algorithm by becoming more \nand more structured. The structure of language in these models thus emerges from \nthe iteration of learning. The role of biological evolution, in this view, is to shape \nthe learning algorithms, such that the complex results of the iterated learning is \nbiologically adaptive [10]. In this paper I will show that if one adopts this view on \nthe interactions between learning, cultural evolution and biological evolution, the \nmodels such as those of Gold [3] and Nowak et al. [7] can no longer be taken as \nevidence for an extensive, innate pr~specification of human language. \n\n3 A Simple Model of Grammar Induction \n\nTo study the interactions between language adaptation and language acquisition, I \nhave first designed a grammar induction algorithm that is simple, but can never(cid:173)\ntheless deal with some non-trivial induction problems. The model uses context-free \ngrammars to represent linguistic abilities. In particular, the representation is lim(cid:173)\nited to grammars G where all rules are of one of the following forms: (1) A 1-+ t, (2) \nA 1-+ BC, (3) A 1-+ Bt. The nontenninals A, B, C are elements of the non-terminal \nalphabet Vnt , which includes the start symbol S. t is a string of tenninal sym(cid:173)\nbols from the terminal alphabet Vt 1\u2022 For determining the language L of a certain \ngrammar G I use simple depth-first exhaustive search of the derivation tree. For \ncomputational reasons, the depth of the search is limited to a certain depth d, and \nthe string length is limited to length l. The set of sentences (L' ~ L) used in train(cid:173)\ning and in communication is therefore finite (and strictly speaking not context-free, \nbut regular); in production, strings are drawn from a uniform distribution over L'. \nThe grammar induction algorithm learns from a set of sample strings (sentences) \nthat are provided by a teacher. The design of the learning algorithm is originally \ninspired by [11] and is similar to the algorithm in [12]. The algorithm fits within a \ntradition of algorithms that search for compact descriptions of the input data [e.g. \n13, 14, 15]. It consists of three operations: \n\nIncorporation: extend the language, such that it includes the encountered string; \nif string s is not already part of the language, add a rule S 1-+ s to the \ngrammar. \n\nINote that the restrictions on the rule-types above do not limit the scope of languages \nthat can be represented (they are essentially equivalent to Chomsky Normal Form). They \nare, however, relevant for the language acquisition algorithm. \n\n\fCompression: substitute frequent and long substrings with a nonterminal, such \nthat the gmmmar becomes smaller and the language remains unchangedj \nfor every valid substring z of the right-hand sides of all rules, calculate the \ncompression effect v(z) of substituting z with a nonterminal Aj replace all \nvalid occurrences of the substring z, = arymaxzv(z) with A if v(z') > 0, and \nadd a rule A f-+ Zl to the grammar. \"Valid substrings\" are those substrings \nwhich can be replaced while keeping all rules of the forms 1- 3 described \nabove. The compression effect is measured as the difference between the \nnumber of symbols in the grammar before and after the substitution. The \ncompression step is repeated until the grammar does not change anymore. \nGeneralization: equate two nonterminals, such that the grammar becomes smaller \nand the language laryerj for every combination of two nonterminals A and \nB (B :f S), calculate the compression effect v of equating A and B. Equate \nthe combination (A',B') = arymaxABv(A,B) ifv(A',B') > OJ i.e. replace \nall occurrences of B with A. The compression effect is measured as the \ndifference between the number of symbols before and after replacing and \ndeleting redundant rules. The generalization step is repeated until the \ngrammar does not change anymore. \n\n4 Learnable and U nlearnable Classes \n\nThe algorithm described above is implemented in C++ and tested on a variety of \ntarget grammars2 \u2022 I will not present a detailed analysis of the learning behavior \nhere, but limit myself to a simple example that shows that the algorithm can learn \nsome (recursive) grammars, while it can not learn others. The induction algorithm \nreceives three sentences (abed, abcabcd, abcabcabcd). The incorporation, com(cid:173)\npression (repeated twice) and generalization steps yield subsequently the following \ngrammars: \n\n(a) Incorporation \n\n(b) Compression \n\n(c) Generalization \n\nS \nS \nS \n\nf-+ \nf-+ \nf-+ \n\nabed \nabcabcd \nabcabcabcd \n\nS \nS \nS \nX \nY \n\nf-+ Yd \nf-+ Xd \nf-+ Xabcd \nf-+ yy \nabc \nf-+ \n\nS \nS \nX \nX \n\nf-+ Xd \nf-+ Xabcd \nf-+ XX \nabc \nf-+ \n\nIn (b) the substrings \"abcabc\" and \"abc\" are subsequently replaced by the non(cid:173)\nterminals X and Y. In (c) the non-terminals X and Y are equated, which leads to \nthe deletion of the second rule in (b). One can check that the total size of the \ngrammar reduces from 24, to 19 and further down to 16 characters. \nFrom this example it is also clear that learning is not always successful. Any of the \nthree grammars above \u00aba) and (b) are equivalent) could have generated the train(cid:173)\ning data, but with these three input strings the algorithm always yields grammar \n(c). Consistent with Gold's general proof [3], many target grammars will never be \nlearned correctly, no matter how many input strings are generated. In practice, \neach finite set of randomly generated strings from some target grammar, might \nyield a different result. Thus, for some number of input strings T, some set of tar(cid:173)\nget grammars are always acquired, some are never acquired, and some are some of \nthe time acquired. H we can enumerate all possible grammars, we can describe this \nwith a matrix Q, where each entry Qij describes the probability that the algorithm \nlearning from sample strings from a target grammar i, will end up with grammar \n\n2The source code is available at http://wvv.ling.ed.ac . uk/ \"\" j elle \n\n\fof type j. Qii is the probability that the algorithm finds the target grammar. To \nmake learning successful, the target grammars that are presented to the algorithm \nhave to be biased. The following section will show that for this we need nothing \nmore than to assume that the output of one learner is the input for the next. \n\n5 \n\nIterated Learning: the Emergence of Learnability \n\nTo study the effects of iterated learning, we extend the model with a population \nstructure. In the new version of the model individuals (agents, that each represent \na generation) are placed in a chain. The first agent induces its grammar from a \nnumber E of randomly generated strings. Every subsequent agent (the child) learns \nits grammar from T sample sentences that are generated by the previous one (the \nparent). To avoid insufficient expressivenes:,;, we al:,;o extend the generalization step \nwith a check if the number EG of different strings the grammar G can recognize is \nlarger than or equal to E. If not, E - EG random new strings are generated and \nincorporated in the grammar. Using the matrix Q from the previou:,; section, we can \nformalize this iterated learning model with the following general equation, where Xi \nis the probability that grammar i is the grammar of the current generation: \n\nN \n\n~Xi = LXjQji \n\nj = O \n\n(1) \n\nIn simulations such a:,; the one of figure 1 communicative :,;ucces:,; between child and \nparent - a measure for the learnability of a grammar - rises steadily from a low \nvalue (here 0.65) to a high value (here 1.0). In the initial stage the grammar shows \nno structure, and consequently almost every string that the grammar produces \nis idiosyncratic. A child in this stage typically hears strings like \"ada\", \"ddac\", \n\"adba\", \"bcbd\", or \"cdca\" from its parent. It can not discover many regularities in \nthese strings. The child therefore can not do much better than simply reproduce the \nstrings it heard (i.e. T random draws from at least E different :,;trings), and generate \nrandom new strings, if necessary to make sure its language obeys the minimum \nnumber (E) of strings. However, in these randomly generated strings, sometimes \nregularitie:,; appear. I.e., a parent may u:,;e the randomly generated string:,; \"dcac\", \n\"bcac\", \"caac\" and \"daac\". When this happens the child tends to analyze these \nstrings as different combinations with the building block \"ac\". Thus, typically, \nthe learning algorithm generates a grammar with the rules S f-7 dcX, S f-7 bcX, \nS f-7 caX, S f-7 daX, and X f-7 ac. When this happens to another set of string:,; as \nwell, say with a new rule Y f-7 b, the generalization procedure can decide to equate \nthe non-terminals X and Y. The resulting grammar can then generalize from the \nobserved strings, to the unobserved strings \"dcb\", \"bcb\", \"cab\" and \"dab\". The \nchild still needs to generate random new strings to reach the minimum E, but fewer \nthan in the case considered above. \n\nThe interesting aspect of this becomes clear when we consider the next step in the \nsimulation, when the child becomes itself the parent of a new child. This child \nis now pre:,;ented with a language with more regularities than before, and has a \nfair chance of cor\u00b7r-ectly generalizing to unseen examples. If, for instance, it only \nsees the strings \"dcac\", \"bcac\", \"caac\", \"bcb\", \"cab\" and \"dab\", it can, through \nthe same procedure as above, infer that \"daac\" and \"dcb\" are also part of the \ntarget language. This means that (i) the child shares more string:,; with its parent \nthan just the ones it observes and consequently shows a higher between generation \ncommunicative success, and (ii) regularities that appear in the language by chance, \nhave a fair chance to remain in the language. In the process of iterated learning, \nlanguages can thus become more structured and better learnable. \n\n\f'---. ..... - '---.... \n\n(a.) LeBnl.&bility \n\n(b) Number of rules \n\n(c) Expressiveness \n\nFigure 1: Iterated Learning: although initially the target language is unstructured \nand difficult to learn, over the course of 20 generation!! (8) the learnability (the frac(cid:173)\ntion of !!uccessful communication!! with the parent) steadily increases, (b) the num(cid:173)\nber of rules steadily dec:reaaes (combmatorial and recursive stategies are used), and \n(c) after a initial. phase of overgeneralization, the expressiveness remains close to its \nminimally required level. Parameters: Vi = {a,b,c,d}, Vut = {S,X,Y,Z, A,B, C}, \nT=30, E=20, 10=3. Shown are the average values of 2 simulations. \n\nSimilar results with different formalismB were already reported before [e.g. 11, 16], \nbut here I have used context-free grammars and the results an! therefore directly \nrelevant for the interpretation of Gold'e proof [3]. Whereas in the ueual interpre(cid:173)\ntation of that proof [e.g. 1] it is assumed that we need. innate constraints on the \nsearch space in addition to a smart leaming procedure, here I show that even a \n!!imple learning procedure can lead to succeMful acquisition, because restriction!! \non the search space automatically emerge in the iteration of learning. If one con(cid:173)\nsiders ieamability a Dina'll feature - 38 is common in generative linguistics - this \nill a rather trivial phenomenon: languages that are not learnable will not occur in \nthe next generation. However, if there are gradations in learnability, the cultural \nevolution of language can be an intricate process where languages get shaped over \nmany generations. \n\n6 Language Adaptation and the Coherence Threshold \n\nWhen we study this effect in a version of the model where selection does play a \nrole, it is also relevant for the analysis in [7]. The model is therefore extended such \nthat at every generation there ill a population of agents, agents of one generation \ncommunicate with each other and the expected number of ofFspring of an agent (the \nfitnt2B) is determined by the number of successful interactions it had. Children still \nacquire their grammar from sample strings produced. by their parent. Adapting \nequation 1, this system CaD now be described with the following equation, where \nz. is now the relative fraction of grammar i in the population (assuming an infinite \npopulation size): \n\n~i = Lz;ljQji - t/Yzi \n\nN \n\nj=O \n\n(2) \n\nHere, Ji ill the relative jitnelJB (quality) of gra.m.mars of type i and equ.alB Ji = \nEj ziF~i' where F~J is the expected communicative success from an interaction \nbetween an individual of type i and an individual of type j. The relative fitness f of a \ngrammar thus depends on the frequencies of all grammar types, hence it ill freflUency \ndependent. q, is the average fitness in the population and equals q, = Ei Xiii. This \n\n\fterm is needed to keep the sum of all fractions at 1. This equation is essentially the \nmodel of Nowak et al. [7]. Recall that the main result of that paper is a \"coherence \nthreshold\": a minimum value for the learning accuracy q to keep coherence in the \npopulation. In previous work [unpublished] I have reproduced this result and shown \nthat it is robust against variations in the Q-matrix, as long as the value of q (i.e. \nthe diagonal values) remains equal for all grammars. \n\n%~~~20~~~40'-~6~o~-o8~o~-7\"oo \n\ngenerations \n\nFigure 2: Results from a run under fitness proportional selection. This figure shows \nthat there are regions of grammar space where the dynamics are apparently under \nthe \"coherence threshold\" [7], while there are other regions where the dynamics are \nabove this threshold. The parameters, including the number of sample sentences T, \nare still the same, but the language has adapted itself to the bias of the learning \nalgorithm. Parameters are: lit = {O, 1, 2, 3}, v;.,t = {S, a, b, c, d, e, f}, P=20, T=100, \nE=100, lo=12. Shown are the average values of 20 agents. \n\nFigure 2, however, shows results from a simulation with the grammar induction \nalgorithm described above, where this condition is violated. Whereas in the simu(cid:173)\nlations of figure 1 the target languages have been relatively easy (the initial string \nlength is short, i.e. 6), here the learning problem is very difficult (initial string \nlength is long, i.e. 12). For a long period the learning is therefore not very suc(cid:173)\ncessful, but around generation 70 the success suddenly rises. With always the same \nT (number of sample sentences), and with always the same grammar space, there \nare regions where the dynamics are apparently under the \"coherence threshold\", \nwhile there are other regions where the dynamics are above this threshold. The \nlanguage has adapted to the learning algorithm, and, consequently, the coherence \nin the population does not satisfy the prediction of Nowak et al. \n\n7 Conclusions \n\nI believe that these results have some important consequences for our thinking \nabout language acquisition. In particular, they offer a different perspective on the \nargument from the poverty of the stimulus, and thus on one of the most central \n\"problems\" of language acquisition research: the logical pmblern of lang'uage acqui(cid:173)\nsition. My results indicate that in iterated learning it is not necessary to put the \n(whole) explanatory burden on the representation bias. Although the details of the \ngrammatical formalism (context-free grammars) and the population structure are \ndeliberately close to [3] and [7] respectively, I do observe successful acquisition of \ngrammars from a class that is unlearn able by Gold's criterion. Further, I observe \ngrammatical coherence even though many more grammars are allowed in principle \nthan Nowak et al. calculate as an upper bound. The reason for these surprising \nresults is that language acquisition is a very particular type of learning problem: \nit is a problem where the target of the learning process is itself the outcome of a \nlearning process. That opens up the possibility of language itself to adapt to the \n\n\flanguage acquisition procedure of children. In such iterated learning situations [11], \nlearners are only presented with targets that other learners have been able to learn. \n\nIsn't this the traditional Universal Grammar in disguise'? Learnability is - consistent \nwith the undisputed proof of [3] - still achieved by constraining the set of targets. \nHowever, unlike in usual interpretations of this proof, these constraints are not \nstrict (some grammars are better learnable than others, allowing for an infinite \n\"Grammar Universe\"), and they are not a-priori: they are the outcome of iterated \nlearning. The poverty of the stimulus is now no longer a problem; instead, the \nancestors' poverty is the solution for the child's. \n\nAcknowledgIllents This work was performed while I was at the AI Laboratory \nof the Vrije Universiteit Brussel. It builds on previous work that was done in close \ncollaboration with Paulien Hogeweg of Utrecht University. I thank her and Simon \nKirby, John Batali, Aukje Zuidema and my colleagues at the AI Lab and the LEC \nfor valuable hints, questions and remarks. Funding from the Concerted Research \nAction fund of the Flemish Government and the VUB, from the Prins Bernhard \nCultuurfonds and from a Marie Curie Fellowship of the European Commission are \ngratefully acknowledged. \n\nReferences \n\n[1) Stefano Bertolo, editor. Language Acquisition and Learnability. Cambridge University \n\nPress, 200l. \n\n[2) Noam Chom::;ky. Aspects of the theor'y of syntax. MIT Pre::;::;, Cambridge, MA, 1965. \n[3) E. M. Gold. Language identification in the limit. \nInfor'mation and Contml (now \n\nInformation and Computation), 10:447- 474, 1967. \n\n[4) Michael A. Arbib and Jane C. Hill. Language acquisition: Schemas replace univer(cid:173)\n\nsal grammar. In John A. Hawkins, editor, Explaining Language Universals. Basil \nBlackwell, New York, USA, 1988. \n\n[5) J. Elman, E. Bates, et al. Rethinking innateness. MIT Press, 1996. \n[6) Steven Pinker and Paul Bloom. Natural language and natural selection. Behavioral \n\nand brain sciences, 13:707-784, 1990. \n\n[7) Martin A. Nowak, Natalia Komarova, and Partha Niyogi. Evolution of universal \n\ngrammar. Science, 291:114-118, 200l. \n\n[8) Terrence Deacon. Symbolic species, the co-e'Uol'ution of language and the h'uman brain. \n\nThe Penguin Press, 1997. \n\n[9) S. Kirby and J. Hurford. The emergence of lingui::;tic ::;tructure: An overview of the \niterated learning model. In Angelo Cangelosi and Domenico Parisi, editors, Sirn'ulating \nthe Evolution of Lang'uage, chapter 6, pages 121-148. Springer Verlag, London, 2002. \n[10) Kenny Smith. Natural selection and cultural selection in the evolution of communi(cid:173)\n\ncation. Adaptive Behavior, 2003. to appear. \n\n[11) Simon Kirby. Syntax without natural selection: How compositionality emerges from \nvocabulary in a population of learners. In C. Knight et al., editors, The Evolutionary \nEmergence of Language. Cambridge University Press, 2000. \n\n[12) J. Gerard Wolff. Language acqui::;ition, data compre::;::;ion and generalization. Language \n\n(3 Communication, 2(1):57-89, 1982. \n\n[13) A. Stolcke. Bayesian Learning of Pmbabilistic Language Models. PhD thesii:i, Dept. \nof Electrical Engineering and Computer Science, University of California at Berkeley, \n1994. \n\n[14) Menno van Zaanen and Pieter Adriaans. Comparing two unsupervised grammar \ninduction systems: Alignment-based learning vs. EMILE. In Ben Kriise et al., editors, \nPmceedinys of BNAIC 2001, 200l. \n\n[15) Zach Solan, Eytan Ruppin, David Horn, and Shimon Edelman. Automatic acquisition \n\nand efficient representation of syntactic structures. This volume. \n\n[16) Henry Brighton. Compositional syntax from cultural transmission. Ar\u00b7tificial Life, \n\n8(1), 2002. \n\n\f", "award": [], "sourceid": 2259, "authors": [{"given_name": "Willem", "family_name": "Zuidema", "institution": null}]}