{"title": "Game Theoretic Algorithms for Protein-DNA binding", "book": "Advances in Neural Information Processing Systems", "page_first": 1081, "page_last": 1088, "abstract": null, "full_text": "Game theoretic algorithms for Protein-DNA binding\n\n Luis Perez-Breva CSAIL-MIT lpbreva@csail.mit.edu Chen-Hsiang Yeang UCSC chyeang@soe.ucsc.edu\n\nLuis E. Ortiz CSAIL - MIT leortiz@csail.mit.edu Tommi Jaakkola CSAIL - MIT tommi@csail.mit.edu\n\nAbstract\nWe develop and analyze game-theoretic algorithms for predicting coordinate binding of multiple DNA binding regulators. The allocation of proteins to local neighborhoods and to sites is carried out with resource constraints while explicating competing and coordinate binding relations among proteins with affinity to the site or region. The focus of this paper is on mathematical foundations of the approach. We also briefly demonstrate the approach in the context of the -phage switch.\n\n1 Introduction\nTranscriptional control relies in part on coordinate operation of DNA binding regulators and their interactions with various co-factors. We believe game theory and economic models provide an appropriate modeling framework for understanding interacting regulatory processes. In particular, the problem of understanding coordinate binding of regulatory proteins has many game theoretic properties. Resource constraints, for example, are critical to understanding who binds where. At low nuclear concentrations, regulatory proteins may occupy only high affinity sites, while filling weaker sites with increasing concentration. Overlapping or close binding sites create explicit competition for the sites, the resolution of which is guided by the available concentrations around the binding sites. Similarly, explicit coordination such as formation of larger protein complexes may be required for binding or, alternatively, binding may be facilitated by the presence of another protein. The key advantage of games as models of binding is that they can provide causally meaningful predictions (binding arrangements) in response to various experimental perturbations or disruptions. Our approach deviates from an already substantial body of computational methods used for resolving transcriptional regulation (see, e.g., [3, 10]). From a biological perspective our work is closest in spirit to more detailed reaction equation models [5, 1], while narrower in scope. The mathematical approach is nevertheless substantially different.\n\n2 Protein-DNA binding\nWe decompose the binding problem into transport and local binding. By transport, we refer to the mechanism that transports proteins to the neighborhood of sites to which they have affinity. The biological processes underlying the transport are not well-understood although several hypotheses exist[12, 4]. We abstract the process initially by assuming separate affinities for proteins to explore neighborhoods of specific sites, modulated by whether the sites are available. This abstraction does not address the dynamics of the transport process and therefore does not distinguish (nor stand in contradiction to) underlying mechanisms that may or may not involve diffusion as a major com-\n\n\f\nponent. We aim to capture the differentiated manner in which proteins may accumulate in the neighborhoods of sites depending on the overall nuclear concentrations and regardless of the time involved. Local binding, on the other hand, captures which proteins bind to each site as a consequence of local accumulations or concentrations around the site or a larger region. In a steady state, the local environment of the site is assumed to be closed and well-mixed. We therefore model the binding as being governed by chemical equilibria: for a type of protein i around site j , {free protein i} + {free site j} {bound ij}, where concentrations involving the site should be thought of as time averages or averages across a population of cells depending on the type of predictions sought. The concentrations of various molecular species around and bound to the sites as well as the rate at which the sites are occupied are then governed by the law of mass action at chemical equilibrium: [bound ij ]/([free protein i][free site j ]) = Kij , where i ranges over proteins with affinity to site j and Kij is a positive equilibrium constant characterizing protein i's ability to bind to site j in the absence of other proteins. Broadly speaking, the combination of transport and local binding results in an arrangement of proteins along the possible DNA binding sites. This is what we aim to predict with our game-theoretic models, not how such arrangements are reached. The predictions should be viewed as functions of the overall (nuclear) concentrations of proteins, the affinities of proteins to explore neighborhoods of individual sites, as well as the equilibrium constants characterizing the ability of proteins to bind to specific sites when in close proximity. Any perturbation of such parameters leads to a potentially different arrangement that we can predict.\n\n3 Game Theoretic formulation\nThere are two types of players in our game, proteins and sites. A protein-player refers to a type of protein, not an individual protein, and decides how its nuclear concentration is allocated to the proximity of sites (transport process). The protein-players are assumed non-cooperative and rational. In other words, their allocations are based on the transport affinities and the availability of sites rather than through some negotiation process involving multiple proteins. The non-coopeative nature of the protein allocations does not, however, preclude the formation of protein complexes or binding facilitated by other proteins. Such extensions can be incorporated at the sites. Each possible binding site is associated with a site-player. Site-players choose the fraction of time (or fraction of cells in a population) a specific type of protein is bound to the site. The site may also remain empty. The strategies of the site-players are guided by local chemical equilibria. Indeed, the site-players are introduced merely to reproduce this physical understanding of the binding process in a game theoretic context. The site-players are non-cooperative and self-interested, always aiming and succeeding at reproducing the local chemical equilibria. The binding game has no global objective function that serves to guide how the players choose their strategies. The players choices are instead guided by their own utilities that depend on the choices of other players. For example, the protein-player allocates its nuclear concentration to the proximity of the sites based on how occupied the sites are, i.e., in a manner that depends on the strategies of the site-players. Similarly, the site-players reproduce the chemical equilibrium at the sites on the basis of the available local protein concentrations, i.e., depending on the choices of the protein-players. The predictions we can make based on the game theoretic formulation are equilibria of the game (not to be confused with the local chemical equilibria at the sites). At an equilibrium, no reallocation of proteins to sites is required and, conversely, the sites have reproduced the local chemical equilibria based on the current allocations of proteins. While games need not have equilibria in pure strategies (actions available to the players), our game will always have one.\n\n4 The binding game\nTo specify the game more formally we proceed to define players' strategies, their utilities, and the notion of an equilibrium of the game. To this end, let f i represent the (nuclear) concentration of protein i. This is the amount of protein available to be allocated to jhe neighborhoods of sites. The t fraction of protein i allocated to site j is specified by pi , where pi = 1. The numerical values j j\n\n\f\nof pi , where j ranges over the possible sites, define a possible strategy for the ith protein player. j The set of such strategies is denoted by P i . The choices of which strategies to play are guided by parameters Eij , the affinity of protein i to explore the neighborhood of site j (we will generally index proteins with i and sites with j ). The utility for protein i, defined below, provides a numerical ranking of possible strategy choices and is parameterized by Eij . Each player aims to maximize its own utility over the set of possible strategy choices. The strategy for site-player j specifies the fraction of time that each type of protein is actually bound to the site. The strategy is denoted by sj , where i ranges over proteins with affinity to the site. Note i ij that the values of sj are in principle observable from binding assays (cf. [9]). si 1 since there i ij is only one site and it may remain empty part of the time. The availability of site j is 1 - si 1, ij j i.e., the fraction of time that nothing is bound. We will also use = si to denote how occupied the site is. The utilities of the site players will depend on Kij , the chemical equilibrium constants characterizing the local binding reaction between protein i and site j . Utilities The utility function for protein-player i is formally defined as j ij i ui (pi , s) pi Eij (1 - j i ) + H (p )\ns\n\n(1)\n\nji where H (pi ) = - pj log pi is the Shannon entropy of the strategy pi and j ranges over possible j j sites. The utility of the protein-player essentially states that protein i \"prefers\" to be around sites that are unbound and for which it has high affinity. The parameter 0 balances how much protein allocations are guided by the differentiated process, characterized by the exploration affinities Eij , as opposed to allocated uniformly (maximizing the entropy function). Since the overall scaling of the utilities is immaterial, only the ratios Eij / are relevant for guiding the protein-players. Note ij that since the utility depends on the strategies of site-players through (1 - si ), one cannot find the equilibrium strategy for proteins by considering sj to be fixed; the sites will respond to any pi j i chosen by the protein-player. As discussed earlier, the site-players always reproduce the chemical equilibrium between the site and the protein species allocated to the neighborhood of the site. The utility for site-player i is defined such that the maximizing strategy corresponds to the chemical equilibrium: ( ij= sj / pi f i - sj )(1 - Kij (2) j i i i)\ns\n\nwhere specifies how much protein i is bound, the first term in the denominator (pi f i - sj ) j i ij specifies the amount of free protein i, and the second term (1 - si ), the fraction of time the site is available. The equilibrium equation holds for all protein species around the site and for the same strategy {sj } of the site-player. The units of each \"concentration\" in the above equation should be i interpreted as numbers of available molecules (e.g., there's only one site). The utility function that reproduces this chemical equilibrium when maximized over possible strategies is given by 1 ij ij vj (sj , p) si - Kij (pi f i - sj ) - (3) j i i\ns\n\nsj i\n\nj subject to Kij (pi f i - - pi f i , and ), 1. These constraints si j j guarantee that the utility is always non-positive and zero exactly when the chemical equilibrium holds. sj pi f i ensures that we cannot have more protein bound than is allocated to the proximity j i of the site. These constraints define the set of strategies available for site-player j or S j (p). Note that the available strategies for the site-player depend on the current strategies for protein-players. The set of strategies S j (p) is not convex.\n\nsj i\n\nsj )(1 i\n\ni\n\nj si\n\nsj i\n\ni\n\n4.1\n\nThe game and equilibria\n\nThe protein-DNA binding game is now fully specified by the set of parameters {Eij / }, {Kij } and {f i }, along with the utility functions {ui } and {vj } and the allocation constraints {P i } and {S j }. We assume that the biological system being modeled reaches a steady state, at least momentarily, preserving the average allocations. In terms of our game theoretic model, this corresponds to what\n\n\f\nwe call an equilibrium of the game. Informally, an equilibrium of a game is a strategy for each player such that no individual has any incentive to unilaterally deviate from their strategy. Formally, if the allocations (p, s) are such that for each protein i and each site j , pi arg maxi ui (pi , s), and sj arg i\np P sj S j (pj ) \n\nmax\n\nvj (sj , pj ), \n\n(4)\n\nthen we call (p, s) an equilibrium of the protein-DNA binding game. Put another way, at an equilib rium, the current strategies of the players must be among the strategies that maximize their utilities assuming the strategies of other players are held fixed. Does the protein-DNA binding game always have an equilibrium? While we have already stated this in the affirmative, we emphasize that there is no reason a priori to believe that there exists an equilibrium in the pure strategies, especially since the sets of possible strategies for the site-players are non-convex (cf. [2]). The existence is guaranteed by the following theorem: Theorem 1. Every protein-DNA binding game has an equilibrium. A constructive proof is provided by the algorithm discussed below. The theorem guarantees that at least one equilibrium exists but there may be more than one. At any such equilibrium of the game, all the protein species around each site are at a chemical equilibrium; that is, if (p, s) is an equilibrium of the game, then for all sites j and proteins i, sj and pi satisfy (2). Consequently, the site utilities j vj (sj , pj ) are all zero for the equilibrium strategies. 4.2 Computing equilibria The equilibria of the binding game represent predicted binding arrangements. Our game has special structure and properties that permit us to find an equilibrium efficiently through a simple iterative algorithm. The algorithm monotonically fills the sites up to the equilibrium levels, starting with all sites empty. We begin by first expressing any joint equilibrium strategy of the game as a function of how filled the sites are, and reduce the problem of finding equilibria to finding fixed points of a monotone function. ij To this end, let j = si denote site j occupancy, the fraction of time it is bound by any protein. j 's are real numbers in the interval [0, 1]. If we fix = (1 , . . . , m ), i.e., the occupancies for all the m sites, then we can readily obtain the maximizing strategies for proteins expressed as a function of site occupancies: pi () exp(Eij (1 - j )/ ), where the maximizing strategies are functions j of . Similarly, at the equilibrium, each site-player achieves a local chemical equilibrium specified ij j in (2). By replacing j = si , and solving for si in (2), we get sj () = i Kij (1 - j ) pi () f i 1 + Kij (1 - j ) j (5)\n\nSo, for example, the fraction of time the site is bound by a specific protein is proportional to the amount of that protein in the neighborhood of the site, modulated by the equilibrium constant. Note that sj () depends not only on how filled site j is but also on how occupied the other sites are i through pi (). j The equilibrium condition can be now expressed solely in terms of and reduces to a simple consistency constraint: overall occupancy should equal the fraction of time any protein is bound or ij i Kij (1 - j ) pi () f i = Gj () (6) j = si () = 1 + Kij (1 - j ) j We have therefore reduced the problem of finding equilibria of the game to finding fixed points of ij the mapping Gj () = si (). This mapping, written explicitly as has a simple but powerful monotonicity property that forms the basis for our iterative algorithm. Specifically, Lemma 1. Let -j denote all components k except j . Then for each j , Gj () Gj (j , -j ) is a strictly decreasing function of j for any fixed -j . We omit the proof as it is straightforward. This lemma, together with the fact that Gj (1, -j ) = 0, immediately guarantees that there is a unique solution to j = Gj (j , -j ) for any fixed and valid -j . The solution j also lies in the interval [0, 1] and can be found efficiently via binary search.\n\n\f\nThe algorithm Let (t) denote the site occupancies at the tth iteration of the algorithm. j (t) specifies the j th component of this vector, while -j (t) contains all but the j th component. The algorithm proceeds as follows: Set j (0) = 0 for all j = 1, . . . , m. Find each new component j (t + 1), j = 1, . . . , m, on the basis of the corresponding -j (t) such that j (t + 1) = Gj (j (t + 1), -j (t)) Stop when j (t + 1) j (t) for all j = 1, . . . , m. Note that the inner loop of the algorithm, i.e., finding j (t + 1) on the basis of -j (t) reduces to a simple binary search as discussed earlier. The algorithm generates a monotonically increasing sequence of 's that converge to a fixed point (equilibrium) solution. We also provide a formal convergence analysis of the algorithm. To this end, we begin with the following critical lemma. k k Lemma 2. Let 1 and 2 be two possible assignments to . If for all k = j , 1 2 , then -j -j Gj (j , 1 ) Gj (j , 2 ) for all j .\n- - The proof is straightforward and essentially based on the fact that 1 j and 2 j appear only in the normalization terms for the protein allocations. We omit further details for brevity. On the basis of this lemma, we can show that the algorithm indeed generates a monotonically increasing sequence of 's Theorem 2. j (t + 1) j (t) for all j and t.\n\nProof. By induction. Since j (0) = 0 and the range of Gj (j , -j (0)) lies in [0, 1], clearly j (1) j (0) for all j . Assume then that j (t) j (t - 1) for all j . We extend the induction step by contradiction. Suppose j (t + 1) < j (t) for some j . Then j (t + 1) < j (t) = Gj (j (t), -j (t - 1)) Gj (j (t), -j (t)) < Gj (j (t + 1), -j (t)) = j (t + 1) which is a contradiction. The first \"\" follows from the induction hypothesis and lemma 2, and the last \"<\" derives from lemma 1 and j (t + 1) < j (t). Since j (t) for any t will always lie in the interval [0, 1], and because of the continuity of Gj (j , -j ) in the two arguments, the algorithm is guaranteed to converge to a fixed point solution. More formally, the Monotone Convergence Theorem for sequences and the continuity of Gj 's imply that Theorem 3. The algorithm converges to a fixed point such that j = Gj (j , -j ) for all j . 4.3 The -phage binding game\n\nWe use the well-known -phage viral infection [11, 1] to illustrate the game theoretic approach. A genetic two-state control switch specifies whether the infection remains dormant (lysogeny) or whether the viral DNA is aggressively replicated (lysis). The components of the -switch are 1) two adjacent genes cI and Cro that encode cI2 and Cro proteins, respectively; 2) the promoter regions PRM and PR of these genes, and 3) an operator (OR ) with three binding sites OR 1, OR 2, and OR 3. We focus on lysogeny, in which cI2 dominates over Cro. There are two relevant protein-players, RNA-polymerase and cI2 , and three sites, OR 1, OR 2, and OR 3 (arranged close together in this order). Since the presence of cI2 in either OR 1 or OR 3 blocks the access of RNA-polymerase to the promoter region PR , or PRM respectively, we can safely restrict ourselves to operator sites as the site-players. There are three phases of operation depending on the concentration of cI2 : 1. cI2 binds to OR 1 first and blocks the Cro promoter PR 2. Slightly higher concentrations of cI2 lead to binding at OR 2 which in turn facilitates RNApolymerase to initiate transcription at PRM 3. At sufficiently high levels cI2 also binds to OR 3 and inhibits its own transcription\n\n\f\nBinding in OR3\n1 probability of binding probability of binding 1\n\nBinding in OR2\n1\n\nBinding in OR1\n\nprobability of binding\n\n0.75\n\n0.75\n\n0.75 0.5 0.25 0 -1 10\n\ncI2 RNA- polym.\n\n0.5 cI2 0.25 RNA- polym.\n\n0.5 cI 0.25\n2\n\nRNA- polym.\n\n0\n\n10\n\n-1\n\n10 10 f cI /fRNA- p 2\n\n0\n\n1\n\n10\n\n2\n\n10\n\n3\n\n0\n\n10\n\n-1\n\n10 10 f cI /fRNA- p 2\n\n0\n\n1\n\n10\n\n2\n\n10\n\n3\n\n10\n\n0\n\n10 f cI /fRNA- p\n2\n\n1\n\n10\n\n2\n\n(a) OR 3\n\n(b) OR 2\n\n(c) OR 1\n\nFigure 1: Predicted protein binding to sites OR 3, OR 2, and OR 1 for increasing amounts of cI2 . The rightmost figure illustrates a comparison with [1]. The shaded area indicates the range of concentrations of cI2 at which stochastic simulation predicts a decline in transcription from OR 1. Our model predicts that cI2 begins to occupy OR 1 at the same concentration. Game parameters The game requires three sets of parameters: chemical equilibrium constants, affinities, and protein concentrations. To use constants derived from experiment we assign units to these quantities. We define f i as the total number of proteins i available, and arrange the units of Kij accordingly: f i f i VT N A , Kij Kij /(NA VS ) Kij = e-G/RT (7)\n\nwhere VT and VS are the volumes of cell and site neighborhood, respectively, NA is the Avogadro number, R is the universal gas constant, T is temperature, f i is the concentration of protein i in the cell, and Kij is the equilibrium constant in units of /mol. As we show in [6] these definitions are consistent with our previous derivation. Note that when game parameters are learned from data any dependence on the volumes will be implicit. For a typical Escherichia coli ( 2m length) at room temperature, the Gibbs' Free energies G tabulated by [11] yield the equilibrium constants shown below; in addition, we set transport affinities in accordance with the qualitative description in [7, 8], Kij cI2 RNA-p OR 3 .0020 .0212 OR 2 .0020 0 OR 1 .0296 .1134 Eij cI2 RNA-p OR 3 .1 .2 OR 2 .1 .01 OR 1 1 1\n\nNote that the overall scaling of the affinities is immaterial; only their relative values will guide the protein-players. Note also that we have chosen not to incorporate any protein-protein interactions in the affinities. Finally, we set fRN A-p = 30nM (cf. [11]) (around fRN A-p 340 copies for a typical E. coli). And varied fcI2 from 1 to 10, 000 copies to study the dynamical behavior of the lysogeny cycle. The results are reported as a function of the ratio fcI2 /fRN A-p . We set = 10-5 . Simulation Results The predictions from the game theoretic model exactly mirror the known behavior. Here we summarize the main results and refer the reader to [6] for a thorough analysis. Figure 1 illustrates how the binding at different sites changes as a function of increasing fcI2 . The simulation mirrors the behavior of the lysogeny cycle discussed earlier. Although our model does not capture dynamics, and figure 1 does not involve time, it is nevertheless useful for assessing quantitative changes and the order of events as a function of increasing fcI2 . Note, for example, that the levels at which cI2 occupies OR 1 and OR 2 rise much faster than at OR 3. While the result is expected, the behavior is attributed to protein-protein interactions which are not encoded in our model. Similarly, RNA-polymerase occupation at OR 3 bumps up as the probability that OR 2 is bound by cI2 increases. In [6] we further discuss the implications of the simultaneous occupancy of OR 1 and OR 2, via simulation of OR 1 knockout experiments. Finally, figure 1(c) shows a comparison with stochastic simulation (v. [1]). Our model predicts that cI2 begins binding OR 1 at the same level as [1] predicts a decline in the transcription of Cro. While consistent, we emphasize that the methods differ in their goals; stochastic simulation focuses on the dynamics of transcription while we study the strategic allocation of proteins as a function of their concentration.\n\n\f\n4.4\n\nA structured extension\n\nThe game theoretic formulation of the binding problem described previously involves a transport mechanism that is specific to individual sites. In other words, proteins are allocated to the proximity of sites based on parameters Eij and occupancies j associated with individual sites. We generalize the game further here by assuming that the transport mechanism has a coarser spatial structure, e.g., specific to promoters (regulatory regions of genes) rather than sites. In this extension the amount of protein allocated to any promoter is shared by the sites it contains. The sharing creates specific challenges to the algorithms for finding the equilibria and we will address those challenges here. Let R represent possible promoter regions each of which may be bound by multiple proteins (at distinct or overlapping sites). Let pi = {pi }rR represent an allocation of protein i into these r regions in a manner that is not specific to the possible sites within each promoter. The utility for protein i is given by r ui (pi ) = pi Eir (ar ) + H (pi ) r\nR\n\nwhere N (r) is the set of possible binding sites within promoter region r and ar =\nj\n\nj\n\nN (r )\n\nj\n\nj is the overall occupancy of the promoter (how many proteins bound). As before, = P si , ) ( where the summation is over proteins. N (r) N (r = whenever r = r promoters don't share sites). We assume only that Eir (ar ) is a decreasing and a differentiable function of ar . The protein utility is based on the assumption that the attraction to the promoter decreases based on the number j f proteins already bound at the promoter. The maximizing strategy for protein i given o j i r r ar = N (r ) for all r , is pr (a) exp(Eir (a )/ ), where a = {a }r R .\n\ni\n\nSites j N (r) within a promoter region r reproduce the following chemical equilibrium ( = k sj / f i pi (a) - sk )(1 - j ) Kij r i N (r ) i for all proteins i P . Note the shared protein resource within the promoter. We can find this chemical equilibrium by solving the following fixed point equations i K (1 - j ) k ij j = f i pi (a) = Gj (, a-r ) r r k 1+ N (r ) Kik (1 - )\nP\n\nThe site occupancies are now tied within the promoter as well as influencing the overall allocation of proteins across different promoters through a = {ar }rR . The following theorem provides the basis for solving the coupled fixed point equations:\nj Theorem 4. Let {1 } be the fixed point solution 1 = Gj (1 , a-r ) and {2 } the solution to ^j ^j r 1 j -r 2 = Gj (2 , a2 ). If al al for all l = r then ar ar . ^1 ^2 r 1 2\n\nj\n\nThe proof is not straightforward but we omit it for brevity (two pages). The result guarantees that if we can solve the fixed point equations within each promoter then the overall occupancies {ar }rR have the same monotonicity property as in the simpler version of the game where ar consisted of a single site. In other words, any algorithm that successively solves the fixed point equations within promoters will result in a monotone and therefore convergent filling of the promoters, beginning with all empty promoters. We will redefine the notation slightly to illustrate the algorithm for finding the solution j = Gj (, a-r ) for j N (r) where a-r is fixed. Specifically, let r i Kij (1 - j ) k f i pi (j , -j , a-r ) Gj (j , -j , j , a-r ) = r r k 1 + Kij (1 - j ) + =j Kik (1 - )\nP\n\nIn other words, the first argument refers to j anywhere on the right hand side, the second argument refers to -j in the denominator of the first expression in the sum, and the third argument refers to -j in pi (). The algorithm is now defined as follows: initialize by setting j (0) = 0 and j (0) = 1 r for all j N (r), then Iteration t, upper bounds: Find j = Gj (j , -j (t), -j (t), a-r ) separately for each ^ r^ j N (t). Update j (t + 1) = j , j N (r) ^\n\n\f\nIteration t, lower bounds: Find j = Gj (j , -j (t), -j (t + 1), a-r ) separately for each ^ r^ j N (r). Update j (t + 1) = j , j N (r) ^ The iterative optimization proceeds until1 j (t) - j (t) for all j N (r). The algorithm successively narrows down the gap between upper and lower bounds. Specifically, j (t + 1) j (t) and j (t + 1) j (t). The fact that these indeed remain upper and lower bounds follows directly from the fact that Gj (, -j , j , a-r ), viewed as a function of the first argument, increases r uniformly as we increase the components of the second argument. Similarly, it uniformly decreases as a function of the third argument.\n\n5 Discussion\nWe have presented a game theoretic approach to predicting protein arrangements along the DNA. The model is complete with convergent algorithms for finding equilibria on a genome-wide scale. The results from the small scale application are encouraging. Our model successfully reproduces known behavior of the -switch on the basis of molecular level competition and resource constraints, without the need to assume protein-protein interactions between cI2 dimers and cI2 and RNA-polymerase. Even in the context of this well-known sub-system, however, few quantitative experimental results are available about binding (see the comparison). Proper validation and use of our model therefore relies on estimating the game parameters from available protein-DNA binding data. This will be addressed in subsequent work. This work was supported in part by NIH grant GM68762 and by NSF ITR grant 0428715. Luis Perez-Breva is a \"Fundacion Rafael del Pino\" Fellow. \n\nReferences\n[1] Adam Arkin, John Ross, and Harley H. McAdams. Stochastic kinetic analysis of developmental pathway bifurcation in phage -infected escherichia coli cells. Genetics, 149:16331648, August 1998. [2] Kenneth J. Arrow and Gerard Debreu. Existence of an equilibrium for a competitive economy. Econometrica, 22(3):265290, July 1954. [3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi, J. Yoo, B. Gordon F. Robert, E. Fraenkel, T. Jaakkola, R. Young, and D. Gifford. Computational discovery of gene modules and regulatory networks. Nature Biotechnology, 21(11):13371342, 2003. [4] Otto G. Berg, Robert B. Winter, and Peter H. von Hippel. Diffusion- driven mechanisms of protein translocation on nucleic acids. 1. models and theory. Biochemistry, 20(24):692948, November 1981. [5] HarleyH. McAdams and Adam Arkin. Stochastic mechanisms in geneexpression. PNAS, 94(3):814819, 1997. [6] Luis Perez-Breva, Luis Ortiz, Chen-Hsiang Yeang, and Tommi Jaakkola. DNA binding and games. Technical Report MIT-CSAIL-TR-2006-018, Massachusetts Institute of Technology, Computer Science and Artificial Intelligence Laboratory, March 2006. [7] Mark Ptashne. A Genetic Switch: Gene control and phage . Cell Press AND Blackwell Scientific Publications, 3rd edition, 1987. [8] Mark Ptashne and Alexander Gann. Genes and Signals. Cold Spring Harbor Laboratory press, 1st edition, 2002. [9] Bing Ren, Franois Robert, John J. Wyrick, Oscar Aparicio, Ezra G. Jennings, Itamar Simon, Julia Zeitlinger, Jrg Schreiber, Nancy Hannett, Elenita Kanin, Thomas L. Volkert, Christopher J. Wilson, Stephen P. Bell, , and Richard A. Young. Genome-wide location and function of DNA-binding proteins. Science, 290(2306), December 2000. [10] E. Segal, M. Shapira, A. Regev, D. Pe'er, D. Botstein, D. Koller, and N. Friedman. Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nature Genetics, 34(2):16676, 2003. [11] Madeline A. Shea and Gary K. Ackers. The or control system of bacteriophage lambda. a physicalchemical model for gene regulation. Journal of Molecular Biology, 181:211230, 1985. [12] Neil P. Stanford, Mark D. Szczelkun, John F. Marko, and Stephen E. Halford. One- and three-dimensional pathways for proteins to reach specific DNA sites. EMBO, 19(23):65466557, December 2000. In the case of multiple equilibria the bounds might converge but leave a finite gap. The algorithm will identify those cases as the monotone convergence of the bounds can be assessed separately.\n1\n\n\f\n", "award": [], "sourceid": 3091, "authors": [{"given_name": "Luis", "family_name": "P\u00e9rez-breva", "institution": null}, {"given_name": "Luis", "family_name": "Ortiz", "institution": null}, {"given_name": "Chen-hsiang", "family_name": "Yeang", "institution": null}, {"given_name": "Tommi", "family_name": "Jaakkola", "institution": null}]}