{"title": "Deep Dynamical Modeling and Control of Unsteady Fluid Flows", "book": "Advances in Neural Information Processing Systems", "page_first": 9258, "page_last": 9268, "abstract": "The design of flow control systems remains a challenge due to the nonlinear nature of the equations that govern fluid flow. However, recent advances in computational fluid dynamics (CFD) have enabled the simulation of complex fluid flows with high accuracy, opening the possibility of using learning-based approaches to facilitate controller design. We present a method for learning the forced and unforced dynamics of airflow over a cylinder directly from CFD data. The proposed approach, grounded in Koopman theory, is shown to produce stable dynamical models that can predict the time evolution of the cylinder system over extended time horizons. Finally, by performing model predictive control with the learned dynamical models, we are able to find a straightforward, interpretable control law for suppressing vortex shedding in the wake of the cylinder.", "full_text": "Deep Dynamical Modeling and Control of\n\nUnsteady Fluid Flows\n\nJeremy Morton\u2217\n\njmorton2@stanford.edu\n\nFreddie D. Witherden\u2217\nfdw@stanford.edu\n\nAntony Jameson \u2020\n\nantony.jameson@tamu.edu\n\nMykel J. Kochenderfer\u2217\nmykel@stanford.edu\n\nAbstract\n\nThe design of \ufb02ow control systems remains a challenge due to the nonlinear nature\nof the equations that govern \ufb02uid \ufb02ow. However, recent advances in computational\n\ufb02uid dynamics (CFD) have enabled the simulation of complex \ufb02uid \ufb02ows with high\naccuracy, opening the possibility of using learning-based approaches to facilitate\ncontroller design. We present a method for learning the forced and unforced dy-\nnamics of air\ufb02ow over a cylinder directly from CFD data. The proposed approach,\ngrounded in Koopman theory, is shown to produce stable dynamical models that\ncan predict the time evolution of the cylinder system over extended time horizons.\nFinally, by performing model predictive control with the learned dynamical models,\nwe are able to \ufb01nd a straightforward, interpretable control law for suppressing\nvortex shedding in the wake of the cylinder.\n\nIntroduction\n\n1\nFluid \ufb02ow control represents a signi\ufb01cant challenge, with the potential for high impact in a variety of\nsectors, most notably the automotive and aerospace industries. While the time evolution of \ufb02uid \ufb02ows\ncan be described by the Navier-Stokes equations, their nonlinear nature means that many control\ntechniques, largely derived for linear systems, prove ineffective when applied to \ufb02uid \ufb02ows. An\nilluminating test case is the canonical problem of suppressing vortex shedding in the wake of air\ufb02ow\nover a cylinder. This problem has been studied extensively experimentally and computationally,\nwith the earliest experiments dating back to the 1960s [1]\u2013[6]. These studies have shown that\ncontroller design can prove surprisingly dif\ufb01cult, as controller effectiveness is highly sensitive to\n\ufb02ow conditions, measurement con\ufb01guration, and feedback gains [3], [5]. Nonetheless, at certain\n\ufb02ow conditions vortex suppression can be achieved with a simple proportional control law based on\na single sensor measurement in the cylinder wake. Thus, while the design of \ufb02ow controllers may\npresent considerable challenges, effective controllers may in fact prove relatively easy to implement.\nRecent advances in computational \ufb02uid dynamics (CFD) have enabled the numerical simulation of\npreviously intractable \ufb02ow problems for complex geometries [7]\u2013[11]. Such simulations are generally\nrun at great computational expense, and generate vast quantities of data. In response, the \ufb01eld of\nreduced-order modeling (ROM) has attracted great interest, with the aim of learning ef\ufb01cient dynami-\ncal models from the generated data. This research has yielded a wide array of techniques for learning\ndata-driven dynamical models, including balanced truncation, proper orthogonal decomposition, and\ndynamic mode decomposition [12]. Recent work has sought to incorporate reduced-order models\nwith robust- and optimal-control techniques to devise controllers for nonlinear systems [6], [13], [14].\nIn parallel, the machine learning community has devoted signi\ufb01cant attention to learning-based\ncontrol of complex systems. Model-free control approaches attempt to learn control policies without\n\n\u2217Department of Aeronautics and Astronautics, Stanford University\n\u2020Department of Aerospace Engineering, Texas A&M University\n\n32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montr\u00e9al, Canada.\n\n\fconstructing explicit models for the environment dynamics, and have achieved impressive success\nin a variety of domains [15], [16]. However, model-free methods require many interactions with\nan environment to learn effective policies, which may render them infeasible for many \ufb02ow control\napplications where simulations are computationally expensive. In contrast, model-based control\napproaches have the potential to learn effective controllers with far less data by \ufb01rst modeling the\nenvironment dynamics. Model-based methods have been successful in some domains [17], [18], but\nsuch success hinges on the need to construct accurate dynamical models.\nIn this work, we apply recent advances in the \ufb01elds of reduced-order modeling and machine learning\nto the task of designing \ufb02ow controllers. In particular, we extend an algorithm recently proposed\nby Takeishi et al. [19], leveraging Koopman theory to learn models for the forced and unforced\ndynamics of \ufb02uid \ufb02ow. We show that this approach is capable of stably modeling two-dimensional\nair\ufb02ow over a cylinder for signi\ufb01cant prediction horizons. We furthermore show that the learned\ndynamical models can be incorporated into a model predictive control framework to suppress vortex\nshedding. Finally, we discuss how the actions selected by this controller shed insight on a simple and\neasy-to-implement control law that is similarly effective.While applied to \ufb02uid \ufb02ow in this work, we\nnote that the proposed approach is general enough to be applied to other applications that require\nmodeling and control of high-dimensional dynamical systems.\n\n2 Modeling unforced dynamics\nLet xt \u2208 Rn be a state vector containing all data from a single time snapshot of an unforced \ufb02uid\n\ufb02ow simulation. Our goal is to discover a model of the form xt+1 = F (xt) that describes how the\nstate will evolve in time. The function F can take many forms; one possibility is that the system\ndynamics are linear, in which case the state updates will obey xt+1 = Kxt, with K \u2208 Rn\u00d7n. If we\nobserve a sequence of time snapshots x1:T +1, we can construct the matrices\n\nX = [x1, x2, . . . , xT ] and Y = [x2, x3, . . . , xT +1]\n\n(1)\nand subsequently \ufb01nd the matrix A = Y X\u2020, where X\u2020 is the Moore-Penrose pseudoinverse of X.\nAs T increases, A will asymptotically approach K [19], and hence approximate the true system\ndynamics. Such an approximation will in general only be accurate for systems with linear dynamics;\nin the following section we discuss how a similar approximation can be formed for nonlinear systems.\n\n2.1 The Koopman operator\nConsider a nonlinear discrete-time dynamical system described by xt+1 = F (xt). Furthermore,\nlet the Koopman operator K be an in\ufb01nite-dimensional linear operator that acts on all observable\nfunctions g : Rn \u2192 C. Koopman theory asserts that a nonlinear discrete-time system can be mapped\nto a linear discrete-time system, where the Koopman operator advances observations of the state\nforward in time [20]:\n\nKg(xt) = g(F (xt)) = g(xt+1).\n\n(2)\n\nWhile Koopman theory provides a lens under which nonlinear systems can be viewed as linear, its\napplicability is limited by the fact that the Koopman operator is in\ufb01nite-dimensional. However, if\nthere exist a \ufb01nite number of observable functions {g1, . . . , gm} that span a subspace G such that\nKg \u2208 G for any g \u2208 G, then G is considered to be an invariant subspace and the Koopman operator\nbecomes a \ufb01nite-dimensional operator K. We abuse notation by de\ufb01ning the vector-valued observable\ng = [g1, . . . , gm]\n\n(cid:124), and furthermore de\ufb01ne the matrices\n\n\u02dcX = [g(x1), g(x2), . . . , g(xT )] and \u02dcY = [g(x2), g(x3), . . . , g(xT +1)] .\n\n(3)\nThe matrix A = \u02dcY \u02dcX\u2020 will asymptotically approach the Koopman operator K with increasing T .\nTakeishi et al. showed that the task of \ufb01nding a set of observables that span an invariant subspace\nreduces to \ufb01nding a state mapping g(xt) under which linear least-squares regression performs well\nF is minimized), and proposed learning such a mapping with deep\nneural networks [19]. In experiments, their proposed algorithm was shown to perform well in analysis\nand prediction on a number of low-dimensional dynamical systems.\n\n(i.e. the loss (cid:107) \u02dcY \u2212(cid:16) \u02dcY \u02dcX\u2020(cid:17) \u02dcX(cid:107)2\n\n2\n\n\fX\n\n[x1, x2, . . . , xT ]\n\n[x2, x3, . . . , xT +1]\n\nY\n\nEncoder\n\n\u02dcX\n\nDecoder\n\n[g(x1), g(x2), . . . , g(xT )]\n\n[g(x2), g(x3), . . . , g(xT +1)]\n\n\u02dcY\n\nLeast\nSquares\n\nA\n\n(cid:2)Ag(x1), A2g(x1), . . . , AT g(x1)(cid:3)\n\n\u02dcYpred\n\n\u02c6X\n\n[\u02c6x1, \u02c6x2, . . . , \u02c6xT ]\n\n[\u02c6x2, \u02c6x3, . . . , \u02c6xT +1]\n\n\u02c6Y = g\u22121(cid:16) \u02dcYpred\n\n(cid:17)\n\nFigure 1: Illustration of procedure used to train Deep Koopman dynamical models.\n\nF +(cid:107)Y \u2212 \u02c6Y (cid:107)2\n\n2.2 Deep Koopman dynamical model\nWe now present the Deep Koopman model, which employs a modi\ufb01ed form of the training algorithm\nproposed by Takeishi et al. to learn state mappings that approximately span a Koopman invariant\nsubspace. The training algorithm is depicted in Fig. 1. First, a sequence of time snapshots x1:T +1 is\nused to construct the matrices X and Y de\ufb01ned in Eq. (1). These matrices are fed into an encoder\nneural network, which serves as the mapping g(xt) and produces the matrices \u02dcX and \u02dcY de\ufb01ned in\nEq. (3). Subsequently, a linear least-squares \ufb01t is performed to \ufb01nd an A-matrix that can propagate\nthe state mappings forward in time. Finally, \u02dcX and the propagated state mappings are fed into a\ndecoder that functions as g\u22121 to yield the matrices \u02c6X and \u02c6Y , approximations to X and Y .\nThe Deep Koopman model is trained to minimize L = (cid:107)X \u2212 \u02c6X(cid:107)2\nF , where \u02c6Y is obtained\nby running \u02dcYpred through the decoder. Minimizing the error between X and \u02c6X enforces that the\nmapping g(xt) is invertible, while minimizing the error between Y and \u02c6Y enforces that the derived\ndynamical model can accurately simulate the time evolution of the system. One main difference\nbetween our algorithm and that proposed by Takeishi et al. is that we force the model to simulate\nthe time evolution of the system during training in a manner that mirrors how the model will be\ndeployed at test time. In particular, we apply the derived A-matrix recursively to state mapping g(x1)\nto produce the matrix \u02dcYpred de\ufb01ned in Fig. 1, which is then mapped to \u02c6Y through the decoder.\nTo better clarify another new feature of our proposed algorithm, it is worth drawing a distinction\nbetween reconstruction and prediction. If a dynamical model is constructed based on a sequence of\nstates x1:N , then simulations generated by the dynamical model would be reconstructing the already\nobserved time evolution of the system for all time steps t \u2264 N, and predicting the time evolution of\nthe system for all time steps t > N. We would like to train a dynamical model that we can ultimately\nuse to predict the future time evolution of a given system. Thus, during training we generate the\nA-matrix based on only the \ufb01rst T /2 entries of \u02dcX and \u02dcY , thereby enforcing that the last T /2 entries\nof \u02dcYpred are purely predictions for how the system will evolve in time.\nOne of the advantages of this approach is its relative simplicity, as the neural network architecture is\nequivalent to that of a standard autoencoder. The dynamics matrix A does not need to be modeled\ndirectly; rather, it is derived by performing least squares on the learned state mappings. In our\nimplementation, the encoder consists of ResNet convolutional layers [21] with ReLU activations\nfollowed by fully connected layers, while the decoder inverts all operations performed by the encoder.\nWe applied L2 regularization to the weights in the encoder and decoder. The gradients for all\noperations are de\ufb01ned in Tensor\ufb02ow [22], and the entire model can be trained end-to-end to learn\nsuitable state mappings g(xt).\n\n2.3 Experiments\nWe now provide a description for how we train and evaluate the Deep Koopman models. We \ufb01rst\ndescribe the test case under study, then quantify the ability of the Deep Koopman model to learn the\nsystem dynamics.\n\n2.3.1 Test case\nThe system under consideration is a two dimensional circular cylinder at Reynolds number 50 and\nan effectively incompressible Mach number of 0.2. This is a well-studied test case [23], [24] which\n\n3\n\n\f(a) Density\n\n(b) x-Momentum\n\n(c) y-Momentum\n\n(d) Energy\n\nFigure 2: Format of inputs to neural network. Different physical quantities are treated as different\nchannels in the input.\n\nhas been used extensively both for validation purposes and as a legitimate research case in its own\nright. The chosen Reynolds number is just above the cut-off for laminar \ufb02ow and thus results in the\nformation of a von Karman vortex street, where vortices are shed from the upper and lower surface\nof the cylinder in a periodic fashion. Vortex shedding gives rise to strong transverse forces, and is\nassociated with higher drag and unsteady lift forces [2], [6].\nTo perform the \ufb02uid \ufb02ow simulations, we use a variation of the high-order accurate PyFR\nsolver [25].The surface of the cylinder is modeled as a no-slip isothermal wall boundary condi-\ntion, and Riemann invariant boundary conditions are applied at the far-\ufb01eld. The domain is meshed\nusing 5672 unstructured, quadratically curved quadrilateral elements. All simulations are run using\nquadratic solution polynomials and an explicit fourth order Runge\u2013Kutta time stepping scheme.\nA training dataset is constructed by saving time snapshots of the system every 1500 solver steps. To\nmake the simulation data suitable for incorporation into a training algorithm, the data is formatted\ninto image-like inputs before storage. Solution quantities are sampled from a 128 \u00d7 256 grid\nof roughly equispaced points in the neighborhood of the cylinder. Each grid point contains four\nchannels corresponding to four physical quantities in the \ufb02ow at that point: density, x-momentum,\ny-momentum, and energy. An example snapshot can be found in Fig. 2, illustrating the qualitative\ndifferences between the four distinct input channels.\n\n2.3.2 Deep Variational Bayes Filter\nWe compare the Deep Koopman model with the Deep Variational Bayes Filter (VBF) [26] to baseline\nperformance. The Variational Bayes Filter can be viewed as a form of state space model, which seeks\nto map high-dimensional inputs xt to lower-dimensional latent states zt that can be evolved forward\nin time. The VBF is a recently proposed approach that improves upon previous state space models\n(e.g. Embed to Control [17]) and evolves latent states forward linearly in time, and thus serves as a\nsuitable performance benchmark for the Deep Koopman model.\nWe use the same autoencoder architecture employed by the Deep Koopman model to perform the\nforward and inverse mappings between the inputs xt and the latent states zt. As with the Deep\nKoopman model, the inputs are time snapshots from CFD simulations. The time evolution of the\nlatent states is described by zt+1 = Atzt + Btut + Ctwt, where ut is the control input at time t and\nwt represents process noise. The matrices At, Bt, and Ct are assumed to comprise a locally linear\ndynamical model, and are determined at each time step as a function of the current latent state and\ncontrol input. Since we seek to model the unforced dynamics of the \ufb02uid \ufb02ow, we ignore the effect of\ncontrol inputs in our implementation of the Deep VBF.\n\n2.3.3 Results\nIn addition to the Deep VBF, we also benchmark the Deep Koopman model against a model trained\nusing the procedure proposed by Takeishi et al., which sets \u02dcYpred = A \u02dcX rather than calculating \u02dcYpred\nby applying A recursively to g(x1). Each model is trained on 32-step sequences of data extracted\nfrom two-dimensional cylinder simulations. We then use the trained models to recreate the time\nevolution of the system observed during 20 test sequences and extract the error over time. For a fair\ncomparison, g(xt) and zt are de\ufb01ned to be 32-dimensional vectors. The Koopman method construct\nits dynamical model based on state mappings from the \ufb01rst 16 time steps, then simulates the system\nfor all time steps using the derived A-matrix. The Takeishi baseline derives its A-matrix based on\nstate mappings from the \ufb01rst 32 time steps. The VBF constructs a new locally linear dynamical\nmodel at each time step, but relies on information from the \ufb01rst 32 time steps to sample the initial\nvalue of the process noise w0.\n\n4\n\n\f\u00b710\u22123\n\n1.5\n\n1\n\n0.5\n\nr\no\nr\nr\nE\ne\nv\ni\nt\na\nl\ne\nR\n\n0\n\n0\n\nKoopman\n\nTakeishi\n\nVBF\n\n10\n\n20\nTime Step\n\n30\n\nr\no\nr\nr\nE\ne\nv\ni\nt\na\nl\ne\nR\n\n3\n\n2\n\n1\n\n0\n\n0\n\n\u00b710\u22122\n\nKoopman\nTakeishi\nVBF\n\n20\n\n40\nTime Step\n\n60\n\n(a) 32-step predictions\n\n(b) 64-step predictions\n\nFigure 3: Average prediction errors over time for Deep Koopman and Deep Variational Bayes Filter\nmodels. Solid lines represent the mean prediction error across 20 sequences, while the shaded regions\ncorrespond to one standard deviation about the mean.\n\nThe results of these experiments can be found in Fig. 3a and Fig. 3b, where the error metric is the\nrelative error, de\ufb01ned as the L1-norm of the prediction error normalized by the L1-norm of the\nground-truth solution. We can see in Fig. 3a that the error for the Takeshi baseline initially grows\nmore rapidly than the error for the other models. This illustrates the importance of training models to\ngenerate recursive predictions, since models trained to make single-step predictions tend to generate\npoor multi-step predictions due to prediction errors compounding over time [27]. Over a 32-step\nhorizon the Deep Koopman and VBF models perform comparably, with the error for the Koopman\nmodel rising slightly at later time steps as it begins generating predictions for states that it did not\nhave access to in constructing its dynamical model. A much starker contrast in model performance\ncan be observed in Fig. 3b, where the Variational Bayes Filter begins to rapidly accumulate error\nonce it surpasses the 32-step time horizon.\nFor the results shown in Fig. 3b, the Variational Bayes Filter is performing reconstruction for 32 steps\nand prediction for 32 steps. Hence, we see that the VBF is effective at reconstruction, but is unable to\nfunction stably in prediction. In contrast, we see that the Deep Koopman model, aided by its ability\nto construct state mappings that approximately span an invariant subspace, is able to generate stable\npredictions for much longer time horizons. In fact, while the prediction error of the Koopman model\ndoes grow with time, its mean prediction error remains less than 0.2% over a horizon of 128 time\nsteps, corresponding to approximately eight periods of vortex shedding. Since we ultimately want a\npredictive dynamical model that can be incorporated into a control framework, we conclude that the\nDeep Koopman model is well suited for the task.\n\n3 Modeling forced dynamics\nWe now explain how the proposed Deep Koopman algorithm can be extended to account for the\neffect that control inputs have on the time evolution of dynamical systems.\n\n3.1 Deep Koopman model with control\nWe have already demonstrated how the Deep Koopman algorithm can learn state mappings that are\nsuitable for modeling unforced dynamics. In accounting for control inputs, we now aim to construct\na linear dynamical model of the form g(xt+1) = Ag(xt) + But, where ut \u2208 Rm and B \u2208 Rn\u00d7m.\nDe\ufb01ning the matrix \u0393 = [u1, . . . , uT ], we would like to \ufb01nd a dynamical model (A, B) that satis\ufb01es\n\u02dcY = A \u02dcX + B\u0393. Proctor et al. presented several methods for estimating A and B given matrices\n\u02dcX and \u02dcY [28]. In this work, we choose to treat B as a known quantity, which means that A can be\nestimated through a linear least-squares \ufb01t\n\nA = ( \u02dcY \u2212 B\u0393) \u02dcX\u2020.\n\n(4)\nThus, the Deep Koopman training algorithm presented in Section 2.2 can be modi\ufb01ed such that A is\ngenerated through Eq. (4). While we treat B as a known quantity, in reality it is another parameter\nthat we must estimate. We account for this by de\ufb01ning a global B-matrix, whose paramaters are\noptimized by gradient descent during training along with the neural network parameters.\n\n5\n\n\f3.2 Modi\ufb01ed test case\nWith the ability to train Koopman models that account for control inputs, we now consider a modi\ufb01ed\nversion the two-dimensional cylinder test case that allows for a scalar control input to affect the \ufb02uid\n\ufb02ow. In particular, the simulation is modi\ufb01ed so that the cylinder can rotate with a prescribed angular\nvelocity. Cylinder rotation is modeled by applying a spatially varying velocity to the surface of the\nwall, thus enabling the grid to remain static. The angular velocity is allowed to vary every 1500 solver\nsteps, with the value held constant during all intervening steps.\n\n3.3 Training process\nWe train Koopman models on data from the modi\ufb01ed test case to construct models of the forced\ndynamics. A training dataset is collected by simulating the two-dimensional cylinder system with\ntime-varying angular velocity. Every 1500 solver steps, a time snapshot xt is stored and the control\ninput ut is altered. In total, the training set contains 4238 snapshots of the system. We then divide\nthese snapshots into 1600 staggered 32-step sequences for training the Deep Koopman model. As\nin the case of unforced dynamics, during training dynamical models are constructed based on\ninformation from the \ufb01rst 16 time steps, but the system is simulated for 32 time steps. Training a\nsingle model takes approximately 12 hours on a Titan X GPU.\nThe form of control inputs applied to the system in generating the training data has a strong effect\non the quality of learned models. Analogous to frequency sweeps in system identi\ufb01cation [29], we\nsubject the system to sinusoidal inputs with a frequency that increases linearly with time. These\nsinusoidal inputs are interspersed with periods with no control inputs to allow the model to learn the\nunforced system dynamics from different initial conditions.\n\n4 Model predictive control\nWe evaluate the quality of the learned Deep Koopman models by studying their ability to enable\neffective control of the modeled system. In particular, we incorporate the learned dynamical models\ninto a model predictive control (MPC) framework with the aim of suppressing vortex shedding. At\neach time step in MPC, we seek to \ufb01nd a sequence of inputs that minimizes the \ufb01nite-horizon cost:\n\nT(cid:88)\n\nJT =\n\n(ct \u2212 cgoal)\n\n(cid:124)\n\nQ(ct \u2212 cgoal) +\n\nT\u22121(cid:88)\n\nRu2\nt ,\n\n(5)\n\nt=1\n\nt=1\n\nwhere ct = g(xt) represents the observable of state xt, cgoal = g(xgoal) represents the observable\nof goal state xgoal, Q is a positive de\ufb01nite matrix penalizing deviation from the goal state, and R is\na nonnegative scalar penalizing nonzero control inputs. We can furthermore apply the constraints\n|ut| < umax \u2200 t, c1 = g(x1), and ct+1 = Act + But for t = 2, . . . , T , where A and B are generated\nby the Deep Koopman model. As formulated this optimization problem is a quadratic program, which\ncan be solved ef\ufb01ciently with the CVXPY software [30].\nFor xgoal we use a snapshot of the steady \ufb02ow observed when the cylinder system is simulated at\na Reynolds number of 45, which is a suf\ufb01ciently low Reynolds number that vortex shedding does\nnot occur. While the \ufb02ow at this lower Reynolds number is qualitatively different from the \ufb02ow at a\nReynolds number of 50, we \ufb01nd that formulating the problem in this way leads to a reliable estimate\nof the cost, as demonstrated in the next section.\nWe use an MPC horizon of T = 16 time steps. This aligns with the Deep Koopman model training\nprocess, where the neural network generates predictions for 16 time steps beyond what it uses to\nconstruct its dynamical model. At each time step, we \ufb01nd state mappings for the previous 16 time\nsteps, and use those mappings in conjunction with the global B-matrix to \ufb01nd a suitable A-matrix for\npropagating the state mappings forward in time. We then solve the optimization problem described\nby Eq. (5) to \ufb01nd u\u2217\n1:T , the optimal sequence of control inputs. We set Q = I, the identity matrix,\nand R \u223c 105, which accounts for the fact that (cid:107)ct \u2212 cgoal(cid:107)2 is typically orders of magnitude larger\nthan |ut| and discourages actions that are too extreme for the dynamical model to handle accurately.\nThe \ufb01rst input, u\u2217\n\n1, is passed to the CFD solver, which advances the simulation forward in time.\n\n4.1 MPC results\nWe now present the results of performing model predictive control on the two-dimensional cylinder\nproblem. To evaluate the effectiveness of the derived controller, we require a measure of closeness\n\n6\n\n\f(a) t = 0\n\n(b) t = 100\n\n(c) t = 400\n\n(d) t = 700\n\nFigure 4: Snapshots of x-momentum over time as the MPC algorithm attempts to suppress vortex\nshedding.\n\n0.6\n\n0.4\n\n0.2\n\ne\nu\nl\na\nV\n\n0\n\n0\n\n1\n\n0\n\n\u22121\n\n0\n\ne\nu\nl\na\nV\n\nResidual\nScaled Cost\n\n100\n\n200\n\n300\n\n400\nTime Step\n\n500\n\n600\n\n700\n\n\u00b710\u22122\n\nMPC input\n0.4 \u00b7 v\n\n100\n\n200\n\n300\n\n400\nTime Step\n\n500\n\n600\n\n700\n\nFigure 5: Above: scaled residuals plotted alongside estimated cost used for MPC action selection.\nBelow: control inputs selected by MPC plotted along with scaled y-velocity measurements.\n\nto the desired outcome; in this case, this desired outcome is to achieve a steady laminar \ufb02ow\ndevoid of vortex shedding. The Navier\u2013Stokes equations are a conservation law, taking the form of\n\u2202t = \u2212\u2207\u00b7f (q,\u2207q), where q = [\u03c1, \u03c1u, \u03c1v, E] is a vector of the density, x-momentum, y-momentum,\n\u2202q\nand energy \ufb01elds respectively, and f (q,\u2207q) is a suitably de\ufb01ned \ufb02ux function. In the process of\nrunning CFD simulations, PyFR evaluates the right-hand side of this equation by calculating residuals.\nNote that if a steady \ufb02ow is achieved, the time derivative will be zero and in turn the residuals will be\nzero. Thus, we use residual values as a measure of closeness to the desired steady \ufb02ow. In particular,\nwe extract the norm of the residuals for x- and y-momentum over time.\nResults from the model predictive control experiments can be found in Fig. 4 and Fig. 5. In Fig. 4,\nwe get a qualitative picture for the effectiveness of the applied control, as the cylinder wake exhibits\na curved region of low x-momentum characteristic of vortex shedding at early time steps, then\n\ufb02attens out to a pro\ufb01le more characteristic of laminar \ufb02ow over time. In the upper plot of Fig. 5,\nwe get a quantitative picture of the controller performance, showing that the controller brings\nabout a monotonic decrease in the residuals over time. Additionally, we plot a scaled version of\n(cid:107)g(xt) \u2212 g(xgoal)(cid:107)2. Interestingly, we note that there is a strong correspondence between a decrease\nin the residuals and a decrease in the cost that model predictive control is attempting to minimize.\nThis provides con\ufb01dence that using this measure of cost in MPC is sensible for this problem.\nThe lower plot in Fig. 5 shows the control inputs applied to the system over time. A dynamical model\ncannot be constructed until 16 states have been observed, so the control inputs are initially set to zero.\nSubsequently, the inputs appear to vary sinusoidally and decrease in amplitude over time. Remarkably,\nit is possible to \ufb01nd a location in the wake of the cylinder, at a location denoted by d\u2217, where the\nvariations in y-velocity are in phase with the selected control inputs. When scaled by a constant value\nof 0.4, we see a strong overlap between the control inputs and velocity values. Viewed in this light,\nwe note that the controller obtained through MPC is quite interpretable, and is functionally similar to\na proportional controller performing feedback based on y-velocity measurements with a gain of 0.4.\n\n4.2 Proportional control\nWith the insights gained in the previous section, we now test the effectiveness of a simple proportional\ncontrol scheme in suppressing vortex shedding. Rather than selecting inputs with MPC, we set the\nangular velocity of the cylinder by applying a gain of 0.4 to measurements of y-velocity at point d\u2217.\n\n7\n\n\fd\u2217\n\n2d\u2217\n\n1\n\n2d\u2217\n\nd\u2217\n\nFigure 6: Schematic of measurement points for performing proportional control.\n\nl\na\nu\nd\ni\ns\ne\nR\n\n0.6\n\n0.4\n\n0.2\n\n0\n\n0\n\n50\n\n100\n\n150\n\n200\n\n250\n\n300\n\n350\n\n400\n\n450\n\n500\n\n550\n\n600\n\nTime Step\n\nd\u2217\n2 d\u2217\n2d\u2217\n\n1\n\nFigure 7: Calculated residuals over time for different measurement locations.\n\nWhile easy to implement, we perform additional experiments to illustrate that such a control law is\nnot easy to \ufb01nd. In these experiments, we attempt to perform proportional control with the same gain\nbased on measurements at two additional locations, 1\nThe results of these experiments, summarized in Fig. 7, demonstrate that proportional control based\non measurements at d\u2217 is effective at suppressing vortex shedding. Meanwhile, proportional control\nlaws based on measurements at the other locations are unable to drive the system closer to the desired\nsteady, laminar \ufb02ow. These results are in agreement with previous studies [3], which showed that the\neffectiveness of proportional control is highly dependent upon the measurement location.\n\n2 d\u2217 and 2d\u2217, as illustrated in Fig. 6.\n\n5 Related work\nWhile originally introduced in the 1930s [31], the Koopman operator has attracted renewed interest\nover the last decade within the reduced-order modeling community due to its connection to the\ndynamic mode decomposition (DMD) algorithm [20], [32]. DMD \ufb01nds approximations to the\nKoopman operator under the assumption that the state variables x span an invariant subspace.\nHowever, they will not span an invariant subspace if the underlying dynamics are nonlinear. Extended\nDMD (eDMD) approaches build upon DMD by employing a richer set of observables g(x), which\ntypically need to be speci\ufb01ed manually [33]. A number of recent works have studied whether deep\nlearning can be used to learn this set of observables automatically, thereby circumventing the need to\nhand-specify a dictionary of functions [19], [34]\u2013[36].\nIn the machine learning community, recent work focused on learning deep dynamical models from\ndata has showed that these models can enable more sample-ef\ufb01cient learning of effective con-\ntrollers [37]\u2013[39]. Our work most closely parallels work in state representation learning (SRL) [40],\nwhich focuses on learning low-dimensional features that are useful for modeling the time evolution\nof high-dimensional systems. Recent studies have worked toward learning state space models from\nimage inputs for a variety of tasks, and have been designed to accommodate stochasticity in the\ndynamics and measurement noise [17], [26], [41]. Given that both Koopman-centric approaches\nand SRL attempt to discover state mappings that are useful for describing the time evolution of\nhigh-dimensional systems, an opportunity likely exists to bridge the gap between these \ufb01elds.\n\n6 Conclusions\nWe introduced a method for training Deep Koopman models, demonstrating that the learned models\nwere capable of stably simulating air\ufb02ow over a cylinder for signi\ufb01cant prediction horizons. Further-\nmore, we detailed how the Koopman models could be modi\ufb01ed to account for control inputs and\nthereby leveraged for \ufb02ow control in order to suppress vortex shedding. Learning suf\ufb01ciently accurate\ndynamical models from approximately 4000 training examples, the method is very sample ef\ufb01cient,\nwhich is of high importance due to the large computational cost associated with CFD simulations.\nMost importantly, by incorporating the Deep Koopman model into an MPC framework, we showed\nthat the resulting control law was both interpretable and sensible, aligning with well studied \ufb02ow\ncontrol approaches from the literature. Future work will focus on applying the proposed approach\nto \ufb02ows at higher Reynolds numbers to see how its effectiveness scales to increasingly complex\n\ufb02ows. Furthermore, we hope to apply the proposed approach to other \ufb02ow control problems, studying\nwhether it can provide similar insight into how to design controllers for other applications. The code\nassociated with this work can be found at https://github.com/sisl/deep_flow_control.\n\n8\n\n\fAcknowledgments\n\nThe authors would like to thank the reviewers for their insightful feedback. This material is based\nupon work supported by the National Science Foundation Graduate Research Fellowship Program\nunder Grant No. DGE- 114747. The authors would like to thank the Air Force Of\ufb01ce of Scienti\ufb01c\nResearch for their support via grant FA9550-14-1-0186.\n\nReferences\n\n[1] E. Berger, \u201cSuppression of vortex shedding and turbulence behind oscillating cylinders,\u201d The\n\nPhysics of Fluids, vol. 10, no. 9, S191\u2013S193, 1967.\n\n[2] K. Roussopoulos, \u201cFeedback control of vortex shedding at low Reynolds numbers,\u201d Journal of\n\nFluid Mechanics, vol. 248, pp. 267\u2013296, 1993.\n\n[3] D. S. Park, D. M. Ladd, and E. W. Hendricks, \u201cFeedback control of von K\u00e1rm\u00e1n vortex\nshedding behind a circular cylinder at low reynolds numbers,\u201d Physics of Fluids, vol. 6, no. 7,\npp. 2390\u20132405, 1994.\n\n[4] M. D. Gunzburger and H. C. Lee, \u201cFeedback control of karman vortex shedding,\u201d Journal of\n\nApplied Mechanics, vol. 63, no. 3, pp. 828\u2013835, 1996.\n\n[5] S. J. Illingworth, H. Naito, and K. Fukagata, \u201cActive control of vortex shedding: An explanation\n\nof the gain window,\u201d Phys. Rev. E, vol. 90, no. 4, p. 043 014, 2014.\n\n[6] S. J. Illingworth, \u201cModel-based control of vortex shedding at low reynolds numbers,\u201d Theoreti-\n\ncal and Computational Fluid Dynamics, vol. 30, no. 5, pp. 429\u2013448, 2016.\n\n[7] H. T. Huynh, \u201cA \ufb02ux reconstruction approach to high-order schemes including discontinuous\ngalerkin methods,\u201d in 18th AIAA Computational Fluid Dynamics Conference, 2007, p. 4079.\n[8] P. E. Vincent, P. Castonguay, and A. Jameson, \u201cA new class of high-order energy stable \ufb02ux\n\nreconstruction schemes,\u201d Journal of Scienti\ufb01c Computing, vol. 47, no. 1, pp. 50\u201372, 2011.\n\n[9] P. Castonguay, P. E. Vincent, and A. Jameson, \u201cA new class of high-order energy stable \ufb02ux\nreconstruction schemes for triangular elements,\u201d Journal of Scienti\ufb01c Computing, vol. 51,\nno. 1, pp. 224\u2013256, 2012.\n\n[10] D. Williams and A. Jameson, \u201cEnergy stable \ufb02ux reconstruction schemes for advection\u2013\ndiffusion problems on tetrahedra,\u201d Journal of Scienti\ufb01c Computing, vol. 59, no. 3, pp. 721\u2013759,\n2014.\n\n[11] A. Jameson, P. E. Vincent, and P. Castonguay, \u201cOn the non-linear stability of \ufb02ux reconstruction\n\nschemes,\u201d Journal of Scienti\ufb01c Computing, vol. 50, no. 2, pp. 434\u2013445, 2012.\n\n[12] C. W. Rowley and S. T. Dawson, \u201cModel reduction for \ufb02ow analysis and control,\u201d Annual\n\nReview of Fluid Mechanics, vol. 49, no. 1, pp. 387\u2013417, 2017.\n\n[13] E. Kaiser, J. N. Kutz, and S. L. Brunton, \u201cData-driven discovery of Koopman eigenfunctions\n\nfor control,\u201d arXiv preprint arXiv:1707.01146, 2017.\n\n[14] M. Korda and I. Mezi\u00b4c, \u201cLinear predictors for nonlinear dynamical systems: Koopman operator\n\nmeets model predictive control,\u201d arXiv preprint arXiv: 1611.03537, 2016.\n\n[15] M. Hessel, J. Modayil, H. van Hasselt, T. Schaul, G. Ostrovski, W. Dabney, D. Horgan, B.\nPiot, M. G. Azar, and D. Silver, \u201cRainbow: Combining improvements in deep reinforcement\nlearning,\u201d arXiv preprint arXiv:1710.02298, 2017.\nJ. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, \u201cProximal policy optimization\nalgorithms,\u201d arXiv preprint arXiv:1707.06347, 2017.\n\n[16]\n\n[17] M. Watter, J. Springenberg, J. Boedecker, and M. Riedmiller, \u201cEmbed to control: A locally\nlinear latent dynamics model for control from raw images,\u201d in Advances in Neural Information\nProcessing Systems (NIPS), 2015.\n\n[18] M. Deisenroth and C. Rasmussen, \u201cPILCO: A model-based and data-ef\ufb01cient approach to\n\npolicy search,\u201d in International Conference on Machine Learning (ICML), 2011.\n\n[19] N. Takeishi, Y. Kawahara, and T. Yairi, \u201cLearning Koopman invariant subspaces for dynamic\nmode decomposition,\u201d in Advances in Neural Information Processing Systems (NIPS), 2017.\nJ. Kutz, S. Brunton, B. Brunton, and J. Proctor, Dynamic Mode Decomposition: Data-Driven\nModeling of Complex Systems. Society for Industrial and Applied Mathematics, 2016.\n\n[20]\n\n9\n\n\f[21] K. He, X. Zhang, S. Ren, and J. Sun, \u201cDeep residual learning for image recognition,\u201d in IEEE\n\nComputer Society Conference on Computer Vision and Pattern Recognition (CVPR), 2016.\n\n[22] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis,\nJ. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R.\nJozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Man\u00e9, R. Monga, S. Moore, D. Murray,\nC. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V.\nVasudevan, F. Vi\u00e9gas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng,\nTensorFlow: Large-scale machine learning on heterogeneous systems, Software available from\ntensor\ufb02ow.org, 2015.\n\n[23] A. Roshko, \u201cOn the wake and drag of bluff bodies,\u201d Journal of the Aeronautical Sciences,\n\nvol. 22, no. 2, pp. 124\u2013132, 1955.\n\n[24] C. H. K. Williamson, \u201cVortex dynamics in the cylinder wake,\u201d Annual Review of Fluid\n\nMechanics, vol. 28, no. 1, pp. 477\u2013539, 1996.\n\n[25] F. D. Witherden, A. M. Farrington, and P. E. Vincent, \u201cPyFR: An open source framework\nfor solving advection\u2013diffusion type problems on streaming architectures using the \ufb02ux\nreconstruction approach,\u201d Computer Physics Communications, vol. 185, no. 11, pp. 3028\u2013\n3040, 2014.\n\n[26] M. Karl, M. Soelch, J. Bayer, and P. van der Smagt, \u201cDeep variational Bayes \ufb01lters: Unsuper-\nvised learning of state space models from raw data,\u201d in International Conference on Learning\nRepresentations (ICLR), 2017.\n\n[27] A. Venkatraman, M. Hebert, and J. A. Bagnell, \u201cImproving multi-step prediction of learned\n\n[28]\n\ntime series models,\u201d in AAAI Conference on Arti\ufb01cial Intelligence, 2015.\nJ. L. Proctor, S. L. Brunton, and J. N. Kutz, \u201cDynamic mode decomposition with control,\u201d\nSIAM Journal on Applied Dynamical Systems, vol. 15, no. 1, pp. 142\u2013161, 2016.\n\n[29] B. Mettler, M. Tischler, and T. Kanade, \u201cSystem identi\ufb01cation of small-size unmanned heli-\ncopter dynamics,\u201d Annual Forum Proceedings - American Helicopter Society, vol. 2, pp. 1706\u2013\n1717, 1999.\n\n[30] S. Diamond and S. Boyd, \u201cCVXPY: A Python-embedded modeling language for convex\n\noptimization,\u201d Journal of Machine Learning Research, vol. 17, no. 83, pp. 1\u20135, 2016.\n\n[31] B. O. Koopman, \u201cHamiltonian systems and transformation in Hilbert space,\u201d Proceedings of\n\nthe National Academy of Sciences, vol. 17, no. 5, pp. 315\u2013318, 1931.\n\n[32] C. W. Rowley, I. Mezic, S. Bagheri, P. Schlatter, and D. S. Henningson, \u201cSpectral analysis of\n\nnonlinear \ufb02ows,\u201d Journal of Fluid Mechanics, vol. 641, 115\u2013127, 2009.\n\n[33] M. O. Williams, I. G. Kevrekidis, and C. W Rowley, \u201cA data-driven approximation of the\nKoopman operator: Extending dynamic mode decomposition,\u201d Journal of Nonlinear Science,\nvol. 8, no. 1, pp. 1\u201340, 2015.\n\n[34] B. Lusch, J. N. Kutz, and S. L. Brunton, \u201cDeep learning for universal linear embeddings of\n\nnonlinear dynamics,\u201d arXiv preprint arXiv: 1712.09707, 2018.\n\n[35] Q. Li, F. Dietrich, E. M. Bollt, and I. G. Kevrekidis, \u201cExtended dynamic mode decomposition\nwith dictionary learning: A data-driven adaptive spectral decomposition of the Koopman\noperator,\u201d Chaos: An Interdisciplinary Journal of Nonlinear Science, vol. 27, no. 10, p. 103 111,\n2017.\n\n[36] E. Yeung, S. Kundu, and N. Hodas, \u201cLearning deep neural network representations for Koop-\n\nman operators of nonlinear dynamical systems,\u201d arXiv preprint arXiv: 1708.06850, 2017.\n\n[37] N. Mishra, P. Abbeel, and I. Mordatch, \u201cPrediction and control with temporal segment models,\u201d\n\narXiv preprint arXiv: 1703.04070, 2017.\n\n[38] A. Nagabandi, G. Kahn, R. S. Fearing, and S. Levine, \u201cNeural network dynamics for\nmodel-based deep reinforcement learning with model-free \ufb01ne-tuning,\u201d arXiv preprint arXiv:\n1708.02596, 2017.\n\n[39] A. Nagabandi, G. Yang, T. Asmar, R. Pandya, G. Kahn, S. Levine, and R. S. Fearing, \u201cLearning\nimage-conditioned dynamics models for control of under-actuated legged millirobots,\u201d arXiv\npreprint arXiv: 1711.05253, 2017.\n\n[40] T. Lesort, N. D\u00edaz-Rodr\u00edguez, J.-F. Goudou, and D. Filliat, \u201cState representation learning for\n\ncontrol: An overview,\u201d arXiv preprint arXiv: 1802.04181, 2018.\n\n10\n\n\f[41] M. Fraccaro, S. Kamronn, U. Paquet, and O. Winther, \u201cA disentangled recognition and\nnonlinear dynamics model for unsupervised learning,\u201d in Advances in Neural Information\nProcessing Systems (NIPS), 2017.\n\n11\n\n\f", "award": [], "sourceid": 5603, "authors": [{"given_name": "Jeremy", "family_name": "Morton", "institution": "Stanford University"}, {"given_name": "Antony", "family_name": "Jameson", "institution": "Texas A&M University"}, {"given_name": "Mykel", "family_name": "Kochenderfer", "institution": "Stanford University"}, {"given_name": "Freddie", "family_name": "Witherden", "institution": "Imperial College London"}]}