__ Summary and Contributions__: The high level context here is Excitatory Inhibitory balance in spiking neuron networks; a question of substantial interest in computational neuroscience. The earliest work in the area showed that unless you have a neuron's membrane potential hover around threshold, it is very difficult to explain its Poisson like spike train output, something that is a hallmark of cortical neurons. But what use is Excitatory Inhibitory balance? Recent previous work whose framework most fits this paper has posited that signal reconstruction (assuming a rate code) is best achieved in this regime.
This paper takes a different tack. It "assumes" that the membrane potential of each neuron should hover around threshold, and then back calculates from this assumption an objective function (by setting the assumption as KKT conditions and asking what objective would have the corresponding KKT conditions).
This objective is what the author(s) call the normative objective, since there is really no reason for this to be an objective, except that if this were the objective, a network whose neurons generate spikes to optimized this objective would--if it reached equilibrium--display detailed excitatory inhibitory balance.
The paper then goes on to show how several other objectives that have been used by other papers can be mapped to their objective.

__ Strengths__: The paper is fairly clearly written. The derivations are sound. Multiple applications have been shown, including what the weights of the network should be for a given input connectivity matrix for minimizing reconstruction error, as well as study of fixed point, ring and grid attractors.

__ Weaknesses__: The biggest concern about this paper is that it is an incremental advance over: Minimax and Hamiltonian Dynamics of Excitatory-Inhibitory Networks.(1997) Seung, Rchardson, Lagarias, Hopfield, and the paper makes only passing reference to this fact Incidentally, one of the goals of the earlier paper was to incorporate Dale's law into Hopfield nets.
The minimax objective is identical to the previous paper's Lyapunov function. And it suffers the same constraints as in the previous paper, W^EE and W^II are symmetric and W^IE=W^EI^T (Something that Hopfield nets have to assume to enforce fixed point dynamics, whereas there is no evidence that cortical networks satisfy this.)
The claim would then be that porting the earlier paper's results to spiking neurons is by no means easy (this at the very least should have been emphasized in this paper). So it is worth considering this. The original paper has the input into a neuron as a function of the membrane potential of the presynaptic neuron, whereas this paper considers the spike train generated by the presynaptic neuron. However, at crucial places, the spike trains are replaced by their low pass filtered rate to push the analysis thru, which reduces it to the analysis in the previous model.
The author rebuttal did not adequately address the the above issue. The response was that KKT conditions have additionally been derived, which to this reviewer is a small advance.

__ Correctness__: Yes

__ Clarity__: Yes

__ Relation to Prior Work__: Discussion is only at a high level. Since the paper follows strongly in the lines of
Minimax and Hamiltonian Dynamics of Excitatory-Inhibitory Networks.(1997) Seung, Rchardson, Lagarias, Hopfield,
the author(s) should have pointed out what additional contribution they have made.

__ Reproducibility__: Yes

__ Additional Feedback__:

__ Summary and Contributions__: In this work, the authors derive balanced spiking E/I networks that obey Dale's law through the optimization of minimax objectives. Their work follows from a combination of previous studies on balanced spiking networks (Deneve & Machens 2016), as well as the construction of E/I rate networks using a similar minimax objective (Seung et al. 1998). In addition to deriving the network and optimality conditions, they show applications in signal reconstruction and various types of attractor networks.

__ Strengths__: The paper is well written and, as far as I can tell, everything seems correct. Overall, I find that this work presents a quite intriguing way of extending the balanced spiking network formalism. First, the authors demonstrate how to design spiking networks from a loss function, in which excitatory and inhibitory neurons are separate. Extending these old (and somewhat forgotten) ideas to spiking networks is clearly a novel contribution and highly useful for the field. Second, the authors use their objective functions to design spiking networks that perform various computations. They thereby provide a new solution to an old and persistent problem in the balanced network literature, which is how to design balanced networks that perform nonlinear computations.

__ Weaknesses__: There are no major weaknesses in the paper, but there are a few things that could be improved or corrected:
(1) The minimax objectives are mostly just used here to design certain networks. However, is there a more general meaning that can be attributed to these objectives? In other words, assuming that that's how neural circuits work, why would they use minimax objectives?
(2) The minimax objective allows to obtain networks that obey Dale's law, but it doesnt seem to require it (as you still have to fix the signs of the connectivity matrices). Is there any advantage to fixing those signs, or is Dale's law simply a constraint on the architecture?

__ Correctness__: Overall all claims seem correct. One minor point:
Balance is typically understood as balance of synaptic inputs, not simply the fact that voltages fluctuate around zero. In the model used here, that will also happen if a single neuron fires all by itself---but that wouldnt be called balance. So I'd suggest to use the term 'balance' more carefully.

__ Clarity__: Yes, writing is generally very clear, given the density of the material. A few clarification questions:
(1) Reconstruction of natural image patches: How many excitatory neurons were used in this example? It does not appear to be mentioned.
(2) Fixed point attractors: There is not enough detail in this section. How many neurons participate in each attractor? How many attractors were stored in the network? Did the attractors include specific inhibitory neurons? Does the network oscillate? (it looks like it might based on the figure)
(3) Attractor examples: It would be nice if the authors could provide some brief analysis of the weight matrices for the solutions found. E.g., do the weight matrices have any structure? Are they low-rank? How do they compare to the previous rate-based solutions to these problems?

__ Relation to Prior Work__: Yes.

__ Reproducibility__: Yes

__ Additional Feedback__:

__ Summary and Contributions__: This study proposes a novel minimax objective function whose optimization could be implemented by a spiking neural network. In order to demonstrate the generalization and plausibility of the framework, the authors linked the new minimax objective function with the energy functions of some canonical networks, and showed their network could reproduce previous canonical networks. The main contribution of this study is the discovery of the minimax objective function.

__ Strengths__: This study is novel in terms of that a minimax objective function was proposed, and the mathematical analysis and proof underlying the objective function are solid. The authors also showed the proposed minimax objective is able to capture some canonical network models that have been widely used in computational neuroscience.

__ Weaknesses__: Although I believe the math derivation of the novel minimax objective function is correct, I have two major concerns.
1. My first concern is whether this minimax objective function provides some novel insight on network dynamics which cannot be captured by traditional framework that network dynamics is minimizing an “energy” function.
My concern is resulted from that the minimax objective (Eq. 4) depends on the sign of each term, and which further depends on the sign of connection matrix W. Mathematically, if we absorb the minus sign in network dynamics (Eq. 2) into W (excitatory connection has positive W while inhibitory connection has negative W), we could derive an quadratic objective function similar with Eq. 4 but needs to be minimized. It seems to me that the only difference between the minimax and minimized objective function is that the network state converges to the saddle point in the former case, while in later case the network state converges to a stable fixed point. I really hope the authors explain this and correct me if I understood something wrong.
Update after rebuttal: The author’s rebuttal persuaded me that the minimax objective doesn’t depend on the definition of the sign of inhibitory connection, but resulted from the anti-symmetric connections between excitatory and inhibitory neurons.
2. Another concern is about the biological plausibility of the KKT conditions in Eqs. 5-6. I certainly understand the mathematical derivations of KKT condition, but I have some difficulty in regarding this KKT condition to a real, biological neural circuit. The main obstacle is that how could we guarantee the KKT condition could be satisfied in a real neural circuit. That is, the membrane potential V is zero when firing rate is non-zero and vice versa. Moreover, the membrane potential V is smaller than the firing threshold, which is a positive number in this study, whereas Eqs. 5-6 states that V is not larger than zero. How should I understand this discrepancy?
Update after rebuttal: I realize that the KKT condition is
not strictly satisfied due to the discontinuous spiking network dynamics. I
suggest the author explicitly state this in a revised manuscript. Moreover, to
help readers understand the underlying math, the author could state which
constraint in spiking dynamics leads to the KKT condition. In addition, it is not quite clear how the KKT conditions lead to the detailed balanced and tight balanced.

__ Correctness__: I have gone through all the math in the main text, and browsed some of them in Supplementary information. I believe the math derivations I have gone through are correct and reasonable.

__ Clarity__: The writing of this paper is structure wise, though it still could be improved if the authors could provide detailed and intuitive explanations on some key underlying assumptions. For example,
1. It is unclear what are the inequality constraints underlying the KKT conditions (Eqs. 5 and 6). Is it the spiking threshold, i.e., the membrane potential is always smaller than the spiking threshold? It would be very helpful if authors explicitly state the key underlying assumptions out.
2. I suggest the authors brief what the “variable substitution” trick means, and what is its underlying assumption. I have spent a short while to infer what this trick really means without reading the ref. [34]. Moreover, I think the underlying assumption of this trick is that only the stationary responses of r^E and r^I are considered. And the authors could add this intuitive explanation right above the Eq. 11.
Some typos exist in the writing and I suggest the authors to proofread the manuscript again.
1. Eq.3: the integrand of x_j(t) should contain \tau_e instead of \tau, if I understood correctly.
2. Eq. 5, 2nd row: W^EI should be W^IE.
PS: I know the authors said W^EI = W^IE (line 81), but writing a notation with this form would be easier for readers and make the notation consistent with the 2nd row in Eq. 2.
3. Eq. SI.1, 2nd row: I think the 2nd and the 3rd terms in this equation should be combined, in that they both sum over all neurons.
4. Eq. SI.1: is \delta_jk the j-th element of standard basis e_k? Please define the notation before using it.
5. Line 217: should “minimizing over r^I” be “maximizing over r^I”?
6. (Optional) Eqs. 1 and 3: Since you used notation N^0 as the number of external inputs (line 73), you may consider use s_j^0 as the external input spike to simplify notations.

__ Relation to Prior Work__: The author discussed the connection of current study with previous work. I think the comparison could be strengthened by providing the novel, unique insight given the proposed minimax objective which cannot be obtained under traditional minimized energy function.

__ Reproducibility__: Yes

__ Additional Feedback__: Update after rebuttal: it seems that the minimax objective and related math derivations are all based on the firing rate. And some implicit _linear_ assumption is probably needed in deriving Eq. 2 from Eq. 1, because in general the membrane potential is not linear with firing rate, especially when the network is in the pattern forming regime. Maybe this linear relation in Eq. 2 is directly resulted from the effect of reset after every spike is absorbed in the diagonal elements in recurrent connections.