The paper studies federated learning, when agents perform different number of local update steps. It shows that normalizing these updates by the effective number of steps performed allows to converge to the correct objective value (and incorrect without such a normalization). Both are valuable new insights to federated learning, and important in practice. Some concerns remained on clarifying the positioning with respect to related work, as well as hyperparameter choices, but overall consensus was positive. We hope the exceptionally detailed feedback with improvement suggestions from the 3 reviews will be implemented for the camera ready version.