__ Summary and Contributions__: This paper makes two valuable contributions.
First, to enbale spectral GCN to handle directed graphs, it starts from Eq. (1) raised by [8] and raises its weakness in high computation complexity, then it proposes personalized PageRank by introducing an auxiliary node to keep aperiodic and irreducible properties while retaining the sparsity of the adjacency. The authors proved two theorems: by controlling the value of pernality \alpha, the proposed form will degenerate to trivial-symmetric Laplacian and random-walk normalized one. Nice story and rigorous analysis.
Second, to enable larger receptive field, the authors further extend the digraph convolution to multiple scales and develop an inception module for the implementation.

__ Strengths__: 1. Well wirtten paper. I enjoy reading it.
2. Performing digraph convolution by Eq. (3) is valuable and novel. The conclusions by Theorem 1 and 2 are simple but meaningful.
3. Instead of applying the k-order adjacency matrix directly, this paper defines the k-order proximity matrix that better suits the structure of directed connections. This consideration is reasonable.
4. Experiments are good and sufficient.

__ Weaknesses__: 1. One potential weakness is the insufficient introduction of [8]: why Eq. (1) works better than simple symmetric form in Line 78? Can Eq. (1) exhibit more desired properties? Perhaps more specifications are needed.
2. Why does intersect in Line 168 mean? does it reflect that both meeting and diffusion paths meet? If so, make it clear. The definitions of the k-order proximity matrix are quite different for k=1 and k>1 in Table 1, why?

__ Correctness__: This paper is well claimed and well supported by its empirical results.

__ Clarity__: Well written paper. Some minor specifications are needed (see above).

__ Relation to Prior Work__: The differences from previous papers are generally well justified. Yet, I have found two papers that also raise the ideas of high order GCN [A] and inception module [B], which are suggested to be cited and discussed.
[A] MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing, ICML 2019.
[B] DropEdge: Towards Deep Graph Convolutional Networks on Node Classification, ICLR 2020.

__ Reproducibility__: Yes

__ Additional Feedback__: 1. Figure 1 b is not easy to be understood. More clarifications are suggested.
2. Line 199-200, why?
3. Line 141: removing the comma?
4. It is better to include the results of the model without IB and with simple symmetric adjacency matrix.
#### post rebuttal ###
The authors' responses confirm my justification. Accept.

__ Summary and Contributions__: The paper proposes a convolutional neural network for directed graphs. The network architecture considers node neighbours at different distance, inspired by the inception network.

__ Strengths__: -The problem faced by the paper is interesting and timely.
-The proposed approach seems reasonable.

__ Weaknesses__: -No details about model selection are provided

__ Correctness__: The experimental part is weak since authors did not provide details about model selection and the paper discussion suggests that the selection of the model's hyperparameters was performed on the test sets (i.e. the results with maximum average performances over the 20 splits are reported)

__ Clarity__: yes

__ Relation to Prior Work__: yes

__ Reproducibility__: Yes

__ Additional Feedback__: -No details about model selection are provided, i.e. how the network architecture and its hyperparameters were set.
-The results provided for the study of the effect of teleport probability and table 4 suggest that in table 3 authors reported the best results on the test set, i.e. they performed model selection in the test set. This would lead to a biased comparison. Authors should clarify this point.
-Time and Space Complexity:in the space complexity calculation, you use a practical consideration to derive the big-O space complexity notation. By definition, that notation should consider the worst case complexity.
Minor remarks:
l61: the consideration about the receptive field is true for all convolutions that consider k-hops neighbours at distance k greater than one.
l108 matrix that -> matrix
l123 coverage -> converge
----
I update my score after the rebuttal provided by authors, that adequately addressed my main concerns

__ Summary and Contributions__: Propose a graph neural network model for directed graphs.

__ Strengths__: The proposed model is novel. The problem is important. Experiments on node classification are conducted.

__ Weaknesses__: The experiments should be improved. More work could be compared.

__ Correctness__: The method design is reasonable.

__ Clarity__: Well written paper.

__ Relation to Prior Work__: More work could be compared.

__ Reproducibility__: No

__ Additional Feedback__: Detailed review.
--
Summary
The manuscript proposes DiGCN, a graph neural network for directed graphs. The model extends GCN to directed graph using Digraph Laplacian based on PageRank. The Digraph Inception Convolutional Networks is further presented. Experiments on several datasets demonstrate that the proposed model outperforms some baseline methods for node classification.
Pros
1 The problem is important.
2 The proposed model is novel. The overall quality of this work is good.
3 Experiments on node classification are conducted. Ablation study and some analytical experiments are also provided.
Cons/Questions
1 Experiments could be improved. Only node classification task is conducted. It is better to perform more downstream tasks such as link prediction. In addition, it is better to compare and discuss some recent graph neural network models such as:
GMNN: Graph Markov Neural Networks, ICML 2019
Deep Graph Infomax, ICLR 2019
How powerful are graph neural networks, ICLR 2019
2 I also concern about the space complexity of proposed model. The datasets used in this work are relatively small with thousands of nodes while the model faces OOM issue. It is necessary to conduct experiments using larger datasets.
To summarize, the novelty and overall quality of this work is good while the experiments could be improved.

__ Summary and Contributions__: The authors solve two problems on GCN. The first is that Vanilla GCN can only handle undirected graphs. The authors propose Digraph Laplacian based on PageRank. However, because of the computational overhead, they further reduce the computational time and memory requirement by Approximate Digraph Laplacian based on Personalized PageRank. Second is that the Fixed Receptive Field in Vanilla GCN. The authors propose kth-order Proximity and concatenate different kth-order Proximity into an inception block.

__ Strengths__: 1. This paper provides an approximation Digraph Laplacian with theoretical justification that can save both computational time and memory requirements.
2. The authors claim by combining the above two network structures their work can generalize to State-Of-The-Art models easily.

__ Weaknesses__: 1. The kth-order Proximity is similar to prior work with no significant improvement in this part. Given such, the impact becomes smaller.
2. The performance for its major competitor, DCGN, is significantly lower than what it have reported in the original DCGN paper,. If the original experiment results are used, then the proposed model does not really outperform DCGN. I wouldn't mind upgrading my grade if there is a convincing explanation on this part.

__ Correctness__: Yes, the proofs are theoretically correct and convincing.

__ Clarity__: Yes, the paper is well structure and easy to read. The mathematical notations are clear, and the writing is coherent and smooth.

__ Relation to Prior Work__: Yes, the author made a complete comparison with prior work and listed all the implemental detail between their work and similar previous work.

__ Reproducibility__: Yes

__ Additional Feedback__: