Reviews: Semi-supervisedly Co-embedding Attributed Networks

Clarity I think that both the structure and readability of the paper are quite nice. The authors define very well their objective function based on the information provided for the Semi-Supervised Variational Auto-Encoders. They also support the presentation of the approach with figures illustrating the general outline of the methodology and the graphical model of the proposed method. Here, I would like to point out to some inconsistencies related to the notation used: - First of all, the paper does not contain line numbers since the authors did not omit the final and preprint options. - In the line just before Equation 1, it is stated that $N_1$ is the number of unlabeled data points, but in Equation 1, the upper bound of the summation containing the labeled data points is also $N_1$. I think the upper bounds of the summation terms should be replaced. - In Section 2, the word “generates” should be replaced with “generate” in the sentence starting with “More recently, a semi-supervised deep ...”. - In Subsection 4.1, the authors may replace the expression $\textbf{A} \in \mathbb{R}^{N\times N}$ with \textbf{A} \in \{0,1\}^{N\times N}$ to avoid the confusion since $\textbf{A}$ is the adjacency matrix. Originality The number of node representation learning approaches have been significantly increased in recent years but there are only a few semi-supervised methods targeting to attributed networks. As the authors in Subsection 3.1 claim, the paper might be the first approach to learn node representations along with the attribute embeddings in a semi-supervised manner. Quality From the technical point of view, the paper seems to be well-prepared. In particular, the authors present their approach with detailed mathematical derivations based on the Semi-Supervised Variational Auto-Encoders and they extensively evaluate the performance of the method in various downstream tasks. In the experiments, they report the results only for a chosen training size --- it could also be nice to examine the performance of the method for varying training set sizes. Significance As it was stated above, although there are many different types of approaches that learn node embeddings, this method seems to be the first one that learns representations of nodes and attributes in a semi-supervised manner.

Reviewer 2

- The authors present a semi-supervised graph embedding procedure that simultaneously embeds nodes and attributes into the same semantic space. The procedure leverages SVAE on heterogeneous data and several variational inference tricks for efficient implementation. Moreover, the node/attribute representations (being modeled via Gaussians in the SVAE) naturally have measures of uncertainty built their representations - Details in Section 3.2 are a bit confusing as presented. I believe you want $$\mathcal{O}=\mathcal{X}\times\mathcal{X}\times\mathcal{R}\times\mathcal{Y},$$ as you want observations to have two entities, a relation between the two entities, and possible class labels for the entities. As written, the definitions of $\mathcal{R}$ and $\mathcal{O}$ do not quite make sense. In addition, are the relations entity or class dependent; i.e., for $x_i\in X^g$ and $x_i\in X^h$ does $r_{ij}$ depend only on $g$ and $h$ or on the entities as well? Finally, I am unsure how you derived (4)? - How scalable is SCAN? It would be helpful to know the orders of the 3 data sets in the experiments and what the runtime of SCAN was in each setting. - How are you choosing D? - In the Attribute Inference experiment section, is SCAN making use of label semi-supervision? Are the algorithms being compared to unsupervised? The experiments would benefit from more experiments on topologically diverse networks to show the effectiveness of your approach in a variety of real data settings. - The ellipsoids in Figure 4 are too small to be easily observed. - There are a number of grammatical errors/quirks in the manuscript which impact overall readability. For example: Abstract, line 2: offers -> offer Abstract, line 7: remove "the" Page 1, line -3: A number of work has -> A number of works have Page 2, line 6: has -> have Page 2, line 6: profound probability theoretical basis -> profound basis in theoretic probability Page 2, line 8: offers -> offer Page 2, line 9: has -> have Page 2, 2nd full paragraph, line 5: consisted -> consisting etc...

Reviewer 3

This paper proposes a novel semi-supervised co-embedding model for attributed networks, based on Semi-supervised VAE. The model design is reasonable, by considering the dependency of node-node and node-attribute in five different cases. The inference process also makes sense. It improves the existing unsupervised co-embedding model by learning also with partially labeled nodes in a semi-supervised way. The introduced model outperforms the state-of-the-art attributed graph embedding models. The paper is also well written and easy to follow. There are some writing errors to correct and some unclear notations to clarify. Please see the “improvements” part. Another suggestion is about the evaluation of the performance on different ratio of labelled data. It will be better to include it in the main text, than leaving it in the supplementary document, because the semi-supervised learning models should be evaluated on the capability of learning from different small portions of labeled data.

Paper ID:	3507
Title:	Semi-supervisedly Co-embedding Attributed Networks

Reviewer 1

Reviewer 2

Reviewer 3