{"title": "Adaptive GNN for Image Analysis and Editing", "book": "Advances in Neural Information Processing Systems", "page_first": 3643, "page_last": 3654, "abstract": "Graph neural network (GNN) has powerful representation ability, but optimal configurations of GNN are non-trivial to obtain due to diversity of graph structure and cascaded nonlinearities. This paper aims to understand some properties of GNN from a computer vision (CV) perspective. In mathematical analysis, we propose an adaptive GNN model by recursive definition, and derive its relation with two basic operations in CV: filtering and propagation operations. The proposed GNN model is formulated as a label propagation system with guided map, graph Laplacian and node weight. It reveals that 1) the guided map and node weight determine whether a GNN leads to filtering or propagation diffusion, and 2) the kernel of graph Laplacian controls diffusion pattern. In practical verification, we design a new regularization structure with guided feature to produce GNN-based filtering and propagation diffusion to tackle the ill-posed inverse problems of quotient image analysis (QIA), which recovers the reflectance ratio as a signature for image analysis or adjustment. A flexible QIA-GNN framework is constructed to achieve various image-based editing tasks, like face illumination synthesis and low-light image enhancement. Experiments show the effectiveness of the QIA-GNN, and provide new insights of GNN for image analysis and editing.", "full_text": "Adaptive GNN for Image Analysis and Editing\n\nLingyu Liang\n\nSouth China Univ. of Tech.\nlianglysky@gmail.com\n\nLianwen Jin\u2217\n\nSouth China Univ. of Tech.\nlianwen.jin@gmail.com\n\nSouth China Univ. of Tech.\n\nPeng Cheng Laboratory\n\nYong Xu\u2217\n\nyxu@scut.edu.cn\n\nAbstract\n\nGraph neural network (GNN) has powerful representation ability, but optimal\ncon\ufb01gurations of GNN are non-trivial to obtain due to diversity of graph structure\nand cascaded nonlinearities. This paper aims to understand some properties of\nGNN from a computer vision (CV) perspective. In mathematical analysis, we\npropose an adaptive GNN model by recursive de\ufb01nition, and derive its relation with\ntwo basic operations in CV: \ufb01ltering and propagation operations. The proposed\nGNN model is formulated as a label propagation system with guided map, graph\nLaplacian and node weight. It reveals that 1) the guided map and node weight\ndetermine whether a GNN leads to \ufb01ltering or propagation diffusion, and 2) the\nkernel of graph Laplacian controls diffusion pattern. In practical veri\ufb01cation, we\ndesign a new regularization structure with guided feature to produce GNN-based\n\ufb01ltering and propagation diffusion to tackle the ill-posed inverse problems of\nquotient image analysis (QIA), which recovers the re\ufb02ectance ratio as a signature\nfor image analysis or adjustment. A \ufb02exible QIA-GNN framework is constructed to\nachieve various image-based editing tasks, like face illumination synthesis and low-\nlight image enhancement. Experiments show the effectiveness of the QIA-GNN,\nand provide new insights of GNN for image analysis and editing.\n\n1\n\nIntroduction\n\nRecently, many research efforts have been devoted to graph neural network (GNN) [1\u20133], which\nis a signi\ufb01cant deep learning technique for graph data under semi-supervised learning. Despite\nits powerful representation ability, optimal con\ufb01gurations of GNN are not trivial to obtain due to\ndiversity of graph structure and cascaded nonlinearities. The layer structure or parameters are mostly\ndetermined by experimentations with expertise. In this paper, we intend to understand some properties\nof GNN mathematically from a computer vision (CV) perspective, and develop some GNN-based\noperations for CV problems.\nIn image analysis and synthesis, there are two basic operations. One is \ufb01ltering to suppress or\nextract feature/content in images [4\u20137]; the other is propagation that diffuses the visual feature from\nthe representative region throughout the entire image, so that similar pixels/regions have similar\nvisual appearance [8\u201312]. The reviving of neural networks with deep learning has introduced many\nCNN-based networks to achieve \ufb01ltering or propagation [6, 5, 13\u201316]. However, the properties\nof these models have not been clearly understood, since they found their inspirations in diverse\n\u2217Corresponding authors: Lianwen Jin, Yong Xu. Lingyu Liang and Lianwen Jin are supported by Natural\nScience Foundation of Guangdong Province (No. 2017A030312006, 2019A1515011045), the National Key\nResearch and Development Program of China (No. 2016YFB1001405), NSFC (Grant No.: 61673182, 61771199,\n61502176), GDSTP (No. 2017A010101027), GZSTP (No. 201704020134) and Fundamental Research Funds\nfor the Central Universities (No. 2019MS023); Yong Xu is supported by National Nature Science Foundation of\nChina (61672241, U1611461), Natural Science Foundation of Guangdong Province (2016A030308013), and\nScience and Technology Program of Guangzhou (201802010055).\n\n33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.\n\n\fcontexts and were formulated in diverse forms, like partial differential equation (PDE) [8], variational\nfunctional [10, 11] or deep neural networks (DNN) [6, 16].\nRecent research indicates that standard signal processing models with elaborate prior knowledge\ncan achieve competitive performance with the state-of-the-art DNN-based methods [4, 9, 17, 8, 10].\nIt inspires us to analyze DNN based on existing visual operations. Speci\ufb01cally, we focus on GNN,\nwhich is one of the semi-supervised methods in the deep learning on graphs [1]. We try to explore\ndifferent diffusion properties of GNN with its relation to \ufb01ltering or propagation models.\nIn mathematical analysis, we propose a GNN model by recursive de\ufb01nition, which is formulated as a\ngraph-based label propagation system with guided map, graph Laplacian and node weight. We derive\nits relation with the basic CV operations of \ufb01ltering (e.g. edge-aware \ufb01lter) and propagation (e.g.\nlearning to diffuse (LTD) model). It reveals that: 1) Guided map and node weight determine whether\na GNN leads to \ufb01ltering or propagation diffusion; 2) Kernel of graph Laplacian controls diffusion\npattern of a GNN.\nIn practical analysis, we applied the GNN models to design operations and systems for quotient\nimage analysis (QIA), which recovers the re\ufb02ectance ratio as a signature for illumination analysis or\nadjustment [18]. QIA is an essential component for image analysis and synthesis [19\u201330], especially\nfor face synthesis and low-light image enhancement.\nThe challenges of QIA is twofold. Firstly, QIA extracts intrinsic representations from ambiguous\nand noisy data, which is an ill-posed inverse problem [31]. Secondly, slight errors in QIA may\nlead to obvious visual artifacts, since human visual cognitive systems are highly sensitive to image\nappearance changes.\nTo tackle these problems, we impose new adaptive kernel structures with different guided feature\nand priors to the GNN model, and propose GNN-based operations that achieve speci\ufb01c diffusion\nof \ufb01ltering and propagation for QIA. Then, we develop a GNN-based QIA system (QIA-GNN) for\nimage-based illumination synthesis. The system consists of three GNN subnetwork (denoted as\nQIA-GNN-L1/L2/L3). QIA-GNN-L1 acts as a \ufb01ltering operation, which extracts the initial re\ufb02ection\nand illumination feature from input images; QIA-GNN-L2 acts as a propagation operation, which\nadaptively propagates the initial illumination feature from the representative region to the whole\nimages with good visual consistency; QIA-GNN-L3 is the output layer that combines different\nimage layers and features to obtain the synthesized results. In this paper, we simply construct\nQIA-GNN-L1/L2 under semi-supervised learning scheme, but the GNN-based system is \ufb02exible to\nachieve various illumination synthesis tasks, including face relighting, face swapping, trans\ufb01guring\nand low-light enhancement.\nThe main contributions are summarized as follows:\n\n\u2022 We propose an adaptive GNN model for image analysis, and mathematically derive its\nrelation to \ufb01ltering and propagation models, like edge-aware \ufb01lters [32] and the LTD\nmodel [8]. It reveals that when a GNN is formulated as a graph-based label propagation\nsystem with guided map, graph Laplacian and node weight, then guided map and node\nweight determine whether it produces \ufb01ltering or propagation diffusion; kernel of graph\nLaplacian controls its diffusion pattern.\n\u2022 To tackle the inverse problems of QIA, we design a new adaptive kernel with different\nguided feature and priors, and propose GNN-based operations that achieve speci\ufb01c diffusion\nof \ufb01ltering and propagation for QIA. Then, a \ufb02exible QIA-GNN system is constructed to\nproduce various illumination synthesis, such as face relighting, face swapping, trans\ufb01guring\nand low-light enhancement.\n\n2 Proposed GNN with Adaptive Kernel\nThe original GNN was proposed in \"pre-deep-learning\" era [33]. For a graph G = (V, E) with N\nnodes (V = {v1, ...vN}), a GNN can be formulated as a recursive equation:\n\n(1)\nwhere ui is the state of node vi; j \u2208 N (i) is the neighborhood set of node vi; p and h denote features\nof nodes and edges respectively; and F is a parametric function.\n\nj\u2208N (i)\n\nui =\n\n(cid:88)\n\nF(cid:0)ui, uj, pi, pj, hi,j\n\n(cid:1) ,\n\n2\n\n\fThe GNN model of [33] was originally designed for classi\ufb01cation or regression problems under a\nsupervised learning scheme. This paper further extends and explores GNN in two aspects. Firstly,\nwe mathematically distinguish and analyze two intrinsic diffusion properties of GNN, i.e. \ufb01ltering\nand propagation; then we use the GNN to unify many signi\ufb01cant CV operations, as discussed in\nSec. 2.1 and Sec. 2.2. Secondly, we generalize the formulation of Scarselli\u2019s GNN [33] from data\nclassi\ufb01cation/regression to visual data manipulation, where we propose a new kernel structure for\nQIA in Sec. 2.3 and a 3-layer QIA-GNN system to achieve multi-task illumination editing in Sec. 3.\nHere, we propose an adaptive GNN based on the graph-based label propagation (LP) system [34].\nDifferent from the original LP system for labelling nodes of a graph [35], we formulate the diffusion\nprocess from a visual diffusion perspective, and it can achieve both \ufb01ltering and propagation diffusion.\nLet V be the visual element domain of an image. The image is mapped into a graph G = (V, E),\nwhere each node vi \u2208 V, i = 1, ..., N corresponds to the visual element of the image, and pi is the\nfeature of node vi. Let u be the state of visual element de\ufb01ned over V, i.e. u \u2208 RN . The graph-based\nLP system of the GNN can be reformulated as:\n\nut+1 \u2212 ut = Lut + \u039b(g \u2212 ut)\n\n(2)\n\u2022 g(p) is the guided map de\ufb01ned over V, i.e. g \u2208 RN , which is used to guide the diffusion of\nGNN. The representative visual elements of the guided map is de\ufb01ned within S, where S is\na close subset of V with boundary \u2202S for diffusion of propagation;\n\u2022 \u039b(p) = diag(\u03bb(pi))i=1,...,N \u2208 RN\u00d7N with \u03bb \u2265 0 is the node weight, which determines\n\u2022 L(p) \u2208 RN\u00d7N is the graph Laplacian controlling the local diffusion pattern of GNN, where\nthe kernel function k(pi, pj) measures the similarity of a node with its neighborhood set as\nfollows:\n\nthe restricted region and the level of the guidance map g for u;\n\n\uf8f1\uf8f2\uf8f3\n\n\u2212(cid:80)\n\nLij =\n\nk(pi, pj),\npj\u2208Npi\n0\n\nk(pi, pj)\n\nj \u2208 N (i)\ni = j\n\notherwise.\n\nEq. 2 can be solved iteratively using Jacobi method. It can be also proved that this LP system with\ngraph Laplacian of Eq. 3 converges to a unique solution based on the Banach \ufb01xed-point theorem [36].\nStudies indicate that GNNs can achieve state-of-the-art performance in various tasks [37\u201340], but\nthe design of new GNNs is mostly based on empirical heuristics and trial-and-error. Recently, [3]\nproposes a theoretical framework for analyzing the expressive power of GNNs based on Weisfeiler-\nLehman (WL) graph isomorphism test [41], and validates the theory by experiments for graph-focused\ntasks. In the following sections, we analyse the diffusion properties of GNN from a CV perspective.\nThe propagation and \ufb01ltering properties of GNN guide us to construct new GNN-based operations\nand system for image analysis and synthesis.\n\n2.1 Propagation Properties of Proposed GNN\nWith proper setting of {g, \u039b, L}, the proposed GNN model (2) can produce propagation diffusion,\nsuch as Zhu\u2019s LP model [35] or Liu\u2019s learning to diffuse (LTD) models [8].\nIn propagation diffusion, the representative elements of g are within S, where S \u2282 V with boundary\n\u2202S. The value of the reaction weight \u039bii = \u03bb(pi) is determined depending on vi \u2208 S or not. The\nGNN model identi\ufb01es the representative visual element of g and propagate the value from S to V.\nWhen Eq. (2) is stable, the GNN model becomes:\n\n(3)\n\n(4)\n\nwith\n\nLu + \u039b(g \u2212 u) = 0\n\n(cid:26) \u03bb(pi) is large, gi = spi,\n\nvi \u2208 S\nvi \u2208 V \\ S,\n\n(5)\nwhere spi is the value corresponding to a node vi with feature pi in representative domain S; \u0001 is a\nsmall constant to avoid degeneration or \u0001 = 0. The speci\ufb01c value of \u03bb(pi) is task-dependent. One\ntypical setting is \u03bb(pi) = 1 for pi \u2208 S, while \u03bb(pi) = 0 otherwise.\n\n\u03bb(pi) is small, gi = \u0001,\n\n3\n\n\fTo clearly demonstrate the propagation properties of the GNN model, we reformulate the whole LP\nsystem (4) with setting (5) for each node vi with its neighborhood j \u2208 N (i):\n\n\uf8eb\uf8ed (cid:88)\n\nj\u2208N (i)\n\n\uf8f6\uf8f8 ui \u2212 (cid:88)\n\uf8eb\uf8ed (cid:88)\n\nj\u2208N (i)\n\nj\u2208N (i)\n\nk(pi, pj) + \u03bb(pi)\n\nk(pi, qj)uj = \u03bb(pi)gi\n\n\u21d2 ui =\n\n1\n\ndpi + \u03bb(pi)\n\nk(pi, pj)uj + \u03bb(pi)gi\n\n\uf8f6\uf8f8\n\n(6)\n\nwhere dpi =(cid:80)\n\nj\u2208N (i) k(pi, pj). The value of ui in (6) is mainly controlled by \u03bb(pi):\n\u2022 For vi \u2208 S, \u03bb(pi) is large, then ui is dominated by the term of the guided map\n\u2022 For vi \u2208 V \\ S, \u03bb(pi) is small, then ui is determined by the diffusion of Eq. (6).\n\n\u03bb(pi)\ndpi +\u03bb(pi)gi;\n\n2.1.1 Relation to Zhu\u2019s Label Propagation (LP)\nLet u = uLP = (uLP\ndata for vi \u2208 S and uLP\nbelow, the GNN leads to Zhu\u2019s label propagation [35, 42]:\n\nl\n\nu ) speci\ufb01es how each data is to be labeled, where uLP\n\n, uLP\ndenotes labeled\nu denotes unlabeled data for vi \u2208 V\\S. With the setting of {gLP , \u039bLP , LLP}\n\nl\n\ndenotes the initial label for vi \u2208 S, and gLP\n\nu = 0 for\n\n, gLP\n\nunlabeled data vi \u2208 V\\S;\n\n\u2022 gLP = (gLP\n\u2022 \u039bLP = diag(\u03bbLP (pi))i=1,...,N , where\n\nu ), where gLP\n\nl\n\nl\n\n(cid:26) \u03bbLP (pi) (cid:29) dLP\n\npi\n\u03bbLP (pi) = 0,\n\n,\n\nvi \u2208 S\nvi \u2208 V\\S;\n\n\u2022 LLP uses the Gaussian kernel of width \u03c3 as the similarity measurement, where\n. For LLP , we make a decomposition as LLP = DLP + WLP ,\nis the diagonal component of LLP , and WLP is the off-diagonal\n\n\u2212 (cid:107)pi\u2212pj(cid:107)2\n\nkLP (pi, pj) = e\nwhere DLP\ncomponent. Then, we obtain DLP\n\nii = dLP\npi\n\n2\u03c32\n\nj WLP\nij .\n\nWith these settings, we can reformulate Eq. (4) as:\n\nuLP\ni =\n\n1\ndLP\npi\n\nj\u2208N (i) kLP (pi, pj)uj,\n\nvi \u2208 S\nvi \u2208 V\\S\n\n(7)\n\nwhich is precisely one iteration of the label propagation [35].\n\n2.1.2 Relation to Liu\u2019s Learning to Diffusion (LTD) Model\nSimilarly, we could also derive the relation to Liu\u2019s LTD model [8]. For vi \u2208 S, let \u03bb(pi) (cid:29) dpi,\nthen ui \u2248 \u03bb(pi)\n\ndpi +\u03bb(pi)gi \u2248 gi = spi. The GNN model (4) with setting (5) becomes:\nvi \u2208 S\nvi \u2208 V\\S\n\nspi ,\nk(pi, pj)uj + \u03bb(pi)gi\n\n(cid:32) (cid:80)\n\nui (cid:39)\n\n(cid:33)\n\n1\n\n,\n\n(8)\n\ndpi +\u03bb(pi)\n\nj\u2208N (i)\n\n\uf8f1\uf8f4\uf8f2\uf8f4\uf8f3\n\nIf k(pi, pj) = exp(\u2212\u03b2(cid:107)pi \u2212 pj(cid:107)2), the GNN system of Eq. (8) leads to the LTD model [8].\n\n2.2 Filtering Properties of Proposed GNN\n\nWith certain setting, the GNN model (2) leads to diffusion that is similar to edge-aware \ufb01lter, like\nanisotropic diffusion [43] or optimization-based \ufb01lter [44].\n\n4\n\nii =(cid:80)\n(cid:80)\n\ngLP\nl\n\n,\n\n(cid:40)\n\n\fFigure 1: The QIA-GNN consists of three subnetworks: QIA-GNN-L1 acts as \ufb01ltering operation\nfor facial quotient image extraction; QIA-GNN-L2 acts as propagation operation for facial quotient\nimage propagation; QIA-GNN-L3 produces the result that combines the image layers and feature.\n\nIn \ufb01ltering diffusion, the representative elements of the guided map g cover the whole domain S = V\nand reaction weight is a identity matrix \u039b = E. The GNN becomes:\n\n(9)\nIt can be regarded as the discrete form of bias anisotropic diffusion [45], if the kernel of L is controlled\nby the gradient of u, i.e. k = k((cid:107)\u2207u(cid:107)). Furthermore, if \u039b = 0, we obtain the famous anisotropic\ndiffusion [43].\n\nut+1 \u2212 ut = Lut + (g \u2212 ut).\n\n2.2.1 Relation to Farbman\u2019s Optimization-Based Filter (OF)\n\nMany important edge-aware \ufb01lters [32] are de\ufb01ned implicitly by a variational formulation, and we\ncalled it optimization-based \ufb01lters (OF) here. One of the representative OF model was proposed\nby [44], which is de\ufb01ned by the minimization of a quadratic functional:\n\n(cid:8)(u \u2212 g)(cid:62)(u \u2212 g) + u(cid:62)Lu(cid:9) ,\n\nu = argmin.\n\nu\n\n(10)\n\nwhere u is the \ufb01ltered output, g is the original image, and L encodes the \ufb01lter kernel.\nWhen Eq. (9) is stable, the GNN model becomes: (E \u2212 L)u = g, which has the same solution of the\nquadratic functional of (10).\nLet u = uOF be the \ufb01ltered output, and g = gOF be the original image. With the setting below, the\nGNN (9) obtains edge-aware smoothing as OF of [44]:\n\n\u2022 LOF measures node similarity with vi and its 4-neighbor set j \u2208 N4(i) using the kernel:\n(11)\nwhere pOF is the log-luminance channel of gOF to guide the edge-aware diffusion, \u03b1\ncontrols the local diffusion pattern, \u03b2 controls the global smoothness, and \u03b5 is a small\nconstant to avoid division by zero.\n\nkOF (pi, pj) = \u03b2((cid:107)pOF\n\ni \u2212 pOF\n\nj (cid:107)\u03b1 + \u03b5)\u22121,\n\n2.3 Adaptive Kernel for Quotient Image Analysis (QIA)\n\nBased on the mathematical analysis, the diffusion pattern is controlled by the kernel of L. To verify\nour analysis, we propose a new kernel with setting {d(M), G} to design \ufb01ltering and propagation\noperation for QIA, as shown bellow:\n\nkQIA(pi, pj) =\n\nd(M)p\n\n(cid:107)G(pi) \u2212 G(pj) + \u0001(cid:107)\u03b1\n\np\n\n,\n\n(12)\n\nwhere d is a spatially inhomogeneous smoothness parameter to control the smoothness of propagation\nin different regions, which is determined by the con\ufb01dence map M; M can be obtained based on the\nstructure information of an image, such as facial components or semantic segmentation of a scene;\n\u03b1 controls the sensitivity of the term to the derivatives of the guided feature; G is the feature to\nguide the propagation, (cid:107) \u00b7 (cid:107)p represents the p-norm of guided feature space and \u0001 is a small constant\n(typically \u0001 = 0.001) to avoid division by zero.\n\n5\n\n\f3 GNN for Quotient Image Analysis (QIA-GNN)\n\nWe apply the GNN with adaptive kernel to design adaptive \ufb01ltering and propagation operations for\nQIA, and construct a GNN-based system (QIA-GNN) to achieve illumination-aware facial synthesis\nand low-light image enhancement, as shown in Fig. 1. QIA-GNN contains three subnetwork, denoted\nas QIA-GNN-L1/L2/L3:\n\n1. QIA-GNN-L1: Quotient image extraction, where a GNN-based \ufb01ltering operation F (g, L)\nis proposed to achieve two goals: 1) separating images into multiple facial layers; 2)\nextracting quotient image Q in the representative region.\n\n2. QIA-GNN-L2: Quotient image propagation, where a GNN-based propagation operation\nP (g, \u039b, L) is constructed that adaptively propagate Q to obtain illumination map T. Note\nthat different combination of F (g, L) and P (g, \u039b, L) operations can produce different T.\n3. QIA-GNN-L3: Image layer combination, which combines T and the image layers to\n\nproduce illumination editing.\n\nWe take face relighting as the main presentation in this paper, whose goal is to transfer the illumination\nfrom the reference image R to the input image I in a consistent manner. We construct the \ufb01ltering\nF relit(g, L) and propagation P relit(g, \u039b, L) operations of face relighting to show how to construct a\nGNN-based system with domain knowledge to solve the visual analysis problem.\n\n3.1 Quotient Image Extraction (FQIA-GNN-L1)\n\nQIA-GNN-L1 is constructed by the GNN-based \ufb01ltering operation F (g, L) with facial prior, which\nseparates the target I or reference R into facial layers and obtains the initial quotient image Q. Note\nthat some pre-processing, like landmark detection or face alignment, have been done for the input\nimages. It is implemented as follow: Firstly, both the input I and R are converted into CIELAB\ncolor space, where the two chromaticity channels are regarded a color layers Ic (Rc). Secondly, the\nluminance channel is decomposed into lighting layer IL (RL) and detail layer Id (Rd) by F (g, L),\nwhere lighting layers captures the main illumination variance and detail layer contains facial details.\nFinally, the initial quotient image Qrelit is obtained by Qrelit = F relit(RL|g,L)\nF relit(IL|g,L) .\nF relit acts as inhomogeneous \ufb01ltering operation. To extract Qrelit, F relit should smooth out details\nin background, eyes and eyebrows, while preserves the information in facial region. The setting of\nF relit(g, L) is as follows:\n\ng = RL lead to inhomogeneous smoothing of lighting layer IL and RL, respectively.\n\n\u2022 The guided map g is regarded as the input image to be \ufb01ltered. For example, g = IL and\n\u2022 We integrate facial prior to the kernel kQIA of L to preserve the illumination within facial\nregion, whiles smooth out the detail in eyes, eyebrows and background. Here we simply\nset j \u2208 N4(i) to obtain local \ufb01ltering. p = log(IL) is the feature to guide the diffusion.\nTypically, the parameters are set as \u03b1 = 1.2 and \u03b5 = 0.0001. d(M) is spatially determined\nby different region, so that background, eyes and eyebrows are smoothed out, while the\ninformative illumination in the facial region is preserved.\n\n3.2 Quotient Image Propagation (QIA-GNN-L2)\nQIA-GNN-L2 is used to generate facial template T de\ufb01ned on V by propagating the values of Q from\nthe facial region S to V, i.e. Trelit = P relit(Qrelit|g, \u039b, L). Since human visual system correlates\nwith the gradient in an image, T should \ufb01t the facial boundary closely and has the smooth transition\nbetween different regions.\nTo generate Trelit de\ufb01ned on V, we construct P relit to propagate the information of Qrelit from\nthe facial region S to the regions with missing and uncertain illumination, like eyes, eyebrows and\nbackground V\\S. The setting of P relit(g, \u039b, L) is as follows:\n\n\u2022 For guided map, g = Qrelit, where g contains illumination of the quotient image in the\n\nrepresentative region S.\n\n6\n\n\f\u2022 The reaction weight \u039b determines which information is propagated to where. There-\nthe values of \u039b is consistent to the spatial location of S and V\\S as \u039b =\n\nfore,\ndiag(\u03bbrelit(pi))i=1,...,N , where\n\n(cid:26) 1,\n\n0,\n\nvi \u2208 S\nvi \u2208 V\\S;\n\n\u03bbrelit(pi) =\n\n\u2022 For kQIA, we produce the con\ufb01dence map M that is consistent to V with smooth transition\nof region boundary. The visual information of Qrelit are propagated from the representative\nfacial region S to the regions of eyes, eyebrows and background V\\S. Smoothness parameter\nd are controlled by M, so that d is large (typically d = 10) in V\\S to produce illumination\npropagation, and d is small (typically d = 0.4) in S to preserve the signi\ufb01cant illumination\ndetail.\n\n3.3 Image Layer Combination (QIA-GNN-L3)\n\nQIA-GNN-L3 is output GNN layer, which combines T with the facial layer of original/reference\nto produce the \ufb01nal face synthesis. For face relighting, we transfer illumination of the reference to\nthe original face by multiplying the Trelit and the lighting layer IL as OL = IL \u25e6 Trelit, where \u25e6 is\nan element wise product. Finally, we recombine the other facial layers to obtain the face relighting\noutput O.\n\n4 Experiment\n\n4.1 Basic Evaluation\n\n(a) Face Relighting (FR)\n\n(b) Low-Light Image Enhancement (LIE)\n\nFigure 2: Basic evaluation of QIA-GNN, where (a) shows face relighting with single target and\nmultiple references; (b) shows low-light image enhancement with illumination maps.\n\nWe use the QIA-GNN to achieve face relighting (FR) and low-light image enhancement (LIE), as\nshown in Fig. 2. Fig. 2a shows FR of the same target with different references, and we can observe\nthere is good consistency between illumination maps and the relighted results. Fig. 2b shows the\nLIE of different images, which indicates the effectiveness of our QIA-GNN system to capture the\nillumination feature in different scenes.\n\n4.2 Qualitative Evaluation\n\nWe verify our QIA-GNN system for different editing tasks, including face relighting, face swapping,\ntrans\ufb01guring and LIE. Fig. 3 illustrates the comparisons with the-state-of-arts, and indicates that\nthe QIA-GNN system are competitive to related methods. Note that most of the previous systems\nare designed for speci\ufb01c tasks, while our GNN-QIA system is \ufb02exible to perform multiple image\nanalysis or adjustment with the corresponding settings.\nFace Relighting (FR). We compare our method with Li\u2019s [46] and Chen\u2019s [47] methods for face\nrelighting. The results show that our method allows to relight faces in two patterns. For the \ufb01rst\npattern, we perform QIA for all the RGB channels and obtain result similar to Li\u2019s method that\ntransfers both the shading and tone to the target. For the other pattern, we perform QIA only for the\n\n7\n\n\fFigure 3: Qualitative comparisons with related methods, including face relighting (red box) with\nLi\u2019s [46] and Chen\u2019s [47], face swapping (blue box) with Korshunova\u2019s [20], trans\ufb01guring (green\nbox) with Kemelmacher\u2019s [23] and Nirkin\u2019s [24], and low-light image enhancement (black box)\nwith CVC [48] and LIME [28].\n\nluminance channel of inputs, and obtain result that transfers only the shading of the reference but\npreserves the original tone of the target.\nFurthermore, our method is complementary to previous methods in two aspects. For visual effect-\ns, [46] and [47] fail to relight the region outside the face, while ours adaptively generates the missing\nillumination in the background. For computation, [46] and [47] requires multiple operations derived\nfrom different contexts, while our operations are based on the same GNN model, which can be\nef\ufb01cient to implement and extend.\nIllumination-Aware Face Swapping (FS). Fig. 3 also shows the comparison with the recent works\nof [20] for face swapping. [20] proposed a new face synthesis system that trains a speci\ufb01c CNN\nto transform an input (original) identity into a reference identity with preserved facial properties.\nFor example, the CageNet transforms the input identity into Nicolas Cage with the same expression.\nAlthough [20] has considered the lighting adjustment problem and integrates the lighting loss for\nthe training of the CageNet, the shading and tone consistency could still be further improved by our\nmethod. Note that the QIA-GNN-L2 is setup to propagate quotient feature within the facial region\nfor seamless blending, which is slightly different to the setting for relighting.\nTrans\ufb01guring (TF). Recently, [23] introduced a new face synthesis task, called trans\ufb01guring, which\nlet users trans\ufb01gure their appearance from images by changing hair style, hair color etc. Fig. 3\nshows the comparison of [23] and [24] for trans\ufb01guring. In some cases, some part of the faces is\nunder occlusion of hair. To tackle this problem, we integrate the region-aware mask of [10] into our\nsystem and obtain competitive results compared with the state-of-the-art methods [23, 24]. Since the\nregion-aware mask [10] is based on LP, it can be implemented by our GNN model, which indicates\nthe powerful representation of GNN and the \ufb02exibility of our QIA-GNN system.\nLow-Light Image Enhancement (LIE). Contrast enhancement have been extensively studied in\nrecent decades [25, 49, 50, 27], but the enhancement for low-light images is still an unsolved\nproblem [26, 51, 28\u201330, 52]. The main challenges is twofold. Firstly, the intensity of the images\nencodes many imaging factors, like illumination of the scene, re\ufb02ection of the object, and the\nviewpoint. Obtaining good low-light enhancement without over-sharpening should recover or estimate\nsome properties of the scene and object from image intensity [26, 51, 28\u201330, 52], but it is unfortunately\nan inherent ill-posed problem [53, 31]. Secondly, quality assessment of sharpened images in objective\nmanners is still an open problem [54, 55], and it lacks a benchmark to evaluate the performance of\ndifferent low-light enhancement methods. We focus on the \ufb01rst aspect in this paper, and apply the\nQIA-GNN with new regularization to adaptively enhance low-light image without over-sharping.\n\n8\n\nTar./Ref.Face RelightingOurs Ours Li et al.Chen et al.Illumination-Aware Face SwappingOurs TargetRef. of CageNetKorshunova et al.TransfiguringRef. of SwiftNetOurs Korshunova et al.Tar./Ref.Ours Shlizerman et al.Nirkin et al.Low-Light Image EnhancementInputCVCLIMEOursIllum. MapImg1Img2Img3Img4Img5Img6\fFigure 4: Objective assessment of FS with [20] and TF with [23] and [24] by GMSD [56].\n\nCVC [48]\nLIME [28]\n\nOurs\n\nImg4\n6.75\n7.65\n7.65\n\nImg5\n4.27\n5.79\n6.04\n\nImg6\n5.16\n7.23\n7.71\n\nImg1\n6.54\n7.67\n7.48\n\nImg2\n6.28\n7.43\n7.47\n\nImg3\n6.38\n7.53\n7.55\n\nTable 1: Objective assessment of LIE with CVC [48], LIME [28] and Ours by DE [57].\n\nBased on the Retinex theory [53, 29], we obtain adaptive low-light enhancement via estimation and\nadjustment of the illumination maps of images using the QIA-GNN. It uses the QIA-GNN-L1 as\n\ufb01ltering operation to extracts the initial illumination map, and adjusts the map adaptively with smooth\ntransition by QIA-GNN-L2 which acts as propagation operation. Finally, the enhancement result is\nproduced by combining the illumination map with the image layer.\nWe made comparison with Contextual and Variational Contrast enhancement (CVC) [48] and the\nrecently proposed LIME [28] for low-lighting image enhancement, as shown in Fig. 3 (black box).\nThe results indicates that our method facilitates to adaptively brighten images without over-sharpening\nthe lighter regions of images in high dynamic range (HDR) manners. It also shows that our method\noutperforms CVC [48] with better tonal consistent and achieve competitive performance with the\nstate-of-the-art LIME [28].\n\n4.3 Quantitative Evaluation\n\nFor facial synthesis (FR, FS, TF), we made a small scale user study to determine which is more\nconsistent to the original target with 10 volunteers (5 males and 5 females) for the results in Fig. 3,\nand our GNN-based results have a higher rank score than the other methods. A larger scale user\nstudy for more results would be performed in our future research. In addition, we used some metric\nof image quality assessment for objective evaluation. For FS and TF, we used gradient magnitude\nsimilarity deviation (GMSD) [56] to measure the visual similarity between the target and output pairs\n(shown in Fig. 4), where GMSD1