{"title": "Learning Parametric Sparse Models for Image Super-Resolution", "book": "Advances in Neural Information Processing Systems", "page_first": 4664, "page_last": 4672, "abstract": "Learning accurate prior knowledge of natural images is of great importance for single image super-resolution (SR). Existing SR methods either learn the prior from the low/high-resolution patch pairs or estimate the prior models from the input low-resolution (LR) image. Specifically, high-frequency details are learned in the former methods. Though effective, they are heuristic and have limitations in dealing with blurred LR images; while the latter suffers from the limitations of frequency aliasing. In this paper, we propose to combine those two lines of ideas for image super-resolution. More specifically, the parametric sparse prior of the desirable high-resolution (HR) image patches are learned from both the input low-resolution (LR) image and a training image dataset. With the learned sparse priors, the sparse codes and thus the HR image patches can be accurately recovered by solving a sparse coding problem. Experimental results show that the proposed SR method outperforms existing state-of-the-art methods in terms of both subjective and objective image qualities.", "full_text": "Learning Parametric Sparse Models for Image\n\nSuper-Resolution\n\nYongbo Li, Weisheng Dong\u2217, Xuemei Xie, Guangming Shi1, Xin Li2, Donglai Xu3\nState Key Lab. of ISN, School of Electronic Engineering, Xidian University, China\n\n1Key Lab. of IPIU (Chinese Ministry of Education), Xidian University, China\n\n2Lane Dep. of CSEE, West Virginia University, USA\n\n3Sch. of Sci. and Eng., Teesside University, UK\n\nyongboli@stu.xidian.edu.cn, {wsdong, xmxie}@mail.xidian.edu.cn\n\ngmshi@xidian.edu.cn, Xin.Li@mail.wvu.edu\n\nAbstract\n\nLearning accurate prior knowledge of natural images is of great importance for\nsingle image super-resolution (SR). Existing SR methods either learn the prior\nfrom the low/high-resolution patch pairs or estimate the prior models from the\ninput low-resolution (LR) image. Speci\ufb01cally, high-frequency details are learned\nin the former methods. Though effective, they are heuristic and have limitations\nin dealing with blurred LR images; while the latter suffers from the limitations\nof frequency aliasing. In this paper, we propose to combine those two lines of\nideas for image super-resolution. More speci\ufb01cally, the parametric sparse prior\nof the desirable high-resolution (HR) image patches are learned from both the\ninput low-resolution (LR) image and a training image dataset. With the learned\nsparse priors, the sparse codes and thus the HR image patches can be accurately\nrecovered by solving a sparse coding problem. Experimental results show that the\nproposed SR method outperforms existing state-of-the-art methods in terms of both\nsubjective and objective image qualities.\n\n1\n\nIntroduction\n\nImage super-resolution (SR) aiming to recover a high-resolution (HR) image from a single low-\nresolution (LR) image, has important applications in image processing and computer vision, ranging\nfrom high-de\ufb01nition (HD) televisions and surveillance to medical imaging. Due to the information\nloss in the LR image formation, image SR is a classic ill-posed inverse problem, for which strong\nprior knowledge of the underlying HR image is required. Generally, image SR methods can be\ncategorized into two types, i.e., model-based and learning-based methods.\nIn model-based image SR, the selection of image prior is of great importance. The image priors,\nranging from smoothness assumptions to sparsity and structured sparsity priors, have been exploited\nfor image SR [1][3][4][13][14][15][19]. The smoothness prior models, e.g., Tikhonov and total\nvariation (TV) regularizers[1], are effective in suppressing the noise but tend to over smooth image\ndetails. The sparsity-based SR methods, assuming that the HR patches have sparse representation with\nrespect to a learned dictionary, have led to promising performances. Due to the ill-posed nature of the\nSR problem, designing an appropriate sparse regularizer is critical for the success of these methods.\nGenerally, parametric sparse distributions, e.g., Laplacian and Generalized Gaussian models, which\ncorrespond to the (cid:96)1 and (cid:96)p (0 \u2264 p \u2264 1) regularizers, are widely used. It has been shown that the\nSR performance can be much boosted by exploiting the structural self-similarity of natural images\n\n\u2217Corresponding author.\n\n30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain.\n\n\f[3][4][15]. Though promising SR performance can be achieved by the sparsity-based methods, it\nis rather challenging to recover high-quality HR images for a large scaling factors, as there is no\nsuf\ufb01cient information for accurate estimation of the sparse models from the input LR image.\nInstead of adopting a speci\ufb01cal prior model, learning-based SR methods learn the priors directly\nfrom a large set of LR and HR image patch pairs [2][5][6][8][18]. Speci\ufb01cally, mapping functions\nbetween the LR and the high-frequency details of the HR patches are learned. Popular learning-based\nSR methods include the sparse coding approaches[2] and the more ef\ufb01cient anchored neighborhood\nregression methods (i.e., ANR and A+)[5][6]. More recently, inspired by the great success of the\ndeep neural network (DNN)[16] for image recognition, the DNN based SR methods have also been\nproposed[8], where the DNN models is used to learn the mapping functions between the LR and\nthe high-frequency details of the HR patches. Despite the state-of-the-art performances achieved,\nthese patch-based methods [6][8] have limitations in dealing with the blurred LR images (as shown\nin Sec. 5). Instead of learning high-frequency details, in [12] Li et al. proposed to learn parametric\nsparse distributions (i.e., non-zero mean Laplacian distributions) of the sparse codes from retrieved\nHR images that are similar to the LR image. State-of-the-art SR results have been achieved for the\nlandmark LR images, for which similar HR images can be retrieved from a large image set. However,\nit has limitations for general LR images (i.e., it reduces to be the conventional sparsity-based SR\nmethod), for which correlated HR images cannot be found in the image database.\nIn this paper, we propose a novel image SR approach combining the ideas of sparsity-based and\nlearning-based approaches for SR. The sparse prior, i.e., the parametric sparse distributions (e.g.,\nLaplace distribution) are learned from general HR image patches. Speci\ufb01cally, a set of mapping\nfunctions between the LR image patches and the sparse codes of the HR patches are learned. In\naddition to the learned sparse prior, the learned sparse distributions are also combined with those\nestimated from the input LR image. Experimental results show that the proposed method performs\nmuch better than the current state-of-the-art SR approaches.\n\n2 Related works\n\n(cid:88)\n\n(x, \u03b1) = argmin\n\n||y \u2212 Hx||2\n\nIn model-based SR, it is often assumed that the desirable HR image/patches have sparse expansions\nwith respect to a certain dictionary. For a given LR image y = Hx + n, where H \u2208 RM\u00d7N speci\ufb01es\nthe degradation model, x \u2208 RN and n \u2208 RM denote the original image and additive Gaussian noise,\nrespectively. Sparsity-based SR image reconstruction can be formulated as [3][4]\n2 + \u03bb\u03c8(\u03b1)},\n{||Rix \u2212 D\u03b1i||2\n\u221a\nn \u00d7 \u221a\n\nwhere Ri \u2208 Rn\u00d7N denotes the matrix extracting image patch of size\nn at position i from x,\nD \u2208 Rn\u00d7K denotes the dictionary that is an off-the-shelf basis or learned from an training dataset,\nand \u03c8(\u00b7) denotes the sparsity regularizer. As recovering x from y is an ill-posed inverse problem,\nthe selection of \u03c8(\u00b7) is critical for the SR performance. Common selection of \u03c8(\u00b7) is the (cid:96)p-norm\n(0 \u2264 p \u2264 1) regularizer, where zero-mean sparse distributions of the sparse coef\ufb01cients are assumed.\nIn [12], nonzero-mean Laplacian distributions are used, leading to the following sparsity-based SR\nmethod,\n\n2 + \u03b7\n\n(1)\n\nx,\u03b1\n\ni\n\n(x, \u03b1) = argmin\n\nx,\u03b1\n\n||y \u2212 Hx||2\n\n2 + \u03b7\n\n{||Rix \u2212 D\u03b1i||2\n\n2 + ||\u039bi(\u03b1i \u2212 \u03b2i)||1},\n\n(2)\n\n\u221a\n\n2\u03c32\nn\n\n\u03b8i,j\n\nwhere \u039b = diag( 2\n), \u03b8i and \u03b2i denote the standard derivation and expectation of \u03b1i, respectively.\nIt has been shown in [3] that by estimating {\u03b2i, \u03b8i} from the nonlocal similar image patches of\nthe input image, promising SR performance can be achieved. However, for large scaling factors,\nit is rather challenging to accurately estimate {\u03b2i, \u03b8i} from the input LR image, due to the lack\nof suf\ufb01cient information. To overcome this limitations, Li et al., propose to learn the parametric\ndistributions from retrieved similar HR images [12] via block matching, and obtain state-of-the-art\nSR performance for landmark images. However, for general LR images, for which similar HR images\ncannot be found, the sparse prior (\u03b2i, \u03b8i) cannot be learned.\nLearning-based SR methods resolve the SR problem by learning mapping functions between LR and\nHR image patches [2][6][8]. Popular methods include the sparse coding methods [2], where LR/HR\ndictionary pair is jointly learned from a training set. The sparse codes of the LR patches with respect\n\n(cid:88)\n\ni\n\n2\n\n\fto the LR dictionary are inferred via sparse coding and then used to reconstruct the HR patches with\nthe HR dictionary. To reduce the computational complexity, anchored neighborhood points (ANR)\nand its advanced version (i.e., A+) methods [6] have been proposed. These methods \ufb01rst divided the\npatch spaces into many clusters, then LR/HR dictionary pairs are learned for each cluster. Mapping\nfunctions between the LR/HR patches are learned for each cluster via ridge regression. Recently,\ndeep neural network (DNN) model has also been developed to learn the mapping functions between\nthe LR and HR patches [8]. The advantages of the DNN model is that the entire SR pipeline is\njointly optimized via end-to-end learning, leading to state-of-the-art SR performance. Despite the\nexcellent performances, these learning-based methods focusing on learning the mapping functions\nbetween LR and HR patches have limitations in recovering a HR image from a blurry LR image\ngenerated by \ufb01rst applying a low-pass \ufb01ltering followed by downsampling (as shown in Sec. 4). In\nthis paper, we propose a novel image SR method by taking advantages of both the sparse-based and\nthe example-based SR approaches. Speci\ufb01cally, mapping functions between the LR patches and\nthe sparse codes of the desirable HR patches are learned. Hence, sparse prior can be learned from\nboth the training patches and the input LR image. With the learned sparse prior, state-of-the-art SR\nperformance can be achieved.\n\n3 Learning Parametric Sparse Models\n\nIn this section, we \ufb01rst propose a novel method to learn the sparse codes of the desirable HR patches\nand then present the method to estimate the parametric distributions from both the predicted sparse\ncodes and those of the LR images.\n\n3.1 Learning the sparse codes from LR/HR patch pairs\nFor a given LR image patch yi \u2208 Rm, we aim to learn the expectation of the sparse code \u03b1i of the\ndesirable HR patch xi with respect to dictionary D. Without the loss of generality, we de\ufb01ne the\nlearning function as\n(3)\nwhere zi denotes the feature vector extracted from the LR patch yi, W \u2208 RK\u00d7m is the weighting\nmatrix and b \u2208 RK is the bias, and g(\u00b7) denotes an activation function. Now, the remaining task\nis to learn the parameters of the learning function of Eq. (3). To learn the parameters, we \ufb01rst\nconstruct a large set of LR feature vectors and HR image patch pairs {(zi, xi)}, i = 1, 2,\u00b7\u00b7\u00b7 , N.\nFor a given dictionary, the sparse codes \u03b1i of xi can be obtained by a sparse coding algorithm. Then,\nthe parameters W = {W, b} can be learned by minimizing the following objective function\n\n\u02dc\u03b1i = f (zi; W, b) = g(W \u2217 zi + b),\n\n(W, b) = argmin\n\nW,b\n\n||\u03b1i \u2212 f (zi; W, b)||2\n2.\n\n(4)\n\nN(cid:88)\n\ni=1\n\n(cid:88)\n\ni\u2208Sk\n\nThe above optimization problem can be iteratively solved by using a stochastic gradient descent\napproach.\nConsidering the highly complexity of the mapping function between the LR feature vectors and the\ndesirable sparse codes, we propose to learn a set of mapping functions for each possible local image\nstructures. Speci\ufb01cally, the K-means clustering algorithm is used to cluster the LR/HR patches into\nK clusters. Then, a mapping function is learned for each cluster. After clustering, the LR/HR patches\nin each cluster generally contain similar image structures, and linear mapping function would be\nsuf\ufb01cient to characterize the correlations between the LR feature vectors and the sparse codes of\nthe desirable HR patches. Therefore, for each cluster Sk, the mapping function can be learned via\nminimizing\n\n(Wk, bk) = argmin\nWk,bk\n\n||\u03b1i \u2212 (Wkzi + bk)||2\n2.\n\n(5)\n\nFor simplicity, the bias term bk in the above equation can be absorbed into Wk by rewriting Wk and\nzi as Wk = [Wk, bk] and zi = [z(cid:62)\ni ; 1](cid:62), respectively. Then, the parameters Wk can be easily solved\nvia a least-square method.\nAs the HR patches in each cluster generally have similar image structures, a compact dictionary\nshould be suf\ufb01cient to represent the various HR patches. Hence, instead of learning an overcomplete\ndictionary for all HR patches, an orthogonal basis is learned for each cluster Sk. Speci\ufb01cally, a PCA\n\n3\n\n\fAlgorithm 1 Sparse codes learning algorithm\nInitialization:\n\nconventional SR method;\n\n(a) Construct a set of LR and HR image pairs {y, x} and recover the HR images { \u02c6x} with a\n(b) Extract feature patches zi, the LR and HR patches yi and xi from { \u02c6x, y, x}, respectively;\n(c) Clustering {zi, yi, xi} into K clusters using K-means algorithm.\n\nOuter loop: Iteration on k = 1, 2,\u00b7\u00b7\u00b7 , K\n\n(a) Calculate the PCA basis Dk for each cluster using the HR patches belong to the k-th cluster;\n(b) Computer the sparse codes as \u03b1i = S\u03bb(D(cid:62)\nxi) for each xi, i \u2208 Sk;\n(c) Learn the parameters W of the mapping function via solving Eq. (5).\n\nki\n\nEnd for\n\nOutput: {Dk,Wk}.\n\nbasis, denoted as Dk \u2208 Rn\u00d7n is learned for each Sk, k = 1, 2,\u00b7\u00b7\u00b7 , K. Then, the sparse codes \u03b1i can\nbe easily obtained \u03b1i = S\u03bb(D(cid:62)\nxi), where Dki denotes the PCA basis of the ki-th cluster. Regarding\nthe feature vectors zi, we extract feature vectors from an initially recovered HR image, which can be\nobtained with a conventional sparsity-based method. Similar to [5][6], the \ufb01rst- and second-order\ngradients are extracted from the initially recovered HR image as the features. However, other more\neffective features can also be used. The sparse distribution learning algorithm is summarized in\nAlgorithm 1.\n\nki\n\n3.2 Parametric sparse models estimation\n\nAfter learning linearized mapping functions, denoted as \u02dc\u03b1i, the estimates of \u03b1i can be estimated from\nLR patch via Eq. (3). Based on the observation that natural images contain abundant self-repeating\nstructures, a collection of similar patches can often be found for an exemplar patch. Then, the mean\nof \u03b1i can be estimated as a weighted average of the sparse codes of the similar patches. As the\noriginal image is unknown, an initial estimate of the desirable HR image, denoted as \u02c6x is obtained\nusing a conventional SR method, e.g., solving Eq. (2). Then, the search of similar patches can be\nconducted based on \u02c6x. Let \u02c6xi denote the patch extracted from \u02c6x at position i and \u02c6xi,l denote the\npatches similar to \u02c6xi that are within the \ufb01rst L-th closest matches, l = 1, 2,\u00b7\u00b7\u00b7 , L. Denoted by zi,l\nthe corresponding features vectors extracted from \u02c6x. Therefore, the mean of \u03b2i can be estimated by\n\nL(cid:88)\n\nl=1\n\nL(cid:88)\n\n\u02dc\u03b2i =\n\nwi,l \u02dc\u03b1i,l,\n\n(6)\n\nc exp(\u2212|| \u02c6xi,l \u2212 \u02c6xi||/h), c is the normalization constant, and h is the prede\ufb01ned\n\nwhere wi,l = 1\nparameter.\nAdditionally, we can also estimate the mean of space codes \u03b1i directly from the intermediate estimate\nof target HR image. For each initially recovered HR patch \u02c6xi, the sparse codes can be obtained\nvia a sparse coding algorithm. As the patch space has been clustered into K sub-spaces and a\ncompact PCA basis is computed for each cluster, the sparse code of \u02c6xi can be easily computed as\n\u02c6\u03b1i,j = S\u03bb(D(cid:62)\n\u02c6xi,j), where S\u03bb(\u00b7) is the soft-thresholding function with threshold \u03bb, ki denote the\ncluster that \u02c6xi falls into. The sparse codes of the set of similar patches \u02c6xi,l can also be computed.\nThen, the expectation of \u03b2i can be estimated as\n\nki\n\n\u02c6\u03b2i =\n\nwi,j \u02c6\u03b1i,l.\n\nThen, an improved estimation of \u03b2i can be obtained by combining the above two estimates, i.e.,\n\nl=1\n\n\u03b2i = \u2206 \u02dc\u03b2i + (1 \u2212 \u2206) \u02c6\u03b2i.\n\n4\n\n(7)\n\n(8)\n\n\fwhere \u2206 = \u03c9diag(\u03b4j) \u2208 RK\u00d7K. Similar to [12], \u03b4j is set according to the energy ratio of \u02dc\u03b2i(j) and\n\u02c6\u03b2i(j) as\n\n, rj = \u02dc\u03b2i(j)/ \u02c6\u03b2i(j).\n\n(9)\n\nAnd \u03c9 is a prede\ufb01ned constant. After estimating \u03b2i, the variance of the sparse codes are estimated as\n\nr2\nj\n\n\u03b4j =\n\nj + 1/r2\nr2\nj\n\nL(cid:88)\n\nj=1\n\n\u03b82\ni =\n\n1\nL\n\n( \u02c6\u03b1i,j \u2212 \u03b2i)2.\n\nThe learned parametric Laplacian distributions with {\u03b2i, \u03b8i} for image patches xi are then used with\nthe MAP estimator for image SR in the next section.\n\nImage Super-Resolution with learned Parametric Sparsity Models\n\n4\nWith the learned parametric sparse distributions {(\u03b2i, \u03b8i)}, image SR problem can be formulated as\n\n(cid:88)\n\nL(cid:88)\n\n( \u02c6x, \u02c6Ai) = argmin\nxi,Ai\n\n||y \u2212 xH||2\n\n2 + \u03b7\n\n{|| \u02dcRix \u2212 DkiAi||2\n\nF + \u03bb\n\n||\u039bi(\u03b1i,l \u2212 \u03b2i)||1},\n\n(11)\n\ni\n\nl=1\n\nwhere \u02dcRix = [Ri,1x, Ri,2x,\u00b7\u00b7\u00b7 , Ri,Lx] \u2208 Rn\u00d7L denotes the matrix formed by the similar patches,\nAi = [\u03b1i,1,\u00b7\u00b7\u00b7 , \u03b1i,L], Dki denotes the selected PCA basis of the ki-th cluster, and \u039bi = diag( 1\n).\nIn Eq.\n(11), the group of similar patches is assumed to follow the same estimated parametric\ndistribution {\u03b2i, \u03b8i}. Eq. (11) can be approximately solved via alternative optimization. For \ufb01xed\nxi, the sets of sparse codes Ai can be solved by minimizing\n\n\u03b8i,j\n\n\u02c6Ai = argmin\n\nAi\n\n|| \u02dcRix \u2212 DkiAi||2\n\nF + \u03bb\n\n||\u039bi(\u03b1i,l \u2212 \u03b2i)||1\n\n(12)\n\nAs the orthogonal PCA basis is used, the above equation can be solved in closed-form solution, i.e.,\n\n(10)\n\n(13)\n\n(14)\n\n(15)\n\nL(cid:88)\n\nl=1\n\n(cid:88)\n\ni\n\n\u02c6\u03b1i,l = S\u03c4i(D(cid:62)\n\nkiRi,lx \u2212 \u03b2i) + \u03b2i,\n\nwhere \u03c4i = \u03bb/\u03b8i. With estimated \u02c6Ai, the whole image can be estimated by solving\n\n\u02c6x = argmin\n\nx\n\n||y \u2212 xH||2\n\n2 + \u03b7\n\n|| \u02dcRix \u2212 DkiAi||2\nF ,\n\nwhich is a quadratic optimization problem and admits a closed-form solution, as\n\n\u02c6x = (H(cid:62)H + \u03b7\n\n\u02dcRi)\u22121(H(cid:62)y + \u03b7\n\n(cid:62)\ni Dki\n\n\u02dcR\n\n\u02c6Ai),\n\n(cid:62)\ni\n\n(cid:88)\n\u02c6Ai =(cid:80)L\n\n\u02dcR\n\ni\n\n(cid:62)\ni Dki\n\n(cid:88)\n\ni\n\n\u02dcRi =(cid:80)L\n\n(cid:62)\ni\n\nl=1 R(cid:62)\n\nl Rl and \u02dcR\n\nwhere \u02dcR\nl Dki \u02c6\u03b1i,l. As the matrix to be inverted in Eq.\n(15) is very large, the conjugate gradient algorithm is used to compute Eq. (15). The proposed image\nSR algorithm is summarized in Algorithm 2. In Algorithm 2, we iteratively extract the feature\npatches from \u02c6x(t) and learn \u02dc\u03b2i from the training set, leading to further improvements in predicting\nthe sparse codes with the learned mapping functions.\n\nl=1 R(cid:62)\n\n5 Experimental results\n\nIn this section, we verify the performance of the proposed SR method. For fair comparisons, we\nuse the relative small training set of images used in [2][6]. The training images are used to simulate\nthe LR images, which are recovered by a sparsity-based method (e.g., the NCSR method [3]). Total\n100, 000 features and HR patches pairs are extracted from the reconstructed HR images and the\noriginal HR images. Patches of size 7 \u00d7 7 are extracted from the feature images and HR images.\nSimilar to [5][6], the PCA technique is used to reduce the dimensions of the feature vectors. The\ntraining patches are clustered into 1000 clusters. The other major parameters of the proposed SR\n\n5\n\n\fAlgorithm 2 Image SR with Learned Sparse Representation\nInitialization:\n\n(a) Initialize \u02c6x(0) with a conventional SR method;\n(b) Set parameters \u03b7 and \u03bb;\n\nOuter loop: Iteration over t = 0, 1,\u00b7\u00b7\u00b7 , T\n\n(a) Extract feature vectors zi from \u02c6x(t) and cluster the patches into clusters;\n(b) Learn \u02dc\u03b2i for each local patch using Eq. (6);\n(c) Update the estimate of \u03b2i using Eq. (8) and estimate \u03b8i with Eq. (10);\n(d) Inner loop (solve Eq.(11)): iteration over j = 1, 2,\u00b7\u00b7\u00b7 , J;\n\ni\n\nby solving Eq.(13);\n\n(I) Compute A(j+1)\n(II) Update the whole image \u02c6x(j+1) via Eq. (15);\n(III) Set x(t+1) = x(j+1) if j = J.\nEnd for\nOutput: x(t+1).\n\nmethod are set as: L = 12, T = 8, and J = 10. The proposed SR method is compared with several\ncurrent state-of-the-art image SR methods, i.e., the sparse coding based SR method (denoted as\nSCSR)[2], the SR method based on sparse regression and natural image prior (denoted as KK) [7],\nthe A+ method [6], the recent SRCNN method [8], and the NCSR method [3]. Note that the NCSR is\nthe current sparsity-based SR method. Three images sets, i.e., Set5[9], Set14[10] and BSD100[11],\nwhich consists of 5, 14 and 100 images respectively, are used as the test images.\nIn this paper, we consider two types of degradation when generating the LR images, i.e., the bicubic\nimage resizing function implemented with imresize in matlab and Gaussian blurring followed by\ndownsampling with a scaling factor, both of which are commonly used in the literature of image SR.\n\n5.1\n\nImage SR for LR images generated with bicubic interpolation function\n\nIn [2][6][7][8], the LR images are generated with the bicubic interpolation function (i.e., imresize\nfunction in Matlab), i.e., y = B(x) + n, where B(\u00b7) denotes the bicubic downsampling function. To\ndeal with this type of degradation, we implement the degradation matrix H as an operator that resizes\ns and implement H(cid:62) as an operator that\na HR image using bicubic function with scaling factors of 1\nupscales a LR image using bicubic function with scaling factor s, where s = 2, 3, 4. The average\nPSNR and SSIM results of the reconstructed HR images are reported in Table 1. It can be seen that\nthe SRCNN method performs better than the A+ and the SCSR methods. It is surprising to see that\nthe NCSR method, which only exploits the internal similar samples performs comparable with the\nSRCNN method. By exploiting both the external image patches and the internal similar patches, the\nproposed method outperforms the NCSR. The average PSNR gain over SRCNN can be up to 0.64\ndB. Parts of some reconstructed HR images by the test methods are shown in Fig. 1, from which\nwe can see that the proposed method reproduces the most visually pleasant HR images than other\ncompeting methods. Please refer to the supplementary \ufb01le for more visual comparison results.\n\n5.2\n\nImage SR for LR images generated with Gaussian blur followed by downsampling\n\nAnother commonly used degradation process is to \ufb01rst apply a Gaussian kernel followed by down-\nsampling. In this experimental setting, the 7 \u00d7 7 Gaussian kernel of standard deviation of 1.6 is used,\nfollowed by downsampling with scaling factor s = 2, 3, 4. For these SCSR, KK, A+ and SRCNN\nmethods, which cannot deal with the Gaussian blur kernel, the iterative back-projection [17] method\nis applied to the reconstructed HR images by those methods as a post processing to remove the\nblur. The average PSNR and SSIM results on the three test image sets are reported in Table 2. It\ncan be seen that the performance of the example-based methods, i.e., SCSR[2], KK[7], A+[6] and\nSRCNN[8] methods are much worse than the NCSR [3] method. Compared with the NCSR method,\nthe average PSNR gain of the proposed method can be up to 0.46 dB, showing the effectiveness of\nthe proposed sparse codes learning method. Parts of the reconstructed HR images are shown in Fig. 2\n\n6\n\n\fTable 1: Average PSNR and SSIM results of the test methods (LR images generated with bicubic\nresizing function)\n\nImages\nUpscaling\nSCSR[2]\n\nKK[7]\n\nA+[6]\n\nSRCNN[8]\n\nNCSR[3]\n\nProposed\n\n\u00d72\n-\n\n36.22\n0.9514\n36.55\n0.9544\n36.66\n0.9542\n36.68\n0.9550\n36.99\n0.9551\n\nSe5\n\u00d73\n31.42\n0.8821\n32.29\n0.9037\n32.59\n0.9088\n32.75\n0.9090\n33.05\n0.9149\n33.39\n0.9173\n\n\u00d74\n-\n\n30.03\n0.8544\n30.29\n0.8603\n30.49\n0.8628\n30.77\n0.8720\n31.04\n0.8779\n\n\u00d72\n-\n\n32.12\n0.9029\n32.28\n0.9056\n32.45\n0.9067\n32.26\n0.9058\n32.61\n0.9072\n\nSet14\n\u00d73\n28.31\n0.7954\n28.39\n0.8135\n29.13\n0.8188\n29.30\n0.8215\n29.30\n0.8239\n29.59\n0.8264\n\n\u00d74\n-\n\n27.15\n0.7422\n27.33\n0.7491\n27.50\n0.7513\n27.52\n0.7563\n27.77\n0.7620\n\n\u00d72\n-\n\n31.08\n0.8834\n31.21\n0.8863\n31.36\n0.8879\n31.14\n0.8863\n31.42\n0.8879\n\nBSD100\n\n\u00d73\n26.54\n0.7729\n28.15\n0.7780\n28.29\n0.7835\n28.41\n0.7863\n28.37\n0.7872\n28.56\n0.7899\n\n\u00d74\n-\n\n26.69\n0.7017\n26.82\n0.7087\n26.90\n0.7103\n26.91\n0.7143\n27.08\n0.7187\n\n(a) Original\n\n(b) Bicubic\n\n(c) SCSR / 26.01dB\n\n(d) KK / 26.49dB\n\n(e) A+ / 26.55dB\n\n(f) SRCNN / 26.71dB\n\n(g) NCSR / 27.11dB\n\n(h) Proposed / 27.35dB\n\nFigure 1: SR results on image \u201986000\u2019 of BSD100 of scaling factor 3 (LR image generated with\nbicubic interpolation function).\n\nand Fig. 3. Obviously, the proposed method can recover sharper edges and \ufb01ner details than other\ncompeting methods.\n\n6 Conclusion\n\nIn this paper, we propose a novel approach for learning parametric sparse models for image super-\nresolution. Speci\ufb01cally, mapping functions between the LR patch and the sparse codes of the desirable\nHR patches are learned from a training set. Then, parametric sparse distributions are estimated from\nthe learned sparse codes and those estimated from the input LR image. With the learned sparse\nmodels, the sparse codes and thus the HR image patches can be accurately recovered by solving a\nsparse coding problem. Experimental results show that the proposed SR method outperforms existing\nstate-of-the-art methods in terms of both subjective and objective image qualities.\n\nAcknowledgments\n\nThis work was supported in part by the Natural Science Foundation (NSF) of China under Grants(No.\nNo. 61622210, 61471281, 61632019, 61472301, and 61390512), in part by the Specialized Research\nFund for the Doctoral Program of Higher Education (No. 20130203130001).\n\n7\n\n\fTable 2: Average PSNR and SSIM results of the test methods of scaling factor 3 (LR images generated\nwith Gaussian kernel followed by downsampling)\nA+[6]\n29.39\n0.8502\n26.96\n0.7627\n26.59\n0.7331\n\nSCSR[2] KK[7]\n30.28\n0.8536\n27.46\n0.7640\n27.10\n0.7342\n\n33.49\n0.9165\n29.63\n0.8255\n28.60\n0.7887\n\n30.22\n0.8484\n27.51\n0.7619\n27.10\n0.7338\n\n33.03\n0.9106\n29.28\n0.8203\n28.35\n0.7841\n\nSet5\n\nSet14\n\nBSD100\n\nSRCNN[8] NCSR[3]\n\nProposed\n\n30.20\n08514\n27.48\n0.7638\n27.11\n0.7338\n\n(a) Original\n\n(b) Bicubic\n\n(c) SCSR / 29.85dB\n\n(d) KK / 29.94dB\n\n(e) A+ / 29.48dB\n\n(f) SRCNN / 29.88dB (g) NCSR / 32.97dB (h) Proposed / 33.84dB\n\nFigure 2: SR results on \u2019Monarch\u2019 from Set14 of scaling factor 3 (LR images generated with Gaussian\nblur followed downsampling).\n\n(a) Original\n\n(b) Bicubic\n\n(c) SCSR / 32.22dB\n\n(d) KK / 32.12dB\n\n(e) A+ / 30.81dB\n\n(f) SRCNN / 32.16dB (g) NCSR / 34.59dB (h) Proposed / 35.15dB\n\nFigure 3: SR results on \u2019Pepper\u2019 from Set14 of scaling factor 3 (LR images generated with Gaussian\nblur followed downsampling).\n\n8\n\n\fReferences\n\n[1] A. Marquina and S. J. Osher. Image super-resolution by TV-regularization and bregman iteration. Journal of\nScienti\ufb01c Computing, 37(3):367\u2013382, 2008.\n[2] J. Yang, J. Wright, T. S. Huang, and Y. Ma. Image super-resolution via sparse representation. IEEE\ntransactions on image processing, 19(11):2861\u20132873, 2010.\n[3] W. Dong, L. Zhang, G. Shi, and X. Li. Nonlocally centralized sparse representation for image restoration.\nIEEE Transactions on Image Processing, 22(4):1620\u20131630, 2013.\n[4] W. Dong, G. Shi, Y. Ma, and X. Li. Image restoration via simultaneous sparse coding: Where structured\nsparsity meets gaussian scale mixture. International Journal of Computer Vision, 114(2-3):217\u2013232, 2015.\n[5] R. Timofte, V. De Smet, and L. Van Gool. Anchored neighborhood regression for fast example-based\nsuper-resolution. In Proceedings of the IEEE International Conference on Computer Vision, pages 1920\u20131927,\n2013.\n[6] R. Timofte, V. De Smet, and L. Van Gool. A+: Adjusted anchored neighborhood regression for fast\nsuper-resolution. In Asian Conference on Computer Vision, pages 111\u2013126. Springer, 2014.\n[7] K. I. Kim and Y. Kwon. Single-image super-resolution using sparse regression and natural image prior. IEEE\nTransactions on Pattern Analysis and Machine Intelligence, 32(6):1127\u20131133, 2010.\n[8] C. Dong, C. C. Loy, K. He, and X. Tang. Image super-resolution using deep convolutional networks. IEEE\ntransactions on pattern analysis and machine intelligence, 38(2):295\u2013307, 2016.\n[9] M. Bevilacqua, A. Roumy, C. Guillemot, and M. L. AlberiMorel. Low-complexity single-image super-\nresolution based on nonnegative neighbor embedding. 2012.\n[10] R. Zeyde, M. Elad, and M. Protter. On single image scale-up using sparse-representations. In International\nconference on curves and surfaces, pages 711\u2013730. Springer, 2010.\n[11] D. Martin, C. Fowlkes, D. Tal, and J. Malik. A database of human segmented natural images and its\napplication to evaluating segmentation algorithms and measuring ecological statistics. In Computer Vision, 2001.\nICCV 2001. Proceedings. Eighth IEEE International Conference on, volume 2, pages 416\u2013423. IEEE, 2001.\n[12] Y. Li, W. Dong, G. Shi, and X. Xie. Learning parametric distributions for image super-resolution: Where\npatch matching meets sparse coding. In Proceedings of the IEEE International Conference on Computer Vision,\npages 450\u2013458, 2015.\n[13] W. Dong, L. Zhang, G. Shi, and X. Wu. Image deblurring and super-resolution by adaptive sparse domain\nselection and adaptive regularization. IEEE Transactions on Image Processing, 20(7):1838\u20131857, 2011.\n[14] W. Dong, L. Zhang, and G. Shi. Centralized sparse representation for image restoration. In 2011 Interna-\ntional Conference on Computer Vision, pages 1259\u20131266. IEEE, 2011.\n[15] G. Yu, G. Sapiro, and S. Mallat. Solving inverse problems with piecewise linear estimators: From gaussian\nmixture models to structured sparsity. IEEE Transactions on Image Processing, 21(5):2481\u20132499, 2012.\n[16] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classi\ufb01cation with deep convolutional neural\nnetworks. In Advances in neural information processing systems, pages 1097\u20131105, 2012.\n[17] M. Irani and S. Peleg. Motion analysis for image enhancement: Resolution, occlusion, and transparency.\nJournal of Visual Communication and Image Representation, 4(4):324\u2013335, 1993.\n[18] D. Dai, R. Timofte, and L. Van Gool. Jointly optimized regressors for image super-resolution. In Computer\nGraphics Forum, volume 34, pages 95\u2013104. Wiley Online Library, 2015.\n[19] K. Egiazarian and V. Katkovnik. Single image super-resolution via BM3D sparse coding. In Signal\nProcessing Conference (EUSIPCO), 2015 23rd European, pages 2849\u20132853. IEEE, 2015.\n\n9\n\n\f", "award": [], "sourceid": 2330, "authors": [{"given_name": "Yongbo", "family_name": "Li", "institution": "Xidian University"}, {"given_name": "Weisheng", "family_name": "Dong", "institution": "Xidian University"}, {"given_name": "Xuemei", "family_name": "Xie", "institution": "Xidian University"}, {"given_name": "GUANGMING", "family_name": "Shi", "institution": "Xidian University"}, {"given_name": "Xin", "family_name": "Li", "institution": "WVU"}, {"given_name": "Donglai", "family_name": "Xu", "institution": "Teesside University"}]}