{"title": "Deep Convolutional Neural Network for Image Deconvolution", "book": "Advances in Neural Information Processing Systems", "page_first": 1790, "page_last": 1798, "abstract": "Many fundamental image-related problems involve deconvolution operators. Real blur degradation seldom complies with an deal linear convolution model due to camera noise, saturation, image compression, to name a few. Instead of perfectly modeling outliers, which is rather challenging from a generative model perspective, we develop a deep convolutional neural network to capture the characteristics of degradation. We note directly applying existing deep neural networks does not produce reasonable results. Our solution is to establish the connection between traditional optimization-based schemes and a neural network architecture where a novel, separable structure is introduced as a reliable support for robust deconvolution against artifacts. Our network contains two submodules, both trained in a supervised manner with proper initialization. They yield decent performance on non-blind image deconvolution compared to previous generative-model based methods.", "full_text": "Deep Convolutional Neural Network for Image\n\nDeconvolution\n\nLi Xu \u2217\n\nLenovo Research & Technology\n\nxulihk@lenovo.com\n\nJimmy SJ. Ren\n\nLenovo Research & Technology\njimmy.sj.ren@gmail.com\n\nCe Liu\n\nJiaya Jia\n\nMicrosoft Research\n\nThe Chinese University of Hong Kong\n\nceliu@microsoft.com\n\nleojia@cse.cuhk.edu.hk\n\nAbstract\n\nMany fundamental image-related problems involve deconvolution operators. Real\nblur degradation seldom complies with an ideal linear convolution model due to\ncamera noise, saturation, image compression, to name a few. Instead of perfectly\nmodeling outliers, which is rather challenging from a generative model perspec-\ntive, we develop a deep convolutional neural network to capture the characteristics\nof degradation. We note directly applying existing deep neural networks does not\nproduce reasonable results. Our solution is to establish the connection between\ntraditional optimization-based schemes and a neural network architecture where\na novel, separable structure is introduced as a reliable support for robust decon-\nvolution against artifacts. Our network contains two submodules, both trained in\na supervised manner with proper initialization. They yield decent performance\non non-blind image deconvolution compared to previous generative-model based\nmethods.\n\n1 Introduction\n\nMany image and video degradation processes can be modeled as translation-invariant convolution.\nTo restore these visual data, the inverse process, i.e., deconvolution, becomes a vital tool in motion\ndeblurring [1, 2, 3, 4], super-resolution [5, 6], and extended depth of \ufb01eld [7].\n\nIn applications involving images captured by cameras, outliers such as saturation, limited image\nboundary, noise, or compression artifacts are unavoidable. Previous research has shown that im-\nproperly handling these problems could raise a broad set of artifacts related to image content, which\nare very dif\ufb01cult to remove. So there was work dedicated to modeling and addressing each particular\ntype of artifacts in non-blind deconvolution for suppressing ringing artifacts [8], removing noise [9],\nand dealing with saturated regions [9, 10]. These methods can be further re\ufb01ned by incorporating\npatch-level statistics [11] or other schemes [4]. Because each method has its own specialty as well\nas limitation, there is no solution yet to uniformly address all these issues. One example is shown\nin Fig. 1 \u2013 a partially saturated blur image with compression errors can already fail many existing\napproaches.\n\nOne possibility to remove these artifacts is via employing generative models. However, these models\nare usually made upon strong assumptions, such as identical and independently distributed noise,\nwhich may not hold for real images. This accounts for the fact that even advanced algorithms can\nbe affected when the image blur properties are slightly changed.\n\n\u2217Project webpage: http://www.lxu.me/projects/dcnn/. The paper is partially supported by a grant from the\n\nResearch Grants Council of the Hong Kong Special Administrative Region (Project No. 413113).\n\n1\n\n\f( )a\n\n( )\nb Krishnan et al\n\n.\n\n( )c Ours\n\nFigure 1: A challenging deconvolution example.\nregions. (b) is the result of [3] using hyper-Laplacian prior. (c) is our result.\n\n(a) is the blurry input with partially saturated\n\nIn this paper, we initiate the procedure for natural image deconvolution not based on their physically\nor mathematically based characteristics. Instead, we show a new direction to build a data-driven\nsystem using image samples that can be easily produced from cameras or collected online.\n\nWe use the convolutional neural network (CNN) to learn the deconvolution operation without the\nneed to know the cause of visual artifacts. We also do not rely on any pre-process to deblur the image,\nunlike previous learning based approaches [12, 13]. In fact, it is non-trivial to \ufb01nd a proper network\narchitecture for deconvolution. Previous de-noise neural network [14, 15, 16] cannot be directly\nadopted since deconvolution may involve many neighboring pixels and result in a very complex\nenergy function with nonlinear degradation. This makes parameter learning quite challenging.\n\nIn our work, we bridge the gap between an empirically-determined convolutional neural network\nand existing approaches with generative models in the context of pseudo-inverse of deconvolution.\nIt enables a practical system and, more importantly, provides an empirically effective strategy to\ninitialize the weights in the network, which otherwise cannot be easily obtained in the conventional\nrandom-initialization training procedure. Experiments show that our system outperforms previous\nones especially when the blurred input images are partially saturated.\n\n2 Related Work\n\nDeconvolution was studied in different \ufb01elds due to its fundamentality in image restoration. Most\nprevious methods tackle the problem from a generative perspective assuming known image noise\nmodel and natural image gradients following certain distributions.\n\nIn the Richardson-Lucy method [17], image noise is assumed to follow a Poisson distribution.\nWiener Deconvolution [18] imposes equivalent Gaussian assumption for both noise and image gra-\ndients. These early approaches suffer from overly smoothed edges and ringing artifacts.\n\nRecent development on deconvolution shows that regularization terms with sparse image priors are\nimportant to preserve sharp edges and suppress artifacts. The sparse image priors follow heavy-tailed\ndistributions, such as a Gaussian Mixture Model [1, 11] or a hyper-Laplacian [7, 3], which could be\nef\ufb01ciently optimized using half-quadratic (HQ) splitting [3]. To capture image statistics with larger\nspatial support, the energy is further modeled within a Conditional Random Field (CRF) framework\n[19] and on image patches [11]. While the last step of HQ method is quadratic optimization, Schmidt\net al. [4] showed that it is possible to directly train a Gaussian CRF from synthetic blur data.\n\nTo handle outliers such as saturation, Cho et al. [9] used variational EM to exclude outlier regions\nfrom a Gaussian likelihood. Whyte et al. [10] introduced an auxiliary variable in the Richardson-\nLucy method. An explicit denoise pass is added to deconvolution, where the denoise approach is\ncarefully engineered [20] or trained from noisy data [12]. The generative approaches typically have\ndif\ufb01culties to handle complex outliers that are not independent and identically distributed.\n\n2\n\n\fAnother trend for image restoration is to leverage the deep neural network structure and big data to\ntrain the restoration function. The degradation is therefore no longer limited to one model regarding\nimage noise. Burger et al. [14] showed that the plain multi-layer perceptrons can produce decent\nresults and handle different types of noise. Xie et al. [15] showed that a stacked denoise autoen-\ncoder (SDAE) structure [21] is a good choice for denoise and inpainting. Agostinelli et al. [22]\ngeneralized it by combining multiple SDAE for handling different types of noise. In [23] and [16],\nthe convolutional neural network (CNN) architecture [24] was used to handle strong noise such as\nraindrop and lens dirt. Schuler et al. [13] added MLPs to a direct deconvolution to remove artifacts.\nThough the network structure works well for denoise, it does not work similarly for deconvolution.\nHow to adapt the architecture is the main problem to address in this paper.\n\n3 Blur Degradation\n\nWe consider real-world image blur that suffers from several types of degradation including clipped\nintensity (saturation), camera noise, and compression artifacts. The blur model is given by\n\n\u02c6y = \u03c8b[\u03c6(\u03b1x \u2217 k + n)],\n\n(1)\n\nwhere \u03b1x represents the latent sharp image. The notation \u03b1 \u2265 1 is to indicate the fact that \u03b1x could\nhave values exceeding the dynamic range of camera sensors and thus be clipped. k is the known\nconvolution kernel, or typically referred to as a point spread function (PSF), n models additive\ncamera noise. \u03c6(\u00b7) is a clipping function to model saturation, de\ufb01ned as \u03c6(z) = min(z, zmax),\nwhere zmax is a range threshold. \u03c8b[\u00b7] is a nonlinear (e.g., JPEG) compression operator.\n\nWe note that even with \u02c6y and kernel k, restoring \u03b1x is intractable, simply because the information\nloss caused by clipping. In this regard, our goal is to restore the clipped input \u02c6x, where \u02c6x = \u03c6(\u03b1x).\n\nAlthough solving for \u02c6x with a complex energy function that involves Eq. (1) is dif\ufb01cult, the gener-\nation of blurry image from an input x is quite straightforward by image synthesis according to the\nconvolution model taking all kinds of possible image degradation into generation. This motivates a\nlearning procedure for deconvolution, using training image pairs {\u02c6xi, \u02c6yi}, where index i \u2208 N .\n\n4 Analysis\n\nThe goal is to train a network architecture f (\u00b7) that minimizes\n\n1\n2|N | X\n\ni\u2208N\n\nkf (\u02c6yi) \u2212 \u02c6xik2,\n\n(2)\n\nwhere |N | is the number of image pairs in the sample set.\n\nWe have used the recent two deep neural networks to solve this problem, but failed. One is the\nStacked Sparse Denoise Autoencoder (SSDAE) [15] and the other is the convolutional neural net-\nwork (CNN) used in [16]. Both of them are designed for image denoise. For SSDAE, we use patch\nsize 17 \u00d7 17 as suggested in [14]. The CNN implementation is provided by the authors of [16]. We\ncollect two million sharp patches together with their blurred versions in training.\n\nOne example is shown in Fig. 2 where (a) is a blurred image. Fig. 2(b) and (c) show the results of\nSSDAE and CNN. The result of SSDAE in (b) is still blurry. The CNN structure works relatively\nbetter. But it suffers from remaining blurry edges and strong ghosting artifacts. This is because these\nnetwork structures are for denoise and do not consider necessary deconvolution properties. More\nexplanations are provided from a generative perspective in what follows.\n\n4.1 Pseudo Inverse Kernels\n\nThe deconvolution task can be approximated by a convolutional network by nature. We consider the\nfollowing simple linear blur model\n\nThe spatial convolution can be transformed to a frequency domain multiplication, yielding\n\ny = x \u2217 k.\n\nF (y) = F (x) \u00b7 F(k).\n\n3\n\n\f(a) input\n\n(b) SSDAE [15]\n\n(c) CNN [16]\n\n(d) Ours\n\nFigure 2: Existing stacked denoise autoencoder and convolutional neural network structures cannot\nsolve the deconvolution problem.\n\n(a)\n\n(b)\n\n(c)\n\n(d)\n\n(e)\n\nFigure 3: Pseudo inverse kernel and deconvolution examples.\n\nF (\u00b7) denotes the discrete Fourier transform (DFT). Operator \u00b7 is element-wise multiplication. In\nFourier domain, x can be obtained as\n\nx = F \u22121(F (y)/F (k)) = F \u22121(1/F (k)) \u2217 y,\n\nwhere F \u22121 is the inverse discrete Fourier transform. While the solver for x is written in a form of\nspatial convolution with a kernel F \u22121(1/F (k)), the kernel is actually a repetitive signal spanning\nthe whole spatial domain without a compact support. When noise arises, regularization terms are\ncommonly involved to avoid division-by-zero in frequency domain, which makes the pseudo inverse\nfalls off quickly in spatial domain [25].\n\nThe classical Wiener deconvolution is equivalent to using Tikhonov regularizer [2]. The Wiener\ndeconvolution can be expressed as\n\nx = F \u22121(\n\n1\n\nF (k)\n\n{\n\n|F(k)|2\n\n|F(k)|2 + 1\n\nSN R\n\n}) \u2217 y = k\u2020 \u2217 y,\n\n1\n\nwhere SN R is the signal-to-noise ratio. k\u2020 denotes the pseudo inverse kernel. Strong noise leads to a\nlarge\nSN R , which corresponds to strongly regularized inversion. We note that with the introduction\nof SN R, k\u2020 becomes compact with a \ufb01nite support. Fig. 3(a) shows a disk blur kernel of radius 7,\nwhich is commonly used to model focal blur. The pseudo-inverse kernel k\u2020 with SN R = 1E \u2212 4\nis given in Fig. 3(b). A blurred image with this kernel is shown in Fig. 3(c). Deconvolution results\nwith k\u2020 are in (d). A level of blur is removed from the image. But noise and saturation cause visual\nartifacts, in compliance with our understanding of Wiener deconvolution.\n\nAlthough the Wiener method is not state-of-the-art, its byproduct that the inverse kernel is with a\n\ufb01nite yet large spatial support becomes vastly useful in our neural network system, which manifests\nthat deconvolution can be well approximated by spatial convolution with suf\ufb01ciently large kernels.\nThis explains unsuccessful application of SSDA and CNN directly to deconvolution in Fig. 2 as\nfollows.\n\n\u2022 SSDA does not capture well the nature of convolution with its fully connected structures.\n\u2022 CNN performs better since deconvolution can be approximated by large-kernel convolution\n\nas explained above.\n\n4\n\n\f\u2022 Previous CNN uses small convolution kernels. It is however not an appropriate con\ufb01gura-\n\ntion in our deconvolution problem.\n\nIt thus can be summarized that using deep neural networks to perform deconvolution is by no means\nstraightforward. Simply modifying the network by employing large convolution kernels would lead\nto higher dif\ufb01culties in training. We present a new structure to update the network in what follows.\nOur result in Fig. 3 is shown in (e).\n\n5 Network Architecture\n\nWe transform the simple pseudo inverse kernel for deconvolution into a convolutional network,\nbased on the kernel separability theorem. It makes the network more expressive with the mapping to\nhigher dimensions to accommodate nonlinearity. This system is bene\ufb01ted from large training data.\n\n5.1 Kernel Separability\n\nKernel separability is achieved via singular value decomposition (SVD) [26]. Given the inverse\nkernel k\u2020, decomposition k\u2020 = U SV T exists. We denote by uj and vj the jth columns of U and V ,\nsj the jth singular value. The original pseudo deconvolution can be expressed as\n\nk\u2020 \u2217 y = X\n\nsj \u00b7 uj \u2217 (vT\n\nj \u2217 y),\n\nj\n\n(3)\n\nwhich shows 2D convolution can be deemed as a weighted sum of separable 1D \ufb01lters. In practice,\nwe can well approximate k\u2020 by a small number of separable \ufb01lters by dropping out kernels associated\nwith zero or very small sj. We have experimented with real blur kernels to ignore singular values\nsmaller than 0.01. The resulting average number of separable kernels is about 30 [25]. Using a\nsmaller SN R ratio, the inverse kernel has a smaller spatial support. We also found that an inverse\nkernel with length 100 is typically enough to generate visually plausible deconvolution results. This\nis important information in designing the network architecture.\n\n5.2 Image Deconvolution CNN (DCNN)\n\nWe describe our image deconvolution convolutional neural network (DCNN) based on the separable\nkernels. This network is expressed as\n\nh3 = W3 \u2217 h2; hl = \u03c3(Wl \u2217 hl\u22121 + bl\u22121), l \u2208 {1, 2}; h0 = \u02c6y,\n\nwhere Wl is the weight mapping the (l \u2212 1)th layer to the lth one and bl\u22121 is the vector value bias.\n\u03c3(\u00b7) is the nonlinear function, which can be sigmoid or hyperbolic tangent.\n\nOur network contains two hidden layers similar to the separable kernel inversion setting. The \ufb01rst\nhidden layer h1 is generated by applying 38 large-scale one-dimensional kernels of size 121 \u00d7 1,\naccording to the analysis in Section 5.1. The values 38 and 121 are empirically determined, which\ncan be altered for different inputs. The second hidden layer h2 is generated by applying 38 1 \u00d7 121\nconvolution kernels to each of the 38 maps in h1. To generate results, a 1 \u00d7 1 \u00d7 38 kernel is applied,\nanalogous to the linear combination using singular value sj.\n\nThe architecture has several advantages for deconvolution. First, it assembles separable kernel in-\nversion for deconvolution and therefore is guaranteed to be optimal. Second, the nonlinear terms\nand high dimensional structure make the network more expressive than traditional pseudo-inverse.\nIt is reasonably robust to outliers.\n\n5.3 Training DCNN\n\nThe network can be trained either by random-weight initialization or by the initialization from the\nseparable kernel inversion, since they share the exact same structure.\n\nWe experiment with both strategies on natural images, which are all degraded by additive Gaussian\nnoise (AWG) and JPEG compression. These images are in two categories \u2013 one with strong color\nsaturation and one without. Note saturation affects many existing deconvolution algorithms a lot.\n\n5\n\n\fFigure 4: PSNRs produced in different stages of our convolutional neural network architecture.\n\n(a) Separable kernel inversion\n\n(b) Random initialization\n\n(c) Separable kernel initialization\n\n(d) ODCNN output\n\nFigure 5: Results comparisons in different stages of our deconvolution CNN.\n\nThe PSNRs are shown as the \ufb01rst three bars in Fig. 4. We obtain the following observations.\n\n\u2022 The trained network has an advantage over simply performing separable kernel inversion,\nno matter with random initialization or initialization from pseudo-inverse. Our interpreta-\ntion is that the network, with high dimensional mapping and nonlinearity, is more expres-\nsive than simple separable kernel inversion.\n\n\u2022 The method with separable kernel inversion initialization yields higher PSNRs than that\nwith random initialization, suggesting that initial values affect this network and thus can be\ntuned.\n\nVisual comparison is provided in Fig. 5(a)-(c), where the results of separable kernel inversion, train-\ning with random weights, and of training with separable kernel inversion initialization are shown.\nThe result in (c) obviously contains sharp edges and more details. Note that the \ufb01nal trained DCNN\nis not equivalent to any existing inverse-kernel function even with various regularization, due to the\ninvolved high-dimensional mapping with nonlinearities.\n\nThe performance of deconvolution CNN decreases for images with color saturation. Visual artifacts\ncould also be yielded due to noise and compression. We in the next section turn to a deeper structure\nto address these remaining problems, by incorporating a denoise CNN module.\n\n5.4 Outlier-rejection Deconvolution CNN (ODCNN)\n\nOur complete network is formed as the concatenation of the deconvolution CNN module with a\ndenoise CNN [16]. The overall structure is shown in Fig. 6. The denoise CNN module has two\nhidden layers with 512 feature maps. The input image is convolved with 512 kernels of size 16 \u00d7 16\nto be fed into the hidden layer.\n\nThe two network modules are concatenated in our system by combining the last layer of deconvolu-\ntion CNN with the input of denoise CNN. This is done by merging the 1 \u00d7 1 \u00d7 38 kernel with 512\n16 \u00d7 16 kernels to generate 512 kernels of size 16 \u00d7 16 \u00d7 38. Note that there is no nonlinearity when\ncombining the two modules. While the number of weights grows due to the merge, it allows for a\n\ufb02exible procedure and achieves decent performance, by further incorporating \ufb01ne tuning.\n\n6\n\n\f64x184x38\n\n64x64x38\n\n49x49x512\n\n49x49x512\n\n184x184\n\n56x56\n\nkernel size\n1x121\n\nkernel size\n121x1\n\nkernel size\n16x16x38\n\nkernel size\n1x1x512\n\nkernel size\n8x8x512\n\nDeconvolution Sub-Network\n\nOutlier Rejection Sub-Network\n\nRestoration\n\nFigure 6: Our complete network architecture for deep deconvolution.\n\n5.5 Training ODCNN\n\nWe blur natural images for training \u2013 thus it is easy to obtain a large number of data. Speci\ufb01cally,\nwe use 2,500 natural images downloaded from Flickr. Two million patches are randomly sampled\nfrom them. Concatenating the two network modules can describe the deconvolution process and\nenhance the ability to suppress unwanted structures. We train the sub-networks separately. The\ndeconvolution CNN is trained using the initialization from separable inversion as described before.\nThe output of deconvolution CNN is then taken as the input of the denoise CNN.\n\nFine tuning is performed by feeding one hundred thousand 184\u00d7184 patches into the whole network.\nThe training samples contain all patches possibly with noise, saturation, and compression artifacts.\nThe statistics of adding denoise CNN are also plotted in Fig. 4. The outlier-rejection CNN after \ufb01ne\ntuning improves the overall performance up to 2dB, especially for those saturated regions.\n\n6 More Discussions\n\nOur approach differs from previous ones in several ways. First, we identify the necessity of using a\nrelatively large kernel support for convolutional neural network to deal with deconvolution. To avoid\nrapid weight-size expansion, we advocate the use of 1D kernels. Second, we propose a supervised\npre-training on the sub-network that corresponds to reinterpretation of Wiener deconvolution. Third,\nwe apply traditional deconvolution to network initialization, where generative solvers can guide\nneural network learning and signi\ufb01cantly improve performance.\n\nFig. 6 shows that a new convolutional neural network architecture is capable of dealing with decon-\nvolution. Without a good understanding of the functionality of each sub-net and performing super-\nvised pre-training, however, it is dif\ufb01cult to make the network work very well. Training the whole\nnetwork with random initialization is less preferred because the training algorithm stops halfway\nwithout further energy reduction. The corresponding results are similarly blurry as the input images.\nTo understand it, we visualize intermediate results from the deconvolutional CNN sub-network,\nwhich generates 38 intermediate maps. The results are shown in Fig. 7, where (a) is the selected\nthree results obtained by random-initialization training and (b) is the results at the corresponding\nnodes from our better-initialized process. The maps in (a) look like the high-frequency part of the\nblurry input, indicating random initialization is likely to generate high-pass \ufb01lters. Without proper\nstarting values, its chance is very small to reach the component maps shown in (b) where sharper\nedges present, fully usable for further denoise and artifact removal.\n\nZeiler et al.\n[27] showed that sparsely regularized deconvolution can be used to extract useful\nmiddle-level representation in their deconvolution network. Our deconvolution CNN can be used to\napproximate this structure, unifying the process in a deeper convolutional neural network.\n\n7\n\n\f(a)\n\n(b)\n\nFigure 7: Comparisons of intermediate results from deconvolution CNN. (a) Maps from random\ninitialization. (b) More informative maps with our initialization scheme.\n\nkernel type Krishnan [3] Levin [7]\n\nCho [9] Whyte [10] Schuler [13] Schmidt [4]\n\nOurs\n\ndisk sat.\n\ndisk\n\nmotion sat.\n\nmotion\n\n24.05dB\n25.94dB\n24.07dB\n25.07dB\n\n24.44dB\n24.54dB\n23.58dB\n24.47 dB\n\n25.35dB\n23.97dB\n25.65 dB\n24.29dB\n\n24.47dB\n22.84dB\n25.54dB\n23.65dB\n\n23.14dB\n24.67dB\n24.92dB\n25.27dB\n\n24.01dB\n24.71dB\n25.33dB\n25.49dB\n\n26.23dB\n26.01dB\n27.76dB\n27.92dB\n\nTable 1: Quantitative comparison on the evaluation image set.\n\n(a) Input\n\n(b) Levin et al. [7]\n\n(c) Krishnan et al. [3]\n\n(d) EPLL [11]\n\n(e) Cho et al. [9]\n\n(f) Whyte et al. [10]\n\n(g) Schuler et al. [13]\n\n(h) Ours\n\nFigure 8: Visual comparison of deconvolution results.\n\n7 Experiments and Conclusion\n\nWe have presented several deconvolution results. Here we show quantitative evaluation of\nour method against state-of-the-art approaches, including sparse prior deconvolution [7], hyper-\nLaplacian prior method [3], variational EM for outliers [9], saturation-aware approach [10], learning\nbased approach [13] and the discriminative approach [4]. We compare performance using both disk\nand motion kernels. The average PSNRs are listed in Table 1. Fig. 8 shows a visual comparison.\nOur method achieves decent results quantitatively and visually. The implementation, as well as the\ndataset, is available at the project webpage.\n\nTo conclude this paper, we have proposed a new deep convolutional network structure for the chal-\nlenging image deconvolution task. Our main contribution is to let traditional deconvolution schemes\nguide neural networks and approximate deconvolution by a series of convolution steps. Our system\nnovelly uses two modules corresponding to deconvolution and artifact removal. While the network\nis dif\ufb01cult to train as a whole, we adopt two supervised pre-training steps to initialize sub-networks.\nHigh-quality deconvolution results bear out the effectiveness of this approach.\n\nReferences\n\n[1] Fergus, R., Singh, B., Hertzmann, A., Roweis, S.T., Freeman, W.T.: Removing camera shake\n\nfrom a single photograph. ACM Trans. Graph. 25(3) (2006)\n\n8\n\n\f[2] Levin, A., Weiss, Y., Durand, F., Freeman, W.T.: Understanding and evaluating blind decon-\n\nvolution algorithms. In: CVPR. (2009)\n\n[3] Krishnan, D., Fergus, R.: Fast image deconvolution using hyper-laplacian priors. In: NIPS.\n\n(2009)\n\n[4] Schmidt, U., Rother, C., Nowozin, S., Jancsary, J., Roth, S.: Discriminative non-blind deblur-\n\nring. In: CVPR. (2013)\n\n[5] Agrawal, A.K., Raskar, R.: Resolving objects at higher resolution from a single motion-blurred\n\nimage. In: CVPR. (2007)\n\n[6] Michaeli, T., Irani, M.: Nonparametric blind super-resolution. In: ICCV. (2013)\n\n[7] Levin, A., Fergus, R., Durand, F., Freeman, W.T.: Image and depth from a conventional camera\n\nwith a coded aperture. ACM Trans. Graph. 26(3) (2007)\n\n[8] Yuan, L., Sun, J., Quan, L., Shum, H.Y.: Progressive inter-scale and intra-scale non-blind\n\nimage deconvolution. ACM Trans. Graph. 27(3) (2008)\n\n[9] Cho, S., Wang, J., Lee, S.: Handling outliers in non-blind image deconvolution. In: ICCV.\n\n(2011)\n\n[10] Whyte, O., Sivic, J., Zisserman, A.: Deblurring shaken and partially saturated images. In:\n\nICCV Workshops. (2011)\n\n[11] Zoran, D., Weiss, Y.: From learning models of natural image patches to whole image restora-\n\ntion. In: ICCV. (2011)\n\n[12] Kenig, T., Kam, Z., Feuer, A.: Blind image deconvolution using machine learning for three-\n\ndimensional microscopy. IEEE Trans. Pattern Anal. Mach. Intell. 32(12) (2010)\n\n[13] Schuler, C.J., Burger, H.C., Harmeling, S., Sch\u00a8olkopf, B.: A machine learning approach for\n\nnon-blind image deconvolution. In: CVPR. (2013)\n\n[14] Burger, H.C., Schuler, C.J., Harmeling, S.:\n\nImage denoising: Can plain neural networks\n\ncompete with bm3d? In: CVPR. (2012)\n\n[15] Xie, J., Xu, L., Chen, E.:\n\nImage denoising and inpainting with deep neural networks. In:\n\nNIPS. (2012)\n\n[16] Eigen, D., Krishnan, D., Fergus, R.: Restoring an image taken through a window covered with\n\ndirt or rain. In: ICCV. (2013)\n\n[17] Richardson, W.: Bayesian-based iterative method of image restoration. Journal of the Optical\n\nSociety of America 62(1) (1972)\n\n[18] Wiener, N.: Extrapolation, interpolation, and smoothing of stationary time series: with engi-\n\nneering applications. Journal of the American Statistical Association 47(258) (1949)\n\n[19] Roth, S., Black, M.J.: Fields of experts. International Journal of Computer Vision 82(2) (2009)\n\n[20] Dabov, K., Foi, A., Katkovnik, V., Egiazarian, K.O.: Image restoration by sparse 3d transform-\n\ndomain collaborative \ufb01ltering. In: Image Processing: Algorithms and Systems. (2008)\n\n[21] Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoen-\ncoders: Learning useful representations in a deep network with a local denoising criterion.\nJournal of Machine Learning Research 11 (2010)\n\n[22] Agostinelli, F., Anderson, M.R., Lee, H.: Adaptive multi-column deep neural networks with\n\napplication to robust image denoising. In: NIPS. (2013)\n\n[23] Jain, V., Seung, H.S.: Natural image denoising with convolutional networks. In: NIPS. (2008)\n\n[24] LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document\n\nrecognition. Proceedings of the IEEE 86(11) (1998)\n\n[25] Xu, L., Tao, X., Jia, J.: Inverse kernels for fast spatial deconvolution. In: ECCV. (2014)\n\n[26] Perona, P.: Deformable kernels for early vision. IEEE Trans. Pattern Anal. Mach. Intell. 17(5)\n\n(1995)\n\n[27] Zeiler, M.D., Krishnan, D., Taylor, G.W., Fergus, R.: Deconvolutional networks. In: CVPR.\n\n(2010)\n\n9\n\n\f", "award": [], "sourceid": 951, "authors": [{"given_name": "Li", "family_name": "Xu", "institution": "Lenovo IVCL"}, {"given_name": "Jimmy", "family_name": "Ren", "institution": "Lenovo Research"}, {"given_name": "Ce", "family_name": "Liu", "institution": "Microsoft Research"}, {"given_name": "Jiaya", "family_name": "Jia", "institution": "CUHK"}]}