{"title": "Scale Adaptive Blind Deblurring", "book": "Advances in Neural Information Processing Systems", "page_first": 3005, "page_last": 3013, "abstract": "The presence of noise and small scale structures usually leads to large kernel estimation errors in blind image deblurring empirically, if not a total failure. We present a scale space perspective on blind deblurring algorithms, and introduce a cascaded scale space formulation for blind deblurring. This new formulation suggests a natural approach robust to noise and small scale structures through tying the estimation across multiple scales and balancing the contributions of different scales automatically by learning from data. The proposed formulation also allows to handle non-uniform blur with a straightforward extension. Experiments are conducted on both benchmark dataset and real-world images to validate the effectiveness of the proposed method. One surprising finding based on our approach is that blur kernel estimation is not necessarily best at the finest scale.", "full_text": "Scale Adaptive Blind Deblurring\n\nHaichao Zhang\nDuke University, NC\n\nhczhang1@gmail.com\n\nJianchao Yang\n\nAdobe Research, CA\njiayang@adobe.com\n\nAbstract\n\nThe presence of noise and small scale structures usually leads to large kernel es-\ntimation errors in blind image deblurring empirically, if not a total failure. We\npresent a scale space perspective on blind deblurring algorithms, and introduce a\ncascaded scale space formulation for blind deblurring. This new formulation sug-\ngests a natural approach robust to noise and small scale structures through tying\nthe estimation across multiple scales and balancing the contributions of different\nscales automatically by learning from data. The proposed formulation also allows\nto handle non-uniform blur with a straightforward extension. Experiments are\nconducted on both benchmark dataset and real-world images to validate the effec-\ntiveness of the proposed method. One surprising \ufb01nding based on our approach is\nthat blur kernel estimation is not necessarily best at the \ufb01nest scale.\n\n1 Introduction\n\nBlind deconvolution is an important inverse problem that gains increasing attentions from various\n\ufb01elds, such as neural signal analysis [3, 10] and computational imaging [6, 8]. Although some re-\nsults obtained in this paper are applicable to more general bilinear estimation problems, we will use\nblind image deblurring as an example. Image blur is an undesirable degradation that often accom-\npanies the image formation process due to factors such as camera shake. Blind image deblurring\naims to recover a sharp image from only one blurry observed image. While signi\ufb01cant progress has\nbeen made recently [6, 16, 14, 2, 22, 11], most of the existing blind deblurring methods do not work\nwell in the presence of noise, leading to inaccurate blur kernel estimation, which is a problem that\nhas been observed in several recent work [17, 26]. Figure 1 shows an example where the kernel\nrecovery quality of previous methods degrades signi\ufb01cantly even though only 5% of Gaussian noise\nis added to the blurry input. Moreover, it has been empirically observed that even for noise-free im-\nages, image structures with scale smaller than that of the blur kernel are actually harmful for kernel\nestimation [22]. Therefore, various structure selection techniques, such as hard/hysteresis gradient\nthresholding [2, 16], selective edge map [22], and image decomposition [24] are incorporated into\nkernel estimation.\n\nIn this paper, we propose a novel formulation for blind deblurring, which explains the conventional\nempirical coarse-to-\ufb01ne estimation scheme and reveals some novel perspectives. Our new formu-\nlation not only offers the ability to encompass the conventional multi-scale estimation scheme, but\nalso offers the ability to achieve robust blind deblurring in a simple but principled way. Our model\nanalysis leads to several interesting and perhaps surprising observations: (i) Blur kernel estimation\nis not necessarily best at the \ufb01nest image scale and (ii) There is no universal single image scale that\ncan be de\ufb01ned as a priori to maximize the performance of blind deblurring.\n\nThe remainder of the paper is structured as follows. In Section 2, we conduct an analysis to motivate\nour proposed scale-adaptive blind deblurring approach. Section 3 presents the proposed approach,\nincluding a generalization to noise-robust kernel estimation as well as non-uniform blur estimation.\nWe discuss the relationship of the proposed method to several previous methods in Section 4. Ex-\n\n1\n\n\f(a) Blurry & Noisy\n\n(b) Levin et al. [13]\n\n(c) Zhang et al. [25]\n\n(d) Zhong et al. [26]\n\n(e) Proposed\n\nFigure 1: Sensitivity of blind deblurring to image noise. Random gaussian noise (5%) is added\nto the observed blurry image before kernel estimation. The deblurred images are obtained with\nthe corresponding estimated blur kernels and the noise-free blurry image to capitalize the kernel\nestimation accuracy.\n\nperiments are carried out in Section 5, and the results are compared with those of the state-of-the-art\nmethods in the literature. Finally, we conclude the paper in Section 6.\n\n2 Motivational Analysis\nFor uniform blur, the blurry image can be modeled as follows\n\ny = k \u2217 x + n,\n\n(1)\nwhere \u2217 denotes 2D convolution, 1 x is the unknown sharp image, y is the observed blurry image,\nk is the unknown blur kernel (a.k.a., point spread function), and n is a zero-mean Gaussian noise\nterm [6]. As mentioned above, most of the blind deblurring methods are sensitive to image noise\nand small scale structures [17, 26, 22]. Although these effects have been empirically observed [2,\n22, 24, 17], we provide a complementary analysis in the following, which motivates our proposed\napproach later. Our analysis is based on the following result:\nTheorem 1 (Point Source Recovery [1]) For a signal x containing point sources at different loca-\ntions, if the minimum distance between sources is at least 2/fc, where fc denotes the cut-off fre-\nquency of the Gaussian kernel k, then x can be recovered exactly given k and the observed signal\ny in the noiseless case.\nAlthough Theorem 1 is stated in the noiseless and non-blind case with a parametric Gaussian kernel,\nit is still enlightening for analyzing the general blind deblurring case we are interested in. As sparsity\nof the image is typically exploited in the image derivative domain for blind deblurring, Theorem 1\nimplies that large image structures whose gradients are distributed far from each other are likely\nto be recovered more accurately, which in return, bene\ufb01ts the kernel estimation. On the contrary,\nsmall image structures with gradients distributed near each other are likely to have larger recovery\nerrors, and thus is harmful for kernel estimation. We refer these small image structures as small\nscale structure in this paper.\n\nApart from the above recoverability analysis, Theorem 1 also suggests a straightforward approach\nto deal with noise and small scale structures by performing blur kernel estimation after smoothing\nthe noisy (and blurry) image y with a low-pass \ufb01lter f p with a proper cut-off frequency f c\n\nyp = fp \u2217 y \u21d4 yp = fp \u2217 k \u2217 x + fp \u2217 n \u21d4 yp = kp \u2217 x + np\n\n(2)\nwhere kp (cid:2) fp \u2217 k and np (cid:2) fp \u2217 n. As fp is a low-pass \ufb01lter, the noise level of yp is reduced.\nAlso, as the small scale structures correspond to signed spikes with small separation distance in\nthe derivative domain, applying a local averaging will make them mostly canceled out [22], and\ntherefore, noise and small scale structure can be effectively suppressed. However, applying the low-\npass \ufb01lter will also smooth the large image structures besides noise, and as a result, it will alter the\npro\ufb01le of the edges. As the salient large scale edge structures are the crucial information for blur\nkernel estimation, the low-pass \ufb01ltering may lead to inaccurate kernel estimation. This is the inherent\nlimitation of linear \ufb01ltering for blind deblurring. To achieve noise reduction while retaining the latent\nedge structures, one may resort to non-linear \ufb01ltering schemes, such as anisotropic diffusion [20],\nBilateral \ufb01ltering [19], sparse regression [5]. These approaches typically assume the absence of\nmotion blur, and thus can cause over-sharpening of the edge structures and over-smoothing of image\ndetails when blur is present [17], resulting in a \ufb01ltered image that is no longer linear with respect to\nthe latent sharp image, making accurate kernel estimation even more dif\ufb01cult.\n1We also overload \u2217 to denote the 2D convolution followed by lexicographic ordering based on the context.\n\n2\n\n\fe\nu\nr\nT\n\nd\ne\nr\ne\nv\no\nc\ne\nR\n\n1\n0\n\u22121\n0\n1\n0\n\u22121\n0\n1\n0\n\u22121\n0\n1\n0\n\u22121\n0\n1\n0\n\u22121\n0\n\nl\n\n4\ne\na\nc\nS\n\nl\n\n3\ne\na\nc\nS\n\nl\n\n2\ne\na\nc\nS\n1\ne\na\nc\nS\n\nl\n\nSignal x\n\n20\n\n40\n\n60\n\n80\n\n100\n\n120\n\n20\n\n40\n\n60\n\n80\n\n100\n\n120\n\n20\n\n40\n\n60\n\n80\n\n100\n\n120\n\n20\n\n40\n\n60\n\n80\n\n100\n\n120\n\n20\n\n40\n\n60\n\n80\n\n100\n\n120\n\n0.2\n0.1\n0\n0\n0.5\n\n0\n0\n0.5\n\n0\n0\n0.4\n\n0.2\n\n0\n0\n0.4\n\n0.2\n\n0\n0\n\nBlur Kernel k\n\n5\n\n5\n\n5\n\n5\n\n5\n\n10\n\n10\n\n10\n\n10\n\n10\n\n1\n\n0.5\n\n0\n1\n0.2\n\n0.1\n\n0\n1\n0.2\n\n0.1\n\n0\n1\n0.5\n\n0\n1\n1\n\n0.5\n\n0\n1\n\nScale Filter fp\n\n2\n\n2\n\n2\n\n2\n\n2\n\n3\n\n3\n\n3\n\n3\n\n3\n\n4\n\n4\n\n4\n\n4\n\n4\n\n5\n\n5\n\n5\n\n5\n\n5\n\n6\n\n6\n\n6\n\n6\n\n6\n\n7\n\n7\n\n7\n\n7\n\n7\n\n8\n\n8\n\n8\n\n8\n\n8\n\n9\n\n9\n\n9\n\n9\n\n9\n\n15\n\n15\n\n15\n\n15\n\n15\n\nFigure 2: Multi-Scale Blind Sparse Recovery. The signal structures of different scales will be\nrecovered at different scales. Large scale structures are recovered \ufb01rst and small structures are\nrecovered later. Top: original signal, blur kernel. Bottom: the recovered signal and bluer kernel\nprogressively across different scales (scale-4 to scale-1 represents the coarsest scale to the \ufb01nest\n(original) scale. The blur kernel at the i-th scale is initialized with the solution from the i-1-th scale.\n\n3 The Proposed Approach\n\nTo facilitate subsequent analysis, we \ufb01rst introduce the de\ufb01nition of scale space [15, 4]:\nDe\ufb01nition 1 For an image x, its scale-space representation corresponding to a Gaussian \ufb01lter G s\nis de\ufb01ned by the convolution Gs \u2217 x, where the variance s is referred to as the scale parameter.\nWithout loss of clarity, we also refer the different scale levels as different scale spaces in the sequel.\n\nNatural images have a multi-scale property, meaning that different scale levels reveal different scales\nof image structures. According to Theorem 1, different scale spaces may play different roles for\nkernel estimation, due to the different recoverability of the signal components in the corresponding\nscale spaces. We propose a new framework for blind deblurring by introducing a variable scale \ufb01lter,\nwhich de\ufb01nes the scale space where the blind estimation process is operated. With the scale \ufb01lter, it\nis straightforward to come up with a blur estimation procedure similar to the conventional coarse-to-\n\ufb01ne estimation by constructing an image pyramid. However, we operate deblurring in a space with\nthe same spatial resolution as the original image rather than a downscaled space as conventionally\ndone. Therefore, it avoids the additional estimation error caused by interpolation between spatial\nscales in the pyramid. To mitigate the problem of structure smoothing, we incorporate the knowledge\nabout the \ufb01lter into the deblurring model, which is different from the way of using \ufb01ltering simply as\na pre-processing step. More importantly, we can formulate the deblurring problem in multiple scale\nspaces in this way, and learn the contribution of each scale space adaptively for each input image.\n\n3.1 Scale-Space Blind Deblurring Model\nOur task is to recover k and x from the \ufb01ltered observation y p, obtained via (2) with a known scale\n\ufb01lter fp. The model is derived in the derivative domain, and we use x \u2208 R\nn to denote\nthe lexicographically ordered sharp and (\ufb01ltered-) blurry image derivatives respectively. 2 The \ufb01nal\ndeblurred image is recovered via a non-blind deblurring step with the estimated blur kernel [26].\nFrom the modifed observation model (2), we can obtain the following likelihood:\n\nm and yp \u2208 R\n\np(yp|x, k, \u03bb) \u221d exp\n\n(cid:2)\n\u2212(cid:6)fp \u2217 y \u2212 fp \u2217 k \u2217 x(cid:6)2\n\n2\n\n(cid:3)\n\n2\u03bb\n\n(cid:2)\n\n= exp\n\n\u2212(cid:6)yp \u2212 kp \u2217 x(cid:6)2\n\n2\n\n2\u03bb\n\n(cid:3)\n\n,\n\n(3)\n\n(cid:4)\n\ni p(xi) \u221d (cid:4)\n\nwhere \u03bb is the variance of the Gaussian noise. Maximum likelihood estimation using (3) is ill-posed\nand further regularization over the unknowns is required. We use a parametrized Gaussian prior for\ni N (xi; 0, \u03b3i), where the unknown scale variables \u03b3 = [\u03b3 1, \u03b32,\u00b7\u00b7\u00b7 ] are\nx, p(x) =\nclosely related to the sparsity of x and they will be estimated jointly with other variables. Rather than\ncomputing the Maximum A Posteriori (MAP) solution, which typically requires empirical tricks to\nachieve success [16, 2], we use type-II maximum likelihood estimation following [13, 21, 25], by\nmarginalizing over the latent image and maximizing over the other unknowns\n\nmax\u03b3,k,\u03bb\u22650\n\np(yp|x, k, \u03bb)p(x)dx \u2261 min\u03b3,k,\u03bb\u22650 yT\n\np \u03a3T yp + log |\u03a3p| ,\n\n(4)\n\n(cid:5)\n\n2The derivative \ufb01lters used in this work are {[\u22121, 1], [\u22121, 1]T}.\n\n3\n\n\f(cid:6)\n\n(cid:7)\nwhere \u03a3p (cid:2)\n, Hp is the convolution matrix of kp and \u0393 (cid:2) diag[\u03b3]. Using standard\nlinear algebra techniques together with an upper-bound over \u03a3 p,3 we can reform (4) as follows [21]\n\n\u03bbI + Hp\u0393HT\np\n\n1\n\u03bb\n\nmin\n\n\u03bb,k\u22650,x\nwith rp(x, k, \u03bb) (cid:2)\n\n(cid:6)fp \u2217 y \u2212 fp \u2217 k \u2217 x(cid:6)2\n2 + rp(x, k, \u03bb) + (n \u2212 m) log \u03bb,\nx2\n+ log(\u03bb + \u03b3i(cid:6)kp(cid:6)2\ni\n2),\n\u03b3i\n\n(cid:8)\n\nmin\n\u03b3i\n\ni\n\n(5)\n\nwhich now resembles a typical regularized-regression formulation for blind deblurring when elim-\ninating fp. The proposed objective function has one interesting property as stated in the following.\n\nTheorem 2 (Scale Space Blind Deblurring) Taking fp as a Gaussian \ufb01lter, solving (5) essentially\nachieves estimation for x and k in the scale space de\ufb01ned by f p given y in the original space.\nIn essence, Theorem 2 reveals the equivalence between performing blind deblurring on y directly\nwhile constraining x and k in a certain scale space and by solving the proposed model (5) with the\naid of the additional \ufb01lter fp. This places the proposed model (5) on a sound theoretical footing.\nCascaded Scale-Space Blind Deblurring. If the blur kernel k has a clear cut-off frequency and\nthe target signal contains structures at distinct scales, then we can suppress the structures with scale\nsmaller than k using a properly designed scale \ufb01lter f p according to Theorem 1, and then solve (5)\nfor kernel estimation. However, in practice, the blur kernels are typically non-parametric and with\ncomplex forms, therefore do not have a clear cut-off frequency. Moreover, natural images have a\nmulti-scale property, meaning different scale spaces reveal different image structures. All these facts\nsuggests that it is not easy to select a \ufb01xed scale \ufb01lter fp a priori and calls for a variable scale \ufb01lter.\nNevertheless, based on the basic point that large scale structures are more advantageous than small\nscale structures for kernel estimation, a natural idea is to perform (5) separately at different scales,\nand pick the best estimation as the output. While this is an appealing idea, it is not applicable in\npractice due to the non-availability of the ground-truth, which is required for evaluating the estima-\ntion quality. A more practical approach is to perform (5) in a cascaded way, starting the estimation\nfrom a large scale and then reducing the scale for the next cascade. The kernel estimation from\nthe previous scale is used as the starting point for the next one. With this scheme, the blur kernel\nis re\ufb01ned along with the resolution of the scale space, and may become accurate enough before\nreaching the \ufb01nest resolution level, as shown in Figure 2 for a 1D example. The latent sparse sig-\nnal in this example contains 4 point sources, with the minimum separation distance of 2, which is\nsmaller than the support of the blur kernel. It is observed that some large elements of the blur kernel\nare recovered \ufb01rst and then the smaller ones appear later at a smaller scale. It can also be noticed\nthat the kernel estimation is already fairly accurate before reaching the \ufb01nest scale (i.e., the original\npixel-level representation). In this case, the \ufb01nal estimation at the last scale is fairly stable given the\ninitialization from the last scale. However, performing blind deblurring by solving (5) in the last\noriginal scale directly (i.e., fp \u2261 \u03b4) cannot achieve successful kernel estimation (results not shown).\nA similar strategy by constructing an image pyramid has been applied successfully in many of the\nrecent deblurring methods [6, 16, 2, 22, 8, 25]. It is important to emphasize that the main purpose of\nour scale-space perspective is more to provide complementary analysis and understanding of the em-\npirical coarse-to-\ufb01ne approach in blind deblurring algorithms, than to replace it. More discussions\non this point are provided in Section 4. Nevertheless, the proposed alternative approach can achieve\nperformance on par with state-of-the-art methods, as shown in Figure 4. More importantly, this al-\nternative formulation offers us a number of extra dimensions for generalization, such as extensions\nto noise robust kernel estimation and scale-adaptive estimation, as shown in the next section.\n\n3.2 Scale-Adaptive Deblurring via Tied Scale-Space Estimation\nIn the above cascade procedure, a single \ufb01lter f p is used at each step in a greedy way. Instead, we\ncan de\ufb01ne a set of scale \ufb01lters P (cid:2) {fp}P\np=1, apply each of them to the observed image y to get\na set of \ufb01ltered observations {yp}P\np=1, and then tie the estimation across all scales with the shared\nlatent sharp image x. By constructing P as a set of Gaussian \ufb01lters with decreasing radius, it is\nequivalent to perform blind deblurring in different scale spaces. Large scale space is more robust\nto image noise, and thus is more effective in stabilizing the estimation; however, only large scale\n\n3log |\u03a3p| \u2264 (cid:2)\n\ni log\n\n(cid:3)\n\u03bb + \u03b3i(cid:4)kp(cid:4)2\n\n2\n\n(cid:4)\n\n+ (n \u2212 m) log \u03bb [25].\n\n4\n\n\f2\n\n4\n\n6\n\n8\n\n0\n\n2\n\n4\n\n2\n\n4\n\n6\n\n8\n\nwithout additive noise\n\n \n\n0\n\n0\n\n0\n\n0\n\n0\n\n0\n0\n\n0\n\n0\n\n0\n\n0\n\n0\n\n0\n0\n\nIter.1\n\n1\n\n2\n\n3\n\n4\n\n5\n\nIter.3\n\n1\n\n2\n\n3\n\n4\n\n5\n\n80\n\n70\n\n60\n\n50\n\ns\nn\no\ni\nt\na\nr\ne\nt\nI\n\n10\n\nwith additive noise\n\n \n\n0\n\n0\n\n0\n\n0\n\n0\n\n0\n0\n\n0\n\n0\n\n0\n\n0\n\n0\n\nIter.1\n\n1\n\n2\n\n3\n\n4\n\n5\n\nIter.3\n\nw/o noise\n\norg.scale\nopt.scale\nuni.scale\nadaptive\n\n101.9\n43.8\n39.4\n36.7\n\n5% noise\n\n80\n\n70\n\n60\n\n50\n\nr\no\nr\nr\ne\nn\no\ni\nt\na\nm\n\n40\n\ni\nt\ns\ne\n\n5\n\n \n\n0\n\n0\n\n0\n\n0\n\n0\n\n0\n0\n\n0\n0\n\n0\n\n0\n\n0\n\n0\n\n0\n\n0\n0\n\n1\n\n2\n\n3\n\n4\n\n5\n\n1\n\n2\n\n3\n\n4\n\n4\n\n3\n\n2\n\n1\n\n40\n\n30\n\n14\n\n12\n\nIter.15\n\nFiltering Radius\n\n316.3\n63.2\n77.6\n46.4\nFigure 3: Scale Adaptive Contribution Learning for a set of 25 Gaussian \ufb01lters with radius r \u2208\n(0, 5] on the \ufb01rst image [14]. Left: without adding noise. Right: with 5% additive noise. The values\nin the heat-map represent the contribution weight (\u03bb\u22121\np ) for each scale \ufb01lter during the iterations.\nThe table on the right shows the performance (SSD error) of blind deblurring with different scales:\noriginal scale (org.scale), empirically optimal scale (opt.scale), multiple scales with uniform\ncontribution weights (uni.scale) and multiple scales with adaptive weights (adaptive).\n\norg.scale\nopt.scale\nuni.scale\nadaptive\n\nFiltering Radius\n\nIter.15\n\n20\n\n30\n\n5\n\n1\n\n2\n\n3\n\n1\n\n2\n\n3\n\n4\n\n5\n\n4\n\n5\n\n \n\nstructures are \u201cvisible\u201d (recoverable) in this space. Small scale space offers the potential to recover\nmore \ufb01ne details, but is less robust to image noise. By conducting deblurring in multiple scale\nspaces simultaneously, we can exploit the complementary property of different scales for robust\nblind deblurring in a uni\ufb01ed framework. Furthermore, different scales may contribute differently\nto the kernel estimation, we therefore use a distinct noise level parameter \u03bb p for each scale, which\nre\ufb02ects the relative contribution of that scale to the estimation. Concretely, the \ufb01nal cost function can\nbe obtained by accumulating the cost function (5) over all the P \ufb01ltered observations with adaptive\nnoise parameters 4\n\nP(cid:8)\n\n1\n\u03bbp\n\nmin\n\n{\u03bbp},k\u22650,x\nwhere R(x, k,{\u03bbp}) =\n\np=1\n\n(cid:8)\n\n(cid:6)fp \u2217 y \u2212 fp \u2217 k \u2217 x(cid:6)2\n\n2 + R(x, k,{\u03bbp}) + (n \u2212 m)\n\nlog \u03bbp,\n\nrp(x, k,{\u03bbp}) =\n\np\n\np,i\n\n(cid:8)\n\np\n\nx2\ni\n\u03b3i\n\nmin\n\u03b3i\n\n+ log(\u03bbp + \u03b3i(cid:6)kp(cid:6)2\n2).\n\n(6)\n\n(cid:8)\n\nThe penalty function R here is in effect a penalty term that exploits multi-scale regulari-\nty/consistency of the solution space. The effectiveness of the proposed approach compared to other\nmethods is illustrated in Figure 1 and more results are provided in Section 5. Formulating the deblur-\nring problem as (6), our joint estimation framework enjoys a number of features that are particularly\nappropriate for the purpose of blind deblurring in presence of noise and small scale image structures:\n(i) It exploits both the regularization of sharing the latent sharp image x across all \ufb01ltered observa-\ntions and the knowledge about the set of \ufb01lters {f p}. In this way, k is recovered directly without\npost-processing as previous work [26]; (ii) the proposed approach can be extended to handle non-\nuniform blur, as discussed in Section 3.3; and (iii) there is no inherent limitations on the form of the\n\ufb01lters we can use besides Gaussian \ufb01lters, e.g., we can also use directional \ufb01lters as in [26].\nScale Adaptiveness. With this cost function, the contribution of each \ufb01ltered observation y p con-\np } are initialized uniformly across all\nstructed by fp is re\ufb02ected by weight \u03bb\u22121\n\ufb01lters and are then learned during the kernel estimation process automatically. In this scenario, a\nsmaller noise level estimation indicates a larger contribution in estimation. It is natural to expect that\nthe distribution of the contribution weights for the same set of \ufb01lters will change under different in-\nput noise levels, as shown in Figure 3. From the \ufb01gure, we obtain a number of interest observations:\n\u2022The proposed algorithm is adaptive to observations with different noise levels. As we can see,\n\ufb01lters with smaller radius contribute more in the noise-free case, while in the noisy case, \ufb01lters\nwith larger radius contribute more.\n\u2022The distribution of the contribution weights evolves during the iterative estimation process. For\nexample in the noise-less case, starting with uniform weights, the middle-scale \ufb01lters contribute the\nmost at the beginning of the iterations, while smaller-scale \ufb01lters contribute more to the estimation\nlater on, a natural coarse-to-\ufb01ne behavior. Similar trends can also be observed for the noisy case.\n\np . The parameters {\u03bb\u22121\n\n4This can be achieved either in an online fashion or in one shot.\n\n5\n\n\f150\n\n(a)\n\n \n\nr\no\nr\nr\nE\nn\no\ni\nt\na\nm\n\ni\nt\ns\nE\n\n100\n\n50\n\nFergus\nShan\nCho\nLevin\nZhang\nProposed\n\n0\n\n \n\n1\n\n2\n3\nImage Index\n\n4\n\n \n\n(b)\n\n(c)\n\n1 2\n3 4\n\n5 6\n7 8\n\n1 2\n3 4\n\nFigure 4: Blind Deblurring Results: Noise-free Case. (a) Performance comparison (image esti-\nmation error) on the benchmark dataset [14], which contains (b) 8 blur kernels and (c) 4 images.\n\u2022While it is expected that the original scale space is not the \u201coptimal\u201d scale for kernel estimation\nin presence of noise, it is somewhat surprising to \ufb01nd that this is also the case for the noise-\nfree case. This corroborates previous \ufb01ndings that small scale structures are harmful to kernel\nestimation [22], and our algorithm automatically learn the scale space to suppress the effects of\nsmall scale structures.\n\u2022The weight distribution is more \ufb02at in the noise-free case, while it is more peaky for the noisy case.\nFigure 3 is obtained with the \ufb01rst kernel and image in Figure 4. Similar properties can be observed\nfor different images/blurs, although the position of the empirical mode are unlikely to be the same.\n\nThe table in Figure 3 shows the estimation error using difference scale space con\ufb01gurations. Blind\ndeblurring in the original space directly (org.scale) fails, indicated by the large estimation error.\nHowever, when setting the \ufb01lter as fo, whose contribution \u03bb\u22121\nis empirically the largest among all\n\ufb01lters (opt.scale), the performance is much better than in the original scale directly, with the\nestimation error reduced signi\ufb01cantly. The proposed method, by tying multiple scales together and\nlearning adaptive contribution weights (adaptive), performs the best across all the con\ufb01gurations,\nespecially in the noisy case.\n\no\n\n3.3 Non-Uniform Blur Extension\nThe extension of the uniform blind deblurring model proposed above to the non-uniform blur case is\nachieved by using a generalized observation model [18, 9], representing the blurry image as the sum-\nmation of differently transformed versions of the latent sharp image y = Hx+n =\nj=1 wj Pjx+\nn = Dw + n. Here Pj is the j-th projection or homography operator (a combination of rotations\nand translations) and wj is the corresponding combination weight representing the proportion of\ntime spent at that particular camera pose during exposure. D = [P 1x, P2x,\u00b7\u00b7\u00b7 , Pjx,\u00b7\u00b7\u00b7 ] denotes\nthe dictionary constructed by projectively transforming x using a set of transformation operators.\nw (cid:2) [w1, w2,\u00b7\u00b7\u00b7 ]T denotes the combination weights of the blurry image over the dictionary. The\nuniform convolutional model (1) can be obtained by restricting {P j} to be translations only. With\nderivations similar to those in Section 3.1, it can be shown that the cost function for the general\nnon-uniform blur case is\n\n(cid:9)\n\nmin\n\n\u03bb,w\u22650,x\n\n(cid:6)yp \u2212 Hpx(cid:6)2\n1\n\u03bbp\n(cid:9)\nwhere Hp (cid:2) Fp\nj wj Pj is the compound operator incorporating both the additional \ufb01lter and the\nnon-uniform blur. Fp is the convolutional matrix form of f p and hip denotes the effective compound\nlocal kernel at site i in the image plane constructed with w and the set of transformation operators.\n\n+ log(\u03bbp + \u03b3i(cid:6)hip(cid:6)2\n\n2) + (n \u2212 m)\n\nlog \u03bbp,\n\nmin\n\u03b3i\n\nx2\ni\n\u03b3i\n\n(7)\n\nP(cid:8)\n\np=1\n\n(cid:8)\n\np,i\n\n2 +\n\n(cid:8)\n\np\n\n4 Discussions\nWe discuss the relationship of the proposed approach with several recent methods to help under-\nstanding properties of our approach further.\nImage Pyramid based Blur Kernel Estimation. Since the blind deblurring work of Fergus et\nal. [6], image pyramid has been widely used as a standard architecture for blind deblurring [16, 2, 8,\n22, 13, 25]. The image pyramid is constructed by resizing the observed image with a \ufb01xed ratio for\nmultiple times until reaching a scale where the corresponding kernel is very small, e.g. 3 \u00d7 3. Then\nthe blur kernel is estimated \ufb01rstly from the smallest image and is upscaled for initializing the next\nlevel. This process is repeated until the last level is reached. While it is effective for exploiting the\n\n6\n\n\fImage Estimation Quality\n\nImage Estimation Quality\n\nImage Estimation Quality\n\n \n\nr\no\nr\nr\nE\nn\no\ni\nt\na\nm\n\ni\nt\ns\nE\n\n200\n\n150\n\n100\n\n50\n\n0\n\n \n\n(a)\n\nZhong\nProposed\n\n \n\n150\n\n(b)\n\n \n\nZhong\nProposed\n\n \n\nr\no\nr\nr\nE\nn\no\ni\nt\na\nm\n\ni\nt\ns\nE\n\n100\n\n50\n\n6\n\n7\n\n8\n\n0\n\n \n\n1\n\n2\n3\nImage Index\n\n4\n\n1\n\n2\n\n3\n\n4\n\n5\n\nKernel Index\n\n(c)\n\n \n\nr\no\nr\nr\nE\nn\no\ni\nt\na\nm\n\ni\nt\ns\nE\n\n180\n\n160\n\n140\n\n120\n\n100\n\n80\n\n60\n\n40\n\n20\n \n0\n\n \n\nLevin\nZhang\nZhong\nProposed\n\n2\n\n4\n\n6\nNoise Level (%)\n\n8\n\nFigure 5: Deblurring results in the presence of noise on the benchmark dataset [14]. Perfor-\nmance averaged over (a) different images and (b) different kernels, with 5% additive Gaussian noise.\n(c) Comparison of the proposed method with Levin et al. [13], Zhang et al. [25], Zhong et al. [26]\non the \ufb01rst image with the \ufb01rst kernel, under different noise levels.\n\nsolution space, this greedy pyramid construction does not provide an effective way to handle image\nnoise. Our formulation not only retains properties similar to the pyramid coarse-to-\ufb01ne estimation,\nbut also offers the extra \ufb02exibility to achieve scale-adaptive estimation, which is robust to noise and\nsmall scale structures.\nNoise-Robust Blind Deblurring [17, 26]. Based on the observation that using denoising as a pre-\nprocessing can help with blur kernel estimation in the presence of noise, Tai et al. [17] proposed to\nperform denoising and kernel estimation alternatively, by incorporating an additional image penalty\nfunction designed specially taking the blur kernel into account [17]. This approach uses separate\npenalty terms and introduces additional balancing parameters. Our proposed model, on the contrary,\nhas a coupled penalty function and learns the balancing parameters from the data. Moreover, the\nproposed model can be generalized to non-uniform blur in a straightforward way. Another recent\nmethod [26] performs blind kernel estimation on images \ufb01ltered with different directional \ufb01lters\nseparately and then reconstructs the \ufb01nal kernel in a second step via inverse Radon transform [26].\nThis approach is only applicable to uniform blur and directional auxiliary \ufb01lters. Moreover, it treats\neach \ufb01ltered observation independently thus may introduce additional errors in the second kernel\nreconstruction step, due to factors such as mis-alignment between the estimated compound kernels.\nSmall Scale Structures in Blur Kernel Estimation [22, 2]. Based on the observation that small\nscale structures are harmful for kernel estimation, Xu and Jia [22] designed an empirical approach\nfor structure selection based on gradient magnitudes. Structure selection has also been incorporated\ninto blind deblurring in various forms before, such as gradient thresholding [2, 16]. However, it\nis hard to determine a universal threshold for different images and kernels. Other techniques such\nas image decomposition has also been incorporated [24], where the observed blurry image is de-\ncomposed into structure and texture layers. However, standard image decomposition techniques do\nnot consider image blur, thus might not work well in the presence of blur. Another issue for this\napproach is again the selection of the parameter for separating texture from structure, which is im-\nage dependent in general. The proposed method achieves robustness to small scale structures by\noptimizing the scale contribution weights jointly with blind deblurring, in an image adaptive way.\n\nThe optimization techniques used in this paper has been used before for image deblurring [13, 21,\n25], with different context and motivations.\n\n5 Experimental Results\nWe perform extensive experiments in this section to evaluate the performance of the proposed\nmethod compared with several state-of-the-art blind deblurring methods, including two recent noise\nrobust deblurring methods of Tai et al. [17], and Zhong et al. [26], as well as a non-uniform de-\nblurring method of Xu et al. [23]. We construct {f p} as Gaussian \ufb01lters, with the radius uniformly\nsampled over a speci\ufb01ed range, which is typically set as [0.1, 3] in the experiment. 5 The number of\niterations is used as the stopping criteria and is \ufb01xed as 15 in practice.\nEvaluation using the Benchmark Dataset of Levin et al. [14]. We \ufb01rst perform evaluation on\nthe benchmark dataset of Levin et al. [14], containing 4 images and 8 blur kernels, leading to 32\nblurry images in total (see Figure 4). Performances for the noise-free case are reported in Figure 4,\nwhere the proposed approach performs on par with state-of-the-art. To evaluate the performances\n\n5The number of \ufb01lters P should be large enough to characterize the scale space. We typically set P = 7.\n\n7\n\n\fBlurry\n\nBlurry\n\nTai [17]\n\nZhong [26]\n\nProposed\n\nBlurry\n\nBlurry\n\nXu [23]\n\nZhong [26]\n\nProposed\n\no\nt\no\ny\nK\n\ng\nn\ni\nd\nl\ni\nu\nB\n\nt\nn\na\nh\np\ne\nl\nE\n\nBlurry\n\nBlurry\n\nXu [23]\n\nZhong [26]\n\nProposed\n\nFigure 6: Deblurring results on image with non-uniform blur, compared with Tai et al. [17], Zhong\net al. [26] and Xu et al. [23]. Full images are shown in the supplementary \ufb01le.\n\nof different methods in the presence of noise, we add i.i.d. Gaussian noise to the blurry images, and\nthen perform kernel estimation. The estimated kernels are used for non-blind deblurring [12] on\nthe noise-free blurry images. The bar plots in Figure 5 show the sum-of-squared-difference (SSD)\nerror of the deblurred images using the proposed method and the method of Zhong et al. [26] when\nthe noise level is 5%. As the same non-blind deblurring method is used, this SSD error re\ufb02ects\nthe quality of the kernel estimation. It is clear that the proposed method performs better than the\nmethod of Zhong et al. [26] overall. We also show the results of different methods with increasing\nnoise levels in Figure 5. It is observed that while the conventional methods (e.g. Levin et al. [13],\nZhang et al. [25]) performs well when the noise level is low, their performances degrade rapidly\nwhen the noise level increases. The method of Zhong et al. [26] performs more robustly across\ndifferent noise levels, but does not performs as well as the other methods when the noise level is\nvery low. This might be caused by the loss of information during its two-step process. The proposed\nmethod outperforms the other methods for all the noise levels, proving its effectiveness.\nDeblurring on Real-World Images. We further evaluate the performance of the proposed method\non real-world images from the literature [17, 7, 8]. The results are shown in Figure 6. For the Kyoto\nimage from [17], the deblurred image of Tai et al. [17] has some ringing artifacts while the result\nof Zhong et al. [26] has ghosting effects due to the inaccurate kernel estimation. The deblurred\nimage from the propose method has neither ghosting or strong ringing artifacts. For the other two\ntest images, the non-uniform deblurring method [23] produces deblurred images that are still very\nblurry, as it achieves kernel estimations close to a delta kernel for both images, due to the presence\nof noise. The method of Zhong et al. [26] can only handle uniform blur and the deblurred images\nhave strong ringing artifacts. The proposed method can estimate the non-uniform blur accurately\nand can produce high-quality deblurring results better than the other methods.\n6 Conclusion\nWe present an analysis of blind deblurring approach from the scale-space perspective. The novel\nanalysis not only helps in understanding several empirical techniques widely used in the blind de-\nblurring literature, but also inspires new extensions. Extensive experiments on benchmark dataset\nas well as real-world images verify the effectiveness of the proposed method. For future work, we\nwould like to investigate the extension of the proposed approach in several directions, such as blind\nimage denoising and multi-scale dictionary learning. The task of learning the auxiliary \ufb01lters in a\nblur and image adaptive fashion is another interesting future research direction.\nAcknowledgement The research was supported in part by Adobe Systems.\n\n8\n\n\fReferences\n\n[1] E. J. Cand\u00e8s and C. Fernandez-Granda. Towards a mathematical theory of super-resolution.\n\nCoRR, abs/1203.5871, 2012.\n\n[2] S. Cho and S. Lee. Fast motion deblurring. In SIGGRAPH ASIA, 2009.\n[3] C. Ekanadham, D. Tranchina, and E. P. Simoncelli. A blind sparse deconvolution method for\n\nneural spike identi\ufb01cation. In NIPS, 2011.\n\n[4] J. H. Elder and S. W. Zucker. Local scale control for edge detection and blur estimation. IEEE\n\nTrans. Pattern Anal. Mach. Intell., 20(7):699\u2013716, 1998.\n\n[5] Z. Farbman, R. Fattal, D. Lischinski, and R. Szeliski. Edge-preserving decompositions for\n\nmulti-scale tone and detail manipulation. In SIGGRAPH, 2008.\n\n[6] R. Fergus, B. Singh, A. Hertzmann, S. T. Roweis, and W. T. Freeman. Removing camera shake\n\nfrom a single photograph. In SIGGRAPH, 2006.\n\n[7] A. Gupta, N. Joshi, C. L. Zitnick, M. Cohen, and B. Curless. Single image deblurring using\n\nmotion density functions. In ECCV, 2010.\n\n[8] S. Harmeling, M. Hirsch, and B. Sch\u00f6lkopf. Space-variant single-image blind deconvolution\n\nfor removing camera shake. In NIPS, 2010.\n\n[9] M. Hirsch, C. J. Schuler, S. Harmeling, and B. Sch\u00f6lkopf. Fast removal of non-uniform camera\n\nshake. In ICCV, 2011.\n\n[10] Y. Karklin and E. P. Simoncelli. Ef\ufb01cient coding of natural images with a population of noisy\n\nlinear-nonlinear neurons. In NIPS, 2011.\n\n[11] D. Krishnan, T. Tay, and R. Fergus. Blind deconvolution using a normalized sparsity measure.\n\nIn CVPR, 2011.\n\n[12] A. Levin, R. Fergus, F. Durand, and W. T. Freeman. Deconvolution using natural image priors.\n\nTechnical report, MIT, 2007.\n\n[13] A. Levin, Y. Weiss, F. Durand, and W. T. Freeman. Ef\ufb01cient marginal likelihood optimization\n\nin blind deconvolution. In CVPR, 2011.\n\n[14] A. Levin, Y. Weiss, F. Durand, and W. T. Freeman. Understanding blind deconvolution algo-\n\nrithms. IEEE Trans. Pattern Anal. Mach. Intell., 33(12):2354\u20132367, 2011.\n\n[15] T. Lindeberg and B. M. H. Romeny. Linear scale-space: I. Basic theory, II. Early visual\n\noperations. In Geometry-Driven Diffusion in Computer Vision, 1994.\n\n[16] Q. Shan, J. Jia, and A. Agarwala. High-quality motion deblurring from a single image. In\n\nSIGGRAPH, 2008.\n\n[17] Y.-W. Tai and S. Lin. Motion-aware noise \ufb01ltering for deblurring of noisy and blurry images.\n\nIn CVPR, pages 17\u201324, 2012.\n\n[18] Y.-W. Tai, P. Tan, and M. S. Brown. Richardson-Lucy deblurring for scenes under a projective\n\nmotion path. IEEE Trans. Pattern Anal. Mach. Intell., 33(8):1603\u20131618, 2011.\n\n[19] C. Tomasi and R. Manduchi. Bilateral \ufb01ltering for gray and color images. In ICCV, 1998.\n[20] D. Tschumperl\u00e9 and R. Deriche. Vector-valued image regularization with PDEs: A common\nframework for different applications. IEEE Trans. Pattern Anal. Mach. Intell., 27(4):506\u2013517,\n2005.\n\n[21] D. P. Wipf and H. Zhang. Revisiting Bayesian blind deconvolution. CoRR, abs/1305.2362,\n\n2013.\n\n[22] L. Xu and J. Jia. Two-phase kernel estimation for robust motion deblurring. In ECCV, 2010.\n[23] L. Xu, S. Zheng, and J. Jia. Unnatural L0 sparse representation for natural image deblurring.\n\nIn CVPR, 2013.\n\n[24] Y. Xu, X. Hu, L. Wang, and S. Peng. Single image blind deblurring with image decomposition.\n\nIn ICASSP, 2012.\n\n[25] H. Zhang and D. Wipf. Non-uniform camera shake removal using a spatially adaptive sparse\n\npenalty. In NIPS, 2013.\n\n[26] L. Zhong, S. Cho, D. Metaxas, S. Paris, and J. Wang. Handling noise in single image deblurring\n\nusing directional \ufb01lters. In CVPR, 2013.\n\n9\n\n\f", "award": [], "sourceid": 1564, "authors": [{"given_name": "Haichao", "family_name": "Zhang", "institution": "Duke University"}, {"given_name": "Jianchao", "family_name": "Yang", "institution": "Adobe Research"}]}