{"title": "Fast Image Deconvolution using Hyper-Laplacian Priors", "book": "Advances in Neural Information Processing Systems", "page_first": 1033, "page_last": 1041, "abstract": "The heavy-tailed distribution of gradients in natural scenes have proven effective priors for a range of problems such as denoising, deblurring and super-resolution. However, the use of sparse distributions makes the problem non-convex and impractically slow to solve for multi-megapixel images. In this paper we describe a deconvolution approach that is several orders of magnitude faster than existing techniques that use hyper-Laplacian priors. We adopt an alternating minimization scheme where one of the two phases is a non-convex problem that is separable over pixels. This per-pixel sub-problem may be solved with a lookup table (LUT). Alternatively, for two specific values of \u03b1, 1/2 and 2/3 an analytic solution can be found, by finding the roots of a cubic and quartic polynomial, respectively. Our approach (using either LUTs or analytic formulae) is able to deconvolve a 1 megapixel image in less than \u223c3 seconds, achieving comparable quality to existing methods such as iteratively reweighted least squares (IRLS) that take \u223c20 minutes. Furthermore, our method is quite general and can easily be extended to related image processing problems, beyond the deconvolution application demonstrated.", "full_text": "Fast Image Deconvolution\n\nusing Hyper-Laplacian Priors\n\nDilip Krishnan,\n\nDept. of Computer Science,\n\nCourant Institute,\n\nNew York University\ndilip@cs.nyu.edu\n\nRob Fergus,\n\nDept. of Computer Science,\n\nCourant Institute,\n\nNew York University\n\nfergus@cs.nyu.edu\n\nAbstract\n\nThe heavy-tailed distribution of gradients in natural scenes have proven effective\npriors for a range of problems such as denoising, deblurring and super-resolution.\n\nThese distributions are well modeled by a hyper-Laplacian(cid:0)p(x) \u221d e\u2212k|x|\u03b1(cid:1), typ-\nically with 0.5 \u2264 \u03b1 \u2264 0.8. However, the use of sparse distributions makes the\nproblem non-convex and impractically slow to solve for multi-megapixel images.\nIn this paper we describe a deconvolution approach that is several orders of mag-\nnitude faster than existing techniques that use hyper-Laplacian priors. We adopt\nan alternating minimization scheme where one of the two phases is a non-convex\nproblem that is separable over pixels. This per-pixel sub-problem may be solved\nwith a lookup table (LUT). Alternatively, for two speci\ufb01c values of \u03b1, 1/2 and 2/3\nan analytic solution can be found, by \ufb01nding the roots of a cubic and quartic poly-\nnomial, respectively. Our approach (using either LUTs or analytic formulae) is\nable to deconvolve a 1 megapixel image in less than \u223c3 seconds, achieving com-\nparable quality to existing methods such as iteratively reweighted least squares\n(IRLS) that take \u223c20 minutes. Furthermore, our method is quite general and can\neasily be extended to related image processing problems, beyond the deconvolu-\ntion application demonstrated.\n\n1 Introduction\n\nNatural image statistics are a powerful tool in image processing, computer vision and computational\nphotography. Denoising [14], deblurring [3], transparency separation [11] and super-resolution [20],\nare all tasks that are inherently ill-posed. Priors based on natural image statistics can regularize these\nproblems to yield high-quality results. However, digital cameras now have sensors that record im-\nages with tens of megapixels (MP), e.g. the latest Canon DSLRs have over 20MP. Solving the above\ntasks for such images in a reasonable time frame (i.e. a few minutes or less), poses a severe challenge\nto existing algorithms. In this paper we focus on one particular problem: non-blind deconvolution,\nand propose an algorithm that is practical for very large images while still yielding high quality\nresults.\n\nNumerous deconvolution approaches exist, varying greatly in their speed and sophistication. Simple\n\ufb01ltering operations are very fast but typically yield poor results. Most of the best-performing ap-\nproaches solve globally for the corrected image, encouraging the marginal statistics of a set of \ufb01lter\noutputs to match those of uncorrupted images, which act as a prior to regularize the problem. For\nthese methods, a trade-off exists between accurately modeling the image statistics and being able to\nsolve the ensuing optimization problem ef\ufb01ciently. If the marginal distributions are assumed to be\nGaussian, a closed-form solution exists in the frequency domain and FFTs can be used to recover the\nimage very quickly. However, real-world images typically have marginals that are non-Gaussian, as\nshown in Fig. 1, and thus the output is often of mediocre quality. A common approach is to assume\nthe marginals have a Laplacian distribution. This allows a number of fast \u21131 and related TV-norm\nmethods [17, 22] to be deployed, which give good results in a reasonable time. However, studies\n\n1\n\n\f \n\n0\n\n\u22125\n\ny\nt\ni\nl\ni\n\nb\na\nb\no\nr\nP\n\n \n\n2\n\ng\no\n\nl\n\n\u221210\n\n \n\n\u221215\n\u2212100\n\nEmpirical\nGaussian (\u03b1=2)\nLaplacian (\u03b1=1)\nHyper\u2212Laplacian (\u03b1=0.66)\n\n\u221280\n\n\u221260\n\n\u221240\n\n\u221220\n\n0\n\nGradient\n\n20\n\n40\n\n60\n\n80\n\n100\n\nFigure 1: A hyper-Laplacian with exponent \u03b1 = 2/3 is a better model of image gradients than\na Laplacian or a Gaussian. Left: A typical real-world scene. Right: The empirical distribution\nof gradients in the scene (blue), along with a Gaussian \ufb01t (cyan), a Laplacian \ufb01t (red) and a hyper-\nLaplacian with \u03b1 = 2/3 (green). Note that the hyper-Laplacian \ufb01ts the empirical distribution closely,\nparticularly in the tails.\n\nof real-world images have shown the marginal distributions have signi\ufb01cantly heavier tails than a\nLaplacian, being well modeled by a hyper-Laplacian [4, 10, 18]. Although such priors give the best\nquality results, they are typically far slower than methods that use either Gaussian or Laplacian pri-\nors. This is a direct consequence of the problem becoming non-convex for hyper-Laplacians with\n\u03b1 < 1, meaning that many of the fast \u21131 or \u21132 tricks are no longer applicable. Instead, standard\noptimization methods such as conjugate gradient (CG) must be used. One variant that works well\nin practice is iteratively reweighted least squares (IRLS) [19] that solves a series of weighted least-\nsquares problems with CG, each one an \u21132 approximation to the non-convex problem at the current\npoint. In both cases, typically hundreds of CG iterations are needed, each involving an expensive\nconvolution of the blur kernel with the current image estimate.\n\nIn this paper we introduce an ef\ufb01cient scheme for non-blind deconvolution of images using a hyper-\nLaplacian image prior for 0 < \u03b1 \u2264 1. Our algorithm uses an alternating minimization scheme where\nthe non-convex part of the problem is solved in one phase, followed by a quadratic phase which can\nbe ef\ufb01ciently solved in the frequency domain using FFTs. We focus on the \ufb01rst phase where at each\npixel we are required to solve a non-convex separable minimization. We present two approaches to\nsolving this sub-problem. The \ufb01rst uses a lookup table (LUT); the second is an analytic approach\nspeci\ufb01c to two values of \u03b1. For \u03b1 = 1/2 the global minima can be determined by \ufb01nding the\nroots of a cubic polynomial analytically. In the \u03b1 = 2/3 case, the polynomial is a quartic whose\nroots can also be found ef\ufb01ciently in closed-form. Both IRLS and our approach solve a series of\napproximations to the original problem. However, in our method each approximation is solved by\nalternating between the two phases above a few times, thus avoiding the expensive CG descent used\nby IRLS. This allows our scheme to operate several orders of magnitude faster. Although we focus\non the problem of non-blind deconvolution, it would be straightforward to adapt our algorithm to\nother related problems, such as denoising or super-resolution.\n\n1.1 Related Work\nHyper-Laplacian image priors have been used in a range of settings: super-resolution [20], trans-\nparency separation [11] and motion deblurring [9]. In work directly relevant to ours, Levin et al. [10]\nand Joshi et al. [7] have applied them to non-blind deconvolution problems using IRLS to solve for\nthe deblurred image. Other types of sparse image prior include: Gaussian Scale Mixtures (GSM)\n[21], which have been used for image deblurring [3] and denoising [14] and student-T distributions\nfor denoising [25, 16]. With the exception of [14], these methods use CG and thus are slow.\n\nThe alternating minimization that we adopt is a common technique, known as half-quadratic split-\nting, originally proposed by Geman and colleagues [5, 6]. Recently, Wang et al. [22] showed how it\ncould be used with a total-variation (TV) norm to deconvolve images. Our approach is closely re-\nlated to this work: we also use a half-quadratic minimization, but the per-pixel sub-problem is quite\ndifferent. With the TV norm it can be solved with a straightforward shrinkage operation. In our\nwork, as a consequence of using a sparse prior, the problem is non-convex and solving it ef\ufb01ciently\nis one of the main contributions of this paper.\nChartrand [1, 2] has introduced non-convex compressive sensing, where the usual \u21131 norm on the\nsignal to be recovered is replaced with a \u2113p quasi-norm, where p < 1. Similar to our approach, a\nsplitting scheme is used, resulting in a non-convex per-pixel sub-problem. To solve this, a Huber\n\n2\n\n\fapproximation (see [1]) to the quasi-norm is used, allowing the derivation of a generalized shrinkage\noperator to solve the sub-problem ef\ufb01ciently. However, this approximates the original sub-problem,\nunlike our approach.\n\n2 Algorithm\nWe now introduce the non-blind deconvolution problem. x is the original uncorrupted linear\ngrayscale image of N pixels; y is an image degraded by blur and/or noise, which we assume to\nbe produced by convolving x with a blur kernel k and adding zero mean Gaussian noise. We as-\nsume that y and k are given and seek to reconstruct x. Given the ill-posed nature of the task, we\nregularize using a penalty function |.|\u03b1 that acts on the output of a set of \ufb01lters f1, . . . , fj applied\nto x. A weighting term \u03bb controls the strength of the regularization. From a probabilistic perspec-\ntive, we seek the MAP estimate of x: p(x|y, k) \u221d p(y|x, k)p(x), the \ufb01rst term being a Gaussian\nlikelihood and second being the hyper-Laplacian image prior. Maximizing p(x|y, k) is equivalent\nto minimizing the cost \u2212 log p(x|y, k):\n\u03bb\n2\n\n(x \u2295 k \u2212 y)2\n\ni +\n\nmin\n\n(1)\n\nN\n\nJ\n\nx\n\nXj=1\n\n|(x \u2295 fj)i|\u03b1\uf8f6\n\uf8f8\n\nXi=1\n\n\uf8eb\n\uf8ed\n\nwhere i is the pixel index, and \u2295 is the 2-dimensional convolution operator. For simplicity, we\nuse two \ufb01rst-order derivative \ufb01lters f1 = [1 -1] and f2 = [1 -1]T , although additional ones can\neasily be added (e.g. learned \ufb01lters [13, 16], or higher order derivatives). For brevity, we denote\nF j\ni x \u2261 (x \u2295 fj)i for j = 1, .., J.\nUsing the half-quadratic penalty method [5, 6, 22], we now introduce auxiliary variables w1\n(together denoted as w) at each pixel that allow us to move the F j\ngiving a new cost function:\n\ni and w2\ni\ni x terms outside the |.|\u03b1 expression,\n(2)\n\n\u03b2\n\n(x \u2295 k \u2212 y)2\n\ni +\n\n2 (cid:0)kF 1\n\ni x \u2212 w1\ni k2\n\n2 + kF 2\n\ni k2\ni x \u2212 w2\n2(cid:1) + |w1\n\nwhere \u03b2 is a weight that we will vary during the optimization, as described in Section 2.3. As\n\u03b2 \u2192 \u221e, the solution of Eqn. 2 converges to that of Eqn. 1. Minimizing Eqn. 2 for a \ufb01xed \u03b2 can\nbe performed by alternating between two steps, one where we solve for x, given values of w and\nvice-versa. The novel part of our algorithm lies in the w sub-problem, but \ufb01rst we brie\ufb02y describe\nthe x sub-problem and its straightforward solution.\n2.1 x sub-problem\nGiven a \ufb01xed value of w from the previous iteration, Eqn. 2 is quadratic in x. The optimal x is thus:\n\nmin\n\nx,w Xi\n\n(cid:18) \u03bb\n\n2\n\ni |\u03b1 + |w2\n\ni |\u03b1(cid:19)\n\n\u03bb\n\u03b2\n\n\u03bb\n\u03b2\n\n(cid:18)F 1T\n\nK T K(cid:19) x = F 1T\n\nF 1 + F 2T\n\nw1 + F 2T\n\nK T y\n\nF 2 +\n\nw2 +\n\nx = F \u22121(cid:18) F(F 1)\u2217 \u25e6 F(w1) + F(F 2)\u2217 \u25e6 F(w2) + (\u03bb/\u03b2)F(K)\u2217 \u25e6 F(y)\nF(F 1)\u2217 \u25e6 F(F 1) + F(F 2)\u2217 \u25e6 F(F 2) + (\u03bb/\u03b2)F(K)\u2217 \u25e6 F(K)(cid:19)\n\n(3)\nwhere Kx \u2261 x \u2295 k. Assuming circular boundary conditions, we can apply 2D FFT\u2019s which diago-\nnalize the convolution matrices F 1, F 2, K, enabling us to \ufb01nd the optimal x directly:\n(4)\nwhere \u2217 is the complex conjugate and \u25e6 denotes component-wise multiplication. The division is also\nperformed component-wise. Solving Eqn. 4 requires only 3 FFT\u2019s at each iteration since many of\nthe terms can be precomputed. The form of this sub-problem is identical to that of [22].\n2.2 w sub-problem\nGiven a \ufb01xed x, \ufb01nding the optimal w consists of solving 2N independent 1D problems of the form:\n\n(w \u2212 v)2\ni x. We now describe two approaches to \ufb01nding w\u2217.\n\nw |w|\u03b1 +\n\nw\u2217 = arg min\n\n\u03b2\n2\n\nwhere v \u2261 F j\n2.2.1 Lookup table\n\n(5)\n\nFor a \ufb01xed value of \u03b1, w\u2217 in Eqn. 5 only depends on two variables, \u03b2 and v, hence can easily be\ntabulated off-line to form a lookup table. We numerically solve Eqn. 5 for 10, 000 different values\nof v over the range encountered in our problem (\u22120.6 \u2264 v \u2264 0.6). This is repeated for different \u03b2\nvalues, namely integer powers of \u221a2 between 1 and 256. Although the LUT gives an approximate\nsolution, it allows the w sub-problem to be solved very quickly for any \u03b1 > 0.\n\n3\n\n\f2.2.2 Analytic solution\n\nFor some speci\ufb01c values of \u03b1, it is possible to derive exact analytical solutions to the w sub-problem.\nFor \u03b1 = 2, the sub-problem is quadratic and thus easily solved. If \u03b1 = 1, Eqn. 5 reduces to a 1-D\nshrinkage operation [22]. For some special cases of 1 < \u03b1 < 2, there exist analytic solutions [26].\nHere, we address the more challenging case of \u03b1 < 1 and we now describe a way to solve Eqn. 5\nfor two special cases of \u03b1 = 1/2 and \u03b1 = 2/3. For non-zero w, setting the derivative of Eqn. 5 w.r.t\nw to zero gives:\n\n(6)\n\nFor \u03b1 = 1/2, this becomes, with successive simpli\ufb01cation:\n\n\u03b1|w|\u03b1\u22121sign(w) + \u03b2(w \u2212 v) = 0\n|w|\u22121/2sign(w) + 2\u03b2(w \u2212 v) = 0\n|w|\u22121 = 4\u03b22(v \u2212 w)2\nw3 \u2212 2vw2 + v2w \u2212 sign(w)/4\u03b22 = 0\n\n(7)\n(8)\n(9)\nAt \ufb01rst sight Eqn. 9 appears to be two different cubic equations with the \u00b11/4\u03b22 term, however we\nneed only consider one of these as v is \ufb01xed and w\u2217 must lie between 0 and v. Hence we can replace\nsign(w) with sign(v) in Eqn. 9:\n\nw3 \u2212 2vw2 + v2w \u2212 sign(v)/4\u03b22 = 0\n\nFor the case \u03b1 = 2/3, using a similar derivation, we arrive at:\nw4 \u2212 3vw3 + 3v2w2 \u2212 v3w +\n\n8\n27\u03b23 = 0\n\n(10)\n\n(11)\n\nthere being no sign(w) term as it conveniently cancels in this case. Hence w\u2217, the solution of Eqn. 5,\nis either 0 or a root of the cubic polynomial in Eqn. 10 for \u03b1 = 1/2, or equivalently a root of the\nquartic polynomial in Eqn. 10 for \u03b1 = 2/3. Although it is tempting to try the same manipulation\nfor \u03b1 = 3/4, this results in a 5th order polynomial, which can only be solved numerically.\nFinding the roots of the cubic and quartic polynomials: Analytic formulae exist for the roots\nof cubic and quartic polynomials [23, 24] and they form the basis of our approach, as detailed in\nAlgorithms 2 and 3. In both the cubic and quartic cases, the computational bottleneck is the cube\nroot operation. An alternative way of \ufb01nding the roots of the polynomials Eqn. 10 and Eqn. 11 is\nto use a numerical root-\ufb01nder such as Newton-Raphson. In our experiments, we found Newton-\nRaphson to be slower and less accurate than either the analytic method or the LUT approach (see\n[8] for futher details).\nSelecting the correct roots: Given the roots of the polynomial, we need to determine which one\ncorresponds to the global minima of Eqn. 5. When \u03b1 = 1/2, the resulting cubic equation can have:\n(a) 3 imaginary roots; (b) 2 imaginary roots and 1 real root, or (c) 3 real roots. In the case of (a),\nthe |w|\u03b1 term means Eqn. 5 has positive derivatives around 0 and the lack of real roots implies the\nderivative never becomes negative, thus w\u2217 = 0. For (b), we need to compare the costs of the single\nreal root and w = 0, an operation that can be ef\ufb01ciently performed using Eqn. 13 below. In (c)\nwe have 3 real roots. Examining Eqn. 7 and Eqn. 8, we see that the squaring operation introduces\na spurious root above v when v > 0, and below v when v < 0. This root can be ignored, since\nw\u2217 must lie between 0 and v. The cost function in Eqn. 5 has a local maximum near 0 and a local\nminimum between this local maximum and v. Hence of the 2 remaining roots, the one further from\n0 will have a lower cost. Finally, we need to compare the cost of this root with that of w = 0 using\nEqn. 13.\n\nWe can use similar arguments for the \u03b1 = 2/3 case. Here we can potentially have: (a) 4 imaginary\nroots, (b) 2 imaginary and 2 real roots, or (c) 4 real roots. In (a), w\u2217 = 0 is the only solution. For\n(b), we pick the larger of the 2 real roots and compare the costs with w = 0 using Eqn. 13, similar\nto the case of 3 real roots for the cubic. Case (c) never occurs: the \ufb01nal quartic polynomial Eqn. 11\nwas derived with a cubing operation from the analytic derivative. This introduces 2 spurious roots\ninto the \ufb01nal solution, both of which are imaginary, thus only cases (a) and (b) are possible.\n\nIn both the cubic and quartic cases, we need an ef\ufb01cient way to pick between w = 0 and a real root\nthat is between 0 and v. We now describe a direct mechanism for doing this which does not involve\nthe expensive computation of the cost function in Eqn. 51.\nLet r be the non-zero real root. 0 must be chosen if it has lower cost in Eqn. 5. This implies:\n\n1This requires the calculation of a fractional power, which is slow, particularly if \u03b1 = 2/3.\n\n4\n\n\f|r|\u03b1 +\nsign(r)|r|\u03b1\u22121 +\n\n\u03b2\n2\n\u03b2\n2\n\n\u03b2v2\n2\n\n(r \u2212 v)2 >\n(r \u2212 2v) \u2276 0 , r \u2276 0\n\n(12)\n\nSince we are only considering roots of the polynomial, we can use Eqn. 6 to eliminate sign(r)|r|\u03b1\u22121\nfrom Eqn. 6 and Eqn. 12, yielding the condition:\n\nr \u2276 2v\n\n, v \u2277 0\n\n(13)\n\n(\u03b1 \u2212 1)\n(\u03b1 \u2212 2)\n\nsince sign(r) = sign(v). So w\u2217 = r if r is between 2v/3 and v in the \u03b1 = 1/2 case or between\nv/2 and v in the \u03b1 = 2/3 case. Otherwise w\u2217 = 0. Using this result, picking w\u2217 can be ef\ufb01ciently\ncoded, e.g. lines 12\u201316 of Algorithm 2. Overall, the analytic approach is slower than the LUT, but\nit gives an exact solution to the w sub-problem.\n\n2.3 Summary of algorithm\n\nWe now give the overall algorithm using a LUT for the w sub-problem. As outlined in Algorithm\n1 below, we minimize Eqn. 2 by alternating the x and w sub-problems T times, before increasing\nthe value of \u03b2 and repeating. Starting with some small value \u03b20 we scale it by a factor \u03b2Inc until it\nexceeds some \ufb01xed value \u03b2Max. In practice, we \ufb01nd that a single inner iteration suf\ufb01ces (T = 1),\nalthough more can sometimes be needed when \u03b2 is small.\n\nAlgorithm 1 Fast image deconvolution using hyper-Laplacian priors\nRequire: Blurred image y, kernel k, regularization weight \u03bb, exponent \u03b1 (\u00bf0)\nRequire: \u03b2 regime parameters: \u03b20, \u03b2Inc, \u03b2Max\nRequire: Number of inner iterations T .\n1: \u03b2 = \u03b20, x = y\n2: Precompute constant terms in Eqn. 4.\n3: while \u03b2 < \u03b2Max do\n4:\n5:\n6:\n7:\n8:\n9:\n10: end while\n11: return Deconvolved image x\n\nGiven x, solve Eqn. 5 for all pixels using a LUT to give w\nGiven w, solve Eqn. 4 to give x\n\niter = 0\nfor i = 1 to T do\n\nend for\n\u03b2 = \u03b2Inc \u00b7 \u03b2\n\nAs with any non-convex optimization problem, it is dif\ufb01cult to derive any guarantees regarding the\nconvergence of Algorithm 1. However, we can be sure that the global optimum of each sub-problem\nwill be found, given the \ufb01xed x and w from the previous iteration. Like other methods that use\nthis form of alternating minimization [5, 6, 22], there is little theoretical guidance for setting the \u03b2\nschedule. We \ufb01nd that the simple scheme shown in Algorithm 1 works well to minimize Eqn. 2 and\nits proxy Eqn. 1. The experiments in Section 3 show our scheme achieves very similar SNR levels\nto IRLS, but at a greatly lower computational cost.\n\n3 Experiments\nWe evaluate the deconvolution performance of our algorithm on images, comparing them to numer-\nous other methods: (i) \u21132 (Gaussian) prior on image gradients; (ii) Lucy-Richardson [15]; (iii) the\nalgorithm of Wang et al. [22] using a total variation (TV) norm prior and (iv) a variant of [22] using\nan \u21131 (Laplacian) prior; (v) the IRLS approach of Levin et al. [10] using a hyper-Laplacian prior\nwith \u03b1 = 1/2, 2/3, 4/5. Note that only IRLS and our method use a prior with \u03b1 < 1. For the\nIRLS scheme, we used the implementation of [10] with default parameters, the only change being\nthe removal of higher order derivative \ufb01lters to enable a direct comparison with other approaches.\nNote that IRLS and \u21132 directly minimize Eqn. 1, while our method, and the TV and \u21131 approaches of\n[22] minimize the cost in Eqn. 2, using T = 1, \u03b20 = 1, \u03b2Inc = 2\u221a2, \u03b2Max = 256. In our approach,\nwe use \u03b1 = 1/2 and \u03b1 = 2/3, and compare the performance of the LUT and analytic methods as\nwell. All runs were performed with multithreading enabled (over 4 CPU cores).\n\n5\n\n\fWe evaluate the algorithms using a set of blurry images, created in the following way. 7 in-focus\ngrayscale real-world images were downloaded from the web. They were then blurred by real-world\ncamera shake kernels from [12]. 1% Gaussian noise was added, followed by quantization to 255\ndiscrete values.\nIn any practical deconvolution setting the blur kernel is never perfectly known.\nTherefore, the kernel passed to the algorithms was a minor perturbation of the true kernel, to mimic\nkernel estimation errors. In experiments with non-perturbed kernels (not shown), the results are\nsimilar to those in Tables 3 and 1 but with slightly higher SNR levels. See Fig. 2 for an example of a\nkernel from [12] and its perturbed version. Our evaluation metric was the SNR between the original\nimage \u02c6x and the deconvolved output x, de\ufb01ned as 10 log10\n, \u00b5(\u02c6x) being the mean of \u02c6x.\nIn Table 1 we compare the algorithms on 7 different images, all blurred with the same 19\u00d719 kernel.\nFor each algorithm we exhaustively searched over different regularization weights \u03bb to \ufb01nd the value\nthat gave the best SNR performance, as reported in the table. In Table 3 we evaluate the algorithms\nwith the same 512\u00d7512 image blurred by 8 different kernels (from [12]) of varying size. Again,\nthe optimal value of \u03bb for each kernel/algorithm combination was chosen from a range of values\nbased on SNR performance. Table 2 shows the running time of several algorithms on images up\nto 3072\u00d73072 pixels. Figure 2 shows a larger 27\u00d727 blur being deconvolved from two example\nimages, comparing the output of different methods.\n\nk\u02c6x\u2212\u00b5(\u02c6x)k2\n\nk\u02c6x\u2212xk2\n\nThe tables and \ufb01gures show our method with \u03b1 = 2/3 and IRLS with \u03b1 = 4/5 yielding higher\nquality results than other methods. However, our algorithm is around 70 to 350 times faster than\nIRLS depending on whether the analytic or LUT method is used. This speedup factor is independent\nof image size, as shown by Table 2. The \u21131 method of [22] is the best of the other methods, being\nof comparable speed to ours but achieving lower SNR scores. The SNR results for our method are\nalmost the same whether we use LUTs or analytic approach. Hence, in practice, the LUT method is\npreferred, since it is approximately 5 times faster than the analytic method and can be used for any\nvalue of \u03b1.\n\nImage\n\n#\n1\n2\n3\n4\n5\n6\n7\n\nAv. SNR gain\nAv. Time\n\n(secs)\n\nBlurry\n6.42\n10.73\n12.45\n8.51\n12.74\n10.85\n11.76\n\n\u21132\n\n14.13\n17.56\n19.30\n16.02\n16.59\n15.46\n17.40\n6.14\n79.85\n\nLucy\n12.54\n15.15\n16.68\n14.27\n13.28\n12.00\n15.22\n3.67\n1.55\n\nTV\n15.87\n19.37\n21.83\n17.66\n19.34\n17.13\n18.58\n8.05\n0.66\n\n\u21131\n\n16.18\n19.86\n22.77\n18.02\n20.25\n17.59\n18.85\n8.58\n0.75\n\nIRLS\n\u03b1=1/2\n14.61\n18.43\n21.53\n16.34\n19.12\n15.59\n17.08\n7.03\n354\n\nIRLS\n\u03b1=2/3\n15.45\n19.37\n22.62\n17.31\n19.99\n16.58\n17.99\n7.98\n354\n\nIRLS\n\u03b1=4/5\n16.04\n20.00\n22.95\n17.98\n20.20\n17.04\n18.61\n8.48\n354\n\nOurs\nOurs\n\u03b1=2/3\n\u03b1=1/2\n16.44\n16.05\n20.26\n19.78\n23.27\n23.26\n18.17\n17.70\n21.28\n21.00\n17.89\n17.79\n18.96\n18.58\n8.93\n8.71\nL:1.01\nL:1.00\nA:5.27 A:4.08\n\nTable 1: Comparison of SNRs and running time of 9 different methods for the deconvolution of\n7 576\u00d7864 images, blurred with the same 19\u00d719 kernel. L=Lookup table, A=Analytic. The best\nperforming algorithm for each kernel is shown in bold. Our algorithm with \u03b1 = 2/3 beats IRLS\nwith \u03b1 = 4/5, as well as being much faster. On average, both these methods outperform \u21131, demon-\nstrating the bene\ufb01ts of a sparse prior.\n\nOurs (LUT) Ours (Analytic)\n\nImage\nsize\n\n256\u00d7256\n512\u00d7512\n1024\u00d71024\n2048\u00d72048\n3072\u00d73072\n\n\u21131\n\n0.24\n0.47\n2.34\n9.34\n22.40\n\nIRLS\n\u03b1=4/5\n78.14\n256.87\n1281.3\n4935\n\n-\n\n\u03b1=2/3\n0.42\n0.55\n2.78\n10.72\n24.07\n\n\u03b1=2/3\n\n0.7\n2.28\n10.87\n44.64\n100.42\n\nTable 2: Run-times of different methods for a range of image sizes, using a 13\u00d713 kernel. Our LUT\nalgorithm is more than 100 times faster than the IRLS method of [10].\n\n4 Discussion\nWe have described an image deconvolution scheme that is fast, conceptually simple and yields\nhigh quality results. Our algorithm takes a novel approach to the non-convex optimization prob-\n\n6\n\n\fOriginal\n\n L\n\n2\n\nSNR=14.89\nt=0.1\n\n L\n\n1\n\nSNR=18.10\nt=0.8\n\nBlurred\nSNR=7.31\n\nOriginal\n\nOurs \u03b1=2/3\nSNR=18.96\nt=1.2\n\n L\n\n2\n\nSNR=11.58\nt=0.1\n\nIRLS \u03b1=4/5\nSNR=19.05\nt=483.9\n\n L\n\n1\n\nSNR=13.64\nt=0.8\n\nBlurred\nSNR=2.64\n\nOurs \u03b1=2/3\nSNR=14.15\nt=1.2\n\nIRLS \u03b1=4/5\nSNR=14.28\nt=482.1\n\nFigure 2: Crops from two images (#1 & #5) being deconvolved by 4 different algorithms, including\nours using a 27\u00d727 kernel (#7). In the bottom left inset, we show the original kernel from [12]\n(lower) and the perturbed version provided to the algorithms (upper), to make the problem more\nrealistic. This \ufb01gure is best viewed on screen, rather than in print.\n\n7\n\n\fKernel\n# / size\n#1: 13\u00d713\n#2: 15\u00d715\n#3: 17\u00d717\n#4: 19\u00d719\n#5: 21\u00d721\n#6: 23\u00d723\n#7: 27\u00d727\n#8: 41\u00d741\nAv. SNR gain\nAv. Time\n\n(sec)\n\nBlurry\n10.69\n11.28\n8.93\n10.13\n9.26\n7.87\n6.76\n6.00\n\n\u21132\n\n17.22\n16.14\n14.94\n15.27\n16.55\n15.40\n13.81\n12.80\n6.40\n57.44\n\nLucy\n14.49\n13.81\n12.16\n12.38\n13.60\n13.32\n11.55\n11.19\n3.95\n1.22\n\nTV\n19.21\n17.94\n16.50\n16.83\n18.72\n17.01\n15.42\n13.53\n8.03\n0.50\n\n\u21131\n\n19.41\n18.29\n16.86\n17.25\n18.83\n17.42\n15.69\n13.62\n8.31\n0.55\n\nIRLS\n\u03b1=1/2\n17.20\n16.17\n15.34\n15.97\n17.23\n15.66\n14.59\n12.68\n6.74\n271\n\nIRLS\n\u03b1=2/3\n18.22\n17.26\n16.36\n16.98\n18.36\n16.73\n15.68\n13.60\n7.78\n271\n\nIRLS\n\u03b1=4/5\n18.87\n18.02\n16.99\n17.57\n18.88\n17.40\n16.38\n14.25\n8.43\n271\n\nOurs\nOurs\n\u03b1=2/3\n\u03b1=1/2\n19.66\n19.36\n18.64\n18.14\n17.25\n16.73\n17.67\n17.29\n19.34\n19.11\n17.77\n17.26\n16.29\n15.92\n13.68\n13.73\n8.67\n8.33\nL:0.81\nL:0.78\nA:2.15 A:2.23\n\nTable 3: Comparison of SNRs and running time of 9 different methods for the deconvolution of a\n512\u00d7512 image blurred by 7 different kernels. L=Lookup table, A=Analytic. Our algorithm beats\nall other methods in terms of quality, with the exception of IRLS on the largest kernel size. However,\nour algorithm is far faster than IRLS, being comparable in speed to the \u21131 approach.\n\nlem arising from the use of a hyper-Laplacian prior, by using a splitting approach that allows the\nnon-convexity to become separable over pixels. Using a LUT to solve this sub-problem allows for\norders of magnitude speedup in the solution over existing methods. Our Matlab implementation is\navailable online at http://cs.nyu.edu/\u02dcdilip/wordpress/?page_id=122.\nA potential drawback to our method, common to the TV and \u21131 approaches of [22], is its use of\nfrequency domain operations which assume circular boundary conditions, something not present in\nreal images. These give rise to boundary artifacts which can be overcome to some extend with edge\ntapering operations. However, our algorithm is suitable for very large images where the boundaries\nare a small fraction of the overall image.\n\nAlthough we focus on deconvolution, our scheme can be adapted to a range of other problems which\nrely on natural image statistics. For example, by setting k = 1 the algorithm can be used to denoise,\nor if k is a defocus kernel it can be used for super-resolution. The speed offered by our algorithm\nmakes it practical to perform these operations on the multi-megapixel images from modern cameras.\n\nAlgorithm 2: Solve Eqn. 5 for \u03b1 = 1/2\n\nAlgorithm 3: Solve Eqn. 5 for \u03b1 = 2/3\n\nRequire: Target value v, Weight \u03b2\n1: \u01eb = 10\u22126\n2: {Compute intermediary terms m, t1, t2, t3}\n3: m = \u2212sign(v)/4\u03b2 2\n4: t1 = 2v/3\n5: t2 = 3p\u221227m \u2212 2v3 + 3\u221a3\u221a27m2 + 4mv3\n6: t3 = v2/t2\n7: {Compute 3 roots, r1, r2, r3:}\n8: r1 = t1 + 1/(3 \u00b7 21/3) \u00b7 t2 + 21/3/3 \u00b7 t3\n9: r2 = t1 \u2212 (1 \u2212 \u221a3i)/(6 \u00b7 21/3) \u00b7 t2\n\u2212 (1 + \u221a3i)/(3 \u00b7 22/3) \u00b7 t3\n10: r3 = t1 \u2212 (1 + \u221a3i)/(6 \u00b7 21/3) \u00b7 t2\n\u2212 (1 \u2212 \u221a3i)/(3 \u00b7 22/3) \u00b7 t3\n11: {Pick global minimum from (0, r1, r2, r3)}\n12: r = [r1, r2, r3]\n13: c1 = (abs(imag(r)) < \u01eb) {Root must be real}\n14: c2 = real(r)sign(v) > (2/3 \u00b7 abs(v))\n15: c3 = real(r)sign(v) < abs(v) {Root < v}\n16: w\u2217= max((c1&c2&c3)real(r)sign(v))sign(v)\nreturn w\u2217\n\n{Root must obey bound of Eqn. 13}\n\nRequire: Target value v, Weight \u03b2\n1: \u01eb = 10\u22126\n2: {Compute intermediary terms m, t1, . . . , t7:}\n3: m = 8/(27\u03b2 3)\n4: t1 = \u22129/8 \u00b7 v2\n5: t2 = v3/4\n6: t3 = \u22121/8 \u00b7 mv2\n7: t4 = \u2212t3/2 +p\u2212m3/27 + m2v4/256\n8: t5 = 3\u221at4\n9: t6 = 2(\u22125/18 \u00b7 t1 + t5 + m/(3 \u00b7 t5))\n10: t7 = pt1/3 + t6\n11: {Compute 4 roots, r1, r2, r3, r4:}\n12: r1 = 3v/4 + (t7 +p\u2212(t1 + t6 + t2/t7))/2\n13: r2 = 3v/4 + (t7 \u2212p\u2212(t1 + t6 + t2/t7))/2\n14: r3 = 3v/4 + (\u2212t7 +p\u2212(t1 + t6 \u2212 t2/t7))/2\n15: r4 = 3v/4 + (\u2212t7 \u2212p\u2212(t1 + t6 \u2212 t2/t7))/2\n16: {Pick global minimum from (0, r1, r2, r3, r4)}\n17: r = [r1, r2, r3, r4]\n18: c1 = (abs(imag(r)) < \u01eb) {Root must be real}\n19: c2 = real(r)sign(v) > (1/2 \u00b7 abs(v))\n20: c3 = real(r)sign(v) < abs(v) {Root < v}\n21: w\u2217 = max((c1&c2&c3)real(r)sign(v))sign(v)\nreturn w\u2217\n\n{Root must obey bound in Eqn. 13}\n\n8\n\n\fReferences\n\n[1] R. Chartrand. Fast algorithms for nonconvex compressive sensing: Mri reconstruction from\n\nvery few data. In IEEE International Symposium on Biomedical Imaging (ISBI), 2009.\n\n[2] R. Chartrand and V. Staneva. Restricted isometry properties and nonconvex compressive sens-\n\ning. Inverse Problems, 24:1\u201314, 2008.\n\n[3] R. Fergus, B. Singh, A. Hertzmann, S. T. Roweis, and W. Freeman. Removing camera shake\n\nfrom a single photograph. ACM TOG (Proc. SIGGRAPH), 25:787\u2013794, 2006.\n\n[4] D. Field. What is the goal of sensory coding? Neural Computation, 6:559\u2013601, 1994.\n[5] D. Geman and G. Reynolds. Constrained restoration and recovery of discontinuities. PAMI,\n\n14(3):367\u2013383, 1992.\n\n[6] D. Geman and C. Yang. Nonlinear image recovery with half-quadratic regularization. PAMI,\n\n4:932\u2013946, 1995.\n\n[7] N. Joshi, L. Zitnick, R. Szeliski, and D. Kriegman. Image deblurring and denoising using color\n\npriors. In CVPR, 2009.\n\n[8] D. Krishnan and R. Fergus. Fast image deconvolution using hyper-laplacian priors, supple-\n\nmentary material. NYU Tech. Rep. 2009, 2009.\n\n[9] A. Levin. Blind motion deblurring using image statistics. In NIPS, 2006.\n[10] A. Levin, R. Fergus, F. Durand, and W. Freeman. Image and depth from a conventional camera\n\nwith a coded aperture. ACM TOG (Proc. SIGGRAPH), 26(3):70, 2007.\n\n[11] A. Levin and Y. Weiss. User assisted separation of re\ufb02ections from a single image using a\n\nsparsity prior. PAMI, 29(9):1647\u20131654, Sept 2007.\n\n[12] A. Levin, Y. Weiss, F. Durand, and W. T. Freeman. Understanding and evaluating blind decon-\n\nvolution algorithms. In CVPR, 2009.\n\n[13] S. Osindero, M. Welling, and G. Hinton. Topographic product models applied to natural scene\n\nstatistics. Neural Computation, 1995.\n\n[14] J. Portilla, V. Strela, M. J. Wainwright, and E. P. Simoncelli. Image denoising using a scale\nmixture of Gaussians in the wavelet domain. IEEE TIP, 12(11):1338\u20131351, November 2003.\n\n[15] W. Richardson. Bayesian-based iterative method of image restoration. 62:55\u201359, 1972.\n[16] S. Roth and M. J. Black. Fields of Experts: A Framework for Learning Image Priors. In CVPR,\n\nvolume 2, pages 860\u2013867, 2005.\n\n[17] L. Rudin, S. Osher, and E. Fatemi. Nonlinear total variation based noise removal algorithms.\n\nPhysica D, 60:259\u2013268, 1992.\n\n[18] E. Simoncelli and E. H. Adelson. Noise removal via bayesian wavelet coring. In ICIP, pages\n\n379\u2013382, 1996.\n\n[19] C. V. Stewart. Robust parameter estimation in computer vision. SIAM Reviews, 41(3):513\u2013537,\n\nSept. 1999.\n\n[20] M. F. Tappen, B. C. Russell, and W. T. Freeman. Exploiting the sparse derivative prior for\n\nsuper-resolution and image demosaicing. In SCTV, 2003.\n\n[21] M. Wainwright and S. Simoncelli. Scale mixtures of gaussians and teh statistics of natural\n\nimages. In NIPS, pages 855\u2013861, 1999.\n\n[22] Y. Wang, J. Yang, W. Yin, and Y. Zhang. A new alternating minimization algorithm for total\n\nvariation image reconstruction. SIAM J. Imaging Sciences, 1(3):248\u2013272, 2008.\n\n[23] E. W. Weisstein.\n\nCubicFormula.html.\n\nCubic\n\nformula.\n\nhttp://mathworld.wolfram.com/\n\n[24] E. W. Weisstein.\n\nQuartic equation.\n\nQuarticEquation.html.\n\nhttp://mathworld.wolfram.com/\n\n[25] M. Welling, G. Hinton, and S. Osindero. Learning sparse topographic representations with\n\nproducts of student-t distributions. In NIPS, 2002.\n\n[26] S. Wright, R. Nowak, and M. Figueredo. Sparse reconstruction by separable approximation.\n\nIEEE Trans. Signal Processing, page To appear, 2009.\n\n9\n\n\f", "award": [], "sourceid": 341, "authors": [{"given_name": "Dilip", "family_name": "Krishnan", "institution": null}, {"given_name": "Rob", "family_name": "Fergus", "institution": null}]}