{"title": "Epitome driven 3-D Diffusion Tensor image segmentation: on extracting specific structures", "book": "Advances in Neural Information Processing Systems", "page_first": 1696, "page_last": 1704, "abstract": "We study the problem of segmenting specific white matter structures of interest from Diffusion Tensor (DT-MR) images of the human brain. This is an important requirement in many Neuroimaging studies: for instance, to evaluate whether a brain structure exhibits group level differences as a function of disease in a set of images. Typically, interactive expert guided segmentation has been the method of choice for such applications, but this is tedious for large datasets common today. To address this problem, we endow an image segmentation algorithm with 'advice' encoding some global characteristics of the region(s) we want to extract. This is accomplished by constructing (using expert-segmented images) an epitome of a specific region - as a histogram over a bag of 'words' (e.g.,suitable feature descriptors). Now, given such a representation, the problem reduces to segmenting new brain image with additional constraints that enforce consistency between the segmented foreground and the pre-specified histogram over features. We present combinatorial approximation algorithms to incorporate such domain specific constraints for Markov Random Field (MRF) segmentation. Making use of recent results on image co-segmentation, we derive effective solution strategies for our problem. We provide an analysis of solution quality, and present promising experimental evidence showing that many structures of interest in Neuroscience can be extracted reliably from 3-D brain image volumes using our algorithm.", "full_text": "Epitome driven 3-D Diffusion Tensor image\nsegmentation: on extracting speci\ufb01c structures\u2217\n\nKamiya Motwani\u2020\u00a7 Nagesh Adluru\u00a7 Chris Hinrichs\u2020\u00a7 Andrew Alexander\u2021 Vikas Singh\u00a7\u2020\n\n\u2020Computer Sciences\nUniversity of Wisconsin\n\n\u00a7Biostatistics & Medical Informatics\n\nUniversity of Wisconsin\n\n\u2021Medical Physics\n\nUniversity of Wisconsin\n\n{kmotwani,hinrichs,vsingh}@cs.wisc.edu\n\n{adluru,alalexander2}@wisc.edu\n\nAbstract\n\nWe study the problem of segmenting speci\ufb01c white matter structures of interest\nfrom Diffusion Tensor (DT-MR) images of the human brain. This is an important\nrequirement in many Neuroimaging studies: for instance, to evaluate whether a\nbrain structure exhibits group level differences as a function of disease in a set of\nimages. Typically, interactive expert guided segmentation has been the method\nof choice for such applications, but this is tedious for large datasets common to-\nday. To address this problem, we endow an image segmentation algorithm with\n\u201cadvice\u201d encoding some global characteristics of the region(s) we want to extract.\nThis is accomplished by constructing (using expert-segmented images) an epitome\nof a speci\ufb01c region \u2013 as a histogram over a bag of \u2018words\u2019 (e.g., suitable feature de-\nscriptors). Now, given such a representation, the problem reduces to segmenting a\nnew brain image with additional constraints that enforce consistency between the\nsegmented foreground and the pre-speci\ufb01ed histogram over features. We present\ncombinatorial approximation algorithms to incorporate such domain speci\ufb01c con-\nstraints for Markov Random Field (MRF) segmentation. Making use of recent\nresults on image co-segmentation, we derive effective solution strategies for our\nproblem. We provide an analysis of solution quality, and present promising exper-\nimental evidence showing that many structures of interest in Neuroscience can be\nextracted reliably from 3-D brain image volumes using our algorithm.\n\n1\n\nIntroduction\n\nDiffusion Tensor Imaging (DTI or DT-MR) is an imaging modality that facilitates measurement of\nthe diffusion of water molecules in tissues. DTI has turned out to be especially useful in Neuroimag-\ning because the inherent microstructure and connectivity networks in the brain can be estimated from\nsuch data [1]. The primary motivation is to investigate how speci\ufb01c components (i.e., structures) of\nthe brain network topology respond to disease and treatment [2], and how these are affected as a\nresult of external factors such as trauma. An important challenge here is to reliably extract (i.e., seg-\nment) speci\ufb01c structures of interest from DT-MR image volumes, so that these regions can then be\nanalyzed to evaluate variations between clinically disparate groups. This paper focuses on ef\ufb01cient\nalgorithms for this application \u2013 that is, 3-D image segmentation with side constraints to preserve\n\ufb01delity of the extracted foreground with a given epitome of the brain region of interest.\nDTI data are represented as a 3 \u00d7 3 positive semide\ufb01nite tensor at each image voxel. These im-\nages provide information about connection pathways in the brain, and neuroscientists focus on the\n\u2217Supported by AG034315 (Singh), MH62015 (Alexander), UW ICTR (1UL1RR025011), and UW ADRC\n(AG033514). Hinrichs and Adluru are supported by UW-CIBM funding (via NLM 5T15LM007359). Thanks\nto Richie Davidson for assistance with the data, and Anne Bartosic and Chad Ennis for ground truth indications.\nThe authors thank Lopamudra Mukherjee, Moo K. Chung, and Chuck Dyer for discussions and suggestions.\n\n1\n\n\fanalysis of white-matter regions (these are known to encompass the \u2018brain axonal networks\u2019). In\ngeneral, standard segmentation methods yield reasonable results in separating white matter (WM)\nfrom gray-matter (GM), see [3]. While some of these algorithms make use of the tensor \ufb01eld di-\nrectly [4], others utilize \u2018maps\u2019 of certain scalar-valued anisotropy measures calculated from tensors\nto partition WM/GM regions [5], see Fig. 1. But different pathways play different functional roles;\nhence it is more meaningful to evaluate group differences in a population at the level of speci\ufb01c\nwhite matter structures (e.g., corpus callosum, fornix, cingulum bundle). Part of the reason is that\neven signi\ufb01cant volume differences in small structures may be overwhelmed in a pair-wise t-test\nusing volume measures of the entire white matter (obtained via WM/GM segmentation [6]). To\nanalyze variations in speci\ufb01c regions, we require segmentation of such structures as a \ufb01rst step.\nUnsupervised segmentation of speci\ufb01c regions of interest from DTI is dif\ufb01cult. Even interactive\nsegmentation (based on gray-level fractional anisotropy maps) leads to unsatisfactory results unless\nguided by a neuroanatomical expert \u2013 that is, specialized knowledge of the global appearance of the\nstructure is essential in this process. Further, this is tedious for large datasets. One alternative is to\nuse a set of already segmented images to facilitate processing of new data. Fortunately, since many\nstudies use hand indicated regions for group analysis [7], such data is readily available. However,\ndirectly applying off the shelf toolboxes to learn a classi\ufb01er (from such segmented images) does not\nwork well. Part of the reason is that the local spatial context at each tensor voxel, while useful, is\nnot suf\ufb01ciently discriminative. In fact, the likelihood of a voxel to be assigned as part of the fore-\nground (structure of interest) depends on whether the set of all foreground voxels (in entirety) match\nan \u2018appearance model\u2019 of the structure, in addition to being perceptually homogeneous. One strat-\negy to model the \ufb01rst requirement is to extract features, generate a codebook dictionary of feature\ndescriptors, and ask that distribution over the codebook (for foreground voxels) be consistent with\nthe distribution induced by the expert-segmented foreground (on the same codebook). Putting this\ntogether with the homogeneity requirement serves to de\ufb01ne the problem: segment a given DTI im-\nage (using MRFs, normalized cuts), while ensuring that the extracted foreground matches a known\nappearance model (over a bag of codebook features). The goal is related to recent work on simulta-\nneous segmentation of two images called Cosegmentation [8, 9, 10, 11].\nIn the following sections, we formalize the problem and then present ef\ufb01cient segmentation meth-\nods. The key contributions of this paper are: (i) We propose a new algorithm for epitome-based\ngraph-cuts segmentation, one which permits introduction of a bias to favor solutions that match a\ngiven epitome for regions of interest. (ii) We present an application to segmentation of speci\ufb01c struc-\ntures in Diffusion Tensor Images of the human brain and provide experimental evidence that many\nstructures of interest in Neuroscience can be extracted reliably from large 3-D DTI images. (iii) Our\nanalysis provides a guarantee of a constant factor approximation ratio of 4. For a deterministic\nround-up strategy to obtain integral solutions, this approximation is provably tight.\n2 Preliminaries\nWe provide a short overview of how image segmentation is expressed as \ufb01nding the maximum like-\nlihood solution to a Conditional or Markov Random Field function. Later, we extend the model to\ninclude an additional bias (or regularizer) so that the con\ufb01gurations that are consistent with an epit-\nome of a structure of interest turn out to be more likely (than other possibly lower energy solutions).\n\nFigure 1: Speci\ufb01c white matter structures such as Corpus Callosum, Interior Capsules, and Cingulum Bundle\nare shown in 3D (left), within the entire white matter (center), and overlaid on a Fractional Anisotropy (FA)\nimage slice (right). Our objective is to segment such structures from DTI images. Note that FA is a scalar\nanisotropy measure often used directly for WM/GM segmentation, since anisotropy is higher in white matter.\n\n2\n\n\f2.1 Markov Random Fields (MRF)\n\nMarkov Random Field based image segmentation approaches are quite popular in computer vision\n[12, 13] and neuroimaging [14]. A random \ufb01eld is assumed over the image lattice consisting of\ndiscrete random variables, x = {x1,\u00b7\u00b7\u00b7 , xn}. Each xj \u2208 x, j \u2208 {1,\u00b7\u00b7\u00b7 , n} takes a value from\na \ufb01nite label set, L = {L1,\u00b7\u00b7\u00b7 ,Lm}. The set Nj = {i|j \u223c i} lists the neighbors of xj on the\nadjacency lattice, denoted as (j \u223c i). A con\ufb01guration of the MRF is an assignment of each xj to\na label in L. Labels represent distinct image segments; each con\ufb01guration gives a segmentation,\nand the desired segmentation is the least energy MRF con\ufb01guration. The energy is expressed as a\nsum of (1) individual data log-likelihood terms (cost of assigning xj to Lk \u2208 L) and (2) pairwise\nsmoothness prior (favor voxels with similar appearance to be assigned to the same label) [12, 15, 16]:\n\nX\n\nnX\n\nX\n\nmin\nx,z\n\nwjkxjk +\n\ncijzij\n\n(1)\n\nj=1\n\n(i\u223cj)\n\nsubject to\n\n\u2200(i \u223c j) \u2208 N where i, j \u2208 {1,\u00b7\u00b7\u00b7 , n},\n\nLk\u2208L\n|xik \u2212 xjk| \u2264 zij \u2200k \u2208 {1,\u00b7\u00b7\u00b7 , m},\nx is binary of size n \u00d7 m, z is binary of size |N|,\n\n(2)\n(3)\nwhere wjk is a unary term encoding the probability of j being assigned to Lk \u2208 L, and cij is\nthe pairwise smoothness prior (e.g., Generalized Potts model). The variable zij = 1 indicates that\nvoxels i and j are assigned to different labels and x provides the assignment of voxel to labels\n(i.e., segments or regions). The problem is NP-hard but good approximation algorithms (including\ncombinatorial methods) are known [16, 15, 17, 12]. Special cases (e.g., when c is convex) are known\nto be poly-time solvable [15]. Next, we discuss an interesting extension of MRF segmentation,\nnamely Cosegmentation, which deals with the simultaneous segmentation of multiple images.\n\n2.2 From Cosegmentation toward Epitome-based MRFs\n\nCosegmentation uses the observation that while global histograms of images of the same object\n(in different backgrounds) may differ, the histogram(s) of the respective foreground regions in the\nimage pair (based on certain invariant features) remain relatively stable. Therefore, one may perform\na concurrent segmentation of the images with a global constraint that enforces consistency between\nhistograms of only the foreground voxels. We \ufb01rst construct a codebook of features F (e.g., using\nRGB intensities) for images I (1) and I (2); the histograms on this dictionary are:\n\nH(1) = {H(1)\n\n1 ,\u00b7\u00b7\u00b7 ,H(1)\n\n\u03b2 } and H(2) = {H(2)\n\n1 ,\u00b7\u00b7\u00b7 ,H(2)\n\n\u03b2 } (b indexes the histogram bins),\n\nsuch that H(u)\nand x(2) denote the segmentation solutions, and x(1)\na measure of consistency between the foreground regions (after segmentation) is given by:\n\n(j) = 1 if voxel j \u2208 I (u) is most similar to codeword Fb, where u \u2208 {1, 2}. If x(1)\nj = 1 assigns voxel j of I (1) to the foreground,\n\nb\n\n\u03b2X\n\n\u201c(cid:104)H(1)\n\nb\n\n\u03a8\n\n, x(2)(cid:105)\u201d\n\n.\n\n, x(1)(cid:105),(cid:104)H(2)\n\nb\n\n, x(u)(cid:105) =(cid:80)n\n\n(4)\n\nj\n\nb\n\nb\n\nb=1\n\nj=1\n\nH(u)\n\nwhere \u03a8(\u00b7,\u00b7) is a suitable similarity (or distance) function and (cid:104)H(u)\n(j)x(u)\n,\na count of the number of voxels in I (u) (from Fb) assigned to the foreground for u \u2208 {1, 2}.\nUsing (4) to regularize the segmentation objective (1) biases the model to favor solutions where the\nforegrounds match (w.r.t. the codebook F), leading to more consistent segmentations.\nThe form of \u03a8(\u00b7,\u00b7) above has a signi\ufb01cant impact on the hardness of the problem, and different\nideas have been explored [8, 9, 10]. For example, the approach in [8] uses the (cid:96)1 norm to measure\n(and penalize) the variation, and requires a Trust Region based method for optimization. The sum\nof squared differences (SSD) function in [9] leads to partially optimal (half integral) solutions but\nrequires solving a large linear program \u2013 infeasible for the image sizes we consider (which are\norders of magnitude larger). Recently, [10] substituted \u03a8(\u00b7,\u00b7) with a so-called reward on histogram\nsimilarity. This does lead to a polynomial time solvable model, but requires the similarity function\nto be quite discriminative (otherwise offering a reward might be counter-productive in this setting).\n3 Optimization Model\n\nWe start by using the sum of squared differences (SSD) as in [9] to bias the objective function\nand incorporate epitome awareness within the MRF energy in (1). However, unlike [9], where one\n\n3\n\n\fseeks a segmentation of both images, here we are provided the second histogram \u2013 the epitome\n(representation) of the speci\ufb01c region of interest. Clearly, this signi\ufb01cantly simpli\ufb01es the resultant\nLinear Program. Unfortunately, it remains computationally intractable for high resolution 3-D image\nvolumes (2562\u00d7128) we consider here (the images are much larger than what is solvable by state of\nthe art LP software, as in [9]). We propose a solution based on a combinatorial method, using ideas\nfrom some recent papers on Quadratic Pseudoboolean functions and their applications [18, 19]. This\nallows us to apply our technique on large scale image volumes, and obtain accurate results quite\nef\ufb01ciently. Further, our analysis shows that we can obtain good constant-factor approximations\n(these are tight under mild conditions). We discuss our formulation next.\nWe \ufb01rst express the objective in (1) with an additional regularization term to penalize histogram\ndissimilarity using the sum of squared differences. This gives the following simple expression,\n\nmin\nx,z\n\ncijz(1)\n\nij +\n\nwj0(1 \u2212 x(1)\n\nj ) +\n\nwj1x(1)\n\nj + \u03bb\n\n((cid:104)H(1)\n\n, x(1)(cid:105) \u2212\n\nb\n\nj=1\n\nj=1\n\nb=1\n\nnX\n\n\u03b2X\n\n\u02c6Hb|{z}\n\n)2\n\n(cid:104)H(2)\n\nb ,x(2)(cid:105)\n\nSince the epitome (histogram) is provided, the second argument of \u03a8(\u00b7,\u00b7) in (4) is replaced with \u02c6H,\nand x(1) represents the solution vector for image I (1). In addition, the term wj0 (and wj1) denote\nthe unary cost of assigning voxel j to the background (and foreground), and \u03bb is a user-speci\ufb01ed\ntunable parameter to control the in\ufb02uence of the histogram variation. This yields\n\nX\n\ni\u223cj\n\nX\n\nnX\n\nnX\n\nnX\n\ncijzij +\n\nwj0(1 \u2212 xj) +\n\nwj1xj + \u03bb\n\ni\u223cj\n|xi \u2212 xj| \u2264 zij \u2200(i \u223c j) where i, j \u2208 {1,\u00b7\u00b7\u00b7 , n}, and\n\nsubject to\nThe last term in (5) is constant. So, the model reduces to\n\nj=1\n\nj=1\n\nb=1\n\nnX\n\n nX\n\n\u03b2X\n\nmin\nx,z\n\nX\n\nnX\n\n\u03b2X\n\n0@(cid:104)Hb, x(cid:105)2 \u2212 2(cid:104)Hb, x(cid:105) \u02c6Hb + \u02c6H2\nb|{z}\n\n1A\n\nconstant\n\nx, z is binary,\n\nnX\n\nl=1\n\nHb(j)Hb(l)xjxl \u2212 2\n\nnX\n\nj=1\n\nHb(j)xj \u02c6Hb\n\n(5)\n\n!\n\nmin\nx,z\n\ncijzij +\n\nwj0(1 \u2212 xj) +\n\nwj1xj + \u03bb\n\nj=1\n\ni\u223cj\n|xi \u2212 xj| \u2264 zij \u2200(i \u223c j) where i, j \u2208 {1,\u00b7\u00b7\u00b7 , n}, and\n\nj=1\n\n(cid:80)\n(cid:81)\ns.t.\n(6)\nObserve that (6) can be expressed as a special case of the general form, \u0393(x1,\u00b7\u00b7\u00b7 , xn) =\nj\u2208S xj where U = {1,\u00b7\u00b7\u00b7 , n}, x = (x1,\u00b7\u00b7\u00b7 , xn) \u2208 Bn is a binary vector, S is a\nS\u2282U \u03c6S\nsubset of U, and \u03c6S denotes the coef\ufb01cient of S. Such a function \u0393 : Bn (cid:55)\u2192 R is called a pseudo-\nBoolean function [18]. If the cardinality of S is no more than two, the corresponding form is\n\nx, z is binary,\n\nj=1\n\nb=1\n\nX\n\nX\n\n\u0393(x1, x2,\u00b7\u00b7\u00b7 , xn) =\n\n\u03c6jxj +\n\n\u03c6ijxixj\n\nj\n\n(i,j)\n\nThese functions are called Quadratic Pseudo-Boolean functions (QPB). In general if the objective\npermits a representation as a QPB, an upper (or lower) bound can be derived using roof (or \ufb02oor)\nduality [18], recently utilized in several papers [19, 20, 21]. Notice that the function in (6) is a QPB\nbecause it has at most two variables in each term in the expansion. An advantage of the model\nderived above is that (pending some additional adjustments) we will be able to leverage an extensive\nexisting combinatorial machinery to solve the problem. We discuss these issues in more detail next.\n\n4 Reparameterization and Graph Construction\n\nNow we discuss a graph construction to optimize the above energy function by computing a max-\nimum \ufb02ow/minimum cut. We represent each variable as a pair of literals, xj and \u00afxj, which cor-\nresponds to a pair of nodes in a graph G. Edges are added to G based on various terms in the\ncorresponding QPB. The min-cut computed on G will determine the assignments of variables to 1\n(or 0), i.e., foreground/background assignment. Depending on how the nodes for a pair of literals are\npartitioned, we either get \u201cpersistent\u201d integral solutions (same as in optimal) and/or obtain variables\nassigned 1\nWe will \ufb01rst reparameterize the coef\ufb01cients in our objective as a vector denoted by \u03a6. More specif-\nically, we express the energy by collecting the unary and pairwise costs in (6) as the coef\ufb01cients\nof the linear and quadratic variables. For a voxel j, we denote the unary coef\ufb01cient as \u03a6j and for\na pair of voxels (i, j) we give their corresponding coef\ufb01cients as \u03a6ij. For presentation, we show\n\n2 (half integral) values and need additional rounding to obtain a {0, 1} solution.\n\n4\n\n\fi \u223c j, i (cid:54)\u223c= j\n\ni (cid:54)\u223c j, i \u223c= j\n\ni \u223c j, i \u223c= j\n\nVoxel pairs (i, j)\n(vi \u2192 vj), (\u00afvj \u2192 \u00afvi)\n(vj \u2192 vi), (\u00afvi \u2192 \u00afvj)\n(\u00afvj \u2192 vi), (\u00afvi \u2192 vj)\nTable 1: Illustration of edge weights introduced in the graph for voxel pairs.\n\n1\n2 cij\n1\n2 cij\n1\n2 \u03bb\n\n1\n2 cij\n1\n2 cij\n0\n\n0\n0\n1\n2 \u03bb\n\nspatial adjacency as i \u223c j, and if i and j share a bin in the histogram we denote it as i \u223c= j, i.e.,\n\u2203b : Hb(i) = Hb(j) = 1. The de\ufb01nition of the pairwise costs will include the following scenarios:\n\n8><>: cij\n\nif i \u223c j\n\u03bb if i (cid:54)\u223c j\nif i \u223c j\ncij\n\u03bb if i \u223c j\n\nand\nand\nand\nand\n\n\u03a6ij =\n\ni (cid:54)\u223c= j\ni \u223c= j\ni \u223c= j\ni \u223c= j\n\nand\nand\nand\nand\n\n(i, j) assigned to different labels\n\n(i, j) assigned to foreground\n\n(i, j) assigned to different labels\n\n(i, j) assigned to foreground\n\n(7)\n\nThe above cases enumerate three possible relationships between a pair of voxels (i, j): (i) (i, j)\nare spatial neighbors but not bin neighbors; (ii) (i, j) are bin neighbors, but not spatial neighbors;\n(iii) (i, j) are bin neighbors and spatial neighbors. In addition, the cost is also a function of label\nassignments to (i, j). Note that we assume i (cid:54)= j above since if i = j, we can absorb those costs in\nthe unary terms (because xi \u00b7 xi = xi). We de\ufb01ne the unary costs for each voxel j next.\nif j is assigned to foreground and \u2203b : Hb(i) = 1\n\nwj0\nwj1 + \u03bb \u2212 2\u03bb \u02c6Hb\n\nif j is assigned to background\n\n\uf6be\n\n\u03a6j =\n\n(8)\n\nWith the reparameterization given as \u03a6 =\n[\u03a6j \u03a6ij]T done, we follow the recipe in\n[18, 22] to construct a graph (brie\ufb02y sum-\nmarized below). For each voxel j \u2208 I, we\nintroduce two nodes, vj and \u00afvj. Hence,\nthe size of the graph is 2|I|. We also have\ntwo special nodes s and t which denote the\nsource and sink respectively. We connect\neach node to the source and/or the sink\nbased on the unary costs, assuming that the\nsource (and sink) partitions correspond to\nforeground (and background). The source\nis connected to the node vj with weight,\n2(wj1 + \u03bb \u2212 2\u03bb \u02c6Hb), and to node \u00afvj with\n1\n2 wj0. Nodes vj and \u00afvj are in turn\nweight 1\nconnected to the sink with costs 1\n2 wj0 and\n2(wj1 + \u03bb \u2212 2\u03bb \u02c6Hb) respectively. These\n1\nedges, if saturated in a max-\ufb02ow, count to-\nFigure 2: A graph to optimize (6). Nodes in the left box\nwards the node\u2019s unary cost. Edges be-\nrepresents vj; nodes in the right box represent \u00afvj. Colors\ntween node pairs (except source and sink)\nindicate spatial neighbors (orange) or bin neighbors (green).\ngive pairwise terms of the energy. These\nedge weights (see Table1) quantify all possible relationships of pairwise voxels and label assign-\nments (Fig. 2). A maximum \ufb02ow/minimum cut procedure on this graph gives a solution to our\nproblem. After the cut, each node (for a voxel) is connected either to the source set or to the sink\nset. Using this membership, we can obtain a \ufb01nal solution (i.e., labeling) as follows.\n\n8<: 0\n\n1\n1\n2\n\nxj =\n\nif vj \u2208 s, \u00afvj \u2208 t\nif vj \u2208 t, \u00afvj \u2208 s\notherwise\n\n(9)\n\nA property of the solution obtained by (9) is that the variables assigned {0, 1} values are \u201cpersistent\u201d,\ni.e., they are the same in the optimal integral solution to (6). This means that the solution from the\nalgorithm above is partially optimal [18, 20]. We now only need to \ufb01nd an assignment for the 1\n2\nvariables (to 0 or 1) by rounding. The rounding strategy and analysis is presented next.\n\n5 Rounding and Approximation analysis\n\nIn general, any reasonable heuristic can be used to round 1\nsolve for and obtain a segmentation for only the 1\n\n2-valued variables to 0 or 1 (e.g., we can\n2-valued variables without the additional bias). Our\n\n5\n\n12wj0I12(wj1+\u03bb\u22122\u03bb\u02c6Hb)\u00afI12(wj1+\u03bb\u22122\u03bb\u02c6Hb)12wj012\u03bb12\u03bbst12cij12cji12cij12cjibinneighborsspatialneighborsanyvoxelj\fexperiments later make use of such a heuristic. The approximation analysis below, however, is based\n2-valued variables up to 1. We only summarize our\non a more conservative scheme of rounding all 1\nmain results here, the longer version of the paper includes details.\nA 2-approximation for the objective function (without the epitome bias) is known [16, 12]. The\nrounding above gives a constant factor approximation.\n\nTheorem 1 The rounding strategy described above gives a feasible solution to Problem (6). This\nsolution is a factor 4 approximation to (6). Further, the approximation ratio is tight for this rounding.\n\n6 Experimental Results\n\nOverview. We now empirically evaluate our algorithm for extracting speci\ufb01c structures of interest\nfrom DTI data, focusing on (1) Corpus Callosum (CC), and (2) Interior Capsule (IC) as represen-\ntative examples. Our experiments were designed to answer the following main questions: (i) Can\nthe model reliably and accurately identify the structures of interest? Note that general-purpose\nwhite matter segmentation methods do not extract speci\ufb01c regions (which is often obtained via\nintensive interactive methods instead). Solutions from our algorithm, if satisfactory, can be used\ndirectly for analysis or as a warm-start for user-guided segmentations for additional re\ufb01nement.\n(ii) Does segmentation with a bias for \ufb01delity with epitomes offer advantages over training a clas-\nsi\ufb01er on the same features? Clearly, the latter scheme will work nicely if the similarity between\nforeground/background voxels is suf\ufb01ciently discriminative. Our experiments provide evidence that\nepitomes indeed offer advantages. (iii) Finally, we evaluate the advantages of our method in terms of\nrelative effort expended by a user performing interactive extraction of CC and IC from 3-D volumes.\nData and Setup. We acquired 25 Diffusion Tensor brain images in 12 non-collinear diffusion encod-\ning directions (and one b = 0 reference image) with diffusion weighting factor of b = 1000s/mm2.\nStandard image processing included correcting for eddy current related distortion, distortion from\n\ufb01eld inhomogeneities (using \ufb01eld maps), and head motion. From this data, the tensor elements were\nestimated using standard toolboxes (Camino [23]). The images were then hand-segmented (slice\nby slice) by experts to serve as the gold standard segmentation. Within a leave one out cross val-\nidation scheme, we split our set into training (24 images) and test set (hold out image). Epitomes\nwere constructed using training data (by averaging tensor volumes and generating feature codeword\ndictionaries), and then speci\ufb01c structures in the hold out image were segmented using our model.\nCodewords used for the epitome also served to train a SVM classi\ufb01er (on training data), which was\nthen used to label voxels as foreground (part of structure of interest) or background, in the hold-out\nimage. We present the mean of segmentation accuracy over 25 realizations.\n\nFigure 3: WM/GM segmentation (without epitomes) from stan-\ndard toolkits, overlaid on FA maps (axial, sagittal views shown).\n\nWM/GM DTI segmentation.\nTo\nbrie\ufb02y elaborate on (i) above, we note\nthat most existing DTI segmentation\nalgorithms in the literature [24] focus\non segmenting the entire white-matter\n(WM) from gray-matter (GM) where\nas the focus here is to extract spe-\nci\ufb01c structure within the WM path-\nways, to facilitate the type of analysis\nbeing pursued in neuroscience studies\n[25, 2]. Fig. 3 shows results of a DTI image WM segmentation. Such methods segment WM well\nbut are not designed to identify different components within the WM. Certain recent works [26] have\nreported success in identifying structures such as the cingulum bundle if a good population speci\ufb01c\natlas is available (here, one initializes the segmentation by a sophisticated registration procedure).\nDictionary Generation. A suitable codebook of features (i.e., F from \u00a72.2) is essential to modulate\nthe segmentation (with an uninformative histogram, the process degenerates to a ordinary segmen-\ntation without epitomes). Results from our preliminary experiments suggested that the codeword\ngeneration must be informed by the properties/characteristics of Diffusion Tensor images. While\ngeneral purpose feature extractors or interest-point detectors from Vision cannot be directly applied\nto tensor data, our simple scheme below is derived from these ideas. Brie\ufb02y, by \ufb01rst setting up\na neighborhood region around each voxel, we evaluate the local orientation context and shape in-\n\n6\n\n\fformation from the principal eigen vectors and eigen values of tensors at each neighboring voxel.\nSimilar to Histogram of Oriented Gradients or SIFT, each neighboring voxel casts a vote for the pri-\nmary eigen vector orientation (weighted by its eigen value), which encodes the distribution of tensor\norientations in a local neighborhood around the voxel, as a feature vector. These feature vectors are\nthen clustered, and each voxel is \u2018assigned\u2019 to its closest codeword/feature to give H(u). Certain\nadjustments are needed for structurally sparse regions close to periphery of the brain surface, where\nwe use all primary eigen vectors in a (larger) neighborhood window. This dictionary generation\nis not rotationally invariant since the orientation of the eigen-vectors are used. Our literature re-\nview suggests that there is no \u2018accepted\u2019 strategy for feature extraction from tensor-valued images.\nWhile the problem is interesting, the procedure here yields reasonable results for our purpose. We\nacknowledge that improvements may be possible using more sophisticated approaches.\nImplementation Details. Our implementation in C++ was interfaced with a QPB solver [22, 18]. We\nused a distance measure proposed in DTI-TK [23] which is popular in the neuroimaging literature,\nto obtain a similarity measure between tensors. The unary terms for the MRF component were\ncalculated as the least DTI-TK metric distance between the voxel and a set of labels (generated by\nsampling from foreground in the training data). Pairwise smoothness terms were calculated using a\nspatial neighborhood of 18 neighbors. The parameter \u03bb was set to 10 for all runs.\n6.1 Results: User guided interactive segmentation, Segmentation with Epitomes and SVMs\n\nUser study for interactive segmentation. To assess the amount of effort expended in obtaining\na good segmentation of the regions of interest in an interactive manner, we set up a user study\nwith two users who were familiar with (but not experts in) neuroanatomy. The users were pre-\nsented with the ground truth solution for each image. The user provided \u201cscribbles\u201d denoting fore-\nground/background regions, which were incorporated into the segmentation via must-link/cannot-\nlink constraints. Ignoring the time required for segmentation, typically 20-40 seeds were needed for\neach 2-D slice/image to obtain results close to ground-truth segmentations, which required \u223c 60s\nof user participation per 3-4 slices. Representative results are presented in Figs. 4\u20135 (column 5).\nResults from SVM and our model. For comparison, we trained a SVM classi\ufb01er on the same set\nof voxel-codewords used for the epitomes. For training, feature vectors for foreground/background\nvoxels from the training images were used, and the learnt function was used to classify voxels in the\nhold-out image. Representative results are presented in Figs. 4\u20135, overlaid on 2-D slices of Frac-\ntional Anisotropy. We see good consistency between our solutions and the ground truth in Figs. 4\u20135\nwhere as the SVM results seem to oversegment, undersegment or pick up erroneous regions with\nsimilar contextual appearance to some voxels in the epitome. It is true that such a classi\ufb01cation ex-\nperiment with better (more discriminative) features will likely perform better; however, it is not clear\nhow to reliably extract good quality features from tensor valued images. The results also suggest\nthat our model exploits the epitome of such features rather well within a segmentation criterion.\nQuantitative Summary. For quantitative evaluations, we computed the Dice Similarity coef\ufb01cient\nbetween the segmentation solutions A and the expert segmentation B, given as 2(A\u2229B)\n|A|+|B|. On CC and\nIC, the similarity coef\ufb01cient of our solutions were 0.62 \u00b1 0.04 and 0.57 \u00b1 0.05 respectively. The\ncorresponding values for the SVM segmentation were 0.28 \u00b1 0.06 and 0.15 \u00b1 0.02 respectively.\nHence, the null hypothesis using a two sample t-test can be rejected at \u03b1 = 0.01 (signi\ufb01cance level).\nThe running time of our algorithm was comparable to the running times of SVM using Shogun (a\nsubset of voxels were used for training). It took \u223c 2 mins for our algorithm to solve the network \ufb02ow\non the graph, and < 4 mins to read in the images and construct the graph. While the segmentation\nresults from the user-guided interactive segmentation are marginally better than ours, the user study\nabove indicates that a signi\ufb01cant level of interaction is required, which is already dif\ufb01cult for large\n3-D volumes and becomes impractical for neuroimaging studies with tens of image volumes.\n7 Discussion and Conclusions\n\nWe present a new combinatorial algorithm for segmenting speci\ufb01c structures from DTI images. Our\ngoal is to segment the structure while maintaining consistency with an epitome of the structure,\ngenerated from expert segmented images (note that this is different from top-down segmentation ap-\nproaches [27], and algorithms which use a parametric prior [28, 11]). We see that direct application\nof max-margin methods does not yield satisfactory results, and inclusion of a segmentation-speci\ufb01c\nobjective function seems essential. Our derived model can be optimized using a network \ufb02ow pro-\n\n7\n\n\fFigure 4: A segmentation of the Corpus Callosum overlaid on FA maps. Rows refer to axial and sagittal views.\nColumns: (1) Tensors. (2) Ground truth. (3) Our solutions. (4) SVM results. (5) User-guided segmentation.\n\nFigure 5: A segmentation of the Interior Capsules overlaid on FA maps. Rows correspond to axial views.\nColumns: (1) Tensors. (2) Ground truth. (3) Our Solutions. (4) SVM results. (5) User-guided segmentation.\n\ncedure. We also prove a 4 factor approximation ratio, which is tight for the proposed rounding\nmechanism. We present experimental evaluations on a number of large scale image volumes which\nshows that the approach works well, and is also computationally ef\ufb01cient (2-3 mins). Empirical\nimprovements seem possible by designing better methods of feature extraction from tensor-valued\nimages. The model may serve to incorporate epitomes for general segmentation problems on other\nimages as well. In summary, our approach shows that many structures of interest in neuroimaging\ncan be accurately extracted from DTI data.\n\nReferences\n[1] J. Burns, D. Job, M. E. Bastin, et al. Structural disconnectivity in schizophrenia: a diffusion tensor\n\nmagnetic resonance imaging study. The British J. of Psychiatry, 182(5):439\u2013443, 2003. 1\n\n8\n\n\f[2] A. Pfefferbaum and E. Sullivan. Microstructural but not macrostructural disruption of white matter in\n\nwomen with chronic alcoholism. Neuroimage, 15(3):708\u2013718, 2002. 1, 6\n\n[3] T. Liu, H. Li, K. Wong, et al. Brain tissue segmentation based on DTI data. Neuroimage, 38:114\u2013123,\n\n2007. 2\n\n[4] Z. Wang and B. Vemuri. DTI segmentation using an information theoretic tensor dissimilarity measure.\n\nTrans. on Med. Imaging, 24:1267\u20131277, 2005. 2\n\n[5] P. A. Yushkevich, H. Zhang, T. J. Simon, and J. C. Gee. Structure-speci\ufb01c statistical mapping of white\n\nmatter tracts using the continuous medial representation. In Proc. of MMBIA, 2007. 2\n\n[6] N. Lawes, T. Barrick, V. Murugam, et al. Atlas based segmentation of white matter tracts of the human\nbrain using diffusion tensor tractography and comparison with classical dissection. Neuroimage, 39:62\u2013\n79, 2008. 2\n\n[7] C. B. Goodlett, T. P. Fletcher, J. H. Gilmore, and G. Gerig. Group analysis of DTI \ufb01ber tract statistics\n\nwith application to neurodevelopment. Neuroimage, 45(1):S133 \u2013 S142, 2009. 2\n\n[8] C. Rother, T. Minka, A. Blake, and V. Kolmogorov. Cosegmentation of image pairs by histogram match-\n\ning: Incorporating a global constraint into MRFs. In Comp. Vision and Pattern Recog., 2006. 2, 3\n\n[9] L. Mukherjee, V. Singh, and C. Dyer. Half-integrality based algorithms for cosegmentation of images. In\n\nComp. Vision and Pattern Recog., 2009. 2, 3, 4\n\n[10] D. Hochbaum and V. Singh. An ef\ufb01cient algorithm for co-segmentation. In Intl. Conf. on Comp. Vis.,\n\n2009. 2, 3\n\n[11] D. Batra, A. Kowdle, D. Parikh, et al. icoseg: Interactive co-segmentation with intelligent scribble guid-\n\nance. In Comp. Vision and Patter Recog., 2010. 2, 7\n\n[12] Y. Boykov, O. Veksler, and R. Zabih. Fast approximate energy minimization via graph cuts. Trans. on\n\nPattern Anal. and Machine Intel., 23(11):1222\u20131239, 2001. 3, 6\n\n[13] V. Kolmogorov, Y. Boykov, and C. Rother. Applications of parametric max\ufb02ow in Computer Vision. In\n\nIntl. Conf. on Comp. Vision, 2007. 3\n\n[14] Y. T . Weldeselassie and G. Hamarneh. DT -MRI segmentation using graph cuts. In Medical Imaging:\n\nImage Processing, volume 6512 of Proc. SPIE, 2007. 3\n\n[15] D. Hochbaum. An ef\ufb01cient algorithm for image segmentation, markov random \ufb01elds and related prob-\n\nlems. J. of the ACM, 48(4):686\u2013701, 2001. 3\n\n[16] J. Kleinberg and E. Tardos. Approximation algorithms for classi\ufb01cation problems with pairwise relation-\n\nships: Metric partitioning and markov random \ufb01elds. J. of the ACM, 49(5):616\u2013639, 2002. 3, 6\n\n[17] H. Ishikawa. Exact optimization for markov random \ufb01elds with convex priors. Trans. on Pattern Anal.\n\nand Machine Intel, 25(10):1333\u20131336, 2003. 3\n\n[18] E. Boros and P. Hammer. Pseudo-Boolean optimization. Disc. Appl. Math., 123:155\u2013225, 2002. 4, 5, 7\n[19] C. Rother, V. Kolmogorov, V. Lempitsky, and M. Szummer. Optimizing binary mrfs via extended roof\n\nduality. In Comp. Vision and Pattern Recog., 2007. 4\n\n[20] P. Kohli, A. Shekhovtsov, C. Rother, V. Kolmogorov, et al. On partial optimality in multi-label MRFs. In\n\nIntl. Conf. on Machine learning, 2008. 4, 5\n\n[21] A. Raj, G. Singh, and R. Zabih. MRFs for MRIs: Bayesian reconstruction of MR images via graph cuts.\n\nIn Comp. Vision and Pattern Recog., 2006. 4\n\n[22] V. Kolmogorov and C. Rother. Minimizing nonsubmodular functions with graph cuts-a review. Trans. on\n\nPattern Anal. and Machine Intel., 29(7):1274, 2007. 5, 7\n\n[23] H. Zhang, P. A. Yushkevich, D. C. Alexander, and J. C. Gee. Deformable registration of diffusion tensor\n\nMR images with explicit orientation optimization. Medical Image Analysis, 10:764\u2013785, 2006. 6, 7\n\n[24] M. Rousson, C. Lenglet, and R. Deriche. Level set and region based surface propagation for diffusion\ntensor MRI segmentation. In Proc. of CVAMIA-MMBIA, volume 3117 of LNCS, pages 123\u2013134, 2004. 6\n[25] S. M. Smith, M. Jenkinson, H. Johansen-Berg, et al. Tract-based spatial statistics: Voxelwise analysis of\n\nmulti-subject diffusion data. 31:1487\u20131505, 2006. 6\n\n[26] S. Awate, H. Zhang, and Gee. J. A fuzzy, nonparametric segmentation framework for DTI and MRI\n\nanalysis with applications to DTI tract extraction. Trans. on Med. Imaging, 26(11):1525\u20131536, 2007. 6\n\n[27] E. Borenstein, E. Sharon, and S. Ullman. Combining top-down and bottom-up segmentation. In Comp.\n\nVision and Pattern Recognition Workshop, 2004. 7\n\n[28] C. Jingyu, Y. Qiong, W. Fang, et al. Transductive object cutout. In Comp. Vision and Pattern Recog.,\n\n2008. 7\n\n9\n\n\f", "award": [], "sourceid": 152, "authors": [{"given_name": "Kamiya", "family_name": "Motwani", "institution": null}, {"given_name": "Nagesh", "family_name": "Adluru", "institution": null}, {"given_name": "Chris", "family_name": "Hinrichs", "institution": null}, {"given_name": "Andrew", "family_name": "Alexander", "institution": null}, {"given_name": "Vikas", "family_name": "Singh", "institution": null}]}