1, the solution can\nonly be determined up to an orthonormal transformation T, since (V:,1:P T)\u03a31:P,1:P (V:,1:P T)T =\n(V:,1:P T)\u03a31:P,1:P (TT VT\n:,1:P . To overcome both sources of ambiguities,\nwe rely on image information. Since we consider the case of rigid and non-rigid pose estima-\ntion, we can make the typical assumption that we have correspondences between 3D points on the\nobject of interest and 2D image locations [8, 9]. The sign ambiguities result in 2P discrete solu-\ntions. We disambiguate between them by choosing the one that yields the smallest reprojection\n\n:,1:P ) = V:,1:P \u03a31:P,1:P VT\n\n3\n\n\uf8f6\uf8f7\uf8f7\uf8f7\uf8f8 =V\u03a3VT .\n\n\uf8eb\uf8ec\uf8ec\uf8ec\uf8ed\n\n\u02dcQ =\n\n\u02dcy2\n\u02dcy1\n\u00b7\u00b7\u00b7\n\u02dcy2\n...\n...\n\u02dcyD \u00b7\u00b7\u00b7\n\n\u00b7\u00b7\u00b7\n\u00b7\u00b7\u00b7\n...\n\u00b7\u00b7\u00b7\n\n\u02dcyD\n\u00b7\u00b7\u00b7\n...\n\n\u02dcy D(D+1)\n\n2\n\n\u02dcZ =p\u03a31:P,1:P VT\n\n\f(a)\n\n(b)\n\n(c)\n\n(d)\n\nFigure 2: Estimating the rotation of a plane.\n(Top) Mean reconstruction error and constraint\nviolation when parameterizing the rotations with quaternions.\n(Bottom) Similar plots when the\nrotations were parameterized as rotation matrices. Note that our approach outperforms the baselines\nand is insensitive to the parameterization used.\n\nerror. Note, however, that other types of image information, such as silhouettes or texture, could\nalso be employed. To determine the global transformation of \u02dcZ, we similarly rely on 3D-to-2D\ncorrespondences; \ufb01nding a rigid transformation that minimizes the reprojection error of 3D points\nis a well-studied problem in computer vision, called the PnP problem. In practice, we employ the\nclosed-form solution of [8] to estimate T.\n3 Experimental Evaluation\nIn this section, we show our results on rigid and non-rigid reconstruction problems involving\nquadratic constraints. Samples from the diverse datasets employed are depicted in Fig. 1. As our\nerror measure, we report mean point-to-point distance between the recovered 3D shape and ground-\ntruth averaged over 10 partitions for a \ufb01xed test set size of 500 examples. Furthermore, we also show\nerror bars that represent \u00b1 one standard deviation computed over the 10 partitions. These error bars\nare non-overlapping for all constraint violation plots, as well as for most of the reconstruction errors,\nwhich shows that our results are statistically signi\ufb01cant. For all experiments we used a covariance\nfunction which is the sum of an RBF and a noise term, and \ufb01xed the width of the RBF to the mean\nn = 0.01. Furthermore,\nsquared distance between the training inputs and the noise variance to \u03c32\nin cases where the number of training examples is smaller than the output dimensionality (i.e. for\nlarge deformable meshes and for human poses), we performed principal component analysis on the\ntraining outputs to speed up training. To entail no loss in the data, we only removed the components\nwith corresponding zero-valued eigenvalues.\n\n3.1 Rotation of a Plane\nFirst, we considered the case of inferring the rotation in 3D space of the square in Fig. 1(a) given\nnoisy 2D image observations of its corners. Note that this is an instance of the PnP problem. We used\ntwo different parameterizations of the rotations: quaternions and rotation matrices. In the \ufb01rst case,\nthe recovered quaternion must have unit norm, i.e., ||\u02dcZ||2 = 1. In the second case, the recovered\nrotation matrix must be orthonormal, i.e., \u02dcZT \u02dcZ = I.\nFig. 2(a,b) depicts the reconstruction errors obtained with quaternions (top) and rotation matrices\n(bottom), as a function of the Gaussian noise variance on the 2D image locations when using a\ntraining set of 100 examples (a), and as a function of the number of training examples for a Gaussian\nnoise variance of 5 (b). We compare the results of our approach to those obtained by a GP trained on\nthe original variables, as well as to the results of a state-of-the-art PnP method [8], which would be\nthe standard approach to solving this problem. In all cases, our approach outperforms the baselines.\nMore importantly, our approach performs equally well for all the parameterizations of the rotation.\nFig. 2(c,d) shows the mean constraint violation for both parameterizations. For quaternions, this\nerror is computed as the absolute difference between the norm of the recovered quaternion and 1.\nFor rotation matrices it is computed as the Frobenius norm of the difference between \u02dcZT \u02dcZ and\n\n4\n\n02468100510152025Gaussian noise varianceMean 3D error [mm] PnP [8]GPOur appr1002003004005005101520Nb. training examplesMean 3D error [mm] PnP [8]GPOur appr024681000.10.20.3Gaussian noise varianceMean constraint violation GPOur appr10020030040050000.10.20.3Nb. training examplesMean constraint violation GPOur appr02468100246810Gaussian noise varianceMean 3D error [mm] PnP [8]GPOur appr10020030040050033.544.555.5Nb. training examplesMean 3D error [mm] PnP [8]GPOur appr02468100.050.10.150.2Gaussian noise varianceMean constraint violation GPOur appr1002003004005000.050.10.15Nb. training examplesMean constraint violation GPOur appr\fFigure 3: Estimating the 3D shape of a 2 \u00d7 2 mesh from 2D image locations. (Top) Mean recon-\nstruction error and constraint violation as a function of the input noise. The global transformation\nwas estimated either (left) from the ground truth, or (middle) using a PnP method [8]. (Bottom)\nSimilar errors shown as a function of the number of training examples.\n\nFigure 4: Estimating the 3D shape of a 9 \u00d7 9 mesh from 2D image locations. (Top) Mean recon-\nstruction error and constraint violation as a function of the input noise. The global transformation\nwas estimated either (left) from the ground truth, or (middle) using a PnP method [8]. (Bottom)\nSimilar errors shown as a function of the number of training examples. Note that the global transfor-\nmations estimated with the PnP method yield poor reconstructions. However, our approach performs\nbest among those that use these transformations.\n\nthe identity matrix. Note that in both cases, our approach better satis\ufb01es the quadratic constraints\nthan the standard GP. This is especially true in the case of unit norm quaternions, where the results\nobtained with the GP strongly violate the constraints.\n3.2 Surface Deformations\nNext, we considered the problem of estimating the shape of a deforming surface from a single\nimage. In this context, the output space is composed of the 3D locations of the vertices of the mesh\nthat represents the surface, and the quadratic constraints encode the fact that the length of the mesh\nedges should remain constant. The constraint error measure was taken to be the average over all\nedges of the percentage of length variation. We compare against two baselines, GP in the original\nvariables (i.e., 3D locations of mesh vertices), and the approach of [12] where the constraints are\nexplicitly enforced at inference. Since our approach only allows us to recover the shape up to a\nglobal transformation, we show results estimating this transformation either from the ground-truth\ndata, which can be done by computing an SVD [20], or by applying a PnP method [8]. To make our\nevaluation fair, we also computed similar global transformations for the baselines.\nWe tested our approach on the same square as before, but allowing it to deform by letting the edge\nbetween its two facets act as a hinge, as shown in Fig. 1(b). Doing so ensures that the length of\n\n5\n\n051015205101520Gaussian noise varianceMean 3D error [mm] GPGP + trsfCstr GP [12]Cstr GP [12] + trsfOur appr + trsf051015205101520Gaussian noise varianceMean 3D error [mm] GPGP + trsfCstr GP [12]Cstr GP [12] + trsfOur appr + trsf0510152002468Gaussian noise varianceMean constraint violation [%] GPCstr GP [12]Our appr100200300400500468101214Nb. training examplesMean 3D error [mm] GPGP + trsfCstr GP [12]Cstr GP [12] + trsfOur appr + trsf100200300400500468101214Nb. training examplesMean 3D error [mm] GPGP + trsfCstr GP [12]Cstr GP [12] + trsfOur appr + trsf10020030040050002468Nb. training examplesMean constraint violation [%] GPCstr GP [12]Our appr0510152068101214Gaussian noise varianceMean 3D error [mm] GPGP + trsfCstr GP [12]Cstr GP [12] + trsfOur appr + trsf051015208101214161820Gaussian noise varianceMean 3D error [mm] GPGP + trsfCstr GP [12]Cstr GP [12] + trsfOur appr + trsf05101520051015Gaussian noise varianceMean constraint violation [%] GPCstr GP [12]Our appr10020030040050067891011Nb. training examplesMean 3D error [mm] GPGP + trsfCstr GP [12]Cstr GP [12] + trsfOur appr + trsf1002003004005008101214Nb. training examplesMean 3D error [mm] GPGP + trsfCstr GP [12]Cstr GP [12] + trsfOur appr + trsf10020030040050005101520Nb. training examplesMean constraint violation [%] GPCstr GP [12]Our appr\fFigure 5: Estimating the 3D shape of a 9 \u00d7 9 mesh from PHOG features. Mean reconstruc-\ntion error and constraint violation obtained from (top) well-textured images (Fig. 1(d)), or (bottom)\npoorly-textured ones (Fig. 1(e)). The global transformation was estimated either (left) from the\nground truth, or (middle) using a PnP method [8].\n\nFigure 6: Non-rigid reconstruction from real images. Reconstructions of a piece of paper from\n2D image locations. We show the recovered mesh overlaid in red on the original image, and a side\nview of this mesh.\nthe mesh edges remains constant. Similarly as before, the inputs to the GP, x, were taken to be\nthe 2D image locations of the corners of the square. Fig. 3 depicts the reconstruction error and\nconstraint violation as a function of the Gaussian noise variance added to the 2D image locations for\ntraining sets composed of 100 training examples (top), and as a function of the number of training\nexamples for a Gaussian noise variance of 10 (bottom). Note that our approach is more robust\nto input noise than the baselines. Furthermore, unlike the standard GP, our approach satis\ufb01es the\nquadratic constraints.\nWe then tested our approach on the larger mesh shown in Fig. 1(c). In that case, the matrix Z \u2208\n<3\u00d781. We generated inextensible deformed mesh examples by randomly sampling the values of a\nsubset of the angles between the facets of the mesh. Fig. 4 depicts the results obtained when using\nthe 2D image locations as inputs. As before, we can see that our approach is more robust to input\nnoise than the baselines1. Note that the global transformations estimated with the PnP method tend\nto be inaccurate and therefore yield poor results. However, our approach performs best among the\nones that utilize the PnP method. We can also notice that our approach better satis\ufb01es the constraints\nthan GP prediction in the original space. The small violation of the constraints is due to the fact that\nour prediction is not guaranteed to be rank 3, and therefore the factorization may result in some loss.\nWe then considered the more general case of having images as inputs instead of the 2D locations of\nthe mesh vertices. For this purpose, we generated images such as those of Fig. 1(d,e) from which\nwe computed PHOG features [5]. As shown in Fig. 5, our approach outperforms the baselines for\nall training set sizes.\nTo demonstrate our method\u2019s ability to deal with real images, we reconstructed the deformations of\na piece of paper from a video sequence. We used the 2D image locations of the vertices of the mesh\nas inputs, which were obtained by tracking the surface in 2D using template matching. For this\ncase, the training data was obtained by deforming a piece of cardboard in front of an optical motion\ncapture system. Results for some frames of the sequence are shown in Fig. 6. Note that, for small\ndeformations, the problem is subject to concave-convex ambiguities arising from the insuf\ufb01cient\nperspective. As a consequence, the shape is less accurate than when the deformations are larger.\n\n1In [12], they proposed to optimize either directly the pose, or the vector of kernel values k\u2217. The second\nchoice requires having more training examples than the number of constraints. Since here this is not always the\ncase, for this dataset we optimized the pose.\n\n6\n\n1002003004005006810121416Nb. training examplesMean 3D error [mm] GPGP + trsfCstr GP [12]Cstr GP [12] + trsfOur appr + trsf1002003004005008101214161820Nb. training examplesMean 3D error [mm] GPGP + trsfCstr GP [12]Cstr GP [12] + trsfOur appr + trsf10020030040050005101520Nb. training examplesMean constraint violation [%] GPCstr GP [12]Our appr1002003004005006810121416Nb. training examplesMean 3D error [mm] GPGP + trsfCstr GP [12]Cstr GP [12] + trsfOur appr + trsf100200300400500101520Nb. training examplesMean 3D error [mm] GPGP + trsfCstr GP [12]Cstr GP [12] + trsfOur appr + trsf10020030040050005101520Nb. training examplesMean constraint violation [%] GPCstr GP [12]Our appr\fl\na\ni\nt\nn\ne\nr\ne\nf\ne\nr\n\na\nr\ne\nm\na\nC\n\nl\na\ni\nt\nn\ne\nr\ne\nf\ne\nr\n\nn\no\nm\nm\no\nC\n\nFeatures of [11]\n\nPHOG\n\nSIFT Hist.\n\nFigure 7: Human pose estimation from different image features. Mean reconstruction error\nas a function of the number of training examples for 3 different feature types and with the pose\nrepresented in 2 different referentials.\n\nFeatures of [11]\n\nPHOG\n\nSIFT Hist.\n\nFigure 8: Constraint violation in human pose estimation. Mean constraint violation for 3 different\nimage feature types. Note that the constrained GP [12] best satis\ufb01es the constraints, since it explicitly\nenforces them at inference. However, our approach is more stable than the standard GP.\n\nFigure 9: Human pose estimation from real images. We show the recti\ufb01ed image from [11] and\nthe pose recovered by our approach using PHOG features as input seen from a different viewpoint.\n3.3 Human Pose Estimation\nWe also applied our method to the problem of estimating the pose of a human skeleton. To this end,\nwe used the HumanEva dataset [17], which consists of synchronized images and motion capture\ndata. In particular, we used the recti\ufb01ed images of [11] and relied on three different image features as\ninput: histograms of SIFT features, PHOG features, and the features of [11]. In this case Z \u2208 <3\u00d719.\nWe performed experiments with two representations of the pose: all poses aligned to a common\nreferential, and all poses in the camera referential. We estimated the global transformation from\nthe ground-truth. As show in Fig. 7 for all feature types our approach outperforms the baselines.\nFig. 8 shows the constraint violation for the different settings. Due to our parameterization, the\namount of constraint violation induced by our approach is independent of the pose referential. This\nis in contrast with the standard GP, which is very sensitive to the representation. In addition, we\nalso enforced the constraints at inference, similarly as [12], but starting from our results. As can be\nobserved from the \ufb01gures, while this reduced the constraint violation, it had very little in\ufb02uence on\nthe reconstruction error. Fig. 9 depicts some of our results obtained from PHOG features.\n3.4 Running Time\nWe compared the running times of our algorithm to those of solving the non-convex constraints at\ninference [12]. As shown in Table 1, the running times of our algorithm are constant with respect\n\n7\n\n100200300400500246810Nb. training examplesMean 3D error [cm] GPGP + trsfCstr GP [12]Cstr GP [12] + trsfOur appr + trsfOur appr + cstr1002003004005002468Nb. training examplesMean 3D error [cm] GPGP + trsfCstr GP [12]Cstr GP [12] + trsfOur appr + trsfOur appr + cstr10020030040050051015202530Nb. training examplesMean 3D error [cm] GPGP + trsfCstr GP [12]Cstr GP [12] + trsfOur appr + trsfOur appr + cstr100200300400500123456Nb. training examplesMean 3D error [cm] GPGP + trsfCstr GP [12]Cstr GP [12] + trsfOur appr + trsfOur appr + cstr10020030040050023456Nb. training examplesMean 3D error [cm] GPGP + trsfCstr GP [12]Cstr GP [12] + trsfOur appr + trsfOur appr + cstr10020030040050046810Nb. training examplesMean 3D error [cm] GPGP + trsfCstr GP [12]Cstr GP [12] + trsfOur appr + trsfOur appr + cstr10020030040050005101520Nb. training examplesMean constraint violation [%] GP humanGP camCstr GP [12]Our apprOur appr + cstr100200300400500051015Nb. training examplesMean constraint violation [%] GP humanGP camCstr GP [12]Our apprOur appr + cstr1002003004005000102030405060Nb. training examplesMean constraint violation [%] GP humanGP camCstr GP [12]Our apprOur appr + cstr\fConstr GP [12]\n\nTraining size\n\n2 \u00d7 2 mesh (D = 4)\nHumanEva (D = 19)\n9 \u00d7 9 mesh (D = 81)\n\n50\n2.0\n26.1\n1664.9\n\n250\n5.1\n49.6\n1625.6\n\n500\n21.3\n101.0\n1599.8\n\nOur approach\n500\n50\n8.0\n8.0\n4.8\n4.9\n8.7\n9.0\n\n250\n7.9\n4.9\n8.9\n\nTable 1: Running times comparison. Average running times per test example in milliseconds for\ndifferent datasets and different number of training examples. We show results for the constrained\nGP of [12] and for our approach. Note that, as opposed to [12], our approach is relatively insensitive\nto the number of training examples and to the dimension of the data.\n\nFigure 10: Robustness to output noise. Mean reconstruction error and constraint violation as\na function of the output noise on the training examples. As a pre-processing step, we projected\nthe noisy training examples to the closest shape that satis\ufb01es the constraints. We then trained all\napproaches with this data. Note that our approach outperforms the baselines.\n\nto the overall size of the problem. This is due to the fact that most of the computation time is\nspent doing the factorization and not the prediction. In contrast, enforcing constraints at inference is\nsensitive to the dimension of the data, as well as to the number of training examples2. Therefore, for\nlarge, high-dimensional training sets, our algorithm is several orders of magnitude faster than [12],\nand, as shown above, obtains similar or better accuracies.\n\n3.5 Robustness to Noise in the Outputs\nAs shown in Section 2.2, the mean prediction of a GP satis\ufb01es linear constraints under the assump-\ntion that the training examples all satisfy these constraints. This suggests that our approach might\nbe sensitive to noise on the training outputs, y. To study this, we added Gaussian noise with vari-\nance ranging from 2mm to 10mm on the 3D coordinates of the 2 \u00d7 2 deformable mesh of 100mm\nside (Fig. 1(b)). To overcome the effect of noise, we \ufb01rst pre-processed the training examples and\nprojected them to the closest shape that satis\ufb01es the constraints in a similar manner as in [12]. We\nthen used these recti\ufb01ed shapes as training data for our approach as well as for the baselines. Fig. 10\ndepicts the reconstruction error and constraint violation as a function of the output noise. We used\nthe image locations of the vertices with noise variance 10 as inputs, and N = 100 training examples.\nNote that our approach outperforms the baselines. Furthermore, our pre-processing step improved\nthe results of all approaches compared to using the original noisy data. Note, however, that in the\ncase of extreme output noise, projecting the training examples on the constraint space might yield\nmeaningless results. This would have a negative impact on the learned predictor, and thus on the\nperformance of all the methods.\n4 Conclusion\nIn this paper, we have proposed an approach to implicitly enforcing constraints in discriminative pre-\ndiction. We have shown that the prediction of a GP always satis\ufb01es linear constraints if the training\ndata satis\ufb01es these constraints. From this result, we have proposed an effective method to enforce\nquadratic constraints by changing the parameterization of the problem. We have demonstrated on\nseveral rigid and non-rigid monocular pose estimation problems that our method outperforms GP\nregression, as well as enforcing the constraints at inference [12]. Furthermore, we have shown that\nour algorithm is very ef\ufb01cient, and makes real-time non-rigid reconstruction an achievable goal. In\nthe future, we intend to investigate other types of image information to estimate the global trans-\nformation, as well as study the use of our approach to tasks involving different constraints, such as\ndynamics.\n\n2For the last dataset, the running times of [12] are independent of N. This is due to the fact that, in this\n\ncase, we optimized the pose directly (see note 1 on page 6).\n\n8\n\n02468105101520Output Gaussian noise varianceMean 3D error [mm] GPGP + trsfCstr GP [12]Cstr GP [12] + trsfOur appr + trsf0246810681012141618Output Gaussian noise varianceMean 3D error [mm] GPGP + trsfCstr GP [12]Cstr GP [12] + trsfOur appr + trsf02468100246810Output Gaussian noise varianceMean constraint violation [%] GPCstr GP [12]Our appr\fReferences\n[1] A. Agarwal and B. Triggs. 3d human pose from silhouettes by relevance vector regression. In Conference\n\non Computer Vision and Pattern Recognition, 2004.\n\n[2] M. Alvarez and N. D. Lawrence. Sparse convolved Gaussian processes for multi-output regression. In\n\nNeural Information Processing Systems, pages 57\u201364. MIT Press, Cambridge, MA, 2009.\n\n[3] V. Blanz and T. Vetter. A Morphable Model for The Synthesis of 3\u2013D Faces. In ACM SIGGRAPH, pages\n\n187\u2013194, Los Angeles, CA, August 1999.\n\n[4] E. Bonilla, K. M. Chai, and C. Williams. Multi-task gaussian process prediction. In J. Platt, D. Koller,\nY. Singer, and S. Roweis, editors, Neural Information Processing Systems, pages 153\u2013160, Cambridge,\nMA, 2008. MIT Press.\n\n[5] A. Bosch, A. Zisserman, and X. Munoz. Image classi\ufb01cation using random forests and ferns. In Interna-\n\ntional Conference on Computer Vision, 2007.\n\n[6] P. Goovaerts. Geostatistics For Natural Resources Evaluation. Oxford University Press, 1997.\n[7] L. Herda, R. Urtasun, and P. Fua. Hierarchical Implicit Surface Joint Limits to Constrain Video-Based\n\nMotion Capture. In European Conference on Computer Vision, Prague, Czech Republic, May 2004.\n\n[8] F. Moreno-Noguer, V. Lepetit, and P. Fua. Accurate Non-Iterative O(n) Solution to the PnP Problem. In\n\nInternational Conference on Computer Vision, Rio, Brazil, October 2007.\n\n[9] M. Perriollat, R. Hartley, and A. Bartoli. Monocular template-based reconstruction of inextensible sur-\n\nfaces. In British Machine Vision Conference, 2008.\n\n[10] J. Quinonero-Candela and C. E. Rasmussen. A unifying view of sparse approximate gaussian process\n\nregression. Journal of Machine Learning Research, pages 1935\u20131959, 2006.\n\n[11] G. Rogez, J. Rihan, S. Ramalingam, C. Orrite, and P. Torr. Randomized Trees for Human Pose Detection.\n\nIn Conference on Computer Vision and Pattern Recognition, 2008.\n\n[12] M. Salzmann and R. Urtasun. Combining discriminative and generative methods for 3d deformable\nsurface and articulated pose reconstruction. In Conference on Computer Vision and Pattern Recognition,\nSan Francisco, CA, June 2010.\n\n[13] M. Salzmann, R. Urtasun, and P. Fua. Local deformation models for monocular 3d shape recovery. In\n\nConference on Computer Vision and Pattern Recognition, Anchorage, AK, June 2008.\n\n[14] G. Shakhnarovich, P. Viola, and T. Darrell. Fast pose estimation with parameter-sensitive hashing. In\n\nInternational Conference on Computer Vision, Nice, France, 2003.\n\n[15] S. Shen, W. Shi, and Y. Liu. Monocular template-based tracking of inextensible deformable surfaces\n\nunder l2-norm. In Asian Conference on Computer Vision, 2009.\n\n[16] H. Sidenbladh, M. J. Black, and D. J. Fleet. Stochastic Tracking of 3D human Figures using 2D Image\n\nMotion. In European Conference on Computer Vision, June 2000.\n\n[17] L. Sigal and M. J. Black. Humaneva: Synchronized video and motion capture dataset for evaluation of\n\narticulated human motion. Technical Report CS-06-08, Brown University, 2006.\n\n[18] C. Sminchisescu and B. Triggs. Kinematic Jump Processes for Monocular 3D Human Tracking.\n\nConference on Computer Vision and Pattern Recognition, volume I, page 69, Madison, WI, June 2003.\n\nIn\n\n[19] E. Snelson, C. E. Rassmussen and Z. Ghahramani. Warped Gaussian Processes. In Neural Information\n\nProcessing Systems. MIT Press, Cambridge, MA, 2004.\n\n[20] S. Umeyama. Least-squares estimation of transformation parameters between two point patterns. IEEE\n\nTransactions on Pattern Analysis and Machine Intelligence, 13(4), Apr. 1991.\n\n[21] R. Urtasun and T. Darrell. Sparse Probabilistic Regression for Activity-independent Human Pose Infer-\n\nence. In Conference on Computer Vision and Pattern Recognition, Anchorage, AK, 2008.\n\n[22] R. Urtasun, D. Fleet, A. Hertzman, and P. Fua. Priors for people tracking from small training sets. In\n\nInternational Conference on Computer Vision, Beijing, China, October 2005.\n\n9\n\n\f", "award": [], "sourceid": 223, "authors": [{"given_name": "Mathieu", "family_name": "Salzmann", "institution": null}, {"given_name": "Raquel", "family_name": "Urtasun", "institution": null}]}