{"title": "Probabilistic amplitude and frequency demodulation", "book": "Advances in Neural Information Processing Systems", "page_first": 981, "page_last": 989, "abstract": "A number of recent scientific and engineering problems require signals to be decomposed into a product of a slowly varying positive envelope and a quickly varying carrier whose instantaneous frequency also varies slowly over time. Although signal processing provides algorithms for so-called amplitude- and frequency-demodulation (AFD), there are well known problems with all of the existing methods. Motivated by the fact that AFD is ill-posed, we approach the problem using probabilistic inference. The new approach, called probabilistic amplitude and frequency demodulation (PAFD), models instantaneous frequency using an auto-regressive generalization of the von Mises distribution, and the envelopes using Gaussian auto-regressive dynamics with a positivity constraint. A novel form of expectation propagation is used for inference. We demonstrate that although PAFD is computationally demanding, it outperforms previous approaches on synthetic and real signals in clean, noisy and missing data settings.", "full_text": "Probabilistic amplitude and frequency demodulation\n\nComputational and Biological Learning Lab, Department of Engineering\nUniversity of Cambridge, Trumpington Street, Cambridge, CB2 1PZ, UK\n\nRichard E. Turner\u2217\n\nret26@cam.ac.uk\n\nGatsby Computational Neuroscience Unit, University College London\n\nAlexandra House, 17 Queen Square, London, WC1N 3AR, UK\n\nManeesh Sahani\n\nmaneesh@gatsby.ucl.ac.uk\n\nAbstract\n\nA number of recent scienti\ufb01c and engineering problems require signals to be de-\ncomposed into a product of a slowly varying positive envelope and a quickly vary-\ning carrier whose instantaneous frequency also varies slowly over time. Although\nsignal processing provides algorithms for so-called amplitude- and frequency-\ndemodulation (AFD), there are well known problems with all of the existing\nmethods. Motivated by the fact that AFD is ill-posed, we approach the problem\nusing probabilistic inference. The new approach, called probabilistic amplitude\nand frequency demodulation (PAFD), models instantaneous frequency using an\nauto-regressive generalization of the von Mises distribution, and the envelopes us-\ning Gaussian auto-regressive dynamics with a positivity constraint. A novel form\nof expectation propagation is used for inference. We demonstrate that although\nPAFD is computationally demanding, it outperforms previous approaches on syn-\nthetic and real signals in clean, noisy and missing data settings.\n\n1\n\nIntroduction\n\nAmplitude and frequency demodulation (AFD) is the process by which a signal (yt) is decomposed\ninto the product of a slowly varying envelope or amplitude component (at) and a quickly varying\nsinusoidal carrier (cos(\u03c6t)), that is yt = at cos(\u03c6t). In its general form this is an ill-posed problem\n[1], and so algorithms must impose implicit or explicit assumptions about the form of carrier and\nenvelope to realise a solution. In this paper we make the standard assumption that the amplitude\nvariables are slowly varying positive variables, and the derivatives of the carrier phase, \u03c9t = \u03c6t \u2212\n\u03c6t\u22121 called the instantaneous frequencies (IFs), are also slowly varying variables.\nIt has been argued that the subbands of speech are well characterised by such a representation [2, 3]\nand so AFD has found a range of applications in audio processing including audio coding [4, 2],\nspeech enhancement [5] and source separation [6], and it is used in hearing devices [5]. AFD has\nbeen used as a scienti\ufb01c tool to investigate the perception of sounds [7]. AFD is also of importance\nin neural signal processing applications. Aggregate \ufb01eld measurements such as those collected at the\nscalp by electroencephalography (EEG) or within tissue as local \ufb01eld potentials often exhibit tran-\nsient sharp spectral lines at characteristic frequencies. Within each such band, both the amplitude of\nthe oscillation and the precise center frequencies may vary with time; and both of these phenomena\nmay reveal important elements of the mechanism by which the \ufb01eld oscillation arises.\n\n\u2217Richard Turner would like to thank the Laboratory for Computational Vision, New York University, New\n\nYork, NY 10003-6603, USA, where he carried out this research.\n\n1\n\n\fDespite the fact that AFD has found a wide range of important applications, there are well-known\nproblems with existing AFD algorithms [8, 1, 9, 10, 5]. Because of these problems, the Hilbert\nmethod, which recovers an amplitude from the magnitude of the analytic signal, is still considered\nto be the benchmark despite a number of limitations [11, 12]. In this paper, we show examples of de-\nmodulation of synthetic, audio, and hippocampal theta rhythm signals using various AFD techniques\nthat highlights some of the anomalies associated with existing methods.\n\nMotivated by the de\ufb01ciencies in the existing methods this paper develops a probabilistic form of\nAFD. This development begins in the next section where we reinterpret two existing probabilistic\nalgorithms in the context of AFD. The limitations of these methods suggest an improved model\n(section 2) which we demonstrate on a range of synthetic and natural signals (sections 4 and 5).\n\n1.1 Simple models for probabilistic amplitude and frequency demodulation\n\nIn this paper, we view demodulation as an estimation problem in which a signal is \ufb01t with a sinusoid\nof time-varying amplitude and phase,\n\nyt = \u211c (at exp (i\u03c6t)) + \u01ebt.\n\ny, that is p(\u01ebt) = Norm(\u01ebt; 0, \u03c32\n\n(1)\nThe expression also includes a noise term which will be modeled as a zero-mean Gaussian with\nvariance \u03c32\ny). We are interested in the situation where the IF of the\nsinusoid varies slowly around a mean value \u00af\u03c9. In this case, the phase can be expressed in terms of\nthe integrated mean frequency and a small perturbation, \u03c6t = \u00af\u03c9t + \u03b8t.\nClearly, the problem of inferring at and \u03b8t from yt is ill-posed, and results will depend on the\nspeci\ufb01cation of prior distributions over the amplitude and phase perturbation variables. Our goal in\nthis paper is to specify such prior distributions directly, but this will require the development of new\ntechniques to handle the resulting non-linearities. A simpler alternative is to generate the sinusoidal\nsignal from a rotating two-dimensional phasor. For example, re-parametrizing the likelihood in\nterms of the components x1,t = at cos(\u03b8t) and x2,t = at sin(\u03b8t), yields a linear likelihood function\nyt = at (cos(\u00af\u03c9t) cos(\u03b8t) \u2212 sin(\u00af\u03c9t) sin(\u03b8t)) + \u01ebt = cos(\u00af\u03c9t)x1,t \u2212 sin(\u00af\u03c9t)x2,t + \u01ebt = wT\nxt + \u01ebt.\nHere the phasor components, which have been collected into a vector xT\nt = [x1,t, x2,t], are multiplied\nby time-varying weights, wT\nt = [cos(\u00af\u03c9t), \u2212 sin(\u00af\u03c9t)]. To complete the model, prior distributions can\nbe now be speci\ufb01ed over xt. One choice that results in a particularly simple inference algorithm is\na Gaussian one-step auto-regressive (AR(1)) prior,\n\nt\n\np(xk,t|xk,t\u22121) = Norm(xk,t; \u03bbxk,t\u22121, \u03c32\n\n(2)\nWhen the dynamical parameter tends to unity (\u03bb \u2192 1) and the dynamical noise variance to zero\n(\u03c32\nx \u2192 0), the dynamics become very slow, and this slowness is inherited by the phase perturbations\nand amplitudes. This model is an instance of the Bayesian Spectrum Estimation (BSE) model [13]\n(when \u03bb = 1), but re-interpreted in terms of amplitude- and frequency-modulated sinusoids, rather\nthan \ufb01xed frequency basis functions. As the model is a linear Gaussian state space model, exact\ninference proceeds via the Kalman smoothing algorithm.\n\nx).\n\nBefore discussing the properties of BSE in the context of \ufb01tting amplitude- and frequency-modulated\nsinusoids, we derive an equivalent model by returning to the likelihood function (eq. 1). Now the\nfull complex representation of the sinusoid is retained. As before, the real part corresponds to the\nobserved data, but the imaginary part is now treated explicitly as missing data,\n\nyt = \u211c (x1,t cos(\u00af\u03c9t) \u2212 x2,t sin(\u00af\u03c9t) + ix1,t sin(\u00af\u03c9t) + ix2,t cos(\u00af\u03c9t)) + \u01ebt.\n\n(3)\nThe new form of the likelihood function can be expressed in vector form, yt = [1, 0]zt + \u01ebt, using a\nnew set of variables, zt, which are rotated versions of the original variables, zt = R(\u00af\u03c9t)xt where\n\ncos(\u03b8) (cid:21) .\nR(\u03b8) =(cid:20) cos(\u03b8) \u2212 sin(\u03b8)\n\nsin(\u03b8)\n\n(4)\n\nAn auto-regressive expression for the new variables, zt, can now be found using the fact that rotation\nmatrices commute, R(\u03b81 + \u03b82) = R(\u03b81)R(\u03b82) = R(\u03b82)R(\u03b81), together with expression for the\ndynamics of the original variables, xt (eq. 2),\n\nzt = \u03bbR(\u00af\u03c9)R(\u00af\u03c9(t \u2212 1))xt\u22121 + R(\u00af\u03c9t)\u01ebt = \u03bbR(\u00af\u03c9)zt\u22121 + \u01eb\n\n(5)\nT\nt iRT(\u00af\u03c9t) = \u03c32\nwhere the noise is a zero mean Gaussian with covariance h\u01eb\nxI.\nThis equivalent formulation of the BSE model is called the Probabilistic Phase Vocoder (PPV) [14].\nAgain exact inference is possible using the Kalman smoothing algorithm.\n\n\u2032\nt\n\u2032T\nt i = R(\u00af\u03c9t)h\u01ebt \u01eb\n\nt \u01eb\n\n\u2032\n\n2\n\n\f1.2 Problems with simple models for probabilistic amplitude and frequency demodulation\n\nBSE-PPV is used to demodulate synthetic and natural signals in Figs. 1, 2 and 7. The decomposition\nis compared to the Hilbert method. These examples immediately reveal several problems with BSE-\nPPV. Perhaps most unsatisfactory is the fact that the IF estimates are often ill behaved, to the extent\nthat they go negative, especially in regions where the amplitude of the signal is low. It is easy to\nunderstand why this occurs by considering the prior distribution over amplitude and phase implied\nby our choice of prior distribution over xt (or equivalently over zt),\n\np(at, \u03c6t|at\u22121, \u03c6t\u22121) =\n\nat\n\n2\u03c0\u03c32\nx\n\nexp(cid:20)\u2212\n\n1\n2\u03c32\n\n\u03bb\n\u03c32\nx\n\natat\u22121 cos(\u03c6t \u2212 \u03c6t\u22121 \u2212 \u00af\u03c9)(cid:21) . (6)\n\nt + \u03bb2a2\n\nx (cid:0)a2\n\nt\u22121(cid:1) +\n\nPhase and amplitude are dependent in the implied distribution, which is conditionally a uniform\ndistribution over phase when the amplitude is zero and a strongly peaked von Mises distribution\n[15] when the amplitude is large. Consequently, the model favors more highly variable IFs at low\namplitudes. In some applications this may be desirable, but for signals like sounds it presents a\nproblem. First it may assign substantial probability to unphysical negative IFs. Second, the same\nnoiseless signal at different intensities will yield different estimated IF content. Third, the complex\ncoupling makes it dif\ufb01cult to select domain-appropriate time-scale parameters. Consideration of\nIF reveals yet another problem. When the phase-perturbations vary slowly (\u03bb \u2192 1), there is no\ncorrelation between successive IFs (h\u03c9t\u03c9t\u22121i \u2212 h\u03c9tih\u03c9t\u22121i \u2192 0). One of the main goals of the\nmodel was to capture correlated IFs through time, and the solution is to move to priors with higher\norder temporal dependencies.\n\nIn the next section we will propose a new model for PAFD which addresses these problems, retaining\nthe same likelihood function, but modifying the prior to include independent distributions over the\nphase and amplitude variables.\n\nt\nr\ne\nb\nl\ni\n\nH\n\nV\nP\nP\nE\nS\nB\n\n/\n\nD\nF\nA\nP\n\n2\n\n0\n\n\u22122\n\n2 \n\n0 \n\n\u22122\n\n2 \n\n0 \n\n\u22122\n\n \n\ny\na\n\u02c6a\n\u02c6y\n\nz\nH\n\n/\n \n\ny\nc\nn\ne\nu\nq\ne\nr\nf\n \n\n100\n\n50\n\n0\n\n \n0\n\n \n\n\u03c9\n\u02c6\u03c9\n0.35\n\n0.05\n\n0.1\n\n0.15\n\n0.2\n\n0.25\n\n0.3\n\n \n0\n\n0.05\n\n0.1\n\n0.15\n\n0.2\n\n0.25\n\n0.3\n\n0.35\n\n0\n\n0.05\n\n0.1\n\n0.15\n\n0.2\n\n0.25\n\n0.3\n\n0.35\n\n0\n\n0.05\n\n0.1\n\n0.15\n\n0.2\n\n0.25\n\n0.3\n\n0.35\n\n100\n\n50\n\n0\n\n100\n\n50\n\n0\n\n0\n\n0.05\n\n0.1\n\n0.15\n\n0.2\ntime /s\n\n0.25\n\n0.3\n\n0.35\n\n0\n\n0.05\n\n0.1\n\n0.15\n\n0.2\ntime /s\n\n0.25\n\n0.3\n\n0.35\n\nFigure 1: Comparison of AFD methods on a sinusoidally amplitude- and frequency-modulated sinu-\nsoid in broad-band noise. Estimated values are shown in red. The gray areas show the region where\nthe true amplitude falls below the noise \ufb02oor (a < \u03c3y) and the estimates become less accurate. See\nsection 4 for details.\n\n2 PAFD using Auto-regressive and generalized von Mises distributions\n\nWe have argued that the amplitude and phase variables in a model for PAFD should be indepen-\ndently parametrized, but that this introduces dif\ufb01culties as the likelihood is highly non-linear in\nthese variables. This section and the next develop the tools necessary to handle this non-linearity.\n\n3\n\n\fs\ne\np\no\nl\ne\nv\nn\ne\n\n4\n\n2\n\n0\n\n\u22122\n\n\u22124\n\n0\n\n3000\n\n2000\n\n1000\n\n0\n\n0\n\n2500\n\n2000\n\n1500\n\n1000\n\nz\nH\n\n/\n\ny\nc\nn\ne\nu\nq\ne\nr\nf\n\nz\nH\n\n/\ny\nc\nn\ne\nu\nq\ne\nr\nf\n\n0.05\n\n0.1\n\n0.15\n\n0.2\n\n0.25\n\n0.05\n\n0.1\n\n0.15\n\n0.2\n\n0.25\n\n0\n\n0.05\n\n0.1\n\n0.15\ntime /s\n\n0.2\n\n0.25\n\nFigure 2: AFD of a starling song. Top: The original waveform with estimated envelopes, shifted\napart vertically to aid visualization. The light gray bar indicates the problematic low amplitude\nregion. Bottom panels: IF estimates superposed onto the spectrum of the signal. PAFD tracks the\nFM/AM well, but the other methods have artifacts.\n\nAn important initial consideration is whether to use a representation for phase which is wrapped, \u03b8 \u2208\n(\u2212\u03c0, \u03c0], or unwrapped, \u03b8 \u2208 R. Although the latter has the advantage of implying simpler dynamics,\nit leads to a potential in\ufb01nity of local modes at multiples of 2\u03c0 making inference extremely dif\ufb01cult.\nIt is therefore necessary to work with wrapped phases and a sensible starting point for a prior is thus\nthe von Mises distribution,\n\np(\u03b8|k, \u00b5) =\n\n1\n\n2\u03c0I0(k)\n\nexp(k cos(\u03b8 \u2212 \u00b5)) = vonMises(\u03b8; k, \u00b5).\n\n(7)\n\nThe two parameters, the concentration (k) and the mean (\u00b5), determine the circular variance and\nmean of the distribution respectively. The normalizing constant is given by a modi\ufb01ed Bessel func-\ntion of the second kind, I0(k). Crucially for our purposes, the von Mises distribution can be obtained\nby taking a bivariate isotropic Gaussian with an arbitrary mean, and conditioning onto the unit-circle\n(this connects with BSE-PPV, see eq. 6). The Generalized von Mises distribution is formed in an\nidentical way when the bivariate Gaussian is anisotropic [16]. These constructions suggest a simple\nextension to time-series data by conditioning a temporal bivariate Gaussian time-series onto the unit\ncircle at all sample times. For example, when two independent Gaussian AR(2) distributions are\nused to construct the prior we have,\n\nT\n\n2\n\n1(x2\n\n1,t + x2\n\n2,t = 1)\n\nNorm(xm,t; \u03bb1xm,t\u22121 + \u03bb2xm,t\u22122, \u03c32\n\nx).\n\n(8)\n\nwhere 1(x2\nchange of variables x1,t = cos(\u03b8t), x2,t = sin(\u03b8t) this yields,\n\n2,t = 1) is an indicator function representing the unit circle constraint. Upon a\n\n1,t + x2\n\np(x1:2,1:T ) \u221d\n\nYt=1\n\nYm=1\n\np(\u03b81:T |k1, k2) \u221d\n\nT\n\nYt=1\n\nexp (k1 cos(\u03b8t \u2212 \u03b8t\u22121) + k2 cos(\u03b8t \u2212 \u03b8t\u22122)) ,\n\n(9)\n\n4\n\n\fx and k2 = \u03bb2/\u03c32\n\nwhere k1 = \u03bb1(1 \u2212 \u03bb2)/\u03c32\nx. One of the attractive features of this prior is that when\nit is combined with the likelihood (eq. 1) the resulting posterior distribution over phase variables\nis a temporal version of the Generalized von Mises distribution. That is, it can be expressed as a\nbivariate anisotropic Gaussian, which is constrained to the unit circle. It is this representation which\nwill prove essential for inference.\n\nHaving established a candidate prior over phases, we turn to the amplitude variables. With one eye\nupon the fact that the prior over phases can be interpreted as product of a Gaussian and a constraint,\nwe employ a prior of a similar form for the amplitude variables; a truncated Gaussian AR(\u03c4 ) process,\n\np(a1:T |\u03bb1:\u03c4 , \u03c32) \u221d\n\nT\n\nYt=1\n\n1(at \u2265 0) Norm at;\n\n\u03bbt\u2032 at\u2212t\u2032 , \u03c32! .\n\n(10)\n\n\u03c4\n\nXt\u2032=1\n\nThe model formed from equations 1, 9 and 10 will be termed Probabilistic Amplitude and Frequency\nDemodulation. PAFD is closely related to the BSE-PPV model [13, 14]. Moreover, when the\nphase variables are drawn from a uniform distribution (k1 = k2 = 0) it reduces to the convex\namplitude demodulation model [17], which itself is a form of probabilistic amplitude demodulation\n[18, 19, 20]. The AR prior over phases has also been used in a regression setting [21].\n\n3\n\nInference via expectation propagation\n\nThe PAFD model introduced in the last section contains three separate types of non-linearity: the\nmultiplicative interaction in the likelihood, the unit circle constraint, and the positivity constraint. Of\nthese, it is the circular constraint which is most challenging as the development of general purpose\nmachine learning methods for handling hard, non-convex constraints is an open research problem.\nFollowing [22], we propose a novel method which uses expectation propagation (EP) [23] to replace\nthe hard constraints with soft, local, Gaussian approximations which are iteratively re\ufb01ned.\n\nIn order to apply EP, the model is \ufb01rst rewritten into a simpler form. Making use of the\nfact that an AR(\u03c4 ) process can be rewritten as an equivalent multi-dimensional AR(1) model\nwith \u03c4 states, we concatenate the latent variables into an augmented state vector, sT\nt =\n[at, at\u22121, . . . , at\u2212\u03c4 +1, x1,t, x2,t, x1,t\u22121, x2,t\u22121], and express the model as a product of clique po-\ntentials in terms of this variable,\n\nT\n\np(y1:T , s1:T ) \u221d\n\n\u03c0t(st, st\u22121)\u03c8t(s1,t, s1+\u03c4,t, s2+\u03c4,t), where \u03c0t(st, st\u22121) = Norm(st; \u039bsst\u22121, \u03a3s),\n\n\u03c8t(at, x1,t, x2,t) = Norm(cid:0)yt; at(cos(\u00af\u03c9t)x1,t \u2212 sin(\u00af\u03c9t)x2,t), \u03c32\n\n(See the supplementary material for details of the dynamical matrices \u039bs and \u03a3s). In this new form\nthe constraints have been incorporated with the non-linear likelihood into the potential \u03c8t, leav-\ning a standard Gaussian dynamical potential \u03c0t(st, st\u22121). Using EP we approximate the posterior\ndistribution using a product of forward, backward and constrained-likelihood messages [24],\n\ny(cid:1) 1(at \u2265 0)1(x2\n\n1,t + x2\n\n2,t = 1).\n\nT\n\nT\n\nq(s1:T ) =\n\n\u03b1t(st)\u03b2t(st) \u02dc\u03c8t(a1,t, x1,t, x2,t) =\n\nqt(st).\n\n(11)\n\nYt=1\n\nYt=1\n\nYt=1\n\nThe messages should be interpreted as follows: \u03b1t(st) is the effect of \u03c0t(st\u22121, st) and q(st\u22121)\non the belief q(st), whilst \u03b2t(st) is the effect of \u03c0t+1(st, st+1) and q(st+1) on the belief q(st).\nFinally, \u02dc\u03c8t(a1,t, x1,t, x2,t) is the effect of the likelihood and the constraints on the belief q(st). All\nof these messages will be un-normalized Gaussians. The updates for the messages can be found by\nremoving the messages from q(s1:T ) that correspond to the effect of a particular potential. These\nmessages are replaced by the corresponding potential. The deleted messages are then updated by\nmoment matching the two distributions. The updates for the forward and backward messages are\na straightforward application of EP and result in updates that are nearly identical to those used for\nKalman smoothing. The updates for the constrained likelihood potential are more complicated:\n\nupdate \u02dc\u03c8t such that q(xt)\n\n(12)\nThe dif\ufb01culty is the moment computation which we evaluate in two stages. First, we integrate\nover the amplitude variable, which involves computing the moments of a truncated Gaussian and\n\nMOM\n= \u02c6p\u03c8(st) = \u03b1t(st)\u03b2t(st)\u03c8t(at, x1,t, x2,t).\n\n5\n\n\fis therefore computationally ef\ufb01cient. Second, we numerically integrate over the one dimensional\nphase variable. For the details we again refer the reader to the supplementary material.\n\nA standard forward-backward message update schedule was used. Adaptive damping improved\nthe numerical stability of the algorithm substantially. The computational complexity of PAFD is\n\nO(cid:0)T (N + \u03c4 3)(cid:1) where N are the number of points used to compute the integral over the phase\n\nvariable. For the experiments we used a second order process over the amplitude variables (\u03c4 = 2)\nand N = 1000 integration points. In this case, the 16-32 forward-backward passes required for\nconvergence took one minute on a modern laptop computer for signals of length T = 1000.\n\n4 Application to synthetic signals\n\nOne of the main challenges posed by the evaluation of AFD algorithms is that the ground truth\nfor real-world signals is unknown. This means that a quantitative comparison of different schemes\nmust take an indirect approach. The \ufb01rst set of evaluations presented here uses synthetic signals, for\nwhich the ground truth is known. In particular, we consider amplitude- and frequency-modulated\nd\u03b8\ndt = \u00aff + \u2206f sin(2\u03c0ff t), which have\nsinusoids, yt = at cos(\u03b8t) where at = 1 + sin(2\u03c0fat) and 1\n2\u03c0\nbeen corrupted by Gaussian noise. Fig. 1 compares AFD of one such signal ( \u00aff = 50Hz, fa = 8Hz,\nff = 5Hz and \u2206f = 25Hz) by the Hilbert, BSE-PPV and PAFD methods. Fig. 3 summarizes the\nresults at different noise levels in terms of the signal to noise ratio (SNR) of the estimated variables\nt=1 (at \u2212 \u02c6at)2. PAFD\nconsistently outperforms the other methods by this measure. Furthermore, Fig. 4 demonstrates that\nPAFD can be used to accurately reconstruct missing sections of this signal, outperforming BSE-PPV.\n\nand the reconstructed signal, i.e. SNR(a) = 10 log10PT\n\nt=1 a2\n\nt \u2212 10 log10PT\n\n120\n\n100\n\n80\n\n60\n\n40\n\n20\n\n0\n\n \n\nB\nd\n/\n\u02c6a\nR\nN\nS\n\n \n\n100\n\nB\nd\n/\n\n\u02c6\u03c9\nR\nN\nS\n\n50\n\n0\n\n\u221250\n\nHilbert\nBSE\u2212PPV\nPAFD\n\n10\n\n20\n\n30\n\n40\n\nSNR signal /dB\n\n\u221e\n\n80\n\n60\n\n40\n\n20\n\n0\n\nB\nd\n/\n\n\u02c6y\nR\nN\nS\n\n10\n\n20\n\n30\n\n40\n\nSNR signal /dB\n\n\u221e\n\n10\n\n20\n\n30\n\n40\n\n50\n\nSNR signal /dB\n\nFigure 3: Noisy synthetic data. SNR of estimated variables as a function of the SNR of the signal.\nEnvelopes (left), IFs (center) and denoised signal (right). Solid markers denote examples in Fig. 1.\n\n5 Application to real world signals\n\nHaving validated PAFD on simple synthetic examples, we now consider real-world signals. Bird-\nsong is used as a prototypical signal as it has strong frequency-modulation content. We isolate a\n300ms component of a starling song using a bandpass \ufb01lter and apply AFD. Fig. 2 shows that PAFD\ncan track the underlying frequency modulation even though there is noise in the signal which causes\nthe other methods to fail. This example forms the basis of two important robustness and consistency\ntests. In the \ufb01rst, spectrally matched noise is added to the signal and the IFs and amplitudes are re-\nestimated and compared to those derived from the clean signal. Fig. 5 shows that the PAFD method\nis considerably more robust to this manipulation than both the Hilbert and BSE-PPV methods. In the\nsecond test, regions of the signal are removed and the model\u2019s predictions for the missing regions\nare compared to the estimates derived from the clean signal (see \ufb01g. 6). Once again PAFD is more\naccurate. As a \ufb01nal test of PAFD we consider the important neuroscienti\ufb01c task of estimating the\nphase, equivalently the IF, of theta oscillations in an EEG signal. The EEG signal typically contains\nbroadband noise and so a conventional analysis applies a band-pass \ufb01lter before using the Hilbert\nmethod to estimate the IF. Although this improves the estimates markedly, the noise component\ncannot be completely eradicated which leads to artifacts in the IF estimates (see Fig. 7). In contrast\n\n6\n\n\fB\nd\n/\n\n\u02c6y\nR\nN\nS\n\n80\n\n60\n\n40\n\n20\n\n0\n\n\u221220\n\n0\n\n \n\nPAFD\nBSE\u2212PPV\n\n40\n20\ntime /ms\n\n60\n\n80\n\n60\n\n40\n\n20\n\n0\n\nB\nd\n/\n\n\u02c6\u03c9\nR\nN\nS\n\n\u221220\n\n0\n\n40\n20\ntime /ms\n\n60\n\n0.05\n\n0.1\n\n0.15\n\n0.2\n\n0.25\n\n0.3\n\n0.35\n\nB\nd\n/\n\u02c6a\nR\nN\nS\n\n100\n\n50\n\n0\n\n\u221250\n\n \n0\n\n \n0\n\n2\n\n0\n\n\u22122\n\n2\n\n0\n\n\u22122\n\n0\n\n0.05\n\n0.1\n\n0.15\n\n0.2\n\n0.25\n\n0.3\n\n0.35\n\n0.4\n\ntime /s\n\n \n\ny\n\u02c6y\na\n0.4\n\u02c6a\n\nz\nH\n\n/\n \n.\n\nq\ne\nr\nf\n\nz\nH\n\n/\n \n.\n\nq\ne\nr\nf\n\n100\n\n50\n\n0\n\n \n0\n\n100\n\n50\n\n0\n\n0\n\n40\n20\ntime /ms\n\n60\n\n \n\n\u03c9\n\u02c6\u03c9\n\n0.2\n\n0.4\n\n0.2\n\ntime /s\n\n0.4\n\nFigure 4: Missing synthetic data experiments. TOP: SNR of estimated variables as a function of gap\nduration in the input signal. Envelopes (left), IFs (center) and denoised signal (right). Solid markers\nindicate the examples shown in the bottom rows of the \ufb01gure. BOTTOM: Two examples of PAFD\nreconstruction. Light gray regions indicate missing sections of the signal.\n\n40\n35\n30\n25\n20\n\n15\n10\n\nB\nd\n/\n\u02c6a\nR\nN\nS\n\n \n\n10\n\n15\n\nHilbert\nBSE\u2212PPV\nPAFD\n\n20\n\n25\n\n30\nSNR signal /dB\n\n \n\nB\nd\n/\n\n\u02c6\u03c9\nR\nN\nS\n\n20\n15\n10\n5\n0\n\u22125\n\u221210\n\n35\n\n10\n\n15\n\n20\n\n25\n\n30\nSNR signal /dB\n\n35\n\nFigure 5: Noisy bird song experiments. SNR of estimated variables as compared to those estimated\nfrom the clean signal, as a function of the SNR of the input signal. Envelopes (left), IFs (right).\n\nPAFD returns sensible estimates from both the \ufb01ltered and original signal. Critically, both estimates\nare similar to one another suggesting the new estimation scheme is reliable.\n\n6 Conclusion\n\nAmplitude and frequency demodulation is a dif\ufb01cult, ill-posed estimation problem. We have devel-\noped a new inferential solution called probabilistic amplitude and frequency demodulation which\nemploys a von Mises time-series prior over phase, constructed by conditioning a bivariate Gaussian\nauto-regressive distribution onto the unit circle. The construction naturally leads to an expectation\npropagation inference scheme which approximates the hard constraints using soft local Gaussians.\n\n7\n\n\f60\n\n40\n\n20\n\n0\n\n \n\nB\nd\n/\n\u02c6a\nR\nN\nS\n\n2\n\n0\n\n\u22122\n\n \n0\n\n2\n\n0\n\n\u22122\n\n0\n\n \nPAFD\nBSE\u2212PPV\n\nB\nd\n/\n\n\u02c6\u03c9\nR\nN\nS\n\n60\n\n40\n\n20\n\n0\n\nB\nd\n/\n\n\u02c6y\nR\nN\nS\n\n40\n\n20\n\n0\n\n 2.8\n\ntime /ms\n\n12.5\n\n 2.8\n\ntime /ms\n\n12.5\n\n0.005\n\n0.01\n\n0.015\n\n0.02\n\n0.025\n\n \n\ny\nac\n\u02c6y\n0.03\n\u02c6a\n\nz\nH\n\n/\n \n.\n\nq\ne\nr\nf\n\nz\nH\n\n/\n \n.\n\nq\ne\nr\nf\n\n0.005\n\n0.01\n\n0.015\ntime /s\n\n0.02\n\n0.025\n\n0.03\n\n 2.8\n\ntime /ms\n\n12.5\n\n \n\n\u03c9c\n\u02c6\u03c9\n\n0.01\n\n0.02\n\n0.03\n\n0.02\n\n0.01\ntime /s\n\n0.03\n\n3000\n\n2500\n\n2000\n\n1500\n\n3000\n\n2500\n\n2000\n\n1500\n\n \n0\n\n0\n\nFigure 6: Missing natural data experiments. TOP: SNR of estimated variables as a function of gap\nduration in the input signal. Envelopes (left), IFs (center) and denoised signal (right). Solid markers\nindicate the examples shown in the bottom rows of the \ufb01gure. BOTTOM: Two examples of PAFD\nreconstruction. Light gray regions indicate missing sections of the signal.\n\ns\ne\np\no\nl\ne\nv\nn\ne\n\n2\n\n0\n\n\u22122\n\nz\nH\n\n/\n \ny\nc\nn\ne\nu\nq\ne\nr\nf\n \n\nz\nH\n\n/\n \n\ny\nc\nn\ne\nu\nq\ne\nr\nf\n \n\n10\n\n5\n\n0\n\n10\n\n5\n\n0\n\n \n0\n\n \n0\n\n \n0\n\n0.5\n\n0.5\n\n0.5\n\n \n\ny\n\u02c6aPAFD\n\u02c6aHE\n\u02c6aPPV-BSE\n1.5\n \n\n\u02c6\u03c9HE\n\u02c6\u03c9BSE-PPV\n\n1.5\n \n\n\u02c6\u03c9PAFD\n\n1.5\n\ns\ne\np\no\nl\ne\nv\nn\ne\n\n2\n\n0\n\n\u22122\n\nz\nH\n\n/\n \ny\nc\nn\ne\nu\nq\ne\nr\nf\n \n\nz\nH\n\n/\n \n\ny\nc\nn\ne\nu\nq\ne\nr\nf\n \n\n0\n\n0\n\n0\n\n10\n\n5\n\n0\n\n10\n\n5\n\n0\n\n1\n\n1\n\n1\n\n0.5\n\n0.5\n\n0.5\n\n1\n\n1\n\n1\n\n1.5\n\n1.5\n\n1.5\n\ntime /s\n\ntime /s\n\nFigure 7: Comparison of AFD methods on EEG data. The left hand side shows estimates derived\nfrom the raw EEG signal, whilst the right shows estimates derived from a band-pass \ufb01ltered version.\nThe gray areas show the region where the true amplitude falls below the noise \ufb02oor (a < \u03c3y), where\nconventional methods fail.\n\nWe have demonstrated the utility of the new method on synthetic and natural signals, where it outper-\nformed conventional approaches. Future research will consider extensions of the model to multiple\nsinusoids, and learning the model parameters so that the algorithm can adapt to novel signals.\nAcknowledgments\nRichard Turner was funded by the EPRC, and Maneesh Sahani by the Gatsby Charitable Foundation.\n\n8\n\n\fReferences\n\n[1] P. J. Loughlin and B. Tacer. On the amplitude- and frequency-modulation decomposition of signals. The\n\nJournal of the Acoustical Society of America, 100(3):1594\u20131601, 1996.\n\n[2] J. L. Flanagan. Parametric coding of speech spectra. The Journal of the Acoustical Society of America,\n\n68:412\u2013419, 1980.\n\n[3] P. Clark and L.E. Atlas. Time-frequency coherent modulation \ufb01ltering of nonstationary signals. Signal\n\nProcessing, IEEE Transactions on, 57(11):4323 \u20134332, nov. 2009.\n\n[4] J. L. Flanagan and R. M. Golden. Phase vocoder. Bell System Technical Journal, pages 1493\u20131509, 1966.\n[5] S. M. Schimmel. Theory of Modulation Frequency Analysis and Modulation Filtering, with Applications\n\nto Hearing Devices. PhD thesis, University of Washington, 2007.\n\n[6] L. E. Atlas and C. Janssen. Coherent modulation spectral \ufb01ltering for single-channel music source sepa-\n\nration. In Proceedings of the IEEE Conference on Acoustics Speech and Signal Processing, 2005.\n\n[7] Z. M. Smith, B. Delgutte, and A. J. Oxenham. Chimaeric sounds reveal dichotomies in auditory percep-\n\ntion. Nature, 416(6876):87\u201390, 2002.\n\n[8] J. Dugundji. Envelopes and pre-envelopes of real waveforms. IEEE Transactions on Information Theory,\n\n4:53\u201357, 1958.\n\n[9] O. Ghitza. On the upper cutoff frequency of the auditory critical-band envelope detectors in the context\n\nof speech perception. The Journal of the Acoustical Society of America, 110(3):1628\u20131640, 2001.\n\n[10] F. G. Zeng, K. Nie, S. Liu, G. Stickney, E. Del Rio, Y. Y. Kong, and H. Chen. On the dichotomy in\nauditory perception between temporal envelope and \ufb01ne structure cues (L). The Journal of the Acoustical\nSociety of America, 116(3):1351\u20131354, 2004.\n\n[11] D. Vakman. On the analytic signal, the Teager-Kaiser energy algorithm, and other methods for de\ufb01ning\n\namplitude and frequency. IEEE Journal of Signal Processing, 44(4):791\u2013797, 1996.\n\n[12] G. Girolami and D. Vakman. Instantaneous frequency estimation and measurement: a quasi-local method.\n\nMeasurement Science and Technology, 13(6):909\u2013917, 2002.\n\n[13] Y. Qi, T. P. Minka, and R. W. Picard. Bayesian spectrum estimation of unevenly sampled nonstationary\n\ndata. In International Conference on Acoustics, Speech, and Signal Processing, 2002.\n\n[14] A. T. Cemgil and S. J. Godsill. Probabilistic Phase Vocoder and its application to Interpolation of Missing\n\nValues in Audio Signals. In 13th European Signal Processing Conference, Antalya/Turkey, 2005.\n\n[15] C. Bishop. Pattern Recognition and Machine Learning. Springer, 2006.\n[16] R. Gatto and S. R. Jammalamadaka. The generalized von mises distribution. Statistical Methodology,\n\n4:341\u2013353, 2007.\n\n[17] G. Sell and M. Slaney. Solving demodulation as an optimization problem. IEEE Transactions on Audio,\n\nSpeech and Language Processing, 18:2051\u20132066, November 2010.\n\n[18] R. E. Turner and M. Sahani. Probabilistic amplitude demodulation. In Independent Component Analysis\n\nand Signal Separation, pages 544\u2013551, 2007.\n\n[19] R. E. Turner and M. Sahani. Statistical inference for single- and multi-band probabilistic amplitude\ndemodulation. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal\nProcessing (ICASSP), pages 5466\u20135469, 2010.\n\n[20] R. E. Turner and M. Sahani. Demodulation as probabilistic inference.\n\nSpeech and Language Processing, 2011.\n\nIEEE Transactions on Audio,\n\n[21] J. Breckling. The analysis of directional time series: Application to wind speed and direction. Springer-\n\nVerlag, 1989.\n\n[22] J. P. Cunningham. Algorithms for Understanding Motor Cortical Processing and Neural Prosthetic Sys-\ntems. PhD thesis, Stanford University, Department of Electrical Engineering, (Stanford, California, USA,\n2009.\n\n[23] T. Minka. A family of algorithms for approximate Bayesian inference. PhD thesis, MIT Media Lab, 2001.\n[24] T. Heskes and O. Zoeter. Expectation propagation for approximate inference in dynamic bayesian net-\n\nworks. In A. Darwiche and N. Friedman, pages 216\u2013233. Morgan Kaufmann Publishers, 2002.\n\n9\n\n\f", "award": [], "sourceid": 602, "authors": [{"given_name": "Richard", "family_name": "Turner", "institution": null}, {"given_name": "Maneesh", "family_name": "Sahani", "institution": null}]}