Paper ID: | 2199 |
---|---|

Title: | Fast and accurate spike sorting of high-channel count probes with KiloSort |

The paper proposes a method for large-scale spike sorting in high-density electrophysiology recordings. They use a template-matching approach and show that their procedure is able to process very large-scale recordings (up to 384 channels) in much less time than an alternative state of the art method (KlustaKwik). The accuracy appears to be similar to the previous method. They also provide a novel method for automatically merging over-segmented clusters based on the continuity of their cluster densities.

The paper is well written and clearly presented. The results appear to support the author's claims that their method can accurately and efficiently perform spike sorting on large-scale, high-density multi-channel recordings. Because the fast speed of the method is a key aspect of its originality/impact, it would be helpful for the authors to analyze in more detail the dependence of processing time on data set size, number of channels, number of units, etc. The authors do not explain how they chose the hyperparameters (lambda, epsilon -- Eq. 1, 2). It's not clear why a normal distribution is used to model "spike amplitude" (Eq. 1, x_k). Would a Poisson, negative binomial, or other distribution over positive values be more appropriate?

2-Confident (read it all; understood it all reasonably well)

New neuroscience technology allows for the recording of hundreds to thousands of neurons at the same time, and this is only growing. This creates a demand for fast accurate and automated spike sorting technology. The submission adds several novel elements to traditional spike sorting, including "private" basis vectors for waveforms, a "moving average" of the cluster waveform and using GPU computing.

I imagine many members of the neuroscience community will use this algorithm, since it is the successor of the largely popular KlustaKwik. I would have liked some discussion of what happens to the algorithm around 700 GT neurons (in Figure 3). why do both algorithms catastrophically fail that point? Or is it those last few neurons are too small to properly distinguish from noise? In Benchmarks the authors mention the channel count of the recordings but do not mention the duration of the recording. There are several missing references (e.g. line 201)

2-Confident (read it all; understood it all reasonably well)

The authors provided a pipeline for spike sorting. The main innovation seems to be 1. doing SVD for each private neurons to build the template, which strikes a balance between computational efficiency and model accuracy; 2. online optimization algorithm in templates learning and nice discussion for avoiding local optimal. pre-processing steps include high-pass filtering, artifacts removal and whitening. initialization algorithm includes K-means. Core algorithm includes online optimization and matching pursuit. post-processing steps involves manual merging.

The paper is well written and mostly clear. I like the discussion in the model formulation part and I like the idea of doing private SVD. For me the key weakness seems to be little discussion for overlapping neurons/spikes. I understand that neuron spike can be rare, but seems to me that the overlapping case seems to still be worth the effort, especially when a large number of neurons are detected. Also I think the experimental result is not very well discussed. The comments below are mainly details/questions for the technical details and experimental result. figure 3: so KiloSort outperforms for easy-to-detect neurons but doesn’t do well for harder-to-detect neurons (panel bcef)? This seems surprising. What’s the intuition there? Also I would like to see some more discussion about the inferior performance of the false positive rates (panel ad). formula (1): I am surprised that the spike amplitude can vary significantly even for the same neuron. Is possible that these are just different neurons? Also if the amplitude vary significantly for the same neuron, does the shape of the waveform also vary for the same neuron? It might be good to have some figures illustrating this point. line 88-89: confused by the comments of computing whitening matrix based on its nearest 32 channels. Is the matrix still symmetric? What exactly is done? section 5: confused by the comments that in line 261-262. Seems that if the two clusters belong to two neurons that are close, we would also observe this type of behavior. figure d and h are both “continuous”, but they seem to be indicating quite different patterns. Maybe it would be good to have a counter example showing when it is a bad idea to merge? typo: formula (3): the second term should be divided by the number of spikes in batch k. (taking the mean instead of the sum)? line 63, 201: reference missing

2-Confident (read it all; understood it all reasonably well)

The authors develop a new spike sorting method. Unlike other approaches, their method is not based on dimensionality reduction. Their GPU based implementation leads to dramatic time savings as compared to KlustaKwik. Importantly, their method utilizes a novel template-wise smoothing method based on the SVD decomposition of the cluster average of the traces (templates), which is the key for obtaining the aforementioned computational gains.

Improvements in spike sorting are always appreciated because of the always-increasing electrode array capabilities (e.g number of electrodes or density) leading to new computational challenges. Methodologically, their contribution is the observation that one can take the SVD of the mean of raw traces in the cluster order to i)regularize shape of templates, ii)obtain computational gains via a GPU implementation of the operations in equation 6. This methodological contribution leads to impressive improvements in time while maintaining comparable levels of accuracy. In general it is a very nice paper. However, I think this paper has major issues and it is not suitable for publication in the NIPS conference/proceedings. 1)Overall, it gives the impression that it hasn't been fleshed out enough. We see this in i)missing references due (I suppose) errors in compilation of the pdf ii)lack of scales and labels in graphs iii)lack of enough references (beyond the ones that are missing in i). See for example section 2.1 or references supporting their optimization method) iiii)the authors don't do a very good job in graphically displaying their results/methods. 2)Their methodological contribution may be a bit incremental for a NIPS paper. It is certainly smart the SVD trick they apply, and how it entails computational gains, but the rest of the innovations are rather simple algebra tricks and don't lead to the same impressive results. 3)The way in which the authors present their method in relation to already published literature is not quite accurate: for example, in lines 63-66 it is stated that they improve the method in [1] but the motivation in [1] is actually orthogonal: they utilize continuous time modeling as a device to resolve overlapping spikes that are separated by a sub-sampling-rate time-lapse. Then, it is imprecise to call this an improvement over [1], as the author, I believe, are instead borrowing the idea from [1] and applying it to a simplified context (which ends up being useful, of course). Also, the authors state that their non-PCA approach is an innovation (line 68-70 'unlike previous approaches'). However, this has already became a common practice in the spike sorting literature ([1],[2],[3],[4]). 4)The time comparison the authors present is questionable (lines 220-225). Why didn't they attempt a GPU against GPU comparison? As a summary, it has a very nice key insight that could be the seed of a very nice work. However, the snapshot presented by the authors at this time seems still too "under-development". [1]Chaitanya Ekanadham, Daniel Tranchina, and Eero P Simoncelli. A unified framework and method for 296 automatic neural spike identification. Journal of neuroscience methods, 222:47–55, 2014. [2]Jonathan W Pillow, Jonathon Shlens, EJ Chichilnisky, and Eero P Simoncelli. A model-based spike 293 sorting algorithm for removing correlation artifacts in multi-neuron recordings. PloS one, 8(5):e62123, 294 2013. [3]D. Carlson, V. Rao, J. Vogelstein, and L. Carin. Real-Time Inference for a Gamma Process Model of NeuralSpiking.NIPS, 2013 [4].D. Carlson, J. Vogelstein, Q. Wu, W. Lian, M. Zhou, C. Stoetzner, D. Kipke, D. Weber, D. Dunson, and L. Carin.Multichannel Electrophysiological Spike Sorting via Joint Dictionary Learning & Mixture Modeling.IEEETBME, 2014

2-Confident (read it all; understood it all reasonably well)

A spike sorting algorithm, KiloSort, is presented. The algorithm performs dimensional reduction through SVD which allows better reconstruction than classical PCA method. SVD allow spike templates to be created, with optimization in parameter annealing to treat the initialization problem that plagues naive K-mean or leader algorithm, etc. The algorithm then performs matching pursuit to sort data. Then the post hoc method of merging over-partitioned clusters produces more reliable result per expert intervention.

Section 2.2. More discussion is needed regarding the claim that the proposed algorithm does not perform dimensional reduction like PCA, yet this section is all about how SVD can reduce dimension better than the classical method. Section 3.2. the initialization issue of many clustering algorithms is handled here through a rather ad hoc method through parameter annealing. Quantified comparison would improve the section. Section 5. Fig. 5bf is poorly presented. Since the data is clustered using Kilosort, there is understandably no linear boundary between the clusters when projected into classical PC space. Regular K-mean of PC would likewise produce linear boundary between clusters that should be merged. Fig.5bf is better omitted.

2-Confident (read it all; understood it all reasonably well)

The paper presents a novel algorithm for task of Spike Sorting, a common task for assigning spikes in voltage in EEG readings to specific neurons. The strong point of the approach is its ability to scale to a large amount of channels maintaining and even improving accuracy. It does so by deviating from the standard steps done in spike sorting, and using well selected local desitions that can be parallelized on GPUs. In short, the model tries to minimize a cost function that compares a generative model of the voltage and the voltage data. The parameters of the model are the descriptors of the spikes and neurons required for the spike sorting task. The parameters are: the temporal location of the spikes, the variability in amplitude for each neuron, the firing profile of each neuron (how it looks on each channel for a constant count of time samples) and the clustering of spikes into neurons. The algorithm works in a left to right fashion, greedily grouping newly discovered spikes and creating an profile average for each neuron (cluster at this point). At the same time, spike times and amplitudes are improved by calculating how much the cost function would decrease if a neuron n were to fire at time t. This is done in a matching pursuit fashion: the best (n, t) tuple is selected and then the voltage reconstruction removed from the data. After that, only the (n, t) tuples close enough to the one selected must be recalculated from the cost matrix, which is constant and can be parallelized. These two are the main places where the new approach gets its speed. Finally the algorithm is compared both in speed and accuracy against a very modern approach (2016). It excels in speed while mantaining and even improving accuracy.

The paper is clear in its presentation. Problem at task and strong points are well presented. Separation of model and processing is also very clear. Model could be understood and makes sense. Most processing steps are also rather clear; details escape (why sigma does not seem to be improved beyond greedy clustering, how does calculation of A_n interleave with spike time and amplitude inference). Evaluation seems strong enough. Plots are clear and time comparison (main strength) is sound. The algorithm against it is compared seems to be modern. Research seems quite relevant given the setting presented: new high-density electrodes being developed and previous approaches at the problem being too slow, while KiloSort being able to perform the task in a reasonable time. The setting in which KiloSort is mainly useful is not clear enough. Citations on the new electrode technology is missing (both to qualify how wide spread they are or will be used, as well as how long have they been around or will be until they can readily be used). It is not clear if other fast algorithms (even real-time) already exists (http://www.sciencedirect.com/science/article/pii/S0165027011006133). Previous work does not perfectly establish the state-of-the-art of the subject and which is the specific use case of the algorithm presented.

1-Less confident (might not have understood significant parts)