Practical: Analysis of spikes from cricket and cockroach

Practical: Analysis of spikes from cricket and cockroach#

introduction to practical#

\index{practintro} We are now ready to look at the measured spikes in more detail by using the analysis. For this we will use Matlab, and a number of custom routines created by the teacher as outlined in the lectures, for direct use on the cricket and cockroach signals.

In this practical you will go through the different analyses for three different data files:

the spikes of the cricket (file: Cricket.aif)
the cockroach spikes (file: Cockroach2.aif)
the spikes of your own measurement (file: UwInsect.aif or UwInsect.wav)

preparation#

First, you download the necessary Matlab files from the web (download Spike Analysis from Brightspace), and place them on the desktop of your computer.

Then you start Matlab, and you link the desktop-spikes directory to the Matlab workspace. To do this, enter the location of the desktop in the ‘Current Folder’ window. On the Mac, for example, this is / Users / yourname / desktop / Spikes. If this is successful you will see a list of Matlab scripts in the Current folder window, including e.g.

*aiffread.m
Cricket.aif
Cockroach2.aif*

Raw signals#

Second, we will see if we can read and plot the raw signals. In this description we use Cockroach2.aif, but of course you should also apply the routines to the other two data files. To do this, call in Matlab:

\mcode{plotruw('Cockroach2.aif');}

(Note: don’t forget the dot comma at the end of the line! If you omit it, Matlab (often) writes all output to the screen!).

If all goes well, you will see a figure like in 6-10, only without the detection lines. The vertically dotted lines indicate time in seconds. You will see the detection lines if you call the program as follows: \mcode{N = plotruw(‘Cockroach2.aif’, 3.4);}

In this case, you tell the program to also set the spike detection level (here 3.4 times the standard deviation (std) of the total signal), and actually detect and count the spikes. That std is now represented as two black-striped horizontal lines, which roughly correspond to the densely filled red band. By choosing this detection level, in principle every signal that protrudes above this level (and moreover, for at least 0.6 ms) will be detected as a potential spike (see the algorithm on page 23). The total number of spikes detected is displayed in the plot; it is also the output (N) of the \mcode{plotruw} program.

\begin{exercise} Vary the detection level from [1.5,2.0, …, 5.0] and see how the number of detected spikes changes. Make a graph (preferably in Matlab, if you can) in which you plot the number of spikes per second against the detection level. Do this for the three data files and compare. \end{exercise}

spike detection#

Third, we are going to detect and store the spikes in a Matlab file. The program \mcode{nbf_detectspikes} is used for this. In the beginning this program does the same as \mcode{plotruw}, with the following differences:

i) the file is cut into equal segments (‘trials’) of 1.000 s.
ii) the spikes are stored in two ways:
only the moments when the spikes occur in every trial
the waveform of every detected spike is also stored

You have to run this program as follows:

\mcode{[MOMS, SPKWAVS, SR, STD] = nbf_detectspikes (‘Cockroach2.aif’, 3.3);}

When this program is running you will always see a one-second picture with the detected spikes as thin vertical lines at the bottom.

This program therefore gives 4 outputs, of which you can enter the names yourself (mentioned here: MOMS, SPKWAVS, SR and STD, but you could also call them A, B, C, D). MOMS is a large row of numbers, always containing the following information: T1, N1, Spk1, Sp22, Spk3, Spk4, …, SpkN1, T2, N2, Spk1, Spk2, …, SpkN2, T3, N3, etc.

T1, T2, T3, … are the trial numbers
N1, N2, N3, etc. are the number of spikes detected in that trial
Spk1, Spk2, etc. are the moments (in ms) at which those N spikes occurred in each trial.

For example, when typing the following:

\mcode{MOMS (1:20)}

(it shows the first 20 values of the Matlab vector MOMS), you get the following:

\mcode{ans = 1,000 29,000 65,1247 66,2585 89,5238} etc.

So trial 1 had 29 spikes and the first spike came after 65.12 ms, the third after 89.52 ms. To see the spike moments of trial 2 you would have to watch from element 32:

\mcode{MOMS (32:42)}

And this then gives

\mcode{ans = 2.000 24.0000 5.9410 11.9274}

etc. on your screen.

SPKWAVS is a matrix in which each detected spike waveform is written as a separate row. The following information is stored per spike (row): T S w (1) w (2) w (3) … w (23) with T the trial number that contained the spike, and S the sample number of the spike in that trial. The series given by w(1) to w(25) are the measured values of the spike signal. For example, for the first spike, you have to type:

\mcode{SPKWAVS (1, :)}

(these are all elements (columns) of the first row), yielding:

\mcode{ans = 1 1436 -3622 -8768 -9394 -17182}

etc.

The first spike in trial 1 therefore occurred at sample number 1436. This then corresponds to time: \mcode{t = 1436/22050 = 65.1247 ms}. The other two outputs of the program are:

SR - the sampling rate of the measurement, which will be 22050
STD - the standard deviation of the total spike signal.

In this way all information about the detected spikes is available for further analysis. The spike wave forms can be used to apply singular value decomposition (and then cluster analysis, as discussed in the lectures).

The spike moments can in principle be used to determine statistical data about the spikes (interval distributions and the like), and to look at information in the spike trains, for example by Fourier analysis. But beware: in the event that the measured spikes come from multiple cells, we will first have to know which spikes belong to which cell!

\begin{exercise} Select a fixed spike detection criterion (e.g. 3.5) and detect the spikes for the three files. Write the spike waveforms in different matrices, by giving the outputs different names! \end{exercise}

spike sorting#

\index{spikesorting}

You now have three spike-wave files, containing the information about the sample number of the spike in the file, and the waveform of that spike. Suppose we have called them WAVS1, WAVS2, and WAVS3. We are now going to perform the data reduction analysis via Singular Value Decomposition, whereby the 23 eigen-vectors are calculated with the corresponding 23 singular values. Then we only take the first four eigen-vectors and singular values to characterize all the spikes of the file. In short, each spike will be approximated by the following description:

\[s(t) \approx \alpha_{k1} \cdot \vec{e}_1(t)+\alpha_{k2} \cdot \vec{e}_2(t)+\alpha_{k3} \cdot \vec{e}_3(t)+\alpha_{k4} \cdot \vec{e}_4(t)\]

\[s_k \sim [\alpha_{k1},\alpha_{k2},\alpha_{k3},\alpha_{k4}]\]

How good that approximation is can be viewed by calculating the correlation between the real spike and the approximated, reconstructed, spike. This is done on all detected spikes in the \mcode{nbf_spike_svd} function. You can call this program as follows:

\mcode{[EVEC1, EVAL1] = nbf_spike_svd (WAVS1);}

The EVAL1 matrix has Nspikes rows and 7 columns. Every row is a spike, of which

Col 1 = trial nr
Col 2 = Sample nr of the spike in the trial
Col 3 to Col 6 = the 4 highest eigenvalues of the spike
Col 7 = the correlation between spike waveform and its approximation.

In addition to the two matrix outputs, the program also produces four figures. In here are shown: the four eigenvectors as a function of time (Fig. 1), all eigenvalues in decreasing order (Fig. 2), two-dimensional projections of the 4 eigenvalues for all spikes (6 plates in Fig. 3), and the correlations for all spikes (Fig. 4).

\begin{exercise} Do this analysis on the three data files and compare the different eigenvectors with each other. Are the four eigenvectors sufficient to reconstruct the spikes accurately? Pick some spike waveforms of your choice from the files, plot these, and also plot the reconstructed waveform (in the same figure as the original) to convince yourself that this reconstruction is indeed accurate. \end{exercise}

\begin{remark} For example, to extract spike 63 from file WAVS3 you type: \mcode{Spk63 = WAVS3(63,3:end);} You also need the four eigenvalues for that spike, which you get from EVAL3: \mcode{Evals63 = EVAL3 (63,3:6);} Now create the reconstructed spike with these values and the eigenvectors in EVEC3 (i.e each eigenvector is a column in this matrix, so \mcode{EV1 = EVEC (:, 1);} etc.) by entering the formula at the top of this page. You plot (e.g. in Fig. 5) a picture of a spike as follows: \begin{lstlisting} figure (5); plot (Spk63,’k-‘); hold on; plot (RecSpk63,’r’); \end{lstlisting} You have now plotted the original spike as a black curve, and the reconstructed spike as a red line on the same scale as the black \end{remark} .

Spike clustering#

\index{spikeclustering}

In Figure 3 of the program \mcode{nbf_spike_svd} you see a six-panel of images in which the four singular values are plotted against each other in pairs (a1-a2, a1-a3, a1-a4, a2-a3, a2-a4 and a3-a4). In some of those pictures it is (possibly) clear to see that the data clouds are more or less separate from each other. The trick of cluster analysis is now to find the optimal dividing lines between the data points in the four-dimensional space of singular values (see lecture). We are going to do this with the program \mcode{nbf_spikeclusters}. As described in the lecture, there are very advanced cluster separation techniques, but applying them requires quite a bit of extra math. Here we limit ourselves to the widely used K-means cluster analysis. We tell the program in advance how many clusters it should find, and then the best possible cluster areas are identified according to the iterative procedure outlined in the lectures. To do the cluster analysis, type:

\mcode{[LABELS1] = nbf_spikeclusters(WAVS1, 3);}

Just like with the SVD routine, the input is the matrix with detected spike waveforms. The second input is the number of clusters that we want (we take here: 3). The program produces six figures. Spikes that belong to a certain cluster now have their own color.

The output of the program is a file, here called: LABELS1. In Matlab this is a so-called ‘structure’, which means that the data storage, in contrast to a vector or a matrix, can consist of entirely different elements (this can be for example pieces of text, whole numbers, real numbers, sets of numbers, etc. etc.).

If you type: \mcode{LABELS1{1}} you will see the first element of this file, and it gives the total number of spikes that were measured in the three different clusters: \begin{lstlisting} ans = Nspks1: 158 Nspks2: 638 Nspks3: 361. \end{lstlisting} After this comes the data for the different trials (the 1.000 s segments from the raw data file). In the case of Cockroach2.aif we had 15 trials, so 15 structures now follow. Each structure then provides all data on the spikes of each of the three clusters. They are called respectively: Spike1, Spike2 and Spike3.

\begin{remark} For example: to see the data of the spikes from the second cluster of the third trial (i.e. between times 2.0 and 3.0 s) you type: \mcode{LABELS1{4}.Spike2}. You will see the following:

\begin{lstlisting} ans = Trial: 3 Nspikes: 15 Label: [15x1] double Moments: [15x1] double Eigenvalues: [15x4] double Correl: [15x1] double \end{lstlisting}

\end{remark}

In this way it is therefore possible to put all possible data together in a well-organized way in one file, and to use it for further analysis.

\begin{exercise} Do this analysis on the three data files and compare the different spike forms and clusters with each other. Try to make a figure in which you (a separate figure for each data file) plot the number of spikes of clusters 2 and 3 in each trial against the number of spikes of clusters 1 of that trial. Do you see high (positive / negative) or low correlations? What would that mean? \end{exercise}

Save the structures of the three data files, because we now need them to be able to perform the analysis on the selected spike clusters!

spike interval distribution#

\index{spikeinterval}

Now we will look at the statistics of the spikes. For this we go the sorted spikes (with labels 1, 2, and 3) from the LABELS structure, and we determine the distribution of the interspike intervals (ISI). The ISI for spike nr. k is defined as:

\[ISI_k = \tau_{k+1} - \tau_k\]

.

With the program \mcode{nbf_intspikint} the intervals for all spikes in a file can be determined and plotted in a histogram with a bin width of 2 ms. The height of the histogram then indicates the number of times that spikes with that ISI were found. Suppose we want to see the spikes intervals for cluster 2.

Then type: \mcode{[ISI2, SPK2, DENS2] = nbf_intspikint (LABELS1, 2);}

The input is the structure file with LABELS, and the input ‘2’ indicates that you have the spikes of the second cluster. The output then gives:

the intervals (in ISI2)
the spike moments (in s) in SPK2
and the spike density (DENS2) of these spikes over the entire time axis (for Cockroach2.aif this is 15 s).

Figure 1 shows you the interval histogram, and Figure 2 shows you the spikes (thin vertical lines), with the spike-density function below (in red).

\begin{exercise} Do this analysis on the three data files for each of the selected spike clusters, and compare the different histograms with each other. Determine the CV for each histogram. Is that far above or below the value of an exponential distribution? \end{exercise}

spike power spectra#

\index{spikespectra}

Finally, we will perform a (simple) Fourier analysis on the spike density functions of the selected spike clusters. For this we use the results from the previous assignment. For this purpose, the function FFT (Fast Fourier Transform) is available in Matlab. This produces a series of complex numbers as output, which contain the amplitude and phase of each frequency component. The sample frequency is 22050 Hz. The data is displayed in N points. Because two values are needed for each frequency (ie amplitude and phase), this means that the N complex numbers range from fft to half the maximum frequency (the Nyquist frequency). In other words, if the sample frequency is Fs, the maximum frequency that can be represented in the signal is \(Fs/2\). So in our case 11025 Hz.

For this we have \(N/2\) points available. This means that the frequency axis after Fourier transformation runs from [0 - 11025] Hz in steps of \(df = (22050 / N) Hz\). The \mcode{nbf_spikespec} function does this for us. You type (after the previous analysis for the spikes of the second cluster)

\mcode{[SPEC2] = nbf_spikespec (DENS2);}

You will then see two figures: Figure 1 shows you the complete spectrum (on a double-log scale) from 0 to 11025 Hz over the entire range. In general, there is not much to see, although you can, for example, read the bandwidth of the signal. The second figure zooms in on the frequencies between 1 and 100 Hz. The program also tries (automatically) to find a possible peak between 5 and 40 Hz. It then places a dashed line on the peak found. You can check for yourself whether you think that this is indeed a peak, or whether it concerns a random fluctuation. A peak <10 Hz belongs to the alpha band. A peak between 15-25 Hz to the beta band, and a peak above that to the gamma band.

\begin{exercise} View the spectra of the three data files for each of the selected spike clusters, and check whether a systematic peak can be found at a certain frequency. \end{exercise}