DATA-MEAns: An open source tool for the classification and management of neural ensemble recordings

By María Paula Bonomini

What for?

Multielectrode recordings are already a widespread tool in the labs nowadays, but the amount of information that this yields can drive you crazy... Therefore the need of a preselection data tool which allows you to meet certain criterions for classifying your data would be extremely useful for the experimenter. The main goal of all of this is to get to preclassified data so that the experimenter can do further analysis on more characterized groups of units, taking advantage of the knowledge about basic behaviours of these units according to different criterions such as autocorrelations, PSTHs, frequency histograms, and so on. It also allows an easy-going way of displaying data for making the identification of units more consistent. Moreover, one can dismiss units from the original dataset as well as create subpopulations according to a certain behaviour.

Thast's why we present an open, freely available software intended for facilitating the classification and graphical representation of spikes and poblational responses to simple control stimuli, such as flashes or moving bars, that is: the regular stimuli applied over the experiment to keep track of the state of the cells. The output can be either graphical (raster plots, spike count plots, spatio-temporal plots (columns or rows of the electrode array along time) or clusters coded by color keeping the spatial configuration of the array for seeing what classs certain unit belong to) or files containing the units that match the defined requirements for classification.

This software is being developed by María Paula Bonomini and any comment or suggestion will be welcomed at p.bonomini@umh.es

Downloads

DATAMEAnsv2-install.exe

Executable file, source code (Delphi 7), Matlab routines and sample files…

So far...

Animations: Spatiotemporal representations, for an array layout of M by N electrodes, (where M and N are user-defined features), throughout the entire experiment or from a starting to an ending time defined by the user. You will be able to see the temporal evolution of the columns or the rows of your electrode array.
Visualization of units: all the units or the choosen ones as rasters or the spike count from all the units.
Classification according to:
Rate histograms, correlograms, rate vs. time curves, PSTHs, ISI histograms.
Edit Plot Mode: Display rasters individually, dismiss units from the dataset, show classified units.
Output files with the classified units. The number of classes for the clustering algorithm is user-defined.
User-configuration of: array layout, number of graphics per Matlab figure, number of spikes under which you would like to dismiss data,number of classes for the clustering method.

Installation notes

To run it succesfully just run the executable DATAMEAnsv2-install.exe, and a C:\syncromat folder will appear containing all the needed files. Do not move any of these files, except the main executable, DATAMEAnsv2.exe, which you can move wherever it's more comfortable for you. In this folder, there is even three sample files (columnar and pair structured) for making your life easier at the first run!

Getting started

This program is aimed at the sharing of analytical tools for multielectrode recordings. The source code is completely free and open, so that any programmer is invited to add his/her own routine!

With this primary version you will be able to run basic analytical processes such as PSTHs, Correlograms, a filter based on Conditioned Spikes, rate, frequency and ISI histograms as well as to run different visualization routines to have fun enough! Moreover, you will be able to generate subpopulations by clustering the cells according to any analytical process mentioned above. Therefore, you can end up with a certain number of relatives grouped by a cell behaviour criterion. This is explained in detail in the Classification link.

The program works with ASCII files so that no matter the commercial system you acquired the data, just translate it to ASCII!!!

The input is allowed multicolumn or pairs structured, for more info on the file format go to sample files. There is a sample of each file format allowed in the installation folder so that you can get started with these..

1- Open a file, remember that you must define your own labels for stimulus identification on the General Settings Panel, located on the left in the main form. On this panel, you will be able to play around with different parameters which will affect the processes you will run. The following is a brief description of these settings:

Matlab_Figures_X/Matlab_Figures_Y: For rectangular multielectrode arrays, if this layout and Electrode_X/Electrode_y are set alike, then you'll infer spatio-temporal information from your recordings. Otherwise, it's a way of presenting the Matlab figures generated in internal calls to Matlab routines.

Electrode_X/Electrode_y: The layout of your multielectrode matrix. The implemented layout is a rectangular one so far.

K-Means Classes: The maximum number of classes in which the clustering method K-Means will split the original dataset. Thus, after running any Classification process, you will end up with at the most N clusters if N was set for the K-Means Classes.

Min spikes per channel: Any channel with under this amount of spikes will be dismissed.

Bin size: Will affect the visualization and classification processes. During many routines, it will be needed to sample the timestamps of the cells. Bin size will be the minimum window time for the sampling.

Start time: For some processes, you can define a start time to analyse from that time on instead of the whole experiment duration.

Stop time: For some processes, you can define a stop time to analyse until that time instead of the whole experiment duration.

Maximum Lag: For the correlogram processes. It's the amount of places that the signal will be shifted in the correlogram routines.

User Identifiers: To define specific stimulus labels.

File Formats

The input file formats can be the followings:

Pairs:

A pair of values (timestamp,unit) or (unit,timestamp) with stimulus information labelled with numbers from 1000 on (where 1000 is interpreted as stimulus offset and any number above 1000 as stimulus onset, with varying values for varying amplitudes) or with the User Identifiers that the experimenter can define on the General Settings Panel. Warning: In the pairs structured case, the stimulus identifiers must be numeric codes over 1000.

timestamp channel.unit

0.000370 37.1

0.000600 75.2

0.000900 68.0

0.003230 2000<-- flash onset

0.004870 59.5

0.007360 27.2

0.007400 81.5

0.008872 59.5

0.012360 59.5

0.014900 1000<-- flash offset

0.023340 59.5

channel.unit timestamp

51.3 0.00857

73.2 0.01263

46.2 0.01267

3000 0.01487<-- flash onset (higher intensity)

56.1 0.0149

96.4 0.01577

36.0 0.0165

85.0 0.01693

74.0 0.02073

1000 0.02137<-- flash offset

95.1 0.0215

85.1 0.02153

Columns:

Each column contains the timestamps belonging to an unique cell.

sig001 sig002 sig003 start

0.000750 0.040400 0.370175 0.020775

0.000750 0.416725 0.347925 0.060775

0.416750 0.000750 0.416725 0.100775

3.421575 0.991475 0.364750 0.140775

0.040400 0.000750 0.370175 0.180775

0.370175 0.000750 0.416725 0.220775

0.021700 0.370175 0.000750 0.260775

0.370175 0.416725 0.127625 0.300775

0.370175 0.040400 0.416725 0.340775

0.370175 0.000750 0.370175 0.380775

A sample file for each structure can be found on 'C:\syncromat\sample_file' folder. If no stimulus was recognised because the User Identifier labels did not match with the data, then a message will warn you about it, since some routines will need stimulus information. Moreover, when running these routines, an additional warning message will remind you that there is not stimulus information.

2- Now you can run any Visualization, Classification or Conditioned Spikes Analysis:

Visualization consists basically of two sorts:

§ Traditional visualization of spikes or CS, which is split into spike or CS count along time or raster plots.

§ Spatio-Temporal visualizations, which arises two kind of outputs:

Static: builds up a population vector composed of the sum over some bin time and among all the units of each column on the electrode array and plots this vector over the time frame given by the user(whole experiment or from a start-time till an end-time)in such a way that you can see the activity in the different columns that form the electrode array along the time.

Animated: it simply shows the activation level of all the electrodes in the array bin per bin.

The bin size is a user defined feature set on the General Settings Panel. Again, you can scan the whole recording or just focus between a start-time and an stop-time. Please, note that the starting and ending time should be within the entire time the experiment.

Classification

Classification runs a script under Matlab, which varies according to the classification criterion (autocorrelation, rate histogram, etc) and ends up running the clustering method K-Means, in order to identify different units with some number of classes.

The grouped units are then graphically represented on the matrix layout defined by the user in a colour coded way. The x axis is extended in order to include all the different units for a certain electrode. So, if a 10 by 10 electrode matrix was defined on the General Settings, an extended 10 by 50 (there is room for 5 different units/electrode) coloured matrix will appear, where all the units in red will belong to a class, all units in blue will to a different class, and so on.

After this point, one can go to the Edit Plot Mode, the second tabsheet in the main form, and get a better visualization of the processed units, as well as removing individual units from the dataset by right-clicking on the charts or displaying the cell's raster plots individually. On the panel on the right, one can see the grouping of the units by clicking on "Display Clusters". To zoom in you just have to drag the mouse in such a way that it defines a rectangle with the area of interest from left to right, to zoom out, the same, but defining the rectangle from right to left.

Conditioned Spikes Analysis

Conditioned Spikes (CS) with an associated firing rate fr, are defined whenever two successive spikes on a given cell occur at times ti and ti-1 such that:

(fr+(delta_freq/2))<|1/(ti-(ti-1))|<(fr-(delta_freq/2)) for a given delta_freq.

The Sweep Windows option allows you to run the CS method through different velocity windows and shows you in the text component the rate of CS correlated with the stimulus for each velocity window, plus the overall number of CS.

The One Window option lets you choose a maximal velocity and a minimal one, and performs the CS method with these settings, this will also allow you to see graphical output in the main chart, in a raster fashion, although you can also go to "Visualization/CS Population Response" in order to get the Conditioned Spikes Population Response displayed.

References

More on "Events concept": Conditioned spikes: a simple and fast method to represent rates and temporal patterns inmultielectrode recordings, J. Neurosci Meth, 133 (2004): 135-141
More on Matlab routines: Mathwork site

For you to get started, there are three sample files included in the zip files you download, it is highly recomended that you start with them. The pairs structured files are the same, but one in the form (times, electrodes) and the other one in the (electrodes, times) format. It is a 22 seconds recording with regular 0.5 Hz white-black flashes, 300 ms ON period, stimulus labels: 1000 and 2000. The third sample file is a column structured file: 5 seconds recording with the label Event002 marking the stimulus information.