# Nodes¶

## Data¶

**These nodes are for performing general data related operations**

### LoadFile¶

Loads a save Transmission file. If you have a Project open it will automatically set the project path according to the open project. Otherwise you must specify the project path. You can specify a different project path to the project that is currently open (this is untested, weird things could happen). You should not merge Transmissions originating from different projects.

Note

You can also load a saved Transmission file by dragging & dropping it into the Flowchart area. It will create a LoadFile node with the name of the dropped.

Terminal Description Out Transmission loaded from the selected file.

Parameters Description load_trn Button to choose a .trn file (Transmission) to load proj_trns Load transmission file located in the project’s “trns” directory proj_path Button to select the Mesmerize project that corresponds to the chosen .trn file. Note

The purpose of specifying the Project Path when you load a save Transmission file is so that interactive plots and the Datapoint Tracer can find raw data that correspond to datapoints.

### LoadProjDF¶

Load the entire Project DataFrame (root) of the project that is currently open, or a sub-DataFrame that corresponds a tab that you have created in the Project Browser.

Output Data Column(numerical): _RAW_CURVEEach element in this output column contains a 1-D array representing the trace extracted from an ROI.

Terminal Description Out Transmission created from the Project DataFrame or sub-DataFrame.

Parameters Description DF_Name DataFrame name. List correponds to Project Browser tabs. Update Re-create Transmission from corresponding Project Browser tab. Apply Process data through this node Note

The

DF_Nameoptions do not update live with the removal or creation of tabs in the Project Browser, you must create a new node to reflect these types of changes.

### Save¶

Save the input Transmission to a file so that the Transmission can be used re-loaded in the Flowchart for later use.

Usage:Connect an input Transmission to this node’sInterminal, click the button to choose a path to save a new file to, and then click the Apply checkbox to save the input Transmission to the chosen file.

Terminal Description In Transmission to be saved to file

Parameters Description saveBtn Button to choose a filepath to save the Transmission to. Apply Process data through this node Note

You must always save a Transmission to a new file (pandas with hdf5 exihibts weird behavior if you overwrite, this is the easiest workaround). If you try to overwrite the file you will be presented with an error saying that the file already exists.

### Merge¶

Merge multiple Transmissions into a single Transmission. The DataFrames of the individual Transmissions are concatenated using pandas.concat and History Traces are also merged. The History Trace of each indidual input Transmission is kept separately.

Warning

At the moment, if you create two separate data streams that originate from the same Transmission and then merge them at a later point, the analysis log (History Trace) of the individual data streams are not maintained. See the information about data blocks in the Transmission.

Terminal Description In Transmissions to be merged Out Merged Transmission

### ViewTransmission¶

View the input Transmission object using the spyder Object Editor. For example you can explore the Transmission DataFrame and HistoryTrace.

### TextFilter¶

Include or Exclude Transmission DataFrame rows according to a text filter in a categorical column.

Usage Example:If you want to select all traces that are from photoreceptor cells and you have a categorical column, named cell_types for example, containing cell type labels, choose “cell_type” as theColumnparameter and enter “photoreceptor” as thefilterparameter, and selectInclude. If you want to select everything that are not photoreceptors selectExclude.Note

It is recommended to filter and group your data beforehand using the Project Browser since it allows much more sophisticated filtering.

Terminal Description In Input Transmission Out Transmission its DataFrame filtered accoring parameters

Parameters Description Column Categorical column that contains the text filter to apply filter Text filter to apply Include Include all rows matching the text filter Exclude Exclude all rows matching the text filter Apply Process data through this node

HistoryTrace output structure:Dict of all the parameters for this node

### SpliceArrays¶

Splice arrays derived in the specified numerical data column and place the spliced output arrays in the output column.

Output Data Column(numerical): _SPLICE_ARRAYS

Terminal Description In Input Transmission Out Transmission with arrays from the input column spliced and placed in the output column

Parameters Description data_column Numerical data column containing the arrays to be spliced indices The splice indices, “start_index:end_index” Apply Process data through this node

### DropNa¶

Drop NaNs and Nones (null) from the Transmission DataFrame. Uses DataFrame.dropna and DataFrame.isna methods.

- If you choose “row” or “column” as axis, entire rows or columns will be dropped if any or all (see params) of the values are NaN/None.
- If you choose to drop NaNs/Nones according to a specific column, it will drop the entire row if that row has a NaN/None value for the chosen column.

Terminal Description In Input Transmission Out Transmission NaNs and None’s removed according to the params

Parameters Description axis Choose to rows, columns, or a rows according to a specific column. how

any:Drop if any value in the row/column is NaN/Noneall:Drop only if all values in the row/column are Nan/Noneignored if “axis” parameter is set to a specific columnApply Process data through this node

### NormRaw¶

`Source`

Scale the raw data such that the min and max values are set to the min and max values derived from the raw spatial regions of the image sequences they originate from. Only for CNMFE data.

The arrays in the

_RAW_CURVEcolumn are scaled and the output is placed in a new column named_NORMRAW

Terminal Description In Input Transmission Out Transmission with the result placed in the output column

Parameter Description option Derive the raw min & max values from one of the following options:top_5:Top 5 brightest pixelstop_10:Top 10 brighest pixelstop_5p:Top 5% of brightest pixelstop_10p:Top 10% of brightest pixelstop_25p:Top 25% of brightest pixelsfull_mean:Full mean of the min and max arrayApply Process data through this node Note

If the raw min value is higher than the raw max value the curve will be excluded in the output. You will be presented with a warning box with the number of curves that were excluded due to this.

## Display¶

**These nodes connect input Transmission(s) to various plots for visualization**

The actual Plot Widget instance that these nodes use can be accessed through the `plot_widget`

attribute in the flowchart console.

For example

```
# Get a heatmap node that is named "Heatmap.0"
>>> hn = get_nodes()['Heatmap.0']
# the plot widget instance
>>> hn.plot_widget
<mesmerize.plotting.widgets.heatmap.widget.HeatmapTracerWidget object at 0x7f26e5d29678>
```

### BeeswarmPlots¶

Based on pqytgraph Beeswarm plots.

Visualize data points as a pseudoscatter and as corresponding Violin Plots. This is commonly used to visualize peak features and compare different experimental groups.

For information on the plot widget see Beeswarm Plots

Terminal Description In Input Transmission

The DataFrame column(s) of interest must have single numerical values, not arrays

### Heatmap¶

Used for visualizing numerical arrays in the form of a heatmap. Also used for visualizing a hieararchical clustering tree (dendrogram) along with a heatmap with row order corresponding to the order leaves of the dendrogram.

For information on the plot widget see Heat Plot

Terminal Description In Input Transmission

The arrays in the DataFrame column(s) of interestmustbe of the same lengthNote

Arrays in the DataFrame column(s) of interest

mustbe of the same length. If they are not, you must splice them using the SpliceArrays node.

### CrossCorr¶

Perform Cross-Correlation analysis. For information on the plot widget see CrossCorrelation Plot

### Plot¶

For information on the plot widget see <plot_SimplePlot>

A simple plot.

Terminal Description In Input Transmission

Parameters Description data_column Data column to plot, must contain numerical arrays Show Show/hide the plot window Apply Process data through this node

### Proportions¶

Plot stacked bar chart of one categorical variable vs. another categorical variable.

For information on the plot widget see Proportions Plot

### ScatterPlot¶

Create scatter plot of numerical data containing [X, Y] values

For information on the plot widget see Scatter Plot

### TimeSeries¶

Plot the means along with confidence intervals or standard eviation of numerical arrays representing time series data.

For more information see plot_TimeSeries

## Signal¶

**Routine signal processing functions**

I recommend this book by Tom O’Haver if you are unfamiliar with basic signal processing: https://terpconnect.umd.edu/~toh/spectrum/TOC.html

### Butterworth¶

`Source`

Creates a Butterworth filter using scipy.signal.butter and applies it using scipy.signal.filtfilt.

The Wn parameter of scipy.signal.butter is calculated by dividing the sampling rate of the data by the

freq_divisorparameter (see below).

Output Data Column(numerical): _BUTTERWORTH

Terminal Description In Input Transmission Out Transmission with filtered signals in the output data column

Parameters Description data_column Data column containing numerical arrays to be filtered order Order of the filter freq_divisor Divisor for dividing the sampling frequency of the data to get Wn Apply Process data through this node

### SavitzkyGolay¶

Savitzky Golay filter. Uses scipy.signal.savgol_filter.

Output Data Column(numerical): _SAVITZKY_GOLAY

Terminal Description In Input Transmission Out Transmission with filtered signals in the output data column

Parameters Description data_column Data column containing numerical arrays to be filtered window_length Size of windows for fitting the polynomials. Must be an odd number. polyorder Order of polynomials to fit into the windows. Must be less than window_lengthApply Process data through this node

### PowSpecDens¶

### Resample¶

Resample the data in numerical arrays. Uses scipy.signal.resample.

Output Data Column(numerical): _RESAMPLE

Terminal Description In Input Transmission Out Transmission with resampled signals in the output data column

Parameters Description data_column Data column containing numerical arrays to be resampled Rs New sampling rate in Tuunits of time.Tu Time unit Apply Process data through this node Note

If Tu = 1, then Rs is the new sampling rate in Hertz.

### ScalerMeanVariance¶

Uses tslearn.preprocessing.TimeSeriesScalerMeanVariance

Output Data Column(numerical): _SCALER_MEAN_VARIANCE

Terminal Description In Input Transmission Out Transmission with scaled signals in the output column

Parameters Description data_column Data column containing numerical arrays to be scaled mu Mean of the output time series std Standard Deviation of the output time series Apply Process data through this node Note

if mu = 0 and std = 1, the output is the z-score of the signal.

### Normalize¶

Normalize the signal so that all values are between 0 and 1 based on the min and max of the signal.

Output Data Column(numerical): _NORMALIZE

Terminal Description In Input Transmission Out Transmission with scaled signals in the output column

Parameters Description data_column Data column containing numerical arrays to be scaled Apply Process data through this node

### RFFT¶

Uses scipy.fftpack.rfft. “Discrete Fourier transform of a real sequence”

Output Data Column(numerical): _RFFT

Terminal Description In Input Transmission Out Transmission with the RFT of signals in the output column

Parameters Description data_column Data column containing numerical arrays Apply Process data through this node

### iRFFT¶

Uses scipy.fftpack.irfft. “inverse discrete Fourier transform of real sequence x”

Output Data Column(numerical): _IRFFT

### PeakDetect¶

Simple Peak Detection using derivatives. The “Differentiation” chapter of Tom O’Haver’s book has a section on Peak Detection which I recommend reading. https://terpconnect.umd.edu/~toh/spectrum/TOC.html

Output Data Column(DataFrame): peaks_basesSee also

Terminal Description Derivative Transmission with derivatives of signals. Must have

_DERIVATIVEcolumn.It’s recommended to use a derivative from a normalized filtered signal.Normalized Transmission containing Normalized signals, used for thresholding

See Normalize nodeCurve Transmission containing original signals.

Usually not filtered to avoid distortions caused by filteringPB_Input (optional)Transmission containing peaks & bases data (peaks_bases column).

Useful for visualizing a saved Transmission that has peaks & bases dataOut Transmission with the detected peaks & bases as DataFrames in the output column Warning

The

PB_Inputterminal overrides all other terminals. Do not connect inputs toPB_Inputand other terminals simultaneously.

Parameter Description data_column Data column of the input CurveTransmission for placing peaks & bases ontoFictional_Bases Add bases to beginning and end of signal if first or last peak is lonely Edit Open Peak Editor GUI, see Peak Editor SlopeThr Slope threshold AmplThrAbs Absolute amplitude threshold AmplThrRel Relative amplitude threshold Apply Process data through this node

### PeakFeatures¶

Compute peak features. The DataFrame of the ouput Transmission contains one row for each peak.

Output Data Column Description _pf_peak_curve array representing the peak _pf_ampl_rel_b_ix_l peak amplitude relative to its left base _pf_ampl_rel_b_ix_r peak amplitude relative to its right base _pf_ampl_rel_b_mean peak amplitude relative to the mean of its bases _pf_ampl_rel_zero peak amplitude relative to zero _pf_area_rel_zero Simpson’s Rule Integral of the curve _pf_area_rel_min Simpson’s Rule Integral relative to the minimum value of the curveSubstracts the minimum values of the peak curve before computing the integral_pf_rising_slope_avg slope of the line drawn from the left base to the peak _pf_falling_slope_avg slope of the line drawn from the right base to the peak _pf_duration_base distance between the left and right base _pf_p_ix index of the peak maxima in the parent curve _pf_uuid peak UUID _pf_b_ix_l index of the left base in the parent curve _pf_b_ix_r index of the right base in the parent curve See also

`mesmerize/analysis/compute_peak_features`

for the code that computes the peak features.

Terminal Description In Input Transmission. Must contain peak_basescolumn that contains peak_bases DataFrames.Out Transmission with peak features in various output columns

Parameter Description data_column Data column containing numerical arrays from which to compute peak features. Apply Process data through this node

## Math¶

**Nodes for performing basic Math functions**

### Derivative¶

Computes the first derivative.

Output Data Column(numerical): _DERIVATIVE

Terminal Description In Input Transmission Out Transmission with the derivative placed in the output column

Parameter Description data_column Data column containing numerical arrays Apply Process data through this node

### TVDiff¶

Based on Numerical Differentiation of Noisy, Nonsmooth Data. Rick Chartrand. (2011).. Translated to Python by Simone Sturniolo.

### XpowerY¶

Raises each element of the numerical arrays in the data_column to the exponent Y

Output Data Column(numerical): _X_POWER_Y

Terminal Description In Input Transmission Out Transmission with the result placed in the output column

Parameter Description data_column Data column containing numerical arrays Y Exponent Apply Process data through this node

### AbsoluteValue¶

Element-wise absolute values of the input arrays. Computes root mean squares if input arrays are complex.

Output Data Column(numerical): _ABSOLUTE_VALUE

Terminal Description In Input Transmission Out Transmission with the result placed in the output column

Parameter Description data_column Data column containing numerical arrays Apply Process data through this node

### LogTransform¶

Perform Logarithmic transformation of the data.

Output Data Column(numerical): _LOG_TRANSFORM

Terminal Description In Input Transmission Out Transmission with the result placed in the output column

Parameter Description data_column Data column containing numerical arrays transform

log10: Base 10 logarithmln: Natural logarithmmodlog10: \(sign(x) * \log_{10} (|x| + 1)\)modln: \(sign(x) * \ln (|x| + 1)\)Apply Process data through this node

### ArrayStats¶

Perform a few basic statistical functions.

Output Data Column(numerical): Customizable by user entryOutput data are single numbers, not arrays

Terminal Description In Input Transmission Out Transmission with the result placed in the output column The desired function is applied to each 1D array in the

data_columnand the output is placed in the Output Data Column.

Parameter Description data_column Data column containing numerical arrays function amin: Return the minimum of the input arrayamax: Return the maximum of the input arraynanmin: Return the minimum of the input array, ignore NaNsnanmax: Return the maximum of the input array, ignore NaNsptp: Return the range (max - min) of the values of the input arraymedian: Return the median of the input arraymean: Return the mean of the input arraystd: Return the standard deviation of the input arrayvar: Return the variance of the input arraynanmedian: Return the median of the input array, ignore NaNsnanmean: Return the mean of the input array, ignore NaNsnanstd: Return the standard deviation of the input array, ignore NaNsnanvar: Return the variance of the input array, ignore NaNsgroup_by (Optional)Group by a categorial variable, for example get the mean array of a group group_by_sec (Optional)Group by a secondary categorical variable output_col Enter a name for the output column Apply Process data through this node

### ArgGroupStat¶

Group by a categorial variable and return the value of any other column based on a statistic. Basically creates sub-dataframes for each group and then returns based on the sub-dataframe.

Group by column “group_by” and return value from column “return_col” where data in

data_columnfits “stat”

Output Data Column(Any): ARG_STAT

Terminal Description In Input Transmission Out Transmission with the result placed in the output column

Parameter Description data_column Data column containing single numbers (not arrays for now) group_by Group by column (categorical variables) return_col Return value from this column (any data) stat “max” or “min” Apply Process data through this node

### ZScore¶

Compute Z-Scores of the data. Uses scipy.stats.zscore. The input data are are divided into groups according to the

group_byparameter. Z-Scores are computed for the data in each group with respect to the data only in that group.

Output Data Column(numerical): _ZSCORE

Terminal Description In Input Transmission Out Transmission with the result placed in the output column

Parameter Description data_column Input data column containing numerical arrays group_by Categorial data column to group by. Apply Process data through this node

### LinRegress¶

Basically uses scipy.stats.linregress

Performs Linear Regression on numerical arrays and returns slope, intercept, r-value, p-value and standard error

Terminal Description In Input Transmission Out Transmission with the result placed in the output column

Parameter Description data_column Data column containing 1D numerical arrays.The values are used as the y values and indices as the x values for the regression

Output Columnns:Single numbers,_SLOPE,_INTERCEPT,_R-VALUE,_P-VALUE,_STDERRas decribed in scipy.stats.linregress

## Biology¶

**Nodes for some biologically useful things which I couldn’t categorize elsewhere**

### ExtractStim¶

Extract the portions of a trace corresponding to stimuli that have been temporally mapped onto it. It outputs one row per stimulus period.

Note:Stimulus extraction is currently quite slow, will be optimized after some planned changes in the Transmission object.

Output Data Column Description ST_TYPE Stimulus type, corresponds to your Project Config ST_NAME Name of the stimulus _ST_CURVE The extracted array based on the parameters _ST_START_IX Start index of the stimulus period in the parent curve _ST_END_IX End index of the stimulus period in the parent curve ST_uuid UUID assigned for the extracted stimulus period

Parameter Description data_column Data column containing the signals to be extracted based on the stimulus maps Stim_Type Type of stimulus to extract start_offset Offset the start index of the stimulus mapping by a value (in frames) end_offset Offset the end index of the stimulus mapping by a value (in frames) zero_pos Zero index of the extracted signal

start_offset: extraction begins at thestart_offsetvalue, stops at theend_offsetstim_end: extraction begins at the end of the stimulus, stops at theend_offset.stim_center: extraction begins at the midpoint of the stimulus period plus thestart_offset, stops atend_offset

### DetrendDFoF¶

Uses the detrend_df_f function from the CaImAn library. This node does not use any of the numerical data in a Transmission DataFrame to compute the detrended \(\Delta F / F_0\). It directly uses the CNMF output data for the Samples that are present in the Transmission DataFrame.

Output Data Column(numerical): _DETREND_DF_O_F

### StaticDFoFo¶

Perform \(\frac{F - F_0}{F_0}\) without a rolling window. \(F\) is an input array and \(F_0\) is the minimum value of the input array.

Output Data Column(numerical): _STATIC_DF_O_F

Terminal Description In Input Transmission Out Transmission with the result placed in the output column

Parameter Description data_column Data column containing numerical arrays Apply Process data through this node

## Clustering¶

### KShape¶

Perform KShape clustering. For more information see KShape plot.

## Hierarchical¶

These nodes allow you to perform Hierarchical Clustering using scipy.cluster.hierarchy.

If you are unfamiliar with Hierarchical Clustering I recommend going through this chapter from Michael Greenacre: http://www.econ.upf.edu/~michael/stanford/maeb7.pdf

Note

**Some of these nodes do not use Transmission objects for some inputs/outputs.**

### Linkage¶

Compute a linkage matrix which can be used to form flat clusters using the FCluster node.

Based on scipy.cluster.hierarchy.linkage

Terminal Description In Input Transmission Out dict containing the Linkage matrix and parameters, not a Transmission object

Parameters Description data_column Numerical data column used for computing linkage matrix method linkage method metric metric for computing distance matrix optimal_order minimize distance between successive leaves, more intuitive visualization

Apply Process data through this node

### FCluster¶

“Form flat clusters from the hierarchical clustering defined by the given linkage matrix.”

Based on scipy.cluster.hierarchy.fcluster

Output Data Column(categorial): FCLUSTER_LABELS

Terminal Description Linkage Linkage matrix, output from Linkage node. Data Input Transmission, usually the same input Transmission used for the Linkage node. IncM (optional)Inconsistency matrix, output from Inconsistent Monocrit (optional)Output from MaxIncStat or MaxInconsistent Out Transmission with clustering data that can be visualized using the Heatmap

Parameters:Exactly as desribed in scipy.cluster.hierarchy.fcluster

HistoryTrace output structure:Dict of all the parameters for this node, as well as the parameters used for creating the linkage matrix and the linkage matrix itself from the Linkage node.

### Inconsistent¶

### MaxIncStat¶

### MaxInconsistent¶

## Transform¶

Nodes for transforming data

### LDA¶

Perform Linear Discriminant Analysis. Uses sklearn.discriminant_analysis.LinearDiscriminantAnalysis

Terminal Description train_data Input Transmission containing the training data predict Input Transmission containing data on which to predict T Transmission with Transformed data and decision function. Output columns outlined below:_LDA_TRANSFORM:The transformed data, can be visualized with a Scatter Plot for instance_LDA_DFUNC:Decision function (confidence scores). Can be visualized with a Heatmapcoef Transmission with LDA Coefficients. Output columns outlined below:classes:The categorical labels that were trained against_COEF:LDA Coefficients (weight vectors) for the classes. Can be visualized with a Heatmapmeans Transmission with LDA Means. Output columns outlined below:classes:The categorical labels that were trained against_MEANS:LDA means for the classes. Can be visualized with a Heatmappredicted Transmission containing predicted class labels for the data.The class labels are placed in a column namedLDA_PREDICTED_LABELSThe names of the class labels correspond to the labels from the training labelsoptional

Parameter Description train_data Single or multiple data columns that contain the input features. labels Data column containing categorical labels to train to solver svd:Singular Value Decompositionlsqr:Least Squares solutioneigen: Eigen decompositionshrinkage Can be used with lsqroreigensolvers.shrinkage_val shrinkage value if shrinkageis set to “value”n_components Number of components to output tol Tolereance threshold exponent. The used value is 10^<tol> score Displays mean score of the classification (read only) predict_on Single or multiple data columns that contain the data that are used for predicting onUsually the same name as the data column(s) used for the training data.optional

HistoryTrace output structure:Dict of all the parameters for this node