# KShape¶

Perform KShape clustering.

I recommend reading the paper on it: Paparrizos, John, and Luis Gravano. “k-Shape: Efficient and Accurate Clustering of Time Series.” In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 1855-1870. ACM, 2015.

This GUI uses the tslearn.clustering.KShape implementation.

See also

Note

This plot can be saved in an interactive form, see Saving plots

**Layout**

**Left:** KShape parameters and Plot parameters

**Bottom left:** Plot of a random sample of input data from a cluster.

**Center:** Plot of cluster mean and either confidence interval, standard deviation, or neither. Uses on seaborn.lineplot

**Right:** Proportions plot. Exactly the same as Proportions.

**Bottom Right:** Console

## KShape Parameters¶

The parameters and input data are simply fed to tslearn.clustering.KShape

Parameters outlined here are simply as they appear in the tslearn.

**data_column:** Input data for clustering.

**n_clusters:** Number of clusters to form.

**max_iter:** Maximum number of iterations of the k-Shape algorithm.

**tol:** Inertia variation threshold. If at some point, inertia varies less than this threshold between two consecutive iterations, the model is considered to have converged and the algorithm stops.

**n_init:** Number of times the k-Shape algorithm will be run with different centroid seeds. The final results will be the best output of n_init consecutive runs in terms of inertia.

**random_state:** Generator used to initialize the centers. If an integer is given, it fixes the seed. Defaults to the global numpy random number generator.

**training subset:** The subset of the input data that are used for used for training. After training, the predictions are fit on all the input data.

## Plot Options¶

**Plot cluster:** The cluster from which to plot random samples of input data in the bottom left plot

**Show centers:** Show the centroids returned by the KShape model

Warning

There’s currently an issue where cluster centroids don’t appear to be index correctly. See https://github.com/rtavenar/tslearn/issues/114

**max num curves:** Maximum number of input data samples to plot

**Error band:** The type of data to show for the the error band in the means plots.

**set x = 0 at:** The zero position of a means plots with respect to the cluster members in the plot.