Chapter 04

Identifying Groups in Data: Analyses Based on Dissimilarities Between Sequences

TOC Content (click for jumping to examples)
4.1 Clustering sequences to uncover typologies Crisp (or hard) clustering algorithms
4.2 Illustrative application Hierarchical clustering: Ward’s linkage
Partitional clustering: PAM
Visualizing clustering options with MDS
Comparison between different time granularities
4.3 “Construct validity” for typologies from cluster analysis to sequences No code for this section
4.4 Using typologies as dependent and independent variables Clusters as outcomes
Clusters as predictors

Chapter 4 considers how to use the dissimilarity matrices to identify groups in data by using different clustering techniques. The resulting typology is further analyzed either as a categorical independent or dependent variable within a regression framework. We recommend to read Chapter 4.3 carefully. as it contains important considerations on how to make informed decisions when identifying the number of clusters.


If you see mistakes or want to suggest changes, please create an issue on the source repository.


Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. Source code is available at, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".