The activity of dividing or grouping data points based on some similar traits. Data points in a cluster are similar to each other while they’re different compared to other data points in other clusters. It’s an unsupervised learning method used to draw references without labeling.