Clustering is used for analyzing data which does not include pre-labeled classes, or even a class attribute at all. 1 Data instances are grouped together using the concept of "maximizing the intraclass similarity and minimizing the interclass similarity," as concisely described by Han, Kamber & Pei. This translates to the clustering algorithm identifying and grouping instances which are very similar, as opposed to ungrouped instances which are much less-similar to one another. k-means clustering is perhaps the most well-known example of a clustering algorithm. As clustering does not require the pre-labeling of instance classes, it is a form of unsupervised learning, meaning that it learns by observation as opposed to learning by example.


  1. Matthew, Mayo. “Machine Learning Key Terms, Explained.” KDnuggets, KDnuggets, 10AD, 2016, (1)

Clustering (last edited 2018-03-12 01:00:45 by notAndrey)

MoinMoin Appliance - Powered by TurnKey Linux