Clustering is used for analyzing data which does not include pre-labeled classes, or even a class attribute at all. 1 Data instances are grouped together using the concept of "maximizing the intraclass similarity and minimizing the interclass similarity," as concisely described by Han, Kamber & Pei. This translates to the clustering algorithm identifying and grouping instances which are very similar, as opposed to ungrouped instances which are much less-similar to one another. k-means clustering is perhaps the most well-known example of a clustering algorithm. As clustering does not require the pre-labeling of instance classes, it is a form of unsupervised learning, meaning that it learns by observation as opposed to learning by example.

Sources

  1. Matthew, Mayo. “Machine Learning Key Terms, Explained.” KDnuggets, KDnuggets, 10AD, 2016, https://www.kdnuggets.com/2016/05/machine-learning-key-terms-explained.html (1)

Clustering (last edited 2018-03-12 01:00:45 by notAndrey)

MoinMoin Appliance - Powered by TurnKey Linux