Hierarchical Grouping to Optimize an Objective Function¶
Authors: Joe H. Ward
Published: 1963 (Journal Paper)
Source: Journal of the American Statistical Association
Algorithm: Ward's Method
DOI: 10.1080/01621459.1963.10500845
Summary¶
Ward introduces an agglomerative hierarchical clustering procedure that repeatedly merges the pair of groups whose union best optimizes a chosen objective function, producing both a dendrogram and a quantitative record of the loss incurred at each merge. In its most influential form, Ward's method minimizes the increase in within-cluster sum of squares, giving a practical variance-based clustering rule for data sets where exact global optimization over a fixed number of clusters is infeasible.
Abstract¶
A procedure for forming hierarchical groups of mutually exclusive subsets, each of which has members that are maximally similar with respect to specified characteristics, is suggested for use in large-scale (n > 100) studies when a precise optimal solution for a specified number of groups is not practical. Given n sets, this procedure permits their reduction to n - 1 mutually exclusive sets by considering the union of all possible n(n - 1)/2 pairs and selecting a union having a maximal value for the functional relation, or objective function, that reflects the criterion chosen by the investigator. By repeating this process until only one group remains, the complete hierarchical structure and a quantitative estimate of the loss associated with each stage in the grouping can be obtained. A general flowchart helpful in computer programming and a numerical example are included.
Links¶
Primary
Standard
Alternate
Tags¶
-
Clustering
-
Hierarchical clustering
-
Agglomerative clustering
-
Ward's method
-
Objective function
-
Minimum variance clustering
-
Multivariate statistics
-
Cluster analysis