This function assigns each row of a data.frame to a cluster based on the Gower distance matrix, and either a pre-specified or an optimal number of clusters.
Arguments
- df
data.frame
- v_cluster
variables used to compute Gower distances between rows (if
NULL, use all)- k
number of clusters (if
NULL, determined optimally; see Details)- weights
(numeric vector) variable weights for calculating Gower distances (default all 1)
Details
First, a distance matrix is computed using cluster::daisy() with metric="gower"
and stand=TRUE. Next, clustering is performed around medoids (a more robust version of k-means
clustering) as implemented in cluster::pam().
If no number of clusters k was specified, then the optimal
number of clusters is determined for the current distance matrix using NbClust::NbClust() with the
method="median" and index="silhouette".
Examples
df |> get_cluster() |> table()
#> x1 x2 x3 y1 y2 y3
#> 1 1 1 1 1 1
#>
#> Only frey, mcclain, cindex, sihouette and dunn can be computed. To compute the other indices, data matrix is needed
#> k chosen by NbClust(): 2
#>
#> 1 2
#> 236 264
#>
#> 0 1
#> 236 264