202412121200
Status: #idea
Tags:
State: #nascient
K-Means Clustering
We specify a
The goal is to minimize the intra-cluster variance.
In practice finding a global optima is
But there is an approximate solution that can give a local optima.
- Give labels randomly
- Compute the centroid for each each class
- Assign each observation to the class of whose the centroid is closest
- Repeat until classes stop changing
Since this is a local optimum, we need to run it multiple times and select from those we selected the one that minimizes the error.