Anand Louis - On the complexity of clustering problems
From Katie Gentilello
views
comments
From Katie Gentilello
Euclidean k-means clustering, a problem having numerous applications, is NP-hard in the worst case but often solved efficiently in practice using simple heuristics. A quest for understanding the properties of real-world data sets that allow efficient clustering has lead to the notion of the perturbation resilience. In the first part of the talk, I'll describe an algorithm to recover the optimal k-means clustering in perturbation resilient instances.
In some cases, clustering with the k-means objective may result in a few clusters of very large cost and many clusters of small cost. This can be undesirable when we have a budget constraint on the cost of each cluster. Motivated by this, we study the "min-max k-means" clustering objective. In the second part of the talk, I'll show approximation algorithms for the min-max k-means problem.
Based on joint works with Amit Deshpande and Apoorv Vikram Singh.