Part Six PDF
Part Six PDF
Part Six PDF
Unsupervised Learning
Clustering
PART 6
◼ Simple segmentation
❑ Dividing students into different registration groups
alphabetically, by last name
◼ Results of a query
❑ Groupings are a result of an external specification
◼ Graph partitioning
❑ Some mutual relevance and synergy, but areas are not
identical
0
◼ Dissimilarity matrix d(2,1) 0
❑ (one mode) d(3,1) d ( 3,2) 0
: : :
d ( n,1) d ( n,2) ... ... 0
◼ Hierarchical clustering
❑ A set of nested clusters organized as a hierarchical tree
p1
p3 p4
p2
p1 p2 p3 p4
◼Traditional Hierarchical Clustering ◼Traditional Dendrogram (tree)
p1
p3 p4
p2
p1 p2 p3 p4
◼ Center-based clusters
◼ Contiguous clusters
◼ Density-based clusters
◼ Property or Conceptual
◼ Well-Separated Clusters:
❑ A cluster is a set of points such that any point in a cluster is
closer (or more similar) to every other point in the cluster than
to any point not in the cluster.
◼3 well-separated clusters
◼ Center-based
❑ A cluster is a set of objects such that an object in a cluster is
closer (more similar) to the “center” of a cluster, than to the
center of any other cluster
❑ The center of a cluster is often a centroid, the average of all
the points in the cluster, or a medoid, the most “representative”
point of a cluster
4 center-based clusters
◼8 contiguous clusters
◼ Density-based
❑ A cluster is a dense region of points, which is separated by
low-density regions, from other regions of high density.
❑ Used when the clusters are irregular or intertwined, and when
noise and outliers are present.
6 density-based clusters
2 Overlapping Circles
◼ Hierarchical clustering
◼ Density-based clustering
merges or splits
6 5
0.2
4
3 4
0.15 2
5
2
0.1
1
0.05
3 1
0
1 3 2 5 4 6
❑ Divisive:
◼ Start with one, all-inclusive cluster
◼ At each step, split a cluster until each cluster contains a point (or
there are k clusters)
+
+