Abstract
In recent years, spectral clustering has become one of the most popular modern clustering algorithms. It is simple to implement, can be solved efficiently by standard linear algebra software, and very often outperforms traditional clustering algorithms such as the k-means algorithm. On the first glance spectral clustering appears slightly mysterious, and it is not obvious to see why it works at all and what it really does. The goal of this tutorial is to give some intuition on those questions. We describe different graph Laplacians and their basic properties, present the most common spectral clustering algorithms, and derive those algorithms from scratch by several different approaches. Advantages and disadvantages of the different spectral clustering algorithms are discussed.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Aldous, D., Fill, J.: Reversible Markov chains and random walks on graphs (in preparation). Online version available at http://www.stat.berkeley.edu/users/aldous/RWG/book.html
Bach, F., Jordan, M.: Learning spectral clustering. In: Thrun, S., Saul, L., Schölkopf, B. (eds.) Advances in Neural Information Processing Systems 16 (NIPS), pp. 305–312. MIT Press, Cambridge (2004)
Bapat, R., Gutman, I., Xiao, W.: A simple method for computing resistance distance. Z. Naturforsch. 58, 494–498 (2003)
Barnard, S., Pothen, A., Simon, H.: A spectral algorithm for envelope reduction of sparse matrices. Numer. Linear Algebra Appl. 2(4), 317–334 (1995)
Belkin, M.: Problems of learning on manifolds. Ph.D. Thesis, University of Chicago (2003)
Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15(6), 1373–1396 (2003)
Belkin, M., Niyogi, P.: Towards a theoretical foundation for Laplacian-based manifold methods. In: Auer, P., Meir, R. (eds.) Proceedings of the 18th Annual Conference on Learning Theory (COLT), pp. 486–500. Springer, New York (2005)
Ben-David, S., von Luxburg, U., Pal, D.: A sober look on clustering stability. In: Lugosi, G., Simon, H. (eds.) Proceedings of the 19th Annual Conference on Learning Theory (COLT), pp. 5–19. Springer, Berlin (2006)
Bengio, Y., Delalleau, O., Roux, N., Paiement, J., Vincent, P., Ouimet, M.: Learning eigenfunctions links spectral embedding and kernel PCA. Neural Comput. 16, 2197–2219 (2004)
Ben-Hur, A., Elisseeff, A., Guyon, I.: A stability based method for discovering structure in clustered data. In: Pacific Symposium on Biocomputing, pp. 6–17 (2002)
Bhatia, R.: Matrix Analysis. Springer, New York (1997)
Bie, T.D., Cristianini, N.: Fast SDP relaxations of graph cut clustering, transduction, and other combinatorial problems. J. Mach. Learn. Res. 7, 1409–1436 (2006)
Bolla, M.: Relations between spectral and classification properties of multigraphs. Technical Report No. DIMACS-91-27, Center for Discrete Mathematics and Theoretical Computer Science (1991)
Brémaud, P.: Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues. Springer, New York (1999)
Brito, M., Chavez, E., Quiroz, A., Yukich, J.: Connectivity of the mutual k-nearest-neighbor graph in clustering and outlier detection. Stat. Probab. Lett. 35, 33–42 (1997)
Bui, T.N., Jones, C.: Finding good approximate vertex and edge partitions is NP-hard. Inf. Process. Lett. 42(3), 153–159 (1992)
Chapelle, O., Schölkopf, B., Zien, A. (eds.): Semi-Supervised Learning. MIT Press, Cambridge (2006)
Chung, F.: Spectral Graph Theory. CBMS Regional Conference Series, vol. 92. Conference Board of the Mathematical Sciences, Washington (1997)
Dhillon, I.: Co-clustering documents and words using bipartite spectral graph partitioning. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 269–274. ACM Press, New York (2001)
Dhillon, I., Guan, Y., Kulis, B.: A unified view of kernel k-means, spectral clustering, and graph partitioning. Technical Report No. UTCS TR-04-25, University of Texas at Austin (2005)
Ding, C.: A tutorial on spectral clustering. Talk presented at ICML (2004). Slides available at http://crd.lbl.gov/~cding/Spectral/
Ding, C., He, X., Zha, H., Gu, M., Simon, H.: A min-max cut algorithm for graph partitioning and data clustering. In: Proceedings of the first IEEE International Conference on Data Mining (ICDM), pp. 107–114. IEEE Computer Society, Washington (2001)
Donath, W.E., Hoffman, A.J.: Lower bounds for the partitioning of graphs. IBM J. Res. Develop. 17, 420–425 (1973)
Fiedler, M.: Algebraic connectivity of graphs. Czechoslovak Math. J. 23, 298–305 (1973)
Fouss, F., Pirotte, A., Renders, J.-M., Saerens, M.: Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. IEEE Trans. Knowl. Data Eng. 19, 355–369 (2007)
Fraley, C., Raftery, A.E.: Model-based clustering, discriminant analysis, and density estimation. J. Am. Stat. Assoc. 97, 611–631 (2002)
Giné, E., Koltchinskii, V.: Empirical graph Laplacian approximation of Laplace-Beltrami operators: large sample results. In: Proceedings of the 4th International Conference on High Dimensional Probability, pp. 238–259 (2005)
Golub, G., Van Loan, C.: Matrix Computations. Johns Hopkins University Press, Baltimore (1996)
Guattery, S., Miller, G.: On the quality of spectral separators. SIAM J. Matrix Anal. Appl. 19(3), 701–719 (1998)
Gutman, I., Xiao, W.: Generalized inverse of the Laplacian matrix and some applications. Bull. Acad. Serb. Sci. Arts (Cl. Math. Natur.) 129, 15–23 (2004)
Hagen, L., Kahng, A.: New spectral methods for ratio cut partitioning and clustering. IEEE Trans. Comput.-Aided Des. 11(9), 1074–1085 (1992)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer, New York (2001)
Hein, M.: Uniform convergence of adaptive graph-based regularization. In: Proceedings of the 19th Annual Conference on Learning Theory (COLT), pp. 50–64. Springer, New York (2006)
Hein, M., Audibert, J.-Y., von Luxburg, U.: From graphs to manifolds—weak and strong pointwise consistency of graph Laplacians. In: Auer, P., Meir, R. (eds.) Proceedings of the 18th Annual Conference on Learning Theory (COLT), pp. 470–485. Springer, New York (2005)
Hein, M., Audibert, J.-Y., von Luxburg, U.: Graph Laplacians and their convergence on random neighborhood graphs. J. Mach. Learn. Res. 8, 1325–1370 (2007)
Hendrickson, B., Leland, R.: An improved spectral graph partitioning algorithm for mapping parallel computations. SIAM J. Sci. Comput. 16, 452–469 (1995)
Joachims, T.: Transductive learning via spectral graph partitioning. In: Fawcett, T., Mishra, N. (eds.) Proceedings of the 20th International Conference on Machine Learning (ICML), pp. 290–297. AAAI Press (2003)
Kannan, R., Vempala, S., Vetta, A.: On clusterings: good, bad and spectral. J. ACM 51(3), 497–515 (2004)
Kempe, D., McSherry, F.: A decentralized algorithm for spectral analysis. In: Proceedings of the 36th Annual ACM Symposium on Theory of Computing (STOC), pp. 561–568. ACM Press, New York (2004)
Klein, D., Randic, M.: Resistance distance. J. Math. Chem. 12, 81–95 (1993)
Koren, Y.: Drawing graphs by eigenvectors: theory and practice. Comput. Math. Appl. 49, 1867–1888 (2005)
Lafon, S.: Diffusion maps and geometric harmonics. Ph.D. Thesis, Yale University (2004)
Lang, K.: Fixing two weaknesses of the spectral method. In: Weiss, Y., Schölkopf, B., Platt, J. (eds.) Advances in Neural Information Processing Systems 18, pp. 715–722. MIT Press, Cambridge (2006)
Lange, T., Roth, V., Braun, M., Buhmann, J.: Stability-based validation of clustering solutions. Neural Comput. 16(6), 1299–1323 (2004)
Lovász, L.: Random walks on graphs: a survey. In: Combinatorics, Paul Erdös is Eighty, pp. 353–397. János Bolyai Math. Soc., Budapest (1993)
Lütkepohl, H.: Handbook of Matrices. Wiley, Chichester (1997)
Meila, M., Shi, J.: A random walks view of spectral segmentation. In: 8th International Workshop on Artificial Intelligence and Statistics (AISTATS) (2001)
Mohar, B.: The Laplacian spectrum of graphs. In: Graph Theory, Combinatorics, and Applications. Kalamazoo, MI, 1988, vol. 2, pp. 871–898. Wiley, New York (1991)
Mohar, B.: Some applications of Laplace eigenvalues of graphs. In: Hahn, G., Sabidussi, G. (eds.) Graph Symmetry: Algebraic Methods and Applications. NATO ASI Ser. C, vol. 497, pp. 225–275. Kluwer, Dordrecht (1997)
Nadler, B., Lafon, S., Coifman, R., Kevrekidis, I.: Diffusion maps, spectral clustering and eigenfunctions of Fokker-Planck operators. In: Weiss, Y., Schölkopf, B., Platt, J. (eds.) Advances in Neural Information Processing Systems 18, pp. 955–962. MIT Press, Cambridge (2006)
Ng, A., Jordan, M., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Dietterich, T., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Information Processing Systems 14, pp. 849–856. MIT Press, Cambridge (2002)
Norris, J.: Markov Chains. Cambridge University Press, Cambridge (1997)
Penrose, M.: A strong law for the longest edge of the minimal spanning tree. Ann. Probab. 27(1), 246–260 (1999)
Pothen, A., Simon, H.D., Liou, K.P.: Partitioning sparse matrices with eigenvectors of graphs. SIAM J. Matrix Anal. Appl. 11, 430–452 (1990)
Saerens, M., Fouss, F., Yen, L., Dupont, P.: The principal components analysis of a graph, and its relationships to spectral clustering. In: Proceedings of the 15th European Conference on Machine Learning (ECML), pp. 371–383. Springer, Berlin (2004)
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)
Simon, H.: Partitioning of unstructured problems for parallel processing. Comput. Syst. Eng. 2, 135–148 (1991)
Spielman, D., Teng, S.: Spectral partitioning works: planar graphs and finite element meshes. In: 37th Annual Symposium on Foundations of Computer Science (Burlington, VT, 1996), pp. 96–105. CA: IEEE Comput. Soc. Press, Los Alamitos (1996)
Stewart, G., Sun, J.: Matrix Perturbation Theory. Academic, New York (1990)
Still, S., Bialek, W.: How many clusters? An information-theoretic perspective. Neural Comput. 16(12), 2483–2506 (2004)
Stoer, M., Wagner, F.: A simple min-cut algorithm. J. ACM 44(4), 585–591 (1997)
Tibshirani, R., Walther, G., Hastie, T.: Estimating the number of clusters in a dataset via the gap statistic. J. Roy. Stat. Soc. B 63(2), 411–423 (2001)
Van Driessche, R., Roose, D.: An improved spectral bisection algorithm and its application to dynamic load balancing. Parallel Comput. 21(1), 29–48 (1995)
von Luxburg, U., Belkin, M., Bousquet, O.: Consistency of spectral clustering. Ann. Stat. (to appear). See also Technical Report No. 134, Max Planck Institute for Biological Cybernetics (2004)
von Luxburg, U., Bousquet, O., Belkin, M.: On the convergence of spectral clustering on random samples: the normalized case. In: Shawe-Taylor, J., Singer, Y. (eds.) Proceedings of the 17th Annual Conference on Learning Theory (COLT), pp. 457–471. Springer, New York (2004)
von Luxburg, U., Bousquet, O., Belkin, M.: Limits of spectral clustering. In: Saul, L., Weiss, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems (NIPS) 17, pp. 857–864. MIT Press, Cambridge (2005)
Wagner, D., Wagner, F.: Between min cut and graph bisection. In: Proceedings of the 18th International Symposium on Mathematical Foundations of Computer Science (MFCS), pp. 744–750. Springer, London (1993)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
von Luxburg, U. A tutorial on spectral clustering. Stat Comput 17, 395–416 (2007). https://doi.org/10.1007/s11222-007-9033-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-007-9033-z