0% found this document useful (0 votes)
13 views

3 Community Detection Methods and Mining

Uploaded by

sshanjay1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

3 Community Detection Methods and Mining

Uploaded by

sshanjay1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Social Network Analysis (3 - 8) Extraction & Mining Communities in Web Social Networks

 Modularity measures the strength of a community partition by taking into account the
degree distribution. A larger value indicates a good community structure
 One advantage of modularity is that it can be computed using only connectivity of the
network, in the absence of any node labels or other information. However, this property can
also be considered a weakness because modularity is unable to incorporate metadata (e.g.
node labels) even if it is available.
 Modularity measures internal and not external connectivity, but it does so with reference to
a randomized null model.
 The modularity can be either positive or negative. Positive values indicate the possible
presence of community structure

 3.5 Methods for Community Detection and Mining


 The classical methods for dividing given networks into sub-networks are graph partitioning,
hierarchical clustering, and k-means clustering.
 All these methods depend upon the numbers of clusters or their size in advance. It is
necessary to find suitable methods that have abilities of extracting complete information
about the community structure of networks.
 The methods for detecting communities are roughly classified into the following categories:
1. Divisive algorithms
2. Modularity optimization
3. Spectral algorithms

 3.5.1 Divisive Algorithm


 Simple method to identify communities in a network is to find the edges that can connect
vertices of different communities and remove them, so that the communities get
disconnected from each other.
 Newman-Girvan algorithm was has two best features :
1. They involve iterative removal of edges from the network to split it into communities,
the edges removed being identified using “betweenness” measure which represents
number of shortest paths between pair of nodes that pass through the links
2. These measures are recalculated after each removal.
 Newman-Girvan algorithms are highly effective at discovering community structure in both
computer-generated and real-world network data, and they can be also used for complex
structure of networked systems. Fig. 3.5.1 shows detecting communities based on edge
betweenness.
 It uses the idea that “bridges” between communities must have high edge betweenness. The
edge with higher betweenness tends to be the bridge between two communities.
 The edge betweenness of an edge is the number of shortest paths between pairs of vertices
run along it. Iteratively removing the edges with highest betweenness, we can determine a
hierarchical tree and then communities.
TECHNICAL PUBLICATIONS® - An up thrust for knowledge
Social Network Analysis (3 - 9) Extraction & Mining Communities in Web Social Networks

Fig. 3.5.1 : Detecting communities based on edge betweenness

 3.5.2 Modularity Optimization


 An exhaustive optimization of modularity is impossible since there are huge numbers of
ways to partition a network. It has been proved that modularity optimization is an NP-hard
problem.
 There are currently several algorithms that are able to find fairly good approximations of
the modularity maximum in a reasonable time. One of the famous algorithms for
modularity optimization is CNM algorithm proposed by Clauset et al.
 Another example of the algorithms are greedy algorithms and simulated annealing.
 Simulated annealing was proposed by Kirkpatrick et al. who noted the conceptual similarity
between global optimization and finding the ground state of a physical system.
 Simulated Annealing
 To get global optimization, simulated annealing is probabilistic procedure used in different
fields and problems. This procedure consists of the space of possible states looking for the
maximum global optimum of a function F.
 The standard implementation combines two types of moves : local moves, where a single
node is shifted from one cluster to another randomly; and global moves, which consist of
mergers and splits of communities

 3.5.3 Spectral Algorithms


 Spectral algorithms are to cut given network into pieces so that the number of edges to be
cut will be minimized. Spectral graph bi-partitioning is example of this category.
 There are two categories of spectral algorithms for maximizing modularity: one is based on
the modularity matrix and the other is based on the Laplacian matrix of a network.
 In this new method which is based on spectral clustering, the correctness and conductivity
function are used to calculate the value of community detection.
 Spectral methods for community detection rely upon normalized cuts for clustering. A cut
partitions a graph into separate parts by removing edges; it shown in Fig. 3.5.2.
 Spectral clustering partitions a graph into two sub-graphs by using the best cut such that
within community connections are high and across-community connections are low.

TECHNICAL PUBLICATIONS® - An up thrust for knowledge


Social Network Analysis (3 - 10) Extraction & Mining Communities in Web Social Networks

 It can be shown that a relaxation of this discrete optimization problem is equivalent to


examining the eigen-vectors of the Laplacian of the graph. For this research, divisive
clustering was used, recursively partitioning the graph into communities by “divide and
conquer” methods.
 One of the most common methods for community detection is bisection method which is
based on the spectral clustering mainly uses Graph Spectral Theory.

Fig. 3.5.2 : Spectral clustering of a graph relies on recursive binary partitions


 Here the proposed algorithm uses the eigen value distribution of the Laplacian matrix to
estimate the number of communities and the k-means algorithm is used for clustering.
 The drawback of this algorithm is that it is applicable to the network graph which can be
clearly divided into two communities, but the number of communities which are shared,
cannot be detected.
 Applications of community detection are as follows :
1. Recommendation system
2. Social network role detection
3. Functional module in biological networks
4. Graph coarsening and summarization
5. Network hierarchy inference

 3.6 Applications of Community Mining Algorithm


 Following are the community mining algorithms :
1. Network reduction
2. Discovering scientific collaboration groups from small networks
3. Mining communities from dynamic and distributed networks

TECHNICAL PUBLICATIONS® - An up thrust for knowledge

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy