0% found this document useful (0 votes)

63 views

Cluster Optimization For Improved Web Usage Mining

Citation/Export MLA Roshni Ali, “Cluster Optimization for Improved Web Usage Mining”, November 15 Volume 3 Issue 11 , International Journal on Recent and Innovation Trends in Computing and Communication (IJRITCC), ISSN: 2321-8169, PP: 6394 - 6399 APA Roshni Ali, November 15 Volume 3 Issue 11, “Cluster Optimization for Improved Web Usage Mining”, International Journal on Recent and Innovation Trends in Computing and Communication (IJRITCC), ISSN: 2321-8169, PP: 6394 - 6399

Uploaded by

Editor IJRITCC

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

63 views

Cluster Optimization For Improved Web Usage Mining

Uploaded by

Editor IJRITCC

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

International Journal on Recent and Innovation Trends in Computing and Communication

Volume: 3 Issue: 11

ISSN: 2321-8169
6394 - 6399

________________________________________________________________________________________________________

Cluster Optimization for Improved Web Usage Mining

Roshni Ali
Department of Computer Science & Engineering
RCERT
Chandrapur, India
roshniali27@gmail.com
Abstract Now days, World Wide Web (WWW) has become rich and most powerful source of information. Conversely, it has become tricky
and critical task to retrieve actual information due to its continuous expansion in dimensions. Web Usage Mining is a step-wise technique of
extracting useful access patterns of the user from web. Web personalization makes use of web usage mining techniques, for knowledge
acquisition process done by analyzing the user navigational patterns. The web page personalization involves clustering of different web pages
having similar navigation patterns for an individual. Since cluster size expands due to the frequent access, optimization or shrinking the size of
clusters becomes a chief consideration. This paper proposes a tactic of cluster optimization based on concept of swarm intelligence techniques.
Later on based on the recognition of user access patterns, clustering is implemented using neural fuzzy approach i.e. NEF Class algorithm and
cluster optimization is implemented using Ant Nest Mate Approach.
Keywords-Web Usage Mining, Web personalization,NEF Class, Cluster optimization, Swarm intelligence, Pheromone Values, Scouts, Searchers

__________________________________________________*****_________________________________________________
I.

INTRODUCTION

With about 30 million new web pages posted daily, the

WWW is the prime, most used, knowledge source and the most
perspective marketplace. In order to profitably hold on to users
in this hastily rising surroundings, a web site must be build in
such a way that it supports user personalization. To achieve
this, an organization focuses on keeping track of user activities
while browsing their web sites. Although there are many tools
that helps to analyze this data using some of the web statistics
methods, but the data provided is a sufficient information only
for the web site and not for the designer. One of the ways to
overcome this shortcoming is by applying data mining
techniques on the Web. Data mining is a powerful new
technology with great potential
to help companies focus on
the most important information in the data they have collected
about the behavior of their potential clients. It identifies valued
information within the data that queries and reports can't
successfully divulge. Data mining, or knowledge discovery, is
the computer-assisted process of digging through and analyzing
enormous sets of data and then extracting the meaning of the
data[1]. Data mining tools foresee behaviors and upcoming
trends, allowing businesses to build proactive, knowledgedriven decisions.
Web mining - is the application of data mining techniques
to discover patterns from the World Wide Web. Web mining is
the integration of information gathered by traditional data
mining methodologies and techniques with information
gathered over the World Wide Web. Web mining is the use of
data mining techniques to automatically discover and extract
information from Web documents and services. Web mining
can be divided into three different types Web usage
mining, Web content mining and Web structure mining [2].
The rapid e-commerce growth has made both business
community and customers face a critical condition. Powerful
& influential competition and the customers option to choose
from several different alternatives, business community has
realized the necessity of intelligent marketing strategies and
relationship management. Web servers record and accumulate
data about user interactions whenever requests for resources

are obtained. Analysis of Web access logs can helps to

understand the user behavior and the web structure. From the
point of view of business and applications, knowledge
acquired by the Web usage patterns can be directly applied to
efficiently manage activities related to e-business, e-services,
e-education and so on . Accuracy Web usage analysis helps to
attract new customers, preserve existing customers, improvise
cross marketing/sales, efficiency in promotional campaigns,
tracking parting customers and find the most effective logical
structure for their Web space. User profiles could be built by
combining users navigation paths with other data features,
such as page viewing time, hyperlink structure, and page
content [4]. As different brands morph into content providers,
clients have become more and more habituated of seeing
content that's customized, hence laying a basis for the need
importance of personalization.
In the coming sections, in the paper we will come across
detailed process of Web Usage Mining in Section II,Section
III will illustrate basics of Clustering in web & Neuro-Fuzzy
approach for the same. Section IV will focus on area of Swarm
Intelligence & Ant Nest Mate method that has been used for
cluster optimization. Section V will give detailed experimental
results of the work done for this paper. Performance Analysis
for same is illustrated in next Section VI.
II.

WEB USAGE MINING

Web usage mining refers to the automatic discovery and

analysis of patterns in click streams, user transactions and
other associated data collected or generated as a result of user
interactions with web resources on one or more Web sites [7].
With the continual growth and propagation of e-commerce,
Web services, and Web-based information systems, the
volumes of click stream, transaction data, and user profile data
collected by Web-based organizations in their daily operations
has reached astronomical proportions [8]. Proper Analysis of
such web data helps the organizations to determine the lifetime value of clients, also helps in designing cross-marketing
strategies for products and services, evaluating effectiveness
6394

IJRITCC | November 2015, Available @ http://www.ijritcc.org

_______________________________________________________________________________________

International Journal on Recent and Innovation Trends in Computing and Communication

Volume: 3 Issue: 11

ISSN: 2321-8169
6394 - 6399

________________________________________________________________________________________________________
of promotional campaigns, optimizing the functionality of
III. CLUSTERING
Web-based applications, provides more tailored content to
Clustering is a method of data mining that collectively
viewers, and helps in finding the most effective logical
groups set of items having similarities in there characteristics
structure for Web space. This kind of analysis helps in
revealed. In usage domain, we have observed that there are
automatic discovery of significant patterns and associations
broadly two basic clusters i.e. user clusters and page clusters
among huge collection of chiefly semi-structured data stored
[12]. Clustering on user records (sessions or transactions) is
in Web servers and applications server access logs and related
referred as most common analysis done in Web usage mining
operational data sources. The goal is to confine, model, and
and Web analytics. Clustering of users together tends to create
examine the behavioral patterns and profiles of users
groups exhibiting similar kind of browsing or access patterns.
interacting with a Web site. The discovered patterns are
This knowledge is especially helpful for inferring user
usually represented as collections of pages, objects, or
demographics in order to perform market segmentation in eresources that are frequently accessed or used by groups of
commerce applications or provide personalized Web content
users with common needs or interests [10].
to the users with similar interests [8]. Further on, analysis of
Web Usage mining Consists of three phases, mainly preuser groups based on their demographic attributes can lead to
processing, pattern discovery, and pattern analysis [11]. Fig 1.
the discovery of valuable business intelligence. Moreover,
Below shows the sequence of Web Usage Mining process.
Usage-based clustering has also been used to create Web-

Fig 1: Web Usage Mining Process

1.Pre-Processing.
It is the process of converting the unstructured data into
useful information by applying some algorithm. Web usage
data sources must be integrated, filtered, cleaned, and
transformed, such that gaps will be possibly filled, irrelevant
information will be thrown away, and user sessions and
transactions will be identified. These sources of data are
mainly Web server log files, agent logs and other interfaces.
The data present in the log file cannot be used as it is.
Therefore, contents of the web log file should be cleaned in
this preprocessing step. The unwanted data are bumped of and
a minimized log file is obtained.
2.Pattern Discovery
After converting the data in log file into a formatted data
the pattern discovery process is done. With the existing data of
the log files many useful patterns are identified either with user
ids, session details, time outs etc. It is the key component for
analyzing the pre-processed data. In this phase the process is
done through various algorithm and knowledge discovery
techniques. It can be done using various techniques such as
association rules, classification, clustering, sequential pattern
and statistical analysis.
3. Pattern analysis
This process eliminates the irrelevant rules or patterns that
were generated. They extract the interesting rules or patterns
from the output of the pattern discovery. The most familiar
form of pattern analysis comprises of a knowledge query
mechanism such as SQL (Structured Query Language) or
loads the usage data into a data cube toper form OLAP (Online
analytical processing) operations.

based user communities reflecting similar interests of users,

and to study user patterns that can be used to provide dynamic
recommendations in Web personalization applications.
One straightforward approach in creating an aggregate view
of each cluster is to compute the centroid (or the mean vector)
of each cluster. The dimension value for each page view in the
mean vector is computed by finding the ratio of the sum of the
page view weights across transactions to the total number of
transactions in the cluster. If page view weights in the original
transactions are binary, then the dimension value of a page
view p in a cluster centroid represents the percentage of
transactions in the cluster in which p occurs. Thus, the
centroid dimension value of p provides a measure of its
significance in the cluster. Page views in the centroid can be
sorted according to these weights and lower weight page views
can be filtered out [8]. The resulting set of page view-weight
pairs can be viewed as an aggregate usage profile
representing the interests or behavior of a significant group of
users.
Since a fuzzy systems describes the knowledge it encodes
better but it cant learn or adapt its knowledge from training
examples. On contrary neural network learns from training
examples but cannot elucidate what it has learnt hence its not
viable to infer the end result in natural language. Neural
networks and fuzzy systems both have their own strengths and
weaknesses. The unification of neural networks and fuzzy
logic in neuro-fuzzy models provides learning & readability.
Hence, researchers have made many attempts to assimilate
these specific methods to craft hybrid models which will unite
the advantages/merits of both methods. In the conventional
approach to fuzzy clustering the model designer based on a
prior knowledge fixes the membership functions and the
consequent models. However, in cases where this set is
unavailable and instead a set of input-output data is observed
from the process, the components of fuzzy system i.e.
membership and consequent models can be represented in a
parametric form and the parameters are tuned with the help of
neural networks. In such situations the fuzzy methods turn into
neurofuzzy methods. Neuro-fuzzy methods combine the
uncertainty handling capability of fuzzy systems and the
learning ability of neural networks. Thus, neurofuzzy (NF)
computing has become a popular framework for solving
complex problems in general and clustering problems in
particular. In case the knowledge about clustering or any
6395

IJRITCC | November 2015, Available @ http://www.ijritcc.org

_______________________________________________________________________________________

International Journal on Recent and Innovation Trends in Computing and Communication

Volume: 3 Issue: 11

ISSN: 2321-8169
6394 - 6399

________________________________________________________________________________________________________
general problem can be expressed in linguistic rules, then a
As the size of the cluster goes on increasing due to increase
fuzzy inference system (FIS) can be built, and if it is in data,
in users or growth of interest of users it has become inevitable
or can be learned from a simulation or training then artificial
need to optimize the clusters. Here we introduces a cluster
neural networks (ANNs) can be applied[16][17].
optimizing methodology based on ants nest mate recognition
ability and is used for eliminating the data redundancies that
IV. OPTIMIZATION THROUGH SWARM INTELLIGENCE
may occur after the clustering done by the web usage mining
methods. Ant Nest Mate approach for cluster optimization is
Particle Swarm Optimization (PSO) was originally
presented to personalize web page clusters of target users.
designed and introduced by Eberhart and Kennedy. The PSO
Hierarchy relationship exists within groups. These complex
algorithm is a population base seek algorithm based on social
behaviors can be instantiated with a fact that ants can
behavior of birds, bees or a school of fishes [20]. Originally
distinguish between nest mates and non-nest mates. The level
swarm intelligence focuses on graphically simulating the
of interaction and cooperation among ants of different colony
graceful and unpredictable choreography of bird folk. Every
is nearly nil as to protect the exploitation of the colony from
single individual is represented as vector in multidimensional
outsiders. Ants can distinguish nest mates from non-nest
search area. Thus same vector have one assigned vector that
mates, which allow them to limit altruism and cooperation to
can determine the subsequent progress of the particle called as
members of their own colony and protect their colony from
velocity vector. The PSO then determines methods to revise
exploitation by outsiders.
the velocity of a particle. Each particle then updates its
velocity based on present velocity and the finest arrangement
V. EXPERIMINTAL RESULTS
explored so far [20].The PSO practice is then iterated for some
fixed number of times till minimum error based on preferred
1) Creating Web log File.
The Windows Firewall log allows advanced users to
performance index is attained. It has been shown that this
collect and identify inbound traffic. You can log dropped
simple model can deal with difficult optimization problems
packets and successful connections. Once logging is turned
efficiently. The PSO, in the beginning, was developed for real
on all of the information is written to a file called,
valued spaces but many troubles are, however, defined for
pfirewall.log. The log file is stored in the %system
discrete valued spaces where the domain of the variables is
root%\Windows directory. This log file contains fields like
finite.
date, time, action, protocol src-ip ,dst-ip ,src-port ,dst-port
Recently a family of nature have inspired lots of
,size ,tcpflags, tcpsyn ,tcpack ,tcpwin, icmptype ,icmpcode
technical algorithms, known as Swarm Intelligence (SI).It has
& info path.This log file is filtered & pre-processed
fascinated number of researchers from the areas of pattern
initially.
recognition and clustering [21]. Various clustering techniques
that are based on this have allegedly presented many classical
Step 1: Take pfirewall.log file as an input.
methods of partitioning a complex real world dataset. This
Step 2: Parsing the pfirewall.log.
area of Swarm Intelligence is a relatively new interdisciplinary
(Selecting the only required attributes.i.e.Src ip,dest
field of research that has gained huge popularity now a day.
ip & size(no. of . Packets)
Different algorithms resembling to the domain portray
Step3: Store in array.
inspiration from the collective intelligence emerging from the
Step 4: Check which entry have same destination &
behavior of a group of social insects (like bees, termites and
source ip and apply the accumulation filter /
wasps). When acting as a community together, these insects
Discretion filter.
with very limited individual capability cooperatively perform
Step 5: Display output in console.
many complex tasks necessary for their continued existence.
Troubles of finding and storing foods, selecting and picking up
materials for future usage need a thorough planning, and are
solved by insect colonies without any kind of supervisor or
controller. Particle Swarm Optimization (PSO) is another very
popular SI algorithm for global optimization over continuous
search spaces.
The complex social behavior of ants and other social
insects requires multiple levels of recognition. Thus, Ant Nest
mate approach suggests that ants can distinguish nest mates
from non-nest mates, which allow them to limit altruism and
cooperation to members of their own colony and protect their
colony from exploitation by outsiders. Ants that have the same
odor will be in the same nest. The clusters obtained are feed
into an ant based clustering approach that checks for the
similarity of the pheromone values of the artificial ants. This is
done on the fact that ants belonging to the same nest will have
similar odor. In this algorithm clusters are considered as the
ants nest and the url combinations in each cluster is considered
as the artificial ants.

Fig 2: Filtered data from Web log File

2) Implementing NEF CLASS.
Out of this filtered data,Src & Dest IP are separated & its
total access of time, total accessed data packets & total hits on
link is calculated & displayed in console. As NEF CLASS
suggests, using Src & Dest IP & Size of data, total accessed
data packets & total hits on link is calculated. We had group of
inputs that had 3 data values. After implementation of NEF
Class we had obtained more 2 attributes i.e. Total accessed
6396

IJRITCC | November 2015, Available @ http://www.ijritcc.org

_______________________________________________________________________________________

International Journal on Recent and Innovation Trends in Computing and Communication

Volume: 3 Issue: 11

ISSN: 2321-8169
6394 - 6399

________________________________________________________________________________________________________
time & Total accessed counts. Now using these inputs values,
Clusters as a whole that are referred as Centroid are
decision will be taken to add data to clustering.
divided in Scouts (Cluster 1) showing similar access patterns
of user & Searchers (Cluster 2) showing dissimilar ones.

Fig 3: Implementation of NEF-CLASS

Fig 5:Cluster as Scouts.

Once we have calculated mean access time and mean

access count, entries are then added for clustering by applying
condition.
Condition Applied.
If Access Time & Access Count Mean
Access Time & Access Count..
Then, ADD ENTRY TO CLUSTER.
Else..Reject Or Discard.

Fig 6.Cluster as Searchers

Once the Clusters are optimized and divided into Scouts &
Searchers, then on basis of this optimized data, User IPs are
tracked ,access to which other user has been done is identified
& mean access, mean time & mean size are calculated &
displayed.

Fig 4:Centroid Cluster.

Once the cluster have been created, then entries of these
clustered are parsed again to identify the similarities and
difference of data accessed. This parsing is done by applying
Swarm intelligence optimization technique on clusters & Ant
Nest mate approach is used for same.
Centroid = Cluster of data from web log file obtained
by pattern matching & Analysis
Pheromone Values = Navigation Links that user
follows by continuous browsing.
Scouts = Similarities of Links that has all relevant
matching contents of users need.
Searchers = Dissimilarities in Links has all irrelevant
items that dont matches up to data that was searched.
Fig 7: Tracked User Profiles
6397
IJRITCC | November 2015, Available @ http://www.ijritcc.org

_______________________________________________________________________________________

International Journal on Recent and Innovation Trends in Computing and Communication

Volume: 3 Issue: 11

ISSN: 2321-8169
6394 - 6399

________________________________________________________________________________________________________
VI. PERFORMANCE ANALYSIS
Once the user profiles have been tracked, Pie-charts are
generated for the same analyzing the behavior of each user on
the various ip address.Fig 8,below represent the user access
times showing that which ip destination is accessed for how
much time span.In the result below green color represent the
maximum time of access & ip address related for same was
179.60.192.7.and remaining other ips were not accessed for
much longer duration as compared to mentioned one.
Fig 10:Pie Chart for User Access Sizes
CONCLUSION

Fig 8:Pie chart for User Access Times

User access counts Pie chart represented below in Fig
9,mentiones the number of times particular ip address is
visited on an on by the user. As from the below Pie chart we
can say ip address 31.13.95.8 have got maximum number of
hit counts by the user,in the time span observed. Next to it was
10.47.176.28 represented by bit broad area in green color.
Other ip addresses like 10.163.122.233, 30.45.61.160,
23.35.39.14 & others as can been seen have fair number of
access counts.

Internet is the broad media for the users today, where

ample amount of data is available to them on a single click,
this data is updated day by day and growing larger in size.
This paper deals with the phenomena of providing user a
better access opportunity with less complexity in minimum
amount of time. This paper proposes optimization technique of
clusters based on swarm intelligence technique of Ant Nest
Mate. As user access data on web, enteries of its detailed
activity gets logged into web log file. Web log file tracks the
record of each and every event of customer. The web log is
accessed and performs data cleaning to remove crawlers
request and request to graphics. Accumulation filtering is done
for log file filtering. This filtered data is then used for
clustering. Clustering is done using NEF Class algorithm
which is a Neuro-Fuzzy approach. Thus filtered input pattern
is used for clustering & clusters thus formed are feed to Ant
Nest Mate Algorithm, where cluster optimization is performed
on ips accessed. Optimization is done on the basis of swarm
intelligence properties, where user following the similar
navigational links will be grouped together in one cluster
referred as Scouts and the other one with dis-similar behavior
will be placed in Searchers. Thus the cluster or centroid gets
optimized in more better way. Then we perform tracking user
profilesby extracting user profiles from each cluster as a set of
relevant urls and the user profiles discovered in certain period
of time is compared with the user profiles discovered in later
period. Based on the user profiles the web page is
personalized. As a future enhancement scalability can be taken
into consideration. Concept of Big data or thin data may come
across as for the storage of data in clusters. Hence improved
algorithm for same could be implemented for better storage.
REFERENCES
[1]
[2]

Fig 9:Pie chart for User Access Counts

[3]

User access size pie chart in Fig 10.below mentions amount of

data accessed on particular ip address.Here ip adress 10.34.189
can be observed to be have largest access size,next to this
comes ip,s like 10.163.122..203,10.45.61.160,10.47.176.28
etc,and other that can be seen has average access size

[4]
[5]
[6]

http://www.laits.utexas.edu/~anorman/BUS.FOR/course.mat/Ale
x/
Lin, C.-W. and Hong, T.-P. (2013), A survey of fuzzy web
mining. WIREs Data Mining Knowl Discov, 3: 190199.
doi: 10.1002/widm.1091
Srivastava J, Desikan P and V Kumar, Web Mining-Concepts,
Applications & Research Direction in 2002 Conference.
Abraham, Ajith, He Guo, and Hongbo Liu. Swarm intelligence:
foundations, perspectives and applications. Springer Berlin
Heidelberg, 2006.
Srivastava J, Desika& n P and V Kumar , Web MiningAccomplishment Future Directions in 2004 Conference.
R. Kosala, and H. Blockeel, Web Mining Research: A Survey,
SIGKDD Explorations, Newsletter of the ACM Special Interest
6398

IJRITCC | November 2015, Available @ http://www.ijritcc.org

_______________________________________________________________________________________

International Journal on Recent and Innovation Trends in Computing and Communication

Volume: 3 Issue: 11

ISSN: 2321-8169
6394 - 6399

________________________________________________________________________________________________________
Group on Knowledge Discovery and Data Mining, Vol. 2, No. [26] C. W. Cleverdon The Cranfield Tests on Index Languages
[7]

[8]
[9]
[10]
[11]
[12]
[13]

[14]

[15]

[16]
[17]
[18]
[19]

[20]
[21]

[22]

[23]

[24]

[25]

1 pp 1-15, 2000.
Puzis, Yury, et al. "Predictive web automation assistant for
people with vision impairments." Proceedings of the 22nd
international conference on World Wide Web. International
World Wide Web Conferences Steering Committee, 2013.
Mobasher, Bamshad. "Data mining for web personalization."
The adaptive web. Springer Berlin Heidelberg, 2007. 90-135.
Qingtian Han, Xiaoyan Gao, Wenguo Study on Web Mining
Algorithim based on usuage Mining, Computer Aided
Industrial design and Conceptual design, 2008 CAID/CD 2008.
Facca, Federico Michele, and Pier Luca Lanzi. "Mining
interesting knowledge from weblogs: a survey." Data &
Knowledge Engineering 53.3 (2005): 225-241.
Jaideep shrivastav, Robert Colley, Mukund Deshpande, PangNing Tan, Web Usage Mining: discovery and Application of
usage pattern from web data,ACM SIGKDD,jan2000.
www.springer.com/cda/content/.../cda.../9783642539640-c1.pdf
WANG Tong HE Pi-lian, Web Log Mining by an Improved
AprioriAll Algorithm, proceedings of world academy of
science, engineering and technology volume 4 February 2005
ISSN 1307-6884, 2005 WASET.ORG.
Mohd Helmy Abd Wahab, Mohd Norzali Haji Mohd, Hafizul
Fahri Hanafi, and Mohamad Farhan Mohamad Mohsin, Data
Pre-processing on Web Server Logs for Generalized Association
Rules Mining Algorithm, World Academy of Science,
Engineering and Technology ,2008.
Kobra Etminani, Mohammad-R. Akbarzadeh-T, and Noorali
Raeeji Yanehsari, Web Usage Mining: users' navigational
patterns extraction from web logs using Ant-based Clustering
Method, in Proc. IFSA-EUSFLAT ,2009.
Roohi, Farhat. "NEURO FUZZY APPROACH TO DATA
CLUSTERING: A FRAMEWORK FOR ANALYSIS."
European Scientific Journal 9.9 (2013).
Cordn, Oscar. Genetic fuzzy systems: evolutionary tuning and
learning of fuzzy knowledge bases. Vol. 19. World Scientific,
2001.
R. Cooley, B. Mobasher, and J. Srivastava,Web Mining:
Information and Pattern Discovery on the World Wide Web,
IEEE Computer Society,2009, pp. 558
R. Cooley, B. Mobasher, and J. Srivastava, Data Preparation
for Mining World Wide Web Browsing Patterns,
KNOWLEDGE AND INFORMATION SYSTEMS, vol.
1,1999.
He, Jie, and Hui Guo. "A modified particle swarm optimization
algorithm." TELKOMNIKA Indonesian Journal of Electrical
Engineering 11.10 (2013): 6209-6215.
Ali, Yasir Hassan, Roslan Abd Rahman, and Raja Ishak Raja
Hamzah. "Acoustic emission signal analysis and artificial
intelligence techniques in machine condition monitoring and
fault diagnosis: a review." Jurnal Teknologi 69.2 (2014).
D.Vasumathi, and A.Govardan,BC-WASPT : Web Acess
Sequential Pattern Tree Mining, IJCSNS International Journal
of Computer Science and Network Security., Vol.9,June-2009,
pp. 569571.
S.Vijayalakshmi V.Mohan, S.Suresh Raja,Mining Constraintbased Multidimensional Frequent Sequential Pattern in Web
Logs, European Journal of Scientific Research., Vol.36, pp
.480-490,2009.
Ming-Syan Chen, Jong Soo Park, Philip S. Yu, Efficient Data
Mining for Path Traversal Patterns, Ieee Transactions On
Knowledge And Data Engineering, Vol. 10, No. 2, March/April
1998.
F.M. Facca, P.L. Lanzi Mining interesting knowledge from
Weblogs: a survey, Data and Knowledge Engineering Vol. 53,
No. 3,June 2005, pp 225-241.

[27]
[28]
[29]
[30]
[31]

[32]
[33]
[34]

[35]
[36]
[37]
[38]

[39]
[40]
[41]
[42]
[43]

[44]

Devices. In Readings in Information Retrieval, Morgan

Kaufmann Publisher. Inc, San Francisco, California, pp.4760.1997.
Qingtian Han, XiaoyanGao, Wenguo Study on Web Mining
Algorithim based on usuage Mining, Computer Aided
Industrial design and Conceptual design, 2008 CAID/CD 2008.
Jaideepshrivastav, Robert Colley, MukundDeshpande, PangNing Tan, Web Usage Mining: discovery and Application of
usage pattern from web data,ACM SIGKDD,jan2000.
Bhaskaran, V.S.a.V.M., "Data Preparation Techniques for Web Usage
Mining in World Wide Web-An Approach,"International Journal of
Recent Trends in Engineering, Vol 2, No.4, 2009.
C, G., M. M, and D. K, "Preprocessing of Web Log Files in Web
Usage Mining,"The Icfai Journal of Information Technology, 35J2008-03-06-01(35J-2008-03-06-01): pp. 55-66, 2008.
Mohd Helmy Abd Wahab, M.N.H.M., Hafizul Fahri
Hanafi,Mohamad Farhan and Mohamad Mohsin."Data Preprocessing on Web Server Logs for Generalized Association Rules
Mining Algorithm,"World Academy of Science, Engineering and
Technology 48, 2008.
L.K. Joshila Grace, V.M.a.D.N., "ANALYSIS OF WEBLOGS AND
WEB USER IN WEB MINING," International Journal of Network
Security & Its Applications(IJNSA), Vol.3, No.1, 2011.
K, D.a.V.v.M., "Session Reconstruction in Web Usage Analysis," The
Icfai Journal of Information Technology, 2008.
R.Krishnamoorthi, K.R.S.a.,"Identifying User Behavior by
Analyzing Web Server Access Log File," IJCSNS International
Journal of Computer Science and Network Security, Vol.9, No. 4,
2009.
V.Chitraa, A.S.D., "A Survey on Preprocessing Methods for Web
Usage Data," (IJCSIS) International Journal of Computer Science and
Information Security, Vol. 7, No. 3, 2010.
Arshi Shamsi, R.N., Pankj Pratap Singh and Mahesh Kumar
Tiwar,"Web Usage Mining by Data Preprocessing," International
Journal of Computer Science And Technology, Vol.3, No. 1, 2012.
R, C.R.S.J.a.D., "Web Usage Mining: Discovery and Application of
Interesting Patterns from Web Data," Ph.D Thesis, University of
Minnesota, 2000.
M.-S. Chen, J. S. Park, and P. S. Yu. Data mining for path traversal
patterns in a web environment. In Proc. of the 16th International
Conference on Distributed Computing Systems,pages 385392, May
1996.
T. Nakayama, H. Kato, and Y. Yamane. Discovering the gap between
web site designers expectations and users behavior. In Proc. of the
Ninth Intl World Wide Web Conference,Amsterdam, May 2000.
J. Pei, J. Han, B. Mortazavi-asl, and H. Zhu. Mining access patterns
efficiently from web logs. In Proc. of the 4th Pacific-Asia Conf. on
Knowledge Discovery and Data Mining,pages 396407, April 2000.
M. Perkowitz and O. Etzioni. Adaptive web sites:Automatically
synthesizing web pages. In Proc. of the Fifteenth National Conf. on
Artificial Intelligence (AAAI), pages 727732, 1998.
M. Perkowitz and O. Etzioni. Towards adaptive sites: Conceptual
framework and case study. In Proc. of the Eighth Intl World Wide
Web Conf, Toronto, Canada, May 1999.
C. Shahabi, A. M. Zarkesh, J. Abidi, and V. Shah. Knowledge
discovery from users web-page naviagtion. In Proc. of the 7th IEEE
Intl. Workshop on Research Issues in Data Engineering (RIDE), pages
2029, 1997.
M. Spiliopoulou, L. C. Faulstich, and K. Wilkler. A data miner
analyzing the navigational behaviour of web users. In Proc. Of
the Workshop on Machine Learning in User Modelling of the
ACAI99, Greece, July 1999.

6399
IJRITCC | November 2015, Available @ http://www.ijritcc.org

_______________________________________________________________________________________

Data Mining Tan Solution Manual
0% (4)
Data Mining Tan Solution Manual
2 pages
A Review On Clustering Techniques
No ratings yet
A Review On Clustering Techniques
4 pages
Ijctt V3i4p110
No ratings yet
Ijctt V3i4p110
3 pages
Bar Sag Ada
No ratings yet
Bar Sag Ada
27 pages
Bda Class - Feb 7th
No ratings yet
Bda Class - Feb 7th
28 pages
Algorithm For Tracing Visitors' On-Line Behaviors
No ratings yet
Algorithm For Tracing Visitors' On-Line Behaviors
7 pages
User Web Usage Mining For Navigation Improvisation Using Semantic Related Frequent Patterns
No ratings yet
User Web Usage Mining For Navigation Improvisation Using Semantic Related Frequent Patterns
5 pages
Web Mining Notes
100% (1)
Web Mining Notes
8 pages
Web Mining Using Artificial Ant Colonies: A Survey
No ratings yet
Web Mining Using Artificial Ant Colonies: A Survey
6 pages
Framework For Web Personalization Using Web Mining
No ratings yet
Framework For Web Personalization Using Web Mining
6 pages
Handling High Web Access Utility Mining Using Intelligent Hybrid Hill Climbing Algorithm Based Tree Construction
No ratings yet
Handling High Web Access Utility Mining Using Intelligent Hybrid Hill Climbing Algorithm Based Tree Construction
11 pages
Pattern Discovery Techniques in Online Data Mining: Madhur Aggarwal, Anuj Bhatia
No ratings yet
Pattern Discovery Techniques in Online Data Mining: Madhur Aggarwal, Anuj Bhatia
4 pages
An Effective Web Usage Analysis Using Fuzzy Clustering: P.Nithya, P.Sumathi
No ratings yet
An Effective Web Usage Analysis Using Fuzzy Clustering: P.Nithya, P.Sumathi
6 pages
Clustering and Classification
No ratings yet
Clustering and Classification
1 page
H 5
No ratings yet
H 5
13 pages
A Framework For Improving E-Commerce Websites Usability Using A Hybrid Genetic Algorithm and Neural Network System
No ratings yet
A Framework For Improving E-Commerce Websites Usability Using A Hybrid Genetic Algorithm and Neural Network System
13 pages
Web Mining: Presented By: Vikash Kumar
No ratings yet
Web Mining: Presented By: Vikash Kumar
24 pages
[4] Web Mining for BI - Part 2
No ratings yet
[4] Web Mining for BI - Part 2
31 pages
Unit 5 DM
No ratings yet
Unit 5 DM
61 pages
Web Mining and Knowledge Discovery of Usage Patterns - A Survey
No ratings yet
Web Mining and Knowledge Discovery of Usage Patterns - A Survey
27 pages
Mining Web Log Files For Web Analytics and Usage Patterns To Improve Web Organization
No ratings yet
Mining Web Log Files For Web Analytics and Usage Patterns To Improve Web Organization
9 pages
Web Data Mining - 5
No ratings yet
Web Data Mining - 5
14 pages
A New Approach For Web Usage Mining Using Artificial Neural Network
No ratings yet
A New Approach For Web Usage Mining Using Artificial Neural Network
5 pages
Web Usage Mining
No ratings yet
Web Usage Mining
14 pages
A Data Warehousing and Data Mining Framework For Web Usage Management
No ratings yet
A Data Warehousing and Data Mining Framework For Web Usage Management
24 pages
Advance Clustering Technique Based On Markov Chain For Predicting Next User Movement
No ratings yet
Advance Clustering Technique Based On Markov Chain For Predicting Next User Movement
7 pages
Found Interest in Migration Patterns Based On Hidden Markov Models
No ratings yet
Found Interest in Migration Patterns Based On Hidden Markov Models
6 pages
Web Usage Mining: Discovery and Applications of Usage Patterns From Web Data
No ratings yet
Web Usage Mining: Discovery and Applications of Usage Patterns From Web Data
12 pages
2nd Project Report Pse12april
No ratings yet
2nd Project Report Pse12april
11 pages
Improving Web Search Results in Web Personalization
No ratings yet
Improving Web Search Results in Web Personalization
4 pages
A Study on Different Aspects of Web Mining and Research Issues
No ratings yet
A Study on Different Aspects of Web Mining and Research Issues
8 pages
Web Mining
No ratings yet
Web Mining
28 pages
An Artificial Ant Colony Methodology For Users Navigation Patterns Mining
No ratings yet
An Artificial Ant Colony Methodology For Users Navigation Patterns Mining
4 pages
Ijca PDF
No ratings yet
Ijca PDF
9 pages
Ijesat 2012 02 Si 01 12
No ratings yet
Ijesat 2012 02 Si 01 12
5 pages
World Wide Web Usage Mining Systems and Technologies
No ratings yet
World Wide Web Usage Mining Systems and Technologies
7 pages
Acstv10n5 65
No ratings yet
Acstv10n5 65
12 pages
Log Paper-1
No ratings yet
Log Paper-1
15 pages
Viral Marketing in Social Network Using Data Mining: Shalini Sharma, Vishal Shrivastava
No ratings yet
Viral Marketing in Social Network Using Data Mining: Shalini Sharma, Vishal Shrivastava
5 pages
7
No ratings yet
7
48 pages
Web Clustering Techniques A Study
No ratings yet
Web Clustering Techniques A Study
6 pages
Artificial Neural Network Approach For Student S Behavior Analysis
No ratings yet
Artificial Neural Network Approach For Student S Behavior Analysis
5 pages
Web Usage Mining Master Thesis
100% (2)
Web Usage Mining Master Thesis
7 pages
Web Mining: by Saumil Shah Roll No: 46 Mca 4 Sem
No ratings yet
Web Mining: by Saumil Shah Roll No: 46 Mca 4 Sem
28 pages
Behavior Study of Web Users Using Two-Phase Utility Mining and Density Based Clustering Algorithms
No ratings yet
Behavior Study of Web Users Using Two-Phase Utility Mining and Density Based Clustering Algorithms
6 pages
Role of Web Mining in E-Commerce: Arti, Sunita Choudhary, G.N Purohit
No ratings yet
Role of Web Mining in E-Commerce: Arti, Sunita Choudhary, G.N Purohit
3 pages
Web Mining123
No ratings yet
Web Mining123
20 pages
9-Advanced Preprocessing Using Distinct User
No ratings yet
9-Advanced Preprocessing Using Distinct User
5 pages
Web Usage Mining: - Hat, Hy, Ho
No ratings yet
Web Usage Mining: - Hat, Hy, Ho
18 pages
Web Miningppt
No ratings yet
Web Miningppt
29 pages
Neuro-Fuzzy Based Hybrid Model For Web Usage Minin
No ratings yet
Neuro-Fuzzy Based Hybrid Model For Web Usage Minin
9 pages
"E-Service Intelligence in Web Mining": Prof. Ms. S. P. Shinde
No ratings yet
"E-Service Intelligence in Web Mining": Prof. Ms. S. P. Shinde
12 pages
A New Intelligent Algorithm to Create a Profile Fo
No ratings yet
A New Intelligent Algorithm to Create a Profile Fo
6 pages
3.Eng-A Survey On Web Mining
No ratings yet
3.Eng-A Survey On Web Mining
8 pages
Web Mining
100% (3)
Web Mining
28 pages
An Analysis of Web User Behavior Using Hybrid Algorithm Based On Sequential Pattern Mining
No ratings yet
An Analysis of Web User Behavior Using Hybrid Algorithm Based On Sequential Pattern Mining
8 pages
A Plausible Comprehensive Web Intelligent System For Investigation of Web User Behaviour Adaptable To Incremental Mining
No ratings yet
A Plausible Comprehensive Web Intelligent System For Investigation of Web User Behaviour Adaptable To Incremental Mining
20 pages
Web Mining- Unearthing Insights From the Digital Landscape
No ratings yet
Web Mining- Unearthing Insights From the Digital Landscape
9 pages
Web Mining Presentation
No ratings yet
Web Mining Presentation
14 pages
Web Usage Mining Literature Review
100% (3)
Web Usage Mining Literature Review
8 pages
Mastering Data Mining Techniques
From Everand
Mastering Data Mining Techniques
Dhaanyalakshmi Ahuja
No ratings yet
A Review of Wearable Antenna For Body Area Network Application
No ratings yet
A Review of Wearable Antenna For Body Area Network Application
4 pages
Importance of Similarity Measures in Effective Web Information Retrieval
No ratings yet
Importance of Similarity Measures in Effective Web Information Retrieval
5 pages
Channel Estimation Techniques Over MIMO-OFDM System
No ratings yet
Channel Estimation Techniques Over MIMO-OFDM System
4 pages
A Review of 2D &3D Image Steganography Techniques
No ratings yet
A Review of 2D &3D Image Steganography Techniques
5 pages
A Review of Wearable Antenna For Body Area Network Application
No ratings yet
A Review of Wearable Antenna For Body Area Network Application
4 pages
IJRITCC Call For Papers (October 2016 Issue) Citation in Google Scholar Impact Factor 5.837 DOI (CrossRef USA) For Each Paper, IC Value 5.075
No ratings yet
IJRITCC Call For Papers (October 2016 Issue) Citation in Google Scholar Impact Factor 5.837 DOI (CrossRef USA) For Each Paper, IC Value 5.075
3 pages
Channel Estimation Techniques Over MIMO-OFDM System
No ratings yet
Channel Estimation Techniques Over MIMO-OFDM System
4 pages
45 1530697786 - 04-07-2018 PDF
No ratings yet
45 1530697786 - 04-07-2018 PDF
5 pages
Image Restoration Techniques Using Fusion To Remove Motion Blur
No ratings yet
Image Restoration Techniques Using Fusion To Remove Motion Blur
5 pages
A Review of 2D &3D Image Steganography Techniques
No ratings yet
A Review of 2D &3D Image Steganography Techniques
5 pages
Prediction of Crop Yield Using LS-SVM
No ratings yet
Prediction of Crop Yield Using LS-SVM
3 pages
A Study of Focused Web Crawling Techniques
No ratings yet
A Study of Focused Web Crawling Techniques
4 pages
Diagnosis and Prognosis of Breast Cancer Using Multi Classification Algorithm
No ratings yet
Diagnosis and Prognosis of Breast Cancer Using Multi Classification Algorithm
5 pages
An Approach For Power Control in Vehicular Adhoc Network For Catastrophe Message
No ratings yet
An Approach For Power Control in Vehicular Adhoc Network For Catastrophe Message
7 pages
Hybrid Algorithm For Enhanced Watermark Security With Robust Detection
No ratings yet
Hybrid Algorithm For Enhanced Watermark Security With Robust Detection
5 pages
Predictive Analysis For Diabetes Using Tableau: Dhanamma Jagli Siddhanth Kotian
No ratings yet
Predictive Analysis For Diabetes Using Tableau: Dhanamma Jagli Siddhanth Kotian
3 pages
Itimer: Count On Your Time
No ratings yet
Itimer: Count On Your Time
4 pages
Safeguarding Data Privacy by Placing Multi-Level Access Restrictions
No ratings yet
Safeguarding Data Privacy by Placing Multi-Level Access Restrictions
3 pages
44 1530697679 - 04-07-2018 PDF
No ratings yet
44 1530697679 - 04-07-2018 PDF
3 pages
Motif and Conglomeration of Software Process Improvement Model
No ratings yet
Motif and Conglomeration of Software Process Improvement Model
3 pages
41 1530347319 - 30-06-2018 PDF
No ratings yet
41 1530347319 - 30-06-2018 PDF
9 pages
BUSINESS DIARY - An Interactive and Intelligent Platform For SME's
No ratings yet
BUSINESS DIARY - An Interactive and Intelligent Platform For SME's
3 pages
Paper On Design and Analysis of Wheel Set Assembly & Disassembly Hydraulic Press Machine
No ratings yet
Paper On Design and Analysis of Wheel Set Assembly & Disassembly Hydraulic Press Machine
4 pages
49 1530872658 - 06-07-2018 PDF
No ratings yet
49 1530872658 - 06-07-2018 PDF
6 pages
Lift Control System Based On PLC
No ratings yet
Lift Control System Based On PLC
3 pages
Assignment 4 Dbms CF
No ratings yet
Assignment 4 Dbms CF
8 pages
Feature Extraction and Reduction by using ModifiedApriori algorithm (1)
No ratings yet
Feature Extraction and Reduction by using ModifiedApriori algorithm (1)
9 pages
Shashank Srivastava
No ratings yet
Shashank Srivastava
2 pages
Important questions in WDM 9.2.25
No ratings yet
Important questions in WDM 9.2.25
4 pages
Important Questions From All Units
No ratings yet
Important Questions From All Units
3 pages
Human Activity Recognization
No ratings yet
Human Activity Recognization
80 pages
Unit 4 - DA - Frequent Itemsets and Clustering-1 (Unit-5)
No ratings yet
Unit 4 - DA - Frequent Itemsets and Clustering-1 (Unit-5)
86 pages
Data Mining Thesis PDF
100% (2)
Data Mining Thesis PDF
4 pages
Data Science Project Training Report
No ratings yet
Data Science Project Training Report
19 pages
TYIT SEM VI BI May 2019 Solution
0% (1)
TYIT SEM VI BI May 2019 Solution
21 pages
Customer Behaviour Prediction Using Web Usage Mining
No ratings yet
Customer Behaviour Prediction Using Web Usage Mining
5 pages
Data-Mining (Set 2)
No ratings yet
Data-Mining (Set 2)
21 pages
41 ml
No ratings yet
41 ml
3 pages
Decision Support and Business Intelligence 9th Edition
No ratings yet
Decision Support and Business Intelligence 9th Edition
195 pages
Complete Doc - Lavanya
No ratings yet
Complete Doc - Lavanya
95 pages
Data Mining-2-1
No ratings yet
Data Mining-2-1
12 pages
Importance of Clustering
No ratings yet
Importance of Clustering
5 pages
ISYE 7406 Fall 2023 Syllabus
No ratings yet
ISYE 7406 Fall 2023 Syllabus
10 pages
KDD - Knowledge Discovery in Databases
No ratings yet
KDD - Knowledge Discovery in Databases
546 pages
RTNU PHD Syllabus - Computer Application
No ratings yet
RTNU PHD Syllabus - Computer Application
14 pages
PHD Thesis Data Mining Bioinformatics
100% (2)
PHD Thesis Data Mining Bioinformatics
6 pages
Jpae 025
No ratings yet
Jpae 025
7 pages
Six Sigma Methodology With Fraud Detection: 1 Applications of Data Mining
No ratings yet
Six Sigma Methodology With Fraud Detection: 1 Applications of Data Mining
4 pages
Analytics For Improving Talent Acquisition Processes ICADABAI2015l
No ratings yet
Analytics For Improving Talent Acquisition Processes ICADABAI2015l
16 pages
Vu Study M Artificial Intelligence 28-8-2014
No ratings yet
Vu Study M Artificial Intelligence 28-8-2014
78 pages
MCQ Machine Learning
No ratings yet
MCQ Machine Learning
23 pages
(Graded) : Assignment No. 3
No ratings yet
(Graded) : Assignment No. 3
3 pages
SSIS Transformations
No ratings yet
SSIS Transformations
6 pages
Machine Learning
No ratings yet
Machine Learning
16 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Cluster Optimization For Improved Web Usage Mining

Uploaded by

Cluster Optimization For Improved Web Usage Mining

Uploaded by

International Journal on Recent and Innovation Trends in Computing and Communication

Cluster Optimization for Improved Web Usage Mining

With about 30 million new web pages posted daily, the

are obtained. Analysis of Web access logs can helps to

WEB USAGE MINING

Web usage mining refers to the automatic discovery and

IJRITCC | November 2015, Available @ http://www.ijritcc.org

International Journal on Recent and Innovation Trends in Computing and Communication

Fig 1: Web Usage Mining Process

based user communities reflecting similar interests of users,

IJRITCC | November 2015, Available @ http://www.ijritcc.org

International Journal on Recent and Innovation Trends in Computing and Communication

Fig 2: Filtered data from Web log File

IJRITCC | November 2015, Available @ http://www.ijritcc.org

International Journal on Recent and Innovation Trends in Computing and Communication

Fig 3: Implementation of NEF-CLASS

Fig 5:Cluster as Scouts.

Once we have calculated mean access time and mean

Fig 6.Cluster as Searchers

Fig 4:Centroid Cluster.

International Journal on Recent and Innovation Trends in Computing and Communication

Fig 8:Pie chart for User Access Times

Internet is the broad media for the users today, where

Fig 9:Pie chart for User Access Counts

User access size pie chart in Fig 10.below mentions amount of

IJRITCC | November 2015, Available @ http://www.ijritcc.org

International Journal on Recent and Innovation Trends in Computing and Communication

Devices. In Readings in Information Retrieval, Morgan

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.