An Approach For Effective Use of Pattern Discovery For Detection of Fraudulent Patterns in Railway Reservation Dataset
An Approach For Effective Use of Pattern Discovery For Detection of Fraudulent Patterns in Railway Reservation Dataset
An Approach For Effective Use of Pattern Discovery For Detection of Fraudulent Patterns in Railway Reservation Dataset
An Approach for Effective Use of Pattern Discovery for Detection of Fraudulent Patterns In Railway Reservation Dataset
Rasika Ingle1, Manali Kshirsagar2
Dept. of Computer Technology,Yeshwantrao chavan college of Engineering, Nagpur,Maharashtra, India 2 Dept. of Computer Technology,Yeshwantrao chavan college of Engineering, Nagpur,Maharashtra,India
1
Abstract:
Data mining concepts and techniques can help in solving many problems. Useful knowledge may be hidden in the data stored. This knowledge, if extracted, may provide good support for planners, decision makers, and legal institutions or organizations. Hence Pattern discovery, as one of the powerful intelligent decision support platforms, is being increasingly applied to large scale complicated systems and domains. It has been shown that it has the capacity to extract useful knowledge from a large data space and present to the decision makers. This will contribute to the detection of illegal activities, the governance of systems, and improvements in systems. This paper proposes a work to develop a mechanism that allows the system to work interactively with a user in detecting, characterizing and learning unusual and previously unknown patterns over groups of records depending on the characteristics of the decisions. The data mining in real time could even help to alert Railways when something untoward happens. Hence this innovative mechanism focuses on detecting anomalous and potentially fraudulent behavioral patterns within set of railway reservation transactional data .The pattern based analysis will include the possible detection of fake ids, fake booking , an unusual pattern like reservation of a person for trains in two different directions on a given date form the same starting city etc.
Keywords: Anomalous transactions, data mining, fraudulent transactions, hash map, pattern, pattern
discovery, rule based discovery.
1. Introduction
All Data Mining is the process of discovering new correlations, patterns, and trends by digging into (mining) large amounts of data stored in warehouses, using artificial intelligence, statistical and mathematical techniques. Data mining is the principle of sorting through large amounts of data and picking out relevant information. It has been described as "finding hidden information in a database. Alternatively, it has been called exploratory data analysis, data driven discovery, and deductive learning" [13] and "the science of extracting useful information from large data sets or databases". The interesting patterns are presented to the user and may be stored as new knowledge in the knowledge base. According to this view, data mining is only one step in the entire process, albeit an essential one because it uncovers hidden patterns for evaluation [14]. It is usually used by business intelligence organizations, and financial analysts, but it is increasingly used in the sciences to extract information from the enormous data sets generated by modern experimental and observational methods. One of main area where data mining can be used in the industry is in monitoring systems. The specific tasks in automated transaction monitoring systems are the identification of suspicious and unusual electronic transactions. An unusual pattern is an observation or a point that is considerably dissimilar to or inconsistent with the remainder of the data. Detection of such outliers or patterns is important for many applications and has recently attracted much attention in the data mining research community. Pattern-based analysis looks for anomalies indicative of fraud or error in normal patterns of data. It is growing gradually and becomes more important with the quick development of computer technologies with increasing capacity to collect massive amounts of valuable data for pattern analysis. In real life, fraudulent transactions are interspersed with genuine transactions and simple pattern matching is not often sufficient to detect them accurately. Often times, discrepancies in transaction data are missed when analysis doesnt go beyond known problems. Discrepancies may result from unanticipated behavior that pattern-based analysis is more apt to uncover.The basic question asked by all detection systems is whether anything strange has occurred in recent events. This question requires defining what it means to be recent and what it means to be strange.Whats strange about recent events. WSARE operates on discrete data sets with the aim of finding rules that characterize significant patterns of anomalies [9]. In general, anomalies can be defined as any observations that are different from the normal behaviour of the data. Many traditional anomaly detection techniques look at the data records individually, and try to determine whether each record is anomalous with respect to the historical distribution of data. A Bayesian Network likelihood model and a conditional anomaly
26
||Issn||2250-3005|| (Online) ||March||2013|| ||www.ijceronline.com||
2. Methodology
2.1. Problem Definition Pattern discovery from large datasets has been an active field of research for the past two decades. These studies are driven by a desire for automated systems which can search, analyse, and extract knowledge from the massive amount of data collected in many fields. The main goal is to replace the conventional manual examination methods which are expensive, inaccurate, error prone and limited in scope. Reservation records should also be searched for unusual patterns and undiscovered knowledge. This proposed work demonstrates that different kinds of illegal manipulation or ways used in railway reservation transactions can be discovered by identifying particular patterns and track them in the datasets. Fraud indicators in the railway reservation transactions are the focus of this work. The problem is formulated by, Recognising those indicators, the patterns associated with them, and the human behaviour underlying these patterns; A data mining approach to automate the discovery of the illegal activities that generate the patterns. 2.2. Significance This study will contribute to current efforts in establishing better systems to support the railway reservation governance. To achieve this, two main problems are addressed. Assessing the patterns hidden in reservation transaction records which can be used to point out useful knowledge. Automating the discovery of some of these patterns from reservation records by applying data mining techniques. A major problem is the lack of published work that addresses the automatic extraction of knowledge from railway reservation systems. Therefore, in some aspects, this is a pioneering study. 2.3. Research Objectives The primary objective of this work is, To explore the use of data mining in railway reservation systems and to develop knowledge of where and how data mining can be applied and integrated into these systems, to contribute to the discovery and alleviation of fraud in railway reservation transactions. As stated, the primary objective of this work is set to provide a solution to fraud bookings by agents and to railway reservation governance by detecting fraud. To serve this primary objective, four main activities or subobjectives are set. They are, 2.3.1 Identify different fraudulent activities in reservation record datasets, in a variety of contexts where these activities may take place.
27
||Issn||2250-3005|| (Online) ||March||2013|| ||www.ijceronline.com||
28
||Issn||2250-3005|| (Online) ||March||2013|| ||www.ijceronline.com||
References
[1] [2] [3] Junjie Wu, Shiwei Zhu, Hui Xiong,Jian Chen, and Jianming Zhu, Adapting the Right Measures for Pattern Discovery: A Unified View, IEEE Trans. On Systems ,Man and Cybernetics-Part B ,Vol.42,No.4,Aug 2012. Ning Zhong, Yuefeng Li, and Sheng-Tang Wu, Effective Pattern Discovery for Text Mining, IEEE Trans. Knowledge Data Eng.,Vol. 24, No. 1, Jan 2012. Mehmet Koyuturk, Ananth Grama, and Naren Ramakrishnan , Compression, Clustering, and Pattern Discovery in Very High-Dimensional Discrete-Attribute Data Sets, IEEE Trans. Knowledge Data Eng.,VOL. 17, NO. 4, APRIL 2005. Dongsong Zhang and Lina Zhou ,Discovering Golden Nuggets: Data Mining in Financial Application IEEE Trans. On Systems ,Man and Cybernetics-Part C:Application and Reviews , vol. 34,no. 4,Nov 2004. Kovalerchuk, B., Vityaev, E., Detecting patterns of fraudulent behavior in forensic accounting, In Proc. of the Seventh International Conference Knowledge-based Intelligent Information and Engineering on Systems, Oxford, UK, part 1, pp. 502-509, Sept, 2003. Andrew K. C. Wong, Senior Member, IEEE, and Yang Wang, Member, IEEE, Pattern Discovery: A data driven approach to decision support, IEEE Trans. On Systems , Man and Cybernetics-Part C:Application and Reviews, vol. 33,no. 1,Feb 2003. T. Chau and A. K. C.Wong, Pattern discovery by residual analysis and recursive partitioning, IEEE Trans. Knowledge Data Eng., vol. 11, pp.833852, Nov./Dec. 1999. Nitin Jindal, Bing Liu, Ee-Peng Lim, Finding Unusual Review Patterns Using Unexpected Rules. Weng-Keen Wong,Andrew Moore,Gregory Cooper, and Michael Wagner,Rule -Based Anomaly Pattern Detection for Detecting Disease Outbreaks. Kaustav Das,Jeff Schneider, Daniel B.Neill, Anomaly Pattern Detection in Categorical Datasets. Kaustav Das, Jeff Schneider,Detecting Anomalous Records in Categorical Datasets. Jia Wu and Jongwoo Park , Intelligent Agents and Fraud Detection. Margaret H. Dunham, Data mining: Introductory and advanced topics,Dorling Kindersley (India) Pvt. Ltd.,Pearson,2006. Jiawei Han , Micheline Kamber , Data mining: Concepts and Techniques, M Morgan Kaufmann , 2005.
[4] [5]
[6]
29
||Issn||2250-3005|| (Online) ||March||2013|| ||www.ijceronline.com||