Buczak Gifford Fuzzy Rules Crime
Buczak Gifford Fuzzy Rules Crime
Buczak Gifford Fuzzy Rules Crime
Anna L. Buczak
Johns Hopkins University Applied Physics Laboratory 11100 Johns Hopkins Rd, Laurel, MD 20723 USA Anna.Buczak@jhuapl.edu
Christopher M. Gifford
Johns Hopkins University Applied Physics Laboratory 11100 Johns Hopkins Rd, Laurel, MD 20723 USA Christopher.Gifford@jhuapl.edu
These data are typically multi-dimensional and too large to manually examine to discover salient patterns which offer significant leads. The nature and sensitivity of the data present important issues that need to be addressed, such as data storage, warehousing, and privacy. Current manual inspection of crime data by analysts and investigators is limited, primarily due to the amount of data that can be processed concurrently and in an acceptable time frame. Further, complex relationships between various crime attributes can be overlooked or misinterpreted by human analysts. Providing automated knowledge discovery tools becomes attractive to enhance and accelerate the efforts of local law enforcement. Local, regional, national, and international crime play important roles in allocating law enforcement resources and influencing investigative priorities across jurisdictions. For example, an aggravated assault is a local jurisdiction matter, whereas drug trafficking and terrorism exhibit regional and global implications. Local crime patterns may differ from surrounding communities, creating localized trends of criminal activity which are unique to a community. Similarly, certain crimes may be more probable in locations with higher populations and dense housing. Regional crime patterns can be discovered which enable law enforcement personnel and criminal investigators to address large-scale trends. Crime is typically temporally, thematically, and geospatially correlated, exhibiting complexities which make the analysts task very challenging. Moreover, evidence can be loosely coupled while being geospatially sparse, forcing a more widespread analysis effort. Leveraging data mining techniques provides the ability to better analyze, predict, prepare for, and respond to criminal acts and potential security risks. Community-based data can be integrated to study associations between socio-economic characteristics and local law enforcement information. Examples of such national crime data sources are the U.S. Census, U.S. FBI Uniform Crime Report, U.S. Law Enforcement Management and Administrative Statistics survey, National Criminal Record Database, and National Archive of Criminal Justice Data. In this paper, we study the application of fuzzy association rule mining for community crime pattern discovery. The following sections discuss available crime data sources, previous criminal act analysis efforts, and techniques for crime data mining. Fuzzy association rule mining is introduced as a novel means for knowledge discovery in the crime domain, supported by experimental results on the open-source Communities and Crime data set [2]. This paper concludes with a discussion on directions for further research.
ABSTRACT
Current manual inspection of crime data by analysts is limited, primarily due to the amount of data that can be processed concurrently and in a reasonable time frame. Further, complex relationships between various crime attributes can be overlooked by human analysts. Providing automated knowledge discovery tools becomes attractive to accelerate the efforts of local law enforcement. In this paper, we study the application of fuzzy association rule mining for community crime pattern discovery. Discovered rules are presented and discussed at regional and national levels. Rules found to hold in all states, be consistent across all regions, and subsets of regions are also discussed. A relative support metric was defined to extract rare, novel rules from thousands of discovered rules. Such an approach relieves the need of law enforcement personnel to sift through uninteresting, obvious rules in order to find interesting and meaningful crime patterns of importance to their community.
General Terms
Algorithms, Performance, Experimentation, Theory.
Keywords
Crime data mining, fuzzy association rules, rule pruning, community-based crime.
1. INTRODUCTION
Crime data mining is receiving increased attention to discover underlying patterns in crime data. The need to act quickly to suppress crime activity and discover links between various data sources persists. State law enforcement are continuing to call upon modern geographic information systems and data mining technologies to enhance crime analytics and better protect their communities and assets. Real-time solutions can save significant resources and push the capability of law enforcement closer to the pulse of criminal activity. Modern computing systems provide a unique opportunity to study this vast amount of data in ways that were previously not feasible. The volume of data being digitally recorded about crimes, suspicious activities, and suspect records is at an all-time high.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ISI-KDD 2010, July 25, 2010, Washington, D.C., USA Copyright 2010 ACM ISBN 978-1-4503-0223-4/10/07 $10.00
improving criminal-intelligence analysis through the use of a cooccurrence concept space built from detailed case reports and terms of interest. The concept space utilizes five primary categories to study link analysis: Person, Organization, Location, Crime, and Vehicle. The COPLINK system was found to effectively increase operating efficiency, while improving case closure and solvability ratings. Based on experience gained from the COPLINK project, a framework for crime data mining is presented in [8]. The authors categorize and discuss levels of implication for various types of crime established via consult from an experienced local detective. General techniques for analyzing crime data are summarized, such as entity extraction, clustering, association rule mining, and sequential pattern analysis. Similarly, The Regional Crime Analysis Program (RECAP) [5] was created to assist Virginia law enforcement. RECAP incorporated aspects of data fusion, data mining, and geospatial clustering of crime. Crime analysis tools are now available for a wide variety of GIS software packages. The authors of [15] developed an incremental mining algorithm, called ITAR, for crime pattern discoveries via temporal association rules. As databases grow large, incremental approaches are necessary to avoid expensive database rescanning operations. Mining temporal association rules can yield new insights into crime trends for various time frames and how they change. The ITAR algorithm was applied to crime data for a district of Hong Kong, organized by offence and modus operandi (MO) with various categorizations of seriousness. For a general discussion of data mining to the crime domain, the reader is referred to [19].
support, confidence and lift have been fuzzified for the purpose of fuzzy association rules. Confidence can be treated as the conditional probability (P(Y|X)) of a transaction containing X and also containing Y. A high confidence value suggests a strong association rule. However, this can be deceptive. For example, if the antecedent or consequent have a high support, they could have a high confidence even if they were independent. This is why lift was suggested as a useful metric. The lift of a rule (X Y) measures the deviation from independence of X and Y. A lift greater than 1.0 indicates that transactions containing the antecedent (X) tend to contain the consequent (Y) more often than transactions that do not contain the antecedent (X). The higher the lift, the more likely that the existence of X and Y together is not just a random occurrence, but rather due to the relationship between them. The fuzzy support is defined as:
Let D = {t1, t2, ..., tn} be the transaction database and let ti represent the ith transaction in D. Lets define the itemset-fuzzy set pair <X, A> where X is the set of attributes xj and A is the set of fuzzy sets aj. A transaction satisfies <X, A> means that the vote of the transaction is greater than zero. The vote of a transaction is calculated by the membership grade of each x j in that transaction. The membership_grade for attribute aj in transaction ti is defined as:
Apriori [1] is the most widely used algorithm for finding frequent k-itemsets and association rules. It exploits the downward closure property which states that if any k-itemset is frequent, all of its subsets must be frequent as well. The Apriori algorithm proceeds as follows: 1. Calculate the support of all 1-itemsets and prune any that fall below the minimum support, specified by the user.
Loop: 2. Form candidate k-itemsets by taking each pair (p,q) of (k-1) itemsets where all but one item match. Form each new kitemset by adding the last item of q onto the items of p. Prune the candidate k-itemsets by eliminating any itemset that contains a subset not in the frequent (k-1)-itemsets. Calculate the supports of the remaining candidate k-itemsets and eliminate any that fall below the specified minimum support. The result is the frequent k-itemsets. where TNorm can be any of T-Norm operators [21]: product, minimum, etc. When a frequent itemset <X, A> is obtained, fuzzy association rules of the form If X is A then Y is B are generated, where X Z, Z = X Y, A C and C = A B. The fuzzy confidence value can be computed as follows:
3. 4.
least a specified minimum support (minsup)). The rules are called strong association rules when they meet or exceed a minimum confidence (minconf). Occasionally the rules of interest have high confidence but a low support. Such rules are called rare association rules. If one wanted to determine in a set of supermarket transactions if there was a relationship between buying a food processor and a cooking pan, this would be difficult due to the fact that each of these items is rarely purchased. Thus, even though the two items are almost always purchased together, this association is usually not found since its support is too low [13]. When dealing with rare diseases, violent crimes, machinery failure, etc., one is interested in finding rare association rules. Rare association rule mining is a newer and less well understood discipline than frequent association rule mining. One of the approaches to rare association rule mining is to use the same algorithms as for frequent item mining (such as Apriori) while selecting a low minimum support. Setting minsup very low causes a combinatorial explosion in the number of generated itemsets and rules. This necessitates the use of rule post-pruning methods, which facilitates extraction of interesting rules from a large set. One of the methods we utilize for rule pruning is the consequentconstraint rule pruning [4], in which an item constraint is used that requires the consequents of the rules to satisfy a given constraint. This method requires prior knowledge of which consequents should be interesting. Rules are additionally pruned based on their support, confidence, and lift. Confidence- and lift-based pruning methods are the same for frequent and rare rule mining. Support-based pruning must be different, since in frequent rule mining it is usually trivial to find the minimum support that is adequate for the entire data set. In contrast, rare rule mining requires setting the minsup to low, causing a combinatorial explosion of the number of rules. Yun et al. [20] proposed the Relative Support Apriori (RSAA) algorithm to generate rules in which significant rare itemsets take part. This technique uses relative support, defined as:
were omitted based on occurrence of significant missing or known incorrect crime statistics, the majority of which were from the Midwest. Certain attributes contain a significant number of missing values for which the data was unavailable or not recorded for particular communities (e.g., Police Officers Per 100K Population, Police Request Per Officer, Officers Assigned to Drug Units, Police Operating Budget). Attributes include information across a variety of crime-related facets, ranging from the percent of officers assigned to drug units, to population density and percent considered urban, to median household income. Also included are measures of crimes considered violent, which are murder, rape, robbery, and assault. For more detail on the attributes and their statistics, the reader is referred to [2].
This algorithm decreases the support threshold for items which have low frequency, and increases the support threshold for items that have high frequency.
People in Homeless Shelters Homeless People Counted in Street Foreign Born (%) Population Density (Persons Per Square Mile) People Commute Using Public Transit (%) Violent Crimes Per 100K Population Murders Robberies Assaults
4.5 US Regions
Community data were grouped into five regions: Northeast, Southeast, Midwest, Southwest, and West [9]. Northeast comprises the following states: CT, DE, ME, MD, MA, NH, NJ, NY, PA, RI, and VT. This subset covers 632 communities. Southeast encompasses the states: AL, AR, FL, GA, KY, LA, MS, NC, SC, TN, VA, and WV. This subset covers 420 communities. Midwest is composed of the following states: IL, IN, IA, KS, MI, MN, MO, NE (no data), ND, OH, SD, and WI. It covers a total of 513 communities. Southwest encompasses the states: AZ, NM, OK, and TX. This subset covers 228 communities. Lastly, West is composed of the following states: CA, CO, ID, MT (no data), NV, OR, UT, WA, and WY. This represents 418 communities. Figure 3. Violent Crimes attribute.
5. METHODOLOGY
The developed methodology has the following primary steps: 1. Variable fuzzification. This includes defining membership functions for each of the variables, and computing the membership values for each data item. Running the Fuzzy Apriori algorithm on the data set. This includes initial pruning of the generated rules. Rule post-pruning.
supports used for each region are as follows: Northeast (0.135%), Southeast (0.714%), Midwest (0.585%), Southwest (1.316%), and West (0.718%). As the number of communities within each region differs, minimum supports were calculated for each region based on the support of a rule occurring the same number of times within that region. This unifies support across regions for reliable comparison of rule measures, and facilitates discovery of consistent rules across all or subsets of the five US regions.
2. 3.
The initial pruning of the rules includes the consequent-constraint rule pruning method [17] mentioned in Section 3.3. We have developed a similar method that we call antecedent-constraint rule pruning, in which an item constraint is used that requires the antecedents of the rules to satisfy a given constraint. This is the second technique used in our work. This technique requires prior knowledge of which items are of interest in the antecedent. In an application such as the crime domain, the user usually knows very well which attributes are of interest as antecedents or consequents. Rule post-pruning is concerned with pruning rules after they have been generated by an algorithm, such as Fuzzy Apriori. We are post-pruning rules based on 60% fuzzy confidence. We have also developed a new Relative Fuzzy Support measure:
The above definition of Relative Fuzzy Support allows reduction of the support threshold for consequents that have low frequency and increasing the support threshold for consequents that have high frequency. The reduction or increase of support is significant because of the square in the denominator. The Relative Fuzzy Support differs from RSup (defined in Section 3.3) in two ways: The denominator is squared so the reduction or increase of the support is more significant compared to RSup. The denominator involves only the support of the consequent. In RSup, the minimum support of all the antecedents and consequents are used. RSup increases the support of a rule if any of its antecedents or consequents is rare. In contrast, Relative Fuzzy Support increases the support only if the consequent is rare. The Relative Fuzzy Support is well suited for applications in which the user knows the consequents of interest. This is the case in this crime application, as the user is most interested in Violent Crimes, Murders, Robberies and Assaults being High.
6. EXPERIMENTAL RESULTS
A set of fuzzy membership functions was defined for each of the 40 attributes and each attribute was fuzzified. The fuzzified data constituted the input to Fuzzy Apriori. Fuzzy Apriori was run with confidence 60% and with different supports depending on the data subset. All membership functions for attributes 1-36 were selected as antecedents, and the following membership functions for attributes 37-40 were selected as consequents: Violent Crimes (Low, Medium, High), Murders (No, Low, Medium, High), Assaults (Low, Medium, High), Robberies (Low, Medium, High). Experiments were run on each region data set, producing rules for each region. The data sets for each region were also combined into a single data set comprising all US states. The minimum Figure 5. Number of total and pruned rules.
increased more than three times, in comparison to all rules for that consequent. The average lift of rules with membership functions No, Low, and Medium remaining after pruning is unchanged. Examples of rules which produce the highest values for each measure follow. Support: [People Speaking No English (Low)] & [People in Dense Housing (Low)] [Robberies (Low)], conf=85.0, lift=1.0, rel sup=1.1, sup=75.3 [Kids Born to Never Married (Low)] & [People in Dense Housing (Low)] [Robberies (Low)], conf=88.0, lift=1.1, rel sup=1.1, sup=73.9 Relative Support: [People in Urban Area (High)] & [Kids Born to Never Married (High)] [Robberies (High)], conf=63.0, lift=34.7, rel sup=11.9, sup=0.4 [Race Caucasian (Minority)] & [Kids Born to Never Married (High)] [Robberies (High)], conf=61.0, lift=33.3, rel sup=10.9, sup=0.4 Confidence: [Kids Born to Never Married (High)] & [People Commute Using Public Transit (High)] [Robberies (High)], conf=96.0, lift=52.9, rel sup=4.5, sup=0.1 [Race African American (Minority)] & [People Speaking No English (Low)] [Robberies (Low)], conf=91.0, lift=1.1, rel sup=1.0, sup=65.9 Lift: [Kids Born to Never Married (High)] & [People Commute Using Public Transit (High)] [Robberies (High)], conf=96.0, lift=52.9, rel sup=4.5, sup=0.1 [Houses with Kids Living with Two Parents (Low)] & [People Commute Using Public Transit (High)] [Robberies (High)], conf=86.0, lift=47.4, rel sup=5.6, sup=0.2 It is interesting to note that, in the above sets of rules, the attributes Kids Born to Never Married and Houses with Kids Living with Two Parents show very prominently. While this is not surprising, it is unexpected that these two attributes show in 6 out of 8 rules with the highest metric values. These discovered rules represent patterns that are of interest to law enforcement officials.
Figure 8. Average lift for all and pruned rules. Figure 8 presents the average lift of all rules, and of rules remaining after pruning, separately for each consequent. Murders (High) and Robberies (High) have the highest lift, exceeding several times the average lift of the other consequents. The average lift of rules with consequent Violent Crimes (High)
A total of 3188 rules with at least 60% confidence were present in all regions. All rule consequents contain Assaults (Low), Assaults (Medium), Robberies (Low), Murders (Low), or Violent Crimes (Low). None of the rule consequents contain High for any of the major crime categories. This observation reinforces that patterns indicative of High major crime differ by region, which is a function of the state and community demographics within them. Table 1 shows the variation of rule measure values for those rules consistent across all US regions. Minimum, maximum, and average values for each rule measure across all consistent rules are reported. On average, rules that are present in each of the five regions from this study exhibit 21% support, 81% confidence, and 1.24 lift. Rules range from rare/novel to highly supported, and exhibit lift values approaching 9.6 in some instances. Table 1. Rule metrics for rules consistent throughout the US. Value Minimum Maximum Average Support (%) 0.48 91.843 21.291 Confidence (%) 60.0 100.0 81.11 Lift 0.581 9.571 1.2447
Examples of rules which produce the highest average values for each measure follow. These specific rules offer insight into crime patterns that are most frequent, probable, and meaningful at a national level. Moreover, these rules provide examples of characteristics which create safe neighborhoods. Support: [People in Dense Housing (Low)] [Robberies (Low)] [People Speaking No English (Low)] [Robberies (Low)] The rules with highest average support indicate that robberies are not a significant risk in communities where housing is not dense, English is widely spoken, and public transit systems are not used for daily commutes. This conversely suggests that communities with dense housing (e.g., apartment complexes), a large number of non-English speakers, and heavy use of public transit (e.g., subway) experience higher volumes of robberies. Confidence: [Houses with Retirement Income (High)] & [People in Homeless Shelters (None)] [Robberies (Low)] [Race Caucasian (Majority)] & [Age 12-29 (High)] [Assaults (Low)] The rules with highest average confidence further indicate that robberies occur less in communities with a high number of retired individuals, no homeless, and a high number of traditional family living arrangements. Retirement and family communities are typically low in crime due to better neighborhoods. Assaults are also low in predominantly Caucasian, under-30 communities. Lift: [Race African American (Middle)] & [Houses with Public Assistance Income (Medium)] [Assaults (Medium)] [Age 16-24 (Medium)] & [Homeless People Counted in Street (Low)] [Murders (Low)] The rules with the highest average lift illustrate that murders are lower in communities with a low number of homeless and medium number of people aged 16-24 (e.g., collegiate communities). Violent crimes are also low in under-30 communities which dont receive public assistance income.
On average, rules that are present in any four of the five regions from this study exhibit 11% support, 71% confidence, and 1.72 lift. Rules in this set exhibit decreased average support and confidence, as well as increased lift values (exceeding 13.75 in some cases), compared to those consistent across all regions. Three- and two-region rules are similar across all three measures, with increased lift values. Examples of rules exhibiting the highest average measure values within each overlapping region subset size are presented in the following sections. As before, rules corresponding to the highest average lift value involve higher consequent variable values. The
fewer number of regions in the subset, the more meaningful the rules become at the state level. Demographic data such as poverty level, income, population density, employment, and living situation for children are directly linked to the occurrence of violent crime. These rules also support the conjecture that high income, family-based communities of educated individuals are less at risk to major crime. Higher income typically translates directly to better security and fewer individuals in the community which would commit such crimes.
7. CONCLUSIONS
Fuzzy association rule mining has proven useful for this crime application, and has utility for other crime-related data sets. To the knowledge of the authors, this is the first experimental study of applying fuzzy association rule mining to a crime data set. Results presented in this paper suggest that further analysis is required to gain a closer understanding of crime at both the community and national levels. Crime patterns were discovered which are consistent across all regions, subsets of regions, and all states. The attributes of interest were computed to measure their occurrence per 100K population, so as to remove the element of community and state size during the rule generation process. Rules discovered as part of this study therefore offer utility for use from the national level down to the state and community level. A novel relative support measure was proposed to prune the set of rules and to extract rare rules from the larger original set. The use of relative support achieves a 95.2% reduction in the final number of rules. These resulting 675 final rules represent a much more manageable number of rules for a crime analyst to investigate. This enables law enforcement personnel to more easily understand the discovered rules by removing the need to sift through uninteresting, obvious rules in order to find interesting and meaningful patterns. In the future, feedback from crime analysts will be utilized to determine if this is a satisfactory number of final rules, or whether additional pruning methods need to be developed to further reduce the number of rules. The data set used in this study has resolution down to the community (town) level. The generated rules are therefore general to that resolution. Several attributes of the data set contained a significant number of missing values (e.g., Police Officers Per 100K Population, Police Requests Per Officer, Officers Assigned to Drug Units, Police Operating Budget). Acquiring accurate data for these attributes will enable the process to produce rules which relate directly to the police force. Rules which relate to the size, budget, and jurisdiction of the police force can then be leveraged. Utilizing data that contains precise locations of crimes or blocklevel demographics would help produce more meaningful rules for local law enforcement jurisdictions. These higher resolution data could then result in rules that are applicable to certain areas of a city, especially those with widespread crime of various types.
[4]
R.J. Bayardo, R. Agrawal, and D. Gunopulos, Constraint-Based Rule Mining in Large, Dense Databases, Data Mining and Knowledge Discovery, 4(2/3), pp. 217-240, 2000. D. Brown, The Regional Crime Analysis Program (RECAP): A Framework for Mining Data to Catch Criminals, In Proceedings of the International Conference on Systems, Man, and Cybernetics, pp. 2848-2853, 1998. J. de Bruin, T. Cocx, W. Kosters, J. Laros, and J. Kok, Data Mining Approaches to Criminal Career Analysis, In Proceedings of the International Conference on Data Mining, pp. 171-177, 2006, Washington, D.C., IEEE Computer Society Press. M. Chau, J. Xu, and H. Chen, Extracting Meaningful Entities from Police Narrative Reports, In Proceedings of the National Conference on Digital Government Research, pp. 1-5, 2002. H. Chen, W. Chung, J. Xu, G. Wang, Y. Qin, and M. Chau, Crime Data Mining: A General Framework and Some Examples, Computer, 37(4), pp. 50-56, April 2004, Los Alamitos, CA, IEEE Computer Society Press. J. Dembsky, United States Regions, Online. Available (October 2006): http://www.dembsky.net/regions/.
[5]
[6]
[7]
[8]
[9]
[10] R. Hauck, H. Atabakhsh, P. Ongvasith, H. Gupta, and H. Chen, Using COPLINK to Analyze Criminal-Justice Data, Computer, 35(3), pp. 30-37, March 2002, Los Alamitos, CA. [11] C. Ku, A. Iriberri, and G. Leroy, Crime Information Extraction from Police and Witness Narrative Reports, In Proceedings of the IEEE International Conference on Technologies for Homeland Security, pp. 193-198, May 2008, Boston, MA. [12] C.M. Kuok, A. Fu, and M.H. Wong, Mining Fuzzy Association Rules in Databases, ACM SIGMOD Record, 27(1), pp. 41-46, New York, NY, 1998. [13] B. Liu, W. Hsu, and Y. Ma, Mining association rules with multiple minimum supports, In Proceedings of the International Conference on Knowledge Discovery and Data Mining, pp. 337-341, New York, NY, 1999. [14] S. Nath, Crime Pattern Detection Using Data Mining, In Proceedings of the International Conference on Web Intelligence and Intelligent Agent Technology, pp. 41-44, 2006, Washington, D.C., IEEE Computer Society Press. [15] V. Ng, S. Chan, D. Lau, and C. Ying, Incremental Mining for Temporal Association Rules for Crime Pattern Discoveries, In Proceedings of the Australasian Database Conference, pp. 123-132, Ballarat, Victoria, Australia, February 2007. [16] P. Phillips and I. Lee, Mining Top-k and Bottom-k Correlative Crime Patterns through Graph Representations, In Proceedings of the IEEE International Conference on Intelligence and Security Informatics, pp. 25-30, June 2009, Dallas, TX. [17] M.A. Redmond, and A. Baveja, A Data-Driven Software Tool for Enabling Cooperative Information Sharing Among Police Departments, European Journal of Operational Research, 141, pp. 660-678, 2002. [18] R. Srikant and R. Agrawal, Mining Quantitative Association Rules in Large Relational Tables, In Proceedings of the International Conference on Management of Data, Montreal, Quebec, Canada, pp.1-12, 1996. [19] P. Thongtae and S. Srisuk, An Analysis of Data Mining Applications in Crime Domain, In Proceedings of the IEEE International Conference on Computer and Information Technology Workshops, pp. 122-126, 2006, IEEE Computer Society Press. [20] H. Yun, D. Ha, B. Hwang, and K.H. Ryu, Mining Association Rules on Significant Rare Data using Relative Support, Journal of Systems and Software, 67(3), pp. 181-191, 2003. [21] L.A. Zadeh, Fuzzy Sets, Information and Control, 8(3), pp. 338 353, 1965.
8. ACKNOWLEDGMENTS
The authors wish to thank Dr. Michael Redmond from La Salle University for providing the data set and patiently answering questions about it. The authors also wish to thank Mark Gabriele of Johns Hopkins University Applied Physics Laboratory for his time evaluating rules discovered as part of this study.
9. REFERENCES
[1] R. Agrawal, T. Imielinski, and A. Swami, Mining Association Rules between Sets of Items in Large Databases, In Proceedings of the International Conference on Management of Data, Washington, D.C., pp. 207-216, May 1993. A. Asuncion and D.J. Newman, UCI Machine Learning Repository, School of Information and Computer Science, University of California, Irvine, CA, 2007, URL: archive.ics.uci.edu/ml/datasets/Communities+and+Crime. S. Bagui, An Approach to Mining Crime Patterns, International Journal of Data Warehousing and Mining, 2(1), pp. 50-80, March 2006.
[2]
[3]