Dbscan Cluster
Dbscan Cluster
Insights on the features for which 911 calls or other emergency situations would impact the
various understanding of the Minneapolis response to the people for which government can issue
active AI responses. As per the design we have estimated the different set of the datasets that
were accessed through the govt websites opendata.minneapolismn.gov. Now each of the datasets
are described based on the Police stop data, Use of Force data and shots fired data. From each
such scenarios we have implemented a case model for analysing the graphical aspects of each
data sets depending upon the requirement. Since the data will be represented with cluster
information on the feature of spatial model and its implementation using DBSCAN algorithm.
INTRODUCTION:
In today's world, criminals are becoming more technologically sophisticated with their criminal
activities, and one challenge faced by intelligence and law enforcement agencies is the difficulty
in analyzing large volumes of data involved in criminal and terrorist activities. As a result,
agencies must learn techniques to catch criminals and stay ahead in the never-ending race
between criminals and law enforcement. As data mining applies to collecting or mining
information from vast volumes of data, it is seen here on a high volume crime dataset, and the
knowledge obtained from data mining techniques is helpful and supports police forces.
Clustering is a data mining technique that clusters a collection of objects in such a way that
objects in the same category are more identical than objects in other categories, and it involves
different algorithms that vary greatly in their notions of what makes a cluster and how to
efficiently locate them. In this article, a spatial cluster scan is implemented to retrieve a valuable
knowledge from a large crime dataset and to classify the data, assisting police in identifying and
analyzing crime trends in order to prevent future incidents of related occurrences and provide
information to deter crime.
RELATED WORK:
In the research and review of criminology, data mining may be divided into two categories:
crime prevention and crime containment De Bruin et al. proposed a system for crime rates
focused on a recent distance metric for evaluating and clustering all people based on their
profiles. Manish Gupta et al. highlight current e-governance programmes utilised by Indian
police and suggest an open query-based framework as a crime detection platform to aid police in
their operations. He proposed an interface for extracting valuable knowledge from the National
Crime Record Bureau's (NCRB) massive crime archive and locating crime hot points utilising
crime data mining methods such as clustering. The suggested interface's efficacy has been shown
using Indian crime reports. Nazlena Mohamad Ali et al. explore the implementation of the Visual
Interactive Malaysia Crime News Retrieval Framework (i-JEN), including the methodology, user
tests, system design, and potential plans. Their main goals were to construct crime-based events,
investigate the use of crime-based events in improving classification and clustering, develop an
interactive crime news retrieval system, visualise crime news in an effective and interactive
manner, integrate them into a usable and robust system, and evaluate the usability and system
performance. Sutapat Thiprungsri looks into how cluster analysis is used in the accounting
sector, specifically difference identification in audits. The aim of his research is to look at how
clustering technologies can be used to simplify fraud filtering during audits. When analysing
community life insurance cases, he used cluster analysis to aid auditors concentrate their efforts.
A. Malathi et al. look at how missing meaning and clustering algorithms can be used in a data
mining method to forecast crime trends and speed up the investigation phase. To predict crime
rates, Malathi. A et al used a clustering/classify dependent model. The city crime data from the
Police Department were analysed using data analysis techniques. The information gleaned from
this data mining may be useful in a variety of ways.
For the next several years, it may be employed to reduce or even deter violence. Malathi and Dr.
S. Santhosh Baboo A study aimed to create a crime detection method for the Indian context
utilising various data processing tools that could aid law enforcement departments in effectively
handling criminal investigations. The proposed tool allows organisations to disinfect,
characterise, and interpret crime data in a simple and cost-effective manner in order to detect
actionable patterns and trends. Kadhim B. Swadi Al-Janabi proposes a method for analysing and
detecting crime and criminal data utilising Decision Tree Algorithms for data classification and
the Simple K Means algorithm for data clustering. The paper is intended to assist experts in
recognising patterns and developments, forecasting, identifying associations and plausible
causes, mapping crime networks, and identifying potential offenders. Aravindan Mahendiran et
al. mine crime data sets using a variety of methods to find evidence that is obscured from human
experience. We view the trends found by our algorithms in a neat and intuitive manner using
state-of-the-art simulation tools, allowing law enforcement agencies to channel their efforts
appropriately. Sutapat Thiprungsri investigates the feasibility of auditing using clustering
technologies. Continuous audits may benefit greatly from automating fraud filtering. The aim of
their research is to see whether cluster analysis can be used as an additional and novel anomaly
detection strategy in the wire transfer method. K. Zakir Hussain et al. used data mining and a
simulation model to try to catch years of human knowledge in computer simulations.
System Architecture: (Objective 3)
As per the design characteristics which we have procedure to implement the different set of the
features that have to be clustered using the directions and its functional analysis on the clustering
algorithm where python based solution becomes easier more applicable.
Design Procedure:
1. Initiate the data set from the dataset as below:
2. Features and its labelling for each data cluster formation based on the attribute
established.
3. We improvise Python tools such as anaconda to provide the specific libraries ensuring the
different set of attributes are acquired from the CSV file.
4. Each attribute are labelled with label encoder to initiate the design model for
implementing the different structures on each data fitting scenario for the different
locations observed for each set of the region considered.
5. Perform DBSCAN algorithm for the data set ensuring the different attributes observed on
the cluster formed.
6. Plot the data to visualize the different crime data attribute using pyplots and scatter plots.
7. Repeat the 1-6 steps for each database chosen.
DESIGN FLOW DIAGRAM:
FEATURE
FEATURE
ATTRIBUTE
DBSCAN
CLUSTERS
IDENTIFIED
MODEL
Figure : Representing the proposed Block diagram flow for the design model.
Our design aims to provide a clustering feature to represent the design aspects on the basis of
police force data as mentioned above in system architecture. We have implemented a data
cleaning process to initiate the design acquisition based on the Numeric responses of the
columns where each set of the data attribute is Case-id, X, Y , Problem and Neighborhood. Each
set of the clustering model is represented with the features accepted with the design, and its
attributes are clustered with correct response based on the data fitting.
The data labeling is modelled with “label_encoder.fit_transform” utilizing the different string
values to corresponding numeric response. For each required column with Dataset_table are
applied to visualize as a cluster feature.
Finally we apply DBSCAN algorithm to initiate the data fitting of the Set of columns and row
which are implemented. For case of 1000
CaseNumber X Y Problem EventAge
0 29 -93.278939 44.97924 52 48
72 29 -93.278939 44.97924 52 48
144 29 -93.278939 44.97924 52 48
532 29 -93.278939 44.97924 52 48
711 29 -93.278939 44.97924 52 48
From the Above figures 1 and 2 we have represented the design for Clustering model on basis of
n= 1000 and n = 10000.
For N= 1000, We have only considered with [CaseNumber X Y Problem EventAge]
attributes ensuring the clusters represented based on the above four.
Similarly for N= 10000 the table itself represents the attributes
The other plots are represented with X and Y with respect to Is911call columns based on swarm
plot and cat plots.
REFERENCES:
1.H. Chen et al., "mining: an overview and case studies", Proceedings of the 2003 annual national conference on
Digital government research, 2003.
2.C. Fan, K. Xiao, B. Xiu and G. Lv, "A fuzzy clustering algorithm to detect criminals without prior information",
Advances in Social Networks Analysis and Mining (ASONAM) 2014 IEEE/ACM International Conference on, pp.
238-243, 2014, August.
3.T. Wang, C. Rudin, D. Wagner and R. Sevieri, "Detecting Patterns of Crime with Series Finder", AAAI (Late-
Breaking Developments), June 2013.
4.R. Kiani, S. Mahdavi and A. Keshavarzi, "Analysis and Prediction of Crimes by Clustering and Classification",
Analysis, vol. 4, no. 8, 2015.
5.A. Malathi and S. S. Baboo, An enhanced algorithm to predict a future crime using data mining, 2011.
6.J. J. Corcoran, I. D. Wilson and J. A. Ware, "Predicting the geo-temporal variations of crime and disorder",
International Journal of Forecasting, vol. 19, no. 4, pp. 623-634, 2003.
7.S. Lin and D. E. Brown, "An outlier-based data association method for linking criminal incidents", Decision
Support Systems, vol. 41, no. 3, pp. 604-615, 2006.
8. Anshu Sharma and Raman Kumar, "The obligatory of an algorithm for matching and predicting crime-using data
mining techniques", International journal of computer science and Technology (IJCST), vol. 4, no. 2, pp. 289-292,
2013.
9. .R. Xu and B. D. Wunsch, "Survey of Clustering Algorithms", IEEE Trans. On Neural Networks, vol. 16, no. 3,
pp. 645-678, May 2005.
10.M. Steinbach, G. Karypis and V. Kumar, "A Comparison of Document Clustering Techniques", KDD Workshop
on Text Mining, 2000.
11.R. Ali, U. Ghani and A. Saeed, "Data Clustering and Its Applications", Rudjer Boskovic Institute, 2001.
12.J. Han, M. Kamber, J. Pei and M. Kamber, Data Mining Concepts and Technologies, The Morgan Kaufmann,
2011.
13Sang C. Sug, Practical Applications of Data Mining, Jones & Bartlett, 2012.
14.M. Kantardzic, Data Mining: Concepts Models Methods and Algorithms, Wiley-IEEE Press, August 2011.
15.The 2015 data records of the Stop Question and Frisk Report Database City of New York Police Department
NYPD.