0% found this document useful (0 votes)
223 views8 pages

Fbi Crime Analysis and Prediction Using Machine Learning

This document discusses using machine learning to analyze and predict crimes using FBI data. It proposes using random forest algorithms on crime datasets to predict: 1. The type of crime that may occur 2. The probability of different crimes happening 3. Crime distribution and patterns over areas Random forest is identified as a suitable algorithm because it can handle both classification and regression tasks, is robust to outliers, and reduces overfitting compared to decision trees. The goal is to help law enforcement predict crimes more accurately to reduce crime rates.

Uploaded by

Jaydwin Labiano
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
223 views8 pages

Fbi Crime Analysis and Prediction Using Machine Learning

This document discusses using machine learning to analyze and predict crimes using FBI data. It proposes using random forest algorithms on crime datasets to predict: 1. The type of crime that may occur 2. The probability of different crimes happening 3. Crime distribution and patterns over areas Random forest is identified as a suitable algorithm because it can handle both classification and regression tasks, is robust to outliers, and reduces overfitting compared to decision trees. The goal is to help law enforcement predict crimes more accurately to reduce crime rates.

Uploaded by

Jaydwin Labiano
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Vol 11, Issue 4 , April/ 2020

ISSN NO: 0377-9254

FBI CRIME ANALYSIS AND PREDICTION USING MACHINE


LEARNING
Linga Akhila Sri1, Kalluri Manvitha2, Gorantla Amulya3, Ikkurthi Sai Sanjuna4, V. Pavani5
1,2,3,4
IV B.Tech, Department of Information Technology, Vignan’s Nirula Institute of Technology &
Science for Women, Peda Palakaluru, Guntur-522009, Andhra Pradesh, India.
5
Asst.Professor, Department of Information Technology, Vignan’s Nirula Institute of Technology &
Science for Women, Peda Palakaluru, Guntur-522009, Andhra Pradesh, India.
manojpavani81@gmail.com

ABSTRACT 1. INTRODUCTION

Crime is one of the biggest and dominating problem Crimes are the significant threat to the
in our society and its prevention is an important task. humankind. There are many crimes that happens
Daily there are huge numbers of crimes committed regular interval of time. Perhaps it is increasing and
frequently. This requires keeping track of all the spreading at a fast and vast rate. Crimes happen from
crimes and maintaining a database for same which small village, town to big cities. Crimes are of
may be used for future reference. The current different type – robbery, murder, rape, assault,
problem faced are maintaining of proper dataset of battery, false imprisonment, kidnapping,
crime and analyzing this data to help in predicting homicide[1][2]. Since crimes are increasing there is a
and solving crimes in future. The objective of this need to solve the cases in a much faster way. The
project is to analyze dataset which consist of crime activities have been increased at a faster rate
numerous crimes and predicting the type of crime and it is the responsibility of police department to
which may happen in future depending upon various control and reduce the crime activities[3][4]. Crime
conditions. In this project, we will be using the prediction and criminal identification are the major
technique of machine learning and data science for problems to the police department as there are
crime prediction of Chicago crime data set. For this tremendous amount of crime data that exist. There is
supervised classification problem, Decision Tree, a need of technology through which the case solving
Gaussian Naive Bayes, k-NN, Logistic Regression. could be faster. The objective would be to train a
This approach involves predicting crimes classifying, model for prediction[5][6]. The training would be
pattern detection and visualization with effective done using the training data set which will be
tools and technologies. Use of past crime data trends validated using the test dataset. Building the model
helps us to correlate factors which might help will be done using better algorithm depending upon
understanding the future scope of crimes. In this the accuracy. The K-Nearest Neighbor (KNN)
work, various visualizing techniques and machine classification and other algorithm will be used for
learning algorithms are adopted for predicting the crime prediction. Visualization of dataset is done to
crime distribution over an area. In the first step, the analyze the crimes which may have occurred in the
raw datasets were processed and visualized based on country[33][34[35].
the need.
This work helps the law enforcement
Keywords: crime analysis, prediction analysis, agencies to predict and detect crimes in Chicago with
machine learning, decision trees, pattern detection. improved accuracy and thus reduces the crime
rate[7][8]. There has been tremendous increase in
machine learning algorithms that have made crime
prediction feasible based on past data. The aim of this

www.jespublication.com Page No:441


Vol 11, Issue 4 , April/ 2020
ISSN NO: 0377-9254

project is to perform analysis and prediction of 3. PROPOSED SYSTEM


crimes in states using machine learning models [47]
[48]. It focuses on creating a model that can help to Here in the proposed system we use the
detect the number of crimes by its type in a particular random forest algorithm in order to get good results
state [9][10]. and better accuracy when compared to the above or
the existing algorithms. We use random forest for
In this project various machine learning accuracy [41] [42]. Random forest is a most popular
models like K-NN, boosted decision trees will be and powerful supervised machine learning algorithm
used to predict crimes[11][12]. Area Wise capable of performing both classification, regression
geographical analysis can be done to understand the tasks, that operate by constructing a multitude of
pattern of crimes. Various visualization techniques decision trees at training time and outputting the class
and plots are used which can help law enforcement that is the mode of the classes (classification) or
agencies to detect and predict crimes with higher mean prediction (regression) of the individual trees
accuracy[30][31]. This will indirectly help reduce the [43] [44]. Random decision forests correct for
rates of crimes and can help to improve securities in decision trees habit of over fitting to their training set
such required areas. Crimes can be predicted as the .The data sets considered are rainfall, perception,
criminals are active and operate in their comfort production, temperature to construct random forest, a
zones. Once successful they try to replicate the crime collection of decision trees by considering two-third
under similar circumstances[13][14]. of the records in the datasets [45] [46]. These
decision trees are applied on the remaining records
2. LITERATURE SURVEY for accurate classification. Accuracy score for
random forest algorithm is 86.6%.
crime prediction is done on Chicago data set
in which various machine learning models are Advantages of Random Forest
used.[15] Comparison of models like KNN, Naïve
Bayes, SVM is done this paper. It is seen that 1. Random Forest is based on the bagging algorithm
prediction varies depending upon the dataset and and uses Ensemble Learning technique. It creates as
features that have been selected[16][17[18]. The many trees on the subset of the data and combines the
prediction accuracy found in is 78% for KNN, 64% output of all the trees. In this way it reduces over
for GaussianNB, 31% for SVC. Auto regressive fitting problem in decision trees and also reduces the
integrated Moving average models were used in to variance and therefore improves the accuracy
make machine learning algorithms to forecast crime
trends in urban areas[19][20]. One of the major 2. Random Forest can be used to solve both
problems in crimes is detecting and analyzing the classifications as well as regression problems.
pattern of crimes. Understanding datasets is also an 3. Random Forest can automatically handle missing
important concept in this case. We surely want to values.
accurately predict so that we don’t waste our
4. Random Forest is usually robust to outliers and can
resources due to false signals[21][22][23]. Also
handle them automatically.
proposed a method for classifying the crime rate as
5. Random Forest algorithm is very stable. Even if a
high, medium or low. None of them has classified the
new data point is introduced in the dataset, the overall
type of crime that can happen and its probability of
algorithm is not affected much since the new data
happening[24][25][26]. Analysis and prediction of
may impact one tree, but it is very hard
crime is an important activity that can be optimized
6. Random Forest is comparatively less impacted by
using various techniques and processes. Lot of
noise.
research work is done by various researchers in this
domain. The existing work is limited to use the Our proposed system uses random forest which gives
datasets to identify locations of crime[27][28][29]. accuracy of 85.6%

www.jespublication.com Page No:442


Vol 11, Issue 4 , April/ 2020
ISSN NO: 0377-9254

The use of random forest gives accurate results up to


86.6%.In this visualization can be done by drawing of
various graphs like bar graphs, line graphs, heap
maps, etc. Using mathpoltlib library from sklearn.
Analysis of the crime dataset is done by plotting
various graphs.

FIG:1 LINE GRAPH REPRESENTING CRIMES


PER MONTH

Describe Information of crime

FIG 2: BAR GRAPH REPRESENTS


COMPARISION OF CRIMES PER MONTH IN
EACH YEAR

www.jespublication.com Page No:443


Vol 11, Issue 4 , April/ 2020
ISSN NO: 0377-9254

FIG 5 : ACCURACY OF ALGORITHM


RANDOM FOREST

4. CONCLUSION

With the help of machine learning technology, it has


become easy to find out relation and patterns among
various data’s. The work in this project mainly
revolves around predicting the type of crime which
FIG 3: REPRESENTS CRIME WISE ARRESTS may happen if we know the location of where it has
BAR GRAPH occurred. Using the concept of machine learning we
have built a model using training data set that have
undergone data cleaning and data transformation. The
model predicts the type of crime with accuracy of
0.789. Data visualization helps in analysis of data set.
The graphs include bar, pie, line and scatter graphs
each having its own characteristics. We generated
many graphs and found interesting statistics that
helped in understanding Chicago crimes datasets that
can help in capturing the factors that can help in
keeping society safe. The tool we have developed
provides a framework for visualizing the crime
networks and analyzing them by various machine
learning algorithms using the Google Maps. The
project helps the crime analysts to analyze these
crime networks by means of various interactive
visualizations. The interactive and visual feature
applications will be helpful in reporting and
discovering the crime patterns. Many classification
models can be considered and compared in the
FIG 4: REPRESENTS HEAT MAP (HOUR OF Analysis. It is evident that law enforcing agencies can
CRIME OCCURRED) take a great advantage of using machine learning
algorithms tofight against the crimes and saving
humanity. For better results, we need to update data
as early as possible by using current trends such as
web and Apps.

www.jespublication.com Page No:444


Vol 11, Issue 4 , April/ 2020
ISSN NO: 0377-9254

References system using genetic algorithm”, Ingénierie


des Systèmes d'Information, Vol.23, Issue.6,
[1]. Lakshman Narayana Vejendla and A Peda pp. 87-98. DOI: 10.3166/ISI.23.6.87-98
Gopi, (2019),” Avoiding Interoperability [8]. Lakshman Narayana Vejendla and Bharathi
and Delay in Healthcare Monitoring System C R,(2017),“Using customized Active
Using Block Chain Technology”, Revue Resource Routing and Tenable Association
d'Intelligence Artificielle , Vol. 33, No. 1, using Licentious Method Algorithm for
2019,pp.45-48. secured mobile ad hoc network
[2]. Gopi, A.P., Jyothi, R.N.S., Narayana, V.L. Management”, Advances in Modeling and
et al. (2020), “Classification of tweets data Analysis B, Vol.60, Issue.1, pp.270-282.
based on polarity using improved RBF DOI: 10.18280/ama_b.600117
kernel of SVM” . Int. j. inf. tecnol. (2020). [9]. Lakshman Narayana Vejendla and Bharathi
https://doi.org/10.1007/s41870-019-00409- C R,(2017),“Identity Based Cryptography
4. for Mobile ad hoc Networks”, Journal of
Theoretical and Applied Information
[3]. A Peda Gopi and Lakshman Narayana
Technology, Vol.95, Issue.5, pp.1173-1181.
Vejendla, (2019),” Certified Node
EID: 2-s2.0-85015373447
Frequency in Social Network Using Parallel
[10]. Lakshman Narayana Vejendla and A Peda
Diffusion Methods”, Ingénierie des
Gopi, (2017),” Visual cryptography for gray
Systèmes d' Information, Vol. 24, No. 1,
scale images with enhanced security
2019,pp.113-117.. DOI:
mechanisms”, Traitement du Signal,Vol.35,
10.18280/isi.240117
No.3-4,pp.197-208. DOI: 10.3166/ts.34.197-
[4]. Lakshman Narayana Vejendla and Bharathi
208
C R ,(2018),“Multi-mode Routing
[11]. A Peda Gopi and Lakshman Narayana
Algorithm with Cryptographic Techniques
Vejendla, (2017),” Protected strength
and Reduction of Packet Drop using 2ACK
approach for image steganography”,
scheme in MANETs”, Smart Intelligent
Traitement du Signal, Vol.35, No.3-
Computing and Applications, Vo1.1,
4,pp.175-181. DOI: 10.3166/TS.34.175-181
pp.649-658. DOI: 10.1007/978-981-13-
[12]. Lakshman Narayana Vejendla and A Peda
1921-1_63 DOI: 10.1007/978-981-13-1921-
Gopi, (2020),” Design and Analysis of
1_63
CMOS LNA with Extended Bandwidth For
[5]. Lakshman Narayana Vejendla and Bharathi
RF Applications”, Journal of Xi'an
C R, (2018), “Effective multi-mode routing
University of Architecture & Technology,
mechanism with master-slave technique and
Vol. 12, Issue. 3,pp.3759-3765.
reduction of packet droppings using 2-ACK
https://doi.org/10.37896/JXAT12.03/319.
scheme in MANETS”, Modelling,
[13]. Chaitanya, K., and S.
Measurement and Control A, Vol.91,
Venkateswarlu,(2016),"DETECTION OF
Issue.2, pp.73-76. DOI:
BLACKHOLE & GREYHOLE ATTACKS
10.18280/mmc_a.910207
IN MANETs BASED ON
[6]. Lakshman Narayana Vejendla , A Peda Gopi
ACKNOWLEDGEMENT BASED
and N.Ashok Kumar,(2018),“ Different
APPROACH." Journal of Theoretical and
techniques for hiding the text information
Applied Information Technology 89.1: 228.
using text steganography techniques: A
[14]. Patibandla R.S.M.L., Kurra S.S.,
survey”, Ingénierie des Systèmes
Mundukur N.B. (2012), “A Study on
d'Information, Vol.23, Issue.6,pp.115-
Scalability of Services and Privacy Issues
125.DOI: 10.3166/ISI.23.6.115-125
in Cloud Computing”. In: Ramanujam R.,
[7]. A Peda Gopi and Lakshman Narayana
Ramaswamy S. (eds) Distributed
Vejendla (2018), “Dynamic load balancing
Computing and Internet Technology.
for client server assignment in distributed
ICDCIT 2012. Lecture Notes in Computer

www.jespublication.com Page No:445


Vol 11, Issue 4 , April/ 2020
ISSN NO: 0377-9254

Science, vol 7154. Springer, Berlin, Science and Technology, Vol. 04, Special
Heidelberg Issue 01, pp. 147-151.
[15]. Patibandla R.S.M.L., Veeranjaneyulu N. [24]. R S M Lakshmi Patibandla, Santhi Sri
(2018), “Survey on Clustering Algorithms Kurra, Ande Prasad and N.Veeranjaneyulu,
for Unstructured Data”. In: Bhateja V., (2015), “Unstructured Data: Qualitative
Coello Coello C., Satapathy S., Pattnaik P. Analysis”, J. of Computation In Biosciences
(eds) Intelligent Engineering Informatics. And Engineering, Vol. 2,No.3,pp.1-4.
Advances in Intelligent Systems and [25]. R S M Lakshmi Patibandla, Santhi Sri Kurra
Computing, vol 695. Springer, Singapore and H.-J. Kim,(2014), “Electronic resource
[16]. Patibandla, R.S.M.L., Veeranjaneyulu, N. management using cloud computing for
(2018), “Performance Analysis of Partition libraries”, International Journal of Applied
and Evolutionary Clustering Methods on Engineering Research, Vol.9,pp. 18141-
Various Cluster Validation Criteria”, Arab J 18147.
Sci Eng ,Vol.43, pp.4379–4390. [26]. Ms.R.S.M.Lakshmi Patibandla Dr.Ande
[17]. R S M Lakshmi Patibandla, Santhi Sri Kurra Prasad and Mr.Y.R.P.Shankar,(2013),
and N.Veeranjaneyulu, (2015), “A Study on “SECURE ZONE IN CLOUD”,
Real-Time Business Intelligence and Big International Journal of Advances in
Data”,Information Engineering, Vol.4,pp.1- Computer Networks and its Security,
6. Vol.3,No.2,pp.153-157.
[18]. K. Santhisri and P.R.S.M. Lakshmi,(2015), “ [27]. Patibandla, R. S. M. Lakshmi et al., (2016),
Comparative Study on Various Security “Significance of Embedded Systems to
Algorithms in Cloud Computing”, Recent IoT.”, International Journal of Computer
Trends in Programming Languages Science and Business Informatics,
,Vol.2,No.1,pp.1-6. Vol.16,No.2,pp.15-23.
[19]. K.Santhi Sri and PRSM Lakshmi,(2017), [28]. AnveshiniDumala and S. PallamSetty.
“DDoS Attacks, Detection Parameters and (2020),“LANMAR routing protocol to
Mitigation in Cloud Environment”, support real-time communications in
IJMTST,Vol.3,No.1,pp.79-82. MANETs using Soft computing technique”,
[20]. P.R.S.M.Lakshmi,K.Santhi Sri and Dr.N. 3rd International Conference on Data
Veeranjaneyulu,(2017), “A Study on Engineering and Communication
Deployment of Web Applications Require Technology (ICDECT-2019), Springer, Vol.
Strong Consistency using Multiple Clouds”, 1079, pp. 231-243.
IJMTST,Vol.3,No.1,pp.14-17.
[21]. P.R.S.M.Lakshmi,K.Santhi Sri and [29]. AnveshiniDumala and S. PallamSetty.
M.V.Bhujanga Ra0,(2017), “Workload (2019),“Investigating the Impact of Network
Management through Load Balancing Size on LANMAR Routing Protocol in a
Algorithm in Scalable Cloud”, Multi-Hop Ad hoc Network”, i-manager’s
IJASTEMS,Vol.3,No.1,pp.239-242. Journal on Wireless Communication
[22]. K.Santhi Sri, P.R.S.M.Lakshmi, and Networks (JWCN), Volume 7, No. 4, pp.19-
M.V.Bhujanga Ra0,(2017), “A Study of 26.
Security and Privacy Attacks in Cloud [30]. AnveshiniDumala and S. PallamSetty.
Computing Environment”, (2019),“Performance analysis of LANMAR
IJASTEMS,Vol.3,No.1,pp. 235-238. routing protocol in SANET and MANET”,
[23]. R S M Lakshmi Patibandla and N. International Journal of Computer Science
Veeranjaneyulu, (2018), “Explanatory & and Engineering (IJCSE) – Vol. 7,No. 5,
Complex Analysis of Structured Data to pp.1237-1242.
Enrich Data in Analytical Appliance”, [31]. AnveshiniDumala and S. PallamSetty.
International Journal for Modern Trends in (2018), “A Comparative Study of Various
Mobility Speeds of Nodes on the

www.jespublication.com Page No:446


Vol 11, Issue 4 , April/ 2020
ISSN NO: 0377-9254

Performance of LANMAR in Mobile Ad [38]. Sk.Reshmi Khadherbhi,K.Suresh Babu , Big


hoc Network”, International Journal of Data Search Space Reduction Based On
Computer Science and Engineering (IJCSE) User Perspective Using Map Reduce
– Vol. 6, No. 9, pp. 192-198. ,International Journal of Advanced
[32]. AnveshiniDumala and S. PallamSetty. Technology and Innovative Research
(2018),“Investigating the Impact of IEEE Volume.07, IssueNo.18, December-2015,
802.11 Power Saving Mode on the Pages: 3642-3647
Performance of LANMAR Routing Protocol [39]. B.V.Suresh kumar,Sk.Reshmi Khadherbhi
in MANETs”, International Journal of ,BIG-IOT Framework Applications and
Scientific Research in Computer Science Challenges: A Survey Volume 7, Issue VII,
and Management Studies (IJSRCSMS) – JULY/2018 pg.no 1257-1264
Vol.7, No. 4. [40]. P.Sandhya Krishna,Sk.Reshmi
[33]. AnveshiniDumala and S. PallamSetty. Khadherbhi,V.Pavani, Unsupervised or
(2016),“Analyzing the steady state behavior Supervised Feature Finding For Study of
of RIP and OSPF routing protocols in the Products Sentiment ,International Journal of
context of link failure and link recovery in Advanced Science and Technology, Vol 28
Wide Area Network”, International Journal No 16 (2019).
of Computer Science Organization Trends [41]. K.Santhi Sri, Dr.Ande Prasad (2013), “A
(IJCOT) – Vol. 34 No 2, pp.19-22. Review of Cloud Computing and Security
[34]. AnveshiniDumala and S. PallamSetty. Issues at Different Levels in Cloud
(2016),“Investigating the Impact of Computing” , International Journal on
Simulation Time on Convergence Activity Advanced Computer Theory and
& Duration of EIGRP, OSPF Routing Engineering Vol. 2,pp 67-73.
Protocols under Link Failure and Link [42]. K.Santhi Sri, N.Veeranjaneyulu(2018), “A
Recovery in WAN Using OPNET Modeler”, Novel Key Management Using Elliptic and
International Journal of Computer Science Diffie-Hellman for Managing users in Cloud
Trends and Technology (IJCST) – Vol. 4 Environment”, Advances in Modelling and
No. 5, pp. 38-42. Analysis B,Vol.61,No.2,pp 106-112.
[35]. VellalacheruvuPavani and I. Ramesh Babu [43]. K.Santhi Sri, N.Veeranjaneyulu(2019),
(2019) ,”Three Level Cloud Storage Scheme “Decentralized Key Management Using
for Providing Privacy Preserving using Edge Alternating Multilinear Forms for Cloud
Computing”,International Journal of Data Sharing with Dynamic Multiprivileged
Advanced Science and Technology Vol. 28, Groups”, Mathematical Modelling of
No. 16, pp. 1929 – 1940. Engineering Problems,Vol.6,No.4,pp511-
[36]. VellalacheruvuPavani and I. Ramesh 518.
Babu,”A Novel Method to Optimize the [44]. S.Sasikala, P.Sudhakar, “interpolation of
Computation Overhead in Cloud Computing CFA color Images with Hybrid image
by Using Linear Programming” denoising”, 2014 Sixth International
,International Journal of Research and Conference on Computational Intelligence
Analytical Reviews May 2019, Volume 6, and Communication Networks, DOI
Issue 2,PP.820-830.. 10.1109/.53 193 DOI
[37]. Anusha Papasani and Nagaraju 10.1109/CICN.2014.53, pp. 193-197.
Devarakonda,(2016),"Improvement of [45]. Me. Jakeera Begum and M.Venkata Rao,
Aomdv Routing Protocol in Manet and (2015), “Collaborative Tagging Using
Performance Analysis of Security Attacks", CAPTCHA” International Journal of
International Journal Of Research in Innovative Technology And Research,
Computer Science & Volume No.3, Issue No.5,pp,2436 – 2439.
Engineering ,Vol.6,No.5, pp.4674-4685. [46]. L.Jagajeevan Rao, M. Venkata Rao,
T.Vijaya Saradhi (2016), “How The

www.jespublication.com Page No:447


Vol 11, Issue 4 , April/ 2020
ISSN NO: 0377-9254

Smartcard Makes the Certification


Verification Easy” Journal of Theoretical
and Applied Information Technology,
Vol.83. No.2, pp. 180-186.
[47]. Venkata Rao Maddumala, R. Arunkumar,
and S. Arivalagan (2018)“An Empirical
Review on Data Feature Selection and Big
Data Clustering” Asian Journal of Computer
Science and Technology Vol.7 No.S1, pp.
96-100.
[48]. Singamaneni Kranthi Kumar, Pallela Dileep
Kumar Reddy, Gajula Ramesh, Venkata Rao
Maddumala, (2019), “Image Transformation
Technique Using Steganography Methods
Using LWT Technique” ,Traitement du
Signalvol 36, No 3, pp. 233-237.

www.jespublication.com Page No:448

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy