0% found this document useful (0 votes)
37 views

Legal Case Classification Using Machine Learning With NLP

Uploaded by

Faique Memon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views

Legal Case Classification Using Machine Learning With NLP

Uploaded by

Faique Memon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

2023 International Conferences on Data Science, Agents, and Artificial Intelligence

Legal Case Classification Using Machine Learning


with NLP
2023 International Conference on Data Science, Agents & Artificial Intelligence (ICDSAAI) | 979-8-3503-4891-0/23/$31.00 ©2023 IEEE | DOI: 10.1109/ICDSAAI59313.2023.10452618

Manjusha Singh Tomar Vishal Gupta


Department of Computer Science and Engineering Department of Computer Science and Engineering
Netaji Subhas University of Technology Netaji Subhas University of Technology
Delhi, India Delhi, India
A. Civil Law and Civil Cases
Abstract—In India, offences occur daily; crime can be minor
or severe. The Indian jurisdiction system provides justice in Civil law’s primary purpose is to settle disagreements and
every severity of the case. However, reporting a crime to get conflicts between people, families, or groups of people by
justice requires proper steps. Cases involving murder fall under giving victims fair and appropriate compensation. In contrast
criminal cases, while cases related to a family dispute over a to criminal law, civil law places more emphasis on
property fall under civil cases. Accurately dividing legal compensating the victims than punishing the accused. With
circumstances into civil and criminal categories is essential for
efficient legal case management and decision-making processes. the assistance of skilled lawyers and judges, Civil Law
In recent years, deep learning models have advanced significantly Courts hear and decide civil cases. Most violations of the law
in problems in evolving natural language processing. This study are either civil or criminal. There are different protocols to
offers a strategy based on BERT for classifying legal cases as follow for both. Civil cases entail a disagreement between
civil or criminal. The recommended model fine-tunes the early individuals or groups, usually over money. Civil litigation
halting and modifies the BERT-Base model using a
hyperparameter on a corpus of labelled court case documents, begins when a legal entity files a ”complaint” with the court
using BERT’s ability to gather contextual information and claiming to have been harmed by the activities of another
precise relationships between words and sentences. The attention person or entity. The well-established rules of the Code of
mechanism in BERT enables the model to understand the Civil Procedure serve as guidance for the majority of civil
nuances of the legal dataset and distinguish between civil and lawsuits.
criminal scenarios based on the given legal context. Experimental
results on a large dataset of legal cases show our method’s
effectiveness and excellent accuracy, precision, recall, and F1- B. Criminal Law and Criminal Cases
score in categorizing legal cases. The BERT-based model
exhibits promising potential for automating case categorization, Criminal law is a broad field that deals with issues
enabling efficient case management, assisting legal professionals arising from police investigations and arrests based on
in decision-making processes, and assisting regular people in possible criminal conduct. In addition to court proceedings,
making informed decisions by providing insightful analyses and criminal law deals with charges, accusations, and guilty pleas.
predictions based on the available data, with an accuracy of 92.9 Criminal law also deals with probation issues and attempts to
and an F1 score of 0.93.
seal or expunge criminal records. A criminal case refers to a
Index Terms—Binary Classification, BERT, LR, Naive Bayes,
SVM, legal case prediction, Text classification. type of court proceeding in which the accused person is
charged with actions that are considered unlawful by the state
I. INTRODUCTION legislature or government. A criminal case usually begins
once the suspect has been arrested and informed of the
In India, offences occur daily; offences can be minor or charges (via FIR), usually during an arraignment hearing. In
severe. The Indian jurisdiction system provides justice in every criminal cases, the accused is presumed innocent until the
severity of the case. Nevertheless, reporting an offence to prosecution can prove his or her guilt beyond a reasonable
get justice requires proper steps. All citizens of India need doubt. When a crime occurs, victims are hardly aware of the
to be adequately aware of these steps. This problem created proper steps they should take, and many of them take their
much dependency on lawyers. Moreover, people often need time to file charges, and many refrain from filing charges if
help searching for the steps, and offences often go unnoticed. the crime is not very serious. This problem could be solved
Cases where murder is involved fall under criminal, while with NLP.
cases related to a family dispute over a property fall under the Natural language processing (NLP) is critical for under-
civil case. Whenever the victim informs the offence lawyer or standing and predicting laws on a large scale because almost
the police in the initial phase, the victim uses natural language all laws are expressed in natural language. NLP transforms
only, and based on the provided information, the lawyer unstructured material into a formal format that helps in anal-
focuses on the keywords that could easily classify the caseas ysis. Incorporating law and natural language processing leads
civil or criminal. The same approach could be used in to (i) further increase legal text data, which helps to increase
classifying legal cases using AI techniques. Two types of law the dataset that gives more effective results compared to the
are there: current scenario, (ii) we can make advancement in NLP by

979-8-3503-4891-0/23/$31.00 ©2023 IEEE

Authorized licensed use limited to: MEHRAN UNIV OF ENGINEERING AND TECHNOLOGY. Downloaded on May 26,2024 at 21:40:17 UTC from IEEE Xplore. Restrictions apply.
improving algorithmic and hardware methods (iii) these factors measuring and checking the performance of the classifier. The
can help in classifying the case in civil and criminal cases. trial findings and the statistical distribution of characteristics
[5].
A method for determining the essential kind of semantic link
between things, known as semantic similarity, was suggested
by Rafe Athar Shaikha. It is built on Google Directory, the
Open Directory Project’s search interface. Two applications
of semantic similarity measurement are the generation of
entities through knowledge acquisition and the fine-grained
classification of entities [6].
It takes a supreme court decision involving Indian law
Fig. 1. Legal Case Classifier and generates a card which contains score of counts of how
frequently each Justice voted in favor of the ”pro-Indian”
C. Objectives result. In order to demonstrate that, despite a general trend, a
Whenever a crime is committed, many people need to know more ”liberal” Justice is more likely to promote the pro-Indian
what steps to take to get justice. Moreover, this scenario is interest, it links those findings to Justice’s political philosophy
especially true when the offence could be more serious. They [7].
are unaware if it is a civil or criminal case, as there are Grant Christensen uses NLP, referring to the indexed key-
different steps to follow for both offences. With the advent of word that stores multiple laws identified. Based on positive and
artificial intelligence, its advantages can also be used in various negative feedback, the system’s modelling takes place. The
other fields. Moreover, this is also true in the field of law. This positive or negative feedback revises the reward and makes
research will focus on solving the problem mentioned with the itself learn from the outcome, so the system provides a solution
help of NLP. based on user feedback [8].
Five traditional ML models named LR (Logistic Regres-
D. Challenges sion), Bagging model, RF(Random Forest) model and SVM
For the model’s accuracy, we must collect real-time data on were compared by D. Sangeetha. The four machine learning
civil and criminal cases, which is time-consuming.In India, models other than SVM with different settings and the text
there is not much research on AI’s benefit in law as, unlike in with semantic information were helpful for feature selection
other countries, Indian data is stored in hard copy form. Nev- for the models used for prediction [9].
ertheless, as digitization is increasing, data is being scanned A study was done in 2017 by Zhenyu Liu using data from
and stored as a soft copy, but it is still in process.As Indian actual judicial cases, the research compares decision trees,
law and types of cases have a vast range, it is impossible to SVM, KNN, and random forests. It describes the experimental
cover every case, and it will be time-consuming and tedious. setup, including the selection of features, the feature selection
procedure, and the division of the train-test set. The findings
II. RELATED WORK are presented after a performance analysis of each model’s
John J. Nay endeavoured to incorporate explicit legal knowl- accuracy, precision, recall, and F1 score. In the context of
edge into legal judgment prediction. Based on deep learning judicial situations, the study also addresses the advantages,
methods, the suggested model resembles factual information disadvantages, and variables affecting prediction performance
about laws as first-order logic rules. These are then seamlessly [10].
integrated into a co-attention model based on a network, This paper described the methodology used to develop the
forming an end-to-end approach. Including this knowledge question answering system. This involved techniques such as
imparts an inductive bias that aids in comprehending legal natural language processing (NLP), and text classification.
data [1]. The authors may discuss the process of curating a dataset
In 2021, Leilei Gan created a novel dataset for predicting of IPC sections and Indian amendment laws, as well as the
legal judgments in English. This dataset included cases from pre-processing steps applied to the legal texts. The author
the European Court of Human Rights. The new set of data presented the architecture and functionality of their question
outperformed the past model which was based on feature. answering system, explaining how it can handle user queries,
Additionally, the study investigated whether models exhibited identify relevant keywords, and retrieve appropriate answers
biases towards demographic information by employing data from the legal corpus. They may also discuss the metrics which
anonymization techniques [2]. are used in evaluation to assess how the system is performing,
Souvik Sengupta uses the vector space model and NLP tech- such as precision, recall, and accuracy [12].
niques; a new problem suggested an IPC section appropriate The paper is focused on applying ML techniques to crime
from user input consisting of crime-related reports [4]. pattern detection, analysis, and prediction tasks. It covers
Ambrish Srivastav utilised traditional machine learning clas- various aspects of the process, including feature selection,
sification techniques, and Leave-one-out cross-validation was model training, and evaluating the results. This may include
used to get the findings. Traditional Metrics are used for algorithms which are supervised such as DT, RF, SVM, or

Authorized licensed use limited to: MEHRAN UNIV OF ENGINEERING AND TECHNOLOGY. Downloaded on May 26,2024 at 21:40:17 UTC from IEEE Xplore. Restrictions apply.
unsupervised learning techniques like clustering algorithms. hidden size of 768. BERT’s contextual representation learning
This paper includes the selection of features and the training capabilities make it effective at understanding and capturing
process of the models [13]. the nuances in text, which can be valuable for classifying civil
and criminal cases. The Architecture of BERT-Base is shown
III. RESEARCH METHODOLOGY
in Figure 3.
A. Data Corpus Collection
In this research paper, Civil and Criminal Dataset is used,
and these are processed, which is used for training the model.
This dataset contains 393 samples separated into two classes
1) 205 Civil cases.
2)188 Criminal cases.
This sample dataset was collected from the Central Bureau of
Investigation and Indian Kanoon sites. Since the dataset was
minimal, dummy data was also added.
The civil cases dataset is divided based on the following:
• Personal injury claims
• Property disputes
• Breach of contract
• Employment disputes
• Divorce and family law cases
• Debt collection cases
• Landlord-tenant disputes
• Intellectual property disputes
• Consumer protection cases
• Medical malpractice claims

The criminal cases data set is divided based on the


following:
• Murder Fig. 3. Architecture of Bert-Base
• Rape and sexual assault
• Theft and robbery
• Assault and battery
• Domestic violence
• Drug trafficking and possession
• Fraud and white-collar crimes
• Cybercrimes
• Kidnapping and abduction
• Terrorism and related offences

Fig. 4. Architecture of Encoder

Bert-Base Hyper-parameter Optimization

Since we have fewer civil and criminal case datasets, the


hyperparameter is the most crucial step for better performance.
Our proposed model includes learning rate, batch size, op-
timizer selection, dropout rate and number of epochs. By
Fig. 2. Visualization of Dataset
tuning these hyperparameters, the best performance is taken
out from the proposed model. To improve the accuracy of
B. Proposed Model the model, you can explore various approaches. These include
Bert-Base Model adjusting the sequence length of the input, modifying the
number of neurons in the layers, adding or removing layers,
There are four main types of BERT models: BERT-Base, and tuning the learning rate of the optimizer. The Adam
BERT-Large, Multilingual BERT, and Domain-specific BERT. optimizer is utilized in our model, which offers benefits such
In this paper, our binary classification is implemented using the as efficient weight updates, reduced memory usage, and faster
BERT-Base model, which consists of 12 encoder stacks and a training time. We tried to fine-tune the model and enhance its
performance by experimenting with these factors.
3

Authorized licensed use limited to: MEHRAN UNIV OF ENGINEERING AND TECHNOLOGY. Downloaded on May 26,2024 at 21:40:17 UTC from IEEE Xplore. Restrictions apply.
Final working architecture of Bert-Base model IV. RESULTS
After training Bert-Base Model, Na¨ıve Bayes, Random
Forest, and SVM using our collected dataset, the results are
as follows.

TABLE I
PREDICTION TABLE

ML Metrics Used
MODELS Accuracy Precision Recall F1-Score
Bert-Base 0.929 0.93 0.93 0.93
SVM 0.878 0.88 0.88 0.88
Naive Bayes 0.919 0.92 0.92 0.92
Random Forest 0.878 0.88 0.88 0.88
.

The following metrics are used for prediction and compar-


Fig. 5. Process of Proposed Bert-Base ison
A. Accuracy
C. Traditional Models The metric helps in visualizing the performance of the
model. Four elements are there in a Confusion Matrix men-
Support Vector Machine tioned below:
1) True Negative: Criminal case sample predicted as a
SVM is used for for binary classification issues. It works by Criminal case.
discovering an ideal hyperplane that, when divided, maximiz- 2) True Positive: Civil case sample predicted as a Civil case
ing the points among the data. SVM can handle continuous 3) False Positive: Civil case sample predicted as a Criminal
data and not linearly separable data using kernel functions. It case
has an excellent ability to control outliers and is famous for 4) False Negative: Criminal case sample predicted as a
its ability to manage high-dimensional data. Civil case
The accuracy of the model is defined as:
Naive Bayes TP + TN
Acc = (1)
The probabilistic algorithm Naive Bayes is based on the TP + FN + TN + FN
Bayes theorem. It idealises that one attributes presence doesn’t B. Precision
depend on another’s attribute . Naive Bayes analyses each In other words, it is the number of correctly anticipated
class probability in light of the features and selects the civil cases from the total positive samples anticipation. The
prediction from the class with the highest likelihood. Due to its precision is given by:
effectiveness, scalability, and ability to handle large amounts of TP
Prec = (2)
data, it is well-liked for applications such as text classification TP + FN
and case filtering. C. Recall
The F1 score measures a model’s accuracy, Given is he
Random Forest
below formula to calculate.
The Random Forest ensemble learning system which after TP
Rec = (3)
combining many decision trees to predict the result. Each tree TP + FP
is constructed using a random subset of the features and the D. F1-Score
data points. During prediction, the results of various trees
are merged to get the final projection. Random Forest often 2 ∗ Prec ∗ Rec
F 1 − Score = (4)
produces accurate results and is resistant to overfitting. High- Prec + Rec
dimensional data, category traits, and missing values are all Analyzing the final version of the model, Bert-Base out-
supported. performs other ML models. Below, Table 1 represents the
compared result of all four models implemented in this paper,
and the corresponding Figure 5 represents the confusion matrix
of the Bert-Base Model, Figure 6 represents the confusion
matrix of the RF, Figure 7 represents the confusion matrix
of Na¨ıve Bayes, Figure 8 represents the confusion matrix of
SVM.

Authorized licensed use limited to: MEHRAN UNIV OF ENGINEERING AND TECHNOLOGY. Downloaded on May 26,2024 at 21:40:17 UTC from IEEE Xplore. Restrictions apply.
Fig. 6. Proposed Model Confusion Matrix Fig. 9. Confusion Matrix

V. CONCLUSION AND FUTURE SCOPE


The Bert-Base model achieved the highest percentage with
an accuracy of 0.929, precision of 0.93, recall of 0.93, and F1-
score of 0.93. It outperformed the SVM, NB, and RF models
regarding accuracy and other metrics. The SVM and Na¨ıve
Bayes models showed similar performance with accuracy
scores of 0.878 and precision, recall, and F1-scores of 0.88.
Similarly, the Random Forest model also exhibited an accuracy
of 0.878 and consistent precision, recall, and F1-scores of
0.88. Based on these results, it can be concluded that the Bert-
Base model is the most effective in classifying legal cases as
civil or criminal. High accuracy and consistent performance
Fig. 7. Confusion Matrix
were measured well across the evaluated metrics. Since the
dataset used for training and evaluation was relatively small
(393 samples), a future scope would involve acquiring a
more extensive and diverse dataset of legal cases. A more
extensive dataset can help improve the models’ generalization
and robustness. Collaborating with legal domain experts and
considering their insights can enhance models’ performance
and applicability, giving insight on which IPC section would
be applicable depending upon the case.

Fig. 8. Confusion Matrix

Authorized licensed use limited to: MEHRAN UNIV OF ENGINEERING AND TECHNOLOGY. Downloaded on May 26,2024 at 21:40:17 UTC from IEEE Xplore. Restrictions apply.
REFERENCES
[1] John J. Nay (2021), “Natural Language Processing for Legal Texts.”,
Legal Informatics. Cambridge University Press, pp 1-35.
[2] Leilei Gan, Kun Kuang, Yi Yang, Fei Wu (2021), “Judgment Prediction
via Injecting Legal Knowledge into Neural Networks”, AAAI Technical
Track on Speech and Natural Language Processing I (Issue: Vol. 35 No.
14).
[3] Ilias Chalkidis, Ion Androutsopoulos, Nikolaos Aletras (2019), “Neural
Legal Judgment Prediction in English”, arXiv:1906.02059v1 [cs.CL].
[4] Souvik Sengupta, Vishwang Dave (2022), “Predicting applicable law
sections from judicial case reports using legislative text analysis with
machine learning” Springer: Journal of Computational Social Science,
pp. 503–516 (2022).
[5] Ambrish Srivastav, Shaligram Prajapat(2021), “Text similarity algo-
rithms to determine Indian penal code sections for offence report, IAES
International Journal of Artificial Intelligence (IJ-AI) March 2022, pp.
34-40.
[6] Rafe Athar Shaikha, Tirath Prasad Sahua, Veena Anand “Predicting
outcomes of Legal Cases based on Legal Factors using Classifiers”, In-
ternational Conference on Computational Intelligence and Data Science
(ICCIDS 2019).
[7] Jiahui Liu, Larry Birnbaum (2007),”Measuring Semantic Similarity
between Named Entities by Searching the Web Directory”, International
Conference on Web Technology.
[8] Grant Christensen (2021), ”Predicting Supreme Court Behavior in
Indian Law Cases”, Michigan Journal of Race and Law, vol. 26.
[9] D. Sangeetha, R. Kavyashri, S. Swetha, S. Vignesh (2016), ”Information
retrieval system for laws”, Eighth International Conference on Advanced
Computing (ICoAC).
[10] Zhenyu Liu, Huanhuan Chen (2017), ”A predictive performance com-
parison of machine learning models for judicial cases”, IEEE Sympo-
sium Series on Computational Intelligence (SSCI).
[11] Afnan Iftikhar, Syed Waqar Ul Qounain Jaffry, Muhammad Kamran
Malik (2019), “InformationMining from Criminal Judgments of Lahore
High Court”, IEEE
[12] R. P. Kamdi, A. J. Agrawal,(2015)”Keywords based Closed Domain
Question Answering System for Indian Penal Code Sections and Indian
Amendment Laws”, International Journal of Intelligent Systems and
Applications, vol. 7, no. 12, pp. 57–67
[13] Rohit Patil, Muzamil Kacchi, Pranali Gavali, Komal Pim-
paria(2020),“Crime Pattern Detection, Analysis Prediction using
Machine Learning.”, International Research Journal of Engineering
and Technology (IRJET).
[14] Nikolaos Aletras, Dimitrios Tsarapatsanis,Daniel Preo¸tiuc-Pietro
,Vasileios Lampos(2016) “Predicting judicial decisions of the European
Court of Human Rights: A Natural Language Processing perspective”,
PeerJ Computer Science
[15] Thaer sahmoud, Dr.MohammadA (2022), “Spam Detection Using
BERT” Computer Engineering Department, Islamic University of Gaza,
Palestine, arXiv.2206.02443.

Authorized licensed use limited to: MEHRAN UNIV OF ENGINEERING AND TECHNOLOGY. Downloaded on May 26,2024 at 21:40:17 UTC from IEEE Xplore. Restrictions apply.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy