Comparative Analysis of Neural Architectures For Underwater Object Detection

2024 Second International Conference on Advance in Information Technology (ICAIT-2024)
Comparative Analysis of Neural Architectures for

Underwater Object Detection
Sai Tejas K R K S Ananda Kumar Suhas P R
Information Science and Engineering Information Science and Engineering Information Science and Engineering
Atria Institute of Technology Atria Institute of Technology Atria Institute of Technology
Bangalore, India Bangalore, India Bangalore, India
krsaitejas@gmail.com anandgdk@gmail.com suhaspranganath15@gmail.com
Vishwas L
Information Science and Engineering
Atria Institute of Technology
Bangalore, India
vishwaslokesh55@gmail.com
Abstract— Underwater object detection is crucial for contrast, blurred images, and color distortions, making it
environmental conservation but faces challenges like poor troublesome to analyze underwater data. Moreover, the
visibility, color distortions, and low contrast. This research unique characteristics of underwater situations, such as tall
leveraged advanced deep learning architectures, including
levels of turbidity and changing light conditions, posture
Convolutional Neural Networks (CNNs), Region-based CNNs
(R-CNNs), and YOLOv8, for improved underwater object
extra challenges for picture procurement and analysis.
recognition. A comprehensive underwater image dataset was Advancements in deep learning have appeared promise in
curated, employing preprocessing and augmentation to mitigate tending to these challenges. Convolutional Neural Systems
degraded conditions. Extensive training and evaluation using (CNNs), such as YOLO and Faster R-CNN, have
precision, recall, and mean Average Precision (mAP) metrics demonstrated their capacity to extract discriminative
were conducted. The CNN model performed well in identifying features and detect objects in pictures [4]. We aim to curate
submerged objects, while R-CNNs achieved accurate localization a large real-world underwater trash image dataset and
by leveraging region proposals. Notably, YOLOv8 emerged as the employ specialized preprocessing and augmentation
best-performing architecture, achieving the highest real-time
techniques to improve image quality. We plan to train
detection accuracy and outperforming other models. Qualitative
analysis confirmed YOLOv8's superiority in precisely defining various CNN models like YOLOv8, Faster R-CNN on this
boundaries and reducing false positives underwater. This dataset. Our ultimate objective is to identify perfect
research demonstrates deep learning's potential, particularly solution for accurate underwater object classification,
YOLOv8, for precise underwater object detection and thereby contributing to the broader effort of conserving
localization, enabling enhanced visual analysis for conservation marine ecosystems. This research introduces several
efforts. Future work involves incorporating more data and novelties in the field of underwater object detection. First,
ensemble techniques for further accuracy improvements. we leverage the ImageNet pre-trained model as a base for
our CNN architecture, freezing specific layers to improve
Keywords—Deep Learning, Object Detection, underwater,
Neural Networks
performance and efficiency. Moreover, we explore
modifications to the Region Proposal Network (RPN) and
I. INTRODUCTION Feature Pyramid Network (FPN) components in the RCNN
architectures, aiming to enhance their accuracy and speed.
Deep learning's latest developments have demonstrated
Notably, our approach enables inference on video data, a
encouraging outcomes on visual recognition tasks [1].
crucial capability for real-time underwater monitoring and
Convolutional neural systems (CNNs) like YOLO [2],
surveillance. Additionally, our models are trained to detect a
Faster R-CNN [3] can extract discriminative features and
wide range of underwater trash classes, making them
detect objects viably. In this project, we leverage CNN
valuable tools for environmental conservation efforts,
architectures to classify key objects in underwater pictures
particularly in large fish tanks or underwater environments
and recordings. Fitting preprocessing and information
where manual cleaning is challenging.
augmentation techniques are utilized to move forward
The rest of the paper is organized as follows. We Continue
picture quality and enhance preparing information. The
to provide a detailed review of the literature on underwater
system points to advance underwater visual analysis through
image analysis and the challenges faced by current methods
comparative Evaluating and comparing the performance of
followed by describing the methodology employed in this
models against established standards. Submerged visual
project. Followed by the results of our experiments, finally
observing is pivotal for understanding and relieving these
we summarize the contributions of our work and suggesting
dangers. Be that as it may, conventional strategies of
directions for future research
underwater observation and information collection are labor-
intensive and often result in poor visibility. This leads to low
979-8-3503-8386-7/24/$31.00 ©2024 IEEE

II. RELATED WORKS al. proposed overview of the YOLOv5 architecture, which
A. N. Tarekegn et al study evaluates three advanced includes input, backbone, neck, and head components.
YOLO models—YOLOv8, YOLOv7, and YOLOv5—for Noteworthy features such as mosaic data augmentation and
automatic image preprocessing and object detection, using adaptive anchor box optimization are highlighted the paper
public underwater image datasets. Results show pre-trained proposes YOLOv5 as a suitable baseline integrated undersea
models excel on the Brackish dataset, with YOLOv5 and recognition of objects and presents experimental results on
YOLOv8 achieving 99% mean average precision (mAP), an underwater biological dataset from the Underwater Robot
surpassing YOLOv7's 89%. Applying an underwater image Professional Contest (URPC). The dataset comprises images
enhancement algorithm to the URPC2021 dataset boosts of various underwater scenarios, and the YOLOv5 algorithm
detection accuracy by 3% across all models. YOLOv5 stands is tested in four different sizes (s, m, l, x) with default hyper
out for its high inference speed, maintaining competitive parameters [6]. S. Wang et al. experimental setup involves
mAP and recall. This integration of deep learning models using publicly accessible underwater benchmarks for image
with AUVs promises more efficient and accurate marine enhancement and super-resolution. The proposed
exploration. [1]. methodology is evaluated using datasets such as UIEB,
J. Lerga et al. presents a comparative analysis of different EUVP, and UFO-120, showcasing the effectiveness of Deep
versions of You only look once (YOLO) identification of Wave Net in enhancing underwater images and improving
objects approaches. in challenging underwater conditions. resolution. The results indicate significant improvements in
The study aims to determine whether newer YOLO versions SSIM and UIQM metrics compared to existing works. The
demonstrate superiority over older ones regarding authors propose extending this technique to underwater
identifying objects performance. The authors utilized a video enhancement in future research [7]. P. B, C. Anuradha
dataset collected by a remotely operated vehicle (ROV) et al. demonstrated that Underwater datasets often suffer
during an underwater pipeline inspection. The evaluated from problems like blurred objects, low contrast, and color
YOLO versions include YOLOv5, YOLOv6, YOLOv7, and distortion, hindering the way it performed of target
the latest YOLOv8. The evaluation is based on Mean detectors. This study explores the influence of data
Average Precision (mAP) scores, precision-recall curves, enhancement methods on underwater target detection using
and confidence levels of detected objects [2]. D. Zhao et al three datasets and seven detection models, including Faster-
introduces an innovative target detection algorithm for RCNN, SSD, and YOLO V1-V5. The evaluation is
underwater fish, leveraging an improved Faster region-based conducted on the Trash-ICRA19 dataset, and YOLOv5 is
convolutional neural network (iFaster RCNN). The identified as the best performing model. The paper ends by
algorithm addresses the challenge of multi-scale detection stressing the importance of image enhancement in
by integrating a feature pyramid network (FPN) with the distinguishing objects from backgrounds in underwater
original Faster RCNN. To boost detection accuracy and scenarios [8]. Z. Zhang et al. experimental results indicate
speed, the Distance-Intersection-over-Union (DIoU) metric that YOLOv7 outperforms YOLOv5 in regard to
replaces the traditional Intersection-over-Union (IoU). correctness. Effectively mitigating issues like occlusion,
Experimental outcomes indicate that iFaster RCNN, image blur, and color distortion. The study has implications
equipped with FPN and DIoU, excels in detecting for underwater unmanned systems, emphasizing The
underwater fish. To assess the algorithm's effectiveness, it significance of accurate detection algorithms for tasks like
was compared with VGG16, MobileNetV2, and ResNet50 intelligence gathering and anti-submarine operations [9]. N.
as backbone feature extraction networks. The results reveal Reddy Nandyala and R. Kumar Sanodiya concludes by
that ResNet50 outperforms VGG16 and MobileNetV2, highlighting the suggestions of these discoveries for the plan
underscoring its superiority in enhancing the detection and improvement of protest location in submerged
capability of iFaster RCNN for underwater fish [3]. situations. Future endeavors incorporate investigating extra
P. Athira et al. summarizes the successful implementation of measurements such as frames per second (FPS) and
the underwater detection of objects utilizing YOLOv3 and deduction time for benchmarking purposes. Assessing the
Fish 4 Knowledge dataset. The model achieves a high way, it performed of other state of-the-art architectures like
accuracy of 96.17% and a mean average precision of Faster RCNN [10].
96.61%. Future scope is discussed, suggesting further tuning
for detecting different object classes and potential III. PROPOSED METHOD
integration with object trackers for tracking and analysis. The methodology involves leveraging the model's
The paper cites relevant references for further exploration advanced architecture, which includes a CNN, RCNN ,Faster
[4]. W. Hao et al. proposed an improved YOLOv4 detection RCNN,YOLOv8, a robust object recognition model.
method for underwater object detection. The challenges in
underwater pictures including texture distortion and color A. Data Collection , Annotation and Preparation
variations, are addressed through modifications in network The project starts by gathering data manually from Google
structure. The experimental results indicate the performance and other online sources. This data is then organized into
of the suggested method. Relative against the first version different folders based on their category or class. Next, the
YOLOv4 model, the improved method achieves a 4.8% LABELIMG tool is used to annotate the pictures by
higher accuracy (AP), a 5.1% higher F1-score, and an drawing bounding boxes around the objects of interest and
mAP@0.5 of 81.5%, which is 5.6% higher [5]. H. Wang et giving them the correct class names. This is done following
979-8-3503-8386-7/24/$31.00 ©2024 IEEE

a specific model format like YOLO or Pascal VOC. The layers of the MobileNetV2 model, allowing them to serve as
annotated dataset is then prepared for training by dividing it feature extractors while enabling the subsequent layers to be
into training, validation, and testing sets. To make the fine-tuned on our underwater dataset. This approach not
training data more diverse, data augmentation techniques only improved the model's performance but also enhanced
like rotating, flipping, and scaling are used. Finally, its computational efficiency, as the frozen layers did not
necessary preprocessing steps, such as resizing and require retraining, reducing the overall training time and
converting the data into the right format, are performed to computational resources required.In order to achieve it, we
make it ready for training the model. The TABLE I use techniques like as rotating, moving, shearing, zooming,
provides details about a dataset used. It has 7,982 total and flipping the training data horizontally. We create data
images of these images, 7,532 are allocated for training the generators for the training and validation datasets by
model, 300 for validation during training, and 150 for final specifying criteria examples include the target picture size,
testing and evaluation. The distinct splitting of data into batch size, and classification mode. Next, the pre-trained
training, validation, and testing sets is a standard practice to MobileNetV2 model is loaded but without the top
build and assess robust machine learning models effectively. classification layers. Custom layers, such as global average
pooling, thick layers, and a dropout layer, were used for
the pre-trained model. A softmax activation function is
TABLE I. DATASET DESCRIPTION used in the output layer to perform multi-class
classification. The model is then compiled using the
B. Training appropriate loss function, optimizer, and evaluation metrics.
Dataset Description
Training and validation accuracy curves are plotted for
Total Images 7982 visual analysis. Once the model is trained, it is retrieved
Number of Classes 15 from storage for evaluation purposes. For each class in the
Training 7532 test dataset, evaluation metrics such as precision, recall, and
Validation 300 mean Average Precision are determined. Lastly, the test
Testing 150 pictures are presented alongside the predicted class
Environment and Model Architecture assignments.
To train our models, we use Google Colab for its powerful
GPUs and computing resources. We install necessary RCNN Architecture
software components like Tensor Flow, Keras, and
OpenCV. We choose an appropriate architecture, such as
CNN, RCNN, RetinaNet, or Faster RCNN, to meet the
specific requirements and challenges of our task. The
architecture involves defining layers, setting
hyperparameters, and specifying configurations. We use
deep learning frameworks like TensorFlow or PyTorch
to build our models.
Fig 2. Architecture of RCNN
CNN Architecture
The Fig 2. depicts the workflow of an R-CNN model, which
is a type of object detection algorithm. It consists of four
main steps Taking an input image, extracting region
proposals using a method like selective search, Computing
CNN features for each proposed region, and Classifying
those regions into different object categories or as
background using the extracted CNN features. To optimize
the performance of the RCNN architectures for underwater
object detection, we explored modifications to the Region
Proposal Network (RPN) and Feature Pyramid Network
(FPN) components. For the RCNN model, we excluded the
RPN and FPN components, relying solely on the selective
search algorithm for region proposals.The number of
classes is also determined by the dataset, and an output
Fig 1. Architecture of CNN directory is established to store model checkpoints. The
model training technique includes putting out a Default
The Fig 1. illustrates the general architecture of a Trainer instance with the supplied parameters. Training can
convolutional neural network (CNN) used in deep learning. be restarted from a checkpoint or begin from scratch. The
To leverage the power of transfer learning, we utilized the model is then trained using the desired number of
ImageNet pre-trained MobileNetV2 model as the base for iterations. Following training, the model is assessed by
our CNN architecture. Specifically, we froze the initial loading the trained model weights from the last checkpoint
979-8-3503-8386-7/24/$31.00 ©2024 IEEE

and determining the test threshold. A data loader is created Fig 4. Architecture of YOLOV8
for the testing dataset, and the COCO Evaluator is used to
assess the model's performance on the dataset, including This Fig. 4. depicts the typical pipeline of a convolutional
measures like mean Average Precision (mAP). Finally, for neural network (CNN) for image classification tasks. It
inference and visualization, the process cycles over the shows the sequential stages of convolution, nonlinear
images in the testing dataset. The predictor is used to make activation (like ReLU), pooling, and fully connected layers,
inferences on each picture, and the predicted instances ultimately leading to the final classification output. Load the
(bounding boxes and class labels) are shown on the test prepared dataset into the training platform or deep learning
images using the Detectron2 Visualizer. framework (such as TensorFlow, PyTorch, or Dark net) for
Faster RCNN Architecture [13,14]. the YOLO (You Only Look Once) model. We are using
base model with size of m. Define or construct the YOLO
model architecture, including layers, hyperparameters, and
other customizations. After training is completed, assess the
model's performance on the validation dataset using
conventional measures like accuracy, recall, and mean
average precision (mAp) to get insights into the model's
learning behavior and convergence during the training
process. Test the trained YOLO model on an unseen dataset
to assess its generalization ability and performance on new,
unseen data [11, 12].
Fig 3. Faster RCNN Overview C. Model Training, Evaluation, and Testing
Data is fed into the training system, and training settings
The Fig. 3. illustrates the Faster R-CNN architecture, Faster
(batch size, learning rate, number of training cycles) are set
RCNN model, we included both the RPN and FPN
up. The training process begins, and metrics like loss are
components, enabling it to generate region proposals more
kept an eye on. After training, the model's accuracy is
efficiently and leveraging multi-scale feature maps for
checked using a validation dataset with measurements like
improved object detection accuracy. These architectural
precision recall and mAP. Training data (loss, accuracy) is
modifications were driven by the unique challenges posed
examined and displayed to see how the model learned. The
by underwater environments, such as low contrast and color
model is further tested on a fresh, unseen test dataset to see
distortions, and aimed to optimize the models' performance
how well it can generalize, using the same metrics.
for our specific task [4]. To start, the project establishes the
Precision, Recall, are calculated using the following
necessary environment by installing essential libraries like
formulas
PyTorch, CUDA, and Detectron2. To confirm successful
loading and annotation, a sample of training data is visually
Precision = Total True Positives / Sum of total True
inspected. To prepare the model, we combine the Faster R-
Positives and False Positives
CNN setup with a ResNet-50 FPN backbone from
Recall = Total True Positives / Sum of total True Positives
Detectron2's model library. The model is trained for a
and False Negatives
particular quantity of cycles. After training, the model's final
weights are loaded. A predictor is created using the trained D. Backend and Frontend Development with Integration
model for predictions. The model is tested on a test dataset The backend is built on Flask, a Python web platform. It
with the COCO Evaluator, which computes performance handles the logic for loading and applying the trained model
measures like mean Average Precision (mAP). The to analyze images. HTML , CSS , and JavaScript are used in
evaluation outcomes are shown, demonstrating how well the the design and development of the user-facing frontend. It
model did. The test set is finally evaluated, and the findings, provides an interface where people are able to submit
which include precision, recall, and mAP, are printed. picturesfor analysis. The frontend and backend are then
connected. The frontend transfers images for analysis to the
YOLOV8 Architecture backend, which sends back the results. The frontend then
processes and displays the results to the user [15].
979-8-3503-8386-7/24/$31.00 ©2024 IEEE

E. Flowchart
Fig. 5 Flowchart
The Fig 5 illustrates the flow of preprocessing and training

process for our machine learning model. It starts with the
raw data being split into training and test sets. The training
data goes through feature engineering and model
preprocessing steps. The preprocessed data is then used to
train the model using a specific algorithm. The trained
model's performance is evaluated on the test set, and hyper
parameter tuning may be performed to optimize the model's
performance. Finally, the trained model is deployed for
making predictions on new, unseen data.
IV. RESULTS AND DISCUSSIONS
In this section, we present underwater pipeline object
detection results for CNN, RCNN, Faster RCNN, YOLOv8
models. We trained each model on the custom dataset and
tested the resulting models on the test part of the dataset to
measure their performance. Out of all the architectures we
considered, YOLOv8 stood out for its exceptional
performance. It achieved a precision of 0.83, a recall of
0.833, and an impressive mAP of 86.8%. This outstanding
performance can be attributed to the advanced capabilities of
YOLOv8 in object detection and localization, making it
highly suitable for underwater visual surveillance and
conservation efforts. The CNN model showed impressive
results, boasting a precision of 0.780, recall of 0.743, and a
979-8-3503-8386-7/24/$31.00 ©2024 IEEE

mAP of 80%. However, it is important to mention that the and eventual convergence as the training progresses through
CNN architecture lacks the capability to localize and more epochs.
identify the position of detected objects in images, which
could be a limitation in applications requiring precise object
localization. The RCNN and Faster RCNN models showed
similar performance, with a precision of 0.624 and 0.681,
and a recall of 0.623 and 0.685. Their mAP scores were 68%
and 68%. Although they didn't outperform the best models,
they provide valuable insights and potential for further
exploration in underwater object detection tasks. In general,
the findings indicate that YOLOv8 is the best architecture
for detecting objects underwater, thanks to its accuracy, Fig. 7 Inference time of models
recall, and mAP values. However, it's crucial to take into
account the unique needs of the application and the balance
between performance and computational complexity when The Fig. 7 The combination of high accuracy and real-time
choosing the right architecture. The system requirements inference capabilities showcased by the YOLOv8 model
include a high-performance Intel i5 or equivalent CPU, at holds significant promise for practical applications in
least 8GB RAM, an NVIDIA RTX GPU for accelerated underwater monitoring and conservation efforts. Its ability
deep learning, and 1TB HDD with a minimum of 256GB to process video data efficiently makes it well-suited for
SSD storage. The software requirements encompass Ubuntu integration with autonomous underwater vehicles (AUVs) or
20.04/22.04 or Windows 10/11 operating systems, Python remotely operated vehicles (ROVs), enabling efficient
programming language along with TensorFlow, PyTorch, exploration and surveillance of marine environments.
and OpenCV libraries, and tools like Visual Studio Code,
Jupyter Notebook, Google Colab, as well as labeling tools
like CVAT and LabelImg for dataset annotation.
(a) (b)
Fig. 8 Precision and Recall of CNN Model
The Fig. 8 indicates the training and validation of precision

and recall curves over epochs for a convolutional neural
network (CNN) model. The X-axis is the number of
epochs and the Y-axis is the value of precision and recall.
The graph provides insights into the model's performance in
terms of precision (ability to identify relevant instances)
(c) and recall (ability to identify all relevant instances) the gap
between the training and validation curves can indicate
Fig. 6 (a) precision result for YOLOv8, (b) recall result of overfitting or under fitting. If the training curves are
YOLOv8, mAP result of YOLOv8. significantly higher than the validation curves, it suggests
overfitting, where the model is memorizing the training
The Fig. 6 contains three plots showing the progression of data instead of learning generalizable patterns. On the other
different evaluation metrics for our model over training hand, if both training and validation curves are low and
epochs. Plot (a) tracks the precision metric, which increases show little improvement, it could indicate under fitting,
and stabilizes after a certain number of epochs. Plot (b) where the model is not complex enough to learn the
illustrates the recall metric, exhibiting a similar increasing underlying patterns.
and converging pattern. Plot (c) depicts the mean average
precision (mAP) metric, also demonstrating an upward trend
Architectures Precision Recall mAP (%)
CNN 0.780 0.743 0.80
979-8-3503-8386-7/24/$31.00 ©2024 IEEE
RCNN 0.624 0.623 0.650
Faster RCNN 0.681 0.685 0.680
YOLOv8 [2] 0.840 0.969 0.951
YOLOv8 [ours] 0.851 0.971 0.964
TABLE II. STANDARD METRICS V. CONCLUSION

Analysing several neural architectures in comparison for
The Table II describes the performance of multiple object underwater object identification produced some interesting
detection architectures, including CNN, RCNN, Faster findings. A detailed evaluation was done utilizing criteria
RCNN, and YOLOv8, on our dataset. Our implementation like as precision, recall, and mAP on a test dataset of
of the YOLOv8 architecture achieved impressive results, underwater photos. The CNN model performed rather well,
with a precision of 0.851, recall of 0.971, and a mAP of correctly identifying and categorizing most of the
0.964, closely matching the YOLOv8 [2] reported submerged items. Nevertheless, the CNN was surpassed by
performance. Notably, our dataset consisted of 15 classes the RCNN and Faster RCNN designs, which used their
and a larger size compared to the 5 classes and smaller region proposal procedures to more accurately locate and
dataset used in the [2]. The robustness of our YOLOv8 identify objects in complex underwater environments. The
implementation in handling a more complex and diverse best-performing model was YOLOv8, which demonstrated
dataset, while maintaining competitive performance, its mastery of real time object identification by achieving the
highlights its effectiveness for real world object detection precision of 0.851, recall of 0.971 and mAP of 0.964.
tasks. Further confirming YOLOv8's superiority in accurately
defining object boundaries and reducing false positives in
difficult underwater conditions was qualitative analysis of
sample output photos. The current research is limited by the
size and diversity of the underwater image dataset, the
specific set of object classes considered, and the
computational requirements and inference times of the
evaluated architectures, which may not be optimized for
real-time or resource-constrained applications. In terms of
future work, incorporating more and diverse underwater
images into the dataset can help the models better
understand different scenarios and object types, thereby
improving their accuracy.object types can further enhance
Fig. 9 Bar Graph the generalization ability and robustness of the models.
The Fig. 9 Comparison of Metrics Values across Different
Neural Network Architectures.
ACKNOWLEDGEMENT
We express our gratitude to Atria Institute of Technology for
providing us with the necessary facilities to complete our
project on Comparative Analysis of Neural Architectures for
Underwater Object Detection.
REFERENCES
[1] A. N. Tarekegn et al., "Underwater Object Detection using Image
Enhancement and Deep Learning Models," 2023 11th European
Workshop on Visual Information Processing (EUVIP), Gjovik,
Norway, 2023, pp. 1-6, doi: 10.1109/EUVIP58404.2023.10323047.
[2] B. Gašparović, G. Mauša, J. Rukavina and J. Lerga, "Evaluating
YOLOV5, YOLOV6, YOLOV7, and YOLOV8 in Underwater
Environment: Is There Real Improvement?," 2023 8th International
Conference on Smart and Sustainable Technologies (SpliTech),
Split/Bol, Croatia, 2023, pp. 1-4, doi:
10.23919/SpliTech58164.2023.10193505.
[3] D. Zhao, B. Yang, Y. Dou and X. Guo, "Underwater fish detection in
sonar image based on an improved Faster RCNN," 2022 9th
International Forum on Electrical Engineering and Automation
(IFEEA), Zhuhai, China, 2022, pp. 358-363, doi:
10.1109/IFEEA57288.2022.10038226.
[4] P. Athira., T. P. Mithun Haridas and M. H. Supriya, "Underwater
Object Detection model based on YOLOv3 architecture using Deep
Neural Networks," 2021 7th International Conference on Advanced
Computing and Communication Systems (ICACCS), Coimbatore,
India, 2021, pp. 40-45, doi: 10.1109/ICACCS51430.2021.9441905.
[5] Y. Wang, J. Liu, S. Yu, K. Wang, Z. Han and Y. Tang, "Underwater
Object Detection based on YOLO-v3 network," 2021 IEEE
International Conference on Unmanned Systems (ICUS), Beijing,
China, 2021, pp. 571-575, doi: 10.1109/ICUS52573.2021.9641489.
979-8-3503-8386-7/24/$31.00 ©2024 IEEE

[6] W. Hao and N. Xiao, "Research on Underwater Object Detection [11] N. Reddy Nandyala and R. Kumar Sanodiya, "Underwater Object
Based on Improved YOLOv4," 2021 8th International Conference on Detection Using Synthetic Data," 2023 11th International Symposium
Information, Cybernetics, and Computational Social Systems on Electronic Systems Devices and Computing (ESDC), Sri City,
(ICCSS), Beijing, China, 2021, pp. 166-171, doi: India, 2023, pp. 1-6, doi: 10.1109/ESDC56251.2023.10149870.
10.1109/ICCSS53909.2021.9722013. [12] X. Wang, X. Jiang, Z. Xia and X. Feng, "Underwater Object
[7] H. Wang et al., "A YOLOv5 Baseline for Underwater Object Detection Based on Enhanced YOLO," 2022 International Conference
Detection," OCEANS 2021: San Diego – Porto, San Diego, CA, USA, on Image Processing and Media Computing (ICIPMC), Xi'an, China,
2021, pp. 1-4, doi: 10.23919/OCEANS44145.2021.9705896. 2022, pp. 17-21, doi: 10.1109/ICIPMC55686.2022.00012.
[8] S. Wang, W. Wu, X. Wang, Y. Han and Y. Ma, "Underwater optical [13] X. Yin, J. Lu and Y. Liu, "Garbage Detection on The Water Surface
image object detection based on YOLOv7 algorithm," OCEANS 2023 Based on Deep Learning," 2022 International Conference on
- Limerick, Limerick, Ireland, 2023, pp. 1-5, doi: Computer Engineering and Artificial Intelligence (ICCEAI),
10.1109/OCEANSLimerick52467.2023.10244658. Shijiazhuang, China, 2022, pp. 679-683, doi:
[9] P. B, C. Anuradha, H. I and M. M, "Underwater Image Enhancement 10.1109/ICCEAI55464.2022.00145.
and Super Resolution based on Deep CNN Method," 2022 8th [14] A. Balaji, Y. S, K. CK, N. R, G. Dooly and S. Dhanalakshmi, "Deep
International Conference on Smart Structures and Systems (ICSSS), WaveNet-based YOLO V5 for Underwater Object Detection,"
Chennai, India, 2022, pp. 01-04, doi: OCEANS 2023 - Limerick, Limerick, Ireland, 2023, pp. 1-5, doi:
10.1109/ICSSS54381.2022.9782177. 10.1109/OCEANSLimerick52467.2023.10244645.
[10] Z. Zhang, Q. Tong, C. Yi, X. Fu, J. Ai and Z. Wang, "The [15] S. Raavi, P. B. Chandu and S. T, "Automated Recognition of
Appropriate Image Enhancement Method for Underwater Object Underwater Objects using Deep Learning," 2023 7th International
Detection," 2022 IEEE 22nd International Conference on Conference on Trends in Electronics and Informatics (ICOEI),
Communication Technology (ICCT), Nanjing, China, 2022, pp. 1627- Tirunelveli, India, 2023, pp. 1055-1059, doi:
1632, doi: 10.1109/ICCT56141.2022.10073015 10.1109/ICOEI56765.2023.10125839.
979-8-3503-8386-7/24/$31.00 ©2024 IEEE

Comparative Analysis of Neural Architectures For Underwater Object Detection

Uploaded by

Copyright:

Available Formats

Comparative Analysis of Neural Architectures For Underwater Object Detection

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Comparative Analysis of Neural Architectures For Underwater Object Detection

Uploaded by

Copyright:

Available Formats

2024 Second International Conference on Advance in Information Technology (ICAIT-2024)

Comparative Analysis of Neural Architectures for

979-8-3503-8386-7/24/$31.00 ©2024 IEEE

979-8-3503-8386-7/24/$31.00 ©2024 IEEE

979-8-3503-8386-7/24/$31.00 ©2024 IEEE

979-8-3503-8386-7/24/$31.00 ©2024 IEEE

The Fig 5 illustrates the flow of preprocessing and training

979-8-3503-8386-7/24/$31.00 ©2024 IEEE

Fig. 8 Precision and Recall of CNN Model

The Fig. 8 indicates the training and validation of precision

TABLE II. STANDARD METRICS V. CONCLUSION

979-8-3503-8386-7/24/$31.00 ©2024 IEEE

979-8-3503-8386-7/24/$31.00 ©2024 IEEE

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.