HW Template

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/362374571

Mushroom classification using machine-learning techniques

Conference Paper · July 2022


DOI: 10.1063/5.0174721

CITATIONS READS
3 5,875

4 authors:

Omar Tarawneh Monther Tarawneh


Amman Arab University Isra University, Jordan
31 PUBLICATIONS 92 CITATIONS 25 PUBLICATIONS 128 CITATIONS

SEE PROFILE SEE PROFILE

Yousef Sharrab Moath Husni Altarawneh


General Motors Company The World Islamic Science and Education University
21 PUBLICATIONS 147 CITATIONS 15 PUBLICATIONS 44 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Monther Tarawneh on 31 July 2022.

The user has requested enhancement of the downloaded file.


Mushroom classification using machine-learning techniques

Omar Tarawneh1, Monther Tarawneh2, Yousef Sharrab3 and Moath husni4

1
Software Engineering department. Amman Arab University, Amman, Jordan.
2,3
Computer Science department. Isra University, Amman, Jordan
4
Software Engineering department, The world Islamic Sciences and education university, Amman, Jordan
1
o.husain@aau.edu.jo
2,3
mtarawneh@iu.edu.jo, sharrab@iu.edu.jo
4
Moath.tarawneh@wise.edu.jo

Abstract—Mushroom is one of the important ingredient in our food that has good nutrients. Most types of mushroom are poisonous (inedible),
and because of its importance, we need to identify poisonous from eatable mushrooms. Machine learning (ML) techniques such as naïve Bayes,
decision tree, SVM, and more applied on mushroom features to classify it into edible or not. There is a limited research on mushroom
classification, existed research focus on applying ML techniques individually, where some algorithm perform better in term of accuracy. This
research proposed an integrated model that combine most accurate technique’s decisions into one decision instead off treating them
individually. Mushroom dataset downloaded from UCI repository. Results shows that the performance of the integrated model outperform
other techniques by 94% accuracy.

Keywords—machine learning, integradted model, classification algorithm, edible mushroom, inedible mushroom, naïve bayes, decision
tree, Support vector machine, Artificial neural networks.

INTRODUCTION
We digest different types of food and focus on healthy food more. Mushrooms are one of the healthiest food on the planet that grow
without any efforts. It can grow on/in the ground or on other planets. Mushroom used as an ingredient in most food industry, it has great
benefits to our body, where it contains most potent nutrients on the plant such as calcium, phosphorus, vitamins and proteins. Mushroom
is used to treat cancer, eradicating viruses, increase immunity system, lose weight, and for good diet programs[1]. Recently, the use of
mushroom has been increased by people[2]. However, mushroom can be classified as edible or inedible (poisonous). There are many
types of mushrooms and 50-100 types cannot be eaten [1] or we can say that most of mushrooms cannot be eaten[3] and eating collected
mushroom directly without knowing it’s type is a big mistake and the effect of eating inedible mushroom range from simple symptoms
to death.
People is looking at the physical characteristic of the mushroom such as shape, neck length, head diameter, size, color and its
environment to decide whether its edible or inedible. Huge amount of mushroom’s data collected over years and applications developed
to classify mushrooms. Different classification techniques developed and improved to give more accurate decision[4]. The
classification algorithms compared for the highest accuracy.

LITRETURE REVIEW
The advanced technology made a huge impact on human life. Different types of tools developed to make life better. Machine learning
(ML) is the field of study that enable computers to learn from data[5]. ML used to extract information from data and make decisions. It
has been used by all type of industries. There are many machine-learning approaches applied to find the optimal solution from huge
data sets. Data classification includes two-step process. The first process is learning step representing by constructing the classification
model. The second process is a classification (testing) step and in this step represents the constructed model, which used a given data to
predict class labels for them.
The used approaches depends on the nature of the datasets, number of variables, and used model. Supervised learning is an ML
algorithm that maps an input to an output based on example input-output pairs. Data will be divided into two sets: training and data
sets. The training set is used to create patterns then apply these patterns used to classify the test dataset. The unsupervised approach
learns some features from training or previous data then use them to classify new data. K-means is a unsupervised learning algorithms
that solve the clustering problem. It defines k centers for each cluster. It is better to have them placed far away from each other.
Decision tree is one of the common method that represents choices and their results in a tree. It has several implementation: ID3, J48,
C4.5, Random Forest, Random Tree, ID3+, Oci and Clouds. It has been used to classify the mushrooms whether edible or poisonous
based on its behavioral features[4]. They used J48 implementation and the running time on both training and test set is the same with
100% accuracy. A comparative study between most used classification methods[6] shows that the best result obtained by KNN. The
KNN used with naïve Bayes to get more accurate and efficient result[7]. naïve Bayes is a classification techniques that based on base
theory. Naïve base count the frequencies of data and their values to calculate probabilities. It assumes that a feature in a class is
independent of other features[8]. Researches have state that the performance of Bayesian classifier is better than to the performance of
decision tree and selected neural network[9]. Image processing techniques with naïve Bayes and KNN algorithms used to classify
mushroom[10], naive Bayes shows better accuracy than KNN.
Artificial neural networks (ANNs) are parallel computational models comprised of densely interconnected, adaptive processing units,
characterized by an inherent propensity for learning from experience and also discovering new knowledge neural network a to classify
whether a mushroom is edible or poisonous[11]. The prediction power of a neural network to classify whether a mushroom is edible or
not[12]. The average predictability rate was 99.25%. However, they used the JustNN environment for building the network that was a
feed forward Multi-Layer Perceptron with one input layer, three hidden layer and one output layer. A good comparative study shows
that SVM out perform all other algorithms by 76% accuracy[13]. The study based on mushroom’s texture to determine if it is edible or
not. Two phase approach used: training and determination process[9]. Naive Bays and Decision Tree classifiers are used and accuracy
of Decision tree is better than Naive Bays, while Naive Bays needs slightly less time than Decision Tree[4].
A mushroom classification model using ML with physical dada of 22 attributes of 800 samples [14]. The study compared several
algorithms: Naive Bayes, Naive Bayes, KNN, and SGD Text, where KNN shows the highest accuracy. However, the have run the
comparison on 200 samples of edible and inedible. Another study shows that ANFIS performed with bets accuracy[15].
Many other studies has been conducted and the focus was on the accuracy and performance of the individual models. A hybrid model
that combine all techniques can be done to achieve better[16].

METHODOLOGY
The main aim of this research is to apply machine-learning approaches to classify mushroom into edible or inedible. Its important to
get better decision to avoid side effect of eating inedible mushrooms. We have used a hybrid approach to achieve better accuracy. The
proposed models consist of 3 phase: Data pre-processing, classification, and combination
Dataset
Mushroom dataset has been downloaded from UCI mushroom data[17]. The data set composes of 8124 number of rows of data
records and 22 attributes. Each mushroom species is identified as class of edible and poisonous. These rows are distributed as 4208
edible mushrooms and 3916 poisonous mushrooms. Table 1 summarizes the attribute, which are used for classifying mushrooms.
Model
The first step is to prepare the data before we proceed more. Prepare the data will include removing null value and repeated features.
Python packages used to create dataset from the raw data. The 22 features is important to classify mushroom into edible or inedible.
We need to examine these features and decide which one has good contribution in the classification process. We decide to drop two
attributes: “stalk-root” and “stalk-root”. The reason is that “stalk-root” has 2480 missing values and ‘veil-type” feature has only one
value and it will not help us in the classification. In addition, it is expected that “gill-color” has the most contribution in the classification
and has to be considered.
TABLE 1. Mushroom dataset attribute information

FIGURE 1. Data set view and number of null values


FIGURE 2. Classifiers accuracy comparison

The Second phase is the classification. Several classifiers applied on the data, then the hybrid approach make a decision based on the
top three accurate classifiers. We can use more than three but we notice that, it will not change that much in the result. Therefore, the
decision on wither mushroom is edible of inedible is taken by combining the decision from most used classifiers. We applied KNN,
SVM, naïve base, ANN, logistic regression, and decision tree. The accuracy rounded to closest integer value. KNN outperform all
classifiers by accuracy of 94%, where ANN was nt that far by accuracy of 93%. Then SVM with 91% of accuracy. The proposed is
perform a little better by combining the decision of KNN, ANN and SVM.
The result of the proposed model is better than individual model performance and it is consistent after running several tests.

CONCLUSION
Mushroom is an important source of essential proteins and vitamins. However, most type of known mushrooms are poisonous. ML
learning employed to classify mushrooms into edible or inedible based on its characteristics. Previous research used ML techniques
individually, where some techniques perform better than other techniques in term of accuracy. Therefore, it is confusing to decide which
methods should we used. In this study, We have applied several machine-learning classifiers on a given dataset “mushroom data”. The
dataset downloaded from UCI repository. After exploring the data we notice that one feature “stalk-root” contains many missing values
and another feature” veil-type” has same values for all rows. These to features are dropped to avoid their effects on the classifications.
Where “odor_n” feature is the most important feature that has most effect on the decision. Based on the result we can conclude that if
mushroom has odor, it is more like to be inedible.
We notice that most of the classifiers are perform well on the data set, because it is a clean set. However, decision tree, ANN and SVM
classifiers outperform the other classifiers. Then we proposed a hybrid model that combine the most top classifiers to improve the
accuracy into a hybrid approach. The proposed approach shows better accuracy and consistent results.
The dataset we used is good dataset that does not need to much cleaning and comparing classifiers based on clean data set may not be
good way to compare them. Other features may need to be included in order to build and text any model.

FUTURE WORK
It is important to continue this and apply it on several foods. We think about using image-processing techniques to extract mushroom
features and apply machine learning to classify it. Deep learning can applied after some times.

REFERENCES

1. Nagulwar, M., D. More, and L. Mandhare, Nutritional properties and value addition of mushroom: a review. The
Pharma Innovation Journal, 2020. 9(10): p. 395-398.
2. Boa, E.R., Wild edible fungi: a global overview of their use and importance to people. 2004.
3. Kousalya, K., et al. Edible Mushroom Identification Using Machine Learning. in 2022 International Conference
on Computer Communication and Informatics (ICCCI). 2022. IEEE.
4. Ismail, S., A.R. Zainal, and A. Mustapha. Behavioural features for mushroom classification. in 2018 IEEE
Symposium on Computer Applications & Industrial Electronics (ISCAIE). 2018. IEEE.
5. Mahesh, B., Machine learning algorithms-a review. International Journal of Science and Research
(IJSR).[Internet], 2020. 9: p. 381-386.
6. Ottom, M.A., N.A. Alawad, and K. Nahar, Classification of mushroom fungi using machine learning techniques.
International Journal of Advanced Trends in Computer Science and Engineering, 2019. 8(5): p. 2378-2385.
7. Hamonangan, R., M.B. Saputro, and C.B.S.D.K. Atmaja, Accuracy of classification poisonous or edible of
mushroom using naïve bayes and k-nearest neighbors. Journal of Soft Computing Exploration, 2021. 2(1): p. 53-
60.
8. Halili, F. and F. Kamberi, Performance analysis of classification Algorithms: A case study of Naïve Bayes and
J48 in Big Data. Applied Mathematics and Computation. 2(2): p. 50-57.
9. Al-Mejibli, I.S. and D.H. Abd, Mushroom Diagnosis Assistance System Based on Machine Learning by Using
Mobile Devices. Journal of Al-Qadisiyah for computer science and mathematics, 2017. 9(2): p. Page 103-113.
10. ERKAN, Y.R. and H.K. ÖRNEK, MUSHROOM SPECIES DETECTION USING IMAGE PROCESSING
TECHNIQUES. International Journal of Engineering and Innovative Research, 2017. 1(2): p. 71-83.
11. Aleksandrova, Y. Predicting Students Performance in Moodle Platforms Using Machine Learning Algorithms. in
Conferences of the department Informatics. 2019. Publishing house Science and Economics Varna.
12. Alkronz, E.S., et al., Prediction of whether mushroom is edible or poisonous using back-propagation neural
network. 2019.
13. Maurya, P. and N.P. Singh. Mushroom classification using feature-based machine learning approach. in
Proceedings of 3rd International Conference on Computer Vision and Image Processing. 2020. Springer.
14. Chumuang, N., et al. Mushroom Classification by Physical Characteristics by Technique of k-Nearest Neighbor.
in 2020 15th International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-
NLP). 2020. IEEE.
15. Verma, S. and M. Dutta, Mushroom classification using ANN and ANFIS algorithm. IOSR Journal of
Engineering (IOSRJEN), 2018. 8(01): p. 94-100.
16. Ortega, J.H.J.C., et al., Analysis of Performance of Classification Algorithms in Mushroom Poisonous Detection
using Confusion Matrix Analysis. International Journal, 2020. 9(1.3).
17. Dua, D.a.G., Casey, UCI Machine Learning Repository. 2019, University of California, Irvine, School of
Information and Computer Sciences.

View publication stats

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy