Disc B.ing
Disc B.ing
Disc B.ing
47
Fariha Tabasum et. al. had investigated the relationship between classification task. For CC, it transforms the problem into a
salesman’s personality traits on sales performance for FMCG. hierarchical chain of binary classification problems, where each
This study shown the positive and significant relationship between one is built upon the previous prediction [10] [11] [12].
salesman personality and consumer perception and sales. The
result shows that customer perception and sales of specific 2.2 STACKING
product or service can be enhanced with the attractive personality Stacking is an ensemble learning technique in which a number of
of salesmen [6]. base classifiers are combined using one meta-classifier which
learns their outputs. The individual classification models are
Joy Eberechukwu Agodi, Emmanuel Onyedikachi Ahaiwe and trained based on the complete training set. Then, instead of using
Aniekan Eyo Awah found a strongly positive relationship between
the original input attributes, Stacking uses the classifications
personality traits and sales performance. Successful salesman can
predicted by the base-classifiers as the input attributes.
be described as being empathetic, assertive and ambitious [7].
The meta-classifier that has been produced combines the different
Anisha Yata et. al. studied about personality prediction of the
predictions into a final prediction. [13][14].
users by implementing multi-label classifiers on textual data
which are the views and opinions of the users posted by them on 2.3 FEATURE SELECTION
the social networking [8]. Many classification problems rely on a large set of features,
feature selection methods can effectively reduce data
Gauthier Doquire and Michel Verleysen proposed the use of
dimensionality by removing irrelevant and redundant features
mutual information for feature selection in multi-label
which can be used to get rid of this problem. There are 3 types of
classification. The results show the interest of the approach which
feature selection methods which are filter, wrapper, and embedded
allows one to sharply reduce the dimension of the problem and to methods [15].
enhance the performance of classifiers [9].
Filter methods is the simplest and most scalable of the methods,
In this study, we combined techniques between Multi-Label this method rank the features based on relevance measure and
Classification, Stacking and Feature Selection to gain a better select the highest k ranked features according to that measure.
result.
Wrapper method evaluates all possible combinations of the
2.1 Multi-Label Classification features and selects the combination that yields the best result for
Multi-label Classification is the task of assigning data points to a a specific machine learning algorithm.
set of classes or categories which are not mutually exclusive, Embedded methods using specific algorithms which incorporate
which means that a point can belong simultaneously to different feature selection as part of the training process, this method has
classes. the best ability to discriminate among classes.
Multi-label classification methods can be broadly categorized into
In this study we applied Backward Elimination which is the
two different groups: wrapper method of feature selection to select relevance attributes.
i) Problem Transformation methods
3. PROPOSED METHOD
ii) Algorithm Adaptation methods We proposed a method of combining of MLC, Stacking and
Feature Selection. 3 steps were created as following;
The first group contains methods that are algorithm independent
approach, since its functioning does not depend directly on the Step 1: building MLC model by applying each algorithm with
classification method used. This method transforms the original MLC to the training set.
multi-label classification problem into one or more single label
Because of each employee can perform various performance, so,
classification, regression or ranking tasks. MLC would be applied to the machine learning process. Figure 1
The second group contains methods that extend specific learning demonstrates the structure of Step1. In this step; BR, LP and CC
algorithms or develop new algorithms in order to handle multi- were run using 6 classification models (Logistic regression, KNN,
label problems directly. Naïve Bayes, SVM, Random Forest and Decision Tree) as the
base classifier. We used a spilt test validation (train 70%, test
This paper aims at solving multi-label classification by
30%) on the training set. Finally, we would get the result of each
implementing Problem Transformation method. The multi-label
base classifier in each MLC algorithm.
classifiers proposed to use in this study are Binary Relevance
(BR), Label Powerset (LP) and Classifier Chains(CC) which are
all Problem Transformation Methods.
The first two approaches were selected because they are the most
basic approaches for multi-label classification tasks. BR considers
the prediction of each label as an independent binary classification
task and then constructing a binary training set for each particular
label, while LP considers each unique set of labels that exists in a
multi-label training set as one of the labels of a new single-label
Figure 1. Building MLC model
48
Step 2: using process from the first step and applying with a In order to find relevance attributes and achieve better accuracy,
stacking model. we applied Backward Elimination method which is one of feature
selection method in the first process. However, the selected
Nanak Chand et al. (2016) measured and compared performance attributes from feature selection process of each label were
of SVM and its stacking with 9 other machine learning algorithms, difference. We joined (union) all selected attributes of each label
and found that Stacking of SVM was providing a better together and then we repeated the same process of step2. (Figure
performance [16]. 3)
Nazlia Omar et al.(2013) made a comparative study carried out on
the effectiveness of individual supervised classifiers and ensemble
methods for Subjectivity and Sentiment Analysis of Arabic
Customers' Reviews. The results showed that ensemble of
classification algorithms with meta learner ensemble technique
performed robustly better than all the other individual classifier
[17].
4. EXPERIMENT
4.1 Dataset Collection
For data collection process, this study collected data from the
DISC personality test from 2,317 respondents which are Manager
level (564 respondents) and Operation level (1,753 respondents),
and also collected past 3 years of job performance record which
consist of 4 job performance types.
49
measures which are: Hamming loss, Accuracy, Precision, Recall Precision: Precision can be defined as the percentage of true
and F-Measure [19][20]. positive examples from all the examples classified as positive by
the classification model.
Hamming Loss: is the fraction of labels that are incorrectly 𝑚
predicted. A low value of hamming loss is required to show better 1 𝑌𝑌 ∩ 𝑍𝑍
𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 = �� �
classification performance. 𝑚 𝑍𝑍
𝑖=1
m Recall: is the percentage of examples classified as positive by a
1 Yi∆Zi
Hamming Loss = �� � classification model that are true positive.
m M
i=1 𝑚
1 𝑌𝑌 ∩ 𝑍𝑍
Accuracy: for each instance is defined as the proportion of the 𝑅𝑅𝑅𝑅𝑅𝑅 = �� �
predicted correct labels to the total number (predicted and actual) 𝑚 𝑌𝑌
𝑖=1
of labels for that instance. F-Measure: is a combination of Precision and Recall. It is the
𝑚 harmonic average of the two metrics and it is used as an
1 𝑌𝑌 ∩ 𝑍𝑍
𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 = �� � aggregated performance score.
𝑚 𝑌𝑌 ∪ 𝑍𝑍
𝑖=1 𝑚
1 2|𝑌𝑌 ∩ 𝑍𝑍|
𝐹 − 𝑀𝑀𝑀𝑀𝑀𝑀𝑀 = �
𝑚 |𝑌𝑌| + |𝑍𝑍|
𝑖=1
1.000
0.900
0.800
0.700
0.600
0.500
0.400
0.300
0.200
0.100
0.000
HL A F Pre Re
1.000
0.900
0.800
0.700
0.600
0.500
0.400
0.300
0.200
0.100
0.000
HL A F Pre Re
50
Table 3. Evaluation Measures Comparison of Step 3
1.000
0.900
0.800
0.700
0.600
0.500
0.400
0.300
0.200
0.100
0.000
HL A F Pre Re
51
Classification in Different Application Domains. in [16] Nanak, C., Preeti, M., C., Rama, K., Emmanuel, S. P. and
International Journal of Computer Applications, Volume Mahesh, C. G. 2016. A Comparative Analysis of SVM and
114 its Stacking with other Classification Algorithm for Intrusion
[11] Santos, A. M., Canuto, A. M. P. and Neto, A. F. 2011. A Detection. International Conference on Advances in
Comparative Analysis of Classification Methods to Multi- Computing, Communication, & Automation (ICACCA)
label Tasks in Different Application Domains. in [17] Nazlia, O., Mohammed, A., Adel, Q. A., Tareq, A. 2013.
International Journal of Computer Information Systems and Ensemble of Classification Algorithms for Subjectivity and
Industrial Management Applications, Volume 3 Sentiment Analysis of Arabic Customers' Reviews. in
[12] Ganda, D. and Buch, R. 2018. A Survey on Multi Label International Journal of Advancements in Computing
Classification. in Recent Trends in Programming Languages, Technology
Volume 5, Issue 1 [18] Erik, M. S., Douglas, T. and Youngmoo, E. K. 2010.
[13] Menahem, E., Rokach, L. and Elovici. Y. 2009. An Feature Selection for Content-Based, Time-Varying Musical
improved stacking schema for classification tasks. Emotion Regression. Published in Multimedia Information
Department of Information Systems Engineering, Ben- Retrieval.
Gurion University and Deutsche Telekom Laboratories at [19] Asim, M. N., Rehman, A. and Shoaib, U. 2017. Accuracy
Ben-Gurion University, Be’er Sheva 84105, Israel Based Feature Ranking Metric for Multi-Label Text
[14] Wolpert, D. H. 1992. Stacked generalization. Neural Classification. in International Journal of Advanced
networks, Volume 5, Issue 2, Pages 241-259 Computer Science and Applications, Vol. 8, No. 10
[15] Spolaˆor, N., Cherman, E. A., Monard, M. C. and Lee, H. D. [20] Santos, A. M., Canuto, A. M. P. and Neto, A. F. 2011. A
2013. A Comparison of Multi-label Feature Selection Comparative Analysis of Classification Methods to Multi-
Methods using the Problem Transformation Approach. in label Tasks in Different Application Domains. in
Electronic Notes in Theoretical Computer Science 292:135– International Journal of Computer Information Systems and
151 Industrial Management Applications, Vol. 3, pp. 218-227
52