Disc B.ing

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Multi-Label Classification of Employee Job Performance

Prediction by DISC Personality


Patompat Kamtar Duangjai Jitkongchuen Eakasit Pacharawongsakda
Big Data Engineering Program, Big Data Engineering Program, Big Data Engineering Program,
College of Innovative Technology College of Innovative Technology College of Innovative Technology
and Engineering, and Engineering, and Engineering,
Dhurakij Pundit University Dhurakij Pundit University Dhurakij Pundit University
110/1-4 Pracha Chuen Rd, 110/1-4 Pracha Chuen Rd, 110/1-4 Pracha Chuen Rd,
Khwaeng Thung Song Hong, Khet Khwaeng Thung Song Hong, Khet Khwaeng Thung Song Hong, Khet
Lak Si, Bangkok 10210, Thailand Lak Si, Bangkok 10210, Thailand Lak Si, Bangkok 10210, Thailand
605162020019@dpu.ac.th duangjai.jit@dpu.ac.th eakasit.pac@dpu.ac.th

ABSTRACT The relationship between personality and job performance has


The objective of this study was to automate job performance been studied in many industrial-organizational psychology
prediction based on DISC personality test. We transformed this research, the result of many research found that personality is one
problem to Multi-Label Classification (MLC) by using of the most important factors that affects to the employee's job
performance [1]. Job performance generally relates to the positive
employee's job performances as labels. In this study, three widely
things that people do to succeed at work, including task
used MLC techniques have been employed such as Binary
performance, discretionary behaviors, future-focused and
Relevance (BR), Label Powerset (LP) and Classifier Chains (CC)
improvement behaviors [2].
for prediction of job performances. However, these traditional
techniques didn't show promising results. Therefore, we proposed However, not every personality is suited for every job position, so
another approach by building stacking MLC with model selection. it's important to recognize personality traits and to pair employees
The proposed method has three steps: (1) building MLC model; with the job that fit their personalities the best. This can lead to
(2) using process from the first step and applying with a stacking increased productivity and job satisfaction, helping business
model and (3) utilizing feature selection technique to select the function to be more efficient.
proper models for final prediction. Using the surveys from a big To put the right man on the right job, understanding personality
financial company in Thailand, we found that the last proposed which suited for job position is crucial.
approach shows better performance, compared to the traditional
MLC. Previous research and study towards personality and job
performance were conducted with only single-label of
CCS Concepts performance, but in fact, each employee can perform various
•Information systems ➝ Data management systems➝ performance at the same time. Single-label study cannot apply to
Database design and models ➝ Data model extensions complicated problem [3].
•Computing methodologies ➝ Modeling and simulation ➝
Simulation types and techniques ➝ Massively parallel and The classification of personality consists of comparing a user's
high-performance simulations personality against the standard personality tests taken. DISC is
one of the most popular and standardized personality test, it shows
Keywords that the human psychology and personality of normal people can
Multi-label Classification; Stacking; Personality; DISC be identified and observed and divides it into four categories.
personality test; Psychology These four factors are Dominance, Influence, Steadiness and
Compliance [4].
1. INTRODUCTION Section 2 is about the related works to conduct this study. Section
Permission to make digital or hard copies of all or part of this work for 3 talks about the proposed method in this study. Section 4
personal or classroom use is granted without fee provided that copies are
describes the process flow. Section 5 and 6 talks about the results
not made or distributed for profit or commercial advantage and that
copies bear this notice and the full citation on the first page. Copyrights and conclusions drawn from this study
for components of this work owned by others than ACM must be
honored. Abstracting with credit is permitted. To copy otherwise, or 2. RELATED WORKS
republish, to post on servers or to redistribute to lists, requires prior Kim Yun-Yong et. al. studied about the effects of DISC behavior
specific permission and/or a fee. Request permissions from style of office workers on job satisfaction, organizational
Permissions@acm.org.
commitment and job performance. The study compared the
ICCBD 2019, October 18–20, 2019, TAICHUNG, Taiwan
© 2019 Association for Computing Machinery. difference between group of DISC type and found that
ACM ISBN 978-1-4503-7290-9/19/10…$15.00 Dominance people has a significantly higher recognized as a
https://doi.org/10.1145/3366650.3366666 person who has good job performance than Steadiness people [5].

47
Fariha Tabasum et. al. had investigated the relationship between classification task. For CC, it transforms the problem into a
salesman’s personality traits on sales performance for FMCG. hierarchical chain of binary classification problems, where each
This study shown the positive and significant relationship between one is built upon the previous prediction [10] [11] [12].
salesman personality and consumer perception and sales. The
result shows that customer perception and sales of specific 2.2 STACKING
product or service can be enhanced with the attractive personality Stacking is an ensemble learning technique in which a number of
of salesmen [6]. base classifiers are combined using one meta-classifier which
learns their outputs. The individual classification models are
Joy Eberechukwu Agodi, Emmanuel Onyedikachi Ahaiwe and trained based on the complete training set. Then, instead of using
Aniekan Eyo Awah found a strongly positive relationship between
the original input attributes, Stacking uses the classifications
personality traits and sales performance. Successful salesman can
predicted by the base-classifiers as the input attributes.
be described as being empathetic, assertive and ambitious [7].
The meta-classifier that has been produced combines the different
Anisha Yata et. al. studied about personality prediction of the
predictions into a final prediction. [13][14].
users by implementing multi-label classifiers on textual data
which are the views and opinions of the users posted by them on 2.3 FEATURE SELECTION
the social networking [8]. Many classification problems rely on a large set of features,
feature selection methods can effectively reduce data
Gauthier Doquire and Michel Verleysen proposed the use of
dimensionality by removing irrelevant and redundant features
mutual information for feature selection in multi-label
which can be used to get rid of this problem. There are 3 types of
classification. The results show the interest of the approach which
feature selection methods which are filter, wrapper, and embedded
allows one to sharply reduce the dimension of the problem and to methods [15].
enhance the performance of classifiers [9].
Filter methods is the simplest and most scalable of the methods,
In this study, we combined techniques between Multi-Label this method rank the features based on relevance measure and
Classification, Stacking and Feature Selection to gain a better select the highest k ranked features according to that measure.
result.
Wrapper method evaluates all possible combinations of the
2.1 Multi-Label Classification features and selects the combination that yields the best result for
Multi-label Classification is the task of assigning data points to a a specific machine learning algorithm.
set of classes or categories which are not mutually exclusive, Embedded methods using specific algorithms which incorporate
which means that a point can belong simultaneously to different feature selection as part of the training process, this method has
classes. the best ability to discriminate among classes.
Multi-label classification methods can be broadly categorized into
In this study we applied Backward Elimination which is the
two different groups: wrapper method of feature selection to select relevance attributes.
i) Problem Transformation methods
3. PROPOSED METHOD
ii) Algorithm Adaptation methods We proposed a method of combining of MLC, Stacking and
Feature Selection. 3 steps were created as following;
The first group contains methods that are algorithm independent
approach, since its functioning does not depend directly on the Step 1: building MLC model by applying each algorithm with
classification method used. This method transforms the original MLC to the training set.
multi-label classification problem into one or more single label
Because of each employee can perform various performance, so,
classification, regression or ranking tasks. MLC would be applied to the machine learning process. Figure 1
The second group contains methods that extend specific learning demonstrates the structure of Step1. In this step; BR, LP and CC
algorithms or develop new algorithms in order to handle multi- were run using 6 classification models (Logistic regression, KNN,
label problems directly. Naïve Bayes, SVM, Random Forest and Decision Tree) as the
base classifier. We used a spilt test validation (train 70%, test
This paper aims at solving multi-label classification by
30%) on the training set. Finally, we would get the result of each
implementing Problem Transformation method. The multi-label
base classifier in each MLC algorithm.
classifiers proposed to use in this study are Binary Relevance
(BR), Label Powerset (LP) and Classifier Chains(CC) which are
all Problem Transformation Methods.

The first two approaches were selected because they are the most
basic approaches for multi-label classification tasks. BR considers
the prediction of each label as an independent binary classification
task and then constructing a binary training set for each particular
label, while LP considers each unique set of labels that exists in a
multi-label training set as one of the labels of a new single-label
Figure 1. Building MLC model

48
Step 2: using process from the first step and applying with a In order to find relevance attributes and achieve better accuracy,
stacking model. we applied Backward Elimination method which is one of feature
selection method in the first process. However, the selected
Nanak Chand et al. (2016) measured and compared performance attributes from feature selection process of each label were
of SVM and its stacking with 9 other machine learning algorithms, difference. We joined (union) all selected attributes of each label
and found that Stacking of SVM was providing a better together and then we repeated the same process of step2. (Figure
performance [16]. 3)
Nazlia Omar et al.(2013) made a comparative study carried out on
the effectiveness of individual supervised classifiers and ensemble
methods for Subjectivity and Sentiment Analysis of Arabic
Customers' Reviews. The results showed that ensemble of
classification algorithms with meta learner ensemble technique
performed robustly better than all the other individual classifier
[17].

In this study, we applied Stacking model with the previous step.


As we discussed in the reviews above, Stacking combines
multiple classifiers generated by different machine learning
algorithms. So, in this step, we used the top three performance
algorithms to be base-classifiers (classification models) and each
base classifier will predict the result from training set, then
created a new data by transformed each base classifier result to be
attributes of new data. 6 classification models (Logistic regression,
KNN, Naïve Bayes, SVM, Random Forest and Decision Tree)
were selected for meta-classifier, and we used it to predict the
final result by applied with BR, LP and CC. (Figure 2)

Figure 3. Utilizing feature selection technique to select the


proper models for final prediction

4. EXPERIMENT
4.1 Dataset Collection
For data collection process, this study collected data from the
DISC personality test from 2,317 respondents which are Manager
level (564 respondents) and Operation level (1,753 respondents),
and also collected past 3 years of job performance record which
consist of 4 job performance types.

4.2 Dataset Preparation


After that move to the data preparation process, some data such as
wrong employee ID, duplicated record (1 ID did the test more
Figure 2. Using the process from the first step to build a than 1 times) and employee data who have been working less than
stacking model. 1 year should be eliminated. So, in total of 1,888 respondents are
remained (406 managers, 1,482 operations). Joined data from
Step 3: we added feature selection technique into step 2 process in
personality test with data from database.
order to select the proper models for the final prediction.
Next step, we transform the data into binary data type. For job
The number of attributes from questionnaire have 144 attributes,
performance, 2.5% of each high and low outlier would be
the high number of attributes that use to train the models may
have a huge influence on the performance. eliminated, then transform by using the average value of each
performance as a cut point (if it is higher than average, it will be
Erik M. Schmidt et al. (2010) found that Stacking with feature transformed to 1, but if it is lower than average the value will be
selection technique provided the best performance across the 0).
classification and regression tasks for automated systems to
recognize the emotional content of music [18]. 4.3 Evaluation
We have performed these experiments on Python version 3.7 by
using DISC and job performance dataset. This study used five

49
measures which are: Hamming loss, Accuracy, Precision, Recall Precision: Precision can be defined as the percentage of true
and F-Measure [19][20]. positive examples from all the examples classified as positive by
the classification model.
Hamming Loss: is the fraction of labels that are incorrectly 𝑚
predicted. A low value of hamming loss is required to show better 1 𝑌𝑌 ∩ 𝑍𝑍
𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 = �� �
classification performance. 𝑚 𝑍𝑍
𝑖=1
m Recall: is the percentage of examples classified as positive by a
1 Yi∆Zi
Hamming Loss = �� � classification model that are true positive.
m M
i=1 𝑚
1 𝑌𝑌 ∩ 𝑍𝑍
Accuracy: for each instance is defined as the proportion of the 𝑅𝑅𝑅𝑅𝑅𝑅 = �� �
predicted correct labels to the total number (predicted and actual) 𝑚 𝑌𝑌
𝑖=1
of labels for that instance. F-Measure: is a combination of Precision and Recall. It is the
𝑚 harmonic average of the two metrics and it is used as an
1 𝑌𝑌 ∩ 𝑍𝑍
𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 = �� � aggregated performance score.
𝑚 𝑌𝑌 ∪ 𝑍𝑍
𝑖=1 𝑚
1 2|𝑌𝑌 ∩ 𝑍𝑍|
𝐹 − 𝑀𝑀𝑀𝑀𝑀𝑀𝑀 = �
𝑚 |𝑌𝑌| + |𝑍𝑍|
𝑖=1

Table 1. Evaluation Measures Comparison of Step 1

1.000
0.900
0.800
0.700
0.600
0.500
0.400
0.300
0.200
0.100
0.000
HL A F Pre Re

BR+LR BR+KNN BR+NB BR+SVM BR+RF BR+DT CC+LR CC+KNN CC+NB


CC+SVM CC+RF CC+DT LP+LR LP+KNN LP+NB LP+SVM LP+RF LP+DT

Table 2. Evaluation Measures Comparison of Step 2

1.000
0.900
0.800
0.700
0.600
0.500
0.400
0.300
0.200
0.100
0.000
HL A F Pre Re

BR+LR BR+KNN BR+NB BR+SVM BR+RF BR+DT CC+LR CC+KNN CC+NB


CC+SVM CC+RF CC+DT LP+LR LP+KNN LP+NB LP+SVM LP+RF LP+DT

50
Table 3. Evaluation Measures Comparison of Step 3

1.000
0.900
0.800
0.700
0.600
0.500
0.400
0.300
0.200
0.100
0.000
HL A F Pre Re

BR+LR BR+KNN BR+NB BR+SVM BR+RF BR+DT CC+LR CC+KNN CC+NB


CC+SVM CC+RF CC+DT LP+LR LP+KNN LP+NB LP+SVM LP+RF LP+DT

PERFORMANCE: AN EMPIRICAL INVESTIGATION TO


5. RESULTS
Table 1-3 shows the predictive performance of all evaluation TEST THE FIVE-FACTOR MODEL (FFM) IN
measures of each pair between base classifiers and MLC PAKISTAN. in Interdisciplinary Journal of Information,
algorithms in each step. Knowledge, and Management, Vol. 12
Here, HL, A, F1, Pre and Re represents Hamming Loss, Accuracy, [4] Weiming, G. 2011. Study on the Application of DISC
F-Measure, Precision and Recall respectively. Behavioral Style in Talent Management in Banking Industry.
From Table 1-3, we can draw the conclusion that Step 3 that Proceedings of the 8th International Conference on
utilizing feature selection technique to MLC with Stacking Innovation & Management
method dominates the other steps in mostly of measures result. [5] Yong, K. Y., Hwa, B. Y., Hyun, P. H., Hyang, Y. J., and Su,
Moreover, Decision Tree base classifier gives the best overall
performance (lowest Hamming Loss and highest Accuracy) when J. E. 2012. The Effects of DISC Behavior Styles of Office
used with the MLC Label Powerset. Workers on Job Satisfaction, Organizational Commitment
and Job Performance. Korean J Occup Health Nurs. 2012
6. CONCLUSIONS Aug;21(2):98-107
We proposed MLC of employee job performance prediction by
disc personality. For improving MLC performance, we focus on [6] Tabasum, F., Ibrahim, M. Rabbani, M. and Asif, M. 2014.
step 3 that combined feature selection process for MLC with Impact of Salesmen Personality on Customer Perception and
Stacking. The top three performance algorithms (Logistic Sales. Global Journal of Management and Business
regression, SVM, and Decision Tree) were applied to be base Research: E Marketing
classifiers with six Meta-classifier’s algorithms, while three MLC [7] Agodi, J. E., Ahaiwe, E. O. and Awah, A. E. 2017.
algorithms including BR, LP and CC. The results of step 3 Salesman’s Personality Trait and Its Effect on Sales
revealed that this step able to gain a better performance than Performance: Study of Fast Moving Consumer Goods
traditional MLC or only MLC with Stacking. (FMCG) in Abia State, Nigeria. in Journal of Economics and
Sustainable Development, Vol.8, No.24
7. REFERENCES
[1] Cascio, F. W. and Aguinis, H. 2008. Research in Industrial [8] Yata, A., Kante, P., Sravani, T. and Malathi, B. 2018.
and Organizational Psychology From 1963 to 2007: Changes, Personality Recognition using Multi-Label Classification.
Choices, and Trends. in Journal of Applied Psychology 2008, International Research Journal of Engineering and
Technology (IRJET), Volume: 05 Issue: 03
Vol. 93, No. 5, 1062–1081
[9] Doquire, G. and Verleysen, M. 2011. Feature Selection for
[2] Spitzmuller, M., Dyne, L. V. and Ilie, R. 2008.
Multi-label Classification Problems. Universit´e catholique
Organizational Citizenship Behavior. A Review and
de Louvain
Extension of its Nomological Network
[10] Kafrawy, P. E., Mausad, A. and Esmail, H. 2015.
[3] Waheed, A., Yang, J. and Webber, J. 2018. THE EFFECT
Experimental Comparison of Methods for Multi-Label
OF PERSONALITY TRAITS ON SALES

51
Classification in Different Application Domains. in [16] Nanak, C., Preeti, M., C., Rama, K., Emmanuel, S. P. and
International Journal of Computer Applications, Volume Mahesh, C. G. 2016. A Comparative Analysis of SVM and
114 its Stacking with other Classification Algorithm for Intrusion
[11] Santos, A. M., Canuto, A. M. P. and Neto, A. F. 2011. A Detection. International Conference on Advances in
Comparative Analysis of Classification Methods to Multi- Computing, Communication, & Automation (ICACCA)
label Tasks in Different Application Domains. in [17] Nazlia, O., Mohammed, A., Adel, Q. A., Tareq, A. 2013.
International Journal of Computer Information Systems and Ensemble of Classification Algorithms for Subjectivity and
Industrial Management Applications, Volume 3 Sentiment Analysis of Arabic Customers' Reviews. in
[12] Ganda, D. and Buch, R. 2018. A Survey on Multi Label International Journal of Advancements in Computing
Classification. in Recent Trends in Programming Languages, Technology
Volume 5, Issue 1 [18] Erik, M. S., Douglas, T. and Youngmoo, E. K. 2010.
[13] Menahem, E., Rokach, L. and Elovici. Y. 2009. An Feature Selection for Content-Based, Time-Varying Musical
improved stacking schema for classification tasks. Emotion Regression. Published in Multimedia Information
Department of Information Systems Engineering, Ben- Retrieval.
Gurion University and Deutsche Telekom Laboratories at [19] Asim, M. N., Rehman, A. and Shoaib, U. 2017. Accuracy
Ben-Gurion University, Be’er Sheva 84105, Israel Based Feature Ranking Metric for Multi-Label Text
[14] Wolpert, D. H. 1992. Stacked generalization. Neural Classification. in International Journal of Advanced
networks, Volume 5, Issue 2, Pages 241-259 Computer Science and Applications, Vol. 8, No. 10
[15] Spolaˆor, N., Cherman, E. A., Monard, M. C. and Lee, H. D. [20] Santos, A. M., Canuto, A. M. P. and Neto, A. F. 2011. A
2013. A Comparison of Multi-label Feature Selection Comparative Analysis of Classification Methods to Multi-
Methods using the Problem Transformation Approach. in label Tasks in Different Application Domains. in
Electronic Notes in Theoretical Computer Science 292:135– International Journal of Computer Information Systems and
151 Industrial Management Applications, Vol. 3, pp. 218-227

52

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy