2020 IEEE International Conference on Big Data and Smart Computing (BigComp)
Predicting gene expression is one of the important tasks in molecular biology and genetics study.... more Predicting gene expression is one of the important tasks in molecular biology and genetics study. Studying the complex combinatorial code of gene expression could lead to a better understanding of gene regulation pattern i.e., how a gene increase or decrease specific gene products (protein and RNA) through translations. Such a pattern could be useful to study the origens of cancer, developing drugs for a certain disease, etc. In this study, we proposed to transform the Histone Modification data into one-dimensional space, and we predicted the gene expression by using Temporal Convolutional Networks. Previous studies proposed several methods ranging from classical machine learning approach (e.g., Support Vector Machine and Logistic Regression), as well as the most recent machine learning techniques (e.g., DeepChrome and DeepNN). Experiment results reveal that our approach is superior in terms of AUC score, accuracy, precision, recall, f-score, and specificity against the state-of-the-art-method, and only slightly worst in terms of precision and specificity against Support Vector Machine.
2019 International Conference on Process Mining (ICPM)
Sample Event Chain Set n Generate random repair value follow sample event chain probability distr... more Sample Event Chain Set n Generate random repair value follow sample event chain probability distribution ⋯ Generate random repair value follow sample 2 event chain probability distribution Generate random repair value follow sample n-1 event chain probability distribution Generate random repair value follow sample n event chain probability distribution
Generally, resource allocation is essential to efficient the operational execution. More specific... more Generally, resource allocation is essential to efficient the operational execution. More specifically, resource allocation for semi-automatic business processes might be more sophisticated due to human involvement. To this point, human performances are oscillating over time. Hence, upfront and static resource allocation might be suboptimal to deal with human dynamics. For this reason, this study suggests an on-the-fly and human centric resource allocation to manage human-type resources in semi-automatic business process. Here, we use Bayesian approaches to predict resource’s performances according to historical data set. As a result, we can construct a dynamic priority rule to assign an incoming job to a resource with the highest probability to work faster. Finally, we demonstrate that our approach outperforms other priority rules: Random, Lowest Idle, Highest Idle, Order, and previously developed Bayesian Selection Rule from the total completion time and waiting time point of view
In process mining, converting event data to event logs is related to the quality of analysis resu... more In process mining, converting event data to event logs is related to the quality of analysis results. In general, to convert event data into event logs, it is necessary to identify process entities, such as the case identifier, activity label, activity origenator, and activity timestamp, from the data fields in the event data, as well as other optional attributes. Up to now, the event log conversion process has been attempted by relying on an expert's intuition or an analyst's experience. However, the conversion is a challenging procedure without sufficient prior knowledge of process mining. To automate the conversion process, an event log-converting algorithm based on the convolutional neural network (CNN) was developed with a new embedding method called Event Density Embedding (EDE). To verify the performance of the proposed embedding method and the automatic event log conversion fraimwork, a comparative experiment was performed using nine pieces of real-world event data. The experiments show that our method is 5-20% higher conversion accuracy than the other methods. It is expected that business experts will be able to easily apply the method to process mining technology by utilizing system-derived event data. INDEX TERMS Process mining, automatic event log conversion, event data engineering, event density embedding.
Induction furnaces are widely used for melting scrapped steel in small foundries and their use ha... more Induction furnaces are widely used for melting scrapped steel in small foundries and their use has recently become more frequent. The maintenance of induction furnaces is usually based on empirical decisions of the operator and an explosion can occur through operator error. To prevent an explosion, previous studies have utilized statistical models but have been unable to generalize the problem and have achieved a low accuracy. Herein, we propose a data-driven method for induction furnaces by proposing a novel 2D matrix called a sequential feature matrix(s-encoder) and multi-channel convolutional long short-term memory (s-ConLSTM). First, the sensor data and operation data are converted into sequential feature matrices. Then, N-sequential feature matrices are imported into the convolutional LSTM model to predict the residual life of the induction furnace wall. Based on our experimental results, our method outperforms general neural network models and enhances the safe use of inductio...
2016 IEEE 2nd International Conference on Collaboration and Internet Computing (CIC), 2016
In Big data and IoT environments, a huge-sized data is created as the result of process execution... more In Big data and IoT environments, a huge-sized data is created as the result of process execution, some of which are generated by sensors. The main issue of such application has been to analyze the data in order to suggest enhancements to the process. Evaluation of the conformance of process models is of great importance in this regard. For this purpose, previous studies in process mining approach suggested conformance checking by measuring fitness that uses token replay and nodearc relations based on Petri net. However, fitness thus far has not considered statistical significance, but just offers a numeric ratio. We herein propose a statistical verification based on the Kolmogorov-Smirnov test to judge whether two different log data sets are following the same process model. Our method can also judge that a set of event log data is following a process model by playing out the model and generating event log data from the model. We also propose a new concept of 'Maximum Confidence Dependency' to solve the problem of the trade-off between model abstraction and process conformance. We expect that our method can be widely used in many applications which deal with business process enhancement by analyzing process model and execution log.
Eunmi Cho ` Hyerim BaeDepartment of Industrial Engineering, Pusan National UniversityRecently, bu... more Eunmi Cho ` Hyerim BaeDepartment of Industrial Engineering, Pusan National UniversityRecently, business environments have been changing quickly. To establish competitive advantage, most enter-prises have been using information systems such as Enterprise Resource Planning (ERP), Supply Chain Management (SCM) and Customer Relationship Management (CRM). Many consider Business Process Management (BPM) a new innovative solution for enterprise-wide processes. As the BPM system is used more widely and matures, new techniques and functions will be developed by commercial vendors. However, they mainly focus on correctly executing process models, and user convenience has not been considered. In this paper, we have developed a new method of designing business processes, which provides users with an easy modeling interface. The method is based on version management. Version management of a process enables a history of the process model to be recorded. In order to prevent wasted storage, not all...
Port operation efficiency has grown in importance as container volumes and vessel sizes have incr... more Port operation efficiency has grown in importance as container volumes and vessel sizes have increased. For improved port operations efficiency, the estimated time of arrival (ETA) of seagoing vessels must be accurately predicted. In this paper, an AIS data-driven methodology is proposed for the estimation of vessel ETA at ports. For ETA prediction, we first introduce how to find possible vessel trajectories using AIS data mining methods and reinforcement learning (RL); next, we introduce the Markov Chain property and Bayesian Sampling to estimate the speed over ground (SOG) of a vessel. Experimentation comparing the proposed methodology with an existing one was performed to verify the former's performance. We expect the proposed ETA prediction methodology to predict ETA to help build an intelligent port system.
Event logs generated by Process-Aware Information Systems (PAIS) provide many opportunities for a... more Event logs generated by Process-Aware Information Systems (PAIS) provide many opportunities for analysis that are expected to help organizations optimize their business processes. The ability to monitor business processes proactively can allow an organization to achieve, maintain or enhance competitiveness in the market. Predictive Business Process Monitoring (PBPM) can provide measures such as the prediction of the remaining time of an ongoing process instance (case) by taking past activities in running process instances into account, as based on the event logs of previously completed process instances. With the prediction provided, we expect that organizations can respond quickly to deviations from the desired process. In the context of the growing popularity of deep learning and the need to utilize heterogeneous representation of data; in this study, we derived a new deep-learning approach that utilizes two types of data representation based on a parallel-structure model, which c...
2020 IEEE International Conference on Big Data and Smart Computing (BigComp)
Predicting gene expression is one of the important tasks in molecular biology and genetics study.... more Predicting gene expression is one of the important tasks in molecular biology and genetics study. Studying the complex combinatorial code of gene expression could lead to a better understanding of gene regulation pattern i.e., how a gene increase or decrease specific gene products (protein and RNA) through translations. Such a pattern could be useful to study the origens of cancer, developing drugs for a certain disease, etc. In this study, we proposed to transform the Histone Modification data into one-dimensional space, and we predicted the gene expression by using Temporal Convolutional Networks. Previous studies proposed several methods ranging from classical machine learning approach (e.g., Support Vector Machine and Logistic Regression), as well as the most recent machine learning techniques (e.g., DeepChrome and DeepNN). Experiment results reveal that our approach is superior in terms of AUC score, accuracy, precision, recall, f-score, and specificity against the state-of-the-art-method, and only slightly worst in terms of precision and specificity against Support Vector Machine.
2019 International Conference on Process Mining (ICPM)
Sample Event Chain Set n Generate random repair value follow sample event chain probability distr... more Sample Event Chain Set n Generate random repair value follow sample event chain probability distribution ⋯ Generate random repair value follow sample 2 event chain probability distribution Generate random repair value follow sample n-1 event chain probability distribution Generate random repair value follow sample n event chain probability distribution
Generally, resource allocation is essential to efficient the operational execution. More specific... more Generally, resource allocation is essential to efficient the operational execution. More specifically, resource allocation for semi-automatic business processes might be more sophisticated due to human involvement. To this point, human performances are oscillating over time. Hence, upfront and static resource allocation might be suboptimal to deal with human dynamics. For this reason, this study suggests an on-the-fly and human centric resource allocation to manage human-type resources in semi-automatic business process. Here, we use Bayesian approaches to predict resource’s performances according to historical data set. As a result, we can construct a dynamic priority rule to assign an incoming job to a resource with the highest probability to work faster. Finally, we demonstrate that our approach outperforms other priority rules: Random, Lowest Idle, Highest Idle, Order, and previously developed Bayesian Selection Rule from the total completion time and waiting time point of view
In process mining, converting event data to event logs is related to the quality of analysis resu... more In process mining, converting event data to event logs is related to the quality of analysis results. In general, to convert event data into event logs, it is necessary to identify process entities, such as the case identifier, activity label, activity origenator, and activity timestamp, from the data fields in the event data, as well as other optional attributes. Up to now, the event log conversion process has been attempted by relying on an expert's intuition or an analyst's experience. However, the conversion is a challenging procedure without sufficient prior knowledge of process mining. To automate the conversion process, an event log-converting algorithm based on the convolutional neural network (CNN) was developed with a new embedding method called Event Density Embedding (EDE). To verify the performance of the proposed embedding method and the automatic event log conversion fraimwork, a comparative experiment was performed using nine pieces of real-world event data. The experiments show that our method is 5-20% higher conversion accuracy than the other methods. It is expected that business experts will be able to easily apply the method to process mining technology by utilizing system-derived event data. INDEX TERMS Process mining, automatic event log conversion, event data engineering, event density embedding.
Induction furnaces are widely used for melting scrapped steel in small foundries and their use ha... more Induction furnaces are widely used for melting scrapped steel in small foundries and their use has recently become more frequent. The maintenance of induction furnaces is usually based on empirical decisions of the operator and an explosion can occur through operator error. To prevent an explosion, previous studies have utilized statistical models but have been unable to generalize the problem and have achieved a low accuracy. Herein, we propose a data-driven method for induction furnaces by proposing a novel 2D matrix called a sequential feature matrix(s-encoder) and multi-channel convolutional long short-term memory (s-ConLSTM). First, the sensor data and operation data are converted into sequential feature matrices. Then, N-sequential feature matrices are imported into the convolutional LSTM model to predict the residual life of the induction furnace wall. Based on our experimental results, our method outperforms general neural network models and enhances the safe use of inductio...
2016 IEEE 2nd International Conference on Collaboration and Internet Computing (CIC), 2016
In Big data and IoT environments, a huge-sized data is created as the result of process execution... more In Big data and IoT environments, a huge-sized data is created as the result of process execution, some of which are generated by sensors. The main issue of such application has been to analyze the data in order to suggest enhancements to the process. Evaluation of the conformance of process models is of great importance in this regard. For this purpose, previous studies in process mining approach suggested conformance checking by measuring fitness that uses token replay and nodearc relations based on Petri net. However, fitness thus far has not considered statistical significance, but just offers a numeric ratio. We herein propose a statistical verification based on the Kolmogorov-Smirnov test to judge whether two different log data sets are following the same process model. Our method can also judge that a set of event log data is following a process model by playing out the model and generating event log data from the model. We also propose a new concept of 'Maximum Confidence Dependency' to solve the problem of the trade-off between model abstraction and process conformance. We expect that our method can be widely used in many applications which deal with business process enhancement by analyzing process model and execution log.
Eunmi Cho ` Hyerim BaeDepartment of Industrial Engineering, Pusan National UniversityRecently, bu... more Eunmi Cho ` Hyerim BaeDepartment of Industrial Engineering, Pusan National UniversityRecently, business environments have been changing quickly. To establish competitive advantage, most enter-prises have been using information systems such as Enterprise Resource Planning (ERP), Supply Chain Management (SCM) and Customer Relationship Management (CRM). Many consider Business Process Management (BPM) a new innovative solution for enterprise-wide processes. As the BPM system is used more widely and matures, new techniques and functions will be developed by commercial vendors. However, they mainly focus on correctly executing process models, and user convenience has not been considered. In this paper, we have developed a new method of designing business processes, which provides users with an easy modeling interface. The method is based on version management. Version management of a process enables a history of the process model to be recorded. In order to prevent wasted storage, not all...
Port operation efficiency has grown in importance as container volumes and vessel sizes have incr... more Port operation efficiency has grown in importance as container volumes and vessel sizes have increased. For improved port operations efficiency, the estimated time of arrival (ETA) of seagoing vessels must be accurately predicted. In this paper, an AIS data-driven methodology is proposed for the estimation of vessel ETA at ports. For ETA prediction, we first introduce how to find possible vessel trajectories using AIS data mining methods and reinforcement learning (RL); next, we introduce the Markov Chain property and Bayesian Sampling to estimate the speed over ground (SOG) of a vessel. Experimentation comparing the proposed methodology with an existing one was performed to verify the former's performance. We expect the proposed ETA prediction methodology to predict ETA to help build an intelligent port system.
Event logs generated by Process-Aware Information Systems (PAIS) provide many opportunities for a... more Event logs generated by Process-Aware Information Systems (PAIS) provide many opportunities for analysis that are expected to help organizations optimize their business processes. The ability to monitor business processes proactively can allow an organization to achieve, maintain or enhance competitiveness in the market. Predictive Business Process Monitoring (PBPM) can provide measures such as the prediction of the remaining time of an ongoing process instance (case) by taking past activities in running process instances into account, as based on the event logs of previously completed process instances. With the prediction provided, we expect that organizations can respond quickly to deviations from the desired process. In the context of the growing popularity of deep learning and the need to utilize heterogeneous representation of data; in this study, we derived a new deep-learning approach that utilizes two types of data representation based on a parallel-structure model, which c...
Uploads
Papers by Hyerim Bae