Manufacturing 43214

G Model
JMSY-629; No. of Pages 13 ARTICLE IN PRESS

Journal of Manufacturing Systems xxx (2018) xxx–xxx
Contents lists available at ScienceDirect
Journal of Manufacturing Systems

journal homepage: www.elsevier.com/locate/jmansys
Deep learning for smart manufacturing: Methods and applications

Jinjiang Wang a,∗ , Yulin Ma a , Laibin Zhang a , Robert X. Gao b , Dazhong Wu c
a
School of Mechanical and Transportation Engineering, China University of Petroleum, Beijing 102249, China
b
Department of Mechanical and Aerospace Engineering, Case Western Reserve University, Cleveland, OH 44106, USA
c
Department of Mechanical and Aerospace Engineering, University of Central Florida, Orlando, FL 32816, USA
a r t i c l e i n f o a b s t r a c t
Article history: Smart manufacturing refers to using advanced data analytics to complement physical science for improv-
Received 13 November 2017 ing system performance and decision making. With the widespread deployment of sensors and Internet
Received in revised form 1 January 2018 of Things, there is an increasing need of handling big manufacturing data characterized by high vol-
Accepted 2 January 2018
ume, high velocity, and high variety. Deep learning provides advanced analytics tools for processing
Available online xxx
and analysing big manufacturing data. This paper presents a comprehensive survey of commonly used
deep learning algorithms and discusses their applications toward making manufacturing “smart”. The
Keywords:
evolvement of deep learning technologies and their advantages over traditional machine learning are
Smart manufacturing
Deep learning
firstly discussed. Subsequently, computational methods based on deep learning are presented specially
Computational intelligence aim to improve system performance in manufacturing. Several representative deep learning models are
Data analytics comparably discussed. Finally, emerging topics of research on deep learning are highlighted, and future
trends and challenges associated with deep learning for smart manufacturing are summarized.
© 2018 Published by Elsevier Ltd on behalf of The Society of Manufacturing Engineers.
1. Introduction product quality and response more timely to dynamically changing

customer demands [3–8]. Statistics shows that 82% of the com-
Over the past century, the manufacturing industry has under- panies using smart manufacturing technologies have experienced
gone a number of paradigm shifts, from the Ford assembly line increased efficiency and 45% of the companies of the companies
(1900s) to Toyota production system (1960s), flexible manufac- experienced increased customer satisfaction [9].
turing (1980s), reconfigurable manufacturing (1990s), agent-based Smart manufacturing refers to a new manufacturing paradigm
manufacturing (2000s), cloud manufacturing (2010s) [1,2]. Various where manufacturing machines are fully connected through wire-
countries have developed strategic roadmaps to transform man- less networks, monitored by sensors, and controlled by advanced
ufacturing to take advantage of the emerging infrastructure, as computational intelligence to improve product quality, system
presented by Internet of Things (IoTs) and data science. As an exam- productivity, and sustainability while reducing costs. Recent
ple, Germany introduced the framework of Industry 4.0 in 2010, advancement of Internet of Things (IoTs), Cloud Computing, Cyber
which has been evolved into a collaborative effort among mem- Physical System (CPS) provides key supporting technologies to
ber countries in the European Union. Similarly, in 2011 the Smart advance modern manufacturing [10–13]. By leveraging these new
Manufacturing Leadership Coalition (SMLC) in the U.S. created technologies in manufacturing, data at different stages of a prod-
a systematic framework for implementing smart manufacturing. uct’s life, ranging from raw materials, machines’ operations, facility
The plan “China Manufacturing 2025”, introduced in China in logistics, and even human operators, is collected and processed
2015, aims to promote advanced manufacturing. As manufacturing [12]. With the proliferation of manufacturing data, data driven
machines are increasingly equipped with sensors and communica- intelligence with advanced analytics transforms unprecedented
tion capabilities, there is significant potential to further improve the volumes of data into actionable and insightful information for smart
condition-awareness of manufacturing machines and processes, manufacturing as illustrated in Fig. 1. Data driven intelligence mod-
reduce operational downtime, improve the level of automation and els the complex multivariate nonlinear relationships among data,
with no in-depth understanding of system physical behaviours
required.
Data driven intelligence has attracted extensive research effort
∗ Corresponding author.
for manufacturing data distilling and decision making. In [14],
E-mail address: jwang@cup.edu.edu (J. Wang).
https://doi.org/10.1016/j.jmsy.2018.01.003
0278-6125/© 2018 Published by Elsevier Ltd on behalf of The Society of Manufacturing Engineers.
Please cite this article in press as: Wang J, et al. Deep learning for smart manufacturing: Methods and applications. J Manuf Syst (2018),
G Model
2 J. Wang et al. / Journal of Manufacturing Systems xxx (2018) xxx–xxx
Fig. 1. The role of data driven intelligence in smart manufacturing.
data mining techniques are classified into five categories, including In light of the above challenges, this paper aims to provide
characterization and description, association, classification, predic- a state-of-the-art review of deep learning techniques and their
tion, clustering and evolution analysis. The barriers to data-driven applications in smart manufacturing. Specifically, the deep learn-
decision making in manufacturing are also identified. Typical ing enabled advanced analytics framework is proposed to meet
machine learning techniques are reviewed in [15,16] for intelli- the opportunistic need of smart manufacturing. The typical deep
gent manufacturing, and their strengths and weaknesses are also learning models are briefly introduced, and their applications to
discussed in a wide range of manufacturing applications. A com- manufacturing are outlined to highlight the latest advancement in
parative study of machine learning algorithms including Artificial relevant areas. The challenges and future trends of deep learning
Neural Network, Support Vector Machine, and Random Forest are discussed in the end.
is performed for machining tool wear prediction. The schemes, The rest of paper is constructed as follows. Firstly, data-driven
techniques and paradigm of developing decision making support artificial intelligence techniques are reviewed in Section 2, with
systems are reviewed for the monitoring of machining opera- the superiority of deep learning techniques outlined. Next, the
tions, and these techniques include neural networks, fuzzy logic, challenge and opportunistic need of deep learning in smart man-
genetic algorithms, and hybrid systems [17,18]. The potential ufacturing are presented, and typical deep learning models are
benefit and successful application examples of typical machining briefly discussed in Section 3. Then, the latest applications of deep
learning techniques including Bayesian Networks, instance-based learning techniques in the context of smart manufacturing are sum-
learning, Artificial Neural Network, and ensemble methods are dis- marized in Section 4. Finally, the challenges as well as future trends
cussed in [19]. Cloud enabled prognosis techniques including data of deep learning in smart manufacturing are discussed.
driven approach, physics based as well as model-based techniques
are reviewed in [20], with the benefits from both advanced com- 2. Overview of data driven intelligence
puting capability and information sharing for intelligent decision
making. Traditional machine learning is usually designed with shal- 2.1. The evolution of data-driven artificial intelligence
low structures, such as Artificial Neural Network, Support Vector
Machine, and logistic regression, etc. By coping with limited hand- Artificial intelligence is considered as a fundamental way to
crafted features, it achieves decent performance in a variety of possess intelligence, and listed as the first place in Gartner’s Top
applications. However, the massive data in smart manufacturing 10 strategic technology trends in 2017 [22]. Artificial intelligence
imposes a variety of challenges [18,19], such as the proliferation of has experienced several lifecycles, from the infancy period (1940s),
multimodal data, high dimensionality of feature space, and mul- through the first upsurge period (1960s) and the second upsurge
ticollinearity among data measurements. These challenges render period (1980s), and the present third boom period (after 2000s).
traditional algorithms struggling and thus greatly impede their per- The development trend and typical artificial intelligence models
formance. are summarized in Table 1.
As a breakthrough in artificial intelligence, deep learning The origin of Artificial Neural Network started back in 1940s,
demonstrates outstanding performance in various applications of when MP model [23] and Hebb rule [24] were proposed to discuss
speech recognition, image recondition, natural language process- how neurons worked in human brain. At the workshops in Dart-
ing (e.g. translation, understanding, test questions & answers), mouth College, significant artificial intelligence capabilities like
multimodal image-text, and games (e.g. Alphago). Deep learning playing chess games and solving simple logic problems were devel-
allows automatically processing of data towards highly nonlinear oped [24]. The pioneering work brought artificial intelligence to the
and complex feature abstraction via a cascade of multiple lay- first upsurge period (1960s). In 1956, a mathematical model named
ers, instead of handcrafting the optimum feature representation Perceptron [25] was proposed to simulate the nervous system of
of data with domain knowledge. With automatic feature learning human learning with linear optimization. Next, a network model
and high-volume modelling capabilities, deep learning provides an called Adaptive Linear Unit [26] was developed in 1959 and had
advanced analytics tool for smart manufacturing in the big data era. been successfully used in practical applications such as commu-
It uses a cascade of layers of nonlinear processing to learn the repre- nication and weather forecasting. The limitation of early artificial
sentations of data corresponding to different levels of abstraction. intelligence was also criticized due to the difficulty in handling
The hidden patterns underneath each other are then identified and non-linear problems, such as XOR (or XNOR) classification [27].
predicted through end-to-end optimization. Deep learning offers With the development of Hopfield network circuit [28], artifi-
great potential to boost data-driven manufacturing applications, cial intelligence stepped forward to the second upsurge (1980s).
especially in the big data era [17,21]. Back Propagation (BP) algorithm was proposed to solve non-linear
problems in complex neural network in 1974 [29]. A random mech-
G Model
J. Wang et al. / Journal of Manufacturing Systems xxx (2018) xxx–xxx 3
Table 1 ward to reduce dimensionality and learn sparse representations

List of typical artificial intelligence models.
[40,41].
Timeline Proposed models Reference Deep learning gained increasing popularity. In 2009, Deep Boltz-
Infancy period (1940s) MP model [23] mann Machine with a bidirectional structure was proposed to learn
Hebb rule [24] ambiguous input data robustly, and the model parameters were
First upsurge period Perceptron [25] optimized using layer-wise pre-training [42]. In 2010, Denoising
(1960s) Adaptive Linear Unit [26] Auto Encoder was presented to reconstruct the stochastically cor-
Second upsurge period Hopfield network circuit [28]
rupted input data, and force the hidden layer to discover more
(1980s) Back Propagation [29]
Boltzmann Machine [30] robust features [43]. Deep Convolutional Neural Network (DCNN)
Support Vector Machine [31] was introduced with deep structure of Convolutional Neural Net-
Restricted Boltzmann Machine [32] work in 2012 [44], and it showed superior performance in image
Auto Encoder [33]
recognition. Generative Adversarial Network (GAN) was proposed
Third boom period Recurrent Neural Network [34]
(after 2000s) Long short-term Memory [35] in 2014 [45], and it contained two independent models acting
Convolutional Neural Network [36] as adversaries. The generative model was designed to generate
Deep Belief Network [37,38] random samples similar to real samples while the discrimina-
Deep Auto Encoder [39] tive model was used for training and classification with both real
Sparse Auto Encoder [40,41]
and generated random samples. In 2016, an attention-based LSTM
Deep Boltzmann Machine [42]
Denosing Auto Encoder [43] model was proposed by integrating attention mechanism with
Deep Convolutional Neural Network [44] LSTM [46]. Nowadays, more and more new models are being devel-
Generative Adversarial Network [45] oped even per week.
Attention-based LSTM [46]
2.2. Comparison between deep learning and traditional machine

learning
anism was introduced into Hopfield network and put forward the
Boltzmann Machine (BM) in 1985 [30]. With the development of Both deep learning and traditional machine learning are data-
statistical learning, Support Vector Machine (SVM) was developed driven artificial intelligence techniques to model the complex
with kernel functions transformation in 1997, and showed decent relationship between input and output as shown in Fig. 2. In addi-
performance on classification and regression [31]. However, these tion to the high hierarchical structure, deep learning also has
traditional machine learning techniques require human expertise distinctive attributes over traditional machine learning in terms
for feature extraction to reduce the dimension of input, and thus of feature learning, model construction, and model training.
their performance highly relies on the engineered features. Deep learning integrates feature learning and model construc-
The birth of deep learning benefits not only from the rich accu- tion in one model by selecting different kernels or tuning the
mulation of traditional machine learning techniques, but also the parameters via end to end optimization. Its deep architecture of
inspiration of statistical learning. Deep learning uses data rep- neural nets with many hidden layers is essentially multi-level
resentation learning rather than explicit engineered features to non-linear operations. It transfers each layer’s representation (or
perform tasks. It transforms data into abstract representations that features) from original input into more abstracted representation
enable the features to be learnt. In 1986, Restricted Boltzmann in the higher layers to find the complicated inherent structures.
Machine (RBM) was developed by obtaining the probability dis- For example, the features such as edge, corner, contour, and
tribution of Boltzmann Machine [32], and the hidden layers were object parts, are abstracted layer-by-layer from an image. These
used as feature vectors to characterize the input data. Meanwhile, abstracted feature representations are then input to the classifier
Auto Encoder (AE) was proposed using the layer-by-layer Greedy layer to perform classification and regression tasks. Overall, deep
learning algorithm to minimize the loss function [33]. In 1995, a learning is an end-to-end learning structure with the minimum
neural network with directed topology connections between neu- human inference, and the parameters of deep learning model are
rons, called Recurrent Neural Network (RNN), was proposed for trained jointly.
feature learning from sequence data [34]. In 1997, an improved ver- On the contrary, traditional machine learning performs fea-
sion of recurrent neural network, named Long short-term Memory ture extraction and model construction in a separated manner,
(LSTM), was proposed to tackle the vanishing gradient problem and and each module is constructed step-by-step. The handcrafted
deal with complex time sequence data [35]. In 1998, Convolutional features are firstly extracted by transforming raw data into a
Neural Network (CNN) was put forward to handle two dimensional different domain (e.g., statistical, frequency, and time-frequency
inputs (e.g. image), in which features learning were achieved by domain) to take the representative information requiring expert
stacking convolutional layers and pooling layers [36]. domain knowledge. Next, feature selection is performed to improve
As the hierarchical structures of deep learning models get- the relevancy and reduce the spurious redundancy among fea-
ting deeper, model training and parameter optimization become tures before feeding into the machine learning model. Traditional
more difficult and time consuming, even leading to overfitting machine learning techniques usually has shallow structures with
or local optimization problems. Many attempts were made to at most three layers (e.g. input, output, and one hidden layer).
develop deep learning models, but no satisfactory performance Thus, the performance of the constructed model not only relies
was reported before 2006. Deep Belief Network (DBN) was devel- on the optimization of adopted algorithms (e.g. BP Neural Net-
oped and achieved success in 2006 [37,38]. It allowed bidirectional work, Support Vector Machine, and logistic regression), but also is
connections in top layer only instead of stacking RBMs directly to heavily affected by the handcrafted features. Generally, the feature
reduce computational complexity, and the parameters were suc- extraction and selection are time-consuming, and highly depend
cessfully learned through layer-wise pre-training and fine tuning. on domain knowledge.
Meanwhile, Deep Auto Encoder was proposed by adding more Therefore, deep learning has distinctive difference with tradi-
hidden layers to deal with high nonlinear input [39]. The model tional machine learning techniques as illustrated in Table 2. The
parameters were firstly pre-trained using a greedy layer-by-layer high level abstract representation in feature learning makes deep
unsupervised learning algorithm and then fine-tuned using BP learning more flexible and adaptable to data variety. Because the
algorithm. One year later, Sparse Auto Encoder (SAE) was put for- data are abstracted, the diverse data types and sources do not have
G Model
Fig. 2. Comparison between two techniques: a) traditional machine learning, b) deep learning.
Table 2
Comparison between traditional machine learning and deep learning.
Feature learning Model construction Model training
Traditional machine Explicit engineered Use extracted features to Each module is trained
learning features extracted with construct data-driven model, step-by-step.
expert domain knowledge. usually with shallow
structures.
Deep learning Features are learned by An end-to-end high Parameters are trained jointly.
transforming data into hierarchical model structure
abstract representations. with nonlinear combination of
multi-layers.
strong influence on the analysis results. On the other hand, the deep reduced or the equipment failure happens, diagnostic analytics
hierarchical structure in deep learning is easier to model the nonlin- examine the root cause and report the reason it happens. Pre-
ear relationship using compositional function comparing with the dictive analytics utilizes statistical models to make predictions
shallow structure which is regarded as a generic function in tradi- about the possibility of future production or equipment degra-
tional machine learning. The superiority of deep network had been dation with available historical data. Prescriptive analytics goes
proven mathematically in [47]. As the size and variety of dataset beyond by recommending one or more courses of action. Measures
grow in the big data context, it becomes more difficult to create can be identified to improve production outcomes or correct the
new, highly relevant features. In the context of big data era in smart problems, showing the likely outcome of each decision.
manufacturing, the ability to avoid feature engineering is regarded With the advanced analytics provided by deep learning, man-
as a great advantage due to the challenges associated with this ufacturing is transformed into highly optimized smart facilities.
process. The benefits include reducing operating costs, keeping up with
changing consumer demand, improving productivity and reducing
downtime, gaining better visibility and extracting more value from
3. Deep learning for smart manufacturing
the operations for globally competitiveness.
Up to date, various deep learning architectures have been devel-
With new technologies (e.g. IoT, big data) embraced in smart
oped and the relevant research topics are fast-growing. To facilitate
manufacturing, smart facilities focus on creating manufacturing
the investigation of manufacturing intelligence, several typical
intelligence that can have a positive impact across the entire
deep learning architectures are discussed including Convolutional
organization. The manufacturing today is experiencing an unprece-
Neural Network, Restricted Boltzmann Machine, Auto Encoder, and
dented increase in available sensory data comprised of different
Recurrent Neural Network and their variants. The feature learning
formats, semantics, and structures. Sensory data was collected from
capability and model construction mechanism were emphasized
different aspects across the manufacturing enterprise, including
since these models are the building blocks to construct compre-
product line, manufacturing equipment, manufacturing process,
hensive and complex deep learning techniques.
labour activity, and environmental conditions. Data modelling and
analysis are the essential part of smart manufacturing to handling
increased high volume data, as well as supporting real-time data 3.1. Convolutional neural network
processing [48].
From sensory data to manufacturing intelligence, deep learning Convolutional Neural Network (CNN) is a multi-layer feed-
has attracted much attention as a breakthrough of computational forward artificial neural network which is firstly proposed for
intelligence. By mining knowledge from aggregated data, deep two-dimensional image processing [36]. It has also been inves-
learning techniques play a key role in automatically learning tigated for one-dimensional sequential data analysis including
from data, identifying patterns, and making decisions as shown in natural language processing and speech recognition recently [49].
Fig. 3. Different levels of data analytics can be produced includ- In CNN, the feature learning is achieved by alternating and stack-
ing descriptive analytics, diagnostic analytics, predictive analytics, ing convolutional layers and pooling operations. The convolutional
and prescriptive analytics. Descriptive analytics aims to summarize layers convolve with raw input data using multiple local kernel
what happens by capturing the product’s conditions, environment filters and generate invariant local features. The subsequent pool-
and operational parameters. When the product performance is ing layers extract the most significant features with a fixed-length
G Model
Fig. 3. Deep learning enabled advanced analytics for smart manufacturing.
over sliding windows of the raw input data by pooling operations metric connection between visible and hidden units, but there are
such as max pooling and average pooling. Max pooling selects the no connections between each neuron within the same layer. It is an
maximum value of one region of the feature map as the most sig- energy based model in which the visible layer is used to input data
nificant feature. Average pooling calculates the mean value of one while the hidden layer is used to extract features. All hidden nodes
region and takes it as the pooling value of this region. Max pooling are assumed conditionally independent. The weights and offsets of
is well suited to extract sparse features, while pooling operation on these two layers are tuned over iterations in order to make the out-
all samples may not be optimal. put of the visible layer as the approximation of the original input.
After multi-layer feature learning, fully-connected layers con- Finally, the hidden layers are regarded as different representations
vert a two- dimensional feature map into a one dimensional vector of the visible layer.
and then feed it into a softmax function for model construction. By The parameters in hidden layers are treated as the features
stacking convolutional layers, pooling layers, and fully-connected to characterize the input data to realize data coding and dimen-
layers, a typical CNN is constructed as shown in Fig. 4. Gradient sion reduction. Then, supervised learning methods such as logistic
based backpropagation is usually used to train convolutional neu- regression, Naïve Bayes, BP Neural Network, and Support Vector
ral network by minimizing the minimum mean squared error or Machine, etc. can be used to implement data classification and
cross-entropy loss function. CNN has the advantageous properties regression. RBM takes the advantages of extracting required fea-
including sparse interactions with local connectivity, parameter tures from training datasets automatically, which avoids the local
sharing with reduced numbers, and equivariant representation minimum value and thus has received a growing number of atten-
which is invariant to object locations. tions. Utilizing RBM as the basic learning module, different variant
models have been developed [32].
Deep Belief Network (DBN): DBN is constructed by stacking mul-
3.2. Restricted Boltzmann machine and its variant tiple RBMs, where the output of the lth layer in hidden units is used
as the input of the (l + 1)th layer in visible units. For DBN training, a
Restricted Boltzmann Machine (RBM) is a two-layer neural fast greedy algorithm is usually used to initialize the network and
network consisting of visible and hidden layer. There exists a sym-
G Model
Input layer Fully connected

layer
x11 x12 … x1n Class 1
Wavelet
x21 x22 … x2n Scalogram Class 2
Reshape N×N
Class 3
Softmax
xn1 xn2 … xnn Class 4
Raw Data Convolution Sampling
Fig. 4. Architecture of convolutional neural network model.
Fig. 5. Architecture of (a) RBM, (b) DBN, and (c) DBM.
the parameters of this deep architecture are then fine-tuned by a parameters and build auto-encoder by minimizing the objective
contractive wake-sleep algorithm [37]. Bayesian Belief Network is loss function in terms of the least square loss or cross-entropy loss.
applied to the area which is close to the visible layers, and RBMs Several variants of AE have been developed and listed as follows:
are used to the area far from the visible layers. That is to say, the
highest two layers are undirected and the other lower layers are 1) Denoising Auto Encoder (DAE): DAE is an extension version
directed, as shown in Fig. 5. of the basic Auto Encoder, which is trained to reconstruct the
Deep Boltzmann Machine (DBM): DBM can be regarded as a stochastically corrupted input data by adding isotropic Gaus-
deep structured RBMs where hidden units are grouped into a sian noise to x and forcing the hidden layer to discover more
hierarchy of layers. The full connections between two adjacent robust features [43].
layers are enabled, but no connection is allowed within a layer 2) Sparse Auto Encoder (SAE): SAE makes the most of the hidden
or between non-neighbouring layers as shown in Fig. 5. By stack- unit’s activations close to zero by imposing sparsity constraints
ing multi-RBMs, DBM can learn complex structures and construct on the hidden units, even the number of hidden units is large
high-level representation of input data [42]. Compared to DBN, [40,41].
DBM is a fully undirected graphical model while DBN is a mixed 3) Contractive Auto Encoder (CAE): In order to force the model
directed/undirected one. Accordingly, the DBM model is trained resistant to small perturbations, CAE encourages learning more
jointly and more computationally expensive. On the contrary, DBN robust representations of the input x [50].
can be trained layer-wisely to be more efficiently.
3.4. Recurrent neural network and its variants
3.3. Auto encoder and its variants Compared with traditional neural networks, Recurrent Neural
Network (RNN) has unique characteristic of topology connections
Auto Encoder (AE) is an unsupervised learning algorithm between the neurons formed directed cycles for sequence data as
extracting features from input data without label information shown in Fig. 7. Thus, RNN is suitable for feature learning from
needed. It mainly consists of two parts including encoder and sequence data. It allows information persists in hidden layers and
decoder as shown in Fig. 6. The encoder can perform data compres- captures previous states of a few time steps ago. An updated rule is
sion especially in dealing input of high dimensionality by mapping applied in RNN to calculate the hidden states at different time steps.
input to a hidden layer [33]. The decoder can reconstruct the Take the sequential input as a vector, the current hidden state can
approximation of input. Suppose the activation function is a linear be calculated by two parts through a same activation function (e.g.
function and we have less hidden layers than the dimensionality of sigmoid or tanh function). The first part is calculated with the input
input data, then the linear Auto Encoder is similar to principle com- while the second part is obtained from the hidden state at the pre-
ponent analysis (PCA). If the input data is highly nonlinear, more vious time step. Then, the target output can be calculated with the
hidden layers are required to construct the deep Auto Encoder. current hidden state through a softmax function. After processing
Stochastic gradient descent (SGD) is often investigated to calculate the whole sequence, the hidden state is the learned representation
G Model
Fig. 6. The architecture of AE.
y1 y2 yn
Output layer
g g g
Hidden layer h0 fW h1 fW h2 fW hn
Input layer x1 x2 xt
Fig. 7. Architecture of recurrent neural network model.
of the input sequential data and a conventional multilayer percep- or semi supervised learning. CNN, RNN and their variants are super-
tron (MLP) is added on top to map the obtained representation to vised learning. The Pros and Cons of these typical deep learning
targets. models are presented in Table 3.
Different from traditional neural networks, the model training in Fortunately, a number of typical deep learning packages includ-
RNN is performed by Backpropagation Through Time (BPTT). RNN ing open source or commercial software are available to public as
is firstly unrolled according to time and each unrolled time step is summarized in Table 4. They facilitate the investigation of deep
considered as an additional layer. Then backpropagation algorithm learning techniques in different manufacturing scenarios.
is applied to calculate gradients. Due to the vanishing/exploding
gradient problem using BPTT for model training, RNN cannot cap-
ture long-term dependencies. In other words, RNN has difficulty in
dealing with long-term sequence data. 4. Applications to smart manufacturing
A variety of enhancements are proposed to solve these prob-
lems, among which long short-term memory (LSTM) is widely Computational intelligence is an essential part of smart man-
investigated for its effectiveness [35]. The most important idea of ufacturing to enable accurate insights for better decision making.
LSTM is cell state, which allows information flow down with lin- Machine learning has been widely investigated in different stages
ear interactions. Comparing with single recurrent structure in RNN, of manufacturing lifecycle covering concept, design [60], evalu-
the gates including forget gate layer, input gate layer and output ation, production, operation, and sustainment [61] as shown in
gate layer, are used in LSTM to control the cell state. It enables Fig. 8. The applications of data mining in manufacturing engineer-
each recurrent unit to adaptively capture long-term dependencies ing are reviewed in [62], covering different categories of production
of different time scales. processes, operations, fault detection, maintenance, decision sup-
port, and product quality improvement. The evolution and future
3.5. Model comparison of manufacturing are reviewed in [63,64], emphasizing the impor-
tance of data modelling and analysis in manufacturing intelligence.
With the above illustration, it can be found that CNN and RNN The application schemes of machine learning in manufacturing
provide complex composition mechanism to learn representation are identified as summarized in [65,66]. Smart manufacturing also
and model construction. The RBM and AE can be used for layer- requires prognostics and health management (PHM) capabilities to
by-layer pretraining of deep neural network to characterize input meet the current and future needs for efficient and reconfigurable
data. In these deep learning models, the top layers normally repre- production [67].
sent the targets. For classification where targets are discrete values, Deep learning, as an emerging technique, has been investigated
softmax layers are applied. For prediction with continuous targets, for a wide range of manufacturing systems recently. To give an
linear regression layers are added. According to the dependence on overview, the applications of state-of-the-art deep learning tech-
labelled data, DBN, AE and their variants are unsupervised learning niques in manufacturing are discussed in this study, especially in
G Model
Table 3
Comparison between different deep learning models.
Model Principle Pros. Cons.
CNN Abstracted features are learned Reduced parameter number, High computational
by stacked convolutional and invariance of shift, scale and complexity for high
sampling layers. distortion hierarchical model training
RBM Hidden layer describes variable Robust to ambiguous input and Time-consuming for joint
dependencies and connections training label is not required in parameter optimization
between input or output layers pre-training stage
as representative features.
AE Unsupervised feature learning Irrelevance in the input is Error propagation
and data dimensionality eliminated, and meaningful layer-by-layer and sparse
reduction are achieved through information is preserved representations are not
encoding guaranteed
RNN Temporal pattern stored in the Short-term information is Difficult to train the model and
recurrent neuros connection retained and temporal save the long-term dependence
and distributed hidden states correlations are captured in
for time-series data. sequence data.
Table 4
A list of deep learning tools.
Tools Type Description
Caffe/Caffe2 [51] Open-source Feedforward network, and suitable for image

processing.
Theano [52] Open-source Consist of the most of state-of-the-art neural
networks, originated at the University of Montreal
Numerical in 2007.
TensorFlow [53] Open-source Open source software library for deep neural networks
using data flow graphs, developed by Google Brain
Team.
Pytorch [54] Open-source Deep learning framework widely used by Facebook and
Twitter, originally developed at New York University
in 2002. Excellent for convnets and rich set of RNNs.
CNTK [55] Open-source Microsoft Cognitive Toolkit, and well known in the
speech community.
Google Cloud machine Commercial Allow users to build and train machine learning
learning platform [56] models by using TensorFlow in Google Cloud Platform.
Amazon machine Commercial Cloud-based service for users to use machine learning
learning [57] technology.
Microsoft Azure [58] Commercial Machine learning library
IBM Watson analytics Commercial Cloud-based machine learning platform for data
[59] exploration, visualization and predictive analytics.
Table 5
A list of deep learning models with applications.
Deep learning model Application Scenarios Reference
CNN Surface integration inspection [72–75]

Machinery fault diagnosis [77–84]
DBN Machinery fault diagnosis [85–92]
Predictive analytics & defect prognosis [109–112]
AE Machinery fault diagnosis [93–103]
RNNs Predictive analytics & defect prognosis [104–108]
surface defect for enhanced product quality in manufacturing

[68]. Traditional machine learning has made remarkable progress
and yields reliable results in many cases [69], but different
pre-processing approaches including structural-based, statistical-
based, filter-based, and model based techniques are needed to
extract representative features with expert knowledge [70]. How-
ever, flexible configuration in modern manufacturing system could
shift production from one product to another quickly. The fea-
Fig. 8. Typical application scenarios of machine learning in smart manufacturing.
ture representation may need redesign from scratch for traditional
the areas of product quality inspection, fault diagnosis, and defect machine learning. Additionally, a new product may present com-
prognosis, as highlighted in Table 5. plicated texture patterns or intensity variations, and the surface
defects could be in an arbitrary size, orientation and shape. There-
4.1. Descriptive analytics for product quality inspection fore, manually designed features in traditional machine learning
technique may lead to insufficient or unsatisfactory inspection
Surface integration inspection is usually inspected employ- performance in complex surface scenarios or dynamic changing
ing machine vision and image processing techniques to detect process [71]. To address these challenges, deep learning has been
G Model
investigated to learn high-level generic features and applied to a extreme learning machine based Auto Encoder is investigated in
wide range of textures or difficult-to-detect defects cases. [98] to learn feature representations for wind turbine fault clas-
Convolutional Neural Network, originally designed for image sification. In [99], a continuous sparse Auto Encoder is presented
analysis, is well fit for automated defect identification in sur- by adding Gaussian stochastic unit into an activation function to
face integration inspection. In [72], a Deep Convolutional Neural extract nonlinear features of the input data. To improve diagnostic
Network architecture is designed and the hyper-parameters are analytics, the comprehensive deep learning models are also devel-
optimized based on backpropagation and stochastic gradient oped. In [100], a sparse filtering based deep neural network model is
descent algorithms. A max-pooling Convolutional Neural Network investigated for unsupervised features learning. Synthesized deep
is presented in [73] to perform feature extraction directly from the learning models are discussed in [101,102] for signal denoising and
pixel representation of steel defect images and shows lower error fused feature extraction. Different deep learning models including
rates comparing with multi-layer perceptron and support vector Deep Boltzmann Machine, Deep Belief Network, and Stacked Auto
machine. The image analysis is studied with convolutional neural Encoder with different preprocessing schemes are comparatively
network in [74] to automatically inspect dirties, scratches, burrs, studied in [103] for rolling bearing fault diagnosis. The results show
and wears on surface parts. The experimental results show that that stacked Auto Encoder performs the best. From the above lit-
CNN works properly with different types of defects on textured or eratures, it is concluded that the deep learning models outperform
non-textured surfaces. A generic approach based on CNN is pro- traditional machine learning techniques with engineered features
posed in [75] to extract patch feature and predict defect area via such as support vector machine, and BP Neural Network in terms
thresholding and segmenting. The results show the pretrained CNN of classification accuracy.
model works well on small dataset with improved accuracy for
automated surface inspection system.
4.3. Predictive analytics for defect prognosis
4.2. Diagnostic analytics for fault assessment

In order to increase manufacturing productivity while reduc-
ing maintenance cost, it is crucial to develop and implement
Manufacturing systems are usually subject to failures caused by
an intelligent maintenance strategy that allows manufacturers to
degradation or abnormal operating conditions, leading to exces-
determine the condition of in-service systems in order to predict
sive load, defection, fracture, overheating, corrosion, and wear. The
when maintenance should be performed. The temporal behaviour
failure may incur higher operating costs, lower productivity, more
in the historical data is important for prediction, and deep recurrent
disqualified part waste, and even unexpected downtime. In order to
neural network has demonstrated its capability to model tempo-
implement smart manufacturing, it is crucial for a smart factory to
ral pattern. Recently, a general recurrent neural network, named
monitor machinery conditions, identify the incipient defects, diag-
long short term memory, has been investigated to predict defect
nose the root cause of failures, and then incorporate the information
propagation and estimate remaining useful life (RUL) of mechani-
into manufacturing production and control [75].
cal systems or components. In [104], a competitive learning-based
With aggregated data from smart sensory and automation sys-
RNN has been proposed for long-term prognosis of rolling bearing
tems, more and more deep learning techniques have been widely
health status. In [105], a new local feature-based gated recurrent
investigated for machinery fault diagnosis and classification [76].
unit network has been proposed to learn the representation of the
Convolutional Neural Network integrates feature learning and
sequence of local features and the proposed method is verified on
defect diagnosis in one model, and has been used in many aspects,
three real machine health monitoring tasks. In [106], an integrated
such as bearing [77–80], gearbox [81,82], wind generator [83],
approach of CNN and bi-directional LSTM is presented for machin-
and rotor [84], etc. Since CNN was originally developed for image
ing tool wear prediction, in which CNN is used to extract local
analysis, different approaches are investigated to construct two
features from sequential signals and bi-directional LSTM to capture
dimensional input from time series data. The frequency spectrum
long-term dependence for prediction. Vanilla LSTM is investigated
of multi-channel vibration data is also investigated in [77] to fit the
in [107] to estimate the remaining useful life of an aircraft turbofan
model requirement. In [75], permutation is performed by trans-
engine under complex operating conditions and strong background
forming time series data into a matrix, and then normalized as
noise, and the experimental results confirm that Vanilla LSTM pro-
image. In [84], time frequency spectrum of vibration signal by
vides good prediction accuracy. A stacked LSTM network enables
wavelet transform is used as image input of a CNN model. Deep
the learning of higher level temporal features, and has been pre-
Belief Network has fast inference as well as the advantage of encod-
sented for anomaly prediction of space shuttle and engine [108].
ing high order network structures by stacking multiple Restricted
Deep Belief Network, as the feature learning approach in regres-
Boltzmann Machines. It has been investigated for fault diagnosis of
sion models, has also been investigated for predictive analytics. In
aircraft engine [85], chemical process [86], reciprocating compres-
[109], Deep Belief Network is investigated to model the complex
sor [87], rolling element bearing [88,89], high speed train [90,91],
relationship between material removal rate and chemical mechan-
and wind turbine [92]. The input of a DBM model is usually the
ical polishing process parameters in semiconductor manufacturing.
preprocessed features by Teager-Kaiser energy operator or wavelet
An integrative approach of Deep Belief Network and particle filter
transform rather than raw data. Then the DBM model is constructed
is presented in [110] for the RUL prediction of a ceramic bear-
following a supervised layer-by-layer learning process.
ing. By aggregating the output of ensemble DBNs, Support Vector
Auto Encoder has been investigated for unsupervised feature
Regression model is investigated to predict electricity load demand
learning, and the learned features are then fed into a traditional
[111]. To predict the resource request in cloud computing, DBN is
machine learning model for model training and classification. In
proposed in [112] to optimize job schedule and balance the com-
[93], a five-layer deep neural network is presented using the fea-
putational load.
tures learned by an Auto Encoder for planetary gearbox diagnosis
under various operating conditions. Different variants are also
investigated including sparse Auto Encoder [94], stacked denoising 5. Discussions and outlook
Auto Encoder [95], and Contractive Auto Encoder [96], etc. Sparse
Auto Encoder is investigated in [97] to learn the features of motor As the evolution of smart manufacturing, more and more
current signal, and partial corruption is performed on the input to machineries are equipped with smart sensors and meshed with
improve the robustness of feature representation. A multi-layered Internet of Things. Currently, most companies do not know what
G Model
to do with the data they have, and they lack software and modelling 5.3. Model visualization
to interpret and analyse them. On the other hand, manufacturers
need practical guidance to improve their processes and products, The analytics solutions of deep learning need to be understood
while the academics develop up-to-date artificial intelligence mod- by manufacturing engineers. Otherwise, the generated recommen-
els without considering how they will be applied in practice. As dations and decisions may be ignored. Due to the model complexity
manufacturing process becomes more complex, more difficulty behind, deep neural network is usually regarded as a black-box
comes along to clear the data and formulate the right problems model. It is hard to explain the internal computation mechanism
to model. Five gaps are identified in smart manufacturing innova- or interpret the abstract feature representation physically. Visual-
tion including adopted strategies, improved data collection, use and ization of the learned features and model architecture may offer
sharing, predictive model design, generalized predictive models, some insights, and thus facilitate the construction and configu-
and connected factories and control processes [48]. ration of deep neural network models for complex problems. On
To meet the high demand of advanced analytics in smart the other hand, the engineered features by domain expertise have
manufacturing, deep learning with feature learning and deep net- demonstrated its effectiveness. Complementing the abstract fea-
work offers great potential and shows advantageous properties. tures with engineered features by visualization and fusion may
To handle overwhelming data characterized by high-volume, high- contribute a more effective model. Some visualization techniques
velocity and high-variety, there are still some challenges associated have been proposed including t-SNE model [114] for high dimen-
with manufacturing industry to adopt, implement, and deploy deep sional data visualization, and visualization of activations produced
learning for real-world applications. To address the challenges, the by each layer of deep neural network via regularized optimization
future development trends of deep learning for smart manufactur- [115].
ing are discussed in terms of data matter, model selection, model
visualization, generic model, and incremental learning. 5.4. Generic model
The application of deep learning is not bonded to specific

machines, thus deep learning models can be a general solution
5.1. Data matter to address manufacturing intelligence problems. Although many
improvements including dropout and activation function have
A common presumption in machine learning is that algorithms been investigated to handle large datasets, it is still challenging
can learn better with more data, and thus the performance of deep to build a high hierarchical model with multi-layers for com-
learning model heavily depends on the scale and quality of datasets. plex problems. Both the architecture design and hyper-parameter
So far deep learning shows the effectiveness when it is applied to optimizations have significant impacts on the performance of
limited types of data (e.g. images, speech, and vibration, etc.) and deep learning models. One way to improve the architecture is
well-defined tasks. Multi-sensory has been instrumented to cap- to increase its width or depth. The determination of the optimal
ture data at all stages of a product’s life. Deep learning algorithm hyper-parameters relies on appropriate optimization algorithms
may be infeasible to directly handle such high dimensional, multi- in a computationally efficient way [116]. On the other hand, paral-
modality, and non-structured data, and even susceptible to the lel implementation of deep learning algorithms can be applied to
curse of dimensionality. Extracting the relevant data to reduce the large scale and real time analytics using parallel computing, graphic
size and applying appropriating task-specific regularization term processing unit (GPU), and Hadoop technology. On the top, setting
may improve the performance of deep learning. On the other hand, up the correct problem to be optimized, and choosing appropriate
the class imbalance problem is another challenge. The class follows models should be the basis of developing generic models.
a highly-skewed distribution in real life, representing most data
samples belong to few categories. For example, the dataset of sur- 5.5. Incremental learning
face defects is normally too small and costly to collect. The ratio of
good to bad parts is highly imbalanced ranging from 9:1 to even less The deep learning algorithms are not fundamentally built to
than one million. Thus, it is difficult to apply standard classification learn incrementally and are therefore susceptible to the data veloc-
techniques to differentiating good parts from scraps. Appropri- ity issues. For a new problem setup, deep learning may need to
ate measures such as class resampling, cost-sensitive training, and rebuild the model from scratch and the existing knowledge may
integration of boot strapping may be necessary for deep learning be difficult to utilize. Additionally, the data in the new scenar-
model to address class imbalance issues [113]. ios is also an issue. It is necessary to enable deep learning with
incremental learning capabilities. Transfer learning aims to extract
the knowledge from one source task and then applies the learned
knowledge to a different but related task [117]. It could employ
5.2. Model selection the pre-trained deep learning model from a relevant task for model
initialization and fine-tuning to enable knowledge reuse and updat-
Different problems are tacked with specialized models. A num- ing as transferred deep learning. Some previous works focusing on
ber of deep learning models are available at present. Considering transferred feature extraction/dimensionality reduction have been
the complexity in manufacturing process, model selection has done. A maximum mean discrepancy (MMD) measure evaluating
become a major challenge. Generally, some empirical criteria can be the discrepancy between source and target domains is added into
adopted when applying deep learning models. Firstly, supervised or the target function of deep neural networks [118]. Thus, transferred
unsupervised deep learning algorithms can be chosen depending deep learning is meaningful and promising for smart manufactur-
on the available data set in hand. Supervised deep learning algo- ing to enable knowledge updating and intelligence upgrading.
rithms are appropriate to dealing with data rich but knowledge
sparse problems, namely labelled data are available. If there is no 6. Conclusions
expert knowledge to solve the problem, unsupervised deep learn-
ing algorithms might be suitable. Secondly, since one algorithm has Deep learning provides advanced analytics and offers great
its strength and weakness, the general applicability of the selected potentials to smart manufacturing in the age of big data. By
algorithm should be considered [16,19]. unlocking the unprecedented amount of data into actionable and
G Model
insightful information, deep learning gives decision-makers new [20] Gao R, Wang L, Teti R, Dornfeld D, Kumara S, Helu M, et al. Cloud-enabled
visibility into their operations, as well as real-time performance prognosis for manufacturing. CIRP Ann Manuf Technol 2015;64(2):749–72.
[21] Wu D, Jennings C, Terpenny J, Gao RX, Kumara S. A comparative study on
measures and costs. To facilitate advanced analytics, a comprehen- machine learning algorithms for smart manufacturing: tool wear prediction
sive overview of deep learning techniques is presented with the using random forests. J Manuf Sci Eng 2017;139(7):1–10.
applications to smart manufacturing. Four typical deep learning [22] Gartner’s Top10 strategic technology trendsfor; 2017. http://www.gartner.
com/smarterwithgartner/gartners-top-10-technology-trends-2017/.
models including Convolutional Neural Network, Restricted Boltz- [Accessed 13 August 2017].
mann Machine, Auto Encoder, and Recurrent Neural Network are [23] Mcculloch WS, Pitts WH. A logical calculus of the ideas immanent in
discussed in detail. The emerging research effort of deep learning in nervous activity. Bull Math Biophys 1943;5(4):115–33.
[24] Samuel AL. Some studies in machine learning using the game of checkers
applications of manufacturing is also summarized. Despite of the
II—recent progress. Annu Rev Autom Program 2010;44(1–2):206–26.
promising results reported so far, there are still some limitations [25] Rosenblatt F. Perceptron simulation experiments. Proc IRE
and significant challenges for further exploration. 1960;48(3):301–9.
[26] Widrow B, Hoff ME. Adaptive switching circuits. Cambridge: MIT Press;
As the evolution of computing resources (e.g., cloud computing
1960.
[119–124], fog computing [125,126], etc.), computational intelli- [27] Minsky M, Perceptrons Papert S. Am J Psychol 1988;84(3):449–52.
gence including deep learning may be push into cloud, enabling [28] Tank DW, Hopfield JJ. Neural computation by concentrating information in
more convenient and on-demand computing services for smart time. Proc Natl Acad Sci USA 1987;84(7):1896.
[29] Werbos PJ. Backpropagation through time: what it does and how to do it.
manufacturing. Proc IEEE 1990;78(10):1550–60.
[30] Sussmann HJ. Learning algorithms for Boltzmann machines. 27th IEEE
conference on decision and control 1988;1:786–91.
Acknowledgments [31] Vapnik VN. An overview of statistical learning theory. IEEE Trans Neural
Netw 1998;10(5):988–99.
[32] Smolensky P. Information processing in dynamical systems: foundations of
This research acknowledges the financial support provided harmony theory. Parallel distributed processing: explorations in the
by National Key Research and Development Program of China microstructure of cognition. Cambridge: MIT Press; 1986.
[33] Rumelhart DE, Hinton GE, Williams RJ. Learning representations by
(No. 2016YFC0802103), National Science foundation of China back-propagating errors. Nature 1986;323(6088):533–6.
(No. 51504274), and Science Foundation of China University of [34] Hihi SE, Hc-J MQ, Bengio Y. Hierarchical recurrent neural networks for
Petroleum, Beijing (No. 2462014YJRC039). Long-Term dependencies. Adv Neural Inf Process Syst 1995;8:493–9.
[35] Hochreiter S, Schmidhuber J. Long short-Term memory. Neural Comput
1997;9(8):1735.
[36] Lécun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to
References document recognition. Proc IEEE 1998;86(11):2278–324.
[37] Hinton GE, Salakhutdinov RR. Reducing the dimensionality of data with
[1] Putnik G, Sluga A, ElMaraghy H, Teti R, Koren Y, Tolio T, et al. Scalability in neural network. Science 2006;313(5786):504–7.
manufacturing systems design and operation: state-of-the-art and future [38] Hinton GE, Osindero S, Teh YW. A fast learning algorithm for deep belief
developments roadmap. CIRP Ann Manuf Technol 2013;62(2):751–74. nets. Neural Comput 2014;18(7):1527–54.
[2] Lee YT, Kumaraguru S, Jain S, Hatim Q, Robinson S, Helu M, et al. A [39] Deng L, Seltzer M, Yu D, Acero A, Mohamed A, Hinton GE. Binary coding of
classification scheme for smart manufacturing systems’ performance speech spectrograms using a deep auto-encoder. Proceedings of 11th annual
metrics. Smart Sustain Manuf Syst 2017;1(1):52–74. conference of the international speech communication association
[3] Hu T, Li P, Zhang C, Liu R. Design and application of a real-time industrial 2010;3:1692–5.
Ethernet protocol under Linux using RTAI. Int J Comput Integr Manuf [40] Schölkopf B, Platt J, Hofmann T. Efficient learning of sparse representations
2013;26(5):429–39. with an energy-Based model. Proceedings of advances in neural information
[4] Ye Y, Hu T, Zhang C, Luo W. Design and development of a CNC machining processingsystems 2006:1137–44.
process knowledge base using cloud technology. Int J Adv Manuf Technol [41] Ranzato MA, Boureau YL, Lecun Y. Sparse feature learning for deep belief
2016:1–13. networks. Proceedings of international conference on neural information
[5] Tao F, Qi Q. New IT driven service-oriented smart manufacturing: framework processing systems 2007;20:1185–92.
and characteristics. IEEE Trans Syst Man Cybern Syst 2017;99:1–11. [42] Salakhutdinov RR, Hinton GE. Deep Boltzmann machines. J Mach Learn Res
[6] Ang J, Goh C, Saldivar A, Li Y. Energy-efficient through-life smart design, 2009;5(2):1967–2006.
manufacturing and operation of ships in an industry 4.0 environment. [43] Larochelle H, Lajoie I, Bengio Y, Manzagol PA. Stacked denoising
Energies 2017;10(5):610. autoencoders: learning useful representations in a deep network with a
[7] Huang Z, Hu T, Peng C, Hou M, Zhang C. Research and development of local denoising criterion. J Mach Learn Res 2010;11(12):3371–408.
industrial real-time Ethernet performance testing system used for CNC [44] Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep
system. Int J Adv Manuf Technol 2016;83(5–8):1199–207. convolution neural networks. International conference on neural
[8] Lalanda P, Morand D, Chollet S. Autonomic mediation middleware for smart information processing systems 2012;25:1097–105.
manufacturing. IEEE Internet Comput 2017;21(1):32–9. [45] Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, et al.
[9] Smart Manufacturing Coalition. Manufacturing growth continues despite Generative adversarial nets. Int Conf Neural Inf Process Syst
uncertain economy, according to ASQ outlook survey; 2013. https:// 2014;3:2672–80.
smartmanufacturingcoalition.org/sites/default/files/12.16.13 [46] Wang Y, Huang M, Zhao L, Zhu X. Attention-based LSTM for aspect-level
manufacturing outlook survey.pdf. [Accessed 10 Sepember 2017]. sentiment classification. Proceedings of conference on empirical methods in
[10] Wang L, Törngren M, Onori M. Current status and advancement of natural language processing 2016:606–15.
cyber-physical systems in manufacturing. J Manuf Syst 2015;37:517–27. [47] Poggio T, Smale S. The mathematics of learning: dealing with data. Not Am
[11] Wang P, Gao RX, Fan Z. Cloud computing for cloud manufacturing: benefits Math Soc 2003;50(5):537–44.
and limitations. J Manuf Sci Eng 2015;137:1–10. [48] Kusiak A. Smart manufacturing must embrace big data. Nature
[12] Lu Y, Xu X, Xu J. Development of a hybrid manufacturing cloud. J Manuf Syst 2017;544(7648):23–5.
2014;33(4):551–66. [49] Ince T, Kiranyaz S, Eren L, Askar M, Gabbouj M. Real-time motor fault
[13] Wu D, Rosen DW, Schaefer D. Cloud-based design and manufacturing: detection by 1-D convolution neural networks. IEEE Trans Ind Electron
status and promise. Comput Aided Des 2015;59:1–14. 2016;63(11):7067–75.
[14] Choudhary AK, Harding JA, Tiwari MK. Data mining in manufacturing: a [50] Hassanzadeh A, Kaarna A, Kauranne T. Unsupervised multi-manifold
review based on the kind of knowledge. J Intell Manuf 2009;20(5):501–21. classification of hyperspectral remote sensing images with contractive
[15] Lade P, Ghosh R, Srinivasan S. Manufacturing analytics and industrial Autoencoder. Neurocomputing 2017;257:67–78.
internet of things. IEEE Intell Syst 2017;32(3):74–9. [51] Caffe2. https://caffe2.ai/. 2017 [Accessed 20 October 2017].
[16] Monostori L, Márkus A, Brussel HV, Westkämpfer E. Machine learning [52] Theano. http://deeplearning.net/software/theano/index.html#. 2017
approaches to manufacturing. CIRP Ann Manuf Technol [Accessed 20 October 2017].
1996;45(2):675–712. [53] Google TensorFlow. https://www.tensorflow.org/. 2017 [Accessed 20
[17] Teti R, Jemielniak K, O’Donnell G, Dornfeld D. Advanced monitoring of October 2017].
machining operations. CIRP Ann Manuf Technol 2010;59(2):717–39. [54] Pytorch. http://pytorch.org/. 2017 [Accessed 20 October 2017].
[18] Helu M, Libes D, Lubell J, Lyons K, Morris K. Enabling smart manufacturing [55] Microsoft Cognitive Toolkit. https://www.microsoft.com/en-us/cognitive-
technologies for decision-making support. Proceedings of the ASME toolkit. 2017 [Accessed 20 October 2017].
international design engineering technical conferences and computers and [56] Google Google cloud machine learning. https://cloud.google.com/products/
information in engineering conference (IDETC/CIE) 2016:1–10. machine-learning/. 2017 [Accessed 20 October 2017].
[19] Wuest T, Weimer D, Irgens C, Klaus DT. Machine learning in manufacturing: [57] Amazon Web Service. Amazon AI, https://aws.amazon.com/amazon-ai/.
advantages, challenges, and applications. Prod Manuf Res 2016;4(1):23–45. 2017 [Accessed 20 October 2017].
G Model
[58] Microsoft Microsoft Azure; 2017. https://azure.microsoft.com/en-us/ [91] Xie H, Yang Y, Wang H, Li T, Jin W. Fault diagnosis in high-speed train
services/machine-learning-studio/. [Accessed 20 October 2017]. running gears with improved deep belief networks. J Comput Inf Syst
[59] IBM. IBM Watson ecosystem program; 2017. http://m.ibm.com/http/www- 2015;11(24):7723–30.
03. ibm.com/innovation/us/watson/. [Accessed 20 October 2017]. [92] Li C, Sanchez RV, Zurita G, Cerrada M, Cabrera D, Vasquez RE. Gearbox fault
[60] Zhang W, Jia MP, Zhu L, Yan X. Comprehensive overview on computational diagnosis based on deep random forest fusion of acoustic and vibratory
intelligence techniques for machinery condition monitoring and fault signals. Mech Syst Signal Process 2016;7:6–77, 283–293.
diagnosis. Chin J Mech Eng 2017;30(4):1–14. [93] Jia F, Lei Y, Lin J, Zhou X, Lu N. Deep neural networks: a promising tool for
[61] Lee J, Lapira E, Bagheri B, Kao H. Recent advances and trends in predictive fault characteristic mining and intelligent diagnosis of rotating machinery
manufacturing systems in big data environment. Manuf Lettersm with massive data. Mech Syst Signal Process 2016;7:2–73, 303–315.
2013;1(1):38–41. [94] Guo J, Xie X, Bie R, Sun L. Structural health monitoring by using a sparse
[62] Harding JA, Shahbaz M, Srinivas Kusiak A. Data mining in manufacturing: a coding −based deep learning algorithm with wireless sensor networks. Pers
review. J Manuf Sci Eng 2006;128:969–76. Ubiquit Comput 2014;18:1977–87.
[63] Esmaeilian B, Behdad S, Wang B. The evolution and future of manufacturing: [95] Lu C, Wang Z, Qin W, Ma J. Fault diagnosis of rotary machinery components
a review. J Manuf Syst 2016;39:79–100. using a stacked denoising autoencoder-based health state identification.
[64] Kang HS, Ju YL, Choi SS, Kim H, Park JH. Smart manufacturing: past research, Signal Process 2017;130:377–88.
present findings, and future directions. Int J Precision Eng Manuf Green [96] Shao H, Jiang H, Wang F, Zhao H. An enhancement deep feature fusion
Technol 2016;3(1):111–28. method for rotating machinery fault diagnosis. Knowl Based Syst
[65] Hazen BT, Boone CA, Ezell JD, Jones-Farmer LA. Data quality for data science, 2017;119:200–20.
predictive analytics, and big data in supply chain management: an [97] Sun W, Shao S, Zhao R, Yan R, Zhang X, Chen X. A sparse auto-encoder-based
introduction to the problem and suggestions for research and applications. deep neural network approach for induction motor faults classification.
Int J Prod Econ 2014;154(4):72–80. Measurement 2016;89:171–8.
[66] Shin SJ, Woo J, Rachuri S. Predictive analytics model for power consumption [98] Yang Z, Wang X, Zhong J. Representational learning for fault diagnosis of
in manufacturing. Procedia CIRP 2014;15:153–8. wind turbine equipment: a multi-layered extreme learning machines
[67] Vogl GW, Weiss BA, Helu M. A review of diagnostic and prognostic approach. Energies 2016;9(379):1–17.
capabilities and best practice for manufacturing. J Intell Manuf 2016:1–17. [99] Wang L, Zhao X, Pei J, Tang G. Transformer fault diagnosis using continuous
[68] Xie X. A review of recent advances in surface defect detection using texture sparse autoencoder. SpingerPlus 2016;5(448):1–13.
analysis techniques. Elcvia Electron Lett ComputVision Image Anal [100] Lei Y, Jia F, Lin J, Xing S, Ding SX. An intelligent fault diagnosis method using
2008;7(3):1–22. unsupervised feature learning towards mechanical big data. IEEE Trans Ind
[69] Neogi N, Mohanta DK, Dutta PK. Review of vision-based steel surface Electron 2016;63(5):3137–47.
inspection systems. EURASIP J Image Video Process 2014;1:1–19. [101] Li C, Sanchez RV, Zutita G, Cerrada M, Cabrera D, Vasquez RE. Multimodel
[70] Pernkopf F, O’Leary P. Visual inspection of machined metallic high-precision deep support vector classification with homologous features and its
surfaces. EURASIP J Adv Signal Process 2002;7:667–8. application to gearbox fault diagnosis. Neurocomputing 2015;168:119–27.
[71] Scholz-Reiter B, Weimer D, Thamer H. Automated surface inspection of [102] Guo X, Shen C, Chen L. Deep fault recognizer: an integrated model to
cold-formed micro-parts. CIRP Ann Manuf Technol 2012;61(1):531–4. denoise and extract features for fault diagnosis in rotating machinery. Appl
[72] Weimer D, Scholz-Reiter B, Shpitalni M. Design of deep convolution neural Sci 2017;7(41):1–17.
network architectures for automated feature extraction in industrial [103] Chen Z, Deng S, Chen X, Li C, Sanchez RV, Qin H. Deep neural network-based
inspection. CIRP Ann Manuf Technol 2016;65(1):417–20. rolling bearing fault diagnosis. Microelectron Reliab 2017;75:327–33.
[73] Ren R, Hung T, Tan KC. A generic deep-learning-based approach for [104] Malhi A, Yan R, Gao RX. Prognosis of defect propagation based on recurrent
automated surface inspection. IEEE Trans Cybern 2017;99:1–12. neural networks. IEEE Trans Instrum Meas 2011;60(3):703–11.
[74] Masci J, Meier U, Ciresan D, Schmidhuber J, Fricout G, Mittal A. Steel defect [105] Zhao R, Wang D, Yan R, Mao K, Shen F, Wang J. Machine health monitoring
classification with max-pooling convolution neural networks. IEEE using local feature-based gated recurrent unit networks. IEEE Transa Ind
international joint conference on neural networks (IJCNN) 2012;20:1–6. Electron 2018;65(2):1539–48.
[75] Park JK, Kwon BK, Park JH, Kang DJ. Machine learning-based imaging system [106] Zhao R, Yan R, Wang J, Mao K. Learning to monitor machine health with
for surface defect inspection. Int J Precision Eng Manuf Green Technol convolution bi-directional LSTM networks. Sensors 2017;17(273):1–18.
2016;3(3):303–10. [107] Wu Y, Yuan M, Dong S, Lin L, Liu Y. Remaining useful life estimation of
[76] Zhao R, Yan R, Chen Z, Chen Z, Mao K, Wang P, et al. Deep learning and its engineered systems using vanilla LSTM neural networks. Neurocomputing
applications to machine health monitoring: a survey; 2016. https://arxiv. 2017;226(5):853–60.
org/pdf/1612.07640.pdf. [Accessed 20 October 2017]. [108] Malhotra P, Vig L, Shroff G, Agarwal P. Long short term memory networks
[77] Janssens O, Slavkovikj V, Vervisch B, Stockman K, Loccufier M, Verstockt S, for anomaly detection in time series. In: Proceeding of European symposium
et al. Convolution neural network based fault detection for rotating on artificial neural networks, computational intelligence, and machine
machinery. J Sound Vib 2016;377:331–45. learning. 2015. p. 89–94.
[78] Lu C, Wang Z, Zhou B. Intelligent fault diagnosis of rolling bearing using [109] Wang P, Gao RX, Yan R. A deep learning-based approach to material removal
hierarchical convolution network based health state classification. Adv Eng rate prediction in polishing. CIRP Ann Manuf Technol 2017;66:429–32.
Inf 2017;32:139–51. [110] Deutsch J, He M, He D. Remaining useful life prediction of hybrid ceramic
[79] Guo X, Chen L, Shen C. Hierarchical adaptive deep convolution neural bearings using an integrated deep learning and particle filter approach. Appl
network and its application to bearing fault diagnosis. Measurement Sci 2017;7(649):1–17.
2016;93:490–502. [111] Qiu X, Zhang L, Ren Y, Suganthan PN, Amaratunga G. Ensemble deep
[80] Verstraete D, Droguett E, Meruance V, Modarres M, Ferrada A. Deep learning learning for regression and time series forecasting. IEEE symposium series
enabled fault diagnosis using time-frequency image analysis of rolling on computational intelligence 2014:1–6.
element bearings. Shock Vib 2017:1–29. [112] Zhang W, Duan P, Yang LT, Xia F, Li Z, Lu Q, et al. Resource requests
[81] Chen ZQ, Li C, Sanchez RV. Gearbox fault identification and classification prediction in the cloud computing environment with a deep belief network.
with convolution neural networks. Shock Vib 2015;2:1–10. Software Pract Exp 2017;47(3):473–88.
[82] Wang P, Ananya Yan R, Gao RX. Virtualization and deep recognition for [113] Khan SH, Hayat M, Bennamoun M, Sohel FA, Tognari R. Cost-sensitive
system fault classification. J Manuf Syst 2017;44:310–6. learning of deep feature representations from imbalanced data. IEEE Trans
[83] Dong H, Yang L, Li H. Small fault diagnosis of front-end speed controlled Neural Networks Learn Syst 2017;99:1–15.
wind generator based on deep learning. WESEAS Trans Circuits Syst [114] Maaten LVD, Hinton G. Visualizing data using t-SNE. J Mach Learn Res
2016;15:64–72. 2008;9(2605):2579–605.
[84] Wang J, Zhuang J, Duan L, Cheng W. A multi-scale convolution neural [115] Yu D, Yao K, Su H, Li G, Seide F. KL-divergence regularized deep neural
network for featureless fault diagnosis. Proceedings of 2016 international network adaptation for improved large vocabulary speech recognition. IEEE
symposium on flexible automation 2016:65–70. international conference on acoustics, speech and signal processing
[85] Tamilselvan P, Wang P. Failure diagnosis using deep belief learning based 2013:7893–7.
health state classification. Reliab Eng Syst Saf 2013;115(7):124–35. [116] Vig E, Dorr M, Cox D. Large-scale optimization of hierarchical features for
[86] Yu H, Khan F, Garaniya V. Nonlinear Gaussian belief network based fault saliency prediction in natural images. IEEE computer vision and pattern
diagnosis for industrial processes. J Process Control 2015;35:178–200. recognition 2014:2798–805.
[87] Tran VT, Althobiani F, Ball A. An approach to fault diagnosis of reciprocating [117] Pan SJ, Yang Q. A survey on transfer learning. IEEE Trans Knowl Data Eng
compressor valves using teager–kaiser energy operator and deep belief 2010;22(10):1345–59.
networks. Expert Syst Appl 2014;41(9):4113–22. [118] Dziugaite GK, Roy DM, Ghahramani Z. Training generative neural networks
[88] Shao H, Jiang H, Zhang X, Niu M. Rolling bearing fault diagnosis using an via Maximum Mean Discrepancy optimization. Proceedings of the 31st
optimization deep belief network. Meas Sci Technol 2015;26(11):1–17. conference on uncertainty in artificial intelligence 2015:258–67.
[89] Gan M, Wang C, Zhu C. Construction of hierarchical diagnosis network based [119] Mell P, Grance T. The NIST definition of cloud computing. Commun ACM
on deep learning and its application in the fault pattern recognition of 2009;53(6), 50–50.
rolling element bearings. Mech Syst Signal Process 2016;72–73(2):92–104. [120] Davis J, Edgar T, Porter J, Bernaden J, Sarli M. Smart manufacturing,
[90] Yin J, Zhao W. Fault diagnosis network design for vehicle on-board manufacturing intelligence and demand-dynamic performance. Comput
equipments of high speed railway: a deep learning approach. Eng Appl Artif Chem Eng 2012;47(12):145–56.
Intell 2016;56:250–9. [121] Lee J, Kao HA, Yang S. Service innovation and smart analytics for industry 4:
0 and big data environment. Procedia CIRP 2014;16:3–8.
G Model
[122] Lee J, Lapira E, Bagheri B, Kao H. Recent advances and trends in predictive [125] O’Donovan P, Leahy K, Bruton K, O’Sullivan D. Big data in manufacturing: a
manufacturing systems in big data environment. Manuf Lett systematic mapping study. J Big Data 2015;2(1):1–22.
2013;1(1):38–41. [126] Wu D, Liu S, Zhang L, Terpenny J, Gao RX, Kurfess T, et al. A fog
[123] Chen CLP, Zhang CY. Data-intensive applications, challenges, techniques and computing-based framework for process monitoring and prognosis in
technologies: a survey on Big Data. Inf Sci 2014;275(11):314–47. cyber-manufacturing. J Manuf Syst 2017;43(1):25–34.
[124] Meziane F, Vadera S, Kobbacy K, Proudlove N. Intelligent systems in
manufacturing: current developments and future prospects. Integr Manuf
Syst 2000;11(4):218–38.

Manufacturing 43214

Uploaded by

Copyright:

Available Formats

Manufacturing 43214

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Manufacturing 43214

Uploaded by

Copyright:

Available Formats

G Model

JMSY-629; No. of Pages 13 ARTICLE IN PRESS

Contents lists available at ScienceDirect

Journal of Manufacturing Systems

Deep learning for smart manufacturing: Methods and applications

1. Introduction product quality and response more timely to dynamically changing

Fig. 1. The role of data driven intelligence in smart manufacturing.

Table 1 ward to reduce dimensionality and learn sparse representations

2.2. Comparison between deep learning and traditional machine

Feature learning Model construction Model training

Fig. 3. Deep learning enabled advanced analytics for smart manufacturing.

Input layer Fully connected

Raw Data Convolution Sampling

Fig. 4. Architecture of convolutional neural network model.

Fig. 5. Architecture of (a) RBM, (b) DBN, and (c) DBM.

Fig. 6. The architecture of AE.

Fig. 7. Architecture of recurrent neural network model.

Model Principle Pros. Cons.

Tools Type Description

Caffe/Caffe2 [51] Open-source Feedforward network, and suitable for image

Deep learning model Application Scenarios Reference

CNN Surface integration inspection [72–75]

surface defect for enhanced product quality in manufacturing

4.2. Diagnostic analytics for fault assessment

The application of deep learning is not bonded to speciﬁc

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.