Manufacturing 43214
Manufacturing 43214
Manufacturing 43214
a r t i c l e i n f o a b s t r a c t
Article history: Smart manufacturing refers to using advanced data analytics to complement physical science for improv-
Received 13 November 2017 ing system performance and decision making. With the widespread deployment of sensors and Internet
Received in revised form 1 January 2018 of Things, there is an increasing need of handling big manufacturing data characterized by high vol-
Accepted 2 January 2018
ume, high velocity, and high variety. Deep learning provides advanced analytics tools for processing
Available online xxx
and analysing big manufacturing data. This paper presents a comprehensive survey of commonly used
deep learning algorithms and discusses their applications toward making manufacturing “smart”. The
Keywords:
evolvement of deep learning technologies and their advantages over traditional machine learning are
Smart manufacturing
Deep learning
firstly discussed. Subsequently, computational methods based on deep learning are presented specially
Computational intelligence aim to improve system performance in manufacturing. Several representative deep learning models are
Data analytics comparably discussed. Finally, emerging topics of research on deep learning are highlighted, and future
trends and challenges associated with deep learning for smart manufacturing are summarized.
© 2018 Published by Elsevier Ltd on behalf of The Society of Manufacturing Engineers.
https://doi.org/10.1016/j.jmsy.2018.01.003
0278-6125/© 2018 Published by Elsevier Ltd on behalf of The Society of Manufacturing Engineers.
Please cite this article in press as: Wang J, et al. Deep learning for smart manufacturing: Methods and applications. J Manuf Syst (2018),
https://doi.org/10.1016/j.jmsy.2018.01.003
G Model
JMSY-629; No. of Pages 13 ARTICLE IN PRESS
2 J. Wang et al. / Journal of Manufacturing Systems xxx (2018) xxx–xxx
data mining techniques are classified into five categories, including In light of the above challenges, this paper aims to provide
characterization and description, association, classification, predic- a state-of-the-art review of deep learning techniques and their
tion, clustering and evolution analysis. The barriers to data-driven applications in smart manufacturing. Specifically, the deep learn-
decision making in manufacturing are also identified. Typical ing enabled advanced analytics framework is proposed to meet
machine learning techniques are reviewed in [15,16] for intelli- the opportunistic need of smart manufacturing. The typical deep
gent manufacturing, and their strengths and weaknesses are also learning models are briefly introduced, and their applications to
discussed in a wide range of manufacturing applications. A com- manufacturing are outlined to highlight the latest advancement in
parative study of machine learning algorithms including Artificial relevant areas. The challenges and future trends of deep learning
Neural Network, Support Vector Machine, and Random Forest are discussed in the end.
is performed for machining tool wear prediction. The schemes, The rest of paper is constructed as follows. Firstly, data-driven
techniques and paradigm of developing decision making support artificial intelligence techniques are reviewed in Section 2, with
systems are reviewed for the monitoring of machining opera- the superiority of deep learning techniques outlined. Next, the
tions, and these techniques include neural networks, fuzzy logic, challenge and opportunistic need of deep learning in smart man-
genetic algorithms, and hybrid systems [17,18]. The potential ufacturing are presented, and typical deep learning models are
benefit and successful application examples of typical machining briefly discussed in Section 3. Then, the latest applications of deep
learning techniques including Bayesian Networks, instance-based learning techniques in the context of smart manufacturing are sum-
learning, Artificial Neural Network, and ensemble methods are dis- marized in Section 4. Finally, the challenges as well as future trends
cussed in [19]. Cloud enabled prognosis techniques including data of deep learning in smart manufacturing are discussed.
driven approach, physics based as well as model-based techniques
are reviewed in [20], with the benefits from both advanced com- 2. Overview of data driven intelligence
puting capability and information sharing for intelligent decision
making. Traditional machine learning is usually designed with shal- 2.1. The evolution of data-driven artificial intelligence
low structures, such as Artificial Neural Network, Support Vector
Machine, and logistic regression, etc. By coping with limited hand- Artificial intelligence is considered as a fundamental way to
crafted features, it achieves decent performance in a variety of possess intelligence, and listed as the first place in Gartner’s Top
applications. However, the massive data in smart manufacturing 10 strategic technology trends in 2017 [22]. Artificial intelligence
imposes a variety of challenges [18,19], such as the proliferation of has experienced several lifecycles, from the infancy period (1940s),
multimodal data, high dimensionality of feature space, and mul- through the first upsurge period (1960s) and the second upsurge
ticollinearity among data measurements. These challenges render period (1980s), and the present third boom period (after 2000s).
traditional algorithms struggling and thus greatly impede their per- The development trend and typical artificial intelligence models
formance. are summarized in Table 1.
As a breakthrough in artificial intelligence, deep learning The origin of Artificial Neural Network started back in 1940s,
demonstrates outstanding performance in various applications of when MP model [23] and Hebb rule [24] were proposed to discuss
speech recognition, image recondition, natural language process- how neurons worked in human brain. At the workshops in Dart-
ing (e.g. translation, understanding, test questions & answers), mouth College, significant artificial intelligence capabilities like
multimodal image-text, and games (e.g. Alphago). Deep learning playing chess games and solving simple logic problems were devel-
allows automatically processing of data towards highly nonlinear oped [24]. The pioneering work brought artificial intelligence to the
and complex feature abstraction via a cascade of multiple lay- first upsurge period (1960s). In 1956, a mathematical model named
ers, instead of handcrafting the optimum feature representation Perceptron [25] was proposed to simulate the nervous system of
of data with domain knowledge. With automatic feature learning human learning with linear optimization. Next, a network model
and high-volume modelling capabilities, deep learning provides an called Adaptive Linear Unit [26] was developed in 1959 and had
advanced analytics tool for smart manufacturing in the big data era. been successfully used in practical applications such as commu-
It uses a cascade of layers of nonlinear processing to learn the repre- nication and weather forecasting. The limitation of early artificial
sentations of data corresponding to different levels of abstraction. intelligence was also criticized due to the difficulty in handling
The hidden patterns underneath each other are then identified and non-linear problems, such as XOR (or XNOR) classification [27].
predicted through end-to-end optimization. Deep learning offers With the development of Hopfield network circuit [28], artifi-
great potential to boost data-driven manufacturing applications, cial intelligence stepped forward to the second upsurge (1980s).
especially in the big data era [17,21]. Back Propagation (BP) algorithm was proposed to solve non-linear
problems in complex neural network in 1974 [29]. A random mech-
Please cite this article in press as: Wang J, et al. Deep learning for smart manufacturing: Methods and applications. J Manuf Syst (2018),
https://doi.org/10.1016/j.jmsy.2018.01.003
G Model
JMSY-629; No. of Pages 13 ARTICLE IN PRESS
J. Wang et al. / Journal of Manufacturing Systems xxx (2018) xxx–xxx 3
Please cite this article in press as: Wang J, et al. Deep learning for smart manufacturing: Methods and applications. J Manuf Syst (2018),
https://doi.org/10.1016/j.jmsy.2018.01.003
G Model
JMSY-629; No. of Pages 13 ARTICLE IN PRESS
4 J. Wang et al. / Journal of Manufacturing Systems xxx (2018) xxx–xxx
Fig. 2. Comparison between two techniques: a) traditional machine learning, b) deep learning.
Table 2
Comparison between traditional machine learning and deep learning.
Traditional machine Explicit engineered Use extracted features to Each module is trained
learning features extracted with construct data-driven model, step-by-step.
expert domain knowledge. usually with shallow
structures.
Deep learning Features are learned by An end-to-end high Parameters are trained jointly.
transforming data into hierarchical model structure
abstract representations. with nonlinear combination of
multi-layers.
strong influence on the analysis results. On the other hand, the deep reduced or the equipment failure happens, diagnostic analytics
hierarchical structure in deep learning is easier to model the nonlin- examine the root cause and report the reason it happens. Pre-
ear relationship using compositional function comparing with the dictive analytics utilizes statistical models to make predictions
shallow structure which is regarded as a generic function in tradi- about the possibility of future production or equipment degra-
tional machine learning. The superiority of deep network had been dation with available historical data. Prescriptive analytics goes
proven mathematically in [47]. As the size and variety of dataset beyond by recommending one or more courses of action. Measures
grow in the big data context, it becomes more difficult to create can be identified to improve production outcomes or correct the
new, highly relevant features. In the context of big data era in smart problems, showing the likely outcome of each decision.
manufacturing, the ability to avoid feature engineering is regarded With the advanced analytics provided by deep learning, man-
as a great advantage due to the challenges associated with this ufacturing is transformed into highly optimized smart facilities.
process. The benefits include reducing operating costs, keeping up with
changing consumer demand, improving productivity and reducing
downtime, gaining better visibility and extracting more value from
3. Deep learning for smart manufacturing
the operations for globally competitiveness.
Up to date, various deep learning architectures have been devel-
With new technologies (e.g. IoT, big data) embraced in smart
oped and the relevant research topics are fast-growing. To facilitate
manufacturing, smart facilities focus on creating manufacturing
the investigation of manufacturing intelligence, several typical
intelligence that can have a positive impact across the entire
deep learning architectures are discussed including Convolutional
organization. The manufacturing today is experiencing an unprece-
Neural Network, Restricted Boltzmann Machine, Auto Encoder, and
dented increase in available sensory data comprised of different
Recurrent Neural Network and their variants. The feature learning
formats, semantics, and structures. Sensory data was collected from
capability and model construction mechanism were emphasized
different aspects across the manufacturing enterprise, including
since these models are the building blocks to construct compre-
product line, manufacturing equipment, manufacturing process,
hensive and complex deep learning techniques.
labour activity, and environmental conditions. Data modelling and
analysis are the essential part of smart manufacturing to handling
increased high volume data, as well as supporting real-time data 3.1. Convolutional neural network
processing [48].
From sensory data to manufacturing intelligence, deep learning Convolutional Neural Network (CNN) is a multi-layer feed-
has attracted much attention as a breakthrough of computational forward artificial neural network which is firstly proposed for
intelligence. By mining knowledge from aggregated data, deep two-dimensional image processing [36]. It has also been inves-
learning techniques play a key role in automatically learning tigated for one-dimensional sequential data analysis including
from data, identifying patterns, and making decisions as shown in natural language processing and speech recognition recently [49].
Fig. 3. Different levels of data analytics can be produced includ- In CNN, the feature learning is achieved by alternating and stack-
ing descriptive analytics, diagnostic analytics, predictive analytics, ing convolutional layers and pooling operations. The convolutional
and prescriptive analytics. Descriptive analytics aims to summarize layers convolve with raw input data using multiple local kernel
what happens by capturing the product’s conditions, environment filters and generate invariant local features. The subsequent pool-
and operational parameters. When the product performance is ing layers extract the most significant features with a fixed-length
Please cite this article in press as: Wang J, et al. Deep learning for smart manufacturing: Methods and applications. J Manuf Syst (2018),
https://doi.org/10.1016/j.jmsy.2018.01.003
G Model
JMSY-629; No. of Pages 13 ARTICLE IN PRESS
J. Wang et al. / Journal of Manufacturing Systems xxx (2018) xxx–xxx 5
over sliding windows of the raw input data by pooling operations metric connection between visible and hidden units, but there are
such as max pooling and average pooling. Max pooling selects the no connections between each neuron within the same layer. It is an
maximum value of one region of the feature map as the most sig- energy based model in which the visible layer is used to input data
nificant feature. Average pooling calculates the mean value of one while the hidden layer is used to extract features. All hidden nodes
region and takes it as the pooling value of this region. Max pooling are assumed conditionally independent. The weights and offsets of
is well suited to extract sparse features, while pooling operation on these two layers are tuned over iterations in order to make the out-
all samples may not be optimal. put of the visible layer as the approximation of the original input.
After multi-layer feature learning, fully-connected layers con- Finally, the hidden layers are regarded as different representations
vert a two- dimensional feature map into a one dimensional vector of the visible layer.
and then feed it into a softmax function for model construction. By The parameters in hidden layers are treated as the features
stacking convolutional layers, pooling layers, and fully-connected to characterize the input data to realize data coding and dimen-
layers, a typical CNN is constructed as shown in Fig. 4. Gradient sion reduction. Then, supervised learning methods such as logistic
based backpropagation is usually used to train convolutional neu- regression, Naïve Bayes, BP Neural Network, and Support Vector
ral network by minimizing the minimum mean squared error or Machine, etc. can be used to implement data classification and
cross-entropy loss function. CNN has the advantageous properties regression. RBM takes the advantages of extracting required fea-
including sparse interactions with local connectivity, parameter tures from training datasets automatically, which avoids the local
sharing with reduced numbers, and equivariant representation minimum value and thus has received a growing number of atten-
which is invariant to object locations. tions. Utilizing RBM as the basic learning module, different variant
models have been developed [32].
Deep Belief Network (DBN): DBN is constructed by stacking mul-
3.2. Restricted Boltzmann machine and its variant tiple RBMs, where the output of the lth layer in hidden units is used
as the input of the (l + 1)th layer in visible units. For DBN training, a
Restricted Boltzmann Machine (RBM) is a two-layer neural fast greedy algorithm is usually used to initialize the network and
network consisting of visible and hidden layer. There exists a sym-
Please cite this article in press as: Wang J, et al. Deep learning for smart manufacturing: Methods and applications. J Manuf Syst (2018),
https://doi.org/10.1016/j.jmsy.2018.01.003
G Model
JMSY-629; No. of Pages 13 ARTICLE IN PRESS
6 J. Wang et al. / Journal of Manufacturing Systems xxx (2018) xxx–xxx
the parameters of this deep architecture are then fine-tuned by a parameters and build auto-encoder by minimizing the objective
contractive wake-sleep algorithm [37]. Bayesian Belief Network is loss function in terms of the least square loss or cross-entropy loss.
applied to the area which is close to the visible layers, and RBMs Several variants of AE have been developed and listed as follows:
are used to the area far from the visible layers. That is to say, the
highest two layers are undirected and the other lower layers are 1) Denoising Auto Encoder (DAE): DAE is an extension version
directed, as shown in Fig. 5. of the basic Auto Encoder, which is trained to reconstruct the
Deep Boltzmann Machine (DBM): DBM can be regarded as a stochastically corrupted input data by adding isotropic Gaus-
deep structured RBMs where hidden units are grouped into a sian noise to x and forcing the hidden layer to discover more
hierarchy of layers. The full connections between two adjacent robust features [43].
layers are enabled, but no connection is allowed within a layer 2) Sparse Auto Encoder (SAE): SAE makes the most of the hidden
or between non-neighbouring layers as shown in Fig. 5. By stack- unit’s activations close to zero by imposing sparsity constraints
ing multi-RBMs, DBM can learn complex structures and construct on the hidden units, even the number of hidden units is large
high-level representation of input data [42]. Compared to DBN, [40,41].
DBM is a fully undirected graphical model while DBN is a mixed 3) Contractive Auto Encoder (CAE): In order to force the model
directed/undirected one. Accordingly, the DBM model is trained resistant to small perturbations, CAE encourages learning more
jointly and more computationally expensive. On the contrary, DBN robust representations of the input x [50].
can be trained layer-wisely to be more efficiently.
3.4. Recurrent neural network and its variants
3.3. Auto encoder and its variants Compared with traditional neural networks, Recurrent Neural
Network (RNN) has unique characteristic of topology connections
Auto Encoder (AE) is an unsupervised learning algorithm between the neurons formed directed cycles for sequence data as
extracting features from input data without label information shown in Fig. 7. Thus, RNN is suitable for feature learning from
needed. It mainly consists of two parts including encoder and sequence data. It allows information persists in hidden layers and
decoder as shown in Fig. 6. The encoder can perform data compres- captures previous states of a few time steps ago. An updated rule is
sion especially in dealing input of high dimensionality by mapping applied in RNN to calculate the hidden states at different time steps.
input to a hidden layer [33]. The decoder can reconstruct the Take the sequential input as a vector, the current hidden state can
approximation of input. Suppose the activation function is a linear be calculated by two parts through a same activation function (e.g.
function and we have less hidden layers than the dimensionality of sigmoid or tanh function). The first part is calculated with the input
input data, then the linear Auto Encoder is similar to principle com- while the second part is obtained from the hidden state at the pre-
ponent analysis (PCA). If the input data is highly nonlinear, more vious time step. Then, the target output can be calculated with the
hidden layers are required to construct the deep Auto Encoder. current hidden state through a softmax function. After processing
Stochastic gradient descent (SGD) is often investigated to calculate the whole sequence, the hidden state is the learned representation
Please cite this article in press as: Wang J, et al. Deep learning for smart manufacturing: Methods and applications. J Manuf Syst (2018),
https://doi.org/10.1016/j.jmsy.2018.01.003
G Model
JMSY-629; No. of Pages 13 ARTICLE IN PRESS
J. Wang et al. / Journal of Manufacturing Systems xxx (2018) xxx–xxx 7
y1 y2 yn
Output layer
g g g
Hidden layer h0 fW h1 fW h2 fW hn
Input layer x1 x2 xt
of the input sequential data and a conventional multilayer percep- or semi supervised learning. CNN, RNN and their variants are super-
tron (MLP) is added on top to map the obtained representation to vised learning. The Pros and Cons of these typical deep learning
targets. models are presented in Table 3.
Different from traditional neural networks, the model training in Fortunately, a number of typical deep learning packages includ-
RNN is performed by Backpropagation Through Time (BPTT). RNN ing open source or commercial software are available to public as
is firstly unrolled according to time and each unrolled time step is summarized in Table 4. They facilitate the investigation of deep
considered as an additional layer. Then backpropagation algorithm learning techniques in different manufacturing scenarios.
is applied to calculate gradients. Due to the vanishing/exploding
gradient problem using BPTT for model training, RNN cannot cap-
ture long-term dependencies. In other words, RNN has difficulty in
dealing with long-term sequence data. 4. Applications to smart manufacturing
A variety of enhancements are proposed to solve these prob-
lems, among which long short-term memory (LSTM) is widely Computational intelligence is an essential part of smart man-
investigated for its effectiveness [35]. The most important idea of ufacturing to enable accurate insights for better decision making.
LSTM is cell state, which allows information flow down with lin- Machine learning has been widely investigated in different stages
ear interactions. Comparing with single recurrent structure in RNN, of manufacturing lifecycle covering concept, design [60], evalu-
the gates including forget gate layer, input gate layer and output ation, production, operation, and sustainment [61] as shown in
gate layer, are used in LSTM to control the cell state. It enables Fig. 8. The applications of data mining in manufacturing engineer-
each recurrent unit to adaptively capture long-term dependencies ing are reviewed in [62], covering different categories of production
of different time scales. processes, operations, fault detection, maintenance, decision sup-
port, and product quality improvement. The evolution and future
3.5. Model comparison of manufacturing are reviewed in [63,64], emphasizing the impor-
tance of data modelling and analysis in manufacturing intelligence.
With the above illustration, it can be found that CNN and RNN The application schemes of machine learning in manufacturing
provide complex composition mechanism to learn representation are identified as summarized in [65,66]. Smart manufacturing also
and model construction. The RBM and AE can be used for layer- requires prognostics and health management (PHM) capabilities to
by-layer pretraining of deep neural network to characterize input meet the current and future needs for efficient and reconfigurable
data. In these deep learning models, the top layers normally repre- production [67].
sent the targets. For classification where targets are discrete values, Deep learning, as an emerging technique, has been investigated
softmax layers are applied. For prediction with continuous targets, for a wide range of manufacturing systems recently. To give an
linear regression layers are added. According to the dependence on overview, the applications of state-of-the-art deep learning tech-
labelled data, DBN, AE and their variants are unsupervised learning niques in manufacturing are discussed in this study, especially in
Please cite this article in press as: Wang J, et al. Deep learning for smart manufacturing: Methods and applications. J Manuf Syst (2018),
https://doi.org/10.1016/j.jmsy.2018.01.003
G Model
JMSY-629; No. of Pages 13 ARTICLE IN PRESS
8 J. Wang et al. / Journal of Manufacturing Systems xxx (2018) xxx–xxx
Table 3
Comparison between different deep learning models.
CNN Abstracted features are learned Reduced parameter number, High computational
by stacked convolutional and invariance of shift, scale and complexity for high
sampling layers. distortion hierarchical model training
RBM Hidden layer describes variable Robust to ambiguous input and Time-consuming for joint
dependencies and connections training label is not required in parameter optimization
between input or output layers pre-training stage
as representative features.
AE Unsupervised feature learning Irrelevance in the input is Error propagation
and data dimensionality eliminated, and meaningful layer-by-layer and sparse
reduction are achieved through information is preserved representations are not
encoding guaranteed
RNN Temporal pattern stored in the Short-term information is Difficult to train the model and
recurrent neuros connection retained and temporal save the long-term dependence
and distributed hidden states correlations are captured in
for time-series data. sequence data.
Table 4
A list of deep learning tools.
Table 5
A list of deep learning models with applications.
Please cite this article in press as: Wang J, et al. Deep learning for smart manufacturing: Methods and applications. J Manuf Syst (2018),
https://doi.org/10.1016/j.jmsy.2018.01.003
G Model
JMSY-629; No. of Pages 13 ARTICLE IN PRESS
J. Wang et al. / Journal of Manufacturing Systems xxx (2018) xxx–xxx 9
investigated to learn high-level generic features and applied to a extreme learning machine based Auto Encoder is investigated in
wide range of textures or difficult-to-detect defects cases. [98] to learn feature representations for wind turbine fault clas-
Convolutional Neural Network, originally designed for image sification. In [99], a continuous sparse Auto Encoder is presented
analysis, is well fit for automated defect identification in sur- by adding Gaussian stochastic unit into an activation function to
face integration inspection. In [72], a Deep Convolutional Neural extract nonlinear features of the input data. To improve diagnostic
Network architecture is designed and the hyper-parameters are analytics, the comprehensive deep learning models are also devel-
optimized based on backpropagation and stochastic gradient oped. In [100], a sparse filtering based deep neural network model is
descent algorithms. A max-pooling Convolutional Neural Network investigated for unsupervised features learning. Synthesized deep
is presented in [73] to perform feature extraction directly from the learning models are discussed in [101,102] for signal denoising and
pixel representation of steel defect images and shows lower error fused feature extraction. Different deep learning models including
rates comparing with multi-layer perceptron and support vector Deep Boltzmann Machine, Deep Belief Network, and Stacked Auto
machine. The image analysis is studied with convolutional neural Encoder with different preprocessing schemes are comparatively
network in [74] to automatically inspect dirties, scratches, burrs, studied in [103] for rolling bearing fault diagnosis. The results show
and wears on surface parts. The experimental results show that that stacked Auto Encoder performs the best. From the above lit-
CNN works properly with different types of defects on textured or eratures, it is concluded that the deep learning models outperform
non-textured surfaces. A generic approach based on CNN is pro- traditional machine learning techniques with engineered features
posed in [75] to extract patch feature and predict defect area via such as support vector machine, and BP Neural Network in terms
thresholding and segmenting. The results show the pretrained CNN of classification accuracy.
model works well on small dataset with improved accuracy for
automated surface inspection system.
4.3. Predictive analytics for defect prognosis
Please cite this article in press as: Wang J, et al. Deep learning for smart manufacturing: Methods and applications. J Manuf Syst (2018),
https://doi.org/10.1016/j.jmsy.2018.01.003
G Model
JMSY-629; No. of Pages 13 ARTICLE IN PRESS
10 J. Wang et al. / Journal of Manufacturing Systems xxx (2018) xxx–xxx
to do with the data they have, and they lack software and modelling 5.3. Model visualization
to interpret and analyse them. On the other hand, manufacturers
need practical guidance to improve their processes and products, The analytics solutions of deep learning need to be understood
while the academics develop up-to-date artificial intelligence mod- by manufacturing engineers. Otherwise, the generated recommen-
els without considering how they will be applied in practice. As dations and decisions may be ignored. Due to the model complexity
manufacturing process becomes more complex, more difficulty behind, deep neural network is usually regarded as a black-box
comes along to clear the data and formulate the right problems model. It is hard to explain the internal computation mechanism
to model. Five gaps are identified in smart manufacturing innova- or interpret the abstract feature representation physically. Visual-
tion including adopted strategies, improved data collection, use and ization of the learned features and model architecture may offer
sharing, predictive model design, generalized predictive models, some insights, and thus facilitate the construction and configu-
and connected factories and control processes [48]. ration of deep neural network models for complex problems. On
To meet the high demand of advanced analytics in smart the other hand, the engineered features by domain expertise have
manufacturing, deep learning with feature learning and deep net- demonstrated its effectiveness. Complementing the abstract fea-
work offers great potential and shows advantageous properties. tures with engineered features by visualization and fusion may
To handle overwhelming data characterized by high-volume, high- contribute a more effective model. Some visualization techniques
velocity and high-variety, there are still some challenges associated have been proposed including t-SNE model [114] for high dimen-
with manufacturing industry to adopt, implement, and deploy deep sional data visualization, and visualization of activations produced
learning for real-world applications. To address the challenges, the by each layer of deep neural network via regularized optimization
future development trends of deep learning for smart manufactur- [115].
ing are discussed in terms of data matter, model selection, model
visualization, generic model, and incremental learning. 5.4. Generic model
Please cite this article in press as: Wang J, et al. Deep learning for smart manufacturing: Methods and applications. J Manuf Syst (2018),
https://doi.org/10.1016/j.jmsy.2018.01.003
G Model
JMSY-629; No. of Pages 13 ARTICLE IN PRESS
J. Wang et al. / Journal of Manufacturing Systems xxx (2018) xxx–xxx 11
insightful information, deep learning gives decision-makers new [20] Gao R, Wang L, Teti R, Dornfeld D, Kumara S, Helu M, et al. Cloud-enabled
visibility into their operations, as well as real-time performance prognosis for manufacturing. CIRP Ann Manuf Technol 2015;64(2):749–72.
[21] Wu D, Jennings C, Terpenny J, Gao RX, Kumara S. A comparative study on
measures and costs. To facilitate advanced analytics, a comprehen- machine learning algorithms for smart manufacturing: tool wear prediction
sive overview of deep learning techniques is presented with the using random forests. J Manuf Sci Eng 2017;139(7):1–10.
applications to smart manufacturing. Four typical deep learning [22] Gartner’s Top10 strategic technology trendsfor; 2017. http://www.gartner.
com/smarterwithgartner/gartners-top-10-technology-trends-2017/.
models including Convolutional Neural Network, Restricted Boltz- [Accessed 13 August 2017].
mann Machine, Auto Encoder, and Recurrent Neural Network are [23] Mcculloch WS, Pitts WH. A logical calculus of the ideas immanent in
discussed in detail. The emerging research effort of deep learning in nervous activity. Bull Math Biophys 1943;5(4):115–33.
[24] Samuel AL. Some studies in machine learning using the game of checkers
applications of manufacturing is also summarized. Despite of the
II—recent progress. Annu Rev Autom Program 2010;44(1–2):206–26.
promising results reported so far, there are still some limitations [25] Rosenblatt F. Perceptron simulation experiments. Proc IRE
and significant challenges for further exploration. 1960;48(3):301–9.
[26] Widrow B, Hoff ME. Adaptive switching circuits. Cambridge: MIT Press;
As the evolution of computing resources (e.g., cloud computing
1960.
[119–124], fog computing [125,126], etc.), computational intelli- [27] Minsky M, Perceptrons Papert S. Am J Psychol 1988;84(3):449–52.
gence including deep learning may be push into cloud, enabling [28] Tank DW, Hopfield JJ. Neural computation by concentrating information in
more convenient and on-demand computing services for smart time. Proc Natl Acad Sci USA 1987;84(7):1896.
[29] Werbos PJ. Backpropagation through time: what it does and how to do it.
manufacturing. Proc IEEE 1990;78(10):1550–60.
[30] Sussmann HJ. Learning algorithms for Boltzmann machines. 27th IEEE
conference on decision and control 1988;1:786–91.
Acknowledgments [31] Vapnik VN. An overview of statistical learning theory. IEEE Trans Neural
Netw 1998;10(5):988–99.
[32] Smolensky P. Information processing in dynamical systems: foundations of
This research acknowledges the financial support provided harmony theory. Parallel distributed processing: explorations in the
by National Key Research and Development Program of China microstructure of cognition. Cambridge: MIT Press; 1986.
[33] Rumelhart DE, Hinton GE, Williams RJ. Learning representations by
(No. 2016YFC0802103), National Science foundation of China back-propagating errors. Nature 1986;323(6088):533–6.
(No. 51504274), and Science Foundation of China University of [34] Hihi SE, Hc-J MQ, Bengio Y. Hierarchical recurrent neural networks for
Petroleum, Beijing (No. 2462014YJRC039). Long-Term dependencies. Adv Neural Inf Process Syst 1995;8:493–9.
[35] Hochreiter S, Schmidhuber J. Long short-Term memory. Neural Comput
1997;9(8):1735.
[36] Lécun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to
References document recognition. Proc IEEE 1998;86(11):2278–324.
[37] Hinton GE, Salakhutdinov RR. Reducing the dimensionality of data with
[1] Putnik G, Sluga A, ElMaraghy H, Teti R, Koren Y, Tolio T, et al. Scalability in neural network. Science 2006;313(5786):504–7.
manufacturing systems design and operation: state-of-the-art and future [38] Hinton GE, Osindero S, Teh YW. A fast learning algorithm for deep belief
developments roadmap. CIRP Ann Manuf Technol 2013;62(2):751–74. nets. Neural Comput 2014;18(7):1527–54.
[2] Lee YT, Kumaraguru S, Jain S, Hatim Q, Robinson S, Helu M, et al. A [39] Deng L, Seltzer M, Yu D, Acero A, Mohamed A, Hinton GE. Binary coding of
classification scheme for smart manufacturing systems’ performance speech spectrograms using a deep auto-encoder. Proceedings of 11th annual
metrics. Smart Sustain Manuf Syst 2017;1(1):52–74. conference of the international speech communication association
[3] Hu T, Li P, Zhang C, Liu R. Design and application of a real-time industrial 2010;3:1692–5.
Ethernet protocol under Linux using RTAI. Int J Comput Integr Manuf [40] Schölkopf B, Platt J, Hofmann T. Efficient learning of sparse representations
2013;26(5):429–39. with an energy-Based model. Proceedings of advances in neural information
[4] Ye Y, Hu T, Zhang C, Luo W. Design and development of a CNC machining processingsystems 2006:1137–44.
process knowledge base using cloud technology. Int J Adv Manuf Technol [41] Ranzato MA, Boureau YL, Lecun Y. Sparse feature learning for deep belief
2016:1–13. networks. Proceedings of international conference on neural information
[5] Tao F, Qi Q. New IT driven service-oriented smart manufacturing: framework processing systems 2007;20:1185–92.
and characteristics. IEEE Trans Syst Man Cybern Syst 2017;99:1–11. [42] Salakhutdinov RR, Hinton GE. Deep Boltzmann machines. J Mach Learn Res
[6] Ang J, Goh C, Saldivar A, Li Y. Energy-efficient through-life smart design, 2009;5(2):1967–2006.
manufacturing and operation of ships in an industry 4.0 environment. [43] Larochelle H, Lajoie I, Bengio Y, Manzagol PA. Stacked denoising
Energies 2017;10(5):610. autoencoders: learning useful representations in a deep network with a
[7] Huang Z, Hu T, Peng C, Hou M, Zhang C. Research and development of local denoising criterion. J Mach Learn Res 2010;11(12):3371–408.
industrial real-time Ethernet performance testing system used for CNC [44] Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep
system. Int J Adv Manuf Technol 2016;83(5–8):1199–207. convolution neural networks. International conference on neural
[8] Lalanda P, Morand D, Chollet S. Autonomic mediation middleware for smart information processing systems 2012;25:1097–105.
manufacturing. IEEE Internet Comput 2017;21(1):32–9. [45] Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, et al.
[9] Smart Manufacturing Coalition. Manufacturing growth continues despite Generative adversarial nets. Int Conf Neural Inf Process Syst
uncertain economy, according to ASQ outlook survey; 2013. https:// 2014;3:2672–80.
smartmanufacturingcoalition.org/sites/default/files/12.16.13 [46] Wang Y, Huang M, Zhao L, Zhu X. Attention-based LSTM for aspect-level
manufacturing outlook survey.pdf. [Accessed 10 Sepember 2017]. sentiment classification. Proceedings of conference on empirical methods in
[10] Wang L, Törngren M, Onori M. Current status and advancement of natural language processing 2016:606–15.
cyber-physical systems in manufacturing. J Manuf Syst 2015;37:517–27. [47] Poggio T, Smale S. The mathematics of learning: dealing with data. Not Am
[11] Wang P, Gao RX, Fan Z. Cloud computing for cloud manufacturing: benefits Math Soc 2003;50(5):537–44.
and limitations. J Manuf Sci Eng 2015;137:1–10. [48] Kusiak A. Smart manufacturing must embrace big data. Nature
[12] Lu Y, Xu X, Xu J. Development of a hybrid manufacturing cloud. J Manuf Syst 2017;544(7648):23–5.
2014;33(4):551–66. [49] Ince T, Kiranyaz S, Eren L, Askar M, Gabbouj M. Real-time motor fault
[13] Wu D, Rosen DW, Schaefer D. Cloud-based design and manufacturing: detection by 1-D convolution neural networks. IEEE Trans Ind Electron
status and promise. Comput Aided Des 2015;59:1–14. 2016;63(11):7067–75.
[14] Choudhary AK, Harding JA, Tiwari MK. Data mining in manufacturing: a [50] Hassanzadeh A, Kaarna A, Kauranne T. Unsupervised multi-manifold
review based on the kind of knowledge. J Intell Manuf 2009;20(5):501–21. classification of hyperspectral remote sensing images with contractive
[15] Lade P, Ghosh R, Srinivasan S. Manufacturing analytics and industrial Autoencoder. Neurocomputing 2017;257:67–78.
internet of things. IEEE Intell Syst 2017;32(3):74–9. [51] Caffe2. https://caffe2.ai/. 2017 [Accessed 20 October 2017].
[16] Monostori L, Márkus A, Brussel HV, Westkämpfer E. Machine learning [52] Theano. http://deeplearning.net/software/theano/index.html#. 2017
approaches to manufacturing. CIRP Ann Manuf Technol [Accessed 20 October 2017].
1996;45(2):675–712. [53] Google TensorFlow. https://www.tensorflow.org/. 2017 [Accessed 20
[17] Teti R, Jemielniak K, O’Donnell G, Dornfeld D. Advanced monitoring of October 2017].
machining operations. CIRP Ann Manuf Technol 2010;59(2):717–39. [54] Pytorch. http://pytorch.org/. 2017 [Accessed 20 October 2017].
[18] Helu M, Libes D, Lubell J, Lyons K, Morris K. Enabling smart manufacturing [55] Microsoft Cognitive Toolkit. https://www.microsoft.com/en-us/cognitive-
technologies for decision-making support. Proceedings of the ASME toolkit. 2017 [Accessed 20 October 2017].
international design engineering technical conferences and computers and [56] Google Google cloud machine learning. https://cloud.google.com/products/
information in engineering conference (IDETC/CIE) 2016:1–10. machine-learning/. 2017 [Accessed 20 October 2017].
[19] Wuest T, Weimer D, Irgens C, Klaus DT. Machine learning in manufacturing: [57] Amazon Web Service. Amazon AI, https://aws.amazon.com/amazon-ai/.
advantages, challenges, and applications. Prod Manuf Res 2016;4(1):23–45. 2017 [Accessed 20 October 2017].
Please cite this article in press as: Wang J, et al. Deep learning for smart manufacturing: Methods and applications. J Manuf Syst (2018),
https://doi.org/10.1016/j.jmsy.2018.01.003
G Model
JMSY-629; No. of Pages 13 ARTICLE IN PRESS
12 J. Wang et al. / Journal of Manufacturing Systems xxx (2018) xxx–xxx
[58] Microsoft Microsoft Azure; 2017. https://azure.microsoft.com/en-us/ [91] Xie H, Yang Y, Wang H, Li T, Jin W. Fault diagnosis in high-speed train
services/machine-learning-studio/. [Accessed 20 October 2017]. running gears with improved deep belief networks. J Comput Inf Syst
[59] IBM. IBM Watson ecosystem program; 2017. http://m.ibm.com/http/www- 2015;11(24):7723–30.
03. ibm.com/innovation/us/watson/. [Accessed 20 October 2017]. [92] Li C, Sanchez RV, Zurita G, Cerrada M, Cabrera D, Vasquez RE. Gearbox fault
[60] Zhang W, Jia MP, Zhu L, Yan X. Comprehensive overview on computational diagnosis based on deep random forest fusion of acoustic and vibratory
intelligence techniques for machinery condition monitoring and fault signals. Mech Syst Signal Process 2016;7:6–77, 283–293.
diagnosis. Chin J Mech Eng 2017;30(4):1–14. [93] Jia F, Lei Y, Lin J, Zhou X, Lu N. Deep neural networks: a promising tool for
[61] Lee J, Lapira E, Bagheri B, Kao H. Recent advances and trends in predictive fault characteristic mining and intelligent diagnosis of rotating machinery
manufacturing systems in big data environment. Manuf Lettersm with massive data. Mech Syst Signal Process 2016;7:2–73, 303–315.
2013;1(1):38–41. [94] Guo J, Xie X, Bie R, Sun L. Structural health monitoring by using a sparse
[62] Harding JA, Shahbaz M, Srinivas Kusiak A. Data mining in manufacturing: a coding −based deep learning algorithm with wireless sensor networks. Pers
review. J Manuf Sci Eng 2006;128:969–76. Ubiquit Comput 2014;18:1977–87.
[63] Esmaeilian B, Behdad S, Wang B. The evolution and future of manufacturing: [95] Lu C, Wang Z, Qin W, Ma J. Fault diagnosis of rotary machinery components
a review. J Manuf Syst 2016;39:79–100. using a stacked denoising autoencoder-based health state identification.
[64] Kang HS, Ju YL, Choi SS, Kim H, Park JH. Smart manufacturing: past research, Signal Process 2017;130:377–88.
present findings, and future directions. Int J Precision Eng Manuf Green [96] Shao H, Jiang H, Wang F, Zhao H. An enhancement deep feature fusion
Technol 2016;3(1):111–28. method for rotating machinery fault diagnosis. Knowl Based Syst
[65] Hazen BT, Boone CA, Ezell JD, Jones-Farmer LA. Data quality for data science, 2017;119:200–20.
predictive analytics, and big data in supply chain management: an [97] Sun W, Shao S, Zhao R, Yan R, Zhang X, Chen X. A sparse auto-encoder-based
introduction to the problem and suggestions for research and applications. deep neural network approach for induction motor faults classification.
Int J Prod Econ 2014;154(4):72–80. Measurement 2016;89:171–8.
[66] Shin SJ, Woo J, Rachuri S. Predictive analytics model for power consumption [98] Yang Z, Wang X, Zhong J. Representational learning for fault diagnosis of
in manufacturing. Procedia CIRP 2014;15:153–8. wind turbine equipment: a multi-layered extreme learning machines
[67] Vogl GW, Weiss BA, Helu M. A review of diagnostic and prognostic approach. Energies 2016;9(379):1–17.
capabilities and best practice for manufacturing. J Intell Manuf 2016:1–17. [99] Wang L, Zhao X, Pei J, Tang G. Transformer fault diagnosis using continuous
[68] Xie X. A review of recent advances in surface defect detection using texture sparse autoencoder. SpingerPlus 2016;5(448):1–13.
analysis techniques. Elcvia Electron Lett ComputVision Image Anal [100] Lei Y, Jia F, Lin J, Xing S, Ding SX. An intelligent fault diagnosis method using
2008;7(3):1–22. unsupervised feature learning towards mechanical big data. IEEE Trans Ind
[69] Neogi N, Mohanta DK, Dutta PK. Review of vision-based steel surface Electron 2016;63(5):3137–47.
inspection systems. EURASIP J Image Video Process 2014;1:1–19. [101] Li C, Sanchez RV, Zutita G, Cerrada M, Cabrera D, Vasquez RE. Multimodel
[70] Pernkopf F, O’Leary P. Visual inspection of machined metallic high-precision deep support vector classification with homologous features and its
surfaces. EURASIP J Adv Signal Process 2002;7:667–8. application to gearbox fault diagnosis. Neurocomputing 2015;168:119–27.
[71] Scholz-Reiter B, Weimer D, Thamer H. Automated surface inspection of [102] Guo X, Shen C, Chen L. Deep fault recognizer: an integrated model to
cold-formed micro-parts. CIRP Ann Manuf Technol 2012;61(1):531–4. denoise and extract features for fault diagnosis in rotating machinery. Appl
[72] Weimer D, Scholz-Reiter B, Shpitalni M. Design of deep convolution neural Sci 2017;7(41):1–17.
network architectures for automated feature extraction in industrial [103] Chen Z, Deng S, Chen X, Li C, Sanchez RV, Qin H. Deep neural network-based
inspection. CIRP Ann Manuf Technol 2016;65(1):417–20. rolling bearing fault diagnosis. Microelectron Reliab 2017;75:327–33.
[73] Ren R, Hung T, Tan KC. A generic deep-learning-based approach for [104] Malhi A, Yan R, Gao RX. Prognosis of defect propagation based on recurrent
automated surface inspection. IEEE Trans Cybern 2017;99:1–12. neural networks. IEEE Trans Instrum Meas 2011;60(3):703–11.
[74] Masci J, Meier U, Ciresan D, Schmidhuber J, Fricout G, Mittal A. Steel defect [105] Zhao R, Wang D, Yan R, Mao K, Shen F, Wang J. Machine health monitoring
classification with max-pooling convolution neural networks. IEEE using local feature-based gated recurrent unit networks. IEEE Transa Ind
international joint conference on neural networks (IJCNN) 2012;20:1–6. Electron 2018;65(2):1539–48.
[75] Park JK, Kwon BK, Park JH, Kang DJ. Machine learning-based imaging system [106] Zhao R, Yan R, Wang J, Mao K. Learning to monitor machine health with
for surface defect inspection. Int J Precision Eng Manuf Green Technol convolution bi-directional LSTM networks. Sensors 2017;17(273):1–18.
2016;3(3):303–10. [107] Wu Y, Yuan M, Dong S, Lin L, Liu Y. Remaining useful life estimation of
[76] Zhao R, Yan R, Chen Z, Chen Z, Mao K, Wang P, et al. Deep learning and its engineered systems using vanilla LSTM neural networks. Neurocomputing
applications to machine health monitoring: a survey; 2016. https://arxiv. 2017;226(5):853–60.
org/pdf/1612.07640.pdf. [Accessed 20 October 2017]. [108] Malhotra P, Vig L, Shroff G, Agarwal P. Long short term memory networks
[77] Janssens O, Slavkovikj V, Vervisch B, Stockman K, Loccufier M, Verstockt S, for anomaly detection in time series. In: Proceeding of European symposium
et al. Convolution neural network based fault detection for rotating on artificial neural networks, computational intelligence, and machine
machinery. J Sound Vib 2016;377:331–45. learning. 2015. p. 89–94.
[78] Lu C, Wang Z, Zhou B. Intelligent fault diagnosis of rolling bearing using [109] Wang P, Gao RX, Yan R. A deep learning-based approach to material removal
hierarchical convolution network based health state classification. Adv Eng rate prediction in polishing. CIRP Ann Manuf Technol 2017;66:429–32.
Inf 2017;32:139–51. [110] Deutsch J, He M, He D. Remaining useful life prediction of hybrid ceramic
[79] Guo X, Chen L, Shen C. Hierarchical adaptive deep convolution neural bearings using an integrated deep learning and particle filter approach. Appl
network and its application to bearing fault diagnosis. Measurement Sci 2017;7(649):1–17.
2016;93:490–502. [111] Qiu X, Zhang L, Ren Y, Suganthan PN, Amaratunga G. Ensemble deep
[80] Verstraete D, Droguett E, Meruance V, Modarres M, Ferrada A. Deep learning learning for regression and time series forecasting. IEEE symposium series
enabled fault diagnosis using time-frequency image analysis of rolling on computational intelligence 2014:1–6.
element bearings. Shock Vib 2017:1–29. [112] Zhang W, Duan P, Yang LT, Xia F, Li Z, Lu Q, et al. Resource requests
[81] Chen ZQ, Li C, Sanchez RV. Gearbox fault identification and classification prediction in the cloud computing environment with a deep belief network.
with convolution neural networks. Shock Vib 2015;2:1–10. Software Pract Exp 2017;47(3):473–88.
[82] Wang P, Ananya Yan R, Gao RX. Virtualization and deep recognition for [113] Khan SH, Hayat M, Bennamoun M, Sohel FA, Tognari R. Cost-sensitive
system fault classification. J Manuf Syst 2017;44:310–6. learning of deep feature representations from imbalanced data. IEEE Trans
[83] Dong H, Yang L, Li H. Small fault diagnosis of front-end speed controlled Neural Networks Learn Syst 2017;99:1–15.
wind generator based on deep learning. WESEAS Trans Circuits Syst [114] Maaten LVD, Hinton G. Visualizing data using t-SNE. J Mach Learn Res
2016;15:64–72. 2008;9(2605):2579–605.
[84] Wang J, Zhuang J, Duan L, Cheng W. A multi-scale convolution neural [115] Yu D, Yao K, Su H, Li G, Seide F. KL-divergence regularized deep neural
network for featureless fault diagnosis. Proceedings of 2016 international network adaptation for improved large vocabulary speech recognition. IEEE
symposium on flexible automation 2016:65–70. international conference on acoustics, speech and signal processing
[85] Tamilselvan P, Wang P. Failure diagnosis using deep belief learning based 2013:7893–7.
health state classification. Reliab Eng Syst Saf 2013;115(7):124–35. [116] Vig E, Dorr M, Cox D. Large-scale optimization of hierarchical features for
[86] Yu H, Khan F, Garaniya V. Nonlinear Gaussian belief network based fault saliency prediction in natural images. IEEE computer vision and pattern
diagnosis for industrial processes. J Process Control 2015;35:178–200. recognition 2014:2798–805.
[87] Tran VT, Althobiani F, Ball A. An approach to fault diagnosis of reciprocating [117] Pan SJ, Yang Q. A survey on transfer learning. IEEE Trans Knowl Data Eng
compressor valves using teager–kaiser energy operator and deep belief 2010;22(10):1345–59.
networks. Expert Syst Appl 2014;41(9):4113–22. [118] Dziugaite GK, Roy DM, Ghahramani Z. Training generative neural networks
[88] Shao H, Jiang H, Zhang X, Niu M. Rolling bearing fault diagnosis using an via Maximum Mean Discrepancy optimization. Proceedings of the 31st
optimization deep belief network. Meas Sci Technol 2015;26(11):1–17. conference on uncertainty in artificial intelligence 2015:258–67.
[89] Gan M, Wang C, Zhu C. Construction of hierarchical diagnosis network based [119] Mell P, Grance T. The NIST definition of cloud computing. Commun ACM
on deep learning and its application in the fault pattern recognition of 2009;53(6), 50–50.
rolling element bearings. Mech Syst Signal Process 2016;72–73(2):92–104. [120] Davis J, Edgar T, Porter J, Bernaden J, Sarli M. Smart manufacturing,
[90] Yin J, Zhao W. Fault diagnosis network design for vehicle on-board manufacturing intelligence and demand-dynamic performance. Comput
equipments of high speed railway: a deep learning approach. Eng Appl Artif Chem Eng 2012;47(12):145–56.
Intell 2016;56:250–9. [121] Lee J, Kao HA, Yang S. Service innovation and smart analytics for industry 4:
0 and big data environment. Procedia CIRP 2014;16:3–8.
Please cite this article in press as: Wang J, et al. Deep learning for smart manufacturing: Methods and applications. J Manuf Syst (2018),
https://doi.org/10.1016/j.jmsy.2018.01.003
G Model
JMSY-629; No. of Pages 13 ARTICLE IN PRESS
J. Wang et al. / Journal of Manufacturing Systems xxx (2018) xxx–xxx 13
[122] Lee J, Lapira E, Bagheri B, Kao H. Recent advances and trends in predictive [125] O’Donovan P, Leahy K, Bruton K, O’Sullivan D. Big data in manufacturing: a
manufacturing systems in big data environment. Manuf Lett systematic mapping study. J Big Data 2015;2(1):1–22.
2013;1(1):38–41. [126] Wu D, Liu S, Zhang L, Terpenny J, Gao RX, Kurfess T, et al. A fog
[123] Chen CLP, Zhang CY. Data-intensive applications, challenges, techniques and computing-based framework for process monitoring and prognosis in
technologies: a survey on Big Data. Inf Sci 2014;275(11):314–47. cyber-manufacturing. J Manuf Syst 2017;43(1):25–34.
[124] Meziane F, Vadera S, Kobbacy K, Proudlove N. Intelligent systems in
manufacturing: current developments and future prospects. Integr Manuf
Syst 2000;11(4):218–38.
Please cite this article in press as: Wang J, et al. Deep learning for smart manufacturing: Methods and applications. J Manuf Syst (2018),
https://doi.org/10.1016/j.jmsy.2018.01.003