Chang 2021

Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3088875, IEEE Internet of
Things Journal
JOURNAL OF IEEE INTERNET OF THINGS CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 1

A Survey of Recent Advances in


Edge-Computing-Powered Artificial Intelligence of
Things
Zhuoqing Chang, Shubo Liu, Xingxing Xiong, Zhaohui Cai, and Guoqing Tu

Abstract—The Internet of Things (IoT) has created a ubiq- applications covering various backgrounds and requests. Cloud
uitously connected world powered by a multitude of wired and computing plays a crucial role in IoT systems, where the
wireless sensors generating a variety of heterogeneous data over vast resources available in the cloud can provide ubiquitous
time in a myriad of fields and applications. To extract complete
information from these data, advanced Artificial Intelligence (AI) on-demand computing and storage capabilities to support
technology, especially Deep Learning (DL), has proved successful these devices [3]. Additionally, these data may consist of
in facilitating data analytics, future prediction and decision- multimedia information, from images, sounds, and videos to
making. The collective integration of AI and the IoT has greatly structured data (e.g., temperature and humidity). Advanced
promoted the rapid development of Artificial Intelligence of tools are needed to glean insights from a large volume of
Things (AIoT) systems that analyze and respond to external
stimuli more intelligently without involvement by humans. How- raw data. Facilitated by the recent achievements of algorithms,
ever, it is challenging or infeasible to process massive amounts computing capabilities and big data processing necessities, Ar-
of data in the cloud due to the destructive impact of the volume, tificial Intelligence (AI), especially its essential sector of Deep
velocity, and veracity of data and fatal transmission latency Learning (DL), has achieved unprecedented success in data
on networking infrastructures. These critical challenges can be analysis, future prediction and decision-making [4]. Clearly,
adequately addressed by introducing edge computing. This paper
conducts an extensive survey of an end-edge-cloud orchestrated the Artificial Intelligence of Things (AIoT), an integrative
architecture for flexible AIoT systems. Specifically, it begins with technology combining both AI and the IoT, is starting to garner
articulating fundamental concepts including the IoT, AI and edge its share of the spotlight with the support of cloud centers. In
computing. Guided by these concepts, it explores the general the AIoT era, large amounts of data generated by IoT devices
AIoT architecture, presents a practical AIoT example to illustrate provide perfect opportunities for training AI models to reliably
how AI can be applied in real-world applications and summarizes
promising AIoT applications. Then, the emerging technologies for mine valuable data from a noisy and complex environment for
AI models regarding inference and training at the edge of the intelligent analysis and decision-making.
network are reviewed. Finally, the open challenges and future The cloud-centric AIoT requires the massive amount of het-
directions in this promising area are outlined. erogeneous data collected from IoT sensors to be transmitted
Index Terms—Internet of Things, Artificial Intelligence, Ma- to the cloud center through a Wide-Area Network (WAN) for
chine Learning, Deep Learning, Edge Computing. further processing and analysis before delivering the feed-
back to end devices [5]. Although the cloud center an has
I. I NTRODUCTION unlimited computational capacity, such a cloud-based AIoT
architecture is ill-suited for time-critical and privacy-sensitive
B ENEFITTING from the extensive use of the internet
and the rapid development of many wired and wire-
less connected devices, the Internet of Things (IoT) matures
applications due to the great pressure on network bandwidth,
the inherent latency constraints of network communication
and the potential to expose private and sensitive information
promptly and plays an increasingly significant role in every
during data offloading and remote processing [6]–[8]. Edge
aspect of life by providing many crucial services, such as
computing seems to be a promising technique to remedy these
information exchange and monitoring [1]. With the extensive
issues, which brings computational resources closer to the
applications of IoTs, the total installed IoT-based devices is
data source with a relatively light access burden and a low
projected to amount to approximately 41.6 billion, and nearly
transmission delay [9]. It is extremely suitable for the AIoT
79.4 Zettabytes (ZB) of data may be generated and consumed
because AI models, especially DL models, that depend greatly
in 2025 [2]. Thus, some nonnegligible challenges that the
on computation and storage resources can still work fluently
IoT faces are explosive data generation and reliable data
and cooperatively by partitioning the layers into several parts
collection between heterogeneous devices in a wide range of
and offloading the computation-intensive tasks to edge servers
Manuscript received April 10, 2020; revised May 10, 2021; accepted June [3]. Such a computing paradigm coupled with AI can assist
10, 2021. Date of publication XXX XX, 2021; date of current version May users better and more intelligently, where AI models function
XX, 2021. (Corresponding author: Shubo Liu.)
Z. Chang, S. liu, X. Xiong and Z. Cai are with the School of as a powerful tool to mine valuable information from raw
Computer Science, Wuhan University, Wuhan 430072, China. (e-mail: data, make real-time decisions [10] and to dynamically manage
changzhuoqing@whu.edu.cn; liu.shubo@whu.edu.cn; xiong xx@whu.edu.cn; various resources of the edge platforms [11], [12]. Rather than
zhcai@whu.edu.cn).
G. Tu is with the School of National Cybersecurity, Wuhan University, transmit all the raw data to the cloud for overall analysis,
Wuhan 430072, China. (tugq2000@163.com) edge computing-assisted AIoT solutions essentially enable AI

2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: China Jiliang University. Downloaded on June 30,2021 at 08:42:36 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3088875, IEEE Internet of
Things Journal
JOURNAL OF IEEE INTERNET OF THINGS CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 2

models to work in the field. These solutions can lighten the reinforcement learning approach for intelligently orchestrating
burden of data transmission through the network backhaul, edge computing and caching resources to maximize the system
further reduce the cost of network processing and maintenance utility for vehicular networks. Survey [23] focuses on the
and make timely decisions by positioning computational ca- combination of AI and edge computing in the field of the
pabilities near end devices [13]. Additionally, these methods Internet of Vehicles (IoV), where AI is used mainly for
protect sensitive data from being abused by illegal operators resource allocation, computational task scheduling, and vehicle
or hijacked by attackers [14]. trajectory prediction in dynamic environments.
This paper investigates convergence of AI and the IoT Surprisingly little work has been done to put forward
from the perspective of end-edge-cloud collaboration, where a general concept and an architecture for the AIoT. The
immediate and real-time responses are enabled by using AI existing works on IoT-based AI implementation rely on cloud
capacities for processing raw data at end or edge devices and platforms; however, this approach is unacceptable for delay-
higher accuracy results are gained by using cloud analytics in sensitive services. Edge computing extends cloud analytics to
collaboration. The AIoT can bring numerous benefits to human the network edge and covers its shortage. In contrast to other
beings in a spectrum of domains and form a ubiquitous in- surveys, this paper explores the combination of the IoT with
telligent collaborative environment; however, significant chal- AI aided by edge computing and the cloud.
lenges must be overcome before fully realizing the potential of
AIoT. Thus, this paper aims to present a comprehensive survey B. Contributions of This Survey
of recent advances, open challenges and future directions for
By referring to previous works, this paper provides a con-
the AIoT.
ceptual introduction to IoT, AI and edge computing technolo-
gies. The main goal is to conduct a comprehensive analysis
A. Related Work on the potentials of integration of the IoT and AI technologies
with the assistance of edge computing coordinated by the
Several published references present AI and Machine Learn-
cloud. This paper put emphasis on seven representative AIoT
ing (ML) or DL methods that have been used in the domain of
application scenarios and is especially interested in techniques
the IoT. For example, reference [15] proposes that integrating
enabling efficient and effective deployment of AI models
the IoT with AI is much more than a way of making human life
in an end-edge-cloud cooperation mood. In summary, the
easier; additionally, this integration brings security concerns
contributions of this paper can be summarized as follows:
and ethical issues. Reference [16] reports different IoT-based
1) An overview of fundamental technologies supporting
ML mechanisms in healthcare, smart grids, and vehicular com-
the AIoT are given in terms of the general architecture of
munications and describes the basic aim of ML applied to the
the IoT, state-of-the-art AI methods accompanied by key
IoT. To lower the latency and bandwidth of data transmission,
characteristics, and edge computing-related paradigms along
edge computing is also investigated in the context of the IoT.
with corresponding hardware and systems.
Reference [7] demonstrates that AI can not only endow edges
2) Confluence of AI and the IoT is the core of this paper. In
with greater intelligence and optimality but also help run AI
this respect, benefits of incorporating AI into IoT systems are
models at edges. Reference [17] describes the potential of ML
first illustrated. An end-edge-cloud collaborative architecture
methods for traffic profiling, device identification, system se-
of AIoT are then proposed. A practical example of AIoT
curity, IoT applications, edge computing and Software-Defined
applications is additionally given to further illustrate how AI
Networking (SDN). In addition to reviewing the characteristics
can be applied in real-world applications.
of IoT data, state-of-the-art ML and DL methods and IoT
3) Some promising applications of AIoT are surveyed in a
applications using different Deep Neural Network (DNN)
variety of domains, such as the IoV, smart healthcare, smart
models, reference [18] reviews approaches and technologies
industry, smart homes, smart agriculture, smart grids and smart
for running DL models on resource-constrained devices and
environment.
edge servers. Efforts to devise compression and acceleration
4) The recent approaches and technologies for performing
techniques that help deploy DL algorithms on resource-hungry
AI inference on an AIoT hierarchy from resource-hungry end
mobile and embedded devices are further differentiated and
devices to edge servers and the cloud are summarized.
discussed in [19] to better satisfy the requirements of real-time
5) The enabling technologies for decentralized AI training
applications and user privacy protection. A novel offloading
among various end devices and edge servers of an AIoT
strategy is essential to advance the implementation of DL-
hierarchy are discussed.
based applications in IoT systems aided by edge computing
6) The open challenges and future directions for construc-
and cloud [20]. References [14] and [21] aim recent and in-
tively and fruitfully merging of AI and the IoT are outlined.
depth research of relevant works that deal with AI inference
and training at the network edge.
AI helps more effectively handle seemingly insurmountable C. Outline of this Survey
resource allocation challenges in edge computing scenario. The remainder of this paper is arranged as follows: Sec-
Reference [22] proposes a forward central dynamic and tion II presents the fundamentals related to end-edge-cloud
available approach to managing the running time of sensing collaborative AIoT systems in terms of the IoT architecture,
and transmission processes in AI-enabled IoT devices for basic AI models with an emphasis on ML and DL models,
industrial platforms. Reference [12] employs a novel deep and edge computing. Section III describes the benefits of

2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: China Jiliang University. Downloaded on June 30,2021 at 08:42:36 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3088875, IEEE Internet of
Things Journal
JOURNAL OF IEEE INTERNET OF THINGS CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 3

devices are used mainly to sense and collect physical data, and
data is generally produced in trillions of bytes with a variety of
attributes, including various physical quantities, identity signs,
location information, and audio and video data. Additionally,
these devices can respond to the environment.
2) The Network Layer
The network layer is the most standardized among the three
layers of the IoT, in which the devices in the perception layer
can communicate with IoT gateways, Wireless-Fidelity (Wi-
Fi) Access Points (APs) and Base Stations (BSs) to transmit
data to other parts quickly, accurately and safely. This layer
enables a device to communicate at a short-range to long-range
distance by using a variety of communication protocols, such
as wired and wireless networks, including Bluetooth, ZigBee,
Fig. 1. The general architecture of the IoT. Sigfox, Long Range Radio (LoRa) and Narrowband IoT (NB-
IoT). The data generated in the perception layer need to be
transmitted to the server via the network layer promptly and
the convergence of AI and the IoT, highlights the AIoT accurately.
architecture and discusses an AIoT example to further il- 3) The Application Layer
lustrate how AI can be applied in real-world applications. The application layer is equivalent to the control layer
AIoT applications in different domains (e.g., the IoV, smart and decision-making layer of the IoT, in which a mass of
healthcare, smart industry, smart homes, smart agriculture, polymorphic and heterogeneous data with rich semantics are
smart grids and smart environment) are surveyed in Section analyzed. The application layer can provide a myriad of
IV. Section V investigates the enabling technologies for AI applications, such as industrial control, urban management,
inference from the end-edge-cloud orchestrated perspective, power monitoring, and green agriculture.
including inference on-device and coinference at the edge,
and private inference technologies. Section VI taxonomically
B. Basics of Artificial Intelligence
summarizes the enabling technologies for distributed model
training in AIoT, focusing on methods of decentralized AI From Siri to self-driving vehicles, AI developed rapidly, is
training at the network edge, model training updates and adopted in a wide range of applications and corroborates its
security enhancement. In Section VII, some open research outstanding performance. Algorithms exert a crucial function
challenges and future directions for the AIoT are envisioned in AI, where simple algorithms can be applicable to simple
for fostering continuous research efforts. Finally, Section VIII applications, while more complex ones can build strong AI.
concludes this paper. Typically, Machine Learning (ML), a subset of AI, is the
most mainstream method used by systems to automatically
II. F UNDAMENTALS OF A RTIFICIAL I NTELLIGENCE OF learn from the data, identify patterns and make decisions from
T HINGS experience without human intervention or assistance. There
are a variety of ML models, ranging from basic algorithms
This section reviews the general architecture of the IoT in
to highly complex ones, such as Support Vector Machines
brief, then presents basic AI models with an emphasis on ML
(SVMs), Decision Trees (DTs), k-means clustering, neural
and DL models and additionally gives an overview of edge
networks. DL, a unique branch of ML, is quite different from
computing in terms of related paradigms, relationship with
classic ML algorithms, which uses a hierarchical neural net-
the Fifth Generation (5G), hardware and systems.
work to make the model more complex and enables automatic
learning by absorbing a great deal of unstructured data, such as
A. Introduction to the Internet of Things images, sound, text and video. Many DL models, as exempli-
With the aid of sensors, wired and wireless networks and fied by Multilayer Perceptrons (MLPs), Convolutional Neu-
cloud computing, the IoT achieves a comprehensive perception ral Networks (CNNs), Recurrent Neural Networks (RNNs),
and a ubiquitously connected environment. Although there is Long Short-Term Memory networks (LSTMs) and Generative
no unified definition of the IoT, a typical IoT architecture Adversarial Networks (GANs), have been developed to train
is largely recognized, as shown in Fig. 1. Generally, the neural networks to make classifications and predictions. This
IoT architecture is composed of three layers, namely, the subsection will introduce several traditional ML algorithms
perception layer, network layer and application layer. and present some typical neural network algorithms accompa-
1) The Perception Layer nied with the corresponding functions, including a distinctive
The perception layer provides the core capability enabling Reinforcement Learning (RL) method.
the comprehensive awareness of the environment; this layer in- 1) Traditional Machine Learning
cludes such diverse devices and technologies as sensors, actua- This section will deliver a brief introduction on several
tors, Radio-Frequency Identification (RFID), two-dimensional traditional ML algorithms, such as SVMs, DTs and k-means
codes, and multimedia information collection devices. These clustering.

2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: China Jiliang University. Downloaded on June 30,2021 at 08:42:36 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3088875, IEEE Internet of
Things Journal
JOURNAL OF IEEE INTERNET OF THINGS CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 4

a) Support Vector Machine of parameters and the computation time by pooling operations
SVMs are a kind of generalized linear classifier that clas- to reduce the chance of overfitting [32]. The Rectified Linear
sifies data by supervised learning. SVMs can also perform Unit (ReLU) activation function can accelerate training time
nonlinear classification by using the kernel method. The without affecting the generalization of the network. These
learning policy of an SVM is to maximize the separation characteristics make CNNs excellent at analyzing not only
between data points, which is formalized as a convex quadratic images and speech signals but also structural data similar to
programming problem. SVMs have been successfully applied images [33], [34].
in pattern recognition problems, such as in text classification c) Recurrent Neural Network
[24] and image recognition [25]. RNNs are developed to handle sequential (speech or text)
b) Decision Tree or time-series (sensor data) inputs of different lengths and
DTs are classification algorithms that predict the labels can estimate energy consumption in smart grids and so on.
of data by iterating the input data via a learning tree, the As the typical structure of RNNs depicted in Fig. 2(c), the
main advantages of which are readability, a fast classification input of each neuron consists of information from the upper
speed and understandability. However, applications of DTs layer and information from its own previous channel [4]. The
are confined to linear separable data. DTs are constructed by neuron is also furnished with a feedback loop that provides the
selecting the partition method with the largest reduction in the current output for the next step as the input. RNNs are ideal
entropy of the data set. DTs can be used in classification [26], candidates to predict future information and restore missing
medical diagnoses [27], and so on. parts of sequential data [35], [36].
c) k-means Clustering d) Long Short-Term Memory
k-means cluster analysis algorithms are based on the it-
LSTMs are considered extensions of RNNs. In contrast to
erative solutions. With advantages of linear complexity and
RNNs, LSTM units employ a gate structure and a well-defined
simple realization, k-means cluster is employed to solve node
memory cell to actively control unit states, shown in Fig. 2(d),
clustering problems [28]. Given a set of data points and the
where gradient explosion is solved by controlling (prohibiting
required number of clusters k, where k is specified by the
or allowing) the flow of information. The gates exploit sigmoid
user, the k-means algorithm divides the data into k clusters
or tanh as their activation function. Vanishing gradients caused
repeatedly according to a certain distance function. k-means
by using these activation functions during the training of other
clustering performs well in grouping the same things from a
models do not take place in LSTMs because the computations
randomly distributed set of things, such as in object detection
stored in the memory cells are not distorted over time. LSTMs
[29].
outperform RNNs when dealing with data featuring a long
2) Neural Networks
dependency over time. LSTMs have been investigated for long
With various types of DNNs, DL models are capable of
dependencies in IoT applications, such as pedestrian trajectory
extracting accurate information from raw sensor data collected
prediction [37] and traffic flow prediction [38].
by IoT devices and make contributions to more complicated
tasks ranging from image classification and retrieval to natural e) Generative Adversarial Network
language processing. It can’t be ignored that DL models can GANs, originating from game theory, consist of two neural
still work in edge computing environment due to their mul- networks, namely, a generator and discriminator, as illustrated
tilayer structure. Therefore, this subsection briefly introduces in Fig. 2(e). The generator aims to generate new data after
some DNN models in terms of their network structures and learning the data distribution, while the discriminator decides
functions. whether the input data come from the real data or the gen-
a) Multilayer Perceptron erator. Both networks constantly optimize their abilities to
MLPs are a type of feedforward Artificial Neural Network generate and distinguish data in the adversarial process until
(ANN), as shown in Fig. 2(a). Adopting the Fully Connected a Nash equilibrium is found [39]. In IoT applications, GANs
Neural Network (FCNN) technique, the input of a cell is fed can be used to generate new data beyond the available data. In
forward to a hidden cell, then activated by cells belonging to multiresident activity recognition in smart homes, GANs can
the following layer, and is finally transmitted to the output be helpful in creating more data to train DL models [40].
cell. By calculating the error between the output layer and the 3) Reinforcement Learning
previous hidden layer, the error can be reduced by adjusting RL, another undeniably important DL method, allows vari-
the internal weights between each pair of layers. MLPs are ous software agents and machines to take suitable actions by
more accurate than other techniques in healthcare applications interacting with their environment and aimed at discovering
[30]. Moreover, MLP-based models for blind entity identifica- mistakes and maximizing rewards in a particular situation.
tion are explored to help estimate the wireless link quality of Generally, the interaction between the agent’s state and actions
a network, which are successfully implemented in agricultural through the environment is described as the Markov Decision
IoT [31]. Process (MDP). Deep Reinforcement Learning (DRL) is a
b) Convolutional Neural Network type of RL that combines DL and RL where DL uses DNNs
CNNs can capture the correlations between adjacent data to fit the Q-value function, thereby addressing the explosion
blocks and extract high-level features by using convolutional of state-action space challenges. DRL has wide utilization in
and pooling operations, as depicted in Fig. 2(b). Unlike recommendation [41], wireless communication [42], resource
FCNNs, CNNs can extract features while reducing the number allocation [43] and so on. There are two typical kinds of DRL

2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: China Jiliang University. Downloaded on June 30,2021 at 08:42:36 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3088875, IEEE Internet of
Things Journal
JOURNAL OF IEEE INTERNET OF THINGS CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 5

Fig. 2. Basic structures of typical DL.

on the score from the Critic, while the Critic learns a value
function and estimates the performance of the policy updated
by the Actor. Other strategy optimization methods include
the Deep Deterministic Policy Gradient (DDPG) [46] and
Proximate Policy Optimization (PPO) [47].

C. Overview of Edge Computing


Cloud computing refers to the concept of delivering differ-
ent services on demand by using networking, virtualization,
distributed computing, utility computing and software services.
Fig. 3. Architecture of typical DRL. Cloud computing is a service-oriented architecture that pro-
vides flexible services, reduces the information technology
overhead for end users, and the total cost of ownership [48].
However, the cloud center is usually built in a remote area
algorithms, namely, value-based DRL and policy-based DRL,
far from end users, thus possibly causing data transmission
as shown in Fig. 3.
delays. With a soaring number of IoT devices, the cloud can-
a) Value-based Deep Q-Learning not satisfy the requirements of latency-sensitive and privacy-
Deep Q-Learning (DQL) is a typical representative of value- critical applications [21]. The emergence of edge computing
based DRL, where DNNs are employed to fit the Q-values of is aimed at migrating computational tasks to edge devices
actions. Experience replay is introduced to fix the instability near sensors and actuators, which can alleviate the pressure of
problem of value functions represented by nonlinear network data transmission, reduce end-to-end latency, and thus enable
and nonstatic distribution problems. However, the max opera- real-time services. A diversity of devices can serve as edge
tion is employed in a deep Q-network to select and measure computing platforms: switches, cellular BSs or IoT gateways
actions, which may result in overestimating the action values. [9], which makes edge computing flexible and scalable de-
Double Deep Q-Learning (DDQL) decouples selection and ploy various services at anywhere between end-users and the
evaluation to address this problem [44]. cloud. Typically, edge computing is considered an extension
b) Policy-based Deep Q-Learning of the cloud platform and works independently and effectively
Policy-based DRL can handle a continuous action space in some scenarios or collaborates with the cloud platform.
with better convergence; in this method, the policy parameters This subsection reviews edge computing-related paradigms,
are updated by continuously calculating the gradient of the pol- the relationship between 5G and edge computing, and edge
icy expectation reward. In such situations, DNNs are employed computing hardware and systems.
to parameterize the policy and are then optimized by the 1) Edge Computing-Related Paradigms
policy gradient method. The Actor-Critic (AC) [45] framework Many emerging technologies have been proposed to work
is widely applied in policy gradient-based DRL, where the at the edge of the network; these technologies have similar
Actor selects actions and modifies the policy distribution based but distinct paradigms, including Mobile Cloud Computing

2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: China Jiliang University. Downloaded on June 30,2021 at 08:42:36 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3088875, IEEE Internet of
Things Journal
JOURNAL OF IEEE INTERNET OF THINGS CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 6

(MCC) [49], cloudlets [50], fog computing [51] and Multiac- limited resources for computing, communication and storage
cess Edge Computing (MEC) [8]. The term “edge computing” have to execute all tasks with the aid of cloud centers.
represents this set of burgeoning technologies. Below, some However, the ambitious millisecond-scale latency services in
typical concepts related to edge computing are discussed and 5G cannot be realized with only the support of the cloud.
differentiated. The volume, velocity, and veracity of the data will heavily
a) Cloudlet burden the network bandwidth and the center cloud. MEC,
Cloudlets, proposed by Carnegie Mellon University, are a viable solution, adopts a decentralized model that allocates
envisioned as a mobility-enhanced data centers with certain computing and storage resources closer to end users, which
computation and storage capabilities and located near mobile is aligned with the concept of a next-generation network in
devices. The cloudlet is considered as the middle layer of a which tasks are generated and executed locally [56]. Edge
three-tier architecture: mobile devices, the microcloud, and the computing and 5G are closely bound to realize high bandwidth
cloud. Cloudlets have the features of computing, connectivity and real-time interactive AIoT services, where 5G enables
and security capabilities, proximity to mobile users, virtualiza- mass connections, while edge computing provides low latency
tion management technologies and implementation of related services by providing computing and storage capabilities near
functionality based on standard cloud technologies [50], [52]. the data source.
Compared with other paradigms, cloudlets are more similar to 3) Edge Computing Hardware and Systems
mobile clouds and closer to mobile devices, which are suitable This subsection introduces some hardware supporting the
for real-time and mobile scenarios. execution of AI algorithms for both end devices and edge
b) Fog Computing nodes and presents edge-cloud systems for AIoT applications
Fog computing, proposed initially by Cisco, is considered (summarized in Table I).
an effective extension of and supplement to traditional cloud a) Hardware for Edge Devices
computing, which places resources and services (computation, Edge devices designed for executing AI models can gen-
storage, networking, and processing) on a path from end erally be classified by their technical architecture into four
devices to the cloud [51]. Fog nodes, such as APs, gate- types. i) Application-Specific Integrated Circuit Chip: An
ways, and BSs, can be deployed on any infrastructure in a Application-Specific Integrated Circuits (ASICs), integrated
specific geographic area [53]. Fog computing provides real- circuits, favor a specific application rather than general func-
time collaborative services with less latency for numerous tions. Because of their small size, low power consumption and
interconnected IoT devices via distributed fog nodes [54]. good security and performance, ASICs can meet the demand
Unlike cloudlets and MEC, fog computing is focused more of edge-computing patterns for AI algorithms. The DianNao
on the IoT and the end side. family [58], consisting of DianNao, DaDianNao, ShiDianNao,
c) Mobile Edge Computing and PuDianNao, is a set of DNN accelerators that uses efficient
Mobile Edge Computing (MEC), standardized by the Euro- memory access to minimize latency and energy consumption.
pean Telecommunications Standards Institute (ETSI), provides Unlike the other DianNao family accelerators, ShiDianNao
services by allocating computation and storage resources at [74] is an embedded device for image applications. Edge
the edge of cellular networks, e.g., wireless BSs [9]. BSs are Tensor Processing Units (TPUs) [57] are Google’s custom-
the major access gates for IoT devices, and MEC enables developed design used to facilitate ML workloads. Edge TPUs
users to deploy services flexibly and directly by using only achieve high performance in terms of their small physical
one hop. Later, MEC is extended by ETSI to multiaccess area and low energy consumption. ii) Graphics Processing
edge computing by supporting more wireless communication Unit-based Product: Graphics Processing Units rely on the
technologies, such as Wi-Fi. MEC is extensively used in inherent data parallelism of a mining program to improve
Autonomous Driving (AD) vehicles, wearable devices, and so the actual throughput, thereby achieving a higher speed than
on. Central Processing Units (CPUs). This characteristic makes
2) The Fifth Generation Mobile Networks and Edge Com- GPUs suitable for implementing AI algorithms. Thus, it is a
puting good option to design an edge device equipped with a GPU.
5G technology standard is seen as the most promising Jetson TX1, TX2 [75] and DRIVE PX2 [76], manufactured
wireless cellular network standard to cater to the requirements by NVIDIA, are embedded AI computing devices with a
of next-generation networks. Many ultra-dense edge devices small size, low latency and high-power efficiency. iii) Field-
will be deployed in 5G systems, including mainly small cell Programmable Gate Array-based Device: Field-Programmable
BSs and wireless APs. These devices are often equipped Gate Arrays (FPGAs) are highly flexible programmable hard-
with certain computing and storage abilities, thus enabling ware with low energy, parallel computing resources and high
ubiquitous mobile computing. In contrast to the main goal security, on which developers familiar with the hardware
of the First Generation (1G) to the Fourth Generation (4G), description language can quickly execute AI algorithms. How-
aimed at higher wireless speeds to support the transition from ever, FPGAs have worse compatibility and more limited
voice-centric to multimedia-centric traffic, the goal of 5G is programming capabilities than GPUs. Xilinx ZYNQ7000 is a
to provide services for the explosive evolution of informa- commonly-used FPGA-based AI accelerator [77]. iv) Brain-
tion communication technology and all kinds of interactive Inspired Chip: Brain-inspired chips adopt a neuromorphic
applications with latency requirements of less than 10 ms architecture, implementing programmable neurons by using
or even 1 ms for specific situations [55]. IoT devices with silicon technology and processing highly sophisticated tasks

2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: China Jiliang University. Downloaded on June 30,2021 at 08:42:36 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3088875, IEEE Internet of
Things Journal
JOURNAL OF IEEE INTERNET OF THINGS CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 7

TABLE I
S UMMARY OF THE SELECTED EDGE DEVICES AND EDGE COMPUTING SYSTEMS

Productions Owners Features/Targets


TPU [57] Google • Accelerates speed and reduces power consumption with a slight accuracy loss
DianNao family [58] Cambrian • Hardware accelerators that minimize memory transfers
NVIDIA • Processors process massive data in parallel at a high speed but with high
Turing GPUs [59]
Corporation energy consumption
Edge devices
• Achieves high-performance computing with low energy consumption and high
7 Series FPGA [60] Xilinx
flexibility
HiSilicon Ascend series • Scenarios are extended from the data center to the edge and devices with a
Huawei
[61] significantly improved energy efficiency ratio
Exynos 9820 [62] Samsung • Executes AI tasks for the mobile future via tricluster CPU
• Supports optimized solutions for space- and power-constrained operational
Xeon D-2100 [63] Intel
networks, storage, and cloud edges
• Completes all kinds of learning tasks quickly with extremely low power
TrueNorth [64] IBM
consumption
The Open
• Delivers a cloud-native and programmable platform to build an operational
CORD [65] Network
edge datacenter with built-in service capabilities for network operators
Foundation
Linux • Provides users with interoperable management for many sensors or devices
EdgeX [66]
Foundation across industrial fields
Edge computing
Linux
systems Akraino Edge Stack [67] • Integrated edge cloud platform based on a source software stack
Foundation
• Supplies distributed edge management with zero-touch provisioning of IoT
Azure IoT Edge [68] Microsoft
devices
• Allows edge devices to perform local operations and intermittently commu-
AWS IoT Greengrass [69] Amazon
nicate with the cloud and other devices
KubeEdge [70] Huawei • Native support for the collaboration of the cloud and edge
• Shield computing framework that simplifies application production and is
OpenEdge [71] Baidu
deployed on demand
Connected and
OpenVDAP [72] Autonomous • An open full-stack edge-based platform for real-field vehicular data analysis
dRiving Lab
Microsoft • A video stream analytic system that seeks the best tradeoff between multiple
VideoEdge [73]
Research resources and accuracy

by mimicking the human brain with synapses. Brain-inspired techniques to initiate appropriate actions. To address this issue,
chips support dramatically accelerated processing of neural AI is introduced into the IoT, heralding the era of AIoT and
network applications in real time with extremely low energy achieving ubiquitous intelligent collaboration. Powered by AI,
consumption. IBM TrueNorth [78] and Intel Loihi [79] are AIoT enables devices to perform self-driven analytics and
typical neuromorphic processor chips that can be applied well make smart decisions with minimal human intervention. In
to complex AI algorithms. this section, opportunities to fuse AI and the IoT, the general
b) Edge Computing Systems AIoT architecture with the support of edge computing and the
Techniques for edge computing systems are blooming. cloud, and a practical example of AIoT applications will be
Researchers focus on constructing high-performance edge reviewed.
computing systems with a microservice architecture to fit
the complex configurations and meet the resource-constrained
A. Opportunities for Integrating Artificial Intelligence with the
demands of AI algorithms. Azure IoT Edge [68], released
Internet of Things
by Microsoft, is a hybrid edge-cloud system that migrates an
application from the cloud to the edge. EdgeX Foundry [66] The data generated by IoT devices have many properties,
and Apache Edgent focus on IoT edge applications and are namely, polymorphism, heterogeneity, timeliness, accuracy,
deployed on a Local Area Network (LAN). EdgeX Foundry, massive-scale and rich semantics. Real-time data for all events
launched by the Linux Foundation, concentrates on controlling must be processed promptly. AI can effectively and efficiently
many end devices, while Apache Edgent [80], published by the mine valuable information from these data and make decisions.
Apache Software Foundation, aims to accelerate data analysis. Moreover, AI models can be deployed on every layer of IoT
In addition, Central Office Re-architected as a Datacenter systems with enhanced performance. The synergy of AI and
(CORD) [65] and Akraino Edge Stack [67] can be deployed IoT is named AIoT, benefits of which are illustrated below.
in a telecom infrastructure to support mobile edge services, 1) High Flexibility
which are suitable for AD and drone applications. Generally, the architecture of AIoT is highly flexible. In
AIoT systems, end devices equipped with certain computa-
tional and storage capacities and edge servers are also utilized
III. A RTIFICIAL I NTELLIGENCE OF T HINGS
with the support of the cloud. More complicated Al models
IoT systems have built a ubiquitously connected world, can be implemented on a device, edge servers and the cloud
where IoT devices collect millions of data sets and perform no in a cooperative mode. Different Quality of Service (QoS)
analysis. However, many practical services require analytical guarantees can be provided according to requirements, such as

2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: China Jiliang University. Downloaded on June 30,2021 at 08:42:36 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3088875, IEEE Internet of
Things Journal
JOURNAL OF IEEE INTERNET OF THINGS CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 8

services in different delay ranges and predictions with different


accuracy requirements. Additionally, unintelligent IoT devices
can make decisions and predictions by deploying AI models
in the cloud.
2) Enhanced Interactivity
The AIoT also addresses the concept of geographically
distributed aspects and provides more opportunities for more
end devices to participate, thereby significantly enhancing
the interactivity of devices. End devices are often of various
types, such as sensors and cameras. Small-size AI tasks can
directly work on end devices to make real-time decisions. For
large-size AI tasks that require more computing resources,
the execution of AI models needs cooperation between end
devices and edge servers. Taking AD as an example, the
sensors need to interact with each other, and the heterogeneous
data generated by different types of sensors require data fusion
to make accurate predictions. The actuators work together
to control the movement of the vehicle according to the Fig. 4. Overview of an AIoT architecture.
prediction results.
3) Intelligent Decisions and Accurate Predictions
It is difficult to extract valuable information from the large- information can reduce the latency, network bandwidth and
scale data collected by millions of IoT devices. AI-based cost of data transmission, which is of vital importance to AIoT
algorithms are suitable for intelligently collecting data into applications. One obstacle is that only small-size AI models
a meaningful group and mining valuable information. AIoT can be implemented on resource-hungry end devices. Thus,
combines data collected by IoT devices with intelligence the emerging technologies designed for compressing neural
offered by AI models, where AI models conduct automatic networks have elicited escalating the attention of researchers,
learning with seamless data, observe the environment, and as reviewed in Section V-A. Moreover, these end devices can
make predictions and decisions with minimal human inter- exchange data with each other or transmit data to the edge
vention. nodes via wireless protocols for further processing.
4) Various Applications 2) The Edge Layer
IoT devices have a variety of types, such as cameras, The network in the edge layer is often empowered by certain
temperature sensors, glucose sensors and radar. These devices computation and storage capabilities. The edge layer usually
can sense and actuate the physical world. Additionally, AI has various edge nodes, such as wireless BSs, routers, IoT
algorithms can handle different types of data sets. Thus, a gateways and APs. The bottom edge nodes are responsible
variety of applications, ranging from smart healthcare to the for receiving data from end devices of the perception layer
IoV and smart industry, can be realized by AIoT systems. and delivering control flows back to the devices via various
wireless interfaces [81]. The upper edge servers perform the
computation tasks using the received data. The computation
B. The General Architecture of Artificial Intelligence of Things tasks can be offloaded to a higher- level server with more pow-
In this section, an overview of the AIoT architecture is erful computation capabilities and memory if the complexity
illustrated, which adopts a tri-tier architecture similar to the of the task exceeds the ability of the current server. The control
architecture of the IoT, as shown in Fig. 4. From bottom to top flows of these servers will also be returned to the BSs or APs
are the end layer, edge layer and cloud layer, and the cloud and finally passed to the end devices. Other functions of edge
layer can coordinate the end layer and the edge layer. The end servers include authentication, authorization, offloading and
layer can proprocess or analyze data on premises and make storage of the data passing between networks. This kind of
early decisions. The proprocessed data from the end layer can edge computing can lower latency, provide continuous services
be aggregated in the edge or cloud layer for deep disposal. without the need for a stable internet and further protect
Each layer is further described below. data security and privacy by avoiding uploading data to the
1) The End Layer cloud. This is of practical value to AIoT applications, such as
The end layer in the AIoT system functions similarly to the agriculture, shipping and smart grids, without a stable internet.
perception layer in the IoT, where millions of interconnected Compared with traditional cloud computing, the resources of
sensors and actuators are deployed in a wide area. Unlike the the edge layer are inadequate. However, DNN models are
perception layer in the IoT, the end layer not only is respon- extremely suitable for deployment on edge nodes since the
sible for sensing, actuating and controlling the physical world layers of DNN models can be partitioned into several parts
but also executes small AI computational tasks or preprocesses to be executed on different nodes [3]. Several techniques have
data. Thus, the devices, ranging from smartphones to Rasp- been explored to make model inference work on edge nodes, as
berry Pi Bare Bones devices, typically have certain computing is further discussed in Section V-B. In addition, model training
and storage capacities. The preprocessed compact structured can work on edge nodes by using Federated Learning (FL) or

2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: China Jiliang University. Downloaded on June 30,2021 at 08:42:36 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3088875, IEEE Internet of
Things Journal
JOURNAL OF IEEE INTERNET OF THINGS CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 9

Transfer Learning (TL), as reviewed in Section VI-B. be controlled by using a keyboard, where the OpenCV library
3) The Cloud Layer is utilized to read the keyboard signals and then call the APIs.
The cloud layer can coordinate the end layer and edge layer,
and is responsible for mining potential value from massive 2) Model Training, Deployment and Inference
data and training AI models. The cloud layer empowers AIoT
Movement of the AD vehicle can be controlled by the
applications to use virtual resources through the internet,
results of AI inference or the keyboard signals. To enable
which provides flexible and scalable resources on demand,
AI inference to work on HydraMini, the corresponding model
e.g., computation and storage resources. The cloud layer
has to be pretrained on remote cloud because the resource
enables various intelligent services, such as the IoV, smart
intensive task will easily exhaust the mobile device owing to
homes, smart grids, smart agriculture, smart industry and smart
a large amount of computation. The training data can be saved
healthcare. Typically, the data from end devices are finally
from the multiple sensors by controlling the vehicle using the
transmitted to the cloud via the internet for further processing
keyboard, and the obtained data serves as input of AI models
and storage, and the AI model will be better trained with
while the keyboard signals as output labels. Alternatively,
higher accuracy. Moreover, deployment of AI models on cloud
the training data can be also downloaded from the Internet.
platforms empowers unintelligent devices with smart control
After training, to deploy the pretrained TensorFlow (an end-
and decision-making.
to-end AI framework) model on HydraMini, the following
process are inevitable. Concretely, with the assistance of the
C. An Example of Artificial Intelligence of Things Applica- Deep learning Processor Unit (DPU) accelerator in the FPGA,
tions model inference will be accelerated. To make the DPU work
more effectively, model compression and model compilation
In this subsection, the HydraMini, an affordable experi-
are necessary. The Deep Compression Tool (DECENT) [84]
mental research platform for AD, is taken as an example to
uses coarse-grained pruning, quantization and weight sharing
describe how AI models can be applied to the real world
to make model inference run fluently, thereby striking a good
in an edge-cloud cooperation mood, as shown in Fig. 5.
balance between latency and accuracy. Then, the Deep Neural
The following describes the corresponding system design,
Network Compiler (DNNC) [84] is employed to map the
illustrates model training and deployment as well as inference
neural network algorithm to the Xilinx DPU instructions,
and provides a case study using an end-to-end AI model.
aimed at achieving maximum utilization of DPU resources
1) System Design of the Autonomous Driving Vehicle
by balancing the computing workload and memory access.
The design and implementation of an AD system consist of
Finally, AI inference can run on HydraMini, where AI infer-
a cloud center and the HydraMini edge computing platform,
ence is packed as a consumer thread, obtains data from the
which is designed by the Center for Automotive Research
data queue, and then is sent to the AI network. The model will
(CAR) at Wayne State University [82]. The cloud layer is
directly generate the control commands or send information to
responsible for model training, model compression and model
the controller thread for decision-making.
compilation, while the edge platform executes the AD tasks.
The HydraMini is equipped with a Xilinx PYNQ-Z2 board, 3) A Case Study using an End-to-End Model
where an Advanced RISC (Reduced Instruction Set Computer)
Machine (ARM) core ARM Cortex-A7 and an Artix-7 FPGA The platform of HydraMini can support AD using end-to-
in the same chip are integrated. All the resources are managed end AI models. End-to-end AI models are simple to execute
by the board, the operation system of which is based on because the models can directly output commands. As the
Ubuntu 18.04. Thus, multiple ML libraries (PyTorch, Tensor- model structure is shown in Fig. 5, the model consists of CNN
Flow, Open Source Computer Vision Library (OpenCV), etc.) and dense layers and maps the camera input to the control
can be easily installed. A producer-consumer model [83] is output. CNN layers are responsible for extracting features
implemented to make the control process more efficient and from image inputs followed by fully connected layers, which
easier to extend; in this process, the sensors and decision- can be applicable to the field of AD to extract the command
making models play the roles of the producers and consumers, information. ReLU serves as the activation function. A softmax
respectively. The producers add the data to the specified queue layer or a dense layer works as output layer for classification
in the memory, while the consumers handle the data. More or regression. It is easy to conduct optimizations on the model
producers or consumers and new kinds of data can be easily if the model is far from satisfactory. Fig. 5 presents the
added by reading or writing the related queue in the memory; whole process. First, users are empowered to use the keyboard
consequently, various kinds of applications based on different to control the vehicle as will and save the data from the
sensors can then be deployed. The consumers usually serve as sensors as training data. Second, the preprocessing data is
the controllers who have a shared clock to check whether the transmitted to input of the AI model and the keyboard signals
vehicle receives the outdated commands. Moreover, a token is used as output. To further improve its accuracy, the model
is used to indicate who is in control at this moment. Users is trained in the cloud by using TensorFlow until a perfect
can build strategies for transforming properties. Basic control model is obtained. Third, the pretrained model is compressed
Application Programming Interfaces (APIs) are provided to and compilated, and then the copy produces files to the vehicle.
set the rotation speed of the motor and the angle of deflection Finally, movement of the vehicle is controlled by the output
of the servo motor. Additionally, movement of the vehicle can results of AI inference.

2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: China Jiliang University. Downloaded on June 30,2021 at 08:42:36 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3088875, IEEE Internet of
Things Journal
JOURNAL OF IEEE INTERNET OF THINGS CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 10

Fig. 5. AD using an end-to-end model [82]

IV. A PPLICATIONS OF A RTIFICIAL I NTELLIGENCE OF car produces 4000 TB of data per day, which is impossible
T HINGS to upload to the cloud [123]. In addition, AD cars need to
make real-time decisions. A transmission lag will result in
To date, the utilization of AIoT applications has reached
serious consequences. Edge computing is an ideal option that
visionary scales, continuing to have profound impact on both
directly moves computing tasks to the edge of the network
the quality of daily life and economic growth. In this section,
(e.g., the vehicles themselves). HydraOne [85] and HydraMini
several application scenarios, including smart homes, smart
[82] are typical examples of AD vehicles, where vehicles are
healthcare, smart agriculture, smart industry, smart agriculture,
equipped with embedded computing platforms to support AI
smart grids and smart environment, are presented to demon-
(e.g., CNN) inference and traditional computer vision analysis
strate how edge computing aided AIoT systems will make real-
and can make real-time decisions (e.g., to shift, brake, throttle,
world more efficient, smarter and safer. The forementioned
and steer). Moreover, Advanced Driver Assistance Systems
application scenarios are shown in Fig. 6. Additionally, the
(ADASs) have been developed to assist drivers. EdgeDrive,
main goals of each field are listed in Table II.
an edge cloud-based system, is designed to provide real-time
ADAS applications, such as smart navigation, to the driver
A. Internet of Vehicles while driving [86].
The IoV enabled by AIoT aims to enhance road safety, 2) Monitoring Systems for Safe Driving
strengthen efficiency, decrease crash risks and lower traffic Driving requires the cooperation of all the senses acting
congestion in transportation systems. Currently, the IoV covers together, and a slight slip may cause a major accident.
three major categories: AD, monitoring systems for safe driv- Drowsiness or fatigue is the main factor that leads to serious
ing, and Cooperative Vehicle Infrastructure Systems (CVISs). traffic accidents, which deserves more attention. A somnolence
1) Autonomous Driving detection system is fabricated on a Raspberry Pi 3 using a
AD achieves the intellectualization of vehicles, which can DL algorithm to make a timely alert when drowsy driving is
sense the environment through on-board sensors and make detected [87]. The proposed system analyzes images captured
smart decisions using AI algorithms to control movement of by a camera equipped with infrared lights using a SqueezeNet
the vehicle by automated means without human labor. An AD DNN to recognize patterns of the driver’s facial features (eye

2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: China Jiliang University. Downloaded on June 30,2021 at 08:42:36 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3088875, IEEE Internet of
Things Journal
JOURNAL OF IEEE INTERNET OF THINGS CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 11

Fig. 6. Examples of an AIoT-driven smart world, including smart homes, smart healthcare, the IoV, smart cities, smart agriculture, smart grids and smart
environment

closure, nodding/head tilting, and yawning) in both daytime detection system based on a You Only Look Once (YOLO) DL
and nighttime. model for car accident detection (YOLO-CA) is developed in
3) Cooperative Vehicle Infrastructure Systems [89]. To improve accuracy of accident detection, the proposed
CVISs acquire a comprehensive picture of dynamic, real- system employs a dataset named Car Accident Detection for
time road information and share the information by connect- Cooperative Vehicle Infrastructure System (CAD-CVIS) that
ing vehicles to vehicles, pedestrians and road infrastructures mainly consists of crash videos from public video sources
through the internet. The edge computing platform includes and the captured images from roadside intelligent devices
many distributed infrastructures, such as vehicles themselves, in CVIS. The RSUs will report traffic accident casualties to
BSs nearby and Roadside Units (RSUs). Directly processing rescue agencies and nearby vehicles using 5G networks, which
V2X data on nearby devices can reduce the transmission delay, can significantly shorten the rescue time response and improve
and the device will broadcast real-time road condition mes- rescue efficiency.
sages or traffic accident information to the relevant vehicles
and pedestrians near a geographical location. The users can
B. Smart Healthcare
change the route in advance, thereby alleviating road traffic
congestion and ensuring safer road traffic. Moreover, aided by The era of edge-computing-enabled AIoT opens a new line
Device-to-Device (D2D) communication and the advantage of of research in the field of medical and healthcare systems,
a latency less than 1 ms, 5G is suitable for providing direct which has already provided a myriad of applications, including
connection services. For instance, a great deal of road traffic health monitoring systems, disease diagnosis systems and
deaths and disabilities are caused by untimely treatment or auxiliary therapy systems.
secondary accidents. To this end, an automatic car accident 1) Health Monitoring Systems

2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: China Jiliang University. Downloaded on June 30,2021 at 08:42:36 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3088875, IEEE Internet of
Things Journal
JOURNAL OF IEEE INTERNET OF THINGS CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 12

TABLE II
AI OT APPLICATIONS APPLIED IN DIFFERENT AREAS

Domains Ref. Main Goals


[82], [85] • Intellectualizes vehicles, thereby enabling vehicles to control movement themselves by AI technologies
[86] • Advanced driver assistance systems aimed at providing intelligent services, such as smart navigation
IoV
[87], [88] • Safe driving monitoring systems that prevent drivers from becoming distracted
[89] • Cooperative vehicle infrastructure systems that ensure safer road traffic and alleviate road traffic congestion
[90], [91] • Remote monitoring systems
Smart Homes [92] • Intelligent control systems that increases convenience and improves life quality
[93] • Smart guard systems
[94], [95] • Timely detects diseases to avoid the wide spread of epidemics
Smart Agriculture
• Environment monitoring to help farmers make scientific decisions and ensure profitability, sustainability, and
[96], [97]
protection of the environment
[98], [99] • Intruder detection systems that help prevent wild animals from destroying crops
[100], [101] • Health monitoring systems that precisely monitor patients’ physical and mental condition
Smart Healthcare [102]–[104] • Disease diagnosis systems that help doctors to accurately detect diseases
[105], [106] • Robot-assisted surgery
[107] • Detects faults for real-time alerts and takes immediate action
[108], [109] • Intelligent attack detection systems
Smart Grids
[110] • Energy management systems that enhance the efficiency, reliability, and economy of smart grids
[111], [112] • Economy-driven vehicle charging systems that automatically determine a charging policy
[113], [114] • Real-time environmental monitoring systems
Smart Environment
[115], [116] • Disaster detection systems that send early warnings
[117], [118] • Waste management systems
• Provides high-accuracy intelligent systems, aimed at increasing efficiency and productivity in assembly/product
[119]
lines and reducing maintenance expenses and operational costs
Smart Industry
[120] • Inspection systems aimed at effectively determining potential defects
[121], [122] • Industrial data analysis aimed at making accurate predictions

The AIoT provides a great opportunity to develop smart used to prevent virus outbreaks [103]. On their smartphones,
health monitoring systems to track patients’ health status users receive alerts from the fog layer about infections and
with respect to psychological and physiological conditions. can avoid proximity to infected regions. Furthermore, an edge-
Smart health monitoring systems employ a wide variety of and-cloud-based DL platform using a 5G network is proposed
wearable devices (advanced sensors, smart watches, smart to detect Coronavirus Disease 2019 (COVID-19) in real time
clothes, smartphones, etc.) to collect information (heart rate, by using chest X-ray or Computed Tomography (CT) scan
blood pressure, blood sugar level, amount of sleep, etc.). images and to monitor social distancing, mask wearing, and
These devices can diagnose conditions and sounding a warning body temperature [104].
[100]. Edge computing is characterized by low latency, real- 3) Auxiliary Therapy Systems
time responsiveness and privacy security, which is an ideal Robot-Assisted Surgery (RAS), a kind of auxiliary therapy
option for smart healthcare systems. In [101], smart socks system, has been accepted by patients and surgeons; robotic
are applied to detect falling and symptoms of Parkinson’s surgical tools are controlled by the surgeon’s real-time hand
disease; the socks, equipped with a textile-based triboelectric movements and operate in small-scale movements. It is easier
nanogenerator, are used to capture personalized triboelectric to perform complex motion tasks, and there is a faster recovery
output signals and provide power for data transmission, while rate with less pain. However, RAS is confronted with the
nearby smartphones perform real-time comprehensive gait challenges of narrow vision and operating space and holes
analysis using AI. in the organs and tissues during the operation. To address
2) Disease Diagnosis Systems these issues, surgical tool detection based on anchor-free
Emerging applications of AIoT have been explored in CNN architectures on a TITAN GPU is proposed for real-
disease diagnosis systems to help doctors diagnose diseases time automated surgical video analysis to assist surgeons with
more accurately and reduce the burden on doctors. Currently, automatic report generation and optimized scheduling [105].
most disease diagnosis systems are designed under the cloud Moreover, in [106], an NVIDIAGTX 1080Ti GPU is used to
framework, where the data are transmitted to the cloud via 4G build an iris tracker for the ophthalmic robotic system by using
networks and Wi-Fi for further analysis using AI technologies. a CNN model, which helps surgeons with reference locations
However, this inevitably leads to a decrease in system perfor- for incisions and protects patients.
mance, such as increased latency and privacy leakage. Health-
Fog is proposed to solve this problem, aimed at real-time C. Smart Industry
automatic heart disease analysis by integrating DL algorithms The fourth industrial revolution, also known as Industry
in edge devices [102]. Additionally, due to the emergence of 4.0, has created new opportunities for product robotization
new infectious viruses and diseases, early detection and remote and automation, which emphasize intelligent manufacturing
monitoring are necessary to control infection. A fog-assisted techniques. The edge-computing-aided AIoT caters to the
cloud-based Chikungunya virus diagnosis system has been requirements of smart manufacturing, where edge computing

2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: China Jiliang University. Downloaded on June 30,2021 at 08:42:36 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3088875, IEEE Internet of
Things Journal
JOURNAL OF IEEE INTERNET OF THINGS CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 13

enables low-latency, secure manufacturing while AI provides cost and device-free experience advantages. A Wi-Fi-based
more intelligent local analysis and prediction. Smart industry moving human activity recognition system is explored using
focuses mainly on production automation and smart data a combined CNN and LSTM model [91]. Such a method can
analysis. be applied to the intelligent control of home devices, such
1) Smart Manufacturing as lights and air conditioners [92]. A smart floor monitoring
Smart manufacturing aims to provide high-quality and low- system based on self-powered triboelectric floor mats can be
cost production lines through the intelligent decision of AI used in position sensing, activity monitoring and individual
and low latency processing of edge computing. Edge-powered recognition by using a DL-based analysis method based on
AIoT in smart manufacturing has been applied in a variety of instant sensory data [93]. Wi-Fi-based activity recognition also
applications, such as product assembly lines and fault diag- demonstrates great potential in theft detection [124].
nosis. An intelligent robot factory is designed and developed
where routers and gateways serves as edge computing node to E. Smart Agriculture
relieve network transmission time and achieve increased pro- Smart agriculture aims to improve crop yield and quality,
duction during manufacturing and production [119]. Inspired reduce labor costs and protect the environment from the
by edge computing, an inspection system based on a CNN excessive use of pesticides and fertilizers by using modern
model coupled with an early exit technique is deployed on a technologies. The explosive employment of sensors and auto-
fog server to effectively discover potential defects and measure mated equipment will generate an abundance of data, thereby
their degrees, thus showing improved inspection accuracy with taxing the internet and cloud center. Edge-computing-aided
low latency [120]. AIoT applications enable data to be processed locally or on
2) Smart Industry Analysis nearby edge servers to seek a timely response. Generally,
Edge-empowered AIoT is also suitable for industrial data smart agriculture concentrates on crop production, agriculture
analysis. An assembly prediction system aided by edge environment monitoring and agriculture security.
computing is formulated to address the problem of high- 1) Crop Production Analysis
dimensional and imbalanced data, where the random forest Crop production analysis can promptly detect diseases to
algorithm is used to reduce dimensionality and extract char- stop the wide spread of epidemics by monitoring their growth.
acteristics, while a Synthetic Minority Oversampling Tech- Deep Leaf, a quantized CNN model for timely detection of
nique–Adaptive Boosting (SMOTE-Adaboost) method with early symptoms of coffee leaves, is developed using the X-
jointly optimized hyperparameters is applied for imbalanced CUBE-AI tool on an edge device, such as the STM32 family
data classification [121]. The use of edge computing can lower [94]. A real-time apple detection using a DL method has
the latency of data transmission and enhance the flexibility of been designed on a Raspberry Pi 3 B+ with improved energy
application deployment and the efficiency of quality predic- consumption, and inference time [95]. The embedded device
tion. Faults and errors often bring great loss for factories. To is equipped with various kinds of accelerators to carry out
reduce loss, a predictive maintenance framework is triggered DL inference, while a dedicated workstation is equipped with
for conveyor motors to detect the early stage of faults and an NVIDIA RTX 2080Ti GPU to train DL models, where the
errors in machinery [122]. YOLOv3 tiny architecture is exploited to accurately recognize,
count, and measure the size of apples. This kind of method
can be applied on the AD cars to monitor the crop growth on
D. Smart Homes
a large scale.
The rapid development of AIoT has encouraged many 2) Agriculture Environment Monitoring
attractive computationally intensive applications that provide Smart agriculture can accurately measure and predict the
intelligent sensing and convenient control services in smart crop growth environment that influences crop productivity,
home scenarios. For home data protection, edge computing has thereby helping farmers to make scientific decisions. To pre-
emerged as an excellent option to execute local computation dict the temperature of the crop growth environment in a real
and processing, especially for some computation-intensive AI- time, an edge-computing-aided smart agriculture system is
based applications. designed by measuring the change in the CPU temperature of
Home monitoring systems are developed to reduce the single board computers. Here, the edge server is used to collect
concerns of family members by identifying and analyzing the CPU temperature, where outdoor temperature is predicted
the behavior of specific family members or house conditions. using a combination of single spectrum analysis and linear
Usually, it is difficult for children to finish complex homework regression [96]. Furthermore, productive soils significantly
alone. An intelligent home environment employs a smart chair affect crop yield. The concentrations of nutrients in soil are
and desk to collect the child’s real-time behavior information measured by using colorimetry, where the data sensed by
and a robotic assistant to interact with the child as a therapist the Nitrogen-Phosphorus-Potassium (NPK) sensor is directly
would do [90]. The implementation of such a system can processed by a fuzzy rule-based system on a Raspberry Pi 3
provide not only education for children without pathologies [97]. Such an approach can warn farmers of the deficiency
but also therapeutic interventions for children with various of N, P and K available in soil, help to optimize the soil
pathologies. Wi-Fi signals exist in almost every corner of fertility and avoid water contamination by runoff and leaching.
a home; thus, Wi-Fi-based activity recognition has received Additionally, more AI algorithms are also applied to manage
constant attention from researchers because of its ubiquity, low the crop growth environment.

2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: China Jiliang University. Downloaded on June 30,2021 at 08:42:36 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3088875, IEEE Internet of
Things Journal
JOURNAL OF IEEE INTERNET OF THINGS CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 14

3) Agriculture Security attack/anomaly detection issue is formulated as a partially


Agriculture is the foundation of a country’s economic observable Markov decision process to accurately detect cy-
development and provides food and other raw materials. berattacks by using an RL-based detection algorithm [109].
Thus, protecting agriculture from malicious damage is crucial. Although a cloud-centric energy theft detection scheme is used
To handle animal-human cohabitation, a CNN-based early to detect abnormal behavior in smart grids, the scheme still
warning system is proposed with the support of the IoT and requires privacy preservation techniques during data transmis-
edge computing, where the sensed or processed data can be sion, which must be explored in edge computing scenarios
further transferred to BSs via a fiber-wireless network to alert [124].
humans of possible animals crossings [98]. In addition to
the function of selecting suitable crops according to the soil G. Smart Environment
conditions, the edge computing-assisted agriculture system can The goal of smart environment is to provide humans with
also keep trespassing wild animals away from the agricultural a safer and more comfortable life. Environmental monitoring
area, where a CNN model is used to detect animals by using systems can send early warnings to humans by making ac-
the captured images on a Raspberry Pi 3, and a buzzer is curate predictions. To monitor air quality, an AI-based model
utilized to deter intruders by irritating them [99]. is designed using on-chip sensing systems to predict the con-
centration of Particulate Matter with a diameter of 2.5 µm or
less (PM 2.5) [113]. A seawater quality assessment integrating
F. Smart Grids
Principal Component Analysis (PCA) and a Relevance Vector
Grid operators have deployed IoT sensors to monitor power Machine (RVM) are developed in an edge computing environ-
grid devices in real time. Recently, there has been a rush to ment to predict the potential values of dissolved oxygen and
integrate AI with electrical grids to provide a more stable, pH [114]. This type of method can be used for city water
cost-saving and secure smart grid. The Linux Foundation’s LF prediction. A smart environment can also be applicable to
Energy releases a scalable and technology-agnostic industrial accidental disaster detection. A CNN-based model is proposed
IoT platform, Grid eXchange Fabric (GXF), enabling grid on an NVidia GeForce GTX 1060 to detect fire and smoke
operators to securely gather data and monitor, control and and thus provide warnings in advance [115]. UAVs are also
manage smart devices on the grid [125]. ISO (Independent employed to increase the area of detection and are suitable
System Operator) New England (ISO-NE) has successfully for forest fire detection [116]. Many studies have explored
used cloud computing technology for large-scale power sys- cloud-based geological hazard detection systems designed to
tem simulations, which may be capable of dealing with the prevent the effects of the disasters [55]. And more work should
increasing challenges [126]. The smart grid application, dis- focus on edge-computing-assisted detection systems designed
tribution management systems, can significantly benefit from to provide warnings more quickly.
the usage of cloud [127]. Moreover, attention is drawn to edge- Smart environment can help manage a city. For instance,
computing-based solutions for smart grids. An edge-assisted much effort has been made regarding waste management
AIoT smart grid architecture is fully explored to support systems to mitigate the side effects of waste materials. A
the connection and management of substantial terminals, data scheme fusing multiple deep models is exploited to classify
privacy protection, and real-time data analysis and prediction waste [117]. To reduce latency and enhance availability, the
[128]. For automated fault detection, an intelligent damage research presented in [118] implements a DL-based waste
classification and estimation system is proposed to automati- detection model on a Raspberry Pi 3 Model B+. WasNet adopts
cally localize and predict damage to power distribution poles; a lightweight neural network to accurately sort waste, and then
the system processes images captured by Unmanned Aerial transplanted to the hardware platform and transformed into
Vehicles (UAVs) by using a CNN model [107]. A more smart trash [130].
transactive energy system is desired to promote efficiency,
save cost and time, and ensure delivery; the system is also V. E NABLING T ECHNOLOGIES FOR A RTIFICIAL
desired for the environmental advantages gained from the I NTELLIGENCE I NFERENCE IN A RTIFICIAL I NTELLIGENCE
increasing use of intermittent renewables [129]. For example, OF T HINGS
renewable energy management is designed using an LSTM- Most AI models, especially DNN models, are designed to
based model to predict solar intensity, guarantee a power be much deeper, which have a larger data set to promote
balance and improve the reliability and efficiency of smart their accuracy. Inference in the cloud will inadvertently incur
grids [110]. In addition, pricing is an important factor that additional queuing and propagation delays from the network,
significantly influences users’ behaviors, such as in the case which is fatal for time-critical applications. These AI models,
of economy-driven electric vehicle charging. Considering the however, are too large and computationally expensive to be
electricity price and the battery energy load, TL [111] and directly deployed on resource-constrained end devices. To
DRL [112] methods are utilized to automatically determine overcome this challenge, one possible approach is to simplify
vehicle charging policies. the models with a dramatic decrease in computation. The other
The security of smart grids is also noteworthy. A DL-based effective approach is to outsource complex inference tasks to
mechanism that a Deep Belief Network (DBN) and Restricted edge nodes with more resources. In this regard, the methods
Boltzmann Machine (RBM) is exploited to detect potential used to optimize inference on device and coinference in the
false data injection attacks in real time [108]. Moreover, online edge, and privacy-preserving techniques, are surveyed.

2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: China Jiliang University. Downloaded on June 30,2021 at 08:42:36 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3088875, IEEE Internet of
Things Journal
JOURNAL OF IEEE INTERNET OF THINGS CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 15

A. On-Device Inference
Generally, two practical methods can be used to reduce
computational costs of AI models. One is to directly design
compact and efficient neural network models with a reduced
number of parameters, such as SqueezeNet [131], Xception
[132] and ShuffleNet [133]. The other method is typically
called model compression, which achieves a smaller memory
footprint and improved operation for end devices by compress-
ing pretrained networks.
1) Designing Compact Networks
Researchers usually focus on constructing a compact DNN
Fig. 7. Knowledge distillation method for DNN on-device inference.
model to reduce the number of parameters while maintaining
impressive accuracy. Many measures have been taken to
achieve this purpose, and the well-accepted DNN models
widely used on resource-constrained devices are presented, this method suffers only 1.4% and 1.0% accuracy loss under
including Xception [132], MobileNets [134], YOLO [135], a 2× speedup for the Residual Neural Network (ResNet) and
and SqueezeNet [131]. Xception and MobileNet use depth- Xception networks.
wise separable convolutions, originally based on the idea of b) Quantization
factorized convolution instead of the standard convolution Quantization uses a more compact format by adopting low-
operation, to obtain a reduced number of computations and bit width numbers instead of 32-bit floating-point numbers
parameters for neural network models while maintaining high to represent each weight, thereby reducing the computational
performance [134]. SqueezeNet utilizes special 1×1 convolu- intensity and the memory footprint and further increasing the
tion filters to replace the original 3×3 convolution filters and energy efficiency. It is sufficient to use fixed-point networks,
downsamples the number of input channels [131]. To over- compared to their 32-bit full-precision floating-point coun-
come the disadvantage of MobileNet’s group convolutions, terparts, with a negligible degradation in performance [141].
ShuffleNet adopts a novel channel shuffle operation to help Some quantization methods for DNN models are confined to
information flow across feature channels [133]. YOLO is a limited types [141]–[143] and bound to a specific hardware
single-shot detector that integrates target area prediction and and DNN framework [142], [143]. To address this issue,
category prediction into a single neural network model to libnumber, a portable and auto-tuning framework, is proposed
achieve real-time, fast target detection and recognition with to optimize number representation for each layer of DNNs;
high accuracy, thereby proving that integrating target area and the Abstract Data Type (ADT) of the portable API allows
category predictions is much faster than completing these steps users to declare the data (e.g., inputs, weights, or both) to be
sequentially [135]. quantized in a layer as number type [144]. Then the auto-tuner
2) Model Compression of libnumber will use a compact representation for the number
Model compression makes inference reduce the complexity to minimize the user-supplied objective function with a certain
and resource requirements of models with a slight accuracy accuracy remaining. In this way, the concern of developing
loss. Below, efforts toward model compression techniques are an effective DNN model is totally separated from low-level
briefly discussed. optimization of the number representation.
a) Parameter Pruning and Sharing c) Knowledge Distillation
Parameter pruning and sharing can decrease the number Knowledge distillation fabricates a compact DNN model
of redundant parameters and address the issue of overfitting. that migrates the behavior from a powerful and complex DNN
Model pruning methods are roughly divided into structural model. By training the smaller DNN model by using the output
pruning and nonstructural pruning. Nonstructural pruning is predictions generated by the complicated model, the smaller
generally a connection-level, fine-grained pruning method with DNN model should approach or exceed the function trained
relatively high accuracy; however, it depends on a specific by the larger DNNs as well as possible, as illustrated in Fig.
algorithm library or hardware platform, such as Deep Com- 7. The framework of knowledge distillation is first introduced
pression [136] or Sparse-Winograd [137]. Structural pruning is in Hinton’s work, where a “softmax” output layer is employed
a filter-level or layer-level coarse-grained pruning method with and a temperature T is used in the softmax layer to produce a
relatively low accuracy, and the structural pruning strategy is softer probability distribution over the classes and improve the
more effective than nonstructural pruning and does not depend distillation performance [145]. A hint-based training method,
on a specific algorithm library or hardware platform. This an extension of knowledge distillation, is carried out by using
pruning strategy can run directly on DL frameworks, such the intermediate representations learned by the teacher as hints,
as ThiNet [138] and Network Slimming [139]. A channel thus making the training of the student deeper and thinner
pruning method is designed that convolutional layer is pruned [146].
by Least Absolute Shrinkage and Selection Operator (LASSO) A stand-alone compression technique may not meet the
regression-based channel selection and least-squares recon- requirements of computation- and memory-intensive end de-
struction [140]. A pruned VGG-16 can reduce the number vices. Thus, some efforts have focused on reasonable combina-
of weights by 5× with a slight increase in the error, while tions of these model compression techniques. Deep compres-

2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: China Jiliang University. Downloaded on June 30,2021 at 08:42:36 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3088875, IEEE Internet of
Things Journal
JOURNAL OF IEEE INTERNET OF THINGS CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 16

sion is employed by using a combination of weight pruning


and trained quantization to reduce the storage requirements of
a neural network without loss of accuracy [147]. Additionally,
a RL-based optimizer that effectively chooses a good com-
bination of various model compression techniques achieves
a balance between the application requirements and mobile
resource constraints [148].

B. Coinference at the Edge


The forementioned works concentrate mainly on designing
compact neural networks and model compression techniques,
which can make AI model inference work directly on end
device without an apparent loss of accuracy. However, de-
ploying large-size AI models, which require high computation,
power and memory capacity from the end infrastructures,
remains a challenge. Therefore, it is a good option to segment
DNN models into multiple partitions and offload each part
to heterogeneous local end devices, more powerful distributed
edge servers or remote cloud servers. In what follows, the
enabling technologies for such coinference between the end
layer and edge layer are summarized.
1) Offloading
Computation offloading is a widely used distributed com-
puting paradigm in fast inference, where an end device can Fig. 8. Model partitioning for DNN inference at the edge.
migrate part of its computation to an edge node, the cloud, or
both over a heterogeneous network. By this means, offloading
can employ remote servers to increase the computation speed
and save energy. However, a compromise between advantages Offloading techniques shift model inference to edge servers
in remote execution and sacrifices in data transmission should or cloud servers, which is highly dependent on unpredictable
be reached. A great deal of optimization-based offloading server availability and network conditions. The idea of of-
approaches, such as DeepDecision [149] and mobile-cloud floading can be extended to model partitioning, which takes
DNNs [150], have been taken into consideration, which have advantage of the unique structure of DNNs. In this way, the
strict constraints on network latency, bandwidth, energy con- layers of DNNs can be divided into several parts, where some
sumption, accuracy and the input size of the DNN models. layers are directly executed on end device and some layers are
Moreover, the decision of whether to offload or not also offloaded to edge server or the cloud for remote computation.
depends on the input data size and hardware capabilities. This approach can provide improved performance with latency
Offloading technology has been widely discussed in the reduction and energy efficiency enhancement. Generally, the
context of networking [151] and edge computing [152]. An DNN model can be decoupled into multiple partitions that are
irreversible trend has been to explore such technology in allocated to 1) distributed mobile devices [156], 2) edge nodes
AIoT. Glimpse, a real-time object recognition system on [157], or 3) the cloud [3], as illustrated in Fig. 8.
mobile devices, ships the execution of algorithms for object Partitioning DNN models horizontally by layers is the most
recognition to server machines to reduce latency [153]. In common method, where some layers are executed on devices
addition, offloading can also take place in the same tier. The and some layers are implemented on edge nodes or the cloud.
femtocloud system configures a cluster of co-located mobile This can decrease the power consumption of the IoT device
devices into an orchestrated cloud service [154]. Such an and the latency by using the computation cycles of other
offloading approach enables resource-constrained IoT devices end devices, while there exists a delay in exchanging the
to outsource the computation intensive tasks to other mobile intermediate results at the DNN partition point, which can
devices nearby rather than to a MEC server. It dedicates advan- still generate overall net benefits. Care must be taken regarding
tages includes better scalable computational capacity and not where to partition the DNN model. Neurosurgeon is a system
relying too much on edge servers. However, the dynamicity, that applies to DNNs and intelligently partitions DNN models
and instability of mobile devices make security unavoidable at the granularity of neural network layers to obtain the best
challenges. It is still a great option to offload computation tasks latency and end device energy consumption [3]. An edge com-
to a more stable server. GigaSight, a framework for crowd- puting system for object tracking is conducted by [158] where
sourced video analysis, removes privacy-sensitive information the best partition point between end device and edge server is
in a user-specific vector machine on the cloudlet and then runs automatically selected to minimize the power consumption of
part of the computation in the cloud [155]. IoT device and the latency of dynamic network bandwidth.
2) DNN Model Partitioning CNN models can be vertically partitioned. DeepThings, a

2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: China Jiliang University. Downloaded on June 30,2021 at 08:42:36 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3088875, IEEE Internet of
Things Journal
JOURNAL OF IEEE INTERNET OF THINGS CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 17

as a cooperation framework between the end, edge and cloud,


where the additional classifiers deployed on the cloud further
process the features and arrive at the improved result [161].
This method also provides better privacy protection because
the extracted features rather than the original data are sent to
other devices.
To meet the strict latency requirements of AIoT applica-
tions, attempts have been made to perform model inference
near the data source. The approaches can be generally cate-
gorized into two types, as summarized in Table III. The first
type is to directly deploy inference on end device by designing
compact networks or compressing existing ones. The com-
pact models target reducing the number of parameters while
maintaining impressive accuracy, while model compression
techniques tend to remove redundant structures and parameters
without significant loss of model performance. The parameter
pruning and sharing method has proven to be effective in
removing redundant and uncritical parameters. Quantization
adopts low-bit width numbers to reduce computational in-
tensity and memory footprint. Knowledge distillation aims to
learn a distilled model and train a more compact lightweight
network to provide accurate inference. These techniques can
be combined to seek improvement in model performance. The
second approach is to support a cooperative mode of inference
between end devices and edge servers with relatively abundant
Fig. 9. Model early exit method for DL inference at the edge. resources, such as offloading and DNN model portioning.
Offloading uses remote servers to accelerate inference at the
cost of data transmission. Based on the unique structure of
lightweight system for distributed implementation of CNN DNNs, model partitioning approach segments the model into
inference on a resource-limited device, employs a fused tile multiple parts and allocates them to each device or node
partition method and divides the CNN models into separate for coinference. Model early exit enables inference work to
distributable subtasks to decrease the memory footprint [157]. exit early via additional side branch classifiers with high
3) Model Early Exit confidence and employs more layers to acquire better results.
A DNN model with additional layers can generally achieve
higher accuracy; however, the model requires increased com- C. Private Inference
putation and energy resources in feedforward inference. There- The AIoT collaborative inference techniques between end
fore, it is difficult to execute such a complicated DNN model devices and servers are reviewed above, which can greatly
on a resource-constrained end device. Various approaches, reduce communication costs and latency. However, this kind
such as offloading and model partitioning, have been adopted of cooperative inference system also faces privacy concerns
to make inference work on edge servers, thereby reaching a when the data including sensitive information is transmitted to
balance between accuracy and processing latency. The idea of a nearby edge server or cloud. Additionally, end devices are
accelerating model inference can be further promoted by the too energy- and resource-constrained to execute complex data
emerging model early exit method, which leverages additional protection methods. Intuitively, it is worth studying privacy
side branch layers to obtain the classification result [159]. The enhancement between end devices and edge servers. Initial
inference process can be completed in advance via the early efforts have been made to protect data from eavesdroppers.
classifiers with high confidence. It is also possible for some In this subsection, secure computation through encryption or
complicated tasks to use more DNN layers to complete the cryptography and data obfuscation techniques are proposed for
classification procedure. the AIoT inference.
As seen in Fig. 9, by using the technology of model early 1) Secure Computation
exit, a partial DNN model on a device can quickly extract Cryptography-based methods can be applied to AIoT pri-
features and, if it is confident, directly produce the inference vacy inference. Edge servers perform computation by using
result [120]. Compared to offloading DNN computation to the data preserved by cryptographic techniques while knowing
the cloud, model early exit can decrease the communication nothing about the data, and end devices receive the result of
delay [120], [160]. Moreover, this method achieves a greater inference without knowing the model.
inference accuracy than the forementioned methods (parameter Homomorphic encryption can be used in AIoT applications
pruning and quantization) for model compression on an end for secure computation, where DNN computation is deployed
device. Model early exit should not be considered independent on the encrypted client data. The commonly-used nonlin-
of offloading and model partitioning but should be thought of ear functions in DNN models, however, are incompatible

2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: China Jiliang University. Downloaded on June 30,2021 at 08:42:36 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3088875, IEEE Internet of
Things Journal
JOURNAL OF IEEE INTERNET OF THINGS CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 18

TABLE III
S UMMARY OF AI OT MODEL INFERENCE TECHNOLOGIES

Architecture Technology Highlights Ref.


Compact • Adopts a compact module in each layer, thereby reducing the number of parameters and amount
[131]–[135]
Networks of computation
On-device Parameter
Pruning and • Removes redundant parameters that are insensitive to the performance [136]–[140]
Sharing
• Uses low-bit data computation to achieve a higher computing speed and lower power without
Quantization [141]–[144]
apparent accuracy loss
Knowledge
• Employs a smaller student network with distilled knowledge of a larger teacher network [145], [146]
Distillation
Cooperation • An adaptive offloading strategy that seeks a tradeoff between energy, accuracy, latency, and
Offloading [149]–[155]
between devices input size for different DNN models
and servers • DNN layer adaptive partitioning that allows cooperation between the resources of the device,
DNN Model
edge and cloud [3], [157], [158]
Partitioning
• Latency- and energy-oriented optimization
• Supports partial DNN model inference based on the accuracy requirements [120], [159]–
Model Early Exit
• Accuracy-aware [161]

with the operations (addition and multiplication) employed implementation. Differentially private DL algorithms prove to
in leveled homomorphic encryption. Moustafa [162] trains be an effective method, where random noise is added to the
a model with a nonlinear function and uses a low-degree data. Edgesinitiator [166] is proposed to protect private data
polynomial approach in the inference of private time instead, from sensitive inference, where a DL model is used to decrease
inevitably leading to a loss of accuracy. Thus, a novel min- the data to a defined size while Local Differential Privacy
max normalization strategy is induced that limits function (LDP) is achieved by injecting additional noise to obfuscate
inputs to ranges with a low approximation error. Additionally, the learned features. To avoid the decrease in model accuracy
secure multiparty computation is applicable for AIoT inference caused by differential privacy, Gaussian and Laplacian noise
privacy, where the multiple parties involved work together is added to the input layer and the intermediate layer output
to jointly evaluate a model. The Efficient Priority-Preserving and decryption keys are generated to remove random noise
Scheme (EPPS) [163] trains secure models, supported by the in [167]. A two-edge-server framework is also designed for
foundation of threshold Paillier encryption, where the secret AIoT privacy-preserving inference, where the DNN layers that
key is split across each party and a certain number of users require much computation are outsourced, while computation-
collude with one another to infer the privacy of trusted users efficient layers are executed on the device.
without the influence of someone exiting at any step of the
process. However, secure multiparty computation often leads VI. E NABLING T ECHNOLOGIES FOR D ISTRIBUTED
to a large communication overhead. More studies have ex- A RTIFICIAL I NTELLIGENCE T RAINING IN A RTIFICIAL
ploited a hybrid of homomorphic encryption and secure mul- I NTELLIGENCE OF T HINGS
tiparty computation for AIoT privacy inference [164]. Gazelle Conventionally, the training mode of AIoT models relies
[164] proposes using a combination of packed additively on a centralized style, which may incur additional costs in
homomorphic encryption and garbles circuit-based multiparty data transmission and privacy issues. To effectively address
computation techniques, where a homomorphic encryption these issues, a decentralized training mode is proposed, where
library offers efficient implementations of basic homomorphic the AI model is divided into several subnetworks and each
operations and homomorphic linear algebra kernels support part is trained directly on end device with local data. The
the efficient use of the automorphic structure. The hybrid trained model updates can be aggregated at edge nodes in
framework obtains a much lower latency and bandwidth than the network or be exchanged through the interconnect end
the purely cryptographic primitive. devices in the network. The two kinds of decentralized training
2) Data Obfuscation modes can be realized without the support of the cloud. In
this section, enabling techniques for decentralized AI model
To avoid the heavy computation of cryptographic primitives,
training, communication efficiency and security enhancement
data obfuscation can provide a strong guarantee for sensitive
in AIoT are mainly discussed.
data by adaptively injecting noise into data set while retaining
the servers’ ability to implement AIoT inference tasks. Noise
is considered an additional trainable set of parameter prob- A. Decentralized Artificial Intelligence of Things Model Train-
abilities, which can be eliminated by the repetitive process ing Methods
of end-to-end self-supervised training. By adding noise as a The AI inference techniques mentioned above work
gradient-based learning process to discover the distributions of smoothly on end devices and edge servers of the AIoT system
the noise tensors, SHREDDER [165] reaches an asymmetric on the assumption that the AI model has been trained in
balance between accuracy and privacy to minimize the loss of the cloud center by using the existing data set. Increasingly
accuracy while maximally reducing the information content more data are generated at the edge of the network, which
of the data unloaded by devices to the servers for inference is of great significance to AI model training. With respect to

2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: China Jiliang University. Downloaded on June 30,2021 at 08:42:36 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3088875, IEEE Internet of
Things Journal
JOURNAL OF IEEE INTERNET OF THINGS CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 19

2) DNN Splitting
DNN splitting exchanges partially processed data instead
of raw data between end devices and edge servers, which is
an effective way to protect privacy-sensitive data [21]. The
successful deployment of DNN splitting lies in the mechanism
by which the DNN model can be split between two successive
layers with two partitions executed at various locations with
no apparent accuracy loss. However, selecting an appropriate
splitting point to meet the requirement of latency remains as a
research point. To reduce the computational complexity with
accuracy preserved and introduce a bottleneck, it is proposed
in [171] to employ network distillation to distill the head
portion of the split model. This approach deploys lightweight
Fig. 10. Model early exit method for DL inference at the edge. models on end side and pushes the intensive computation of
DNN to the server, minimizing processing load at the mobile
device as well as the amount of wirelessly transferred data.
3) Transfer Learning
the AIoT, it is essential to develop decentralized AI training Knowledge transfer learning, also known as TL, has
methods that can avoid data transmission, further reducing the emerged as a practical DNN training mechanism that enables
transition bandwidth and enhancing privacy. In this subsection, the convolution kernels to be initialized with the weights
enabling techniques for decentralized AIoT model training are learned from the pretrained model and solves the problem of
introduced. training data drawn from different distributions. TL is closely
1) Federated Learning connected with DNN splitting, the goal of which is to reduce
FL is a collaborative AI setting that is originally aimed the energy cost of DNN model training on mobile devices
at addressing the problem of Android mobile terminal users and suitable for general-feature image recognition. Extensive
updating models locally with unreliable and slow network studies have explored the impact of TL techniques on DNN
connections [168]. Gradually, FL has come to assist in efficient training of end devices. Reference [172] provides research on
AI training by using data distributed over a large number the performance (in terms of both accuracy and convergence
of end devices and edge servers while ensuring information speed) of TL and focuses on various student architectures
security, protecting terminal and user privacy, and adhering and enabling techniques for transferring knowledge from the
to legal requirements during data exchange. Traditionally, in teacher to the student. The effect of TL varies with the ar-
the distributed configuration of FL, the AI model is built chitecture, transferring techniques, knowledge and skills from
without direct access to data, while mobile devices serving both the intermediate layers and last layer of the teacher to a
as clients carry out local training. Additionally, these mobile shallower student.
devices can be extended to end devices, while edge nodes In this subsection, three distributed AI model training
and cloud servers are equivalently considered as clients. A techniques for AIoT are reviewed, the highlights of which
server coordinates a series of nodes, thereby enabling the are presented in Table IV. FL preserves privacy by leaving
clients to take responsibility for various levels of ML model raw data on local devices and trains a shared model on
training and share the individual trained models with the server the server by uploading the computed updates. Rather than
[169]. The server creates a federated model by using the transmitting the raw data, DNN splitting selects a splitting
uploaded trained models and returns the optimized model to point, thereby enabling distributed DNN models to be trained
the clients. In addition to DNNs, other important algorithms, using the partially processed data. TL applies the general
such as random forests and so on, can also be trained using features learned from a DNN pretrained on the basic data to
FL. As shown in Fig. 10, in the architecture of FL, a set a specific data set or task.
of edge nodes downloads the global DL model from the
aggregation server, train their local DL models based on the
downloaded global model with local data and upload the B. Enabling Technologies for AIoT Model Training Updates
trained model to the server for model averaging. Privacy and Distributed AI training techniques are extremely suitable
security can be enhanced by restricting the local data to be for scenarios with large data streams. The performance of
trained solely on end devices or end devices nearby [170]. distributed DL training techniques faces the challenge of com-
A decentralized method of FL for CNNs and LSTMs based munication costs, which is vital to end devices and edge nodes.
on iterative model averaging is presented in which the local Thus, it is necessary to reduce the frequency of communication
data are left on mobile devices and the trained models are or the size of communicated data, and the corresponding
shared by aggregating locally computed updates [169]. An methods are provided in the following section.
extensive empirical evaluation on the basis of five different 1) Frequency of Training Updates
model architectures and four data sets demonstrates that FL In distributed DL training, a locally trained model or
is robust to non-independent and identically distributed (non- preprocessed data must be uploaded to a central server. One
IID) data distributions and accelerates the process of training. important issue is to optimize the gradient of the shared

2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: China Jiliang University. Downloaded on June 30,2021 at 08:42:36 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3088875, IEEE Internet of
Things Journal
JOURNAL OF IEEE INTERNET OF THINGS CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 20

TABLE IV
S UMMARY OF AI OT MODEL DISTRIBUTED TRAINING TECHNOLOGIES

Architecture Technology Highlights Ref.


• The data remains in the original position, and the optimized model is trained on user
Federated
side by parameter exchange under an encryption mechanism, thus solving the problem [168]–[170]
Learning
Distributed of data islands, ensuring privacy and reducing the communication costs.
training • The deep network is split into a series of subnetworks, which are trained using partially
DNN Splitting [171]
processed data.
• Enables a system to recognize and apply the knowledge and skills learned in
Transfer
previous domains/tasks to novel domains/tasks by transferring labeled data or knowledge [172]
Learning
structures from related fields

model through the gradient updates on end devices. Stochastic to the next iteration if the server receives gradient updates
Gradient Descent (SGD) is a widely used gradient descent from enough devices. This approach not only reduces the
method that updates the minibatch gradient over the entire long latencies involved in waiting for straggler devices in
data set. Generally, there are two kinds of SGD: synchronous synchronous SGD but also avoids asynchronous noise when
and asynchronous SGD [173]. In synchronous SGD, each adjusting for the worst stragglers.
device updates its parameters in lockstep, if all the devices 2) Size of Training Updates
finish the computation tasks, with the gradients on their local In the scenario of AIoT applications, communication band-
training data. Each device updates its parameters to the central width is required for gradient exchange, and the practical
server in asynchronous SGD after computing the gradients. bandwidth limits the scalability of multinode training. In
Synchronous SGD may converge to a good solution but is addition to the factor of the frequency of training updates, the
slow in practice because of the requirement of waiting for size of training updates significantly influences on the trans-
the other devices. Asynchronous SGD converges faster than mission bandwidth. Gradient compression is usually adopted
synchronous SGD at the cost of additional noise but updates to reduce the size of model updates communicated to the cen-
the parameters by using stale information, thereby leading to tral server, which aims to compress the gradient information
convergence to poor solutions. Thus, more studies focus on of the updated model. Generally, gradient quantization and
making synchronous SGD faster or allowing asynchronous gradient sparsification are recommended to perform gradient
SGD to converge to better solutions. The Elastic Averaging compression [175].
SGD (EASGD) method [174] is introduced to reduce the Gradient quantization carries out lossy compression of the
communication costs of asynchronous SGD training methods; gradient vectors by using a finite-bit low-width number instead
the basic idea is to allow each end device to carry out more of the original floating-point gradients, which is similar to
local training computations and to deviate further from the parameter quantization of inference. The difference between
central shared parameters before synchronizing its updates. these methods lies in whether the quantization technique is
This method decreases the amount of communication between applied to the model gradients or the model parameters.
local devices and central server. Most of the gradient exchange in distributed SGD has been
proven to be redundant; thus, Deep Gradient Compression
FL also adopts SGD. In [169], the Federated Averaging (DGC) is proposed to decrease the communication bandwidth
(FedAvg) method for FL with a DNN established based on by compressing the updated gradients for a wide range of
iterative model averaging is presented, in which end devices CNNs and RNNs [175]. To preserve the training accuracy
update the DNN model with one-step SGD and the server after compression, DGC uses four techniques: momentum
averages the obtained models with weights. This approach correction, local gradient clipping, momentum factor masking,
can be also applied to unbalanced and non-IID distributions. and warm-up training.
To reduce the communication costs from end devices to Gradient sparsification reduces the communication costs by
the central server, structured updates and sketched updates dropping important gradient updates and transmitting only
[168] are proposed and shown to enhance the efficiency of updates that exceed a certain threshold. A convex optimization
communication. The structured update method obtains the formulation [176] is employed to reduce the coding length of
model update by restricting the parameters by using a smaller stochastic gradients, where the coordinates of the stochastic
number of variables, as in the low-rank and random mask gradient vectors are randomly reduced and the remaining
methods. In the sketched update process, a full model up- coordinates are appropriately amplified to keep the results
date is obtained and compressed by combined operations of unbiased.
subsampling, probabilistic quantization and structured random Furthermore, more research has explored combinations of
rotation before being sent to the server. To overcome the gradient quantization and sparsification. The convergence rate
conventional shortcomings of synchronous and asynchronous of distributed SGD for nonconvex stochastic optimization is
SGD, straggler effects are mitigated in synchronous stochastic considered using sparse parameter averaging and gradient
optimization with backup workers [173]. The core idea is to quantization [177]. Adaptive Residual Gradient Compression
compute the minibatch gradient updates of the decentralized (AdaComp) is proposed to compress the updates communi-
devices with only a subset of the worker machines. The cated to the server by applying the gradient sparsification
training process updates to the full parameters and goes on method in combination with efficient gradient selection and

2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: China Jiliang University. Downloaded on June 30,2021 at 08:42:36 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3088875, IEEE Internet of
Things Journal
JOURNAL OF IEEE INTERNET OF THINGS CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 21

learning rate modulation [178]. have explored the use of blockchain to improve data security
in distributed model training. To guarantee the credibility
of the third party, the blockchain instead of the centralized
C. Security Enhancement
aggregator is employed in FL, where customers sign the hashes
Distributed learning has shown a strong trend toward large- of encrypted models and send the locally trained models to
scale model training in AIoT, where a server coordinates the the blockchain. The miners check the identities of the senders
computational power of end devices by sharing the trained data and create a federated model by using the downloaded locally
or aggregating local models trained on individual devices. This trained models [181]. One miner is chosen as the temporary
kind of method can stop privacy leaks from directly sharing leader and takes responsibility for encrypting and returning
the raw data collected from end devices; however, the gradient the final model to the blockchain. Such a blockchain system
information shared by end devices still inevitably divulges pri- can protect the security of model updates by tracing mali-
vate information. Thus, research and development of privacy cious model updates. Reference [185] presents a blockchain-
preservation for AI model training is necessary. Generally, powered differentially private data-sharing model for multiple
there are two obstacles: the first is that attackers may infer distributed parties, where the mapped data models are shared
sensitive information from aggregated data or gradients; the between the multiple parties according to blockchain; conse-
other obstacle is that third parties are not trusted, thus causing quently, the data are protected, and the trained model accuracy
data or model leakage. Next, this paper introduces privacy- is enhanced.
enhancing techniques from two perspectives: data privacy and
security as well as system security. VII. O PEN C HALLENGES AND F UTURE D IRECTIONS
1) Data Privacy and Security
The AIoT has a myriad of applications in daily life and
The distributed collaborative methods used to train AI mod-
brings considerable convenience to humans, however, it is
els are usually vulnerable to model poisoning attacks [179],
still in its infancy and has broad prospects. The AIoT also
and the client’s contribution during training and sensitive
faces inevitable challenges in practical deployment, including
information provided are susceptible to leakage by analyzing
finding a cooperative mode among end devices, edge servers,
the locally trained model. To protect the privacy of the
and the cloud. In this section, some of the open challenges
updated local models, local differential privacy is incorporated
and potential future directions in AIoT are discussed.
into the local gradient descent training scheme of FL [180].
A randomly distributed update scheme is also employed to
remove the security threats posed by a centralized curator. A. Heterogeneity and Interoperability
Furthermore, differential privacy is used in training an FL End devices in the perception layer of the AIoT vary from
model by injecting Laplacian noise into the features extracted the Raspberry Pi to FPGA-based products to smart phones.
by the designed CNN model on the mobile device so that Owing to the diversity of sensors and devices as well as the
an adversary could not infer sensitive information from the complexity of the physical environment required to be sensed,
learned model [181]. varieties of devices need to be deployed on discrete servers for
Cryptographic technology is also suitable for distributed different applications or services to realize a comprehensive
model training. A multikey-based distributed DL framework perception of the environment, which indicates the hetero-
is introduced to enhance data security of clients with the aid geneity of the AIoT architecture. For example, sensing devices
of homomorphic reencryption in asynchronous SGD [182]. for AD are decorated on RSUs, and sensors for smart homes
Although the proposed framework adds additional communi- are deployed on smart gateways. To make smart decisions,
cation costs, it still achieves better security properties. data exchange and fusion among such interconnected de-
Adding too much noise to the model parameter data will vices are also required using heterogeneous networks, namely,
lead to poor performance in model training, while the sole Bluetooth, NB-IoT, ZigBee, Wi-Fi, the Hypertext Transfer
usage of secure multiparty computation in FL will result in Protocol/Transmission Control Protocol (HTTP/TCP) or the
vulnerability to inference attacks. A hybrid approach using User Datagram Protocol (UDP). Thus, AIoT systems are
differential privacy and secure multiparty computation for FL expected to be extremely heterogeneous in terms of devices,
systems is proposed to balance these tradeoffs, which reduces platforms, and frameworks. Interoperability and coordination
the growth of noise injection as the number of parties increases among heterogeneous devices and platforms are essential.
[183]. This system is suitable for various AI models (e.g. Attempts to explore network softwarization paradigms, such
CNNs, DTs and SVMs, etc.) and provides privacy guarantees as SDN [186] and Network Function Virtualization (NFV)
with high accuracy. [187], may bring many improvements by promoting efficient
2) System Security and flexible operation. SDN technologies can simplify man-
In the distributed training mode, a third party may use agement systems by using their programmability and offer a
the clients’ data illegally or be vulnerable to hacking, thus unified framework to manage various end devices or sensors.
causing unpredictable security issues [181]. Blockchain is a SDN can virtualize physical devices or provide customized
decentralized and distributed shared ledger and database that services to address the heterogeneity of the devices. NFV
stores data and features traceability, consensus, transparency virtualizes network node functions into software modules
and fast settlement [184]. These characteristics lay a solid through its virtualization technology. Recently, attempts to
foundation for creating trust in the blockchain. Many works amalgamate SDN and NFV in edge-cloud computing have

2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: China Jiliang University. Downloaded on June 30,2021 at 08:42:36 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3088875, IEEE Internet of
Things Journal
JOURNAL OF IEEE INTERNET OF THINGS CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 22

been made to improve the QoS for AIoT-driven applications lossless model quality and data isolation, breaks through the
[188], [189]. Furthermore, a standard communication protocol bottlenecks of data-driven requirements and privacy protection
for such heterogeneous devices and sensors in the edge-cloud faced by AI models. However, the bandwidth of edge nodes
environment is required to support smooth communication in is limited and heterogeneous, and their computing capabilities
the network layer. OpenFlow is the standard communication are different. In addition, different edge nodes have different
protocol between an SDN controller and a switch and has amounts of data, which are unevenly distributed. Distributed
attracted escalating attention from researchers [190]. SGD often results in communication delays because edge
DL models can be accelerated if they are implemented on servers have to wait for all the model parameters to be returned
a GPU edge server; however, it remains a challenge to make from the host server in every iteration. A variety of parallel
NFV compatible with a GPU. There is also some future work communication mechanisms need to be further explored to
required to successfully deploy such paradigms, such as in improve efficiency [195]. In addition, the existing quantization
security, resource allocation, runtime service deployment, and methods are generally applied in AI inference. Fine-grained
computational offloading. quantization-aware training, which supports forward and back-
ward propagation, can be applied to AIoT applications [196].
B. Resource Management
Advancements in AIoT have created many applications, D. Security and Privacy
such as smart homes and the IoV. In AIoT systems, many
Although the AIoT brings great convenience, it is subject
sensors and devices are deployed distributively to collect
to security and privacy issues, such as malicious attacks
data. Distributed sensors and devices are usually powered by
and privacy leakage. As mentioned above, AI models are
batteries with limited computation and storage capacities, and
more likely to be deployed at the edge of the network or
it is difficult to execute latency-sensitive computation tasks on
directly on an end device to deliver near-zero-delay ser-
their own computing resources. To fully explore the potential
vices. Edge servers and end devices are often equipped with
of dispersive resources across edge nodes and devices, it is
limited computation and storage resources and are confront
helpful to partition complex AI models into small subtasks
with malicious attacks, namely, Distributed-Denial-of-Service
and offload these subtasks to various edge nodes and devices
(DDoS) attacks (such as Mirai). Flooding-based DDoS can still
for collaborative training. The service environments of many
work effectively in edge computing systems. The excessively
complex and diverse AIoT applications, such as the IoV, are
great computation and communication loads produced by the
highly dynamic, thus making it difficult to predict what will
existing security methods are infeasible for end devices to
happen. Thus, the ability to perform online edge resource or-
use to protect their security. To address this issue, researchers
chestration and provisioning is required to support substantial
can further emphasize lightweight security mechanisms for
AIoT tasks. Schemes aimed at the real-time joint optimization
resource-constrained devices. Physical Unclonable Functions
of heterogeneous end devices’ computing, communication, and
(PUFs), with the advantages of antiphysical intrusion capabil-
caching resource coordination at runtime according to different
ities, lower computational consumption, less resource usage,
task requirements should be thoroughly addressed. A joint
convenient implementation and unique physical attributes, will
caching and computing policy is designed to minimize the
be promising research points for security authentication in
bandwidth cost of a wireless multicast channel [191]. Other
edge computing environments. Furthermore, it is necessary
researches aim to manage resource allocation and scheduling
to explore hardware-assisted protection mechanisms based on
using AI techniques, such as DRL [43], [192].
RISC-V. Thus far, some explorations have begun [197], [198].
Privacy issues are also vitally important. The AIoT may
C. Model Inference and Training be vulnerable to data and firmware attacks. A mass of data
Section V presents AI inference compression and accel- is generated by end users and devices and is stored in local
eration technologies involving many hyperparameters, which devices or edge servers, which may contain sensitive infor-
require empirical experiments and expert knowledge to adjust mation (e.g., user location information and health or activity
the networks. Moreover, the networks need to be retrained by records). Exposure of such information will result in serious
fine-tuning based on many experiments. It is worthwhile to consequences. In addition, the AIoT faces transmission attacks
develop adaptive or automatic compression and acceleration because a mass of diverse data is required to design and train
techniques. Some studies are working in this direction [193], AI algorithms. The performance of algorithms will decrease
[194]. From a software perspective, acceleration technologies, if sufficient training data are not provided. Thus, data trans-
such as pruning and quantization, may make AI models mission between edge infrastructures will also cause privacy
hardware friendly but decrease performance. It is a promising leakage. One promising direction is to adopt the FL method
direction to support the execution of AI models by hardware to perform privacy-preserving distributed data training. Other
acceleration. More hardware measures should be taken to technologies, such as differential privacy, homomorphic en-
address this problem. cryption and secure multiparty computation, are also employed
Due to the limited computation, storage and network re- to design parameter-sharing AI models with privacy assurance
sources and the distributed heterogeneous data, it is difficult in the edge computing environment. Moreover, blockchain also
to train AI models in parallel. FL, a distributed computing plays an important role in security and privacy for the IoT,
architecture with benefits of little data communication traffic, which can be combined with the forementioned techniques,

2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: China Jiliang University. Downloaded on June 30,2021 at 08:42:36 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3088875, IEEE Internet of
Things Journal
JOURNAL OF IEEE INTERNET OF THINGS CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 23

such as FL, to preserve privacy. However, blockchain technol- appliances (smart lights, cars, etc.) accessing the internet,
ogy developed for IoT networks fails in performance because on the one hand, life becomes increasingly convenient; on
of a waste of network resources (such as communication the other hand, it is more convenient for hackers to attack.
bandwidth and computational resources). Some researchers Additionally, people may become lazy due to enjoying such
also resort to combining blockchain with the IoT in the conveniences. AIoT applications can replace jobs due to their
context of the Sixth Generation (6G) communication network low cost and high efficiency, thereby leading to fewer job
to reduce computational cost, which shows great potentials in opportunities. The AIoT also cannot discover the root causes
AI and data storage and analytics [199]. Thus, blockchain- of errors that are difficult to correct and need to be modified
based solutions for the AIoT should be further explored to manually.
protect users and devices from attacks. This survey explores the edge-computing-enabled integra-
tion of AI with the IoT. This paper comprehensively reviews
E. Artificial Intelligence Ethics in Artificial Intelligence of the current research efforts on AIoT. Specifically, an overview
Things of the IoT, AI technology and edge computing is given. Then,
this paper discusses opportunities for the synthesis of IoT
In AIoT systems, AI decision supports may be made in
and AI, depicts a general AIoT architecture and introduces
milliseconds with no time for human oversight, which requires
a practical AIoT example to explain how AI can be applied in
AI algorithms to autonomously learn in real-world scenar-
real-world scenarios. Additionally, the key AIoT technologies
ios without harming people or undermining their rights. To
for AI models regarding inference and training at the edge of
address the ethical problems of AI-based technology, some
the network are provided in detail. This paper further outlines
design principles, e.g., justice, honesty, accountability, safety
the open challenges and future directions of AIoT.
and sustainability, need to considered. In AI ethics, justice that
refrains from having any prejudice or favoritism towards indi-
viduals is prioritized the highest. The realization of justice con- R EFERENCES
sists of the fairness, non-discrimination and diversity of data, [1] L. Atzori, A. Iera, and G. Morabito, “The internet of things: A survey,”
algorithm, implementation and outcome. Honesty, the core of Computer Networks, vol. 54, no. 15, pp. 2787–2805, 2010.
fulfilling the series of AI ethical issues, underpins the idea of [2] M. Shirer and C. MacGillivray. The growth in connected iot devices
is expected to generate 79.4zb of data in 2025, according to a
explainability or technical transparency regarding AI systems. new idc forecast. [Online]. Available: https://www.iotcentral.io/blog/
Achieving AI honesty requires transparency, openness and iot-is-not-a-buzzword-but-necessity
interpretability of data and technology, and acknowledging [3] Y. Kang, J. Hauswald, C. Gao, A. Rovinski, T. Mudge, J. Mars,
errors. Accountability cannot be neglected in AI ethics, where and L. Tang, “Neurosurgeon: Collaborative intelligence between the
cloud and mobile edge,” ACM SIGARCH Computer Architecture News,
AI developers, designers or institutions should be accountable vol. 45, no. 1, pp. 615–629, 2017.
for AI actions and their outcomes. Accountability should be [4] J. Schmidhuber, “Deep learning in neural networks: An overview,”
established across the whole design and implementation work- Neural Networks, vol. 61, pp. 85–117, 2015.
[5] B. Heintz, A. Chandra, and R. K. Sitaraman, “Optimizing grouped
flow. Safety is the destination of AI ethics, which care more aggregation in geo-distributed streaming analytics,” in Proceedings of
about the accuracy, reliability, security, and robustness of AI the 24th International Symposium on High-Performance Parallel and
systems. To enhance security, AI designers should state that AI Distributed Computing, 2015, pp. 133–144.
[6] Y. Mao, C. You, J. Zhang, K. Huang, and K. B. Letaief, “A survey
systems will never result in foreseeable or unintentional harm, on mobile edge computing: The communication perspective,” IEEE
including military war or malicious cyber hacking. Finally, Communications Surveys & Tutorials, vol. 19, no. 4, pp. 2322–2358,
sustainability in AI ethics requires protecting the environment 2017.
[7] S. Deng, H. Zhao, W. Fang, J. Yin, S. Dustdar, and A. Y. Zomaya,
and improving the ecosystem when developing and deploying “Edge intelligence: The confluence of edge computing and artificial
AI systems. To reach their goal, AI-based applications should intelligence,” IEEE Internet of Things Journal, vol. 7, no. 8, pp. 7457–
be designed, deployed and managed considering increases in 7469, 2020.
[8] W. Hu, Y. Gao, K. Ha, J. Wang, B. Amos, Z. Chen, P. Pillai, and
their energy efficiencies and reductions in their ecological M. Satyanarayanan, “Quantifying the impact of edge computing on
footprints. mobile applications,” in Proceedings of the 7th ACM SIGOPS Asia-
Pacific Workshop on Systems, 2016, pp. 1–8.
[9] W. Shi, J. Cao, Q. Zhang, Y. Li, and L. Xu, “Edge computing: Vision
VIII. C ONCLUSION and challenges,” IEEE Internet of Things Journal, vol. 3, no. 5, pp.
637–646, 2016.
The extensive use of the IoT faces a series of challenges,
[10] F. Bu and X. Wang, “A smart agriculture iot system based on deep
including data explosion and heterogeneity. The emergence reinforcement learning,” Future Generation Computer Systems, vol. 99,
of advanced AI technologies is a potential solution to these pp. 500–507, 2019.
challenges and can help the IoT discover significant informa- [11] W. Zhan, C. Luo, J. Wang, C. Wang, G. Min, H. Duan, and Q. Zhu,
“Deep-reinforcement-learning-based offloading scheduling for vehicu-
tion, make accurate predictions and carry out early responses. lar edge computing,” IEEE Internet of Things Journal, vol. 7, no. 6,
The end-edge-cloud coordination AIoT provides flexible QoS pp. 5449–5465, 2020.
guarantees according to various requirements, such as services [12] Y. Dai, D. Xu, S. Maharjan, G. Qiao, and Y. Zhang, “Artificial
intelligence empowered edge computing and caching for internet of
in different delay ranges and high-accuracy predictions, and vehicles,” IEEE Wireless Communications, vol. 26, no. 3, pp. 12–18,
enables multiple users to participate. However, the edge- 2019.
cloud coordination AIoT may also endanger human life when [13] K. Zhang, Y. Mao, S. Leng, Y. He, and Y. Zhang, “Mobile-edge
computing for vehicular networks: A promising network paradigm with
encountering faults and situations that have never occurred predictive off-loading,” IEEE Vehicular Technology Magazine, vol. 12,
or improper operation. Moreover, with various household no. 2, pp. 36–44, 2017.

2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: China Jiliang University. Downloaded on June 30,2021 at 08:42:36 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3088875, IEEE Internet of
Things Journal
JOURNAL OF IEEE INTERNET OF THINGS CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 24

[14] J. Chen and X. Ran, “Deep learning with edge computing: A review.” home environment,” IEEE Internet of Things Journal, vol. 6, no. 5, pp.
Proceedings of the IEEE, vol. 107, no. 8, pp. 1655–1674, 2019. 8553–8562, 2019.
[15] A. Ghosh, D. Chakraborty, and A. Law, “Artificial intelligence in [35] G. Bedi, G. K. Venayagamoorthy, and R. Singh, “Development of
internet of things,” CAAI Transactions on Intelligence Technology, an iot-driven building environment for prediction of electric energy
vol. 3, no. 4, pp. 208–218, 2018. consumption,” IEEE Internet of Things Journal, vol. 7, no. 6, pp. 4912–
[16] I. U. Din, M. Guizani, J. J. Rodrigues, S. Hassan, and V. V. Korotaev, 4921, 2020.
“Machine learning in the internet of things: Designed techniques for [36] Z. Che, S. Purushotham, K. Cho, D. Sontag, and Y. Liu, “Recurrent
smart cities,” Future Generation Computer Systems, vol. 100, pp. 826– neural networks for multivariate time series with missing values,”
843, 2019. Scientific Reports, vol. 8, no. 1, pp. 1–12, 2018.
[17] L. Cui, S. Yang, F. Chen, Z. Ming, N. Lu, and J. Qin, “A survey on [37] P. Zhang, W. Ouyang, P. Zhang, J. Xue, and N. Zheng, “Sr-lstm:
application of machine learning for internet of things,” International State refinement for lstm towards pedestrian trajectory prediction,” in
Journal of Machine Learning and Cybernetics, vol. 9, no. 8, pp. 1399– Proceedings of the IEEE/CVF Conference on Computer Vision and
1417, 2018. Pattern Recognition, 2019, pp. 12 085–12 094.
[18] M. Mohammadi, A. Al-Fuqaha, S. Sorour, and M. Guizani, “Deep [38] H. Zheng, F. Lin, X. Feng, and Y. Chen, “A hybrid deep learning model
learning for iot big data and streaming analytics: A survey,” IEEE with attention-based conv-lstm networks for short-term traffic flow
Communications Surveys & Tutorials, vol. 20, no. 4, pp. 2923–2960, prediction,” IEEE Transactions on Intelligent Transportation Systems,
2018. 2020.
[19] Y. Chen, B. Zheng, Z. Zhang, Q. Wang, C. Shen, and Q. Zhang, “Deep [39] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-
learning on mobile and embedded devices: State-of-the-art, challenges, Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial
and future directions,” ACM Computing Surveys (CSUR), vol. 53, no. 4, networks,” arXiv preprint arXiv:1406.2661, 2014.
pp. 1–37, 2020. [40] A. Natani, A. Sharma, T. Peruma, and S. Sukhavasi, “Deep learning for
[20] H. Li, K. Ota, and M. Dong, “Learning iot in edge: Deep learning for multi-resident activity recognition in ambient sensing smart homes,” in
the internet of things with edge computing,” IEEE Network, vol. 32, 2019 IEEE 8th Global Conference on Consumer Electronics (GCCE).
no. 1, pp. 96–101, 2018. IEEE, 2019, pp. 340–341.
[21] Z. Zhou, X. Chen, E. Li, L. Zeng, K. Luo, and J. Zhang, “Edge [41] P. Wei, S. Xia, R. Chen, J. Qian, C. Li, and X. Jiang, “A deep-
intelligence: Paving the last mile of artificial intelligence with edge reinforcement-learning-based recommender system for occupant-driven
computing,” Proceedings of the IEEE, vol. 107, no. 8, pp. 1738–1762, energy optimization in commercial buildings,” IEEE Internet of Things
2019. Journal, vol. 7, no. 7, pp. 6402–6413, 2020.
[22] A. H. Sodhro, S. Pirbhulal, and V. H. C. De Albuquerque, “Artificial [42] N. C. Luong, D. T. Hoang, S. Gong, D. Niyato, P. Wang, Y.-C.
intelligence-driven mechanism for edge computing-based industrial Liang, and D. I. Kim, “Applications of deep reinforcement learning
applications,” IEEE Transactions on Industrial Informatics, vol. 15, in communications and networking: A survey,” IEEE Communications
no. 7, pp. 4235–4243, 2019. Surveys & Tutorials, vol. 21, no. 4, pp. 3133–3174, 2019.
[23] H. Ji, O. Alfarraj, and A. Tolba, “Artificial intelligence-empowered [43] N. Cheng, F. Lyu, W. Quan, C. Zhou, H. He, W. Shi, and X. Shen,
edge of vehicles: Architecture, enabling technologies, and applica- “Space/aerial-assisted computing offloading for iot applications: A
tions,” IEEE Access, vol. 8, pp. 61 020–61 034, 2020. learning-based approach,” IEEE Journal on Selected Areas in Com-
[24] S. Sankaranarayanan and S. Mookherji, “Svm-based traffic data clas- munications, vol. 37, no. 5, pp. 1117–1129, 2019.
sification for secured iot-based road signaling system,” in Research [44] H. Van Hasselt, A. Guez, and D. Silver, “Deep reinforcement learning
Anthology on Artificial Intelligence Applications in Security. IGI with double q-learning,” in Proceedings of the AAAI Conference on
Global, 2021, pp. 1003–1030. Artificial Intelligence, vol. 30, no. 1, 2016.
[25] D. Valluru and I. J. S. Jeya, “Iot with cloud based lung cancer [45] S. Tuli, S. Ilager, K. Ramamohanarao, and R. Buyya, “Dynamic
diagnosis model using optimal support vector machine,” Health Care scheduling for stochastic edge-cloud computing environments using
Management Science, pp. 1–10, 2019. a3c learning and residual recurrent neural networks,” IEEE Transac-
[26] S. Panda and G. Panda, “Intelligent classification of iot traffic in health- tions on Mobile Computing, 2020.
care using machine learning techniques,” in 2020 6th International [46] Y. Liang, C. Guo, Z. Ding, and H. Hua, “Agent-based modeling in
Conference on Control, Automation and Robotics (ICCAR). IEEE, electricity market using deep deterministic policy gradient algorithm,”
2020, pp. 581–585. IEEE Transactions on Power Systems, vol. 35, no. 6, pp. 4180–4192,
[27] A. Alabdulkarim, M. Al-Rodhaan, T. Ma, and Y. Tian, “Ppsdt: A novel 2020.
privacy-preserving single decision tree algorithm for clinical decision- [47] J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Prox-
support systems using iot devices,” Sensors, vol. 19, no. 1, p. 142, imal policy optimization algorithms,” arXiv preprint arXiv:1707.06347,
2019. 2017.
[28] X. Wang, C. Shao, S. Xu, S. Zhang, W. Xu, and Y. Guan, “Study on the [48] M. A Vouk, “Cloud computing–issues, research and implementations,”
location of private clinics based on k-means clustering method and an Journal of Computing and Information Technology, vol. 16, no. 4, pp.
integrated evaluation model,” IEEE Access, vol. 8, pp. 23 069–23 081, 235–246, 2008.
2020. [49] M. R. Rahimi, J. Ren, C. H. Liu, A. V. Vasilakos, and N. Venkatasub-
[29] X. Liu, G. Chen, X. Sun, and A. Knoll, “Ground moving vehicle ramanian, “Mobile cloud computing: A survey, state of art and future
detection and movement tracking based on the neuromorphic vision directions,” Mobile Networks and Applications, vol. 19, no. 2, pp. 133–
sensor,” IEEE Internet of Things Journal, vol. 7, no. 9, pp. 9026–9039, 143, 2014.
2020. [50] M. Satyanarayanan, P. Bahl, R. Caceres, and N. Davies, “The case for
[30] K. Costa, P. Ribeiro, A. Camargo, V. Rossi, H. Martins, M. Neves, vm-based cloudlets in mobile computing,” IEEE Pervasive Computing,
R. Fabris, R. Imaisumi, and J. P. Papa, “Comparison of the techniques vol. 8, no. 4, pp. 14–23, 2009.
decision tree and mlp for data mining in spams detection to computer [51] F. Bonomi, R. Milito, J. Zhu, and S. Addepalli, “Fog computing and
networks,” in Third International Conference on Innovative Computing its role in the internet of things,” in Proceedings of the First Edition
Technology (INTECH 2013). IEEE, 2013, pp. 344–348. of the MCC Workshop on Mobile Cloud Computing, 2012, pp. 13–16.
[31] A. Mukherjee, S. Misra, N. S. Raghuwanshi, and S. Mitra, “Blind [52] T. Verbelen, P. Simoens, F. De Turck, and B. Dhoedt, “Cloudlets:
entity identification for agricultural iot deployments,” IEEE Internet of Bringing the cloud to the mobile user,” in Proceedings of the Third
Things Journal, vol. 6, no. 2, pp. 3156–3163, 2018. ACM Workshop on Mobile Cloud Computing and Services, 2012, pp.
[32] M. D. Zeiler and R. Fergus, “Visualizing and understanding con- 29–36.
volutional networks,” in European Conference on Computer Vision. [53] S. Yi, C. Li, and Q. Li, “A survey of fog computing: Concepts,
Springer, 2014, pp. 818–833. applications and issues,” in Proceedings of the 2015 Workshop on
[33] S. Khan, K. Muhammad, S. Mumtaz, S. W. Baik, and V. H. C. Mobile Big Data, 2015, pp. 37–42.
de Albuquerque, “Energy-efficient deep cnn for smoke detection in [54] F. Bonomi, R. Milito, P. Natarajan, and J. Zhu, “Fog computing: A
foggy iot environment,” IEEE Internet of Things Journal, vol. 6, no. 6, platform for internet of things and analytics,” in Big Data and Internet
pp. 9237–9245, 2019. of Things: A Roadmap for Smart Environments. Springer, 2014, pp.
[34] V. Bianchi, M. Bassoli, G. Lombardo, P. Fornacciari, M. Mordonini, 169–186.
and I. De Munari, “Iot wearable sensor and deep learning: An inte- [55] G. Mei, N. Xu, J. Qin, B. Wang, and P. Qi, “A survey of internet
grated approach for personalized human activity recognition in a smart of things (iot) for geohazard prevention: Applications, technologies,

2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: China Jiliang University. Downloaded on June 30,2021 at 08:42:36 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3088875, IEEE Internet of
Things Journal
JOURNAL OF IEEE INTERNET OF THINGS CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 25

and challenges,” IEEE Internet of Things Journal, vol. 7, no. 5, pp. [82] T. Wu, Y. Wang, W. Shi, and J. Lu, “Hydramini: An fpga-based
4371–4386, 2019. affordable research and education platform for autonomous driving,” in
[56] M. Chiang and T. Zhang, “Fog and iot: An overview of research 2020 International Conference on Connected and Autonomous Driving
opportunities,” IEEE Internet of things journal, vol. 3, no. 6, pp. 854– (MetroCAD). IEEE, 2020, pp. 45–52.
864, 2016. [83] R. H. Arpaci-Dusseau and A. C. Arpaci-Dusseau, Operating systems:
[57] E. TPU. Google’s purpose-built asic designed to run inference at the Three easy pieces. Arpaci-Dusseau Books LLC, 2018.
edge. [Online]. Available: https://cloud.google.com/edge-tpu/ [84] X. Inc. Dnndk: Deep neural network development kit. [Online].
[58] Y. Chen, T. Chen, Z. Xu, N. Sun, and O. Temam, “Diannao family: Available: https://www.xilinx.com/products/design-tools/ai-inference/
energy-efficient hardware accelerators for machine learning,” Commu- edgeai-platform.htmldnndk
nications of the ACM, vol. 59, no. 11, pp. 105–112, 2016. [85] Y. Wang, L. Liu, X. Zhang, and W. Shi, “Hydraone: An indoor exper-
[59] NVIDIA. Turing gpu architecture. [Online]. Available: https://www. imental research and education platform for cavs,” in 2nd {USENIX}
nvidia.com/en-us/geforce/turing/ Workshop on Hot Topics in Edge Computing (HotEdge 19), 2019.
[60] ——. Adaptable. intelligent. [Online]. Available: https://www.xilinx. [86] S. Maheshwari, W. Zhang, I. Seskar, Y. Zhang, and D. Raychaudhuri,
com/products/silicon-devices/cost-optimized-portfolio.html “Edgedrive: Supporting advanced driver assistance systems using mo-
[61] HiSilicon. The world’s first full-stack all-scenario ai chip. [Online]. bile edge clouds networks,” in IEEE INFOCOM 2019-IEEE Confer-
Available: http://www.hisilicon.com/en/Products/ProductList/Ascend ence on Computer Communications Workshops (INFOCOM WKSHPS).
[62] Samsung. Mobile processor exynos 9820. [Online]. Available: IEEE, 2019, pp. 1–6.
http://www.hisilicon.com/en/Products/ProductList/Ascend [87] A. Villanueva, R. L. L. Benemerito, M. J. M. Cabug-Os, R. B. Chua,
[63] I. X. P. D.-. P. Brief. Advanced intelligence for high-density edge C. K. D. Rebeca, and M. Miranda, “Somnolence detection system
solution. [Online]. Available: http://www.hisilicon.com/en/Products/ utilizing deep neural network,” in 2019 International Conference on
ProductList/Ascend Information and Communications Technology (ICOIACT). IEEE,
[64] J. Hsu, “Ibm’s new brain [news],” IEEE Spectrum, vol. 51, no. 10, pp. 2019, pp. 602–607.
17–19, 2014. [88] W.-J. Tsaur and L.-Y. Yeh, “Dans: A secure and efficient driver-
[65] O. N. Foundation. Cord. [Online]. Available: https://www. abnormal notification scheme with iot devices over iov,” IEEE Systems
opennetworking.org/cord Journal, vol. 13, no. 2, pp. 1628–1639, 2018.
[66] T. L. Foundation. Edgex foundry. [Online]. Available: https: [89] D. Tian, C. Zhang, X. Duan, and X. Wang, “An automatic car accident
//www.edgexfoundry.org detection method based on cooperative vehicle infrastructure systems,”
[67] ——. Akraino edge stack. [Online]. Available: https://www.lfedge. IEEE Access, vol. 7, pp. 127 453–127 463, 2019.
org/projects/akraino/ [90] J. Berrezueta-Guzman, I. Pau, M.-L. Martı́n-Ruiz, and N. Máximo-
[68] M. Azure. Azure iot edge, extend cloud intelligence and analytics to Bocanegra, “Smart-home environment to support homework activities
edge devices. [Online]. Available: https://github.com/Azure/iotedge for children,” IEEE Access, vol. 8, pp. 160 251–160 267, 2020.
[69] A. W. Services. Aws iot greengrass. [Online]. Available: https: [91] F. Wang, W. Gong, and J. Liu, “On spatial diversity in wifi-based
//aws.amazon.com/cn/greengrass/ human activity recognition: A deep learning-based approach,” IEEE
[70] Y. Xiong, Y. Sun, L. Xing, and Y. Huang, “Extend cloud to edge with Internet of Things Journal, vol. 6, no. 2, pp. 2035–2047, 2018.
kubeedge,” in 2018 IEEE/ACM Symposium on Edge Computing (SEC). [92] H. Zou, Y. Zhou, H. Jiang, S.-C. Chien, L. Xie, and C. J. Spanos,
IEEE, 2018, pp. 373–377. “Winlight: A wifi-based occupancy-driven lighting control system for
[71] OpenEdge. Extend cloud computing, data and service seamlessly to smart building,” Energy and Buildings, vol. 158, pp. 924–938, 2018.
edge devices. [Online]. Available: https://github.com/baetyl/baetyl [93] Q. Shi, Z. Zhang, T. He, Z. Sun, B. Wang, Y. Feng, X. Shan, B. Salam,
[72] Q. Zhang, Y. Wang, X. Zhang, L. Liu, X. Wu, W. Shi, and H. Zhong, and C. Lee, “Deep learning enabled smart mats as a scalable floor
“Openvdap: An open vehicular data analytics platform for cavs,” in monitoring system,” Nature Communications, vol. 11, no. 1, pp. 1–11,
2018 IEEE 38th International Conference on Distributed Computing 2020.
Systems (ICDCS). IEEE, 2018, pp. 1310–1320. [94] F. De Vita, G. Nocera, D. Bruneo, V. Tomaselli, D. Giacalone, and
[73] C.-C. Hung, G. Ananthanarayanan, P. Bodik, L. Golubchik, M. Yu, S. K. Das, “Quantitative analysis of deep leaf: a plant disease detector
P. Bahl, and M. Philipose, “Videoedge: Processing camera streams on the smart edge,” in 2020 IEEE International Conference on Smart
using hierarchical clusters,” in 2018 IEEE/ACM Symposium on Edge Computing (SMARTCOMP). IEEE, 2020, pp. 49–56.
Computing (SEC). IEEE, 2018, pp. 115–131. [95] V. Mazzia, A. Khaliq, F. Salvetti, and M. Chiaberge, “Real-time apple
[74] Z. Du, R. Fasthuber, T. Chen, P. Ienne, L. Li, T. Luo, X. Feng, Y. Chen, detection system using embedded systems with hardware accelerators:
and O. Temam, “Shidiannao: Shifting vision processing closer to the an edge ai application,” IEEE Access, vol. 8, pp. 9102–9114, 2020.
sensor,” in Proceedings of the 42nd Annual International Symposium [96] C. Krintz, R. Wolski, N. Golubovic, and F. Bakir, “Estimating outdoor
on Computer Architecture, 2015, pp. 92–104. temperature from cpu temperature for iot applications in agriculture,”
[75] N. Corporation. Jetson tx2 module. [Online]. Available: https: in Proceedings of the 8th International Conference on the Internet of
//developer.nvidia.com/embedded/buy/jetson-tx2 Things, 2018, pp. 1–8.
[76] S. Kato, S. Tokunaga, Y. Maruyama, S. Maeda, M. Hirabayashi, [97] G. Lavanya, C. Rani, and P. Ganeshkumar, “An automated low cost iot
Y. Kitsukawa, A. Monrroy, T. Ando, Y. Fujii, and T. Azumi, “Autoware based fertilizer intimation system for smart agriculture,” Sustainable
on board: Enabling autonomous vehicles with embedded systems,” Computing: Informatics and Systems, vol. 28, p. 100300, 2020.
in 2018 ACM/IEEE 9th International Conference on Cyber-Physical [98] S. K. Singh, F. Carpio, and A. Jukan, “Improving animal-human
Systems (ICCPS). IEEE, 2018, pp. 287–296. cohabitation with machine learning in fiber-wireless networks,” Journal
[77] Y. Wang, J. Shen, T.-K. Hu, P. Xu, T. Nguyen, R. Baraniuk, Z. Wang, of Sensor and Actuator Networks, vol. 7, no. 3, p. 35, 2018.
and Y. Lin, “Dual dynamic inference: Enabling more efficient, adaptive, [99] R. Nikhil, B. Anisha, and R. Kumar, “Real-time monitoring of agricul-
and controllable deep inference,” IEEE Journal of Selected Topics in tural land with crop prediction and animal intrusion prevention using
Signal Processing, vol. 14, no. 4, pp. 623–633, 2020. internet of things and machine learning at edge,” in 2020 IEEE Inter-
[78] P. A. Merolla, J. V. Arthur, R. Alvarez-Icaza, A. S. Cassidy, J. Sawada, national Conference on Electronics, Computing and Communication
F. Akopyan, B. L. Jackson, N. Imam, C. Guo, Y. Nakamura et al., “A Technologies (CONECCT). IEEE, 2020, pp. 1–6.
million spiking-neuron integrated circuit with a scalable communica- [100] A. Ahad, M. Tahir, and K.-L. A. Yau, “5g-based smart healthcare
tion network and interface,” Science, vol. 345, no. 6197, pp. 668–673, network: Architecture, taxonomy, challenges and future research di-
2014. rections,” IEEE Access, vol. 7, pp. 100 747–100 762, 2019.
[79] M. Davies, N. Srinivasa, T.-H. Lin, G. Chinya, Y. Cao, S. H. Choday, [101] Z. Zhang, T. He, M. Zhu, Q. Shi, and C. Lee, “Smart triboelectric
G. Dimou, P. Joshi, N. Imam, S. Jain et al., “Loihi: A neuromorphic socks for enabling artificial intelligence of things (aiot) based smart
manycore processor with on-chip learning,” IEEE Micro, vol. 38, no. 1, home and healthcare,” in 2020 IEEE 33rd International Conference on
pp. 82–99, 2018. Micro Electro Mechanical Systems (MEMS). IEEE, 2020, pp. 80–83.
[80] T. A. S. Foundation. Apache edgent. [Online]. Available: https: [102] S. Tuli, N. Basumatary, S. S. Gill, M. Kahani, R. C. Arya, G. S.
//github.com/apache/incubator-retired-edgent Wander, and R. Buyya, “Healthfog: An ensemble deep learning based
[81] Y. Xiao, Y. Jia, C. Liu, X. Cheng, J. Yu, and W. Lv, “Edge computing smart healthcare system for automatic diagnosis of heart diseases in
security: State of the art and challenges,” Proceedings of the IEEE, vol. integrated iot and fog computing environments,” Future Generation
107, no. 8, pp. 1608–1631, 2019. Computer Systems, vol. 104, pp. 187–200, 2020.

2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: China Jiliang University. Downloaded on June 30,2021 at 08:42:36 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3088875, IEEE Internet of
Things Journal
JOURNAL OF IEEE INTERNET OF THINGS CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 26

[103] S. K. Sood and I. Mahajan, “A fog-based healthcare framework for [125] T. L. Foundation. Lf energy gxf. [Online]. Available: https:
chikungunya,” IEEE Internet of Things Journal, vol. 5, no. 2, pp. 794– //www.lfenergy.org/projects/gxf/
801, 2017. [126] F. Ma, X. Luo, and E. Litvinov, “Cloud computing for power system
[104] M. S. Hossain, G. Muhammad, and N. Guizani, “Explainable ai and simulations at iso new england—experiences and challenges,” IEEE
mass surveillance system-based healthcare framework to combat covid- Transactions on Smart Grid, vol. 7, no. 6, pp. 2596–2603, 2016.
i9 like pandemics,” IEEE Network, vol. 34, no. 4, pp. 126–132, 2020. [127] Ž. N. Popović, B. B. Radmilović, and V. M. Gačić, “Smart grids
[105] Y. Liu, Z. Zhao, F. Chang, and S. Hu, “An anchor-free convolutional concept in electrical distribution system,” Thermal Science, vol. 16,
neural network for real-time surgical tool detection in robot-assisted no. suppl. 1, pp. 205–213, 2012.
surgery,” IEEE Access, vol. 8, pp. 78 193–78 201, 2020. [128] S. Chen, H. Wen, J. Wu, W. Lei, W. Hou, W. Liu, A. Xu, and Y. Jiang,
[106] H. Qiu, Z. Li, Y. Yang, C. Xin, and G.-B. Bian, “Real-time iris tracking “Internet of things based smart grids supported by intelligent edge
using deep regression networks for robotic ophthalmic surgery,” IEEE computing,” IEEE Access, vol. 7, pp. 74 089–74 102, 2019.
Access, vol. 8, pp. 50 648–50 658, 2020. [129] P. N. N. LABORATORY. Transactive energy: Negotiating
[107] M. M. Hosseini, A. Umunnakwe, M. Parvania, and T. Tasdizen, “Intel- new terrain. [Online]. Available: https://www.pnnl.gov/news-media/
ligent damage classification and estimation in power distribution poles transactive-energy-negotiating-new-terrain
using unmanned aerial vehicles and convolutional neural networks,”
[130] Z. Yang and D. Li, “Wasnet: A neural network-based garbage collection
IEEE Transactions on Smart Grid, vol. 11, no. 4, pp. 3325–3333, 2020.
management system,” IEEE Access, vol. 8, pp. 103 984–103 993, 2020.
[108] Y. He, G. J. Mendis, and J. Wei, “Real-time detection of false data
[131] F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally,
injection attacks in smart grid: A deep learning-based intelligent
and K. Keutzer, “Squeezenet: Alexnet-level accuracy with 50x fewer
mechanism,” IEEE Transactions on Smart Grid, vol. 8, no. 5, pp. 2505–
parameters and¡ 0.5 mb model size,” arXiv preprint arXiv:1602.07360,
2516, 2017.
2016.
[109] M. N. Kurt, O. Ogundijo, C. Li, and X. Wang, “Online cyber-attack
detection in smart grid: A reinforcement learning approach,” IEEE [132] F. Chollet, “Xception: Deep learning with depthwise separable convo-
Transactions on Smart Grid, vol. 10, no. 5, pp. 5174–5185, 2018. lutions,” in Proceedings of the IEEE Conference on Computer Vision
[110] Y. Wang, Y. Shen, S. Mao, X. Chen, and H. Zou, “Lasso and lstm and Pattern Recognition, 2017, pp. 1251–1258.
integrated temporal model for short-term solar intensity forecasting,” [133] X. Zhang, X. Zhou, M. Lin, and J. Sun, “Shufflenet: An extremely effi-
IEEE Internet of Things Journal, vol. 6, no. 2, pp. 2933–2944, 2018. cient convolutional neural network for mobile devices,” in Proceedings
[111] M. D’Incecco, S. Squartini, and M. Zhong, “Transfer learning for non- of the IEEE Conference on Computer Vision and Pattern Recognition,
intrusive load monitoring,” IEEE Transactions on Smart Grid, vol. 11, 2018, pp. 6848–6856.
no. 2, pp. 1419–1429, 2019. [134] W. Ding, Z. Huang, Z. Huang, L. Tian, H. Wang, and S. Feng,
[112] Z. Wan, H. Li, H. He, and D. Prokhorov, “Model-free real-time ev “Designing efficient accelerator of depthwise separable convolutional
charging scheduling based on deep reinforcement learning,” IEEE neural network on fpga,” Journal of Systems Architecture, vol. 97, pp.
Transactions on Smart Grid, vol. 10, no. 5, pp. 5246–5257, 2018. 278–286, 2019.
[113] Y.-W. Lee, “A stochastic model of particulate matters with ai-enabled [135] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look
technique-based iot gas detectors for air quality assessment,” Micro- once: Unified, real-time object detection,” in Proceedings of the IEEE
electronic Engineering, vol. 229, p. 111346, 2020. Conference on Computer Vision and Pattern Recognition, 2016, pp.
[114] X. Sun, X. Wang, D. Cai, Z. Li, Y. Gao, and X. Wang, “Multivariate 779–788.
seawater quality prediction based on pca-rvm supported by edge [136] S. Han, H. Mao, and W. Dally, “Compressing deep neural networks
computing towards smart ocean,” IEEE Access, vol. 8, pp. 54 506– with pruning, trained quantization and huffman coding,” arXiv preprint,
54 513, 2020. 2015.
[115] J. Gotthans, T. Gotthans, and R. Marsalek, “Deep convolutional neural [137] X. Liu, J. Pool, S. Han, and W. J. Dally, “Efficient sparse-winograd con-
network for fire detection,” in 2020 30th International Conference volutional neural networks,” arXiv preprint arXiv:1802.06367, 2018.
Radioelektronika (RADIOELEKTRONIKA). IEEE, 2020, pp. 1–6. [138] J.-H. Luo, J. Wu, and W. Lin, “Thinet: A filter level pruning method
[116] D. Kinaneva, G. Hristov, J. Raychev, and P. Zahariev, “Early forest fire for deep neural network compression,” in Proceedings of the IEEE
detection using drones and artificial intelligence,” in 2019 42nd Inter- International Conference on Computer Vision, 2017, pp. 5058–5066.
national Convention on Information and Communication Technology, [139] Z. Liu, J. Li, Z. Shen, G. Huang, S. Yan, and C. Zhang, “Learning
Electronics and Microelectronics (MIPRO). IEEE, 2019, pp. 1060– efficient convolutional networks through network slimming,” in Pro-
1065. ceedings of the IEEE International Conference on Computer Vision,
[117] K. Ahmad, K. Khan, and A. Al-Fuqaha, “Intelligent fusion of deep 2017, pp. 2736–2744.
features for improved waste classification,” IEEE Access, vol. 8, pp. [140] Y. He, X. Zhang, and J. Sun, “Channel pruning for accelerating
96 495–96 504, 2020. very deep neural networks,” in Proceedings of the IEEE International
[118] T. J. Sheng, M. S. Islam, N. Misran, M. H. Baharuddin, H. Arshad, Conference on Computer Vision, 2017, pp. 1389–1397.
M. R. Islam, M. E. Chowdhury, H. Rmili, and M. T. Islam, “An
[141] K. Hwang and W. Sung, “Fixed-point feedforward deep neural network
internet of things based smart waste management system using lora
design using weights+ 1, 0, and- 1,” in 2014 IEEE Workshop on Signal
and tensorflow deep learning model,” IEEE Access, vol. 8, pp. 148 793–
Processing Systems (SiPS). IEEE, 2014, pp. 1–6.
148 811, 2020.
[142] R. Zhao, W. Song, W. Zhang, T. Xing, J.-H. Lin, M. Srivastava,
[119] L. Hu, Y. Miao, G. Wu, M. M. Hassan, and I. Humar, “irobot-factory:
R. Gupta, and Z. Zhang, “Accelerating binarized convolutional neural
An intelligent robot factory based on cognitive manufacturing and edge
networks with software-programmable fpgas,” in Proceedings of the
computing,” Future Generation Computer Systems, vol. 90, pp. 569–
2017 ACM/SIGDA International Symposium on Field-Programmable
577, 2019.
Gate Arrays, 2017, pp. 15–24.
[120] L. Li, K. Ota, and M. Dong, “Deep learning for smart industry:
Efficient manufacture inspection system with fog computing,” IEEE [143] J. Qiu, J. Wang, S. Yao, K. Guo, B. Li, E. Zhou, J. Yu, T. Tang,
Transactions on Industrial Informatics, vol. 14, no. 10, pp. 4665–4673, N. Xu, S. Song et al., “Going deeper with embedded fpga platform for
2018. convolutional neural network,” in Proceedings of the 2016 ACM/SIGDA
[121] Y. Feng, T. Wang, B. Hu, C. Yang, and J. Tan, “An integrated method International Symposium on Field-Programmable Gate Arrays, 2016,
for high-dimensional imbalanced assembly quality prediction supported pp. 26–35.
by edge computing,” IEEE Access, vol. 8, pp. 71 279–71 290, 2020. [144] Y. H. Oh, Q. Quan, D. Kim, S. Kim, J. Heo, S. Jung, J. Jang,
[122] K. S. Kiangala and Z. Wang, “An effective predictive maintenance and J. W. Lee, “A portable, automatic data qantizer for deep neural
framework for conveyor motors using dual time-series imaging and networks,” in Proceedings of the 27th International Conference on
convolutional neural network in an industry 4.0 environment,” IEEE Parallel Architectures and Compilation Techniques, 2018, pp. 1–14.
Access, vol. 8, pp. 121 033–121 049, 2020. [145] G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a
[123] L. Liu, Y. Yao, R. Wang, B. Wu, and W. Shi, “Equinox: A road- neural network,” arXiv preprint arXiv:1503.02531, 2015.
side edge computing experimental platform for cavs,” in 2020 Interna- [146] A. Romero, N. Ballas, S. E. Kahou, A. Chassang, C. Gatta, and Y. Ben-
tional Conference on Connected and Autonomous Driving (MetroCAD). gio, “Fitnets: Hints for thin deep nets,” arXiv preprint arXiv:1412.6550,
IEEE, 2020, pp. 41–42. 2014.
[124] D. Yao, M. Wen, X. Liang, Z. Fu, K. Zhang, and B. Yang, “Energy [147] S. Han, H. Mao, and W. J. Dally, “Deep compression: Compressing
theft detection with energy privacy preservation in the smart grid,” deep neural networks with pruning, trained quantization and huffman
IEEE Internet of Things Journal, vol. 6, no. 5, pp. 7659–7669, 2019. coding,” arXiv preprint arXiv:1510.00149, 2015.

2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: China Jiliang University. Downloaded on June 30,2021 at 08:42:36 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3088875, IEEE Internet of
Things Journal
JOURNAL OF IEEE INTERNET OF THINGS CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 27

[148] S. Liu, Y. Lin, Z. Zhou, K. Nan, H. Liu, and J. Du, “On-demand [167] Z. He, T. Zhang, and R. B. Lee, “Attacking and protecting data privacy
deep model compression for mobile devices: A usage-driven model in edge-cloud collaborative inference systems,” IEEE Internet of Things
selection framework,” in Proceedings of the 16th Annual International Journal, 2020.
Conference on Mobile Systems, Applications, and Services, 2018, pp. [168] J. Konečnỳ, H. B. McMahan, F. X. Yu, P. Richtárik, A. T. Suresh, and
389–400. D. Bacon, “Federated learning: Strategies for improving communica-
[149] X. Ran, H. Chen, X. Zhu, Z. Liu, and J. Chen, “Deepdecision: A mobile tion efficiency,” arXiv preprint arXiv:1610.05492, 2016.
deep learning framework for edge video analytics,” in IEEE INFOCOM [169] B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas,
2018-IEEE Conference on Computer Communications. IEEE, 2018, “Communication-efficient learning of deep networks from decentral-
pp. 1421–1429. ized data,” in Artificial Intelligence and Statistics. PMLR, 2017, pp.
[150] S. Han, H. Shen, M. Philipose, S. Agarwal, A. Wolman, and A. Kr- 1273–1282.
ishnamurthy, “Mcdnn: An approximation-based execution framework [170] M. Fredrikson, S. Jha, and T. Ristenpart, “Model inversion attacks
for deep stream processing under resource constraints,” in Proceedings that exploit confidence information and basic countermeasures,” in
of the 14th Annual International Conference on Mobile Systems, Proceedings of the 22nd ACM SIGSAC Conference on Computer and
Applications, and Services, 2016, pp. 123–136. Communications Security, 2015, pp. 1322–1333.
[151] E. Cuervo, A. Balasubramanian, D.-k. Cho, A. Wolman, S. Saroiu, [171] Y. Matsubara, S. Baidya, D. Callegaro, M. Levorato, and S. Singh,
R. Chandra, and P. Bahl, “Maui: Making smartphones last longer with “Distilled split deep neural networks for edge-assisted real-time sys-
code offload,” in Proceedings of the 8th International Conference on tems,” in Proceedings of the 2019 Workshop on Hot Topics in Video
Mobile Systems, Applications, and Services, 2010, pp. 49–62. Analytics and Intelligent Edges, 2019, pp. 21–26.
[152] L. Lin, X. Liao, H. Jin, and P. Li, “Computation offloading toward edge [172] R. Sharma, S. Biookaghazadeh, B. Li, and M. Zhao, “Are existing
computing,” Proceedings of the IEEE, vol. 107, no. 8, pp. 1584–1607, knowledge transfer techniques effective for deep learning with edge
2019. devices?” in 2018 IEEE International Conference on Edge Computing
[153] T. Y.-H. Chen, L. Ravindranath, S. Deng, P. Bahl, and H. Balakrishnan, (EDGE). IEEE, 2018, pp. 42–49.
“Glimpse: Continuous, real-time object recognition on mobile devices,” [173] J. Chen, X. Pan, R. Monga, S. Bengio, and R. Jozefowicz, “Revisiting
in Proceedings of the 13th ACM Conference on Embedded Networked distributed synchronous sgd,” arXiv preprint arXiv:1604.00981, 2016.
Sensor Systems, 2015, pp. 155–168. [174] S. Zhang, A. Choromanska, and Y. LeCun, “Deep learning with elastic
[154] K. Habak, M. Ammar, K. A. Harras, and E. Zegura, “Femto clouds: averaging sgd,” arXiv preprint arXiv:1412.6651, 2014.
Leveraging mobile devices to provide cloud service at the edge,” in [175] Y. Lin, S. Han, H. Mao, Y. Wang, and W. J. Dally, “Deep gradient
2015 IEEE 8th International Conference on Cloud Computing. IEEE, compression: Reducing the communication bandwidth for distributed
2015, pp. 9–16. training,” arXiv preprint arXiv:1712.01887, 2017.
[155] P. Simoens, Y. Xiao, P. Pillai, Z. Chen, K. Ha, and M. Satyanarayanan, [176] J. Wangni, J. Wang, J. Liu, and T. Zhang, “Gradient sparsification
“Scalable crowd-sourcing of video from mobile devices,” in Proceeding for communication-efficient distributed optimization,” arXiv preprint
of the 11th Annual International Conference on Mobile Systems, arXiv:1710.09854, 2017.
Applications, and Services, 2013, pp. 139–152. [177] P. Jiang and G. Agrawal, “A linear speedup analysis of distributed deep
learning with sparse and quantized communication,” in Proceedings of
[156] J. Mao, X. Chen, K. W. Nixon, C. Krieger, and Y. Chen, “Modnn:
Local distributed mobile computing system for deep neural network,” in the 32nd International Conference on Neural Information Processing
Design, Automation & Test in Europe Conference & Exhibition (DATE), Systems, 2018, pp. 2530–2541.
[178] C. Hardy, E. Le Merrer, and B. Sericola, “Distributed deep learning
2017. IEEE, 2017, pp. 1396–1401.
on edge-devices: Feasibility via adaptive compression,” in 2017 IEEE
[157] Z. Zhao, K. M. Barijough, and A. Gerstlauer, “Deepthings: Distributed
16th International Symposium on Network Computing and Applications
adaptive deep learning inference on resource-constrained iot edge
(NCA). IEEE, 2017, pp. 1–8.
clusters,” IEEE Transactions on Computer-Aided Design of Integrated
[179] C. Fung, C. J. Yoon, and I. Beschastnikh, “Mitigating sybils in
Circuits and Systems, vol. 37, no. 11, pp. 2348–2359, 2018.
federated learning poisoning,” arXiv preprint arXiv:1808.04866, 2018.
[158] Z. Zhao, Z. Jiang, N. Ling, X. Shuai, and G. Xing, “Ecrt: An
[180] Y. Lu, X. Huang, Y. Dai, S. Maharjan, and Y. Zhang, “Differentially
edge computing system for real-time image-based object tracking,” in
private asynchronous federated learning for mobile edge computing
Proceedings of the 16th ACM Conference on Embedded Networked in urban informatics,” IEEE Transactions on Industrial Informatics,
Sensor Systems, 2018, pp. 394–395.
vol. 16, no. 3, pp. 2134–2143, 2019.
[159] S. Teerapittayanon, B. McDanel, and H.-T. Kung, “Branchynet: Fast [181] Y. Zhao, J. Zhao, L. Jiang, R. Tan, and D. Niyato, “Mobile edge com-
inference via early exiting from deep neural networks,” in 2016 23rd puting, blockchain and reputation-based crowdsourcing iot federated
International Conference on Pattern Recognition (ICPR). IEEE, 2016, learning: A secure, decentralized and privacy-preserving system,” arXiv
pp. 2464–2469. preprint arXiv:1906.10893, 2019.
[160] ——, “Distributed deep neural networks over the cloud, the edge [182] F. Tang, W. Wu, J. Liu, H. Wang, and M. Xian, “Privacy-preserving
and end devices,” in 2017 IEEE 37th International Conference on distributed deep learning via homomorphic re-encryption,” Electronics,
Distributed Computing Systems (ICDCS). IEEE, 2017, pp. 328–339. vol. 8, no. 4, p. 411, 2019.
[161] E. Li, Z. Zhou, and X. Chen, “Edge intelligence: On-demand deep [183] S. Truex, N. Baracaldo, A. Anwar, T. Steinke, H. Ludwig, R. Zhang,
learning model co-inference with device-edge synergy,” in Proceedings and Y. Zhou, “A hybrid approach to privacy-preserving federated
of the 2018 Workshop on Mobile Edge Communications, 2018, pp. 31– learning,” in Proceedings of the 12th ACM Workshop on Artificial
36. Intelligence and Security, 2019, pp. 1–11.
[162] M. AboulAtta, M. Ossadnik, and S.-A. Ahmadi, “Stabilizing inputs [184] Z. Zheng, S. Xie, H.-N. Dai, X. Chen, and H. Wang, “Blockchain
to approximated nonlinear functions for inference with homomorphic challenges and opportunities: A survey,” International Journal of Web
encryption in deep neural networks,” arXiv preprint arXiv:1902.01870, and Grid Services, vol. 14, no. 4, pp. 352–375, 2018.
2019. [185] Y. Lu, X. Huang, Y. Dai, S. Maharjan, and Y. Zhang, “Blockchain
[163] Y. Li, H. Li, G. Xu, S. Liu, and R. Lu, “Epps: Efficient privacy- and federated learning for privacy-preserved data sharing in industrial
preserving scheme in distributed deep learning,” in 2019 IEEE Global iot,” IEEE Transactions on Industrial Informatics, vol. 16, no. 6, pp.
Communications Conference (GLOBECOM). IEEE, 2019, pp. 1–6. 4177–4186, 2019.
[164] C. Juvekar, V. Vaikuntanathan, and A. Chandrakasan, “{GAZELLE}: [186] C. Wang, Y. Zhang, X. Chen, K. Liang, and Z. Wang, “Sdn-based
A low latency framework for secure neural network inference,” in 27th handover authentication scheme for mobile edge computing in cyber-
{USENIX} Security Symposium ({USENIX} Security 18), 2018, pp. physical systems,” IEEE Internet of Things Journal, vol. 6, no. 5, pp.
1651–1669. 8692–8701, 2019.
[165] F. Mireshghallah, M. Taram, P. Ramrakhyani, A. Jalali, D. Tullsen, and [187] M. Huang, W. Liang, X. Shen, Y. Ma, and H. Kan, “Reliability-
H. Esmaeilzadeh, “Shredder: Learning noise distributions to protect aware virtualized network function services provisioning in mobile
inference privacy,” in Proceedings of the Twenty-Fifth International edge computing,” IEEE Transactions on Mobile Computing, vol. 19,
Conference on Architectural Support for Programming Languages and no. 11, pp. 2699–2713, 2019.
Operating Systems, 2020, pp. 3–18. [188] Z. Lv and W. Xiu, “Interaction of edge-cloud computing based on
[166] C. Xu, J. Ren, L. She, Y. Zhang, Z. Qin, and K. Ren, “Edgesanitizer: sdn and nfv for next generation iot,” IEEE Internet of Things Journal,
Locally differentially private deep inference at the edge for mobile data vol. 7, no. 7, pp. 5706–5712, 2019.
analytics,” IEEE Internet of Things Journal, vol. 6, no. 3, pp. 5140– [189] M. Wang, B. Cheng, W. Feng, and J. Chen, “An efficient service
5151, 2019. function chain placement algorithm in a mec-nfv environment,” in 2019

2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: China Jiliang University. Downloaded on June 30,2021 at 08:42:36 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3088875, IEEE Internet of
Things Journal
JOURNAL OF IEEE INTERNET OF THINGS CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 28

IEEE Global Communications Conference (GLOBECOM). IEEE, Xingxing Xiong is currently pursuing the Ph.D.
2019, pp. 1–6. degree with the department of cyber space security,
[190] A. Mondal, S. Misra, and I. Maity, “Amope: Performance analysis Wuhan University, Wuhan, China. His main research
of openflow systems in software-defined networks,” IEEE Systems interests include privacy protection and IoT security.
Journal, vol. 14, no. 1, pp. 124–131, 2019.
[191] Y. Sun, Z. Chen, M. Tao, and H. Liu, “Bandwidth gain from mobile
edge computing and caching in wireless multicast systems,” IEEE
Transactions on Wireless Communications, vol. 19, no. 6, pp. 3992–
4007, 2020.
[192] F. Tang, Y. Zhou, and N. Kato, “Deep reinforcement learning for
dynamic uplink/downlink resource allocation in high mobility 5g
hetnet,” IEEE Journal on Selected Areas in Communications, vol. 38,
no. 12, pp. 2773–2782, 2020.
[193] N. Liu, X. Ma, Z. Xu, Y. Wang, J. Tang, and J. Ye, “Autocompress: An
automatic dnn structured pruning framework for ultra-high compression
rates,” in Proceedings of the AAAI Conference on Artificial Intelligence,
vol. 34, no. 04, 2020, pp. 4876–4883.
[194] Y. He, J. Lin, Z. Liu, H. Wang, L.-J. Li, and S. Han, “Amc: Automl for
model compression and acceleration on mobile devices,” in Proceed-
ings of the European Conference on Computer Vision (ECCV), 2018,
pp. 784–800.
[195] D. Rothchild, A. Panda, E. Ullah, N. Ivkin, I. Stoica, V. Braverman,
J. Gonzalez, and R. Arora, “Fetchsgd: Communication-efficient feder-
ated learning with sketching,” in International Conference on Machine
Learning. PMLR, 2020, pp. 8253–8265.
[196] C.-C. Chung, W.-T. Chen, and Y.-C. Chang, “Using quantization- Zhaohui Cai received the Ph.D. degree from Wuhan
aware training technique with post-training fine-tuning quantization to University, Wuhan, China, in 2005. She is currently
implement a mobilenet hardware accelerator,” in 2020 Indo–Taiwan an associate professor with the School of Computer
2nd International Conference on Computing, Analytics and Networks Science, Wuhan University, Wuhan, China. She is
(Indo-Taiwan ICAN). IEEE, 2020, pp. 28–32. mainly engaged in machine learning, Internet of
[197] J. Long, W. Liang, K.-C. Li, D. Zhang, M. Tang, and H. Luo, “Puf- Things technology and big data analysis and has
based anonymous authentication scheme for hardware devices and ips published many articles and participated in the Na-
in edge computing environment,” IEEE Access, vol. 7, pp. 124 785– tional 863 program, the National Natural Science
124 796, 2019. Foundation and the provincial and ministerial sci-
[198] A. De, A. Basu, S. Ghosh, and T. Jaeger, “Hardware assisted buffer entific and technological research projects.
protection mechanisms for embedded risc-v,” IEEE Transactions on
Computer-Aided Design of Integrated Circuits and Systems, vol. 39,
no. 12, pp. 4453–4465, 2020.
[199] R. Sekaran, R. Patan, A. Raveendran, F. Al-Turjman, M. Ramachan-
dran, and L. Mostarda, “Survival study on blockchain based 6g-enabled
mobile edge computation for iot automation,” IEEE Access, vol. 8, pp.
143 453–143 463, 2020.

Zhuoqing Chang (S’09) is currently pursuing the


Ph.D. degree with the School of Computer Science,
Wuhan University, Wuhan, China. His current re-
search interests include the Internet of Things, deep
learning, and missing data. Guoqing Tu received the B.E. degree from the
Department of Computer and Application, Hefei
University of Technology, Hefei, China, in 1996,
and the M.S. and Ph.D. degrees from the School
of Computer Science, Wuhan University, Wuhan,
China, in 2002 and 2008, respectively. He is cur-
rently an Associated Professor with the School of
Cyber Science and Engineering, Wuhan University,
Wuhan, China. His current research interests include
IoT and its security, embedded system, hydro-logical
monitoring system.

Shubo Liu received the Ph.D. degree in communica-


tion and information system from Wuhan University,
Wuhan, China, in 2009. He is currently a Full Pro-
fessor with the School of Computer Science, Wuhan
University, Wuhan, China. His research interests are
on multimedia information processing and security,
Internet of Things and edge computing.

2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: China Jiliang University. Downloaded on June 30,2021 at 08:42:36 UTC from IEEE Xplore. Restrictions apply.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy