Prediction-Based Sensor Nodes

Computer Communications 34 (2011) 793802
Contents lists available at ScienceDirect
Computer Communications
journal homepage: www.elsevier.com/locate/comcom
Prediction-based data aggregation in wireless sensor networks: Combining

grey model and Kalman Filter
Guiyi Wei a,, Yun Ling a, Binfeng Guo a, Bin Xiao b, Athanasios V. Vasilakos c
a
School of Computer Science and Information Engineering, Zhejiang Gongshang University, Hangzhou, China
b
Department of Computing, Hong Kong Polytechnic University, Hong Kong
c
Department of Computer and Telecommunications Engineering, University of Western Macedonia, Greece
a r t i c l e i n f o a b s t r a c t
Article history: In many environmental monitoring applications, since the data periodically sensed by wireless sensor
Received 26 January 2010 networks usually are of high temporal redundancy, prediction-based data aggregation is an important
Received in revised form 30 September approach for reducing redundant data communications and saving sensor nodes energy. In this paper,
2010
a novel prediction-based data collection protocol is proposed, in which a double-queue mechanism is
Accepted 5 October 2010
Available online 12 October 2010
designed to synchronize the prediction data series of the sensor node and the sink node, and therefore,
the cumulative error of continuous predictions is reduced. Based on this protocol, three prediction-based
data aggregation approaches are proposed: Grey-Model-based Data Aggregation (GMDA), Kalman-Filter-
Keywords:
Wireless sensor networks
based Data Aggregation (KFDA) and Combined Grey model and Kalman Filter Data Aggregation
Data collection protocol (CoGKDA). By integrating the merit of grey model in quick modeling with the advantage of Kalman Filter
Data aggregation in processing data series noise, CoGKDA presents high prediction accuracy, low communication overhead,
Grey model and relative low computational complexity. Experiments are carried out based on a real data set of a tem-
Kalman Filter perature and humidity monitoring application in a granary. The results show that the proposed
approaches signicantly reduce communication redundancy and evidently improve the lifetime of wire-
less sensor networks.
2010 Elsevier B.V. All rights reserved.
1. Introduction Since the data generated by sensor nodes during continuous

sensing periods usually are of high temporal coherence, it indicates
Wireless sensor networks consist of a large number of low-cost there are redundant data in the continuous data sequence, which
sensor nodes which form a multi-hop ad hoc network through causes unnecessary data transmission and energy consumption.
wireless communication [19]. In general, sensor nodes rely on bat- In many environmental sensing applications, e.g. granary monitor-
tery only and once deployed, they are usually unable to be re- ing, data ow is many-to-one through a reverse multicast tree,
charged. Therefore, power is a critical resource in wireless sensor from leaf sensor nodes to a small number of sink nodes. In these
networks. Reducing energy consumption is of great importance cases, transmitting redundant data will incur a serious waste of
in improving the lifetime of wireless sensor networks. communication bandwidth and energy. The efciency of data col-
In wireless sensor networks used in environmental monitoring, lection will decrease when each node sends all data to the sink
a large number of sensor nodes collect information and return col- node. Furthermore, the difculty of scheduling at the link layer will
lected information to a base station(s) where it is processed, ana- increase and cause more frequent collisions [20]. Data aggregation
lyzed, and used. Since the sensor node is energy constrained and techniques which exploit temporal correlation of the sensed data
its valid communication distance is limited, it is infeasible for all are needed to resolve these two problems.
the sensors to transmit data directly to the base station (or sink Model-driven data aggregation approaches take advantage of
node). In most environmental monitoring applications, sensed data data coherence to remove redundancy and reduce transmissions
may be of high temporal or spatial correlation, and applications among sensor nodes [2]. They are effective in improving energy
can tolerate some loss of data accuracy. Therefore, it is possible efciency and extending the lifetime of the wireless sensor net-
to use a data aggregation approach to process raw data at the sen- work [2]. In-network processing usually aggregates data at inter-
sor nodes or at intermediate nodes to reduce packet transmissions mediate nodes between the sources and the sinks by using
and save energy [3,20]. aggregation functions (such as maximum, minimum, sum and
average). However, using aggregation functions causes a loss of
Corresponding author. Tel.: +86 136 06619504; fax: +86 571 28008303. data resolution. Furthermore, the difference between the pro-
E-mail address: weigy@zjgsu.edu.cn (G. Wei). cessed data and the original data may be too large to tolerate in
0140-3664/$ - see front matter 2010 Elsevier B.V. All rights reserved.
doi:10.1016/j.comcom.2010.10.003
794 G. Wei et al. / Computer Communications 34 (2011) 793802
some environmental sensing applications. For example, granary Anastasi et al. [4] presents a systematic and comprehensive taxon-
monitoring must continuously gather temperature and humidity omy of the energy conservation scheme in wireless sensor net-
data from every sensor node with relatively small tolerated-error. works. Prediction-based data aggregation approaches are
Xin et al. [21] analyze some more complicated aggregation ap- overviewed and classied into three types: stochastic approaches,
proaches, including data mining and multiple-source-queries rout- time series forecasting, and algorithmic approaches.
ing. These approaches can provide higher accuracy. However, they Stochastic approaches exploit the probabilistic and statistical
consume a large amount of computational power and storage re- properties of sensed data. Deshpande et al. [5] propose a data pre-
sources, because their pre-processing stages require O(n2d) trans- diction scheme based on a probabilistic model to reduce data
missions, where n is the number of nodes and d is the diameter transmission and reduce the quantity of data acquisition. A repre-
of the network [21]. Therefore, these complex approaches are sentative stochastic approach, named KEN [6], uses dynamic prob-
infeasible for most environmental monitoring applications. abilistic model to minimize communication from the sensor node
This paper proposes a novel prediction-based data collection to the base station. The data aggregation process does not require
protocol to reduce redundant data transmission. A double-queue communication between the sensor node and the base station ex-
mechanism is designed to synchronize the predicted data series cept when the sensor node senses anomalous data. KEN naturally
in the sensor node and the sink node, and therefore, the mecha- accommodates applications that are based on event reporting or
nism avoids the cumulative error of continuous predictions. Based anomaly detection. An extension of KEN is presented in [7], where
on this protocol, we design three prediction-based data aggrega- a Dynamic Probabilistic Model (DPM) is exploited to implement a
tion approaches (GMDA, KFDA, and CoGKDA). The proposed ap- probabilistic database view. The main drawback of this class of
proaches are used to predict the data of the next period at both techniques is that they inherently have relative high computa-
sensor and sink ends based on the same small number of recent tional cost. To improve compression of the data communicated,
data items. When data of the next period is sensed, the sensor node some stochastic models exploit sophisticated spatial correlations
compares the predicted data with the sensed data. The sensor node of data in neighboring nodes. However, the more sophisticated
does not forward the sensed data to the sink node when the predic- the model, the more communications are required among sensor
tion error is less than a pre-congured threshold value. In this case, nodes themselves for coordination [2]. Therefore, possible
the sink node considers the predicted data as the sensed data in improvements in this direction may focus on deriving simplied
current sensing period. Therefore, unnecessary transmission is distributed models for obtaining the desired trade-off between
eliminated and energy is saved. The sensor node must send the the energy efciency and the data accuracy according to users
sensed data to the sink node when the prediction error is out of requirements.
the pre-congured threshold. The pre-congured threshold is a The most representative time series methods include Moving
tunable parameter for users to control the accuracy of predicted Average (MA), Auto-Regressive (AR) and Auto-Regressive Moving
data. It is inversely proportional to data accuracy. Experiments Average (ARMA) models. These models are quite simple, and can
and evaluations demonstrate the proposed approaches can signif- be used in many practical cases. Probabilistic Adaptable Query sys-
icantly reduce communication redundancy and improve the net- tem (PAQ) [8] uses a combination of AR models to probabilistically
work lifetime in environmental monitoring applications. answer queries. This model is used globally to predict the readings
Our contribution can be summarized as follows: of individual sensors at the sink node, and locally to detect when
sensor nodes produce outlier readings or when the model ceases
A prediction-based data collection protocol is proposed to spec- to properly t the data at a sensor node. The Similarity-based
ify the cooperative processes between sensor node and sink Adaptive Framework (SAF) [9] uses a simple linear time series
node, in which a novel double-queue mechanism is designed model that consists of a time-varying function, also called trend
to synchronize the prediction data series in the sensor node component, and a stationary AR component representing the diver-
and the sink node, hence cumulative error in continuous predic- gence of the phenomenon from the time-varying function over
tions is avoided. time. SAF can detect both outliers and inconsistent data. Le-Borgne
By integrating the merits of the grey model in quick modeling et al. [10] propose an adaptive multi-model selection mechanism,
with the advantages of Kalman Filter in processing data series which uses a lightweight, online algorithm that allows a sensor
noise, we have designed the CoGKDA algorithm for environmen- node to autonomously determine a satisfactory model from a set
tal monitoring wireless sensor networks. CoGKDA exhibits high of candidate models. As sensed data are collected, based on a
data accuracy, low communication overhead, and relatively low weight metric, it is possible to select the model that offers at each
computational complexity. Furthermore, CoGKDA can extend instant the highest achievable communication savings. Time series
the sensor nodes lifetime by reducing data transmissions redun- forecasting methods can provide sufcient accuracy, and their
dancy and conserving power during continuous data collections. implementation in sensor devices is simple and lightweight. How-
ever, it is difcult to nd an appropriate model that can tackle the
The rest of the paper is organized as follows: In Section 2, state- long-term trend and short-term noise of data sequences simulta-
of-the-art methods in data aggregation are reviewed. Section 3 neously while providing a tunable trade-off between energy ef-
presents a novel prediction-based data collection protocol. Section ciency and data accuracy.
4 describes the Grey-Model-based Data Aggregation approach. An- Algorithmic approaches aggregate data by exploiting the
other data aggregation approach based on the Kalman Filter is gi- heuristic or behavioral characteristics of the sensing phenomena.
ven in Section 5. In Section 6, a combined data aggregation PREMON [10] views a snapshot of the sensor network as an image
approach and its concrete algorithm are presented in detail. Exper- the readings of individual sensors corresponding to the intensity
iments and performance evaluation are presented in Section 7 and value of pixels in the image. Monitoring operations are considered
concluding remarks are made in Section 8. as receiving a sequence of the snapshots on a continuous basis.
When the sink node gets the initial reading from a sensor node,
it computes the model by evaluating correlations between
2. Related work macro-blocks and deriving a motion vector relative to each block.
After obtaining the model, the sensor node sends the model back
There has been a lot of work done in the eld of data-driven to the sink node. From this time on, the sensor node compares each
techniques for energy conservation in wireless sensor networks. sample with the prediction derived from the model. When sensed
G. Wei et al. / Computer Communications 34 (2011) 793802 795
data are close to the prediction within a user-specied tolerance, To solve above problems, the proposed cooperative data collec-
the sensor node does not transmit the data to the sink node. The tion protocol is presented in detail as follows.
model is periodically updated. Goel et al. [11] propose a buddy pro- Prerequisites:
tocol to extend the PREMON approach by establishing a collabora-
tive buddy relationship between sensor and sink nodes. It is (1.1) Each sensor nodes lifetime is divided into equal periods. A
suitable for cluster structured wireless sensor networks. By includ- sensor node produces only one sensed data in one period.
ing a periodic polling scheme in cluster operations, the proposed (1.2) Both the sink node and the sensor node use the same pre-
buddy protocol can guarantee that each node in the network is diction algorithm. The sink node is assumed to have sufcient
reachable within the specied maximum delay constraints. Han computing power, storage, and energy.
et al. [12] present an Energy Efcient Data Collection (EEDC) mech- (1.3) A reliable data delivery is dened as an end-to-end data
anism for data prediction. EEDC is effective in active inquiry-based intercommunication in which the receiver must send an
applications, in which each node associates an upper and a lower acknowledgement message back to the sender.
bound, whose difference represents the accuracy of the sensed
data. These bounds are sent to the sink node, which stores them Initialization:
for each sensor node in the network. These bounds can be updated
according to source-initiated and sink-initiated requests. However, (2.1) The sink node broadcasts its acceptable prediction error
the algorithmic techniques are too complex in computation and threshold e and cumulative error threshold h to all sensor nodes
may also incur a great deal of communication overhead [2]. according to the requirement of specic application by using
Compared to the above mentioned data aggregation methods, reliable data deliveries. e and h are tunable parameters, pre-con-
the data collection protocol and data aggregation approaches pro- gured at the sink node. When their values are modied, the
posed in this paper have the following advantages. (1) They can pro- fresh e and h must be re-broadcast to all sensor nodes.
vide high prediction accuracy without a large amount of training (2.2) Each sensor node constructs two data queues, actual data
data and a priori knowledge of the distribution of sensed data, and queue (ADQ) and predicted value queue in sensor end (PVQsensor).
eliminate more redundant transmissions. (2) They are more adap- ADQ stores actual data series and is used to control cumulative
tive to dynamic changes in the distribution of sensed data. In addi- error. PVQsensor stores the data series that is used to do the same
tion, they are more scalable and structure-free, therefore, they can predictions in both the sensor node and the sink node. PVQsensor
be used to couple with other route or topology-based data aggrega- may contain predicted data value. This is called the Double
tion protocols. (3) They are relatively lightweight in terms of compu- Queue Mechanism. The length of ADQ and PVQsensor are equal
tational complexity to resource-constrained sensor nodes. and both are specied by the applied prediction algorithm
(denoted as l). The sink node constructs a corresponding queue
3. Prediction-based data collection protocol for each sensor node, called PVQsink, PVQsink(i) = PVQsensor(i) for
sensor node i.
In the application layer of a wireless sensor network, data col- (2.3) Each sensor node stores the rst l sensed data into its ADQ
lection can be classied into three schemes: Pull, Push, and Integra- and PVQsensor, and sends them to the sink node to construct
tion of Pull and Push. In the Pull scheme, the sensor node acquires PVQsink via reliable delivery. Let xj denote the data item in a
data from physical layer and caches it locally. The cached data is queue. In the initial stage, ADQ(i) = PVQsensor(i) = PVQsink(i) =
collected only when the sensor node receives a query from the sink {x1, x2, . . . , xl} for an arbitrary sensor node i.
node. In this case, the sensor network looks like a database. In the
Push scheme, the sensor node periodically senses data and imme- Prediction:
diately delivers it to the sink node. The sink node acts as a passive
data collector. The Integration scheme provides capabilities of ac- (3) Let xl+1, x0l1 and x00l1 denote the actual sensed data, predicted
tive data pushing and passive data acquisition by integrating the value using ADQ, and predicted value using PVQsensor(i), respec-
Pull scheme with the Push scheme. tively. It is noticeable that the sink node can also obtain x0l1
In this paper, a prediction-based data collection protocol is pro- from the PVQsink(i) queue. If absx00l1 xl1 < e, the prediction
posed for the Push scheme. The proposed protocol is different from error is considered as in threshold; otherwise out of threshold.

data collection protocols in the MAC layer, since it only focuses on If abs x00l1 x0l1 < h, the cumulative error is considered as in
the prediction-based cooperation between the sensor node and the threshold; otherwise out of threshold. When prediction error
sink node without taking into consideration network topology, and cumulative error are in their thresholds simultaneously,
node density, link quality and radio transceiver parameters. In gen- the prediction of this period is considered successful. For a suc-
eral, the main challenges in designing a prediction-based data col- cessful prediction, the sensor node does not need to send xl+1 to
lection protocol include: (1) how to keep the data series at the sink sink node. The sink node considers the predicted value x00l1 as
node and the sensor node synchronous. In our approaches, both xl+1 in this period. After a successful prediction, the queues are
sensor node and sink node must use the same data series and updated by following rules: (a) ADQ(i) = {x2, x3, . . . , xl+1}; (b)
the same prediction algorithm. However, the sensor has real PVQ sensor i fx2 ; x3 ; . . . ; x00l1 g; and (c) PVQ sink i fx2 ; x3 ; . . . ;
sensed data while the sink node does not. The reason is that some x00l1 g.
sensed data have not been sent to the sink node since related suc-
cessful predictions are done previously; (2) how to avoid cumula- Exceptions:
tive error in continuous predictions. Since the data used for
performing predictions may contain predicted value, cumulative (4.1) The actual sensed data xl+1 must be sent to the sink node
error will inherently be produced; and (3) how to differentiate suc- using reliable delivery in the following cases: (a) a failed predic-
cessful prediction and data loss when the sink node does not re- tion occurs; and (b) the number of continuous successful pre-
ceive the sensed data. When the sensed data is out of threshold, dictions exceeds a pre-congured number.
it must be sent to the sink node. Nonetheless, the sink node may
fail to receive the sensed data due to packet loss induced by unre- (4.2) After an exceptional data delivery, the queues are updated
liable communication. From the viewpoint of the sink node, this by following rules: (a) ADQ(i) = {x2, x3, . . . , xl+1}, (b) PVQsensor(i) =
case is very similar to the successful prediction scenario. {x2, x3, . . . , xl+1}, and (c) PVQsink(i) = {x2, x3, . . . , xl+1}.
0 M
4. Grey model based data aggregation (GMDA) ^0 t 1 e^x t1 :
y 8
A system is called a white system if all information about it is Let Dt 1 jy ^0 t 1 y0 t 1j and e represent the pre-
known, and a black system if no information about it is known. A diction error and the threshold of the prediction error, respectively.
grey system is intervenient between the white system and the For simplicity, cumulative error is not taken into consideration
black system, in which poor, incomplete, or uncertain data is pro- here. After obtaining the predicted data and the prediction error,
vided [1]. The grey model provides a powerful tool for modeling the sensor node compares the error with e. If D(t + 1) < e, the sen-
discrete series with a few data items and for forecasting based sor node does not need to transmit y(0)(t + 1) to the sink node.
on determination of an exponential pattern. A sensor node can Otherwise, it must send y(0)(t + 1) to the sink node. At the other
be treated as an uncertain grey system in the data aggregation pro- end, the sink node runs the same prediction program with the
cess, since only a small sample and poor information is stored and same prediction data sequence. Therefore, it obtains the same pre-
provided. In this paper, the single variable rst-order grey model dicted data as the sensor node predicted. However, the sink node
GM(1, 1) [1] is used to capture the long-term trend of the sensed can not compute the prediction error since it does not have
data sequence by exploring and extracting valuable information y(0)(t + 1). If there is no data coming from the sensor node in a xed
from recently sensed data. time T0, the sink node sets D(t + 1) < e and considers y ^0 t 1 as
Before predicting, a few historical sensed data should be stored (0)
y (t + 1) in the current sensing period. T0 should be longer than
in the sensor node to construct the initial data sequence for the maximum transmission latency, but shorter than the length
GM(1, 1) model, denoted as Y(0). of a sensing period. It is important to synchronize the prediction
data sequences, PVQsensor and PVQsink. When D(t + 1) P e, the sensor
Y 0 y0 1; y0 2; . . . ; y0 t : 1 node must use the predicted data y ^0 t 1 as the data of (t + 1)th
period in its next prediction sequence, because the sink node does
In Eq. (1), y(0)(j), j = 1, 2, . . . , t, represents a data element. t de-
have y(0)(t + 1).
notes the number of elements in the sequence. t is an invariant,
which represents the length of the data sequence. The GM(1, 1)
model uses data of most recent t periods. To eliminate the inu-
ence of oscillation in the initial data sequence, the natural loga- 5. Kalman-Filter-based Data Aggregation (KFDA)
rithm and the exponential function are used to get the adjusted
sequence for GM(1, 1) model, as described in Eq. (2). The Kalman Filter [13] is an efcient recursive lter that esti-
1=M n mates the state of a linear dynamic system from a series of noisy
1=M 1=M 1=M o
ln Y 0 ln y0 1 ; ln y0 2 ; . . . ; ln y0 t : measurements. It presents high prediction accuracy based on a
small quantity of information. It has been used to design adaptive
2 routing mechanisms in mobile wireless sensor networks [14,15].
Olfati-Saber [17] proposes a peer-to-peer continuous-time distrib-
In Eq. (2), M is an integer invariant. In general, 1 < M < 10. Let
uted Kalman Filter that uses local aggregation of the sensor data
x(0)(j) = (lny(0)(j))1/M and X(0) denotes the prediction data sequence,
but attempts to reach a consensus on estimates with other nodes
as described in Eq. (3).
in the network. Yu et al. [18] also design a distributed consensus

X 0 x0 1; x0 2; . . . ; x0 t : 3 lter, in which each sensor can communicate with the neighboring
sensors, and ltering can be distributed among nodes. By using a
(1)
Let X be the 1-AGO (accumulated generating operator) se- pinning control scheme, only a small fraction of sensors need to
quence of X(0), as described in Eq. (4). measure the target information. In this paper, the Kalman Filter
is used to estimate the data sequence for each sensor node rather
X 1 x1 1; x1 2; . . . ; x1 t : 4
than to choose sensor nodes.
Therefore, the GM(1, 1) model can be established as Eq. (5) (a
differential equation).
1 5.1. Kalman-Filter-based prediction model
dx
ax1 b: 5
dt
0 1 0 1 1 In a sensor node, continuous data forms a discrete time data se-
x0 2 z 2 1 quence, which can be modeled by the following Linear Stochastic
B x0 3 C B z1 3 1 C
Let AB C and BB C, where Difference equation:
@ A @ A
x0 t z1 t 1
Xk AkXk 1 BkUk Wk: 9
z1 k 12 x1 k x1 k 1 when k = 2, . . . , t. Therefore,
^ T BT B1 BT A. Using the Least Squares Method, the values of
^; b X(k) represents the predicted data at the period k. A(k) repre-
a
sents the state transition model which is applied to the data of
the parameters a and b can be obtained. Therefore, x k 1 can^1
the previous period (k 1). B(k) represents the control-input mod-
be obtained by using Eq. (6).
el applied to the control vector U(k). W(k) represents the noise of
! the prediction period, which is assumed to follow a zero mean
^
b ^
b
^x1 k 1 eak x1 1 : 6 multivariate normal distribution with the covariance Q(k). Let
^
a ^
a
Z(k) denote the actual sensed data sequence at the period k:
x1 k 1 of the data sequence

Therefore, the predicted data ^
(0)
X can be computed by Eq. (7). Zk HkXk Vk; 10
^x0 k 1 ^x1 k 1 ^x1 k: H(k) is the observation model which maps the predicted data
7
space into the actual sensed data sequence. V(k) is the noise which
^0 t 1 can be obtained by
Finally, the nal predicted data y is assumed to be Zero Mean Gaussian white Noise with covariance
Eq. (8). V(k).
5.2. Kalman-Filter-based prediction algorithm X

m h i2
Jt b tp g Y
wj f Y b j tp ; 16
j1
The Kalman Filter has two distinct phases: prediction and up-
date. The prediction phase uses the data estimated from the previ- where P 0, f and g are continuous differentiable functions. To
ous sensing period to produce an estimation of the data at the compute the extremum of W, we let @Jt 0. The combined predic-
@b
Y t
current period. The prediction model and its covariance model tion model is dened as Eq. (17).
are illustrated as Eqs. (11) and (12), respectively. In the update 2 !1=p 3
phase, measurement information at the current period is used to X
m
b t f 1 4
Y b j p
wj g Y 5: 17
rene this prediction to achieve a new, more accurate data esti-
j1
mate, again for the current period. The updated prediction model
and its covariance model are illustrated as Eqs. (13) and (14), In this paper, with the consideration of constrained computing re-
respectively. And the Optimal Kalman Gain can be computed using b t Y
sources, we let f Y b t, g Y
b j t Y
b j t, and p = 1 to simplify
Eq. (15). the model. As a result, the combined prediction model changes to
the weighted arithmetic average of the meta prediction models.
b k 1jk AkXkjk BkUk;
X 11
6.2. Combination weights
T
Pk 1jk AkPkjkAk Qk; 12
For the ith prediction approach, its predicted data sequence is
^ i y
y î1 ; y
î2 ; . . . ; y
în , i = 1, 2, . . . , m. Let eij denote the error of the
b k 1jk 1 X
X b k 1jk Kgk 1Yk
ith approach in predicting the jth datum, ei denote the error vector
b 1jk;
Hk 1 Xk 13 of the ith approach in Eq. (18).
î1 ; y2 y
ei ei1 ; ei2 ; . . . ; ein y1 y î2 ; . . . ; yn y
în : 18
Pk 1jk 1 I Kgk 1Hk 1Pk 1jk; 14
Using the weight vector W, the combined predicted data se-
T
Kgk 1 Pk 1jkHk Hk 1Pk 1jkHk 1 Rk :
T 1 quence can be described as Eq. (19).
!
15 X
m X
m X
m
b y
Y ^1 ; y
^2 ; . . . ; y
^n î1 ;
wi y î2 ; . . . ;
wi y în :
wi y 19
i1 i1 i1
b njm represents the estimate of X at period n, given sensed
X
Therefore, the error metrics of the combined prediction
data sequence of recent m periods. P(njm) represents the error P
b feij g eij yj m ^
i1 wi yij and the sum of squares of the combined
covariance matrix according to Xnjm.
prediction errors is J(t):
In this section, the Kalman Filter is viewed as a single measure-
!
ment of a single model for temperature prediction. To simplify the X
n X
m
computation, we let A(k) = 1, B(K) = Q(k) = 0, R(k) = H(k) = I, and Jt yj îj :

wi y 20
j1 i1
P(0j0) = 1. Let Y(k) = {y(1), y(2), . . . , y(k)} represent the historical
sensed data sequence. Using the Least Squares Method to minimize J(t), the optimal
As a single-prediction-method-based approach, the KFDA is A R 1 T T
weight vector can be obtained. W RA 1 T , where R = (1, 1, . . . , 1)
similar to the GMDA. The only difference is that they use different 0 Pn 2 Pn PnR 1
prediction algorithms. i1 e1i i1 e1i e2i Pi1 e1i eni
B Pn e2i e1i P n 2 n C
B
and A @ i1 i1 e2i i1 e2i eni C.
A
Pn Pn Pn 2
6. CoGKDA i1 eni e1i i1 eni e2i e
i1 ni
As described in Eq. (18), the combined prediction in CoGKDA

6.1. Modeling needs the actual sensed data sequence to compute an optimal
weight vector. Since the sink node does not have the actual sensed
In this section, GMDA and KFDA are combined to improve the data sequence, it can not compute the optimal weight vector. In
accuracy of the prediction. Since the grey model is very effective addition, computing an optimal weight vector in every period is
for predicting data series with secular trends [1] and the Kalman too expensive for a sensor node. In this paper, an empirical and
Filter is useful for improving the prediction accuracy on the data periodical update mechanism is proposed to solve this problem.
series that may oscillate frequently in a short-term, the combina- Sensor nodes compute and send their optimal weight vector to
tion of the grey model and the Kalman Filter reduces randomness the sink node in their initial prediction period. After a xed number
and improves prediction accuracy simultaneously. The main idea of periods (denoted as u), sensor nodes periodically re-compute
of CoGKDA is to combine the two independent prediction technol- and send the fresh weight vector to the sink node to synchronize
ogies to avoid potential deciencies that a single prediction tech- their prediction parameters.
nology may lead to. In the combination, weights are used to
leverage the two prediction approaches. The optimal weights can 6.3. CoGKDA algorithm
be computed by minimizing the sum of prediction error squares
of the meta approaches. The predictions should be based on the same data sequence
Without loss of generality, let W = (w1, w2, . . . , wm)T represent the (PVQ) at both the sensor node and the sink node. The global thresh-
P
weight vector of m prediction models, m i1 wi 1. Let Y(t) represent olds of the prediction error and cumulative error are set before the
the actual sensed data sequence of the recent t periods and Y b j t predictions, denoted as e and h, respectively. According to the pro-
(t = 1, 2, . . . , n and j = 1, 2, . . . , m) represent the corresponding pre- posed data collection protocol, we let the length of data queues
dicted data sequences of the m prediction models. The predicted l = t. For all sensor nodes, the data of the rst t sensing periods
data sequence of the combined approach is Y b t; t 1; 2; . . . ; n. must be transmitted to the sink node. Starting with the (t + 1)th
Therefore, the sum of prediction error squares can be denoted as period, the sensor node and the sink node do predictions. For
J(t) in Eq. (16). example, let yt 1; y ^0 t 1 represent the currently
^t 1, and y
sensed data, the data predicted by using the prediction value it knows the sensor node is functioning. Otherwise, the sensor
queue in the sensor node (PVQsensor), and the data predicted by node will be temporarily treated as a dysfunctional node since it
using the actual data queue (ADQ), respectively. In the (t + 1)th per- has not sent back data in the pre-congured time. In CoGKDA, v
iod, Dt 1 absy ^t 1 yt 1 is the prediction error and is a pre-congured parameter, which should be determined by
D0 t 1 absy
^0 t 1 y
^t 1 is the cumulative error. The the trade-off between reducing concurrent error and increasing
sensor node checks the prediction error with the pre-congured communication overhead. It is reasonable to deduce that as v de-
global thresholds. If D(t + 1) < e and D0 (t + 1) < h, the sensor node creases, CoGKDA decreases concurrent error but increases commu-
does not send the actual y(t + 1) to the sink node. The sink node nication overhead.
considers y ^t 1 as y(t + 1). The sensor node must set
yt 1 y^t 1 to keep the prediction data sequence PVQsensor
7. Experiment and performance evaluation
synchronized with the sink nodes PVQsink for future predictions.
Otherwise, the sensor node sends y(t + 1) to the sink node. The
7.1. Experiment setup
CoGKDA algorithm in the sensor node is described in Table 1.
According to the protocol described in Section 3, when contin-
In this paper, experiments are based on an environmental mon-
uous and successful predictions are made in a sensor node, the sink
itoring system in a granary. Since grain is liable to mildew when
node will not receive any data from the sensor node for a long time.
the humidity and temperature in the storehouse are too high, it
In this case, the sensor node is very similar to a dysfunctional (or
is very important to monitor real-time humidity and temperature.
failed) sensor node. In addition, as the number of continuous and
The data used in our experiments are derived from a real deployed
successful predictions increases, the cumulative error will increase
sensor network. The sensor network is used to collect the temper-
correspondingly. To distinguish the two different cases and avoid
ature and humidity of the grain in a large granary, which consists
excessive cumulative error, we use another threshold v for the
of 30 storehouses. Each storehouse is a detached building, which is
number of continuous and successful predictions. When the num-
divided into 24 volumes. The grain is stored in the volumes. In each
ber of continuous and successful predictions is out of v, the sensor
volume, there are four sensor nodes buried in the grain. The sensor
node must send the actual data to the sink node. Therefore, when
eld of a volume is divided into four zones: top, middle-top, mid-
the sink node receives actual data from a sensor node in v periods,
dle-bottom, and bottom. All sensor nodes in a storehouse form a
tree-structured network with three layers: sensor layer, intermedi-
ate layer and sink layer. Nodes in the intermediate layer and the
Table 1 sink layer are external powered, while nodes in the sensor layer
The CoGKDA algorithm. are only powered by battery. Each intermediate node receives data
Input: from four sensor nodes in one volume and sends them to the sink
Y(i): current prediction data sequence, Yi y ît1 ; y
ît2 ; . . . ; y
î ; i P t; node (intermediate nodes just relay data between sensor and sink
W: static variable, the current optimal weight vector; nodes). One sink node is deployed in one storehouse to collect data
yi+1: the sensed data of the (i + 1) th period; from intermediate nodes. Sensor nodes sense and return tempera-
r: static variable, the number of the continuous and successful predictions;
ture and humidity data every thirty minutes. This system has
u: static variable, the number of periods for re-computing weight vectors;
v: static variable, a threshold for variable r. If r P v, sensor must send worked for three years and has collected a large volume of data.
currently sensed data to sink node; To evaluate the proposed data aggregation approaches, these
s: static variable, the age of the current weight vector; approaches are implemented in our test bed system, in which all
e: static variable, the threshold of prediction error; sensor nodes are designed based on TinyOS 2 and the IEEE
h: static variable, the threshold of cumulative error;
Output: 802.15.4 protocol. All experiments are carried out based on a real
Y(i + 1): next prediction data sequence; data set. Since these prediction-based data aggregation approaches
CoGKDA (Y(i), W, yi+1, r, u, v, s, e, h) are structure-free and topology-free, a sensor node was randomly
{ chosen for the experiments and only its temperature data is used.
Perform GMDA prediction and obtain the predicted data y ^g and its error eg;
A data sequence (denoted as D) that includes 720 continuous data
Perform KFDA prediction and obtain the predicted data y ^k and its error ek;
if s < u 1{ items (data for half a month) was randomly extracted from the ori-
s = s + 1; ginal temperature data stream for the following experiments.
Perform the combination and obtain the predicted data y ^c , prediction error In the proposed approaches, e and h represent users require-
pec, ments on data accuracy. They are application-specic parameters.
and cumulative error cec;
In the temperature monitoring of a granary, the tolerated error of
}
else{ the predicted data is relatively small, since the grain is sensitive
Compute new optimal weight vector Wnew; to temperature changes. Therefore, in our experiments, we let
Send the new weight vector to the sink node; e = 0.5 and e = 1 represent users high and low on data accuracy
W = Wnew;
requirements, respectively. For simplicity, we let e = h.
s = 0;
}
if pec < e and cec < h and r < v 1{ 7.2. GMDA
r = r + 1;
î1 y
y ^c ;//Synchronize the next period prediction data sequence with the
sink node. Compared to other prediction-based approaches, GMDA is very
Yi 1 y ît2 ; y
ît3 ; . . . ; y
î1 ; //Refresh the prediction data sequence lightweight. The predicted data sequence of GMDA can be repre-
for future predictions. sented as Y(0) = (y(0)(1), y(0)(2), . . . , y(0)(t)), where parameter t de-
} notes the length of the sequence used for predictions. In general,
else{
Send yi+1 to the sink node;
longer sequences lead to more accurate predictions. However,
r = 0; greater length consumes more sensor storage and leads to higher
Yi 1 y ît2 ; y
ît3 ; . . . ; y
î ; yi1 ; computational complexity. To choose a suitable t value, we evalu-
} ate the growth rate of prediction accuracy while t changes from 3
return Y(i + 1);
to 9. The results are shown in Fig. 1. First, we randomly extracted
}
three sub-sequences from the original data set. Each sub-sequence
0.35 1
0.3 = 0.5 0.9 =1

=1
growth rate of prediction accuracy
0.25 0.8
0.2 0.7
0.15 0.6
CDF
GMDA
0.1 0.5 KFDA
CoGKDA
0.05 0.4
0 0.3
0.05 0.2
0.1 0.1
3 4 5 6 7 8 9 0 1 2 3 4 5
length of data queue used for predicting (t) Prediction error
Fig. 1. Evaluation of parameter t in GMDA. Fig. 3. Cumulative distribution functions of the prediction errors of GMDA, KFDA
and CoGKDA when e = 1.
consists of thirty continuous data items. Second, using each t value,

we performed GMDA on the three sub-sequences, respectively, and Table 2
then we obtained the average prediction accuracy of each t value. Performance comparison between GMDA, KFDA, CoGKDA and Auto-Regressive
Finally, we computed the growth rate of prediction accuracy while method when the threshold e = 1 and e = 0.5, where ts denotes the number of data
of the training set in Auto-Regressive method.
t increases. The prediction accuracy is measured by e = 1 and
e = 0.5, respectively. Since the growth rate of prediction accuracy Approaches Communication energy savings
achieves its maximum value when t = 5, as shown in Fig. 1, we GMDA: t = 5, e = 0.5 25.22%
choose t = 5 in following GM(1, 1) algorithm experiments. GMDA: t = 5, e = 1 43.77%
The experiments for GMDA are carried out based on the data se- KFDA: e = 0.5 33.62%
KFDA: e = 1 58.41%
quence D. The results are shown in Figs. 2 and 3. As described in Auto-Regressive method: ts = 60, e = 0.5 27.35%
Section 3, the sensor node does not need to send actual data to Auto-Regressive method: ts = 60, e = 1 40.62%
the sink node when the error of the related prediction is within CoGKDA: e = 0.5 35.21%
the threshold e, thereby reducing transmissions and power con- CoGKDA: e = 1 59.85%
sumption. In Fig. 2, the cumulative distribution function (CDF) of
the prediction errors produced by GMDA is 25.22%. When we let
e = 1, the CDF value of GMDAs prediction errors achieves 43.77%. prediction success rate. In other words, more failed predictions
Since GMDA causes no overhead communication, the CDF value cause more actual data communication.
of its prediction errors on the corresponding threshold is approxi-
mately equal to the percentage of communication energy saving,
7.3. KFDA
as shown in Table 2. As illustrated in Figs. 2 and 3, when the
threshold of the prediction error decreases, energy consumption
Using the same data sequence D for GMDA, the experiments on
increases. The reason is that a smaller threshold causes a lower
KFDA were carried out. The results show that KFDA obtains a high-
er prediction success rate than GMDA, as illustrated in Figs. 2 and
3. The CDF value of the prediction errors produced by KFDA is
1
much higher than GMDAs CDF value for both e = 0.5 and e = 1.
0.9
According to the data collection protocol described in Section 3,
= 0.5 in the experiments on KFDA, no communication overhead is pro-
0.8 duced. Table 2 shows that the percents of communication energy
savings are 33.62% and 58.41% when e = 0.5 and e = 1, respectively.
0.7 For data sequence D, KFDA behaves better than GMDA and bet-
ter conserves power. Nonetheless, by comparing their prediction
0.6
models (as described in Sections 6.1 and 6.2), KFDA requires more
CDF
GMDA computation than GMDA in the process of data collection.

0.5 KFDA
CoGKDA
0.4 7.4. CoGKDA
0.3 To simplify the computation, we let f Y b t Y

b t,
b j t Y
g Y b j t and p = 1 in Eq. (16). Therefore, the CoGKDA model
0.2
changes to a simple weighted arithmetic average of the two meta
0.1 prediction models. The pivotal problem in CoGKDA is that the sen-
0 1 2 3 4 5 sor node must compute optimal weights periodically and keep its
Prediction error
weight vector synchronized with sensor nodes in predictions. The
Fig. 2. Cumulative distribution functions of the prediction errors of GMDA, KFDA parameter u is used to control the interval of weight refresh.
and CoGKDA when e = 0.5. According to the proposed data collection protocol in Section 3,
0.7 KFDA. The CDF value of the prediction errors produced by CoGDA
is higher than those of KFDA and GMDA for both e = 0.5 and
Success prediction rate and Comunication overhead
0.6 e = 1. From Figs. 2 and 3, it can be observed that performance of

CoGDA is better than that of KFDA when the tolerable prediction
error is in the low interval (range from 0 to 3), and close to that
0.5
=1
of KFDA when the tolerable prediction error is in the high interval
=0.5 (range from 3 to 5). Table 2 also shows that CoGKDA achieves en-
0.4 ergy savings of 35.21% and 59.85% while e = 0.5 and e = 1, respec-
Overhead
tively. It demonstrates that by combining KFDA and GMDA,
0.3 CoGKDA improves communication energy saving with an insignif-
icant increase in overhead.
0.2 State-of-the-art prediction-based approaches such as PAQ [8]
and SAF [9] use Auto-Regressive or improved Auto-Regressive
methods to aggregate data in wireless sensor networks. In SAF
0.1
and PAQ, prediction models are built at each sensor to predict local
readings. Sensor nodes transmit their local models to a sink node,
0 which uses them to predict sensor values without need for down-
10 15 20 25 30 35 40
Weight vector refresh interval (u) linking communicating with sensor nodes. When needed, sensor
nodes send information about outlier readings and model updates
Fig. 4. Success prediction rate and communication overhead as the weight vector to the sink node. To compare CoGKDA with PAQ and SAF, we car-
update interval (u) changes from 10 to 40.
ried out the same data collection experiments on them by using
the same data sequence D. Since PAQ, SAF and AR need a training
the smaller u is, the higher the corresponding successful prediction set before prediction, we let their training set contain 60 data items
rate will be and the more communication overhead CoGKDA will (the rst 60 data items in D). In this experiment, a third order AR
produce. To nd a suitable u value, we carried out a simple exper- algorithm was used as a benchmark. The experiment results are
iment to evaluate the successful prediction rate and communica- shown in Figs. 6 and 7. Fig. 6 shows that CoGKDA presents a higher
tion overhead while increasing u from 10 to 40 stepped by 5. In success rate and more energy saving than the third order Auto-
this experiment, we used the rst 3u data items in D for predic- Regressive model AR(3), PAQ, and SAF. To compare data accuracy
tions and computed prediction success rate and communication of successful predictions, we analyze the mean square errors of
overhead for each u value. As illustrated in Fig. 4, the result shows successful predictions of these approaches. Fig. 7 shows that CoG-
u = 25 is the best choice for both e = 1 and e = 0.5. In CoGKDA, an- KDA is also better than SAF and PAQ as the threshold e changes
other parameter v is used to reduce long concurrent errors. To nd from 0.2 to 1. Since SAF and PAQ do not have a cooperation mech-
a suitable v value, we carried out another simple experiment, in anism to synchronize the data series used for prediction, they pro-
which we computed the percentage of overhead in communication duce a lower communication overhead. However, as illustrated in
energy consumption while changing v from 2 to 16. The result in Figs. 6 and 7, the prediction errors and mean square errors of
Fig. 5 shows that communication overhead is 0 when (e = 1, SAF and PAQ are much higher than that of CoGKDA. The reason
v = 16) and (e = 0.5, v = 8). It is noticeable that the percentage of is that, in SAF and PAQ, the sink node and the sensor node use dif-
overhead in communication energy is less than 1% for both e = 1 ferent data series to predict, while different data series produce
and e = 0.5 when v = 10 (The percentage is approximate 0.87% different prediction errors which cause cumulative error increases.
when e = 1, and 0 when e = 0.5). It is remarkable that: (1) CoGKDA provides high successful pre-
Based on above evaluations, we let t = 5, v = 8 and u = 10. Using diction rate by using the adjusted predicted data sequence PVQ,
the same data sequence D, we carried out experiments on CoGKDA. rather than the actual data sequence ADQ; and (2) CoGKDA reduces
As illustrated in Figs. 2, 3 and Table 2, the results show that CoG- energy consumption caused by redundant communications with
KDA obtained a higher prediction success rate than GMDA and insignicant overhead.
25 450
Overhead in communication energy consumption(\%)
400
= 0.5
the number of successful predictions
20
=1 350
300
15
250
10 AR(3)
200
CoGKDA
SAF
150 PAQ
5
100
0 50
2 4 6 8 10 12 14 16 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
v prediction threshold
Fig. 5. Communication overhead as v changes from 2 to 16. Fig. 6. The comparison of the success rates as e changes from 0.2 to 1.
0.7 protocol they used is structure-free. Therefore, they can be used to

couple with other route-based or topology-based data aggregation
0.6 protocols. Above analysis indicates the proposed approaches are
scalable for most environmental monitoring applications.
0.5
8. Conclusion
Mean square error
0.4
Prediction-based data aggregation is a fundamental data-driven
energy conservation approach. The prediction-based approach
0.3 saves energy by reducing redundant data communications. Since
CoGKDA the prediction-based approach is structure-free, it can be used to
0.2 PAQ couple with other route or topology-based data aggregation ap-
SAF proaches. By analyzing energy efciency and data accuracy, a novel
AR(3)
0.1 prediction-based data collection protocol is proposed to specify the
cooperations between the sensor node and the sink node. In the
proposed protocol, a double-queue mechanism is designed to syn-
0
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 chronize predicted data at both sensor node and sink node to avoid
Prediction threshold cumulative error in continuous predictions. Based on this novel
data collection protocol, three prediction-based data aggregation
Fig. 7. The comparison of the mean square errors of all successful predictions as e
changes from 0.2 to 1. approaches are proposed: GMDA, KFDA and CoGKDA. Experiments
have been carried out based on a real data set collected from a tem-
perature and humidity monitoring application in a grain repertory.
7.5. Complexity and scalability The results demonstrate that the proposed approaches can reduce
energy consumption caused by redundant communications with
Main computation in GMDA derives from the GM(1, 1) algo- minimally increased overhead. Experiments also show CoGKDA
rithm. According to Eqs. (1)(8), it can be deduced that the compu- achieves better performance compared to traditional prediction-
tational complexity of GMDA is O(t), where t is the number of data based approaches (including SAF, PAQ and AR).
items for the prediction algorithm. For most applications, the com-
putational complexity of GMDA is very low when t = 5. Acknowledgement
In the KFDA approach, as described in [16], the order of the Kal-
man Filter is O(2m2n) + O(2mn2) + O(m3) + O(n3). In this paper, m is This work was supported by the National Natural Science Foun-
the number of dimensions of data prediction sequence at the pre- dation of China under Grant No. 60803161.
diction phase, and n is the number of predicted data items for one
recursion. If n = am, where a > 1, then the number of computations References
can be transformed to O[(1 + 2a + 2a2 + a3)m3]. As described in Eqs.
(9)(15), the computational complexity becomes O(3m3), when [1] J.L. Deng, Introduction to Grey system theory, Journal of Grey System 1 (1)
(1989) 124.
using the equivalent system with A(k) = 1, B(k) = Q(k) = 0,
[2] R. Rajagopalan, P.K. Varshney, Data aggregation techniques in sensor
R(k) = H(k) = I and P(0j0) = 1. Therefore, the computational com- networks: a survey, IEEE Commununications Surveys Tutorials 8 (4) (2006)

3 4863.
plexity of KFDA is O 12a2a 2 a3 times the normal KF algorithm.
[3] S. Ozdemir, Y. Xiao, Secure data aggregation in wireless sensor networks: a
For example, let a = 2, the computational complexity of KFDA is comprehensive overview, Computer Networks 53 (2009) 20222037.
[4] G. Anastasi, M. Conti, M.D. Francesco, A. Passarella, Energy conservation in
only 0.14 times the normal KF algorithm. By analyzing the CoGKDA wireless sensor networks: a survey, Ad Hoc Networks 7 (2009) 537568.
algorithm in Table 1, the computation of CoGKDA approach mainly [5] A. Deshpande, C. Guestrin, S. Madden, J.M. Hellerstein, W. Hong, Model-driven
consists of three parts: GMDA, KFDA and computing weight vector data acquisition in sensor networks, in: Proceedings of the 30th International
Conference on Very Large Data Bases, 2004, pp. 588599.
W. As described in Eqs. (16)(20), the complexity for computing W
[6] D. Chu, A. Deshpande, J.M. Hellerstein, W. Hong, Approximate data collection
is O(mn2), in which m is the number of meta predictions model and in sensor networks using probabilistic models, in: Proceedings of the 22nd
n is the number of data items in the prediction data sequence. In International Conference on Data Engineering 2006, pp. 4859.
CoGKDA, the computational complexity for computing W is [7] B. Kanagal, A. Deshpande, Online ltering, smoothing and probabilistic
modeling of streaming data, in: Proceedings of the 24th International
O(2n2). To unify measurements, we let m represent the number Conference on Data Engineering 2008, pp. 11601169.
of data items in prediction data sequence. Therefore, adding the [8] D. Tulone, S. Madden, PAQ: time series forecasting for approximate query
three parts, the computational complexity of CoGKDA is answering in sensor networks, in: Proceedings of the Third European
Conference on Wireless Sensor Networks, 2006, pp. 2137.
O(m) + O(3m3) + O(6m2), and the order of CoGKDAs computation [9] D. Tulone, S. Madden, An energy-efcient querying framework in sensor
complexity is approximately O(m3). networks for detecting node similarities, in: Proceedings of the Ninth
It can be seen that CoGKDA is the most complex, KFDA is sim- International ACM Symposium on Modeling, Analysis and Simulation of
Wireless and Mobile Systems, 2006, pp. 291300.
pler, and GMDA is the simplest. Although the computational com- [10] Y. Le-Borgne, S. Santini, G. Bontempi, Adaptive model selection for time series
plexity of CoGKDA seems high, in practice, it is acceptable for most prediction in wireless sensor networks, Signal Process 87 (12) (2007) 3010
applications when the length of data queues (ADQ and PVQ) is 3020.
[11] S. Goel, A. Passarella, T. Imielinski, Using buddies to live longer in a boring
small, e.g. m = 5. world, in: Proceedings of 2006 IEEE International Workshop on Sensor
In the proposed approaches, all computations for data aggrega- Networks and Systems for Pervasive Computing, 2006, pp. 342346.
tion are only performed in the sensor node and the sink node. Data [12] Q. Han, S. Mehrotra, N. Venkatasubramanian, Energy efcient data collection
in distributed sensor environments, in: Proceedings of the 24th IEEE
processing in intermediate nodes is not needed. Therefore, the pro-
International Conference on Distributed Computing Systems, 2004, pp. 590
posed approaches are independent of network scale. Their perfor- 597.
mance is only determined by users requirements which are jointly [13] R.E. Kalman, A new approach to linear ltering and prediction problems,
controlled by parameters e, h, t, u and v. Furthermore, the proposed Transactions of the ASME Journal of Basic Engineering 82 (Series D) (1960) 35
45.
approaches can be used in tree-structured, cluster-structured and [14] B. Pasztor, M. Musolesi, C. Mascolo, Opportunistic mobile sensor data
peer-structured wireless sensor networks, since the data collection collection with SCAR, in: Proceedings of MASS 2007, 2007, pp. 112.
[15] M. Musolesi, S. Hailes, C. Mascolo, Adaptive routing for intermittently [19] R. Kay, F. Mattern, The design space of wireless sensor networks, IEEE Wireless
connected mobile ad hoc networks, in: Proceedings of 2005 IEEE WoWMoM, Communications 11 (6) (2004) 5461.
2005, pp. 183189. [20] Y. Yao, B.B. Giannakis, Energy-efcient scheduling for wireless sensor
[16] M.J. Goris, D.A. Gray, I.M.Y. Mareels, Reducing the computational load of a networks, IEEE Transactions on Communications 53 (8) (2005) 5461.
Kalman lter, Electronics Letters 33 (18) (1997) 15391541. [21] Q. Xin, L. Gasieniec, C. Su, P. Wong, Routing via single-source and multiple-
[17] R. Olfati-Saber, Distributed Kalman ltering for sensor networks, in: The 46th source queries in static sensor networks, in: Proceedings of IPDPS 2005, 2005,
IEEE Conference on Decision and Control, 2007, pp. 54925498. pp. 183189.
[18] W. Yu, G. Chen, Z. Wang, W. Yang, Distributed consensus ltering in sensor
networks, IEEE Transactions on Systems, Man, and CyberneticsPart B:
Cybernetics 39 (6) (2009) 15681577.

Prediction-Based Sensor Nodes

Uploaded by

Copyright:

Available Formats

Prediction-Based Sensor Nodes

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Prediction-Based Sensor Nodes

Uploaded by

Copyright:

Available Formats

Computer Communications 34 (2011) 793802

Contents lists available at ScienceDirect

Prediction-based data aggregation in wireless sensor networks: Combining

1. Introduction Since the data generated by sensor nodes during continuous

x1 k 1 of the data sequence

5.2. Kalman-Filter-based prediction algorithm X

computation, we let A(k) = 1, B(K) = Q(k) = 0, R(k) = H(k) = I, and Jt yj ^ij :

As described in Eq. (18), the combined prediction in CoGKDA

0.3 = 0.5 0.9 =1

consists of thirty continuous data items. Second, using each t value,

GMDA computation than GMDA in the process of data collection.

0.3 To simplify the computation, we let f Y b t Y

0.6 e = 1. From Figs. 2 and 3, it can be observed that performance of

0.7 protocol they used is structure-free. Therefore, they can be used to

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.