RFID Technologies For IoT
RFID Technologies For IoT
RFID Technologies For IoT
Min Chen
Shigang Chen
RFID Technologies
for Internet of
Things
Wireless Networks
Series editor
Xuemin (Sherman) Shen
University of Waterloo, Waterloo, Ontario, Canada
RFID Technologies
for Internet of Things
123
Min Chen Shigang Chen
Department of Computer and Information Department of Computer and
University of Florida Information Science
Gainesville, FL, USA University of Florida
Gainesville, FL, USA
This work is supported in part by the National Science Foundation under grants CNS-1409797 and
STC-1562485.
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Internet of Things . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 RFID Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Tag Search Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Anonymous RFID Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.5 Identification of Networked Tags. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.6 Outline of the Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Efficient Tag Search in Large RFID Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1 System Model and Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1.2 Time Slots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.3 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.1 Tag Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.2 Polling Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.3 CATS Protocol. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 A Fast Tag Search Protocol Based on Filtering Vectors. . . . . . . . . . . . . . . . 14
2.3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3.2 Bloom Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.3 Filtering Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.4 Iterative Use of Filtering Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3.5 Generalized Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3.6 Values of mi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3.7 Iterative Tag Search Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.8 Cardinality Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3.9 Additional Filtering Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3.10 Hardware Requirement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
v
vi Contents
Generally, every physical object in the IoT needs to be augmented with some auto-
ID technologies such that the object can be uniquely identified. Radio Frequency
Identification (RFID) [12] is one of the most widely used auto-ID technologies.
RFID technologies integrate simple communication, storage, and computation
components in attachable tags that can communicate with readers wirelessly over
a distance. Therefore, RFID technologies provide a simple and cheap way of
connecting physical objects to the IoT—as long as an object carries a tag, it can
be identified and tracked by readers.
RFID technologies have been pervasively used in numerous applications, such
as inventory management, supply chain, product tracking, transportation, logistics,
and toll collection [1, 3, 8, 10, 16–19, 21–24, 27, 29, 32, 35, 37–39]. According to a
market research conducted by IDTechEx [30], the market size of RFID has reached
$8.89 billion in 2014, and is projected to rise to $27.31 billion after a decade.
Typically, an RFID system consists of a large number of RFID tags, one or multiple
RFID readers, and a backend server. Today’s commercial tags can be classified into
three categories: (1) passive tags, which are powered by the radio wave from an
RFID reader and communicate with the reader through backscattering; (2) active
tags, which are powered by their own energy sources; and (3) semi-active tags,
which use internal energy sources to power their circuits while communicating
with the reader through backscattering. As specified in EPC Class-1 Gen-2 (C1G2)
protocol [12], each tag has a unique ID identifying the object it is attached to.
The object can be a vehicle, a product in a warehouse, an e-passport that carries
personal information, a medical device that records a patient’s health data, or any
other physical object in IoT. The integrated transceiver of each tag enables it to
transmit and receive radio signals. Therefore, a reader can communicate with a tag
over a distance as long as the tag is located in its interrogation area. However,
communications amongst RFID tags are generally not feasible due to their low
transmission power. The emerging networked tags [13, 14] bring a fundamental
enhancement to RFID tags by enabling tags to communicate with each other. The
networked tags are integrated with energy-harvesting components that can harvest
energy from surrounding environment.
The widespread use of RFID tags in IoT brings about new issues on efficiency,
security, and privacy that are quite different from those in traditional networking
systems [7, 36]. This book presents several state-of-the-art RFID protocols that aim
at improving the efficiency, security, and privacy of the IoT.
Given a set of IDs for the wanted tags, the tag search problem is to identify which
wanted tags are existing in an RFID system [9, 11]. Note that there may exist
other tags that do not belong to the set. As an example, a manufacturer finds that
some of its products, which have been distributed to different warehouses, may be
defective, and wants to recall them for further inspection. Since each product in the
IoT carries a tag, and the manufacturer knows all tag IDs of those defective products,
it can perform tag search in each warehouse to identify the products that need to be
recalled.
To meet the stringent delay requirements of real-world applications, time
efficiency is a critical performance metric for the RFID tag search problem. For
example, it is highly desirable to make the search quick in a busy warehouse as
lengthy searching process may interfere with other activities that move things in
and out of the warehouse. The only prior work studying this problem is called CATS
[40], which, however, does not work well under some common conditions (e.g., if
the size of the wanted set is much larger than the number of tags in the coverage
area of the reader).
We present a fast tag search method based on a new technique called filtering
vectors. A filtering vector is a compact one-dimension bit array constructed from
tag IDs, which can be used for filtering unwanted tags. Using the filtering vectors,
1.4 Anonymous RFID Authentication 3
we design, analyze, and evaluate a novel iterative tag search protocol, which
progressively improves the accuracy of search result and reduces the time of each
iteration to a minimum by using the information learned from previous iterations.
Given an accuracy requirement, the iterative protocol will terminate after the
search result meets the accuracy requirement. We show that our protocol performs
much better than the CATS protocol and other alternatives used for comparison.
In addition, our protocol can be extended to work under noisy channel with a modest
increase in execution time.
The proliferation of RFID tags in their traditional ways makes their carriers
trackable. Should future tags penetrate into everyday products in the IoT and be
carried around (oftentimes unknowingly), people’s privacy would become a serious
concern. A typical tag will automatically transmit its ID in response to the query
from a nearby reader. If we carry tags in our pockets or by our cars, these tags
will give off their IDs to any readers that query them, allowing others to track us.
As an example, for a person whose car carries a tag (automatic toll payment [35]
or tagged plate [34]), he may be unknowingly tracked over years by toll booths or
others who install readers at locations of interest to learn when and where he has
been. To protect the privacy of tag carriers, we need to invent ways of keeping the
usefulness of tags while doing so anonymously.
Many RFID applications such as toll payment require authentication. A reader
will accept a tag’s information only after authenticating the tag and vice versa.
Anonymous authentication should prohibit the transmission of any identifying
information, such as tag ID, key identifier, or any fixed number that may be used
for identification purpose. As a result, there comes the challenge that how can a
legitimate reader efficiently identify the right key for authentication without any
identifying information of the tag?
The importance and challenge of anonymous authentication attract much atten-
tion from the RFID research community. Many anonymous authentication protocols
have been designed. However, we will show that all prior work has some potential
problems, either incurring high computation or communication overhead, or having
security or functional concern. Moreover, most prior work, if not all, employs
cryptographic hash functions, which requires considerable hardware [2], to ran-
domize authentication data in order to make the tags untrackable. The high hardware
requirement makes them not suited for low-cost tags with limited hardware resource.
Hence, designing anonymous authentication protocols for low-cost tags remains an
open and challenging problem [4].
In our design, we make a fundamental shift from the traditional paradigm for
anonymous RFID authentication [5]. First, we release the resource-constrained
RFID tags from implementing any complicated functions (e.g., cryptographic
hashes). Since the readers are not needed in a large quantity as tags do, they
4 1 Introduction
can have much more hardware resource. Therefore, we follow the asymmetry
design principle to push most complexity to the readers while leaving the tags
as simple as possible. Second, we develop a novel technique to generate random
tokens on demand for anonymous authentication. Our protocol only requires O.1/
communication overhead and online computation overhead per authentication for
both readers and tags, which is a significant improvement over the prior art. Hence,
our protocol is scalable to large RFID systems. Finally, extensive theoretic analysis,
security analysis, simulations, and statistical randomness tests are provided to verify
the effectiveness of our protocol.
The rest of the book is organized as follows. Chapter 2 presents an efficient tag
search protocol based on filtering vectors and evaluates the impact of channel noise
on its performance. Chapter 3 introduces the problem of anonymous authentication
in RFID systems. We present a lightweight anonymous authentication protocol
using dynamically generated tokens. The protocol only requires constant communi-
cation and computation overhead for both readers and tags. Chapter 4 discusses the
problem of identifying networked tags. Two tag identification protocols are given
and compared in detail.
References
11. Chen, M., Luo, W., Mo, Z., Chen, S., Fang, Y.: An efficient tag search protocol in large-scale
RFID systems with noisy channel. IEEE/ACM Trans. Networking PP(99), 1–1 (2015)
12. EPC Radio-Frequency Identity Protocols Class-1 Gen-2 UHF RFID Protocol for Communi-
cations at 860MHz-960MHz, EPCglobal. Available at http://www.epcglobalinc.org/uhfclg2
(2011)
13. Gorlatova, M., Kinget, P., Kymissis, I., Rubenstein, D., Wang, X., Zussman, G.: Challenge:
ultra-low-power energy-harvesting active networked tags (EnHANTs). In: Proceedings of
ACM Mobicom, pp. 253–260 (2009)
14. Gorlatova, M., Margolies, R., Sarik, J., Stanje, G., Zhu, J., Vigraham, B., Szczodrak, M.,
Carloni, L., Kinget, P., Kymissis, I., Zussman, G.: Prototyping energy harvesting active
networked tags (EnHANTs). In: Proceedings of IEEE INFOCOM mini-conference (2013)
15. Kinget, P., Kymissis, I., Rubenstein, D., Wang, X., Zussman, G.: Energy harvesting active
networked tags (EnHANTs) for ubiquitous object networking. IEEE Trans. Wirel. Commun.
17(6), 18–25 (2010)
16. Lee, C.H., Chung, C.W.: Efficient storage scheme and query processing for supply chain
management using RFID. In: Proceedings of ACM SIGMOD (2008)
17. Li, Y., Ding, X.: Protecting RFID communications in supply chains. In: Proceedings of IEEE
ASIACCS (2007)
18. Li, T., Chen, S., Ling, Y.: Efficient protocols for identifying the missing tags in a large RFID
system. IEEE/ACM Trans. Networking 21(6), 1974–1987 (2013)
19. Liu, J., Xiao, B., Bu, K., Chen, L.: Efficient distributed query processing in large RFID-enabled
supply chains. In: Proceedings of IEEE INFOCOM, pp. 163–171 (2013)
20. Liu, V., Parks, A., Talla, V., Gollakota, S., Wetherall, D., Smith, J.R.: Ambient backscatter:
wireless communication out of thin air. In: Proceedings of ACM SIGCOMM, pp. 39–50 (2013)
21. Liu, J., Chen, M., Xiao, B., Zhu, F., Chen, S., Chen, L.: Efficient RFID grouping protocols.
IEEE/ACM Trans. Networking PP(99), 1–1 (2016)
22. Luo, W., Chen, S., Li, T.: Probabilistic missing-tag detection and energy-time tradeoff in large-
scale RFID systems. In: Proceedings of ACM Mobihoc (2012)
23. Luo, W., Qiao, Y., Chen, S.: An efficient protocol for RFID multigroup threshold-based
classification. In: Proceedings of IEEE INFOCOM, pp. 890–898 (2013)
24. Luo, W., Qiao, Y., Chen, S., Chen, M.: An efficient protocol for RFID multigroup threshold-
based classification based on sampling and logical bitmap. IEEE/ACM Trans. Networking
24(1), 397–407 (2016)
25. Myung, J., Lee, W.: Adaptive splitting protocols for RFID tag collision arbitration.
In: Proceedings of ACM Mobihoc (2006)
26. Network Everything. Available at http://openinterconnect.org
27. Ni, L., Liu, Y., Lau, Y.C.: Landmarc: indoor location sensing using active RFID. In: Proceed-
ings of IEEE PerCom (2003)
28. Qian, C., Liu, Y., Ngan, H., Ni, L.M.: ASAP: scalable identification and counting for
contactless RFID systems. In: Proceedings of IEEE ICDCS (2010)
29. Qiao, Y., Chen, S., Li, T.: Energy-efficient polling protocols in RFID systems. In: Proceedings
of ACM Mobihoc (2011)
30. RFID Report. Available at http://www.idtechex.com/research/reports/rfid-forecasts-players-
and-opportunities-2014-2024-000368.asp (2013)
31. Shahzad, M., Liu, A.X.: Probabilistic optimal tree hopping for RFID identification.
In: Proceedings of ACM SIGMETRICS, pp. 293–304 (2013)
32. Sheng, B., Tan, C., Li, Q., Mao, W.: Finding popular categories for RFID tags. In: Proceedings
of ACM Mobihoc (2008)
33. Sheng, B., Li, Q., Mao, W.: Efficient continuous scanning in RFID systems. In: Proceedings of
IEEE INFOCOM (2010)
34. SINIAV. Available at http://roadpricing.blogspot.com/2012/08/brazil-to-have-compulsory-
toll-tags-by.html (2012)
35. Sun Pass. Available at https://www.sunpass.com/index
References 7
36. Xia, Y., Chen, S., Cho, C., Korgaonkar, V.: Algorithms and performance of load-balancing with
multiple hash functions in massive content distribution. Comput. Netw. 53(1), 110–125 (2009)
37. Xiao, Q., Chen, M., Chen, S., Zhou, Y.: Temporally or spatially dispersed joint RFID estimation
using snapshots of variable lengths. In: Proceedings of ACM Mobihoc (2015)
38. Xiao, Q., Chen, S., Chen, M.: Joint property estimation for multiple RFID tag sets using
snapshots of variable lengths. In: Proceedings of ACM Mobihoc (2016)
39. Zhang, Z., Chen, S., Ling, Y., Chow, R.: Capacity-aware multicast algorithms on heterogeneous
overlay networks. IEEE Trans. Parallel Distrib. Syst. 17(2), 135–147 (2006)
40. Zheng, Y., Li, M.: Fast tag searching protocol for large-scale RFID systems. IEEE/ACM Trans.
Networking 21(3), 924–934 (2012)
Chapter 2
Efficient Tag Search in Large RFID Systems
This chapter introduces the tag search problem in large RFID systems. A new
technique called filtering vector is designed to reduce the transmission overhead
during search process, thereby improving the time efficiency. Based on this tech-
nique, we present an iterative tag search protocol. Some tags are filtered out in
each round and the search process will eventually terminate when the result meets a
given accuracy requirement. Moreover, the protocol is extended to work under noisy
channel. The simulation results demonstrate that our protocol performs much better
than the best existing work.
The rest of this chapter is organized as follows. Section 2.1 gives the system
model and the problem statement. Section 2.2 briefly introduces the related work.
Section 2.3 describes our new protocol in detail. Section 2.4 addresses noisy wire-
less channel. Section 2.5 evaluates the performance of our protocol by simulations.
Section 2.6 gives the summary.
executing the protocol in parallel. The readers in each group can be regarded as
an integrated unit, still called a reader for simplicity. Many works regarding multi-
reader coordination can be found in literature [5, 7, 17].
In practice, the tag-to-reader transmission rate and the reader-to-tag transmission
rate may be different and subject to the environment. For example, as specified
in the EPC global Class-1 Gen-2 standard, the tag-to-reader transmission rate is
40–640kbps in the FM0 encoding format or 5–320kbps in the Miller modulated
subcarrier encoding format, while the reader-to-tag transmission rate is about
26.7–128kbps. However, to simplify our discussions, we assume the tag-to-reader
transmission rate and the reader-to-tag transmission rate are the same, and it is
straightforward to adapt our protocol for asymmetric transmission rates.
The RFID reader and the tags in its coverage area use a framed slotted MAC
protocol to communicate. We assume that clocks of the reader and all tags in
the RFID system are synchronized by the reader’s signal. During each frame, the
communication is initialized by the reader in a request-and-response mode, namely
the reader broadcasts a request with some parameters to the tags and then waits for
the tags to reply in the subsequent time slots.
Consider an arbitrary time slot. We call it an empty slot if no tag replies in this
slot, or a busy slot if one or more tags respond in this slot. Generally, a tag just needs
to send one-bit information to make the channel busy such that the reader can sense
its existence. The reader uses “0” to represent an empty slot with an idle channel
and “1” for a busy slot with a busy channel. The length of a slot for a tag to transmit
a one-bit short response is denoted as ts . Note that ts can be set larger than the time
of one-bit data transmission for better tolerance of clock drift in tags. Some prior
RFID work needs another type of slots for transmission of tag IDs, which will be
introduced shortly.
jWj
RINTS D : (2.1)
minfjXj; j Yjg
Exactly finding W can be expensive if X and Y are very large. It is much more
efficient to find W approximately, allowing small bounded error [28]—all wanted
tags in the coverage area must be identified, but a few wanted ones that are not in
the coverage may be accidentally included.1
Our solution performs iteratively. Each round rules out some tags in X when it
becomes certain that they are not in the coverage area (i.e., Y), and it also rules out
some tags in Y when it becomes certain that they are not wanted ones in X. These
ruled-out tags are called non-candidate tags. Other tags that remain possible to be
in both X and Y are called candidate tags. At the beginning, the search result is
initialized to all wanted tags X. As our solution is iteratively executed, the search
result shrinks towards W when more and more non-candidates are ruled out.
Let W be the final search result. We have the following two requirements:
1. All wanted tags in the coverage area must be detected, namely W W .
2. A false positive occurs when a tag in X W is included in W , i.e., a tag not in
the coverage area is kept in the search result by the reader.2 The false-positive
ratio is the probability for any tag in X W to be in W after the execution of
a search protocol. We want to bound the false-positive ratio by a pre-specified
system requirement PREQ , whose value is set by the user. In other words, we
expect
jW Wj
PREQ : (2.2)
jX Wj
Notations used in this chapter are given in Table 2.1 for quick reference.
A straightforward solution for the tag search problem is identifying all existing tags
in Y. After that, we can apply an intersection operation X \ Y to compute W. EPC
C1G2 standard assumes that the reader can only read one tag ID at a time. Dynamic
Framed Slotted ALOHA (DFSA) [4, 8, 19–21] is implemented to deal with tag
collisions, where each frame consists of a certain number of equal-duration slots.
1
If perfect accuracy is necessary, a post step may be taken by the reader to broadcast the identified
IDs. As the wanted tags in the coverage reply after hearing their IDs, those mistakenly included
tags can be excluded due to non-response to these IDs.
2
The nature of our protocol guarantees that all tags in Y W are not included in W .
12 2 Efficient Tag Search in Large RFID Systems
It is proved that the theoretical upper bound of identification throughput using DFSA
is approximately 1e tags per slot (e is the natural constant), which is achieved when
the frame size is set equal to the number of unidentified tags [25]. As specified in
EPC C1G2, each slot consists of the transmissions of a QueryAdjust or QueryRep
command from the reader, one tag ID, and two 16-bit random numbers: one for
the channel reservation (collision avoidance) sent by the tags, and the other for
ACK/NAK transmitted by the reader. We denote the duration of each slot for tag
identification as tl . Therefore, the lower bound of identification time for tags in Y
using DFSA is
TDFSA D e j Yj tl : (2.3)
One limitation of the current DFSA is that the information contained in collision
slots is wasted. Some recent work [3, 12, 15, 16, 24, 27] focuses on Collision
Recovery (CR) techniques, which enable the resolution of multiple tag IDs from a
collision slot. Benefiting from the CR techniques, the identification throughput can
be dramatically improved up to 3.1 tags per slot in [16]. Suppose the throughput is
tags per slot after adopting the CR techniques. The lower bound for identification
time is
j Yj
TCR D tl : (2.4)
Note that after employing the CR techniques the real duration of each slot can be
longer than tl . The reason is that the reader may need to acknowledge multiple tags
and the tags may need to send extra messages to facilitate collision recovery.
2.2 Related Work 13
The polling protocol provides an alternative solution to the tag search problem.
Instead of collecting all IDs in Y, the reader can broadcast the IDs in X one by
one. Upon receiving an ID, each tag checks whether the received ID is identical to
its own. If so, the tag transmits a one-bit short response to notify the reader about its
presence; otherwise, the tag keeps silent. Hence, the execution time of the polling
protocol is
where tid is the time cost for the reader to broadcast a tag ID.
The polling protocol is very efficient when jXj is small. However, it also has
serious limitations. First, it does not work well when jXj j Yj. Second, the energy
consumption of tags (particularly when active tags are used) is significant because
tags in Y have to continuously listen to the channel and receive a large number of
IDs until its own ID is received.
To address the problems of the tag identification and polling protocols, Zheng et al.
design a two-phase protocol named Compact Approximator based Tag Searching
protocol (CATS) [28], which is the most efficient solution for the tag search problem
to date.
The main idea of the CATS protocol is to encode tag IDs into a Bloom filter and
then transmit the Bloom filter instead of the IDs. In its first phase, the reader encodes
all IDs of wanted tags in X into an L1 -bit Bloom filter, and then broadcasts this
filter together with some parameters to tags in the coverage area. Having received
this Bloom filter, each tag tests whether it belongs to the set X. If the answer is
negative, the tag is a non-candidate and will keep silent for the remaining time. After
the filtration of phase one, the number of candidate tags in Y is reduced. During the
second phase, the remaining candidate tags in Y report their presence in a second
L2 -bit Bloom filter constructed from a frame of time slots ts . Each candidate tag
transmits in k slots that it is mapped to. Listening to channel, the reader builds the
Bloom filter based on the status of the time slots: “0” for an idle slot where no tag
transmits, and “1” for a busy slot where at least one tag transmits. Using this Bloom
filter, the reader conducts filtration for the IDs in X to see which of them belong to
Y, and the result is regarded as X \ Y.
With a pre-specified false-positive ratio requirement PREQ , the CATS protocol
uses the following optimal settings for L1 and L2 :
14 2 Efficient Tag Search in Large RFID Systems
˛jXj
L1 D jXj log ; (2.6)
ˇj Yj ln PREQ
jXj ˛
L2 D ln PREQ ; (2.7)
ln ˇ
where is a constant that equals 0.6185, ˛ and ˇ are constants pertaining to the
reader-to-tag transmission rate and the tag-to-reader transmission rate, respectively.
In CATS, the authors assume ts is the time needed to delivering one-bit data, and
˛ D ˇ, i.e., the reader-to-tag transmission rate and the tag-to-reader transmission
rate are identical. Therefore, the total search time of the CATS protocol is
TCATS D .L1 C L2 / ts
jXj ln PREQ 1 (2.8)
D jXj log C ts :
j Yj ln PREQ ln
This section presents an Iterative Tag Search Protocol (ITSP) to solve the tag search
problem in large-scale RFID systems. We will ignore channel error for now and
delay this subject to Sect. 2.4.
2.3.1 Motivation
Although the CATS protocol takes a significant step forward in solving the tag
search problem, it still has several important drawbacks. First, when optimizing
the Bloom filter sizes L1 and L2 , CATS approximates jX \ Yj simply as jXj.
This rough approximation may cause considerable overhead when jX \ Yj deviates
significantly from jXj.
Second, it assumes that jXj < j Yj in its design and formula derivation. In reality,
the number of wanted tags may be far greater than the number in the coverage area of
an RFID system. For example, there may be a huge number jXj of tagged products
that are under recall, but as the products are distributed to many warehouses, the
number j Yj of tags in a particular warehouse may be much smaller than jXj.
Although CATS can still work under conditions of jXj >> j Yj, it will become
less efficient as our simulations will demonstrate.
Third, the performance of CATS is sensitive to the false-positive ratio require-
ment PREQ . The performance deteriorates when the value of PREQ is very small.
While the simulations in [28] set PREQ = 5 %, its value may have to be much smaller
in some practical cases. For example, suppose jXj D 100;000, and jWj D 1000. If
2.3 A Fast Tag Search Protocol Based on Filtering Vectors 15
we set PREQ D 5 %, the number of wanted tags that are falsely claimed to be in Y
by CATS will be up to jX Wj PREQ D 4995, far more than the 1000 wanted tags
that are actually in Y.
We will show that an iterative way of implementing Bloom filters is much more
efficient than the classical way that the CATS protocol adopts.
A Bloom filter is a compact data structure that encodes the membership for a set of
items. To represent a set S D fe1 ; e2 ; ; en g using a Bloom filter, we need a bit
array of length l in which all bits are initialized to zeros. To encode each element
e 2 S, we use k hash functions, h1 , h2 , , hk , to map the element randomly to k bits
in the bit array, and set those bits to ones. For membership lookup of an element b,
we again map the element to k bits in the array and see if all of them are ones. If so,
we claim that b belongs to S; otherwise, it must be true that b … S. A Bloom filter
may cause false positive: a non-member element is falsely claimed as a member
in S. The probability for a false positive to occur in a membership lookup is given
as follows [2, 23]:
!k
1 kn k
PB D 1 1 1 ekn=l : (2.9)
l
k ln 2 nl
When k D ln 2 nl , PB is approximately minimized to 12 D 12 . In order to
ln PB
achieve a target value of PB , the minimum size of the filter is .ln 2/2
n.
CATS sends one Bloom filter from the reader to tags and another Bloom filter
from tags back to the reader. Consider the first Bloom filter that encodes X. As
ln PB
n D jXj, the filter size is .ln 2/2
jXj. As an example, to achieve PB D 0:001, the
size becomes 14:4 jXj bits. Similarly, the size of the second filter from tags to the
reader is also related to the target false-positive probability.
Below we show that the overall size of the Bloom filter can be significantly
reduced by reconstructing it as filtering vectors and then iteratively applying these
vectors.
A Bloom filter can also be implemented in a segmented way. We divide its bit array
into k equal segments, and the ith hash function will map each element to a random
bit in the ith segment, for i 2 Œ1:::k. We name each segment as a filtering vector
(FV), which has l=k bits. The following formula gives the false-positive probability
16 2 Efficient Tag Search in Large RFID Systems
which is approximately the same as the result in (2.9). It means that the two ways
of implementing a Bloom filter have similar performance. The value PFP is also
minimized when k D ln 2 nl . Hence, the optimal size of each filtering vector is
l n
D ; (2.12)
k ln 2
which results in
1
PFV : (2.13)
2
Namely, each filtering vector on average filters out half of non-members.
Figure 2.1 illustrates the concept of filtering vectors. Suppose we have two
elements a and b, two hash function h1 and h2 , and an 8-bit bit array. First,
suppose h1 .a/ mod 8 = 1, h1 .b/ mod 8 = 7, h2 .a/ mod 8 = 5, h2 .b/ mod 8 = 2,
and we construct a Bloom filter for a and b in the upper half of the figure. Next, we
divide the bit array into two 4-bit filtering vectors, and apply h1 to the first segment
and h2 to the second segment. Since h1 .a/ mod 4 = 1, h1 .b/ mod 4 = 3, h2 .a/ mod 4
= 1, h2 .b/ mod 4 = 2, we build the two filtering vectors in the lower half of the figure.
0101 0110
1st FV 2nd FV
2.3 A Fast Tag Search Protocol Based on Filtering Vectors 17
Fig. 2.2 Iterative use of filtering vectors. Each arrow represents one filtering vector, and the length
of the arrow indicates the filtering vector’s size, which is specified to the right. As the size shrinks
in subsequent rounds, the total amount of data exchanged between the reader and the tags is
significantly reduced
In this work, we use filtering vectors in a novel iterative way: Bloom filters between
the reader and tags are exchanged in rounds; one filtering vector is exchanged in
each round, and the size of filtering vector is continuously reduced in subsequent
rounds, such that the overall size of each Bloom filter is much reduced.
Below we use a simplified example to explain the idea, which is illustrated in
Fig. 2.2: Suppose there is no wanted tag in the coverage area of an RFID reader,
namely X \ Y D ;. In round one, we firstly encode X in a filtering vector of size
jXj= ln 2 through a hash function h1 , and broadcast the vector to filter tags in Y.
Using the same hash function, each candidate tag in Y knows which bit in the vector
it is mapped to, and it only needs to check the value of that bit. If the bit is zero,
the tag becomes a non-candidate and will not participate in the protocol execution
further. The filtering vector reduces the number of candidate tags in Y to about
j Yj PFV j Yj=2. Then a filtering vector of size j Yj=.2 ln 2/ is sent from the
remaining candidate tags in Y back to the reader in a way similar to [28]: Each
candidate tag hashes its ID to a slot in a time frame and transmit one-bit response in
that slot. By listening to the states of the slots in the time frame, the reader constructs
the filtering vector, “1” for busy slots and “0” for empty slots. The reader uses this
vector to filter non-candidate tags from X. After filtering, the number of candidate
tags remaining in X is reduced to about jXj PFV jXj=2. Only the candidate tags
in X need to be encoded in the next filtering vector, using a different hash function
h2 . Hence, in the second round, the size of the filtering vector from the reader to
tags is reduced by half to jXj=.2 ln 2/, and similarly the size of the filtering vector
from tags to the reader is also reduced by half to j Yj=.4 ln 2/. Repeating the above
process, it is easy to see that in the ith round, the size of the filtering vector from
the reader to tags is jXj=.2i1 ln 2/, and the size of the filtering vector from tags
to the reader is j Yj=.2i ln 2/. After K rounds, the total size of all filtering vectors
from the reader to tags is
18 2 Efficient Tag Search in Large RFID Systems
1 X jXj
K
2jXj
< ; (2.14)
ln 2 iD1 2i1 ln 2
where 2jXj
ln 2
is an upper bound, regardless of the number K of rounds (i.e., regardless
of the requirement on the false-positive probability). It compares favorably to CATS
ln PB
whose filter size, .ln 2/2
jXj, grows inversely in PB , and reaches 14:4 jXj bits when
PB D 0:001 in our earlier example.
Similarly, the total size of all filtering vectors from tags to the reader is
1 X j Yj
K
j Yj
< ; (2.15)
ln 2 iD1 2i ln 2
K
and PFP D .PFV /K 12 . We can make PFP as small as we like by increasing
n, while the total transmission overhead never exceeds ln12 .2jXj C j Yj/ bits. The
strength of filtering vectors in bidirectional filtration lies in their ability to reduce the
candidate sets during each round, thereby diminishing the sizes of filtering vectors in
subsequent rounds and thus saving time. Its power of reducing subsequent filtering
vectors is related to jX Wj and j Y Wj. The more the numbers of tags outside
of W, the more they will be filtered in each round, and the greater the effect of
reduction.
Unlike the CATS protocol, our iterative approach divides the bidirectional filtration
in tag search process into multiple rounds. Before the ith round, the set of candidate
tags in X is denoted as Xi ( X), which is also called the search result after the
.i 1/th round. The final search result is the set of remaining candidate tags in X
after all rounds are completed. Before the ith round, the set of candidate tags in Y
is denoted as Yi ( Y). Initially, X1 D X and Y1 D Y. We define Ui D Xi W and
Vi D Yi W, which are the tags to be filtered out. Because W is always a subset of
both Xi and Yi , we have
2nd round m2 = 0
. .
. .
. .
K th round
Fig. 2.3 Generalized approach. Each round has two phases. In phase one, the reader transmits
zero, one, or multiple filtering vectors. In phase two, the tags send exactly one filtering vector to
the reader. In the example shown by the figure, m1 D 2 and m2 D 0, which means there are two
filtering vectors sent by the reader in the first round, while no filtering vector from the reader during
the second round
vector is sent from the remaining candidate tags in YiC1 back to the reader, which
uses the received filtering vector to shrink its set of remaining candidates from Xi
to XiC1 , setting the stage for the next round. This process continues until the false-
positive ratio meets the requirement of PREQ .
The values of mi will be determined in the next subsection. If mi > 0, multiple
filtering vectors will be sent consecutively from the reader to tags in one round.
If mi D 0, no filtering vector is sent from the reader in this round. When this
happens, it essentially allows multiple filtering vectors to be sent consecutively from
tags to the reader (across multiple rounds). An illustration is given in Fig. 2.3.
2.3.6 Values of mi
Let K be the total number of rounds. After all K rounds, we use XKC1 as our search
result. There are in total K filtering vectors sent from tags to the reader. We know
from Sect. 2.3.3 that each filtering vector can filter out half of non-members (in our
case, tags in XW). To meet the false-positive ratio requirement PREQ , the following
constraint should hold:
K
1
. PFV /K PREQ : (2.17)
2
ln P
Hence, the value of K is set to d lnREQ
2
e. (We will discuss how to guarantee meeting
the requirement PREQ in Sect. 2.3.9.)
Next, we discuss how to set the values of mi , 1 i K, in order to minimize
the execution time of each round. We use FV./ to denote the filtering vector of a
set. In phase one of the ith round, the reader builds mi filtering vectors, denoted as
20 2 Efficient Tag Search in Large RFID Systems
FVi1 .Xi /, FVi2 .Xi /, , FVimi .Xi /, which are consecutively broadcasted to the tags.
From (2.12), we know the size of each filtering vector is jXi j= ln 2. After the filtration
based on these vectors, the number of remaining candidate tags in YiC1 is on average
In phase two of the ith round, the tags in YiC1 use a time frame of ln12 j YiC1 j slots
to report their presence. After receiving the responses, the reader builds a filtering
vector, denoted as FVi .YiC1 /. After the filtration based on FVi .YiC1 /, the size of the
search result XiC1 is on average
We denote the transmission time of the ith round by f .mi /. In order to make a fair
comparison with CATS, we utilize the parameter setting that conforms with [28].
Therefore, f .mi / D ln12 mi jXi j ts C ln12 j YiC1 j ts , which is set to be:
ts
f .mi / D .mi jXi j C jVi j=2mi C jWj/ : (2.20)
ln 2
To find the value of mi that minimizes f .mi /, we take the first-order derivative and
set the right side to zero.
df .mi / ts
D .jXi j ln 2jVi j=2mi / D 0 (2.21)
dmi ln 2
j Y2 j, jX2 j, and jV2 j based on (2.18), (2.19), and (2.16), respectively. From jX2 j
and jV2 j, we can calculate the value m2 . Following the same procedure, we can
iteratively compute all values of mi for 1 i K.
We find it often happens that the mi sequence has several consecutive zeros at the
end, that is, 9p < K, mi D 0 for i 2 Œ p; K. In this case, we may be able to further
optimize the value of mp with a slight adjustment. We first explain the reason for
mp D 0: It costs some time for the reader to broadcast a filtering vector in phase
one of the pth round. It is true that this filtering vector can reduce set Yp , thereby
reducing the frame size of phase two in the pth round. However, if the time cost of
sending the filtering vector cannot be compensated by the time reduction of phase
two, it will be better off to remove this filtering vector by setting mp D 0. (This
situation typically happens near the end of the mi sequence because the number of
unwanted tags in the remaining candidate set Yp is already very small.) But if all
values of mi in the subsequent rounds (after mp ) are zeros, increasing mp to a non-
zero value m0p may help reduce the transmission time of phase two of all subsequent
rounds, and the total time reduction may compensate more than the time cost of
sending those m0p filtering vectors.
Consider the transmission time of these .K p C 1/ rounds as a whole, denoted
by G.m0p ; p/. It is easy to derive
m0p KpC1 jVp j
G.m0p ; p/ D jXp j C 0 C jWj ts : (2.23)
ln 2 ln 2 2mp
Having calculated the values of mi , we can present our iterative tag search protocol
(ITSP) based on the generalized approach in Sect. 2.3.5. The protocol consists of K
iterative rounds. Each round consists of two phases. Consider the ith round, where
1 i K.
The RFID reader constructs mi filtering vectors for Xi using mi hash functions.
According to (2.12), we set the size LXi of each filtering vector as
1
LXi D jXi j: (2.25)
ln 2
The RFID reader then broadcasts those filtering vectors one by one. Once receiving
a filtering vector, each tag in Yi maps its ID to a bit in the filtering vector using
the same hash function that the reader uses to construct the filter. The tag checks
whether this bit is “1”. If so, it remains a candidate tag; otherwise, it is excluded
as a non-candidate tag and drops out of the search process immediately. The set of
remaining candidate tags is YiC1 .
If the filtering vectors are too long, the reader divides each vector into blocks of
a certain length (e.g., 96 bits) and transmits one block after another. Knowing which
bit it is mapped to, each tag only needs to record one block that contains its bit.
From (2.13), we know that the false-positive probability after using mi filtering
vectors is .PFV /mi .1=2/mi . Therefore, j YiC1 j jVi j .PFV /mi C jWj jVi j=2mi
C jWj.
The reader broadcasts the frame size LYiC1 of phase two to the tags, where
1
LYiC1 D .jVi j=2mi C jWj/ : (2.26)
ln 2
After receiving LYiC1 , each tag in YiC1 randomly maps its ID to a slot in the time
frame using a hash function and transmits a one-bit short response to the reader in
2.3 A Fast Tag Search Protocol Based on Filtering Vectors 23
that slot. Based on the observed state (busy or empty) of the slots in the time frame,
the reader builds a filtering vector, which is used to filter non-candidates from Xi .
The overall transmission time of all K rounds in the ITSP is
X
K
TITSP D .mi LXi C LYiC1 / ts : (2.27)
iDi
Recall from Sect. 2.3.6 that we must know the values of jXi j, jWj, and jVi j to
determine mi , LXi , and LYiC1 . It is trivial to find the value of jXi j by counting the
number of tags in the search result of the .i 1/th round. Meanwhile, we know
jVi j jVi1 j=2mi1 and jV1 j D j Y1 j jWj. Therefore, we only need to estimate jWj
and j Y1 j.
Besides serving as a filter, a filtering vector can also be used for cardinality
estimation, a feature that is not exploited in [28]. Since no filtering vector is available
at the very beginning, the first round of the ITSP should be treated separately:
We may use the efficient cardinality estimation protocol ART [26] to estimate j Yj
(i.e., j Y1 j) if its value is not known at first. As for jWj, it is initially assumed to be
min fjXj; j Yjg.
Next, we can take advantage of the filtering vector received by the reader in
phase two of the ith (i 1) round to estimate jWj without any extra transmission
expenditure. The estimation process is as follows: First, counting the actual number
of “1” bits in the filtering vector, denoted as N1 , we know the actual false-positive
probability of using this filtering vector, denoted by Pi , is
because an arbitrary unwanted tag has a chance of N1 out of LYiC1 to be mapped to a
“1” bit, where LYiC1 is the size of the vector. Meanwhile, we can record the number
of tags in the search results before and after the ith round, i.e., jXi j and jXiC1 j,
respectively. We have jXi j D jUi j C jWj, jXiC1 j D jUiC1 j C jWj, and jUiC1 j
jUi j Pi . Therefore,
For the purpose of accuracy, we may estimate jWj after every round, and obtain the
average value.
24 2 Efficient Tag Search in Large RFID Systems
Estimation may have error. Using the values of mi and LYi computed from estimated
jWj and j Yi j, a direct consequence is that the actual false-positive ratio, denoted as
PT , can be greater than the requirement PREQ . Fortunately, from (2.28), the reader is
able to compute the actual false-positive ratio Pi , 1 i k, of each filtering vector
received in phase two of the ITSP. Thus, we have
Y
K
PT D Pi : (2.30)
1
If PT > PREQ , our protocol will automatically add additional filtering vectors to
further filter XKC1 until PT PREQ (as described in Sect. 2.3.4).
ITSP cannot be supported by off-the-shelf tags that conform to the EPC Class-
1 Gen-2 standard [9], whose limited hardware capability constrains the functions
which can be supported. By our design, most of the ITSP protocol’s complexity
is on the reader side, but tags also need to provide certain hardware support. Besides
the mandatory commands of C1G2 (e.g., Query, Select, and Read), in order for a tag
to execute the ITSP protocol, we need a new command defined in the set of optional
commands, asking each awake tag to listen to the reader’s filtering vector, hash its
ID to a certain slot of the vector for its bit value, keep silent and go sleep if the value
is zero, and respond in a hashed slot (by making a transmission to make the channel
busy) if the value is one. Note that the tag does not need to store the entire filtering
vector, but instead only need to count to the slot it is hashed to, and retrieve the value
(0/1) carried in that slot.
Hardware-efficient hash functions [1, 13, 22] can be found in the literature.
A hash function may also be derived from the pseudo-random number generator
required by the C1G2 standard. To keep the complexity of a tag’s circuit low,
we only use one uniform hash function h./, and use it to simulate multiple
independent hash functions: In phase one of the ith round, we use h./ and mi
unique hash seeds fs1 ; s2 ; ; smi g to achieve mi independent hash outputs. Thus,
a tag id is mapped to bit locations .h.id ˚ s1 / mod LXi /, .h.id ˚ s2 / mod LXi /,
, .h.id ˚ smi / mod LXi / in the mi filtering vectors, respectively. Each hash seed,
together with its corresponding filtering vector, will be broadcast to the tags.
In phase two of the ith round, the reader generates a new hash seed s0 and
sends
it to the tags. Each
candidate tag in YiC1 maps its id to the slot of index
h.id ˚ s0 / mod LYiC1 , and then transmits a one-bit short response to the reader in
that slot.
2.4 ITSP over Noisy Channel 25
So far the ITSP assumes that the wireless channel between the RFID reader and
tags is reliable. Note that the CATS protocol does not consider channel error, either.
However, it is common in practice that the wireless channel is far from perfect due
to many different reasons, among which interference noise from nearby equipment,
such as motors, conveyors, robots, wireless LAN’s, and cordless phones, is a crucial
one. Therefore, this section is to enhance ITSP by making it robust against noise
interference.
The reader transmits at a power level much higher than the tags (which after
all backscatter the reader’s signals in the case of passive tags). It has been
shown that the reader may transmit more than one million times higher than tag
backscatter [14]. Hence, the forward link (reader to tag) communication is more
resilient against channel noise than the reverse link (tag to reader). To provide
additional assurance against noise for forward link, we may use CRC code for
error detection. The C1G2 standard requires the tags to support the computation
of CRC-16 (16-bit CRC) [9], which therefore can also be adopted by future tags
modified for ITSP. Each filtering vector built by the reader can be regarded as a
combination of many small segments with fixed size of lS bits (e.g., lS D 80). For
each segment, the reader computes its 16-bit CRC and appends it to end of that
segment. Those segments are then concatenated and transmitted to tags. When a
tag receives a filtering vector, it first finds the segment it hashes to and computes
the CRC of that segment. If the calculated CRC matches the attached one, it will
determine its candidacy by checking the bit in the segment which it maps to. For
mismatching CRC, the tag knows that the segment has been corrupted, and it will
remain as a candidate tag regardless of the value of the bit which it maps to.
Suppose we let lS D 80, then
1
ln 2
jXi j 1:2jXj
LXi D .lS C 16/ D : (2.31)
lS ln 2
We assume the probability that the noise corrupts each segment is PS (PS is expected
to be very small as explained above). A corrupted segment can be thought as
consisting of all “1”s. Hence, the false-positive probability for a filtering vector sent
by reader, denoted by PRT , is roughly
26 2 Efficient Tag Search in Large RFID Systems
LXi LXi
96
PS lS C 96
.1 PS / lS PFV
PRT LXi
96
lS (2.32)
1 C PS
D :
2
We can also get
Now let us study the noise on the reverse link and its effect on the ITSP. Since the
backscatter from a tag is much weaker than the signal transmitted by the reader, the
reverse link is more likely to be impacted by noise.
First, channel noise may corrupt a would-be empty slot into a busy slot. The
original empty slot is supposed to be translated into a “0” bit in the filtering vector
by the reader; if a candidate tag is mapped to that bit, it is ruled out immediately.
However, if that slot is corrupted and becomes a busy slot, the corresponding
bit turns into “1”; a tag mapped to that bit will remain a candidate tag, thereby
increasing the false-positive probability of the filtering vector.
Second, noise may also occur during a busy slot. Although the noise and the
transmissions from tags may partially cancel each other in a slot if they happen
to reach the reader in opposite phase, it is extremely unlikely that they will exactly
eliminate each other. As long as the reader can still detect some energy, regardless of
its source (it may even come from the noise), that slot will be correctly determined
as a busy slot, and the corresponding bit in the filtering vector is set to “1” just as it is
supposed to be. However, if we take the propagation path loss, including reflection
loss, attenuation loss, and spreading loss [11], into account, there is still a chance
that a busy slot may not be detected by the reader. This may happen in a time-varying
channel where the reader may fail in receiving a tag’s signal during a deeply faded
slot when the tag transmits. We stress that this is not a problem unique to ITSP, but
all protocols that require communications from tags to readers will suffer from this
2.4 ITSP over Noisy Channel 27
problem if it happens that the reader cannot hear the tags. ITSP is not robust against
this type of error. But there exists ways to alleviate this problem—for instance, each
filtering vector from tags to the reader is transmitted twice. As long as a slot is busy
in one of two transmissions, the slot is considered to be busy.
Next, we will investigate the reverse link with noise interference for ITSP under
two error models.
The random error model is characterized by a parameter called error rate PERR ,
which means every slot independently has a probability PERR to be corrupted by
the noise. Influencing by the channel noise, the reader can detect more busy slots
as some empty slots turn into busy ones, which raises the false-positive probability
of phase-two filtering vectors. Suppose the frame size of phase two in a certain
round is l, the original number of busy slots is about l PFV l=2. At the reader’s
side, however, the number of busy slots averagely increases to l=2 C l=2 PERR D
.1CPERR /l
2
. After encoding the slot status into a filtering vector, the false-positive
probability of that filtering vector is
.1CPERR /l
2 1 C PERR
P0FV D : (2.36)
l 2
K
To satisfy the false-positive ratio requirement, P0FV PREQ should hold.
Therefore, the search process of ITSP-rem contains at least
ln PREQ
KDd e (2.37)
lnŒ.1 C PERR /=2
describes the number of bursts during an interval and the number of incorrect
symbols in each burst error, which differs greatly from the random error model.
According to the burst error model presented in [6], both the number of bursts
in an interval and the number of errors in each burst have Poisson distributions.
Assume the expected number of bursts in an l-bit interval is , the probability
distribution function for the number of bursts can be expressed as
X
1
i
h.x/ D e ıxi ; (2.39)
iD0
iŠ
where ıxi is the Kronecker delta function [18]. Meanwhile, if the mean value of
errors due to a burst in the l bits is , then the probability distribution function of the
number of error is given by
X
1
j
g. y/ D e ıyj : (2.40)
jD0
jŠ
w X iw i i
1
Pl .w/ D e e : (2.41)
wŠ iD0 iŠ
In other words, for a frame with l slots, the probability that w slots will be corrupted
by the burst noise is Pl .w/.
Now we evaluate the ITSP under the burst error model, denoted as ITSP-bem.
Given a filtering vector with size of l-bit, recall from (2.41) that the probability of
having w errors in this l-bit vector is Pl .w/. In this case, each original “0” bit has a
probability wl to be corrupted by the errors, and becomes a “1” bit. Consequently,
the false-positive probability of the filtering vector is expected to be:
1X
l
1 w
P0FV C Pl .w/ : (2.42)
2 2 wD0 l
After obtaining the value of P0FV , the ITSP-bem can use (2.37), (2.38), to determine
the values of other necessary parameters.
2.5 Performance Evaluation 29
We compare our protocol ITSP with CATS [28], the polling protocol (Sect. 2.2.2),
the optimal DFSA (dynamic frame slotted ALOHA), and a tag identification
protocol with collision recovery [15], denoted as CR, which identifies 4.8 tags per
slot on average, about 13 times the speed of the optimal DFSA. For ITSP and CATS,
their Bloom filters (or filtering vectors) constitute most of the overall transmission
overhead, while other transmission cost, such as transmission of hash seeds, is
comparatively negligible. Both protocols need to estimate the number of tags in
the system, j Yj, as a pre-protocol step. According to the results presented in [28],
the time for estimating j Yj takes up less than 2 % of the total execution time of
CATS. Hence, we do not count the estimation time of j Yj in the simulation results
because it is relatively small and does not affect fair comparison as both protocols
need it. Consequently, the key metric concerning the time efficiency is the total size
of Bloom filters or filtering vectors, and then (2.8) can be used for calculating the
search time required by CATS, while (2.27) for ITSP.
After the search process is completed, we will calculate the false -positive ratio
Wj
PFP using PFP D jWjXWj , where W is the set of tags in the search result and W is
the actual set of wanted tags in the coverage area. PFP will be compared with PREQ
to see whether the search result meets the false -positive ratio requirement.
We evaluate the performance of our protocol and compare it with the CATS
protocol. In the first set of simulations, we set PREQ D 0:001, fix j Yj D 50;000,
vary jXj from 5000 to 640,000, and let RINTS = 0.1, 0.3, 0.5, 0.7, 0.9. In the second
set of simulations, we set PREQ D 0:001, fix jXj D 10;000, vary j Yj from 1250 to
40,000 to investigate the scalability of ITSP with tag population from a large range,
and let RINTS = 0.1, 0.3, 0.5, 0.7, 0.9. For simplicity, we assume tid D 96ts , and
tl D 137ts , in which a 9-bit QueryAdjust or a 4-bit QueryRep command, a 96-bit
ID and two 16-bit random numbers can be transmitted. Tables 2.4 and 2.5 show
the number of ts slots needed by the protocols under different parameter settings.
Each data point in these tables or other figures/tables in the rest of the section is
the average of 500 independent simulation runs with ˙5 % or less error at 95 %
confidence level.
From the tables, we observe that when RINTS is small (which means jWj is small),
the ITSP performs much better than the CATS protocol. For example, in Table 2.4,
when RINTS D 0:1, the ITSP reduces the search time of the CATS protocol by as
much as 90.0 %. As we increase RINTS (which implies larger jWj), the gap between
the performance of the ITSP and the performance of the CATS gradually shrinks.
30 2 Efficient Tag Search in Large RFID Systems
Table 2.4 Performance comparison of tag search protocols. CR means a tag identification
protocol with collision recovery techniques. j Yj D 50;000, PREQ D 0:001
ITSP (RINTS )
jXj 0.1 0.3 0.5 0.7 0.9 CATS Polling CR
5,000 61,463 96,989 105,828 108,346 124,553 126,370 485,000 1,427,083
10,000 108,017 145,553 206,709 199,586 231,236 238,313 970,000 1,427,083
20,000 185,204 255,898 335,426 397,462 403,954 447,772 1,940,000 1,427,083
40,000 304,767 467,433 512,156 598,718 678,066 837,837 3,880,000 1,427,083
80,000 414,686 590,150 656,426 721,347 721,347 1,560,259 7,760,000 1,427,083
160,000 472,677 630,669 721,347 721,347 721,347 2,889,689 15,520,000 1,427,083
320,000 529,835 668,794 721,347 721,347 721,347 5,317,715 31,040,000 1,427,083
640,000 573,270 696,015 721,347 721,347 721,347 10,533,732 62,080,000 1,427,083
Table 2.5 Performance comparison of tag search protocols. CR means a tag identification protocol
with collision recovery techniques. jXj D 10;000, PREQ D 0:001
ITSP (RINTS )
j Yj 0.1 0.3 0.5 0.7 0.9 CATS Polling CR
1,250 13,047 17,364 18,033 18,033 18,033 164,589 970,000 35,677
2,500 24,289 33,337 36,067 36,067 36,067 175,960 970,000 71,354
5,000 42,835 62,862 68,528 72,134 72,134 190,387 970,000 142,708
10,000 73,909 109,281 119,022 137,056 144,269 204,814 970,000 285,417
20,000 95,833 132,546 169,065 167,713 192,960 219,241 970,000 570,833
40,000 111,904 152,606 174,926 228,215 232,904 233,668 970,000 1,141,667
In particular, the CATS performs poorly when jXj j Yj. But the ITSP can work
efficiently in all cases. In addition, the ITSP is also much more efficient than the
polling protocol, and any tag identification protocol with/without CR techniques.
Even in the worst case, the ITSP only takes about half of the execution time of a
tag identification protocol with CR techniques. (Note that the identification process
actually takes much more time since the throughput 4.8 tags per slot may not be
achievable in practical and the duration of each slot is longer.) In practice, the
wanted tags may be spatially distributed in many different RFID systems (e.g.,
warehouses in the example we use in the introduction), and thus RINTS can be small.
The ITSP is a much better protocol for solving the tag search problem in these
practical scenarios.
Another performance issue we want to investigate is the relationship between
the search time and PREQ . The polling protocol, DFSA, and CR do not have false
positive. Our focus will be on ITSP and CATS. We set jXj D 5000, 20;000 or
80;000, j Yj D 50;000, vary RINTS from 0.1 to 0.9, and vary PREQ from 106 to 102 .
Figure 2.4 compares the search times required by the CATS and the ITSP under
different false -positive ratio requirements. Generally speaking, the gap between the
search time required by the ITSP and the search time by the CATS keeps getting
larger with the decrease of PREQ , particularly when RINTS is small. For example, in
2.5 Performance Evaluation 31
a b
2.5 ITSP RINTS = 0.1
8 ITSP RINTS = 0.1
ITSP RINTS = 0.3 ITSP RINTS = 0.3
ITSP RINTS = 0.5 ITSP RINTS = 0.5
Number of slots (*105)
0 -6 0 -6
10 10-5 10-4 10-3 10-2 10 10-5 10-4 10-3 10-2
PREQ PREQ
c
30 ITSP RINTS = 0.1
ITSP RINTS = 0.3
25 ITSP RINTS = 0.5
Number of slots (*10 )
5
15
10
0 -6 -5 -4 -3 -2
10 10 10 10 10
PREQ
Fig. 2.4 Relationship between search time and PREQ . Parameter setting: j Yj D 50;000;
(a) jXj D 5000, (b) jXj D 20;000, (c) jXj D 80;000
Fig. 2.4c, when PREQ D 102 and RINTS D 0:1, the search time by the ITSP is about
one third of the time by the CATS; when we reduce PREQ to 106 , the time by the
ITSP becomes about one fifth of the time by the CATS. The reason is as follows:
When RINTS is small, jWj is small and most tags in X and Y are non-candidates. After
several ITSP rounds, as many non-candidates are filtered out iteratively, the size of
filtering vectors decreases exponentially and therefore subsequent ITSP rounds do
not cause much extra time cost. This merit makes the ITSP particularly applicable
in cases where the false -positive ratio requirement is very strict, requiring many
ITSP rounds. On the contrary, the CATS protocol does not have this capability of
exploiting low RINTS values.
Next, we examine whether the search results after execution of the ITSP will indeed
meet the requirement of PREQ . In this simulation, we set the false-positive ratio
requirement based on the following formula:
32 2 Efficient Tag Search in Large RFID Systems
jWj
PREQ ; (2.43)
.jXj jWj/
We evaluate the performance of ITSP-rem and ITSP-bem. To simulate the error rate
PERR in ITSP-rem, we employ a pseudo-random number generator, which generates
random real numbers uniformly in the range Œ0; 1. If a bit in the filtering vector
is “0” and the generated random number is in Œ0; PERR , that bit is flipped to “1”.
PS can be simulated in a similar way. As for the burst error in ITSP-bem, we first
2.5 Performance Evaluation 33
1.0
0.8
0.6
0.4
0.2
0
0 0.2 0.4 0.6 0.8 1
RINTS
1.2
1.0
0.8
0.6
0.4
0.2
0
0 50 100 150 200 250 300 350 400 450 500
calculate the values of Pl .w/ with different w for a given l. Then each w is assigned
with a non-overlapping range in Œ0; 1, whose length is equal to the value of Pl .w/.
For each interval, we generate a random number and check which range the number
locates, thereby determining the number of errors in that interval.
We set PREQ D 0:001, PS D 0:01, and RINTS D 0:1; 0:5; 0:9, respectively. The
values of jXj and j Yj are the same as those in Tables 2.4 and 2.5. ls is set to 80 bits
and a 16-bit CRC is appended to each segment on forward link for integrity check.
For ITSP-rem, we consider two cases with PERR = 5 % and 10 %, respectively. For
ITSP-bem, the prescribed parameters are set to be: D 0:135, D 7:10 with each
interval to be 96 bits [6].
34 2 Efficient Tag Search in Large RFID Systems
Tables 2.6, 2.7, 2.8, 2.9, 2.10, and 2.11 show the number of ts slots needed
under each parameter setting. The second column presents the results of ITSP
when the channel is perfectly reliable. The third and fourth columns present the
results of ITSP-rem with an error rate of 5 % or 10 %. The fifth column presents the
results of ITSP-bem. It is not surprising that the search process under noisy channel
generally takes more time due to the use of CRC and the higher false-positive
probability of filtering vectors, and the execution time of the ITSP-rem is usually
longer in a channel with a higher error rate. An important positive observation is
that the performance of ITSP gracefully degrades in all simulations. The increase
in execution time for both ITSP-rem and ITSP-bem is modest, compared to ITSP
with a perfect channel. For example, even when the error rate is 10 %, the execution
time of ITSP-rem is about 10–30 % higher than that of ITSP. This modest increase
demonstrates the practicality of our protocol under noisy channel.
We use the same parameter settings in Sect. 2.5.3 to examine the accuracy of search
results by ITSP-rem and ITSP-bem. Meanwhile, for ITSP-rem, we set PERR = 5 %
or 10 %. For ITSP-bem, the required input parameter setting is D 0:135 and
D 7:10, with each 96-bit interval. Simulation results are delineated in Fig. 2.7,
2.5 Performance Evaluation 35
a b
|X|=5,000, |Y|=50,000, PREQ=10-2 |X|=20,000, |Y|=50,000, PREQ=10-3
3.0 3.0
False positive ratio (*10 )
-3
2.5 ITSP-rem(10%) 2.5 ITSP-rem(10%)
ITSP-bem ITSP-bem
2.0 2.0
1.5 1.5
1.0 1.0
0.5 0.5
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
RINTS RINTS
c
|X|=80,000, |Y|=50,000, PREQ=10-4
1.0
False positive ratio (*10 )
-4
0.8
0.6
0.4
ITSP-rem(5%)
0.2 ITSP-rem(10%)
ITSP-bem
0
0 0.2 0.4 0.6 0.8 1
RINTS
Fig. 2.7 False -positive ratio after running ITSP-rem, ITSP-bem, and CATS. (a) jXj=5000,
jYj=50,000, PREQ D 102 , (b) jXj=20,000, jYj=50,000, PREQ D 103 , (c) jXj=80,000,
jYj=50,000, PREQ D 104
where the error rate is given between the parentheses after ITSP-bem. Clearly, the
false-positive ratio in the search results after executing ITSP-rem or ITSP-bem is
always within the bound of PREQ . These results confirm that the false-positive ratio
requirement is met under noisy channel.
300
200
100
0
0.01 0.02 0.03 0.04 0.05
error ratio
shows the error rate, which is defined as the fraction of slots in deep fading, causing
complete signal loss. ITSP-2 denotes the approach of transmitting each filtering
vector from tags to the reader twice. When a wanted tag in W is not identified, we
call it a false negative. The simulation results show that ITSP incurs significant false
negatives when the error rate becomes large. For example, when the error rate is 2 %,
the average number of false negatives is 90.7. ITSP-2 works very well in reducing
this number. When the error rate is 2 %, its number of false negatives is just 1.95.
2.6 Summary
This chapter discusses the tag search problem in large-scale RFID systems. We
present an iterative tag search protocol (ITSP) that improves time efficiency and
eliminates the limitation of prior solutions. Moreover, we extend the ITSP to
work under noisy channel. The main contributions of our work are summarized
as follows: (1) The iterative method of ITSP based on filtering vectors is very
effective in reducing the amount of information to be exchanged between tags and
the reader, and consequently saves time in the search process; (2) the ITSP performs
much better than the existing solutions; (3) the ITSP works well under all system
conditions, particularly in situations of jXj j Yj when CATS works poorly; (4)
the ITSP is improved to work effectively under noisy channel.
References
1. Bogdanov, A., Leander, G., Paar, C., Poschmann, A., Robshaw, M.J.B., Seurin, Y.: Hash
functions and RFID tags: mind the gap. In: Proceedings of CHES, pp. 283–299 (2008)
2. Broder, A., Mitzenmacher, M.: Network applications of bloom filters: a survey. Internet Math.
1(4), 485–509 (2003)
3. Castiglione, P., Ricciato, F., Popovski, P.: Pseudo-random Aloha for inter-frame soft combining
in RFID systems. In: Proceedings of IEEE DSP, pp. 1–6 (2013)
38 2 Efficient Tag Search in Large RFID Systems
4. Cha, J.R., Kim, J.H.: Dynamic framed slotted ALOHA algorithms using fast tag estimation
method for RFID systems. In: Proceedings of IEEE Consumer Communications and
Networking Conference (CCNC) (2006)
5. Choi, J., Lee, C.: A cross-layer optimization for a LP-based multi-reader coordination in RFID
systems. In: Proceedings of IEEE GLOBECOM, pp. 1–5 (2010)
6. Cornaglia, B., Spini, M.: New statistical model for burst error distribution. Eur. Trans.
Telecommun. 7, 267–272 (1996)
7. Dan, L., Wei, P., Wang, J., Tan, J.: TFDMA: a scheme to the RFID reader collision problem
based on graph coloration. In: Proceedings of IEEE SOLI, pp. 502–507 (2008)
8. Eom, J., Lee, T.: Accurate tag estimation for dynamic framed-slotted ALOHA in RFID
systems. In: Proceedings of IEEE Communication Letters, pp. 60–62 (2010)
9. EPC Radio-Frequency Identity Protocols Class-1 Gen-2 UHF RFID Protocol for Communi-
cations at 860MHz-960MHz, EPCglobal. Available at http://www.epcglobalinc.org/uhfclg2
(2011)
10. Federal Standard 1037C. Available at http://www.its.bldrdoc.gov/fs-1037/fs-1037c.htm (1996)
11. Fletcher, R., Marti, U.P., Redemske, R.: Study of UHF RFID signal propagation through
complex media. In: IEEE Antennas and Propagation Society International Symposium, vol.
1B, pp. 747–750 (2005)
12. Fyhn, K., Jacobsen, R.M., Popovski, P., Scaglione, A., Larsen, T.: Multipacket reception of
passive UHF RFID tags: a communication theoretic approach. IEEE Trans. Signal Process.
59(9), 4225–4237 (2011)
13. Guo, J., Peyrin, T., Poschmann, A.: The PHOTON family of lightweight Hash functions. In:
Proceedings of CRYPTO, pp. 222–239 (2011)
14. NIST: RFID Communication and Interference. White paper, Grand Prix Application Series
(2007)
15. Kaitovic, J., Rupp, M.: Improved physical layer collision recovery receivers for RFID readers.
In: Proceedings of IEEE RFID, pp. 103–109 (2014)
16. Kaitovic, J., Langwieser, R., Rupp, M.: A smart collision recovery receiver for RFIDs.
EURASIP J. Embed. Syst. 2013, 1–19 (2013)
17. Kang, Y., Kim, M., Lee, H.: A hierarchical structure based reader anti-collision protocol for
dense RFID reader networks. In: Proceedings of ICACT, pp. 164–167 (2011)
18. Kronecker delta. Available at http://en.wikipedia.org/wiki/Kronecker_delta
19. Lee, S., Joo, S., Lee, C.: An enhanced dynamic framed slotted ALOHA algorithm for RFID
tag identification. In: Proceedings of IEEE MobiQuitous (2005)
20. Nguyen, C.T., Hayashi, K., Kaneko, M., Popovski, P., Sakai, H.: Probabilistic dynamic framed
slotted ALOHA for RFID tag identification. Wirel. Pers. Commun. 71, 2947–2963 (2013)
21. Onat, I., Miri, A.: A tag count estimation algorithm for dynamic framed ALOHA based RFID
MAC protocols. In: Proceedings of IEEE ICC, pp. 1–5 (2011)
22. O’Neill, M.: Low-cost SHA-1 hash function architecture for RFID tags. In: Proceedings of
RFIDSec (2008)
23. Qiao, Y., Li, T., Chen, S.: One memory access bloom filters and their generalization. In:
Proceedings of IEEE INFOCOM, pp. 1745–1753 (2011)
24. Ricciato, F., Castiglione, P.: Pseudo-random ALOHA for enhanced collision-recovery in RFID.
IEEE Commun. Lett. 17(3), 608–611 (2013)
25. Schoute, F.C.: Dynamic frame length ALOHA. IEEE Trans. Commun. 31, 565–568 (1983)
26. Shahzad, M., Liu, A.: Every bit counts - fast and scalable RFID estimation. In: Proceedings of
ACM Mobicom (2012)
27. Stefanovic, C., Popovski, P.: ALOHA random access that operates as a rateless code. IEEE
Trans. Commun. 61(11), 4653–4662 (2013)
28. Zheng, Y., Li, M.: Fast tag searching protocol for large-scale RFID systems. IEEE/ACM Trans.
Networking 21(3), 924–934 (2012)
Chapter 3
Lightweight Anonymous RFID Authentication
Consider a hierarchical distributed RFID system as shown in Fig. 3.1. Each tag
is pre-installed with some keys for authentication. The readers are deployed at
Central server
...
Backend server
... ...
Readers & Tags
chosen locations, responsible for authenticating tags entering their coverage areas.
In addition, the readers at each location are connected to a backend server, serving
as a supplement to provide more storage and computation resources. All backend
servers are further connected to the central server, where every tag’ keys are stored.
Any authorized backend server can fetch the tags’ keys from the central server. Since
the keys of each tag are only stored at the central server, they are synchronized from
the view of different backend servers. Moreover, the high speed links connecting the
central server, backend servers, and readers make the latency of transmitting small
authentication data negligible. Therefore, a reader, its connected backend server, and
the central server can be thought as single entity, and will be used interchangeably.
In this chapter, we focus on low-cost RFID tags, particularly passive backscatter
tags that are ubiquitously used nowadays. The simplicity of these tags contributes
to their low prices, which in turn restricts their computation, communication, and
storage capabilities. In contrast, the readers, which are not needed in a large quantity
as tags do, can have much richer resource. Moreover, the backend server can provide
the readers with extra resource when necessary. The communication between a
reader and a tag works in the request-and-response mode. The reader initiates the
communication by sending a request. Upon receiving the request, the tag makes
an appropriate transmission in response. We divide the transmissions between the
readers and tags into two types: (1) Invariant transmissions contain the content that
is invariant between any tag and any reader, such as the beacon transmission from a
reader which informs the incoming tag of what to do next. (2) Variant transmissions
contain the content that may vary for different tags or the same tag at different times,
such as the exchanged data for anonymous authentication.
identify the tag carriers. However, such unauthorized readers have no access to the
backend servers or the central server since the servers will authenticate the readers
before granting access permissions. In the sequel, a reader without further notation
means an authorized one by default. Moreover, we assume that the adversary
may compromise some tags and obtain their keys, but it cannot compromise any
authorized readers.
Anonymous Model The anonymous model requires that all variant transmissions
must be indistinguishable by the adversary, meaning that (1) any variant transmis-
sion in the protocol should not carry a fixed value that is unchanged across multiple
authentications, and (2) the transmission content should appear totally random and
unrelated across different authentications to any eavesdropper that captures the
transmissions. Therefore, no adversary will have a non-negligible advantage in
successfully guessing the next variant transmission of a tag based on the previous
transmissions [9].
Notations used in the chapter are given in Table 3.1 for quick reference.
The hash-lock [20] leverages random hash values for anonymous authentication.
After receiving an authentication request from a reader, a tag sends back (r; id ˚
fk .r/), where r is a random number, id is the tag’s ID, k is a pre-shared secret
between the tag and the reader, and f fn gn2N is a pseudo-random number function
ensemble. The reader exhaustively searches its database for a tag whose ID and key
can produce a match with the received data. The hash-lock protocol has a serious
efficiency problem that the reader needs to perform O.n/ hash computations on
line per authentication, where n is the number of tags in the system. Some variants
[1, 2, 14, 19] of hash-lock scheme try to improve the search efficiency, but they have
issues. The OKS protocol [14] uses hash-chain for anonymous authentication. The
OSK/AO protocol [1, 2] leverages the time-memory tradeoff to reduce the search
2 2
complexity to O.n 3 / (still too large) at the cost of O.n 3 / units of memory. However,
both OKS and OSK/AO cannot guarantee anonymity under denial-of-service (DoS)
attack [9]. The YA-TRAP protocol [19] makes use of monotonically increasing
timestamps to achieve anonymous authentication. YA-TRAP is also susceptible to
DoS attack, and a tag can only be authenticated once in each time unit. The DoS
attack in OSK/AO and YA-TRAP is in nature a desynchronization attack, which
tricks a tag into updating its keys unnecessarily and makes it fail to be authenticated
by an authorized reader later.
The LAST protocol was designed based on a weak privacy model [13]. Key
identifiers are used to facilitate the reader to identify the tags quickly. After each
authentication, the reader uploads a new hidentifier; keyi pair to the tag. LAST only
requires the reader and tag to compute O.1/ hashes per authentication, but the over-
head for the reader to search a given key identifier is not considered. Moreover, since
the key identifier is only updated after a successful authentication, the tag keeps
sending the same key identifier between two consecutive successful authentications.
Therefore, LAST is not anonymous in the strict sense. In addition, the process of
uploading a new hidentifier; keyi pair to the tag after each authentication incurs extra
communication overhead.
3.3 A Strawman Solution 43
Tree-based protocols organize the shared keys in a balanced tree to reduce the
complexity of identifying a tag to O.log n/. However, the tree-based protocols
generally require each tag to store O.log n/ keys, which is O.1/ for non-tree-based
protocols.
In Dimitriou’s protocol [6], the non-leaf nodes of the tree store auxiliary keys
that can be used to infer the path leading to leaf nodes that store the authentication
keys. For each authentication, the computation overhead for both the reader and the
tag is O.log n/, and the tag needs to transit O.log n/ hash values. This protocol is
vulnerable to the compromising attack since different tags may share auxiliary keys
[11, 12].
The ECNP protocol [10] leverages a cryptographic encoding technique to com-
press the authentication data transmitted by tags. ECNP can reduce the computation
overhead of the reader and the transmission overhead of the tag by multifold
compared with Dimitriou’s protocol [6], but they remain O.log n/ due to the use
of tree structure. Moreover, ECNP is not resistant against the compromising attack
since the children of one node in the tree share the same group keys.
The ACTION protocol [12] was designed to be resistant against the compro-
mising attack. It adopts a sparse tree architecture to make the keys of each tag
independent from one another. In ACTION, each tag is randomly assigned with
a path key, which is further segmented into link indices to guide the reader to walk
down the tree towards the leaf node that carries the secret key k of the tag. For
each authentication, a tag needs to compute and transmit O.log n/ hashes and the
reader needs to perform O.log n/ hashes to locate the shared key. The key problem
of ACTION is that the size of link indices is too small after segmentation (e.g., 4
bits), rendering them easy to guess.
3.3.1 Motivation
Most prior wok, if not all, employs cryptographic hash functions to randomize
authentication data for the purpose of keeping anonymity. Implementing a typical
cryptographic hash function such as MD4, MD5, and SHA-1 requires at least 7K
logic gates [3]. However, widely used passive tags only have 7K–15K logic gates,
of which 2K–5K are reserved for security purposes [16]. The hardware constraint
44 3 Lightweight Anonymous RFID Authentication
Table 3.2 Key table for the Tag Token array Token index
preliminary design
t1 [tk11 , tk12 , : : : , tk1m ] pt1
t2 [tk21 , tk22 , : : : , tk2m ] pt2
:: :: ::
: : :
tn [tkn1 , tkn2 , : : : , tknm ] ptn
3.4 Dynamic Token-Based Authentication Protocol 45
To avoid the leakage of the tag’s identity, the tokens used for authentication
should look random. In addition, each token can be used only once. Hence, the
tag must be replenished with new tokens after m2 mutual authentications, e.g.,
purchasing new tokens from an authorized branch. Therefore, the tag should store as
many tokens as possible to reduce the inconvenience caused by token replenishment.
A low-cost tag, however, only has a tiny memory. For example, a passive UHF tag
generally has a 512-bit user memory for storing user-specific data (tokens in our
case). Some high-end tags with large memory [8, 15] are prohibitively expensive to
be applied in large quantities. As an example, x Sky-ID tags [8] with an 8 KB user
memory cost $25 each. We will introduce the security issues of this design in the
next section.
3.4.1 Motivation
Given the memory constraint, each tag can only store a few tokens. Frequent
token replenishment brings about unacceptable inconvenience in practice. Hence,
we want to invent a way to enable dynamic token generation from the few pre-
installed tokens. In addition, the time for the reader to search a particular token
is O.n/ in the preliminary design. We desire to reduce this overhead to O.1/.
More importantly, we hope all advantages of the preliminary design, including no
46 3 Lightweight Anonymous RFID Authentication
requirement of cryptographic hash functions, low computation overhead for the tag,
and low communication overhead for both the reader and the tag, can be retained in
our new design.
3.4.2 Overview
Let an arbitrary tag t in the system be pre-installed with u base tokens, denoted
by [bt1 , bt2 , : : : , btu ], each being a-bit long. These base tokens can be used
to derive dynamic tokens for authentication. In addition, we introduce another
type of auxiliary keys called base indicators to generate indicators that support
the derivation of dynamic tokens. Suppose t stores v base indicators denoted by
[bi1 , bi2 , : : : , biv ], each being b-bit long. Let tk represent the current a-bit token,
and ic be the current b-bit indicator. All the base tokens, base indicators, token,
and indictor are also stored at the central server. Our idea is to let the reader
and the tag independently generate the same random tokens by following the
instruction encoded in the indicator. TAP consists of three phases: initialization
phase, authentication phase, and updating phase, which will be elaborated one by
one.
The central server stores all tags’ keys in a key table, denoted by KT. As shown in
Table 3.3, each entry is indexed by the tag index, supporting random access in O.1/
time. With the tag index idx, the keys of t can be found at KTŒidx.
When t joins the system, the reader randomly generates an array of u base tokens
[bt1 , bt2 , : : : , btu ], an array of v base indicators [bi1 , bi2 , : : : , biv ], a token tk and
an indicator ic for t. After that, the reader requests the central server to store those
keys of t in the database. The central server inserts the keys to the first empty entry
in KT. The search process for an empty entry can be sped up by maintaining a small
table recording all empty entries in KT (e.g., due to tags’ departure). If KT is fully
occupied, the central server doubles its size to accommodate more tags.
Table 3.3 Key table stored by the central server for TAP
Tag index Tag Base token array Token Base indicator array Indicator
1 t1 [bt11 , : : : , bt1u ] tk1 [bi11 , : : : , biv1 ] ic1
2 t2 [bt21 , : : : , bt2u ] tk2 [bi12 , : : : , biv2 ] ic2
:: :: :: :: :: ::
: : : : : :
n tn [btn1 , : : : , btnu ] tkn [bi1n , : : : , bivn ] icn
3.4 Dynamic Token-Based Authentication Protocol 47
Tokens tk 1 tk 2 tk 3 tk 4 tk 5
Hash
Hash Table 2 0 1 0 0 4 5 0 3 0
Fig. 3.3 A hash table used by TAP. The tokens of the five tags t1 , t2 , t3 , t4 , t5 are tk1 , tk2 , tk3 , tk4 , tk5 ,
respectively. Each token is randomly mapped to a slot in the hash table, where the corresponding
tag index is stored
To identify a tag based on its token in O.1/ time, the central server maintains a
hash table HT, mapping the token of each tag to its tag index. Let HT consist of
l slots. At first, every slot in HT is initialized to zero. After t joins the system, the
reader computes the hash value h.tk/, where the hash function h./ yields random
values in Œ1; l, and then puts the tag index idx of t in the h.tk/th slot of the hash table,
i.e., HTŒh.tk/ D idx (the potential problem of hash collisions will be addressed
shortly). Figure 3.3 presents an illustration of the hash table built for the tokens of
five tags.
To guarantee anonymity, the tokens exchanged between the reader and the tag
should have good randomness. Therefore, the reader (central server) and the tag
need to synchronously update their shared keys after the current token is used. We
stress that the tag will update its keys once it uses its current token. Therefore,
the same token will never be used in two consecutive authentications, which
fundamentally differs from LAST [13] where the tag only updates its key identifier
after a successful authentication (which breaks the anonymity).
48 3 Lightweight Anonymous RFID Authentication
MSB MSB
update 1 update 1
pattern 0 pattern 0
yes yes
1 bt 1 1 bi 1
no no
0 bt 2 update 0 bi2 update
yes yes
1 bt 3 tk 1 bi3 ic
selector no selector no
0 bt 4 0 bi4
yes yes
1 bt 5 1 bi5
no no
0 bt 6 0 bi 6
LSB LSB
Fig. 3.5 Left plot: Generating a new token using the base tokens and the selector. Right plot:
Generating a new indicator using the base indicators and the selector
The tag t relies on its current indicator ic to update its keys. Figure 3.4 shows the
structure of an indicator, which includes two parts: The low-order .b 2/ bits form
a selector, indicating which base tokens/base indicators should be used to derive
the new token/indicator, while the high-order two bits encode the update pattern.
When the updating phase begins, t calculates a new token from the base tokens
according to the selector. Each of the low-order u bits (u b 2) in the selector
encodes a choice of the corresponding base token: “0” means not selected, while
“1” means selected. For all selected base tokens, they are XORed to compute the
new token. Therefore,
M
u
tk D icŒ jbtj ; (3.1)
jD1
where icŒ j is the jth bit in ic (assume one-based indexes are used) and ˚ is the XOR
operator. The left plot in Fig. 3.5 gives an example of token update, where bt1 , bt3 ,
and bt5 among the six base tokens happen to be selected. Similarly, t derives a new
indicator from the base indicators as follows:
v
M
ic D icŒ jbij : (3.2)
jD1
3.4 Dynamic Token-Based Authentication Protocol 49
At the server’s side, the same new token and new indicator can be generated because
it shares the same keys with the tag. In addition, the server also needs to update the
hash table. First, the server sets HTŒh.tk/ (the old token) to 0, and after generating
the new token, it sets HTŒh.tk/ D idx.
After updating the token and the indicator, the central server and the tag need to
further update the selected base tokens and base indicators. The update process for
any selected base token or base indicator includes two steps: A one-bit left circular
shift, and bit flip by following the particular 2-bit update pattern:
1. Pattern .00/2 : no flip is performed;
2. Pattern .01/2 : flip the jth bit if j 0 .mod 3/;
3. Pattern .10/2 : flip the jth bit if j 1 .mod 3/;
4. Pattern .11/2 : flip the jth bit if j 2 .mod 3/.
Obviously, the ith and jth bits can be flipped together if and only if i j .mod 3/.
This rationale of the updating scheme is that if the parameters a and b are set
properly, any two bits in a base token or a base indicator have a chance to not be
flipped together, thereby reducing their mutual dependence. We will provide the
formal proof shortly. We emphasize that all keys are only stored at the central server
rather than every single reader. Hence, the update process of a tag’s keys triggered
by one reader is transparent to other readers (a tag carrier can only appear at one
location at a time).
1 1
which converges to 2 2
1 1 : Therefore, X becomes 0 or 1 with equal probability.
2 2
Now let us further investigate two arbitrary bits in a base token, and we have the
following lemma:
Lemma 2. If the update pattern in the indicator is random, two arbitrary bits in a
base token are independent under our update scheme.
Proof. Consider two arbitrary bits, denoted by random variables X and Y, in base
token btj . Suppose X and Y are initially located at the pth bit and qth bit of btj
(1 p < q a), respectively. The transition matrices when X and Y cannot be
flipped together and can be flipped together are
3.4 Dynamic Token-Based Authentication Protocol 51
01 1 1 1 01 1 1
2 4 4
0 2
00 2
B1 1 0 1C B0 1 1
0C
P2 D B 4 2 4 C; and P3 D B 2 2 C;
@1 0 1 1A @0 1 1
0A
4 2 4 2 2
0 14 41 21 1
2
00 1
2
each entry converging to 14 . Therefore, two arbitrary bits in a base token are pairwise
independent.
With the two lemmas above, we can prove the following theorem regarding the
randomness of the derived tokens:
Theorem 2. If the indicator is random, any bit in the derived token has an equal
probability to be 1 or 0, and two arbitrary bits in the derived token are independent
using our update scheme.
Proof. Consider theL ith bit of the derived token tk, denoted by tkŒi (1 i a).
We know tkŒi D ujD1 icŒ jbtj Œi. Let N0 be the random variable of the number of
base tokens whose ith bit is 0, and N1 be the random variable of the number of base
tokens whose ith bit is 1, subjecting to N0 0, N1 0 and N0 C N1 D u. According
to Lemma 1 and the independence of different
base
u tokens, N0 follows the binomial
distribution B.u; 0:5/, and P.N0 D x/ D ux 12 , where 0 x u. To calculate
tkŒi, we need to consider two possible cases:
Case 1: N0 D u, namely there is no 1 in those u bits. In this case, tkŒi must be 0.
Case 2: 0 N0 < u. In this case, tkŒi can be 0 or 1. If tkŒi D 0, it implies that
an even number of base tokens whose ith bit is 1 are chosen, and the conditional
probability is
52 3 Lightweight Anonymous RFID Authentication
N1
2 e
P
d
2N0
N1
2x
1
xD0
P.tkŒi D 0 j 0 N0 < u/ D (3.3)
2u 1
2u1
1
D :
2 1
u
3.4.7 Discussion
In this section, we present our third protocol, called Enhanced dynamic Token-based
Authentication Protocol (ETAP), to address the issues of TAP.
Since desynchronization attack and replay attack can be carried out simultaneously,
we tackle them together. Our objective is twofold: First, the valid tag can still
be successfully authenticated by a legitimate reader after some desynchronization
attacks; Second, even if the adversary has captured some tokens from the valid tag,
it cannot use those tokens to authenticate itself.
54 3 Lightweight Anonymous RFID Authentication
Table 3.4 Key table stored by the central server for ETAP
Tag index Tag Base token array Token array Base indicator array Indicator
1 t1 [bt11 , : : : , bt1u ] [tk11 , tk12 , : : : , tk1k ] [bi11 , : : : , biv1 ] ic1
2 t2 [bt21 , : : : , bt2u ] [tk21 , tk22 , : : : , tk2k ] [bi12 , : : : , biv2 ] ic2
:: :: :: :: :: ::
: : : : : :
n tn [btn1 , : : : , btnu ] [tkn1 , tkn2 , : : : , tknk ] [bi1n , : : : , bivn ] icn
Updated 6 8
tk5 tk tk7 tk
Token Array
Updated
Hash Table 0 idx 0 idx 0 idx idx
1
An exponentially increasing timeout period can be enforced between unsuccessful authentications
to prevent an adversary from depleting the k tokens too quickly.
3.5 Enhanced Dynamic Token-Based Authentication Protocol 55
we adopt the two-step verification as illustrated in Fig. 3.7. In step 3, the reader
includes a b-bit random nonce in its message, and challenges the tag to send another
token. After the tag authenticates the reader, it updates its indicator by XORing
the indicator with the received nonce (so does the reader), which contributes to
randomizing the indicator as well. After that, the tag derives a new token based on
the updated indicator, and sends it to the reader for the second verification. Since
the adversary does not know the base tokens and the indicator, it cannot derive
the correct token to pass the second verification, rendering replay attack infeasible.
After the successful mutual authentication, the reader generates four new tokens to
replenish the token array. In addition, the reader updates HT by setting the slots
corresponding to the old tokens to 0, and setting the slots corresponding to the new
tokens to idx. Note that the token replenishment is performed off line by the central
server, which is therefore not a performance concern.
Suppose the central server pre-computes k tokens for each tag. Let nt D nk be the
number of total tokens, and l be the size of the hash table. A slot in the hash table is
called an empty slot, a singleton slot, and a collision slot, respectively, if zero, one
and multiple tokens are mapped to it. When every token is mapped to a singleton
slot (no collision happens), the utilization ratio of the hash table is defined as
nk nt
D D ; (3.5)
l l
which is used as the performance metric for evaluating memory efficiency.
One candidate approach for reducing hash collisions is to use a large hash table.
However, l must be set prohibitory large for the purpose of totally eliminating
hash collisions, resulting in low memory utilization (small ). Figure 3.8 shows the
memory utilization ratio of the hash table when different numbers of tokens need to
56 3 Lightweight Anonymous RFID Authentication
0.04
ρ
0.02
0
1000 3000 5000
nt
be stored and a single hash function is used. We can see the utilization ratio is very
low. Moreover, the value of drops dramatically when more tokens are mapped to
the hash table. For example, D 0:0023 when nt D 5000, meaning more than 99 %
of slots in the hash table are wasted.
We observe that two different tokens causing a hash collision under one hash
function probably will not have a collision under another hash function. Therefore,
using multiple hash functions provides an alternative way for resolving hash
collisions.
When a single hash function is used, the probability ps that an arbitrary slot is a
singleton slot is
!
nt 1 1 nt 1
ps D 1
1 l l
nt nt 1 (3.6)
e l
l
nt nt
e l :
l
Tokens tk 1 tk 2 tk 3 tk 4 tk 5 tk 1 tk 2 tk 3 tk 4 tk 5
h2 h3
Hash h h3 h2
h1
Hash Table c e c e s s s s s s
Fig. 3.9 An example of using multiple hash functions to reduce hash collisions, where the left
plot uses one hash function, and the right plot uses three hash functions. h, h1 , h2 , and h3 are hash
functions. “e” means an empty slot, “s” means a singleton slot, and “c” means a collision slot
Tag Index 2 4 5 1 3
Tokens tk 2 tk 4 tk 5 tk 1 tk 3
starting from i D 1. If the hi .tk/th slot in the hash table has not been occupied by
any token, tk can be added to this slot immediately. Each slot needs to store both the
token value and the tag index associated with the token to facilitate identification
of which token indeed uses a certain slot. As the example in Fig. 3.10, a token tk2
of t2 is mapped by the hash function h3 ./ to the first slot (a singleton slot) of HT.
Hence, HTŒ0 records tag index 2 and the token value tk2 . Instead of implementing
r independent hash functions, we can use one master hash function H and a set S of
random seeds, and let
where ˚ is the XOR operator. When the reader receives a token tk from a tag for
authentication, it computes hi .tk/ (1 i r) until it finds that the token value in
slot HTŒhi .tk/ matches tk, where it can obtain the correct tag index of that tag. If no
matching token is found, the tag fails authentication.
We will shortly evaluate the effectiveness of our scheme for resolving hash
collisions caused by different tokens through simulations. The issue of token
collisions, however, may not be solved by this approach. If the two identical
tokens happen to be associated with the same tag, it will not cause any problem.
But if they are associated with different tags, the reader cannot uniquely identify
the tag from the received token. Therefore, such collided tokens cannot be used
for authentication. The central server can store those tokens in a CAM (Content
Addressable Memory) [5] or a Bloom filter for quick lookup. When the reader
receives a token, it first checks if it will cause token collision; if so, the reader
needs to request another token from the tag for identification purpose. We expect
that the chance for token collisions is small as long as the generated tokens have
good randomness.
58 3 Lightweight Anonymous RFID Authentication
3.5.3 Discussion
Memory Requirement The memory requirement for the tag to implement ETAP
is the same as TAP, i.e., .u C 1/a C .v C 1/b bits. The memory requirement at the
central server moderately increases because of the larger key table and hash table
for storing multiple tokens for each tag.
Communication Overhead For each authentication, the tag only needs to transmit
two a-bit tokens, and the reader needs to send an authentication request, one a-
bit token, one b-bit nonce, and a response, both incurring O.1/ communication
overheads.
Online Computation Overhead For each authentication, the tag generates three
tokens and performs one comparison to authenticate the reader. ETAP requires some
extra computation overhead from the reader (server). First, the reader should check
if a received token is a collided one, which requires O.1/ computation. In addition,
the reader needs to calculate at most r hash values to identify the tag, and perform
at most k comparisons to locate the received token in the token array. Since r and k
are small constants, the online computation overhead for the reader is still O.1/.
Hardware Cost The hardware for RFID tags to implement ETAP consists of a
circular shift register, two registers for storing intermediate results, some XOR
gates, and some RAM to store u base tokens, v base indicators, one token, and one
indicator. We estimate the hardware cost of ETAP following the estimated costs of
typical cryptographic hardware [4, 16] listed in Table 3.5. The circular shift register
is a group of flip-flops connected in chain, which requires 12 max.a; b/ logic
gates. Similarly, the two registers for intermediate results need 2 12 max.a; b/
logic gates. In addition, it takes 2:5 max.a; b/ logic gates to implement the XOR
gates. Finally, the cost of the RAM for storing the base tokens, base indicators,
token, and indicator is about .uC1/a8
12 C .vC1/b
8
12 logic gates. Therefore,
the total number of required logic gates for implementing ETAP is approximately
38:5max.a; b/C1:5.uC1/aC1:5.vC1/b. For example, if we set a D b D 16,
u D 10, and v D 6 (the reason for this setting will be explained shortly), ETAP only
requires about 1K logic gates.
1 1
Prob.z0 D z/ C ; (3.8)
2 ploy.s/
the tag will XOR its current indicator with a random nonce. Even if the adversary
obtains all current keys of the tag, it does not know the previous values of the
indicator without capturing all random nonces. Therefore, the adversary cannot
perform reverse operations of the updating process to calculate the previous tokens.
In the first set of simulations, the number nt of tokens is set to 100, 500, 1000, and
5000, respectively. We vary the number r of hash functions from 1 to 20. Under each
parameter setting, we repeat the simulation 500 times and obtain the average value
of utilization ratio. Results in Fig. 3.11 demonstrate that increases significantly
with the increase of r at first, and gradually flattens out when r is sufficiently large.
Consider the case that nt D 5000. When r D 1, less than 0.3 % of slots are used.
In contrast, when r D 10, more than 50 % of slots are occupied. In addition, we
observe that for larger nt , the corresponding is slightly smaller when the same
number of hash functions is used.
Next, we investigate the effectiveness of the multi-hash approach in resolving
hash collisions. We fix the number r of hash functions to 10, and vary the number
nt of tokens from 1000 to 10,000 at steps of 1000. The number l of slots in the
hash table is set to 2, 3, and 5 the number of tokens, respectively. Under each
parameter setting, we run 500 tests and calculate the ratio of tests that no hash
collision occurs. The results in Fig. 3.12 show that when l D 2nt , hash collisions
can occur with high probability, particularly for large nt ; when l is increased to 5nt ,
there is no hash collision any more.
0.4
nt=500
0.2 nt=1000
nt=5000
5 10 15 20
r
3.7 Numerical Results 61
Ratio
0.8
l=5nt
0.6
0.4
1 2 3 4 5 6 7 8 9 10
3
nt (× 10 )
The effectiveness of ETAP relies on the randomness of the tokens and indicators.
An intuitive requirement of randomness is that any token (indicator) should have
approximately the same probability to appear. The EPC C1G2 standard [7] specifies
that for a 16-bit pseudo-random generator the probability of any 16-bit RN16 with
value x shall be bounded by 20:8 1:25
16 < P.RN16 D x/ < 216 . To evaluate the randomness
of tokens and indicators generated by ETAP, we set a D b D 16, respectively,
produce 216 500 tokens and indicators, and calculate the frequency of each token
or indicator. Note that we can concatenate multiple tokens to form a longer one if
necessary. In addition, we set u D 10 as suggested by Theorem 3, and vary v = 2, 4,
6 to investigate its impact on randomness. Figure 3.13 presents the results, where the
dotted horizontal lines represent the bounds specified by EPC C1G2. We can see that
the indicators have better randomness with the increase of v, while the randomness
of tokens is not sensitive to the value of v since u is already set sufficiently large.
In addition, when u D 10 and v D 4, requiring only 256-bit tag memory, both the
tokens and indicators meet the randomness requirement.
Frequency (×10-5)
Frequency (×10 )
-5
3 3
2 2
1 1
0 0
0 20000 40000 60000 0 20000 40000 60000
Token value Indicator value
Frequency (×10 )
-5
3 3
2 2
1 1
0 0
0 20000 40000 60000 0 20000 40000 60000
Token value Indicator value
Frequency (×10 )
-5
3 3
2 2
1 1
0 0
0 20000 40000 60000 0 20000 40000 60000
Token value Indicator value
Fig. 3.13 Frequency tests for tokens and indicators generated by ETAP, where a D b D 16. Each
point represents a token/indicator and its frequency. The two dotted horizontal lines represent the
required bounds
Hence, Y U.0; 1/. We divide .0; 1/ into ten equal-length subintervals, and denote
the numbers of p-values in each subinterval as F1 , F2 , ..., F10 , respectively. We have
3.7 Numerical Results 63
10
X ms 2
2
. Fi 10
/ 2
D 9ms
.9/: (3.10)
iD1 100
2 2 2
Therefore, we can employ test. If the observed statistic of is .obs/, the
p-value is
R1
2 2
2 .obs/ ex=2 x9=21 dx
P. obs / D
.9=2/29=2
R1
2 .obs/=2 ex x9=21 dx
D
.9=2/
2
9 .obs/
D igamc. ; /;
2 2
Rz
ex xc1 dx
where igamc .c; z/ D 1 1
.c/
. The uniformity is acceptable if
2 .obs/
igamc. 92 ; 2
/ 0:0001 [17].
We set a D b D 16, u D 10, and v D 4, and convert the tokens generated by
ETAP to a bit sequence. We vary ns from 1000, 5000, 10,000 to 50,000. The NIST
suggests that ˛ 0:001, so we set ˛ D 0:01. In addition, we set ms D 500, in
the same order of magnitude as ˛ 1 . The block size M should be selected such that
M 20, M > 0:01ns , and NB < 100, where NB is the number of blocks. We set
M D 0:02ns , so NB D nMs D 50.
The test results are shown in Table 3.6. We can see that the bit sequence generated
by ETAP can pass the randomness tests under all parameter settings, which again
verifies that our protocol can generate tokens with good randomness.
64 3 Lightweight Anonymous RFID Authentication
Table 3.6 The sample size ms is 500, and the acceptable confidence interval of the success
proportion is Œ0:97665; 1
PP
PP
Length 1000 5000 10,000 50,000
Test PP Prop. p-value Prop. p-value Prop. p-value Prop. p-value
Monobit frequency 0.9920 0.002927 0.9880 0.861264 0.9920 0.719747 0.9900 0.957612
Block frequency 0.9960 0.037076 0.9880 0.538182 0.9880 0.162606 0.9940 0.055361
Cumulative sum 0.9920 0.119508 0.9860 0.123755 0.9880 0.632955 0.9880 0.986227
(Mode 0)
Cumulative sum 0.9920 0.798139 0.9920 0.823725 0.9900 0.877083 0.9900 0.081510
(Mode 1)
Runs 0.9940 0.286836 0.9820 0.482707 0.9880 0.146982 0.9860 0.068571
Longest run 0.9960 0.583145 0.9880 0.554420 0.9860 0.851383 0.9900 0.889118
Matrix ranka NA NA NA NA NA NA 0.9880 0.004697
a
Matrix rank test requires that the bit sequence consists of at least 38,912 bits. Hence, no test is
performed when ns < 38;912, which is marked as NA
3.8 Summary
References
1. Avoine, G., Oechslin, P.: A scalable and provably secure hash-based RFID protocol. In: IEEE
PerCom Workshops, pp. 110–114 (2005)
2. Avoine, G., Dysli, E., Oechslin, P.: Reducing time complexity in RFID systems. In: Selected
Areas in Cryptography, pp. 291–306. Springer, Berlin/Heidelberg (2006)
3. Bogdanov, A., Leander, G., Paar, C., Poschmann, A., Robshaw, M.J.B., Seurin, Y.: Hash
functions and RFID tags: mind the gap. In: Proceedings of CHES, pp. 283–299 (2008)
4. Chen, M., Chen, S., Xiao, Q.: Pandaka: a lightweight cipher for RFID systems. In: Proceedings
of IEEE INFOCOM, pp. 172–180 (2014)
5. Chisvin, L., Duckworth, R.J.: Content-addressable and associative memory: alternatives to the
ubiquitous RAM. IEEE Comput. 22, 51–64 (1989)
6. Dimitriou, T.: A secure and efficient RFID protocol that could make big brother (partially)
obsolete. In: Proceedings of IEEE PERCOM (2006)
7. EPC Radio-Frequency Identity Protocols Class-1 Gen-2 UHF RFID Protocol for Communi-
cations at 860MHz-960MHz, EPCglobal (2011). Available at http://www.epcglobalinc.org/
uhfclg2
References 65
Traditional RFID technologies allow tags to communicate with a reader but not
among themselves. By enabling peer communications between nearby tags, the
emerging networked tags represent a significant enhancement to today’s RFID
tags. They support applications in previously infeasible scenarios where the readers
cannot cover all tags due to cost or physical limitations. This chapter introduces
a fundamental problem of identifying networked tags. To prolong the lifetime
of networked tags and make identification protocols scalable to large systems,
energy efficiency and time efficiency are most critical. We reveal that the traditional
contention-based protocol design will incur too much energy overhead in multihop
tag systems, while a reader-coordinated design that significantly serializes tag
transmissions performs much better. In addition, we show that load balancing is
important in reducing the worst-case energy cost to the tags, and we present a
solution based on serial numbers.
The rest of this chapter is organized as follows. Section 4.1 presents the
system model and the problem statement. Section 4.2 discusses the related work.
Section 4.3 describes the contention-based ID collection protocol. Section 4.4
introduces the serialized ID collection protocol. Section 4.5 presents two techniques
to improve the time efficiency of serialized ID collection. Section 4.6 evaluates the
performance of our protocols by simulations. Section 4.7 gives the summary.
A networked tag system consists a reader and a large number of objects, each
of which is attached with a tag. We will use tag, node, and networked tag
interchangeably in the sequel. Each tag has a unique ID that identifies the object it is
attached to. The reader also has a unique ID that differentiates itself from the tags.
A networked tag system is different from a traditional RFID system with a
fundamental change: Tags near each other can directly communicate. This capability
allows a multihop network to be formed amongst the tags. Developed at Columbia
University recently [2], networked tag prototypes can communicate using variants
of CSMA and slotted ALOHA. The transmission range of inter-tag communications
is usually short, about 1–10 m [5]. But the reader is a more powerful device, and
its transmission range can be much larger. Tags that can perform direct two-way
communicate with a node form the neighborhood of the node.
Networked tags are expected to carry sufficient internal energy for long-term
operations or have the capability of harvesting energy from the environment where
they are deployed. Tags of the highest energy demand are located in the reader’s
neighborhood (i.e., coverage area) because they have to relay the information from
all other tags as the data converge towards the reader. Fortunately, these tags can
be powered by the reader’s radio waves, similar to what today’s passive RFID tags
do; their energy supply is ensured. In contrast, tags that are beyond the reader’s
coverage need to use their own energy. The operations of these tags must be made
energy-efficient.
The reader and the tags in the system form a connected network. In other words,
there exists at least one path between the reader and any tag such that they can
communicate by transmitting data along that path. Tags that are not reachable from
the reader are not considered to be in the system.
The problem of tag identification is for a reader to collect IDs from all networked
tags that can be reached by the reader over multiple hops with the help of
intermediate tags relaying the IDs of tags that are not in the immediate coverage
area of the reader. Our goal is to develop tag identification protocols that are
efficient in terms of energy cost and protocol execution time. We will consider both
average energy cost per tag and maximum energy cost among all tags in the system.
The average energy cost is an overall measurement of energy drain across the whole
system, and the maximum energy cost is a measurement for the worst hot spot which
may cause power-exhausted tags and network partition.
There are two types of networked tags. The stateful networked tags maintain
network state such as neighbors and routing tables and update the information
to keep it up-to-date. These tags resemble the nodes in a typical sensor network.
4.1 System Model and Problem Statement 69
On the contrary, for the purpose of energy conservation, the state-free tags do
not maintain any network state prior to operation, which makes them different
from traditional networks, including sensor networks—virtually all literature on
data-collecting sensor networks assume the stateful model, where the sensor nodes
maintain information about who are their neighbors and/or how to route data in
the network. We consider state-free networked tags, not only because there is
little prior work on this type of networked nodes, but also because it makes more
sense for the tag identification problem: First, establishing neighborship and then
routing tables across the network is expensive and may incur much more overhead
than tag identification itself, which only requires each tag to deliver one number
(its ID) to the reader. Second, maintaining the neighbor relationship and updating
the routing tables (as tags may move between operations) require frequent network-
wide communications, which is not worthwhile for infrequent operation of tag
identification.
It is challenging to design an identification protocol for state-free networked tags.
First, because power is a scarce resource for tags, the protocol must be energy-
efficient in order to reduce the risk of network failure caused by energy depletion.
Second, we should also make the protocol time-efficient so that it can scale to a large
tag system where the communication channel works at a very low rate for energy
conservation. Third, in order to eliminate overhead of state maintenance and thus
conserve energy, tags are assumed to be state-free, which means that they do not
know who are their neighbors and there is no existing routing structure for them to
send IDs to the reader.
freely moved around. In case that the identification operation needs to be performed
during the daytime, we need to design a protocol that takes as little time as possible
to avoid significant interruption to other warehouse operations due to the stationary
requirement at the time of identification.
To conserve energy, networked tags are likely configured to sleep and wait up
periodically for operations. After wake-up, a tag will listen for a request broadcast
from the reader into the network, which either puts the tag back to sleep or asks the
tag to participate in an operation such as reporting its ID. The broadcast request will
serve the purpose of loosely re-synchronizing the tag clock. The reader will time its
next request a little later than the timeout period set by the tags to compensate for the
clock drift and the clock difference at the tags due to broadcast delay. The exact sleep
time of the tags and the inter-request interval of the reader should be set empirically
based on application needs and physical parameters of the tags.
Notations used in this chapter are given in Table 4.1 for quick reference.
The tag identification protocols for traditional RFID systems can be broadly
classified into two categories: ALOHA-based [12, 15] and tree-based [9, 14]. To run
an ALOHA-based identification protocol, the reader first broadcasts a query, which
is followed by a slotted time frame. Each tag randomly picks a time slot in the frame
to report its ID. Collision happens if a slot is chosen by multiple tags. Tags not
receiving positive acknowledgements from the reader will continue participating in
the subsequent frames. The dynamic frame slotted ALOHA (DFSA) [10, 11] adjusts
the frame size round by round.
4.3 Contention-Based ID Collection Protocol for Networked Tag Systems 71
The tree-based protocols organize all IDs into a tree of ID prefixes. Each in-tree
node has two child nodes that have one additional bit, “0” or “1”. The tag IDs are
leaves of the tree. The reader walks through the tree. As it reaches an in-tree node, it
queries for tags with the prefix represented by the node. When multiple tags match
the prefix, they will all respond and cause collision. Then the reader moves to a child
node by extending the prefix with one more bit. If zero or one tag responds (in the
one-tag case, the reader receives an ID), it moves up in the tree and follows the next
branch.
To further improve the identification efficiency, network coding and interference
cancelation techniques are used to help the reader recover IDs from collided
signals [7, 16].
We are not aware of any existing data collection protocol specifically designed
for the state-free model which makes sense in the domain of tags but was not
adopted in the mainstream literature of sensor networks or other types of wireless
systems. However, it is not hard to design an ID collection protocol for networked
tags based on techniques known in existing wireless systems. For example, in
this section, we will follow an obvious design path based on broadcast, spanning
tree, and contention-based transmission. The resulting protocol will be used as a
benchmark for performance comparison (since there is no prior work on identifying
networked tags). In the next section we will point out that the obvious techniques
are, however, inefficient and other less-obvious design choices can produce much
better performance.
4.3.1 Motivation
One straightforward approach for tags to deliver their IDs to the reader is through
flooding: As each tag broadcasts its ID and every other ID it receives for the first time
into its neighborhood, the IDs will eventually reach the reader. However, flooding
causes a lot of communication overhead. In addition, each tag has to store the
IDs that it has received in order to avoid duplicate broadcast. Due to the nature
of flooding, it means that eventually each tag will store all IDs in the system, which
demands too much memory.
Another approach is to ask tags to discover their neighbors and run a routing
protocol to form routing paths towards the reader right before sending the IDs
(even though the tags are state-free prior to operation). However, as the number
72 4 Identifying State-Free Networked Tags
The classical broadcast protocol is for each node to transmit a message when
it receives the message for the first time. But it becomes more complicated to
guarantee that all nodes receive the message: If each node knows its neighbors,
it may keep transmitting the message until receiving acknowledgements from all
neighbors. However, more care must be taken if the nodes do not know their
neighbors. Below we briefly describe a request broadcast protocol (RBP) that
guarantees delivering a request from the reader to all state-free tags.
To initiate tag identification, the reader broadcasts a request notifying the tags
to report their IDs. The request initially carries the reader’s ID, which will later
be replaced with a tag’s ID when the tag forwards the request to others. The state
transition diagram of the protocol is depicted in Fig. 4.1, which is explained below.
State of Waiting for Request Each tag begins in this state and takes action based
on one of three possible events.
Wait for
request
idle channel collision
go to sleep send a NAK
request received
send an ACK no response || one ACK
stop broadcasting
Forward
Terminate
request
one NAK || collision
exponential backoff
broadcast request
Fig. 4.1 State transition diagram of the RBP protocol. Each circle is a state, and each arrow is a
transition, where the event triggering the transition is above the line and the action is below the line
4.3 Contention-Based ID Collection Protocol for Networked Tag Systems 73
Theorem 4. Every tag will receive a copy of the request sent out by the reader
under RBP.
Proof. To prove by contradiction, let’s assume there exists a tag T that does not
receive the request after executing RBP. It must be true that none of its neighbors
has received the request. Otherwise, according to the protocol, any neighbor having
received the request would continue broadcasting the request until T receives it and
acknowledges its receipt—each time the request is transmitted, if T does not receive
the request successfully, it will respond NACK, causing the sender to retransmit.
By the same token, the neighbors of any T’s neighbor must not receive the request.
Applying this argument recursively, all nodes reachable from T must not receive the
request. By the assumption that the network is connected, at least one neighbor T 0
of the reader is reachable from T. Therefore, T 0 must not receive the request. This
contradicts to the fact that T 0 is located in the reader’s coverage area and should
receive the request at the very beginning when the reader broadcasts the request for
the first time. Therefore, the theorem must hold.
When a tag transmits its ID, it will include its parent’s ID in the message,
such that the parent node will receive it while other neighbors will discard the
message. This unicast transmission is performed based on the classical ALOHA
with acknowledgement and exponential backoff to resolve collision. The parent
node will forward the received ID to its parent, and so on, until the ID reaches
the reader.
The execution of ICP is performed in parallel with RBP: Once a tag knows its
parent ID from RBP, it will begin transmitting its ID to the parent. When a tag
needs to forward both an ID for ICP and a request for RBP, we give priority to ID
forwarding because it is easier for unicast to complete.
Theorem 5. The reader will receive the IDs of all tags in the system after the
execution of ICP.
4.4 Serialized ID Collection Protocol 75
Proof. From Theorem 1, each tag is guaranteed to receive the request and therefore
find a parent (from which the request is received). Consider an arbitrary tag T.
According to the design of ICP, the ID of T will be sent to its parent until positively
acknowledged. The parent will forward the ID to its parent, and as this process
repeats, the ID will eventually reach the reader at the root of the spanning tree.
4.4.1 Motivation
4.4.2 Overview
Tier 2
T5 T6 T7 T10 T11
T8 T9 11
5 6 7 8 9 10
7 7
8 2 8
2
5 5
1 1
Reader 0 6 Reader 0 6
3 9 3 9
4 10 4 10
11 11
12 12
Fig. 4.4 Left plot: a roughly balanced spanning tree; Right plot: a biased spanning tree, where tag
1 delivers its ID to the reader first and a large number of nodes in its neighborhood chooses it as
their parent, causing a biased tree. The serial number of a tag is shown inside the circle. Each arrow
represents a child–parent relationship
so, all other tier-1 nodes will stay silent. As T2 sends out a request for IDs, only its
children (T5 , T6 and T7 ) will respond. The same process as described in the previous
paragraph will repeat; only this time T2 takes the role of the reader.
After T2 collects the IDs of all its children, it will forward the IDs to the reader,
which will then move to the next tier-1 node. Once it exhausts all tier-1 nodes, it
will move to tier-2 nodes, one by one and tier by tier, until the IDs of all nodes in
the network are collected.
Below we will first introduce the problem of biased energy consumption, give a
solution, and then describe recursive serialization.
When a tag is transmitting its ID to the reader, its neighbors outside of the reader’s
coverage can overhear the ID. They may use this tag as their parent. As illustrated
in the left plot of Fig. 4.4, we prefer a roughly balanced spanning tree where each
node serves as the parent for a similar number of children. In reality, however, a tag
that delivers its ID to the reader early on will tend to have many more children. An
example is given in the right plot of Fig. 4.4. Suppose tag 1 transmits its ID to the
4.4 Serialized ID Collection Protocol 77
reader first. Overhearing its ID, tags 5–9 will pick tag 1 as their parent. When tag
2 transmits its ID at a later time, no tag will be left to choose tag 2 as parent even
though tags 7–8 are in the range of tag 2—recall that they have already chosen tag
1. In this case, tag 1 will have to forward more IDs, resulting in quicker energy drain
than others. The severity of the problem grows rapidly with an increasing number
of tiers because the numerous children of tag 1 tend to acquire even more numerous
children of their own and those IDs will pass through tag 1 to the reader.
Uneven energy consumption causes some tags to run out of energy earlier, which
can result in network partition. The same problem also exists for RBP/ICP where
tags that receive and forward the request early on during the network-wide broadcast
may end up with a large number of children.
We observe that a tag may overhear multiple ID transmissions over time and
thus have multiple candidates to choose its parent from, as shown by Fig. 4.5 where
T may choose its parent from three tier-1 nodes. Ideally, the tag should choose its
parent uniformly at random from the candidates. However, because of collision,
each candidate may have to retransmit its ID for a different number of times before
the reader successfully receives it. To avoid giving more chance to a candidate that
retransmits its ID more times, the tag may keep the IDs of all known candidates
to filter out duplicate overhearing. However, a serious drawback of this approach is
that the memory cost can be high if a tag has numerous candidates for its parent in a
system where tagged objects are packed tightly together. We want to point out that
typical tags have very limited memory.
time frame. It waits until the chosen slot to report its ID to the reader. If only one tag
selects a certain slot, its ID will be correctly received by the reader, which replies an
ACK to the tag in the same slot. The ACK carries the number of IDs that the reader
has successfully received so far. This number is assigned as the serial number of the
tag; the number is system-wide unique due to its monotonically increasing nature. A
tag can be identified either by its ID or its assigned serial number. After receiving the
ACK, we require the tag to broadcast the assigned serial number in its neighborhood.
Hence, each time slot contains an ID transmission, an ACK transmission, and a
serial-number transmission. If the ID transmission is collision-free, so do the other
two transmissions. Even though a tag may need to retransmit its ID multiple times
due to collision, it will transmit its assigned serial number once, only at the time
when an ACK is received.
If the reader observes any collision in the time frame, it will broadcast another
request with another time frame to collect more IDs. If no collision is observed,
the reader has collected all IDs from its neighborhood and it will perform recursive
serialization (to be discussed) to collect IDs outside of its neighborhood.
Consider an arbitrary neighbor of T, denoted as T 0 , which has not set its parent yet.
As illustrated in Fig. 4.6, T 0 must not be in the reader’s neighborhood because the
tags in that neighborhood set the serial number 0 as their parent when they receive
the request from the reader for the first time. When T 0 receives a serial number for
the first time, it will set the number as its parent, which is subject to change when
T 0 receives more serial numbers from other tags (candidates for parent). Recall that
each tag broadcasts its serial number only once. This property allows us to design
the following parent selection algorithm (PSA) which guarantees every candidate
has an equal chance to be selected as the parent: Each tag maintains two values,
its parent and a counter c for the number of candidates having been discovered so
far. The counter is initialized to zero. Each time when T 0 receives a serial number
s0 from a neighbor, it increases c by one and then replaces the current parent with s0
by a probability 1c . Using this PSA, we have the following theorem:
0
s
0 T T’
Reader 0 0
0
4.4 Serialized ID Collection Protocol 79
Theorem 6. Suppose a tag has m candidates for parent. Each candidate has an
equal probability of m1 to be chosen as the tag’s parent in the end.
Proof. For the jth (1 j m) discovered candidate, it becomes the final parent
only if it replaces the previously selected parent, and is never substituted by the
subsequently discovered candidates. Therefore, the probability that it is chosen by
the tag as the parent in the end is
1 Y
m
1 1
.1 / D ; (4.1)
j lDjC1 l m
After the reader collects all IDs from its neighborhood, each tag in the neighborhood
will obtain a unique serial number. Recall that these tags constitute the first tier of the
network. The reader then serializes the subsequent ID collection process by sending
the serial numbers of tier-1 tags one by one, in order to command the corresponding
tag to collect IDs from its neighbors, with other tier-1 tags staying idle.
The reader begins by transmitting another type of request, denoted by RQST2 ,
which includes the serial number 1 and the number s of IDs it has received so
far. In response, the tag with the serial number 1, denoted as T1 , transmits an ID
collection request RQST1 , carrying its own serial number 1 and a frame size f . The
request causes the neighbors that are not tier-1 to finalize their parent selection;
these nodes are tier-2. Note that some of them may have selected nodes other than
T1 as their parents. Hence, when a tier-2 node receives the request from T1 , only
if its chosen parent matches the serial number in the request, it will transmit its ID
in the subsequent time frame; otherwise, it can sleep for a duration of f slots. If T1
correctly receives an ID in a slot from a child T10 , it increases the value of s by one
and sends back an ACK with s as the serial number assigned to T10 , which in turn
broadcasts its serial number and tier number (i.e., 2) in its neighborhood such that
the neighbors at the next tier can discover it as one of their candidates for parent.
When a tag sets (or later replaces) its parent, it also sets its tier number as the tier
number of its parent plus one; it should never replace its current parent with one
whose tier number is larger.
80 4 Identifying State-Free Networked Tags
It may take T1 multiple requests to finish reading all IDs from its children. It then
forwards the IDs to the reader. After acknowledging T1 , the reader sends a command
to trigger the ID collection process at the next tier-1 tag.
After the reader finishes this process with all tier-1 tags, it has collected the IDs
of all tier-2 tags. The reader also has the information to construct a spanning tree
covering tier-1 and tier-2 nodes, as illustrated in Fig. 4.3 where the assigned serial
numbers are shown inside the circles.
After the reader commands all tier-1 tags one by one to collect the IDs of tier-2
tags, it repeats this serialization process recursively to collect other IDs tier by tier.
Suppose the reader has collected the IDs from all tags at tier 1 through tier i and
the range of serial numbers at tier i is from x to y. The reader will send an RQST2
to each tier-i tag in sequence. The command includes a concatenation of the serial
numbers along the path in the spanning tree from the root (excluded) to that tag, in
addition to the number s of IDs that the reader has received so far. For example, for
tag 7 in Fig. 4.3, the command will carry two serial numbers, 2 and 7. (Note that
since each serial number is of fixed size, there is no ambiguity on interpreting the
sequence of serial numbers.)
When the reader broadcasts the command in its neighborhood, any tag receiving
the command will extract and compare the first serial number with its own. If the
two serial numbers do not match, it discards the command. Otherwise, it further
checks whether there are more serial numbers in the command. If so, it broadcasts
the remaining command. This process repeats until a tag matches the last serial
number in the command. That tag will performs ID collection in a similar way as
described in Sect. 4.4.6. The collected IDs will be sent through the parent chain to
the reader.
Theorem 7. The reader will receive the IDs of all tags in the system after the
execution of SICP.
Proof. Proving by contradiction, we assume at least one tag T fails in delivering its
ID to the reader. T must not have a parent; we again prove this by contradiction:
Assume that T has a parent T 0 . According to the protocol, for T 0 to be chosen as a
parent, it must either the reader or a node that has already successfully delivered its
ID and subsequently broadcast its assigned serial number. Hence, it will receive a
command from the reader to collect IDs from its children. After the reader sends a
command to T 0 , T 0 will broadcast requests, free of collision due to serialization, to
children until all IDs are collected—which happens when no collision is detected in
the time frame after a request. When T 0 receives the ID of T, if it is not the reader,
it will forward the ID to the reader along the path with which its own ID has been
successfully delivered, free of collision due to serialization. This contradicts to the
assumption that T fails in delivering its ID to the reader. Hence, T does not have a
parent.
4.4 Serialized ID Collection Protocol 81
If T does not have a parent, all of its neighbors must fail in delivering their IDs
to the reader because otherwise any successful neighbor would broadcast its serial
number according to the protocol, which would result in T having a parent after T
receives the serial number.
If all neighbors of T fail in delivering their IDs to the reader, by the same
reasoning as above, all their neighbors must fail too. Recursively applying this
argument, all tags in the network must fail in delivering their IDs to the reader
because the network is connected, which contradicts at least to the fact that the
reader’s immediate neighbors are able to send their IDs to the reader through the
slotted ALOHA protocol that SICP employs. Hence, the theorem is proved.
The pseudo code of SICP for an arbitrary tag T is given in Protocol 1.
When the reader or a tag tries to collect the IDs in its neighborhood, its request
carries a frame size f . Let n be the number of tags that are children of the reader or
tag sending the request. It is well known that the optimal frame size should be set as
n, such that the probability of each slot carrying a single ID (without collision) can
be maximized. This can be easily seen as follows: Consider an arbitrary slot. The
probability p that one and only one tag chooses this slot to transmit is
!
n 1 1 n1 n n1 n n
pD 1 e f e f (4.2)
1 f f f f
when n is large. To find the value of f that maximizes p, we take the first-order
derivative of the right side and set it to zero. Solving the resulting equation, we have
f D n; (4.3)
which means the maximal value of p is e1 . In subsequent requests, as more and
more IDs have been collected, fewer and fewer tags are transmitting their IDs and
the frame sizes should be reduced accordingly.
However, we do not know n. There are numerous estimation methods for n [3, 6,
13], which are, however, intended for a system with a large number of tags, in tens
of thousands. It is known that these estimation methods will actually be inefficient if
they are applied to a relatively small number of tags such as a couple of thousands or
fewer [8]; if the number of tags is very small, the estimation time can be much larger
than the time it takes to complete the tag identification task itself. In the context of
this chapter, we expect the number of children of the reader or any tag is relatively
small. Hence, it is not worthwhile to add the overhead of a separate component for
estimating n before the reader (tag) begins collecting IDs from its neighborhood.
Our solution is to estimate the value of n iteratively from the frame itself without
incurring additional overhead. Initially, we set f to be a small constant in the first
request. We double the value of f in each subsequent request until there exists at
least one empty slot that no tag chooses. From then on, we will estimate the number
of n and set the frame size accordingly in the subsequent requests. Without the loss
of generality, suppose we want to determine the frame size for the ith request. Let fj
be the frame size used in the jth request, 1 j < i. After the jth request, let cj , sj ,
and ej be the numbers of slots that are chosen by multiple tags (collision), a single
tag, and zero tag, respectively. Let mj be the number of IDs that are successively
collected after the jth request. All these values are known to the reader (tag). The
process for a tag to randomly choose a slot in a time frame can be cast into bins and
balls problem [4]. In the jth frame, n mj1 tags (balls) are mapped to fj slots (bins).
nm
The total number of different ways for putting n mj1 balls to fj bins is fj j1 .
The number of ways for choosing ej bins from fj bins and let them be empty is efjj .
4.4 Serialized ID Collection Protocol 83
In addition, the number of ways for choosing sj balls from n mj1 balls and putting
j nmj1
each of them into one of the remaining fj ej bins is fj e
sj sj
.sj Š/. Finally, the
remaining n mj1 sj balls should be thrown into the remaining cj bins, each
containing at least two balls (collision slots). We first choose 2cj balls and put two
j1 sj .2cj /Š
balls into each of the cj bins, which includes nm2c j 2cj
possibilities. After that,
the remaining .n mj1 sj 2cj / balls can be put into any of the cj bins, which
involves .n mj1 sj 2cj /cj different ways. Therefore, the likelihood function for
observing these values is
fj fj ej nmj1 j1 sj .2cj /Š
Y
i1
ej sj sj
.sj Š/ nm2c 2cj
.n mj1 sj 2cj /cj :
j
L.n/ D
jD1
fj nmj1
(4.4)
The estimate of n is the value that maximizes L. Let this value be nO , which can be
found through exhaustive search since the range for n is limited in practice, rarely
going beyond tens of thousands. For the ith request, we set the frame size to be
nO mi1 .
The above estimator follows the general principle originally seen in [6], but it
takes the information of cj , sj , and ej all in the same estimator, whereas the estimators
in [6] use either cj or ej .
As our analysis will show, except for the reader, the average number of children
per tag is typically very small (less than 2) for a randomly distributed tag network.
In this case, if we set the initial frame size to 4, the chance is high that a tag
successfully collect all IDs from it children in the first time frame. Therefore, only
the reader needs to use (4.4) to estimate the number of its children, while the tags
can just set the frame size to a small constant to avoid the computation overhead.
We analyze the work load of each tag in terms of how many children and
descendants it has to handle. While our load balancing approach is designed for
any tag distribution, to make the analysis tractable, we assume here that tags are
evenly distributed in an area with density , and the tags whose distances from the
reader are no larger than R form the first tier, while those whose distances from the
reader are greater than R C .i 2/r but smaller than R C .i 1/r form the ith (i 2)
tier of the network, where the transmission ranges of the reader and a tag are R and
r, respectively, with R r. For example, Fig. 4.7 presents a network with three tiers.
The number Ni of tags in the ith tier is estimated as
Ni D . .R C .i 1/r/2 .R C .i 2/r/2 /
(4.5)
D .2Rr C .2i 1/r2 /:
84 4 Identifying State-Free Networked Tags
R+2r
One exception is that N1 computed from (4.5) actually includes only the portion of
tier-1 tags whose distances from the reader are larger than R r; these are the tags
that can serve as parents for tier-2 tags.
The children degree of tier-i tags, denoted by Di , is defined as the average number
of children that a tier-i tag has. Because tags at the ith tier only serve as parents for
tags at the .i C 1/th tier, we have
We have R r because the reader can transmit at a much higher power level and
it has much more sensitive antenna. This makes the values of Di very small. For
example, if R D 3r, Table 4.2 shows the values of Di , 1 i < 10, which are smaller
than 1.3 and quickly converge towards 1 as i increases. The values in the table will
be even smaller if R > 3r.
The load factor of tier-i tags, denoted as Li , is defined as the average number of
IDs that a tier-i tag has to forward, including the IDs of its tier-.i C 1/ children as
well as other IDs that its children collects from their descendants. Li is equal to the
total number of tags beyond the ith tier divided by the number of tags at the ith tier.
Pl Pl
jDiC1 Nj jDiC1 2R C .2j 1/r
Li D D
Ni 2R C .2i 1/r
(4.7)
r 2
2.l i/ C i2 /
.l
D R
;
2 C .2i 1/ Rr
where l is the total number of tiers and i < l. When R D 3r and l D 10, Table 4.3
shows the values of Li , 1 i < 10, which are surprisingly small. Because tier-1 tags
can be powered by the radio wave from the reader, we are only concerned with the
4.5 Improving Time Efficiency of SICP 85
power consumption of tags at other tiers. The tags at tier 2 have to forward more IDs
than those at outer tiers. From the table, a tier-2 tag forwards just 16 IDs on average,
which is modest overhead, considering that there are eight more tiers beyond tier 2.
While the average is modest, the worst-case load factor is also important when
we evaluate overhead. SICP is designed to evenly distribute the work load among
tags by balancing the spanning tree, so that tags at a certain tier have similar
numbers of children (or descendants), which translate to similar children degrees
(or load factors). We will study the worst-case children degree and load factor by
simulations.
Recall that each tag has to receive an RQST2 request from the reader before
collecting IDs from its children. The request is forwarded over multiple hops along
the path from the reader to the tag. We first use an example to illustrate the idea of
request aggregation. Consider a subtree in Fig. 4.3 that consists of the reader, T2 ,
T5 , T6 , and T7 . As shown in the upper half of Fig. 4.8, the request to T5 should be
forwarded over two hops reader! T2 and T2 ! T5 . Similarly, the request targeted
to T6 or T7 needs to be forwarded two times as well. Suppose a one-hop transmission
requires one slot. It requires six slots in total to forward all three requests to T5 , T6 ,
and T7 . We observe that all three requests must be first forwarded to T2 , the common
parent of T5 , T6 , and T7 . The reader! T2 transmission is carried out three times,
86 4 Identifying State-Free Networked Tags
aggregate T2
request 3 7 T7
request request 2
Reader 0 2 6 T6
request 1 5 T5
which is redundant and unnecessary. We can indeed aggregate the three requests
to a single one to avoid redundant transmissions. As shown in the bottom half of
Fig. 4.8, instead of sending separate commands to T5 , T6 , and T7 individually, the
reader first sends an aggregate request to T2 , and T2 then sends requests to the three
children to perform ID collection in sequence. As a result, the total number of slots
for forwarding the requests can be reduced to four. In the aggregate request, the
reader can include the serial numbers of T2 ’s children such that T2 does not need
to remember who are its children. Alternatively, we can slightly modify the SICP
protocol by asking each tag to record the starting serial number of its children and
the number of children its has, which can be easily achieved while performing ID
collection. With those two values, each tag can recover all serial numbers of its
children when necessary.
Suppose T is a tier-i tag and it has m children. In the original design of SICP, it
takes .i C 1/ slots to forward an RQST2 request to a child of T, which is a tier-.i C 1/
tag. Hence, the total time cost tf for forwarding a request to every child of T is
tf D .i C 1/ m: (4.8)
After applying the technique of request aggregation, only one aggregate request
needs to be sent to T. Therefore, the time cost is reduced to
tf0 D i C m: (4.9)
After collecting all IDs from its children, a tag forwards the collected IDs to its
parent, which may take multiple slots. Only after receiving all IDs from the tag will
the parent start to forward those received IDs. This process continues until the IDs
4.5 Improving Time Efficiency of SICP 87
are finally delivered to the reader. Consider a tier-i tag. Suppose it has m children
whose IDs are collected, and k IDs can be transmitted in each slot. The time cost tb ,
in number of time slots needed to forward the IDs to the reader, is approximately
m
tb D i : (4.10)
k
This completely serialized way of ID delivery is, however, not time-efficient
since only one tag is allowed to transmit at any time. We want to exploit simul-
taneous transmissions among non-interfering tags through spatial channel reuse.
Before introducing the idea of transmission pipelining, we first prove the following
theorem:
Theorem 8. If tag T is an ancestor node but not the direct parent node of tag T 0 in
the spanning tree built by SICP, T 0 must not be a neighbor of T.
Proof. Proving by contradiction, we assume that T 0 is a neighbor of T. Denote the
parent of T 0 as Tp . Recall that a tag will determine its parent when receiving the first
ID collect request. Let t0 be the time when T 0 determines its parent, and t be the time
when T broadcasts the ID collection request for the first time. Because T 0 may hear
a request for the first time from another node, we must have
t0 t: (4.11)
Let tp be the time when Tp successfully delivers its ID. Since T is also an ancestor
node of Tp , the delivery of Tp ’s ID must happen after T sends out its ID connection
request. Hence, t < tp . From (4.11), we have
t 0 < tp : (4.12)
tp < t 0 ; (4.13)
Reader 0 1 2 3 4 5 6 Slot 1
Reader 0 1 2 3 4 5 6 Slot 2
Reader 0 1 2 3 4 5 6 Slot 3
Reader 0 1 2 3 4 5 6 Slot 4
Reader 0 1 2 3 4 5 6 Slot 5
Reader 0 1 2 3 4 5 6 Slot 6
Fig. 4.9 Transmission pipelining of the IDs collected by node 6. Each arrow represents one
transmission of IDs from a node to its parent in a given slot
its parent, as shown by slot 1 in Fig. 4.9. In the following two slots, only node 5 and
node 4 forward the received IDs, respectively, while node 6 does not perform any
transmissions to avoid collision. In slot 4, when node 3 forwards the received IDs to
node 2, node 6 will transmit another slot of IDs to node 5. According to Theorem 8,
node 3 is not a neighbor of node 5, and thus its transmission will not interfere with
node 5’s receipt of transmission from node 6. In summary, node 6 can perform one
transmission every three slots in this example. This is also true for other nodes in
the path from node 6 to the reader. The parallel transmissions effectively produce a
transmission pipeline in delivering the IDs to the reader.
We generalize the technique of transmission pipelining for any tag in the system
as follows:
1. After a tier-1 tag collects IDs from its children, the tag will directly deliver the
IDs to the reader in continuous slots without waiting.
2. After a tier-2 tag T collects IDs from its children, the tag performs one ID
transmission every two slots, allowing its parent to forward the received IDs to
the reader in the time slot immediately following each slot when T transmits.
3. After a tag T at tier 3 or higher collects IDs from its children, the tag performs
one ID transmission every three slots to support pipelining. When any tag on the
path from T to the reader receives IDs in a slot, it will transmit the received IDs
in the next slot.
Therefore, the time cost tb0 for a tier-i tag to forward m collected IDs to the reader
with transmission pipelining is approximately
(
i m
if 1 i 2
tb0 D k (4.14)
3 . mk 1/ C i if i 3
4.6 Evaluation 89
It is easy to prove that tb0 tb . Transmission pipelining can improve the time
efficiency of ID collection, particularly for tags with large tier numbers.
We denote the SICP with request aggregation and transmission pipelining as p-
SICP in the sequel.
4.6 Evaluation
There is no prior work on tag identification for networked tag systems.1 But
known techniques such as broadcast and contention-based transmission widely
used in other wireless systems can be used to design a state-free tag identification
protocol, CICP, which we will use as a benchmark for comparison. We evaluate the
performance of CICP, SICP, and p-SICP to demonstrate three major findings that (1)
although the ALOHA-based protocols are very successful in other wireless systems
(including RFID systems), they are not suitable for networked tag systems, and
that (2) serialization can significantly improve the tag identification performance,
and (3) the techniques of request aggregation and ID-transmission pipelining can
significantly improve the time efficiency of serialized ID collection.
Three performance metrics are used: (1) execution time measured in number of
time slots, (2) average and maximum numbers of bits sent per tag, and (3) average
and maximum numbers of bits received per tag. The last two are indirect measures
of energy cost, where tier-1 tags are excluded because they can be powered by the
reader’s radio waves. Computation by tags in the two protocols is very limited. Most
energy is spent on communication. The amount of communication data serves as an
indirect means to compare different protocols. For example, if tags in one protocol
receive and send far more than those in another protocol, it is safe to say that the
first protocol costs more energy than the second.
We vary the number N of tags in the system from 1000 to 10,000 at steps of 1000.
The tags are randomly distributed in a circular area with a radius of 50 m unless an
explicit parameter is specified. The reader, whose communication range R is set to
25 m, is located at the center of the area. For each tag, its inter-tag communication
range r is 5 m. In SICP and p-SICP, the reader sets its frame size of the ith request
to fi = max{On mi1 , fl }, where mi1 is the number of IDs that have been collected
and nO is the estimate number of tags that maximizes (4.4). The lower bound fl , fixed
to 50, prevents the frame size from being setting too small or even negative due to
the estimation deviation of nO . The initial frame size is 50 for the reader. To relieve
1
For the special case when all networked tags are within the coverage of the reader, our protocols
naturally become the traditional protocols, literally, because we may actually adopt any existing
ALOHA-based RFID identification protocol for collecting IDs within the reader’s neighborhood in
place of the operations described in Sect. 4.4.4, as long as the serial number is embedded in ACK.
90 4 Identifying State-Free Networked Tags
the tags from estimating the numbers of children they have, we let them use a fixed
frame size with a default value of 4, but we will also vary it from 2 to 10. The
length of each tag ID is 96 bits long. The length of each serial number is dlog2 Ne
bits long. The length of each tier number is 4 bits long. Following the specification
of the EPC global Class-1 Gen-2 standard [1], we set the length of any types of
requests to 20 bits, and set ACK and NAK to 16 bits and 8 bits, respectively. In
SICP and p-SICP, the ACK will also include a serial number. For each data point in
the figures, we repeat the simulation for 100 times and present the average result.
We first examine the balance of the spanning trees built by CICP and SICP (the
spanning tree in p-SICP is built in the same way as SICP). It has significant impact
on the worst-case energy cost of the tags. A tag with a larger children degree
(or a larger load factor) has to collect (or forward) more tag IDs, resulting in
additional energy expenditure. Tags that have the largest children degree or load
factor may become the energy bottleneck in the network. If the residual on-tag
energy is exhausted before the completion of the protocol, the network may even
be partitioned due to dead tags.
Figures 4.10 and 4.11 present the maximum children degree and the maximum
load factor in the spanning trees built by CICP and SICP, respectively. As the
number N of tags in the system becomes larger, the increase in these worst-case
numbers under CICP is a lot faster than the increase under SICP, indicating a much
balanced tree for the latter. For example, when N D 10;000, the maximum children
degree and load factor in CICP are 83 and 1969, and those numbers in SICP are
only 14 and 165.
We compare the performance of CICP, SICP, and p-SICP in Fig. 4.12, where the
first plot shows the protocol execution time in terms of number of slots used, the
second plot shows the average number of bits sent per tag, and the third plot shows
the average number of bits received per tag. SICP uses a comparable number of
slots as CICP, while p-SICP needs a much smaller number of slots than SICP,
as we expect. The energy costs of SICP and p-SICP are very close. Both are
much smaller than that of CICP, thanks to serialization for collision reduction. For
example, when N D 10;000, the numbers of bits sent/received per tag in CICP
are 8783 and 412,218, whereas those numbers are just 862 and 54,871 for SICP,
respectively, which represent 90.2 and 86.7 % reduction over CICP. Because of the
request aggregation technique, the average number of bits sent per tag in p-SICP is
slightly smaller than that of SICP. But each tag in p-SICP receives slightly more bits
on average than SICP. The reason is that a tag in SICP can inform its non-parental
neighbors to sleep for a certain duration without receiving the IDs unnecessarily; in
contrast, the transmission pipelining of p-SICP requires every neighbor to receive
what a tag transmits in each slot. For p-SICP, its numbers of bits sent/received
per tag are 634 and 63,716, which represent 92.8 and 84.5 % reduction over CICP,
respectively.
Figure 4.13 shows the maximum numbers of bits sent/received by a tag under the
three protocols, respectively. As expected, the most energy-consuming tags spend
much less energy under SICP and p-SICP than under CICP. For example, when N D
10;000, the maximum numbers of bits sent/received by any tag in CICP are 631,412
and 2,367,899, and those numbers in SICP are 38,273 and 159,431—93.9 % and
93.3 % reduction, respectively.
92 4 Identifying State-Free Networked Tags
5
comparison between CICP 2.0
CICP
and SICP 1.6 SICP
p-SICP
1.2
0.8
0.4
0
1 2 3 4 5 6 7 8 9 10
number of tags, N (× 10 3)
5.0
CICP
4.0 SICP
3.0
2.0
1.0
0
1 2 3 4 5 6 7 8 9 10
number of tags, N ( ×10 3)
The analysis in Sect. 4.5 demonstrates that the techniques of request aggregation
and transmission pipelining can reduce the time of ID collection, particularly for
tags with large tier numbers. We use simulations to verify this conclusion. We vary
the radius of the circular area, where 5000 networked tags are randomly distributed,
from 50 to 100 m at steps of 10 m. A larger radius of the distribution area means
there are more tiers in the system, resulting in a larger height of the spanning tree.
Figure 4.15 compares the execution time of SICP and p-SICP. We can see that
the gap between the execution time of SICP and p-SICP becomes larger with the
increase of the radius. When the radius is large, p-SICP cuts the execution time of
SICP by more than half.
4.7 Summary
This chapter discusses the tag identification problem in the emerging networked
tag systems. The multihop nature of networked tag systems makes this problem
different from the tag identification problem in RFID systems. Two tag identification
protocols are designed with three important findings. The first finding is that
94 4 Identifying State-Free Networked Tags
4
2 3 4 5 6 7 8 9 10
frame size
4
2 3 4 5 6 7 8 9 10
frame size
no. of bits received/tag (× 10 )
4
4.5
SICP
4.0 p-SICP
3.5
3.0
2.5
2.0
2 3 4 5 6 7 8 9 10
frame size
the traditional contention-based protocol design incurs too much energy overhead
in networked tag systems due to excessive collision. The second finding is that
load imbalance causes large worst-case energy cost to the tags. We address these
problems through serialization and probabilistic parent selection based on serial
numbers. The third finding is that the techniques of request aggregation and ID-
transmission pipelining can significantly improve the time efficiency of serialized
ID collection.
References 95
4
50 60 70 80 90 100
radius of circular area (m)
References
1. EPC Radio-Frequency Identity Protocols Class-1 Gen-2 UHF RFID Protocol for Communi-
cations at 860MHz-960MHz, EPCglobal (2011). Available at http://www.epcglobalinc.org/
uhfclg2
2. Gorlatova, M., Margolies, R., Sarik, J., Stanje, G., Zhu, J., Vigraham, B., Szczodrak, M.,
Carloni, L., Kinget, P., Kymissis, I., Zussman, G.: Prototyping energy harvesting active
networked tags (EnHANTs). In: Proceedings of IEEE INFOCOM Mini-Conference (2013)
3. Han, H., Sheng, B., Tan, C.C., Li, Q., Mao, W., Lu, S.: Counting RFID tags efficiently and
anonymously. In: Proceedings of IEEE INFOCOM (2010)
4. Johnson, N.L., Kotz, S.: Urn Models and Their Application: An Approach to Modern Discrete
Probability Theory. Wiley, New York (1977)
5. Kinget, P., Kymissis, I., Rubenstein, D., Wang, X., Zussman, G.: Energy harvesting active
networked tags (EnHANTs) for ubiquitous object networking. IEEE Trans. Wirel. Commun.
17(6), 18–25 (2010)
6. Kodialam, M., Nandagopal, T.: Fast and reliable estimation schemes in RFID systems. In:
Proceedings of ACM MobiCom (2006)
7. Kong, L., He, L., Gu, Y., Wu, M., He, T.: A parallel identification protocol for RFID systems.
In: Proceedings of IEEE INFOCOM, pp. 154–162 (2014)
8. Luo, W., Qiao, Y., Chen, S.: An efficient protocol for RFID multigroup threshold-based
classification. In: Proceedings of IEEE INFOCOM, pp. 890–898 (2013)
9. Myung, J., Lee, W.: Adaptive splitting protocols for RFID tag collision arbitration. In:
Proceedings of ACM MOBIHOC (2006)
10. Nguyen, C.T., Hayashi, K., Kaneko, M., Popovski, P., Sakai, H.: Probabilistic dynamic framed
slotted ALOHA for RFID tag identification. Wirel. Pers. Commun. 71, 2947–2963 (2013)
11. Onat, I., Miri, A.: A tag count estimation algorithm for dynamic framed ALOHA based RFID
MAC protocols. In: Proceedings of IEEE ICC, pp. 1–5 (2011)
12. Qian, C., Liu, Y., Ngan, H., Ni, L.M.: ASAP: scalable identification and counting for
contactless RFID systems. In: Proceedings of IEEE ICDCS (2010)
13. Shahzad, M., Liu, A.: Every bit counts - fast and scalable RFID estimation. In: Proceedings of
ACM MOBICOM (2012)
14. Shahzad, M., Liu, A.X.: Probabilistic optimal tree hopping for RFID identification. In:
Proceedings of ACM SIGMETRICS, pp. 293–304 (2013)
15. Sheng, B., Li, Q., Mao, W.: Efficient continuous scanning in RFID systems. In: Proceedings
of IEEE INFOCOM (2010)
16. Zhang, M., Li, T., Chen, S., Li, B.: Using analog network coding to improve the RFID reading
throughput. In: Proceedings of IEEE ICDCS (2010)