Optimizing Offline Access To Social Network Content On Mobile Devices
Optimizing Offline Access To Social Network Content On Mobile Devices
Optimizing Offline Access To Social Network Content On Mobile Devices
I. I NTRODUCTION
The phenomenal popularity of social networks, such as
Facebook, Twitter, LinkedIn, Google+, and Instagram, has
changed the way people interact today. Indeed, many people
rely on these social networks to communicate with their friends,
family, and community on a day to day basis. The ability to
continue these interactions anytime, anywhere seamlessly is
quickly becoming commonplace, and users on modern mobile
devices expect to not just to access social networks but also
exchange rich media contents, such as video, audio, and images,
for an enhanced user experience. It is reported that 93% of
Android smartphone users in India use social networks on their
smartphones [1] and often this is the reason why they purchase
smartphones in the first place. In North America, a recent IDC
report on smartphone users indicates that 70% of them access
Facebook via smartphones, and more strikingly, 40% of users
feel connected when using Facebook, only trailing 43% for
making voice calls and 49% for texting [2]. In fact, the main
finding of the IDC report is mobile+social=connectedness,
i.e., people feel isolated without mobile access to social networks.
To ensure this constant connectedness, mobile users subscribe (and pay for) 3G/4G data plans that are often expensive
and do not work for a host of reasons including: (a) wireless
network availability is sporadic (accessibility of WiFi access
points, unpredictable data rates in 3G networks), (b) mobile
1 This research has been supported in part by NSF grants 1063596, 1059436,
1352727 and 1143705.
Figure 1.
Figure 2.
E{( )}.
(1)
t t
=0
Content departure and dropping. Every time slot, the
scheduling algorithm chooses a subset of the fetched contents
to deliver to the user. Let r(t) be a set of contents scheduled to
be delivered in time slot
P t, and (t) be the total delivered data
amount, i.e., (t) = cr(t) sc . We assume (t) max t.
It is not hard to see that (t) is limited by the mobile devices
downlink bandwidth at t, defined as (t), i.e., (t) (t). We
also assume (t) max t. The time averaged data amount
delivered to the user is defined as:
t1
1X
E{( )}.
(2)
, lim
t t
=0
Some fetched contents are dropped by the proxy due to low
viewing probability or large size (hence, high energy consumption). Let d(t) and (t) be a set of dropped contentsP
and the total dropped data amount in time slot t, i.e., (t) = cd(t) sc .
The time averaged data amount dropped from the fetched
content list is:
t1
1X
, lim
E{( )}.
(3)
t t
=0
Energy budget and consumption. In our system, each user
sets an energy threshold et and nothing is prefetched if the
current energy level, denoted as e(t) is less than et . Moreover,
we allow mobile users to set energy budget for prefetching
contents, in order to preserve battery levels for other daily
activities, such as making phone calls and browsing Web pages.
The energy budget is set for user-specified time periods, e.g.,
a user may set the energy budget to 10% of the current energy
for a period between 8 a.m. to 6 p.m., and to 30% between 6
p.m. to 8 a.m. In particular, we let ts and te be the starting and
ending times of a period. Assume period [ts , tf ] consists of a
set of time slots {t0 , t1 ,...}. Let us define ea as the available
energy amount at the device at ts , and eu as the usable energy
percentage for the system to download social contents in the
max: U = lim
E Ur (t) Ud (t) ;
(6a)
t t
=0
st:
+ ;
(6b)
;
X
(6c)
sc (t);
(6d)
(6e)
cr(t);r(t)f (t)
X
cr(t);r(t)f (t)
max:
cr(t);r(t)f (t)
st:
d1 (c, t)
cd(t),d(t)f (t)
sc (t);
(15)
cr(t);r(t)f (t)
cr(t);r(t)f (t)
r1 (c, t) = V Uc (t) + Q(t)sc +
d1 (c, t) = Q(t)sc V Uc (t).
(P (t) )c (t);
(16)
U = lim
E{Ur ( ) Ud ( )} U , (17)
T T
V
=0
where is a constant defined in Lemma 1.
The theorem shows that by choosing an arbitrarily large V , the
achieved utility is arbitrarily close to the maximum value.
Theorem 2 (Real queue Q upper bound): Our algorithm restricts the storage limit for real queue Q as:
Q(t) V + max t.
(18)
3 The
st:
(20)
sc (t).
cr(t);r(t)f (t)
1
1) For each c f (t), calculate wc = 1
.
sc
2) Add cP
in f (t) to r(t) in decreasing order of wc if wc > 0
until cr(t) sc (t) does not hold.
Step 2: Select d(t) from the set f (t)r(t) by iterating through
each content c in f (t) r(t), and add c to d(t) if d1 (c, t) > 0.
is equivalent
Xto the following problem:
max:
cr(t)
r(t)f (t)
st:
cf (t)
(21)
sc (t).
cr(t);r(t)f (t)
P
Notice that in the new objective function cf (t) [d1 (c, t)]+ is
independent to r(t). This allows us to select the contents for
delivering r(t) first, and then derive the contents for dropping
d(t). We solve the problem in Eq. (21) in two steps, as
presented in Algorithm 2. The following theorems analyze the
performance of the near-optimal algorithm.
Lemma 2: Using Algorithm 2 for Content Selection achieves
a bound on 1 (t):
1 opt
(22)
apx
1 (t) 1 (t)
2
Theorem 4 (Utility bound): By using Algorithm 2 for Content Selection, the system achieves the following bound on the
long term utility:
apx 1 U apx .
(23)
U
2
V
We note that using Algorithm 2 achieves the same bounds on
the real queue and virtual queue as presented in Theorems 2
and 3.
Figure 3.
Table I
E NERGY CONSUMPTION (J) AT DEVICES FOR OFFLINE ACCESS TO
FACEBOOK IN 30 HOURS
Approach
Without Broker/Proxy
With Broker/Proxy
WiFi
970.9 (6.9X)
140.2
Cellular
16,783.4 (9.1X)
1,843.1
B. Trace-driven Simulations
1) Dataset: To evaluate the performance of our proposed
algorithm, we conducted trace-driven simulations using Facebook user data collected through our Android app [17]. We
release our app to 10 users in America, Asia and Europe,
and collect their traces between May 15 and 24, 2013. With
their consensus, we log and collected information on their
social attributes (e.g., Facebook friends and groups), news
feeds attributes (e.g., their authors, created time, downloaded
time, and size), device resources (e.g., battery and network
conditions), user-app interactions (e.g., timestamps of clicks to
view, comment, and like a news feed), and etc. To protect users
privacy, all Facebook objects, such as users, friends, groups and
news feeds, are one-way encoded to prevent being traced back
to the original users and contents. Our dataset includes 12,596
multimedia contents for all 10 users. The maximum number of
contents for a user is 3,919, and the minimum number is 130.
A user has 310 friends on average. The maximum number of
friends is 503 and the minimum is 75. Each user clicks to view
contents 693 times on average. The maximum ratio between
the number of clicks and the number of contents is 95%, and
the minimum ratio is 17%.
2) Settings: We built a Java based simulator that is driven by
the collected traces: the feed stream trace was used to model
Facebooks content arrival while the network trace to model
network conditions for the simulation. Energy consumption
was simulated using PowerTutors models [18]. Optimization
solver [16] was used to solve the optimization problem in
the Content Selection procedure in Algorithm 1. We run the
in Section III-C.
An interesting result in Figs. 4 and 5 is the two algorithms,
OPT and NOPT, achieves very close utility and delivery ratio
(our experiments also indicate they have very similar results for
the other metrics). For example, at V = 100, NOPT achieves
the utility of 79.14 while OPT has a slightly higher value, 80.94.
The largest gap of utility between the two algorithms is only
4.36 in any V . The key difference between these algorithms
is at the running time. Using optimization solver CPLEX to
solve the OPT problems leads to the worst case running time
of 2.4 secs in our experiments. The worst case running times
for NOPT is only 78 ms. This indicates that NOPT runs much
faster than OPT while achieving a very close performance. We
thus suggest to use NOPT for the practical deployment purpose,
and from now on, we do not report experiment results for OPT.
Although larger V leads to higher utility, it also leads to
larger queue delay. For example, in Fig. 6 at V = 10, the
queue delay for NOPT is 0.5 hours while at V = 106 , the
delay is more than 8 hours. A reason for the high delay is
we set energy threshold to be high (average energy level).
But we see a trend here. With larger V , more contents in
queue are considered for being delivered to mobile devices.
This observation is inline with our analysis: queue size and
queueing delay are strongly dependent on V . Transferring a
larger number of contents causes high queue delay. Note that in
the social network context, queue delay in some sense indicates
the satisfaction of user: how soon they can view social content
updates from their friends.
We report effective energy according to V in Fig. 7. The
lower the effective energy, the better the system works, because
more energy is used for delivering right contents. Fig. 7 demonstrates that when V increases, the effective energy decreases
and then increases. For example, at V = 101 , the effective
energy is 18 J per content; at V = 104 , it decreases to 11 J per
content; it increases back to 30 J per content when V = 106 .
This trend can be explained as follows. With a small V , queue
size is dominated in making content selection for delivery
and dropping, and viewing probability does not have much
impact on decisions. Thus, the effective energy is low at small
V . With a medium V , the viewing probability plays a larger
role, and a higher number of contents clicked to view by the
user is delivered, which contributes to achieving low effective
energy. With a large V , the system delivers more contents than
necessary if network and energy conditions permit. Therefore,
there are more contents not clicked to view by the user delivered
to the user. This leads to higher effective energy.
Compared with other strategies. Fig. 8 shows the effective
energy to download contents viewed by users using each
algorithm under two energy thresholds. For fairness, we set
parameter = 10000 to prevent our algorithm from dropping
contents so that all algorithms download all contents (with
different schedules) under the same energy and bandwidth
constraints in the simulations. The proposed algorithms exhibit
better energy efficiency than the baseline algorithms: 15%30%
difference is observed.
Fig. 9 presents the hit late ratio of the downloaded contents,
which indicates the timeliness of content downloads that are
NOPT
OPT
Utility
300
200
100
0
102
10
103
104
105
106
70
60
50
40
30
20
10
0
10
NOPT
OPT
400
102
10
103
Figure 4.
Utility.
Figure 5.
20
15
10
5
102
10
103
Delivery ratio.
104
105
20
10
0
106
Rand
Effective energy.
Figure 8.
Size
Prob
Algorithms
3
68
4
82
5
77
6
65
7
62
8
70
9
67
NOPT
Effective energy.
Table II
V IEWING PREDICTION ACCURACY (AUC) IN % OF EACH USER . T HE
AVERAGE ACCURACY IS 72%.
2
77
103
Figure 6.
30
1
69
102
10
81
104
105
106
40
Figure 7.
2
10
25
106
50
Effective Energy
Effective Energy
105
30
User
AUC
104
Queue delay.
1.5
1.2
0.9
0.6
0.3
0
Rand
Size
Prob
Algorithms
Figure 9.
NOpt