Mining-smartphone-data2017
Mining-smartphone-data2017
Mining-smartphone-data2017
Hong Cao
McLaren Applied Technologies, APAC
Miao Lin
Institute for Infocomm Research, A*STAR.
1 Fusionopolis Way #21-01, Connexis (South Tower), Singapore 138632
Abstract
http://www.sciencedirect.com/science/article/pii/S1574119217300421
2 http://www.gartner.com/newsroom/id/2153215
3 https://developer.apple.com/wwdc/
2
1.1. Background and Motivations
Understanding app usage behaviors is important since it can benefit both
smartphone users and app makers. For smartphone users, existing studies show
30 that the number of used apps in each user’s smartphone ranges from 10 to more
than 90, with a median value of about 50 [2]. This large number makes finding a
specific app on one’s smartphone a non-straightforward task. In particular, as a
fundamental interaction with high frequency of occurrences, it should be made
really simple and efficient for an individual to perform. Also, for some apps such
35 as the news apps, annoyingly it can take a while to download the latest content
after they are turned on. With the knowledge of the most likely next-app to
be used, both the battery energy consumption and app searching time can be
planned in advance and optimized. With the fast proliferation of apps, there
are also immediate needs to recommend the right apps to the right users at
40 the right time in an intelligent and automated manner. In the app market, one
can typically find the app information including the app description, the recent
number of downloads, the review ratings, comments and some introductory
materials in terms of short videos and figures. Even though the app developer
can publicize their app information in the app market, the information is static
45 and requires users to proactively check out in the marketplace on a periodical
basis. Searching for the right app to install is thus a non-trivial task in particular
given the existence of an extremely large pool of apps. Therefore, it is absolutely
necessary to design an app recommendation mechanism that understands users’
underlying app usage patterns and preferences.
50 In addition to app usage prediction and recommendations, discovering one’s
smartphone usage patterns also helps interpret one’s mobility behavior and pro-
file personal traits [3]. For instance, social context can be deduced from temporal
app usage patterns [4]. The works in [1, 5, 6] interpreted users’ mobility using
location data and the work in [7] studied the correlation between various app-
55 seeking behaviors and mobility patterns. Other studies [8, 9, 10, 11, 12, 13] in
social science also showed that smartphone usage patterns could be applied to
differentiate personalities [3], namely extraversion, agreeableness, conscientious-
3
ness, neuroticism, and openness to new experience.
In view of the importance of uncovering the useful patterns from smartphone
60 usage and mobile data, a few relevant competitions have been organized in re-
cent years. In 2012, Nokia Research Center and its academic partners set up
a mobile data challenge (MDC) based on the smartphone data collected from
200 individuals over a one-year period4 . The dedicated challenge tasks of MDC
include place semantics prediction, next place prediction and demographic pre-
65 diction. MDC also set up an open challenge to invite innovative applications
that make use of their collected smartphone data. Other than MDC, a series of
challenges, named Data For Development (D4D), were organized by MIT in the
Netmob conference5 . In these challenges, telecommunication (telco) data includ-
ing cell tower data that the phones connect to and some auxiliary information
70 were released. An example task to be performed was to uncover the correlation
between local residents’ mobility patterns and their social network. In 2014,
another competition was organized based on a large scale of smartphone usage
data collected from more than 17,000 individuals globally using an Android app
in Google Play, named Device Analyzer [14]. The large number of users installed
75 this app to either voluntarily contribute their data for research or to keep track
of their own phone usage. These competitions contributed significantly to the
smartphone research community by not only providing comprehensive data sets,
but also creating awareness and unlocking new opportunities in this emerging
research area.
80 Beyond these competitions, there were also recent studies to address various
other smartphone usage mining issues covered in the wide areas such as ubiqui-
tous computing, pervasive computing, mobile computing and machine learning.
These studies include but are not limited to the following topics: analyzing the
impact of publishing time to the app’s popularity [15], characterizing the app
85 market [16, 17], comparing smartphone usage among different social-economic
4 https://research.nokia.com/page/12000
5 http://perso.uclouvain.be/vincent.blondel/netmob/2013/
4
groups [18, 19], understanding battery consumption6 [20, 21, 22, 23, 24, 25, 26],
understanding human activities [27, 28, 29, 30, 31, 32, 33, 34, 35], tracking
atmospheric pressure using the smartphone’s barometer [36], profiling individ-
uals based on smartphone usage patterns [37], and designing intelligent user
90 notification mechanism [38, 39].
1.2. Scope
6 http://carat.cs.berkeley.edu/
5
Data Sets and Design
Representation Applications Modelling Challenges
Platforms Considerations
Data: Common
App prediction *Markovian Personalization Unified
*Time statistics:
*Bayesian benchmarking
*App logs *Usage session
App *Topic modelling Data sparseness dataset
*GPS *Interaction time
recommendations *Maximum entropy
*Cell tower ID ...
*Cluster-based Battery Privacy
*WiFi
Others: *Similarity-based consumption preservation
*Bluetooth Explicit features:
*App classification *Context based
*Settings *Location
*User retrieval *User preference Performance Energy efficiency
*Battery *Time-Based
*Pattern discovery modelling based constraint
*Charging *Phone setting
*Graph modelling Fraudulent use
*Sensory *Phone Status
based Dynamic modelling detection
…
*Privacy modelling
Implicit features:
based Similarity measure Transition from
Platform: *App sequence
*Bernoulli process syntactic to
*Android *Transition
*TF-IFT Phone-base versus semantic level
*iOS probabilities
… Cloud-based
*Nokia ...
*Windows
Anonymisation
*BlackBerry
...
Figure 1: Overview of our survey, where app prediction and recommendations in bold are our
focused application topics
6
Table 1: Comparison of data sets used in app usage pattern-mining studies.
Reference Software Platform #User Duration Description
Screen status, apps used,
Falaki 2010 Android / 7-28
Logger 33 / 222 network traffic status,
[2] Windows weeks
battery status
Do 2011 App record, location data,
Client Nokia 77 9 months
[58] bluetooth data
Cell ID, voice call,
Verkasalo
Client Nokia 324 ≥3 weeks multimedia app usage,
2009 [59]
network data
Do 2010 1-8
Client Nokia 111 App usage time
[60] months
Voice call, screen status,
Kang 2011
App Android 20 2 months network data, WiFi, battery
[61]
status
Bohmer Location, time, app usage
App Android 4100 5 months
2011 [62] log
Wagner Detailed app usage
App Android 17000 >1 year
2014 [63] information
Shepard iPhone usage log, network
Logger iOS 25 1 year
2011 [64] utilization
Oliver 2010 17 days / Interactions, diurnal
App BlackBerry 17300
[65] user patterns
130 We summarize in Table 1 the major data sets that have been used to support
the current smartphone usage data mining studies. These data sets were col-
lected from different mobile platforms. Among the platforms, it was reported
that iOS was the most difficult one for data collection as ‘jailbreaking” iOS
devices was typically required [58] before a data acquisition client could be in-
135 stalled. For this reason, we can see that the number of iOS devices sampled is
much smaller than those associated with other major platforms. For Nokia’s
platform, the data collection tool was often called “client”. During the data
collection, volunteers were either requested to install the client or were given
Nokia smartphones with pre-installed data collection client. Comparatively, it
140 was fairly popular to deploy the data collection apps for Android or Blackberry
end users. A few successful examples were AppSensor [59], Device Analyzer [14]
and the app in [60], which had been utilized to collect smartphone data from
7
more than 4000, 17000 and 17300 devices, respectively.
To collect smartphone data over a long period, the data collection app was
145 typically designed to run quietly at the background without impacting on the
phone’s performance and user’s experience. For this purpose, a few challenges
of designing the apps were discussed. The foremost issue was how to efficiently
collect the data with minimal additional battery consumption imposed. In [61],
Do et al. used a state-machine approach which can automatically adjust the
150 sampling frequency according to users’ current context. For instance, when
the app detected that its user was staying indoor, the GPS sensor can be set
higher to a low sampling rate. Similarly, the sampling rate for GPS sensor
can be adjusted high for an outdoor context. The works in [14, 62] suggested
to collect data through an event-driven approach, i.e. only when there is a
155 change of context, data will be recorded. Besides app usage data, other types
of data such as nearby WiFi access points, cell tower, the battery level were
also sampled. With the above-mentioned measures, it was reported that the
battery consumption for data collection can be kept as low as 2% to 3% for
majority of the devices [14]. Other than the energy efficiency aspect, additional
160 issues addressed also include data privacy, bandwidth constraint, and event
synchronization.
Despite differences of the data sets used, some common usage statistics are
often quantified and reported. These include the duration of the usage sessions
165 and the interaction time, the spatial-temporal correlation to the app usage and
the app usage dependencies.
Commonly reported statistics are associated with the usage sessions and the
interaction time. A usage session was defined as a continuous smartphone usage
period without any interruption [2] and interaction time was defined as the total
170 accumulated time of the usage sessions in a fixed period, for example one day.
The daily interaction time was found typically ranging from a few minutes to
more than 500 minutes, and 90% had a range of 20 to 100 minutes [2]. Similar
8
results were reported that the daily average usage time was 123 minutes in
[62], 117 minutes in [63], and 74 minutes in [64], respectively. The work in [59]
175 had a different definition of a usage session, i.e. “the usage sequence of the
apps without the device being in the standby mode for more than 30 seconds”.
Based on this, the reported average daily usage time became only 59 minutes.
From the above studies, it was reported that the daily interaction time was
typically constituted by a large number (ranging from 10 to 200) of short usage
180 sessions with the duration ranging from 10 to 250 seconds. Despite the vast
majority of short usage sessions [2], there were typically some long sessions
as well. It was observed that the number of sessions dropped exponentially
when the session length increased. The work in [2] thus proposed to combine
the exponential distribution and the Pareto distribution to model the relation
185 between the probability and length of usage sessions.
Beyond overall statistics of usage sessions, a number of works also investi-
gated their statistics under different contexts. Hintze et al. [63] considered two
different contexts, namely when the smartphone was locked or unlocked. He
found that in the former case, the number of usage sessions was more than that
190 in the latter case. The mean session length and the daily interaction time in
the former case, which were 88 seconds and 33 minutes, respectively, were much
shorter than those in the latter case, which were 285 seconds and 87 minutes,
respectively. This revealed that for a substantial amount of time, the smart-
phones were used without being unlocked. Soikkeli et al. [64] then distinguished
195 the usage sessions based on location contexts such as home and office. Among
all these contexts, the usage sessions at home had the longest average length
and the lowest number of sessions and usage minutes per hour. As pointed out
in [64], this can be explained as smartphone usage at home can be considered
non-essential, but when used, the session length tended to be longer than that
200 for other contexts.
It is also worthwhile to analyze how the usage sessions distribute among
various apps. The work in [59] reported that the number of usage sessions dis-
tributed to different apps typically experienced a power-law decay, with more
9
than 68.2% of sessions associated with only one app. Falaki et al. [2] confirmed
205 that the app usage ratio can be well fitted by an exponential distribution, imply-
ing that each user typically spent their time on a few dominant apps only. For
the app categories, the study [59] showed that the categorical labels of the apps,
e.g. finance, travel, communication, news and games, can also be correlated to
the usage time. For instance, the average usage time ranged from 36 seconds
210 for apps in the category of finance to 270 seconds for those in the category of
libraries & demos, which was even longer than the time (114 seconds) for the
category of games. Among all the usage sessions, about half of them (49.6%)
were started by the apps from the category of communication. This suggested
that communication with others often triggered the usage of various other apps.
215 A number of other spatial-temporal factors were also reported to have sig-
nificant impact on the app usage behaviors and the apps’ popularity. Böhmer
et al. [59] reported that when individuals used apps during the daytime, they
tended to switch frequently between the apps leading to a short usage time
for each app. When individuals used the apps at night time, they tended to
220 stick to a single app with significantly longer average usage time. Also the apps
usage can be influenced by the location context, which was either represented
by geographic areas or location semantics. In the former case, Xu et al. [41]
reported that a considerable number of popular apps, which was about 20% in
1000 selected apps, were only used in a few US states. For instance, some apps
225 designed to provide local news and TV programs were used geographically only
in one state and its vicinity area. Other apps developed by some universities
were popular only within the university. It was also reported in [61, 59, 65]
that there was a high correlation between the usage of apps and the place se-
mantics. For instance, individuals can have different preferences in choosing the
230 communication tools, such as SMS, voice call, communication apps, depending
on whether the user is indoor, in transport or at a place of leisure. The choice
of apps is also dependent on the availability of free WiFi, which is typically
associated with the location semantics.
10
3. App Usage Prediction
235 App usage prediction refers to the task of predicting the next app that will
be used for a given a user and at a given time. In this section, we review
and discuss several essential aspects of current app usage prediction literature,
including two types of features used, various predictive modeling methodologies,
as well as some practical implementation issues.
In smartphone data mining, the features used are typically categorized into
explicit and implicit types [66, 44, 67]. Similarly, we formalize the explicit
features as readily available information extracted from the phone. Implicit
features instead refer to the subtle derived app statistics. The explicit fea-
245 tures include: 1) location and environmental data of various types, 2) time
information, e.g. hour of the day, day of the week, and 3) smartphone modes
extracted from the phone’s settings, e.g. airplane mode, vibrate mode, silent
mode. Specifically, we categorize the explicit features into four types below:
• Location features refer to both the location and other data sources that
250 can imply locations, including: GPS, WiFi access points (APs) and cell
tower IDs. Other forms of data can also be translated into locations. For
instance, Huang et al. [68] translated the signals from WiFi APs into
locations. Also, a number of studies [69, 49, 67] furnished the locations
with semantics through integrating the crowdsourced location labels.
255 • Time-based features refer to hour of the day and day of the week in
majority of the cases. The works in [43, 70, 71] suggested that features
for smartphone data can be computed separately for different temporal
contexts. We often call these features temporal features.
11
Table 2: Feature and performance comparison of next-app prediction studies, where top-K
denotes the evaluation is based on top-K selected candidate apps, MRR stands for mean
reciprocal rank, MAP for mean average precision, MAR for mean average recall, NDCG
for normalized discounted cumulative gain and Acc for the accuracy and “ − ” and “?” for
unreported information.
Explicit features Implicit
Reference #User #Apps Metric Result
Location Time Setting Status features
Cell,
Shin 2012 Accel., 74 per
tower, √ √ Latest apps 23 Acc.@K Top-5: 78%
[42] battery user
GPS
Top-5: 80%
Liao 2013 Temporal 41 per Recall@K,
- - - - 80 (Rec.), 42%
[43] app stats. user MRR
(MRR)
Top-5: 60%
Yu 2012 Semantic 665 in MAP,
√ √ - - 443 (MAP), 85%
[49] locations total MAR@K
(MAR)
Top-5: 20%
Zhu 2012 Semantic 680 in Precision,
√ √ - - 443 (Pre.), 90%
[69] locations total Recall@K
(Rec.)
Natarajan App 9583
- - - Screen 17062 Recall@K Top-5: 67%
2013 [68] sequence in total
300 in
Top-5: 80%
Liao 2013 Accel., App total, Recall@K,
GPS √ - 50 (Rec.), 0.64
[44] battery sequence 56 per nDCG
(NDCG)
user
Accel., Latest app,
Xu 2013 9 per
GPS √ √ microph app 4600 Acc.@K Top-5: 67%
[79] user
one category
Trigger /
Yan 2012 Clustered follower app,
- - - - - - -
[73] locations app
sequence
Sun 2013 App stats, 11 per Top-4: 70%
- - - - 52 Acc.@K
[74] sequence user Top-8: 80%
Zou 2013 App stats, 59 per
- - - - 80 Acc.@K Top-5: 86%
[75] sequence user
Zhang 2012
WiFi √ √ - Latest apps - - - -
[88]
Huang 2012
WiFi √ √ - Latest apps 38 - Acc.@K Top-3: 69%
[71]
App
Parate 2013 Semantic
√ - - duration, 34 - Acc.@K Top-5: 81%
[72] locations
sequence
GPS,
Ricardo Audio, App
Semantic √ - 480 - Precision@K Top-?: 90%
2015 [80] charging sequence
locations
Call, 1.4 m
Do 2013 GPS,
√ √ bluetoot - 71 app in Recall@K Top-5: 90.8%
[78] WiFi
h total
12
can be used to effectively filter out unlikely app candidates in the given
circumstances.
As mentioned above, implicit features are derived app usage statistics, which
require feature engineering and data modeling, and are subtle in nature. They
270 can be correlation statistics of app usage sequences, distribution of the app
usage duration and probability of each app being the first to be used7 . It
was reported by [40, 51] that the implicit features tended to be more effective
in distinguishing app usage behaviors as compared with the explicit features.
Table 2 overviews and compares the explicit and implicit features for the next-
275 app prediction literature. In the next section, we will discuss how implicit
features are modeled in the current works.
7 This refers to the case when the smartphones are recovered from the standby mode or
13
learned from these subsequences. Despite that first-order Markov model is able
290 to learn the probabilities of switching any two apps in an app transition matrix,
we note that the model is limited as it does not take into account of the app
usage duration.
Liao et al. [44] learned the implicit features covering both app transitions
and duration using a graph-based method. For each user, a temporal apps usage
295 graph was constructed, where the node and directed edge represented an app
and a transition, respectively. For each pair of apps, the duration was modeled
by an exponential distribution. The implicit features of app transition matrix
was learned in the training phase with the assumption of Markovian properties.
When the app transition matrix was learned, the optimal combination of the
300 next app and other implicit features, like duration, were then jointly estimated
in the prediction phase.
When modeling the app usage sequences with Markovian Models, training
data of sufficient length was typically required as the list of installed app on an
individual’s smartphone can be long and change fairly dynamically. To address
305 this data adequacy issue, Parate et al. [69] proposed to use a text compression
algorithm, named Predicting by Partial Match algorithm [73], to model the
app usage sequences. The algorithm assumed that each app was analogous to
one character in the text. In particular, Parate et al. [69] conjectured that a
long prefix, i.e. a list of previously used apps, did not necessarily provide more
310 useful information than a short prefix in terms of predicting the next app. With
the short-prefix assumption, a large amount of data was not required to model
implicit features at high orders. The data sparseness issue was further addressed
through weighted combination of Markov models of different orders.
Similarly, in [72] Zou et al. proposed using Markovian models to learn the
315 app usage sequences, and these models included first and second-order Markov
models, along with weighted linear combination of these two models8 . The
8 When linearly combining two models, the used second-order of Markov model was kept
simple such that the next app prediction was only dependent on the current app rather than
the current one and its previous two apps
14
results showed that the best model was the combined model with an accuracy
of 85% in predicting the top-5 apps each time.
Natarajan et al. [66], on the other hand, modeled the app usage sequences
320 using a cluster-level Markov model, which segments app usage behaviors cross
multiple users into a number of clusters. Through clustering the users based
on Kullback-Leibler divergence of the respective user-level app transition ma-
trix, the algorithm was able to group users with similar apps based on their
usage patterns. In the second-level learning, the Markov model was trained on
325 the collective data for each group of users to model the group-level app usage
characteristics.
In addition to Markov features, other complementary implicit features were
also introduced. Liao et al. [43] modeled the usage behaviors based on different
temporal periods. Specifically, three implicit app usage features were proposed,
330 namely a global feature, a temporal feature and a periodicity feature. The
global feature was defined as the probabilities of the apps being used, which was
aggregated over all available temporal periods. The remaining two features were
time-dependent and referred to the usage at the predefined period of a day. For
example, the temporal feature was computed for predefined equal time intervals
335 for each day. The periodicity feature was instead computed by including periods
that only covered significant app usage periods, which were detected using a
dynamic cut strategy [74]. It was mentioned in [43] that merely considering the
dependencies between the app usage behaviors was not enough to make accurate
prediction of the next app simply because of the high dependency of app usage
340 on the temporal context. It also suggested that the starting time of app usage
could be useful to be modeled to make the next-app usage predictions.
3.2.1. Discussion
The modeling techniques for implicit features face a number of major chal-
lenges when used with practical data acquired from real life. We highlight below
345 two such challenges related with model complexity and dynamism requirement.
Model Complexity versus Training Data Size. The works in [68, 44]
15
employed a fixed-order Markov model for app usage prediction. With a low or-
der, the model typically requires less data points to train but can only coarsely
capture the app transition dependency due to its lack of expressive power. In-
350 creasing the order of the Markovian model, on the other hand, will significantly
expand the number of parameters to be learned from the data9 . This can easily
lead to data sparsity issues as the number of observed transition events between
each pair of apps can be sparse due to the lack of sufficient data. Subsequently,
this can result in inaccurate modeling. The studies [66, 69] addressed this issue
355 in two different ways, namely by reducing the number of parameters through
adopting a mixture of Markovian models and by training the Markovian models
using the aggregated data from clusters of similar users.
Dynamic Models. It is worth noting that in practice, modeling the app
usage sequence with Markov models is effective only for a short period of time.
360 This is simply because various new apps are released to the market at a fairly
fast rate and users could change their app usage behaviors when changes happen
to the list of their installed apps. Thus, we view that modeling the app usage
sequence for a long period of time will also call for dynamic modeling algorithms
that can timely adapt the current model to the latest app usage behaviors
365 observed from new data.
Even though implicit features were found to be highly effective in app usage
predictions, their performance can be further improved with the inclusion of
explicit features as tabulated in Table 2. In the next section, we survey the
studies that combine explicit and implicit features in app usage prediction.
Combining implicit and explicit features for app prediction is a fusion prob-
lem that calls for effective means to integrate multiple data cues and contexts.
Among [42, 68, 49, 75, 76, 70], we found that Bayesian framework had been
9 When the number of unique app is N and the order of Markov Model is K, the number
N
of parameters is (N − 1) × K .
16
popularly applied to combine different features in a unified manner. The un-
375 derlying assumption of using Bayesian framework is that the features used are
independent from each other in terms of impacting a users’ choice of next app
[68, 49, 42, 77].
One can refer to the work in [68] for a generic description of using a Bayesian
framework in combining various features. Specifically, [68] integrated location
380 features, smartphone settings features and app usage features in app prediction.
Location features are a set of significant locations inferred from WiFi APs when
the app is used [78]. The smartphone setting features indicate that the smart-
phone was in silent mode, flight mode or other modes. The app usage features
are represented as the transition probabilities between apps, which were inferred
385 by a first-order Markov model discussed in Section 3.2. This study suggested
that three types of features can be combined using a Bayesian model, which es-
timates the probability of the next-app candidates by multiplying the posterior
probabilities computed using each feature.
The works in [49, 77] considered the context information in modeling app
390 usage. Yu et al. [49] explored a context-aware combination of different features.
The idea was motivated by the observation that different users in the same
spatial-temporal context would have similar app usage behaviours, but this be-
haviour can become quite different in different contexts. Based on this idea, Yu
et al. [49] introduced a new term called atomic context, i.e. a context common
395 to all the users. Such a context provided a basis of aggregating different users’
data and this alleviated the data sparsity issue of using each individual’s data.
Specifically, the user u’s preference to an app a was modeled by the proba-
bility of observing the atomic context p in a list of context C, represented as
Q
P (a|C, u) ∝ p P (a, p|u). As pointed out in [49], each user’s app usage context
400 might be too few to reliably estimate the probability of P (a, p|u). Thus, a vari-
able z, named Common Context-aware Preferences, was introduced to decom-
P
pose the probability P (a, p|u) in the form of P (a, p|u) ∝ z P (a, p|z)P (z|u).
The probabilities of P (a, p|z) and P (z|u) were learned by the Latent Dirichlet
Allocation (LDA) model [79]. The context and apps were modeled in a manner
17
405 analogous to the topic modeling for topics and words, where all users’ data were
aggregated for parameter estimation in the LDA model.
Similarly, Ricardo et al. [77] combined various explicit features and implicit
features in a parallelized Tree Augmented Naive Bayes [80] (i.e. PTAN) for app
usage prediction. The used features were location features including latitude,
410 longitude, GPS accuracy, location semantics (either home or work), time-based
features including hour of the day, phone status features such as the status
of charging and audio cable connection. The implicit features were a set of
temporally adjacent app usage information. The set of useful features that
maximized the probability of a given app was selected based on the idea from
415 word2vec [81], and then combined in PTAN via a structural learning phase and
a parametric optimization phase. For two cold-start issues (some apps lack
historical usage records and some users lack initial training data), the authors
enhanced the phone usage records either through other apps with similar usage
frequency or with other users possessing similar app usage records.
420 The works in [42, 44] incorporated feature selection methods to combine dif-
ferent types of features. Shin et al. [42] proposed to construct a naı̈ve Bayes
model for each combination of app and user instead of learning a multi-app
classifier. As the app prediction can hardly be equally dependent on all avail-
able features, it was meaningful to incorporate feature selection to choose a
425 subset of effective features for each context. By employing the Greedy Thick
Thinning [82] as the feature selection method, the work reported that the last
used app and the user’s current location in terms of the cell tower IDs were
the top two features for the next-app prediction. The work further presented
two related case studies which are related to next-app prediction and the design
430 of the home screens, respectively. Experimentally, it showed that using their
selected features, the prediction of the next app given by a naı̈ve Bayes model
outperformed those several other baseline methods. These included choosing
the most frequently used apps, choosing the most recently used apps and using
decision tree modeling for the prediction. Secondly, a dynamic home screen app
435 was developed by using the proposed naı̈ve Bayesian model. This home screen
18
app not only could show the most probable next-app to be used, but also could
provide additional prominence to the apps that have the largest increment in
the overall probability of use.
Similarly, Liao et al. [44] incorporated a feature selection method to combine
440 explicit and implicit features for the prediction of the next app. The explicit
features here included data from various sensors, e.g. location, time, battery,
accelerometer, WiFi signal, GSM signal and system configuration. The implicit
features were the app transition characteristics modeled using a directed graph,
where the transition times were modeled as an exponential distribution. Due
445 to the high dimension of the combined feature set, minimum description length
[83] was applied as a criterion to iteratively select the best subset of features.
The selected features were then applied in a multi-class k-nearest neighbours
(kNN) classifier to predict the next app.
Apart from Bayesian models, Do et al. [75] also compared a few other
450 models to jointly predict the next place () and the next app by combining a list
of features. These models included linear regression, logistic regression, random
forest, Markov models and a fusion technique that can linearly combine the
above-mentioned models. Using the proposed method, Do et al. trained both
generic and personalized models based on all the users’ data. By fusing the
455 generic models with personalized models, the experimental results showed that
the generic models performed well against personalized models especially when
each user’s training data was very small, where small is defined as less than 3
weeks. By increasing the size of the training data, the combined model was also
found to perform about 2% better than the personalized models. This suggested
460 that using all available users’ data and fusing multiple models can help address
the lack of data issue for individual users and enhance performance.
Xu et al. [76] proposed a two-phase classification model to combine three
features, namely individual’s app transition probabilities, contextual features
and aggregated app usage patterns from communities. The two-phase model
465 included a personalized classification phase, which aimed to group users with
similar app usage patterns, and a community-aware prediction phase, which
19
aimed to predict the next app. In the classification phase, the apps were clas-
sified in terms of the App bag, which was a set of the contextual features of
using each app. The similarity between App bags was measured using the
470 Mercer Kernel [84]. Here the classification model was constructed by learning
the correlation between the contextual features and the usage of apps. In the
community-aware prediction phase, the similarity of two users’ app usage pat-
terns were evaluated by all pairs of app bags, where one was from each user.
Prediction of app usage was conducted to classify the usage context packed
475 in the app bags. The work in [76] also pointed out that each user’s app us-
age data can be too sparse to train a comprehensive classifier. Instead, it was
recommended that the classifier was trained based on collective data from a
community of users that presented similar app usage patterns.
3.3.1. Discussion
480 When training a data model for the next-app prediction, a number of works
above also touch on the issue of model personalization. We discuss it in greater
details below.
Generic Model versus Personalized Model. A basic design consid-
eration for next-app predictive modeling is whether we should learn a generic
485 model based on multiple users’ data or a personalized model based on each
user’s data. The underlying assumption for the generic model is that a group
of users share common app usage patterns in a similar context. Although the
size of the training data increases as more users’ data are combined, it is worth
noting that the number of contexts can also increase. As some users could have
490 more training data and others have less, there is also a risk that the predictive
model becomes dominated by those users who contribute more data.
As for personalized models, they are typically learned by using an individ-
ual’s data. With sufficient data for model training, personalized models were
reported to have better accuracy than the generic models [42, 75]. However,
495 training a personalized model often encounters cold-start and data sparseness
issues, especially at beginning of data collection for an individual. Cold start
20
here refers to the frequent scenario that at the beginning of data collection for
each individual, there is typically insufficient data to train a personalized model
for immediate interpretation of his/her live smartphone data feeds accurately.
500 It is thus desirable to design an adaptive mechanism that can take a generalized
model to interpret an individual’s initial data as a first step. The model then
gets systematically personalized to deliver a better performance for the user
when more and more of her new data get collected. Data sparseness instead
refer to the lack of data or data variety to cover the presentation of all combina-
505 tions and dimensions. Similar to cold start issue, data sparseness issue can be
addressed by pooling data from other similar individuals to represent a targeted
individual. The underlying assumption is that similar users will share similar
app usage patterns and preferences.
Achieved Performances. Table 2 shows the achieved accuracy perfor-
510 mances and features used for the next-app prediction studies that we have dis-
cussed. We can see that the study scale can vary in a large range from 23 users
to more than 17,000 users and the number of apps per user can range from 9
to 74. The total number of apps can be as large as 9583. For the majority of
studies, the next-app prediction models were trained for each individual user
515 and were tested on an individual basis. An exception is the work reported in
[76] where the authors selected 9 commonly used apps and used only 35 subjects
to train a common next-app prediction model. This model was then tested on
4,600 users. In this instance, the next-app predictive model is confined to the 9
selected apps.
520 Majority of the tabulated works (9 out of 15 in total) utilized a combination
of explicit and implicit features. Out of all 15 works, 11 used some form of
location or its proxy, 10 directly used time, 7 used some phone settings, and 6
used some phone status as explicit features for next-app predictive modeling.
This suggested that location and time are two major types of explicit features.
525 For implicit features, we found that 7 out of the 15 works directly modeled the
app sequence as implicit features, suggesting this is the most popular type of
implicit features used. Other than that, latest app, app duration, temporal app
21
usage statistics, trigger/follow app were also used in the predictive modeling.
Despite of the different forms of implicit features used, each app usage prediction
530 study typically leveraged modeling some form of app sequence to facilitate the
next-app predictive modeling.
The most commonly used evaluation metrics are precision, recall and ac-
curacy for the top-K selected candidates, where K typically ranges from 3 to
8. Besides the above, other metrics like rank, mean precision, mean recall and
535 normalized discounted cumulative gain were also used.
We list in the last column of Table 2 the performance achieved as an indirect
comparison among various app usage prediction works. It is worth noting that
these figures provide indicators of the achievable performance under specific
study settings. Given the large heterogeneity in the study settings and in the
540 adopted evaluation metrics, the comparative advantages of two studies cannot
be judged solely based on their reported results.
There are also alternative next-app studies which focused on the system
improvement aspects instead of reporting the next-app predictive accuracy, such
as the works in [70] and [85]. We do not include such studies in the table.
App usage prediction finds its immediate applications in designing fast app
launching UIs. Along this line, several works [42, 70, 71] used a client-server
model for the implementation. In general, intensive computation, e.g. mining
app usage patterns and training of the data models, is conducted on a server
550 and the light-weight app usage prediction can either be allocated to run on the
server or on the client device. There are also works in [85, 69]10 which trained
the light-weight models directly on users’ smartphones. In the following section,
we elaborate the above-mentioned works.
Sun et al. [71] developed an Android app (AppRush) to predict the next
10 In [85], although Zhang et al. also proposed the client-server model in developing a UI, it
worked without the need to connect to the server since all the required statistics of predicting
the apps were cached locally.
22
555 app. AppRush consisted of a frontend widget, showing the dynamic shortcuts
to potential apps, and a service running at the background to predict the next
app based on the current app usage feed in real time. The prediction model
was learned by combining usage information including recency, frequency, du-
ration, temporal distribution as well as app launching sequence. The weights
560 for combining these features were determined by the prediction accuracy of ap-
plying each feature alone. The evaluation of AppRush showed two findings on
prediction performance and energy efficiency, respectively. On the prediction
performance, it showed that AppRush achieved consistent improvement over
the baseline methods that used recency and frequency features only. On the
565 energy efficiency, it was reported that AppRush requires an average CPU usage
of less than 1% and memory consumption of about 4-6 MB only.
Yan et al. [70] leveraged three features for next app prediction and this
method was implemented in a fast app launching UI app (FALCON) for Win-
dows Phone OS. The selected three features were usage sequence in terms of the
570 trigger and follower apps, location and temporal bursts of app usage. Trigger
apps refer to the apps that lead a user to start using smartphone and the follower
apps are those subsequently used. The most common trigger apps were found to
be SMS, email, Facebook, a phone call and web browsers such as Safari. The lo-
cations here were quantized through the k-means clustering algorithm based on
575 GPS readings. A temporal burst indicated significant app usage over a period.
This was detected by a wavelet-based burst detection algorithm [86]. Using
these features, a list of prelaunched apps was selected to balance the trade-off
between the energy cost and the reduction in loading time.
Apart from using a server-client model to design an app fast launch UIs,
580 two studies [85, 69] ran both the model training and prediction on a user’s
smartphone. Due to the constraint of limited smartphone resource, such as
computation power and battery capacity, two main issues were generally consid-
ered including energy efficiency and trade-off between loading time and battery
consumption.
585 For energy efficiency, the work in [85] used a simple Bayesian model for app
23
prediction [68] with only four features, including time of the day, day of the
week, the location and the last used app. The basic statistics used in training
and predictions were stored locally in smartphones. By using only three weeks’
data for model training, Zhang et al. [85] demonstrated that the algorithm
590 presented four next-app candidates with an accuracy of 76% and the average
lookup time for finding the apps was reduced by 0.5 second. This reduction was
significant given the high frequency of app searching behaviors. Parate et al.
[69] proposed a variant of the Predicting by Partial Matching algorithm [73] for
app usage prediction. This algorithm made a combination of Markov Models of
595 different orders and it claimed to be able to quickly adapt to the changes of users’
behaviors and also made good prediction results. Besides next-app prediction,
the future time point that a user would use a given app was also predicted based
on the cumulative distribution model learned from the smartphone’s app usage
history.
600 To optimize the trade-off between the app loading time and additional bat-
tery consumption, Yan et al. [70] formulated the problem as the classical 0-1
Knapsack problem [87]. The cost was the additional battery energy consumed
by prelaunching the app and the benefit became the reduced loading time which
included both turning on the app and fetching the online content through a reli-
605 able connection11 . The problem of determining the apps to be prelaunched was
formulated and solved as the 0-1 Knapsack problem [87], where the total battery
energy cost was constrained to 1% of total battery energy and the objective was
set to minimize the overall loading time.
Along the same avenue, Parate et al. [69] proposed to optimize the above-
610 mentioned trade-off using a prefetching strategy, which considered the time of
prefetching an app, the time period of app usage as well as the bandwidth
cost. The key idea was that the app’s online content was constantly updated,
such as a news app, thus the optimal period of updating the app should be
11 This was the optimal case the authors considered. If the smartphone was connected to a
24
a period before users turned on the app. The proposed prefetching strategy
615 aimed to maximize the freshness of the apps’ content with the constraints on the
bandwidth cost. The freshness is defined as the time gap between the time that
the app was turned on and the latest update time. In the proposed methodology,
the maximal probability of updating an app was [ni × C]th based on historical
cumulative usage distribution such that the prefetching was guaranteed to be at
620 most ni times. Here C was the bandwidth cost and ni was the number of times
that this app had been used in the records. For a given probability of usage for
an app, its prefetching time was drawn from the app’s cumulative distribution
of time till usage.
3.4.1. Discussion
625 For embedding next-app prediction into real applications, such as design-
ing fast-launch UI app, there are two common types of architectural design to
choose, namely client-server model and on-device model. The former design can
accommodate complex machine learning mechanism since intensive computation
runs on a server. However, it requires data transmission and expects latency
630 in receiving the computational outcomes. The latter design is self-contained
and adapts a personalized predictive model where no data transmission to an
external entity is required. Also, without the need for sending the raw data to
a server and fetching the computation outcomes, the UI launch app can poten-
tially operate in a fast and efficient manner without much privacy risk. However,
635 the practical issues of battery consumption, algorithm complexity, bandwidth
requirement, cost, and user experience given the constrained smartphone re-
sources need to be carefully accounted for and validated to make this design
choice.
4. App Recommendations
640 With a large pool of apps currently available and fast proliferation rate of
new apps, it has become both tedious and time-consuming for users to proac-
tively search for apps that can potentially suit their needs. Although every app
25
Table 3: Comparison of explicit information and implicit information used in movie recom-
mendations and app recommendations studies.
Movie Recommendations App Recommendations
Basic information: content, genres, Basic information: content,
cast, producer category, developer, permissions
Explicit
Box office Download times
Ratings from the review forums Ratings from app users
Installed apps on users’ phone /
User’s watching history
app usage records
Implicit Friends’ app usage information
Friends’ preference
shared on social network
N.A. Spatial-temporal information
market provides different methods to enable users to search for an app (such
as searching by keywords, listing by categories and sorting by download times),
645 they require users to proactively use these tools which can be burdensome to
users. Also these methods are designed based on aggregated app information
from a large pool of users and have not been personalized to suit an individ-
ual’s trait and app usage pattern. Notably, the types of information used in
app searching methods, e.g. descriptions of apps, categorical label and down-
650 load times, have been suggested to be ineffective at quantifying popularity of
an app [88, 53, 40, 51, 89]. Therefore, it is necessary to discover new quanti-
tative measures and features to truly reflect user experiences of app usage and
design personalized app recommendation algorithms and systems that can ef-
fectively connect each app to their targeted users or recommend each user with
655 a personalized set of apps.
26
explicit and implicit features used. In what follows, we detail our comparison
in three aspects.
665 Firstly, movies and apps are by nature very different products for users to
consume. A movie is mostly produced at a high cost to a high standard. It gets
consumed on one-time basis in the cinema or in an on-demand home setting
where the users often passively receive the entertainment through watching the
film. However, an app represents an intelligent software entity that provides
670 certain function to the users through an interactive UI repeatedly on a day-to-
day basis. The functionality, interactivity and the anticipated user experience
are far more important for one to choose an app than that for a movie. Therefore
in movie recommendations, simple methodology using a few explicit features
such as genre, cast and producer, can be fairly effective; however their state-
675 of-the-art methods also look into complex modeling approaches such as tensor
factorization. In contrast, for app recommendations the descriptions of the app
may not be as important for the recommendation since user experiences of an
app can be unrelated to its description. Also, the app’s basic information, e.g.
the developers, the permissions of the apps, have been suggested not the major
680 factors for users to decide on which apps to install and use [88, 53, 40, 51].
Secondly, the user feedback data sources, relevance and reliability matter.
Both box office sales and movie ratings accumulated from audiences provide
trustworthy guidance on popularity of the movies. In comparison, both the
rating and the number of downloads of an app may not be effective measures
685 used for app recommendations. Zhu et al. [88] reported that both the rank-
ing and rating can be forged within a short time and their scores can change
dramatically after a long period. Shi et al. [53] also reported that the ratings
for some apps occasionally manifest inconsistency with skewed distribution. An
extremely low rating can be caused by incompatibility of apps with some oper-
690 ating systems12 . Further, we note that even when a user downloads an app, it
12 This happens mostly in the Android system, which is the most popular mobile system
27
does not imply that the user likes this app. This can be simply because some
users prefer to test out different apps. The user’s traits, as characterized in their
current app choices and usage pattern, is thus important for recommending a
new app.
695 Thirdly, the works in [40, 51] mention that it is hard to get reliable explicit
information for apps in general. Based on historical data in the app markets,
it was reported that only less than 1% of users rate apps that they have in-
stalled [40]. As an example, one popular productivity app, named Evernote,
was reported to receive only a few ratings even after a relative large number of
700 users had installed it and used the app [51]. This happened when the app was
initially released to the market.
Relevance to Next-App Prediction. App recommendation studies have
a great deal in common with the next-app usage prediction studies discussed
in Section 3. App usage prediction is the task of predicting the next app that
705 a user will choose to use on her smartphone, while app recommendations is to
recommend a user to install new apps for future usage. Both topics can be
formulated as similar machine learning problems where typically current app
usage patterns and user preference traits are important factors for making reli-
able prediction or recommendations. These important factors are often selected
710 as the implicit features included in Table 2. In particular, the implicit features
help profile each user and answer several important questions including: which
category of apps does a particular user prefer, how much time will the user
spend on each app installed, how long does it take for a user start losing inter-
est in a particular app? Additional context information such as the location,
715 time of the day, day of the week related to app usage, can also be utilized for
making effective recommendations. A recommendation can be made in the right
context to increase its rate of adoption. For example, it would be more effective
to recommend a game app to a user while he/she is traveling on the train than
driving during the rush hours.
720 The key difference between app usage prediction and app recommendations
is size of the pool for the potential candidates. App prediction chooses within
28
the list of installed apps, while app recommendation chooses within boundary of
the entire app market. For app recommendation, typically there is no historical
usage data of the candidate apps from an individual user. The recommendation
725 has to base on transferable and complementary characteristics deduced from
other apps and other users. App recommendation studies use not only the
features in Table 2, but also other feature types, e.g. app descriptions and
categories, for measuring the similarity between apps. One can refer to Table
4 for the overview of data, features and methodologies used in the current app
730 recommendation studies. In reference to time criticality, app usage prediction
is typically conducted in real time, while app recommendation can afford to be
less time critical, whose effectiveness can be measured based on near-future app
installation and usage.
With the similarity and fundamental differences of app usage prediction and
735 recommendations discussed, our work allows researchers to draw parallels be-
tween the two streams of research works and to identify new cross-topic research
opportunities. Such examples can be: the explicit/implicit feature representa-
tion found useful in user modeling and app prediction will be most likely useful
for app recommendations as well. For app recommendations, frequently we need
740 to use other user’s data and auxiliary data to model and to infer an individual’s
preference level to a particular app. This would involve the computation of user
similarity and performing user clustering on available smartphone data. A good
methodology to leverage other users’ data for recommendations can be easily
reapplied to app prediction due to the common issues of data sparseness and
745 cold start.
29
Table 4: Feature and performance comparison of app recommendation studies, where the
evaluation is typically performed on top-K recommended candidates or as action ratios, nITN
stands for normalized item novelty, MAP for mean average precision, Ppos , Pneg @K for the
average number of top-K recommended candidates are accepted or not accepted by a user,
NDCG for the normalized discounted cumulative gain (NDCG), Pre. for precision and Rec.
for recall.
Group Reference Data Method #Users #Apps Metric Result
App usage info. : Combination of
Shi 2012 Precision,
#days of a binary memory-based 101k 7k Recall@10: 8%
[53] recall@K
indicator model and PCA
Bhandari Descriptions and Similarity graph,
Explicit Features
Similarity-based
3 18 nDCG
[107] app similarity time, LDA ranks position)
Liu 2015 Users’ ratings, app Precision, Precision@10: 8%,
Latent factor model 16k 6k
[104] permissions recall@K recall@10: 20%
nDCG, nDCG@25: 0.9,
Zhu 2014 Users’ ratings, app
Latent factor model - 170k precision, prec.@25: 79%,
[112] permissions
recall@K recall@25: 20%
Version type,
Lin 2014
description, genre, Topic model 10k 9.6k Recall@K Recall@100: 73%
[105]
ratings
Combination of
App descriptions,
Montenegro multiple
ratings and usage - - - -
2012 [46] recommendation
records
methods
App usage Weighted
Social-based
30
found among a group of users. These usage patterns were described in terms of
either the number of days that the app was used or an indicator which described
whether a user has used an app or not. An eigenapp model was proposed for
755 extracting features from user-app usage matrix via eigen decomposition. How-
ever, the computational complexity of such a problem is high especially when
the number of users is large. Thus, a dimensionality reduction method was
proposed by applying eigen decomposition on the app-user matrix, where the
dimension was controlled by the number of apps and the extracted eigenvector
760 was further used to transformed the original user-app matrix. The recommen-
dation was made according to the app similarity which was calculated using
projections to the earlier calculated eignevectors as app features.
Aside from considering app similarities, Bhandari et al. [50] also took into
account diversity and novelty for recommendations. The idea was that an effec-
765 tive recommendation should not only be based on the user’s current installed
apps and usage pattern, but also search among the list of unseen apps that the
user had barely used before. Based on this idea, Bhandari et al. [50] measured
the similarity of apps based on features extracted from an app’s descriptions
and comments. The apps with similarity exceeding a threshold were connected
770 in a similarity graph and the weight on each edge was the corresponding simi-
larity value. The recommendation lists were those on the shortest path between
any pairs of the installed apps. Therefore in this way, the recommendations
were picked as the most dissimilar app candidates in a selected app pool which
present high similarities with an user’s currently installed apps.
775 Yan et al. [47] presented the AppJoy system to make personalized recom-
mendations by picking the apps with similar usage patterns to an user’s installed
apps. Similar usage patterns were evaluated based on a usage score, which was
the weighted sum of three temporal app statistics features, namely recency, fre-
quency and duration. The scores for those apps that users had not used were
780 estimated according to a variant of the Slope-One algorithm [91]. This algo-
rithm averaged the scores of similar apps to the apps that a user has not used
before. Here, similar apps were determined by collective usage behaviors from
31
all users. An item-based recommendation was made by constructing a usage
score matrix between users and apps.
785 Wang et al. [90] recommended apps based on cooperation. The authors
pointed out that in most cases, cooperation between apps was due to their com-
plimentary functionalities for the same content, e.g. the apps for NBA news
and NBA forums. Based on this idea, a cooperation metric was measured by
combining the similarity of the apps’ topics and the differences of their func-
790 tions using the F-measure metric. Similarity in app topics was computed on
their descriptions represented by term frequency-inverse document frequency
(tf-idf) features [92], and the differences of their functions were measured by
an Android mechanism called intent13 , indicating one app involved another in
the runtime to complete a task. This algorithm was designed to recommend to
795 users new candidate apps that had similar topics with users’ existing ones but
with complementary functionality.
Similar to the idea of app cooperation, Bae et al. [93] made app recom-
mendations based on the co-occurrence usage behaviors within a session. For
each user, the app usage graph was constructed by a weighted graph and the
800 weight on each edge was computed as the inverse of the co-occurrence of two
apps observed in the same usage session. Given each user’ app usage graph,
the recommendation was made based on a fit score, which indicated the match-
ing probability of a new node (app) to fit into the current graph. This was
achieved by exploring and fitting the target user’s neighborhood structure into
805 the common nodes of other users’ graph.
Discussion. The above studies proposed to measure the similarity between
apps installed on users’ smartphones and other apps in the market, and used
it as the basis for app recommendations. Here, the similarity measurement can
be either based on the explicit information or the app’s usage information. In
810 the former case, the text description of the apps is often used to measure the
similarity of apps’ content. In the latter case, the basic usage information, e.g.
13 http://developer.android.com/reference/android/content/Intent.html
32
the number of days, the indicator of the usage behaviors and the consecutive
usage of two apps, is used to measure the similarity of apps’ functionality. Even
though making recommendations based on app similarity has found be to ef-
815 fective in some contexts, it is worth noting that they are inherently limited in
the following aspects: Firstly, such recommendations might not be effective as
very similar apps could hardly offer complementary functionality and additional
value to the user. Secondly, for a particular app category, namely social net-
work apps, a user’s decision to accept and install the app may not depend on
820 the similarity of the content since the social networking apps serve a similar
purpose and other key factors not accounted for could matter, such as the num-
ber of his/her friends using the same social networking app. Therefore, there
exist studies that explored latent factors that could affect users’ decision on
accepting a recommended app and build that into recommendation algorithm
825 for improved outcomes. We discuss these studies in the following subsections.
The contextual information, e.g. users’ mobility status, location and time
of the day, were also found to be useful in app recommendations [94, 95, 54,
96]. Kaji et al. [94] developed an iPhone app, named App.Locky, for context-
830 aware app recommendations. This app requested users to select their current
context from a predefined context-tag tree. The relevant apps were chosen by
combining the context information from the context-tag tree and the reviews
of the apps from the web and modelled by tf-idf method [92]. Davidsson et al.
[95] developed an app recommendation system by combining both the context
835 and users’ feedbacks. The context information, e.g. the current location, was
applied to solve the cold-start problem [51] when no app usage information was
initially available.
Besides these two context-based app recommendation systems, the following
two studies [54, 96] applied tensor factorization [97] in app recommendations.
840 The key idea was that the usage information given by user-app-context matrix
M = Rn×r×l had high dimensionality, where each entry pijk in M was a binary
33
value indicating the observation of the app usage in the corresponding context.
Tensor factorization decomposed M into three low dimensional matrices, each
of which represented a profile for users U = Rn×d , apps V = Rr×d and context
845 C = Rl×d , respectively. Thus, user m’s preference to app i given context type
k was the multiplication of the corresponding profiles.
The difference between the above two tensor-flow studies lied in the different
objective functions used in computing three profile matrices. To be specific,
Karatzoglou et al. [54] proposed an objective function based on the observation
850 that repeatable usage patterns should be emphasized. In the user-app-context
matrix, each entry pijk was only a binary indicator that denotes the presence of
a predefined usage pattern, and there was no emphasis on frequently observed
usage patterns against rarely observed usage patterns. Therefore, Karatzoglou
et al. [54] proposed a weighting mechanism for each observed usage pattern.
855 Higher weights were assigned to more frequently observed usage patterns. The
profile matrices U , V and C were computed by minimizing the objective function
Pn Pr Pl PD
of i j k [wijk (pijk − d=1 Uid Vjd Ckd )2 ] together with some penalty terms,
where wijk denotes the weight for the usage pattern pijk .
Differing from the above method, Shi et al. [96] solved three profile matrices
860 by optimizing the objective function defined by a smoothed Mean Average Pre-
cision function (MAP) [98]. The smoothed MAP approximated the app ranking
by applying a new logistic function in the original MAP [92]. A fast learn-
ing tensor factorization MAP algorithm was proposed to solve the three profile
matrices by only choosing a subset of apps given each user-context pair in the
865 optimization process. The subset of apps included only those possessing the
expected usage patterns in a given user-context pair, bus also the top-ranked
apps for which the pair were not observed in the past data.
Discussion. Even though making recommendations by solving a user-app-
context matrix is popular, we observe that it can encounter an issue of large
870 context space, which requires a large amount of available data so that the ma-
trices are not too sparse to solve. Here, the context often refers to the explicit
features in Table 2. We can see that in this table, the information sources for
34
the context can be large. For instance, the contexts can include spatial data,
temporal data, various smartphone settings and statuses, moreover each data
875 source can be further decomposed into many subcategories. For instance, smart-
phones can produce three sources of spatial information such as GPS, cell tower
and WiFi APs. Each of them describes the location in different resolutions and
the complexity can increase when the phone status is jointly considered. All
these factors contribute to the complexity and result in a large context space.
880 Resolving such an issue will require careful categorization, discretization and
interpretation of the data into a limited and meaningful number of contexts.
In addition, feature selection can be employed to evaluate the merit of each
context feature so that a compact set of key context features can be identified
for designing context-aware recommendations. Such an idea has been explored
885 in [42, 44].
The words in [99, 52, 100, 101, 102] recommended apps by inferring users’
app preferences according to their usage behaviors, e.g. the frequency of app
usage, the regularity in updating apps, the ratings of the apps. In particular,
890 Yu et al. [99] estimated the apps’ real values, i.e. users’ preference of the apps,
based on installation logs. The relations between users and apps were inferred
by Hyperlink-Induced Topic Search algorithm (HITS) [103], which iteratively
estimated each user’s preference score to the apps and a preference score of each
app among all users. In the following section, we discuss app recommendation
895 studies based on the idea of modeling users’ preference for particular new apps.
Yin et al. [52] proposed an Actual Tempting (AT) model to characterize
app usage patterns and this model was further applied in app recommendations.
The AT model aimed to model and estimate users’ preferences for installed apps
and the potential of installing new apps, where the former ones were defined as
900 actual values and the latter ones were defined as tempting values. The decision
that whether the user would download and install a candidate app was modeled
as the competing behaviour between the candidate apps and the user’s installed
35
apps. This competing behaviour was modeled as a Bernoulli process. The
parameters in the Bernoulli process were predetermined by the existing app’s
905 actual value and a candidate app’s tempting value. Given a list of installed apps
from a user, ranking of the candidate apps was determined by computing and
sorting their weighted expectation of the competing results.
Böhmer et al. [100] evaluated users’ app preferences based on four different
interaction stages. Specifically, the first stage was to view apps from the recom-
910 mendation list. Once the user showed some level of interest to an app, she may
choose to install the app and this moved the interaction to the second stage. If
the user launched the app within a short period of time after the installation,
the interaction moved to the third stage. The last stage was the long-term us-
age stage, indicating a consistent interest to the installed app. Among the four
915 interacting stages, the long-term usage stage was the only one that was inferred
from app usage data based on both the relative usage and ownership days. The
recommendation engine was embedded with different modeling methods which
were selected depending on the context. The relatively simple models in the list
included recommendation by app popularity, and usage-based collaborative fil-
920 tering. The context-aware method included app-aware filtering, location-aware
collaborative filtering, and time-aware filtering. The evaluation of each recom-
mendation system was based on both the average number of apps among the
recommendation list being observed in each stage, and the conversion rate of
the apps from one stage to the next.
925 We note that the previous studies [52, 100] assumed that users’ preferences to
apps were constant over time. With this assumption the recommendation model
can effectively learn users’ preferences from historical data for app recommen-
dations. The following studies considered the evolvement of users’ preferences
over time and adapted the recommendation model dynamically.
930 Jang et al. [104] recommended apps based on the user’s dynamic preference
over time. The authors mentioned that context factors were important in app
usage modeling and “time was one of the most easily measured factors in mobile
environment”. The basic idea was to recommend apps that were similar to
36
current apps that the user was actively using based on apps’ descriptions. The
935 similarity of apps was measured as a combination of two components, 1) the
temporal app usage statistics of existing apps with new apps, and 2) the general
similarity between a new app to existing apps. The temporal app statistics was
modeled by a Topic-Over-Time model and it combined the probabilistic topic
model and time factors using the Bayesian framework. The general similarity
940 was modeled by an LDA model [79] to learn users’ general preference towards
installing and using the apps.
Lin et al. [102] proposed a recommendation method based on matching
between the topics that users were interested in and the topics that the apps
were related to. The key idea was that an app’s topics could change with re-
945 spect to its version, thus the app’s topic in each version should be specifically
learned. For each app, its version-dependent features were extracted based on
textual descriptions, “version-category” including major change, minor change,
and maintenance, genre information and rating. The topics were then inferred
using either latent Dirichlet allocation (LDA) [79] or Labeled-LDA [105]. The
950 importance of the latent topics associated with each genre was measured by rat-
ings and topic probabilities. Each user’s preference was measured by the latent
topics presented from the apps that this user had used. Thus, the score between
each user and each app was calculated based on the app’s topic distribution,
the weights of the genres of the app’s topics, and user’s preference. This score
955 was further integrated into conventional recommendation techniques, such as
collaborative filtering [106], in app recommendations.
As the apps might require access to various personal data stored on a user’s
phone [107], this can raise privacy concerns. If an app needs to access more of
a user’s sensitive information than its functionality requires, a privacy-sensitive
960 smartphone users may not to install it [108]. The following two studies made
app recommendations by taking into account privacy perspectives, such as data
access permissions of an app and users’ privacy risk [101, 109].
Liu et al. [101] jointly modeled matching of app functionality with users’
interests, and the required access permission from the apps against users’ privacy
37
965 risk. The rationale is that even if certain apps match well with users’ interests,
they should not be recommended to a user simply because that they over-
requested the access to personal information than the app’s functionality is
perceived to require. In the proposed method, latent factors are introduced
to represent a user’s interest and privacy preference, respectively. From the
970 perspective of each app, one functionality latent factor and one privacy latent
factor are introduced to model the apps’ functionality and the privacy score
based on the requested data permission. A user’s preference score to an app
is thus computed as the multiplication of the user’s latent factor and the app’s
latent factor for privacy-aware app recommendations.
975 Zhu et al. [109] recommended apps by considering both the privacy risk
and the popularity of the apps. The risk of an app is related to the permission
that the app requests and the risk score for each app is iteratively estimated
using an app-permission bipartite graph. Given each app’s risk score and its
popularity score, app rankings are based on a hybrid principle to combine the
980 two scores, which is inspired from modern portfolio theory [110]. Specifically,
an app portfolio simply refer to a list of apps and each of them is assigned with
a weight. These weights are collectively estimated by optimizing an objective
function that is proportional to the difference between the expected popularity
of the apps and the expected risk scores. The subsequent recommendation is
985 made for top-ranked apps based on their weights in a descending order within
a category.
A few studies [46, 111] proposed to train app recommendation models us-
ing app usage data from a user’s friends. The work in [46] proposed an app
990 recommendation system named Which App?. In this system, app recommen-
dation was made by combining five different filtering algorithms according to
app usage information. Specifically, these five filtering algorithms were, 1)
content-based recommendations, 2) collaborative recommendations, 3) social
content-based recommendations, 4) social collaborative recommendations, and
38
995 5) context-aware recommendations. The data used included descriptions of apps
obtained from either the Android Market or AppBrain14 , individual’s preference
to the apps obtained from his/her rating of the apps, and app usage statistics.
Girardello et al. [111] developed an app (AppAware) to make app recommen-
dation based on different usage behaviors including uninstallation, installation,
1000 and app updates. The three behaviors were considered as important preference
indicators for a given user. By assigning the behavior of updating with the
highest value and the behavior of uninstallation with the lowest value, the apps
were ranked based on the mean values by aggregating all users’ data. Also,
users can share their app usage information to their social network such as to
1005 their Twitter followers through a built-in connection.
Lin et al. [51] proposed to use information from Twitter followers to recom-
mend apps to overcome a cold-start issue. As mentioned earlier, the cold-start
issue arised at the initial stage when apps were released to the market and there
were very few or even no app operational data and users’ app ratings available.
1010 The authors [51] reported that before being rated some apps already had a large
number of followers on the app’s Twitter account, and these followers actually
formed a social group that shared similar interest to the same app. LDA method
[79] was then applied to model individual’s preference to apps. Here, each user
was represented by a pseudo-document including a list of pseudo-words. Each
1015 word was formed by a pair of values including a follower from the app’s Twitter
official account and a binary indicator of this follower’s preference to the app.
Based on the pseudo-documents, LDA was applied to infer latent groups, which
were further used to estimate individual’s preference to a given app.
Pan et al. [112] predicted app installations by using a composite network
1020 constructed from the networks of users’ historical app usage data. The basic
idea was that the underlying networks between users can help spread the in-
fluence towards particular apps and promote their downloads and installations.
Based on this idea, various graph networks were constructed from smartphone
14 www.appbrain.com
39
usage data, namely call log network, bluetooth proximity network, friendship
1025 network, affiliation network, and composite network. The composite network
was established as a weighted combination of other networks. The conditional
probability that a current user will install a given app was modeled by an ex-
ponential function with a number of parameters as input arguments. These
parameters help model the aggregated behaviors on how a user’s social network
1030 connections had collectively behaved upon receiving notification of his/her in-
stallation of a particular app.
Discussion. Social information is in general regarded as useful informa-
tion for app recommendations. It is because that human beings are inherently
social and our preferences to new things are naturally influenced by others to
1035 whom we are close to, such as friends and relatives. When a user group shares
similar interest of using certain type of apps, e.g. gaming apps, it becomes an
important cue for recommendations, because friends’ app installation actions
are more pervasive than other information to influence a user [111]. We should
also note that not all users in social setting are influencers. Some users tend
1040 to be influencers while the majority of others tend to be followers in terms of
app installation and usages. Identifying the influencers and such social behav-
ioral patterns, e.g. through using tweeter’s network, are also helpful to app
recommendations. Nevertheless, directly inferring users’ decisions on app in-
stallation based on their social networks may be a challenging task. Some users
1045 are more open to try out new apps by nature while others might be simply
inactive not open for this exploration [112]. Social information can be more
effectively used in app recommendations when combined with users’ preference
and privacy traits characterized from their historical app usage data.
Achieved Performances. Table 4 shows the performances achieved in
1050 the app recommendation studies that we have discussed. Compared with the
next-app prediction studies in Table 2, we find that the data sets used for app
recommendation studies typically involved significantly more users and more
apps. The maximal number of smartphone users analyzed is 101K and that for
the app candidates is only 170K. This is partly because recommendation studies
40
1055 often deal with a much larger set of candidate apps than the next-app prediction
studies. Correspondingly, the list of recommended apps for installation is often
longer than the list of next-app candidates with the K typically ranging from
10 to 100. The large number of users is also necessary to allow comprehensive
training and testing of a particular recommendation algorithm.
1060 Like the typical recommendation studies, precision and recall based on the
top-K candidates are often used for evaluation of app recommendation studies.
Besides these, other metrics like normalized item novelty measure (nITN)[50],
Mean Average Precision (MAP)[54], Ppos @K [52] and the normalized discounted
cumulative gain (NDCG)[109] were also suggested for evaluating app recommen-
1065 dation studies.
We also note that a number of works [46, 111, 95, 94, 90, 99] did not report
the exact achieved accuracy performance of their proposed app recommendation
systems. We hence do not include these works in Table 4. The achieved results
for the app recommendation study in the last column of Table 4 show the per-
1070 formances achievable as reported in these state-of-the-art app recommendation
algorithms. However, we do not recommend comparing these numbers directly
given the different settings, data sets used and the design concepts.
5. Other Studies
In addition to the main research streams on app usage prediction and app
1075 recommendations, we also briefly touch on several following related topics,
namely classifying the apps [67] and retrieving similar users in terms of usage
patterns [113, 114, 115], etc. These works also mixed past and present discov-
ery of the underlying app usage patterns from raw smartphone records, and the
techniques proposed could benefit app usage prediction and recommendations.
1080 Zhu et al. [67] applied a Maximum Entropy model (MaxEnt) to classify
apps into different categories by combining context information from web and
individuals’ app usage information. The context information included both ex-
plicit and implicit feedback from web. The explicit feedback from each app was
41
the top-searched results from a search engine. Based on the search outcomes,
1085 two measures, namely general label confidence score and general label entropy,
were proposed to evaluate the likelihood of the app being classified into a given
category. The implicit feedback of the apps was the latent topics, and these
topics were learned from an LDA model [79] by considering similar meanings of
words. Also, two pieces of contextual information, namely pseudo feedback from
1090 context vectors and frequency patterns, were extracted from individuals’ app
usage records. The former referred to the pair of context-feature value when us-
ing the apps, and the latter referred to the relevance of different contexts. Two
baseline methods, namely word vector based app classifier [113] and hidden
topic based app classification [116], were compared with the MaxEnt model by
1095 using different features. Among all the methods, the classifier using the Max-
Ent model showed the highest precision and recall rates. In combining both
the web knowledge and contextual features for recommendations, the MaxEnt
model was demonstrated to outperform the other two baseline methods.
The following studies [113, 114, 115] retrieved users with similar app usage
1100 patterns. The first two studies [113, 114] considered sparseness in app usage
data. Cao et al. [114] drew similarity of mining the app usage patterns with
the study of association rule mining. The major difference was that mining app
usage patterns was supported by context spanning, which referred to the span
of the same context in the temporally-adjacent range. Ma et al. [113] solved
1105 the sparseness issue of the app usage patterns through reduction of the feature
space in two steps. Firstly, the location context was represented by two types of
semantic meanings, namely home and work place, and each app was represented
by its category. Secondly, the current user behavioral patterns were mapped to
common usage habits, namely hyper behavioral patterns, to further reduce the
1110 space of behavioral patterns using a constrained Bayesian matrix factorization.
The usage patterns could be applied to discover users who presented similar app
usage behaviors.
Do et al. [115] modeled the app usage patterns by an author-topic model
[117] to analyze users’ daily app usage patterns and to retrieve similar users in
42
1115 an database. The raw data of usage records was represented by a list of three-
element tuples, including app name, usage time in four different time slots, and
usage frequency quantized in four levels. With this representation, an author-
topic model was applied to infer the hidden topics from these app usage records.
This was further applied to measure user similarity and retrieve similar users.
We discuss three key challenges that are equally tied with promising oppor-
tunities in this emerging field of mining smartphone usage patterns.
Unified Benchmarking Framework. Similar to many other research
1125 fields at their initial stage, the current works in smartphone usage mining typ-
ically focus on exploration and ideation of new concepts to fuel innovative ap-
plications with initial experimental validation. However, it has not come to
the stage of convergence with unification so that the different algorithms can
be directly comparable under a unified benchmarking framework. This is fairly
1130 evident in our summary of next-app prediction studies in Table 2 and app recom-
mendation studies in Table 4. We can observe substantial heterogeneity between
the studies, as they are typically performed under unique settings with tailored
evaluation metrics. This large heterogeneity hinders our ability to answer the
deeper questions such as what features and methods are better for what types of
1135 problems and how much is the comparative advantage in a quantitative sense.
For app prediction and recommendation, some of the studies perform on fairly
large-scale data sets and their reported performances appear more trustworthy
than the small-scale studies. We expect to see more open mobile data acqui-
sition platforms become available and more large-scale data sets being shared
1140 to research community. This will then start to establish an open and unified
benchmarking framework. Such a framework will enable researchers in this do-
main to test their algorithms and compare with other algorithms easily in a
43
standard way. Such a framework and its common adoption will help fuel the
next major growth in this emerging area of smartphone usage data mining.
1145 Privacy Preservation. When accessing, sharing and publishing private
data from individuals, privacy preservation is always an important aspect to
consider and to plan [118, 119]. For instance, in [59] and [14] more than 1000
individuals’ phone usage records had been collected and the privacy for the
large number of users is preserved through anonymization which replaces the
1150 personally identifiable information with unrecognizable symbols. In this way,
not only multiple sets of users’ data are delinked, but also the possibility that
personal identities get revealed with other publicly available auxiliary data gets
minimized. From a user’s perspective, she might be unwilling to consent sharing
her private usage information unless the privacy aspect is securely taken care
1155 of. Therefore in a practical setting, we often do not see a publicly shared smart-
phone data set containing personally identifiable information, such as actual
cell towers and WiFi MAC addresses. However, given only anonymized data,
the challenging research issues become how to uncover meaningful smartphone
usage patterns and how to tap into such data patterns found across multiple
1160 smartphone users to build intelligent applications that can benefit individual
users.
Energy Efficiency. As people spend more time with their smartphones,
energy consumption becomes another issue to be considered. Even though next-
app prediction and fast app launching UIs can bring convenience to our current
1165 smartphone interaction, they are tied with a cost of additional battery energy
consumption. In the app usage studies discussed, only a few works in Section
3.4 consider the energy-efficiency issue of their proposed prediction algorithm.
Whereas, studies of continuous sensing with smartphone in activity recognition
have already taken energy-efficiency issues into consideration [120, 121, 122, 123,
1170 124, 125, 126]. We anticipate that future app usage mining works will take the
energy usage as a key consideration.
When deploying an app usage system or algorithm onto the phone, a simple
server-client model can move the computational load to a server and transmit
44
the results back to the smartphone. However, the connection to the server or
1175 cloud will consume some battery energy and a user may not always be con-
nected. Therefore, energy consumption has to be considered with choices on:
1) whether to design a simple yet efficient mining algorithm running on an
individual’s smartphone without compromising substantially on the predictive
performance 2) or to optimize the predictive performance and move the com-
1180 plex computational load to the server. In addition, in deciding which part of the
computation should be done remotely, factors such as data transmission cost,
latency and merits of centralized processing, should also be carefully considered
when an intelligent mobile system is designed.
45
phone uses. Recent works [128, 129, 130] used non-traditional biometrics, e.g.
1205 pressure on the touch screen, for mobile user authentication.
Meanwhile, users’ daily behaviors typically contain regular patterns and
such patterns are repeatedly observed from the data logged in their smart-
phones.Along this line, the work in [131] showed that users can be identified us-
ing their spatial-temporal mobility patterns learned from smartphone data. Lin
1210 et al. [132, 133] explored mobile user verification through mining anonymized
time-series location-ambience data.
The current approaches for mining smartphone data as reviewed in our sur-
vey are mostly at its early stage of syntatic level. There is also a need to interpret
1215 human behaviors from smartphone usage data at a higher semantic level. Inter-
esting semantic meanings can be the high-level activities performed at different
locations, such as dining in a restaurant, meeting friends in a coffee shop. We
observe that the main challenge in interpreting smartphone usage at the seman-
tic level is to collect smartphone usage data along with semantic labels. In this
1220 survey, we have reported different ways of collecting smartphone usage data in
a non-intrusive way without any input from users. However, obtaining semantic
information largely requires users’ cooperative input with the timing and fre-
quency optimized. Due to the difficulty in obtaining semantic labels, existing
studies [134, 135, 136, 6, 5], to our best knowledge, only considered interpreting
1225 some generic location-relevant semantics, namely home, working place, and the
regular mobility paths between these locations.
Acquisition of the semantic labels can be done through crawling from social
networks, e.g. Facebook, Twitter. Active social network users will publish their
activities with location information, and such information can be integrated as
1230 input to interpret smartphone data, such as inferring the purpose of a visit.
However, activity sharing on social network for individuals is entirely on vol-
untary basis so there expected to data quality issues, such as inconsistency,
sparseness, and missing data. Analyzing and deriving high-quality semantic la-
46
bels out of users’ social network sharing is hence a non-trivial task. Moveover,
1235 such high-level semantics have to be time synchronized and integrated with the
users’ smartphone data acquisition period before semantic modeling and data
interpretation can take place.
7. Conclusions
Acknowledgement
47
References
1260 [1] N. Eagle, A. (Sandy) Pentland, Reality Mining: Sensing Complex Social
Systems, Personal Ubiquitous Comput. 10 (4) (2006) 255–268.
[5] X. Bao, N. Z. Gong, B. Hu, Y. Shen, H. Jin, Connect the dots by un-
derstanding user status and transitions, UbiComp ’14 Adjunct, 2014, pp.
1270 361–366.
48
[11] S. Butt, J. G. Phillips, Personality and Self Reported Mobile Phone Use,
1285 Comput. Hum. Behav. 24 (2) (2008) 346–360.
[15] N. Henze, S. Boll, Release Your App on Sunday Eve: Finding the Best
1295 Time to Deploy Apps, MobileHCI, 2011, pp. 581–586.
49
[21] K. Athukorala, E. Lagerspetz, M. von Kügelgen, A. Jylhä, A. J. Oliner,
S. Tarkoma, G. Jacucci, How Carat Affects User Behavior: Implications
for Mobile Battery Awareness Applications, CHI, 2014, pp. 1029–1038.
1325 [26] Y. Chon, W. Ryu, H. Cha, Predicting smartphone battery usage using
cell tower ID monitoring, Pervasive Mob. Comput. 13 (2014) 99–110.
50
[32] V. Etter, M. Kafsi, E. Kazemi, M. Grossglauser, P. Thiran, Where to go
from here? Mobility prediction from instantaneous information, Pervasive
1340 and Mobile Computing 9 (6) (2013) 784 – 797.
1360 [40] A. Girardello, F. Michahelles, Explicit and Implicit Ratings for Mobile
Applications, in: GI-Jahrestagung, 2010, pp. 606–612.
51
1365 [42] C. Shin, J.-H. Hong, A. K. Dey, Understanding and Prediction of Mobile
Application Usage for Smart Phones, UbiComp, 2012, pp. 173–182.
[43] Z.-X. Liao, Y.-C. Pan, W.-C. Peng, P.-R. Lei, On Mining Mobile Apps
Usage Behavior for Predicting Apps Usage in Smartphones, CIKM, 2013,
pp. 609–618.
1370 [44] Z.-X. Liao, S.-C. Li, W.-C. Peng, P. S. Yu, T.-C. Liu, On the feature
discovery for app usage prediction in smartphones, in: ICDM, 2013, pp.
1127–1132.
52
[51] J. Lin, K. Sugiyama, M.-Y. Kan, T.-S. Chua, Addressing Cold-start in
App Recommendation: Latent User Models Constructed from Twitter
Followers, SIGIR, 2013, pp. 283–292.
1395 [52] P. Yin, P. Luo, W.-C. Lee, M. Wang, App recommendation: A contest
between satisfaction and temptation, WSDM, 2013, pp. 395–404.
[55] M. Lin, W. Hsu, Mining GPS data for mobility patterns: A survey, Per-
vasive and Mobile Computing 12 (2014) 1–16.
1415 [60] E. Oliver, The challenges in large-scale smartphone user studies, Hot-
Planet ’10, 2010, pp. 5:1–5:5.
53
[61] T. M. T. Do, J. Blom, D. Gatica-Perez, Smartphone Usage in the Wild:
A Large-scale Analysis of Applications and Context, ICMI, 2011, pp. 353–
360.
1430 [66] N. Natarajan, D. Shin, I. S. Dhillon, Which App Will You Use Next?:
Collaborative Filtering with Interactional Context, RecSys, 2013, pp. 201–
208.
54
[71] C. Sun, J. Zheng, H. Yao, Y. Wang, D. F. Hsu, AppRush: Using Dy-
1445 namic Shortcuts to Facilitate Application Launching on Mobile Devices,
Procedia Computer Science 19 (0) (2013) 445 – 452.
[72] X. Zou, W. Zhang, S. Li, G. Pan, Prophet: What app you wish to use
next, UbiComp ’13 Adjunct, 2013, pp. 167–170.
55
1470 [80] N. Friedman, D. Geiger, M. Goldszmidt, Bayesian network classifiers,
Mach. Learn. 29 (2-3) (1997) 131–163.
1480 [85] C. Zhang, X. Ding, G. Chen, K. Huang, X. Ma, B. Yan, Nihao: A Pre-
dictive Smartphone Application Launcher, in: MobiCASE, 2012, pp. 294–
313.
[88] H. Zhu, H. Xiong, Y. Ge, E. Chen, Ranking Fraud Detection for Mobile
Apps: A Holistic View, CIKM, 2013, pp. 619–628.
[89] Y.-X. Tong, J. She, L. Chen, Towards better understanding of app func-
tions, Journal of Computer Science and Technology 30 (2015) 1130–1140.
1490 [90] F. Wang, Z. Zhang, H. Sun, R. Zhang, X. Liu, A cooperation based met-
ric for mobile applications recommendation, in: 2013 IEEE/WIC/ACM
International Joint Conferences on Web Intelligence (WI) and Intelligent
Agent Technologies (IAT), Vol. 3, 2013, pp. 13–16.
56
[92] C. D. Manning, P. Raghavan, H. Schütze, Introduction to Information
Retrieval, Cambridge University Press, 2008.
[99] P. Yu, C. man Au Yeung, App Mining: Finding the Real Value of Mobile
Applications, WWW Companion ’14, 2014, pp. 417–418.
1520 [102] J. Lin, K. Sugiyama, M.-Y. Kan, T.-S. Chua, New and improved: Model-
ing versions to improve app recommendation, SIGIR, 2014, pp. 647–656.
57
[103] J. M. Kleinberg, Hubs, authorities, and communities, ACM Comput. Surv.
31 (4es) (1999) 5.
1535 [109] H. Zhu, H. Xiong, Y. Ge, E. Chen, Mobile app recommendations with
security and privacy awareness, KDD ’14, 2014, pp. 951–960.
[113] H. Ma, H. Cao, Q. Yang, E. Chen, J. Tian, A Habit Mining Approach for
Discovering Similar Mobile Users, WWW, 2012, pp. 231–240.
1545 [114] H. Cao, T. Bao, Q. Yang, E. Chen, J. Tian, An Effective Approach for
Mining Mobile User Habits, CIKM, 2010, pp. 1677–1680.
58
[115] T.-M.-T. Do, D. Gatica-Perez, By Their Apps You Shall Understand
Them: Mining Large-scale Patterns of Mobile Phone Usage, MUM, 2010,
pp. 27:1–27:10.
1550 [116] X.-H. Phan, C.-T. Nguyen, D.-T. Le, L.-M. Nguyen, S. Horiguchi, Q.-
T. Ha, A Hidden Topic-Based Framework toward Building Applications
with Short Web Documents, IEEE Transactions on Knowledge and Data
Engineering 23 (7) (2011) 961–976.
1565 [121] S. Nath, Ace: Exploiting correlation for energy-efficient and continuous
context sensing, MobiSys ’12, 2012, pp. 29–42.
59
[124] H. Lu, J. Yang, Z. Liu, N. D. Lane, T. Choudhury, A. T. Campbell, The
1575 Jigsaw Continuous Sensing Engine for Mobile Phone Applications, SenSys
’10, 2010, pp. 71–84.
60
1600 [132] M. Lin, H. Cao, V. Zheng, K. C. Chang, S. Krishnaswamy, Mobile user ver-
ification/identification using statistical mobility profile, BigComp, 2015,
pp. 15–18.
[134] J. Zheng, S. Liu, L. Ni, Effective routine behavior pattern discovery from
sparse mobile phone data via collaborative filtering, PerCom, 2013, pp.
29–37.
[135] J. Zheng, S. Liu, L. M. Ni, Effective mobile context pattern discovery via
1610 adapted hierarchical dirichlet processes, MDM, 2014, pp. 146–155.
61