SBRC 2019-2
SBRC 2019-2
SBRC 2019-2
Abstract. The city of São Paulo, the most populous in Brazil, is characterized
by an urban segregation responsible for numerous problems related to urban
mobility. The current actions to solve problems of urban mobility have not ex-
ploited the potential of Social Networks. This work aims to use tweets to identify
the bus lines impacted by exception events, in respect to velocity. To achieve this
goal, this work proposes a new methodology for detecting exception events using
tweets published by governmental institutions responsible for reporting excep-
tion events, SPTrans’s (Transport Company of São Paulo) GTFS (Google Transit
Feed Specification) and AVL (Automatic Vehicle Location) data. We character-
ized 60,984 events and found 10,027 exception events that impacted 1,073 bus
lines. Besides, we found that social events have an average of 87,04% impact
on the average speedy of bus lines affected by a radius of 1,000 meters; urban
events 70,11%; accidents 66,51% and natural disasters 59,77%.
1. Introduction
In São Paulo 10% of population lives in the Expanded Center area and 90% in
the Peripheral Belt [SÁ, T. H. et al. 2017], which characterizes an urban segregation
responsible for numerous problems related to urban mobility. Especially in segregated
cities, exception events are capable of generating significant delays or even unavailability
of the operation of public transport. Exception events are events that happen sporadically
or suddenly such as manifestation, sporting events, floods, tree falls, fires, accidents, etc.
All exception events previously mentioned are reported by citizens and authori-
ties in Social Networks, which can be used by Smart City systems. As an example, the
public transport can benefit by integrating Social Networks content with the planning,
management and operational activities of public transport, addressing their respective so-
ciotechnical factors [Kuflik et al. 2017]. In this work we aims to use tweets to identify the
bus lines impacted by exception events, in respect to velocity.
The new methodology developed allows to detect messages that refer to exception
events published on social networks and automatically detect which lines will be affected
and estimate how the velocity those lines will be affected. To achieve this objective we (I)
trained a model to classify exception events reported by the selected profiles, (II) devel-
oped a process to addresses extraction and geolocalization based on tweets, which are (III)
correlated with SPTrans’s (responsible for the bus lines of the municipality of São Paulo)
GTFS (commonly used to describe public transport data) data to find bus lines impacted
by exception events in the São Paulo city and with (IV) AVL data to velocity impact char-
acterization. Using this methodology we characterized 60,984 events and found 10,027
exception events that impacted 1,073 bus lines. Besides, we found that social events have
an average of 87,04% impact on the average speedy of bus lines affected by a radius of
1,000m; urban events 70,11%; accidents 66,51% and natural disasters 59,77%.
2. Related Work
Several works studies how to use tweets processing for analyzing problems related
to public transport. These studies can be classified into event impact analysis, planning
and management of public transport. For example, [Wen et al. 2016] used tweets to
analyze the impact of the terrorist attacks in Paris (2015) on mobility patterns regarding
the use of public transport. Similarly, [Itoh et al. 2016] developed a tool based on tweets
to visualize and explore the decisions of passengers of the Tokyo Metro before abnormal
events such as typhoons, fires, earthquakes, etc. In this same context, [Ni et al. 2016]
proposed a technique to predict passenger flow in the New York Metro and identify events
based on hashtags. [Chen et al. 2016] studied the relationship between traffic events and
the demand for bicycles.
In respect to public transport planning and management, [Mukherjee et al. 2015]
presents a platform developed and used by the Bangalore Public Transport Agency, which
allows report issues related to public transport, improving the operation planning and the
service provided to the population. Analogously, [Gutev and Nenko 2016] used tweets
to identify the popularity of points of interest and age distribution, in order to determine
the best points for bicycle stations and thus encourage the use of this mode of transport.
Also related to the points of interest, [Maghrebi et al. 2015] used tweets to identify human
activities patterns and their respective impacts on the demand for public transport.
In [Gal-Tzur et al. 2014] a hierarchical approach was created to classify tweets
related to transport. They have demonstrated that it is possible to use this information for
transportation planning and management purposes. This technique was applied in a case
study associated with sporting events in the United Kingdom. The hierarchy is composed
of three levels (I) tweets classified among those that express the need for transport ser-
vices, opinions and incidents; (II) identification of the transport category and (III) topics.
Another study that contributes to the planning of public transport is the one carried
out in [Gkiotsalitis and Stathopoulos 2015, Gkiotsalitis and Stathopoulos 2016], in which
tweets were processed to identify user disposition to trips related to leisure, suggesting
to them activities with less time of travel and probability of delays. Another relevant
point considered was the level of access to public transport, which, when high, positively
impacts people’s happiness and correlates with positive feelings, according to the analysis
of feelings carried out by [Guo et al. 2016], using tweets published in Greater London.
Neither of the presented works tackle the identification of different types of excep-
tion events from tweets published by an authority to characterize the velocity impact on
bus of São Paulo. In this work we propose a new method, explained ahead, for deal with
this problem. The cited works are connected to our on aspects related to tweets processing
for analysis of the impact of events on public transport, planning and management.
3. Social Networks
Social Networks (SN) can be defined as networks that have many relationships,
with large connected components, clustering coefficients and degree of reciprocity. Such
features, e.g, are found on Facebook1 . Another SN is Twitter2 , which besides having the
social networking features mentioned can also be characterized as an Information Net-
work. In this type of network the dominant interaction is the dissemination of information
between relationships, with low reciprocity index [Myers et al. 2014].
On Twitter the information (tweets) is published containing a maximum of 280
characters; each publication can receive retweets (to be shared by other users), comments
(directly in the tweet — replies — or privately via the message box) and likes (indicator of
how many users liked the post), in addition to these features, tweets may contain mentions
to other users (@profile) and labels (#hashtag) indicating subjects, categories, etc. Due to
the characteristics mentioned previously, the Twitter has been an important social network
for sharing information and everyday events. Such events can be classified as social
events, capable of describing from routine events to crisis situations (natural disasters,
social mobilizations, among others) [Zhou and Chen 2014, Atefeh and Khreich 2015].
The concept of Smart Cities (SC) has been defined mainly as sustainable and
socially inclusive cities [Wang et al. 2016], which use Information and Communica-
tion Technologies (ICTs) to efficiently manage natural resources, energy, transportation,
waste, etc. [Ahvenniemi et al. 2017]. ICTs permeates urban systems and physical spaces,
which has been accentuated by the increasing number of sensors and devices connected
to the Internet of Things (IoT); voluntary data and existing content on SN about daily
events. Such heterogeneous sources generate large amounts of data, used to develop SC
services [Finger and Razaghi 2017, Ang et al. 2017].
The development of SC services has challenges related to connectivity (network
infrastructure, interoperability and standards, power consumption and scalability) and re-
lated to data (capacity and location of data storage, extraction, processing, analysis, in-
tegration and aggregation). Besides, data analysis has issues related to correlation, infer-
ence of data from different domains, machine learning, real-time processing, and new-use
proposals for data from existing infrastructures [Ang et al. 2017, Xiao et al. 2017].
In the public transport context, the GTFS3 is a specification of a common format
(that solves the problem of interoperability and patterns related to public transport data) to
exchange static information on public transport. A feed specified in static GTFS consists
of text files (which follows certain requirements similar to the CSV format) compressed
in ZIP format. In this research we correlate SPTrans’s static GTFS and AVL data (i.e.
location data related to each bus) with tweets from the selected accounts.
1
https://www.facebook.com. Accessed in December 09, 2018.
2
https://twitter.com. Access in December 09, 2018.
3
Google Transit: https://developers.google.com/transit. Accessed in December 11,
2018.
5. Natural Language Processing
Automatic exception event tweets classification involves Natural Language Pro-
cessing (NLP), which explores how computers can be used to understand and manipulate
text or speech in natural language [Liu et al. 2017]. Before the NLP processing, the
tweets were preprocessed — removing URLs, datetime, mentions to other tweets, emoti-
cons, punctuations — to remove noise and to reduce the dimension of feature space.
A particular attention was paid to hashtags, which are relevant to exception events
classification, but adds noise to the address extraction phase. In order to mitigate this
problem, hashtags are identified and replaced by empty spaces in the address extraction
process. Also, it is important to note that hashtags are not removed from original tweets.
After the preprocessing phase we applied NLP techniques to tweets, such as (I)
Tokenization — process to obtain the words, i.e. tokens (features used to train the clas-
sification model), in a tweet, removing numbers and characters that do not belong to
the alphabet (TweetTokenizer4 ); (II) morphological decomposition to get a given word
into its inflected form using lemmatization (word lemma identification) or stemming
(identification of the root of the word using heuristics to determine the location of its
flexion — RSLPStemmer5 ); process used to features space reduction, besides of Brazil-
ian Portuguese stopwords remotion6 (common words without meaning) [Setiawan et al.
2017, Nadkarni et al. 2011, Korenius et al. 2004, Roy et al. 2017, Collobert et al. 2011].
6. Classification model
Finding exception events involves the identification of events related to an excep-
tion, which is possible through classification. The following classes are often used to
classify exception events (that normally occurs in a city) [Itoh et al. 2016, Chen et al.
2016, Lecue et al. 2014, Gal-Tzur et al. 2014]:
1. Accidents, e.g. accidents occurred at transport stations, fire, collision of vehi-
cles, etc. 2. Time-space, e.g. day of the week (mondays, fridays and holidays), time of
day (peak times), etc. 3. Social Events, e.g. street fairs, festivals, sport games, marches,
marathons, etc. 4. Urban Events, e.g. related to traffic (deviations), road maintenance,
etc. 5. Natural disasters, e.g. storms, earthquake, typhoons, etc. 6. Meteorological, e.g.
clear day, overcast, rainy, snowing, haze, (high and low) temperatures, etc.
Using the found classes, 60,984 tweets from selected accounts were manually
classified. This labeled data was transformed to a binary representation of features, which
was used to train a model to classify tweets in exception events. The process of construct-
ing these features is known as feature engineering, that is iterative between the phases
of feature extraction, feature construction, and feature selection. Before this iteration, the
data can be preprocessed using standardization, normalization, noise removal, dimension-
ality reduction, discretization, expansion, etc; it is important to note that information can
be lost when performing these transformations [Guyon and Elisseeff 2006].
4
NLTK module used to the tokenization process. https://www.nltk.org/api/nltk.
tokenize. Accessed in December 09, 2018.
5
NLTK module used to the stemming process. https://www.nltk.org/_modules/nltk/
stem/rslp. Accessed in December 09, 2018.
6
Brazilian Portuguese stopwords were obtained from NLTK — https://www.nltk.org. Ac-
cessed in December 19, 2018.
As mentioned in Section 5, we used a preprocessing phase to feature extraction
through a NLP function. The feature construction and selection phases are not used be-
cause these processes do not apply to the methodology of this work. After the preprocess-
ing the tweets are processed to be represented by a bag-of-words, which contains feature
vectors created using the Term Frequency - Inverse Document Frequency (TF-IDF) mea-
sure. The bag-of-words is randomly partitioned into training (60%) and test (40%) sets,
that are inputs to the classification algorithms mentioned in Section 7.
TP + TN TP + TN TP
Accuracy = = ; P recision =
P +N TP + TN + FP + FN TP + FP
TP P recision ∗ Recall 2T P
Recall = ; F1 score = =
TP + FN P recision + Recall 2T P + F P + F N
8. Data set
Corpus Twitter. The Social Network Twitter was chosen as data source for the
construction of the data set related to the exception events. The choice is due to the fact
that each publication is limited in 280 characters, which reduces the complexity of the
processing of the published content, and because São Paulo’s public agencies use it as an
instant channel of communication with its citizens.
The data set used to identify the exception events is composed by tweets, writ-
ten in Brazilian Portuguese, published by the profiles cited in Table 1. We chose to use
tweets from official public service providers to guarantee the reliability of the data ana-
lyzed, discarding retweets and replies. Thus, the data used are related to the unidirectional
7
We used the algorithms implemented by Sci-Kit Learn, with the standard hyper-parameters. It is not
the focus of this work hyper-parameters tuning.
8
https://scikit-learn.org/stable/modules/cross_validation.html#
cross-validation. Accessed in December 26, 2018.
communication channel (in the context of e-participation — interaction between citizens
and public authorities). Regarding profiles selection, all accounts were manually selected
according to the institutions responsible for reporting exception events. Such profiles are
public in nature, so access to their tweets does not involve privacy issues.
Corpus SPTrans. The SPTrans (São Paulo Transportation Company)9 corpus has
data provided by SPTrans specified in GTFS, detailed in Table 2 and data of geolocation
(movements) of all the buses of São Paulo, referring to the year of 2017 — obtained by
the law on access to information10 . In respect to AVL data set, it is important to note
inconsistencies in the two AVL files of January 11, according to SPTrans meta data each
file must have 19 fields, however, the file with data from 09h to 10h has 21 fields in line
1,075,548 and the file with data from 10h to 11h has 35 fields in line 60,025.
The gaps mentioned before were ignored in processing, the original data was con-
verted from string to its respective type (long, double, int or string), time values were
standardized using POSIX timestamps, and data referring to latitude and longitude were
converted to legacy coordinate pairs11 . In order to enable geospatial queries, geospatial
indexes11 were created in the MongoDB collections containing geolocalized information.
The second set ({[a − z À − ÿ ]+}), represents a filter to identify a set of words after L or
S, candidates to compose the wanted addresses.
These words are treated as candidates because it is hard to know how many words
after L or S belongs to the address, however, the selected accounts publish tweets with
visible patterns in the texts, after and before the addresses. As a consequence, a possible
method to find the wanted address is the removal of these patterns after and before of
the address. After address extraction, we used the Google Maps Geocoding API12 to
geolocate the found address (only 1.5% of tweets have geolocalization [Niu et al. 2016]).
The HTTP response from this API is processed to get the values from location (which
contains latitude and longitude information) and formatted address.
11. Results
The methodology was applied to the Corpus Twitter15 , which contains 60,984
tweets. At the end of tweets preprocessing and processing, the corpus got 414,637 words,
with a vocabulary size of 13,915 words. All tweets were manually classified according to
identified exception events. This data set is composed of the following labels: Accident,
Irrelevant — to non exception events, Natural Disaster, Social Event and Urban Event.
This labeled data set was used to train exception events classification models,
based on a bag-of-words, described in Section 6. According to Table 4, the model using
the Multi-layer Perceptron algorithm obtained greater accuracy for the classification task.
Of the 60,984 tweets 10,027 were classified into exception events and from that
subset we found 7,710 addresses (which represents 76.89% of the total of tweets classified
as exception events. The reasons for tweets without address extracted are:
1. Tweets with only the point of interest, in other words, the address is not explic-
itly stated. 2. Tweets without address information. 3. Tweets with unusual public place
name (for example passageway, road complex, connection to). 4. Tweets with addresses
with concatenated words (for example avenidapaulista)
14
It is important to note that this work does not consider the exact start and end of the exception events,
but a time range of one hour from the time in the tweet timestamp.
15
Data set publicly available at: https://github.com/fcas/mobility-analysis/blob/
master/datasets/tweets.zip. Accessed in December 14, 2018.
Table 4. Metrics of the evaluations of the algorithms used to classify the tweets
in exception events
Figure 1 illustrates the addresses16 most affected by exception events and Figure 2
shows the distribution of these events in the central region of São Paulo. It is important to
note that the exception events found are concentrated in the addresses and regions where
they normally occur in São Paulo, which validates the methodology developed.
in the groups of bus lines affected by a radius of 1,000m and 100% to a radius of 100m,
this probably due to the large number of people involved in this type of event, number of
avenues with modified or interrupted traffic flow.
Urban events, in turn, impacts 70,11% at 1,000m and 98,86% at 100m, even
though these events are being carried out with alternative routes planning and warn signs
on public roads. The third and fourth most affected classes are those of accidents and
natural disasters, respectively, 66,51% and 59,77 % at 1,000m and 98,39 % and 99,80 %
to 100m, which normally blockages or detours on public roads used by buses.
In addition, January, February and March were the three months most affected by
exception events related to natural disasters, a period of high rainfall in São Paulo, where
landslides, tree falls and floods usually occurs. In relation to social events, the year 2017
was marked with numerous political manifestations, in this context, May was the most
impacted month by this type of exception event, mainly due to the protests against the
Table 6. Percentage of impact on the average speed of the groups of lines af-
fected by exception events at 1,000m and 100m distance respectively, in
the months of 2017
government Temer 17 . The events related to accidents usually occur in greater concen-
tration in the periods of holidays and holidays, which can be observed in the months of
January and April (single month of 2017 with two prolonged holidays), with a mean im-
pact of 83.33% and 87.50% at the average speeds, respectively. Impacts related to urban
events occurs normally during all months, due to which they percentages are uniform.
The months close to 100% of impact at average speeds are justified because of the
small volume of events for a given class in a given month, as Fig.3, which also happens
for scenarios with geolocated data next to the exception events. Similarly, the months and
classes without impact data are months with little data for the analyzed class.
17
www1.folha.uol.com.br/poder/2017/05/1884977-manifestacao-anti-
temer-reune-hundreds-of-people-in-av-paulista.shtml. Accessed on December 2, 2018
12. Conclusions
This work presents a new methodology for exception events classification and
analyze their respective impacts on velocity of the public transport system by bus of the
São Paulo city. Using tweets from selected public service providers, we found that Multi-
layer Perceptron was the best algorithm for classifying tweets in exception events. We
also showed that it is possible to extract addresses from semi-structured tweets using only
regular expressions. Classifying these events are the first step to better understand how
these exceptional events impact the velocity of bus, using the methodology developed we
found that social events reduces the velocity of 87,04% of a group impacted, urban event
70,11%, accident 66,51% and natural disaster 59,77% from a distance of 1,000m.
Although validated using selected Twitter profiles written in Brazilian Portuguese
language, this method can be generalized for different languages and cities. GTFS is a
ubiquitous format for public transport and tools like NLTK supports several languages.
Acknowledgment
This research is part of the INCT of the Future Internet for Smart Cities funded
by CNPq proc. 465446/2014-0, Coordenação de Aperfeiçoamento de Pessoal de Nı́vel
Superior – Brasil (CAPES) – Finance Code 001, FAPESP proc. 14/50937-1, and FAPESP
proc. 15/24485-9.
References
Ahvenniemi, H., Huovila, A., Pinto-Seppä, I., and Airaksinen, M. (2017). What are the
differences between sustainable and smart cities? Cities, 60:234–245.
Ang, L.-M., Seng, K. P., Zungeru, A., and Ijemaru, G. (2017). Big Sensor Data Systems
for Smart Cities. IEEE Internet Things J., 4(5):1–1.
Atefeh, F. and Khreich, W. (2015). A survey of techniques for event detection in twitter.
Computational Intelligence, 31(1):132–164.
Chen, L., Zhang, D., Wang, L., Yang, D., Ma, X., Li, S., Wu, Z., Pan, G., Nguyen, T.-
M.-T., and Jakubowicz, J. (2016). Dynamic Cluster-Based Over-Demand Prediction
in Bike Sharing Systems. UBICOMP, pages 841–852.
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., and Kuksa, P. (2011).
Natural language processing (almost) from scratch. Journal of Machine Learning Re-
search, 12(Aug):2493–2537.
Dwivedi, S. K. and Arya, C. (2016). Automatic text classification in information retrieval:
A survey. In Proceedings of the Second International Conference on Information and
Communication Technology for Competitive Strategies, page 131. ACM.
Finger, M. and Razaghi, M. (2017). Conceptualizing “Smart Cities”. Informatik-
Spektrum, 40(1):6–13.
Gal-Tzur, A., Grant-Muller, S. M., Kuflik, T., Minkov, E., Nocera, S., and Shoor, I.
(2014). The potential of social media in delivering transport policy goals. Transp.
Policy, 32:115–123.
Gkiotsalitis, K. and Stathopoulos, A. (2015). A utility-maximization model for retrieving
users’ willingness to travel for participating in activities from big-data. Transp. Res.
Part C Emerg. Technol., 58:265–277.
Gkiotsalitis, K. and Stathopoulos, A. (2016). Joint leisure travel optimization with user-
generated data via perceived utility maximization. Transp. Res. Part C Emerg. Tech-
nol., 68:532–548.
Guo, W., Gupta, N., Pogrebna, G., and Jarvis, S. (2016). Understanding happiness in
cities using twitter: Jobs, children, and transport. IEEE 2nd Int. Smart Cities Conf.
Improv. Citizens Qual. Life, ISC2 2016 - Proc.
Gutev, A. and Nenko, A. (2016). Better Cycling - Better Life: Social Media Based
Parametric Modeling Advancing Governance of Public Transportation System in St.
Petersburg. Proc. Int. Conf. Electron. Gov. Open Soc. Challenges Eurasia, pages 242–
247.
Guyon, I. and Elisseeff, A. (2006). An introduction to feature extraction. Feature extrac-
tion, pages 1–25.
Itoh, M., Yokoyama, D., Toyoda, M., Tomita, Y., Kawamura, S., and Kitsuregawa, M.
(2016). Visual Exploration of Changes in Passenger Flows and Tweets on Mega-City
Metro Network. IEEE Trans. Big Data, 2(1):85–99.
Korenius, T., Laurikkala, J., Järvelin, K., and Juhola, M. (2004). Stemming and lemma-
tization in the clustering of finnish text documents. In Proceedings of the Thirteenth
ACM International Conference on Information and Knowledge Management, CIKM
’04, pages 625–633, New York, NY, USA. ACM.
Kotsiantis, S. B., Zaharakis, I., and Pintelas, P. (2007). Supervised machine learning:
A review of classification techniques. Emerging artificial intelligence applications in
computer engineering, 160:3–24.
Kuflik, T., Minkov, E., Nocera, S., Grant-Muller, S., Gal-Tzur, A., and Shoor, I. (2017).
Automating a framework to extract and analyse transport related social media content:
The potential and the challenges. Transportation Research Part C: Emerging Tech-
nologies, 77:275–291.
Lecue, F., Tallevi-Diotallevi, S., Hayes, J., Tucker, R., Bicer, V., Sbodio, M., and Tom-
masi, P. (2014). Smart traffic analytics in the semantic web with STAR-CITY: Scenar-
ios, system and lessons learned in Dublin City. J. Web Semant., 27:26–33.
Liu, D., Li, Y., and Thomas, M. A. (2017). A roadmap for natural language processing
research in information systems. In Proceedings of the 50th Hawaii International
Conference on System Sciences.
Maghrebi, M., Abbasi, A., Rashidi, T. H., and Waller, S. T. (2015). Complementing
Travel Diary Surveys with Twitter Data: Application of Text Mining Techniques on
Activity Location, Type and Time. IEEE Conf. Intell. Transp. Syst. Proceedings, ITSC,
2015-Octob:208–213.
Mukherjee, T., Chander, D., Eswaran, S., Singh, M., Varma, P., Chugh, A., and Dasgupta,
K. (2015). Janayuja: A People-centric Platform to Generate Reliable and Actionable
Insights for Civic Agencies. Acm Dev 2015, pages 137–145.
Myers, S. A., Sharma, A., Gupta, P., and Lin, J. (2014). Information network or so-
cial network?: the structure of the twitter follow graph. In Proceedings of the 23rd
International Conference on World Wide Web, pages 493–498. ACM.
Nadkarni, P. M., Ohno-Machado, L., and Chapman, W. W. (2011). Natural language
processing: an introduction. Journal of the American Medical Informatics Association,
18(5):544–551.
Narayanan, U., Unnikrishnan, A., Paul, V., and Joseph, S. (2017). A survey on vari-
ous supervised classification algorithms. In 2017 International Conference on Energy,
Communication, Data Analytics and Soft Computing (ICECDS), pages 2118–2124.
IEEE.
Ni, M., He, Q., and Gao, J. (2016). Forecasting the Subway Passenger Flow Under Event
Occurrences With Social Media. IEEE Trans. Intell. Transp. Syst., 18(6):1623–1632.
Niu, W., Caverlee, J., Lu, H., and Kamath, K. (2016). Community-based geospatial tag
estimation. In Advances in Social Networks Analysis and Mining (ASONAM), 2016
IEEE/ACM International Conference on, pages 279–286. IEEE.
Roy, A., Majumder, A. G., and Nath, A. (2017). Understanding natural language process-
ing and its primary aspects. International Journal, 5(8).
Setiawan, E. B., Widyantoro, D. H., and Surendro, K. (2017). Feature expansion us-
ing word embedding for tweet topic classification. Proceeding 2016 10th Int. Conf.
Telecommun. Syst. Serv. Appl. TSSA 2016 Spec. Issue Radar Technol., (2011).
SÁ, T. H., Tainio, M., Goodman, A., Edwards, P., Haines, A., Gouveia, N., Monteiro,
C., and Woodcock, J. (2017). Health impact modelling of different travel patterns on
physical activity, air pollution and road injuries for são paulo, brazil. Environment
International, 108(Supplement C):22 – 31.
Wang, S., Sinnott, R., and Nepal, S. (2016). Privacy-protected social media user trajec-
tories calibration. Proc. 2016 IEEE 12th Int. Conf. e-Science, e-Science 2016, pages
293–302.
Wen, X., Lin, Y.-R., and Pelechrinis, K. (2016). PairFac: Event Analytics through Dis-
criminant Tensor Factorization. Cikm, pages 519–528.
Xiao, Z., Lim, H. B., and Ponnambalam, L. (2017). Participatory Sensing for Smart
Cities: A Case Study on Transport Trip Quality Measurement. IEEE Trans. Ind. Infor-
matics, 13(2):759–770.
Zhou, X. and Chen, L. (2014). Event detection over twitter social media streams. The
VLDB journal, 23(3):381–400.