Earthquake Shakes Twitter User:: Analyzing Tweets For Real-Time Event Detection

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 50

Earthquake Shakes Twitter User:

Analyzing Tweets for Real-Time Event Detection

Outline
Introduction Event Detection Model Experiments And Evaluation Application

Conclusions

Outline
Introduction

Whats happening?

Twitter

is one of the most popular microblogging services has received much attention recently is a form of blogging

Microblogging

that allows users to send brief text updates that allows users to send photographs or audio clips

is a form of micromedia

In this research, we focus on an important characteristic

real-time nature

Real-time Nature of Microblogging


social events parties baseball games presidential campaign

disastrous events storms fires traffic jams riots heavy rain-falls earthquakes Twitter users write tweets several times in a single day.
There is a large number of tweets, which results in many reports related to events We can know how other users are doing in real-time We can know what happens around other users in realtime.

Our motivation

Adam Ostrow, an Editor in Chief at Mashable wrote the possibility to detect earthquakes from tweets in his blog
Japan Earthquake Shakes Twitter Users ... And Beyonce: Earthquakes are one thing you can bet on being covered on Twitter first, because, quite frankly, if the ground is shaking, youre going to tweet about it before it even registers with the USGS* and long before it gets reported by the media. That seems to be the case again today, as the third earthquake in a week has hit Japan and its surrounding islands, about an hour ago. The first user we can find that tweeted about it was Ricardo Duran of Scottsdale, AZ, who, judging from his Twitter feed, has been traveling the world, arriving in Japan yesterday.

we can know earthquake occurrences from tweets =the motivation of our research

*USGS : United States Geological Survey

Our Goals

propose an algorithm to detect a target event


do semantic analysis on Tweet

to obtain tweets on the target event precisely


to detect the target event to estimate location of the target

regard Twitter user as a sensor


produce a probabilistic spatio-temporal model for


event detection location estimation

propose Earthquake Reporting System using Japanese tweets

Twitter and Earthquakes in Japan


a map of Twitter user world wide

a map of earthquake occurrences world wide The intersection is regions with many earthquakes and large twitter users.

Twitter and Earthquakes in Japan

Other regions: Indonesia, Turkey, Iran, Italy, and Pacific coastal US cities

Outline

Event Detection

Event detection algorithms

do semantic analysis on Tweet

to obtain tweets on the target event precisely

regard Twitter user as a sensor


to detect the target event to estimate location of the target

Semantic Analysis on Tweet

Search tweets including keywords related to a target event

Example: In the case of earthquakes

shaking, earthquake

Classify tweets into a positive class or a negative class

Example:

Earthquake right now!! ---positive Someone is shaking hands with my boss --- negative

Create a classifier

Semantic Analysis on Tweet

Create classifier for tweets

use Support Vector Machine(SVM)

Features (Example: I am in Japan, earthquake right now!)

Statistical features (7 words, the 5th word) the number of words in a tweet message and the position of the query within a tweet Keyword features ( I, am, in, Japan, earthquake, right, now) the words in a tweet Word context features (Japan, right) the words before and after the query word

Tweet as a Sensory Value


Event detection from twitter Object detection in ubiquitous environment

Probabilistic model Classifier tweets

Probabilistic model values

observation by twitter users


target event

observation by sensors
target object

the correspondence between tweets processing and sensory data detection

Tweet as a Sensory Value


Event detection from twitter detect an earthquake search and classify them into positive class some users posts earthquake right now!! Object detection in ubiquitous environment detect an earthquake some earthquake sensors responses positive value

Probabilistic model Classifier tweets

Probabilistic model values

observation by twitter users

observation by sensors
target object

earthquake target event occurrence

We can apply methods for sensory data detection to tweets processing

Tweet as a Sensory Value

We make two assumptions to apply methods for observation by sensors

Assumption 1: Each Twitter user is regarded as a sensor


a tweet a sensor reading a sensor detects a target event and makes a report probabilistically Example:

make a tweet about an earthquake occurrence earthquake sensor return a positive value

Assumption 2: Each tweet is associated with a time and location

a time : post time location : GPS data or location information in users profile

Processing time information and location information, we can detect target events and estimate location of target events

Outline

Model

Probabilistic Model

Why we need probabilistic models?


Sensor values are noisy and sometimes sensors work incorrectly We cannot judge whether a target event occurred or not from one tweets We have to calculate the probability of an event occurrence from a series of data

We propose probabilistic models for


event detection from time-series data location estimation from a series of spatial information

Temporal Model

We must calculate the probability of an event occurrence from multiple sensor values We examine the actual time-series data to create a temporal model

20

60

80

number of tweets number of tweets


100 120 160 40 20 40 60 80 140 100

Temporal Model

Aug 9 Aug 9 0 Aug 9 0 Aug 10 0 Aug 10 0 Aug 10 0 0 Aug 11 0 Aug 11 0 Aug 11 0 Aug 12 0 Aug 12 0 Aug 12 0 Aug 13 0 Aug 13 0 Aug 13 0 Aug 14 0 Aug 14 0 Aug 14 0 Aug 15 0 Aug 15 0 Aug 15 0 Aug 16 0 Aug 16 0 Aug 16 0 Aug 17 0 Aug 17 0
0

120

Temporal Model

the data fits very well to an exponential function

f t; e

t 0, 0

0.34

design the alarm of the target event probabilistically ,which was based on an exponential distribution

Spatial Model

We must calculate the probability distribution of location of a target We apply Bayes filters to this problem which are often used in location estimation by sensors

Kalman Filers Particle Filters

Bayesian Filters for Location Estimation

Kalman Filters

are the most widely used variant of Bayes filters approximate the probability distribution which is virtually identical to a uni-modal Gaussian representation advantages: the computational efficiency disadvantages: being limited to accurate sensors or sensors with high update rates

Bayesian Filters for Location Estimation

Particle Filters represent the probability distribution by sets of

samples, or particles

advantages: probability

the ability to represent arbitrary

densities
particle filters can converge to the true posterior even in nonGaussian, nonlinear dynamic systems.

disadvantages: the difficulty in applying to high-dimensional estimation problems

Information Diffusion Related to Real-time Events

Proposed spatiotemporal models need to meet one condition that

Sensors are assumed to be independent

We check if information diffusions about target events happen because

if an information diffusion happened among users, Twitter user sensors are not independent . They affect each other

Information Diffusion Related to Real-time Events


Information Flow Networks on Nintendo DS Game an earthquake a typhoon Twitter

In the case of an earthquakes and a typhoons, very little information diffusion takes place on Twitter, compared to Nintendo DS Game We assume that Twitter user sensors are independent about earthquakes and typhoons

Outline

Experiments And Evaluation

Experiments And Evaluation

We demonstrate performances of

tweet classification event detection from time-series data show this results in application location estimation from a series of spatial information

Evaluation of Semantic Analysis

Queries

Earthquake query: shaking and earthquake Typhoon query:typhoon

Examples to create classifier

597 positive examples

Evaluation of Semantic Analysis

earthquake query
Features Recall Precision F-Value

Statistical
Context All

87.50%
50.00% 87.50%

63.64%
38.89% 66.67% 63.64%

73.69%
53.85% 57.14% 73.69%

Keywords 87.50%

shaking query
Features Statistical Context All Recall 66.67% 52.78% 80.56% Precision F-Value 68.57% 57.41% 86.36% 65.91% 67.61% 68.89% 68.20% 72.50%

Keywords 86.11%

Discussions of Semantic Analysis


Features Recall Precision F-Value

Statistical
Keywords Context

87.50%
87.50% 50.00%

63.64%
38.89% 66.67%

73.69%
53.85% 57.14%

All

87.50%

63.64%

73.69%

We obtain highest F-value when we use Statistical features and all features. Keyword features and Word Context features dont contribute much to the classification performance A user becomes surprised and might produce a very short tweet Its apparent that the precision is not so high as the recall

Experiments And Evaluation

We demonstrate performances of

tweet classification event detection from time-series data show this results in application location estimation from a series of spatial information

Evaluation of Spatial Estimation

Target events

earthquakes

25 earthquakes from August.2009 to October 2009 name: Melor

typhoons

Baseline methods

weighed average

simply takes the average of latitudes and longitudes simply takes the median of latitudes and longitudes

the median

We evaluate methods by distances from actual centers

a distance from an actual center is smaller, a method works better

Evaluation of Spatial Estimation


balloon: each tweets color : post time

Kyoto Tokyo

estimation by median estimation by particle filter

Osaka

actual earthquake center

Evaluation of Spatial Estimation

Evaluation of Spatial Estimation


Earthquakes
Date Actual Center Median Weighed Average Kalman Filter Particle Filter

mean square errors of latitudes and longitude

Average

5.47

3.62

3.85

3.01

Particle filters works better than other methods

Evaluation of Spatial Estimation


A typhoon
Date Actual Center Median Weighed Average Kalman Filter Particle Filter

mean square errors of latitudes and longitude

Average

4.39

4.02

9.56

3.58

Particle Filters works better than other methods

Discussions of Experiments

Particle filters performs better than other methods If the center of a target event is in an oceanic area, its more difficult to locate it precisely from tweets It becomes more difficult to make good estimation in less populated areas

Outline

Application

Earthquake Reporting System

Toretter ( http://toretter.com)

Earthquake reporting system using the event detection algorithm All users can see the detection of past earthquakes Registered users can receive e-mails of Dear Alice, earthquake detection reports
We have just detected an earthquake around Chiba. Please take care. Toretter Alert System

Screenshot of Toretter.com

Earthquake Reporting System

Effectiveness of alerts of this system

Alert E-mails urges users to prepare for the earthquake if

they are received by a user shortly before the earthquake actually arrives.

Is it possible to receive the e-mail before the earthquake actually arrives? An earthquake is transmitted through the earth's

crust at about 3~7 km/s. a person has about 20~30 sec before its arrival at a point that is 100 km distant from an actual center

Results of Earthquake Detection


Date Magnitude Location Time E-mail sent time 7:00:30 19:23:14 8:51:35 2:23:21 22:22:29 17:48:11 20:26:45 00:46:24 13:05:04 17:38:27 time gap [sec] 95 26 19 31 73 41 22 30 19 34 # tweets within 10 minutes 35 17 52 23 13 16 14 32 18 3 Announce of JMA 7:08 19:28 8:56 2:27 22:26 1:7:53 20:31 00:51 13:10 17:43

Aug. 18 Aug. 18 Aug. 21 Aug. 25 Aug.25 Aug. 27 Aug. 27 Ag. 31 Sep. 2 Sep. 2

4.5 3.1 4.1 4.3 3.5 3.9 2.8 4.5 3.3 3.6

Tochigi Suruga-wan Chiba Uraga-oki Fukushima Wakayama Suruga-wan Fukushima Suruga-wan Bungo-suido

6:58:55 19:22:48 8:51:16 2:22:49 2:21:15 17:47:30 20:26:23 00:45:54 13:04:45 17:37:53

In all cases, we sent E-mails before announces of JMA In the earliest cases, we can sent E-mails in 19 sec.

Experiments And Evaluation

We demonstrate performances of

tweet classification event detection from time-series data show this results in application location estimation from a series of spatial information

Results of Earthquake Detection


JMA intensity scale Num of earthquakes Detected 2 or more 78 70(89.7%) 3 or more 25 24(96.0%) 4 or more 3 3(100.0%)

Promptly detected*

53(67.9%)

20(80.0%)

3(100.0%)

Promptly detected: detected in a minutes JMA intensity scale: the original scale of earthquakes by Japan Meteorology Agency

Period: Aug.2009 Sep. 2009 Tweets analyzed : 49,314 tweets Positive tweets : 6291 tweets by 4218 users We detected 96% of earthquakes that were stronger than scale 3 or more during the period.

Outline

Conclusions

Conclusions

We investigated the real-time nature of Twitter for event detection

Semantic analyses were applied to tweets classification We consider each Twitter user as a sensor and set a problem to detect an event based on sensory observations Location estimation methods such as Kaman filters and particle filters are used to estimate locations of events
We developed an earthquake reporting system, which is a novel approach to notify people promptly of an earthquake event We plan to expand our system to detect events of various kinds such as rainbows, traffic jam etc.

Thank you for your paying attention and tweeting on earthquakes.

http://toretter.com

Takeshi Sakaki(@tksakaki)

Temporal Model

the probability of an event occurrence at time t


n0 1 e ( t 1) 1 e occur f pf the false positive ratio of a sensor n pf the probability of all n sensors returning a false alarm the probability of event occurrence 1 pn f t n0 sensors at time 0 n0 e sensors at time t ( t 1) 1 e the number of sensors at time t n0 1 e

(t ) 1 p

expected wait timet wait to deliver notification

t wait 1 (0.1264 n0 0.7117 1

parameter

0.34, p f 0.35, poccurr 0.99

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy