Detecting Emergency Situations by Inferring Locations in Twitter

Most methods to detect emergency situations using Twitter rely on keywords. The problem with keywordbased methods is the need for training in specific domains for different types of events, for example: earthquakes, typhoons, terrorist attacks, tornadeos, etc. In contrast, our proposal uses the recurring mention of a country-locations in microblogging messages to identify such events without using keywords and characterize through of inter-arrival times the urgency of situation.


INTRODUCTION
Social media has become a major channel for communication during high-impact real events, for example: elections, sports events, emergency situations, etc.In any event, users act as social sensors where they share and post their mood, opinions, photos, videos and exact location by GPS in only three per cent of messages.
Microblogging plays a critical role during emergency situations because social media enable to communicate quickly and real-time current status of the affected people when a unusual event occur.Thus, microblogging offers an additional way to track the course of a disaster and the effectiveness of the response as perceived by the public.For this reason, researchers have studied the behaviour during these events for to detect, summarize and classify messages with the goal of helping authorities and the general public with situational awareness.
In current works, Ashktorab et al. (2014), Imran et al. (2014), Kumar et al. (2011) detect, summarize and classify messages using method rely on keywords over Twitter public streaming API.The problem with these keyword-based methods is the need for training in specific domains for different types of events.In general terms, to label data for classification is hard work because it requires time, human supervision and external sources as crowdsourcing.Furthermore, Olteanu et al. (2014) generate a set of keywords based on different disaster datasets, but sometimes specific terms spontaneously arise for one event, for example #eqnz for Earthquake in New Zealand or #pabloph for Typhoon Pablo in Philippines.
We propose a method based on recurring mentions of country-locations in messages' metadata for detecting a new emergency situation without using a set of keywords related to crisis situation.These mentions can occur in the text message, the GPS coordinates, the location of the user profile or a combination of these features.

METHODS
We collect random messages without using keywords or bounding box for three days -before, during and after event -for Mw 6.9 chilean Earthquake (24 April 2017 21:38:28 UTC) using Twitter public streaming API.Addionality, we extract locations for Chile from GeoNames database with at least 5,000 people per location.We create four signals associated with country-locations in the Tweet's metadata.To do this, we inspect the text of the message, the GPS coordinates and the location of the user profile, seeking any mention of country-locations into text (See Table 1 for number of tweets by signal): • Countrytxt: text tweet contains a location associated with Chile.
• Countryusr: user who shares a message has profile with location associated with Chile.• Countrytxt-usr: user shares a message that contains a location associated with Chile and his profile as well.
For each signal, we consider original tweets and retweets, compute their frequency per minute and normalize with respect to the maximum value of each (Figure 1).

RESULTS AND DISCUSSION
We compare signals when the earthquake occurs.In all cases, signals detect a new emergency situation except the countrygeo signal having lower frequency, but represent the exact place where from the message was sent.This result is due to small portion of users using GPS coordinates when sharing a message (about to three per cent or less).
The other signals correctly detect a new emergency situation because its maximum values coincide with an earthquake's datetime, which can be explained by an unusual high-impact event that affect in countrylevel.
However, the signals that represent countrylocations in the text message or the location of the user profile, exhibit noise that can generate a burst for other event types.Sometimes these signals have a relative behaviour that depends on the average daily user's activity of country.
For reducing this noise exhibited by independent features, we combine the countrytxt and the countryusr signals for generating the countrytxt-usr signal.This means that a user with an inferred locality of Chile shares a message that contains an inferred locality of Chile into the text tweet.Moreover, we can consider that a user (probably in Chile) shares information of Chile in his message, thus, the user cares about things that happen in Chile.

Characterization of the Signals
We characterize an emergency situation by using inter-arrival times between consecutive social media messages within a sub-time series.The inter-arrival time is defined as d i = t i+1 − t i where d i denotes the difference between two consecutive social media messages i and i + 1 that arrived in moments t i and t i+1 , respectively.
Using the countrytxt-usr signal, which detect a new emergency situation and has less noise than other signals, we characterize and compare two different sub-time series within of the event.For these tasks, we extract messages one hour before earthquake and messages 10 minutes after earthquakes (Figure 2). .Therefore, we learn that when people share messages about a situation on a country, their location usually corresponds with said country.
On the other hand, considering a sub-time series before an earthquake, bins are spread and 28 per cent of the messages have an inter-arrival time d i < 10 seconds (Figure 2.b).Thus, users in the same country are not simultaneously affected by the same spatio-temporal event and do not share their current location in the message.

CONCLUSIONS
Detecting an emergency situation without using a specific-domain of keywords is important because training data is hard work for researchers.We presented a proposal for detecting this event type using only locations associated with the country.These locations, unlike keywords, do not change over time and do not emerge spontaneously as new terms during an emergency situation.
In the future, we will to include a fine-grained hierarchy for each location which contains countrystate-city level with the aim of examining local impact in the first phase after an event.Also, we will compare our approach with keyword-based methods for detecting emergency situation, for example, Kumar et al. (2011) and Cameron et al. (2012).
For this task, we will use public datasets that are generated by Imran et al. (2014) or TREC 2011 microblog track.Furthermore, we will inspect the most frequent key phrases and hashtags in the tweets in the burst identified by our method.

Figure 2 :
Figure 2: Inter-arrival Time Before and During Mw 6.9 Chilean Earthquake