Exploring multiple evidence to infer users’ location in Twitter
Autor: | Gisele L. Pappa, Renato M. Assunção, Erica Castilho Rodrigues, Wagner Meira, Diogo Rennó |
---|---|
Rok vydání: | 2016 |
Předmět: |
Markov random field
Computer science Cognitive Neuroscience Posterior probability Probabilistic logic Markov chain Monte Carlo 02 engineering and technology computer.software_genre Computer Science Applications symbols.namesake Artificial Intelligence 020204 information systems 0202 electrical engineering electronic engineering information engineering symbols Graph (abstract data type) 020201 artificial intelligence & image processing Data mining computer |
Zdroj: | Neurocomputing. 171:30-38 |
ISSN: | 0925-2312 |
DOI: | 10.1016/j.neucom.2015.05.066 |
Popis: | Online social networks are valuable sources of information to monitor real-time events, such as earthquakes and epidemics. For this type of surveillance, users' location is an essential piece of information, but a substantial number of users choose not to disclose their geographical location. However, characteristics of the users' behavior, such as the friends they associate with and the types of messages published may hint on their spatial location. In this paper, we propose a method to infer the spatial location of Twitter users. Unlike the approaches proposed so far, it incorporates two sources of information to learn geographical position: the text posted by users and their friendship network. We propose a probabilistic approach that jointly models the geographical labels and Twitter texts of users organized in the form of a graph representing the friendship network. We use the Markov random field probability model to represent the network, and learning is carried out through a Markov Chain Monte Carlo simulation technique to approximate the posterior probability distribution of the missing geographical labels. We show the accuracy of the algorithm in a large dataset of Twitter users, where the ground truth is the location given by GPS. The method presents promising results, with little sensitivity to parameters and high values of precision. |
Databáze: | OpenAIRE |
Externí odkaz: |