SILO



The weekly SILO Seminar Series is made possible through the generous support of the 3M Company and its Advanced Technology Group

3M

with additional support from the Analytics Group of the Northwestern Mutual Life Insurance Company

Northwestern Mutual

Spatio-Temporal Signal Recovery from Social Media

Junming Xu | Aniruddha Bhargava, Graduate students in CS and ECE respectively

Date and Time: May 02, 2012 (12:30 PM)
Location: Orchard room (3280) at the Wisconsin Institute for Discovery Building

Abstract:

Abstract:
Many real-world phenomena can be represented by a spatio-temporal signal: where, when, and how much. Social media is a tantalizing data source for those who wish to monitor such signals. Unlike most prior work, we assume that the target phenomenon is known and we are given a method to count its occurrences in social media. However, counting is plagued by sample bias, incomplete data, and, paradoxically, data scarcity issues inadequately addressed by prior work. We formulate signal recovery as a Poisson point process estimation problem. We explicitly incorporate human population bias, time delays and spatial distortions, and spatio-temporal regularization into the model to address the noisy count issues. We present an efficient optimization algorithm and discuss its theoretical properties. We show that our model is more accurate than commonly-used baselines. Finally, we present a case study on wildlife roadkill monitoring, where our model produces qualitatively convincing results.

Part 1 (Aniruddha): Introduction to Twitter related problems, previous work done in this area such as earthquake detection and then to our mathematical model of how tweets are produced. A more formal mathematical model is proposed which leads to a graph-regularized optimization problem. Finally, some new mathematical issues that arise from our initial work will be talked about.

Part 2 (Junming): The case study on roadkill and synthetic data. Data collection from twitter: natural language processing on tweet and how it can be used for our particular problem, Twitter API and collecting tweets automatically. Results and some future questions.