The goal of this project is to develop a software system to collect, store, organize and query Twitter messages, and to develop algorithms that can process the Twitter data to extract value-added information, in particular, the geolocation of Tweets. First, we will design and implement a processing and analytics system for Twitter data using the Apache Spark environment. Second, we will research and extend advanced algorithms to infer the geolocation of Tweets from their contents.