Urban computing using call detail records : mobility pattern mining, next-location prediction and location recommendation

TitleUrban computing using call detail records : mobility pattern mining, next-location prediction and location recommendation
Publication TypeThesis
Year of Publication2016
AuthorsYan Leng
Academic DepartmentDepartment of Civil and Environmental Engineering
DegreeMaster of Science in Transportation
Date Published06/2016
UniversityMassachusetts Institute of Technology
CityCambridge, MA

Urban computing fuses computer science with other fields, such as transportation, in the context of urban spaces by connecting ubiquitous sensing technologies, analytical models and visualizations to solve challenging problems in urban environment and operation systems. This paper focuses on Call Detail Records, one widely collected opportunistic sensing data source for billing purposes, to understand presence patterns, develop mobility prediction methods and reduce traffic congestions with location recommendations. Understanding human mobility and presence patterns at locations are the building blocks for behavior prediction, service design and system improvements. In the first part, this thesis focuses on 1) understanding presence patterns at user locations with a proposed metric Normalized Hourly Presence, 2) extracting common presence patterns across the population with Principal Component Analysis; 3) and infer home and workplaces using K-means Clustering and Fuzzy C-means Clustering. The proposed method was implemented on MIT Reality Mining data, by which we demonstrate that with inference rates of 56% and 82%, the method can improve 79% and 34% in accuracy respectively in home and workplace inference comparing to the baseline model. In addition, it was implemented on the CDR data collected in a crowded city in China to prove its scalability and applicability in real-world applications. With Fuzzy C-means Clustering, we could flexibly trade-off between inference rate and accuracy to understand the interplay between the two and apply it for various purposes. With an understanding of mobility patterns, the next crucial foundation in urban computing is mobility prediction, enabling transportation practitioners to take actions beforehand and commercial organizations to send location-based advertisements, etc. Specifically, this paper focuses on next-location prediction from Call Detail Records. Mobility traces was analogized to language models, mapping cell towers to words and individual location traces to sentences. Recurrent Neural Network is a successful tool in natural language processing, which is applied in mobility prediction due to its acceptance of sequential input, variable input length and ability to learn the 'meaning' of cell towers. By implementing the method on Call Detail Records collected in Andorra, we show that the method improved more than 40% over the baseline model, with 67% and 78% accuracy in next location at cell tower and merged cell tower level respectively. The 'meanings' of the cell tower could also be inferred, the same as learning the meanings of words in sentences, from the embedding layer of Recurrent Neural Network. The last project aims at tackling the challenge of severe traffic congestions with location recommendations. The availability of large-scale longitudinal geolocation data, such as Call Detail Records, offers planners and service providers an unprecedented opportunity to understand location preferences and alleviate traffic congestions. Location recommendation is a potential tool to achieve these two objectives. Previous research on location recommendations has focused on automatically and accurately inferring users' preferences, while little attention has been devoted to the constraints of service capacity. The ignorance may lead to congestion and long waiting time. We argue that Call Detail Records could help planners and authorities make interventions by providing personalized recommendations given the comprehensive urban-wide picture of historical behaviors and preferences. In this research, we propose a method to make location recommendations for system efficiency, defined as maximizing satisfactions toward recommendations subject to capacity constraints, exploiting travelers' choice flexibilities. We infer implicit location preferences based on sparse and passively-collected Call Detail Records. We then formulate an optimization model the defined system efficiency. As a proof-of-concept experiment, we implement the method in Andorra, a small European country heavily relying on tourism. By extensive simulations, we demonstrate that the method can reduce the travel time increased by congestion during peak hour from 11.73 minutes to 5.6 minutes with idealized trips under full compliance rates. We show that the average travel time increased by congestion is 6.17, 6.98, 8.37 and 10.98 minutes with 80%, 60%, 40% and 20% compliance rates. Overall, our results indicate that Call Detail Records can be used to make locations recommendation while reduce traffic congestion for system efficiency. The proposed method can be applied to other large-scale location traces and extended to other location or events recommendation applications.

Short TitleUrban computing using call detail records