Uncovering Individual Mobility Patterns from Transit Smart Card Data: Trip Prediction, Activity Inference, and Change Detection

TitleUncovering Individual Mobility Patterns from Transit Smart Card Data: Trip Prediction, Activity Inference, and Change Detection
Publication TypeThesis
Year of Publication2018
AuthorsZhan Zhao
Academic DepartmentDepartment of Civil and Environmental Engineering
Date PublishedSeptember 2018
UniversityMassachusetts Institute of Technology
CityCambridge, MA

While conventional travel survey data are limited in sample size and observation period, recent advances in urban sensing technologies afford the opportunity to collect traces of individual mobility at a large scale and over extended periods of time. As a result, individual mobility has become an emerging field dedicated to extracting patterns that describe individual movements in time and space. Individual mobility is the result of spatiotemporal choices (e.g., the decision to go somewhere at some time) made by individuals with diverse and dynamic preferences and lifestyles. These spatiotemporal choices vary across individuals, but also for the same person over time. However, our understanding of the behavioral mechanism underlying individual mobility is lacking. The objective of this dissertation is to develop statistical approaches to extract dynamic and interpretable travel-activity patterns from individual-level longitudinal travel records. Specifically, this work focuses on three problems related to the spatiotemporal behavioral structures in individual mobility—next trip prediction, latent activity inference, and pattern change detection. Transit smart card data from London’s rail network are used as a case study for the analysis. To account for the sequential dependency between trips, a predictive model is developed for the prediction of the next trip based on the previous one. Each trip is defined by a combination of start time t (aggregated to hours), origin o, and destination d. To predict the next trip of an individual, we first predict whether the individual will travel again in the period of interest (trip making prediction), and, if so, predict the attributes of the next trip (trip attribute prediction). For trip attribute prediction, a Bayesian n-gram model is developed to estimate the probability distribution of the next trip conditional on the previous one. Based on regularized logistic regression, the trip making prediction models achieve median accuracy levels of over 80%. The prediction accuracy for trip attributes varies by the attribute considered—around 40% for t, 70-80% for o and 60-70% for d. The first trip of the day is more difficult to predict than later trips. Significant variations are found across individuals in terms of the model performance, implying diverse mobility patterns. Human activities have long been recognized as the fundamental driver for travel demand. While passively-collected human mobility data sources, such as the transit smart card data, can accurately capture the time and location of individual movements, they do not explicitly provide any behavioral explanation regarding why people travel, e.g., activity types or travel purposes. Probabilistic topic models, which are widely used in natural language processing for document classification, can be adapted to uncover latent activity patterns from human mobility data in an unsupervised manner. In this case, the activity episodes (i.e., discrete activity participations between trips) of an individual are treated as words in a document, and each “topic” represents a unique distribution over space and time that corresponds to some activity type. Specifically, a classical topic model, Latent Dirichlet Allocation (LDA), is extended to incorporate multiple heterogeneous spatiotemporal attributes—the location, arrival time, day of week, and duration of stay. The model is tested with different choices of the number of activities Z, and the results demonstrate how new patterns may emerge as Z increases. The discovered latent activities reveal diverse spatiotemporal patterns, and provide a new way to characterize individual activity profiles. Although stable in the short term, individual mobility patterns are subject to change in the long term. The ability to detect such changes is critical for developing behavior models that are adaptive over time. In this study, a travel pattern change is defined as “an abrupt, substantial, and persistent change in the underlying pattern of travel”. To detect these changes from longitudinal travel records, we specify one distribution for each of the three dimensions of travel behavior (the frequency of travel, time of travel, and origins/destinations), and interpret the change of the parameters of the distributions as indicating a pattern change. A Bayesian method is developed to estimate the probability that a pattern change occurs at any given time for each behavioral dimension. The test results show that the method can successfully identify significant changepoints in travel patterns. Compared to the traditional generalized likelihood ratio (GLR) approach, the Bayesian method requires fewer predefined parameters and is more robust. It is generalizable and may be applied to detect changes in other aspects of travel behavior and human behavior in general.