Real time transit demand prediction capturing station interactions and impact of special events

TitleReal time transit demand prediction capturing station interactions and impact of special events
Publication TypeJournal Article
Year of PublicationSubmitted
AuthorsPeyman Noursalehi, Haris N. Koutsopoulos, Jinhua Zhao
JournalTransportation Research Part B
KeywordsAFC data, correlation clustering, dynamic factor models, real time prediction, state-space models, station arrivals
Abstract

 

Demand for public transportation is highly affected by passengers' experience and the level of service provided. Thus, it is vital for transit agencies to deploy adaptive strategies to respond to changes in demand or supply in a timely manner, and prevent unwanted deterioration in service quality. In this paper, a real time prediction methodology, based on univariate and multivariate state-space models, is developed to predict the short-term passenger arrivals at transit stations. A univariate state-space model is developed at the station level. Through a hierarchical clustering algorithm with correlation distance, stations with similar arrival patterns are identified. A dynamic factor model is proposed for each cluster, capturing station interdependencies through a set common factors. Both approaches can model the effect of exogenous events (such as football games). Ensemble predictions are then obtained by combining the outputs from the two models, based on their respective accuracy. We evaluate these models using AFC data from the 32 stations on the Central line of the London Underground (LU), operated by Transport for London (TfL). The results indicate that the proposed methodology performs well in predicting short-term station arrivals for the set of test days. For most stations, ensemble prediction has the lowest mean error, as well as smallest range of error, and exhibits more robust performance across the test days. The paper also briefly discusses the use of data from Twitter and other social media platforms to detect unplanned events in real time, and use them as inputs to the models.