|Title||Individual-Level Trip Detection using Sparse Call Detail Record Data based on Supervised Statistical Learning|
|Publication Type||Conference Paper|
|Year of Publication||2016|
|Authors||Zhan Zhao, Jinhua Zhao, Haris Koutsopoulos|
|Conference Name||Transportation Research Board 95th Annual Meeting|
Despite a large body of literature related to trip detection using Call Detail Record (CDR) data, the fundamental understanding of the limitations of the data is lacking and, particularly, its sparse nature is not well addressed in existing work. This paper develops a conceptual framework to make explicit distinction between telecommunication patterns captured by CDRs and travel patterns that are of interest to the transportation community. Motivated by the over-reliance of existing trip detection methodology on heuristics and assumptions, the authors propose to use data fusion to form labeled data for supervised statistical learning. In the absence of complementary data, this can be done by extracting labeled observation from more granular cellular data access records and extracting feature vectors from voice-call and SMS records. The proposed approach is demonstrated, using real-word CDR data from a Chinese city, through inferring whether there exists a hidden visit between two consecutive visits observed from CDR data. Logistic regression, support vector machine (SVM) and artificial neural network (ANN) are used to develop statistical classification models, and all show significant improvement over the naïve rule that assumes no hidden visit. This study provides a deeper understanding on how the authors can, and should, extract trips in human mobility from CDRs in telecommunication. The proposed data fusion approach offers a flexible and systematic way to make inference of individual mobility patterns, even when only CDR data is available.