Katrakazas C., Antoniou C., Yannis G., “Time series classification using imbalanced learning for real-time safety assessment“, Proceedings of the Transportation Research Board (TRB) 98th Annual Meeting, Washington, D.C., 13-17 January 2019.

The probability of estimating a traffic collision happening in real-time primarily depends on comparing traffic conditions just before a collision with traffic conditions during normal operations. Most studies however utilize aggregated traffic data and are not concerned with the dynamic nature of collisions or the imbalance of safety databases which can lead to erroneous real-time predictions. In this study, this is overcome through the use of raw speed time series data of variant duration (1-minute to 5-minute time series data) from a driving simulator experiment and the use of imbalanced learning techniques. Two classifiers are then employed to examine the proposed idea: (i) Random Forests (RFs) – an ensemble classifier and (ii) Neural Networks (NNs) – a popular classifier in the literature. These classifiers are tested on the original time series data, as well as on time-series treated with the imbalanced learning techniques of undersampling and its integration with oversampling. The main results demonstrate the viability of using raw speed time series data for real-time safety assessment and the superiority of time series with 4-minute duration in the classification results. Furthermore, RFs perform well even in 1-minute time series data while the classification results can be enhanced by up to 40% from imbalanced learning approaches. It is also demonstrated that the classification results outperform similar approaches in the literature. However, real-world traffic data and the use of more sophisticated classifiers (e.g. Deep Learning) are expected to provide more effective collision predictions.