Paradiso S., Ziakopoulos A., Yannis G., “Hierarchical Clustering on Graph Embeddings: A Scalable Approach to Risky Intersections”, Proceedings of the 20th Road Safety on Five Continents Conference, Leeds, UK, 3-5 September 2025.

It is well-documented that road crashes and their consequences continue to be a serious global societal concern, as 1.19 million road users lost their lives in 2021 due to involvement in crashes (WHO, 2023). To investigate their causes and to propose solutions, Machine Learning (ML) and Deep Learning (DL) techniques have gained significant attention among road safety researchers along with the increasing availability of data (Sohail et al., 2023). Researchers have extended DL methods to graph-structured thanks to the Graph Neural Network (GNN) (Scarselli et al., 2009), by incorporating the attention mechanism, enabling the model to capture graph context and relationships more effectively, as with the Graph Attention Network (GAT) (Veličković et al., 2018). Telematics data can provide insights into road safety assessments, together with the geometric data offering more context. The present study aims to explore a hierarchy of data point groups using an agglomerative hierarchical clustering, where the data points are the nodes in the analysed graph, and improve the analysis by involving edge features employing the GAT model.such as trip ID, geographic coordinates, speed, and several binary indicators of risky driving behaviour. These data have been spatially aggregated onto the road network extracted from OpenStreetMap (2025) with aggregation methods chosen according to variable type: numerical features (e.g., speed) were averaged, while discrete events (e.g., binary flags) were summed. In the present study the GAT model was used to obtain node embeddings involving edge features of a graph. The single method points merging at low heights, related to bridging points leading to the chain effect, whereas the complete method points merging at various heights. The average and centroid methods yield similar dendrograms having short stems leading to a growing main cluster. The median method highlights a small cluster separated from the central one. Ward’s method yields a good CCC, enabling a separation of the dense data points. However, it does not rely on pairwise distances, sacrificing distance accuracy in favour of forming clusters, identifying clusters even when only statistical fluctuations are present. Yet, different stem lengths in the dendrogram indicate a certain level of separation in the data (Benatti & Costa, 2024). Thus, Ward’s method was chosen for achieving a more distinct separation in the data. Discussion focuses on the selected Ward’s method applied to feature embeddings. The dendrogram encourages separation in 2 clusters, giving as output one risky cluster displaying increased speeding and mobile usage and more frequent harsh events, while the other one exhibits lower risk traits. Downwards, the number of clusters increased to 3, resulting in the previous safer cluster being split into two subgroups: one with intermediate characteristics in terms of speeding and harsh events, yet interestingly displaying the highest mobile usage, besides the highest trip frequency.