From Complexity to Clarity: Simplifying OpenStreetMap Data for Improved Active Transportation Analysis

Achituv Cohen

Playlists: 'sotm2024' videos starting here / audio

OpenStreetMap (OSM) provides detailed street networks essential for analyzing active transportation (AT) infrastructure. However, the granularity and inconsistencies in OSM data pose challenges in modeling AT users' movements at the street level. This study proposes a novel methodology to generate axial networks for AT users using OSM data, simplifying the network while preserving topology. Applied to cities like Turin, Tel Aviv, and San Francisco, this approach effectively streamlined pedestrian and bicycle networks. The proposed methodology can enhance AT infrastructure, contributing to safer and more efficient urban mobility.

Introduction
OpenStreetMap (OSM) offers comprehensive street networks that span nearly every city worldwide. The networks contain essential details like road type and name, and individual streets are often represented through multiple concurrent segments. The segments include separate lanes for motor vehicles, dedicated bike paths, and pedestrian walkways.
OSM's comprehensive dataset allows the creation of specialized networks customized for pedestrian and cyclist analysis, providing users with a powerful tool for understanding and improving active transportation (AT) infrastructure. These specialized networks play a pivotal role in monitoring and understanding walking and bicycling patterns, contributing to infrastructure enhancement aligned with cities' goals for fostering AT to combat traffic congestion, obesity, and air pollution (Nelson et al., 2021)
Nevertheless, the granularity of these networks may pose challenges in modeling AT users' movements on a street level. For example, when assessing AT safety or suggesting a walkability index at the street level, the current data representativeness on OSM necessitates significant manipulation to do so. Furthermore, owing to OSM's open-editing model, the standards for mapping elements are not consistently defined, leading to contentions and variations in data quality (Haklay, 2010). Individuals often map elements based on personal needs and knowledge, introducing inconsistencies in the dataset. Consequently, in some locales, all designated lanes for various users are meticulously mapped, while in others, only a single lane representing the presence of a street is depicted. Additionally, there are instances where only lanes for motor vehicles are detailed, with scant attention paid to lanes catering to other road user groups.
Previous research has endeavored to address this issue. For example, a study suggested a topology-preserving simplification of OSM network data for large-scale simulation in sumo (Meng et al., 2022). However, they primarily focus on less complex areas and prioritizing vehicular considerations when generating the new network over those of AT users.
Aim
In this study, we propose an innovative solution to generate an axial network specifically designed for monitoring and analyzing AT users, using exclusively OSM data. Our approach simplifies the network while preserving its topology. We apply this solution across diverse spatial contexts, from straightforward geographic regions to complex urban environments. Furthermore, we implement our methodology in several cities worldwide—Tel Aviv (Israel), Turin (Italy), and San Francisco (United State)—each characterized by unique urban structures and varying levels of economic development.

Methodology
The preliminary tasks use OSMnx (Boeing, 2017) to acquire OSM street network data, converting it into a graph while correcting topological errors. Then, the data is stored in a geodata table, including polyline geometry, names, and road types. Our algorithm filters out unsuitable roads, like motorways and trunk roads, and replaces roundabouts with their central points.
The network is then ready for the multilane detection algorithm. Polylines identified in a multilane scenario are aggregated into a centerline and added to the Simplification OSM Data (SOD) network. Other polylines are added to the SOD network, retaining their original geometry. A multilane scenario is identified when polylines share the same street name, have similar angles, and are close together. Polylines are grouped by street name, and azimuth (0°-180°) narrows the angle range to ensure that parallel lines are considered parallel, regardless of orientation. The Density-Based Spatial Clustering of Applications with Noise (DBSCAN) clusters similarly angled polylines using a 10° radius and a minimum of two samples. Outliers with significantly different angles are excluded. The remaining clusters apply a right-shifted buffer to each polyline. If two or more polylines in the same class overlap by at least 10% in the shifted buffer, they're classified as multilane, and the entire class is replaced with one or more centerlines.
The core idea behind creating a new centerline is to identify its start and end vertices and then add intermediate vertices to preserve the overall shape of multilane polylines. Overlapping buffers are merged into a single polygon, and the two polygon vertices that best define the original polylines are used to establish the new centerline's start and end. Intermediate vertices are then added at regular intervals to maintain the original polylines' orientation.
The simplification process introduces significant changes to the locations of many polylines, resulting in issues of continuity, topology, accuracy, and inconsistency in the SOD network. Most problems can be resolved using existing methods, but connecting roundabouts requires extra effort. This process has three steps: transforming the roundabout representation to a point, connecting nearby dead-end polylines, and linking polylines in proximity. Following these steps ensures proper continuity and maintains the network's topology.

Findings
The case study centered on Turin, Italy, simplified its street network to 11,807 polylines from an initial 49,750. This effort reduced the complexity of 411 streets, converting 138 roundabouts to single points. Despite some challenges near intersections, the final version preserved topology.
For validation, 43 streets in Turin were reviewed, revealing a 51% success rate, 28% with minor flaws, and 14% with partial success. Issues such as threshold errors, external mapping inaccuracies, and ambiguous street configurations affected the results. Similar challenges were faced in Tel Aviv, Israel, where 50% of 22 test streets matched perfectly with the reference network. Yet, 23% had minor issues, and 9% were entirely incorrect. In San Francisco, the comparison showed 80% accuracy with 20 streets evaluated. While most streets were correctly simplified, minor flaws were present in a few cases, mainly due to inconsistencies between the SOD and reference networks. Overall, the methodology successfully streamlined street networks, although it achieved varying rates of success across different cities.

Discussion
Our study offers a robust methodology for generating axial networks for AT users using OSM data, balancing data simplification with topology preservation. Its successful application across diverse cities such as Turin, Tel Aviv, and San Francisco demonstrate its versatility and effectiveness in varied urban environments. The approach simplifies the complex urban network data, streamlining pedestrian and bicycle analysis while retaining essential details. This enables urban planners and policymakers to better monitor and understand AT patterns, leading to infrastructure improvements that support safer, more efficient, and environmentally friendly urban mobility.
Our research offers both scientific contributions and practical benefits. Scientifically, it has been applied to evaluate the built environment for pedestrian walkability using a spatial data clustering approach. Additionally, it has been utilized in research to evaluate safety and bike network connectivity, aiming to improve bicycle and pedestrian infrastructure in California cities. Practically, this work has been published on GitHub, making the code accessible to everyone for their specific needs and goals.

References
Boeing, G. (2017). OSMnx: New methods for acquiring, constructing, analyzing, and visualizing complex street networks. Computers, Environment and Urban Systems, 65, 126–139.
Haklay, M. (2010). How good is volunteered geographical information? A comparative study of OpenStreetMap and Ordnance Survey datasets. Environment and Planning B: Planning and Design, 37(4), 682–703.
Meng, Z., Du, X., Sottovia, P., Foroni, D., Axenie, C., Wieder, A., Eckhoff, D., Bortoli, S., Knoll, A., & Sommer, C. (2022). Topology-Preserving Simplification of OpenStreetMap Network Data for Large-scale Simulation in SUMO. SUMO Conference Proceedings, 3, 181–197.
Nelson, T., Ferster, C., Laberee, K., Fuller, D., & Winters, M. (2021). Crowdsourced data for bicycling research and practice. Transport Reviews, 41(1), 97–114.

Creative Commons Attribution 3.0 Unported https://creativecommons.org/licenses/by/3.0/

Download

Embed

Share:

Tags