[Profile picture of Ruben Verborgh]

Ruben Verborgh

Geospatial Partitioning of Open Transit Data

Harm Delva, Julián Rojas Meléndez, Pieter-Jan Vandenberghe, Pieter Colpaert, and Ruben Verborgh

One of the guiding principles of open data is that anyone can use the raw data for any purpose. Public transit operators often publish their open data as a single data dump, but developers with limited computational resources may not be able to process all this data. Existing work has already focused on fragmenting the data by departure time, so that data consumers can be more selective in the data they process. However, each fragment still contains data from the entire operator’s service area. We build upon this idea by fragmenting geospatially as well as by departure time. Our method is robust to changes in the original data, such as the deletion or the addition of stops, which is crucial in scenarios where data publishers do not control the data itself. In this paper we explore popular clustering methods such as k-means and METIS, alongside two simple domain-specific methods of our own. We compare the effectiveness of each for the use case of client-side route planning, focusing on the ease of use of the data and the cacheability of the data fragments. Our results show that simply clustering stops by their proximity to 8 transport hubs yields the most promising results: queries are 2.4 times faster and download 4 times less data. More than anything though, our results show that the difference between clustering methods is small, and that engineers can safely choose practical and simple solutions. We expect that this insight also holds true for publishing other geospatial data such as road networks, sensor data, or points of interest.

BibTeX other citation formats

Published in 2020 in Proceedings of the 20th International Conference on Web Engineering.

Keywords:

Read this article online

Cite this article in your work

Cite this article easily using its BibTeX entry:

@inproceedings{delva_icwe_2020,
  author = {Delva, Harm and Rojas Mel\'endez, Juli\'an and Vandenberghe, Pieter-Jan and Colpaert, Pieter and Verborgh, Ruben},
  title = {Geospatial Partitioning of {Open Transit Data}},
  booktitle = {Proceedings of the 20th International Conference on Web Engineering},
  editor = {Bielikova, Maria and Mikkonen, Tommi and Pautasso, Cesare},
  year = 2020,
  month = jun,
  pages = {305--320},
  volume = 12128,
  series = {Lecture Notes in Computer Science},
  publisher = {Springer},
  isbn = {978-3-030-50578-3},
  doi = {10.1007/978-3-030-50578-3_21},
}

Alternatively, pick a reference of your choice below:

ACM
Harm Delva, Julián Rojas Meléndez, Pieter-Jan Vandenberghe, Pieter Colpaert, and Ruben Verborgh. 2020. Geospatial Partitioning of Open Transit Data. In Proceedings of the 20th International Conference on Web Engineering (Lecture Notes in Computer Science), Springer, 305–320.
APA
Delva, H., Rojas Meléndez, J., Vandenberghe, P.-J., Colpaert, P., & Verborgh, R. (2020). Geospatial Partitioning of Open Transit Data. In M. Bielikova, T. Mikkonen, & C. Pautasso (Eds.), Proceedings of the 20th International Conference on Web Engineering (Vol. 12128, pp. 305–320). Springer.
IEEE
H. Delva, J. Rojas Meléndez, P.-J. Vandenberghe, P. Colpaert, and R. Verborgh, “Geospatial Partitioning of Open Transit Data,” in Proceedings of the 20th International Conference on Web Engineering, 2020, vol. 12128, pp. 305–320.
LNCS
Delva, H., Rojas Meléndez, J., Vandenberghe, P.-J., Colpaert, P., Verborgh, R.: Geospatial Partitioning of Open Transit Data. In: Bielikova, M., Mikkonen, T., and Pautasso, C. (eds.) Proceedings of the 20th International Conference on Web Engineering. pp. 305–320. Springer (2020).
MLA
Delva, Harm, et al. “Geospatial Partitioning of Open Transit Data.” Proceedings of the 20th International Conference on Web Engineering, edited by Maria Bielikova et al., vol. 12128, Springer, 2020, pp. 305–20.

Discuss this article