Predicting train occupancies based on query logs and external data sources
On dense railway networks—such as in Belgium—train travelers are frequently confronted with overly occupied trains, especially during peak hours. Crowdedness on trains leads to a deterioration in the quality of service and has a negative impact on the well-being of the passenger. In order to stimulate travelers to consider less crowded trains, the iRail project wants to show an occupancy indicator in their route planning applications by the means of predictive modelling. As there is no official occupancy data available, training data is gathered by crowd sourcing using the Web app iRail.be and the Railer application for iPhone. Users can indicate their departure & arrival station, at what time they took a train and classify the occupancy of that train into the classes: low, medium or high. While preliminary results on a limited data set conclude that the models do not yet perform sufficiently well, we are convinced that with further research and a larger amount of data, our predictive model will be able to achieve higher predictive performances. All datasets used in the current research are, for that purpose, made publicly available under an open license on the iRail website and in the form of a Kaggle competition. Moreover, an infrastructure is set up that automatically processes new logs submitted by users in order for our model to continuously learn. Occupancy predictions for future trains are made available through an API.
Full text BibTeX Mendeley
Published in 2017 in Proceedings of the 7th International Workshop on Location and the Web.
Keywords: Web, research
Read this paper online
- Read the full text online.
- Request a digital copy of this paper.
- Add this paper to your Mendeley library.
Cite this paper in your publications
- Use the BibTeX entry to easily refer to this paper.
- Alternatively, you can refer to this paper as: Vandewiele, G., Colpaert, P., Janssens, O., Van Herwegen, J., Verborgh, R., Mannens, E., Ongenae, F., et al. (2017), “Predicting train occupancies based on query logs and external data sources”, in Proceedings of the 7th International Workshop on Location and the Web.