Guidelines: To simplify running of code, download the entire file (containing ‘base_trajet_total.csv’ and ‘temps_trajet.csv’) to avoid having to merge all the databases from scratch
Keywords : Djisktra, metro, Shiny
I/ Aim of project
The aim of the project was to find the optimal mid-way for two users in a city to meet using Djikstra algorithm. A visualisation tool depicting the mi-way and meeting point suggestions (café, bar, restaurant) was integrated into a Shiny App. This optimal mid-way was computed based on data taken from the public Paris metro station dataset (‘Open RATP’ website). The scope of the dataset was limited to metro and suburban train (RER), excluding buses.
For example, if a person is at ‘Bastille’ tube stop and the other at ‘Champs Elysées Clémenceau’, the app would find the optimal tube mid-way and geographical coordinates. the Shiny App displays café, bar and restaurant recommendations at this mid-way.
II/ Methodology
Datasets were taken from the following website: https://data.ratp.fr/explore/dataset/offre-transport-de-la-ratp-format-gtfs/ Folder “RATP_GTFS_FULL”
This folder included separate databases for metro stop, trips, routes and stop_times which we merged.
After merging of different datasets, the global dataset was sorted by:
- Direction of line (to or fro)
- branches (several branches per line)
- stop sequence (order of stations per line)
Some of the challenges encountered were loops for some lines and irregular stop sequences, which we corrected for to enable visualisation and computation of Djikstra algorithm. Visualisations were produced using ggplot and leaflet packages, with attribution of colours per line.
To prepare for the Djikstra algorithm, timings between stops were calculated using stop inputs and stop outputs. Commute travel time between stops (walking) were integrated to adjust for the timing computations. The database was simplified to a 3-column dataframe containing stop input, stop output and time difference between input and output (‘weight’).
For example, the line RER B had 10 different branches, these were reduced to just 2 (KOCQ and SOIR) due to irregular stop sequences on the other branches (duplicate stop sequence for one stop, stops being skipped on the line) Metro Line 7 had loops, which were separated as separate journeys.
The cafés, restaurants and bar recommendations were taken from Google API website (https://developers.google.com/places/web-service/) and integrated to midway calculations through geographical coordinates.
The project was enriching from several angles :
- Narrowing of the scoping of project (metro and RER, excluding for buses)
- Arranging of database based on direction, branches, stop sequences
- Exclusion of outliers (loops, stop sequences and variables (sorting the databases by key variables
- Visualisation using leaflet packages
- Shiny App integration using user interface and server