wri-cities / static-gtfs-manager Goto Github PK
View Code? Open in Web Editor NEWGUI interface for creating, editing, exporting of static GTFS data for a public transit authority
License: GNU General Public License v3.0
GUI interface for creating, editing, exporting of static GTFS data for a public transit authority
License: GNU General Public License v3.0
This could get a little tricky as the conventional flow is that the tabulator table is the main data holder and the map populates itself from there.
Initial thoughts on how to achieve this:
To "break out" of the map constraints, user presses the button or toggle again. At that point, need to run a function to remove filtering on the tabulator table.
Tricky parts : There may currently be triggers set to populate map based on what the table is showing. That shouldn't set off an infinite loop with both map and table going millennial on us and getting triggered by each other endlessly.
Ref:
Backend log file to capture all the print() statements, with timestamps. Maybe CSV with categorization. In Maintenance section, let user download and clear log file.
Possible categories : DB write, DB read, which page, which section, etc.
Calendar: date picker for start and end dates
The tabulator JS library on the front end seems to have some issues with Firefox browser. Yet to do full testing, would be great if folks who are more invested in making this work in Firefox (mozilla friends?) can take this up. At my end I'm satisfied with running this on Chrome / Chromium browser.
Can use dataFiltered
callback from here: http://tabulator.info/docs/3.4#callbacks-filter
apply the shakeIt() function at all places in all web pages where data is not entered properly and save button is pressed etc. Right now it's working for the Misc > Maintenance section and few other places. Apply it everywhere.
Schedules page, Trips tab.
Show status messages like loading.. loaded etc next to route selector for loading trips. As this can take time.
This was an optional field in the GTFS spec, but if we want to support multiple agencies then this becomes necessary. For now, add a column "agency_id" to the routes tabulator table.
For later: make dropdown for selecting agency from existing agencies.
Sequence stop addition: exclude stops already in list
Some reservations : may there be cases of a route circling around or going into a constrained area, coming back out the same way, and having the same stop further in the sequence?
Bug seen on first time load on heroku (but not on local machine):
The shape's sequence order is messed up on loading from DB. Checking db.json, confirmed : the rows are stored as 1,10,100... ie text sorting.
"shapes": {
"1": {
"shape_dist_traveled": 0.0,
"shape_id": "R001_0",
"shape_pt_lat": 10.110608860405785,
"shape_pt_lon": 76.34918872659786,
"shape_pt_sequence": 1
},
"10": {
"shape_dist_traveled": 0.63,
"shape_id": "R001_0",
"shape_pt_lat": 10.105114690637768,
"shape_pt_lon": 76.35012437809694,
"shape_pt_sequence": 10
},
"100": {
"shape_dist_traveled": 15.880000000000006,
"shape_id": "R001_0",
"shape_pt_lat": 9.990707688393124,
"shape_pt_lon": 76.28752271426649,
"shape_pt_sequence": 100
},
"101": { ...
When we upload a fresh shapefile, the ordering is proper. It's also peculiar that just the onward direction shape's order is messed up while the return direction shape is fine. So this seems to be an artefact from the GTFS import mechanism.
In any case, the program needs to have a sorting mechanism when retrieving the data for a shape. This will probably be better done at python end itself.
To replicate:
Uncaught TypeError: Cannot set property 'stop_id' of undefined
at initiateSequence (routes.js:453)
at XMLHttpRequest.xhr.onload (routes.js:424)
It's because the stop_id was still left in the sequence db. The program read that and tried to load data for that stop_id and errored out.
To do: delete the stop_id from sequence db as well. Link for code location:
https://github.com/WRI-Cities/static-GTFS-manager/blob/v1.0.0/GTFSserverfunctions.py#L629
Secondary to-do: Even renaming of stop_id doesn't take care of this I believe.
https://github.com/WRI-Cities/static-GTFS-manager/blob/v1.0.0/GTFSserverfunctions.py#L722
And, resilience planning : This sort of error may happen again. The sequence db is an extra thing that is not part of the GTFS feed spec, it is created and kept by this program to help standardize the stops sequence for a route and keep a template ready for when a new trip is provisioned. It should not be allowed to fail the program. Therefore, the code reading it should quietly skip a non-matching stop_id encountered and move on to next stop in sequence. This is lower priority, so moving it to another issue.
This is related with a lot of other bugs that are coming up when creating schedules from scratch. If an argument is missing in a GET or POST request received from the webpage, then instead of erroring out, the tornado handler should assign it a default value like None
or ''
(empty string) and handle things gracefully.
Ref: http://www.tornadoweb.org/en/stable/web.html#tornado.web.RequestHandler.get_argument
Scan thru trips db and list or mark routes that are missing from routes db.
Find routes having no trips defined
Detect stops mentioned in stop_times but missing from stops db.
The "Add" button actions next to the stop choosers were still taking value from the older autocomplete inputs (stop2add-0
, stop2add-1
). Update them to take values of the newer stop selectors (stopChooser0
, stopChooser1
).
In Stops page map, do the same divIcon technique used in Routes page maps.
And show the first letter of each stop or something.
If a route is only one-directional (like circular) then need to handle that. Give the user a way to specify that when deciding sequence.
For stops, routes, trips, fares, timings etc sections, give user an option to add entries in bulk by uploading a CSV file, or copy-pasting tabular data (tab-separated values) from an excel. At present, apart from full GTFS feed upload, the user has to manually add each entry. But what if they already have the data arranged in a table on their end and can name their headers to match ours?
Plus: we can additionally give them an option to download the presently loaded data in the tabulator table as CSV.
Concern : this feature will need to run diagnostics and validate the bulk data. It will also need to check for unique entries or fields.
In cases where part of the bulk data is fresh entries to be added and part seems to be an edit of existing ones, uses will have to be prompted with the statistics and before/after data and have to give consent for the changed entries. Either that or we give adequate disclaimers that existing data will be overwritten if the key fields are same etc.
Rejection: Rejection of bulk added entries could be on grounds like:
All .fitBounds(...
commands need to have options argument :
{padding:[20,20], maxZoom:14}
Like:
map.fitBounds(sequenceLayer[0].getBounds(), {padding:[20,20], maxZoom:14});
Misc section : agency and calendar entries have other linkages, can't directly delete.
Need to disable the delete columns in the Agency and Calendar tabulator tables on front-end and refer users to the Maintenance section.
Calendar service is already provisioned in maintenance section; agency_id deletion and renaming need to be provisioned. This will involve re-coding in both frontend JS and backend.
Operative code is around here: https://github.com/WRI-Cities/static-GTFS-manager/blob/v1.0.0/GTFSserverfunctions.py#L44
It calls the function csvwriter which
Current limitation : To know what columns to create in the csv, the csvwriter
function only reads the first row in the table array. In the event that there are more fields further down the data, they will not make it to the exported feed.
Proposed solution: I haven't confirmed yet but I believe Pandas dataframe would handle this better.. would create columns for any and all keys in encounters throughout the array (list of dicts to be precise). I have already changed the GTFS Import mechanism over to Pandas for this same reason. It reads and stores into the db the numbers as numbers only whereas csvreader was casting everything as text, and that makes it straightforward to run numerical comparisons etc on the data.
Note: linked to v.1.0.0 though further developments will happen in master branch. So that when the code does eventually change these links (linking to line numbers in the file) don't break.
Sequences management: Load other stops greyed out or so on the map, user can click them to add to sequence. Desired UX: Compose a route by clicking stops on the map instead of searching by id/name.
schedules.html : On choosing "No Selection" option in the Route picker, backend gets an API call. Logs:
trips has 0 rows for route_id = No Selection
/API/trips GET call took 0.37 seconds.
At JS side itself if the value is "No Selection" then it should blank out the table and exit without making a GET request.
Routes : colorpicker for choosing colors of the route.
https://developers.google.com/transit/gtfs/reference/#frequenciestxt
If we want this tool to be used by bus agencies etc then we need to provision frequency.txt feature.
Suppose a certain 'trunk' route plies every 20 minutes, from 6am in the morning to 11.30pm at night. Both ways. For provisioning this route, we would need to make this many new entries in trips.txt (Schedules > Trips):
6am to 11.30pm => 0600 to 2330 =>1730 =>17.5 hrs.
20 mins => 3 times an hour.
trips = 3 x 17.5 = 51+1 = 52 trips.
Both directions => 104 trips.
That's not all. Suppose there are 30 stops along this route. Then, number of entries to be made in stop_times.txt (Schedules > Timings) : 30 x 104 = 3120
.
And that's weekdays. If for the weekend service there are say 80 trips in a day, then another 2400 entries to be made for weekend.
And for all those entries we need to calculate what time each trip will begin, and interpolate the arrival/departure times for every stop. So, 5520
calculations to be done in total.
So one route running every 20 minutes needs total:
1 entry in routes.txt
184 entries in trips.txt
5520 entries in stop_times.txt, all with different values of timings.
One-time exercise? Ok then, suppose at some point the route is bumped up to plying every 10 minutes during just the morning and evening rush hours.
Transit agency boss : "What's the trouble? We're just changing the frequency for some time period."
Person in charge of updating GTFS feed: "#W$#@$#%#!$@!#@!##"
Now you may say "so what, since it's a repeating pattern let's automate it". But if it's a repeating pattern, why can't the app reading and interpreting the GTFS feed automate it? The GTFS feed itself should be restricted to carrying information that cannot be auto-generated.
For this reason, frequencies.txt is introduced in the static GTFS standard.
Let's take the same 20-min frequency route again. Instead of several trips, only two trips (one per direction) are entered in trips.txt. And then, in frequencies.txt, make two entries for the two directions:
trip_id | start_time | end_time | headway_secs | exact_times |
---|---|---|---|---|
R1_0 | 06:00:00 | 23:30:00 | 1200 | 0 |
R1_1 | 06:00:00 | 23:30:00 | 1200 | 0 |
Then, in stop_times.txt, just one run along each direction is recorded (30 stops x 2 direction = 60 entries total
) with the starting stop's arrival time set to 00:00:00 and subsequent times entered as an offset from that.
So now, the route running every 20 minutes needs total:
1 entry in routes.txt
4 entries in trips.txt (considering 2 for weekdays + 2 for weekend service)
4 entries in frequences.txt
60 entries in stop_times.txt
Looks more manageable, right?
Then, suppose the transit agency decides to double the frequency during rush hour, say 8 to 10am and 6 to 8 pm. The only edit needed now is in frequencies.txt. The day is now split into 5 time periods :
trip_id | start_time | end_time | headway_secs | exact_times |
---|---|---|---|---|
R1_0 | 06:00:00 | 08:00:00 | 1200 | 0 |
R1_0 | 08:00:00 | 10:00:00 | 600 | 0 |
R1_0 | 10:00:00 | 18:00:00 | 1200 | 0 |
R1_0 | 18:00:00 | 22:00:00 | 600 | 0 |
R1_0 | 22:00:00 | 23:30:00 | 1200 | 0 |
... and similarly for the return journey.
Hence, using a frequencies feature can greatly help for transit agencies who have some or all routes running on frequencies anyways. Plus, the size of the GTFS feed would be greatly reduced and so the program works faster.
Ref #6
This sort of error (stop_id is in sequence db not found in main db) may happen again through some other way. The sequence db is an extra thing that is not part of the GTFS feed spec, it is created and kept by this program to help standardize the stops sequence for a route and keep a template ready for when a new trip is provisioned. It should not be allowed to fail the program. Therefore, the code reading it should quietly skip a non-matching stop_id encountered and move on to next stop in sequence. This is lower priority, so moving it to another issue.
as of v.1.3.0 the fare rules tab shows a pivoted table. This can be easier for editing fares, but we cannot add a new fare rule here if, for example, a new zone_id is configured.
Way forward: Create a new tab "Fare Rules - Simple". Here, load a simple linear tabulator table that shows the fare rules as they are in the GTFS spec: https://developers.google.com/transit/gtfs/reference/#fare_rulestxt
This requires provisioning a new tabulator table and related actions on the JS side, a new API call to the backend, and a new API handler endpoint on the backend side that simply reads the full fare_rules table and returns it as JSON.
Schedules: Time picker for timings entry or edit.
Routes Sequence : tell if default sequence is already saved or not
Also tell user that with the default sequence saved they can create new trips in Schedules section.
This may involve some Python side tweaks also as AFAIK currently the JS API call function has no way of telling if this is a saved sequence or auto-generated one.
Would be similar to how in Schedules > Timings, for a chosen trip, user is told if the timings data was pre-stored or has been auto-generated.
Slip in a note infoming that saving default sequence is needed to be able to provision new trips under the route.
See /xml2GTFS.html
Stations section. Tabulator is loading the data via ajax GET request by itself. The advantage : less coding needed in /js/xml2GTFS.js
It also has on-load functions to trigger things after the data has loaded or if it fails to load. See http://tabulator.info/docs/3.4#callbacks-ajax
We can do this on all the pages where a tabulator table has to be populated on page load.
When a new route is loaded, or a sequence saved, or a new shape uploaded, repopulated the dropdown options in the shape pickers on the onward and return sequence maps.
Top? Bottom? Under the title? Make up your mind!!
Misc>Maintenanace section : include fare_ids also
If there are no matching records in a table to replace, then skip that table-key pair and move to next.
Currently it is erroring out if there are no records:
[{'table': 'calendar', 'key': 'service_id'}, {'table': 'trips', 'key': 'service_id'}]
WK
ALL
ERROR:tornado.application:Uncaught exception POST /API/replaceID?pw=kmrl&valueFrom=WK&valueTo=ALL (::1)
HTTPServerRequest(...)
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tornado/web.py", line 1510, in _execute
result = method(*self.path_args, **self.path_kwargs)
File "web_wrapper.py", line 1120, in post
returnMessage = replaceIDfunc(valueFrom,valueTo,tableKeys)
File "<string>", line 731, in replaceIDfunc
File "/usr/local/lib/python3.5/dist-packages/tinydb/database.py", line 503, in write_back
if sorted(doc_ids)[-1] > self._last_id:
IndexError: list index out of range
The program is currently (as of 12-Apr-2018) using using TinyDB to manage its database. The database is a db.json
file in GTFS folder. (Actually there's another one too, sequence.json)
Possible source for code: https://codepen.io/aurer/pen/jEGbA
Shapes: Give an option to draw on the page itself instead of uploading a shapefile.
Also, keep multiple formats shapefile upload : .kml, .gpx in addition to .geojson
Tie it to agency.
Example: KMRL_R001
, KMRL_R002
,...
User has to pick the agency, then just click Add Route button.
User can go rename it from the Maintenance section.
Translations: Give dropdown of existing names from across the system.
Any translation not done yet or not done in the picked language should come in a dropdown.
Implies : don't allow them to translate any random string.
In Routes page > Sequence (Onward and Return)
After moving a stop up or down in the sequence table, upon saving the changed sequence wasn't being saved to DB. This was because global variables sequence0
and sequence1
in routes.js were not saving the changed rows.
Realized that the code doesn't need global variables to begin with. Data can be retrieved from tabulator tables at any time. So, changing the other functions in js/routes.js
to not use the global sequence variables and instead work directly with the tabulator tables.
Misc: Maintenance : updated dropdowns
https://harvesthq.github.io/chosen/options.html#triggerable-events
Use the chosen:updated
trigger
Bug: After deleting, the dropdowns' html on the behind is being updated, but because the chosen.js plugin acting on them is not being updated, it's leading to misleading selections. (you choose option x, but actually another option is selected)
For brand new routes, saving sequence API call is crashing because of no shapes. Shape the code (pun) such that it is able to gracefully handle it in case there aren't any shapes allotted to the new route's directions.
Current fields list for trips table: route_id,service_id,trip_id,trip_headsign,direction_id,block_id,shape_id,wheelchair_accessible
Of these, in the Schedules page when we create a new trip, only route_id and trip_id is populated. Also, at present they are filled by text input, whereas some of these should be fixed values depending on other tables (like service_id). It can cause trouble if saved without populating properly.
Maintenance: Shape delete : zap from sequence DB also
To do this, the key shape0
/shape1
needs to be popped from the record. It won't be enough to set it to blank string.
How a sequence is saved in DB:
{
"1": {
"0": [
"ALVA",
...
"MACE"
],
"1": [
"MACE",
...
"ALVA"
],
"route_id": "R001",
"shape0": "R001_0",
"shape1": "R001_1"
}
}
}
Present workflow:
Therefore, task at hand:
Things that will be needed on user interface end:
While managing of existing data is working well, lot of errors encountered when we want to create new data like routes etc. Also, Misc > Maintenance section all-IDs API call errors out when the DB has been reset to blank slate.
Wherever they are used, make them visible on the page only when the table is edited. Make them disappear when changes are saved / committed to DB (big green button).
For both deleting and zapping, skip if there are no existing records in that table for that key and value pair.
link for possible CSS: https://www.bootply.com/112999
See: https://jqueryui.com/dialog/#default
http://api.jqueryui.com/dialog/
Same tech many also be applied for creating new trip etc.
Fares: Fare Rules : Presently loading in alphabetical order of stations. Explore if possible to load in a sequence, and decide which sequence.
Idea: Have an expandable (accordion) section for filtering down the stops. There, show a routes listing table. User selects routes (can select multiple routes) and presses a filter button. That restricts the fare rules table to only the stops that are covered by those routes.
Why multiple routes, why not a single route: For interchanges of course!
-> Multiple routes select : can achieve through chosen.js
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.