ropensci-archive / gtfsr Goto Github PK
View Code? Open in Web Editor NEW:warning: ARCHIVED :warning: Package for obtaining, validating, viewing, and storing GTFS (transit) data
:warning: ARCHIVED :warning: Package for obtaining, validating, viewing, and storing GTFS (transit) data
Apparently I cannot create a description for the package, @eamcvey! Would be useful so folks could know what the package does.
Hello,
I'm new to R. Tried to install this package and got this message :"package ‘gtfsr’ is not available (for R version 3.4.1)
Looking forward to work with this package
I've got a much more nascent gtfs package where I've been focusing on getting the realtime feeds into R, (and integrating with data.table
).
https://github.com/SymbolixAU/gtfsway
Is this something you think is worthwhile bringing into gtfsr
?
Or a new maintainer team 😸
If you're interested, please comment in the issue.
For more info, see
Hi,
this gtfs feed
library("gtfsr")
gtfs <- import_gtfs("http://www.sardegnamobilita.it/opendata/dati_atpnu.zip")
is classified as ...failed. Invalid data structure.
This is because the file calendar.txt
is missing - this is the relevant line:
https://github.com/ropenscilabs/gtfsr/blob/master/R/validate-gtfs-structure.R#L19
But according to the reference, you can omit calendar.txt
if calendar_dates.txt
includes all dates of service, see https://developers.google.com/transit/gtfs/reference/calendar_dates-file
I'm running into trouble with several of the mapping functions since they depend on the presence of agency_id, which is an optional field in agency.txt and routes.txt. So even though structure, files, and vars may validate, they can't necessarily be mapped.
In Chicago, for example, CTA doesn't provide agency_id in its feed, so map_gtfs_route_shape() map_gtfs_route_stops() and map_gtfs_agency_routes() all fail.
A workaround is to add agency_id columns and fill them with dummy values (has to be the same for both agency_df and routes_df):
gtfs_objs$agency_df$agency_id <- "CTA"
gtfs_objs$routes_df$agency_id <- "CTA"
(This gets complicated if you have more than a couple of elements in a list)
As a fix, maybe as part of the validation, just fill agency_id with some derived value if it's not present?
👋 @dantonnoriega et al! Is this package still maintained?
I see errors https://github.com/r-universe/ropensci/runs/6549341600?check_suite_focus=true via https://ropensci.r-universe.dev/ui#builds=
Make each stop/route it's own overlay. This allows the user to enable/disable routes.
I know this package is intended for validation but how about a function to export stops and routes as sp objects (SpatialPointsDataFrame and SpatialLinesDataFrame respectively).
Do you feel this fits within the scope of the package?
When loading gtfs files I get a warning,
`summarise_each()` is deprecated.
Use `summarise_all()`, `summarise_at()` or `summarise_if()` instead.
To map `funs` over all variables, use `summarise_all()`
The summarise_each()
method was deprecated in dplyr 0.6/0.7
When running devtools::install_github('ropensci/gtfsr', build_vignettes = TRUE)
, I get the following error.
Quitting from lines 59-64 (gtfsr-vignette.Rmd) Error: processing vignette 'gtfsr-vignette.Rmd' failed with diagnostics: API Key not found. Please set your API key using function 'set_api_key()'. Execution halted Installation failed: Command failed (1)
It seems obvious that I need to set my API key, however this error shouldn't popup during installation as the set_api_key function is not active prior to installation as it is part of the package.
After installing the package it's also not really clear how to get the vignettes.
Is there any in dept documentation for this package? It has a lot of potential, but it really lacks some extra info.
I just thought it might be neat to look across previous versions and summarize the changes some how. Not sure what would be best though.
Hey guys,
I think the Auckland GTFS feed fails when importing the non-conforming GTFS file called stops_info.txt. Here is the reproducible code:
akl_url <- get_feedlist() %>%
filter(grepl("Auckland Transport GTFS", t)) %>%
pull(url_d)
akl_gtfs <- import_gtfs(akl_url)
I assume it is as easy as adding in a check, and an option for ignoring irregular files e.g. files not in the google specification? I'm happy to go try doing this but I am relatively new to the package writing world. Happy to receive some guidance.
Cheers,
Phil Donovan
Suggestion: rename this verbosely name function or create a new generic plotting function called simply map_gtfs()
:
I think in most cases users will want to see the stops. Suggestion: make include_stops = TRUE the default option in this function and the yet-to-be created generic map_gtfs()
.
To all who have contributed recently, please leave your preferred name and, optionally, email, as a comment below so your contributions can be recognized!
Hello,
import_gtfs
fails if the encoding of a gtfs file is UTF-8 BOM
. This is because the function use readr::read_csv
with the default encoding in locale
set to UTF-8
.
Example:
# source http://www.sardegnamobilita.it/opengovernment/opendata/
# the encoding for `stop_times.txt` is `UTF-8 BOM`
r <- import_gtfs("http://www.covimo.de/gtfs/dati_atpss.zip")
Not sure what a solution could be - maybe make locale
an argument in the function call? Most of the times all files of a gtfs dataset are of one type (BOM or not BOM).
Thank you for the package!
Patrick
Hi,
when I run the code from the readme using this gtfs dataset (Palermo) I'm getting an error
library('gtfsr')
library("magrittr")
library("dplyr")
url <- "http://www.comune.palermo.it/gtfs/amat_feed_gtfs_v14.zip"
amat <- url %>% import_gtfs
amat_routes <- amat[['routes_df']] %>%
slice(which(grepl('606|212', route_id, ignore.case=TRUE))) %>%
'$'('route_id')
amat %>% map_gtfs(route_ids = amat_routes)
# Error in get_agency_stops(gtfs_obj, agency_name = .) :
# No trips for Route ID 'AMAT Palermo S.p.A.' were found.
This one from Rome on the other hand runs fine!
url <- "http://www.covimo.de/gtfs_roma/gtfs_roma.zip"
roma <- url %>% import_gtfs
routes <- roma[['routes_df']] %>%
slice(which(grepl('01|011', route_id, ignore.case=TRUE))) %>%
'$'('route_id')
roma %>% map_gtfs(route_ids = routes)
Cheers
Patrick
Just thought it could be helpful sometimes when its hard to see it on the basemap.
I think if we parse the dates in calendar.txt with strptime(foo, format = "%Y%m%d")
we can test if they fit the standard.
And I think if we convert the fields for the days of the week with as.logical()
we can test if they are a value other than 0 or 1.
Hey! I'm trying to reproduce this visualization using my own gtfs data and one of the dependecies is gtfsr. However the install gets hung up on a dependency called 'gtsf', any ideas? I've tried on windows (r version 3.5.1) and rstudio cloud (r version 3.5.0) so far.
Sorry if I'm missing something obvious! Thanks for reading
> devtools::install_github('ropensci/gtfsr')
Downloading GitHub repo ropensci/gtfsr@master
from URL https://api.github.com/repos/ropensci/gtfsr/zipball/master
Installing gtfsr
'/opt/R/3.5.0/lib/R/bin/R' --no-site-file --no-environ --no-save --no-restore --quiet \
CMD INSTALL '/tmp/RtmpMARBVQ/devtoolsd524ffa48f/ropensci-gtfsr-05111e1' \
--library='/home/rstudio-user/R/x86_64-pc-linux-gnu-library/3.5' --install-tests
ERROR: dependency ‘gtsf’ is not available for package ‘gtfsr’
* removing ‘/home/rstudio-user/R/x86_64-pc-linux-gnu-library/3.5/gtfsr’
Installation failed: Command failed (1)
Hi Danton.
I would like to share this snippet of code that is super fast to calculate the distance of each shape_id. The idea is to use data.table
to make really quick operations.
library(data.table)
library(geosphere)
# read shapes.txt
shapes_df <- fread("shapes.txt)
# convert lat long columns to numeric
shapes_df[, shape_pt_lon := as.numeric(shape_pt_lon) ][, shape_pt_lat := as.numeric(shape_pt_lat) ]
# Pair subsequent points for each shape
shapes_df[, `:=`(next_shape_pt_sequence = shift(shape_pt_sequence, type = "lead"),
next_lat = shift(shape_pt_lat, type = "lead"),
next_lon = shift(shape_pt_lon, type = "lead")), by = .(shape_id)]
# Calculate distance between each point in the shape
shapes_df[ , shape_dist_traveled := distGeo(matrix(c(shape_pt_lon, shape_pt_lat), ncol = 2),
matrix(c(next_lon, next_lat), ncol = 2))/1000]
# sum total distance of each shape_id
shapes_df[ , .(dist_shape= sum(shape_dist_traveled, na.rm=T)), by=shape_id]
#> shape_id dist_shape
#> 1: 29647112 39.476308
#> 2: 17391142 7.941542
#> 3: 17614235 28.088435
#> 4: 29949632 14.276429
#> 5: 17576631 7.025251
Now what I do next is to add this info to the trip.txt
file . I also do a similar operation in the stop_times.txt
file to estimate the travel time between stops and the travel time for each trip. The idea in the end is to combine these results to get a diagnostic of each gtfs feed with the summary statistics (min, mean, max) of distance and speed for each trip and possibly and for ech trip segment between pairs of stops.
I still haven't find time to work on my scripts to contribute with new a function that gets this summary to the package. In the mean time, I hope this will snippet will be useful for other purposes as well.
Both a question and a prompt: Any concrete plans? Schedule?
I'll be happy to help any way I can. Maybe start a git project, add some issues to it, and ask folk to jump in and help? I would like to usegtfsr
in an extension to my dodgr
package, but would need it on CRAN first.
I don't have permission to share the zipped gtfs in question but on opening it with read_gtfs it deletes the zip from my file system. I'm using ubuntu 14.04.
Neither transitfeed.com nor openmobilitydata.com sign in works and it is impossible to obtain their API key.
Do you know anything about other possibilities of getting API key? Do they still support it?
would this be more maintainable if we separate the validation from the table joins and mapping?
it seems like things are sort of blended together.
is that out of necessity?
or could, for example, a check_service_ids
function be called from within a mapping function?
Currently the vignette isn't working for me because GO Durham is directed to a web page that isn't a zip file in transit feeds.
I think that this should be handled in import_gtfs to look for a .zip extension first and not and throw up a warning as general way to handle these warnings.
I don't see the useR2016 talk available. Did you not want to share it?
The following are new additions to the trillium gtfs data.
calendar_attributes
directions
fare_rider_categories
farezone_attributes
rider_categories
shapes
There appears to be new field created by trillium as well e.g.
stops.txt
- platform_code
routes.txt
- route_sort_order
and min_headway_minutes
Need to update meta data and validation.
Suggestion: explore options for converting gtfs data classes into spatial objects such as SpatialLinesDataFrame.
Note: stplanr has a basic function for this: gtfs2sldf().
Hi ,
I am researching about the NYC's subway system and came across the "gtfsr" package. It is a wonderful package, however, it wasn't able to render the shapes of many NYCs subway lines (E, 1, C , etc.). The error that I received is :
Error in row.names<-.data.frame
(*tmp*
, value = value) :
missing values in 'row.names' are not allowed
I suspect there might be something incorrect in the GTFS file. Do you have any recommendations to bypass these problems? Also, is the package capable of handling realtime data feed?
Below is the reproducible codes that I used:
feedlist_df <- get_feedlist() %>%
filter(grepl('NYC Subway GTFS',t, ignore.case= TRUE))
NYC <- import_gtfs(feedlist_df$url_d)
#get line A
route <- NYC[['routes_df']] %>%
slice(which(grepl('a', route_short_name, ignore.cas=TRUE)))
map_gtfs_route_shapes(gtfs_obj = NYC,
route_id = route$route_id,
include_stops = FALSE)
i had to do this before import_gtfs() would work today (windows 10 R 3.3)
install.packages("pkgconfig")
install.packages("glue")
install.packages("bindrcpp")
The first line of unzip_gtfs is
ex_dir <- strsplit(path, '/')[[1]][1]
Assuming path is the filepath to gtfs_zip.zip this will assign the root directory of the user or operating system.
Suggestion from robinlovelace: recommend setting the key in .Renviron, as documented in the httr vignette, e.g. with the line GTFS_API_KEY=XXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX. Then it could be automatically retrieved with each new session by adding something like the following lines to get_api_key():
if(grepl("[[:alnum:]]{8}\\-[[:alnum:]]{4}\\-[[:alnum:]]{4}\\-[[:alnum:]]{4}\\-[[:alnum:]]{12}", Sys.getenv("GTFS_API_KEY")))
gtfs_api_key$set(Sys.getenv("GTFS_API_KEY"))
We use a similar technique in stplanr with cyclestreet_pat, which we should probably generalise to other API keys...
Suggestion: make simply loading and plotting GTFS data appear earlier and more prominently in the README.
Make the README shorter and self-standing, make the vignette longer with more links to existing software and documentation for understanding and working with GTFS data.
Please add a package level manual file so that when users do ?gtfsr and ?gtfsr-package they get a high level manual file that explains the package, etc.
hi @dantonnoriega, @mpadge, and @mdsumner,
i'd be grateful if you could give me some feedback on attempt i made to refactor this package so that it can keep on chugging along.
i broke things out by function/dependency:
i think gtfsr can still live here and will require a lot less code in it. let me know what you think, if you have the time.
i think this may offer a few simple solutions to #45 and #48
i also hope having these component pieces should make issue like #29 easier for people to contribute to.
thanks for taking a look.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.