Comments (11)
Right, so @Robinlovelace is raising this and I am glad he is, because this is not just 2019. Here is what I have found but cannot give you a reprex just yet:
2015 2016 2017 2018 2019
caNotINac 37 11 25 76 42
veNotINac 53 12 36 96 54
acNotINca 0 0 0 0 0
acNotINve 0 0 0 0 0
from trafficalmr.
This might be the next best ticket for me.
from trafficalmr.
Fantastic you're up for looking at it. I think it would be great if the function, or family of tc_join*()
functions, can output data that is:
- At the crash level
- At the casualty level
- At the vehicle level
- Other???
A question is whether to make them arguments in one main function or separate functions. From a usability perspective I would err towards separate function for each, e.g.
tc_join_stats19_ac()
tc_join_stats19_ca()
tc_join_stats19_ve()
from trafficalmr.
Just trying to understand this better. I see that the point of the work in tc-join
as it stands is to generate a df
where we can see which vehicles were involved in which crash. I suppose, @Robinlovelace would then want to see UpSet
plots for casualty types and ? What I mean is there would be no tc_join_stats19_ac
because that is the basis for the other two and indeed, accident_index
and year were always our key
s. Correct?
from trafficalmr.
I suppose, @Robinlovelace would then want to see UpSet plots for casualty types and ?
Yes it would be good to see them for number of casualties, number of vehicles and number of crash records, and there could be different combinations (e.g. Y axis being number of casualties and X axis being vehicle type) perhaps. The outputs of tc_join()
functions could be useful for a range of different things, not just upset plots. One approach would be to use the dm package but that would introduce more overheads so suggest we don't use it for now, but good to be aware of alternative approaches that could be useful later: https://github.com/krlmlr/dm
there would be no tc_join_stats19_ac because that is the basis for the other two and indeed, accident_index and year were always our keys. Correct?
I think tc_join_stats19_ac()
could be useful but may need aggregating functions, e.g. to count the number of cyclists, pedestrians etc in the casualties table who are hurt per crash. @joeytalbot has done that in previous scripts I think, please share a link to code that does that if you get a chance Joey.
Hope that makes sense...
from trafficalmr.
@Robinlovelace wont be making a pull out of this yet, like to know what would be at least one useful function from your comment above so I can implement/improve/contribute further. As it stands, not quite able to translate your comment into code.
from trafficalmr.
No worries, you could take a look at adding some comment to this instead, starting with the building blocks of the ac, ca and ve tables could be a starter for deciding how to best to write code to automate parts of the joining process. Alternatively, it's possible that this is one of those things that is best just describing and not 'over functionalising' as @rogerbeecham was alluding to with respect to the upset plot code.
Here's a section in need of content (will try to make the edit button work now but the source should be easy to find): https://saferactive.github.io/rrsrr/joining-road-crash-tables.html
from trafficalmr.
OK, so just doing some work here and @Robinlovelace whilst I was watching has done a good section in the rrsrr
on this. Just found out that not all indices in casualties and vehicles are in the accidents table.
Is this something @mem48 is an expert in? Anyone else?
I guess the question I must ask: what do we do with those records in the case of joining them?
library(stats19)
#> Data provided under OGL v3.0. Cite the source and link to:
#> www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
ac = get_stats19(year = 2019, type = "ac", output_format = "sf")
#> Files identified: DfTRoadSafety_Accidents_2019.zip
#> http://data.dft.gov.uk.s3.amazonaws.com/road-accidents-safety-data/DfTRoadSafety_Accidents_2019.zip
#> Attempt downloading from:
#> Data saved at /tmp/Rtmpzevr4i/DfTRoadSafety_Accidents_2019/Road Safety Data - Accidents 2019.csv
#> Reading in:
#> /tmp/Rtmpzevr4i/DfTRoadSafety_Accidents_2019/Road Safety Data - Accidents 2019.csv
#> date and time columns present, creating formatted datetime column
#> 28 rows removed with no coordinates
ca = get_stats19(year = 2019, type = "ca")
#> Files identified: DfTRoadSafety_Casualties_2019.zip
#> http://data.dft.gov.uk.s3.amazonaws.com/road-accidents-safety-data/DfTRoadSafety_Casualties_2019.zip
#> Attempt downloading from:
#> Data saved at /tmp/Rtmpzevr4i/DfTRoadSafety_Casualties_2019/Road Safety Data - Casualties 2019.csv
ve = get_stats19(year = 2019, type = "ve")
#> Files identified: DfTRoadSafety_Vehicles_2019.zip
#> http://data.dft.gov.uk.s3.amazonaws.com/road-accidents-safety-data/DfTRoadSafety_Vehicles_2019.zip
#> Attempt downloading from:
#> Data saved at /tmp/Rtmpzevr4i/DfTRoadSafety_Vehicles_2019/Road Safety Data- Vehicles 2019.csv
all(ca$accident_index %in% ac$accident_index)
#> [1] FALSE
all(ve$accident_index %in% ac$accident_index)
#> [1] FALSE
which(!ca$accident_index %in% ac$accident_index)
#> [1] 32870 35672 37513 37514 42935 43816 44878 49428 49429 49610
#> [11] 49611 49612 49634 49981 50269 50270 50329 50330 50612 50694
#> [21] 50921 50929 50930 51000 51001 51039 51040 51041 51137 53661
#> [31] 53662 60791 76150 76151 117585 126082 139245 140079 143021 143022
#> [41] 143023 145523
which(!ve$accident_index %in% ac$accident_index)
#> [1] 49394 49395 53112 55581 55582 63126 64441 64442 66036 66037
#> [11] 72305 72522 72523 72544 72545 72996 72997 72998 73396 73397
#> [21] 73485 73486 73876 73877 73989 73990 74272 74283 74284 74378
#> [31] 74379 74433 74434 74560 74561 78032 78033 87939 87940 109329
#> [41] 109330 167863 179919 179920 197938 197939 197940 199070 199071 203025
#> [51] 203026 203027 206315 206316
Created on 2020-10-07 by the reprex package (v0.3.0)
from trafficalmr.
It is interesting actually:
nrow(ca) == sum(ac$number_of_casualties) + length(which(!ca$accident_index %in% ac$accident_index))
#> TRUE
from trafficalmr.
Very interesting @layik. I think it's worth asking the road safety stats team about, suspect it's an error in the data but not sure.
from trafficalmr.
Reopen if need be
from trafficalmr.
Related Issues (20)
- Improve readme
- Revert use of pkgdown branch
- Function to auto download cycleways HOT 8
- Function to recode maxspeed values from OSM HOT 4
- Function to recode highway tags in OSM
- Function to bulk download road traffic stats data HOT 3
- Compare maxspeed in OSM with other sources
- tc_upset bug HOT 5
- Function to preprocess roads in preparation to allocate crashes HOT 1
- Missing link in documentation
- Time to check package... HOT 2
- Error with osm_consolidate HOT 2
- segment argument in osm_consolidate does not work as expected HOT 2
- osm_get_junctions() places junctions where there are no junctions HOT 13
- Check issue associated with sf HOT 2
- Allow the user to specify their own recoding options in `tc_recode_vehicles()` HOT 3
- Build is failing
- Comments on report 3 HOT 15
- Markdown not working in package website HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from trafficalmr.