Code Monkey home page Code Monkey logo

Comments (11)

layik avatar layik commented on September 23, 2024 2

Right, so @Robinlovelace is raising this and I am glad he is, because this is not just 2019. Here is what I have found but cannot give you a reprex just yet:

        2015 2016 2017 2018 2019
caNotINac   37   11   25   76   42
veNotINac   53   12   36   96   54
acNotINca    0    0    0    0    0
acNotINve    0    0    0    0    0

from trafficalmr.

layik avatar layik commented on September 23, 2024

This might be the next best ticket for me.

from trafficalmr.

Robinlovelace avatar Robinlovelace commented on September 23, 2024

Fantastic you're up for looking at it. I think it would be great if the function, or family of tc_join*() functions, can output data that is:

  • At the crash level
  • At the casualty level
  • At the vehicle level
  • Other???

A question is whether to make them arguments in one main function or separate functions. From a usability perspective I would err towards separate function for each, e.g.

  • tc_join_stats19_ac()
  • tc_join_stats19_ca()
  • tc_join_stats19_ve()

from trafficalmr.

layik avatar layik commented on September 23, 2024

Just trying to understand this better. I see that the point of the work in tc-join as it stands is to generate a df where we can see which vehicles were involved in which crash. I suppose, @Robinlovelace would then want to see UpSet plots for casualty types and ? What I mean is there would be no tc_join_stats19_ac because that is the basis for the other two and indeed, accident_index and year were always our keys. Correct?

from trafficalmr.

Robinlovelace avatar Robinlovelace commented on September 23, 2024

I suppose, @Robinlovelace would then want to see UpSet plots for casualty types and ?

Yes it would be good to see them for number of casualties, number of vehicles and number of crash records, and there could be different combinations (e.g. Y axis being number of casualties and X axis being vehicle type) perhaps. The outputs of tc_join() functions could be useful for a range of different things, not just upset plots. One approach would be to use the dm package but that would introduce more overheads so suggest we don't use it for now, but good to be aware of alternative approaches that could be useful later: https://github.com/krlmlr/dm

there would be no tc_join_stats19_ac because that is the basis for the other two and indeed, accident_index and year were always our keys. Correct?

I think tc_join_stats19_ac() could be useful but may need aggregating functions, e.g. to count the number of cyclists, pedestrians etc in the casualties table who are hurt per crash. @joeytalbot has done that in previous scripts I think, please share a link to code that does that if you get a chance Joey.

Hope that makes sense...

from trafficalmr.

layik avatar layik commented on September 23, 2024

@Robinlovelace wont be making a pull out of this yet, like to know what would be at least one useful function from your comment above so I can implement/improve/contribute further. As it stands, not quite able to translate your comment into code.

from trafficalmr.

Robinlovelace avatar Robinlovelace commented on September 23, 2024

No worries, you could take a look at adding some comment to this instead, starting with the building blocks of the ac, ca and ve tables could be a starter for deciding how to best to write code to automate parts of the joining process. Alternatively, it's possible that this is one of those things that is best just describing and not 'over functionalising' as @rogerbeecham was alluding to with respect to the upset plot code.

Here's a section in need of content (will try to make the edit button work now but the source should be easy to find): https://saferactive.github.io/rrsrr/joining-road-crash-tables.html

from trafficalmr.

layik avatar layik commented on September 23, 2024

OK, so just doing some work here and @Robinlovelace whilst I was watching has done a good section in the rrsrr on this. Just found out that not all indices in casualties and vehicles are in the accidents table.

Is this something @mem48 is an expert in? Anyone else?

I guess the question I must ask: what do we do with those records in the case of joining them?

library(stats19)
#> Data provided under OGL v3.0. Cite the source and link to:
#> www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
ac = get_stats19(year = 2019, type = "ac", output_format = "sf")
#> Files identified: DfTRoadSafety_Accidents_2019.zip
#>    http://data.dft.gov.uk.s3.amazonaws.com/road-accidents-safety-data/DfTRoadSafety_Accidents_2019.zip
#> Attempt downloading from:
#> Data saved at /tmp/Rtmpzevr4i/DfTRoadSafety_Accidents_2019/Road Safety Data - Accidents 2019.csv
#> Reading in:
#> /tmp/Rtmpzevr4i/DfTRoadSafety_Accidents_2019/Road Safety Data - Accidents 2019.csv
#> date and time columns present, creating formatted datetime column
#> 28 rows removed with no coordinates
ca = get_stats19(year = 2019, type = "ca")
#> Files identified: DfTRoadSafety_Casualties_2019.zip
#>    http://data.dft.gov.uk.s3.amazonaws.com/road-accidents-safety-data/DfTRoadSafety_Casualties_2019.zip
#> Attempt downloading from:
#> Data saved at /tmp/Rtmpzevr4i/DfTRoadSafety_Casualties_2019/Road Safety Data - Casualties 2019.csv
ve = get_stats19(year = 2019, type = "ve")
#> Files identified: DfTRoadSafety_Vehicles_2019.zip
#>    http://data.dft.gov.uk.s3.amazonaws.com/road-accidents-safety-data/DfTRoadSafety_Vehicles_2019.zip
#> Attempt downloading from:
#> Data saved at /tmp/Rtmpzevr4i/DfTRoadSafety_Vehicles_2019/Road Safety Data- Vehicles 2019.csv

all(ca$accident_index %in% ac$accident_index)
#> [1] FALSE
all(ve$accident_index %in% ac$accident_index)
#> [1] FALSE

which(!ca$accident_index %in% ac$accident_index)
#>  [1]  32870  35672  37513  37514  42935  43816  44878  49428  49429  49610
#> [11]  49611  49612  49634  49981  50269  50270  50329  50330  50612  50694
#> [21]  50921  50929  50930  51000  51001  51039  51040  51041  51137  53661
#> [31]  53662  60791  76150  76151 117585 126082 139245 140079 143021 143022
#> [41] 143023 145523

which(!ve$accident_index %in% ac$accident_index)
#>  [1]  49394  49395  53112  55581  55582  63126  64441  64442  66036  66037
#> [11]  72305  72522  72523  72544  72545  72996  72997  72998  73396  73397
#> [21]  73485  73486  73876  73877  73989  73990  74272  74283  74284  74378
#> [31]  74379  74433  74434  74560  74561  78032  78033  87939  87940 109329
#> [41] 109330 167863 179919 179920 197938 197939 197940 199070 199071 203025
#> [51] 203026 203027 206315 206316

Created on 2020-10-07 by the reprex package (v0.3.0)

from trafficalmr.

layik avatar layik commented on September 23, 2024

It is interesting actually:

nrow(ca) == sum(ac$number_of_casualties) + length(which(!ca$accident_index %in% ac$accident_index))
#> TRUE

from trafficalmr.

Robinlovelace avatar Robinlovelace commented on September 23, 2024

Very interesting @layik. I think it's worth asking the road safety stats team about, suspect it's an error in the data but not sure.

from trafficalmr.

layik avatar layik commented on September 23, 2024

Reopen if need be

from trafficalmr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.