Hi @lukebornn @stevenwu4,
I just enjoyed reading your paper and the helpful reviews by @beanumber and @seanjtaylor. You've obviously already done a lot of revision in response to those reviews, leaving mercifully little for me to do here.
It would be good if the repo had a README with instructions on, e.g., how to produce the PDF. I managed to get some version of it, but only after removing some setwd()
code (hey: check out the solutions for this problem ๐) and changing the results
chunk options. I never work with .Rnw
so perhaps this was user error, but I just took the path of least resistance in RStudio.
I'll make a few more small suggestions and, once settled, we'll get you to submit this to the TAS system.
Get ready for TAS. You'll need to make sure you're typesetting with a relevant template. Our existing thread about this is here: https://github.com/dsscollection/dsscollection/issues/39.
Some of the references only show up with title and authors. Are those the blog posts? If so, I would expect them to have URL and date, at the very least.
I like the new title. If @lukebornn feels it is appropriate, I think adding his new affiliation with the Kings would provide a nice validation of the relevance and credibility of this work!
The state and distribution channel for the data and code: We can certainly leave it "as is", but might want to consider making a fresh repo for it. Although I am not opposed to simply turning the switch to public on this one.
- For data, I suggest you consider depositing in a proper data repository vs leaving as GitHub only. I'm not sure if the sports nature (vs science) makes it difficult to use one of the usual ones, such as Figshare.
- I'm glad it looks like you've removed a lot of the inline code and focused more on the high-level interface of some utility functions, as per @beanumber's suggestions. Although it's not coupled to the fate of this article, I'd encourage you to seriously consider the advice to adopt a more idiomatic R style and to make this into an R package that others can use.
About a specific comment from @beanumber:
I'll leave it to the @jennybc and @hadley to decide whether adding two derivative columns to a data frame is an operation that will be interesting to readers.
I think there is a way to make this interesting, at least to many people. I have faced a similar challenge from analyzing ultimate frisbee data. Most people have never thought about the practical problem of using recorded game data to derive higher level game-play-based structure, such as possessions, and inferring who is on offense and which direction they are heading. It's clear you can do it, but I think you show and people will appreciate that it's not completely automatic. That's how I would frame some of that up front work.
Overall suggestion: use present rather than future tense. So "we will address", "we will detail", becomes "we address", "we detail", everywhere.
Table 1: "z coordinates" should not be plural -- should be "z coordinate".
In R, it's more correct to say "package" than "library". I'm thinking of the references to the raster package.
I really liked the intro of @seanjtaylor's review, e.g.:
The authors contribute a simple model of player movement, predicting position at a time period using their velocity and position at the last time period plus some unobserved acceleration term which can be measured using a model.
I think it should inspire a few more sentences around the player movement model and what the terms mean. Stuff like: "A player's position at time t + 1 is modelled as position at time t plus ...." Basically restating the equation in words and helping the reader attach meaning to each term. Especially once I realized how critical the "empirical eta's" were to become, I really wished for some more help from the experts about how I should think about them.
I see @beanumber already pushed back on some of the basketball lingo and sports vocabulary and I gather some changes have already been made. I actually like that, towards the end, the paper gets quite serious about the basketball specifics. And yet I would still like to see more of an effort to bring the wording closer to a "general public" style (at least we're not aiming for "academic journal style"!).
Overall, let me repeat that this is a great case study of what it takes to bring raw SportVU data into an analytical environment, through a statistical model, and into useful visualizations. I look forward to a few more small revisions and submission to TAS.