Code Monkey home page Code Monkey logo

nflverse-data's Introduction

nflverse

CRAN status Dev status R-CMD-check nflverse support

Overview

The nflverse is a set of packages dedicated to data of the National Football League. The nflverse package is designed to make it easy to install and load core packages from the nflverse in a single command. Please see the nflverse organisation repo on more information about governance, the code of conduct and possible roles.

Installation

The easiest way to get nflverse is to install it from CRAN with:

install.packages("nflverse")

To get a bug fix or to use a feature from the development version, you can install the development version of nflverse either from GitHub with:

if (!require("pak")) install.packages("pak")
pak::pak("nflverse/nflverse")

or prebuilt from the development repo with:

install.packages("nflverse", repos = c("https://nflverse.r-universe.dev", getOption("repos")))

Usage

library(nflverse) will load the following nflverse packages:

  • nflfastR, for play-by-play data back to 1999.
  • nflseedR, for season simulations.
  • nfl4th, for 4th down analysis.
  • nflreadr, for fast end efficient nflverse data downloads.
  • nflplotR, for tools to create visualization of NFL related analysis.

Getting help

The best places to get help on this package are:

Contributing

Many hands make light work! Here are some ways you can contribute to this project:

Terms of Use

The R code for this package is released as open source under the MIT License. NFL data accessed by this package belong to their respective owners, and are governed by their terms of use.

nflverse-data's People

Contributors

actions-user avatar mrcaseb avatar tanho63 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nflverse-data's Issues

Add GSIS IDs to Player Contracts

  1. Is your feature request related to a problem? Please describe.

Within the nflverse datasets, there is no direct way to map nflverse contract information to any other dataset as is, because there is no column in that dataset that holds GSIS ID information.

  1. Describe the solution you'd like

A column named [gsis_id] added to the nflverse contract information, to make it easier to map player contract information to other parts of the nflverse project.

  1. Describe alternatives you've considered

None.

  1. Additional context

Found this problem when trying to create a local nflverse database.

[FEATURE REQ] Update headshot_url for 2024 Draft Class in NFL Roster database

Is there an existing issue for this?

  • I have searched the existing issues

Is your feature request related to a problem? Please describe.

  1. Player profile headshot urls are missing for the 2024 NFL draft class in the NFL Rosters database. Would also be nice if those URLs were added to the Draft database.

Describe the solution you'd like

Add new player headshot image urls to the NFL Roster database for players who were taken in the 2024 NFL draft. Also, add a new table column for headshot url in the NFL Draft database.

Describe alternatives you've considered

I tried to access the base url for the player profile urls (https://static.www.nfl.com/image/private/f_auto,q_auto/league/...) to see if I could contribute, but I don't have access that server.

Additional context

No response

NFL Preseason stats

Is there an existing issue for this?

  • I have searched the existing issues

Is your feature request related to a problem? Please describe.

It's not a problem rather than a request to add more impactful data

Describe the solution you'd like

Look for the preseason data from the same sources

Describe alternatives you've considered

Other sites doesn't provide such detailed data as the PBP

Additional context

No response

pbp_status is "failing" since 2022-07-28

Is there an existing issue for this?

  • I have searched the existing issues

Have you installed the latest development version of the package(s) in question?

  • I have installed the latest development version of the package.

Describe the bug

It appears that PBP data are failing since end of July. Of course, there wasn't any official game until now but since yesterday we had the beginning of the new season, I was hoping that we could see PBP data for the games once they were concluded.
Another factor which I'm not sure about, is whether the pbp data supposed to be updated constantly or in batches.
Thank you

Reprex

load_pbp(2022)

Expected Behavior

Return PBP data for concluded 2022 games

nflverse_sitrep

Sorry cannot execute right now, but I think it is not important.

screenshots

- OS:
- Node:
- npm:

Additional context

No response

[question] NFL Data Update Schedule

Hello! I'd like to use the NFL data for weekly game statistics. When is this data updated? For instance, when can I access the play-by-play for Week 1 of the 2023 season? I couldn't find this info elsewhere. Thank you very much in advance!

include scrape source for rosters in documentation

  1. Is your feature request related to a problem? Please describe.
    The source of the data which populates the individuals on the 53-man roster for a given week is not clear.

  2. Describe the solution you'd like
    Add 1-2 lines in load_rosters() or a dependency that indicates the source for this data.

  3. Describe alternatives you've considered
    A field in the dataframe that explicitly references the datasource that was used to determine the player was on the 53-man for this week and year.

  4. Additional context
    Had a friend ask me about source in a sports analytics chat I belong to, I did not have a satisfying answer for him.

add quarterly action to archive data somewhere

This repo is now a bit more fragile than usual because of lack of versioning. Archiving all data to a backup repo every few months should be a nice fallback plan - pb_list %>% upload to backup repo

[BUG] <Roster file has bad ID for Jay Cutler in 2010 and no ID in 2011>

Is there an existing issue for this?

  • I have searched the existing issues

Have you installed the latest development version of the package(s) in question?

  • I have installed the latest development version of the package.

What version of the package do you have?

na direct pull

Describe the bug

In the roster file, Jay Cutler's 2010 record uses Rashied Davies ID's. Jay Cutler does not have a record for 2011. Both Jay Cutler and Rashid Davies have inconsistent draft data (different records say they were drafted by different teams in different rounds)

Reprex

import pandas as pd
import numpy

## load roster files ##\
roster_url = 'https://github.com/nflverse/nflverse-data/releases/download/rosters'
rosters = []
for season in range(2006,2018):
    ## pull roster for that season ##
    temp = pd.read_csv(
        '{0}/roster_{1}.csv?raw=true'.format(
            roster_url,
            season
        ),
        low_memory=False
    )
    rosters.append(temp)

## combine rosters ##
r = pd.concat(rosters)


## can see that jay cutler has wrong ID 2010, is missing 2011, and strange draft data ##
r[
    r['full_name'] == 'Jay Cutler'
][[
    'season','team','full_name','gsis_id',
    'espn_id','pff_id','pfr_id','esb_id',
    'entry_year','draft_club','draft_number'
]]

## if you inspect the wrong ID, you see it's Rashied Davies ##
r[
    r['gsis_id'] == '00-0023429'
][[
    'season','team','full_name','gsis_id',
    'espn_id','pff_id','pfr_id','esb_id',
    'entry_year','draft_club','draft_number'
]]

## to confirm it's not an issue with concatination of DFs, you can see issue is at the file ##
## level ##
r2010 = pd.read_csv(
    '{0}/roster_{1}.csv?raw=true'.format(
        roster_url,
        2010
    )
)
r2010[
    r2010['full_name'] == 'Jay Cutler'
][[
    'season','team','full_name','gsis_id',
    'espn_id','pff_id','pfr_id','esb_id',
    'entry_year','draft_club','draft_number'
]]

Expected Behavior

Expected behavior is that the roster file would have consistent information for both players and not have missing seasons

nflverse_sitrep

na did in python

Screenshots

No response

Additional context

No response

Incorrect old_game_id in participation

Hi!

When looking at the load_participation data, it seems the week 15 DAL vs NYG game doesn't share the same old_game_id with the pbp data (it's the number 14 in the lists) "2021121903" vs "2021121907"

nflreadr::load_participation(season=2021) %>% filter(possession_team =="DAL") %>% select (old_game_id) %>% distinct()
── nflverse pbp participation ──
ℹ Data updated: 2022-08-03 02:38:31 CEST
# A tibble: 18 × 1
   old_game_id
   <chr>      
 1 2021090900 
 2 2021091911 
...
13 2021121207 
14 2021121903
15 2021122611
...
> nflfastR::load_pbp(season=2021) %>% filter(posteam=="DAL")%>%select(old_game_id)%>%distinct()
── nflverse play by play  ──
ℹ Data updated: 2022-07-29 00:10:55 CEST
# A tibble: 18 × 1
   old_game_id
   <chr>      
 1 2021090900 
 2 2021091911 
...
13 2021121207 
14 2021121907
15 2021122611 
...

The data seems to be correct just the wrong old_game_id, In the load_schedules it shows the same old_game_id as in the pbp.

nflreadr::load_schedules(season=2021) %>% filter(away_team =="DAL",week==15) %>% select (old_game_id)
── nflverse games and schedules  ──
ℹ Data updated: 2022-08-08 18:51:11 CEST
# A tibble: 1 × 1
  old_game_id
  <chr>      
1 2021121907 

[BUG] missing value for `air_yards` and `yards_after_catch`

Is there an existing issue for this?

  • I have searched the existing issues

Have you installed the latest development version of the package(s) in question?

  • I have installed the latest development version of the package.

If this is a data issue, have you tried clearing your nflverse cache?

I have cleared my nflverse cache and the issue persists.

What version of the package do you have?

nflfastR 4.6.0

Describe the bug

play 1558 of 2021_02_NO_CAR; play 2072 of 2021_12_ATL_JAX and play 1644 of 2022_04_MIN_NO.
those 3 plays have NA for air_yards and yards_after_catch.

Reprex

pass_d<-pbp%>%
  filter(season>=2018, season<=2022, week>=1, week<=99, 
         complete_pass==1, !is.na(receiving_yards)) %>%	 
  filter(is.na(air_yards) %>%
  select(play_id, game_id,season, week, qtr, time, team = posteam, desc, play_type, 
         yrdln, 
         yardline_100,passing_yards,receiving_yards,air_yards,yards_after_catch,
         yards_gained, penalty, 
         return_touchdown, touchdown, player_id = passer_player_id,
         receiver_player_id) %>%
  collect()

Expected Behavior

for compele_pass, air_yards and yards_after_catch should not be NA.

nflverse_sitrep

NA

Screenshots

No response

Additional context

No response

[BUG] Anthony Brown BAL QB mapped to wrong pfr_player_id

Is there an existing issue for this?

  • I have searched the existing issues

Have you installed the latest development version of the package(s) in question?

  • I have installed the latest development version of the package.

What version of the package do you have?

na

Describe the bug

Baltimore Ravens QB Anthony Brown is incorrectly mapped to Dallas Cowboys CB Anthony Brown

Reprex

Repro:
1. Download this file: `https://github.com/nflverse/nflverse-data/releases/download/snap_counts/snap_counts_2022.csv`
2. Filter to: {game_id: 2022_14_BAL_PIT, player: "Anthony Brown"}
Actual: pfr_player_id = BrowAn02

Expected Behavior

Expected: pfr_player_id = BrowAn06

nflverse_sitrep

na

Screenshots

na

Additional context

na

Kickoff_Distance only showing value for runbacks.

Is there an existing issue for this?

  • I have searched the existing issues

Have you installed the latest development version of the package(s) in question?

  • I have installed the latest development version of the package.

What version of the package do you have?

1.0.2

Describe the bug

In PBP.. Kick_Distance only shows a value when the kickoff is not a touchback. please show a value for all kickoffs. If the kickoff starts On the 35, and the kick is a touchback, the kick distance would be 50 yards.

Reprex

Using the pbp releases.

Expected Behavior

If the kickoff starts On the 35, and the kick is a touchback, the kick distance would be 50 yards.

nflverse_sitrep

N/A

Screenshots

N/A

Additional context

N/A

`load_injuries` missing week 1 regular season data for 2009-2019

nflreadr::load_injuries(2009:2019) |>
    dplyr::filter(season_type == "REG" & week == 1)
#> ── nflverse injury and practice reports ────────────────────────────────────────
#> ℹ Data updated: 2022-03-10 09:50:23 PST
#> # A tibble: 0 × 16
#> # … with 16 variables: season <chr>, season_type <chr>, team <chr>, week <chr>,
#> #   gsis_id <chr>, position <chr>, full_name <chr>, first_name <chr>,
#> #   last_name <chr>, report_primary_injury <chr>,
#> #   report_secondary_injury <chr>, report_status <chr>,
#> #   practice_primary_injury <chr>, practice_secondary_injury <chr>,
#> #   practice_status <chr>, date_modified <dttm>
#> # ℹ Use `colnames()` to see all variable names

[BUG] no Jalen Hurts for Philapdelphia Depth Chart (week 23)

Is there an existing issue for this?

  • I have searched the existing issues

Have you installed the latest development version of the package(s) in question?

  • I have installed the latest development version of the package.

What version of the package do you have?

1.3.1

Describe the bug

there isn't a record for Jalen Hurts as qb in Philadelphia depth chart for week 23

Reprex

team_target <- "PHI"

depth_chart <- nflreadr::load_depth_charts(seasons = 2022) %>% 
  
  filter(club_code==team_target) %>% 
  filter(week == max(week))

Expected Behavior

a record showing Jalen hurts as qb for week 23

nflverse_sitrep

── System Info ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
• R version 4.2.2 Patched (2022-11-10 r83330)   • Running under: Ubuntu 22.04.1 LTS
── nflverse Packages ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
• nflreadr (1.3.1)  • nflseedR (1.1.0)  • nflplotR (1.1.0)  
• nflfastR (4.5.0)  • nfl4th   (1.0.2)  • nflverse (1.0.2)  
── nflverse Options ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
No options set for nflreadr, nflfastR, nflseedR, nfl4th, nflplotR, and nflverse
── nflverse Dependencies ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
• askpass     (1.1)     • hms        (1.1.2)     • proto        (1.0.0)    
• cachem      (1.0.6)   • httr       (1.4.4)     • purrr        (0.3.5)    
• cli         (3.4.1)   • isoband    (0.2.6)     • R6           (2.5.1)    
• codetools   (0.2-18)  • janitor    (2.1.0)     • rappdirs     (0.3.3)    
• colorspace  (2.0-3)   • jsonlite   (1.8.3)     • RColorBrewer (1.1-3)    
• cpp11       (0.4.3)   • labeling   (0.4.2)     • Rcpp         (1.0.9)    
• crayon      (1.5.2)   • lattice    (0.20-45)   • rlang        (1.0.6)    
• curl        (4.3.3)   • lifecycle  (1.0.3)     • rstudioapi   (0.14)     
• data.table  (1.14.6)  • listenv    (0.8.0)     • scales       (1.2.1)    
• digest      (0.6.30)  • lubridate  (1.9.0)     • snakecase    (0.11.0)   
• dplyr       (1.0.10)  • magick     (2.7.3)     • stringi      (1.7.8)    
• ellipsis    (0.3.2)   • magrittr   (2.0.3)     • stringr      (1.4.1)    
• fansi       (1.0.3)   • MASS       (7.3-58.1)  • sys          (3.4.1)    
• farver      (2.1.1)   • Matrix     (1.5-3)     • tibble       (3.1.8)    
• fastmap     (1.1.0)   • memoise    (2.0.1)     • tidyr        (1.2.1)    
• fastrmodels (1.0.2)   • mgcv       (1.8-41)    • tidyselect   (1.2.0)    
• furrr       (0.3.1)   • mime       (0.12)      • timechange   (0.1.1)    
• future      (1.29.0)  • munsell    (0.5.0)     • utf8         (1.2.2)    
• generics    (0.1.3)   • nlme       (3.1-160)   • vctrs        (0.5.1)    
• ggplot2     (3.4.0)   • openssl    (2.0.4)     • viridisLite  (0.4.1)    
• globals     (0.16.2)  • parallelly (1.32.1)    • withr        (2.5.0)    
• glue        (1.6.2)   • pillar     (1.8.1)     • xgboost      (1.6.0.1)  
• gsubfn      (0.7)     • pkgconfig  (2.0.3)       
• gtable      (0.3.1)   • progressr  (0.11.0)

Screenshots

No response

Additional context

No response

play_by_play_2023 has incorrect old_game_id for Week 15 2023

Is there an existing issue for this?

  • I have searched the existing issues

Have you installed the latest development version of the package(s) in question?

  • I have installed the latest development version of the package.

If this is a data issue, have you tried clearing your nflverse cache?

I have cleared my nflverse cache and the issue persists.

What version of the package do you have?

1.0.3

Describe the bug

As stated in the title, the old_game_id is not correct for all of the week 15 games of the 2023 season preventing the ability to easily join to the pbp_participation data to the play_by_play data. The Play By Play 2023 data has old_game_id values beginning with 2022 whereas the PBP Participation has the correctly formatted version starting with 2023. See images provided.

Reprex

Not applicable, bad data is the source of the bug as illustrated.

Expected Behavior

A successful join of the data based on the old_game_id and play_id fields.

nflverse_sitrep

NA

Screenshots

PBP Participation for week 15 of 2023:
image

Play By Play for week 15 of 2023:
image

Additional context

No response

MIssing Games for PBP Participation

Is there an existing issue for this?

  • I have searched the existing issues

Have you installed the latest development version of the package(s) in question?

  • I have installed the latest development version of the package.

If this is a data issue, have you tried clearing your nflverse cache?

I have cleared my nflverse cache and the issue persists.

What version of the package do you have?

1.4.0

Describe the bug

2023 Weeks 16-19 don't seem to have every game. For example, week 17 only has 3 games.

Reprex

library(nflreadr)

pbp <- load_participation() %>%
  filter(startsWith(nflverse_game_id, '2023_17'))

print(length(unique(pbp$nflverse_game_id)))

* Returns 3

Expected Behavior

Should return 16 (the number of games that week)

nflverse_sitrep

── System Info ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
• R version 4.3.2 (2023-10-31 ucrt) • Running under: Windows 11 x64 (build 22631)
── Package Status ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
   package installed  cran      dev behind
1 nflreadr     1.4.0 1.4.0 1.4.0.12    dev
── Package Options ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
• No options set for above packages
── Package Dependencies ────────────────────────────────────────────────────────────────────────────────────────────────────────────────
• cachem     (1.0.8)   • glue     (1.7.0)  • grDevices (4.3.2)  
• cli        (3.6.2)   • memoise  (2.0.1)  • methods   (4.3.2)  
• curl       (5.2.0)   • rappdirs (0.3.3)  • stats     (4.3.2)  
• data.table (1.15.0)  • rlang    (1.1.3)  • tools     (4.3.2)  
• fastmap    (1.1.1)   • graphics (4.3.2)  • utils     (4.3.2)  
── Not Installed ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
• nflfastRnfl4thnflversenflseedRnflplotR

Screenshots

No response

Additional context

No response

Citation

I'm using your data for a school project and i want to cite your repository; i think it would be cool to have a Citation.md

Time of day for each play

Is there an existing issue for this?

  • I have searched the existing issues

Is your feature request related to a problem? Please describe.

I would like to find the time of day (on a wall clock) when each play started. I want to be able to predict how much wall clock time in a game is left.

Describe the solution you'd like

I want to find a dataset similar to play by play data, with an extra column for time of day

Describe alternatives you've considered

I wonder what dataset AWS uses for its "Next Gen Stats". It's probably not publicly available

Additional context

No response

[BUG] Missing `roof` values for 2021

Is there an existing issue for this?

  • I have searched the existing issues

Have you installed the latest development version of the package(s) in question?

  • I have installed the latest development version of the package.

If this is a data issue, have you tried clearing your nflverse cache?

I have cleared my nflverse cache and the issue persists.

What version of the package do you have?

1.4.0.9

Describe the bug

PBP data is missing roof values for a significant number of 2021 games. This issue appears to only be limited to a subset of 2021 games, no other seasons appear to be affected.

Reprex

nflreadr::load_pbp(TRUE) |>
    dplyr::filter(is.na(roof)) |>
    dplyr::count(season) |>
    data.frame()
#>   season    n
#> 1   2021 4387

nflreadr::load_pbp(2021) |>
    dplyr::filter(is.na(roof)) |>
    dplyr::count(game_id) |>
    data.frame()
#>            game_id   n
#> 1  2021_01_JAX_HOU 206
#> 2  2021_01_PHI_ATL 191
#> 3  2021_01_SEA_IND 169
#> 4   2021_02_LA_IND 168
#> 5  2021_03_CAR_HOU 170
#> 6  2021_04_WAS_ATL 190
#> 7   2021_05_NE_HOU 169
#> 8  2021_06_HOU_IND 162
#> 9  2021_08_CAR_ATL 173
#> 10  2021_08_LA_HOU 178
#> 11 2021_08_TEN_IND 206
#> 12 2021_09_NYJ_IND 183
#> 13 2021_10_JAX_IND 182
#> 14  2021_11_NE_ATL 158
#> 15 2021_12_NYJ_HOU 168
#> 16  2021_12_TB_IND 184
#> 17 2021_13_IND_HOU 165
#> 18  2021_13_TB_ATL 186
#> 19 2021_14_SEA_HOU 191
#> 20  2021_15_NE_IND 164
#> 21 2021_16_DET_ATL 156
#> 22 2021_16_LAC_HOU 172
#> 23  2021_17_LV_IND 165
#> 24  2021_18_NO_ATL 162
#> 25 2021_18_TEN_HOU 169

Expected Behavior

The 2021 season should have complete roof values for these games.

nflverse_sitrep

blah

Screenshots

No response

Additional context

No response

load_combine() not showing data for 2023

There doesn't appear to be any NFL Combine data for 2023 (see below).

nflreadr::load_combine(seasons = 2023)
#> ── nflverse combine measurements ───────────────────────────────────────────────
#> ℹ Data updated: 2023-03-13 04:15:20 AEDT
#> Empty data.table (0 rows and 18 cols): season,draft_year,draft_team,draft_round,draft_ovr,pfr_id...

Created on 2023-05-09 with reprex v2.0.2

[BUG] draft_picks release is missing 2017 and newer

Is there an existing issue for this?

  • I have searched the existing issues

Have you installed the latest development version of the package(s) in question?

  • I have installed the latest development version of the package.

What version of the package do you have?

NA

Describe the bug

The data in the draft_picks release (specifically, parquet and csv checked) is cut off after 2016.

Reprex

from pandas import read_csv

picks = read_csv('https://github.com/nflverse/nflverse-data/releases/download/draft_picks/draft_picks.csv')
print(picks.season.max())

Expected Behavior

2022

nflverse_sitrep

NA

Screenshots

No response

Additional context

No response

Preparation for possible {arrow} removal on CRAN

It is highly likely that the R package {arrow} will be (temporarily) archived on CRAN.

This has impact on all nflverse releases that include .parquet files as the corresponding workflows install arrow through CRAN.

The least invasive solution is to add the arrow r-universe repo to the list of repos when setting up R on the runner to allow the installation of a binary. We can find the suggested repo here. The yaml should look like this

 - uses: r-lib/actions/setup-r@v2
   with:
      extra-repositories: 'https://apache.r-universe.dev/'

Affected tags in nflverse-data

I am not 100% sure if there are releases in other repos that require arrow to save parquet files. In nflverse-data, we can list all affected tags using this code

all_assets <- piggyback::pb_list("nflverse/nflverse-data")

all_assets |> 
  dplyr::filter(
    stringr::str_detect(file_name, ".parquet")
  ) |> 
  dplyr::distinct(tag)

which outputs the following list

  • ftn_charting
  • espn_data
  • weekly_rosters
  • players_components
  • players
  • pbp_participation
  • officials
  • misc
  • draft_picks
  • contracts
  • snap_counts
  • rosters
  • player_stats
  • pfr_advstats
  • pbp
  • nextgen_stats
  • injuries
  • depth_charts
  • combine

[BUG] unwanted line feeds within "injuries_YYYY.csv"

Is there an existing issue for this?

  • I have searched the existing issues

Have you installed the latest development version of the package(s) in question?

  • I have installed the latest development version of the package.

What version of the package do you have?

not relevant, I just want to use the downloaded csv

Describe the bug

Within the csv files for injuries there are unwanted line feeds in column "practice_status" for line items where this field is empty.
This leads to issues when loading the file e.g. with pandas.read_csv.

Example file: https://github.com/nflverse/nflverse-data/releases/tag/injuries/injuries_2023.csv
Line Item 26 "Robert Tonyan"
2023-09-16_13h52_55

Reprex

not relevant, I just want to use the downloaded csv

Expected Behavior

I would expect that there are no line feeds withing "cells" of a csv file whatsoever.

nflverse_sitrep

not relevant, I just want to use the downloaded csv

Screenshots

No response

Additional context

No response

Add Preseason PBP Data

I would love for preseason play-by-play to be released for each season. This will be helpful to develop a preseason ranking and will help to bet for the first few weeks.

[BUG] Status badges are broken

Is there an existing issue for this?

  • I have searched the existing issues

Have you installed the latest development version of the package(s) in question?

  • I have installed the latest development version of the package.

What version of the package do you have?

NA

Describe the bug

image

Reprex

NA

Expected Behavior

NA

nflverse_sitrep

NA

Screenshots

No response

Additional context

This seems to be the cause - badges/shields#8671

[BUG] <RYOE Stats Don't Match Up With NextGenStats Website>

Is there an existing issue for this?

  • I have searched the existing issues

Have you installed the latest development version of the package(s) in question?

  • I have installed the latest development version of the package.

What version of the package do you have?

1.3.2.7

Describe the bug

When I try to pull up the 2022 NextGenStats for RBs, the stats come up, and most of them come up correct. However, the RYOE, RYOE/Att., and Rush % OE stats come up completely different than what the NextGenStats website has.

Reprex

next_gen_2022 <- load_nextgen_stats(seasons = 2022, stat_type = "rushing") |> 
  filter(week == 0)

Expected Behavior

I expected the overall season stats for RYOE, RYOE/Att., and Rush % OE stats to match what the NextGenStats website has for them.

nflverse_sitrep

nflverse_sitrep()
── System Info ──────────────────────────────────────────────────────────────────────────────────────────────────
• R version 4.3.1 (2023-06-16)   • Running under: macOS Ventura 13.4.1
── nflverse Packages ────────────────────────────────────────────────────────────────────────────────────────────
• nflreadr (1.3.2.07)    • nflseedR (1.2.0)       • nflplotR (1.1.0.9006)  
• nflfastR (4.5.1.9004)  • nfl4th   (1.0.2.9006)  • nflverse (1.0.2)       
── nflverse Options ─────────────────────────────────────────────────────────────────────────────────────────────
No options set for nflreadr, nflfastR, nflseedR, nfl4th, nflplotR, and nflverse
── nflverse Dependencies ────────────────────────────────────────────────────────────────────────────────────────
• askpass     (1.1)     • hms        (1.1.3)    • progressr    (0.13.0)   
• cachem      (1.0.8)   • httr       (1.4.6)    • proto        (1.0.0)    
• cli         (3.6.1)   • isoband    (0.2.7)    • purrr        (1.0.1)    
• codetools   (0.2-19)  • janitor    (2.2.0)    • R6           (2.5.1)    
• colorspace  (2.1-0)   • jsonlite   (1.8.7)    • rappdirs     (0.3.3)    
• cpp11       (0.4.4)   • labeling   (0.4.2)    • RColorBrewer (1.1-3)    
• crayon      (1.5.2)   • lattice    (0.21-8)   • Rcpp         (1.0.10)   
• curl        (5.0.1)   • lifecycle  (1.0.3)    • rlang        (1.1.1)    
• data.table  (1.14.8)  • listenv    (0.9.0)    • rstudioapi   (0.14)     
• digest      (0.6.31)  • lubridate  (1.9.2)    • scales       (1.2.1)    
• dplyr       (1.1.2)   • magick     (2.7.4)    • snakecase    (0.11.0)   
• fansi       (1.0.4)   • magrittr   (2.0.3)    • stringi      (1.7.12)   
• farver      (2.1.1)   • MASS       (7.3-60)   • stringr      (1.5.0)    
• fastmap     (1.1.1)   • Matrix     (1.5-4.1)  • sys          (3.4.2)    
• fastrmodels (1.0.2)   • memoise    (2.0.1)    • tibble       (3.2.1)    
• furrr       (0.3.1)   • mgcv       (1.8-42)   • tidyr        (1.3.0)    
• future      (1.33.0)  • mime       (0.12)     • tidyselect   (1.2.0)    
• generics    (0.1.3)   • munsell    (0.5.0)    • timechange   (0.2.0)    
• ggplot2     (3.4.2)   • nlme       (3.1-162)  • utf8         (1.2.3)    
• globals     (0.16.2)  • openssl    (2.0.6)    • vctrs        (0.6.3)    
• glue        (1.6.2)   • parallelly (1.36.0)   • viridisLite  (0.4.2)    
• gsubfn      (0.7)     • pillar     (1.9.0)    • withr        (2.5.0)    
• gtable      (0.3.3)   • pkgconfig  (2.0.3)    • xgboost      (1.7.5.1)

Screenshots

See bottom:

Additional context

What it should be:

Screenshot 2023-08-08 at 6 57 30 PM

What I get:

Screenshot 2023-08-08 at 7 34 43 PM Screenshot 2023-08-08 at 7 34 46 PM

[FEATURE REQ] Extend serialization formats that encourage backwards and forwards compatibility

Is there an existing issue for this?

  • I have searched the existing issues

Is your feature request related to a problem? Please describe.

The current file formats can be prone to breaking changes to columns and/or data types. With the extensive use of the data in this repository moving more towards supporting backwards and forwards compatibility would allow users to take new fields in their own time.

Describe the solution you'd like

Extend the current process which outputs files to also support other serialization formats, perhaps something like; Protobuf or FlatBuffers.

There is also the added benefit of better deserialzation performance based on the languages being used to read the data.

Describe alternatives you've considered

Maintaining my own mapping based off the CSV or Parquet file but this would require a fair amount of intervention and would not be particularly robust

Additional context

No response

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.