Code Monkey home page Code Monkey logo

open-data's People

Contributors

cddr avatar deepxg avatar haghanim avatar jamesyorke77 avatar scotty779 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

open-data's Issues

Check the data: Two managers are registered for the home team.

I have a question about publicly available data.

  • competition_id: 12
  • season_id: 27
  • row number: 378

This data has 2 managers registered. Or is the data structure allowing multiple names to be registered? Please also answer whether the policy is to correct or not.

df.iloc[378]['home_team'] {'home_team_id': 226, 'home_team_name': 'Hellas Verona', 'home_team_gender': 'male', 'home_team_group': None, 'country': {'id': 112, 'name': 'Italy'}, 'managers': [{'id': 1002096, 'name': 'Rafael Márquez Álvarez', 'nickname': None, 'dob': '1979-02-13', 'country': {'id': 147, 'name': 'Mexico'}}, {'id': 4071, 'name': 'Andrea Mandorlini', 'nickname': None, 'dob': '1960-07-17', 'country': {'id': 112, 'name': 'Italy'}}]}
https://raw.githubusercontent.com/statsbomb/open-data/master/data/matches/12/27.json

Invalid locations for specific events: GoalKeeper, Pressure and ¿more?

Hi,

I've found invalid locations in 360 frames (freeze_frame.location field). I share with you several examples about that:

  • Euro 2020 - Italy vs. Spain (competitionId = 55, seasonId = 43, matchId = 3795220):

Event Index: 3365 (event type: Pressure): players of freeze_frame change their position from the right side of the pitch (event index: 3364) to the left (event index: 3365). After that, in the event index 3366, players change again to the right.

  • Euro 2020 - Italy vs. England (competitionId = 55, seasonId = 43, matchId = 3795506):

Event Index: 72 (event type: GoalKeeper): : players of freeze_frame change their position from the right side of the pitch (event index: 71) to the left (event index: 72). After that, in the event index 73, players change again to the right

Is it a problem with the 'x' values? o with 'y' values? both?

Do you know any workaround that I could use to fix that temporary until you will fix it?

Thank you.

Regards,
Paco.

Match Missing

Data of Group A, match 2 between Egypt v Uruguay is missing from the dataset. (it didn't appear in matches, lineups, events)

Versioning

First of all, thanks for releasing these precious data for free!

It seems that you periodically release new data or update the existing data. It would be really great if you could start versioning (and also possibly providing the info about what has changed since the last time).

Thanks!

Eager Novice - R

Hello,
I please need help getting started. It seems like R is the recommended path and I am following all instructions from the Statsbomb R ppt and the https://github.com/statsbomb/StatsBombR link. From what I can tell, the only issue I am having is with the StatsbombR package portion. Below is the error message I receive. Do I have to click on any of the packages previously downloaded in order to download this - such devtools? as Any advice will help greatly.

`> devtools::install_github("statsbomb/StatsBombR");
Downloading GitHub repo statsbomb/StatsBombR@master
Downloading git repo https://github.com/cran/SDMTools.git
√ checking for file 'C:\Users\Cris Curis' Surface\AppData\Local\Temp\Rtmp6H80rl\file247833b2144e/DESCRIPTION' (484ms)

  • preparing 'SDMTools': (851ms)
    √ checking DESCRIPTION meta-information ...
  • cleaning src
  • checking for LF line-endings in source and make files and shell scripts (377ms)
  • checking for empty or unneeded directories
  • building 'SDMTools_1.1-221.2.tar.gz'

Installing package into ‘C:/Users/Cris Curis' Surface/Documents/R/win-library/4.0’
(as ‘lib’ is unspecified)

  • installing source package 'SDMTools' ...
    ** using staged installation
    ** libs

*** arch - i386
/mingw32/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c ConnectedComponentLabelling.c -o ConnectedComponentLabelling.o
/mingw32/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c getmin.c -o getmin.o
/mingw32/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c movewindow.c -o movewindow.o
/mingw32/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c patchstats.c -o patchstats.o
/mingw32/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c pointinpolygon.c -o pointinpolygon.o
/mingw32/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c slope.aspect.c -o slope.aspect.o
/mingw32/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c vincenty.geodesics.c -o vincenty.geodesics.o
/mingw32/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c writeasciidata.c -o writeasciidata.o
/mingw32/bin/gcc -shared -s -static-libgcc -o SDMTools.dll tmp.def ConnectedComponentLabelling.o getmin.o movewindow.o patchstats.o pointinpolygon.o slope.aspect.o vincenty.geodesics.o writeasciidata.o -LC:/PROGRA1/R/R-401.2/bin/i386 -lR
installing to C:/Users/Cris Curis' Surface/Documents/R/win-library/4.0/00LOCK-SDMTools/00new/SDMTools/libs/i386

*** arch - x64
/mingw64/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c ConnectedComponentLabelling.c -o ConnectedComponentLabelling.o
/mingw64/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c getmin.c -o getmin.o
/mingw64/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c movewindow.c -o movewindow.o
/mingw64/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c patchstats.c -o patchstats.o
/mingw64/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c pointinpolygon.c -o pointinpolygon.o
/mingw64/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c slope.aspect.c -o slope.aspect.o
/mingw64/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c vincenty.geodesics.c -o vincenty.geodesics.o
/mingw64/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c writeasciidata.c -o writeasciidata.o
/mingw64/bin/gcc -shared -s -static-libgcc -o SDMTools.dll tmp.def ConnectedComponentLabelling.o getmin.o movewindow.o patchstats.o pointinpolygon.o slope.aspect.o vincenty.geodesics.o writeasciidata.o -LC:/PROGRA1/R/R-401.2/bin/x64 -lR
installing to C:/Users/Cris Curis' Surface/Documents/R/win-library/4.0/00LOCK-SDMTools/00new/SDMTools/libs/x64
** R
** byte-compile and prepare package for lazy loading
Error: unexpected symbol in "tools:::makeLazyLoading("SDMTools", 'C:/Users/Cris Curis' Surface"
Execution halted
ERROR: lazy loading failed for package 'SDMTools'

  • removing 'C:/Users/Cris Curis' Surface/Documents/R/win-library/4.0/SDMTools'
    Error: Failed to install 'StatsBombR' from GitHub:
    Failed to install 'unknown package' from Git:
    (converted from warning) installation of package ‘C:/Users/CRISCU~1/AppData/Local/Temp/Rtmp6H80rl/file247837fd279/SDMTools_1.1-221.2.tar.gz’ had non-zero exit status`

Getting error tryng to download data

Using this code:

library(StatsBombR)
events <- StatsBombFreeEvents()

or

WWC <- FreeMatches(72)

I'm getting this error:

Error: parse error: trailing garbage
404: Not Found
(right here) ------^

Only data until mid-2021?

First of all, thanks for collecting such vast information across many many matches. I've registered my email, is there a way to get more recent data than mid-2021 (ie. La Liga only goes until 2021)?

Location in meters

How can the location be translated into actual dimensions, i.e. in meters?

7298.json included from 2017-18 season [Manchester City WFC vs Chelsea WFC 2018-02-24]

Events/ lineup files 7298.json are included from the 2017-18 season of the FA Women's Super League. There's no match information as the season shouldn't be available in the open-data.

This is a game between Manchester City WFC vs Chelsea WFC from 2018-02-24.

A link to the event file: https://github.com/statsbomb/open-data/blob/master/data/events/7298.json
A link to the lineup file: https://github.com/statsbomb/open-data/blob/master/data/lineups/7298.json

Help needed for a complete noob.

Hi can you please provide me step by step instructions for a complete noob?

  1. How to download all the data?
  2. Take a match for example (Just say Chelsea vs Manchester United recently.)
  3. How do I read the events data for the Chelsea vs Manchester United match? (I have R and have installed statsbomb on R.)

Event coordinates flipped

If seems event ( "9c0159bb-61e0-4b62-9ac4-1cd45ae2df45") coordinates are flipped when I have compared it to video of the event. Maybe an input error for you to be aware of

First half kick off missing - match 7480

Hi,
Doing some stuff on your data I realised, match id: 7480 is different from the others at the beggining. There is no pass from the middle of the pitch and also no play pattern: "from kick off" both in the first half. Also is pretty strange, game (after XI's and half start's events) starts from "Pressure" event.

zip file corrupted

i can't unzip the data because it says corrupted and when I open through the zip I only get the data, readme, and license
image

Can't fully download zip file

Hello
When I try to download the file, it does not reach completion and when I try to unzip the file, it says the file is corrupted. I tried fixing and it worked, but when I unzip it, there is a lot of files missing and I only get the "events" folder
All the rest are missing
Tried downloading on 3 different machines, same results

Women's World Cup: erroneous competition stages

Women's World Cup (comp_id 72, season_id 30) has all matches from the group stage as Regular Season (id 1) instead of Group Stage (10) in the competition stage.

Moreover, the four quarter-finals (match_id 69199, 69202, 69205, 69208) are also under Regular Season (id 1) when they should be Quarter-finals (11).

Thanks for your attention, I'm looking forward to the next update.

Champions League Matches data missing

Hello,

Thank you for providing this data to developers and football enthusiasts (and not only).
I tried to process some information about the Champions League 2013-2014 season, but I found out that not only there is no JSON data about that specific season; there is no data for any Champions League season.

Is it accessible only with an account? Or is it just really missing?

The code I tried to access data with:

import json

with open('open-data-master/open-data-master/data/matches/16/76.json', encoding="utf-8") as f:
    data = json.load(f)

data

Identifying non-shot-derived goals

I'm trying to infer game state from goals scored, which I can do trivially for routine shot-derived goals by identifying events where shot.outcome.name == "Goal" but not for non-shot-derived goals (e.g. own goals, deflected passes, etc..).

For example, I can't find any way of identifying Aziz Bouhaddouz's own goal in the Morocco v Iran World Cup game (competition_id == 43, match_id == 7577, id == "8040ccd5-449c-4d0c-a557-e1e2ca4d2f18"). The event is classified as type.name == "Ball Receipt*" and ball_receipt.outcome.name == "Incomplete" but I can't find any information to identify it as leading to a goal.

Thanks for any suggestions!

Trouble fetching Open La Liga 360 data frames

Hi,
For matches in the open data season of Barcelona matches sb.matches(11, 90), I'm getting an error when trying to retrieve the 360 frames, even though they are said to be available in the matches dataframe.

Example below for match id 3773565:

from statsbombpy import sb
mid = 3773565
sb.frames(mid)

>raise HTTPError(http_error_msg, response=self)
>requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://raw.githubusercontent.com/statsbomb/open-data/master/data/three-sixty/3773565.json

Is this a known issue?

missing matches for La Liga season 2019/2020

Hello,

It appears the matches data set is missing La Liga matches before September 2019. Data for match weeks 1-4 are not found in the json file.
Will this be updated soon?

thanks

Player mismapping

There are occasions where player 8563 is mistaken for 3921 (e.g. match 7522) which suggests you have a mismapped player in your DB.

open_goal appears missing

The documentation for shot events specifies a boolean attribute 'open_goal' but in the data I have been using (WSL 2018/19 & 2019/20) this does not appear to be present in the event data

Missing counterpress column

Hi! Before the data update there was a column counterpress == TRUE in the Events open data; has that been removed from the free datasets or has it been renamed or nested in a different location?

Problem unzip

When I unzip the file, an error message pops up. I tried with different with different programs and problem remain.

Thank you in advance

Clarification of event coordinates

Reading the documentation pretty thoroughly I understand the way the pitch coordinates work, however (and apologies if I've just missed something here) I can't find anything which clarifies which direction each team is playing in.

Is the home team always playing from 0 -> 120? Does the home team always start playing from 0 -> 120 but then this is flipped at half time? Is there a data point which contains which direction each team is playing in?

My particular use case is determining whether a pass is going into the attacking third.

Details of pass id 6a793934-d6d0-4e0e-a09c-7c0a69f67a0f appears incorrect

Event id "6a793934-d6d0-4e0e-a09c-7c0a69f67a0f" appears to encodes the corner taken in this video: https://youtu.be/rT1pTkBGcUU?t=3328 at YouTube timestamp 55:28 and match timestamp 53:57.

The footage does not show the corner taking place as it is taken during a broadcast replay.

However, the location encoded is [115.0, 45.0] which does not represent a location near enough to the corner point. The corner in the footage is seemingly taken by Iniesta, but the pass is attributed to Xavi.

Reproducible in Python:

from statsbombpy import sb

df = sb.events(69289)

df[df['id']=='6a793934-d6d0-4e0e-a09c-7c0a69f67a0f'][['id', 'minute', 'second', 'pass_type', 'player', 'location', 'pass_end_location']]

Some data is incorrect

Specifically:

  • The Orlando Pride are incorrectly referred to as Orlando Pride SC, an error also made on Soccerway and the NWSL website for archived 2016 matches.
  • Ryan Williams is listed as being Australian, but she is from the USA and has no reported links to Australia.
  • Thaisa's name is misspelled as Thaysa.

For illustrative purposes, detailed changes can be seen in https://github.com/oznogon/open-data/pull/1, but the license on this data appears to preclude directly contributing to it.

Documentation mismatch

Some fields use hyphens instead of underscores for variable names and certain fields (e.g. 'off_camera') aren't described at all.

StatsBombFreeEvents() command bug

Copy & pasting the code from the "Working-with-R.pdf" document from StatsBomb,

llibrary(tidyverse)
library(StatsBombR) 

Comp <- FreeCompetitions() %>%
  filter(competition_id==37 & season_name=="2020/2021") 

Matches <- FreeMatches(Comp) 

StatsBombFreeEvents(MatchesDF = Matches, Parallel = T)

produces the error

Error in if (MatchesDF == "ALL") { : the condition has length > 1

which is odd, especially since I didn't type MatchesDF=="ALL" but maybe I'm not understanding the error message. In any case, the tutorial code is not working.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.