statsbomb / open-data Goto Github PK
View Code? Open in Web Editor NEWFree football data from StatsBomb
Home Page: https://statsbomb.com/resource-centre/
License: Other
Free football data from StatsBomb
Home Page: https://statsbomb.com/resource-centre/
License: Other
When I filter the free data for the women's PL 2018/19 data I only get 107 games but the season was 110 games.
I have a question about publicly available data.
This data has 2 managers registered. Or is the data structure allowing multiple names to be registered? Please also answer whether the policy is to correct or not.
df.iloc[378]['home_team'] {'home_team_id': 226, 'home_team_name': 'Hellas Verona', 'home_team_gender': 'male', 'home_team_group': None, 'country': {'id': 112, 'name': 'Italy'}, 'managers': [{'id': 1002096, 'name': 'Rafael Márquez Álvarez', 'nickname': None, 'dob': '1979-02-13', 'country': {'id': 147, 'name': 'Mexico'}}, {'id': 4071, 'name': 'Andrea Mandorlini', 'nickname': None, 'dob': '1960-07-17', 'country': {'id': 112, 'name': 'Italy'}}]}
https://raw.githubusercontent.com/statsbomb/open-data/master/data/matches/12/27.json
Hi,
I've found invalid locations in 360 frames (freeze_frame.location field). I share with you several examples about that:
Event Index: 3365 (event type: Pressure): players of freeze_frame change their position from the right side of the pitch (event index: 3364) to the left (event index: 3365). After that, in the event index 3366, players change again to the right.
Event Index: 72 (event type: GoalKeeper): : players of freeze_frame change their position from the right side of the pitch (event index: 71) to the left (event index: 72). After that, in the event index 73, players change again to the right
Is it a problem with the 'x' values? o with 'y' values? both?
Do you know any workaround that I could use to fix that temporary until you will fix it?
Thank you.
Regards,
Paco.
Data of Group A, match 2 between Egypt v Uruguay is missing from the dataset. (it didn't appear in matches, lineups, events)
First of all, thanks for releasing these precious data for free!
It seems that you periodically release new data or update the existing data. It would be really great if you could start versioning (and also possibly providing the info about what has changed since the last time).
Thanks!
Hi.
competitions.json
does not have any information since yesterday.
Hello,
I please need help getting started. It seems like R is the recommended path and I am following all instructions from the Statsbomb R ppt and the https://github.com/statsbomb/StatsBombR link. From what I can tell, the only issue I am having is with the StatsbombR package portion. Below is the error message I receive. Do I have to click on any of the packages previously downloaded in order to download this - such devtools? as Any advice will help greatly.
`> devtools::install_github("statsbomb/StatsBombR");
Downloading GitHub repo statsbomb/StatsBombR@master
Downloading git repo https://github.com/cran/SDMTools.git
√ checking for file 'C:\Users\Cris Curis' Surface\AppData\Local\Temp\Rtmp6H80rl\file247833b2144e/DESCRIPTION' (484ms)
Installing package into ‘C:/Users/Cris Curis' Surface/Documents/R/win-library/4.0’
(as ‘lib’ is unspecified)
*** arch - i386
/mingw32/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c ConnectedComponentLabelling.c -o ConnectedComponentLabelling.o
/mingw32/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c getmin.c -o getmin.o
/mingw32/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c movewindow.c -o movewindow.o
/mingw32/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c patchstats.c -o patchstats.o
/mingw32/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c pointinpolygon.c -o pointinpolygon.o
/mingw32/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c slope.aspect.c -o slope.aspect.o
/mingw32/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c vincenty.geodesics.c -o vincenty.geodesics.o
/mingw32/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c writeasciidata.c -o writeasciidata.o
/mingw32/bin/gcc -shared -s -static-libgcc -o SDMTools.dll tmp.def ConnectedComponentLabelling.o getmin.o movewindow.o patchstats.o pointinpolygon.o slope.aspect.o vincenty.geodesics.o writeasciidata.o -LC:/PROGRA1/R/R-401.2/bin/i386 -lR
installing to C:/Users/Cris Curis' Surface/Documents/R/win-library/4.0/00LOCK-SDMTools/00new/SDMTools/libs/i386
*** arch - x64
/mingw64/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c ConnectedComponentLabelling.c -o ConnectedComponentLabelling.o
/mingw64/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c getmin.c -o getmin.o
/mingw64/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c movewindow.c -o movewindow.o
/mingw64/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c patchstats.c -o patchstats.o
/mingw64/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c pointinpolygon.c -o pointinpolygon.o
/mingw64/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c slope.aspect.c -o slope.aspect.o
/mingw64/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c vincenty.geodesics.c -o vincenty.geodesics.o
/mingw64/bin/gcc -I"C:/PROGRA1/R/R-401.2/include" -DNDEBUG -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c writeasciidata.c -o writeasciidata.o
/mingw64/bin/gcc -shared -s -static-libgcc -o SDMTools.dll tmp.def ConnectedComponentLabelling.o getmin.o movewindow.o patchstats.o pointinpolygon.o slope.aspect.o vincenty.geodesics.o writeasciidata.o -LC:/PROGRA1/R/R-401.2/bin/x64 -lR
installing to C:/Users/Cris Curis' Surface/Documents/R/win-library/4.0/00LOCK-SDMTools/00new/SDMTools/libs/x64
** R
** byte-compile and prepare package for lazy loading
Error: unexpected symbol in "tools:::makeLazyLoading("SDMTools", 'C:/Users/Cris Curis' Surface"
Execution halted
ERROR: lazy loading failed for package 'SDMTools'
Using this code:
library(StatsBombR)
events <- StatsBombFreeEvents()
or
WWC <- FreeMatches(72)
I'm getting this error:
Error: parse error: trailing garbage
404: Not Found
(right here) ------^
Players
Hi,
I've noticed that the matches file for the 2003/04 PL Season (data/matches/2/44.json) is missing data for the stadiums for each match.
First of all, thanks for collecting such vast information across many many matches. I've registered my email, is there a way to get more recent data than mid-2021 (ie. La Liga only goes until 2021)?
How can the location be translated into actual dimensions, i.e. in meters?
Events/ lineup files 7298.json are included from the 2017-18 season of the FA Women's Super League. There's no match information as the season shouldn't be available in the open-data.
This is a game between Manchester City WFC vs Chelsea WFC from 2018-02-24.
A link to the event file: https://github.com/statsbomb/open-data/blob/master/data/events/7298.json
A link to the lineup file: https://github.com/statsbomb/open-data/blob/master/data/lineups/7298.json
Hi can you please provide me step by step instructions for a complete noob?
If seems event ( "9c0159bb-61e0-4b62-9ac4-1cd45ae2df45") coordinates are flipped when I have compared it to video of the event. Maybe an input error for you to be aware of
Hi,
Doing some stuff on your data I realised, match id: 7480 is different from the others at the beggining. There is no pass from the middle of the pitch and also no play pattern: "from kick off" both in the first half. Also is pretty strange, game (after XI's and half start's events) starts from "Pressure" event.
Hello
When I try to download the file, it does not reach completion and when I try to unzip the file, it says the file is corrupted. I tried fixing and it worked, but when I unzip it, there is a lot of files missing and I only get the "events" folder
All the rest are missing
Tried downloading on 3 different machines, same results
Abc
Women's World Cup (comp_id 72, season_id 30) has all matches from the group stage as Regular Season (id 1) instead of Group Stage (10) in the competition stage.
Moreover, the four quarter-finals (match_id 69199, 69202, 69205, 69208) are also under Regular Season (id 1) when they should be Quarter-finals (11).
Thanks for your attention, I'm looking forward to the next update.
Hello,
Thank you for providing this data to developers and football enthusiasts (and not only).
I tried to process some information about the Champions League 2013-2014 season, but I found out that not only there is no JSON data about that specific season; there is no data for any Champions League season.
Is it accessible only with an account? Or is it just really missing?
The code I tried to access data with:
import json
with open('open-data-master/open-data-master/data/matches/16/76.json', encoding="utf-8") as f:
data = json.load(f)
data
I'm trying to infer game state from goals scored, which I can do trivially for routine shot-derived goals by identifying events where shot.outcome.name == "Goal"
but not for non-shot-derived goals (e.g. own goals, deflected passes, etc..).
For example, I can't find any way of identifying Aziz Bouhaddouz's own goal in the Morocco v Iran World Cup game (competition_id == 43
, match_id == 7577
, id == "8040ccd5-449c-4d0c-a557-e1e2ca4d2f18"
). The event is classified as type.name == "Ball Receipt*"
and ball_receipt.outcome.name == "Incomplete"
but I can't find any information to identify it as leading to a goal.
Thanks for any suggestions!
Hi,
For matches in the open data season of Barcelona matches sb.matches(11, 90)
, I'm getting an error when trying to retrieve the 360 frames, even though they are said to be available in the matches dataframe.
Example below for match id 3773565
:
from statsbombpy import sb
mid = 3773565
sb.frames(mid)
>raise HTTPError(http_error_msg, response=self)
>requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://raw.githubusercontent.com/statsbomb/open-data/master/data/three-sixty/3773565.json
Is this a known issue?
Hello,
It appears the matches data set is missing La Liga matches before September 2019. Data for match weeks 1-4 are not found in the json file.
Will this be updated soon?
thanks
There are occasions where player 8563 is mistaken for 3921 (e.g. match 7522) which suggests you have a mismapped player in your DB.
The documentation for shot events specifies a boolean attribute 'open_goal' but in the data I have been using (WSL 2018/19 & 2019/20) this does not appear to be present in the event data
Hi! Before the data update there was a column counterpress == TRUE
in the Events open data; has that been removed from the free datasets or has it been renamed or nested in a different location?
When I unzip the file, an error message pops up. I tried with different with different programs and problem remain.
Thank you in advance
Reading the documentation pretty thoroughly I understand the way the pitch coordinates work, however (and apologies if I've just missed something here) I can't find anything which clarifies which direction each team is playing in.
Is the home team always playing from 0 -> 120? Does the home team always start playing from 0 -> 120 but then this is flipped at half time? Is there a data point which contains which direction each team is playing in?
My particular use case is determining whether a pass is going into the attacking third.
Data about matches from Ligue1 2015-2016 is empty (since last commit): master/data/matches/7/27.json
Event id "6a793934-d6d0-4e0e-a09c-7c0a69f67a0f" appears to encodes the corner taken in this video: https://youtu.be/rT1pTkBGcUU?t=3328 at YouTube timestamp 55:28 and match timestamp 53:57.
The footage does not show the corner taking place as it is taken during a broadcast replay.
However, the location encoded is [115.0, 45.0] which does not represent a location near enough to the corner point. The corner in the footage is seemingly taken by Iniesta, but the pass is attributed to Xavi.
Reproducible in Python:
from statsbombpy import sb
df = sb.events(69289)
df[df['id']=='6a793934-d6d0-4e0e-a09c-7c0a69f67a0f'][['id', 'minute', 'second', 'pass_type', 'player', 'location', 'pass_end_location']]
Specifically:
For illustrative purposes, detailed changes can be seen in https://github.com/oznogon/open-data/pull/1, but the license on this data appears to preclude directly contributing to it.
Some fields use hyphens instead of underscores for variable names and certain fields (e.g. 'off_camera') aren't described at all.
Copy & pasting the code from the "Working-with-R.pdf" document from StatsBomb,
llibrary(tidyverse)
library(StatsBombR)
Comp <- FreeCompetitions() %>%
filter(competition_id==37 & season_name=="2020/2021")
Matches <- FreeMatches(Comp)
StatsBombFreeEvents(MatchesDF = Matches, Parallel = T)
produces the error
Error in if (MatchesDF == "ALL") { : the condition has length > 1
which is odd, especially since I didn't type MatchesDF=="ALL"
but maybe I'm not understanding the error message. In any case, the tutorial code is not working.
Good morning,
any of you encountered this error while trying to upload this file in PowerBi?
Many thanks,
Mich
Some matches dont have the info of the managers who were managing
FreeMatches(Comp)
can't get UCL free data because the files on the repo are blank.
https://github.com/statsbomb/open-data/tree/master/data/matches/16
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.