Code Monkey home page Code Monkey logo

Comments (4)

ElSaico avatar ElSaico commented on May 27, 2024 1

Those fields described by @JoGall are generated by the data cleaning functions of https://github.com/statsbomb/StatsBombR.

Got a decent understanding because I'm finishing to port them all to Python (check out https://github.com/ElSaico/pyStatsBomb in the next few days - I'll owe you all the API functionality because I lack the necessary $resources$ to access it).

Shots

Shot5, Shot6, etc. seem to be earlier glitches from importing that already got fixed: statsbomb/StatsBombR@2e38647

All distance variables use the same unit as the positions, i.e. they're scaled to a 120x80 pitch.
DistToGoal is exactly what it implies, but DistToKeeper refers, counter-intuitively, to the distance between keeper and goal (!). The distance between shot and goal is in DistSGK.

All angular variables are in degrees. AngleToGoal and AngleToKeeper are the opening angles formed by DistToGoal and DistToKeeper, respectively, while AngleDeviation is the opening angle between both.

Freeze frames

density and density.income are both described in the README:

  • Density is calculated as the aggregated inverse distance for each defender behind the ball.
  • Density in the cone is the density filtered for only defenders who are in the cone between the shooter, and each goal post.

The other variables are:

  • DefendersInCone- amount of defending players between the shooter and the goal
  • distance.ToD1 - distance between shooter and nearest defending player
  • distance.ToD2 - distance between shooter and second-nearest defending player
  • InCone.GK - whether the goalkeeper is in the path between the shooter and the goal
  • AttackersBehindBall and DefendersBehindBall - self-explanatory
  • DefArea - area of the smallest square that covers all opposite defenders (which means centre-backs and full-backs only)

All variables exclude the defending goalkeeper, except obviously for InCone.GK

Time

All extra time-related variables are in milliseconds and seem to have pretty descriptive names.

from open-data.

JoGall avatar JoGall commented on May 27, 2024

I'd also really like to know what variables like density, density.incone, AngleDeviation, Shot5, Shot6, etc... mean, and whether variables like DistToGoal and DistToKeeper are given in metres or arbitrary pitch units.

from open-data.

JoGall avatar JoGall commented on May 27, 2024

Thanks for taking the time for such a detailed reply @ElSaico!

I thought DistToKeeper was much lower than expected so wondered if it was given in an unexpected unit of measurement, that makes more sense! For anyone else reading, DistToKeeper is the distance from the GK to the centre of the goal (not the nearest part of the goal line).

I didn't notice density and density.incone in the documentation when I first looked -- seems they'd be very useful for xG models. I haven't seen several of the other variables (e.g. DistSGK, AttackersBehindBall, DefArea) as I don't think they're available in the free data but good to know.

Good luck with pyStatsBomb and making the data accessible to more people!

from open-data.

deepxg avatar deepxg commented on May 27, 2024

At some point we'll tidy up StatsBombR and document the inner workings of @YamStats brain, but for the most part it's provided as is to give people a bit of a leg up using the data. Happy to see issues raised in the other repo for any other improvements. In the meantime, the docs have been updated today so there shouldn't be anything in the raw data that's not covered now.

from open-data.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.