Code Monkey home page Code Monkey logo

transfermarkt-transfers's Introduction

Transfermarkt Transfers (tmtransfers)

All soccer/football club transfers from 1992/93–2020/21 for 10 of the top European leagues, namely

  1. Premier League 🏴󠁧󠁢󠁥󠁮󠁧󠁿
  2. La Liga 🇪🇸
  3. Bundesliga 🇩🇪
  4. Serie A 🇮🇹
  5. Ligue 1 🇫🇷
  6. Primeira Liga 🇵🇹
  7. Eredivisie 🇳🇱
  8. Premier Liga* 🇷🇺
  9. Jupiler Pro League* 🇧🇪
  10. Scottish Premiership* 🏴󠁧󠁢󠁳󠁣󠁴󠁿

Data were obtained by web scraping league transfer data from Transfermarkt.

* Transfermarkt does not provide data for the 2011/12 Premier Liga season, the 1992/93 and 1993/94 Jupiler Pro League seasons, or the 1992/93–2002/03 Scottish Premiership seasons.

Data

All data are provided in the data directory and grouped into season subdirectories. Feel free to use this dataset for your own purposes! You can clone it or download it via DownGit. Consult the README for more information.

Usage

If you'd like to pull the raw data directly from the source or scrape data for other countries and leagues, you can use the Python script provided by tmtransfers.

Setup and running the script

Clone this repository and open a terminal in the cloned folder. First ensure all dependencies are met:

pip install -r requirements.txt

The module can now be run as a script from the top directory:

python -m tmtransfers

This launches a series of text prompts. You should see the following output to start:

Select currency (default is euro):
[1] EUR €
[2] GBP £
[3] USD $
===>

Follow the prompts to input your desired league parameters. Scraped data will then be written to CSVs in a created data directory.

As an example, an output CSV for the Premier League's 2020/21 season with the default options and before cleaning should look like:

club name age nationality position short_pos market_value dealing_club dealing_country fee movement window league season
Arsenal FC Thomas Partey 27 Ghana Defensive Midfield DM €40.00m Atlético Madrid Spain €50.00m in summer premier-league 2020
Arsenal FC Gabriel 22 Brazil Centre-Back CB €20.00m LOSC Lille France €26.00m in summer premier-league 2020
Arsenal FC Pablo Marí 26 Spain Centre-Back CB €4.80m Flamengo Brazil €5.00m in summer premier-league 2020
Arsenal FC Rúnar Alex Rúnarsson 25 Iceland Goalkeeper GK €1.20m Dijon France €2.00m in summer premier-league 2020
Arsenal FC Cédric Soares 28 Portugal Right-Back RB €8.00m Southampton England free transfer in summer premier-league 2020

Note: If you run the script again and scrape data for the same league and same season, the existing CSV will be overwritten. Be sure to move or rename existing files if you need them as is before running the script again.

Using the module

If you'd like to use this module elsewhere, install it from the top directory with

pip install .

It provides two functions, scrape_transfermarkt and tidy_transfers. Use them like so:

import pandas
import tmtransfers

# Web scrape data for a league not explicitly given in the script
# Returns a Pandas dataframe
df = tmtransfers.scrape_transfermarkt(
        league_name='championship',
        league_id='GB2',
        season_id='2005',
        write=True)

# Clean the data
# Returns another Pandas dataframe
tidy_df = tmtransfers.tidy_transfers(df)

See the documentation in tmtransfers.py for more details.

Note: These functions have been tested for only the above leagues through the listed seasons. You'll have to browse Transfermarkt for what to input to scrape other countries and leagues.

Source

All data are scraped from Transfermarkt according to their terms of use.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.