outside-edge / python-espncricinfo Goto Github PK

View Code? Open in Web Editor NEW

143.0 143.0 64.0 82 KB

Python wrapper for the ESPNCricInfo JSON API

License: MIT License

Python 100.00%

python-espncricinfo's People

Contributors

Stargazers

Watchers

Forkers

devanjedi digideskio dpebert7 iamgs7 mittamani codophobia regunathan mcdonaldaj sbshah97 ogordon100 ehsan1997 prumjot f2k5 prasanna-sk 3vk hgoyal09 dotcorner narunbabu ameyem-skill-labs shafran123 hazdik parshvas25 rohit11may scrambldchannel sameer-kumar-jain jjwilliams27 nborwankar shivapbhusal momin-butt pawankaushal nikhilkhiarnar05 muddlebee ckola shashuec psbankar boredcyborg yajiviki farazirfan47 srikpv felixfelicis555 chisepo85 arijit91c wally1002 naveen-tirupattur ratish5175 srikanth-gandi asports1 kushgupta16 adityaroongta sh4dowbyt3 albertbannister asmishra s29sharma ejep vbrltech mamba-market dylangh jackma222 kapilgarg nikhilklath rnerd12 andrehofmeyr rahulthetoolsmaker

python-espncricinfo's Issues

Change float to int in Player class

Creating a Player object leads to a problem in the _batting_fielding_averages method, on line 84, with num_formats using the / operator, which gives a float instead of the // operator to get an int. This raises an error when it is used to create a range on line 85.

Getting an error

Hi,

I tried running scraper.py and got the following error saying - "must be unicode, not str". Please see the details below.

/Library/Python/2.7/site-packages/beautifulsoup4-4.6.0-py2.7.egg/bs4/init.py:181: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

The code that caused this warning is on line 30 of the file scraper.py. To get rid of this warning, change code that looks like this:

BeautifulSoup(YOUR_MARKUP})

to this:

BeautifulSoup(YOUR_MARKUP, "lxml")

Traceback (most recent call last):
File "scraper.py", line 39, in
new_host = unicodedata.normalize('NFKD', new_host).encode('ascii','ignore')
TypeError: must be unicode, not str

I am not sure how best to solve this. Look forward to hearing from you. Thanks!

Hemant

Hlw Bro how can i show live score, result + upcomming match list

Issue with match.py when passing match ID = 729303

Describe the bug
Whenever I try using match.py with match id =729303. it's throwing error: Expecting value: line 1 column 1 (char 0).
It's a valid match ID. I haven't got any issue for other match ids so far, but this one is throwing an error.
On deep diving, I found out that:
json_url = "https://www.espncricinfo.com/matches/engine/match/{0}.json".format(str(match_id))

This is giving: error 500 Internal Server Error

I tried json_url for other match ids and it's giving proper json object.

I don't know what's causing the issue with this match. Is it any server-side issue or am I doing something wrong?

To Reproduce
-> m = Match(729303)
-> m.json

Expected behavior
Match(729303) should work without any error.

Screenshots
Actual Error:

JSON_url for match_id =729303 (NOT WORKING)

JSON_url for match_id =729301 (WORKING)

Additional context
Add any other context about the problem here.

Series name property

To get the series name, you need to currently do this, Match(match_id).series[0]['series_name'], I think there should be a Match.series_name property instead.

Player error float object cannot be interpreted as integer

Code:
player = Player('277916')

Traceback:

Traceback (most recent call last):
File "C:/Users/ehsan/PycharmProjects/ScrapPSLData/CreateCSV_AdvStats.py", line 9, in
print(Player('277916').name)
File "C:\Users\ehsan\AppData\Roaming\Python\Python36\site-packages\espncricinfo\player.py", line 20, in init
self.batting_fielding_averages = self._batting_fielding_averages()
File "C:\Users\ehsan\AppData\Roaming\Python\Python36\site-packages\espncricinfo\player.py", line 79, in _batting_fielding_averages
format_positions = [15*x for x in range(num_formats)]
TypeError: 'float' object cannot be interpreted as an integer

[BUG] Unable to get a player's stats data

Describe the bug
get_data() method in Player class returns a 403 when it tries to access the html page

To Reproduce
p = Player(47270)
print(p.get_data(None, 1, 'batting', 'match'))

Expected behavior
Should be able to access the html page and cotinue with the parsing process.

Screenshots
I've put a print command to show the contents of the html page

This is what the output and stack trace is.

The Response <403> is the output from printing the HTML.

Additional context
I'm guessing this is espncricinfo blocking odd requests not coming from browsers? Is there a way to get around this?

Provide values for toss decision where json doesn't have them

The json for some matches (an example) do not have values for toss_decision. In these cases we'll probably need to fall back to scraping, since the match summary HTML does have that information.

[BUG] 'pip3 install python-espncricinfo' is installing 0.5.8, instead of 0.6.1 (latest)

Describe the bug
Following the documentation, I tried 'pip3 install python-espncricinfo' and the command is installing 0.5.8, instead of 0.6.1 (latest)

Seeing below in packages as well.

$ pip3 index versions python-espncricinfo
WARNING: pip index is currently an experimental command. It may be removed/changed in a future release without prior warning.
python-espncricinfo (0.5.8)
Available versions: 0.5.8, 0.5.7, 0.5.6, 0.5.5, 0.5.4, 0.5.3, 0.5.2, 0.5.1, 0.5.0, 0.4.1, 0.4.0, 0.3.2, 0.3.1, 0.3, 0.2.3, 0.2.2, 0.3.macosx-10.11-intel
INSTALLED: 0.5.8
LATEST: 0.5.8

Match.all_innings no longer working.

Firstly, great repo has been very useful in making my small cricket website, thank you.

I have a sqlite database of all player innings in Test and ODI cricket and have been using your module to import the data, using the Match class and then .all_innings to gather all the data from the match. It was definitely working a few weeks ago but now has broken.

It used to produce a dictionary of all the innings in a match but now only returns Nonetype regardless of the match.

To Reproduce
Steps to reproduce the behavior:

from espncricinfo.match import Match

m = Match(1225248)

# 1225248 is the number for the second test for Eng v Windies 2020

print(m.all_innings)

Expected behavior
It should print out all the info from each innings in the game, including info about each player's individual innings.

Screenshots
Here's a screenshot of the above code in action.

Get Recent Matches for the day

Currently we get matches from the entire week. I would like to get them by days too. I wrote a script for that which I am using currently I will share it here.

Also summary.py has some problem when trying to create a Summary object.


Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "C:\Users\ehsan\AppData\Local\Programs\Python\Python36\lib\site-packages\espncricinfo\summary.py", line 7, in __init__
    self.json = self.get_json()
  File "C:\Users\ehsan\AppData\Local\Programs\Python\Python36\lib\site-packages\espncricinfo\summary.py", line 13, in get_json
    return r.json()
  File "C:\Users\ehsan\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\models.py", line 896, in json
    return complexjson.loads(self.text, **kwargs)
  File "C:\Users\ehsan\AppData\Local\Programs\Python\Python36\lib\json\__init__.py", line 354, in loads
    return _default_decoder.decode(s)
  File "C:\Users\ehsan\AppData\Local\Programs\Python\Python36\lib\json\decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "C:\Users\ehsan\AppData\Local\Programs\Python\Python36\lib\json\decoder.py", line 357, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 2 (char 1)

Code for getting matches by the day.

    def get_recent_matches_day(date=None):
        if date:
            #date format should be YYYYmmdd e.g. 20180814
            url = "http://www.espncricinfo.com/scores/?date=date"
        else:
            url = "http://www.espncricinfo.com/scores/"
        r = requests.get(url)
        soup = BeautifulSoup(r.text, 'html.parser')
        return [x['href'].split('/', 4)[4].split('.')[0] for x in soup.findAll('a', href=True, text='SCORECARD')]

Document Methods and Classes in Wiki and redirect here as official documentation

As of now Documentation is very scarce for a non-developer and it'll be easier if a non-developer has to look at text rather than code. I propose we move the current README contents here and also have the Wiki structure as follows:

|-Home
|-Player
|-Summary
|-Match

P.S: I'd love to work on this if possible!

Trying to convert NoneType to Float isn't handeled

I found error on a match that had no result, and the match ended without the second team playing a ball. So I opened match.py, and found out float conversion can cause the script to fail if there is NoneType returned. Please use exception handling or whatever you deem fit.

[BUG] Maximum Excursion Error

Describe the bug
Excursion Error when calling methods within the module

To Reproduce
Run Sample code in the overview page of the module

Expected behavior
No MaximumExcursion Error

Screenshots
N/A

Additional context
https://stackoverflow.com/questions/67165137/espn-cricinfo-maximum-recursion-depth-exceeded/67166173?noredirect=1#comment118744395_67166173

Please see the following question I asked on StackOverflow with the exact problem I encountered.

Thanks a lot to anyone that can help

Matches scraper

The Match class could use some class methods for scraping upcoming matches, with option for weekly, month or season view.

method to fetch Commentary for the match

I am trying to see if I can get the full ball-by-ball commentary of the match.

With match ID - I thought get_comms_json() will do the trick. But, unfortunately, it doesn't return any value.

>>> from espncricinfo.match import Match
>>> m = Match('1175356')
>>> m.match_url
'http://www.espncricinfo.com/matches/engine/match/1175356.html'
>>> m.comms_json
>>>

Is it possible/feasible to get the full commentary in json format? If not, guess I ll have to start working on a scraper. Thanks for your work.

[FeatureRequest] Scraping every scorecard for a given format or team

I am a statistician interested in cricket, interested in making general claims across entire formats of the game (i.e. including Test cricket game ever, for example). Would it be possible to use python-espncricinfo to scrape every scorecard from every test match, for example?
It is around 2,000 in total, which is more than is worth doing by hand, but not so scary if we can automatically generate every match_id for every test match, for example.
If another resource is more appropriate for this type of bulk analysis then I am happy to hear of it.

Improve Match API

I don't think you should pass in an int, you should pass in a string so that you can't accidentally do arithmetic on the Match object (more common than you might think)
The program seems to be failing on a few Match IDs -- 76557 and 74539 seem to be failing for me with the error ValueError.

Add players to match instances

For a given match, a user should be able to retrieve players from both teams from the match JSON.

Add over-by-over run tally for matches

Should be possible to parse either the cricinfo html or json to construct this.

'method' object is not subscriptable

Describe the bug
Try to get summary of live matches, by crating an instance of the Summary class get this error
TypeError: 'method' object is not subscriptable

**Here is My Code **

    s = Summary()
    print(s.match_ids)

Here is the error

[FeatureRequest] Getting all the ids of the match belonging to a specific series or league (e.g. PSL, IPL)

Add country object

Create a Country class that has methods for grabbing players by first letter of last name, grounds, fixtures (possibly use iCal format), international results, maybe domestic results?.

Add PlayerNotFound error

If 404: http://www.espncricinfo.com/england-v-pakistan-2016/content/player/45788.html

Add additional toss attributes to Match

toss_choice_team_id
toss_decision
toss_decision_name
toss_winner_team_id

Given Match ID and Player ID return statistics of Player in that match

Given a particular Match ID in the Player class, it should return the batting and bowling statistics for that player in that match.
If the player hasn't played that match, it should return a different values rather than default 0/0 - bowling and 0(0) - batting

To Decide:

How should the format of the output be like?
Any clues as to how to start this?

[BUG] All URL's returned needs to be http (currently https)

Describe the bug
The url returned by the Match().details_url method has https. The link only works with http.

To Reproduce

>>> from espncricinfo.match import Match
>>> m = Match('1175356')
>>> m.details_url
'https://core.espnuk.org/v2/sports/cricket/leagues/8048/events/1175356/competitions/1175356/details?page_size=1000&page=1'

Expected behavior

>>> m.details_url
'http://core.espnuk.org/v2/sports/cricket/leagues/8048/events/1175356/competitions/1175356/details?page_size=1000&page=1'

Espn website used for JSON object not available

Hi! Amazing work with this project. Although I haven't quite understood how to use it entirely, I tried to read through the code, and found this in 'player.py'

self.json_url = "https://core.espnuk.org/v2/sports/cricket/athletes/{0}".format(str(player_id))

This URL isn't available anymore it seems. Would you happen to know any alternate websites or sources where we could get player data packaged in a JSON?

Scraper for player info

Create a Player class that parses details from player pages like this one and returns details about the player and performance statistics.

IndexError: list index out of range in player.py

Here's the traceback, as you can see the problem arises when trying to get player data with id '10125'.
Probably some unhandled exception.

Traceback (most recent call last):
  File "C:/Users/ehsan/PycharmProjects/ScrapPSLData/test_player.py", line 2, in <module>
    p = Player('10125')
  File "C:\Users\ehsan\AppData\Roaming\Python\Python36\site-packages\espncricinfo\player.py", line 34, in __init__
    self.recent_matches = self._recent_matches()
  File "C:\Users\ehsan\AppData\Roaming\Python\Python36\site-packages\espncricinfo\player.py", line 223, in _recent_matches
    table = self.parsed_html.findAll('table', class_='engineTable')[3]
IndexError: list index out of range

[BUG] Get KeyError: 'props' when trying to run Summary()

When I run the following code, I get the traceback below.

>>> from espncricinfo.summary import Summary
>>> s = Summary()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/John/opt/anaconda3/lib/python3.7/site-packages/espncricinfo/summary.py", line 12, in __init__
    self.match_ids = self._match_ids()
  File "/Users/John/opt/anaconda3/lib/python3.7/site-packages/espncricinfo/summary.py", line 30, in _match_ids
    matches = [x['id'] for x in self.summary_json()['props']['pageProps']['data']['content']['leagueEvents'][0]['matchEvents']]
KeyError: 'props'

I'm not certain what I'm doing wrong here, can someone help me out?

[BUG] get_comms_json not working

from espncricinfo.match import Match m = Match('1363448') match_json=m.get_comms_json() print(match_json)

I am not sure why this isn't working, this matchid produces results for all scorecard functions and there is comms for this match (https://www.espncricinfo.com/series/ireland-in-sri-lanka-2023-1363445/sri-lanka-vs-ireland-2nd-test-1363448/ball-by-ball-commentary)

Thanks

Error: Float objet cannot be interpreted as integer

Hi,
I am trying to simulate data from the DCC model and I have to create matrices with some formulas. I am trying to create the first matrix, mQ, but I am getting the same error all the time. This is (part of) my code:

iN = 10000
dOmega = 0.1
dAlpha = 0.05
dBeta = 0.94
vEps = np.random.normal(size=iN);

def computeMQ(iN, dAlpha, dBeta, mS, vEps, mQ):
"""
Purpose:
Compute mQ

Inputs:
    ...

Return value:
    mQ     correlation driving process
"""

# Define mQ
mQ = np.zeros(shape=(iN,iN))

# Compute mQ

for i in range(iN):
for i in range(iN):
mQ[i,i] = np.ndarray(1 - dAlpha - dBeta) * mS + np.ndarray(dAlpha * (vEps[i-1] * np.transpose(vEps[i-1]))) + (np.ndarray(dBeta) * mQ[i-1])

I hope someone can help me, because I have no idea what to do!! And I also tried to calculate mQ without using np.ndarray but then sometimes I get the error: setting an array element with a sequence.

I can also upload my whole code if someone would like to see that in order to help me.

Thanks!!

[BUG]Kernel dead in jupyter notebook

Describe the bug

from espncricinfo.summary import Summary
s = Summary()
s.match_ids

By running the above code, i am getting kernel dead restart error in jupyter notebook everytime. Can you check and tell me please thanks

To Reproduce
Steps to reproduce the behavior:

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

[BUG] Player class is not working with Python 3.11.0

Describe the bug
When I am using from espncricinfo.player import Player and trying to execute p = Player('277916') its giving me an error.

To Reproduce
Gives me following error:

>>> p = Player('277916')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\Singh\Desktop\Alexa Automation\.venv\Lib\site-packages\espncricinfo\player.py", line 14, in __init__
    self.player_information = self._parse_player_information()
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Singh\Desktop\Alexa Automation\.venv\Lib\site-packages\espncricinfo\player.py", line 60, in _parse_player_information
    return self.parsed_html.find_all('p', class_='ciPlayerinformationtxt')
           ^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'find_all'

Expected behavior
It should give player information.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

[BUG]

Describe the bug
The Match class is not working for the past 3 to 5 days. previously, i used to extract all the details of the match like players name and players id.

To Reproduce
Steps to reproduce the behavior:

Expected behavior
I expect to get the details of all the players. i have attached the screenshots of the code.

Screenshots

Additional context
Add any other context about the problem here.

A few Player Ids whose Object creation fails

['874261', '878035', '749957', '858807', '515870', '743645', '515904', '878055', '743647', '580662', '646213', '878025', '885815', '517573', '874173', '1008089']

Above mentioned is the list of ids of player whose object creation fails.

My guess is the problem is in the constructor when Bowling averages are being parsed.

[BUG] Error while accessing the player object

Describe the bug
Error while accessing the player object

To Reproduce
Steps to reproduce the behavior:

from espncricinfo.player import Player
p = Player('277916')

Expected behavior
Should not throw an error and get the player object correctly.

Screenshots

Additional context
Add any other context about the problem here.

Detect reduction in number of overs

The json returned for ball by ball commentary isn't properly managed. The ballLimit field should reduce when a stoppage occurs. But in most of the matches, even the first ball of an interrupted innings shows a reduced ballLimit.

match.toss_decision returns null

Is it just me or the toss_decision attribute is returning a null string.

Number of 6s and 4s in match file

Hello,

I just wondered if there was a way to get number if 4s/6s into the innings dict? Or is there a workaround I could use?

Love the package, cheers all.

[BUG] Not able to fetch player information

Describe the bug
player.py is not able to extract player information. It is giving error - "'NoneType' object has no attribute 'find_all'"
To Reproduce
Just run read me sample:

from espncricinfo.player import Player
p = Player('277916')
p.name

Expected behavior
It should give name of player.

Screenshots

Additional context
AttributeError Traceback (most recent call last)
in
1 from espncricinfo.player import Player
----> 2 p = Player('277916')
3 p.name

C:\Python38\lib\site-packages\espncricinfo\player.py in init(self, player_id)
12 self.parsed_html = self.get_html()
13 self.json = self.get_json()
---> 14 self.player_information = self._parse_player_information()
15 self.cricinfo_id = str(player_id)
16 if self.parsed_html:

C:\Python38\lib\site-packages\espncricinfo\player.py in _parse_player_information(self)
58
59 def parse_player_information(self):
---> 60 return self.parsed_html.find_all('p', class='ciPlayerinformationtxt')
61
62 def _name(self):

AttributeError: 'NoneType' object has no attribute 'find_all'

Add Ground object

http://www.espncricinfo.com/ci/content/ground/58899.html

Convert List of Dictionaries to a Single Dictionary

While trying to access the batting_fielding_averages of a player we get a list of Dictionaries (With just one key, value pair) which is very annoying, because it forces you to do this:
player.batting_fielding_averages[2]['T20Is] rather than player.batting_fielding_averages['T20Is]. I can fix this in places where I observe this, but until all such issue are solved this issue should remain open.

Python 3 support

Rewrite/update Match class

Is your feature request related to a problem? Please describe.
Cricinfo has added JSON APIs that provide additional information and data that would be useful to have. At the same time, the existing Match class is too brittle and probably contains too many methods that aren't very useful.

Describe the solution you'd like
Add support for additional data available through newer API endpoints (example here), remove any existing methods that no longer work and in general make the Match class slimmer and more maintainable. The batting & bowling stats should not rely on the existing comms_json method.

Add details to Match object

current batting and bowling figures (centre['common']['batting'] and centre['common']['bowling'])
innings summary (centre['common']['innings'] and centre['common']['innings_list']
fall of wickets (centre['common']['fow'])
Add current_summary (match['current_summary']
Add followon (match['followon'])
Add present_datetime_local (match['present_datetime_local']) & GMT
Add start_datetime_gmt and local
Add town_name and town_id
Add weather_location_code

Return JSON error message when no match json present

Example: http://www.espncricinfo.com/new-zealand-v-sri-lanka-2015-16/content/match/317876.json

outside-edge / python-espncricinfo Goto Github PK

python-espncricinfo's People

Contributors

Stargazers

Watchers

Forkers

python-espncricinfo's Issues

To Decide:

Recommend Projects

Recommend Topics

Recommend Org