Code Monkey home page Code Monkey logo

python-espncricinfo's People

Contributors

dwillis avatar ehsan1997 avatar jackma222 avatar rnerd12 avatar sbshah97 avatar scrambldchannel avatar snyk-bot avatar soodoku avatar wally1002 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

python-espncricinfo's Issues

Change float to int in Player class

Creating a Player object leads to a problem in the _batting_fielding_averages method, on line 84, with num_formats using the / operator, which gives a float instead of the // operator to get an int. This raises an error when it is used to create a range on line 85.

Getting an error

Hi,

I tried running scraper.py and got the following error saying - "must be unicode, not str". Please see the details below.

/Library/Python/2.7/site-packages/beautifulsoup4-4.6.0-py2.7.egg/bs4/init.py:181: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

The code that caused this warning is on line 30 of the file scraper.py. To get rid of this warning, change code that looks like this:

BeautifulSoup(YOUR_MARKUP})

to this:

BeautifulSoup(YOUR_MARKUP, "lxml")

Traceback (most recent call last):
File "scraper.py", line 39, in
new_host = unicodedata.normalize('NFKD', new_host).encode('ascii','ignore')
TypeError: must be unicode, not str

I am not sure how best to solve this. Look forward to hearing from you. Thanks!

Hemant

Issue with match.py when passing match ID = 729303

Describe the bug
Whenever I try using match.py with match id =729303. it's throwing error: Expecting value: line 1 column 1 (char 0).
It's a valid match ID. I haven't got any issue for other match ids so far, but this one is throwing an error.
On deep diving, I found out that:
json_url = "https://www.espncricinfo.com/matches/engine/match/{0}.json".format(str(match_id))

This is giving: error 500 Internal Server Error

I tried json_url for other match ids and it's giving proper json object.

I don't know what's causing the issue with this match. Is it any server-side issue or am I doing something wrong?

To Reproduce
-> m = Match(729303)
-> m.json

Expected behavior
Match(729303) should work without any error.

Screenshots
Actual Error:
7293303_error
JSON_url for match_id =729303 (NOT WORKING)
729303_json
JSON_url for match_id =729301 (WORKING)
729301_json

Additional context
Add any other context about the problem here.

Series name property

To get the series name, you need to currently do this, Match(match_id).series[0]['series_name'], I think there should be a Match.series_name property instead.

Player error float object cannot be interpreted as integer

Code:
player = Player('277916')

Traceback:

Traceback (most recent call last):
File "C:/Users/ehsan/PycharmProjects/ScrapPSLData/CreateCSV_AdvStats.py", line 9, in
print(Player('277916').name)
File "C:\Users\ehsan\AppData\Roaming\Python\Python36\site-packages\espncricinfo\player.py", line 20, in init
self.batting_fielding_averages = self._batting_fielding_averages()
File "C:\Users\ehsan\AppData\Roaming\Python\Python36\site-packages\espncricinfo\player.py", line 79, in _batting_fielding_averages
format_positions = [15*x for x in range(num_formats)]
TypeError: 'float' object cannot be interpreted as an integer

[BUG] Unable to get a player's stats data

Describe the bug
get_data() method in Player class returns a 403 when it tries to access the html page

To Reproduce
p = Player(47270)
print(p.get_data(None, 1, 'batting', 'match'))

Expected behavior
Should be able to access the html page and cotinue with the parsing process.

Screenshots
I've put a print command to show the contents of the html page

image

This is what the output and stack trace is.

image

The Response <403> is the output from printing the HTML.

Additional context
I'm guessing this is espncricinfo blocking odd requests not coming from browsers? Is there a way to get around this?

[BUG] 'pip3 install python-espncricinfo' is installing 0.5.8, instead of 0.6.1 (latest)

Describe the bug
Following the documentation, I tried 'pip3 install python-espncricinfo' and the command is installing 0.5.8, instead of 0.6.1 (latest)

Seeing below in packages as well.

$ pip3 index versions python-espncricinfo
WARNING: pip index is currently an experimental command. It may be removed/changed in a future release without prior warning.
python-espncricinfo (0.5.8)
Available versions: 0.5.8, 0.5.7, 0.5.6, 0.5.5, 0.5.4, 0.5.3, 0.5.2, 0.5.1, 0.5.0, 0.4.1, 0.4.0, 0.3.2, 0.3.1, 0.3, 0.2.3, 0.2.2, 0.3.macosx-10.11-intel
INSTALLED: 0.5.8
LATEST: 0.5.8

Match.all_innings no longer working.

Firstly, great repo has been very useful in making my small cricket website, thank you.

I have a sqlite database of all player innings in Test and ODI cricket and have been using your module to import the data, using the Match class and then .all_innings to gather all the data from the match. It was definitely working a few weeks ago but now has broken.

It used to produce a dictionary of all the innings in a match but now only returns Nonetype regardless of the match.

To Reproduce
Steps to reproduce the behavior:

from espncricinfo.match import Match

m = Match(1225248)

# 1225248 is the number for the second test for Eng v Windies 2020

print(m.all_innings)

Expected behavior
It should print out all the info from each innings in the game, including info about each player's individual innings.

Screenshots
Here's a screenshot of the above code in action.
Screenshot 2020-09-10 at 23 00 39

Get Recent Matches for the day

Currently we get matches from the entire week. I would like to get them by days too. I wrote a script for that which I am using currently I will share it here.

Also summary.py has some problem when trying to create a Summary object.


Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "C:\Users\ehsan\AppData\Local\Programs\Python\Python36\lib\site-packages\espncricinfo\summary.py", line 7, in __init__
    self.json = self.get_json()
  File "C:\Users\ehsan\AppData\Local\Programs\Python\Python36\lib\site-packages\espncricinfo\summary.py", line 13, in get_json
    return r.json()
  File "C:\Users\ehsan\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\models.py", line 896, in json
    return complexjson.loads(self.text, **kwargs)
  File "C:\Users\ehsan\AppData\Local\Programs\Python\Python36\lib\json\__init__.py", line 354, in loads
    return _default_decoder.decode(s)
  File "C:\Users\ehsan\AppData\Local\Programs\Python\Python36\lib\json\decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "C:\Users\ehsan\AppData\Local\Programs\Python\Python36\lib\json\decoder.py", line 357, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 2 (char 1)

Code for getting matches by the day.

    def get_recent_matches_day(date=None):
        if date:
            #date format should be YYYYmmdd e.g. 20180814
            url = "http://www.espncricinfo.com/scores/?date=date"
        else:
            url = "http://www.espncricinfo.com/scores/"
        r = requests.get(url)
        soup = BeautifulSoup(r.text, 'html.parser')
        return [x['href'].split('/', 4)[4].split('.')[0] for x in soup.findAll('a', href=True, text='SCORECARD')]

Trying to convert NoneType to Float isn't handeled

I found error on a match that had no result, and the match ended without the second team playing a ball. So I opened match.py, and found out float conversion can cause the script to fail if there is NoneType returned. Please use exception handling or whatever you deem fit.

[BUG] Maximum Excursion Error

Describe the bug
Excursion Error when calling methods within the module

To Reproduce
Run Sample code in the overview page of the module

Expected behavior
No MaximumExcursion Error

Screenshots
N/A

Additional context
https://stackoverflow.com/questions/67165137/espn-cricinfo-maximum-recursion-depth-exceeded/67166173?noredirect=1#comment118744395_67166173

Please see the following question I asked on StackOverflow with the exact problem I encountered.

Thanks a lot to anyone that can help

method to fetch Commentary for the match

I am trying to see if I can get the full ball-by-ball commentary of the match.

With match ID - I thought get_comms_json() will do the trick. But, unfortunately, it doesn't return any value.

>>> from espncricinfo.match import Match
>>> m = Match('1175356')
>>> m.match_url
'http://www.espncricinfo.com/matches/engine/match/1175356.html'
>>> m.comms_json
>>>

Is it possible/feasible to get the full commentary in json format? If not, guess I ll have to start working on a scraper. Thanks for your work.

[FeatureRequest] Scraping every scorecard for a given format or team

I am a statistician interested in cricket, interested in making general claims across entire formats of the game (i.e. including Test cricket game ever, for example). Would it be possible to use python-espncricinfo to scrape every scorecard from every test match, for example?
It is around 2,000 in total, which is more than is worth doing by hand, but not so scary if we can automatically generate every match_id for every test match, for example.
If another resource is more appropriate for this type of bulk analysis then I am happy to hear of it.

Improve Match API

  1. I don't think you should pass in an int, you should pass in a string so that you can't accidentally do arithmetic on the Match object (more common than you might think)

  2. The program seems to be failing on a few Match IDs -- 76557 and 74539 seem to be failing for me with the error ValueError.

'method' object is not subscriptable

Describe the bug
Try to get summary of live matches, by crating an instance of the Summary class get this error
TypeError: 'method' object is not subscriptable

**Here is My Code **

    s = Summary()
    print(s.match_ids)

Here is the error
image

Given Match ID and Player ID return statistics of Player in that match

  • Given a particular Match ID in the Player class, it should return the batting and bowling statistics for that player in that match.
  • If the player hasn't played that match, it should return a different values rather than default 0/0 - bowling and 0(0) - batting

To Decide:

  • How should the format of the output be like?
  • Any clues as to how to start this?

[BUG] All URL's returned needs to be http (currently https)

Describe the bug
The url returned by the Match().details_url method has https. The link only works with http.

To Reproduce

>>> from espncricinfo.match import Match
>>> m = Match('1175356')
>>> m.details_url
'https://core.espnuk.org/v2/sports/cricket/leagues/8048/events/1175356/competitions/1175356/details?page_size=1000&page=1'

Expected behavior

>>> m.details_url
'http://core.espnuk.org/v2/sports/cricket/leagues/8048/events/1175356/competitions/1175356/details?page_size=1000&page=1'

Espn website used for JSON object not available

Hi! Amazing work with this project. Although I haven't quite understood how to use it entirely, I tried to read through the code, and found this in 'player.py'

self.json_url = "https://core.espnuk.org/v2/sports/cricket/athletes/{0}".format(str(player_id))

This URL isn't available anymore it seems. Would you happen to know any alternate websites or sources where we could get player data packaged in a JSON?

IndexError: list index out of range in player.py

Here's the traceback, as you can see the problem arises when trying to get player data with id '10125'.
Probably some unhandled exception.

Traceback (most recent call last):
  File "C:/Users/ehsan/PycharmProjects/ScrapPSLData/test_player.py", line 2, in <module>
    p = Player('10125')
  File "C:\Users\ehsan\AppData\Roaming\Python\Python36\site-packages\espncricinfo\player.py", line 34, in __init__
    self.recent_matches = self._recent_matches()
  File "C:\Users\ehsan\AppData\Roaming\Python\Python36\site-packages\espncricinfo\player.py", line 223, in _recent_matches
    table = self.parsed_html.findAll('table', class_='engineTable')[3]
IndexError: list index out of range

[BUG] Get KeyError: 'props' when trying to run Summary()

When I run the following code, I get the traceback below.

>>> from espncricinfo.summary import Summary
>>> s = Summary()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/John/opt/anaconda3/lib/python3.7/site-packages/espncricinfo/summary.py", line 12, in __init__
    self.match_ids = self._match_ids()
  File "/Users/John/opt/anaconda3/lib/python3.7/site-packages/espncricinfo/summary.py", line 30, in _match_ids
    matches = [x['id'] for x in self.summary_json()['props']['pageProps']['data']['content']['leagueEvents'][0]['matchEvents']]
KeyError: 'props'

I'm not certain what I'm doing wrong here, can someone help me out?

Error: Float objet cannot be interpreted as integer

Hi,
I am trying to simulate data from the DCC model and I have to create matrices with some formulas. I am trying to create the first matrix, mQ, but I am getting the same error all the time. This is (part of) my code:

iN = 10000
dOmega = 0.1
dAlpha = 0.05
dBeta = 0.94
vEps = np.random.normal(size=iN);

def computeMQ(iN, dAlpha, dBeta, mS, vEps, mQ):
"""
Purpose:
Compute mQ

Inputs:
    ...

Return value:
    mQ     correlation driving process
"""

# Define mQ
mQ = np.zeros(shape=(iN,iN))

# Compute mQ 

for i in range(iN):
for i in range(iN):
mQ[i,i] = np.ndarray(1 - dAlpha - dBeta) * mS + np.ndarray(dAlpha * (vEps[i-1] * np.transpose(vEps[i-1]))) + (np.ndarray(dBeta) * mQ[i-1])

I hope someone can help me, because I have no idea what to do!! And I also tried to calculate mQ without using np.ndarray but then sometimes I get the error: setting an array element with a sequence.

I can also upload my whole code if someone would like to see that in order to help me.

Thanks!!

[BUG]Kernel dead in jupyter notebook

Describe the bug

from espncricinfo.summary import Summary
s = Summary()
s.match_ids

By running the above code, i am getting kernel dead restart error in jupyter notebook everytime. Can you check and tell me please thanks

To Reproduce
Steps to reproduce the behavior:

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

[BUG] Player class is not working with Python 3.11.0

Describe the bug
When I am using from espncricinfo.player import Player and trying to execute p = Player('277916') its giving me an error.

To Reproduce
Gives me following error:

>>> p = Player('277916')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\Singh\Desktop\Alexa Automation\.venv\Lib\site-packages\espncricinfo\player.py", line 14, in __init__
    self.player_information = self._parse_player_information()
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Singh\Desktop\Alexa Automation\.venv\Lib\site-packages\espncricinfo\player.py", line 60, in _parse_player_information
    return self.parsed_html.find_all('p', class_='ciPlayerinformationtxt')
           ^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'find_all'

Expected behavior
It should give player information.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

[BUG]

Describe the bug
The Match class is not working for the past 3 to 5 days. previously, i used to extract all the details of the match like players name and players id.

To Reproduce
Steps to reproduce the behavior:

Expected behavior
I expect to get the details of all the players. i have attached the screenshots of the code.

Screenshots
image
image
image
image

Additional context
Add any other context about the problem here.

A few Player Ids whose Object creation fails

['874261', '878035', '749957', '858807', '515870', '743645', '515904', '878055', '743647', '580662', '646213', '878025', '885815', '517573', '874173', '1008089']

Above mentioned is the list of ids of player whose object creation fails.

My guess is the problem is in the constructor when Bowling averages are being parsed.

[BUG] Error while accessing the player object

Describe the bug
Error while accessing the player object

To Reproduce
Steps to reproduce the behavior:

from espncricinfo.player import Player
p = Player('277916')

Expected behavior
Should not throw an error and get the player object correctly.

Screenshots
image

Additional context
Add any other context about the problem here.

Detect reduction in number of overs

The json returned for ball by ball commentary isn't properly managed. The ballLimit field should reduce when a stoppage occurs. But in most of the matches, even the first ball of an interrupted innings shows a reduced ballLimit.

Number of 6s and 4s in match file

Hello,

I just wondered if there was a way to get number if 4s/6s into the innings dict? Or is there a workaround I could use?

Love the package, cheers all.

[BUG] Not able to fetch player information

Describe the bug
player.py is not able to extract player information. It is giving error - "'NoneType' object has no attribute 'find_all'"
To Reproduce
Just run read me sample:

from espncricinfo.player import Player
p = Player('277916')
p.name

Expected behavior
It should give name of player.

Screenshots
image
Additional context
AttributeError Traceback (most recent call last)
in
1 from espncricinfo.player import Player
----> 2 p = Player('277916')
3 p.name

C:\Python38\lib\site-packages\espncricinfo\player.py in init(self, player_id)
12 self.parsed_html = self.get_html()
13 self.json = self.get_json()
---> 14 self.player_information = self._parse_player_information()
15 self.cricinfo_id = str(player_id)
16 if self.parsed_html:

C:\Python38\lib\site-packages\espncricinfo\player.py in _parse_player_information(self)
58
59 def parse_player_information(self):
---> 60 return self.parsed_html.find_all('p', class
='ciPlayerinformationtxt')
61
62 def _name(self):

AttributeError: 'NoneType' object has no attribute 'find_all'

Convert List of Dictionaries to a Single Dictionary

While trying to access the batting_fielding_averages of a player we get a list of Dictionaries (With just one key, value pair) which is very annoying, because it forces you to do this:
player.batting_fielding_averages[2]['T20Is] rather than player.batting_fielding_averages['T20Is]. I can fix this in places where I observe this, but until all such issue are solved this issue should remain open.

Rewrite/update Match class

Is your feature request related to a problem? Please describe.
Cricinfo has added JSON APIs that provide additional information and data that would be useful to have. At the same time, the existing Match class is too brittle and probably contains too many methods that aren't very useful.

Describe the solution you'd like
Add support for additional data available through newer API endpoints (example here), remove any existing methods that no longer work and in general make the Match class slimmer and more maintainable. The batting & bowling stats should not rely on the existing comms_json method.

Add details to Match object

  • current batting and bowling figures (centre['common']['batting'] and centre['common']['bowling'])
  • innings summary (centre['common']['innings'] and centre['common']['innings_list']
  • fall of wickets (centre['common']['fow'])
  • Add current_summary (match['current_summary']
  • Add followon (match['followon'])
  • Add present_datetime_local (match['present_datetime_local']) & GMT
  • Add start_datetime_gmt and local
  • Add town_name and town_id
  • Add weather_location_code

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.