gingeleski / odds-portal-scraper Goto Github PK

View Code? Open in Web Editor NEW

103.0 12.0 53.0 1.49 MB

Sports odds and results scraping for Odds Portal (oddsportal.com).

License: The Unlicense

Python 100.00%

scraper sports-stats sports-data

odds-portal-scraper's People

Contributors

Stargazers

Watchers

Forkers

slavazhuk se000ra okaara1 jjhughes001 kball-tabcorp jtyao jlumbard vanscruffler che7611 agmezr caoruister navarmn isachard w1r2p1 villa7777 snwokenk dbamurike captainbuckkets jamesbaker1 1sbphd dzuranyuk kyczyk steadylearner wutianchen vaughankg paulorosariodata jemorriso anzus-bet michelmf wickblue aalt0 francescobarbara rileycullen tmnftofficial battyone davidsonsantana89 scooby75 mike-kane benwol crisrigs joe69 jvyannaros bryanmildort jan6698 yiannis-gkoufas lazydayz137 prodromosv fasopost wang1365 magiceric

odds-portal-scraper's Issues

Re-organize project to separate new code from old

Instead of being a single scraping project for Odds Portal, I see this shifting to (if only temporarily) a repo of multiple example projects doing scraping.

As a result of issue #4

No exception handling for Internet disconnection/interruption

(This is for full_support)

There is currently zero fault tolerance if the user's Internet disconnects or is interrupted while scraping.

PROBLEM WHEN LOGING ON THE WEB

Hello, can someone help me?
When I run the code it shows me this message:

[WARNING] Problem with link, could not find Login button - https://www.oddsportal.com/basketball/usa/nba/results/

thank you so much!

Missing config file?

The config directory and config file sports.json appear to be missing. Is this the case?

Could not read environment variable ODDS_PORTAL_USERNAME

Got error

(venv) c:\Helper\odds-portal-scraper\predictions>python scraper.py
Traceback (most recent call last):
File "scraper.py", line 52, in main
username = os.environ['ODDS_PORTAL_USERNAME']
File "C:\Users\1\AppData\Local\Programs\Python\Python38\lib\os.py", line 673,
in getitem
raise KeyError(key) from None
KeyError: 'ODDS_PORTAL_USERNAME'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "scraper.py", line 135, in
asyncio.get_event_loop().run_until_complete(main())
File "C:\Users\1\AppData\Local\Programs\Python\Python38\lib\asyncio\base_event
s.py", line 612, in run_until_complete
return future.result()
File "scraper.py", line 54, in main
raise RuntimeError('Could not read environment variable ODDS_PORTAL_USERNAME
')
RuntimeError: Could not read environment variable ODDS_PORTAL_USERNAME

What should I fix ?

Scraping now fails - has oddsportal site changed?

I've re-run some commands usingFinalScraper.py which now fail, despite these same commands completeing successfully in December

This is the error I get. The scripts looks to be cycling through pages and getting this error every time.

Page not found
This page not exist on Oddsportal.com!

Sample commands run:

scrape_oddsportal_current_season(sport = 'soccer', country = 'england', league = 'premier-league', season = '2022', max_page = 20)
scrape_oddsportal_historical(sport = 'soccer', country = 'england', league = 'premier-league', start_season = '2022-2023', nseasons = 1, current_season = 'yes', max_page = 25)
scrape_oddsportal_historical(sport = 'soccer', country = 'world', league = 'world-cup', start_season = '2018', nseasons = 1, current_season = 'no', max_page = 2)

I note that the oddsportal site looks different visually and has the word "beta" in the logo.

Has the structure of the oddsportal site changed and hence resulted in these scraping fails?

Cull requirements.txt of full_scraper

Cull requirements.txt of full_scraper/ ... got my full pip load.

DevToolsActivePort file doesn't exist

Been getting a error

selenium.common.exceptions.WebDriverException: Message: unknown error: DevToolsActivePort file doesn't exist
I think its cause im on linux. Not a huge issue but think this might fix it

https://stackoverflow.com/questions/50642308/webdriverexception-unknown-error-devtoolsactiveport-file-doesnt-exist-while-t

Might try it on my local code

Extend Scraper and Crawler classes from a common base

(This is for full_scraper)

Extend Scraper.py and Crawler.py classes from a common base ... right now there's not total compliance with the D.R.Y. programming principle.

.Json output to include DRAW when the game has finished after OT/pen. (hockey)

Hi,

Firstly, many thanks for your project.

Currently the output for games finishing after OT / Pen. is either HOME/AWAY for whoever wins the match after OT/pen. Would it be possible to add "DRAW" for the outcome (f.ex. HOCKEY), when a game results in a draw after regulation time? The final output would still result in the final score, but the match outcome would result in "DRAW".

Many thanks for your time and consideration!

Breaking change in dependency joblib

in requirements.txt files, joblib==0.13.2 needs to change to joblib==1.1.0, and need to add Cython==0.29.30

Before making the change, got this error:

(venv) ➜  full_scraper git:(master) ✗ python op.py --help
Traceback (most recent call last):
  File "/Users/brentbrewington/project-files-github/odds-portal-scraper/full_scraper/op.py", line 8, in <module>
    from joblib import delayed
  File "/Users/brentbrewington/project-files-github/odds-portal-scraper/full_scraper/venv/lib/python3.9/site-packages/joblib/__init__.py", line 119, in <module>
    from .parallel import Parallel
  File "/Users/brentbrewington/project-files-github/odds-portal-scraper/full_scraper/venv/lib/python3.9/site-packages/joblib/parallel.py", line 28, in <module>
    from ._parallel_backends import (FallbackToBackend, MultiprocessingBackend,
  File "/Users/brentbrewington/project-files-github/odds-portal-scraper/full_scraper/venv/lib/python3.9/site-packages/joblib/_parallel_backends.py", line 22, in <module>
    from .executor import get_memmapping_executor
  File "/Users/brentbrewington/project-files-github/odds-portal-scraper/full_scraper/venv/lib/python3.9/site-packages/joblib/executor.py", line 14, in <module>
    from .externals.loky.reusable_executor import get_reusable_executor
  File "/Users/brentbrewington/project-files-github/odds-portal-scraper/full_scraper/venv/lib/python3.9/site-packages/joblib/externals/loky/__init__.py", line 12, in <module>
    from .backend.reduction import set_loky_pickler
  File "/Users/brentbrewington/project-files-github/odds-portal-scraper/full_scraper/venv/lib/python3.9/site-packages/joblib/externals/loky/backend/reduction.py", line 125, in <module>
    from joblib.externals import cloudpickle  # noqa: F401
  File "/Users/brentbrewington/project-files-github/odds-portal-scraper/full_scraper/venv/lib/python3.9/site-packages/joblib/externals/cloudpickle/__init__.py", line 3, in <module>
    from .cloudpickle import *
  File "/Users/brentbrewington/project-files-github/odds-portal-scraper/full_scraper/venv/lib/python3.9/site-packages/joblib/externals/cloudpickle/cloudpickle.py", line 152, in <module>
    _cell_set_template_code = _make_cell_set_template_code()
  File "/Users/brentbrewington/project-files-github/odds-portal-scraper/full_scraper/venv/lib/python3.9/site-packages/joblib/externals/cloudpickle/cloudpickle.py", line 133, in _make_cell_set_template_code
    return types.CodeType(
TypeError: an integer is required (got type bytes)

After making the change the error went away:

(venv) ➜  full_scraper git:(master) ✗ python op.py --help            
usage: op.py [-h] [--number-of-cpus [NUMBER_OF_CPUS]]
             [--wait-time-on-page-load [WAIT_TIME_ON_PAGE_LOAD]]

oddsporter v1.0

optional arguments:
  -h, --help            show this help message and exit
  --number-of-cpus [NUMBER_OF_CPUS]
                        Number parallel CPUs for processing (default -1 for max available)
  --wait-time-on-page-load [WAIT_TIME_ON_PAGE_LOAD]
                        How many seconds to wait on page load (default 3)

scrap next machtes

Is possible to scrab next matches, for exaample: http://www.oddsportal.com/soccer/germany/bundesliga/
thanks

Consecutive league/"sport" runs seem to fail with Selenium exception

(This is for full_scraper)

Consecutive league/"sport" runs seem to fail with Selenium exception

Write new proof-of-concept scraper for user predictions

This issue is prompted by an email I received from a fan of this project named Dmitry.

He'd written some scraping code to grab data of a user he follows on Odds Portal who makes private predictions. The results from his code seemed unstable.

I'm tackling this use case out of curiosity for how my approach might be different to his. I haven't looked at this project in several years, really, and write scrapers differently now than I used to. 🥇

The way I see it, whether a user making predictions is public or private doesn't really affect the scraping approach. My proof-of-concept logic should apply either way.

Pseudo-code works out to...

Sign in
Go to your user profile
Go to "following"
Collect list of handicappers you're following
For each handicapper in the list...
	Get that handicapper's next predictions - https://www.oddsportal.com/profile/OldTwinTowersFutbol/my-predictions/next/
	For each page of predictions...
		ACTION: Save screenshot
		For each prediction...
			ACTION: Get sport
			ACTION: Get region
			ACTION: Get league
			ACTION: Get start time
			ACTION: Get game name
			ACTION: Get game specifier
			ACTION: Get link to the game on Odds Portal
			ACTION: Get outcome odds
		        ACTION: Get picked outcome

Scraping actual odds by bookmaker vs. average odds?

Do you have any plans to add a feature that allows a user to scrape the actual odds offered by each bookmaker (e.g. each entry here), versus the average odds across all bookmakers? Thanks.

No JSON object could be decoded

Hi, thanks for a great resource. However, I'm having some difficulties with running the scraper. I get the following error. I've tried searching for what could be the problem, but nothing has worked so far.

(venv) 5072AB1C:odds-portal-scraper-master admin$ python /Users/admin/Desktop/odds-portal-scraper-master/run.py
Traceback (most recent call last):
  File "/Users/admin/Desktop/odds-portal-scraper-master/run.py", line 19, in <module>
    match_scraper = Scraper(json_str, initialize_db)
  File "/Users/admin/Desktop/odds-portal-scraper-master/Scraper.py", line 28, in __init__
    self.league = self.parse_json(league_json)
  File "/Users/admin/Desktop/odds-portal-scraper-master/Scraper.py", line 42, in parse_json
    return json.loads(json_str)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 366, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 384, in raw_decode
    raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
(venv) 5072AB1C:odds-portal-scraper-master admin$