a-maliarov / amazoncaptcha Goto Github PK

Pure Python, lightweight, Pillow-based solver for Amazon's text captcha.

License: MIT License

Python 100.00%

captcha captcha-solver amazon python3 pillow amazon-captcha amazon-scraper training-data amazoncaptcha data-extraction

amazoncaptcha's Introduction

  ______                                  ______                      __              __                
 /      \                                /      \                    |  \            |  \               
|  ▓▓▓▓▓▓\______ ____  ________ _______ |  ▓▓▓▓▓▓\ ______   ______  _| ▓▓_    _______| ▓▓____   ______  
| ▓▓__| ▓▓      \    \|        \       \| ▓▓   \▓▓|      \ /      \|   ▓▓ \  /       \ ▓▓    \ |      \
| ▓▓    ▓▓ ▓▓▓▓▓▓\▓▓▓▓\\▓▓▓▓▓▓▓▓ ▓▓▓▓▓▓▓\ ▓▓       \▓▓▓▓▓▓\  ▓▓▓▓▓▓\\▓▓▓▓▓▓ |  ▓▓▓▓▓▓▓ ▓▓▓▓▓▓▓\ \▓▓▓▓▓▓\
| ▓▓▓▓▓▓▓▓ ▓▓ | ▓▓ | ▓▓ /    ▓▓| ▓▓  | ▓▓ ▓▓   __ /      ▓▓ ▓▓  | ▓▓ | ▓▓ __| ▓▓     | ▓▓  | ▓▓/      ▓▓
| ▓▓  | ▓▓ ▓▓ | ▓▓ | ▓▓/  ▓▓▓▓_| ▓▓  | ▓▓ ▓▓__/  \  ▓▓▓▓▓▓▓ ▓▓__/ ▓▓ | ▓▓|  \ ▓▓_____| ▓▓  | ▓▓  ▓▓▓▓▓▓▓
| ▓▓  | ▓▓ ▓▓ | ▓▓ | ▓▓  ▓▓    \ ▓▓  | ▓▓\▓▓    ▓▓\▓▓    ▓▓ ▓▓    ▓▓  \▓▓  ▓▓\▓▓     \ ▓▓  | ▓▓\▓▓    ▓▓
 \▓▓   \▓▓\▓▓  \▓▓  \▓▓\▓▓▓▓▓▓▓▓\▓▓   \▓▓ \▓▓▓▓▓▓  \▓▓▓▓▓▓▓ ▓▓▓▓▓▓▓    \▓▓▓▓  \▓▓▓▓▓▓▓\▓▓   \▓▓ \▓▓▓▓▓▓▓
                                                          | ▓▓                                          
  >>>solution                                             | ▓▓                            Response 0.24s
  "AmznCaptcha"                                            \▓▓                            Accuracy 99.9%

The motivation behind the creation of this library is taking its start from the genuinely simple idea: "I don't want to use pytesseract or some other non-amazon-specific OCR services, nor do I want to install some executables to just solve a captcha. I desire to get a solution with 2 lines of code without any heavy add-ons, using a pure Python."

Pure Python, lightweight, Pillow-based solver for Amazon's text captcha.

Installation

You can simply install the library from PyPi using pip. For more methods check the docs.

pip install amazoncaptcha

Quick Snippet

An example of the constructor usage. Scroll a bit down to see some tasty class methods. For consistency across different devices, it is highly recommended to use fromlink class method.

from amazoncaptcha import AmazonCaptcha

captcha = AmazonCaptcha('captcha.jpg')
solution = captcha.solve()

# Or: solution = AmazonCaptcha('captcha.jpg').solve()

Status

Usage and Class Methods

Browsing Amazon using selenium and stuck on captcha? The class method below will do all the dirty work of extracting an image from the webpage for you. Practically, it takes a screenshot from your webdriver, crops the captcha and stores it into bytes array which is then used to create an AmazonCaptcha instance. This also means avoiding any local savings. For consistency across different devices, it is highly recommended to use fromlink class method instead of fromdriver.

from amazoncaptcha import AmazonCaptcha
from selenium import webdriver

driver = webdriver.Chrome() # This is a simplified example
driver.get('https://www.amazon.com/errors/validateCaptcha')

captcha = AmazonCaptcha.fromdriver(driver)
solution = captcha.solve()

If you are not using selenium or the previous method is not just the case for you, it is possible to use a captcha link directly. This class method will request the url, check the content type and store the response content into bytes array to create an instance of AmazonCaptcha.

from amazoncaptcha import AmazonCaptcha

link = 'https://images-na.ssl-images-amazon.com/captcha/usvmgloq/Captcha_kwrrnqwkph.jpg'

captcha = AmazonCaptcha.fromlink(link)
solution = captcha.solve()

In addition, if you are a machine learning or neural network developer and are looking for some training data, check this repository, which was created to store images and other non-script data for the solver.

Help the Development

If you are willing to help the development, consider setting keep_logs argument of the solve method to True. Here is the example, if you are using fromdriver class method. If set to True, all the links of the unsolved captcha will be stored so that later you can open the issue and send the logs.

from amazoncaptcha import AmazonCaptcha
from selenium import webdriver

driver = webdriver.Chrome() # This is a simplified example
driver.get('https://www.amazon.com/errors/validateCaptcha')

captcha = AmazonCaptcha.fromdriver(driver)
solution = captcha.solve(keep_logs=True)

If you have any suggestions or ideas of additional instances and methods, which you would like to see in this library, please, feel free to contact the owner via email or fork'n'pull to repository. Any contribution is highly appreciated!

Additional

If you want to see the History of Changes, Code of Conduct, Contributing Policy, or License, use these inline links to navigate based on your needs.
If you are facing any errors, please, report your situation via an issue.
This project is for educational and research purposes only. Any actions and/or activities related to the material contained on this GitHub Repository is solely your responsibility. The author will not be held responsible in the event any criminal charges be brought against any individuals misusing the information in this GitHub Repository to break the law.
Amazon is the registered trademark of Amazon.com, Inc. Amazon name used in this project is for identification purposes only. The project is not associated in any way with Amazon.com, Inc. and is not an official solution of Amazon.com, Inc.

amazoncaptcha's People

Contributors

Stargazers

Watchers

Forkers

gangstamilk vadyaizgermanii derekluo bellyfat tawawhite tim-wq longinteger017 rheelme afeng0007 zanachka merry75 ives1990 caixi wangzun67 sunnyjocker iamumairayub daramony steathy saketks2694 www805 acbocz msortur xanrag li155225626 zqlaaa r00mz ahquaa sukhcha-in dvukolovac phoenix-repo proryanator-forks offensivenomad ruanjian2007 rawandahmad698 mm86133 makrorof dda08a shobhits7 pow3632 aijx360 yoohooyoo niravrathod nodisk8800 tienthienhd elijahahianyo hezhengwei92 seidnerj hkervit asilkarbeyaz ptsonev zergey nkrutoholo kareemgamalmahmoud arpitjain799 maxasif luckyboyqing mad-cat-lon ecalose abdelrahman31 sudonorm eunhatbe kanavgupta01 phatdatpq smtamim huochequan johnduarte valujin ouguochong mr0000001 vudev zhenx harrisonpy mrfuhang sumonst21 j-verint zhl19970919 saylorzhu baimao001

amazoncaptcha's Issues

How to post the result of Captcha solved

I am looking for code that will post the captcha solution once solved. I don't see mention of how to properly submit it.

Thanks

Test Issue for the Stale Bot

Does it work on Amazon's new captcha system

It seems that Amazon has upgraded its captcha system.

Does the code still work?

Chromedriver version 86.* is coming

Update the CI
Test the package

[help]

Describe the bug
I tryed to install amazoncaptcha with the pip command but when i try to execute my .py i got this error "line 1, in
from amazoncaptcha import AmazonCaptcha
ModuleNotFoundError: No module named 'amazoncaptcha'
"

To Reproduce
Steps to reproduce the behavior:
I followed the guide to install it

Desktop (please complete the following information):

OS: [W10]
Browser [chrome]
Version [88]

Support for another type of Amazon Captcha

Hello! I'm constantly getting the 'Not Solved' issue, I think because the training data used has a different style to the CAPTCHAs I'm getting from Amazon. The ones I get on Amazon have a line in the background of the text. I've attached an image.

Is this something that could be included for a future implementation do you think?
Thank-you so much!

[Feature] update to Pillow 10.0.0

I have some other dependencies that need Pillow 10.0.0, but amazoncaptcha restricts version to <9.6.0.

Any reason to not update this dependency?

Thanks in advance

Submitting solved captcha back

disregard. Thanks

Constantly Not Solved

Paste the code showing which endpoint of the AmazonCaptcha library you are using

from amazoncaptcha import AmazonCaptcha

captcha = AmazonCaptcha('src\captcha\captcha.png')
solution = AmazonCaptcha('src\captcha\captcha.png').solve()
print(solution)

# Or: solution = AmazonCaptcha('captcha.jpg').solve()

If your endpoint is not ".fromdriver", paste the code of how you download the image

Code goes here

Is it really constant?
How many captchas in a row have given the "Not Solved" result?
about 10 times

To Reproduce
Steps to reproduce the behavior:

Go to '...'
Click on '....'
Scroll down to '....'
See error

Screenshots
If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

OS: [e.g. iOS] WIndows 10
Driver [e.g. chrome, safari]
Driver version [e.g. 22]

Additional context
Add any other context about the problem here.

Update AmazonCaptchaCollector

The current instance of AmazonCaptchaCollector was created at the beginning of the library development and uses outdated methods.

Add NotFolderError exception.
Add the option to the endpoints to decide whether a user wants to store images or just conduct an accuracy test.
Add corresponding logs to allow to use the instance as a devtool to proceed accuracy checks.
Update existent tests for AmazonCaptchaCollector
Add new tests
Add docstrings

Proceed an accuracy test for the version 0.4.*

Update the readme with more precise success percentage information. (Decided to revert)
After the AmazonCaptchaCollector update (#5) proceed the test and check the accuracy.

Won't launch properly

The project won't work or boot with the code imported. I'm using pycharm with a virtual environment on a Windows machine. I get the error "can't find 'main' module in {path to project}" anytime I try to run anything, including the example code segments with selenium.

Can bypass aws puzzle captcha ?

Below picture it's given captcha . Captcha send by aws waf . Any way to bypass it or sove it?

Make Selenium dependency optional

Would it be possible to make Selenium dependency optional? Currently, the dependency tree for a project that imports amazoncaptcha shows that the majority of packages are coming from Selenium:

$ pipdeptree -p amazoncaptcha
amazoncaptcha==0.5.5
  - pillow [required: ~=9.0.1, installed: 9.0.1]
  - requests [required: ~=2.27.1, installed: 2.27.1]
    - certifi [required: >=2017.4.17, installed: 2021.10.8]
    - charset-normalizer [required: ~=2.0.0, installed: 2.0.11]
    - idna [required: >=2.5,<4, installed: 3.3]
    - urllib3 [required: >=1.21.1,<1.27, installed: 1.26.8]
  - selenium [required: >=3.141,<4.2, installed: 4.1.0]
    - trio [required: ~=0.17, installed: 0.19.0]
      - async-generator [required: >=1.9, installed: 1.10]
      - attrs [required: >=19.2.0, installed: 21.4.0]
      - cffi [required: >=1.14, installed: 1.15.0]
        - pycparser [required: Any, installed: 2.21]
      - idna [required: Any, installed: 3.3]
      - outcome [required: Any, installed: 1.1.0]
        - attrs [required: >=19.2.0, installed: 21.4.0]
      - sniffio [required: Any, installed: 1.2.0]
      - sortedcontainers [required: Any, installed: 2.4.0]
    - trio-websocket [required: ~=0.9, installed: 0.9.2]
      - async-generator [required: >=1.10, installed: 1.10]
      - trio [required: >=0.11, installed: 0.19.0]
        - async-generator [required: >=1.9, installed: 1.10]
        - attrs [required: >=19.2.0, installed: 21.4.0]
        - cffi [required: >=1.14, installed: 1.15.0]
          - pycparser [required: Any, installed: 2.21]
        - idna [required: Any, installed: 3.3]
        - outcome [required: Any, installed: 1.1.0]
          - attrs [required: >=19.2.0, installed: 21.4.0]
        - sniffio [required: Any, installed: 1.2.0]
        - sortedcontainers [required: Any, installed: 2.4.0]
      - wsproto [required: >=0.14, installed: 1.0.0]
        - h11 [required: >=0.9.0,<1, installed: 0.13.0]
    - urllib3 [required: ~=1.26, installed: 1.26.8]

Given that AmazonCaptcha is fully functional without Selenium, as well as a wealth of alternatives to Selenium exist when an automated browser is required (Playwright, pyppeteer), there seems to be an opportunity to make the package lighter. If I understand it correctly, at the moment Selenium is only imported for unit tests. In production the expectation is for the user to install & import Selenium, then pass the driver object to AmazonCaptcha. Therefore two separate production and test requirements (with an optional version compatibility check in production) should do the trick. Thanks!

The solver doesn't work with modified images of captcha

Hello
I have an image with np array format and save it with a PIL image with formatting jpg after that I pass the image to the solver but every time I receive "Not solved"

I can't figure out what the problem is, I tried to download and try images from amazon it worked well so can anyone help me

PIL version '9.2.0'
python version '3.9.13'

Performance enhancement

Hello

how can I enhance the performance of amazoncaptcha?

i have hundred of thousands of images and the CPU consumption is high

any recommendations or suggestion for other tools?

Thanks

amazoncaptcha 0.5.0 is coming

For users:

Remove captchas folder to the separated repository to lower the weight of this one
Add Python 3.9 support
Add Chromedriver 86.0.4240.22 support
Update AmazonCaptchaCollector
Do an accuracy test after latest
Add documentation.
Add Pillow 8.0.0 support
Minor edits

For developers:

Add Stale Bot to remove stale issues
Add discord notifications to monitor the changes
Update setup.py
Workflow update

Test discord notifications on update in this repository

Not solved

I constantly receive the message "not solved". What should I do?

My Code:

from amazoncaptcha import AmazonCaptcha

captcha = AmazonCaptcha('Captcha.jpg')
solution = captcha.solve()

print(solution)

Output:
>>> Not solved

Change repository name to "amazoncaptcha"

The name of the repository should be updated to amazoncaptcha.

This will require updating all the links within the repository to exclude any incompatibilities.

Update setup.py

Add more classifiers: "Intended Audience :: Developers", "Intended Audience :: Education", "Intended Audience :: Information Technology", "Topic :: Internet :: WWW/HTTP :: Browsers".
Add a function to remove repository logo from README before pushing to PyPI, since the logo won't be displayed correctly.
Add version file with all the package information
Think about an implementation of auto tracker for the version number.
Slight code style change, defining requires and classifiers lists before usage.

'WebDriver' object has no attribute 'find_element_by_tag_name'

I have two issues, so I'll mention them both here. Using the provided examples I get errors:

fromlink:

requests.exceptions.MissingSchema: Invalid URL '<selenium.webdriver.firefox.webdriver.WebDriver (session="4ff4c007-d259-454a-8b8d-5be914964afd")>': No scheme supplied. Perhaps you meant https://<selenium.webdriver.firefox.webdriver.WebDriver (session="4ff4c007-d259-454a-8b8d-5be914964afd")>?

fromdriver:

AttributeError: 'WebDriver' object has no attribute 'find_element_by_tag_name'

To Reproduce
Steps to reproduce the behavior:

driver = webdriver.Firefox()
driver.get('https://www.amazon.com/errors/validateCaptcha')

captcha = AmazonCaptcha.fromlink(driver)
solution = captcha.solve()

Desktop (please complete the following information):

macOS Ventura
Firefox 112.0.2 (64-bit)

Additional context

Hopefully it's something simple.

Add Python 3.9 support

The initial test of whether the package is compatible with 3.9
Solving issues (none was found)
Updating PyPI package (also, grab #13 and #16 here)

No timeout can cause the script to freeze indefinitely

First off, I'd like to thank you for the useful package.

The issue I'm having is that threads in my script will randomly freeze and stop responding after hours or sometimes days. After doing some analysis with gdb, the reason seems to be AmazonCaptcha.fromlink() using requests.get(image_link) without a timeout which might cause the script to wait indefinitely if there's an issue. At its core, I guess the real culprit is network connectivity issues since I'm randomly rotating IP but I don't see any other solution than adding a timeout and error handling in AmazonCaptcha.

Maybe adding a high default timeout of say 60-180 seconds and then raising an exception might be a decent solution?

Traceback (most recent call first):
  File "/usr/lib/python3.10/ssl.py", line 1342, in do_handshake                                                                                   
    self._sslobj.do_handshake()
  File "/usr/lib/python3.10/ssl.py", line 1071, in _create
    self.do_handshake()
  File "/usr/lib/python3.10/ssl.py", line 513, in wrap_socket
    return self.sslsocket_class._create(
  File "/usr/lib/python3/dist-packages/urllib3/util/ssl_.py", line 493, in _ssl_wrap_socket_impl
    return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
  File "/usr/lib/python3/dist-packages/urllib3/util/ssl_.py", line 449, in ssl_wrap_socket
    ssl_sock = _ssl_wrap_socket_impl(
  File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 411, in connect
    self.sock = ssl_wrap_socket(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 1012, in _validate_conn
    conn.connect()
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 382, in _make_request
    self._validate_conn(conn)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 699, in urlopen
    httplib_response = self._make_request(
  File "/home/username/.local/lib/python3.10/site-packages/requests/adapters.py", line 489, in send
    resp = conn.urlopen(
  File "/home/username/.local/lib/python3.10/site-packages/requests/sessions.py", line 701, in send
    r = adapter.send(request, **kwargs)
  File "/home/username/.local/lib/python3.10/site-packages/requests/sessions.py", line 587, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/username/.local/lib/python3.10/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "/home/username/.local/lib/python3.10/site-packages/requests/api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
  File "/home/username/.local/lib/python3.10/site-packages/amazoncaptcha/solver.py", line 247, in fromlink
    response = requests.get(image_link)
  File "/home/username/script.py", line 139, in fetch_html
    solution = AmazonCaptcha.fromlink(imageurl).solve()

Update tests

Even though the current tests cover 100% of the code, they should be rewritten in a specific style.

Recheck imports after separating devtools

There is the import of multiprocessing within amazoncaptcha.solver module. It shouldn't be there since AmazonCaptchaCollector was moved to the module amazoncaptcha.devtools

Update PyPi readme file.

Readme file at PyPi currently contains next information within Additional block: Just FYI, pip will install only the module itself. However, if you are using git clone, be aware that you will also clone 50 MB of captchas currently located in the repository.

This is outdated, since all the images were moved to the separated repository.

Pillow 8.0.0 is coming on October, 15

No changes required based on dev notes at https://pillow.readthedocs.io/en/latest/releasenotes/8.0.0.html, but we'll see.

Constantly Not Solved

Paste the code showing which endpoint of the AmazonCaptcha library you are using

    captcha_link = "https://user-images.githubusercontent.com/17553693/102722542-fd61b780-4301-11eb-9222-f7532a638b6f.jpg"
    captcha = AmazonCaptcha.fromlink(captcha_link)
    solution = captcha.solve()
    print("Captcha solution is: ", solution)

Is it really constant?
How many captchas in a row have given the "Not Solved" result?
All!

Environment (please complete the following information):

OS: macOS 10.15
Driver chrome
Driver version ChromeDriver 87.0.4280.88

Additional context
Hi, I am trying to solve this type of Captcha seen in the attached photo that occurs while logging into an Amazon account. I have tried loading it in different ways and the img is loaded correctly, anyway I always get a "Not Solved." response.

Can't find the training_data in the Temp\\_MEI167922 folder[Bug]

Describe the bug
A clear and concise description of what the bug is.
Can't find the training_data in the Temp\_MEI167922 folder

To Reproduce
Steps to reproduce the behavior:
PyInstaller: 4.3
Python: 3.9.5
Platform: Windows-10-10.0.22621-SP0

Expected behavior
A clear and concise description of what you expected to happen.
File "amazoncaptcha\solver.py", line 68, in init
FileNotFoundError: [WinError 3] 系统找不到指定的路径。: 'C:\Users\Matt\AppData\Local\Temp\_MEI167922\amazoncaptcha\training_data'

Desktop (please complete the following information):

OS: Windows11
Browser: Chrome

Additional context
Add any other context about the problem here.
在pycharm运行的时候，是可以正常识别验证码的。
但用了pyinstaller 打包了exe文件，运行的时候临时文件夹找不到amazoncaptcha文件夹。

打包代码：pyinstaller -F "D:\Code\picshot\Picshot_amazon.py"

运行代码：
import amazoncaptcha
solution = amazoncaptcha.AmazonCaptcha(captcha.png).solve(keep_logs=True)

Should private method AmazonCaptcha._monochrome() be moved to the utils module?

Unable to import Amazon captcha

Hi Team.
from amazoncaptcha import AmazonCaptcha or import amazoncaptcha in python 3.5.0

Is there any way to make compatible for python 3.5 version.kindly assist.

Recently installed python 3.6 version.But still the same error.Could you help on this.

Getting below error :

Could you please help on the issue.

Hey! All the dependencies issues will be fixed starting next year.

The core solving functionality works fine, but I don't have time to look into other issues at the moment.

Remove "return None" from the solver

Function: amazoncaptcha.solver.AmazonCaptcha._find_letters()

Problematic lines:

if (len(letters) == 6 and letters[0].width < MINIMUM_LETTER_LENGTH) or (len(letters) != 6 and len(letters) != 7):
   self.letters = {str(k): Image.new('L', (200, 70)) for k in range(1, 7)}
   return

Assuming the next if just changing letters list and doesn't return None, it will be conceptually more correct to avoid using return None in the mentioned one.

Constantly Not Solved

The current issue is the new version of AmazonCaptcha is different from the validateCaptcha site.
If I try to login I get this kind of image. Image Solver cant solve it.

Image

Constantly receiving `Not solved` while using `fromdriver`

The following keeps returning Not solved for me. I have never got any other return.

import selenium
from amazoncaptcha import AmazonCaptcha
from selenium import webdriver as webdriver

d = webdriver.Chrome(ChromeDriverManager().install())
captcha = AmazonCaptcha.fromdriver(d)
print(captcha.solve(keep_logs=True))

not-solved-captcha.log:
https://images-na.ssl-images-amazon.com/captcha/bfhuzdtn/Captcha_cebmxydbrt.jpg
https://images-na.ssl-images-amazon.com/captcha/perumqgc/Captcha_gaommpndkq.jpg
https://images-na.ssl-images-amazon.com/captcha/rhnrlggh/Captcha_tijaodpupx.jpg
https://images-na.ssl-images-amazon.com/captcha/bysppkyq/Captcha_xroxbnvmrg.jpg

Add external documentation

Giving 'Not Solved' result

Using the fromlink method its giving 'Not Solved' result repeatedly after giving many successful result, i tried today, its given almost 50-55 successful result, after this giving only 'Not Solved' result

https://opfcaptcha-prod.s3.amazonaws.com/f34d23b16c634795a69ecc38123f5255.jpg?AWSAccessKeyId=AKIA5WBBRBBBZF7P4WZG&Expires=1721622818&Signature=HQj4bBDc%2BKZ86Afh%2FWZRrlryPEo%3D

https://opfcaptcha-prod.s3.amazonaws.com/80f8fb9cc66e4589949f67881fe12006.jpg?AWSAccessKeyId=AKIA5WBBRBBBZF7P4WZG&Expires=1721622820&Signature=lq68Of1r%2Fc%2BnZc8L2Aoa%2B6Tnlhw%3D

https://opfcaptcha-prod.s3.amazonaws.com/c5d9c5971f774ea8883b58e9927d790b.jpg?AWSAccessKeyId=AKIA5WBBRBBBZF7P4WZG&Expires=1721622822&Signature=Ht6ke7djqBNoRi0TaSKw%2BCX3Tss%3D

Question about usage

Hi, potential silly question here: does solve() actually send the solution back to the website via selenium, or does it just return the solution to be saved into a var and sent by the user as needed?

Include only json file in manifest for training data folder

Instead of include amazoncaptcha/training_data/*.*, it will be more correct to use include amazoncaptcha/training_data/*.json

[Feature] Could using Pillow-SIMD imrpove performance?

I was looking at some benchmarks and pillow-simd claims to be a drop in replacement for pillow with better performance. I was wondering if this is something that has been considered to be implemented here.

Curious about the scope of your project

First of all, I think that captcha solving is a really cool application of machine learning techniques. I am interested in making a simple web app to present the results of a data science project carried out using cached Amazon review data from a couple of years ago when it was easier to scrape the site. The idea of the web app will be that a scrape will happen in real time for a particular product and then the model I have built will be applied to make a guess about which reviews are the most relevant to making an informed purchasing decision. Amazon's antiscraping defenses are such that it seems that I will have to either pay for a VPN to cycle IPs or find a way to robustly solve captchas. Is your tool able to solve captchas and hand the solutions over to Amazon using pyppeteer or some other tool in a way which will satisfy the site and allow me access to the source code of the page I actually want to visit? If so, do you have sample code for this? I am very new to python and it is not obvious to me how to do this from the sample code you presented on your home page. If this is not currently feasible, is there a chance that it could be made so in the near future? I do not need this to work on an industrial scale; the app will only be used sporadically as a proof-or-concept demonstration of the machine learning I have done. If this is not something that you are interested in, no problem. In that case, I would be curious to know what applications you have in mind for your program? I suppose that the challenge of building something which can pass "Turing tests" is a worthy goal in and of itself :)

Workflow update

Add an allow_failures test on python version 3.5.* at Travis-CI
Setup a github action to automatically publish the package on release

Constantly Not Solved

Versions of amazonCaptcha and divers
version of drivers.

Is it really constant?
How many captchas in a row have given the "Not Solved" result?
Always. I tried dozens times but it still doesn't work.

To Reproduce
Steps to reproduce the behavior:
Write this code:

from amazoncaptcha import AmazonCaptcha
from selenium import webdriver
from chromedriver_py import binary_path
import time

chrome_options = webdriver.ChromeOptions();
chrome_options.binary_location = 'C:\\Program Files\\Google\\Chrome Beta\\Application\chrome.exe';
driver = webdriver.Chrome(executable_path=binary_path, options = chrome_options);
while(1):
    driver.get('https://www.amazon.com/errors/validateCaptcha');

    captcha = AmazonCaptcha.fromdriver(driver);
    solution = captcha.solve();
    print(solution);
    time.sleep(5);

Substitute the binary location with the directory where is present chrome.exe file.
Run and see the results. Only Not solved string will appear.

Screenshots
If applicable, add screenshots to help explain your problem.
version of drivers.

messages.

Environment (please complete the following information):

OS: [Windows Home 10 version 19041.867]
Driver [Chrome Beta 90.0.4430.51]
Driver version [See screenshot]

Additional context
I have tried with GeckoDriver and Firefox too, but it still does not work.

This is the type of captcha that I saw always.

a-maliarov / amazoncaptcha Goto Github PK

amazoncaptcha's Introduction

Recent News

Installation

Quick Snippet

Status

Usage and Class Methods

Help the Development

Additional

amazoncaptcha's People

Contributors

Stargazers

Watchers

Forkers

amazoncaptcha's Issues

Recommend Projects

Recommend Topics

Recommend Org