Code Monkey home page Code Monkey logo

amazoncaptcha's Introduction

  ______                                  ______                      __              __                
 /      \                                /      \                    |  \            |  \               
|  ▓▓▓▓▓▓\______ ____  ________ _______ |  ▓▓▓▓▓▓\ ______   ______  _| ▓▓_    _______| ▓▓____   ______  
| ▓▓__| ▓▓      \    \|        \       \| ▓▓   \▓▓|      \ /      \|   ▓▓ \  /       \ ▓▓    \ |      \
| ▓▓    ▓▓ ▓▓▓▓▓▓\▓▓▓▓\\▓▓▓▓▓▓▓▓ ▓▓▓▓▓▓▓\ ▓▓       \▓▓▓▓▓▓\  ▓▓▓▓▓▓\\▓▓▓▓▓▓ |  ▓▓▓▓▓▓▓ ▓▓▓▓▓▓▓\ \▓▓▓▓▓▓\
| ▓▓▓▓▓▓▓▓ ▓▓ | ▓▓ | ▓▓ /    ▓▓| ▓▓  | ▓▓ ▓▓   __ /      ▓▓ ▓▓  | ▓▓ | ▓▓ __| ▓▓     | ▓▓  | ▓▓/      ▓▓
| ▓▓  | ▓▓ ▓▓ | ▓▓ | ▓▓/  ▓▓▓▓_| ▓▓  | ▓▓ ▓▓__/  \  ▓▓▓▓▓▓▓ ▓▓__/ ▓▓ | ▓▓|  \ ▓▓_____| ▓▓  | ▓▓  ▓▓▓▓▓▓▓
| ▓▓  | ▓▓ ▓▓ | ▓▓ | ▓▓  ▓▓    \ ▓▓  | ▓▓\▓▓    ▓▓\▓▓    ▓▓ ▓▓    ▓▓  \▓▓  ▓▓\▓▓     \ ▓▓  | ▓▓\▓▓    ▓▓
 \▓▓   \▓▓\▓▓  \▓▓  \▓▓\▓▓▓▓▓▓▓▓\▓▓   \▓▓ \▓▓▓▓▓▓  \▓▓▓▓▓▓▓ ▓▓▓▓▓▓▓    \▓▓▓▓  \▓▓▓▓▓▓▓\▓▓   \▓▓ \▓▓▓▓▓▓▓
                                                          | ▓▓                                          
  >>>solution                                             | ▓▓                            Response 0.24s
  "AmznCaptcha"                                            \▓▓                            Accuracy 99.9%

The motivation behind the creation of this library is taking its start from the genuinely simple idea: "I don't want to use pytesseract or some other non-amazon-specific OCR services, nor do I want to install some executables to just solve a captcha. I desire to get a solution with 2 lines of code without any heavy add-ons, using a pure Python."


Pure Python, lightweight, Pillow-based solver for Amazon's text captcha.

Accuracy Timing Size Version Python version Downloads

Recent News

  • May 5, 2023: tested and approved compatibility with Pillow 9.5.0
  • January 25, 2022: tested and approved compatibility with Python 3.10
  • January 25, 2022: dropped support for Python 3.6

Installation

You can simply install the library from PyPi using pip. For more methods check the docs.

pip install amazoncaptcha

Quick Snippet

An example of the constructor usage. Scroll a bit down to see some tasty class methods. For consistency across different devices, it is highly recommended to use fromlink class method.

from amazoncaptcha import AmazonCaptcha

captcha = AmazonCaptcha('captcha.jpg')
solution = captcha.solve()

# Or: solution = AmazonCaptcha('captcha.jpg').solve()

Status

Status Build Status Documentation Status Code Coverage CodeFactor Grade

Usage and Class Methods

Browsing Amazon using selenium and stuck on captcha? The class method below will do all the dirty work of extracting an image from the webpage for you. Practically, it takes a screenshot from your webdriver, crops the captcha and stores it into bytes array which is then used to create an AmazonCaptcha instance. This also means avoiding any local savings. For consistency across different devices, it is highly recommended to use fromlink class method instead of fromdriver.

from amazoncaptcha import AmazonCaptcha
from selenium import webdriver

driver = webdriver.Chrome() # This is a simplified example
driver.get('https://www.amazon.com/errors/validateCaptcha')

captcha = AmazonCaptcha.fromdriver(driver)
solution = captcha.solve()

If you are not using selenium or the previous method is not just the case for you, it is possible to use a captcha link directly. This class method will request the url, check the content type and store the response content into bytes array to create an instance of AmazonCaptcha.

from amazoncaptcha import AmazonCaptcha

link = 'https://images-na.ssl-images-amazon.com/captcha/usvmgloq/Captcha_kwrrnqwkph.jpg'

captcha = AmazonCaptcha.fromlink(link)
solution = captcha.solve()

In addition, if you are a machine learning or neural network developer and are looking for some training data, check this repository, which was created to store images and other non-script data for the solver.

Help the Development

If you are willing to help the development, consider setting keep_logs argument of the solve method to True. Here is the example, if you are using fromdriver class method. If set to True, all the links of the unsolved captcha will be stored so that later you can open the issue and send the logs.

from amazoncaptcha import AmazonCaptcha
from selenium import webdriver

driver = webdriver.Chrome() # This is a simplified example
driver.get('https://www.amazon.com/errors/validateCaptcha')

captcha = AmazonCaptcha.fromdriver(driver)
solution = captcha.solve(keep_logs=True)

If you have any suggestions or ideas of additional instances and methods, which you would like to see in this library, please, feel free to contact the owner via email or fork'n'pull to repository. Any contribution is highly appreciated!

"Buy Me A Coffee"

Additional

  • If you want to see the History of Changes, Code of Conduct, Contributing Policy, or License, use these inline links to navigate based on your needs.
  • If you are facing any errors, please, report your situation via an issue.
  • This project is for educational and research purposes only. Any actions and/or activities related to the material contained on this GitHub Repository is solely your responsibility. The author will not be held responsible in the event any criminal charges be brought against any individuals misusing the information in this GitHub Repository to break the law.
  • Amazon is the registered trademark of Amazon.com, Inc. Amazon name used in this project is for identification purposes only. The project is not associated in any way with Amazon.com, Inc. and is not an official solution of Amazon.com, Inc.

amazoncaptcha's People

Contributors

a-maliarov avatar anatolii-maliarov avatar dependabot-preview[bot] avatar dependabot[bot] avatar mend-bolt-for-github[bot] avatar rheelme avatar tim-wq avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

amazoncaptcha's Issues

[help]

Describe the bug
I tryed to install amazoncaptcha with the pip command but when i try to execute my .py i got this error "line 1, in
from amazoncaptcha import AmazonCaptcha
ModuleNotFoundError: No module named 'amazoncaptcha'
"

To Reproduce
Steps to reproduce the behavior:
I followed the guide to install it

Desktop (please complete the following information):

  • OS: [W10]
  • Browser [chrome]
  • Version [88]

Support for another type of Amazon Captcha

Hello! I'm constantly getting the 'Not Solved' issue, I think because the training data used has a different style to the CAPTCHAs I'm getting from Amazon. The ones I get on Amazon have a line in the background of the text. I've attached an image.
captcha
Is this something that could be included for a future implementation do you think?
Thank-you so much!

[Feature] update to Pillow 10.0.0

I have some other dependencies that need Pillow 10.0.0, but amazoncaptcha restricts version to <9.6.0.

Any reason to not update this dependency?

Thanks in advance

Constantly Not Solved

Paste the code showing which endpoint of the AmazonCaptcha library you are using

from amazoncaptcha import AmazonCaptcha

captcha = AmazonCaptcha('src\captcha\captcha.png')
solution = AmazonCaptcha('src\captcha\captcha.png').solve()
print(solution)

# Or: solution = AmazonCaptcha('captcha.jpg').solve()

If your endpoint is not ".fromdriver", paste the code of how you download the image

Code goes here

Is it really constant?
How many captchas in a row have given the "Not Solved" result?
about 10 times

To Reproduce
Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Screenshots
If applicable, add screenshots to help explain your problem.
image

Environment (please complete the following information):

  • OS: [e.g. iOS] WIndows 10
  • Driver [e.g. chrome, safari]
  • Driver version [e.g. 22]

Additional context
Add any other context about the problem here.

Update AmazonCaptchaCollector

The current instance of AmazonCaptchaCollector was created at the beginning of the library development and uses outdated methods.

  • Add NotFolderError exception.

  • Add the option to the endpoints to decide whether a user wants to store images or just conduct an accuracy test.

  • Add corresponding logs to allow to use the instance as a devtool to proceed accuracy checks.

  • Update existent tests for AmazonCaptchaCollector

  • Add new tests

  • Add docstrings

Won't launch properly

The project won't work or boot with the code imported. I'm using pycharm with a virtual environment on a Windows machine. I get the error "can't find 'main' module in {path to project}" anytime I try to run anything, including the example code segments with selenium.

Make Selenium dependency optional

Would it be possible to make Selenium dependency optional? Currently, the dependency tree for a project that imports amazoncaptcha shows that the majority of packages are coming from Selenium:

$ pipdeptree -p amazoncaptcha
amazoncaptcha==0.5.5
  - pillow [required: ~=9.0.1, installed: 9.0.1]
  - requests [required: ~=2.27.1, installed: 2.27.1]
    - certifi [required: >=2017.4.17, installed: 2021.10.8]
    - charset-normalizer [required: ~=2.0.0, installed: 2.0.11]
    - idna [required: >=2.5,<4, installed: 3.3]
    - urllib3 [required: >=1.21.1,<1.27, installed: 1.26.8]
  - selenium [required: >=3.141,<4.2, installed: 4.1.0]
    - trio [required: ~=0.17, installed: 0.19.0]
      - async-generator [required: >=1.9, installed: 1.10]
      - attrs [required: >=19.2.0, installed: 21.4.0]
      - cffi [required: >=1.14, installed: 1.15.0]
        - pycparser [required: Any, installed: 2.21]
      - idna [required: Any, installed: 3.3]
      - outcome [required: Any, installed: 1.1.0]
        - attrs [required: >=19.2.0, installed: 21.4.0]
      - sniffio [required: Any, installed: 1.2.0]
      - sortedcontainers [required: Any, installed: 2.4.0]
    - trio-websocket [required: ~=0.9, installed: 0.9.2]
      - async-generator [required: >=1.10, installed: 1.10]
      - trio [required: >=0.11, installed: 0.19.0]
        - async-generator [required: >=1.9, installed: 1.10]
        - attrs [required: >=19.2.0, installed: 21.4.0]
        - cffi [required: >=1.14, installed: 1.15.0]
          - pycparser [required: Any, installed: 2.21]
        - idna [required: Any, installed: 3.3]
        - outcome [required: Any, installed: 1.1.0]
          - attrs [required: >=19.2.0, installed: 21.4.0]
        - sniffio [required: Any, installed: 1.2.0]
        - sortedcontainers [required: Any, installed: 2.4.0]
      - wsproto [required: >=0.14, installed: 1.0.0]
        - h11 [required: >=0.9.0,<1, installed: 0.13.0]
    - urllib3 [required: ~=1.26, installed: 1.26.8]

Given that AmazonCaptcha is fully functional without Selenium, as well as a wealth of alternatives to Selenium exist when an automated browser is required (Playwright, pyppeteer), there seems to be an opportunity to make the package lighter. If I understand it correctly, at the moment Selenium is only imported for unit tests. In production the expectation is for the user to install & import Selenium, then pass the driver object to AmazonCaptcha. Therefore two separate production and test requirements (with an optional version compatibility check in production) should do the trick. Thanks!

The solver doesn't work with modified images of captcha

Hello
I have an image with np array format and save it with a PIL image with formatting jpg after that I pass the image to the solver but every time I receive "Not solved"
Screenshot_4

Screenshot_5

I can't figure out what the problem is, I tried to download and try images from amazon it worked well so can anyone help me

PIL version '9.2.0'
python version '3.9.13'

Performance enhancement

Hello

how can I enhance the performance of amazoncaptcha?

i have hundred of thousands of images and the CPU consumption is high

any recommendations or suggestion for other tools?

Thanks

amazoncaptcha 0.5.0 is coming

For users:

  • Remove captchas folder to the separated repository to lower the weight of this one
  • Add Python 3.9 support
  • Add Chromedriver 86.0.4240.22 support
  • Update AmazonCaptchaCollector
  • Do an accuracy test after latest
  • Add documentation.
  • Add Pillow 8.0.0 support
  • Minor edits

For developers:

  • Add Stale Bot to remove stale issues
  • Add discord notifications to monitor the changes
  • Update setup.py
  • Workflow update

Not solved

I constantly receive the message "not solved". What should I do?

My Code:

from amazoncaptcha import AmazonCaptcha

captcha = AmazonCaptcha('Captcha.jpg')
solution = captcha.solve()

print(solution)

Output:
>>> Not solved

Update setup.py

  • Add more classifiers: "Intended Audience :: Developers", "Intended Audience :: Education", "Intended Audience :: Information Technology", "Topic :: Internet :: WWW/HTTP :: Browsers".

  • Add a function to remove repository logo from README before pushing to PyPI, since the logo won't be displayed correctly.

  • Add version file with all the package information

  • Think about an implementation of auto tracker for the version number.

  • Slight code style change, defining requires and classifiers lists before usage.

More documentation for how to use with requests/lxml

Selenium is awesome, but I am trying to use this with requests and lxml. It seems like it is solving things properly, but I am having trouble submitting the solution. Could you add some example usage to the readme?

This is what I am doing right now using requests/lxml:

import random
import requests
from lxml import html
from fake_useragent import UserAgent
import csv
import time
import os
from amazoncaptcha import AmazonCaptcha


amazon_captcha_xpath = '//h4[contains(text(), "Enter the characters you see below")]'
captcha_image_xpath = '//div[@class="a-row a-text-center"]/img/@src'


def get_link(url, session=None, user_agent=None, proxy=None):
    """
    Fetches the HTML content from the provided URL.
    Returns a parsed lxml HTML tree that can be used with XPath.
    """
    ua = UserAgent()
    headers = {'User-Agent': ua.google if not user_agent else user_agent}
    proxies = {'http': proxy, 'https': proxy} if proxy else {}

    if session is None:
        session = requests.Session()

    response = session.get(url, headers=headers, proxies=proxies)
    tree = html.fromstring(response.content)

    return tree, session


# code that does stuff assuming there is no captcha. Leaving it out because it's long and probably not helpful.

if tree.xpath(amazon_captcha_xpath):
    bot_check = True
    print(html.tostring(tree).decode())
    print('[ Captcha Detected! ]')

    captcha_image_link = tree.xpath(captcha_image_xpath)[0]
    print(captcha_image_link)

    solution = AmazonCaptcha.fromlink(captcha_image_link).solve()
    print(f'Solution is: {solution}')

    print('Pausing to seem human...')
    time.sleep(random.randrange(3, 15))

 
    print('Submitting solution')
    
    # THIS IS THE PART TO SUBIMT IT THAT DOES NOT SEEM TO WORK
    
    amzn = tree.xpath('//input[@name="amzn"]/@value')[0]
    amzn_r = tree.xpath('//input[@name="amzn-r"]/@value')[0]

    data = {
        'amzn': amzn,
        'amzn-r': amzn_r,
        'field-keywords': solution
    }

    response = response = session.post('https://www.amazon.com/errors/validateCaptcha', data=data)

    # check response
    print(response.status_code)   # always comes back as 503
    #print(response.text)
    #input('PAUSED')
    ```

'WebDriver' object has no attribute 'find_element_by_tag_name'

I have two issues, so I'll mention them both here. Using the provided examples I get errors:

fromlink:

requests.exceptions.MissingSchema: Invalid URL '<selenium.webdriver.firefox.webdriver.WebDriver (session="4ff4c007-d259-454a-8b8d-5be914964afd")>': No scheme supplied. Perhaps you meant https://<selenium.webdriver.firefox.webdriver.WebDriver (session="4ff4c007-d259-454a-8b8d-5be914964afd")>?

fromdriver:

AttributeError: 'WebDriver' object has no attribute 'find_element_by_tag_name'

To Reproduce
Steps to reproduce the behavior:

driver = webdriver.Firefox()
driver.get('https://www.amazon.com/errors/validateCaptcha')

captcha = AmazonCaptcha.fromlink(driver)
solution = captcha.solve()

Desktop (please complete the following information):

macOS Ventura
Firefox 112.0.2 (64-bit)

Additional context

Hopefully it's something simple.

Add Python 3.9 support

  • The initial test of whether the package is compatible with 3.9

  • Solving issues (none was found)

  • Updating PyPI package (also, grab #13 and #16 here)

No timeout can cause the script to freeze indefinitely

First off, I'd like to thank you for the useful package.

The issue I'm having is that threads in my script will randomly freeze and stop responding after hours or sometimes days. After doing some analysis with gdb, the reason seems to be AmazonCaptcha.fromlink() using requests.get(image_link) without a timeout which might cause the script to wait indefinitely if there's an issue. At its core, I guess the real culprit is network connectivity issues since I'm randomly rotating IP but I don't see any other solution than adding a timeout and error handling in AmazonCaptcha.

Maybe adding a high default timeout of say 60-180 seconds and then raising an exception might be a decent solution?

Traceback (most recent call first):
  File "/usr/lib/python3.10/ssl.py", line 1342, in do_handshake                                                                                   
    self._sslobj.do_handshake()
  File "/usr/lib/python3.10/ssl.py", line 1071, in _create
    self.do_handshake()
  File "/usr/lib/python3.10/ssl.py", line 513, in wrap_socket
    return self.sslsocket_class._create(
  File "/usr/lib/python3/dist-packages/urllib3/util/ssl_.py", line 493, in _ssl_wrap_socket_impl
    return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
  File "/usr/lib/python3/dist-packages/urllib3/util/ssl_.py", line 449, in ssl_wrap_socket
    ssl_sock = _ssl_wrap_socket_impl(
  File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 411, in connect
    self.sock = ssl_wrap_socket(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 1012, in _validate_conn
    conn.connect()
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 382, in _make_request
    self._validate_conn(conn)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 699, in urlopen
    httplib_response = self._make_request(
  File "/home/username/.local/lib/python3.10/site-packages/requests/adapters.py", line 489, in send
    resp = conn.urlopen(
  File "/home/username/.local/lib/python3.10/site-packages/requests/sessions.py", line 701, in send
    r = adapter.send(request, **kwargs)
  File "/home/username/.local/lib/python3.10/site-packages/requests/sessions.py", line 587, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/username/.local/lib/python3.10/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "/home/username/.local/lib/python3.10/site-packages/requests/api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
  File "/home/username/.local/lib/python3.10/site-packages/amazoncaptcha/solver.py", line 247, in fromlink
    response = requests.get(image_link)
  File "/home/username/script.py", line 139, in fetch_html
    solution = AmazonCaptcha.fromlink(imageurl).solve()

Update tests

Even though the current tests cover 100% of the code, they should be rewritten in a specific style.

Recheck imports after separating devtools

There is the import of multiprocessing within amazoncaptcha.solver module. It shouldn't be there since AmazonCaptchaCollector was moved to the module amazoncaptcha.devtools

Update PyPi readme file.

Readme file at PyPi currently contains next information within Additional block: Just FYI, pip will install only the module itself. However, if you are using git clone, be aware that you will also clone 50 MB of captchas currently located in the repository.

This is outdated, since all the images were moved to the separated repository.

Constantly Not Solved

unsolveable-captcha

Paste the code showing which endpoint of the AmazonCaptcha library you are using

    captcha_link = "https://user-images.githubusercontent.com/17553693/102722542-fd61b780-4301-11eb-9222-f7532a638b6f.jpg"
    captcha = AmazonCaptcha.fromlink(captcha_link)
    solution = captcha.solve()
    print("Captcha solution is: ", solution)

Is it really constant?
How many captchas in a row have given the "Not Solved" result?
All!

Environment (please complete the following information):

  • OS: macOS 10.15
  • Driver chrome
  • Driver version ChromeDriver 87.0.4280.88

Additional context
Hi, I am trying to solve this type of Captcha seen in the attached photo that occurs while logging into an Amazon account. I have tried loading it in different ways and the img is loaded correctly, anyway I always get a "Not Solved." response.

Can't find the training_data in the Temp\\_MEI167922 folder[Bug]

Describe the bug
A clear and concise description of what the bug is.
Can't find the training_data in the Temp\_MEI167922 folder

To Reproduce
Steps to reproduce the behavior:
PyInstaller: 4.3
Python: 3.9.5
Platform: Windows-10-10.0.22621-SP0

Expected behavior
A clear and concise description of what you expected to happen.
File "amazoncaptcha\solver.py", line 68, in init
FileNotFoundError: [WinError 3] 系统找不到指定的路径。: 'C:\Users\Matt\AppData\Local\Temp\_MEI167922\amazoncaptcha\training_data'

Desktop (please complete the following information):

  • OS: Windows11
  • Browser: Chrome

Additional context
Add any other context about the problem here.
在pycharm运行的时候,是可以正常识别验证码的。
但用了pyinstaller 打包了exe文件,运行的时候临时文件夹找不到amazoncaptcha文件夹。

打包代码:pyinstaller -F "D:\Code\picshot\Picshot_amazon.py"

运行代码:
import amazoncaptcha
solution = amazoncaptcha.AmazonCaptcha(captcha.png).solve(keep_logs=True)

Unable to import Amazon captcha

Hi Team.
from amazoncaptcha import AmazonCaptcha or import amazoncaptcha in python 3.5.0

Is there any way to make compatible for python 3.5 version.kindly assist.

Recently installed python 3.6 version.But still the same error.Could you help on this.

Getting below error :

image

Could you please help on the issue.

Remove "return None" from the solver

Function: amazoncaptcha.solver.AmazonCaptcha._find_letters()

Problematic lines:

if (len(letters) == 6 and letters[0].width < MINIMUM_LETTER_LENGTH) or (len(letters) != 6 and len(letters) != 7):
   self.letters = {str(k): Image.new('L', (200, 70)) for k in range(1, 7)}
   return

Assuming the next if just changing letters list and doesn't return None, it will be conceptually more correct to avoid using return None in the mentioned one.

Constantly Not Solved

The current issue is the new version of AmazonCaptcha is different from the validateCaptcha site.
If I try to login I get this kind of image. Image Solver cant solve it.

Image

Constantly receiving `Not solved` while using `fromdriver`

The following keeps returning Not solved for me. I have never got any other return.

import selenium
from amazoncaptcha import AmazonCaptcha
from selenium import webdriver as webdriver

d = webdriver.Chrome(ChromeDriverManager().install())
captcha = AmazonCaptcha.fromdriver(d)
print(captcha.solve(keep_logs=True))

not-solved-captcha.log:
https://images-na.ssl-images-amazon.com/captcha/bfhuzdtn/Captcha_cebmxydbrt.jpg
https://images-na.ssl-images-amazon.com/captcha/perumqgc/Captcha_gaommpndkq.jpg
https://images-na.ssl-images-amazon.com/captcha/rhnrlggh/Captcha_tijaodpupx.jpg
https://images-na.ssl-images-amazon.com/captcha/bysppkyq/Captcha_xroxbnvmrg.jpg

Giving 'Not Solved' result

Question about usage

Hi, potential silly question here: does solve() actually send the solution back to the website via selenium, or does it just return the solution to be saved into a var and sent by the user as needed?

Curious about the scope of your project

First of all, I think that captcha solving is a really cool application of machine learning techniques. I am interested in making a simple web app to present the results of a data science project carried out using cached Amazon review data from a couple of years ago when it was easier to scrape the site. The idea of the web app will be that a scrape will happen in real time for a particular product and then the model I have built will be applied to make a guess about which reviews are the most relevant to making an informed purchasing decision. Amazon's antiscraping defenses are such that it seems that I will have to either pay for a VPN to cycle IPs or find a way to robustly solve captchas. Is your tool able to solve captchas and hand the solutions over to Amazon using pyppeteer or some other tool in a way which will satisfy the site and allow me access to the source code of the page I actually want to visit? If so, do you have sample code for this? I am very new to python and it is not obvious to me how to do this from the sample code you presented on your home page. If this is not currently feasible, is there a chance that it could be made so in the near future? I do not need this to work on an industrial scale; the app will only be used sporadically as a proof-or-concept demonstration of the machine learning I have done. If this is not something that you are interested in, no problem. In that case, I would be curious to know what applications you have in mind for your program? I suppose that the challenge of building something which can pass "Turing tests" is a worthy goal in and of itself :)

Workflow update

  • Add an allow_failures test on python version 3.5.* at Travis-CI
  • Setup a github action to automatically publish the package on release

Constantly Not Solved

Versions of amazonCaptcha and divers
immagine version of drivers.

Is it really constant?
How many captchas in a row have given the "Not Solved" result?
Always. I tried dozens times but it still doesn't work.

To Reproduce
Steps to reproduce the behavior:
Write this code:

from amazoncaptcha import AmazonCaptcha
from selenium import webdriver
from chromedriver_py import binary_path
import time

chrome_options = webdriver.ChromeOptions();
chrome_options.binary_location = 'C:\\Program Files\\Google\\Chrome Beta\\Application\chrome.exe';
driver = webdriver.Chrome(executable_path=binary_path, options = chrome_options);
while(1):
    driver.get('https://www.amazon.com/errors/validateCaptcha');

    captcha = AmazonCaptcha.fromdriver(driver);
    solution = captcha.solve();
    print(solution);
    time.sleep(5);

Substitute the binary location with the directory where is present chrome.exe file.
Run and see the results. Only Not solved string will appear.

Screenshots
If applicable, add screenshots to help explain your problem.
immagine version of drivers.
immagine
messages.

Environment (please complete the following information):

  • OS: [Windows Home 10 version 19041.867]
  • Driver [Chrome Beta 90.0.4430.51]
  • Driver version [See screenshot]

Additional context
I have tried with GeckoDriver and Firefox too, but it still does not work.
immagine
This is the type of captcha that I saw always.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.