Code Monkey home page Code Monkey logo

Comments (4)

dcts avatar dcts commented on May 20, 2024

Waiting for .cf-browser-verification to be hidden means that you are on the cloudflare page (cf = cloudflare) and within 30 seconds are not being redirected to the actual opensea page. I think most likely opensea is detecting that you run the scraper from a google cloud IP and the cloudflare loop kicks in where it will refresh the page in an endless loop asking you to wait to resolve, which it never does.

I have no way around that currently, deploying scrapers on cloud infrastructure is difficult.

If you (or someone else) has ideas please share, its a very common problem.

One solution that might work but is costly is using a service like bright data (proxy with unblocker API).

from opensea-scraper.

mlarcher avatar mlarcher commented on May 20, 2024

UPDATE: When running on GCP we now have a less frequent TimeoutError: waiting for selector ``.cf-browser-verification`` to be hidden failed: timeout 30000ms exceeded error, but when we don't have the error we end up with a empty offers list and stats, i.e.:

offers: []
stats: {}

I hope this will be fixed by v7's new approach 🤞

from opensea-scraper.

dcts avatar dcts commented on May 20, 2024

REPORT FROM @mlarcher :

I dug a bit into the code and setup a test case...
It seems that on GCP I'm stuck on a page that says

Checking your browser before accessing opensea.io.
This process is automatic. Your browser will redirect to your requested content shortly.

Please allow up to 5 seconds…
DDoS protection by [Cloudflare](https://www.cloudflare.com/5xx-error-landing/)

:(

From what I gathered :

All in all this doesn't seem too good, but not directly related to the current library. Let me know if you have expertise on the matter and know some other way to tackle the problem though :)

from opensea-scraper.

dcts avatar dcts commented on May 20, 2024

Bypassing cloudflare is definately not my expertise. I have tried to solve this problem for some time now, and it is definately possible but as you mentioned its an arms race. I tried these packages:

  • cloudflare-scraper in JS, did not work for me. To me it seems like its not maintained anymore.
  • cloudscraper python package. I managed to setup a google cloud run environmen with python and successfully overcome cloudflare. That was 3 months ago approximately. To make it work with OpenseaScraper you could: => only get HTML through python, then extract top 32 offers with the code provided in this repo. OR: rewrite everything in pypeteer, but that is just an idea as I am not even sure if that would work.

from opensea-scraper.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.