Code Monkey home page Code Monkey logo

Comments (7)

dteviot avatar dteviot commented on August 17, 2024 3

@norabelle101 @MarkoPabst @Dongboy69

Seems to be same problem as #1306. Fix provided by gamebeaker.

Test versions for Firefox and Chrome have been uploaded to https://drive.google.com/drive/folders/1B_X2WcsaI_eg9yA-5bHJb8VeTZGKExl8?usp=sharing. Pick the one suitable for you, follow the "How to install from Source (for people who are not developers)" instructions at https://github.com/dteviot/WebToEpub/tree/ExperimentalTabMode#user-content-how-to-install-from-source-for-people-who-are-not-developers and let me know how it goes.
Tested with:

For my notes: no extra work. (Fixed by #1306)

from webtoepub.

MarkoPabst avatar MarkoPabst commented on August 17, 2024 1

@dteviot

Hello! I've also had this same issue with both foxaholic sources recently since it worked up until a couple weeks ago. The security of the site seems to be the same though since they've always had Cloudflare security and the "human verification" check (or maybe something in the backend has changed?). But, this issue doesn't seem to actually implicate the Webtoepub extension itself as it moreso might be a problem with the Chrome browser since the store version on Mozilla Firefox can easily bypass the Cloudflare protection for this site and download any novel, just like how Chrome was able to before. So, I'm rather confused on what exactly is causing the "404 CAPTCHA error" in only Chrome...

Also, in one of the comments for issue #1304, the user reported using a "noscript" extension from Chrome webstore to bypass this error, but it didn't quite work for me, despite putting the same configurations... Is there really no way to find a workaround to this issue to allow it to rework again in Chrome since this browser is preferable?

And after using the extension for the first time in Mozilla, the interface and overall look (style + fonts) was vastly different from the clean design of Chrome, so it was a bit more hard to see where everything was :'(

EDIT: I tested the Chrome version of the extension on other browsers as well (Brave and Microsoft Edge), and received the same 404 error. It seems the extension does not work at all on either foxaholic site if using the Chrome version of the exension, but the Mozilla version is fine?

It didn't work for me either.

from webtoepub.

Dongboy69 avatar Dongboy69 commented on August 17, 2024 1

can confirm Firefox don't get this error only chromium browsers

from webtoepub.

dteviot avatar dteviot commented on August 17, 2024

@MarkoPabst

I'm seeing more and more of this. Site is using Cloudflare for anti-scraping. Which also blocks WebToEpub.

from webtoepub.

norabelle101 avatar norabelle101 commented on August 17, 2024

@dteviot

Hello! I've also had this same issue with both foxaholic sources recently since it worked up until a couple weeks ago. The security of the site seems to be the same though since they've always had Cloudflare security and the "human verification" check (or maybe something in the backend has changed?). But, this issue doesn't seem to actually implicate the Webtoepub extension itself as it moreso might be a problem with the Chrome browser since the store version on Mozilla Firefox can easily bypass the Cloudflare protection for this site and download any novel, just like how Chrome was able to before. So, I'm rather confused on what exactly is causing the "404 CAPTCHA error" in only Chrome...

Also, in one of the comments for issue #1304, the user reported using a "noscript" extension from Chrome webstore to bypass this error, but it didn't quite work for me, despite putting the same configurations... Is there really no way to find a workaround to this issue to allow it to rework again in Chrome since this browser is preferable?

And after using the extension for the first time in Mozilla, the interface and overall look (style + fonts) was vastly different from the clean design of Chrome, so it was a bit more hard to see where everything was :'(

EDIT: I tested the Chrome version of the extension on other browsers as well (Brave and Microsoft Edge), and received the same 404 error. It seems the extension does not work at all on either foxaholic site if using the Chrome version of the exension, but the Mozilla version is fine?

from webtoepub.

dteviot avatar dteviot commented on August 17, 2024

@norabelle101

Chrome version of the extension on other browsers as well (Brave and Microsoft Edge)

FYI. Chrome, Brave and Edge (and most other) all use the same base (chromium) engine. As far as I know, the only two major browsers that don't use Chromium are Firefox and Safari.

I would suspect that whatever Cloudflare has done, it ignores Firefox.
Note, I SUSPECT the problem is Cloudflare, as the Firefox and Chrome versions of the extension are (I'd guestimate) around 98% identical code. I don't remember any differences in the Fetch Chapter logic.

Is there really no way to find a workaround to this issue to allow it to rework again in Chrome since this browser is preferable?

I have a thought how this could be done. Basically, instead of trying to fetch just the wanted content, open each chapter in a tab so whatever Cloudflare is looking for WILL be there. Then fetch the content from the tab.
The problem with this plan is:

  1. It's likely to be a lot of work. I'm thinking 10s of hours of work. And I just don't have the motivation (or time)
  2. Doing this is complicated, and I'm not currently sure how to do at least one part of it.
  3. I don't think it can get images.
  4. WebToEpub will be kind of annoying to use. Will be opening and closing tabs. I'm not sure it will even work on Android.

An alternate plan would be to do something similar with an actual Browser. Selenium allows another program to "remote control" a browser. The problem(s) with this plan is

  1. It basically requires writing a whole new program from scratch. So, probably 100s of hours of work.
  2. It's not an extension, but a stand alone program, So, would only work for Windows.
  3. Getting images is not easy.

Note, I've seen a scraper project that is going down this path https://github.com/martial-god/Benny-Scraper. Giving some consideration to assisting.

from webtoepub.

norabelle101 avatar norabelle101 commented on August 17, 2024

@dteviot

FYI. Chrome, Brave and Edge (and most other) all use the same base (chromium) engine. As far as I know, the only two major browsers that don't use Chromium are Firefox and Safari.

I would suspect that whatever Cloudflare has done, it ignores Firefox.

Oh, I see! As long as Firefox remains capable bypassing this protection, that's good then!

I have a thought how this could be done. Basically, instead of trying to fetch just the wanted content, open each chapter in a tab so whatever Cloudflare is looking for WILL be there. Then fetch the content from the tab.

As you've explained, it's probably most preferable to not go further with implementing this, if it's only going to overly complicate the simply purposed usage of Webtoepub :)

Note, I've seen a scraper project that is going down this path https://github.com/martial-god/Benny-Scraper. Giving some consideration to assisting.

I've never heard of this program, but that would really amazing if they could enlist your assistance, albeit this program only supports around five sources, so I'm not sure if they'd be willing to add more that would come close to the level of Webtoepub, though it's nice to have a working alternative :)

All in all, thank you very much for considering these methods to get around this issue! For now, it's probably best to stick to using Firefox for any sources which employ this protection :D

from webtoepub.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.