Code Monkey home page Code Monkey logo

Comments (15)

Philippus avatar Philippus commented on September 4, 2024

I've updated two of those links in #1375, because they changed. Not sure why we're getting a 403 Forbidden error though. Maybe the ip where the build is running from is blocked?

from scala-lang.

SethTisue avatar SethTisue commented on September 4, 2024

it's strange, I don't know what to make of it

after #1375, remaining failures are:

- ./_site/blog/2017/08/28/gsoc-connecting-contributors-with-projects.html
  *  External link https://docs.github.com/en/graphql failed: 403 No error
  *  External link https://docs.github.com/en/rest failed: 403 No error
htmlproofer 3.10.2 | Error:  HTML-Proofer found 3 failures!
- ./_site/blog/2018/06/04/scalac-profiling.html
  *  External link https://docs.github.com/en/authentication/connecting-to-github-with-ssh failed: 403 No error

from scala-lang.

SethTisue avatar SethTisue commented on September 4, 2024

there is some previous history on 403s at #945

from scala-lang.

griggt avatar griggt commented on September 4, 2024

How odd. I tried to curl https://docs.github.com/en/rest and I also got a 403.

I was able to get a 200 by adding to the request an Accept-Encoding header that explicitly specified at least one compression algorithm, e.g. it liked Accept-Encoding: gzip, identity but not Accept-Encoding: identity or Accept-Encoding: *

Not sure what to make of that, maybe the server has been configured to only send compressed responses?

I don't know how htmlproofer works or what request headers it sends.

from scala-lang.

SethTisue avatar SethTisue commented on September 4, 2024

latest run: https://github.com/scala/scala-lang/runs/6250840445

and... there is a massive amount of 403s :-(

not sure what to make of that. like do we revert #1376 (and #1378) because it seems to have made matters worse?

or perhaps it's just because there were several runs in close succession and so we're getting rate-limited? normally the cron job only runs once/day

let's see how the next cron run does

from scala-lang.

Philippus avatar Philippus commented on September 4, 2024

I think it made things worse. :(

from scala-lang.

SethTisue avatar SethTisue commented on September 4, 2024

Still tons of 403s at https://github.com/scala/scala-lang/runs/6282395653 :-/

from scala-lang.

griggt avatar griggt commented on September 4, 2024

I ran the check on a single directory locally:

With the recent Accept-Encoding "fix".

$ bundle exec htmlproofer ./_site/blog/2017/08/28/ --external_only --only-4xx --http-status-ignore "400,401,429" --empty-alt-ignore --allow-hash-href --url-ignore "/trends.google.com/,/pgp.mit.edu/,/www.oracle.com/,/scalafiddle.io/" --typhoeus-config='{"headers":{"Accept-Encoding":"gzip, deflate"}}'
Running ["LinkCheck", "ImageCheck", "ScriptCheck"] on ["./_site/blog/2017/08/28/"] on *.html... 

Checking 37 external links...
Ran on 1 file!

- ./_site/blog/2017/08/28/gsoc-connecting-contributors-with-projects.html
  *  External link https://index.scala-lang.org failed: 403 No error
  *  External link https://index.scala-lang.org/ failed: 403 No error
  *  External link https://index.scala-lang.org/search?q=&contributingSearch=true failed: 403 No error

HTML-Proofer found 3 failures!

and without:

$ bundle exec htmlproofer ./_site/blog/2017/08/28/ --external_only --only-4xx --http-status-ignore "400,401,429" --empty-alt-ignore --allow-hash-href --url-ignore "/trends.google.com/,/pgp.mit.edu/,/www.oracle.com/,/scalafiddle.io/" 
Running ["LinkCheck", "ImageCheck", "ScriptCheck"] on ["./_site/blog/2017/08/28/"] on *.html... 

Checking 37 external links...
Ran on 1 file!

- ./_site/blog/2017/08/28/gsoc-connecting-contributors-with-projects.html
  *  External link https://docs.github.com/en/graphql failed: 403 No error
  *  External link https://docs.github.com/en/rest failed: 403 No error

HTML-Proofer found 2 failures!

Which at least reproduces what we're seeing in CI on the GitHub Actions runner.

But why do some sites fail without the Accept-Encoding header and others fail with it? Using curl seems to work fine here on all with the header set. 🤷 I guess if I'm in the mood for a puzzle later I'll take a look.

from scala-lang.

griggt avatar griggt commented on September 4, 2024

Ah, so the problem seems to be that specifying --typhoeus-config on the command line discards all of the html-proofer default Typhoeus configuation. Mildly annoying that it doesn't just perform a dictionary update. So all the defaults need to be re-specified on the command line (as appropriate, of course). The defaults appear to be these:

https://github.com/gjtorikian/html-proofer/blob/1bab3a1a18e95a10378371ddf000df9bea01740e/lib/html-proofer/configuration.rb#L36-L44

    TYPHOEUS_DEFAULTS = {
      followlocation: true,
      headers: {
        'User-Agent' => "Mozilla/5.0 (compatible; HTML Proofer/#{HTMLProofer::VERSION}; +https://github.com/gjtorikian/html-proofer)",
        'Accept' => 'application/xml,application/xhtml+xml,text/html;q=0.9, text/plain;q=0.8,image/png,*/*;q=0.5'
      },
      connecttimeout: 10,
      timeout: 30
    }

I tried a full htmlproofer run locally with all these settings + the Accept-Encoding header and it succeeded.

$ bundle exec htmlproofer ./_site/ --external_only --only-4xx --http-status-ignore "400,401,429" --empty-alt-ignore --allow-hash-href --url-ignore "/trends.google.com/,/pgp.mit.edu/,/www.oracle.com/,/scalafiddle.io/" --typhoeus-config='{"headers":{"Accept-Encoding":"gzip, deflate", "Accept":"application/xml,application/xhtml+xml,text/html;q=0.9, text/plain;q=0.8,image/png,*/*;q=0.5", "User-Agent":"Mozilla/5.0 (compatible; HTML Proofer/#{HTMLProofer::VERSION}; +https://github.com/gjtorikian/html-proofer)"}, "followlocation":"true", "connecttimeout":"10", "timeout":"30"}' 
Running ["LinkCheck", "ImageCheck", "ScriptCheck"] on ["./_site/"] on *.html... 

Checking 4974 external links...
Ran on 367 files!

HTML-Proofer finished successfully.

from scala-lang.

griggt avatar griggt commented on September 4, 2024

#1380

from scala-lang.

SethTisue avatar SethTisue commented on September 4, 2024

merged... we'll see what happens in the next scheduled run...

from scala-lang.

SethTisue avatar SethTisue commented on September 4, 2024
- ./_site/2019/12/18/road-to-scala-3.html
[10](https://github.com/scala/scala-lang/runs/6315550685?check_suite_focus=true#step:6:10)

[11](https://github.com/scala/scala-lang/runs/6315550685?check_suite_focus=true#step:6:11)

[12](https://github.com/scala/scala-lang/runs/6315550685?check_suite_focus=true#step:6:12)
  *  External link https://docs.scala-lang.org/scala3/reference/metaprogramming.html failed: 404 No error
[13](https://github.com/scala/scala-lang/runs/6315550685?check_suite_focus=true#step:6:13)
- ./_site/2020/11/06/explicit-term-inference-in-scala-3.html
[14](https://github.com/scala/scala-lang/runs/6315550685?check_suite_focus=true#step:6:14)
  *  External link https://docs.scala-lang.org/scala3/reference/contextual.html failed: 404 No error
[15](https://github.com/scala/scala-lang/runs/6315550685?check_suite_focus=true#step:6:15)
- ./_site/community/index.html
[16](https://github.com/scala/scala-lang/runs/6315550685?check_suite_focus=true#step:6:16)
  *  External link https://groups.google.com/g/scala-announce failed: 403 No error
[17](https://github.com/scala/scala-lang/runs/6315550685?check_suite_focus=true#step:6:17)
  *  External link https://groups.google.com/g/scala-tools failed: 403 No error

from scala-lang.

julienrf avatar julienrf commented on September 4, 2024

Those are due to the new reference documentation (https://docs.scala-lang.org/scala3/reference). They would be fixed by scala/scala3#15118

from scala-lang.

julienrf avatar julienrf commented on September 4, 2024

Run of last night succeeded: https://github.com/scala/scala-lang/runs/6330776484

from scala-lang.

SethTisue avatar SethTisue commented on September 4, 2024

thanks all for the group effort here!

from scala-lang.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.