Comments (10)
I like this idea. Just don't know how exactly one would pass all the possible header fields to mlc? Via commandarg?
from mlc.
Probably the best option would be a config file, otherwise it would be impractical to specify different headers for different URLs.
See for example:
https://github.com/orgs/github-community/discussions/14773#discussioncomment-2679987
https://github.com/tcort/markdown-link-check#config-file-format
from mlc.
I think your pipeline has been hit by this bug:
https://github.com/becheran/mlc/actions/runs/3559864946/jobs/5979511630
[Err ] ./README.md (62, 22) => https://docs.github.com/en/actions/using-workflows/workflow-commands-for-github-actions - 403 - Forbidden
Error: https://docs.github.com/en/actions/using-workflows/workflow-commands-for-github-actions. 403 - Forbidden
from mlc.
@diegorondini fun fact: It does not fail when I run it locally. Does github somehow prevent requests to GitHub.com from their own runners? You mention missing request parameters? What would that be in this case?
from mlc.
@becheran I think the first question is why the pipeline checks that link even if there's no such link in the README.md
:
$ grep 'docs\.github' README.md
Returning to this bug, docs.github.com
requires the Accept-Encoding: zstd, br, gzip, deflate
header:
$ curl -i -X GET https://docs.github.com/en/actions/using-workflows/workflow-commands-for-github-actions
HTTP/2 403
x-azure-ref: 0wn2EYwAAAACr4P2HgpUzTatC1/nj5XnyTU5aMjIxMDYwNjEzMDIxADU5NmQ3OGEyLWNhNWYtNDc5ZC1iY2RjLTA4MzU4MzMxNzRiMg==
accept-ranges: bytes
via: 1.1 varnish, 1.1 varnish
date: Mon, 28 Nov 2022 09:22:10 GMT
x-served-by: cache-iad-kiad7000135-IAD, cache-mrs10563-MRS
x-cache: MISS, MISS
x-cache-hits: 0, 0
x-timer: S1669627330.213655,VS0,VE92
strict-transport-security: max-age=31557600
$ curl -i -H "Accept-Encoding: zstd, br, gzip, deflate" -X GET https://docs.github.com/en/actions/using-workflows/workflow-commands-for-github-actions
HTTP/2 200
cache-control: public, max-age=60
content-type: text/html; charset=utf-8
access-control-allow-origin: *
content-security-policy: default-src 'none';prefetch-src 'self';connect-src 'self';font-src 'self' data: githubdocs.azureedge.net;img-src 'self' github.com *.github.com *.githubusercontent.com *.githubassets.com data: githubdocs.azureedge.net placehold.it;object-src 'self';script-src 'self' data: githubdocs.azureedge.net;frame-src 'self' github.com *.github.com *.githubusercontent.com *.githubassets.com https://www.youtube-nocookie.com;frame-ancestors 'self' github.com *.github.com *.githubusercontent.com *.githubassets.com;style-src 'self' 'unsafe-inline' data: githubdocs.azureedge.net;child-src 'self';upgrade-insecure-requests;base-uri 'self';form-action 'self';script-src-attr 'none'
cross-origin-opener-policy: same-origin
cross-origin-resource-policy: same-origin
x-dns-prefetch-control: off
x-frame-options: SAMEORIGIN
x-download-options: noopen
x-content-type-options: nosniff
origin-agent-cluster: ?1
x-permitted-cross-domain-policies: none
referrer-policy: strict-origin-when-cross-origin
x-xss-protection: 0
x-powered-by: Next.js
x-azure-ref: 0hXyEYwAAAADMF8jkAx/XToTRxIg5u1m/UEhMMzBFREdFMDMxOQA1OTZkNzhhMi1jYTVmLTQ3OWQtYmNkYy0wODM1ODMzMTc0YjI=
content-encoding: br
via: 1.1 varnish, 1.1 varnish
accept-ranges: bytes
date: Mon, 28 Nov 2022 09:22:29 GMT
age: 335
x-served-by: cache-iad-kiad7000135-IAD, cache-mrs10583-MRS
x-cache: CONFIG_NOCACHE, HIT, HIT
x-cache-hits: 3, 1
x-timer: S1669627349.305248,VS0,VE1
vary: Accept-Encoding
strict-transport-security: max-age=31557600
content-length: 38324
Warning: Binary output can mess up your terminal. Use "--output -" to tell
Warning: curl to output it to your terminal anyway, or consider "--output
Warning: <FILE>" to save to a file.
from mlc.
Sorry, I just realized I should have checked out the github-action-output
branch.
Now it fails for me as well with 0.15.4:
$ mlc ./README.md
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+ +
+ markup link checker - mlc v0.15.4 +
+ +
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
09:31:29 [WARN] Broken reference link: Borrowed("possible values: md, html")
09:31:29 [WARN] Strip everything after #. The chapter part '#ci-pipeline-integration' is not checked.
[ OK ] ./README.md (19, 8) => #ci-pipeline-integration -
[ OK ] ./README.md (64, 1) => ./docs/FailingAnnotation.PNG -
[ OK ] ./README.md (32, 28) => https://doc.rust-lang.org/cargo/ -
[ OK ] ./README.md (4, 2) => https://badgen.net/crates/d/mlc?color=blue -
[ OK ] ./README.md (46, 56) => https://github.com/marketplace/actions/markup-link-checker-mlc -
[ OK ] ./README.md (20, 29) => https://rust-lang.github.io/async-book/ -
[ OK ] ./README.md (3, 2) => https://img.shields.io/crates/v/mlc.svg?color=orange -
[ OK ] ./README.md (9, 1) => https://asciinema.org/a/299100 -
[ OK ] ./README.md (9, 2) => https://asciinema.org/a/299100.svg -
[ OK ] ./README.md (6, 2) => https://img.shields.io/badge/License-MIT-yellow.svg -
[ OK ] ./README.md (5, 2) => https://github.com/becheran/mlc/actions/workflows/rust.yml/badge.svg -
[ OK ] ./README.md (7, 2) => https://img.shields.io/badge/PRs-welcome-brightgreen.svg -
[ OK ] ./README.md (3, 1) => https://crates.io/crates/mlc -
[ OK ] ./README.md (4, 1) => https://crates.io/crates/mlc -
[ OK ] ./README.md (32, 92) => https://crates.io/crates/mlc -
[Err ] ./README.md (62, 22) => https://docs.github.com/en/actions/using-workflows/workflow-commands-for-github-actions - 403 - Forbidden
[ OK ] ./README.md (144, 60) => https://github.com/becheran/mlc/blob/master/LICENSE -
[ OK ] ./README.md (75, 32) => https://github.com/becheran/ntest/blob/master/.github/workflows/ci.yml -
[ OK ] ./README.md (79, 37) => https://hub.docker.com/repository/docker/becheran/mlc -
[ OK ] ./README.md (140, 14) => https://github.com/becheran/mlc/blob/master/CHANGELOG.md -
[ OK ] ./README.md (6, 1) => https://opensource.org/licenses/MIT -
[ OK ] ./README.md (112, 221) => https://github.com/becheran/wildmatch -
[ OK ] ./README.md (40, 54) => https://github.com/becheran/mlc/releases -
[ OK ] ./README.md (5, 1) => https://github.com/becheran/mlc/actions/workflows/rust.yml -
[ OK ] ./README.md (7, 1) => https://github.com/becheran/mlc/blob/master/CONTRIBUTING.md -
Result (25 links):
OK 24
Skipped 0
Warnings 0
Errors 1
The following links could not be resolved:
./README.md (62, 22) => https://docs.github.com/en/actions/using-workflows/workflow-commands-for-github-actions.
from mlc.
Ah, right. Did the same mistake and ran it on wrong branch locally 🤦♂️
from mlc.
@diegorondini would 'Accept-Encoding: *' help in this case? Might be a sane default since we don't care about the content anyways right now.
To make it configurable I think a map of links with wildcards and associated headers would make sense as config parameter. Will think about it.
from mlc.
@becheran well, not literally:
$ curl -i -H "Accept-Encoding: *" -X GET https://docs.github.com/en/actions/using-workflows/workflow-commands-for-github-actions
HTTP/2 403
[...]
The official way to mean any encoding should be Accept-Encoding: */*
, but I don't know how much it works in pratice.
https://stackoverflow.com/questions/25182888/does-in-an-http-accepts-encoding-header-mean-gzip-is-supported
The library you're using (reqwest?) may support accepting all encodings. Libcurl does that:
https://curl.se/libcurl/c/CURLOPT_ACCEPT_ENCODING.html
Not sure though if servers that don't support compression / encoding peacefully decline the "Accept-Encoding" header.
from mlc.
Yes, I am using reqwest. I did turn on all supported encodings (brotli, gzip, deflate) and that did the trick for now. But I guess there are other cases where a custom request is still required. For example if a authentication token is required for a specific link.
from mlc.
Related Issues (20)
- Missing binaries for 0.15.x releases HOT 2
- Directory parameter is ignored if provided after an option HOT 5
- Support for ignore link list from a file HOT 3
- Latest Docker images broken HOT 1
- Line and column wrong with windows typical line breaks
- Support reporting redirections HOT 2
- Format links in vs code console so that ctrl + left click opens the file at right location HOT 1
- First line separator is wrong on Windows HOT 1
- Use GitHub actions workflow commands
- Installation failed HOT 2
- Evaluate keeping compatibility with some not-so-old glibc HOT 7
- Option to check links in code blocks HOT 2
- Links with spaces in not parsed correctly HOT 1
- Support `ignore` or `disable` lines and blocks in checked files HOT 2
- Support reading ` .gitignore` if present to ignore paths HOT 2
- `.mlc.toml` options are not used HOT 9
- Mismatch between definition and access of `throttle` HOT 1
- Option to (maybe conditionally) hide redirects HOT 3
- Windows tests fail in pipeline with STATUS_STACK_BUFFER_OVERRUN
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mlc.