Code Monkey home page Code Monkey logo

Comments (13)

whymarrh avatar whymarrh commented on July 28, 2024 1

It doesn't appear to have the X-Ratelimit-Limit (and related) headers:

$ headers 'https://raw.githubusercontent.com/MetaMask/eth-phishing-detect/master/src/config.json'
HTTP/1.1 200 OK
Connection: keep-alive
Content-Length: 280842
Cache-Control: max-age=300
Content-Security-Policy: default-src 'none'; style-src 'unsafe-inline'; sandbox
Content-Type: text/plain; charset=utf-8
ETag: W/"a05d16ca5ad05fef6b1c776695c0957a3080ac7c7b56c7b1d371d1a24b3055ed"
Strict-Transport-Security: max-age=31536000
X-Content-Type-Options: nosniff
X-Frame-Options: deny
X-XSS-Protection: 1; mode=block
Via: 1.1 varnish (Varnish/6.0)
X-GitHub-Request-Id: A632:019D:34C114:3DD568:5EB33C0C
Accept-Ranges: bytes
Date: Wed, 06 May 2020 22:37:33 GMT
Via: 1.1 varnish
X-Served-By: cache-lga21936-LGA
X-Cache: HFM, HIT
X-Cache-Hits: 0, 1
X-Timer: S1588804654.658818,VS0,VE1
Vary: Authorization,Accept-Encoding
Access-Control-Allow-Origin: *
X-Fastly-Request-ID: 28b69360a54a75e18172f59fc5ace9ee6cf794c1
Expires: Wed, 06 May 2020 22:42:33 GMT
Source-Age: 33

Compared to:

$ headers 'https://api.github.com/repos/MetaMask/eth-phishing-detect/contents/src/config.json'
HTTP/1.1 200 OK
server: GitHub.com
date: Wed, 06 May 2020 22:37:46 GMT
content-type: application/json; charset=utf-8
status: 200 OK
cache-control: public, max-age=60, s-maxage=60
vary: Accept, Accept-Encoding, Accept, X-Requested-With
etag: W/"b750fb0d85d534231b657fc4701b389e790f2e28"
last-modified: Tue, 05 May 2020 15:33:06 GMT
x-github-media-type: github.v3; format=json
access-control-expose-headers: ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type, Deprecation, Sunset
access-control-allow-origin: *
strict-transport-security: max-age=31536000; includeSubdomains; preload
x-frame-options: deny
x-content-type-options: nosniff
x-xss-protection: 1; mode=block
referrer-policy: origin-when-cross-origin, strict-origin-when-cross-origin
content-security-policy: default-src 'none'
X-Ratelimit-Limit: 60
X-Ratelimit-Remaining: 55
X-Ratelimit-Reset: 1588806021
Accept-Ranges: bytes
Transfer-Encoding: chunked
X-GitHub-Request-Id: CF4F:3090:1143BB:2C30CC:5EB33C3A

from core.

whymarrh avatar whymarrh commented on July 28, 2024 1

That's unfortunate. It does still have an ETag though, so we can at least ensure we don't waste bandwidth when nothing changed.

Interesting, sending If-None-Match works via cURL:

curl -i 'https://raw.githubusercontent.com/MetaMask/eth-phishing-detect/master/src/config.json' \
  -H 'Connection: keep-alive' \
  -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36' \
  -H 'Accept: */*' \
  -H 'Sec-Fetch-Site: none' \
  -H 'Sec-Fetch-Mode: cors' \
  -H 'Sec-Fetch-Dest: empty' \
  -H 'Accept-Language: en-US,en;q=0.9' \
  -H 'If-None-Match: W/"a05d16ca5ad05fef6b1c776695c0957a3080ac7c7b56c7b1d371d1a24b3055ed"' \
  --compressed
HTTP/1.1 304 Not Modified
Connection: keep-alive
Date: Wed, 06 May 2020 23:05:14 GMT
Via: 1.1 varnish
Cache-Control: max-age=300
ETag: W/"a05d16ca5ad05fef6b1c776695c0957a3080ac7c7b56c7b1d371d1a24b3055ed"
X-Served-By: cache-lga21969-LGA
X-Cache: HIT
X-Cache-Hits: 1
X-Timer: S1588806315.658028,VS0,VE91
Vary: Authorization,Accept-Encoding
Access-Control-Allow-Origin: *
X-Fastly-Request-ID: 3c9fecc3996e1215f9b5ef2dcd2694210ac88a2c
Expires: Wed, 06 May 2020 23:10:14 GMT
Source-Age: 0

But not via fetch. 🤔

I'm seeing the If-None-Match header being sent but the browser getting a 200 from the server instead of a 304. Can you confirm that the browser sees a 304?

from core.

whymarrh avatar whymarrh commented on July 28, 2024 1

We should be able to patch this via #244

from core.

whymarrh avatar whymarrh commented on July 28, 2024

We could use the Contents endpoint but we'd have to decode the body which seems expensive.

curl -i https://api.github.com/repos/MetaMask/eth-phishing-detect/contents/src/config.json
{
  "name": "config.json",
  "path": "src/config.json",
  "sha": "b750fb0d85d534231b657fc4701b389e790f2e28",
  "size": 280842,
  "url": "https://api.github.com/repos/MetaMask/eth-phishing-detect/contents/src/config.json?ref=master",
  "html_url": "https://github.com/MetaMask/eth-phishing-detect/blob/master/src/config.json",
  "git_url": "https://api.github.com/repos/MetaMask/eth-phishing-detect/git/blobs/b750fb0d85d534231b657fc4701b389e790f2e28",
  "download_url": "https://raw.githubusercontent.com/MetaMask/eth-phishing-detect/master/src/config.json",
  "type": "file",
  "content": "ewogICJ2ZX...K\n",
  "encoding": "base64",
  "_links": {
    "self": "https://api.github.com/repos/MetaMask/eth-phishing-detect/contents/src/config.json?ref=master",
    "git": "https://api.github.com/repos/MetaMask/eth-phishing-detect/git/blobs/b750fb0d85d534231b657fc4701b389e790f2e28",
    "html": "https://github.com/MetaMask/eth-phishing-detect/blob/master/src/config.json"
  }
}

from core.

whymarrh avatar whymarrh commented on July 28, 2024

The https://api.infura.io/v2/blacklist endpoint is just a proxy of https://raw.githubusercontent.com/409H/EtherAddressLookup/master/blacklists/domains.json and https://raw.githubusercontent.com/MetaMask/eth-phishing-detect/master/src/config.json. We should request these files from GitHub directly to reduce our Infura traffic.

Note that https://api.infura.io/v1/blacklist is the former and https://api.infura.io/v2/blacklist is the latter. I don't know that we use /v1/blacklist.

from core.

Gudahtt avatar Gudahtt commented on July 28, 2024

I suspect that raw.githubusercontent.com is rate-limited in the same manner as unauthenticated API requests (as described here).

If raw.githubusercontent.com supports either the If-Modified-Since or If-None-Match headers (as described here), we could use raw.githubusercontent.com directly without generating much additional GitHub traffic by taking advantage of their own caching infrastructure.

from core.

Gudahtt avatar Gudahtt commented on July 28, 2024

That's unfortunate. It does still have an ETag though, so we can at least ensure we don't waste bandwidth when nothing changed.

I'd feel better if there was an explicit policy published on how we're allowed to use raw.githubusercontent.com, but it still seems like a decent option. The blacklist doesn't change that often. I doubt that traffic from our users would make much of a difference to them, and I doubt they'd block our traffic.

from core.

Gudahtt avatar Gudahtt commented on July 28, 2024

Seems to work for me:

Screenshot:

etag

(Not pictured is the ETag in the first response, but it was indeed the same value that I used for the If-None-Match header).

from core.

whymarrh avatar whymarrh commented on July 28, 2024

w3c/ServiceWorker#412 is the behaviour I was seeing

from core.

Gudahtt avatar Gudahtt commented on July 28, 2024

I had asked GitHub support about whether content served from raw.githubusercontent.com was rate-limited, and got this response:

You should not rely on any rate-limiting behavior there. This doesn't mean that there aren't any, but instead that you could get rate limited at any time. Those endpoints are not intended for programmatic use -- if you need that, you should use the API which has well-defined rate limits and well-defined caching behavior.

For example, you could use this:

https://developer.github.com/v3/repos/contents/#get-repository-content

from core.

whymarrh avatar whymarrh commented on July 28, 2024

Yeah, that's what I would expect them to say. (And 100% the correct response.)

The only tangible thing preventing us from using the proper API is that we'd have to decode the body.

from core.

Gudahtt avatar Gudahtt commented on July 28, 2024

I believe we could skip the parsing step by specifying the raw JSON format via the Accept header, like this:

curl -H 'Accept: application/vnd.github.v3.raw+json' https://api.github.com/repos/MetaMask/eth-phishing-detect/contents/src/config.json

This is explained more here: https://developer.github.com/v3/media/

from core.

whymarrh avatar whymarrh commented on July 28, 2024

Oh, nice! We should definitely use that instead.

from core.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.