Check out our website at ipfs.tech.
For papers on IPFS, please see the Academic Papers section of the IPFS Docs.
MIT.
a vendor-agnostic gateway conformance test suite for implementers of IPFS Gateways to ensure compliance with https://specs.ipfs.tech/http-gateways/
Home Page: https://specs.ipfs.tech/http-gateways/
License: Other
Check out our website at ipfs.tech.
For papers on IPFS, please see the Academic Papers section of the IPFS Docs.
MIT.
See the changes in #87
We used fixtures + libraries to generate expectation values (like an expected payload or an expected CIDs)
But in some (most?) cases, this relies on the fact that the library itself is correct.
Take a test that verifies the ordering of blocks in a CAR produced by a gateway, if we use go-ipld-xyz to load the car and generate the array of blocks, then we test against the library, not against the spec.
The other option is to hardcode results (like arrays of CIDs) which is okay up to a point where we have dozens of long lists of random strings and one test breaks or we have to update it.
Ideally we'd have some way to describe CIDs & other payloads like documentation and load these in the code, similar to how we do DNS link configurations.
Build
String Formatting:
Equal vs Equalf
) to help with testing URL-escaped strings/ipfs/{{CID}}/%c4%85/%c4%99
and not have to deal with escaping stringsUsage:
Fixtures:
Debugging
Documentation
We have a system where each test for hears is registered as a golang test case:
https://github.com/ipfs/gateway-conformance/blob/17cbde1ec1968884336d5b2ceb415b63ccbfd62a/tooling/test/test.go#LL147C19-L147C19
It would be nice to have the same thing for the check on Body and Status Code
That would give us the granularity to temporarily disable a check like we do for Kubo's Content Length:
Use case:
#65
Extracting this issue out of #8
At the moment, we configure Kubo's gateway with inlining on both localhost and subdomains gateways:
This test might mean running the conformance test suite twice, once against an inlining gateway and once (with different specs) against a gateway running a different configuration.
Query
is percent-encoding values, entity-bytes=2200:*
ends up as 2200%3A%2A
:
GET /ipfs/QmYhmPjhFjYFyaoiuNzYv8WGavpSRDwdHWe5B4M5du5Rtk?dag-scope=entity&entity-bytes=2200%3A%2A HTTP/1.1
while it should be:
GET /ipfs/QmYhmPjhFjYFyaoiuNzYv8WGavpSRDwdHWe5B4M5du5Rtk?dag-scope=entity&entity-bytes=2200:* HTTP/1.1
@laurentsenta escaping is fine most of the time, but for this specific case, maybe we could add QueryRaw
for cases like this one? I suspect this happens to work only because boxo/gateway
does percent-decoding by default too, but other implementations may not do this extra step for this parameter.
Does write a test in the implementation, run it for a while, contribute it back to gateway conformance work with our current setup? If so, let's describe how to do it. Otherwise, what would it take to get us there? I think it could lower the bar to start writing gateway conformance tests.
Tagging @laurentsenta to get his thoughts on this.
This is not part of specs (but we could add this to https://specs.ipfs.tech/http-gateways/path-gateway/#appendix-notes-for-implementers), just a quality-of-life improvement we've introduced in ipfs/kubo#9785 to support non-CIDv1, legacy peerids (RSA / ed25519) encoded as Multihash in Base58 in a way that redirects user to a canonical URL.
Brought up by @guanzo:
I'm getting this error merkledag node was not a directory or shard when I try hamtFixture.MustGetChildrenCids("sub1", "hello.txt")
err occurs here:gateway-conformance/tooling/car/unixfs.go
Line 118 in 5f2bc9c
dag structure of sub1/hello.txt
guanzo@saturn-node:~$ ipfs dag get Qmb7KRN5qCAwTYXAdTd5JHzXXQv3BDRJQhcEuMJzdiGix6 | jq
{
"Data": {
"/": {
"bytes": "CAUSHIAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAoIjCAAg"
}
},
"Links": [
{
"Hash": {
"/": "QmZgfvZtoFdbJy4JmpPHc1NCXyA7Snim2L8e6zKspiUzhu"
},
"Name": "DFhello.txt",
"Tsize": 117
}
]
}
guanzo@saturn-node:~$ ipfs dag get QmZgfvZtoFdbJy4JmpPHc1NCXyA7Snim2L8e6zKspiUzhu | jq
{
"Data": {
"/": {
"bytes": "CAIYByAFIAI"
}
},
"Links": [
{
"Hash": {
"/": "QmaATBg1yioWhYHhoA8XSUqD1Ya91KKCibWVD4USQXwaVZ"
},
"Name": "",
"Tsize": 13
},
{
"Hash": {
"/": "QmdQEnYhrhgFKPCq5eKc7xb1k7rKyb3fGMitUPKvFAscVK"
},
"Name": "",
"Tsize": 10
}
]
}
I suspect this is because we're calling getChildren on a small file
https://specs.ipfs.tech/http-gateways/trustless-gateway/ does not allow paths for single blocks, but we want do support this on deserialized path gateway as it enables apps to fetch data via relative pathing (enables things like "autocodec" in HTML/JS+WASM).
Fix here would be to move ?format=raw
tests that use path out of #74 and run only when path-gateway or subdomain-gateway is enabled.
We test different kind of proxy and tunnel behavior, but we don't test HTTP 1.0 behavior,
See the discussion in: #9 (comment)
Once we reached v1 for the test, we'll revisit this issue.
Contributes to #8
main
To complete porting the 114 test, we also need to support dns-link fixture.
After talking with @lidel these are excellent candidates for the conformance test suite:
See that test:
https://github.com/ipfs/gateway-conformance/actions/runs/4938039926/jobs/8827321519
CI is red as expected, but the summary doesn't show any errors despite:
οΏ½=== NAME TestGatewayTar
οΏ½=== RUN TestGatewayTar/GET_TAR_has_expected_root_directory
panic: test timed out after 10m0s
running tests:
TestGatewayTar (9m52s)
TestGatewayTar/GET_TAR_has_expected_root_directory (9m52s)
goroutine 1028 [running]:
testing.(*M).startAlarm.func1()
/usr/local/go/src/testing/testing.go:2241 +0x3c5
created by time.goFunc
/usr/local/go/src/time/sleep.go:176 +0x32
We'll have to prepare for test cases where we need to have requests and checks that depend on the result of other requests.
For example:
A test wants to query an endpoint, retrieve the ETAG from the response, then check that subsequent calls to the same endpoint reusing the ETAG value gives a 304 - Not Modified.
We used a minimal approach to implement these test:
- define empty variable "etag"
- call Run(test1), which has a Function Check which updates "etag" with a side effect
- call Run(test2, test3) which reuses "etag".
@galargh investigated a few approaches:
We used a minimal approach instead (PR), which does the side-effect explicitly, without endorsing it in the API.
That won't scale to more complex test cases, and we should be ready to implement a "better" approach, ideally, that should:
Using futures values should work: a future might "taint" values and templates where they are used (templating with a future returns a future). At evaluation time, we can do a topological sort to find the order of execution.
Refs:
We had a few back and forths with test structures, spec names, etc.
See https://github.com/ipfs/gateway-conformance/pull/79/files, #92 (comment), and #87 (comment).
Could we find a "definitive" structure for organizing tests and specs?
What about we use spec id or PR to the spec repo as a test name and spec id work?
Take cors, we could have cors-000
(before specification, tests coming from sharness), then cors-pr-423
, etc.
The current "mainstream" version of cors would be a spec group with all the specs that compose a regular cors gateway.
We could even pin these on dates, like cors-2023
would be all the specs active during 2023.
End goal is:
https://github.com/ipfs/bifrost-gateway/actions/runs/4511354942 <- we executed 4 subdomain gateway test cases and then stopped. We should continue with other tests.
Priority: low
The below is copied from #9 (comment)
Wouldn't it make sense to always use what we now call subdomain-url. As in, all the test should talk to gateway-url in a way that thinks it's called subdomain-url? Theoretically, wouldn't that be more correct? So really, every test case should be executed 3 times (with host header, proxy, proxy with tunneling) as long as subdomain-url and gateway-url are different.
Looking more closely at how we execute the test cases with host header, I think we're not using the protocol part of gateway-url. Is that correct?
If so, isn't what we really need from the user these two pieces of info:
What are you calling/want to call your gateway? Please give us the full URL with a domain name. That would be https://dweb.link/ for example or http://example.com/. You could give us http://127.0.0.1:8080/ here as well BUT there's no way for you to support sudomains on that URL so you'll have to disable subdomain-gateway spec.
Where is your gateway hosted? Please give us the host we should contact when we're making requests to the URL you provided in the previous step. If you don't, we're just going to use the host from the URL. It's fine if you give us a host with IP or a host with domain. This would be 127.0.0.1 for example or 127.0.0.1:8080 or foo.bar.example.com.
That would come together to something like this - main...gateway-host (note that I intentionally skipped modifying test/config code; I just wanted to get the idea across).
There's absolutely no rush with replying to this. And it's completely possible that I'm missing the point a bit - still trying to wrap my head fully around this. I just want to understand what might be most straight-forward way for those running the tests to provide us all the information we need.
See for example: #105
Ideally we want to make sure the steps we document in the README are tested in CI,
the goal is to make sure the first-time user experience is stable.
(wip)
Shell
Check.When we release the next version of gateway conformance, remember to:
Found this edge case in ipfs/bifrost-gateway#160
Content Paths with special characters are usually encoded before sending to the gateway.
However there is an edge case where a content path has already encoded parts.
We do not want it to be mangled during transport OR things like subdomain redirects.
There should be a conformance test based on below example (we dont want to pull-in entire wikipedia, but its fine to reuse the single file from /I
directory as-is).
The original content path is /ipns/en.wikipedia-on-ipfs.org/I/Auditorio_de_Tenerife%2C_Santa_Cruz_de_Tenerife%2C_EspaΓ±a%2C_2012-12-15%2C_DD_02.jpg.webp
(note Γ±
in EspaΓ±a%2C
)
But the URL path used in HTTP request is percent-encoded:
curl -v "https://en-wikipedia--on--ipfs-org.ipns.dweb.link/I/Auditorio_de_Tenerife%252C_Santa_Cruz_de_Tenerife%252C_Espa%C3%B1a%252C_2012-12-15%252C_DD_02.jpg.webp"
Test with curl (incl. subdomain redirect):
curl -v -L "http://localhost:8081/ipns/en.wikipedia-on-ipfs.org/I/Auditorio_de_Tenerife%252C_Santa_Cruz_de_Tenerife%252C_Espa%C3%B1a%252C_2012-12-15%252C_DD_02.jpg.webp"
For now, I've added basic unit test in ipfs/bifrost-gateway@982535a#diff-84d756d48159dd2e742a98880c23ce6d22bd80bf878409ddcf1d4dc81bc114acR38, but we should have a conformance test for this case.
We built this test suite out of Kubo's integration tests. Make sure tests are NOT overspecified for a single implementation.
ETA: 2023-12-01
See this comment from @darobin, similarly to #17 we should prepare for HTTP expectations as well.
The one (relatively small) thing I would flag is that you seem to be relying on string matching (equals/contains) but a lot of these values can be wilder (eg. case insensitive, there may be spaces, the value might have to be anchored at the start of the string, etc.). If your HTTP library normalises values, this probably helps a fair bit (worth checking), but it's like that quite a few checks will need to go to regex land. This isn't a big deal and not a problem for the framework, I'm just flagging it as a potential problem that jumped out, probably worth addressing before too many tests are written.
reference: #2 (comment)
We should tackle this after all the kubo sharness tests have been ported
FILL this in
This is about supporting ipfs/specs#412
This is just a stylistic nit (cc @laurentsenta @hacdias for visibility)
https://pkg.go.dev/net/http#CanonicalHeaderKey
Our tests should have already canonical format, to ensure users who are in rush and copy&paste blindly will use normalized version.
For example:
- Header("X-IPFS-Path")
+ Header("X-Ipfs-Path")
- Header("ETag")
+ Header("Etag")
(we had a quick sync today, but also had chat during Kubo standup, and wanted to drop info here, food for thought)
Quick feedback on building something that will last years and decades:
during the past 7+ years we had bad experiences with JS stack (NPM dependencies, to be specific), and having something leaner, ideally based on something future proof like stdlib or curl or libcurl would be easier to maintain and accept than JS libraries which will break in next 1-2 years and will suck time out of teams to fix/maintain (i know this sounds like a controversial take, but ask @SgtPooki how painful ipfs-webui update to the latest js-ipfs and ESM was :'( )
as some prior art in complexity reduction: we have some poc in https://github.com/ipfs/kubo/blob/master/test/cli/gateway_test.go where @guseggert ported sharness into go tests ( you can compare ipfs/kubo#9505).
If built-in client is not enough, we can use curl/libcurl (been around for 30 years, not going anywhere).
avoid using third-party JS libs, they are cool and maintained until they are not (request) and you are stuck with technical debt to refactor into something else.
i think it boils down to picking stable foundations, not specific language, so this is just a prompt for discussion. @darobin may have some useful lessons learned and rules of thumb around maintenance of something like WPT (which is, iirc, a bit of everything, JavaScript, Go, Python)
#56 introduced request-response tests for IPIP-402, including entity-bytes
for range requests that return a minimal subset of blocks for passed byte range:
Request().
Path("/ipfs/{{cid}}", subdirWithMixedBlockFiles.MustGetCidWithCodec(0x70, "subdir", "multiblock.txt")).
Query("format", "car").
Query("dag-scope", "entity").
Query("entity-bytes", "512:1023"),
Tests included in #56 will confirm the response includes the correct subset of blocks for the file.
entity-bytes
Because fixture includes all blocks available for multiblock.txt
file, the compliance test will not detect when a backend is broken and downloads the entire thing from the beginning (0:1023
) just to return a slice of that (512:1023
).
Such implementation will produce a valid response for cached data, but cache misses will be excruciatingly slow, and usually timeout.
If the file is 4GiB video and the client requested 1MiB in the middle, nothing will happen for a long long time, because the backend will be busy retrieving 2GiB of data just to return the last 1MiB.
I believe conformance suite should be testing this, because if our internal teams ended up with this footgun in Rhea/Saturn (brainstorming fix in filecoin-saturn/L1-node#415), we most likely will see more of this in the future.
We should have additional test with partially-resolvable-1GiB-multiblock.bin
, where only root block + child block responsible for requested range are present in the CAR fixture.
This way, the test will hang/timeout when an incorrectly implemented backend tries to fetch blocks that are not related to the entity-bytes
request.
The Kubo team needs a patch to gateway conformance testing:
#65
But we have many changes in main that are not ready for deploy, ideally we'd have a release branch we can use to release a minor update without all the changes in main.
@hacdias raised a want to use http.StatusMovedPermanently
instead of raw status code like 301
#58 (comment)
Creating an issue here to agree & then rename all the status code in a single PR if we want too.
Brought up by @lidel and @hacdias on Slack:
For the Etags specifically , the best IMO would be to move those tests to Boxo and remove them from both Kubo and the conformance. I think thatβs something we should agree on.
yeah, conformance is expecting boxo/kubo-specific Etag values (example) because MVP was to port sharness.
next step is to make these tests vendor-agnostic:
Conformance should not care what is inside of Etag, but must test Etag is present, and unique to specific resource+response type.
Boxo/Kubo can test specific values.
ipnsRecord := ipns.MustOpenRecord("t0124/key1.ipns-record")
ipnsRecord.TTL() // used to validate the cache output from API
ipnsRecord.Verify(bytes) // used to validate the output from API
ipnsRecord.Name() // used to contruct /ipfs/NAME
Also add means to check the Body
output:
https://github.com/ipfs/kubo/blob/master/test/sharness/t0124-gateway-ipns-record.sh
We need a regression test for ipfs/boxo#412
iiuc it should have _redirects
# Map SPA routes to the main index HTML file.
/* /index.html 200
and make request for /ipfs/cid/MISSING-PAGE
twice, second time sending If-None-Matc
with Etag
value from the first response.
There is a test already on the linked PR, and aCAR fixture which we should reuse as-is:
Ref. https://github.com/ipfs/boxo/pull/412/files#diff-d34c7069ce5e81d45082b19eb3e869ee1a086e185dcd6630e75e3ed0d368b546
#3 relies on #22 and #10 which requires new features / larger tasks.
In parallel, we'll push more tests to the test suite to support bifrost
Other tests implemented along other tickets:
In some test (subdomain 114) we check for HTML responses like
does this response contains a link to another page
We ported sharness tests, which used raw string comparisons,
It would be less fragile to provide HTML expectations like Body().Contains(A(href=""))
.
Context:
Not all APIs are pure right now:
tar
API mutates the builder object.We want to make sure these do not mutate the input object so that a pattern like this is correct:
req1 := SomeCheck().A().B()
req2 := req1.C().D()
See discussion in
#92 (comment)
Right now, whichever finishes latest "wins"
We already have a few unit-tests in the framework, but coverage is low
ETA: 2024-12-01
As noted in ipfs/specs#402 (comment) we were unable to get consensus on what goes into that field.
Updated spec states clients MAY ignore it, so the conformance tests should no longer make any assumptions about roots present in CAR header β we should remove checks for HasRoot+MightHaveNoRoots
for now (but keep these helpers, we may reintroduce them in the future).
I know its frustrating, as the time was spent on adding HasRoot
and MightHaveNoRoots
helpers for testing the previous iteration of IPIP-402, but this is the only path forward, given the cards we were dealt with.
See #37 (comment)
Discussed this briefly with @laurentsenta, projects like Saturn, Rhea, Boost etc will benefit from a test targets that only check compliance with https://specs.ipfs.tech/http-gateways/trustless-gateway/ spec.
We already have some prior art for Subdomain Gateways and IPNS.
Ideally, we would provide targets for requested subsets of the spec (we know some use cases won't care about IPNS or CARs).
My thinking is to add:
trustless-block-gateway
Just Blocks (no paths) β bare minimum, the most basic type of trustless gatewaytrustless-block-car-gateway
Blocks and CARs with paths (@aarshkshah1992 would like to run this in Rhea projects)trustless-block-car-ipns-gateway
Blocks, CARs, and IPNS Records, enabling end-to-end verifiability of both /ipfs
and /ipns
. (we would run this in Kubo and bifrost-gateway)cc @hacdias β up to you if we add this to #56 or do a follow-up PR
We added a default timeout per test of 2 minutes to catch tests that would hang the CI
#37 (comment)
We're probably going to need "expected timeouts" for requests:
#85 (comment)
Creating this issue to capture discussions/notes.
Mains questions:
I'm thinking about releasing v0.2 so that the Saturn team can use the conformance tests,
see @lidel comment in #80
Ideally, we'd have a "next" version that we release on every merge to master. Teams that start setting up the test suite and needs the latest changes to do so would use that branch as a temporary target. This opens up the opportunity to make small adjustments before cutting a release.
cc @galargh
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.