ipni / indexstar Goto Github PK

View Code? Open in Web Editor NEW

5.0 5.0 0.0 345 KB

:star:️ A load splitter for storetheindex :star:

License: Other

Go 98.09% Dockerfile 0.96% HTML 0.94%

indexstar's People

Contributors

Stargazers

Watchers

indexstar's Issues

Remove support for deprecated `reframe` protocol

reframe is deprecated in favour of HTTP delegated routing. Remove support for it at cid.contact.

Include cascade labels in `OPTIONS` request

caskadht correctly implements the IPNI specification when it comes to responding to OPTIONS request: it includes X-IPNI-Allow-Cascade with the labels it supports.

However, indexstar does not propagate OPTIONS request to the backends and therefore does not include the supported labels. Change the implementation such that response to such request will include supported X-IPNI-Allow-Cascade header.

Do not route request to offline services

When a backend service is being redeployed it may be momentarily unreachable. Ideally, indexstar should implement a circuit breaker pattern to avoid blindly hitting hosts that are down for every incoming request.

Cannot find multiple multihashes when using `--translateNonStreaming`

When making a batch find request for multiple multihashes (using POST request method), the batch query does not work when running indexstar with the --translateNonStreaming flag, and works without the flag.

Indexstar logs associated with the query:

{"level":"warn","ts":"2023-05-02T20:03:30.255Z","logger":"indexstar/mux","caller":"app/find_ndjson.go:202","msg":"Request processing was not successful","backend":"ber-indexer:3000","status":405,"body":"\n"}
{"level":"error","ts":"2023-05-02T20:03:30.255Z","logger":"indexstar/mux","caller":"app/scatter_gather.go:44","msg":"failed to scatter on target","target":"ber-indexer:3000","err":"status 405 response from backend ber-indexer:3000","maxWait":2}
{"level":"warn","ts":"2023-05-02T20:03:30.255Z","logger":"indexstar/mux","caller":"app/find_ndjson.go:202","msg":"Request processing was not successful","backend":"cali-indexer:3000","status":405,"body":"\n"}
{"level":"error","ts":"2023-05-02T20:03:30.255Z","logger":"indexstar/mux","caller":"app/scatter_gather.go:44","msg":"failed to scatter on target","target":"cali-indexer:3000","err":"status 405 response from backend cali-indexer:3000","maxWait":2}
{"level":"warn","ts":"2023-05-02T20:03:30.257Z","logger":"indexstar/mux","caller":"app/find_ndjson.go:202","msg":"Request processing was not successful","backend":"dhfind-cluster-ip:40080","status":400,"body":"input isn't valid multihash\n"}
{"level":"error","ts":"2023-05-02T20:03:30.257Z","logger":"indexstar/mux","caller":"app/scatter_gather.go:44","msg":"failed to scatter on target","target":"dhfind-cluster-ip:40080","err":"status 400 response from backend dhfind-cluster-ip:40080","maxWait":2}
{"level":"warn","ts":"2023-05-02T20:03:30.257Z","logger":"indexstar/mux","caller":"app/find_ndjson.go:202","msg":"Request processing was not successful","backend":"ago-indexer:3000","status":405,"body":"\n"}
{"level":"error","ts":"2023-05-02T20:03:30.258Z","logger":"indexstar/mux","caller":"app/scatter_gather.go:44","msg":"failed to scatter on target","target":"ago-indexer:3000","err":"status 405 response from backend ago-indexer:3000","maxWait":2}`

Add endpoint to produced a combined response for `/providers`

Add an endpoint that produces response with same schema as storetheindex on GET /providers. Note that conflicting results may be returned from different indexer nodes. The majority of such conflicts can be resolved using the timestamp in the responses where the combined result would converge on the most recent.

Implement pagination in HTTP delegated routing

The collection of providers returned by GET /routing/v1/providers/{CID} should be paginated to the default limit of 100.

See IPIP-337 for the specification on pagination support.

Add internal admin server that scatters commands across backends

It wold be nice if indexstar also mimics the admin API exposed by a regular indexer node, and upon receiving a request it sends it to all configured backends. Example usage: trigger manual sync from a specific provider. Without it, the admin sync command needs to be executed on each node individually.

Note the admin API should be exposed on a different port and be excluded from public ingress.

legacy cli json queries fail

curl "https://cid.contact/cid/bafybeidbjeqjovk2zdwh2dngy7tckid7l7qab5wivw2v5es4gphqxvsqqu"

fails with a 404

note that the x-ndjson variant succeeds

curl -H "Accept: application/x-ndjson" "https://cid.contact/cid/bafybeidbjeqjovk2zdwh2dngy7tckid7l7qab5wivw2v5es4gphqxvsqqu"

is okay

Switch unknown providers at /routing/v1/providers to peer schema

Currently, cid.contact returns all HTTP nft.storage announcements like this (example):

{
  "Protocol":"unknown",
  "Schema":"unknown",
  "ID":"QmUA9D3H7HeCYsirB3KmPSvZh3dNXMZas6Lwgr4fv1HTTp",
  "Addrs":["/dns4/dag.w3s.link/tcp/443/https"]
}

The problem is that results with "Schema":"unknown" are being ignored by boxo client (conformed in ipfs/boxo#422 (comment)), and we would like Kubo and other users of boxo client library to access ID and Addrs information present there (+ setting standard for doing the same in JS in Helia).

@masih would it be possible to change to Schema: "peer" (from IPIP-417) for these /routing/v1/providers responses?

All we need is:

{
- "Protocol":"unknown",
- "Schema":"unknown",
+ "Schema":"peer",
  "ID":"QmUA9D3H7HeCYsirB3KmPSvZh3dNXMZas6Lwgr4fv1HTTp",
  "Addrs":["/dns4/dag.w3s.link/tcp/443/https"]
}

Handle HEAD and OPTIONS on `/`

Handle HEAD and OPTIONS on root path /

Add metrics to get query response by metadata protocol ID

Add metrics to measure of the results returned what count had what protocol ID, for GraphSync, BitSwap and Other.

Allow hot reload of backend services

Allow runtime configuration change for the list of backends

dhfind as the first backend results in 404s on web ui despite fallback configuration

If the first backend is dhfind, web ui results in 404 regardless of what endpoint is configured as fallback.

Do not separate dhstore backends and non-dhstore backends.

Now that dhstore supports non-encrypted lookups (dhfind), there is no need to separate dhstore backends and not-dhstore backends, unless we intend to allow mixed dh and non-dh storage deployments.

consider adding batch find support

the verify ingest tool uses this interface for validation

Measure traffic volume on `PUT` requests to `/routing/v1/*`

Add metrics that measure the number of requests requesting PUT on /routing/v1/* in order to build understanding of volume event though PUT on that path is not implemented by indexstar.

Add new metrics: #55
Deploy to dev.cid.contact ipni/storetheindex#1125
Deploy to production cid.contact ipni/storetheindex#1127

manual release created (v0.0.7)

@gammazero just pushed a release tag: v0.0.7.
Please manually verify validity (using `gorelease`), and update `version.json` to reflect the manually released version, if necessary.
In the future, please use the automated process.

Stream early results back to client as they become available

as backends have differing latencies, we should be able to return fast results we know about without waiting for all backends to return.

Await completion of current requests before shutting down

Reduce downtime for the clients by handling SIGINT gracefully; the service should:

become "unready" to prohibit new requests arriving.
await completion of ongoing requests.
shut down once all is complete.

Once this feature is added, the termination timeout should be adjusted in K8S manifests to allow the container to wait enough for the requests to complete. Note the default timeout is 30 seconds which in the case of indexstar is plenty considering the current HTTP timeout configurations.

Optimise search for metadata

The pattern used for finding encrypted metadata is "Find First" which is slightly different than scatter gather. There is room for optimisation here where we do not wait for all results to come back before responding to the client.

For now what's here is fine; we should capture an issue to later optimise this.

Originally posted by @masih in #63 (comment)

properly route `/providers/<peerid>`

this route is currently just sent to the fallback backend

Use matcher to select backend instead of type check

A set of backend types were introduced to then selectively route traffic like this. But a matcher functionality exists already that would select appropriate backends depending on the incoming request.

The matcher mechanism is much more extensive since it allows us to inspect HTTP requests fully. It also results in a less verbose and more readable code to select backends depending on incoming requests. An example of this is already present for cascading backends here.

Refactor the code to use matchers instead of backend types for choosing which route to take for a given request.

Hookup ndjson to HTTP delegated routing

streaming is implemented for all backend but missing hookup for HTTP delegated routing endpoints.

Slow response on scatter gather requests

After introducing logic to fall back on streaming content type it seems responses are generally slower. See latency observed by hydra boosters for example:

Reduce logging of canceled contexts

The indexstar logs are filled with an extremely high volume of messages about context Canceled and DeadlineExceeded. There are so many such messages that it makes it difficult to find anything more meaningful. Canceled context messages should be logged at debug level.

Add health endpoint

Add a health endpoint that can be used to configure readiness checks in k8s. This will allow zero downtime rollout of index star deployments

Expose API to GET encrypted `/metadata/<key>`

The IPNI reader privacy provides an additional endpoint to fetch encrypted metadata by base58 encoded key. expose this in indexstar.

Add prometheus metrics

Add metrics for:

Number of providers
Reframe response latency tagged by response code
Find response latency tagged by response code
Backend response latency tagged by backend name

Once metrics are implemented in go, manifests need to be updated to add prometheus pod monitor, and the deployment needs to be updated to export the debug and metrics server if it is separated from HTTP server.

Validate find requests before forwarding to backends

Invalid requests are forwarded to the backends regardless, even though they are destined to fail. This causes spikes in upstream latency.

Check find requests for:

correctness of multihash, and
correctness of CID.

GET multihash does not find existing doublehashed entries

indexstar integration with dhstore does not find double hashed multihashes even though they exist and are found successfully via dhstore GET /multihash API.

Example multihash: 2wvkjD7mo3EfXptjbYX1tUGv6eouczbyYrP1k11RaKYB1sP

Optionally cascade lookups onto the IPFS DHT

Integrate DHT lookup cascade with optional query parameter check which should result in search on both IPNI and the IPFS DHT.

Separate encrypted lookup path entirely

The double hash multicodec code is not dedicated to encrypted records. Therefore we cannot implicitly assume that it represents one.

Separate encrypted lookup path by adding /encrypted prefix.

Context:

ipfs/specs#421 (comment)

Add `Retry-After` and `Cache-Control` headers

In cases where 404s can potentially be satisfied in a later time, e.g. max internal timeout was hit, or 5 minutes caching of 404s, add extra headers to aid http clients make better decisions on retries.

See:

https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/HTTPStatusCodes.html

Getting a non-existing provider returns success with empty response

It should return 400/404 depending on the validity of peer ID; instead we get:

$ curl -qsv https://indexstar.prod.cid.contact/providers/fish
...
* Connection state changed (MAX_CONCURRENT_STREAMS == 128)!
< HTTP/2 200 
< date: Tue, 18 Oct 2022 12:17:43 GMT
< content-type: application/json; charset=utf-8
< content-length: 58
< strict-transport-security: max-age=15724800; includeSubDomains
< access-control-allow-origin: *
< access-control-allow-credentials: true
< access-control-allow-methods: GET, PUT, POST, DELETE, PATCH, OPTIONS
< access-control-allow-headers: DNT,X-CustomHeader,Keep-Alive,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Authorization
< access-control-max-age: 1728000
< 
* Connection #0 to host indexstar.prod.cid.contact left intact
{"AddrInfo":{"ID":"","Addrs":[]},"LastAdvertisement":null}

Bitswap provider protocol id is incorrect

$ curl -s https://dev.cid.contact/routing/v1/providers/bafykbzacedzugqdzfjjbd4pgsvwrpkfdpk7rzfrsgwqj2kehxv2ffjy2tlny6 | jq -r '.Providers[].Protocol'
transport-bitswap
transport-graphsync-filecoinv1

The correct protocol id for bitswap is bitswap. (These are intentionally different from multicodec names so that they can be versioned without having to add a new multicodec code.)

Combining reframe results fails with unexpected JOSN content

Finding providers via /reframe when more than one backend is configured fails with:

unexpected content after end of json object
2022-08-23T15:36:48.258+0100	ERROR	service/client/delegatedrouting	proto/proto_edelweiss.go:1233	client received error response (unexpected content after end of json object)

Remove duplicate providers in aggregated result

Results return by reframe aggregation logic filter out duplicates by provider ID. But HTTP find and reframe translator do not do such filtering.

Ideally, the response should not include duplicate provider information.

Expose an endpoint to return estimate of total unique keys in cid.contact

Pebble exposes key count estimation, which is exposed via Pebble store implementation API.

Indexstar already knows about all of dhstore instances, configured as --backend.

Expose an endpoint in indestar /stats to return the total number of unique pebble keys. Make sure to cache the requests to pebble, since getting the estimated count is expensive. So indexstar should cache the response of /stats and update as needed with expiry of 30 minutes or so.

Measure the volume of queries from bifrost gateways

To truly measure this, we would have to enable event logs on cloudfront which would log every request flushed to a S3 bucket on hourly basis. Then write code to process it, filtering requests by the gateway IPs.

Measure distribution of lookups answered via DHT cascade

Add metrics to answer what fraction of queries are getting content only from the cascaded DHT source, as a way of understanding indexer DB’s fraction of total.

ipni / indexstar Goto Github PK

indexstar's People

Contributors

Stargazers

Watchers

indexstar's Issues

Recommend Projects

Recommend Topics

Recommend Org