ipni / index-provider Goto Github PK
View Code? Open in Web Editor NEW📢 Index Provider
License: Other
📢 Index Provider
License: Other
2022-01-18T14:04:13.499+0200 INFO provider/engine engine/linksystem.go:45 Retrieved advertisement from datastore {"cid": "baguqeeqqj2huagbnkzkxipqqi2j2vce7ya", "size": 958}
panic: Traversed Explore Recursive Edge Node With No Parent
goroutine 153112616 [running]:
github.com/ipld/go-ipld-prime/traversal/selector.ExploreRecursiveEdge.Decide(...)
/root/code/pkg/mod/github.com/ipld/[email protected]/traversal/selector/exploreRecursiveEdge.go:31
github.com/ipld/go-ipld-prime/traversal/selector.ExploreRecursive.Decide(...)
/root/code/pkg/mod/github.com/ipld/[email protected]/traversal/selector/exploreRecursive.go:165
github.com/ipld/go-ipld-prime/traversal.Progress.walkAdv(0xc038df92c0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc00cfc0310, ...)
/root/code/pkg/mod/github.com/ipld/[email protected]/traversal/walk.go:159 +0x83
github.com/ipld/go-ipld-prime/traversal.Progress.WalkAdv(0xc038df92c0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc00cfc0310, ...)
/root/code/pkg/mod/github.com/ipld/[email protected]/traversal/walk.go:147 +0xd9
github.com/ipfs/go-graphsync/ipldutil.(*traverser).start.func1(0xc011069900)
/root/code/pkg/mod/github.com/ipfs/[email protected]/ipldutil/traverser.go:230 +0x3d1
created by github.com/ipfs/go-graphsync/ipldutil.(*traverser).start
/root/code/pkg/mod/github.com/ipfs/[email protected]/ipldutil/traverser.go:198 +0x55
Write up docs that would explain how to use this repo in order to implement a provider.
Currently the mvp runs as a daemon.
we should add a code interface path for noticing a new CAR file
we should then implement one or both of
reference-provider add my.carv2
We need a way to specify that some content is "no longer published" by a provider to prevent indexers from having to fetch the full chain of advertisement to sync. It may be the case that by fetching the few latest updates is already enough to fully sync to the latest state.
This is linked with the idea discussed off-band of having and "snapshot scheme" (or something similar) to allow providers to "prune their chain" when it is getting to long, signalling that by getting up to N=some index
is enough to fully sync to the latest state (as the rest is no longer provided).
Related: ipni/storetheindex#53
Draft task list for MVP of indexer reference provider. It will be extended as new tasks are identified.
Feel free to take any of the tasks of the list, create an individual issue for it, and assign it to yourself
Indexing
go-indexer-core
in a datastore
interface).Ingestion
Publish
to pubsub channelPublishLocal
to make an advertisement available through an endpoint for indexers to proactively request them. Publish
always assumes a subsequent PublishLocal
./advertisement/latest
and /advertisement/<id>
endpoints for indexers to be able to request new/missed advertisements for a provider.
go-data-transfers
of indexed data through advertisement selectors.Process
investigate why
2022/01/18 10:54:39 cannot delete temporary directory: CreateFile C:\Users\RUNNER~1\AppData\Local\Temp\testscript-main127752611\bin\provider.exe: Access is denied.
https://github.com/filecoin-project/index-provider/runs/4852070256?check_suite_focus=true
Expect metadata for a storage provider needs at least
format string // "filecoin/1"
pieceCID CID
free bool
fastRetrieval bool
consider putting this in a new repo / mod
currently a restart would need to re-scan the full carv1 if a retrieval or callback for it's index is ever encountered.
$ ./index-provider import car -l http://localhost:3102 -i bafybeihh.carv2
Post "http://localhost:3102/admin/import/car": EOF
Sometimes after publishing an advertisement, when the indexer asks the reference provider for the content of that advertisement, the reference provider cannot find the content. It seems like there is some delay before it is available from the linksystem cache.
This can be forced to work by re-importing the car file and making the reference provider re-publish its advertisement, which triggers the indexer to ask for content again… and it usually works after one or two retries. Of course, this is not acceptable.
This problem appears to be on the reference provider side, because it appears that the indexer is asking for exactly the same content, in exactly the same way/order for both the success and failure case.
When engine is instantiated with an empty config directly, defaults don't get set. As a result chunk size ends up being zero which results in a single sized chunks.
Make sure wherever constructors take config, the config is checked for sanity.
We currently generate synchronously the entries linked list used for ingestion by triggering the callback in the linksystem if they are not available in the cache. The exchange can't proceed until the full linked list for the entries is generated and stored in the cache.
https://github.com/filecoin-project/indexer-reference-provider/blob/78768e2546a7f81fe7c6237644f5598a83ea5258/core/engine/linksystem.go#L68
The aim of this is to avoid having to keep redundant storages persisting the list of entries advertised. We only generate the linked list for a list of entries when they are requested. However, this process of "generating on-the-fly" is currently quite inefficient.
https://github.com/filecoin-project/indexer-reference-provider/blob/78768e2546a7f81fe7c6237644f5598a83ea5258/core/engine/linksystem.go#L76
My suggestions to make this faster would be:
cacheManager
that keeps track of what has been asked for ingestion, what linked lists need to be generated, and which ones are already in cache and ready to be served. Even more, this would allow us to parallelize the generation of linked lists so at soon as we have the first few nodes of the linked list, we can start serving them using the linked system.cacheManager
should also include an offline algorithm to garbage collect linkedList (we currently don't have any garbage collection strategy). As a first approach we could use a naïve approach where we only let a small number of linked list in cache at a time (assuming that after an advertisement publication, all indexers will be looking to ingest the same few set of entries in order to sync).Looks like GitVersion var is not set when binaries are built. Remove if always out of date.
Originally posted by @willscott in #129 (comment)
This line in the README is missing the end.
Issue: Market node crashing when starting up. ERROR:
2022-02-15T10:28:56.826+0100 INFO dagstore.migrator dagstore/wrapper.go:354 deal not ready; skipping {"deal_id": 0, "piece_cid": "baga6ea4seaqk54tcailxwsaqkfsmacu2h43e4z2ytn2xqwj762guubryalaaega"}
2022-02-15T10:28:56.826+0100 INFO dagstore.migrator dagstore/wrapper.go:354 deal not ready; skipping {"deal_id": 0, "piece_cid": "baga6ea4seaqhmohnt4uukhxf3enrgoyn5yldhe6hk3ebltr7jan57j6kkytokfq"}
2022-02-15T10:28:56.826+0100 INFO dagstore.migrator dagstore/wrapper.go:358 registering deal in dagstore with lazy init {"deal_id": 2985231, "piece_cid": "baga6ea4seaqgj2d6h7pzqlvld5qdc5kfnjuxsbxagn2qk253vwvhpxbwehrukmq"}
ERROR: creating node: starting node: failed to connect index provider host with the full node: failed to connect index provider host with the full node: failed to dial 12D3KooWNCcog7KWPsjWa1FmKyqeTeBxxc5cJf27X6vKBVqm1mW3:
* [/ip4/127.0.0.1/tcp/10231] dial tcp4 127.0.0.1:10231: connect: connection refused
* [/ip4/192.168.x.x/tcp/10231] failed to negotiate security protocol: read tcp4 192.168.x.x:35397->192.168.x.x:10231: read: connection reset by peer
* [/ip4/x.x.x.x/tcp/10231] failed to negotiate security protocol: read tcp4 192.168.x.x:35397->x.x.x.x:10231: read: connection reset by peer
* [/ip4/x.x.x.x/tcp/10231] dial tcp4 0.0.0.0:35397->x.x.x.x:10231: i/o timeout
OBS: My market nodes runs on a separate physical server and needs to connect to the daemon over the network.
Network connectivity is fine and system was just running perfectly on v1.14.0.
It is only allowed to connect on the internal network, so I would believe the error here is on these lines:
[/ip4/x.x.x.x/tcp/10231] failed to negotiate security protocol: read tcp4 192.168.x.x:35397->x.x.x.x:10231: read: connection reset by peer
Lotus Daemon and Miner is running the same version, as the Market node: master-spx.idxprov.rc-1
$ lotus-miner version
Daemon: 1.15.0-dev+mainnet+git.1bf7e6a40+api1.3.0
Local: lotus-miner version 1.15.0-dev+mainnet+git.1bf7e6a40
The Lotus-miner is also connecting externally on the daemon, so I know for sure the daemon is reachable and is currently connected to the lotus-miner, as it is running...
Similar to the import car
command, there should be a remove car
command. It should invoke the admin/remove/car
endpoint described be #77
Providers will need to include additional context information about the CIDs/metadata being notified (such as a dealID
or any other unique identifier) so they can inform indexers of the specific entry an update/removal refers to.
This would mean changing the reference-provider interface to something like:
NotifyPutCids(ctx context.Context, dealID cid.Cid, metadata []byte) (cid.Cid, error)
NotifyRemoveCids(ctx context.Context, dealID cid.Cid) (cid.Cid, error)
// Callback that goes from dealID to a carv2.Index used to fetch the list of CIDs to index.
type CidCallback func(dealID cid.Cid) carv2.Index
// Register Callback to fetch CIDs when triggering `NotifyPut`
RegisterCidCallback(cb CidCallback)
If this ID ends up not being the dealID
we should maybe choose some other name for it (ctxID
?)
//cc @aarshkshah1992
A provider may not have the list of multihashes available when receiving a request for ingestion of a remove advertisement. In this case, instead of generating the linked list of entries that need to be removed, the linksystem should check if it is a remove advertisement, and not follow the link for the list of entries (as they won't be available and it would make the exchange fail)
https://github.com/filecoin-project/indexer-reference-provider/blob/78768e2546a7f81fe7c6237644f5598a83ea5258/core/engine/linksystem.go#L100
For advertismeents with isRm == true
we should avoid trigerring the callback and traversing the Entries
link: https://github.com/filecoin-project/indexer-reference-provider/blob/78768e2546a7f81fe7c6237644f5598a83ea5258/core/engine/linksystem.go#L109
This issue is related with ipni/storetheindex#78
If we provide not just the direct multihashes of content, but instead a hash over that multihash, then the index system will not itself know what content is being queried.
This can help limit some enumeration attacks on the network, which can be useful.
IngestClient
expects as an argument an Identity
config data structure. This means that we need to generate an ad-hoc storetheindex identity config in refere-provider to make an IndexContent
call from the provider. We should consider removing this dependency from storetheindex specific config data structures.
The only way to make a config construct with defaults just now is to to initialise one with identity, take the sub config needed and throw the rest away. Default values are not exported.
Expose the ability to construct an instance of a config/sub-config with default values set.
when removing by metadata context ID:
2022-01-03T08:52:29.283Z INFO provider/engine engine/engine.go:331 Generating removal list for advertisement
2022-01-03T08:52:29.287Z ERROR provider/engine engine/engine.go:380 Error generating new advertisement: storetheindex: invalid metadata: encountered protocol ID 0 on encode
2022-01-03T08:52:29.287Z ERROR adminserver http/removecar_handler.go:47 failed to remove CAR: storetheindex: invalid metadata: encountered protocol ID 0 on encode {"err": "storetheindex: invalid metadata: encountered protocol ID 0 on encode", "key": "NQe/ch+eG2aP3DE1xSMcbGF+mJGLH6WmiA8z4xB5QXg="}
The goal of this issue is to track the list of unit tests that are flaky and need some extra care to make it work:
If you see any other flaky test that needs to be investigated please add it to the list
TestNotifyPublish
from engine_test.go
(fails in MacOS go 1.17)
time.Sleep
instead of an async approach for tests. It fails in go test -race
with runs slower than vanilla go test
@gammazero just pushed a release tag: v0.1.0.
Please manually verify validity (using gorelease
), and update version.json
to reflect the manually released version, if necessary.
In the future, please use the automated process.
Just a note, but i think we can in a protocol-compatible way, say that metadata is not simply <protocol ID><payload>
but rather [<protocol id><protocol-speicfic-payload>]
, so e.g.
uvarint{graphsync filecoin} || cbor{<cid>, true, true} || uvarint{bitswap}
to indicate that both protocols are supported.
since the cbor payload of graphsync can already be determined where the end is, if there are trailing bytes they can be interpreted as the next protocol
Originally posted by @willscott in #187 (comment)
Retrieval addresses are passed as constructor arguments to engine. They may change. Add mechanism to get those updated.
During chunk generation if a single linked list is more than the cache's current capacity, the cache is resized to hold the entire linked list.
However the final resize is always to the length of the cache itself, which means none of the entries are ever evicted and cache grows indefinitely.
Test:
func Test_CacheIsEvicted(t *testing.T) {
cfg := config.NewIngest()
cfg.LinkedChunkSize = 10
cfg.LinkCacheSize = 50
engine := mkEngineWithConfig(t, cfg)
mhCount := 100
expectedChunkCount := mhCount / cfg.LinkedChunkSize
for i := 0; i < 10; i++ {
idx := index.NewMultihashSorted()
cids, err := testutil.RandomCids(mhCount)
require.NoError(t, err)
var records []index.Record
for i, c := range cids {
records = append(records, index.Record{
Cid: c,
Offset: uint64(i + 1),
})
}
err = idx.Load(records)
require.NoError(t, err)
iterator, err := provider.CarMultihashIterator(idx)
require.NoError(t, err)
_, err = engine.generateChunks(iterator)
require.NoError(t, err)
}
store, ok := engine.cache.(*lrustore.LRUStore)
require.True(t, ok)
require.Equal(t, expectedChunkCount, store.Cap())
}
provider:index-provider> ./index-provider rm car -l http://localhost:3102 -i /data/snip/deal-cars/bafybeihh.carv2
Not Found: no CAR file found for the given Key
provider:index-provider> ./index-provider import car -l http://localhost:3102 -i /data/snip/deal-cars/bafybeihh.carv2
Conflict: CAR already advertised
Because, any JSON with field Signature
might not be an advertisement.
Each time a car file is imported, this results in a different advertisement CID. This is preventing the indexer from determining that it has already ingested the advertised indexes.
The reason that the CID is different is that each new advertisement links to the previous and the CID is calculated over the advertisement data and its link to the previous. What needs to happen is that the previous advertisement needs to be looked up by context ID, and if a previous advertisement exists for that context ID (coming from car supplier) and the previous advertisement metadata is equal to the current metadata, then the import request should be ignored.
Note: The context ID comes from the car supplier, which comes from the import request, which is a hash of the car file path. The metadata also originates from the import request.
Remove the libp2p server exposed by the provider, because the server predates "go-legs" and what it provides is now satisfiable via alternative already exposed endpoints:
GET_LATEST
is now exposed via go-legs' head publisher over go streamGET_AD
is available via graphsyc.The server is used in end to end tests which needs to be refactored to use graphsyc instead.
Once removed, the client library for it can also go away here.
Note that storetheindex uses the libp2p to get the latest advertisement instead of go-legs head publisher which also needs to be updated, captured in ipni/storetheindex#137
The admin server needs a new endpoint for removing car content. This new admin/remove/car
endpoint will complement the existing admin/import/car
endpoint, and will take a contextID or a file path as input. The file path is used to generate the contextID if contextID is not supplied.
Invoking this endpoint must result in a remove advertisement when previously imported content is removed from the provider.
The admin server delivers error responses using an errRes
data structure that gets serialized by the respond()
function. This was originally to have a common way of delivering errors over both libp2p and http, but is not needed here.
Here is an example of the current error response:
errRes := newErrorResponse("failed to supply CAR. %v", err)
respond(w, http.StatusInternalServerError, errRes)
I would prefer to see:
http.Error(w, fmt.Sprintf("failed to supply CAR: %s", err), http.StatusInternalServerError)
Even that can be refined, so that if the file is not found it returns http.StatusBadRequest
When provider engine is restarted, the head of advertisement exposed via legs head publisher is cid.Undef
until the next advertisement is published. Internally, provider engine maintains a mapping to the latest advertisement in datastore.
legs publisher API however does not allow the head to be set without also publishing it onto gossipsub or http (depending on the publisher implementation).
engine.Start
check if mapping to latest advertisement is present and set the current head to it.Engine shutdown call always returns context cancelled error when daemon is stopped via a single SIGINT
:
./provider daemon
2021-10-15T14:17:15.180+0100 INFO command/reference-provider provider/daemon.go:87 libp2p host initialized {"host_id": "12D3KooWK9NsyCndVed4QR71wNayaXHcLnRRTq11QPEe8n1w45Jy", "multiaddr": "/ip4/0.0.0.0/tcp/3103"}
2021-10-15T14:17:15.268+0100 INFO dt-impl impl/impl.go:145 start data-transfer module
2021-10-15T14:17:15.268+0100 INFO provider/engine engine/engine.go:64 Retrieval address not configured, using /ip4/192.168.68.105/tcp/3103
2021-10-15T14:17:15.268+0100 INFO command/reference-provider provider/daemon.go:134 libp2p servers initialized {"host_id": "12D3KooWK9NsyCndVed4QR71wNayaXHcLnRRTq11QPEe8n1w45Jy"}
2021-10-15T14:17:15.268+0100 INFO command/reference-provider provider/daemon.go:154 admin server initialized {"address": "127.0.0.1:3102"}
Starting admin server on 127.0.0.1:3102 ...2021-10-15T14:17:15.268+0100 INFO adminserver http/server.go:58 admin http server listening {"addr": "127.0.0.1:3102"}
^CReceived interrupt signal, shutting down...
(Hit CTRL-C again to force-shutdown the daemon.)
2021-10-15T14:17:16.584+0100 INFO pubsub [email protected]/pubsub.go:608 pubsub processloop shutting down
2021-10-15T14:17:16.584+0100 INFO command/reference-provider provider/daemon.go:170 Shutting down daemon
2021-10-15T14:17:16.584+0100 ERROR command/reference-provider provider/daemon.go:184 Error closing provider core {"err": "context canceled"}
2021-10-15T14:17:16.585+0100 INFO adminserver http/server.go:63 admin http server shutdown
2021-10-15T14:17:16.586+0100 INFO command/reference-provider provider/daemon.go:199 node stopped
daemon did not stop gracefully
I believe the cause is that the context used during shutdown is the parent context from CLI and the shutdown function does not use the passed context since underlying go-legs closer does not accept it.
When shutdown is called the CLI context is already cancelled hence the error - otherwise shutdown would not have been triggered; see:
Syncing removal advertisements fails because the link system tries to look up an internal mapping of advertisement context ID which is removed as part of publishing the removal advertisement.
This makes all syncs for such advertisements fail.
Consider checking if the advertisement is a removal ad and do not attempt to look up entries for it from callback, since they may no longer be there.
For "remove all" advertisements, the check is straight forward: check if entries link is NoEntries
and if so do not attempt to look up entries.
For removal advertisements with explicit list of multihashes rework in engine is needed to store that list permanently, since engine must be able to serve the advertisements it publishes.
The metadata schema for FilecoinV1Data
metadata is defined programatically.
Now that the schema generator work in IPLD is operational, use the generator similar to the work done in storetheindex.
fails trying to get a block that doesn't exist
Write a utility that verifies multihashes are known by indexer node , more specifically storetheindex
implementation of it.
The utility is to be used for verification in miners in its first iteration. It will accept as input:
and for each multihash found in the index checks that storetheindex
has it associated to a provider with the same peer ID.
The tool should also be able to get the list of advertisements from the provider's graphsync API as a source of multihashes to verify.
The metadata flag takes base64 encoded metadata bytes. Consider integrating built-in generation of metadata for known types as a convenient way for a user to specify them instead of having to deal with manual byte encoding.
Originally posted by @gammazero in #198 (comment)
The rationale is CIDs of size ~128 bytes and 4MB message size
This issue records proposed changes to make the reference provider more easily consumable by parties wishing to implement their own provider based on the reference provider, and by reusing packages from the reference provider.
provider
subdirectory in cmd/provider
engine/linksystem
into a separate packageengine/engine.go
to top-level directory.internal
engine/dscache.go
into separate pacakge. Possibly move it into a package under go-datastore
.Add readme for this repo
there's currently no way to query which files / items are being provided by the current provider daemon.
the legs handler should filter which selector requests are allowed, and limit them to:
When calling the indexer to verify a given collection of multihashes are ingested, when the collection is too large chunk them to avoid IO timeouts here.
Document how much memory the index-provider engine occupies. This helps the integrators understand the resource consumption.
we need to shrink the the ‘data transfer’ reservation of the full multicodec private use area
we need a bitswap parallel to the filecoinv0 metadata in metadata
/
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.