Comments (4)
That would be a great study! We did not look into that in our paper and won't be able to deduct information about that from the corresponding datasets :/
You know that but for completeness: In that paper, we published content, resolved the provider record, fetched it from multiple locations and stopped there. At this point, as you said, the provider record should include the PeerIDs of all nodes that fetched the content. We would have had to request the data again and track from where we get it served - which we didn't do 🤷♂️
from network-measurements.
Have we verified that they will receive the EU-based copy?
IPFS has no mechanism that favors by geography, right? The closest to this I guess is that IPFS would favor the closer node due to lower latency and higher throughput, ... over time. But not explicitly, but simply because it is faster to retrieve data from that closer node and thus overall retrieves more data from that closer node. Am I missing something?
from network-measurements.
Thanks for the input @mxinden!
... over time
I guess here you mean that it would favour the fastest user at the Bitswap level, right? I.e., if it establishes a connection to both users and figures out one is faster than the other - but do we reach that stage? 😁
A few things to find out here:
- when
publisher_2
advertises contentCID_x
(previously advertised bypublisher_1
) to what fraction of the (20) provider records initially published bypublisher_1
doespublisher_2
's PeerID end up in. - who does the client connect to, if they have the PeerID of both
publisher_1
andpublisher_2
?- Before connecting to either of the publishers, the client would have to walk the DHT again to do the mapping
PeerID -> multiaddress
. Contacting both is the preferred way, performance-wise, but adds extra load to DHT servers.
- Before connecting to either of the publishers, the client would have to walk the DHT again to do the mapping
- The ideal way to proceed, I would argue, is to eventually connect to both publishers and start the transfer through Bitswap. After the first few blocks, the client can identify the fastest of the two publishers, continue with that and prune the other Bitswap session. Overhead stays at low levels and content is delivered faster.
- One alternative would be to identify the closest peer geographically (from the multiaddress) and proceed with that only. This most likely (but not necessarily) guarantees that content will be delivered faster and avoids an extra Bitswap connection setup.
Curious to find out what happens in practice :)
Of course, in order for that optimisation to have performance impact it would prerequisite that "enough" content in IPFS is stored in more than one peer - which is what we've been asked and have been discussing @dennis-tra :) But I agree it's a great study to do. We probably should craft an RFM out of this.
from network-measurements.
After the first few blocks, the client can identify the fastest of the two publishers, continue with that and prune the other Bitswap session. Overhead stays at low levels and content is delivered faster.
I can also imagine a mechanism where the client load balances the traffic between the two if the upload bandwidth of one provider doesn't saturate the download bandwidth of the client. Like requesting a part of the graph from one provider and the other part of the graph from the other provider. Does this already happen?
Of course, in order for that optimisation to have performance impact it would prerequisite that "enough" content in IPFS is stored in more than one peer
I think the hydra-boosters could be a good source to determine the statistics around that. They store the provider records in DynamoDB and we could just count the provider records that have 1,2,3,...,n providing peers. This should be a statistically significant indicator of the distribution.
from network-measurements.
Related Issues (20)
- Network self-organisation HOT 10
- RFM-16 Proposal: An alternative to measuring bitswap efficiacy HOT 7
- Broadcast latencies in the Filecoin network HOT 6
- RFM Proposal: Data on usage of libp2p circuit relay v1 HOT 8
- Track and measure number of Brave browser IPFS nodes HOT 14
- Track number of client nodes in the IPFS DHT Network HOT 4
- Impact of peers that rotate their PeerIDs HOT 5
- Website Monitoring feedback 202302 and 202303 HOT 16
- Balance Kademlia Buckets HOT 1
- Add table of all advertised stream handlers and how many peerid advertised thoses stream handlers. HOT 6
- RFM Proposal: Number of Client nodes across various networks and implementations HOT 4
- RFM: IPNI Lookup Performance HOT 19
- Unreachable providers for popular CIDs HOT 40
- website monitoring: add specs.ipfs.tech HOT 1
- PL Websites not continuously pinned at PL pinning cluster or Fleek HOT 13
- DHT Lookup Latency Increase since mid-June 2023 HOT 1
- Large Number of Unavailable Peers HOT 2
- Gateway Measurements: Helia HOT 1
- Kubo Version 12-Month Trend HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from network-measurements.