Code Monkey home page Code Monkey logo

Comments (13)

Stebalien avatar Stebalien commented on July 21, 2024 2

In general, you can't expect this to work. Basically, IPFS is a filesystem built on-top-of IPLD (i.e., it creates IPLD nodes to encode blocks, indirect blocks, inodes, etc). IPFS chunks files into blocks (~256KiB by default) and builds a merkle-tree on top of these chunks (using IPLD). The CID of a file corresponds (approximately) to the hash of the root node of this merkle-tree.

In your case, I assume you're file is less than 256KiB. That means it fits in a single chunk.

In the CIDv0 case, IPFS is taking your file content and wrapping it in a protobuf datastructure then hashing it. IPFS has to do this because CIDv0 only supports one IPLD "codec" (DagPB). Every IPLD object with a V0 CID uses this same DagPB format.

CIDv1, on the other hand, supports many IPLD codecs (the specific codec used is recorded in the CID itself). In this case, because your file fits in a single block, IPFS is using the Raw codec (raw binary). That's why genCID.Prefix().Sum(content) works. In this specific case, ipfs add isn't chunking, wrapping, or re-encoding your file at all. It's just taking it as-is and directly using it as a block.

from go-cid.

Stebalien avatar Stebalien commented on July 21, 2024 1

These APIs are really bad, I'm so sorry.

The (current) file format wraps a protobuf within a protobuf. You've just created the inner protobuf, but you still need to create the outer one.

You need to call merkledag.NodeWithData(unixFSWrappedSampleData).Cid() where merkledag is github.com/ipfs/go-merkledag. That should produce the correct CID.

guaranteed to fit within a single chunk

Are you willing to change the defaults? If you are, you can use ipfs add --raw-leaves my_file. If you do that and the data fits into one "chunk", the CID will be equivalent to the hash (specifically, it'll be a CID with the codec set to "raw").


NOTE: "fits into one chunk" means <= 1MiB (ish). IPFS will refuse to transfer larger chunks over bitswap as we don't want to download too much data without verifying it.

from go-cid.

Stebalien avatar Stebalien commented on July 21, 2024 1

from go-cid.

DRK3 avatar DRK3 commented on July 21, 2024 1

@Vikram710 It's been awhile, but from what I recall I didn't have the need for multiple chunks, so it may be possible but I haven't attempted it.

from go-cid.

Stebalien avatar Stebalien commented on July 21, 2024

(Closing for tracking, please feel free to continue discussing/asking questions)

from go-cid.

DRK3 avatar DRK3 commented on July 21, 2024

Hi @Stebalien and @AminArria,

I'm running into the same issue - I'm trying to generate a v0 CID that matches the one that an IPFS node would generate, but without running an IPFS node. Is there some way of doing this?

from go-cid.

Stebalien avatar Stebalien commented on July 21, 2024

I believe you can use ipfs add --only-hash offline. Alternatively, you could extract that code into a separate tool. But the resulting CID depends on a lot of knobs/structures so there's no way to "predict" it other than to generate it and throw away the results as you go.

from go-cid.

DRK3 avatar DRK3 commented on July 21, 2024

@Stebalien You're right, ipfs add --only-hash does indeed seem to work offline. However (and I realize now that I failed to specify this in my previous post...), I was looking for a way to do this in Go code (i.e. without the ipfs command installed on the system).

I was hoping that this would work to replicate that behaviour that the ipfs command has (with it using a protobuf):

	prefix := cid.Prefix{
		Version:  0,
		Codec:    cid.DagProtobuf,
		MhType:   mh.SHA2_256,
		MhLength: -1,
	}

	contentID, err := prefix.Sum(content)

But it seems like go-cicd ignores the Codec for v0 CIDs?

from go-cid.

Stebalien avatar Stebalien commented on July 21, 2024

from go-cid.

DRK3 avatar DRK3 commented on July 21, 2024

@Stebalien Thanks so much for all your help and for the quick responses! And thanks for the heads up on how the CID can change depending on the IPFS config. I'll be sure to take that into account as I build my solution.

from go-cid.

DRK3 avatar DRK3 commented on July 21, 2024

Hi @Stebalien,

I have a follow-up question. I was watching https://www.youtube.com/watch?v=Z5zNPwMDYGg to learn more about how adding data to IPFS works (great video, by the way).

I have a very specific use-case: generate CIDs locally that match the ones produced by the ipfs add command, with an IPFS node running only default settings (the current defaults as of today). The data I'm dealing with is very small and is guaranteed to fit within a single chunk.

What I tried doing was wrapping the data in a UnixFS file wrapper before calculating the CID, but I'm still not getting a matching CID. I've verified that the Merkle DAG should be just a single node by using https://dag.ipfs.io/, so no chunking/node balancing should be needed. It seems I'm still missing something - do you know what?

Here is a short code snippet showing exactly what I'm doing:

import (
	"github.com/ipfs/go-cid"
	"github.com/ipfs/go-unixfs"
)

func Example() {
	sampleData := []byte("content")

	unixFSWrappedSampleData := unixfs.FilePBData(sampleData, uint64(len(sampleData)))

	prefix := cid.Prefix{
		Version:  0,
		Codec:    cid.DagProtobuf,
		MhType:   mh.SHA2_256,
		MhLength: -1, // default length
	}

	contentID, _ := prefix.Sum(unixFSWrappedSampleData)

	// The CID produced here is QmXiUR1x5tZ5zk9AySV4cmD3X72M5to3gWXcx2LnCWZDRY
	// but the CID from IPFS is QmbSnCcHziqhjNRyaunfcCvxPiV3fNL3fWL8nUrp5yqwD5
        // Is there another step I'm missing?
	println(contentID.String())
}

from go-cid.

DRK3 avatar DRK3 commented on July 21, 2024

@Stebalien Yep, that did the trick!

These APIs are really bad, I'm so sorry.

No need to apologize! Thanks for all your hard work on this awesome (and free) project!

Are you willing to change the defaults? If you are, you can use ipfs add --raw-leaves my_file. If you do that and the data fits into one "chunk", the CID will be equivalent to the hash (specifically, it'll be a CID with the codec set to "raw").

For now my requirement is to support the default IPFS settings, but this is really good to know. I'll keep this in mind in case my requirements change (and/or I need to support more configurations).

Thanks again so much for your help!

from go-cid.

Vikram710 avatar Vikram710 commented on July 21, 2024

@DRK3 by any chance were you able to extend the solution to programmatically get ipfs cid for multiple chunks ( my file sizes will be around 4-5 mb )

from go-cid.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.