Code Monkey home page Code Monkey logo

Comments (9)

iherman avatar iherman commented on September 27, 2024 1

The text proposed in the explainer (after @msporny's changes) is more specific: the hash can indeed be used to see if there is a change on the dataset. What it does is to say whether there has been a change or not, not what exactly the change is. The former is absolutely o.k., the latter is what I suspect is a very different problem.

from rch-wg-charter.

iherman avatar iherman commented on September 27, 2024 1

I propose to close this issue. The cumulative changes via @msporny's PRs seem to cover the points I had...

from rch-wg-charter.

msporny avatar msporny commented on September 27, 2024 1

@aidhog wrote:

But it becomes complicated if the partitions depend on the labels; in other words, I think the partitions would have to be well-defined without using blank nodes.

Yes, agree with everything you said.

The use case I added had more to do with "it's a tool that a human could use to help them narrow down on changes", where they might have to use their brain to reason about the changes (rather than an automated reasoner). I'll try to think more about how to reword that use case to be more accurate.

from rch-wg-charter.

iherman avatar iherman commented on September 27, 2024

B.t.w., we probably would have to move this repository to w3c (it makes it easier to solicit reviews, etc). I would do that when we are ready to go.

from rch-wg-charter.

msporny avatar msporny commented on September 27, 2024

agree on the proposed length of the WG (the current text is 2 years)

I think 2 years is adequate... ideally, we get done sooner. The only thing that could derail us is an entirely new proposal or a big change to the existing implementations and test suite.

would be nice to have at least one co-chairs settled

Agree... haven't put much thought into that yet. Agree that it should be one from the CCG/DID/VC orbit and another from a very different community, at least.

there is a clear example of usage through Verifiable Credentials. I believe another example referring to other usages of Linked Data must be added before we start with, hopefully, a publicly expressed need that can be referenced (e.g., signing an ontology for XXX usage, or using an unambiguous reference/identification for a public dataset, etc.

We do need a use cases document, which we don't have yet. Here are a few use cases that have surfaced over the past several years:

  • RDF Dataset Canonicalization Use Cases
    • Cryptographic Hashing
      • Digital Signatures and Proofs
    • Dataset Comparison (to determine if/what information has changed)
      • Cache invalidation
      • Detecting data tampering/modification
      • Debugging digital signature failures (what information changed between sender and receiver?)
  • Linked Data Proofs and Signatures
    • Verifiable Credentials
      • Cryptographically establishing source of information
    • Verifiable Presentations
      • Replay Protection
    • Protection of Arbitrary RDF Datasets

agree on the deliverables (and their titles, though that can be kept flexible)

  • RDF Dataset Normalization (REC)
  • Linked Data Proofs (REC)
  • Linked Data Signatures (REC)

have a public reference to the paper of Dave and Rachel, which should also refer to a public review

I'll contact the authors and get them to post to CCG.

list of the 'other' deliverables

  • Use Cases and Requirements
  • Ed25519Cryptosuite (NOTE), JOSE Cryptosuite (NOTE), Koblitz Cryptosuite (NOTE)

a realistic timeline for a FPWD and for a CR

  • RDF Dataset Normalization - FPWD +3 months, CR +12 months
  • Linked Data Proofs - FPWD +3 months, CR +15 months
  • Linked Data Signatures - FPWD +3 months, CR +15 months

possible list of external organizations we want to liaise with

  • All the standard W3C ones - PING, a11y, TAG
  • All W3C Security WGs
  • W3C CCG, VCWG, DIDWG
  • IETF CFRG
  • DIF SDSWG
  • Hyperledger Aries

from rch-wg-charter.

iherman avatar iherman commented on September 27, 2024

Preamble: I try to be extremely cautious in not promising more than what we want to do here, and be very focused. We know that we will have to be very convincing...

agree on the proposed length of the WG (the current text is 2 years)

I think 2 years is adequate... ideally, we get done sooner.

Sweet dreams :-)

The only thing that could derail us is an entirely new proposal or a big change to the existing implementations and test suite.

there is a clear example of usage through Verifiable Credentials. I believe another example referring to other usages of Linked Data must be added before we start with, hopefully, a publicly expressed need that can be referenced (e.g., signing an ontology for XXX usage, or using an unambiguous reference/identification for a public dataset, etc.

In the proposed version in PR no. 7 what I propose to do is to have:

  • two very focused use cases (see that PR preview) in the charter
  • have a longer list in the explainer document (see the version in the PR)

We do need a use cases document, which we don't have yet. Here are a few use cases that have surfaced over the past several years:

  • RDF Dataset Canonicalization Use Cases

    • Cryptographic Hashing

I believe that is a fundamental one, see #6.

* Digital Signatures and Proofs
  • Dataset Comparison (to determine if/what information has changed)

That one I do not really believe in (we have discussed this elsewhere). I do not see the role of canonicalization in that one.

* Cache invalidation
* Detecting data tampering/modification
* Debugging digital signature failures (what information changed between sender and receiver?)
  • Linked Data Proofs and Signatures

    • Verifiable Credentials

      • Cryptographically establishing source of information
    • Verifiable Presentations

      • Replay Protection
    • Protection of Arbitrary RDF Datasets

Could you make a selection and add those, possibly with a PR, to the explainer? Not many, I think 3-4 should be enough.

agree on the deliverables (and their titles, though that can be kept flexible)

  • RDF Dataset Normalization (REC)

I believe the right terminology (that I also saw elsewhere) is canonicalization...

  • Linked Data Proofs (REC)

I have problems with this: it is, or suggests, something way too generic and therefore open us up for objections. We do not do (generic) proofs; we "only" define a way of hashing and signing linked data. Better choose a title that reflects this.

  • Linked Data Signatures (REC)

If the previous document is expressing signatures (which I believe is the case) then the third document only defines the vocabulary to express that signatures in linked data. I think it is worth separating the generic procedure (which is the previous document) from the way it is expressed via a suitable vocabulary. The generic procedure may be usable by itself (e.g., the result of the hash), it does not have to be expressed via a vocabulary...

Hence my proposal for the texts (and the description) in the charter text. I believe they are clearer (though maybe a bit convoluted) in specifying what we want to achieve. Again, we have to be defensive.

have a public reference to the paper of Dave and Rachel, which should also refer to a public review

I'll contact the authors and get them to post to CCG.

That should be fine. Don't forget a reference to the reviews.

list of the 'other' deliverables

  • Use Cases and Requirements

We have to be careful. If we explicitly refer to a document like that here, then we may get the push back saying "do a working group when you already have a UCR". I am not sure how to handle that...

  • Ed25519Cryptosuite (NOTE), JOSE Cryptosuite (NOTE), Koblitz Cryptosuite (NOTE)

The current draft is slightly more generic: "A Linked Data cryptosuite registry, containing Linked Data related cryptographic terms, including, although not restricted to, terms used for Linked Data Hash or Signatures." Do you really think we should be that specific in the charter?

a realistic timeline for a FPWD and for a CR

  • RDF Dataset Normalization - FPWD +3 months, CR +12 months
  • Linked Data Proofs - FPWD +3 months, CR +15 months
  • Linked Data Signatures - FPWD +3 months, CR +15 months

These sound about right. Although... the really tough one seems to be the first one. Isn't it possible to have the other two documents released at the same time? Why that 3 months gap?

(Note to myself: probably better to write the timeline in terms of "T+3", etc)

possible list of external organizations we want to liaise with

  • All the standard W3C ones - PING, a11y, TAG
  • All W3C Security WGs

There is now a standard boilerplate text in the charter that covers all the horizontals.

  • W3C CCG, VCWG, DIDWG

yep, they are all there

  • IETF CFRG
  • DIF SDSWG
  • Hyperledger Aries

I would need a precise link and a text for those (or a PR from you with those).

Before getting into PR-s, it would be nice to agree (or decide to disagree:-) on #7

from rch-wg-charter.

msporny avatar msporny commented on September 27, 2024

I have now done a complete pass and editorial suggestions in PRs #10, #11, #12, #13, and #14.

There are two high-level take-aways for my suggestions:

  • Linked Data Hashing is going to confuse people... I have eliminated the use of "Normalization"... we mean Canonicalization, that is the correct Computer Science term and we should not deviate from that word.
  • We should combine Linked Data Proofs and Linked Data Signatures into one specification called "Linked Data Security". I made the suggestion in the PRs, and if we're ok with that, then I'll make the changes to the CCG specifications.
  • Stating that the work is only about Signatures is also going to confuse people, because it's not just about Signatures. We should call the WG "Linked Data Security", and name one of the specs "Linked Data Security", and the registry should be the "Linked Data Security Registry".

Again, these are suggestions (fairly heavy suggestions -- I do think it's the right path), but would love to hear thoughts.

from rch-wg-charter.

msporny avatar msporny commented on September 27, 2024

In the proposed version in PR no. 7 what I propose to do is to have:

  • two very focused use cases (see that PR preview) in the charter
  • have a longer list in the explainer document (see the version in the PR)

Agree, I will try to add some more to the explainer. We do want people to be aware that we are focused now, but the work may expand in the future.

  • Dataset Comparison (to determine if/what information has changed)

That one I do not really believe in (we have discussed this elsewhere). I do not see the role of canonicalization in that one.

This is a big use case for us, @iherman... we use it all the time to debug broken digital signatures. I'm fine w/ not putting a focus on it, but it is an important use case.

Could you make a selection and add those, possibly with a PR, to the explainer? Not many, I think 3-4 should be enough.

Yes, I can do that... also, a few more came up on the CCG mailing list yesterday (based on work that Alan Karp did in 2004).

I believe the right terminology (that I also saw elsewhere) is canonicalization...

Agreed, I am currently updating all the things to match the "Canonicalization" terminology.

I have problems with this: it is, or suggests, something way too generic and therefore open us up for objections. We do not do (generic) proofs; we "only" define a way of hashing and signing linked data. Better choose a title that reflects this.

I updated to "Linked Data Security" and clearly outlined what would be in the specification (as focused and concrete).

I'll contact the authors and get them to post to CCG.

That should be fine. Don't forget a reference to the reviews.

Done, this ball is rolling.

  • Use Cases and Requirements

We have to be careful. If we explicitly refer to a document like that here, then we may get the push back saying "do a working group when you already have a UCR". I am not sure how to handle that...

I can quickly put a Use Cases document together if that happens. For now, I think the Explainer is good enough. I'll fill more use cases out there.

  • Ed25519Cryptosuite (NOTE), JOSE Cryptosuite (NOTE), Koblitz Cryptosuite (NOTE)

The current draft is slightly more generic: "A Linked Data cryptosuite registry, containing Linked Data related cryptographic terms, including, although not restricted to, terms used for Linked Data Hash or Signatures." Do you really think we should be that specific in the charter?

I took your text and modified it slightly to not put the possibility of those NOTEs out of scope.

These sound about right. Although... the really tough one seems to be the first one. Isn't it possible to have the other two documents released at the same time? Why that 3 months gap?

Based on the reality that I've experienced throughout the last decade... these things tend to fall on a very small number of overworked people, so I'm trying to be kind to them. :)

(Note to myself: probably better to write the timeline in terms of "T+3", etc)

I updated these values in my PRs... used "WG-START + 3 months".

  • IETF CFRG
  • DIF SDSWG
  • Hyperledger Aries

I would need a precise link and a text for those (or a PR from you with those).

Done in aba5c9f.

Before getting into PR-s, it would be nice to agree (or decide to disagree:-) on #7

Here's the proposal:

https://pr-preview.s3.amazonaws.com/iherman/ld-signatures-charter/pull/13.html#timeline

What do you think?

from rch-wg-charter.

aidhog avatar aidhog commented on September 27, 2024
  • Dataset Comparison (to determine if/what information has changed)

That one I do not really believe in (we have discussed this elsewhere). I do not see the role of canonicalization in that one.

Also a bit unsure about this part. I guess it might need to be worded carefully to avoid the impression that we can solve this in the general case.

Canonical labelling could be used, for example, to see which graphs in a dataset changed.

It could also be used to build hash structures like Merkle trees over large RDF graphs.

More generally, you could use canonical labelling to see which (pre-defined) partitions of a graph changed by using the labels to hash each of those partitions and compare the hash. But it becomes complicated if the partitions depend on the labels; in other words, I think the partitions would have to be well-defined without using blank nodes. Also it may not be very helpful to understand what actually changed within each partition (just whether the partition changed or not).

from rch-wg-charter.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.