w3c / rch-wg-charter Goto Github PK

View Code? Open in Web Editor NEW

11.0 18.0 7.0 499 KB

Charter proposal for an “RDF Dataset Canonicalization and Hash Working Group”

Home Page: https://w3c.github.io/rch-wg-charter/

License: Other

HTML 100.00%

rch-wg-charter's Introduction

Proposed Charter for an RDF Dataset Canonicalization and Hash Working Group

Any text in this repository should be considered as a draft of a draft. The process to get to a charter is:

(completed) agreement among the authors of the charter for a first public version;
(completed) transfer this repo to W3C so that it becomes more 'official';
(completed) discuss the charter with the W3C strategy team as part of an internal review;
(completed) release the charter to the AC as an "advance notice";
(in progress) gather community input and seek consensus to evolve and finalize the charter;
submit the charter for approval to the W3C management;
submit the text to an official review by the AC of W3C.

Note that, in the process, the proposed name of the WG changed (from “Linked Data Signature” to “RDF Dataset Canonicalization and Hash”). This was to align with the discussion happening in step (5) above.

The properly rendered version is also available.

rch-wg-charter's People

Contributors

Stargazers

Watchers

Forkers

pchampin msporny philarcher csarven peacekeeper

rch-wg-charter's Issues

Tweak out of scope

For avoidance of doubt, I think this:

Definition of new signature or encryption functions. This Working Group will only define suitable terms to identify such functions that the community has developed, or will develop in future.

Should be changed to something like this:

Definition of new cryptographic signature or encryption primitives. This Working Group will only define suitable terms to identify such primitives or their combinations that the community has developed, or will develop in future.

These changes would avoid leading people to believe that we can't define how to serialize the data to be signed, e.g., we should be able to say that the way you produce the data to be signed for LD Signature type Foo is:

Canonize content graph CG and hash it with hash function X.
Canonize proof graph PG and hash it with hash function X.
Concatenate the result from 1 and 2 (1 + 2).
Hash the result from 3 and sign it with signature function Y.

What is definitely out of scope is defining new cryptographic primitives, i.e., the WG will not be coming up with new elliptic curves or defining new lattice-based cryptosystems.

Transmute supports the charter

We're really excited about this work, one of our areas of interests is the different ways that canonicalization can be used as input to signatures... The simple case is just signing the canonicalized representation or a hash of it.

https://github.com/w3c-ccg/lds-jws2020

More advanced cases sign parts of the canonicalization, and leverage those signatures together...

For example:

https://github.com/w3c-ccg/ldp-bbs2020

How these cryptographic signature algorithms work is out of scope for the charter... but I think describing the 2 different input approaches will be exciting, particularly in the context of BBS+.

TNO/Rieks Joosten supports the W3C RCH WG

TNO/Rieks Joosten supports the W3C LDS WG

Blockcerts support LDS WG

Having comprehensive standards to help design solutions with predictability is key to the advancement of technology. With the new SSI space emerging, having a standardized set of rules to implement new solutions will allow for more consistency between the different actors of the space. With that in mind, this technology can expand to other solutions and help improve the global digital human experience. We trust the working group can produce the right documentation for that goal.

Semmtech supports the W3C RCH WG

Semmtech would like to voice its support of the creation of the W3C Linked Data Signatures Working Group. We are a consultancy and software development company offering semantic solutions to share and exchange data effectively. We assist our clients, the majority of which in the infrastructure and construction sectors, in using Linked Data to capture domain-specific ontologies, share these with contractors, and use the commonly available terminology within to express and exchange project information on physical assets (e.g., bridges, tunnels, roads).

We see LDS as an important step in achieving a mechanism of verifying integrity of large project datasets shared between multiple parties in a decentralized manner (e.g., available on a Linked Data Platform) instead of being sent back and forth as part of the communication itself. Being able to obtain a canonical version of an RDF dataset in order to create a hash and signature for that dataset will be vital for further decentralizing data transactions in the industries we work in.

SecureKey supports the W3C RCH WG

SecureKey Technologies would like to voice its support of the creation of the W3C Linked Data Signatures Working Group. We are making use of this tech in our emerging product base, and see LDS as a critical technology in supporting an open trust and assertion ecosystem.

Spruce supports the W3C RCH WG

Spruce Systems, Inc. expresses its support for the W3C LDS WG. We believe that LDS is a key enabling technology to more secure, open, and decentralized networks. It allows us to harness new advances in cryptography and data formats across different architectures and platforms. Furthermore, it helps us port more of the tacit yet crucial guarantees around information privacy, sharing, and authenticity from real world interactions to the digital world.

Inrupt supports the W3C RCH WG

Inrupt would like to express support for the creation of a W3C Linked Data Signatures Working Group.

For Linked Data applications in which users share information with others, it is critical to build on a strong foundation of trust and security. In addition, by formalizing these mechanisms in the context of a W3C Working Group, it becomes much more possible to achieve widespread interoperability.

The deliverables of the LDS WG are of interest to Inrupt's work, for example, on consent, VC-related tooling, as well as other areas in the Solid project. Inrupt is also looking into joining the LDS WG to contribute to the progression of the specifications.

Add a reference to the RDF-DEV CG to the groups to liaise with

The RDF Dev CG is responsible for the development of RDF-star which may also become a work item for W3C. It would be good if there was a clear path for signing RDF-star datasets (even if non-normative...).

cc @pchampin @msporny @dlongley

Anticipate a future RDF specification

It seems likely there may be a new RDF and/or SPARQL working group chartered near the time that canonicalization is chartered. It would be useful for the specification to be defined in such a way that could anticipate a range of future updates without requiring canonicalization, itself, to be revisited.

For example, there is a fair amount of work behind RDF-star right now, and it would likely be considered by future RDF and/or SPARQL working groups. Language for canonicalization could be written to allow for a triple to have more than two blank nodes by not over-specifying that blank nodes are constrained to be just the subject or object of a triple. In my experience, this has worked for The RDF 1.1 description of dataset isomorphism to allow for comparing the results of rdf-star dataset evaluation tests.

The principle could be to use language that specifically allows for extension by other specifications, rather than being overly prescriptive. Describing the boundaries of such extension could be challenging to not invalidate the mathematics of the canonicalization algorithm.

Jolocom supports the W3C RCH WG

At Jolocom we make use of Linked Data Signatures in combination with JSON-LD Verifiable Credentials, as well as DID Documents.

We strongly believe that the work proposed by this working group is required for implementations such as ours to be interoperable and secure.

ED of explainer points to charter doc.

The editor's draft link for https://w3c.github.io/lds-wg-charter/explainer.html points to https://w3c.github.io/lds-wg-charter/. Probably requires an edDraftURI configuration.

Explain why signing any serialization is not sufficient?

Disclaimer: I am not a security or cryptography expert, so this may be a stupid question.

The first paragraph of §1 Scope gives examples of why signing RDF datasets is useful. However, one could imagine a poor man's solution where the an arbitrary serialization of the dataset is signed, and that serialization is then used whenever the dataset needs to be exchanged.

I am sure that there are cases where this wouldn't work, and I can vaguely imagine them. But I believe that they should be briefly spelled out in the charter's intro, to justify that their is an actual need beyond the poor's man solution above (if only for the sake of philistines like me ;).

Consider defining a canonical serialization to bytes, rather than a hash

Different systems that need to hash or sign an RDF dataset are likely to have different requirements on the security properties of those hashes and signatures. Picking a single hash function (the draft charter mentions BLAKE3 or SHA-3) will nail down a particular set of security properties rather than letting the embedding system decide. I think the RDF-specific part is about defining the series of bytes that represents the dataset, and then those bytes can be fed into any hash or signature algorithm.

I don't feel strongly about changing this, but it might reduce the amount of arguing you have to do later.

Optical/RF data carriers

I wonder whether there is scope for a non-normative NOTE in this WG about encoding of small RDF Datasets in optical and radio frequency data carriers, such as QR, Data Matrix and NFC tags. Orie has done some excellent work on creating CBOR-LD-based VCs that could meet the current stellar use case of COVID-status certificates but that same approach could be applied to other certificates. At GS1 we'd be thinking of things like certified organic, certified gluten-free etc. but the potential use cases are legion of course.

In a limited capacity data carrier, you don't have room for niceties and flexibility so a standardized approach is going to be essential

mesur.io supports the W3C LDS WG

mesur.io is highly supportive of the LDS WG and in particular of possible additions as mentioned by Transmute in #30

Proper and consistent canonicalization and signing on linked data items is key to many aspects of our work, not just around Verifiable Credentials but also in use of linked data for certain aspects of machine learning where there must be a certain level of trust around data utilized in training and validation.

Change of terminology "Canonicalization" -> "Canonical Labelling"?

I am wondering whether we could not avoid unnecessary frictions and misunderstandings by slightly changing our terminology.

The term "Canonicalization" automatically triggers, for many, a reference to the Canonical XML Specification. Without going into the details, that specification describes, fundamentally, a complex syntactic transformation of the original XML content (see the overview of the type of transformation in the Terminology Section of the aforementioned specification). Implementation of those steps are complex and many in the XML community claim that this is an unnecessary step for security purposes.

However. The case of RDF Graphs is fundamentally different and has no analogy in the XML (or JSON, for that matter) context. The problem to solve is to define a canonical blank node mapping or (canonical blank node relabelling), which happens on the abstract RDF graph and not on a specific serialization. This is deeply rooted in the RDF data model.

Comparison between the RDF blank node relabelling and XML Canonicalization is therefore comparing apples and oranges, and only the source of unnecessary frictions and discussions.

My proposal: let us do an overall change of terminology in the charter and all the other documents, replacing the term (Linked Data) Canonicalization by, say, Canonical Labelling (I am not bound to this term, if there is a better one I am fine, too).

Wdyt?

@msporny @dlongley @pchampin @samuelweiler @wseltzer @aidhog

Digital Bazaar supports W3C RCH WG

Digital Bazaar would like to add its voice to supporting the creation of the W3C Linked Data Signatures Working Group. Our company has depended on this work for years in the production systems we run for our customers.

Given Linked Data Signature adoption among governments around the world (for use in W3C Verifiable Credentials and W3C Decentralized Identifiers), as well as private industry (deployments to 152,000+ retail stores across the US)... and use in privacy-protecting digital signatures (BBS+) and other selective disclosure schemes... we think it's high time we cut an official standard that governments can use to move toward a more privacy-preserving future for their citizens.

The reasons above are in addition to all of the benefits that will be gained by verifiable data publishers -- there is a deep requirement to know where data came from on the Web today, misinformation is rampant, and this WG will standardize technologies that could be used to combat misinformation and gain trust in our Web-based data sources again.

For these reasons (and others), Digital Bazaar is supportive of the LDS WG charter:

https://w3c.github.io/lds-wg-charter/

Need a glossary for the acronyms the lds-wg-charter

... or ummm a link to one, e.g. AC => https://www.w3.org/wiki/AdvisoryCommittee. As a lurking noob, I should probably continue to just lurk. But you might want to lure in more casual readers like me. Or not, because you know, they'll ask questions. And speak up. Annoying stuff like that...

imec supports the W3C RCH WG

imec supports the W3C Working Group on Linked Data Signatures.

Our KNoWS research team has focused strongly on transforming, publishing, and accessing data in decentralized environments. Provenance is not sufficient for keeping track of whether information can be trusted; we need to provide the ability for publishers to sign data such that it can be proven that certain data is authentic.

In particular, for our work within the Solid ecosystem, we see a necessity for this technology and already had this on our own roadmap. If we want citizens to be in control of their own data, it should be possible for authorities to write signed data into citizens' data vault, such that those citizens then can share that data with other parties of their choice in a trusted manner. In particular, we are also interested in subsetting, such that parts of signed Linked Data set can also be exchanged with trust.

We want to contribute to writing, editing, and implementing this specification.

Stale expressions of support

Why are there many expressions of support for a no-longer-really-in-progress charter littering the issue list here?

Convergence.tech supports the W3C LDS WG

Convergence.tech would like to voice its support of the creation of the W3C Linked Data Signatures Working Group. Our company currently depends on this work for many of our production systems to date supporting a broad customer base.

Given Linked Data Signature adoption among governments, private industry, and growing use in privacy-protecting digital signatures (BBS+) we believe it is suitable to raise an official standard as soon as possible that we all as technologists and technology consumers can use to move toward a more safe and secure future.

Furthermore, Convergence.tech is supportive of the LDS WG charter:

https://w3c.github.io/lds-wg-charter/

Vague mentions of json-ld context work item needs clarification

The Working Group will also provide standard ways to represent this vocabulary in various RDF serialization formats, such as by providing JSON-LD contexts for JSON-LD serializations. (See the separate explainer document for more detailed technical backgrounds and for the terminology used in this context.)

and later

The specification may also define one or more JSON-LD Context documents to be used by a JSON-LD serialization.

The word "context" does not appear in the explainer, and afaik there is no normative dependency on json-ld since the work is syntax-neutral. The charter or explainer could usefully explain what's needed from a json-ld context, what ongoing maintenance, security, privacy, longevity, and uptime commitments w3c (including systems team) expects to make if the context definitions are an integral part of using json-ld for secured RDF. How will non-JSON-LD formats match whatever the context does? Is it a syntactic sugar kind of a mechanism, or a must-have? The word "may" suggests the former, in which case there should be chartered work to assess the potential costs and risks for using json-ld contexts in this way.

These mentions should be clarified. Perhaps, for example, the expectation is that the content of the context will be served or directly included locally in applications? Or signed? cached, etc?

See also https://www.w3.org/blog/systeam/2008/02/08/w3c_s_excessive_dtd_traffic/ for previous W3C operational woes in this area.

My understanding is that it is widely considered proprietary or non-standard to extract RDF triples from a json-ld instance document without having up to date content of all relevant context files to hand (including the potential context alluded to in the charter). Is this a misreading? This WG is predicated on a workflow based entirely on processing RDF triples into canonicalizable graphs, so we need to understand how these pieces fit together.

Support for RCH WG

I support creation of this working group. I have implemented The Dataset Canonicalization specification and feel that it is of general import to the Linked Data/RDF ecosystem. Canonicalization is a necessary component for creating a digital signature, but useful and of itself for a number of RDF use cases.

Skolemisation as a use-case?

The RDF 1.1 spec describes the option of replacing blank nodes with (Skolem) IRIs. It seems like it could be a pretty direct use-case of canonicalisation: to compute deterministic Skolem IRIs to replace blank nodes with. Given the same RDF graph/dataset (or an isomorphic copy), different "Skolemisers" based on a canonical form could produce the same Skolem IRIs without coordination. This would be a practical way to remove blank nodes entirely from an eco-system, mint dereferenceable IRIs automatically, etc. (The fact that Skolem IRIs were included in the RDF 1.1 spec would also seem to suggest that they were deemed important.)

Vrije Universiteit Amsterdam supports the W3C RCH WG

Vrije Universiteit Amsterdam, a Dutch public research university, supports the planned W3C Working Group on Linked Data Signatures.

Within our User Centric Data Science group, we study the use of Linked Data in the form of nanopublications in the context of scientific data and scientific communication in general. We are also applying digital signatures and cryptographic hashes for making such nanopublications immutable and verifiable. Aligning this work with standards such as the ones proposed by this working group would strengthen this endeavor, ensure interoperability, and put it onto a more solid foundation.

We are also interested in actively contributing to this working group in the design as well as implementation stages.

Hash function rebranding?

TL;DR: I propose to “rebrand” the “unique identification” terminology into something like “Linked Data Hash” (or “Linked Data Hashing Function”).

Details:

At the moment we say, for example, “The scope of this Working Group is to define a Standard to canonicalize, sign, or uniquely identify RDF Datasets”. The unique identifier we are talking about is the result of a hash function applied on the ordered set of quads representing the canonical version of the graph.

I think, however, that the term “hashing” would resonate more to a number of technical people out there, rather than “unique identification” (even if the two are the same). This dawned on me when @pchampin and I discussed a perfect use case for the technology we are talking about: adding a hash of a dataset in a data or vocabulary repository (e.g., to the Linked Open Vocabularies) just like many repositories of code, applications, javascript files and modules, etc, add the hash value of the relevant resource. This has become familiar to many alongside the concept of signatures. We even have a hashlink proposal for its usage :-). I am a bit afraid that the concept of “identifier” may not resonate the same way.

Technically, we are of course talking about the same thing; this is purely a branding exercise. I am happy to propose a re-write of the text(s) if you guys agree.

@msporny @dlongley @pchampin

Digital Contract Design supports the W3C RCH WG

Most of our work at DCD relies on DIDs and Verifiable Credentials. We need interoperable signatures.

SemanticClarity/John Walker supports the W3C RCH WG

SemanticClarity would like to add its voice to supporting the creation of the W3C Linked Data Signatures Working Group.

For these reasons (and others), SemanticClarity is supportive of the LDS WG charter:

https://w3c.github.io/lds-wg-charter/

MATTR supports the W3C RCH WG

MATTR would like to express support for the creation of a W3C Linked Data Signatures working group.

Given the current environment on the web today, there is a growing need to evolve capabilities around the authenticity and provable origin of data to combat problems such as mis-information. Linked Data Signatures offer multiple novel features in the space of digital signature representation including in the area of selective disclosure capable schemes.

Proof Market Incorporated / Tony Rose Supports the W3C LDS WG

Proof Market is focused on the development of User Experience and Customer Experience leveraging the latest technology in Decentralized Digital Identity. In the past year our focus on MedCreds in order to alleviate the effects of the pandemic and the lockdown have sent us deep down the verifiable credential representation technologies (JSON, XML, etc.)

Our experience as developers is that further work to standardize on data representation that supports rich use cases, ZKPs, and linked data sets will accelerate our ability to deploy real solution into the world that interoperate with an ecosystem of solution providers.

We support this working group charter and are excited to participate in BUIDLing the future of digital identity.

A is B considered harmful

Saying that “Linked Data” is used as a synonym to “RDF” doesn't make it so but has the connotation that W3C considers it to be so. Unless W3C actually considers this to be so (and evidence of this is provided) the statement should not be in the WG charter.

Change name of charter to "Linked Data Security"

This WG could standardize the following things over multiple iterations. For the sake of this issue, let's call these phases.

Phase I Specifications

RDF Dataset Canonicalization (Algorithm) - Used to take an RDF Dataset as input and canonicalize it to a deterministic output in NQuads format.
Linked Data Proofs (Algorithms and Vocabulary) - Used create and verify abstract mathematical proofs (algorithms) on a canonicalized dataset and express the algorithm inputs/outputs (vocabulary).
Linked Data Signature (Vocabulary) - Used create and verify concrete digital signatures (algorithms) on a canonicalized dataset and express the algorithm inputs/outputs (vocabulary).

Phase II Specifications

Ed25519 Cryptosuite (Vocabulary) - A vocabulary for expressing a linked data signature for the Ed25519 elliptic curve.
JOSE Cryptosuite (Vocabulary) - A vocabulary for expressing a linked data signature using the JOSE cryptography standards.
Secp256k1 (Vocabulary) - A vocabulary for expressing a linked data signature using the Koblitz k1 curve.

Phase III Specifications

BBS+ Cryptosuite (Vocabulary) - A vocabulary for expressing a linked data signature using the BBS+ BLS pairing-friendly curve for use in selective disclosure schemes.
Authorization Capabilities (Algorithms and Vocabulary) - Used to create and verify cryptographic authorization grants, delegations, and attenuations using the Linked Data Security cryptosuites.

We should name the WG such that it can easily be rechartered to handle each of these phases.

3 Round Stones supports the W3C RCH WG

3 Round Stones, a linked data product & service company, supports an official standard for use by governments. We support greater privacy-protecting measures for government transactions, and those between the public sector & citizens.

Verifiable data is and will continue to be central to public trust and the use of Web technologies. Ideally, the proposed WG will help standardize technologies to ensure provenance and authenticity of data published on the Web.

For these reasons (and others), 3 Round Stones supports the proposed Linked Data Signatures WG charter:

https://w3c.github.io/lds-wg-charter/

Sincerely,
Bernadette Hyland-Wood
Founder & CEO

Create a registry of hash functions

It looks like every standard tries to define its own list of hash functions. As suggested in #46 (which argues for a broader kind of registry) we need a normative list of hash functions, each with URI and basic information such as names, link to specification, length of hashes etc. See list of hash functions in Wikipedia and most relevant the table of multicodec project (filter by column tag=multihash). IANA has used many lists of hash functions.

Diwala supports W3C LDS WG

Diwala would like to add its voice to supporting the creation of the W3C Linked Data Signatures Working Group.

It is crucial work for the requirement for DIDs and Verifiable Credentials, where digital signatures and other proofs over RDF datasets are used for creating/resolving/updating/deactivating DIDs, and for issuing and verifying Verifiable Credentials.

We see LDS as a critical technology in supporting an open trust and assertion ecosystem. We believe that LDS is a key enabling technology to more secure, open, and decentralized networks

We are currently building products that will use this tech going forward!

Rename "Linked Data Hash" as "RDF Dataset Hash"

The charter has 2 sections - deliverables 1 and 2 are about single RDF Datasets, 3 and 4 about linkages and "Linked Datasets" introduced in item 3 (LDI).

It would be helpful to rename deliverable 2

Linked Data Hash => RDF Dataset Hash (RDH)

to reflect it applies to one dataset and follows on from RDC.

Danube Tech supports the RCH WG

This is very important work, since it is required for DIDs and Verifiable Credentials, where digital signatures and other proofs over RDF datasets are used for creating/resolving/updating/deactivating DIDs, and for issuing and verifying Verifiable Credentials.

But this is also more than just a necessity for DIDs and VCs. It's really a new and fundamental building block for any kind of Linked Data based application, since it makes it possible to attach digital proofs to RDF graphs containing any type of data. This adds a new security and trust layer to the web itself.

Use-case idea

Hi all,
currently working on the RDF-star test-suite, I came to realize that updating the test-suite may cause problems. Suppose I detect a bug in a test, and I update the manifest accordingly.

Somebody loading an old EARL report and the new test-suite manifest might wrongly believe that the implementation described in the EARL report passes the test, while in fact in passed the earlier, buggy, version of the test. One solution would be for the EARL report to include a hash of the manifest it ran on.

Of course, for this solution to be complete, the manifest should itself include a hash for every file it points to, otherwise changing the files could pass undetected... But we don't have to figure out all the details right now.

What do you think of this use-case?

DHS/S&T/SVIP Supports the W3C RCH WG

DHS S&T has been funding, championing, refining and utilizing the W3C Verifiable Credentials and W3C Decentralized Identifiers from the beginning to ensure a Competitive, Interoperable Marketplace of Solution Providers that can support a global scale architecture that enables individuals and organizations to have agency, consent and control over their data.

For the problem sets that are being addressed by DHS/SVIP and DHS Operational Components, open standards based multi-party and multi-platform interoperability of both (1) Digital Personal Credentials (DHS/USCIS & DHS/PRIV) and (2) Digital Trade Credentials (DHS/CPB) require the use of Linked Data (JSON-LD) with Linked Data Signatures including LD Signatures that support Selective Disclosure (BBS+ Signatures).

For these reasons, DHS/S&T/SVIP is supportive of the LDS WG charter:

https://w3c.github.io/lds-wg-charter/

AUEB/MMlab supports the W3C LDS WG

The Mobile Multimedia Laboratory at the Athens University of Economics and Business supports the creation of the W3C Linked Data Signatures Working Group and plans to contribute to its work.

We believe that Verifiable Credentials, Decentralized Identifiers, Self-Sovereign Identities, selective credential disclosure, and linked trusted data in general, are important for all aspects of digital life and this is the time to enable their wide acceptance.

We agree with and support the LDS WG charter.

Prof. George C. Polyzos, Director AUEB/MMlab

bengo supports the W3C RCH WG

Linked Data Signatures is mentioned twice in the W3C TR for ActivityPub, and (AFAICT) is required for message integrity when ActivityPub messages are passed around using channels other than HTTPS (where HTTP Signatures could be used). Unfortunately, these mentions of "Linked Data Signatures" are not very easy to implement, considering there is no link/footnote to a TR for them.

This, of course, is only an example of a single TR that makes use of linked data. For the same benefits of message integrity in any other use case besides 'social', Linked Data Signatures (et al) deserves standardization and a WG.

W3C Solid CG supports the W3C RCH WG

The W3C Solid Community Group would like to express support for the creation of a W3C Linked Data Signatures Working Group.

Some of the W3C Solid CG members are looking into joining the LDS WG to contribute to the progression of the specifications.

TODO List #1: before starting the W3C journey

This is just a call for action :-)

One of the first goals is to start the W3C journey; the first step is to present this proposal to the W3C Strategy team for a first reaction, which also includes a charter review on i18n, a11y, privacy, and security. I would expect the first three to go through quickly, there may be discussion on the fourth item. More in parallel with the discussion in the W3C strategy team and advance notice should be issued to the W3C AC, asking for public comments.

There are a number of issues we should try to settle before starting the aforementioned W3C route. I attempt to make a list here:

agree on the proposed length of the WG (the current text is 2 years)
would be nice to have at least one co-chairs settled
there is a clear example of usage through Verifiable Credentials. I believe another example referring to other usages of Linked Data must be added before we start with, hopefully, a publicly expressed need that can be referenced (e.g., signing an ontology for XXX usage, or using an unambiguous reference/identification for a public dataset, etc.
agree on the deliverables (and their titles, though that can be kept flexible)
have a public reference to the paper of Dave and Rachel, which should also refer to a public review
list of the 'other' deliverables
a realistic timeline for a FPWD and for a CR
possible list of external organizations we want to liaise with

I believe its o.k. to start the journey without (1), (6), (8), and possibly (2) settled (although, I believe, having (2) by the time we send an advance notice would be really important). We may rely on the public review (via the advanced notice) to get an example for (3) by contacting, e.g., the semantic web mailing list, but I am worried of the impression that would give that the only reason this work is proposed is due to the Verifiable Credential usage.

Cc @msporny @pchampin @aidhog @dlongley

Is canonicalization single?

(Originally raised by @samuelweiler in w3c/strategy#262 (comment); moved here with permission.)

When I think of canonicalization for signing, a key property of the canonicalized form is that there is a single such form - the function always gives the same result. When I look at the charter explainer, that property isn't clear. Am I just not understanding those words? Should that language be a little clearer?

Pravici supports the W3C RCH WG

Pravici would like to voice its support of the creation of the W3C Linked Data Signatures Working Group. We see LDS as a critical technology in supporting an open trust and assertion ecosystem.

Linked Data Vocabulary as registry

(Originally raised by @samuelweiler in w3c/strategy#262 (comment); moved here with permission.)

It's good that defining algorithms is out of scope, but my experience is that the use of an algorithm requires a few words of explanation, much as https://w3c-ccg.github.io/security-vocab/#Ed25519VerificationKey2018 and https://tools.ietf.org/html/rfc8080#section-3 show "here's how we format the key". Typically I want those definitions to be in their own documents, so it's (relatively) easy to add new ones. (I also want them to be more well-defined than I see in https://w3c-ccg.github.io/security-vocab/) Which brings us to:

Registries. I haven't been following the recent Process changes around registries, but it looks like Linked Data Security Vocabulary (LDSV) is, in fact, a registry. Only it's a registry that contains the above explanation (albeit only by example). I think it might be better to create a registry - including specifying update criteria - and then create docs that populate initial values. In the IETF, it's possible to do both at once, creating the registry in the same doc that defines initial values (see the later paragraphs of https://tools.ietf.org/html/rfc5155#section-11: "This document creates a new IANA registry...")
I don't know that w3c's registry stuff will look like. Perhaps we should say, as a work/scope item: the WG will create a registry and populate initial values, without being clear what that document is going to be? @plehegar ?

Linked Data Signature or Linked Data Security?

(Originally raised by @samuelweiler in w3c/strategy#262 (comment); moved here with permission.)

It seems like the Linked Data Security (LDS) spec should instead be titled Linked Data Signatures (or signing). Please change the name or explain the choice.

Coordinate with WebAppSec

My impression is that security experts advise against canonicalizing structured data in order to hash it, and instead advise hashing the bytes that are transmitted in order to transfer the data. This WG proposes to do the thing that's not advised (with a justification in the explainer), but https://w3c.github.io/rch-wg-charter/#w3c-coordination doesn't mention working with WebAppSec to do it.

Since the problem of hashing structured data has been around a long time in the security space, I don't think it's sufficient to just assume that horizontal review includes security reviewers: they need to be actively engaged in defining and solving the problem.

W3C Web of Things (WoT) WG supports the W3C LDS WG

Web of Things (WoT) has also working on a canonical form (and signature mechanism) for WoT Thing Descriptions. The canonical form is reasonably stable at this point; see https://w3c.github.io/wot-thing-description/#canonicalization-serialization-json. However we are still working on a signature mechanism (and debating whether or not it would be better to wait).

We were hoping to be able to base this on a standard JSON-LD canonicalization and signature mechanism, but the timing was not working out. TD 1.1 is in flight and the plan is to complete it before the end of the year, too soon to adopt LDS. However, hopefully we can get at least partially aligned with LDS and intercept it with WoT TD 2.0. Even then the LDS and next-gen WoT specs (assuming the WoT charter is renewed for updates) would have similar timelines, so we can't wait for LDS to be complete first and then adopt it in WoT. Instead we will have to collaborate and align while developing parallel drafts. There are also some special concerns arising from TD peculiarities, for example, dealing with default values in TDs. Regardless, the goal of having a way to canonicalize and sign JSON-LD and TDs is important for integrity and authentication purposes.

See our discussion of this in the following issues (which have also gathered our requirements):

w3c/wot-security#166
w3c/wot-profile#55
w3c/wot-thing-description#940 (has most detailed requirements)

w3c / rch-wg-charter Goto Github PK

rch-wg-charter's Introduction

Proposed Charter for an RDF Dataset Canonicalization and Hash Working Group

rch-wg-charter's People

Contributors

Stargazers

Watchers

Forkers

rch-wg-charter's Issues

Phase I Specifications

Phase II Specifications

Phase III Specifications

Recommend Projects

Recommend Topics

Recommend Org