Code Monkey home page Code Monkey logo

scitt-ccf-ledger's Introduction

scitt-ccf-ledger

Build/test (virtual)

Build/test (all platforms)

This repository contains the source code for scitt-ccf-ledger, an application that runs on top of CCF implementing draft standards developed within the IETF SCITT WG. Its purpose is to provide provenance for artefacts in digital supply chains, increasing trust in those artefacts. scitt-ccf-ledger achieves this by allowing signed claims about artefacts to be submitted to a secure immutable ledger, and returning receipts which prove claims have been stored and registration policies applied.

This research project is at an early stage and is open sourced to facilitate academic collaborations. We are keen to engage in research collaborations on this project, please do reach out to discuss this by opening an issue.

Getting Started

The instructions below guide you through building and deploying a local instance of scitt-ccf-ledger for development and testing purposes.

Being a CCF application, scitt-ccf-ledger runs in SGX enclaves. However, for testing purposes, it also supports running on non-SGX hardware in what is called virtual mode.

All instructions below assume Linux as the operating system.

Using Docker

Use the following commands to start a single-node CCF network with the scitt-ccf-ledger application setup for development purposes.

Note: PLATFORM should be set to sgx or virtual to select the type of build.

export PLATFORM=<sgx|virtual>
./docker/build.sh
./docker/run-dev.sh

The node is now reachable at https://127.0.0.1:8000/.

Note that run-dev.sh configures the network in a way that is not suitable for production, in particular it generates an ad-hoc governance member key pair and it disables API authentication.

See the demo/ folder on how to interact with the application.

Development setup

See DEVELOPMENT.md for instructions on building, running, and testing scitt-ccf-ledger without Docker.

Using the CLI

To help with the configuration of an application or to be able to interact with its API you could leverage the available CLI.

The pyscitt CLI is written in Python and is available on PyPi here. To install it, you can use the following command:

pip install pyscitt

The CLI is also distributed through the GitHub releases as a wheel file. Optionally, it can be used from within the repository using the ./pyscitt.sh script. For example:

./pyscitt.sh --help

The CLI is extensively used in the following functional tests and demo scripts:

See pyscitt for more details.

Reproducing builds

See reproducibility.md for instructions.

Contributing

This project welcomes contributions and suggestions. Please see the Contribution guidelines.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft’s Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party’s policies.

scitt-ccf-ledger's People

Contributors

andpiccione avatar darracott avatar ivarprudnikov avatar letmaik avatar plietar avatar prathod09 avatar ryazhang-microsoft avatar sumitis14 avatar vimauro avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

scitt-ccf-ledger's Issues

Update to t_cose 1.1.1

This will allow to remove some of the recently added work-around to support Notary header parameters.

This has to be happen in upstream CCF.

Add `iss` and `kid` field to receipts

We're headed towards using a DID for the server's identity. This will require a few parts, but the first step would be to add iss and kid fields to the receipt's protected headers.

The iss value would be copied from an optional field in the configuration file. If the field is missing, we'll omit the iss and kid altogether. The kid should represent the current service's certificate/public key and will change on disaster recovery. For now, I think we can use the same value we use for service_identity, ie. the hash of the DER certificate, and revisit this in the future.

Even though it will be redundant with the kid, I think we shouldn't remove the service_identity field just yet, so as to not break tooling that uses it while we are still transitioning to DIDs.

Remove dependency on `sandbox.sh`

CCF ships a script sandbox.sh to easily launch CCF apps for testing purposes. We currently use this within start.sh for manual testing and indirectly for the functional tests (by running start.sh), both when not using Docker. For Docker, we have docker/run-dev.sh which does roughly the same thing without relying on sandbox.sh.

There are a couple of disadvantages using sandbox.sh:

  • It starts the network with a default constitution that doesn't require voting on proposals. This makes testing of client code dealing with governance hard.
  • It does a bunch of magic to get the network running. Again, this makes testing of those parts hard.
  • It's not usable with our Docker images. This means there will be two ways to run tests that will differ more than necessary.
  • It creates its own Python venv which takes a while to get ready as it's installing dependencies. This means using it from tests in a temporary folder (e.g., using pytest fixtures) would cause unnecessary overhead.

Some of these are fixable upstream but since it is fairly simple to replicate the functionality we need, it makes sense to avoid the dependency on sandbox.sh. By doing that we'll also be forced to train our CCF muscles which is a good thing.

Use non-`/app/`-prefixed endpoints

CCF 3.x exposes app endpoints both under the old /app/ prefix but also directly under the root /. Note that apps cannot register anything under /gov/ or /node/.

We should switch to using the root endpoints in tooling etc. to avoid confusion between the C++ code (which doesn't list the /app prefix) and the actual API. We will also soon require a root endpoint for exposing a DID document via did:web.

Don't hardcode `localhost` in `did:web` resolver as callback address

auto [_host, port] =
ccf::split_net_address(primary_interface.bind_address);
return fmt::format("https://localhost:{}/did/{}/doc", port, did);

If a node is configured to bind to 0.0.0.0 then using localhost works. But if it binds to a concrete IP address that's not 0.0.0.0, then localhost doesn't work.

The code should be changed to just use the binding address as callback address as-is. Note that this works even with 0.0.0.0 (which is mostly a Linux feature).

Add build option for experimental features

We currently have a flag for enabling the prefix tree feature. To avoid combinatorial explosion when we add more experimental features we should consolidate this to a single flag. My suggestion would be:

CMake: ENABLE_EXPERIMENTAL
Test: --enable-experimental

Extend configuration to set maximum claim size

Currently, an upper limit of 1 MiB is hard-coded. This should be configurable, defaulting to 1 MiB.

Generally, the idea is still that the ledger should not contain large content, so setting this to 1 GiB or similar is discouraged. The solution to this is to reference content stored elsewhere (by hash) instead of embedding it.

Log subprocess output when doing did:web resolutions

We use CCF's host subprocess feature to kick off did:web resolutions using shell scripts. Any stdout/stderr output from the subprocess is discarded and we have a hack in place that redirects the output to a fixed file to disk, overridden it each time. Given that CCF already tracks the execution of those subprocesses it would make sense for CCF to capture stdout/stderr and emit CCF log messages for each line. This would help diagnosing unexpected issues in those subprocesses. Implementing that requires a change in CCF.

CCF's JSON adapter exceptions are not correctly logged

Currently, those exceptions are logged as "Unhandled exception in endpoint" since they are not HTTPError instances. CCF handles those exceptions and translates them to OData errors here:

https://github.com/microsoft/CCF/blob/6f365683de8d04bc0a37c3732c1de608989fb14e/src/node/rpc/frontend.h#L606-L619

I think we should duplicate this code so that we control the HTTP response and can then log it properly.

We could wrap CCF's JSON adapter into our own JSON adapter and emit HTTPError instances, for example.

Use IEEE encoding for ECDSA receipt signatures

Currently, receipts have DER-encoded ECDSA signatures, as that's what CCF provides out of the box.

The receipts spec is silent on the encoding, which needs fixing as well, see ietf-scitt/draft-birkholz-scitt-receipts#14.

Given that receipts are within the COSE ecosystem, it makes sense to use the same format as COSE does, which is IEEE P1363 (r and s simply concatenated). See also https://www.rfc-editor.org/rfc/rfc9053.html#name-ecdsa. JWS also uses the same encoding (see https://www.rfc-editor.org/rfc/rfc7518#section-3.4).

Notary COSE profile support

We currently support SCITT's default profile which is specified in https://ietf-scitt.github.io/draft-birkholz-scitt-architecture/draft-birkholz-scitt-architecture.html#name-envelope-and-claim-format. While the spec currently doesn't account for supporting other COSE profiles it seems natural to do that, especially ones that are related to software supply chains.

This issue tracks adding support for Notary's COSE profile as specified in https://github.com/notaryproject/notaryproject/blob/main/specs/signature-envelope-cose.md.

Extending Notary's spec to align with efforts like https://github.com/thomas-fossati/draft-fossati-cose-profile are out of scope in this issue and will be tracked separately.

0.2.1 release checklist

Checklist:

  • Link this issue to the correct milestone for this version.
  • Make sure all issues/PRs for the milestone are closed/merged.
  • Create a PR to update CHANGELOG.md.
    • If this release upgrades CCF to the next major version, add the following note in the changelog:
      "In order to upgrade an existing service to this version, it must first be upgraded to the version preceding this version."
  • Create a release with the new version number as git tag.
  • Check that CI is green.
  • Close the milestone for this release.
  • Close this issue.

Improved logging

The CCF app currently only logs the following cases:

  • A non-2xx response is sent back to the user. Note that this does not apply to the historical query adapter.
  • A did:web resolution is triggered.
  • The DID document for an issuer is updated.

While that's a good start, a bit more should be logged:

  • Non-2xx responses created by the historical query adapter.
  • 202 responses when historical state is not loaded yet
  • Details of the entry being stored as part of POST /entries
    • If possible, the transaction id should be logged as well to allow correlation with follow-up requests.
  • 200 responses of all endpoints

Overall, for a given request there should always be a single log entry that logs the request/response itself (with HTTP status code, request path and query parameters, possible error details). Additional log entries may add further details or diagnostics but should be correlatable via some identifier.

Related: #74.

Extend configuration to restrict issuers

Currently, the governance-controlled SCITT-specific configuration allows to set authentication options and the set of allowed signing algorithms. For the non-DID code path (not standard in SCITT at this point), the X.509 CA trust store can be configured as well, which re-uses an existing governance table from CCF.

In addition to that, it should be possible to restrict the set of allowed DID issuers.

There are a couple of options, but I think at minimum the following should be possible (aka requirements):

  • Opt-in to a set of DID methods, out of the ones that are supported. Default to allowing none.
  • Allow a specific set of DIDs. Default to allowing all.

I think in the first iteration we shouldn't support regular expressions. It's not clear whether it's needed or whether this is even desirable. I seem to remember that the DID spec says somewhere that you're not supposed to deconstruct a DID and infer relationships etc.

Enable authn for read-only endpoints

Currently, the authn policy affects only the POST /entries endpoint which is the only write endpoint (ignoring the did:web update endpoint which validates a remote attestation).

The read-only app endpoints should use the same authn policy. In the future, we may support more granular policies.

Note that authn of built-in governance and operator CCF endpoints cannot be modified. Of relevance here would only be the /node/ endpoints (since /gov/ already requires authn), see https://microsoft.github.io/CCF/main/operations/operator_rpc_api.html.

CCF also adds a few default endpoints to the app endpoints, see https://microsoft.github.io/CCF/main/use_apps/rpc_api.html. However, those can be disabled (or manually added with authn), see https://microsoft.github.io/CCF/main/build_apps/example_cpp.html#default-endpoints. For this issue, disabling or reimplementing the default app endpoints is out of scope.

Reduce `retry-after` header value

We currently use 3 seconds as retry-after value sent back to the client when a historical query is not cached in-memory yet. This seems excessive and should be lowered to 1 second.

Add trace correlation headers

Tracing of requests across systems typically relies on HTTP headers to establish correlation ids.

There's https://www.w3.org/TR/trace-context/ which is seeing adoption but Microsoft historically also uses x-ms-client-request-id and x-ms-request-id.

For compatibility, we can start with x-ms-request-id/x-ms-client-request-id and include those in log messages. Later on we can add W3C Trace Context headers and eventually deprecate and then remove the x-ms-* headers.

0.2.0 release checklist

Checklist:

  • Link this issue to the correct milestone for this version.
  • Make sure all issues/PRs for the milestone are closed/merged.
  • Create a PR to update CHANGELOG.md.
    • If this release upgrades CCF to the next major version, add the following note in the changelog:
      "In order to upgrade an existing service to this version, it must first be upgraded to the version preceding this version."
  • Create a release with the new version number as git tag.
  • Check that CI is green.
  • Close the milestone for this release.
  • Close this issue.

Create release checklist

This should be a markdown file that contains steps that must happen before cutting a new release.

Something like:


Copy the following list into a new issue named "<x.y.z> release checklist".
Go through each item, tick it off and once everything is done close the issue.

  • If the CCF version has been upgraded, read all relevant CCF changelog entries and create issues for any outstanding tasks
  • Make sure all issues/PRs for the current milestone are closed/merged
  • Create a PR to update CHANGELOG.md
  • Create a release with the new version number as git tag
  • Check that CI is green

Don't store operation contexts in the KV

In #99, we introduced long running asynchronous operations, which we use to perform DID resolution.

Asynchronous operations are implemented as two transactions, one which triggers the operation in response to a client request, and a second which completes the operation, in response to a request to the callback endpoint.

On the first request, we need to serialize the operation's state (in this case, the COSE claim that is being submitted) and propagate it to the second request one way or another. At present, we achieve this by writing the operation context into the KV, in the operations table, but this has some downsides: it pollutes the ledger with unverified data (even if in a different table), and it requires us to make a historical query to get it back.

We can replace this mechanism by passing the context data to the DID fetch scripts, and have it give it back to us on the operation's callback.

Passing context to the fetch script

We currently pass the requested URL and nonce to the fetch script on its command line argument. This is suitable for small values, but a 1MB (plus the base64 overhead) COSE claim would exceed the kernel's limit on command line lengths. The other option is to pass the context on the fetch script's standard input. This is not yet supported by CCF's host process interface, but CCF can be extended to support it.

Integrity checks

We would likely still need to check for integrity of the context, to make sure the external process has not tampered with it. The easiest way to do this would be taking a hash of the context and storing it in the KV. This would be much smaller than the context itself, and can therefore be kept in memory in the existing indexing strategy, allowing immediate access.

Another option for integrity would be to add an HMAC to the context, issued with a secret key that is only known to the ledger. Through this, the callback endpoint would not need to look up any state from the previous operation at all. While this is reasonably easy to do at a single node level, it might need help from CCF to manage secrets across the cluster.

Make `kid` a relative DID URL

The DID/VC community generally used absolute DID URLs as kid in VC-JWTs but they are moving to relative to avoid faulty implementations that forget to match that the issuer is a prefix of kid. See https://github.com/decentralized-identity/jwt-vc-presentation-profile/blob/main/spec/spec.md#kid-jwt-header for example.

In our code, kid is the fragment value, so not an actual relative DID URL. This should be changed to to start with # to be consistent and avoid interop issues later on. This aspect is currently underspecified in the SCITT specs.

Note that DID resolution options, if any, are not meant to go into kid, so this is a restricted form of relative DID URL only allowing a fragment.

Create doc on upgrading CCF

The doc should contain things like:

NOTE: This could be done as an issue template like #57

Monitor performance

We should define some performance metrics and monitor them similar to how CCF does in their repository.

Metrics could be:

  • Throughput on POST /entries
    • did:web using cached resolution
    • X.509
  • Throughput on GET /entries/<txid>/receipt
    • Cached historical transaction
  • Latency as time to receipt following POST /entries
    • Uncached did:web, requires at minimum two calls to POST /entries, followed by one or more GET /entries/<txid>/receipt until receipt is ready
    • Cached did:web, requires one call to POST /entries, followed by one or more GET /entries/<txid>/receipt
    • X.509, same as cached did:web
    • Note: Measuring latency is tricky as it relies on some parameters like CCF's tree signing interval (configurable) and the client delay between retries both for POST /entries in the case of uncached did:web and for receipt fetching.
  • Memory usage after inserting a large number of entries
    • Single issuer
    • Many issuers
    • With/without prefix tree

Using a small claim payload size for now is probably fine, like 1 KB.

Remove custom COSE header decoding code

The t_cose 1.x release does not support decoding of custom header parameters. This functionality however is part of the dev branch of t_cose which will become 2.x.

As we need custom parameters, we currently decode the header ourselves in app/src/cose.h. Once t_cose 2.x or a pre-release of it is out, we can upgrade and remove the custom code.

Document API endpoints

Endpoints inherited from CCF already have documentation, so there no need to rewrite them but they can be linked or attention can be drawn to important endpoints.
New scitt-ccf-ledger endpoints are not documented at all yet.

Extend attested-fetch to return errors

https://github.com/microsoft/attested-fetch

Currently, any network error including HTTP errors (>=400) don't generate a result when calling attested-fetch. The error is dumped to console and the exit code is 1. It would be useful, especially for HTTP errors, to return an attested result that includes the error. This can then be surfaced in GET /operations/<id> to provide more details when DID resolution failed.

Add an endpoint which returns the version and commit hash

When debugging issues with a running service, it can be useful to know which version is running. We should embed this at build time and expose it over the HTTP interface. This already exists in CCF as the /node/version endpoint, which returns the version of the framework.

We should make it clear what the behaviour is when multiple versions are running. We may want both a /app/node_version and a /app/leader_version endpoint, where the former never gets forwarded and the latter always does.

This is of course a self-reported value, used for debugging only. It does not replace the existing code identity managed by CCF.

Improve historical query cache

We currently use a simple LRU cache in the historical query adapter to keep track of when to drop cached transactions to avoid going out of memory. This is based on the transaction count and is hard-coded to 100. Given that transactions have different sizes it would be more economical to use the transaction size to determine the cache size and then set a limit like 100 MiB, based on the available enclave memory set in oe_sign.conf. With small claim sizes this would effectively raise the cache limit by one or more magnitudes of transactions.

Define service DID document and add `did:web` endpoint

The goal is to experiment with representing the service's receipt trust anchors across disaster recoveries through a stable DID.

#25 was a first step towards that goal which allows setting the service DID and including it in receipts. The next step is to define what the corresponding DID document would look like when using did:web and exposing it under GET <domain>/.well-known/did.json.

There are some challenges that need to be solved:

Usage of JWK

Verification material in DID documents is expressed using one or more JWKs (JSON Web Key), where each JWK represents a public key. The expectation is that the public key verifies the signature of what was signed.

In CCF, there is a hierarchy of service key and node key, both wrapped in certificates. The node key differs per node and is the one that signs the tree root and hence is used to verify receipt signatures. The service key only changes when a disaster recovery event occurs.

There are two options:

Option A: Node public keys could be exposed in the DID document which would make most sense from a DID point of view but is not economical as the list will grow too large very quickly.

Option B: Service public keys could be exposed in the DID document which makes more sense from a trust anchor and efficiency point of view but doesn't seem to be immediately compatible with existing DID practices.

To make Option B work, the details of CCF including the indirection through the node certificate could be hidden behind the receipt/tree algorithm, whose interface would receive the service key and internally uses the node key and node certificate for verification after checking that the node certificate was signed by the service key.

Validation using historic keys

Receipts shall always be verifiable, even if they are old and the current signing keys of a service are not the ones used in the old receipts.

From the point of CCF, old service keys can be considered as valid only for the range of transactions for which they were in use. The current service key has no upper bound, until a disaster recovery happens.

In DID, the only standard way to restrict the use of a key to validate new signatures is by either including or not including it in the DID document. If a DID method supports retrieving old versions of DID documents via versionId or versionTime and a signature includes that metadata and there is a reliable way to establish the point in time at which the signature was made, then resolving an old DID document state would allow to retrieve the correct key and verify the signature. did:web does not support this.

Option A: Include all historic keys and add non-standard metadata to each key which restricts its use. This could be a transaction sequence range, though currently the sequence numbers are not exposed in receipts. It could also be a time range, though CCF does not record time in general.

Option B: ?

Remove `"x5c"` from DID document

(Follow-up from #53)

At the moment, SCITT receipts with CCF tree algorithm rely on service and node certificates. This complicates matters, especially when looking at DID as service ids.

If possible, we want to end up in a place where:

  • The service identity keys sign the receipt (tree root) directly
  • Node certificates are not part of the receipt anymore
  • Only a DID and key id is included in a receipt
  • The DID document only publishes service identity keys, without the service certificates in "x5c"
  • The key id changes from being a hash of the certificate to a hash of the public key only

This would require an extension to CCF to support signing the tree root directly instead of (or in addition to) having node keys do the signing.

Include transaction ranges in historical service identity keys

(Follow-up from #53)

The DID document contains the current but also historic service identity keys. A given service identity key is only valid to be used with a a certain range of transactions (sequence numbers, really). It would be better to include those ranges in some way for each key and then use them during receipt validation.

There are roughly two places where extra properties could go:

  • Inside the verification method object
  • Inside the JWK object

Given that the JWK object already supports a "use" field to determine intended use (e.g., signing, encryption), it seems natural to add another field in there and potentially register it at some point in https://www.iana.org/assignments/jose/jose.xhtml#web-key-parameters.

The other piece, validation, is a little more tricky, since SCITT currently does not expose CCF's transaction id in the receipt in a sensible way.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.