This specification describes a Data Integrity Cryptosuite for use when generating digital signatures using the BBS signature scheme. The Signature Suite utilizes BBS signatures to provide selective disclosure and unlinkable derived proofs.

The Working Group is actively seeking implementation feedback for this specification. In order to exit the Candidate Recommendation phase, the Working Group has set the requirement of at least two independent implementations for each feature, both mandatory and optional, in the specification. For details on the conformance testing process, see the test suites listed in the implementation report.

Introduction

This specification defines a cryptographic suite for the purpose of creating, verifying, and deriving proofs using the BBS Signature Scheme in conformance with the Data Integrity [[VC-DATA-INTEGRITY]] specification. The BBS signature scheme directly provides for selective disclosure and unlinkable proofs. It provides four high-level functions that work within the issuer, holder, verifier model. Specifically, an issuer uses the BBS `Sign` function to create a cryptographic value known as a "BBS signature" which is used in signing the original credential. A holder, on receipt of a credential signed with BBS, then verifies the credential with the BBS `Verify` function.

The holder then chooses information to selectively disclose from the received credential and uses the BBS `ProofGen` function to generate a cryptographic value, known as a "BBS proof", which is used in creating a proof for this "derived credential". The cryptographic "BBS proof" value is not linkable to the original "BBS signature" and a different, unlinkable "BBS proof" can be generated by the holder for additional "derived credentials", including any containing the exact same information. Finally, a verifier uses the BBS `ProofVerify` function to verify the derived credential received from the holder.

Applying the BBS signature scheme to verifiable credentials involves the processing specified in this document. In general the suite uses the RDF Dataset Canonicalization Algorithm [[RDF-CANON]] to transform an input document into its canonical form. An issuer then uses selective disclosure primitives to separate the canonical form into mandatory and non-mandatory statements. These are processed separately with other information to serve as the inputs to the BBS `Sign` function along with appropriate key material. This output is used to generate a secured credential. A holder uses a set of selective disclosure functions and the BBS `Verify` function on receipt of the credential to ascertain validity.

Similarly, on receipt of a BBS signed credential, a holder uses the RDF Dataset Canonicalization Algorithm [[RDF-CANON]] to transform an input document into its canonical form, and then applies selective disclosure primitives to separate the canonical form into mandatory and selectively disclosed statements, which are appropriately processed and serve as inputs to the BBS `ProofGen` function. Suitably processed, the output of this function becomes the signed selectively disclosed credential sent to a verifier. Using canonicalization and selective disclosure primitives, the verifier can then use the BBS `verifyProof` function to validate the credential.

Terminology

Terminology used throughout this document is defined in the Terminology section of the [[[VC-DATA-INTEGRITY]]] specification.

A conforming proof is any concrete expression of the data model that complies with the normative statements in this specification. Specifically, all relevant normative statements in Sections and of this document MUST be enforced.

A conforming processor is any algorithm realized as software and/or hardware that generates or consumes a conforming proof. Conforming processors MUST produce errors when non-conforming documents are consumed.

This document contains examples of JSON and JSON-LD data. Some of these examples are invalid JSON, as they include features such as inline comments (`//`) explaining certain portions and ellipses (`...`) indicating the omission of information that is irrelevant to the example. Such parts need to be removed if implementers want to treat the examples as valid JSON or JSON-LD.

Data Model

The following sections outline the data model that is used by this specification for verification methods and data integrity proof formats.

Verification Methods

These verification methods are used to verify Data Integrity Proofs [[VC-DATA-INTEGRITY]] produced using BLS12-381 cryptographic key material that is compliant with [[CFRG-BBS-SIGNATURE]]. The encoding formats for these key types are provided in this section. Lossless cryptographic key transformation processes that result in equivalent cryptographic key material MAY be used during the processing of digital signatures.

Multikey

The Multikey format, as defined in [[VC-DATA-INTEGRITY]], is used to express public keys for the cryptographic suites defined in this specification.

The `publicKeyMultibase` property represents a Multibase-encoded Multikey expression of a BLS12-381 public key in the G2 group. The encoding of this field is the two-byte prefix `0xeb01` followed by the 96-byte compressed public key data. The 98-byte value is then encoded using base58-btc (`z`) as the prefix. Any other encodings MUST NOT be allowed.

Developers are advised to not accidentally publish a representation of a private key. Implementations of this specification will raise errors in the event of a value other than `0xeb01` being used in a `publicKeyMultibase` value.

{
  "id": "https://example.com/issuer/123#key-0",
  "type": "Multikey",
  "controller": "https://example.com/issuer/123",
  "publicKeyMultibase": "zUC7EK3ZakmukHhuncwkbySmomv3FmrkmS36E4Ks5rsb6VQSRpoCrx6
  Hb8e2Nk6UvJFSdyw9NK1scFXJp21gNNYFjVWNgaqyGnkyhtagagCpQb5B7tagJu3HDbjQ8h
  5ypoHjwBb"
}
          
{
  "@context": [
    "https://www.w3.org/ns/did/v1",
    "https://w3id.org/security/data-integrity/v1"
  ],
  "id": "https://example.com/issuer/123",
  "verificationMethod": [{
    "id": "https://example.com/issuer/123#key-1",
    "type": "Multikey",
    "controller": "https://example.com/issuer/123",
    "publicKeyMultibase": "zUC7EK3ZakmukHhuncwkbySmomv3FmrkmS36E4Ks5rsb6VQSRpoCr
    x6Hb8e2Nk6UvJFSdyw9NK1scFXJp21gNNYFjVWNgaqyGnkyhtagagCpQb5B7tagJu3HDbjQ8h
    5ypoHjwBb"
  }]
}
          

Proof Representations

DataIntegrityProof

A proof contains the attributes specified in the Proofs section of [[VC-DATA-INTEGRITY]] with the following restrictions.

The `verificationMethod` property of the proof MUST be a URL. Dereferencing the `verificationMethod` MUST result in an object containing a `type` property with the value set to `Multikey`.

The `type` property of the proof MUST be `DataIntegrityProof`.

The `cryptosuite` property of the proof MUST be `bbs-2023`.

The value of the `proofValue` property of the proof MUST be a BBS signature or BBS proof produced according to [[CFRG-BBS-SIGNATURE]] that is serialized and encoded according to procedures in section .

Algorithms

The following algorithms describe how to use verifiable credentials with the BBS Signature Scheme [[CFRG-BBS-SIGNATURE]]. When using the BBS signature scheme the SHA-256 variant SHOULD be used.

Implementations SHOULD fetch and cache verification method information as early as possible when adding or verifying proofs. Parameters passed to functions in this section use information from the verification method — such as the public key size — to determine function parameters — such as the cryptographic hashing algorithm.

When the RDF Dataset Canonicalization Algorithm [[RDF-CANON]] is used, implementations of that algorithm will detect dataset poisoning by default, and abort processing upon detection.

Instantiate Cryptosuite

This algorithm is used to configure a cryptographic suite to be used by the Add Proof and Verify Proof functions in [[[VC-DATA-INTEGRITY]]]. The algorithm takes an options object ([=map=] |options|) as input and returns a [=data integrity cryptographic suite instance|cryptosuite instance=] ([=struct=] |cryptosuite|).

  1. Initialize |cryptosuite| to an empty [=struct=].
  2. If |options|.|type| does not equal `DataIntegrityProof`, return |cryptosuite|.
  3. If |options|.|cryptosuite| is `bbs-2023` then:
    1. Set |cryptosuite|.|createProof| to the algorithm in Section [[[#create-base-proof-bbs-2023]]].
    2. Set |cryptosuite|.|verifyProof| to the algorithm in Section [[[#verify-derived-proof-bbs-2023]]].
  4. Return |cryptosuite|.

Selective Disclosure Functions

createShuffledIdLabelMapFunction

The following algorithm creates a label map factory function that uses an HMAC to shuffle canonical blank node identifiers. The required input is an HMAC (previously initialized with a secret key), |HMAC|. A function, labelMapFactoryFunction, is produced as output.

  1. Create a function, |labelMapFactoryFunction|, with one required input (a canonical node identifier map, |canonicalIdMap|), that will return a blank node identifier map, bnodeIdMap, as output. Set the function's implementation to:
    1. Generate a new empty bnode identifier map, bnodeIdMap.
    2. For each map entry, entry, in |canonicalIdMap|:
      1. Perform an HMAC operation on the canonical identifier from the value in entry to get an HMAC digest, digest.
      2. Generate a new string value, b64urlDigest, and initialize it to "u" followed by appending a base64url-no-pad encoded version of the digest value.
      3. Add a new entry, |newEntry|, to bnodeIdMap using the key from entry and b64urlDigest as the value.
    3. Derive the shuffled mapping from the `bnodeIdMap` as follows:
      1. Set `hmacIds` to be the sorted array of values from the `bnodeIdMap`, and set `bnodeKeys` to be the ordered array of keys from the `bnodeIdMap`.
      2. For each key in `bnodeKeys`, replace the `bnodeIdMap` value for that key with the index position of the value in the `hmacIds` array prefixed by "b", i.e., `bnodeIdMap.set(bkey, 'b' + hmacIds.indexOf(bnodeIdMap.get(bkey)))`.
    4. Return bnodeIdMap.
  2. Return |labelMapFactoryFunction|.

It should be noted that step 1.2 in the above algorithm is identical to step 1.2 in Section 3.3.4 `createHmacIdLabelMapFunction` of [[DI-ECDSA]], so developers might be able to reuse the code or call the function if implementing both.

bbs-2023 Functions

serializeBaseProofValue

The following algorithm serializes the base proof value, including the BBS signature, HMAC key, and mandatory pointers. The required inputs are a base signature |bbsSignature|, an HMAC key |hmacKey|, and an array of |mandatoryPointers|. A single base proof string value is produced as output.

  1. Initialize a byte array, `proofValue`, that starts with the BBS base proof header bytes `0xd9`, `0x5d`, and `0x02`.
  2. Initialize |components| to an array with five elements containing the values of: |bbsSignature|, |bbsHeader|, |publicKey|, |hmacKey|, and |mandatoryPointers|.
  3. CBOR-encode |components| per [[RFC8949]] where CBOR tagging MUST NOT be used on any of the |components|. Append the produced encoded value to |proofValue|.
  4. Initialize |baseProof| to a string with the multibase-base64url-no-pad-encoding of `proofValue`. That is, return a string starting with "`u`" and ending with the base64url-no-pad-encoded value of |proofValue|.
  5. Return |baseProof| as base proof.

parseBaseProofValue

The following algorithm parses the components of a `bbs-2023` selective disclosure base proof value. The required input is a proof value (|proofValue|). A single object, parsed base proof, containing five or seven elements, using the names "bbsSignature", "bbsHeader", "publicKey", "hmacKey", "mandatoryPointers", and optional feature parameters "pid" and "signer_blind" is produced as output.

  1. Ensure the `proofValue` string starts with u (U+0075 LATIN SMALL LETTER U), indicating that it is a `multibase-base64url-no-pad-encoded` value, and throw an error if it does not.
  2. Initialize `decodedProofValue` to the result of base64url-no-pad-decoding the substring following the leading `u` in `proofValue`.
  3. Ensure that the `decodedProofValue` starts with the BBS base proof header bytes `0xd9`, `0x5d`, and `0x02`, and throw an error if it does not.
  4. Initialize `components` to an array that is the result of CBOR-decoding the bytes that follow the three-byte BBS base proof header.
  5. Return an object with properties set to the following elements, using the names "bbsSignature", "bbsHeader", "publicKey", "hmacKey", "mandatoryPointers", (and optional feature parameters) "pid" and "signer_blind" respectively.

createDisclosureData

The following algorithm creates data to be used to generate a derived proof. The inputs include a JSON-LD document (|document|), a BBS base proof (|proof|), an array of JSON pointers to use to selectively disclose statements (|selectivePointers|), an OPTIONAL BBS |presentationHeader| (byte array that defaults to an empty byte array if not present), an OPTIONAL |commitment_with_proof| (a byte array), an OPTIONAL |pid| value (a byte array), and any custom JSON-LD API options (such as a document loader). A single object, disclosure data, is produced as output, which contains the `bbsProof`, `labelMap`, `mandatoryIndexes`, `selectiveIndexes`, `presentationHeader`, and `revealDocument` fields.

  1. Initialize `bbsSignature`, `bbsHeader`, `publicKey`, `hmacKey`, `mandatoryPointers`, and the optional feature parameters `pid` and `signer_blind` to the values of the associated properties in the object returned when calling the algorithm in Section , passing the `proofValue` from `proof`.
  2. Initialize `hmac` to an HMAC API using `hmacKey`. The HMAC uses the same hash algorithm used in the signature algorithm, i.e., SHA-256.
  3. Initialize `labelMapFactoryFunction` to the result of calling the `createShuffledIdLabelMapFunction` algorithm passing `hmac` as `HMAC`.
  4. Initialize `combinedPointers` to the concatenation of `mandatoryPointers` and `selectivePointers`.
  5. Initialize `groupDefinitions` to a map with the following entries: key of the string `"mandatory"` and value of `mandatoryPointers`; key of the string `"selective"` and value of `selectivePointers`; and key of the string `"combined"` and value of `combinedPointers`.
  6. Initialize `groups` and `labelMap` to the result of calling the algorithm in Section 3.3.16 canonicalizeAndGroup of the [[DI-ECDSA]] specification, passing `document` `labelMapFactoryFunction`, `groupDefinitions`, and any custom JSON-LD API options. Note: This step transforms the document into an array of canonical N-Quads whose order has been shuffled based on 'hmac' applied blank node identifiers, and groups the N-Quad strings according to selections based on JSON pointers.
  7. Compute the mandatory indexes relative to their positions in the combined statement list, i.e., find the position at which a mandatory statement occurs in the list of combined statements. One method for doing this is given below.
    1. Initialize `mandatoryIndexes` to an empty array. Set `mandatoryMatch` to `groups.mandatory.matching` map; set `combinedMatch` to `groups.combined.matching`; and set `combinedIndexes` to the ordered array of just the keys of the `combinedMatch` map.
    2. For each key in the `mandatoryMatch` map, find its index in the `combinedIndexes` array (e.g., `combinedIndexes.indexOf(key)`), and add this value to the `mandatoryIndexes` array.
  8. Compute the selective indexes relative to their positions in the non-mandatory statement list, i.e., find the position at which a selected statement occurs in the list of non-mandatory statements. One method for doing this is given below.
    1. Initialize `selectiveIndexes` to an empty array. Set `selectiveMatch` to the `groups.selective.matching` map; set `mandatoryNonMatch` to the map `groups.mandatory.nonMatching`; and `nonMandatoryIndexes` to to the ordered array of just the keys of the `mandatoryNonMatch` map.
    2. For each key in the `selectiveMatch` map, find its index in the `nonMandatoryIndexes` array (e.g., `nonMandatoryIndexes.indexOf(key)`), and add this value to the `selectiveIndexes` array.
  9. Initialize `bbsMessages` to an array of byte arrays containing the values in the `nonMandatory` array of strings encoded using the UTF-8 character encoding.
  10. Set `bbsProof` to the value computed by the appropriate procedure given below based on the values of the |commitment_with_proof| and |pid| options.
    1. If both |commitment_with_proof| and |pid| options are empty, set `bbsProof` to the value computed by the `ProofGen` procedure from [[CFRG-BBS-SIGNATURE]], i.e., `ProofGen(PK, signature, header, ph, messages, disclosed_indexes)`, where `PK` is the original issuers public key, `signature` is the `bbsSignature`, `header` is the `bbsHeader`, `ph` is the `presentationHeader` `messages` is `bbsMessages`, and `disclosed_indexes` is `selectiveIndexes`.
    2. If |commitment_with_proof| is not empty and |pid| is empty, set `bbsProof` to the value computed by the `ProofGen` procedure from [[CFRG-Blind-BBS-Signature]], where `PK` is the original issuers public key, `signature` is the `bbsSignature`, `header` is the `bbsHeader`, `ph` is the `presentationHeader` `messages` is `bbsMessages`, `disclosed_indexes` is `selectiveIndexes`, `commitment_with_proof`, and `signer_blind`. The holder will also furnish its "secret value" that was used to compute the `commitment_with_proof`. This is the "anonymous holder binding" option.
    3. If |pid| is not empty, compute the |pseudonym| according to the procedures given in [[CFRG-Pseudonym-BBS-Signature]], and set `bbsProof` to the value computed by the `ProofGen` procedure from [[CFRG-Pseudonym-BBS-Signature]], where `PK` is the original issuers public key, `signature` is the `bbsSignature`, `header` is the `bbsHeader`, `ph` is the `presentationHeader` `messages` is `bbsMessages`, `disclosed_indexes` is `selectiveIndexes`, and |pseudonym| is the `pseudonym`. This is for both "pseudonym with issuer known pid" and "pseudonym with hidden pid" cases.
  11. Initialize |revealDocument| to the result of the "selectJsonLd" algorithm, passing `document`, and `combinedPointers` as `pointers`.
  12. Run the RDF Dataset Canonicalization Algorithm [[RDF-CANON]] on the joined |combinedGroup|.|deskolemizedNQuads|, passing any custom options, and get the canonical bnode identifier map, |canonicalIdMap|. Note: This map includes the canonical blank node identifiers that a verifier will produce when they canonicalize the reveal document.
  13. Initialize |verifierLabelMap| to an empty map. This map will map the canonical blank node identifiers produced by the verifier when they canonicalize the revealed document, to the blank node identifiers that were originally signed in the base proof.
  14. For each key (`inputLabel`) and value (`verifierLabel`) in `canonicalIdMap:
    1. Add an entry to `verifierLabelMap`, using `verifierLabel` as the key, and the value associated with `inputLabel` as a key in `labelMap` as the value.
  15. Return an object with properties matching `bbsProof`, "verifierLabelMap" for `labelMap`, `mandatoryIndexes`, `selectiveIndexes`, `revealDocument`, and |pseudonym|, if computed.

compressLabelMap

The following algorithm compresses a label map. The required input is label map (|labelMap|). The output is a compressed label map.

  1. Initialize `map` to an empty map.
  2. For each entry (`k`, `v`) in `labelMap`:
    1. Add an entry to `map`, with a key that is a base-10 integer parsed from the characters following the "c14n" prefix in `k`, and a value that is a base-10 integer parsed from the characters following the "b" prefix in `v`.
  3. Return `map` as compressed label map.

decompressLabelMap

The following algorithm decompresses a label map. The required input is a compressed label map (|compressedLabelMap|). The output is a decompressed label map.

  1. Initialize `map` to an empty map.
  2. For each entry (`k`, `v`) in `compressedLabelMap`:
    1. Add an entry to `map`, with a key that adds the prefix "c14n" to `k`, and a value that adds a prefix of "b" to `v`.
  3. Return `map` as decompressed label map.

serializeDerivedProofValue

The following algorithm serializes a derived proof value. The required inputs are a BBS proof (|bbsProof|), a label map (|labelMap|), an array of mandatory indexes (|mandatoryIndexes|), an array of selective indexes (|selectiveIndexes|), and a BBS presentation header (|presentationHeader|). Optional input is |pseudonym|. A single derived proof value, serialized as a byte string, is produced as output.

  1. Initialize `compressedLabelMap` to the result of calling the algorithm in Section , passing `labelMap` as the parameter.
  2. Initialize a byte array, `proofValue`, that starts with the BBS disclosure proof header bytes `0xd9`, `0x5d`, and `0x03`.
  3. Initialize |components| to an array with elements containing the values of |bbsProof|, |compressedLabelMap|, |mandatoryIndexes|, |selectiveIndexes|, |presentationHeader|, and, if provided,|pseudonym|.
  4. CBOR-encode |components| per [[RFC8949]] where CBOR tagging MUST NOT be used on any of the |components|. Append the produced encoded value to |proofValue|.
  5. Return the derived proof as a string with the multibase-base64url-no-pad-encoding of |proofValue|. That is, return a string starting with "`u`" and ending with the base64url-no-pad-encoded value of `proofValue`.

parseDerivedProofValue

The following algorithm parses the components of the derived proof value. The required input is a derived proof value (|proofValue|). A single derived proof value object is produced as output, which contains a set of five or six elements, having the names `bbsProof`, `labelMap`, `mandatoryIndexes`, `selectiveIndexes`, `presentationHeader`, and the optional `pseudonym` parameter.

  1. Ensure the `proofValue` string starts with u (U+0075, LATIN SMALL LETTER U), indicating that it is a `multibase-base64url-no-pad-encoded` value, and throw an error if it does not.
  2. Initialize `decodedProofValue` to the result of base64url-no-pad-decoding the substring that follows the leading `u` in `proofValue`.
  3. Ensure that the `decodedProofValue` starts with the BBS disclosure proof header bytes `0xd9`, `0x5d`, and `0x03`, and throw an error if it does not.
  4. Initialize `components` to an array that is the result of CBOR-decoding the bytes that follow the three-byte BBS disclosure proof header. Ensure the result is an array of five or six elements — a byte array, a map of integers to integers, an array of integers, another array of integers, and a byte array; otherwise, throw an error.
  5. Replace the second element in `components` using the result of calling the algorithm in Section , passing the existing second element of `components` as `compressedLabelMap`.
  6. Return derived proof value as an object with properties set to the five elements, using the names `bbsProof`, `labelMap`, `mandatoryIndexes`, `selectiveIndexes`, `presentationHeader`, and optional `pseudonym`, respectively.

createVerifyData

The following algorithm creates the data needed to perform verification of a BBS-protected verifiable credential. The inputs include a JSON-LD document (|document|), a BBS disclosure proof (|proof|), and any custom JSON-LD API options (such as a document loader). A single verify data object value is produced as output containing the following fields: `bbsProof`, `proofHash`, `mandatoryHash`, `selectedIndexes`, `presentationHeader`, and `nonMandatory`.

  1. Initialize `proofHash` to the result of performing RDF Dataset Canonicalization [[RDF-CANON]] on the proof options, i.e., the proof portion of the document with the `proofValue` removed. The hash used is the same as that used in the signature algorithm, i.e., SHA-256. Note: This step can be performed in parallel; it only needs to be completed before this algorithm needs to use the `proofHash` value.
  2. Initialize `bbsProof`, `labelMap`, `mandatoryIndexes`, `selectiveIndexes`, `presentationHeader`, and `pseudonym` to the values associated with their property names in the object returned when calling the algorithm in Section , passing `proofValue` from `proof`.
  3. Initialize `labelMapFactoryFunction` to the result of calling the "`createLabelMapFunction`" algorithm.
  4. Initialize `nquads` to the result of calling the "`labelReplacementCanonicalize`" algorithm of [[DI-ECDSA]], passing `document`, `labelMapFactoryFunction`, and any custom JSON-LD API options. Note: This step transforms the document into an array of canonical N-Quads with pseudorandom blank node identifiers based on `labelMap`.
  5. Initialize `mandatory` to an empty array.
  6. Initialize `nonMandatory` to an empty array.
  7. For each entry (`index`, `nq`) in `nquads`, separate the N-Quads into mandatory and non-mandatory categories:
    1. If `mandatoryIndexes` includes `index`, add `nq` to `mandatory`.
    2. Otherwise, add `nq` to `nonMandatory`.
  8. Initialize `mandatoryHash` to the result of calling the "`hashMandatory`" primitive, passing `mandatory`.
  9. Return an object with properties matching `baseSignature`, `proofHash`, `nonMandatory`, `mandatoryHash`, `selectiveIndexes`, and `pseudonym`.

bbs-2023

The `bbs-2023` cryptographic suite takes an input document, canonicalizes the document using the Universal RDF Dataset Canonicalization Algorithm [[RDF-CANON]], and then applies a number of transformations and cryptographic operations resulting in the production of a data integrity proof. The algorithms in this section also include the verification of such a data integrity proof.

Create Base Proof (bbs-2023)

The following algorithm specifies how to create a [=data integrity proof=] given an unsecured data document. Required inputs are an unsecured data document ([=map=] |unsecuredDocument|), and a set of proof options ([=map=] |options|). A [=data integrity proof=] ([=map=]), or an error, is produced as output.

  1. Let |proof| be a clone of the proof options, |options|.
  2. Let |proofConfig| be the result of running the algorithm in Section [[[#base-proof-configuration-bbs-2023]]] with |options| passed as a parameter.
  3. Let |transformedData| be the result of running the algorithm in Section [[[#base-proof-transformation-bbs-2023]]] with |unsecuredDocument|, |proofConfig|, and |options| passed as parameters.
  4. Let |hashData| be the result of running the algorithm in Section [[[#base-proof-hashing-bbs-2023]]] with |transformedData| and |proofConfig| passed as a parameters.
  5. Let |proofBytes| be the result of running the algorithm in Section [[[#base-proof-serialization-bbs-2023]]] with |hashData| and |options| passed as parameters.
  6. Let |proof|.|proofValue| be a base64url-encoded Multibase value of the |proofBytes|.
  7. Return |proof| as the [=data integrity proof=].

Base Proof Transformation (bbs-2023)

The following algorithm specifies how to transform an unsecured input document into a transformed document that is ready to be provided as input to the hashing algorithm in Section .

Required inputs to this algorithm are an unsecured data document (|unsecuredDocument|) and transformation options (|options|). The transformation options MUST contain a type identifier for the cryptographic suite (|type|), a cryptosuite identifier (|cryptosuite|), and a verification method (|verificationMethod|). The transformation options MUST contain an array of mandatory JSON pointers (|mandatoryPointers|) and MAY contain additional options, such as a JSON-LD document loader. A transformed data document is produced as output. Whenever this algorithm encodes strings, it MUST use UTF-8 encoding.

  1. Initialize |hmac| to an HMAC API using a locally generated and exportable HMAC key. The HMAC uses the same hash algorithm used in the signature algorithm, i.e., SHA-256. Per the recommendations of [[RFC2104]], the HMAC key MUST be the same length as the digest size; for SHA-256, this is 256 bits or 32 bytes.
  2. Initialize `labelMapFactoryFunction` to the result of calling the `createShuffledIdLabelMapFunction` algorithm passing `hmac` as `HMAC`.
  3. Initialize `groupDefinitions` to a map with an entry with a key of the string "`mandatory`" and a value of |mandatoryPointers|.
  4. Initialize `groups` to the result of calling the algorithm in Section 3.3.16 canonicalizeAndGroup of the [[DI-ECDSA]] specification, passing `labelMapFactoryFunction`, `groupDefinitions`, `unsecuredDocument` as `document`, and any custom JSON-LD API options. Note: This step transforms the document into an array of canonical N-Quads whose order has been shuffled based on 'hmac' applied blank node identifiers, and groups the N-Quad strings according to selections based on JSON pointers.
  5. Initialize `mandatory` to the values in the `groups.mandatory.matching` map.
  6. Initialize `nonMandatory` to the values in the `groups.mandatory.nonMatching` map.
  7. Initialize `hmacKey` to the result of exporting the HMAC key from `hmac`.
  8. Return an object with "`mandatoryPointers`" set to `mandatoryPointers`, "`mandatory`" set to `mandatory`, "`nonMandatory`" set to `nonMandatory`, and "`hmacKey`" set to `hmacKey`.

Base Proof Hashing (bbs-2023)

The following algorithm specifies how to cryptographically hash a transformed data document and proof configuration into cryptographic hash data that is ready to be provided as input to the algorithms in Section .

The required inputs to this algorithm are a transformed data document (|transformedDocument|) and canonical proof configuration (|canonicalProofConfig|). A hash data value represented as an object is produced as output.

  1. Initialize `proofHash` to the result of calling the RDF Dataset Canonicalization algorithm [[RDF-CANON]] on `canonicalProofConfig` and then cryptographically hashing the result using the same hash that is used by the signature algorithm, i.e., SHA-256. Note: This step can be performed in parallel; it only needs to be completed before this algorithm terminates, as the result is part of the return value.
  2. Initialize `mandatoryHash` to the result of calling the the algorithm in Section 3.3.17 hashMandatoryNQuads of the [[DI-ECDSA]] specification, passing |transformedDocument|.`mandatory` and using the SHA-256 algorithm.
  3. Initialize `hashData` as a deep copy of |transformedDocument|, and add `proofHash` as "`proofHash`" and `mandatoryHash` as "`mandatoryHash`" to that object.
  4. Return `hashData` as hash data.

Base Proof Configuration (bbs-2023)

The following algorithm specifies how to generate a proof configuration from a set of proof options that is used as input to the base proof hashing algorithm.

The required inputs to this algorithm are proof options (|options|). The proof options MUST contain a type identifier for the cryptographic suite (|type|) and MUST contain a cryptosuite identifier (|cryptosuite|). A proof configuration object is produced as output.

  1. Let |proofConfig| be a clone of the |options| object.
  2. If |proofConfig|.|type| is not set to `DataIntegirtyProof` and/or |proofConfig|.|cryptosuite| is not set to `bbs-2023`, an `INVALID_PROOF_CONFIGURATION` error MUST be raised.
  3. If |proofConfig|.|created| is set and if the value is not a valid [[XMLSCHEMA11-2]] datetime, an `INVALID_PROOF_DATETIME` error MUST be raised.
  4. Set |proofConfig|.|@context| to |unsecuredDocument|.|@context|.
  5. Let |canonicalProofConfig| be the result of applying the Universal RDF Dataset Canonicalization Algorithm [[RDF-CANON]] to the |proofConfig|.
  6. Return |canonicalProofConfig|.

Base Proof Serialization (bbs-2023)

The following algorithm, to be called by an issuer of a BBS-protected Verifiable Credential, specifies how to create a base proof. The base proof is to be given only to the holder, who is responsible for generating a derived proof from it, exposing only selectively disclosed details in the proof to a verifier. This algorithm is designed to be used in conjunction with the algorithms defined in the Data Integrity [[VC-DATA-INTEGRITY]] specification, Section 4: Algorithms. Required inputs are cryptographic hash data (|hashData|) and proof options (|options|). Optional inputs include a |commitment_with_proof| byte array and/or a |use_pseudonyms| boolean. The proof options MUST contain a type identifier for the cryptographic suite (|type|) and MAY contain a cryptosuite identifier (|cryptosuite|). A single digital proof value represented as series of bytes is produced as output.

  1. Initialize `proofHash`, `mandatoryPointers`, `mandatoryHash`, `nonMandatory`, and `hmacKey` to the values associated with their property names in |hashData|.
  2. Initialize `bbsHeader` to the concatenation of `proofHash` and `mandatoryHash` in that order.
  3. Initialize `bbsMessages` to an array of byte arrays containing the values in the `nonMandatory` array of strings encoded using the UTF-8 character encoding.
  4. Compute the `bbsSignature` using the procedures below, dependent on the values of |commitment_with_proof| and |use_pseudonyms| options.
    1. If |commitment_with_proof| is empty and |use_pseudonyms| is false, compute the `bbsSignature` using the `Sign` procedure of [[CFRG-BBS-Signature]], with appropriate key material, `bbsHeader` for the `header`, and `bbsMessages` for the `messages`.
    2. If |commitment_with_proof| is not empty and |use_pseudonyms| is false, compute the `bbsSignature` using the `Sign` procedure of [[CFRG-Blind-BBS-Signature]], with appropriate key material, `bbsHeader` for the `header`, and `bbsMessages` for the `messages`. If the signing procedure uses the optional |signer_blind| parameter, retain this value for use when calling (below). This provides for the "anonymous holder binding" feature.
    3. If |commitment_with_proof| is empty and |use_pseudonyms| is true, generate a cryptographically random 32 byte |pid| value. Compute the `bbsSignature` using the `Sign` procedure of [[CFRG-Pseudonym-BBS-Signature]], with appropriate key material, `bbsHeader` for the `header`, `bbsMessages` for the `messages`, and |pid| for the `pid`. Retain the |pid| value for use when calling below. This provides for "pseudonym with issuer known pid".
    4. If |commitment_with_proof| is not empty and |use_pseudonyms| is true, compute the `bbsSignature` using the `Sign` procedure of [[CFRG-Pseudonym-BBS-Signature]], with appropriate key material, `bbsHeader` for the `header`, `bbsMessages` for the `messages`, and |commitment_with_proof| for the `commitment_with_proof`. If the signing procedure uses the optional |signer_blind| parameter retain this value for use when calling below. This provides for the "pseudonym with hidden pid" feature.
  5. Initialize `proofValue to the result of calling the algorithm in Section , passing `bbsSignature`, `bbsHeader`, `publicKey`, `hmacKey`, `mandatoryPointers`, `pid`, and `signer_blind` values as paramters. Use empty byte arrays for `pid` and `signer_blind` if they are not used. Note `publicKey` is a byte array of the public key, encoded according to [[CFRG-BBS-SIGNATURE]].
  6. Return `proofValue` as digital proof.

Add Derived Proof (bbs-2023)

The following algorithm, to be called by a holder of a `bbs-2023`-protected verifiable credential, creates a selective disclosure derived proof. The derived proof is to be given to the verifier. The inputs include a JSON-LD document (|document|), a BBS base proof (|proof|), an array of JSON pointers to use to selectively disclose statements (|selectivePointers|), an OPTIONAL BBS |presentationHeader| (a byte array), an OPTIONAL |commitment_with_proof| (a byte array), an OPTIONAL |pid| value (a byte array), and any custom JSON-LD API options, such as a document loader. A single selectively revealed document value, represented as an object, is produced as output.

  1. Initialize `bbsProof`, `labelMap`, `mandatoryIndexes`, `selectiveIndexes`, and `revealDocument` to the values associated with their property names in the object returned when calling the algorithm in Section , passing the `document`, `proof`, `selectivePointers`, `presentationHeader`, and any custom JSON-LD API options, such as a document loader.
  2. Initialize `newProof` to a shallow copy of `proof`.
  3. Replace `proofValue` in `newProof` with the result of calling the algorithm in Section , passing `bbsProof`, `labelMap`, `mandatoryIndexes`, `selectiveIndexes`, |commitment_with_proof|, and |pid|.
  4. Set the value of the "`proof`" property in `revealDocument` to `newProof`.
  5. Return `revealDocument` as the selectively revealed document.

Verify Derived Proof (bbs-2023)

The following algorithm specifies how to verify a [=data integrity proof=] given an secured data document. Required inputs are a secured data document ([=map=] |securedDocument|). This algorithm returns a verification result, which is a [=struct=] whose [=struct/items=] are:

verified
`true` or `false`
verifiedDocument
Null, if [=verification result/verified=] is `false`; otherwise, an [=unsecured data document=]

To verify a derived proof, perform the following steps:

  1. Let |unsecuredDocument| be a copy of |securedDocument| with the `proof` value removed.
  2. Let |proofConfig| be a copy of |securedDocument|.|proof| with `proofValue` removed.
  3. Let |proof| be the value of |securedDocument|.|proof|.
  4. Initialize `bbsProof`, `proofHash`, `mandatoryHash`, `selectedIndexes`, `presentationHeader`, `pseudonym`, and `nonMandatory` to the values associated with their property names in the object returned when calling the algorithm in Section , passing the |unsecuredDocument|, |proof|, and any custom JSON-LD API options (such as a document loader).
  5. Initialize `bbsHeader` to the concatenation of `proofHash` and `mandatoryHash` in that order. Initialize `disclosedMessages` to an array of byte arrays obtained from the UTF-8 encoding of the elements of the `nonMandatory` array.
  6. Initialize |verified| to the result of applying the verification algorithm below, depending on whether the |pseudonym| value is empty.
    1. If the |pseudonym| value is empty, initialize |verified| to the result of applying the verification algorithm `ProofVerify(PK, proof, header, ph, disclosed_messages, disclosed_indexes)` of [[CFRG-BBS-SIGNATURE]] with `PK` set as the public key of the original issuer, `proof` set as `bbsProof`, `header` set as `bbsHeader`, `disclosed_messages` set as `disclosedMessages`, `ph` set as `presentationHeader`, and `disclosed_indexes` set as `selectiveIndexes`. This applies to the regular BBS proof case as well as "anonymous holder binding" case.
    2. If the |pseudonym| value is not empty, initialize |verified| to the result of applying the verification algorithm `PseudonymProofVerify(PK, proof, header, ph, disclosed_messages, disclosed_indexes, pseudonym)` of [[CFRG-Pseudonym-BBS-Signature]], with `PK` set as the public key of the original issuer, `proof` set as `bbsProof`, `header` set as `bbsHeader`, `disclosed_messages` set as `disclosedMessages`, `ph` set as `presentationHeader`, `disclosed_indexes` set as `selectiveIndexes`, and `pseudonym`. This applies to the "pseudonym with issuer known pid" and "pseudonym with hidden pid" cases.
  7. Return a [=verification result=] with [=struct/items=]:
    [=verified=]
    |verified|
    [=verifiedDocument=]
    |unsecuredDocument| if |verified| is `true`, otherwise Null

Optional Features

The cryptographic properties of BBS signatures permit variants that can support advanced functionalities. This specification is limited to supporting only the most relevant of these enhancements, which we explain in the following sections. The variables |commitment_with_proof|, |use_pseudonyms|, |pid|, and |pseudonym| are associated with these features and are not otherwise needed for BBS signatures and proofs.

The optional BBS features described in this section, and included in the algorithms in this specification, are at risk and will be removed before the finalization of this specification if their respective specifications at the IETF do not reach RFC status on the same timeline or if there are not at least two independent implementations for each optional feature.

Anonymous Holder Binding

This feature binds, at the time of issuance, a document with base proof, to a secret, known only to a holder, in such a way, that only that holder can generate a revealed document with derived proof that will verify. For example, if an adversary obtained the document with base proof, they could not create a revealed document with derived proof that can verify.

To provide for this functionality, a holder generates a |holder_secret| value which should generally be at least 32 bytes long and cryptographically randomly generated. This value is never shared by the holder. Instead, the holder generates a commitment along with a zero knowledge proof of knowledge of this value, using the "Commitment Generation" procedure of [[CFRG-Blind-BBS-Signature]]. This computation involves cryptographically random values and computes the |commitment_with_proof| and |secret_prover_blind| values. The |commitment_with_proof| is conveyed to the issuer while the |secret_prover_blind| is kept secret and is retained by the holder for use in generation of derived proofs. Note that a holder can run the "Commitment Generation" procedure multiple times to produce unlinkable |commitment_with_proof| values for use with different issuers.

The issuer, on receipt of the |commitment_with_proof|, follows the procedures of [[CFRG-Blind-BBS-Signature]] to produce a base proof (signature) over the document with the commitment furnished by the holder. If the issuer chooses to use the |signer_blind| parameter when creating the signature in [[CFRG-Blind-BBS-Signature]], this value needs to be conveyed to the holder as part of the base proof value.

When the holder wants to create a selectively disclosed document with derived proof, they use their |holder_secret| (as a "commited message"), the |secret_prover_blind|, and, if supplied in the base proof, the |signer_blind| in the proof generation procedure of [[CFRG-Blind-BBS-Signature]].

Verification of the revealed document with derived proof uses the "regular" BBS proof verification procedures of [[CFRG-BBS-SIGNATURE]].

Pseudonyms with Issuer-known PID

This feature is a privacy preserving enhancement that allows a verifier that has seen a selectively revealed document with derived proof from a holder to recognize that the same holder is presenting a new selectively revealed document with derived proof. Note that this may just be a new unlinkable proof (derived proof) on the same selectively revealed information. By "privacy preserving," we mean that no uniquely identifiable information is added that would allow tracking between different verifiers that may share information amongst themselves. This variant does allow for the issuer to monitor usage if verifiers share information with the issuer.

To furnish this capability, before creating the base proof for a document, an issuer generates a value known as a |pid| (prover id) which should be cryptographically random and at least 32 bytes long. This value is shared with the holder but otherwise kept secret. This value is then used in creating the base proof via the signing procedure in [[CFRG-Pseudonym-BBS-Signature]].

The holder receives the document with base proof which includes the |pid| value from the issuer. The holder obtains a |verifier_id| associated with the verifier for which they intend to create a revealed document with derived proof. Using the procedures of [[CFRG-Pseudonym-BBS-Signature]], a cryptographic |pseudonym| value is generated. The derived proof value is generated via the proof generation procedure of [[CFRG-Pseudonym-BBS-Signature]], and this value along with the |pseudonym| are given to the verifier. Note that the |pid| value cannot be recovered from the |pseudonym|.

When the verifier receives the revealed document with derived proof and |pseudonym|, they use the proof verification procedures of [[CFRG-Pseudonym-BBS-Signature]].

Pseudonyms with Hidden PID

This feature is a privacy preserving enhancement that allows a verifier that has seen a selectively revealed document with derived proof from a holder to recognize that the same holder is presenting a new selectively revealed document with derived proof. Note that this may just be a new unlinkable proof (derived proof) on the same selectively revealed information. By "privacy preserving," we mean that no uniquely identifiable information is added that would allow tracking between different verifiers that may share information amongst themselves and/or with the issuer.

To provide for this capability, a holder needs to generate a secret |pid| value that should be at least 32 bytes long and generated in cryptographically random manner. The holder then uses the "Commitment Generation" procedure of [[CFRG-Blind-BBS-Signature]] to generate a |commitment_with_proof| value and a private |secret_prover_blind| value. This value needs to be conveyed to the issuer who will use it in the issuance of a document with base proof, in accordance with [[CFRG-Pseudonym-BBS-Signature]], which is sent to the holder. The |pid| value is never shared by the holder. If the issuer chooses to use the optional |signer_blind| parameter when creating the signature in this value needs to be conveyed to the holder as part of the base proof value.

The holder obtains a |verifier_id| associated with the verifier for which they intend to create a revealed document with derived proof. Using the procedures of [[CFRG-Pseudonym-BBS-Signature]], a cryptographic |pseudonym| value is generated from their |pid| value and the |verifier_id|. The derived proof value is generated via the proof generation using the |pid|, |secret_prover_blind|, |verifier_id|, and |signer_blind| using the procedures of [[CFRG-Pseudonym-BBS-Signature]], and this value is given to the verifier along with the |pseudonym|. Note that the |pid| value cannot be recovered from the |pseudonym|.

When the verifier receives the revealed document with derived proof and |pseudonym|, they use the proof verification procedures of [[CFRG-Pseudonym-BBS-Signature]].

Security Considerations

Before reading this section, readers are urged to familiarize themselves with general security advice provided in the Security Considerations section of the Data Integrity specification.

Base Proof Security Properties

The security of the base proof is dependent on the security properties of the associated BBS signature. Digital signatures might exhibit a number of desirable cryptographic properties [[Taming_EdDSAs]] among these are:

EUF-CMA (existential unforgeability under chosen message attacks) is usually the minimal security property required of a signature scheme. It guarantees that any efficient adversary who has the public key p k of the signer and received an arbitrary number of signatures on messages of its choice (in an adaptive manner): { m i , σ i } i = 1 N , cannot output a valid signature σ for a new message m { m i } i = 1 N (except with negligible probability). In case the attacker outputs a valid signature on a new message: ( m , σ ) , it is called an existential forgery.

SUF-CMA (strong unforgeability under chosen message attacks) is a stronger notion than EUF-CMA. It guarantees that for any efficient adversary who has the public key p k of the signer and received an arbitrary number of signatures on messages of its choice: { m i , σ i } i = 1 N , it cannot output a new valid signature pair ( m , σ ) , such that ( m , σ ) { m i , σ i } i = 1 N (except with negligible probability). Strong unforgeability implies that an adversary cannot only sign new messages, but also cannot find a new signature on an old message.

In [[CDL2016]] under some reasonable assumptions BBS signatures were proven to be EUF-CMA. Furthermore, in [[TZ2023]], under similar assumptions BBS signatures were proven to be SUF-CMA. In both cases the assumptions are related to the hardness of the discrete logarithm problem which is not considered post large scale quantum computing secure.

Under non-quantum computing conditions [[CFRG-BBS-SIGNATURE]] provides additional security guidelines to BBS signature suite implementors. Further security considerations related to pairing friendly curves are discussed in [[CFRG-PAIRING-FRIENDLY]].

Derived Proof Security Properties

The security of the derived proof is dependent on the security properties of the associated BBS proof. Both [[CDL2016]] and [[TZ2023]] prove that a BBS proof is a zero knowledge proof of knowledge of a BBS signature.

As explained in [[CFRG-BBS-SIGNATURE]] this means:

a verifying party in receipt of a proof is unable to determine which signature was used to generate the proof, removing a common source of correlation. In general, each proof generated is indistinguishable from random even for two proofs generated from the same signature.

and

The proofs generated by the scheme prove to a verifier that the party who generated the proof (holder/prover or an agent of theirs) was in possession of a signature without revealing it.

More precisely, verification of a BBS proof requires the original issuers public key as well as the unaltered, revealed BBS message in the proper order.

Privacy Considerations

Selective Disclosure and Data Leakage

Selective disclosure permits a holder to minimize the information revealed to a verifier to achieve a particular purpose. In prescribing an overall system that enables selective disclosure, care has to be taken that additional information that was not meant to be disclosed to the verifier is minimized. Such leakage can occur through artifacts of the system. Such artifacts can come from higher layers of the system, such as in the structure of data or from the lower level cryptographic primitives.

For example the BBS signature scheme is an extremely space efficient scheme for producing a signature on multiple messages, i.e., the cryptographic signature sent to the holder is a constant size regardless of the number of messages. The holder then can selectively disclose any of these messages to a verifier, however as part of the encryption scheme, the total number of messages signed by the issuer has to be revealed to the verifier. If such information leakage needs to be avoided then it is recommended to pad the number of messages out to a common length as suggested in the privacy considerations section of [[CFRG-BBS-SIGNATURE]].

At the higher levels, how data gets mapped into individual statements suitable for selective disclosure, i.e., BBS messages, is a potential source of data leakage. This cryptographic suite is able to eliminate many structural artifacts used to express JSON data that might leak information (nesting, map, or array position, etc.) by using JSON-LD processing to transform inputs into RDF. RDF can then be expressed as a canonical, flat format of simple subject, property, value statements (referred to as claims in the Verifiable Credentials Data Model [[VC-DATA-MODEL-2.0]]). In the following, we examine RDF canonicalization, a general scheme for mapping a verifiable credential in JSON-LD format into a set of statements (BBS messages), for selective disclosure. We show that after this process is performed, there remains a possible source of information leakage, and we show how this leakage is mitigated via the use of a keyed pseudo random function (PRF).

RDF canonicalization can be used to flatten a JSON-LD VC into a set of statements. The algorithm is dependent on the content of the VC and also employs a cryptographic hash function to help in ordering the statements. In essence, how this happens is that each JSON object that represents the subject of claims within a JSON-LD document will be assigned an id, if it doesn't have an `@id` field defined. Such ids are known as blank node ids. These ids are needed to express claims as simple subject, property, value statements such that the subject in each claim can be differentiated. The id values are deterministically set per [[RDF-CANON]] and are based on the data in the document and the output of a cryptographic hash function such as SHA-256.

Below we show two slightly different VCs for a set of windsurf sails and their canonicalization into a set of statements that can be used for selective disclosure. By changing the year of the 6.1 size sail we see a major change in statement ordering between these two VCs. If the holder discloses information about just his larger sails (the 7.0 and 7.8) the verifier could tell something changed about the set of sails, i.e., information leakage.

{
  "@context": [
    "https://www.w3.org/ns/credentials/v2",
    {
      "@vocab": "https://windsurf.grotto-networking.com/selective#"
    }
  ],
  "type": [
    "VerifiableCredential"
  ],
  "credentialSubject": {
    "sails": [
      {
        "size": 5.5,
        "sailName": "Kihei",
        "year": 2023
      },
      {
        "size": 6.1,
        "sailName": "Lahaina",
        "year": 2023 // Will change this to see the effect on canonicalization
      },
      {
        "size": 7.0,
        "sailName": "Lahaina",
        "year": 2020
      },
      {
        "size": 7.8,
        "sailName": "Lahaina",
        "year": 2023
      }
    ]
  }
}
        

Canonical form of the above VC. Assignment of blank node ids, i.e., the _:c14nX labels are dependent upon the content of the VC and this also affects the ordering of the statements.

_:c14n0 <https://windsurf.grotto-networking.com/selective#sailName> "Lahaina" .
_:c14n0 <https://windsurf.grotto-networking.com/selective#size> "7.8E0"^^<http://www.w3.org/2001/XMLSchema#double> .
_:c14n0 <https://windsurf.grotto-networking.com/selective#year> "2023"^^<http://www.w3.org/2001/XMLSchema#integer> .
_:c14n1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://www.w3.org/2018/credentials#VerifiableCredential> .
_:c14n1 <https://www.w3.org/2018/credentials#credentialSubject> _:c14n4 .
_:c14n2 <https://windsurf.grotto-networking.com/selective#sailName> "Lahaina" .
_:c14n2 <https://windsurf.grotto-networking.com/selective#size> "7"^^<http://www.w3.org/2001/XMLSchema#integer> .
_:c14n2 <https://windsurf.grotto-networking.com/selective#year> "2020"^^<http://www.w3.org/2001/XMLSchema#integer> .
_:c14n3 <https://windsurf.grotto-networking.com/selective#sailName> "Kihei" .
_:c14n3 <https://windsurf.grotto-networking.com/selective#size> "5.5E0"^^<http://www.w3.org/2001/XMLSchema#double> .
_:c14n3 <https://windsurf.grotto-networking.com/selective#year> "2023"^^<http://www.w3.org/2001/XMLSchema#integer> .
_:c14n4 <https://windsurf.grotto-networking.com/selective#sails> _:c14n0 .
_:c14n4 <https://windsurf.grotto-networking.com/selective#sails> _:c14n2 .
_:c14n4 <https://windsurf.grotto-networking.com/selective#sails> _:c14n3 .
_:c14n4 <https://windsurf.grotto-networking.com/selective#sails> _:c14n5 .
_:c14n5 <https://windsurf.grotto-networking.com/selective#sailName> "Lahaina" .
_:c14n5 <https://windsurf.grotto-networking.com/selective#size> "6.1E0"^^<http://www.w3.org/2001/XMLSchema#double> .
_:c14n5 <https://windsurf.grotto-networking.com/selective#year> "2023"^^<http://www.w3.org/2001/XMLSchema#integer> .
        

Updated windsurf sail collection, i.e., the 6.1 size sail has been updated to the 2024 model. This changes the ordering of statements via the assignment of blank node ids.

{
  "@context": [
    "https://www.w3.org/ns/credentials/v2",
    {
      "@vocab": "https://windsurf.grotto-networking.com/selective#"
    }
  ],
  "type": [
    "VerifiableCredential"
  ],
  "credentialSubject": {
    "sails": [
      {
        "size": 5.5,
        "sailName": "Kihei",
        "year": 2023
      },
      {
        "size": 6.1,
        "sailName": "Lahaina",
        "year": 2024 // New sail to update older model, changes canonicalization
      },
      {
        "size": 7.0,
        "sailName": "Lahaina",
        "year": 2020
      },
      {
        "size": 7.8,
        "sailName": "Lahaina",
        "year": 2023
      }
    ]
  }
}
        

Canonical form of the previous VC. Note the difference in blank node id assignment and ordering of statements.

_:c14n0 <https://windsurf.grotto-networking.com/selective#sailName> "Lahaina" .
_:c14n0 <https://windsurf.grotto-networking.com/selective#size> "6.1E0"^^<http://www.w3.org/2001/XMLSchema#double> .
_:c14n0 <https://windsurf.grotto-networking.com/selective#year> "2024"^^<http://www.w3.org/2001/XMLSchema#integer> .
_:c14n1 <https://windsurf.grotto-networking.com/selective#sailName> "Lahaina" .
_:c14n1 <https://windsurf.grotto-networking.com/selective#size> "7.8E0"^^<http://www.w3.org/2001/XMLSchema#double> .
_:c14n1 <https://windsurf.grotto-networking.com/selective#year> "2023"^^<http://www.w3.org/2001/XMLSchema#integer> .
_:c14n2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://www.w3.org/2018/credentials#VerifiableCredential> .
_:c14n2 <https://www.w3.org/2018/credentials#credentialSubject> _:c14n5 .
_:c14n3 <https://windsurf.grotto-networking.com/selective#sailName> "Lahaina" .
_:c14n3 <https://windsurf.grotto-networking.com/selective#size> "7"^^<http://www.w3.org/2001/XMLSchema#integer> .
_:c14n3 <https://windsurf.grotto-networking.com/selective#year> "2020"^^<http://www.w3.org/2001/XMLSchema#integer> .
_:c14n4 <https://windsurf.grotto-networking.com/selective#sailName> "Kihei" .
_:c14n4 <https://windsurf.grotto-networking.com/selective#size> "5.5E0"^^<http://www.w3.org/2001/XMLSchema#double> .
_:c14n4 <https://windsurf.grotto-networking.com/selective#year> "2023"^^<http://www.w3.org/2001/XMLSchema#integer> .
_:c14n5 <https://windsurf.grotto-networking.com/selective#sails> _:c14n0 .
_:c14n5 <https://windsurf.grotto-networking.com/selective#sails> _:c14n1 .
_:c14n5 <https://windsurf.grotto-networking.com/selective#sails> _:c14n3 .
_:c14n5 <https://windsurf.grotto-networking.com/selective#sails> _:c14n4 .
        

To prevent such information leakage from the assignment of these blank node ids and the ordering they impose on the statements, an HMAC based PRF is run on the blank node ids. The HMAC secret key is only shared between the issuer and holder and each Base Proof generated by the issuer uses a new HMAC key. An example of this can be seen in the canonical HMAC test vector of [[DI-ECDSA]]. As discussed in the next section, for BBS to preserve unlinkability we do not use HMAC based blank node ids but produce a shuffled version of the ordering based on the HMAC as shown in test vector . Note that this furnishes less information hiding concerning blank node ids than in the ECDSA-SD approach, since information the number of blank node ids can leak, but prevents linkage attacks via the essentially unique identifiers produced by applying an HMAC to blank node ids.

Selective Disclosure and Unlinkability

In some uses of VCs it can be important to the privacy of a holder to prevent the tracking or linking of multiple different verifier interactions. In particular we consider two important cases (i) verifier to issuer collusion, and (ii) verifier to verifier collusion. In the first case, shown in , a verifier reports back to the original issuer of the credential on an interaction with a holder. In this situation, the issuer could track all the holder interactions with various verifiers using the issued VC. In the second situation, shown in , multiple verifiers collude to share information about holders with whom they have interacted.


Diagram showing multiple verifiers sending data back to the issuer.
The diagram is laid out top to bottom with a circle labeled issuer at the top,
connected to a circle label holder below. From the circle labeled holder there
are multple arrows to additional circles labeled verifiers. From the circles
labeled verifiers there are dashed arrows back to the circle labeled issuer
showing collusion data flow.
Verifier to verifier collusion.

Diagram showing multiple verifiers sharing with each other.
The diagram is laid out top to bottom with a circle labeled issuer at the top,
connected to a circle label holder below. From the circle labeled holder there
are multple arrows to additional circles labeled verifiers. From the circles
labeled verifiers there are dashed arrows back to other circles labeled issuer
to show verifier to verifier collusion data flows.
Verifier to issuer collusion.

We use the term unlinkability to describe the property of a VC system to prevent such "linkage attacks" on holder privacy. Although the term unlinkability is relatively new section 3.3 of [[NISTIR8053]] discusses and gives a case study of Re-identification through Linkage Attacks. A systemization of knowledge on linkage attack on data privacy can be found in [[Powar2023]]. The most widespread use of linkage attack on user privacy occurs via the practice of web browser fingerprinting, a survey of which can be found in [[Pugliese2020]].

To quantify the notion of linkage, [[Powar2023]] introduces the idea of an anonymity set. In the VC case we are concerned with here, the anonymity set would contain the holder of a particular VC and other holders associated with a particular issuer. The smaller the anonymity set the more likely the holder can be tracked across verifiers. Since a signed VC contains a reference to a public key of the issuer, the starting size for the anonymity set for a holder possessing a VC from a particular issuer is the number of VC issued by that issuer with that particular public/private key pair. Non-malicious issuers are expected to minimize the number of public/private key pairs used to issue VCs. Note that the anonymity set idea is similar to the group privacy concept in [[vc-bitstring-status-list]]. When we use the term linkage here we generally mean any mechanism that results in a reduction in size of the anonymity set.

Sources of linkage in a VC system supporting selective disclosure:

  1. Artifacts from cryptographic primitives.
  2. Artifacts from mapping a VC into a set of statements suitable for selective disclosure.
  3. Artifacts from Proof Options and Mandatory reveal Information in the VC.
  4. Selectively revealed information in the VC.
  5. External VC System Based Linkage

We discuss each of these below.

Linkage via Cryptographic Artifacts

Cryptographic Hashes, HMACs, and digital signatures by their nature generate highly unique identifiers. The output of a hash function such as SHA-256, by its collision resistance properties, are guaranteed to be essentially unique given different inputs and result in a strong linkage, i.e., reduces the anonymity set size to one. Similarly deterministic signature algorithms such as Ed25519 and deterministic ECDSA will produce essentially unique outputs for different inputs and lead to strong linkages.

This implies that holders can be easily tracked across verifiers via digital signature, HMAC, or hash artifacts inside VCs and hence are vulnerable to verifier-verifier collusion and verifier-issuer collusion. Randomized signature algorithms such as some forms of ECDSA can permit the issuer to generate many distinct signatures on the same inputs and send these to the holder for use with different verifiers. Such an approach could be used to prevent verifier-verifier collusion based tracking but cannot help with verifier-issuer collusion.

To achieve unlinkability requires specially designed cryptographic signature schemes that allow the holder to generate what is called a zero knowledge proof of knowledge of a signature (ZKPKS). What this means is that the holder can take a signature from the issuer in such a scheme, compute a ZKPKS to send to a verifier. This ZKPKS cannot be linked back to the original signature, but has all the desirable properties of a signature, i.e., the verifier can use it to verify that the messages were signed by the issuers public key and that the messages have not been altered. In addition, the holder can generate as many ZKPKSs as desired for different verifiers and these are essentially independent and unlinkable. BBS is one such signature scheme that supports this capability.

Although the ZKPKS, known as a BBS proof in this document, has guaranteed unlinkability properties. BBS when used with selective disclosure has two artifacts that can contribute to linkability. These are the total number of messages originally signed, and the index values for the revealed statements. See the privacy considerations in [[CFRG-BBS-SIGNATURE]] for a discussion and mitigation techniques.

As mentioned in the section on Issuer's Public Keys of [[CFRG-BBS-SIGNATURE]] there is the potential threat that an issuer might use multiple public keys with some of those used to track a specific subset of users via verifier-issuer collusion. Since the issuers public key has to be visible to the verifier, i.e., it is referenced in the BBS proof (derived proof) this can be used as a linkage point if the issuer has many different public keys and particularly if it uses a subset of those keys with a small subset of users (holders).

Linkage via VC Processing

We saw in the section on information leakage that RDF canonicalization uses a hash function to order statements and that a further shuffle of the order of the statements is performed based on an HMAC. This can leave a fingerprint that might allow for some linkage. How strong of a linkage is dependent on the number of blank nodes, essentially JSON objects within the VC, and the number of indexes revealed. Given n blank nodes and k disclosed indexes in the worst case this would be a reduction in the anonymity set size by a factor of C(n, k), i.e., the number combinations of size k chosen from a set of n elements. One can keep this number quite low by reducing the number of blank nodes in the VC, e.g., keep the VC short and simple.

Linkage via JSON-LD Node Identifiers

JSON-LD is a JSON-based format for serialization of Linked Data. As such, it supports assigning a globally unambiguous `@id` attribute (node identifier) to each object ("node", in JSON-LD terminology) within a document. This allows for the linking of linked data, enabling information about the same entity to be correlated. This correlation can be desirable or undesirable, depending on the use case.

When using BBS for its unlinkability feature, globally unambiguous node identifiers cannot be used for individuals nor for their personally identifiable information, since the strong linkage they provide is undesirable. Note that the use of such identifiers is acceptable when expressing statements about non-personal information (e.g., using a globally unambiguous identifier to identify a large country or a concert event). Also note that JSON-LD's use of `@context`, which maps terms to IRIs, does not generally affect unlinkability.

Linkage via Proof Options and Mandatory Reveal

In the [[vc-data-integrity]] specification, a number of properties of the `proof` attribute of a VC are given. Care has to be taken that optional fields ought not provide strong linkage across verifiers. The optional fields include: id, created, expires, domain, challenge, and nonce. For example the optional created field is a `dateTimeStamp` object which can specify the creation date for the proof down to an arbitrary sub-second granularity. Such information, if present, could greatly reduce the size of the anonymity set. If the issuer wants to include such information they ought to make it as coarse grained as possible, relative to the number of VCs being issued over time.

The issuer can also compel a holder to reveal certain statements to a verifier via the `mandatoryPointers` input used in the creation of the Base Proof. See section , , and . By compel we mean that a generated Derived Proof will not verify unless these statements are revealed to the verifier. Care should be taken such that if such information is required to be disclosed, that the anonymity set remains sufficiently large.

Linkage via Holder Selective Reveal

As discussed in [[Powar2023]] there are many documented cases of re-identification of individuals from linkage attacks. Hence the holder is urged to reveal as little information as possible to help keep the anonymity set large. In addition, it has been shown a number of times that innocuous seeming information can be highly unique and thus leading to re-identification or tracking. See [[NISTIR8053]] for a walk through of a particularly famous case of a former governor of Massachusetts and [[Powar2023]] for further analysis and categorization of 94 such public cases.

External VC System Based Linkage

It ought to be pointed out that maintaining unlinkability, i.e., anonymity, requires care in the systems holding and communicating the VCs. Networking artifacts such as IP address (layer 3) or Ethernet/MAC address (layer 2) are well known sources of linkage. For example, mobile phone MAC addresses can be used to track users if they revisited a particular access point, this led to mobile phone manufacturers providing a MAC address randomization feature. Public IP addresses generally provide enough information to geolocate an individual to a city or region within a country potentially greatly reducing the anonymity set.

Test Vectors

Demonstration of selective disclosure features including mandatory disclosure, selective disclosure, and overlap between those, requires an input credential document with more content than previous test vectors. To avoid excessively long test vectors, the starting document test vector is based on a purely fictitious windsurfing (sailing) competition scenario. In addition, we break the test vectors into two groups, based on those that would be generated by the issuer (base proof) and those that would be generated by the holder (derived proof).

Base Proof

To add a selective disclosure base proof to a document, the issuer needs the following cryptographic key material:

  1. The issuer's private/public key pair, i.e., the key pair corresponding to the verification method that will be part of the proof.
  2. An HMAC key. This is used to randomize the order of the blank node IDs to avoid potential information leakage via the blank node ID ordering. This is used only once, and is shared between issuer and holder. The HMAC in this case is functioning as a pseudorandom function (PRF).

The key material used for generating the test vectors to test add base proof is shown below. Hexadecimal representation is used for the BBS key pairs and the HMAC key.

          

In our scenario, a sailor is registering with a race organizer for a series of windsurfing races to be held over a number of days on Maui. The organizer will inspect the sailor's equipment to certify that what has been declared is accurate. The sailor's unsigned equipment inventory is shown below.


          

In addition to letting other sailors know what kinds of equipment their competitors may be sailing on, it is mandatory that each sailor disclose the year of their most recent windsurfing board and full details on two of their sails. Note that all sailors are identified by a sail number that is printed on all their equipment. This mandatory information is specified via an array of JSON pointers as shown below.


          

The result of applying the above JSON pointers to the sailor's equipment document is shown below.


          

Transformation of the unsigned document begins with canonicalizing the document, as shown below.


          

To prevent possible information leakage from the ordering of the blank node IDs these are processed through a PRF (i.e., the HMAC) to give the canonicalized HMAC document shown below. This represents an ordered list of statements that will be subject to mandatory and selective disclosure, i.e., it is from this list that statements are grouped.


          

The above canonical document gets grouped into mandatory and non-mandatory statements. The final output of the selective disclosure transformation process is shown below. Each statement is now grouped as mandatory or non-mandatory, and its index in the previous list of statements is remembered.


          

The next step is to create the base proof configuration and canonicalize it. This is shown in the following two examples.


          

          

In the hashing step, we compute the SHA-256 hash of the canonicalized proof options to produce the `proofHash`, and we compute the SHA-256 hash of the join of all the mandatory N-Quads to produce the `mandatoryHash`. These are shown below in hexadecimal format.


          

Shown below are the computed `bbsSignature` in hexadecimal, and the `mandatoryPointers`. These are are fed to the final serialization step with the `hmacKey`.


          

Finally, the values above are run through the algorithm of Section , to produce the `proofValue` which is used in the signed base document shown below.


        

Derived Proof

Random numbers are used, and an optional `presentationHeader` can be an input, for the creation of BBS proofs. To furnish a deterministic set of test vectors, we used the Mocked Random Scalars procedure from [[CFRG-BBS-SIGNATURE]]. The `seed` and `presentationHeader` values we used for generation of the derived proof test vectors are given in hex, below.


          

To create a derived proof, a holder starts with a signed document containing a base proof. The base document we will use for these test vectors is the final example from Section , above. The first step is to run the algorithm of Section to recover `bbsSignature`, `hmacKey`, and `mandatoryPointers`, as shown below.


          

Next, the holder needs to indicate what else, if anything, they wish to reveal to the verifiers, by specifying JSON pointers for selective disclosure. In our windsurfing competition scenario, a sailor (the holder) has just completed their first day of racing, and wishes to reveal to the general public (the verifiers) all the details of the windsurfing boards they used in the competition. These are shown below. Note that this slightly overlaps with the mandatory disclosed information which included only the year of their most recent board.


          

To produce the `revealDocument` (i.e., the unsigned document that will eventually be signed and sent to the verifier), we append the selective pointers to the mandatory pointers, and input these combined pointers along with the document without proof to the `selectJsonLd` algorithm of [[DI-ECDSA]], to get the result shown below.


          

Now that we know what the revealed document looks like, we need to furnish appropriately updated information to the verifier about which statements are mandatory, and the indexes for the selected non-mandatory statements. Running step 6 of the yields an abundance of information about various statement groups relative to the original document. Below we show a portion of the indexes for those groups.


          

The verifier needs to be able to aggregate and hash the mandatory statements. To enable this, we furnish them with a list of indexes of the mandatory statements adjusted to their positions in the reveal document (i.e., relative to the `combinedIndexes`), while the `selectiveIndexes` need to be adjusted relative to their positions within the `nonMandatoryIndexes`. These "adjusted" indexes are shown below.



          

The last important piece of disclosure data is a mapping of canonical blank node IDs to HMAC-based shuffled IDs, the `labelMap`, computed according to Section . This is shown below along with the rest of the disclosure data minus the reveal document.


          

Finally, using the disclosure data above with the algorithm of Section , we obtain the signed derived (reveal) document shown below.


        

Acknowledgements

Portions of the work on this specification have been funded by the United States Department of Homeland Security's (US DHS) Silicon Valley Innovation Program under contracts 70RSAT20T00000003, and 70RSAT20T00000033. The content of this specification does not necessarily reflect the position or the policy of the U.S. Government and no official endorsement should be inferred.