Hidden service descriptors: encryption format

Hidden service descriptors are protected by two layers of encryption. Clients need to decrypt both layers to connect to the hidden service.

The first layer of encryption provides confidentiality against entities who don't know the public key of the hidden service (e.g. HSDirs), while the second layer of encryption is only useful when restricted discovery is enabled and protects against entities that do not possess valid client credentials.

First layer of encryption

The first layer of HS descriptor encryption is designed to protect descriptor confidentiality against entities who don't know the public identity key of the hidden service.

First layer encryption logic

The encryption keys and format for the first layer of encryption are generated as specified in [HS-DESC-ENCRYPTION-KEYS] with customization parameters:

     SECRET_DATA = blinded-public-key
     STRING_CONSTANT = "hsdir-superencrypted-data"

The encryption scheme in [HS-DESC-ENCRYPTION-KEYS] uses the service credential which is derived from the public identity key (see [SUBCRED]) to ensure that only entities who know the public identity key can decrypt the first descriptor layer.

The ciphertext is placed on the "superencrypted" field of the descriptor.

Before encryption the plaintext is padded with NUL bytes to the nearest multiple of 10k bytes.

First layer plaintext format

After clients decrypt the first layer of encryption, they need to parse the plaintext to get to the second layer ciphertext which is contained in the "encrypted" field.

If restricted discovery is enabled, the hidden service generates a fresh descriptor_cookie key (N_hs_desc_enc, 32 random bytes) and encrypts it using each authorized client's identity x25519 key. Authorized clients can use the descriptor cookie (N_hs_desc_enc) to decrypt the second (inner) layer of encryption. Our encryption scheme requires the hidden service to also generate an ephemeral x25519 keypair for each new descriptor.

If restricted discovery is disabled, fake data is placed in each of the fields below to obfuscate whether restricted discovery is enabled.

Here are all the supported fields:

"desc-auth-type" SP type NL

[Exactly once]

      This field contains the type of authorization used to protect the
      descriptor. The only recognized type is "x25519" and specifies the
      encryption scheme described in this section.

      If restricted discovery is disabled, the value here should be "x25519".

     "desc-auth-ephemeral-key" SP KP_hs_desc_ephem NL

      [Exactly once]

      This field contains `KP_hss_desc_enc`, an ephemeral x25519 public
      key generated by the hidden service and encoded in base64. The key
      is used by the encryption scheme below.

      If restricted discovery is disabled, the value here should be a fresh
      x25519 pubkey that will remain unused.

     "auth-client" SP client-id SP iv SP encrypted-cookie

      [At least once]

      When restricted discovery is enabled, the hidden service inserts an
      "auth-client" line for each of its authorized clients. If client
      authorization is disabled, the fields here can be populated with random
      data of the right size (that's 8 bytes for 'client-id', 16 bytes for 'iv'
      and 16 bytes for 'encrypted-cookie' all encoded with base64).

      When restricted discovery is enabled, each "auth-client" line
      contains the descriptor cookie `N_hs_desc_enc` encrypted to each
      individual client. We assume that each authorized client possesses
      a pre-shared x25519 keypair (`KP_hsc_desc_enc`) which is used to
      decrypt the descriptor cookie.

      We now describe the descriptor cookie encryption scheme. Here is what
      the hidden service computes:

          SECRET_SEED = x25519(KS_hs_desc_ephem, KP_hsc_desc_enc)
          KEYS = KDF(N_hs_subcred | SECRET_SEED, 40)
          CLIENT-ID = fist 8 bytes of KEYS
          COOKIE-KEY = last 32 bytes of KEYS

      Here is a description of the fields in the "auth-client" line:

      - The "client-id" field is CLIENT-ID from above encoded in base64.

      - The "iv" field is 16 random bytes encoded in base64.

      - The "encrypted-cookie" field contains the descriptor cookie ciphertext
        as follows and is encoded in base64:
           encrypted-cookie = STREAM(iv, COOKIE-KEY) XOR N_hs_desc_enc.

      See section [FIRST-LAYER-CLIENT-BEHAVIOR] for the client-side logic of
      how to decrypt the descriptor cookie.

    "encrypted" NL encrypted-string

     [Exactly once]

      An encrypted blob containing the second layer ciphertext, whose format is
      discussed in [HS-DESC-SECOND-LAYER] below. The blob is base64 encoded
      and enclosed in -----BEGIN MESSAGE---- and ----END MESSAGE---- wrappers.

    Compatibility note: The C Tor implementation does not include a final
    newline when generating this first-layer-plaintext section; other
    implementations MUST accept this section even if it is missing its final
    newline.  Other implementations MAY generate this section without a final
    newline themselves, to avoid being distinguishable from C tor.

Client behavior

    The goal of clients at this stage is to decrypt the "encrypted" field as
    described in [HS-DESC-SECOND-LAYER].

    If restricted discovery is enabled, authorized clients need to extract the
    descriptor cookie to proceed with decryption of the second layer as
    follows:

    An authorized client parsing the first layer of an encrypted descriptor,
    extracts the ephemeral key from "desc-auth-ephemeral-key" and calculates
    CLIENT-ID and COOKIE-KEY as described in the section above using their
    x25519 private key. The client then uses CLIENT-ID to find the right
    "auth-client" field which contains the ciphertext of the descriptor
    cookie. The client then uses COOKIE-KEY and the iv to decrypt the
    descriptor_cookie, which is used to decrypt the second layer of descriptor
    encryption as described in [HS-DESC-SECOND-LAYER].

Hiding restricted discovery data

    Hidden services should avoid leaking whether restricted discovery is
    enabled or how many authorized clients there are.

    Hence even when restricted discovery is disabled, the hidden service adds
    fake "desc-auth-type", "desc-auth-ephemeral-key" and "auth-client" lines to
    the descriptor, as described in [HS-DESC-FIRST-LAYER].

    The hidden service also avoids leaking the number of authorized clients by
    adding fake "auth-client" entries to its descriptor. Specifically,
    descriptors always contain a number of authorized clients that is a
    multiple of 16 by adding fake "auth-client" entries if needed.
    [XXX consider randomization of the value 16]

    Clients MUST accept descriptors with any number of "auth-client" lines as
    long as the total descriptor size is within the max limit of 50k (also
    controlled with a consensus parameter).

Second layer of encryption

The second layer of descriptor encryption is designed to protect descriptor confidentiality against unauthorized clients. If restricted discovery is enabled, it's encrypted using the descriptor_cookie, and contains needed information for connecting to the hidden service, like the list of its introduction points.

If restricted discovery is disabled, then the second layer of HS encryption does not offer any additional security, but is still used.

Second layer encryption keys

The encryption keys and format for the second layer of encryption are generated as specified in [HS-DESC-ENCRYPTION-KEYS] with customization parameters as follows:

     SECRET_DATA = blinded-public-key | descriptor_cookie
     STRING_CONSTANT = "hsdir-encrypted-data"

   If restricted discovery is disabled the 'descriptor_cookie' field is left blank.

   The ciphertext is placed on the "encrypted" field of the descriptor.

Second layer plaintext format

After decrypting the second layer ciphertext, clients can finally learn the list of intro points etc. The plaintext has the following format:

"create2-formats" SP formats NL

\[Exactly once\]

      A space-separated list of integers denoting CREATE2 cell HTYPEs
      (handshake types) that the server recognizes. Must include at least
      ntor as described in tor-spec.txt. See tor-spec section 5.1 for a list
      of recognized handshake types.
     "intro-auth-required" SP types NL

      [At most once]

      A space-separated list of introduction-layer authentication types; see
      section [INTRO-AUTH] for more info. A client that does not support at
      least one of these authentication types will not be able to contact the
      host. Recognized types are: 'ed25519'.
     "single-onion-service"

      [At most once]

      If present, this line indicates that the service is a Single Onion
      Service (see prop260 for more details about that type of service). This
      field has been introduced in 0.3.0 meaning 0.2.9 service don't include
      this.
     "pow-params" SP scheme SP seed-b64 SP suggested-effort
                  SP expiration-time NL

      If present, this line provides parameters for an optional proof-of-work
      client puzzle. A client that supports an offered scheme can include a
      corresponding solution in its introduction request to improve priority
      in the service's processing queue.

      Only scheme `v1` is currently defined.
      It may appear only once.

      Unknown schemes found in a descriptor must be completely ignored:
      future schemes might have a different format (in the parts of the
      Item after the "scheme"; this could even include an Object); and
      future schemes might allow repetition, and might appear in any order.

      Introduced in tor-0.4.8.1-alpha.

        scheme: The PoW system used. We call the one specified here "v1".

        seed-b64: A random seed that should be used as the input to the PoW
                  hash function. Should be 32 random bytes encoded in base64
                  without trailing padding.

        suggested-effort: An unsigned integer specifying an effort value that
                  clients should aim for when contacting the service. Can be
                  zero to mean that PoW is available but not currently
                  suggested for a first connection attempt.

        expiration-time: A timestamp in "YYYY-MM-DDTHH:MM:SS" format (iso time
                         with no space) after which the above seed expires and
                         is no longer valid as the input for PoW.

Followed by zero or more introduction points as follows (see section [NUM_INTRO_POINT] below for accepted values):

        "introduction-point" SP link-specifiers NL

          [Exactly once per introduction point at start of introduction
            point section]

          The link-specifiers is a base64 encoding of a link specifier
          block in the format described in [BUILDING-BLOCKS] above.

          As of 0.4.1.1-alpha, services include both IPv4 and IPv6 link
          specifiers in descriptors. All available addresses SHOULD be
          included in the descriptor, regardless of the address that the
          onion service actually used to connect/extend to the intro
          point.

          The client SHOULD NOT reject any LSTYPE fields which it doesn't
          recognize; instead, it should use them verbatim in its EXTEND
          request to the introduction point.

          The client SHOULD perform the basic validity checks on the link
          specifiers in the descriptor, described in `tor-spec.txt`
          section 5.1.2. These checks SHOULD NOT leak
          detailed information about the client's version, configuration,
          or consensus. (See 3.3 for service link specifier handling.)

          When connecting to the introduction point, the client SHOULD send
          this list of link specifiers verbatim, in the same order as given
          here.

          The client MAY reject the list of link specifiers if it is
          inconsistent with relay information from the directory, but SHOULD
          NOT modify it.
        "onion-key" SP "ntor" SP key NL

          [Exactly once per introduction point]

          The key is a base64 encoded curve25519 public key which is the onion
          key of the introduction point Tor node used for the ntor handshake
          when a client extends to it.
        "onion-key" SP KeyType SP key.. NL

          [Any number of times]

          Implementations should accept other types of onion keys using this
          syntax (where "KeyType" is some string other than "ntor");
          unrecognized key types should be ignored.

        "auth-key" NL certificate NL

          [Exactly once per introduction point]

          The certificate is a proposal 220 certificate wrapped in
          "-----BEGIN ED25519 CERT-----".  It contains the introduction
          point authentication key (`KP_hs_ipt_sid`), signed by
          the descriptor signing key (`KP_hs_desc_sign`).  The
          certificate type must be [09], and the signing key extension
          is mandatory.

          NOTE: This certificate was originally intended to be
          constructed the other way around: the signing and signed keys
          are meant to be reversed.  However, C tor implemented it
          backwards, and other implementations now need to do the same
          in order to conform.  (Since this section is inside the
          descriptor, which is _already_ signed by `KP_hs_desc_sign`,
          the verification aspect of this certificate serves no point in
          its current form.)
        "enc-key" SP "ntor" SP key NL

          [Exactly once per introduction point]

          The key is a base64 encoded curve25519 public key used to encrypt
          the introduction request to service. (`KP_hss_ntor`)
        "enc-key" SP KeyType SP key.. NL

          [Any number of times]

          Implementations should accept other types of onion keys using this
          syntax (where "KeyType" is some string other than "ntor");
          unrecognized key types should be ignored.

        "enc-key-cert" NL certificate NL

          [Exactly once per introduction point]

          Cross-certification of the encryption key using the descriptor
          signing key.

          For "ntor" keys, certificate is a proposal 220 certificate
          wrapped in "-----BEGIN ED25519 CERT-----" armor.

          The subject
          key is the the ed25519 equivalent of a curve25519 public
          encryption key (`KP_hss_ntor`), with the ed25519 key
          derived using the process in proposal 228 appendix A,
          and its sign bit set to zero.

          The
          signing key is the descriptor signing key (`KP_hs_desc_sign`).
          The certificate type must be [0B], and the signing-key
          extension is mandatory.

          NOTE: As with "auth-key", this certificate was intended to be
          constructed the other way around.  However, for compatibility
          with C tor, implementations need to construct it this way.  It
          serves even less point than "auth-key", however, since the
          encryption key `KP_hss_ntor` is already available from
          the `enc-key` entry.

          ALSO NOTE: Setting the sign bit of the subject key
          to zero makes the subjected unusable for verification;
          this is also a mistake preserved for compatiblility with
          C tor.

        "legacy-key" NL key NL

          [None or at most once per introduction point]
          [This field is obsolete and should never be generated; it
           is included for historical reasons only.]

          The key is an ASN.1 encoded RSA public key in PEM format used for a
          legacy introduction point as described in [LEGACY_EST_INTRO].

          This field is only present if the introduction point only supports
          legacy protocol (v2) that is <= 0.2.9 or the protocol version value
          "HSIntro 3".

        "legacy-key-cert" NL certificate NL

          [None or at most once per introduction point]
          [This field is obsolete and should never be generated; it
           is included for historical reasons only.]

          MUST be present if "legacy-key" is present.

          The certificate is a proposal 220 RSA->Ed cross-certificate wrapped
          in "-----BEGIN CROSSCERT-----" armor, cross-certifying the RSA
          public key found in "legacy-key" using the descriptor signing key.

To remain compatible with future revisions to the descriptor format, clients should ignore unrecognized lines in the descriptor. Other encryption and authentication key formats are allowed; clients should ignore ones they do not recognize.

Clients who manage to extract the introduction points of the hidden service can proceed with the introduction protocol as specified in [INTRO-PROTOCOL].

Compatibility note: At least some versions of OnionBalance do not include a final newline when generating this inner plaintext section; other implementations MUST accept this section even if it is missing its final newline.

Deriving hidden service descriptor encryption keys

In this section we present the generic encryption format for hidden service descriptors. We use the same encryption format in both encryption layers, hence we introduce two customization parameters SECRET_DATA and STRING_CONSTANT which vary between the layers.

The SECRET_DATA parameter specifies the secret data that are used during encryption key generation, while STRING_CONSTANT is merely a string constant that is used as part of the KDF.

Here is the key generation logic:

       SALT = 16 bytes from H(random), changes each time we rebuild the
              descriptor even if the content of the descriptor hasn't changed.
              (So that we don't leak whether the intro point list etc. changed)

       secret_input = SECRET_DATA | N_hs_subcred | INT_8(revision_counter)

       keys = KDF(secret_input | salt | STRING_CONSTANT, S_KEY_LEN + S_IV_LEN + MAC_KEY_LEN)

       SECRET_KEY = first S_KEY_LEN bytes of keys
       SECRET_IV  = next S_IV_LEN bytes of keys
       MAC_KEY    = last MAC_KEY_LEN bytes of keys

   The encrypted data has the format:

       SALT       hashed random bytes from above  [16 bytes]
       ENCRYPTED  The ciphertext                  [variable]
       MAC        D_MAC of both above fields      [32 bytes]

   The final encryption format is ENCRYPTED = STREAM(SECRET_IV,SECRET_KEY) XOR Plaintext .

   Where D_MAC = H(mac_key_len | MAC_KEY | salt_len | SALT | ENCRYPTED)
   and
    mac_key_len = htonll(len(MAC_KEY))
   and
    salt_len = htonll(len(SALT)).

Number of introduction points

This section defines how many introduction points an hidden service descriptor can have at minimum, by default and the maximum:

Minimum: 0 - Default: 3 - Maximum: 20

A value of 0 would means that the service is still alive but doesn't want to be reached by any client at the moment. Note that the descriptor size increases considerably as more introduction points are added.

The reason for a maximum value of 20 is to give enough scalability to tools like OnionBalance to be able to load balance up to 120 servers (20 x 6 HSDirs) but also in order for the descriptor size to not overwhelmed hidden service directories with user defined values that could be gigantic.