Hidden services: overview and preliminaries

Hidden services aim to provide responder anonymity for bidirectional stream-based communication on the Tor network. Unlike regular Tor connections, where the connection initiator receives anonymity but the responder does not, hidden services attempt to provide bidirectional anonymity.

Participants:

Operator -- A person running a hidden service

      Host, "Server" -- The Tor software run by the operator to provide
         a hidden service.

      User -- A person contacting a hidden service.

      Client -- The Tor software running on the User's computer

      Hidden Service Directory (HSDir) -- A Tor node that hosts signed
        statements from hidden service hosts so that users can make
        contact with them.

      Introduction Point -- A Tor node that accepts connection requests
        for hidden services and anonymously relays those requests to the
        hidden service.

      Rendezvous Point -- A Tor node to which clients and servers
        connect and which relays traffic between them.

Improvements over previous versions

Here is a list of improvements of this proposal over the legacy hidden services:

a) Better crypto (replaced SHA1/DH/RSA1024 with SHA3/ed25519/curve25519) b) Improved directory protocol leaking less to directory servers. c) Improved directory protocol with smaller surface for targeted attacks. d) Better onion address security against impersonation. e) More extensible introduction/rendezvous protocol. f) Offline keys for onion services g) Advanced client authorization

Notation and vocabulary

Unless specified otherwise, all multi-octet integers are big-endian.

We write sequences of bytes in two ways:

     1. A sequence of two-digit hexadecimal values in square brackets,
        as in [AB AD 1D EA].

     2. A string of characters enclosed in quotes, as in "Hello". The
        characters in these strings are encoded in their ascii
        representations; strings are NOT nul-terminated unless
        explicitly described as NUL terminated.

   We use the words "byte" and "octet" interchangeably.

   We use the vertical bar | to denote concatenation.

We use INT_N(val) to denote the network (big-endian) encoding of the unsigned integer "val" in N bytes. For example, INT_4(1337) is [00 00 05 39]. Values are truncated like so: val % (2 ^ (N * 8)). For example, INT_4(42) is 42 % 4294967296 (32 bit).

Cryptographic building blocks

This specification uses the following cryptographic building blocks:

      * A pseudorandom number generator backed by a strong entropy source.
        The output of the PRNG should always be hashed before being posted on
        the network to avoid leaking raw PRNG bytes to the network
        (see [PRNG-REFS]).

      * A stream cipher STREAM(iv, k) where iv is a nonce of length
        S_IV_LEN bytes and k is a key of length S_KEY_LEN bytes.

      * A public key signature system SIGN_KEYGEN()->seckey, pubkey;
        SIGN_SIGN(seckey,msg)->sig; and SIGN_CHECK(pubkey, sig, msg) ->
        { "OK", "BAD" }; where secret keys are of length SIGN_SECKEY_LEN
        bytes, public keys are of length SIGN_PUBKEY_LEN bytes, and
        signatures are of length SIGN_SIG_LEN bytes.

        This signature system must also support key blinding operations
        as discussed in appendix [KEYBLIND] and in section [SUBCRED]:
        SIGN_BLIND_SECKEY(seckey, blind)->seckey2 and
        SIGN_BLIND_PUBKEY(pubkey, blind)->pubkey2 .

      * A public key agreement system "PK", providing
        PK_KEYGEN()->seckey, pubkey; PK_VALID(pubkey) -> {"OK", "BAD"};
        and PK_HANDSHAKE(seckey, pubkey)->output; where secret keys are
        of length PK_SECKEY_LEN bytes, public keys are of length
        PK_PUBKEY_LEN bytes, and the handshake produces outputs of
        length PK_OUTPUT_LEN bytes.

      * A cryptographic hash function H(d), which should be preimage and
        collision resistant. It produces hashes of length HASH_LEN
        bytes.

      * A cryptographic message authentication code MAC(key,msg) that
        produces outputs of length MAC_LEN bytes.

      * A key derivation function KDF(message, n) that outputs n bytes.

   As a first pass, I suggest:

      * Instantiate STREAM with AES256-CTR.

      * Instantiate SIGN with Ed25519 and the blinding protocol in
        [KEYBLIND].

      * Instantiate PK with Curve25519.

      * Instantiate H with SHA3-256.

      * Instantiate KDF with SHAKE-256.

      * Instantiate MAC(key=k, message=m) with H(k_len | k | m),
        where k_len is htonll(len(k)).

When we need a particular MAC key length below, we choose MAC_KEY_LEN=32 (256 bits).

For legacy purposes, we specify compatibility with older versions of the Tor introduction point and rendezvous point protocols. These used RSA1024, DH1024, AES128, and SHA1, as discussed in rend-spec.txt.

As in [proposal 220], all signatures are generated not over strings themselves, but over those strings prefixed with a distinguishing value.

Protocol building blocks

In sections below, we need to transmit the locations and identities of Tor nodes. We do so in the link identification format used by EXTEND2 messages in the Tor protocol.

         NSPEC      (Number of link specifiers)   [1 byte]
         NSPEC times:
           LSTYPE (Link specifier type)           [1 byte]
           LSLEN  (Link specifier length)         [1 byte]
           LSPEC  (Link specifier)                [LSLEN bytes]

Link specifier types are as described in tor-spec.txt. Every set of link specifiers SHOULD include at minimum specifiers of type [00] (TLS-over-TCP, IPv4), [02] (legacy node identity) and [03] (ed25519 identity key). Sets of link specifiers without these three types SHOULD be rejected.

As of 0.4.1.1-alpha, Tor includes both IPv4 and IPv6 link specifiers in v3 onion service protocol link specifier lists. All available addresses SHOULD be included as link specifiers, regardless of the address that Tor actually used to connect/extend to the remote relay.

We also incorporate Tor's circuit extension handshakes, as used in the CREATE2 and CREATED2 cells described in tor-spec.txt. In these handshakes, a client who knows a public key for a server sends a message and receives a message from that server. Once the exchange is done, the two parties have a shared set of forward-secure key material, and the client knows that nobody else shares that key material unless they control the secret key corresponding to the server's public key.

Assigned relay message types

These relay message types are reserved for use in the hidden service protocol.

32 -- RELAY_COMMAND_ESTABLISH_INTRO

            Sent from hidden service host to introduction point;
            establishes introduction point. Discussed in
            [REG_INTRO_POINT].

      33 -- RELAY_COMMAND_ESTABLISH_RENDEZVOUS

            Sent from client to rendezvous point; creates rendezvous
            point. Discussed in [EST_REND_POINT].

      34 -- RELAY_COMMAND_INTRODUCE1

            Sent from client to introduction point; requests
            introduction. Discussed in [SEND_INTRO1]

      35 -- RELAY_COMMAND_INTRODUCE2

            Sent from introduction point to hidden service host; requests
            introduction. Same format as INTRODUCE1. Discussed in
            [FMT_INTRO1] and [PROCESS_INTRO2]

      36 -- RELAY_COMMAND_RENDEZVOUS1

            Sent from hidden service host to rendezvous point;
            attempts to join host's circuit to
            client's circuit. Discussed in [JOIN_REND]

      37 -- RELAY_COMMAND_RENDEZVOUS2

            Sent from rendezvous point to client;
            reports join of host's circuit to
            client's circuit. Discussed in [JOIN_REND]

      38 -- RELAY_COMMAND_INTRO_ESTABLISHED

            Sent from introduction point to hidden service host;
            reports status of attempt to establish introduction
            point. Discussed in [INTRO_ESTABLISHED]

      39 -- RELAY_COMMAND_RENDEZVOUS_ESTABLISHED

            Sent from rendezvous point to client; acknowledges
            receipt of ESTABLISH_RENDEZVOUS message. Discussed in
            [EST_REND_POINT]

      40 -- RELAY_COMMAND_INTRODUCE_ACK

            Sent from introduction point to client; acknowledges
            receipt of INTRODUCE1 message and reports success/failure.
            Discussed in [INTRO_ACK]

Acknowledgments

This design includes ideas from many people, including

     Christopher Baines,
     Daniel J. Bernstein,
     Matthew Finkel,
     Ian Goldberg,
     George Kadianakis,
     Aniket Kate,
     Tanja Lange,
     Robert Ransom,
     Roger Dingledine,
     Aaron Johnson,
     Tim Wilson-Brown ("teor"),
     special (John Brooks),
     s7r

It's based on Tor's original hidden service design by Roger Dingledine, Nick Mathewson, and Paul Syverson, and on improvements to that design over the years by people including

     Tobias Kamm,
     Thomas Lauterbach,
     Karsten Loesing,
     Alessandro Preite Martinez,
     Robert Ransom,
     Ferdinand Rieger,
     Christoph Weingarten,
     Christian Wilms,

We wouldn't be able to do any of this work without good attack designs from researchers including

     Alex Biryukov,
     Lasse Ă˜verlier,
     Ivan Pustogarov,
     Paul Syverson,
     Ralf-Philipp Weinmann,

   See [ATTACK-REFS] for their papers.

   Several of these ideas have come from conversations with

      Christian Grothoff,
      Brian Warner,
      Zooko Wilcox-O'Hearn,

And if this document makes any sense at all, it's thanks to editing help from

      Matthew Finkel,
      George Kadianakis,
      Peter Palfrader,
      Tim Wilson-Brown ("teor"),

[XXX Acknowledge the huge bunch of people working on 8106.] [XXX Acknowledge the huge bunch of people working on 8244.]

Please forgive me if I've missed you; please forgive me if I've misunderstood your best ideas here too.