Outline

There is a small set (say, around 5-10) of semi-trusted directory authorities. A default list of authorities is shipped with the Tor software. Users can change this list, but are encouraged not to do so, in order to avoid partitioning attacks.

Every authority has a very-secret, long-term "Authority Identity Key". This is stored encrypted and/or offline, and is used to sign "key certificate" documents. Every key certificate contains a medium-term (3-12 months) "authority signing key", that is used by the authority to sign other directory information. (Note that the authority identity key is distinct from the router identity key that the authority uses in its role as an ordinary router.)

Routers periodically upload signed "routers descriptors" to the directory authorities describing their keys, capabilities, and other information. Routers may also upload signed "extra-info documents" containing information that is not required for the Tor protocol. Directory authorities serve server descriptors indexed by router identity, or by hash of the descriptor.

Routers may act as directory caches to reduce load on the directory authorities. They announce this in their descriptors.

Periodically, each directory authority generates a view of the current descriptors and status for known routers. They send a signed summary of this view (a "status vote") to the other authorities. The authorities compute the result of this vote, and sign a "consensus status" document containing the result of the vote.

Directory caches download, cache, and re-serve consensus documents.

Clients, directory caches, and directory authorities all use consensus documents to find out when their list of routers is out-of-date. (Directory authorities also use vote statuses.) If it is, they download any missing server descriptors. Clients download missing descriptors from caches; caches and authorities download from authorities. Descriptors are downloaded by the hash of the descriptor, not by the relay's identity key: this prevents directory servers from attacking clients by giving them descriptors nobody else uses.

All directory information is uploaded and downloaded with HTTP.

What's different from version 2?

Clients used to download multiple network status documents, corresponding roughly to "status votes" above. They would compute the result of the vote on the client side.

Authorities used to sign documents using the same private keys they used for their roles as routers. This forced them to keep these extremely sensitive keys in memory unencrypted.

All of the information in extra-info documents used to be kept in the main descriptors.

Document meta-format

Server descriptors, directories, and running-routers documents all obey the following lightweight extensible information format.

The highest level object is a Document, which consists of one or more Items. Every Item begins with a KeywordLine, followed by zero or more Objects. A KeywordLine begins with a Keyword, optionally followed by whitespace and more non-newline characters, and ends with a newline. A Keyword is a sequence of one or more characters in the set [A-Za-z0-9-], but may not start with -. An Object is a block of encoded data in pseudo-Privacy-Enhanced-Mail (PEM) style format: that is, lines of encoded data MAY be wrapped by inserting an ascii linefeed ("LF", also called newline, or "NL" here) character (cf. RFC 4648 ยง3.1). When line wrapping, implementations MUST wrap lines at 64 characters. Upon decoding, implementations MUST ignore and discard all linefeed characters.

More formally:

NL = The ascii LF character (hex value 0x0a).
Document ::= (Item | NL)+
Item ::= KeywordLine Object?
KeywordLine ::= Keyword (WS Argument)*NL
Keyword = KeywordStart KeywordChar*
KeywordStart ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9'
KeywordChar ::= KeywordStart | '-'
Argument := ArgumentChar+
ArgumentChar ::= any graphical printing ASCII character.
WS = (SP | TAB)+
Object ::= BeginLine Base64-encoded-data EndLine
BeginLine ::= "-----BEGIN " Keyword (" " Keyword)*"-----" NL
EndLine ::= "-----END " Keyword (" " Keyword)* "-----" NL

A Keyword may not be -----BEGIN.

The BeginLine and EndLine of an Object must use the same keyword.

When interpreting a Document, software MUST ignore any KeywordLine that starts with a keyword it doesn't recognize; future implementations MUST NOT require current clients to understand any KeywordLine not currently described.

Other implementations that want to extend Tor's directory format MAY introduce their own items. The keywords for extension items SHOULD start with the characters "x-" or "X-", to guarantee that they will not conflict with keywords used by future versions of Tor.

In our document descriptions below, we tag Items with a multiplicity in brackets. Possible tags are:

    "At start, exactly once": These items MUST occur in every instance of
      the document type, and MUST appear exactly once, and MUST be the
      first item in their documents.

    "Exactly once": These items MUST occur exactly one time in every
      instance of the document type.

    "At end, exactly once": These items MUST occur in every instance of
      the document type, and MUST appear exactly once, and MUST be the
      last item in their documents.

    "At most once": These items MAY occur zero or one times in any
      instance of the document type, but MUST NOT occur more than once.

    "Any number": These items MAY occur zero, one, or more times in any
      instance of the document type.

    "Once or more": These items MUST occur at least once in any instance
      of the document type, and MAY occur more.

For forward compatibility, each item MUST allow extra arguments at the end of the line unless otherwise noted. So if an item's description below is given as:

"thing" int int int NL

then implementations SHOULD accept this string as well:

"thing 5 9 11 13 16 12" NL

but not this string:

"thing 5" NL

and not this string:

       "thing 5 10 thing" NL
   .

Whenever an item DOES NOT allow extra arguments, we will tag it with "no extra arguments".

Signing documents

Every signable document below is signed in a similar manner, using a given "Initial Item", a final "Signature Item", a digest algorithm, and a signing key.

The Initial Item must be the first item in the document.

The Signature Item has the following format:

<signature item keyword> [arguments] NL SIGNATURE NL

The "SIGNATURE" Object contains a signature (using the signing key) of the PKCS#1 1.5 padded digest of the entire document, taken from the beginning of the Initial item, through the newline after the Signature Item's keyword and its arguments.

The signature does not include the algorithmIdentifier specified in PKCS #1.

Unless specified otherwise, the digest algorithm is SHA-1.

All documents are invalid unless signed with the correct signing key.

The "Digest" of a document, unless stated otherwise, is its digest as signed by this signature scheme.

Voting timeline

Every consensus document has a "valid-after" (VA) time, a "fresh-until" (FU) time and a "valid-until" (VU) time. VA MUST precede FU, which MUST in turn precede VU. Times are chosen so that every consensus will be "fresh" until the next consensus becomes valid, and "valid" for a while after. At least 3 consensuses should be valid at any given time.

The timeline for a given consensus is as follows:

VA-DistSeconds-VoteSeconds: The authorities exchange votes. Each authority uploads their vote to all other authorities.

VA-DistSeconds-VoteSeconds/2: The authorities try to download any votes they don't have.

Authorities SHOULD also reject any votes that other authorities try to upload after this time. (0.4.4.1-alpha was the first version to reject votes in this way.)

Note: Refusing late uploaded votes minimizes the chance of a consensus split, particular when authorities are under bandwidth pressure. If an authority is struggling to upload its vote, and finally uploads to a fraction of authorities after this period, they will compute a consensus different from the others. By refusing uploaded votes after this time, we increase the likelihood that most authorities will use the same vote set.

Rejecting late uploaded votes does not fix the problem entirely. If some authorities are able to download a specific vote, but others fail to do so, then there may still be a consensus split. However, this change does remove one common cause of consensus splits.

VA-DistSeconds: The authorities calculate the consensus and exchange signatures. (This is the earliest point at which anybody can possibly get a given consensus if they ask for it.)

VA-DistSeconds/2: The authorities try to download any signatures they don't have.

VA: All authorities have a multiply signed consensus.

   VA ... FU: Caches download the consensus.  (Note that since caches have
        no way of telling what VA and FU are until they have downloaded
        the consensus, they assume that the present consensus's VA is
        equal to the previous one's FU, and that its FU is one interval after
        that.)

   FU: The consensus is no longer the freshest consensus.

   FU ... (the current consensus's VU): Clients download the consensus.
        (See note above: clients guess that the next consensus's FU will be
        two intervals after the current VA.)

VU: The consensus is no longer valid; clients should continue to try to download a new consensus if they have not done so already.

VU + 24 hours: Clients will no longer use the consensus at all.

VoteSeconds and DistSeconds MUST each be at least 20 seconds; FU-VA and VU-FU MUST each be at least 5 minutes.