Directory cache operation

All directory caches implement this section, except as noted.

Downloading consensus status documents from directory authorities

All directory caches try to keep a recent network-status consensus document to serve to clients. A cache ALWAYS downloads a network-status consensus if any of the following are true:

  • The cache has no consensus document.
  • The cache's consensus document is no longer valid.

Otherwise, the cache downloads a new consensus document at a randomly chosen time in the first half-interval after its current consensus stops being fresh. (This time is chosen at random to avoid swarming the authorities at the start of each period. The interval size is inferred from the difference between the valid-after time and the fresh-until time on the consensus.)

   [For example, if a cache has a consensus that became valid at 1:00,
    and is fresh until 2:00, that cache will fetch a new consensus at
    a random time between 2:00 and 2:30.]

Directory caches also fetch consensus flavors from the authorities. Caches check the correctness of consensus flavors, but do not check anything about an unrecognized consensus document beyond its digest and length. Caches serve all consensus flavors from the same locations as the directory authorities.

Downloading server descriptors from directory authorities

Periodically (currently, every 10 seconds), directory caches check whether there are any specific descriptors that they do not have and that they are not currently trying to download. Caches identify these descriptors by hash in the recent network-status consensus documents.

If so, the directory cache launches requests to the authorities for these descriptors.

If one of these downloads fails, we do not try to download that descriptor from the authority that failed to serve it again unless we receive a newer network-status consensus that lists the same descriptor.

Directory caches must potentially cache multiple descriptors for each router. Caches must not discard any descriptor listed by any recent consensus. If there is enough space to store additional descriptors, caches SHOULD try to hold those which clients are likely to download the most. (Currently, this is judged based on the interval for which each descriptor seemed newest.)

[XXXX define recent]

Downloading microdescriptors from directory authorities

Directory mirrors should fetch, cache, and serve each microdescriptor from the authorities.

The microdescriptors with base64 hashes <D1>, <D2>, <D3> are available at:

http://<hostname>/tor/micro/d/<D1>-<D2>-<D3>[.z]

<Dn> are base64 encoded with trailing =s omitted for size and for consistency with the microdescriptor consensus format. -s are used instead of +s to separate items, since the + character is used in base64 encoding.

Directory mirrors should check to make sure that the microdescriptors they're about to serve match the right hashes (either the hashes from the fetch URL or the hashes from the consensus, respectively).

(NOTE: Due to squid proxy url limitations at most 92 microdescriptor hashes can be retrieved in a single request.)

Downloading extra-info documents from directory authorities

Any cache that chooses to cache extra-info documents should implement this section.

Periodically, the Tor instance checks whether it is missing any extra-info documents: in other words, if it has any server descriptors with an extra-info-digest field that does not match any of the extra-info documents currently held. If so, it downloads whatever extra-info documents are missing. Caches download from authorities. We follow the same splitting and back-off rules as in section 4.2.

Consensus diffs

Instead of downloading an entire consensus, clients may download a "diff" document containing an ed-style diff from a previous consensus document. Caches (and authorities) make these diffs as they learn about new consensuses. To do so, they must store a record of older consensuses.

(Support for consensus diffs was added in 0.3.1.1-alpha, and is advertised with the DirCache protocol version "2" or later.)

Consensus diff format

Consensus diffs are formatted as follows:

The first line is "network-status-diff-version 1" NL

The second line is

"hash" SP FromDigest SP ToDigest NL

where FromDigest is the hex-encoded SHA3-256 digest of the signed part of the consensus that the diff should be applied to, and ToDigest is the hex-encoded SHA3-256 digest of the entire consensus resulting from applying the diff. (See 3.4.1 for information on that part of a consensus is signed.)

The third and subsequent lines encode the diff from FromDigest to ToDigest in a limited subset of the ed diff format, as specified in appendix E.

Serving and requesting diffs

When downloading the current consensus, a client may include an HTTP header of the form

X-Or-Diff-From-Consensus: HASH1, HASH2, ...

where the HASH values are hex-encoded SHA3-256 digests of the signed part of one or more consensuses that the client knows about.

If a cache knows a consensus diff from one of those consensuses to the most recent consensus of the requested flavor, it may send that diff instead of the specified consensus.

Caches also serve diffs from the URIs:

/tor/status-vote/current/consensus/diff/<HASH>/<FPRLIST>.z
/tor/status-vote/current/consensus-<FLAVOR>/diff/<HASH>/<FPRLIST>.z

where FLAVOR is the consensus flavor, defaulting to "ns", and FPRLIST is +-separated list of recognized authority identity fingerprints as in appendix B.

Retrying failed downloads

See section 5.5 below; it applies to caches as well as clients.