Directory cache operation
All directory caches implement this section, except as noted.
Note: Directory caches are currently in the process of being renamed to “Directory Mirrors”, in order to better reflect their purpose. You might encounter both terms in the wild.
General download behavior
For directory caches, a few general rules apply, which are outlined here.
Downloads always take place over ordinary TCP and plain-text HTTP through the ordinary dirport of the authorities. A directory cache SHOULD NOT download from another directory cache and SHOULD NOT use a Tor circuit or anything else besides ordinary TCP for that.
When downloading a consensus, a directory cache tries all directory authorities sequentially (linear, non-parallel), in a randomized round-robin order. It stops after the first successful response, remembering that authority that returned the successful response.
For all subsequent downloads associated with the consensus, this remembered authority will be used as the primary upstream source. In other words: All subsequent attempts within the context of the just obtained consensus will try to use that authority, with the failure handling described below.
In the case that a subsequent download within the context of the consensus fails and the failure is a network failure, the directory cache SHOULD pick a new directory authority determined in the same fashion as the previous outlined above, with the obvious exception of excluding the failed authority.
By network failure, we mean the following:
- TCP failure or an even lower error in the OSI stack.
- All
5xxHTTP error codes.
The idea behind this is, that retrying a request where the first
authority responded with 404 SHOULD stay the same with all
authorities, as this is the main idea behind signing such a thing
together.
Please keep in mind that further rules apply for retrying failed downloads, with this section primarily trying to outline the idea behind trying to stay consistent with a single authority for the context of a single consensus. See also Retrying failed downloads.
Downloading consensus status documents from directory authorities
All directory caches try to keep a recent network-status consensus document to serve to clients. A cache ALWAYS downloads a network-status consensus if any of the following are true:
- The cache has no consensus document.
- The cache’s consensus document is no longer valid.
Otherwise, the cache downloads a new consensus document at a randomly chosen time in the first half-interval after its current consensus stops being fresh. (This time is chosen at random to avoid swarming the authorities at the start of each period. The interval size is inferred from the difference between the valid-after time and the fresh-until time on the consensus.)
[For example, if a cache has a consensus that became valid at 1:00,
and is fresh until 2:00, that cache will fetch a new consensus at
a random time between 2:00 and 2:30.]
Directory caches also fetch consensus flavors from the authorities. Caches check the correctness of consensus flavors, but do not check anything about an unrecognized consensus document beyond its digest and length. Caches serve all consensus flavors from the same locations as the directory authorities.
Downloading server descriptors from directory authorities
Periodically (currently, every 10 seconds), directory caches check whether there are any specific descriptors that they do not have and that they are not currently trying to download. Caches identify these descriptors by hash in the recent network-status consensus documents.
If so, the directory cache launches requests to the authorities for these descriptors.
If one of these downloads fails, we do not try to download that descriptor from the authority that failed to serve it again unless we receive a newer network-status consensus that lists the same descriptor.
Directory caches must potentially cache multiple descriptors for each router. Caches must not discard any descriptor listed by any recent consensus. If there is enough space to store additional descriptors, caches SHOULD try to hold those which clients are likely to download the most. (Currently, this is judged based on the interval for which each descriptor seemed newest.)
[XXXX define recent]
Downloading microdescriptors from directory authorities
Directory mirrors should fetch, cache, and serve each microdescriptor from the authorities.
The microdescriptors with base64 SHA256 hashes <D1>, <D2>, <D3> are available
at:
http://<hostname>/tor/micro/d/<D1>-<D2>-<D3>
<Dn> are base64 encoded with trailing =s omitted for size and for
consistency with the microdescriptor consensus format. -s are used
instead of +s to separate items, since the + character is used in
base64 encoding.
Directory mirrors should check to make sure that the microdescriptors they’re about to serve match the right hashes (either the hashes from the fetch URL or the hashes from the consensus, respectively).
(NOTE: Due to squid proxy url limitations at most 92 microdescriptor hashes can be retrieved in a single request.)
Note that these URLs here have variants ending in “.z”.
Downloading extra-info documents from directory authorities
Any cache that chooses to cache extra-info documents should implement this section.
Periodically, the Tor instance checks whether it is missing any extra-info documents: in other words, if it has any server descriptors with an extra-info-digest field that does not match any of the extra-info documents currently held. If so, it downloads whatever extra-info documents are missing. Caches download from authorities. We follow the same splitting and back-off rules as in section 4.2.
Consensus diffs
Instead of downloading an entire consensus, clients may download a “diff” document containing an ed-style diff from a previous consensus document. Caches (and authorities) make these diffs as they learn about new consensuses. To do so, they must store a record of older consensuses.
Support for consensus diffs was added in 0.3.1.1-alpha, and is
advertised with the subprotocol “DirCache=2” (DIRCACHE_CONSDIFF).
Consensus diff format
Consensus diffs are formatted as follows:
The first line is “network-status-diff-version 1” NL
The second line is
“hash” SP FromDigest SP ToDigest NL
where FromDigest is the hex-encoded SHA3-256 digest of the signed part of the consensus that the diff should be applied to, and ToDigest is the hex-encoded SHA3-256 digest of the entire consensus resulting from applying the diff. (See 3.4.1 for information on that part of a consensus is signed.)
The third and subsequent lines encode the diff from FromDigest to ToDigest in a limited subset of the ed diff format, as specified in appendix E.
Serving and requesting diffs
When downloading the current consensus, a client may include an HTTP header of the form
X-Or-Diff-From-Consensus: HASH1, HASH2, …
where the HASH values are hex-encoded SHA3-256 digests of the signed part of one or more consensuses that the client knows about.
If a cache knows a consensus diff from one of those consensuses to the most recent consensus of the requested flavor, it may send that diff instead of the specified consensus.
Caches also serve diffs from the URIs:
/tor/status-vote/current/consensus/diff/<HASH>/<FPRLIST>
/tor/status-vote/current/consensus-<FLAVOR>/diff/<HASH>/<FPRLIST>
where FLAVOR is the consensus flavor, defaulting to “ns”, and FPRLIST is +-separated list of recognized authority identity fingerprints as in appendix B.
Note that these URLs here have variants ending in “.z”.
Retrying failed downloads
See section 5.5 below; it applies to caches as well as clients.
Also, General download behavior explains some directory cache specific characteristics that apply partially to the retrying of failed downloads, but concern more about the selection of a static upstream directory authority for the duration/context of an active consensus.