We choose the path for each new circuit before we build it, based on our current directory information. (Clients and relays use the latest directory information they have; directory authorities use their own opinions.)
We choose the exit node first, followed by the other nodes in the circuit, front to back. (In other words, for a 3-hop circuit, we first pick hop 3, then hop 1, then hop 2.)
All paths we generate obey the following constraints:
- We do not choose the same router twice for the same path.
- We do not choose any router in the same family as another in the same path. (Two routers are in the same family if each one lists the other in the "family" entries of its descriptor.)
- We do not choose more than one router in a given network range,
which defaults to /16 for IPv4 and /32 for IPv6.
(C Tor overrides this with
EnforceDistinctSubnets; Arti overrides this with
- The first node must be a Guard (see discussion below and in the guard specification).
- XXXX Choosing the length
Additionally, we may be building circuits with one or more requests in mind. Each kind of request puts certain constraints on paths.
Most circuits need to be "Fast".
For these, we only choose nodes with the
For non-"fast" circuits, nodes without the
Fast flag are eligible.
- TODO document which circuits (do not) need to be Fast.
Similarly, some circuits need to be "Stable". For these, we only choose nodes with the Stable flag.
- All service-side introduction circuits and all rendezvous paths should be Stable.
- All connection requests for connections that we think will need to stay open a long time require Stable circuits. Currently, Tor decides this by examining the request's target port, and comparing it to a list of "long-lived" ports. (Default: 21, 22, 706, 1863, 5050, 5190, 5222, 5223, 6667, 6697, 8300.)
For all circuits, we weight node selection according to router bandwidth.
We also weight the bandwidth of Exit and Guard flagged nodes depending on the fraction of total bandwidth that they make up and depending upon the position they are being selected for.
These weights are published in the consensus, and are computed as described in "Computing Bandwidth Weights" in the directory specification. They are:
Wgg - Weight for Guard-flagged nodes in the guard position Wgm - Weight for non-flagged nodes in the guard Position Wgd - Weight for Guard+Exit-flagged nodes in the guard Position Wmg - Weight for Guard-flagged nodes in the middle Position Wmm - Weight for non-flagged nodes in the middle Position Wme - Weight for Exit-flagged nodes in the middle Position Wmd - Weight for Guard+Exit flagged nodes in the middle Position Weg - Weight for Guard flagged nodes in the exit Position Wem - Weight for non-flagged nodes in the exit Position Wee - Weight for Exit-flagged nodes in the exit Position Wed - Weight for Guard+Exit-flagged nodes in the exit Position Wgb - Weight for BEGIN_DIR-supporting Guard-flagged nodes Wmb - Weight for BEGIN_DIR-supporting non-flagged nodes Web - Weight for BEGIN_DIR-supporting Exit-flagged nodes Wdb - Weight for BEGIN_DIR-supporting Guard+Exit-flagged nodes Wbg - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests Wbm - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests Wbe - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests Wbd - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests
If any of those weights is malformed or not present in a consensus, clients proceed with the regular path selection algorithm setting the weights to the default value of 10000.
If we know what IP address we want to connect to, we can trivially tell whether a given router will support it by simulating its declared exit policy.
(DNS resolve requests are only sent to relays whose exit policy is not equivalent to "reject :".)
Because we often connect to addresses of the form hostname:port, we do not always know the target IP address when we select an exit node. In these cases, we need to pick an exit node that "might support" connections to a given address port with an unknown address. An exit node "might support" such a connection if any clause that accepts any connections to that port precedes all clauses (if any) that reject all connections to that port.
Unless requested to do so by the user, we never choose an exit node flagged as "BadExit" by more than half of the authorities who advertise themselves as listing bad exits.
Users can alter the default behavior for path selection with configuration options.
- If "ExitNodes" is provided, then every request requires an exit node on the ExitNodes list. (If a request is supported by no nodes on that list, and StrictExitNodes is false, then Tor treats that request as if ExitNodes were not provided.) - "EntryNodes" and "StrictEntryNodes" behave analogously. - If a user tries to connect to or resolve a hostname of the form <target>.<servername>.exit, the request is rewritten to a request for <target>, and the request is only supported by the exit whose nickname or fingerprint is <servername>. - When set, "HSLayer2Nodes" and "HSLayer3Nodes" relax Tor's path restrictions to allow nodes in the same /16 and node family to reappear in the path. They also allow the guard node to be chosen as the RP, IP, and HSDIR, and as the hop before those positions.