Path selection and constraints

We choose the path for each new circuit before we build it, based on our current directory information. (Clients and relays use the latest directory information they have; directory authorities use their own opinions.)

We choose the exit node first, followed by the other nodes in the circuit, front to back. (In other words, for a 3-hop circuit, we first pick hop 3, then hop 1, then hop 2.)

Universal constraints

All paths we generate obey the following constraints:

  • We do not choose the same router twice for the same path.
  • We do not choose any router in the same family as another in the same path. (Two routers are in the same family if each one lists the other in the "family" entries of its descriptor.)
  • We do not choose more than one router in a given network range, which defaults to /16 for IPv4 and /32 for IPv6. (C Tor overrides this with EnforceDistinctSubnets; Arti overrides this with ipv[46]_subnet_family_prefix.)
  • The first node must be a Guard (see discussion below and in the guard specification).
  • XXXX Choosing the length

Special-purpose constraints

Additionally, we may be building circuits with one or more requests in mind. Each kind of request puts certain constraints on paths.

Most circuits need to be "Fast". For these, we only choose nodes with the Fast flag. For non-"fast" circuits, nodes without the Fast flag are eligible.

  • TODO document which circuits (do not) need to be Fast.

Similarly, some circuits need to be "Stable". For these, we only choose nodes with the Stable flag.

  • All service-side introduction circuits and all rendezvous paths should be Stable.
  • All connection requests for connections that we think will need to stay open a long time require Stable circuits. Currently, Tor decides this by examining the request's target port, and comparing it to a list of "long-lived" ports. (Default: 21, 22, 706, 1863, 5050, 5190, 5222, 5223, 6667, 6697, 8300.)

Weighting node selection

For all circuits, we weight node selection according to router bandwidth.

We also weight the bandwidth of Exit and Guard flagged nodes depending on the fraction of total bandwidth that they make up and depending upon the position they are being selected for.

These weights are published in the consensus, and are computed as described in "Computing Bandwidth Weights" in the directory specification. They are:

      Wgg - Weight for Guard-flagged nodes in the guard position
      Wgm - Weight for non-flagged nodes in the guard Position
      Wgd - Weight for Guard+Exit-flagged nodes in the guard Position

      Wmg - Weight for Guard-flagged nodes in the middle Position
      Wmm - Weight for non-flagged nodes in the middle Position
      Wme - Weight for Exit-flagged nodes in the middle Position
      Wmd - Weight for Guard+Exit flagged nodes in the middle Position

      Weg - Weight for Guard flagged nodes in the exit Position
      Wem - Weight for non-flagged nodes in the exit Position
      Wee - Weight for Exit-flagged nodes in the exit Position
      Wed - Weight for Guard+Exit-flagged nodes in the exit Position

      Wgb - Weight for BEGIN_DIR-supporting Guard-flagged nodes
      Wmb - Weight for BEGIN_DIR-supporting non-flagged nodes
      Web - Weight for BEGIN_DIR-supporting Exit-flagged nodes
      Wdb - Weight for BEGIN_DIR-supporting Guard+Exit-flagged nodes

      Wbg - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests
      Wbm - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests
      Wbe - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests
      Wbd - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests

If any of those weights is malformed or not present in a consensus, clients proceed with the regular path selection algorithm setting the weights to the default value of 10000.

Choosing an exit

If we know what IP address we want to connect to, we can trivially tell whether a given router will support it by simulating its declared exit policy.

(DNS resolve requests are only sent to relays whose exit policy is not equivalent to "reject :".)

Because we often connect to addresses of the form hostname:port, we do not always know the target IP address when we select an exit node. In these cases, we need to pick an exit node that "might support" connections to a given address port with an unknown address. An exit node "might support" such a connection if any clause that accepts any connections to that port precedes all clauses (if any) that reject all connections to that port.

Unless requested to do so by the user, we never choose an exit node flagged as "BadExit" by more than half of the authorities who advertise themselves as listing bad exits.

User configuration

Users can alter the default behavior for path selection with configuration options.

   - If "ExitNodes" is provided, then every request requires an exit node on
     the ExitNodes list.  (If a request is supported by no nodes on that list,
     and StrictExitNodes is false, then Tor treats that request as if
     ExitNodes were not provided.)

   - "EntryNodes" and "StrictEntryNodes" behave analogously.

   - If a user tries to connect to or resolve a hostname of the form
     <target>.<servername>.exit, the request is rewritten to a request for
     <target>, and the request is only supported by the exit whose nickname
     or fingerprint is <servername>.

   - When set, "HSLayer2Nodes" and "HSLayer3Nodes" relax Tor's path
     restrictions to allow nodes in the same /16 and node family to reappear
     in the path. They also allow the guard node to be chosen as the RP, IP,
     and HSDIR, and as the hop before those positions.