Scaling bandwidths

Scaling requirements

  Tor accepts zero bandwidths, but they trigger bugs in older Tor
  implementations. Therefore, scaling methods SHOULD perform the
  following checks:
   * If the total bandwidth is zero, all relays should be given equal
   bandwidths.
   * If the scaled bandwidth is zero, it should be rounded up to one.

Initial experiments indicate that scaling may not be needed for torflow and sbws, because their measured bandwidths are similar enough already.

A linear scaling method

If scaling is required, here is a simple linear bandwidth scaling method, which ensures that all bandwidth votes contain approximately the same total bandwidth:

  1. Calculate the relay quota by dividing the total measured bandwidth
     in all votes, by the number of relays with measured bandwidth
     votes. In the public tor network, this is approximately 7500 as of
     April 2018. The quota should be a consensus parameter, so it can be
     adjusted for all generators on the network.

  2. Calculate a vote quota by multiplying the relay quota by the number
     of relays this bandwidth authority has measured
     bandwidths for.

  3. Calculate a scaling factor by dividing the vote quota by the
     total unscaled measured bandwidth in this bandwidth
     authority's upcoming vote.

  4. Multiply each unscaled measured bandwidth by the scaling
     factor.

Now, the total scaled bandwidth in the upcoming vote is approximately equal to the quota.

Quota changes

If all generators are using scaling, the quota can be gradually reduced or increased as needed. Smaller quotas decrease the size of uncompressed consensuses, and may decrease the size of consensus diffs and compressed consensuses. But if the relay quota is too small, some relays may be over- or under-weighted.

Torflow aggregation

Torflow implements two methods to compute the bandwidth values from the (stream) bandwidth measurements: with and without PID control feedback. The method described here is without PID control (see Torflow specification, section 2.2).

In the following sections, the relays' measured bandwidth refer to the ones that this bandwidth authority has measured for the relays that would be included in the next bandwidth authority's upcoming vote.

  1. Calculate the filtered bandwidth for each relay:
    - choose the relay's measurements (`bw_j`) that are equal or greater
      than the mean of the measurements for this relay
    - calculate the mean of those measurements

    In pseudocode:

      bw_filt_i = mean(max(mean(bw_j), bw_j))

  2. Calculate network averages:
    - calculate the filtered average by dividing the sum of all the
      relays' filtered bandwidth by the number of relays that have been
      measured (`n`), ie, calculate the mean average of the relays'
      filtered bandwidth.
    - calculate the stream average by dividing the sum of all the
      relays' measured bandwidth by the number of relays that have been
      measured (`n`), ie, calculate the mean average or the relays'
      measured bandwidth.

     In pseudocode:

       bw_avg_filt_ = bw_filt_i / n
       bw_avg_strm = bw_i / n

  3. Calculate ratios for each relay:
    - calculate the filtered ratio by dividing each relay filtered
      bandwidth by the filtered average
    - calculate the stream ratio by dividing each relay measured
      bandwidth by the stream average

    In pseudocode:

r_filt_i = bw_filt_i / bw_avg_filt r_strm_i = bw_i / bw_avg_strm

  4. Calculate the final ratio for each relay:
    The final ratio is the larger between the filtered bandwidth's and the
    stream bandwidth's ratio.

    In pseudocode:

      r_i = max(r_filt_i, r_strm_i)

  5. Calculate the scaled bandwidth for each relay:
    The most recent descriptor observed bandwidth (`bw_obs_i`) is
    multiplied by the ratio

    In pseudocode:

      bw_new_i = r_i * bw_obs_i

<<In this way, the resulting network status consensus bandwidth values are effectively re-weighted proportional to how much faster the node was as compared to the rest of the network.>>