Common protocol
We have made an effort to split the design of the proof-of-work subsystem into an algorithm-specific piece that can be upgraded, and a core protocol that provides queueing and effort adjustment.
Currently there is only one versioned subprotocol defined:
Overview
+----------------------------------+
| Onion Service |
+-------+ INTRO1 +-----------+ INTRO2 +--------+ |
|Client |-------->|Intro Point|------->| PoW |-----------+ |
+-------+ +-----------+ |Verifier| | |
+--------+ | |
| | |
| | |
| +----------v---------+ |
| |Intro Priority Queue| |
+---------+--------------------+---+
| | |
Rendezvous | | |
circuits | | |
v v v
The proof-of-work scheme specified in this document takes place during the introduction phase of the onion service protocol.
The system described in this proposal is not meant to be on all the time, and it can be entirely disabled for services that do not experience DoS attacks.
When the subsystem is enabled, suggested effort is continuously adjusted and the computational puzzle can be bypassed entirely when the effort reaches zero. In these cases, the proof-of-work subsystem can be dormant but still provide the necessary parameters for clients to voluntarily provide effort in order to get better placement in the priority queue.
The protocol involves the following major steps:
- Service encodes PoW parameters in descriptor:
pow-params
in the second layer plaintext format. - Client fetches descriptor and begins solving. Currently this must use the
v1
solver algorithm. - Client finishes solving and sends results using the proof-of-work extension to INTRODUCE1.
- Service verifies the proof and queues an introduction based on proven effort. This currently uses the
v1
verify algorithm only. - Requests are continuously drained from the queue, highest effort first, subject to multiple constraints on speed. See below for more on handling queued requests.
Replay protection
The service MUST NOT accept introduction requests with the same (seed, nonce) tuple. For this reason a replay protection mechanism must be employed.
The simplest way is to use a hash table to check whether a (seed, nonce) tuple has been used before for the active duration of a seed. Depending on how long a seed stays active this might be a viable solution with reasonable memory/time overhead.
If there is a worry that we might get too many introductions during the lifetime of a seed, we can use a Bloom filter or similar as our replay cache mechanism. A probabilistic filter means that we will potentially flag some connections as replays even if they are not, with this false positive probability increasing as the number of entries increase. With the right parameter tuning this probability should be negligible, and dropped requests will be retried by the client.
The introduction queue
When proof-of-work is enabled for a service, that service diverts all incoming introduction requests to a priority queue system rather than handling them immediately.
Adding introductions to the introduction queue
When PoW is enabled and an introduction request includes a verified proof, the service queues each request in a data structure sorted by effort. Requests including no proof at all MUST be assigned an effort of zero. Requests with a proof that fails to verify MUST be rejected and not enqueued.
Services MUST check whether the queue is overfull when adding to it, not just when processing requests. Floods of low-effort and zero-effort introductions need to be efficiently discarded when the queue is growing faster than it's draining.
The C implementation chooses a maximum number of queued items based on its configured dequeue rate limit multiplied by the circuit timeout. In effect, items past this threshold are expected not to be reachable by the time they will timeout. When this limit is exceeded, the queue experiences a mass trim event where the lowest effort half of all items are discarded.
Handling queued introductions
When deciding which introduction request to consider next, the service chooses the highest available effort. When efforts are equivalent, the oldest queued request is chosen.
The service should handle introductions only by pulling from the introduction queue. We call this part of introduction handling the "bottom half" because most of the computation happens in this stage.
For more on how we expect such a system to work in Tor, see the scheduler analysis and discussion section.
Effort control
Overall strategy for effort determination
Denial-of-service is a dynamic problem where the attacker's capabilities constantly change, and hence we want our proof-of-work system to be dynamic and not stuck with a static difficulty setting. Instead of forcing clients to go below a static target configured by the service operator, we ask clients to "bid" using their PoW effort. Effectively, a client gets higher priority the higher effort they put into their proof-of-work. Clients automatically increase their bid when retrying, and services regularly offer a suggested starting point based on the recent queue status.
Motivated users can spend a high amount of effort in their PoW computation, which should guarantee access to the service given reasonable adversary models.
An effective effort control algorithm will improve reachability and UX by suggesting values that reduce overall service load to tolerable values while also leaving users with a tolerable overall delay.
The service starts with a default suggested-effort value of 0, which keeps the PoW defenses dormant until we notice signs of queue overload.
The entire process of determining effort can be thought of as a set of multiple coupled feedback loops. Clients perform their own effort adjustments via timeout retry atop a base effort suggested by the service. That suggestion incorporates the service's control adjustments atop a base effort calculated using a sum of currently-queued client effort.
Each feedback loop has an opportunity to cover different time scales. Clients can make adjustments at every single circuit creation request, whereas services are limited by the extra load that frequent updates would place on HSDir nodes.
In the combined client/service system these client-side increases are expected to provide the most effective quick response to an emerging DoS attack. After early clients increase the effort using timeouts, later clients benefit from the service detecting this increased queued effort and publishing a larger suggested effort.
Effort increases and decreases both have a cost. Increasing effort will make the service more expensive to contact, and decreasing effort makes new requests likely to become backlogged behind older requests. The steady state condition is preferable to either of these side-effects, but ultimately it's expected that the control loop always oscillates to some degree.
Service-side effort control
Services keep an internal suggested effort target which updates on a regular periodic timer in response to measurements made on queue behavior in the previous update period. These internal effort changes can optionally trigger client-visible descriptor changes when the difference is great enough to warrant republication to the HSDir.
This evaluation and update period is referred to as HS_UPDATE_PERIOD
.
The service-side effort control loop takes inspiration from TCP congestion control's additive increase / multiplicative decrease approach, but unlike a typical AIMD this algorithm is fixed-rate and doesn't update immediately in response to events.
TODO: HS_UPDATE_PERIOD
is hardcoded to 300 (5 minutes) currently, but it should be configurable in some way.
Is it more appropriate to use the service's torrc here or a consensus parameter?
Per-update-period service state
During each update period, the service maintains some state:
TOTAL_EFFORT
, a sum of all effort values for rendezvous requests that were successfully validated and enqueued.REND_HANDLED
, a count of rendezvous requests that were actually launched. Requests that made it to dequeueing but were too old to launch by then are not included.HAD_QUEUE
, a flag which is set if at any time in the update period we saw the priority queue filled with more than a minimum amount of work, greater than we would expect to process in approximately 1/4 second using the configured dequeue rate.MAX_TRIMMED_EFFORT
, the largest observed single request effort that we discarded during the update period. Requests are discarded either due to age (timeout) or during culling events that discard the bottom half of the entire queue when it's too full.
Service AIMD conditions
At the end of each update period, the service may decide to increase effort, decrease effort, or make no changes, based on these accumulated state values:
- If
MAX_TRIMMED_EFFORT
> our previous internalsuggested_effort
, always INCREASE. Requests that follow our latest advice are being dropped. - If the
HAD_QUEUE
flag was set and the queue still contains at least one item with effort >= our previous internalsuggested_effort
, INCREASE. Even if we haven't yet reached the point of dropping requests, this signal indicates that our latest suggestion isn't high enough and requests will build up in the queue. - If neither condition 1 or 2 are taking place and the queue is below a level we would expect to process in approximately 1/4 second, choose to DECREASE.
- If none of these conditions match, the
suggested_effort
is unchanged.
When we INCREASE, the internal suggested_effort
is increased to either its previous value + 1, or (TOTAL_EFFORT
/ REND_HANDLED
), whichever is larger.
When we DECREASE, the internal suggested_effort
is scaled by 2/3rds.
Over time, this will continue to decrease our effort suggestion any time the service is fully processing its request queue. If the queue stays empty, the effort suggestion decreases to zero and clients should no longer submit a proof-of-work solution with their first connection attempt.
It's worth noting that the suggested_effort
is not a hard limit to the efforts that are accepted by the service, and it's only meant to serve as a guideline for clients to reduce the number of unsuccessful requests that get to the service.
When adding requests to the queue, services do accept valid solutions with efforts higher or lower than the published values from pow-params
.
Updating descriptor with new suggested effort
The service descriptors may be updated for multiple reasons including introduction point rotation common to all v3 onion services, scheduled seed rotations like the one described for v1
parameters, and updates to the effort suggestion.
Even though the internal effort value updates on a regular timer, we avoid propagating those changes into the descriptor and the HSDir hosts unless there is a significant change.
If the PoW params otherwise match but the seed has changed by less than 15 percent, services SHOULD NOT upload a new descriptor.
Client-side effort control
Clients are responsible for making their own effort adjustments in response to connection trouble, to allow the system a chance to react before the service has published new effort values. This is an important tool to uphold UX expectations without relying on excessively frequent updates through the HSDir.
TODO: This is the weak link in user experience for our current implementation. The C tor implementation does not detect and retry onion service connections as reliably as we would like. Currently our best strategy to improve retry behavior is the Arti rewrite.
Failure ambiguity
The first challenge in reacting to failure, in our case, is to even accurately and quickly understand when a failure has occurred.
This proposal introduces a bunch of new ways where a legitimate client can fail to reach the onion service. Furthermore, there is currently no end-to-end way for the onion service to inform the client that the introduction failed. The INTRODUCE_ACK message is not end-to-end (it's from the introduction point to the client) and hence it does not allow the service to inform the client that the rendezvous is never gonna occur.
From the client's perspective there's no way to attribute this failure to the service itself rather than the introduction point, so error accounting is performed separately for each introduction-point. Prior mechanisms will discard an introduction point that's required too many retries.
Clients handling timeouts
Alice can fail to reach the onion service if her introduction request gets trimmed off the priority queue when enqueueing new requests, or if the service does not get through its priority queue in time and the connection times out.
This section presents a heuristic method for the client getting service even in such scenarios.
If the rendezvous request times out, the client SHOULD fetch a new descriptor for the service to make sure that it's using the right suggested-effort for the PoW and the right PoW seed. If the fetched descriptor includes a new suggested effort or seed, it should first retry the request with these parameters.
TODO: This is not actually implemented yet, but we should do it. How often should clients at most try to fetch new descriptors? Determined by a consensus parameter? This change will also allow clients to retry effectively in cases where the service has just been reconfigured to enable PoW defenses.
Every time the client retries the connection, it will count these failures per-introduction-point. These counts of previous retries are combined with the service's suggested_effort
when calculating the actual effort to spend on any individual request to a service that advertises PoW support, even when the currently advertised suggested_effort
is zero.
On each retry, the client modifies its solver effort:
- If the effort is below
CLIENT_POW_EFFORT_DOUBLE_UNTIL
(= 1000) it will be doubled. - Otherwise, multiply the effort by
CLIENT_POW_RETRY_MULTIPLIER
(= 1.5). - Constrain the effort to no less than
CLIENT_MIN_RETRY_POW_EFFORT
(= 8). Note that this limit is specific to retries only. Clients may use a lower effort for their first connection attempt. - Apply the maximum effort limit described below.
Client-imposed effort limits
There isn't a practical upper limit on effort defined by the protocol itself, but clients may choose a maximum effort limit to enforce. It may be desirable to do this in some cases to improve responsiveness, but the main reason for this limit currently is as a workaround for weak cancellation support in our implementation.
Effort values used for both initial connections and retries are currently limited to no greater than CLIENT_MAX_POW_EFFORT
(= 10000).
TODO: This hardcoded limit should be replaced by timed limits and/or an unlimited solver with robust cancellation. This is issue 40787 in C tor.