Filename: 367-sendme-covers-more.md
Title: Better SENDMEs with fewer exceptions and queue-jumping
Author: Nick Mathewson
Created: 23 September 2025
Status: Draft

Introduction

This proposal describes a change to how Tor interprets SENDME messages, such that nearly all RELAY cells will count against the SENDME windows.

The problem

Tor's current congestion control design uses SENDME messages to tell the other end of a circuit that it can originate more RELAY cells.

But currently, only certain RELAY cells (those containing DATA messages) count towards SENDME windows. This causes a significant problem:

There is no congestion control on other message types.

Notably, INTRODUCE2 messages have potential to get out of hand if they arrive at an introduction point faster than they can be sent to the onion service. But other types (like DROP or BEGIN) also have potential to fill buffers and queues more than we'd like.

(This problem is significant, and probably worth solving.)

So it's likely a good idea to look for a way to make SENDME messages cover more RELAY cells—ideally, as many RELAY cells as possible. But before we can solve this issue, there are some issues to consider.

Obstacles

Our primary obstacle is a deadlock situation: if SENDME windows apply to SENDME messages, then it's possible for two parties to enter a situation where both parties' windows are empty, and so neither one can send the other one a SENDME.

As another (less important) obstacle, there are some messages that we send in order to prevent the other side from sending us messages of its own. These include END, XOFF, and (if we ever implement them) TRUNCATE/TRUNCATED. If these messages have to wait before we can send them, we might find ourselves receiving more unwanted traffic than we did before.

As an implementation obstacle, our current code does not always anticipate that it might ever be unable to queue a non-DATA message, or that it might not be able to flush messages on a cicuit queue because of current window values. We'll want to consider the amount of refactoring necessary here.

The status quo in detail

Before we come to the proposed solution, let's refresh how outbound relay cell queues work today.

Consider how Tor's outbound circuit cell queues work today: when we have a RELAY cell to send or or to relay, we encrypt/decrypt the cell, and add it to the end of an outbound queue for the circuit. When there is space to write on that circuit's attached channel, we pull a cell from the front of that queue, and write it.

Before pulling data from a stream to package into a DATA cell and write onto a circuit, we check whether our send window for that circuit is exhausted. If it is, we leave the data on the stream.

Proposed solution

When this protocol is enabled, nearly all relay cells will count against circuit SENDME windows.

(This protocol is incompatible with our legacy flow-control, so we won't consider stream SENDME windows at all.)

(This protocol also requires the use of authenticated SENDMEs.)

Use of this protocol is advertised and negotiated using the SUBPROTO_REQUEST extension of proposal 346.

What will count against SENDME windows.

A relay cell will count against circuit SENDME windows if it contains any part of any message of a type other than the following:

SENDME
XOFF
END
DROP

Note that this rule is phrased so that it can remain the same even after packed relay cells are implemented.

SENDME messages need to be unrestricted to prevent deadlocks; they can't be used for flooding attacks because of authenticated SENDMEs.

XOFF messages and END messages don't strictly need to be unrestricted; but since they result in less bandwidth usage, we allow them anyway. They can't be used for unbounded flooding attacks since they are limited by the number of open streams.

DROP messages need to be unrestricted because they can originate from middle hops, which do not currently have a way to send a congestion signal to the client. For now, keeping DROP messages to a reasonable number is the job of the circuit padding code.

Handling full queues

When SENDME windows apply to more relay cells, we will need to handle, for each type of relay message, the possibility that we will not be able to enqueue it due to a full window.

Considering the each message type in turn:

SENDME, END, XOFF, and DROP are not subject to windows.
DATA messages are already handled.
For BEGIN, BEGIN_DIR, CONNECTED, RESOLVE, and RESOLVE messages, we can do any of these:
- Queue them as part of the associated stream's data.
- Place them in an auxiliary queue (see below).
- Not open the stream or launch the request until the circuit window has room.
- For BEGIN or RESOLVE or BEGIN_DIR, not connect a stream to a congested circuit. (This might have security implications.)
For XON messages, we can:
- Queue them as part of the associated stream's data.
- Place them in an auxiliary queue (see below).
These message types are (currently) only sent during circuit establishment, or immediately after: as we use them today, they will never encounter a full window: EXTEND2, EXTENDED2, CONFLUX_LINK, CONFLUX_LINKED, CONFLUX_LINKED_ACK, ESTABLISH_INTRO, INTRO_ESTABLISHED, ESTABLISH_RENDEVOUS, RENDEZVOUS_ESTABLISHED, RENDEZVOUS1, RENDEZVOUS2 INTRODUCE1, INTRODUCE_ACK, PADDING_NEGOTIATE, PADDING_NEGOTIATED.

If we later need to send them after circuit establishment, we can use an auxiliary queue
CONFLUX_SWITCH is sent to tell the other party that we are switching the traffic we send on a conflux tunnel to the given circuit. Therefore, there should be no need to switch to a congested circuit, and this message shouldn't be sent when the window is full.
TRUNCATE and TRUNCATED are not currently used or implemented. If we later decide that they are necessary, they can use the same reserve window mechanism as SENDME, or they can use the use an auxiliary queue.
INTRODUCE2 is sent from introduction points to onion services; many can be sent on a single circuit.

Auxiliary queues

If we decide that we want to queue a message when there is no room in the queue, we can use an auxiliary queue, as specified below:

When the queue is full, we can put a message on the auxiliary queue. When draining the queue, once there is space in the window, we take messages from the auxiliary queue before taking messages from any other source.

Messages in the auxiliary queue are not encrypted; we encrypt them once we place them in relay cells.

Handling XON/XOFF and BEGIN/END pairs.

If we have an XON queued and we would like to send an XOFF from the same stream, we must not allow that XOFF to be sent before the XON.

Similarly, if we have any DATA or BEGIN messages for a stream, we must not allow an END for the stream to be sent before them.

There are several possible solutions to this issue:

We could force XOFF and/or END to be queued, just the same as DATA messages.
If we have an XON queued and we would like to queue an XOFF, we can simply remove the XON from the queue instead.
We could also exempt XON from SENDME windows. (This won't work for BEGIN, since we would also have to exempt DATA.)
We could allow END to jump the queue, but not farther than any other message on the same stream.

I'm going to suggest the following basic approach:

At first, when we want to send XOFF or XON, if we have no window, then we place it in the auxiliary queue.
Later, as an optimization, we can allow XOFF to bypass the queue, but if there is an XON for the same stream on the queue, we remove that XON instead of queueing the XOFF.
Later, as an optimization, we can allow END to bypass the queue, but if there are any other messages for the same stream on the queue, END must follow them.

An additional optimization: late-encryption and queue-jumping

As we implement the design above, we can perform an additional optimization to improve responsiveness for SENDME, XOFF, and similar messages.

Instead of the current implementation, where we encrypt relay cells as early as possible, we can instead encrypt relay cells as give them to the channel, or just before. This change would allow us to give relay messages to the channel in an order other than than the order in which we generate them: SENDME (and XOFF, etc) could "jump the queue" and get delivered before regular data.

We need to encrypt late in order to implement queue jumping, since our RELAY cell encryption is order-sensitive.

If we do this, we would still need to preserve the relative order of SENDME messages.

We might also want to have the ability to revoke an unsent XOFF or XON message if we no longer want to disable a given stream.

When we implement packed relay cells, queue-jumping messages could be packed along with non-queue-jumping messages.

A rejected solution: Reserve windows

Here is a solution that we considered and rejected in an earlier version of this proposal:

In this design, we would have made all relay cells count against SENDME windows.

To avoid the deadlock issue, we would reserve R cells out of the SENDME window for SENDME cells: If the window is W, and we have W-R cells in the queue, then we would consider the window full for all purposes other than SENDME messages.

This would suffice to prevent deadlock.

We've rejected this solution, however, because the correct value of R is difficult to calculate, and because under certain pathological circumstances it could become higher than the total congestion window.

Acknowledgments

Thanks to Mike Perry, who helped walk me through all the ways that the first version of this proposal would not have worked, and encouraged me to get the revised version out the door.

Tor design proposals