Relay cells

Within a circuit, the OP and the end node use the contents of RELAY packets to tunnel end-to-end commands and TCP connections ("Streams") across circuits. End-to-end commands can be initiated by either edge; streams are initiated by the OP.

End nodes that accept streams may be:

  • exit relays (RELAY_BEGIN, anonymous),
  • directory servers (RELAY_BEGIN_DIR, anonymous or non-anonymous),
  • onion services (RELAY_BEGIN, anonymous via a rendezvous point).

The payload of each unencrypted RELAY cell consists of:

Relay command1 byte
'Recognized'2 bytes
StreamID2 bytes
Digest4 bytes
Length2 bytes
DataLength bytes
PaddingPAYLOAD_LEN - 11 - Length bytes

The relay commands are:

2RELAY_DATAforward or backward
3RELAY_ENDforward or backward
5RELAY_SENDMEforward or backwardsometimes control
10RELAY_DROPforward or backwardcontrol
16..18Reserved for UDP; Not yet in use, see prop339.
19..22Reserved for Conflux, see prop329.
32..40Used for hidden services; see the rendezvous spec.
41..42Used for circuit padding; see "Circuit-level padding" in the padding spec.
43XON (See Sec 4 of prop324)forward or backward
44XOFF (See Sec 4 of prop324)forward or backward

Commands labelled as "forward" must only be sent by the originator of the circuit. Commands labelled as "backward" must only be sent by other nodes in the circuit back to the originator. Commands marked as either can be sent either by the originator or other nodes.

The 'recognized' field is used as a simple indication that the cell is still encrypted. It is an optimization to avoid calculating expensive digests for every cell. When sending cells, the unencrypted 'recognized' MUST be set to zero.

When receiving and decrypting cells the 'recognized' will always be zero if we're the endpoint that the cell is destined for. For cells that we should relay, the 'recognized' field will usually be nonzero, but will accidentally be zero with P=2^-16.

When handling a relay cell, if the 'recognized' in field in a decrypted relay payload is zero, the 'digest' field is computed as the first four bytes of the running digest of all the bytes that have been destined for this hop of the circuit or originated from this hop of the circuit, seeded from Df or Db respectively (obtained in Setting circuit keys), and including this RELAY cell's entire payload (taken with the digest field set to zero). Note that these digests do include the padding bytes at the end of the cell, not only those up to "Len". If the digest is correct, the cell is considered "recognized" for the purposes of decryption (see Routing relay cells).

(The digest does not include any bytes from relay cells that do not start or end at this hop of the circuit. That is, it does not include forwarded data. Therefore if 'recognized' is zero but the digest does not match, the running digest at that node should not be updated, and the cell should be forwarded on.)

All RELAY cells pertaining to the same tunneled stream have the same stream ID. StreamIDs are chosen arbitrarily by the OP. No stream may have a StreamID of zero. Rather, RELAY cells that affect the entire circuit rather than a particular stream use a StreamID of zero -- they are marked in the table above as "[control]" style cells. (Sendme cells are marked as "sometimes control" because they can include a StreamID or not depending on their purpose -- see Flow control.)

The 'Length' field of a relay cell contains the number of bytes in the relay payload which contain real payload data. The remainder of the unencrypted payload is padded with padding bytes. Implementations handle padding bytes of unencrypted relay cells as they do padding bytes for other cell types; see Cell Packet format.

The 'Padding' field is used to make relay cell contents unpredictable, to avoid certain attacks (see proposal 289 for rationale). Implementations SHOULD fill this field with four zero-valued bytes, followed by as many random bytes as will fit. (If there are fewer than 4 bytes for padding, then they should all be filled with zero.

Implementations MUST NOT rely on the contents of the 'Padding' field.

If the RELAY cell is recognized but the relay command is not understood, the cell must be dropped and ignored. Its contents still count with respect to the digests and flow control windows, though.

Calculating the 'Digest' field

The 'Digest' field itself serves the purpose to check if a cell has been fully decrypted, that is, all onion layers have been removed. Having a single field, namely 'Recognized' is not sufficient, as outlined above.

When ENCRYPTING a RELAY cell, an implementation does the following:

# Encode the cell in binary (recognized and digest set to zero)
tmp = cmd + [0, 0] + stream_id + [0, 0, 0, 0] + length + data + padding

# Update the digest with the encoded data
digest_state = hash_update(digest_state, tmp)
digest = hash_calculate(digest_state)

# The encoded data is the same as above with the digest field not being
# zero anymore
encoded = cmd + [0, 0] + stream_id + digest[0..4] + length + data +

# Now we can encrypt the cell by adding the onion layers ...

When DECRYPTING a RELAY cell, an implementation does the following:

decrypted = decrypt(cell)

# Replace the digest field in decrypted by zeros
tmp = decrypted[0..5] + [0, 0, 0, 0] + decrypted[9..]

# Update the digest field with the decrypted data and its digest field
# set to zero
digest_state = hash_update(digest_state, tmp)
digest = hash_calculate(digest_state)

if digest[0..4] == decrypted[5..9]
  # The cell has been fully decrypted ...

The caveat itself is that only the binary data with the digest bytes set to zero are being taken into account when calculating the running digest. The final plain-text cells (with the digest field set to its actual value) are not taken into the running digest.