Relay cells

Within a circuit, the OP and the end node use the contents of relay cells to tunnel end-to-end commands and TCP connections ("Streams") across circuits. End-to-end commands can be initiated by either edge; streams are initiated by the OP.

End nodes that accept streams may be:

  • exit relays (RELAY_BEGIN, anonymous),
  • directory servers (RELAY_BEGIN_DIR, anonymous or non-anonymous),
  • onion services (RELAY_BEGIN, anonymous via a rendezvous point).

The body of each unencrypted relay cell consists of an enveloped relay message, encoded as follows:

FieldSize
Relay command1 byte
'Recognized'2 bytes
StreamID2 bytes
Digest4 bytes
Length2 bytes
DataLength bytes
PaddingCELL_BODY_LEN - 11 - Length bytes

TODO: When we implement prop340, we should clarify which parts of the above are about the relay cell, and which are the enveloped message.

The relay commands are:

CommandIdentifierTypeDescription
Core protocol
1BEGINFOpen a stream
2DATAF/BTransmit data
3ENDF/BClose a stream
4CONNECTEDBStream has successfully opened
5SENDMEF/B, C?Acknowledge traffic
6EXTENDF, CExtend a circuit with TAP (obsolete)
7EXTENDEDB, CFinish extending a circuit with TAP (obsolete)
8TRUNCATEF, CRemove nodes from a circuit (unused)
9TRUNCATEDB, CReport circuit truncation (unused)
10DROPF/B, CLong-range padding
11RESOLVEFHostname lookup
12RESOLVEDBHostname lookup reply
13BEGIN_DIRFOpen stream to directory cache
14EXTEND2F, CExtend a circuit
15EXTENDED2B, CFinish extending a circuit
16..18ReservedFor UDP; see prop339.
Conflux
19CONFLUX_LINKF, CLink circuits into a bundle
20CONFLUX_LINKEDB, CAcknowledge link request
21CONFLUX_LINKED_ACKF, CAcknowledge CONFLUX_LINKED message (for timing)
22CONFLUX_SWITCHF/B, CSwitch between circuits in a bundle
Onion services
32ESTABLISH_INTROF, CCreate introduction point
33ESTABLISH_RENDEZVOUSF, CCreate rendezvous point
34INTRODUCE1F, CIntroduction request (to intro point)
35INTRODUCE2B, CIntroduction request (to service)
36RENDEZVOUS1F, CRendezvous request (to rendezvous point)
37RENDEZVOUS2B, CRendezvous request (to client)
38INTRO_ESTABLISHEDB, CAcknowledge ESTABLISH_INTRO
39RENDEZVOUS_ESTABLISHEDB, CAcknowledge ESTABLISH_RENDEZVOUS
40INTRODUCE_ACKB, CAcknowledge INTRODUCE1
Circuit padding
41PADDING_NEGOTIATEF, CNegotiate circuit padding
42PADDING_NEGOTIATEDB, CNegotiate circuit padding
Flow control
43XONF/BStream-level flow control
44XOFFF/BStream-level flow control
  • F (Forward): Must only be sent by the originator of the circuit.
  • B (Backward): Must only be sent by other nodes in the circuit back towards the originator.
  • F/B (Forward or backward): May be sent in either direction.
  • C: (Control) must have a zero-valued stream ID. (Other commands must have a nonzero stream ID.)

The 'recognized' field is used as a simple indication that the cell is still encrypted. It is an optimization to avoid calculating expensive digests for every cell. When sending cells, the unencrypted 'recognized' MUST be set to zero.

When receiving and decrypting cells the 'recognized' will always be zero if we're the endpoint that the cell is destined for. For cells that we should relay, the 'recognized' field will usually be nonzero, but will accidentally be zero with P=2^-16.

When handling a relay cell, if the 'recognized' in field in a decrypted relay cell is zero, the 'digest' field is computed as the first four bytes of the running digest of all the bytes that have been destined for this hop of the circuit or originated from this hop of the circuit, seeded from Df or Db respectively (obtained in Setting circuit keys), and including this relay cell's entire body (taken with the digest field set to zero). Note that these digests do include the padding bytes at the end of the cell, not only those up to "Len". If the digest is correct, the cell is considered "recognized" for the purposes of decryption (see Routing relay cells).

(The digest does not include any bytes from relay cells that do not start or end at this hop of the circuit. That is, it does not include forwarded data. Therefore if 'recognized' is zero but the digest does not match, the running digest at that node should not be updated, and the cell should be forwarded on.)

All relay messages pertaining to the same tunneled stream have the same stream ID. StreamIDs are chosen arbitrarily by the OP. No stream may have a StreamID of zero. Rather, relay messages that affect the entire circuit rather than a particular stream use a StreamID of zero -- they are marked in the table above as "C" ([control") style cells. (Sendme cells are marked as "sometimes control" because they can include a StreamID or not depending on their purpose -- see Flow control.)

The 'Length' field of a relay cell contains the number of bytes in the relay cell's body which contain the body of the message. The remainder of the unencrypted relay cell's body is padded with padding bytes. Implementations handle padding bytes of unencrypted relay cells as they do padding bytes for other cell types; see Cell Packet format.

The 'Padding' field is used to make relay cell contents unpredictable, to avoid certain attacks (see proposal 289 for rationale). Implementations SHOULD fill this field with four zero-valued bytes, followed by as many random bytes as will fit. (If there are fewer than 4 bytes for padding, then they should all be filled with zero.

Implementations MUST NOT rely on the contents of the 'Padding' field.

If the relay cell is recognized but the relay command is not understood, the cell must be dropped and ignored. Its contents still count with respect to the digests and flow control windows, though.

Calculating the 'Digest' field

The 'Digest' field itself serves the purpose to check if a cell has been fully decrypted, that is, all onion layers have been removed. Having a single field, namely 'Recognized' is not sufficient, as outlined above.

When ENCRYPTING a relay cell, an implementation does the following:

# Encode the cell in binary (recognized and digest set to zero)
tmp = cmd + [0, 0] + stream_id + [0, 0, 0, 0] + length + data + padding

# Update the digest with the encoded data
digest_state = hash_update(digest_state, tmp)
digest = hash_calculate(digest_state)

# The encoded data is the same as above with the digest field not being
# zero anymore
encoded = cmd + [0, 0] + stream_id + digest[0..4] + length + data +
          padding

# Now we can encrypt the cell by adding the onion layers ...

When DECRYPTING a relay cell, an implementation does the following:

decrypted = decrypt(cell)

# Replace the digest field in decrypted by zeros
tmp = decrypted[0..5] + [0, 0, 0, 0] + decrypted[9..]

# Update the digest field with the decrypted data and its digest field
# set to zero
digest_state = hash_update(digest_state, tmp)
digest = hash_calculate(digest_state)

if digest[0..4] == decrypted[5..9]
  # The cell has been fully decrypted ...

The caveat itself is that only the binary data with the digest bytes set to zero are being taken into account when calculating the running digest. The final plain-text cells (with the digest field set to its actual value) are not taken into the running digest.