Filename: 351-socks-auth-extensions.md
Title: Making SOCKS5 authentication extensions extensible
Author: Nick Mathewson
Created: 9 September 2024
Status: Closed
Implemented-In: Arti 1.2.8, Tor 0.4.9.1-alpha

Introduction

Currently, Tor implementations use the SOCKS5 username and password fields to pass parameters for stream isolation. (See the IsolateSocksAuth flag in the C tor manual, and the "Stream isolation" section (forthcoming) in our socks extensions spec.)

Tor implementations also support SOCKS4 and SOCKS4a, but they are not affected by this proposal.

The C Tor implementation also supports other proxy types besides SOCKS. They are not affected by this proposal because they either have other means to extend their protocols (as with HTTP headers in HTTP CONNECT) or no means to pass extension information (as for DNS proxies, iptables transparent proxies, etc).

Until now, the rules for interpreting these fields have been simple: all values are permitted, and streams with unequal values may not share a circuit.

But in order to integrate SOCKS connections into Arti's RPC protocol, we additionally want the ability to send RPC "Object IDs"1 in these fields. To do this, we will need some way to tell when we have received an object ID, when we have received an isolation parameter, and to avoid confusing them with one another.

Note that some confusion will necessarily remain possible: Since current Tor clients are allowed to send any value as SOCKS username and password, any value we specify here will be one which a client in principle might have sent under the old protocol.

Additionally, since we are adding complexity to the interpretation of these fields, it's possible we'll want to change this complexity in the future. To do this, we'll want a format typing scheme to premit changes.

Proposal

If accepted, the following can be incorporated into our socks extensions spec.)

We support a series of extensions in SOCKS5 Username/Passwords. Currently, these extensions can encode a stream isolation parameter (used to indicate that streams may share a circuit) and an RPC object ID (used to associate the stream with an entity in an RPC session).

These extensions are in use whenever the SOCKS5 Username begins with the 8-byte "magic" sequence [3c 74 6f 72 53 30 58 3e]. (This is the ASCII encoding of <torS0X>).

If the SOCKS5 Username/Password fields are present but the Username does not begin with this byte sequence, it indicates legacy isolation. New client implementations SHOULD NOT use legacy isolation. A SocksPort may be configured to reject legacy isolation.

When these extensions are in use, the next byte of the username after the "magic" sequence indicate a format type. Any implementation receiving an unrecognized or missing format type MUST reject the socks request.

  • When the format type is [30] (the ascii encoding of 0), we interpret the rest of the Username field and the Password field as follows:

    The remainder of the Username field must be empty.

    The Password field is stream isolation parameter. If it is empty, the stream isolation parameter is an empty string.

  • When the format type is [31] (the ascii encoding of 1), we interpret the rest of the Username and field and the Password field as follows:

    The remainder of the Username field encodes an RPC Object ID. It must not be empty.

    The Password field is stream isolation parameter. If it is empty, the stream isolation parameter is an empty string.

All implementations SHOULD implement format type [30].

Tor began implementing format type [30] in 0.4.9.1-alpha.

Examples:

  • Username=hello; Password=world. These are legacy parameters, since hello does not begin with <torS0X>.

  • Username=<torS0X>0; Password=123. There is no associated object ID. The isolation string is 123.

  • Username=<torS0X>0; Password=123. There is no associated object ID. The isolation string is empty.

  • Username=<torS0X>1abc; Password=123. The object ID is abc. The isolation string is 123.

  • Username=<torS0X>1abc; Password=123. The object ID is abc. The isolation string is empty.

  • Username=<torS0X>0abc; Password=123. Error: The format type is 0 but there is extra data in the username. Implementations must reject this.

  • Username=<torS0X>1; Password=123. Error: The format type is 1 but the object ID is empty. Implementations must reject this.

  • Username=<torS0X>9; Password=123. Error: The format type 9 is unspecified. Implementations must reject this.

  • Username=<torS0X>; Password=123. Error: There is no format type. Implementations must reject this.

Stream isolation

This replaces the corresponding part of the "Stream isolation" section (forthcoming) in our socks extensions spec.

Two streams are considered to have the same SOCKS authentication values if and only if one of the following is true:

  • They are both SOCKS4 or SOCKS4a, with the same user "ID" string.
  • They are both SOCKS5, with no authentication.
  • They are both SOCKS5 with USERNAME/PASSWORD authentication, using legacy isolation parameters, and they have identical usernames and identical passwords.
  • They are both SOCKS5 using the extensions above, with the same stream isolation parameter.

Additionally, two streams with different format types (e.g. [30] and [31]) may not share a circuit.

For more information on stream isolation, including other factors that can prevent streams from sharing circuits, see proposal 171.

A further extension for integration with Arti SOCKS

We should add the following to a specification, though it is not clear whether it goes in the Arti RPC spec or in the socks extensions spec.

In some cases, the RPC Object ID may denote an object that already includes information about its intended stream isolation. In such cases, the stream isolation MUST be blank. Implementations MUST reject non-blank stream isolation in such cases.

In some cases, the RPC object ID may denote an object that already includes information about its intended destination address and port. In such cases, the destination address MUST be 0.0.0.0 or :: (encoded either as an IPv4 address, an IPv6 address, or a hostname) and the destination port MUST be 0. Implementations MUST reject other addresses in such cases.


(Here the specifications end. The rest of this proposal is discussion.)

Design considerations

Our use of SOCKS5 Username/Passwords here (as opposed to some other, new authentication type) is based on the observation that many existing SOCKS5 implementations support Username/Password, but comparatively few support arbitrary plug-in authentication.

The magic "<torS0X>" prefix is chosen to be 8 bytes long so that existing client implementations that generate random strings will not often generate it by mistake. (A client's chances of chances of doing so are one in 2^-64 every time it generates a random username of at least 8 bytes.)

The format type code is chosen to be an ASCII 0 or 1 rather than a raw 0 or 1 byte, for compatibility with existing SOCKS5 client implementations that do not support non-ASCII username/password values.

C Tor migration

When this proposal is accepted, we should configure C tor to implement it as follows:

  • To reject any SOCKS5 Username starting with <torS0X> unless it is exactly <torS0X>0.

This behavior is sufficient to give correct isolation behavior, to reject any connection including an RPC object ID, and to reject any as-yet-unspecified isolation mechanisms.

1

An ObjectId is used in the Arti RPC protocol to associate a SOCKS request with some existing Client object, or with a preexisting DataStream.