Proposals for changes in the Tor protocols

This "book" is a list of proposals that people have made over the years, (dating back to 2007) for protocol changes in Tor. Some of these proposals are already implemented or rejected; others are under active discussion.

If you're looking for a specific proposal, you can find it, by filename, in the summary bar on the left, or at this index. You can also see a list of Tor protocols by their status at BY_STATUS.md.

For information on creating a new proposal, you would ideally look at 001-process.txt. That file is a bit out-of-date, though, and you should probably just contact the developers.

Tor proposals by number

Here we have a set of proposals for changes to the Tor protocol. Some of these proposals are implemented; some are works in progress; and some will never be implemented.

Below are a list of proposals sorted by their proposal number. See BY_STATUS.md for a list of proposals sorted by status.

Tor proposals by status

Here we have a set of proposals for changes to the Tor protocol. Some of these proposals are implemented; some are works in progress; and some will never be implemented.

Below are a list of proposals sorted by status. See BY_INDEX.md for a list of proposals sorted by number.

Active proposals by status

OPEN proposals: under discussion

These are proposals that we think are likely to be complete, and ripe for discussion.

ACCEPTED proposals: slated for implementation

These are the proposals that we agree we'd like to implement. They might or might not have a specific timeframe planned for their implementation.

FINISHED proposals: implemented, specs not merged

These proposals are implemented in some version of Tor; the proposals themselves still need to be merged into the specifications proper.

META proposals: about the proposal process

These proposals describe ongoing policies and changes to the proposals process.

INFORMATIONAL proposals: not actually specifications

These proposals describe a process or project, but aren't actually proposed changes in the Tor specifications.

Preliminary proposals

DRAFT proposals: incomplete works

These proposals have been marked as a draft by their author or the editors, indicating that they aren't yet in a complete form. They're still open for discussion.

NEEDS-REVISION proposals: ideas that we can't implement as-is

These proposals have some promise, but we can't implement them without certain changes.

NEEDS-RESEARCH proposals: blocking on research

These proposals are interesting ideas, but there's more research that would need to happen before we can know whether to implement them or not, or to fill in certain details.

(There are no proposals in this category)

Inactive proposals by status

CLOSED proposals: implemented and specified

These proposals have been implemented in some version of Tor, and the changes from the proposals have been merged into the specifications as necessary.

RESERVE proposals: saving for later

These proposals aren't anything we plan to implement soon, but for one reason or another we think they might be a good idea in the future. We're keeping them around as a reference in case we someday confront the problems that they try to solve.

SUPERSEDED proposals: replaced by something else

These proposals were obsoleted by a later proposal before they were implemented.

DEAD, REJECTED, OBSOLETE proposals: not in our plans

These proposals are not on-track for discussion or implementation. Either discussion has stalled out (the proposal is DEAD), the proposal has been considered and not adopted (the proposal is REJECTED), or the proposal addresses an issue or a solution that is no longer relevant (the proposal is OBSOLETE).

Filename: 000-index.txt
Title: Index of Tor Proposals
Author: Nick Mathewson
Created: 26-Jan-2007
Status: Meta

Overview:

   This document provides an index to Tor proposals.

   This is an informational document.

   Everything in this document below the line of '=' signs is automatically
   generated by reindex.py; do not edit by hand.

============================================================
Proposals by number:

000  Index of Tor Proposals [META]
001  The Tor Proposal Process [META]
098  Proposals that should be written [OBSOLETE]
099  Miscellaneous proposals [OBSOLETE]
100  Tor Unreliable Datagram Extension Proposal [DEAD]
101  Voting on the Tor Directory System [CLOSED]
102  Dropping "opt" from the directory format [CLOSED]
103  Splitting identity key from regularly used signing key [CLOSED]
104  Long and Short Router Descriptors [CLOSED]
105  Version negotiation for the Tor protocol [CLOSED]
106  Checking fewer things during TLS handshakes [CLOSED]
107  Uptime Sanity Checking [CLOSED]
108  Base "Stable" Flag on Mean Time Between Failures [CLOSED]
109  No more than one server per IP address [CLOSED]
110  Avoiding infinite length circuits [CLOSED]
111  Prioritizing local traffic over relayed traffic [CLOSED]
112  Bring Back Pathlen Coin Weight [SUPERSEDED]
113  Simplifying directory authority administration [SUPERSEDED]
114  Distributed Storage for Tor Hidden Service Descriptors [CLOSED]
115  Two Hop Paths [DEAD]
116  Two hop paths from entry guards [DEAD]
117  IPv6 exits [CLOSED]
118  Advertising multiple ORPorts at once [SUPERSEDED]
119  New PROTOCOLINFO command for controllers [CLOSED]
120  Shutdown descriptors when Tor servers stop [DEAD]
121  Hidden Service Authentication [CLOSED]
122  Network status entries need a new Unnamed flag [CLOSED]
123  Naming authorities automatically create bindings [CLOSED]
124  Blocking resistant TLS certificate usage [SUPERSEDED]
125  Behavior for bridge users, bridge relays, and bridge authorities [CLOSED]
126  Getting GeoIP data and publishing usage summaries [CLOSED]
127  Relaying dirport requests to Tor download site / website [OBSOLETE]
128  Families of private bridges [DEAD]
129  Block Insecure Protocols by Default [CLOSED]
130  Version 2 Tor connection protocol [CLOSED]
131  Help users to verify they are using Tor [OBSOLETE]
132  A Tor Web Service For Verifying Correct Browser Configuration [OBSOLETE]
133  Incorporate Unreachable ORs into the Tor Network [RESERVE]
134  More robust consensus voting with diverse authority sets [REJECTED]
135  Simplify Configuration of Private Tor Networks [CLOSED]
136  Mass authority migration with legacy keys [CLOSED]
137  Keep controllers informed as Tor bootstraps [CLOSED]
138  Remove routers that are not Running from consensus documents [CLOSED]
139  Download consensus documents only when it will be trusted [CLOSED]
140  Provide diffs between consensuses [CLOSED]
141  Download server descriptors on demand [OBSOLETE]
142  Combine Introduction and Rendezvous Points [DEAD]
143  Improvements of Distributed Storage for Tor Hidden Service Descriptors [SUPERSEDED]
144  Increase the diversity of circuits by detecting nodes belonging the same provider [OBSOLETE]
145  Separate "suitable as a guard" from "suitable as a new guard" [SUPERSEDED]
146  Add new flag to reflect long-term stability [SUPERSEDED]
147  Eliminate the need for v2 directories in generating v3 directories [REJECTED]
148  Stream end reasons from the client side should be uniform [CLOSED]
149  Using data from NETINFO cells [SUPERSEDED]
150  Exclude Exit Nodes from a circuit [CLOSED]
151  Improving Tor Path Selection [CLOSED]
152  Optionally allow exit from single-hop circuits [CLOSED]
153  Automatic software update protocol [SUPERSEDED]
154  Automatic Software Update Protocol [SUPERSEDED]
155  Four Improvements of Hidden Service Performance [CLOSED]
156  Tracking blocked ports on the client side [SUPERSEDED]
157  Make certificate downloads specific [CLOSED]
158  Clients download consensus + microdescriptors [CLOSED]
159  Exit Scanning [INFORMATIONAL]
160  Authorities vote for bandwidth offsets in consensus [CLOSED]
161  Computing Bandwidth Adjustments [CLOSED]
162  Publish the consensus in multiple flavors [CLOSED]
163  Detecting whether a connection comes from a client [SUPERSEDED]
164  Reporting the status of server votes [OBSOLETE]
165  Easy migration for voting authority sets [REJECTED]
166  Including Network Statistics in Extra-Info Documents [CLOSED]
167  Vote on network parameters in consensus [CLOSED]
168  Reduce default circuit window [REJECTED]
169  Eliminate TLS renegotiation for the Tor connection handshake [SUPERSEDED]
170  Configuration options regarding circuit building [SUPERSEDED]
171  Separate streams across circuits by connection metadata [CLOSED]
172  GETINFO controller option for circuit information [RESERVE]
173  GETINFO Option Expansion [OBSOLETE]
174  Optimistic Data for Tor: Server Side [CLOSED]
175  Automatically promoting Tor clients to nodes [REJECTED]
176  Proposed version-3 link handshake for Tor [CLOSED]
177  Abstaining from votes on individual flags [RESERVE]
178  Require majority of authorities to vote for consensus parameters [CLOSED]
179  TLS certificate and parameter normalization [CLOSED]
180  Pluggable transports for circumvention [CLOSED]
181  Optimistic Data for Tor: Client Side [CLOSED]
182  Credit Bucket [OBSOLETE]
183  Refill Intervals [CLOSED]
184  Miscellaneous changes for a v3 Tor link protocol [CLOSED]
185  Directory caches without DirPort [SUPERSEDED]
186  Multiple addresses for one OR or bridge [CLOSED]
187  Reserve a cell type to allow client authorization [CLOSED]
188  Bridge Guards and other anti-enumeration defenses [RESERVE]
189  AUTHORIZE and AUTHORIZED cells [OBSOLETE]
190  Bridge Client Authorization Based on a Shared Secret [OBSOLETE]
191  Bridge Detection Resistance against MITM-capable Adversaries [OBSOLETE]
192  Automatically retrieve and store information about bridges [OBSOLETE]
193  Safe cookie authentication for Tor controllers [CLOSED]
194  Mnemonic .onion URLs [SUPERSEDED]
195  TLS certificate normalization for Tor 0.2.4.x [DEAD]
196  Extended ORPort and TransportControlPort [CLOSED]
197  Message-based Inter-Controller IPC Channel [REJECTED]
198  Restore semantics of TLS ClientHello [CLOSED]
199  Integration of BridgeFinder and BridgeFinderHelper [OBSOLETE]
200  Adding new, extensible CREATE, EXTEND, and related cells [CLOSED]
201  Make bridges report statistics on daily v3 network status requests [RESERVE]
202  Two improved relay encryption protocols for Tor cells [META]
203  Avoiding censorship by impersonating an HTTPS server [OBSOLETE]
204  Subdomain support for Hidden Service addresses [CLOSED]
205  Remove global client-side DNS caching [CLOSED]
206  Preconfigured directory sources for bootstrapping [CLOSED]
207  Directory guards [CLOSED]
208  IPv6 Exits Redux [CLOSED]
209  Tuning the Parameters for the Path Bias Defense [OBSOLETE]
210  Faster Headless Consensus Bootstrapping [SUPERSEDED]
211  Internal Mapaddress for Tor Configuration Testing [RESERVE]
212  Increase Acceptable Consensus Age [NEEDS-REVISION]
213  Remove stream-level sendmes from the design [DEAD]
214  Allow 4-byte circuit IDs in a new link protocol [CLOSED]
215  Let the minimum consensus method change with time [CLOSED]
216  Improved circuit-creation key exchange [CLOSED]
217  Tor Extended ORPort Authentication [CLOSED]
218  Controller events to better understand connection/circuit usage [CLOSED]
219  Support for full DNS and DNSSEC resolution in Tor [NEEDS-REVISION]
220  Migrate server identity keys to Ed25519 [CLOSED]
221  Stop using CREATE_FAST [CLOSED]
222  Stop sending client timestamps [CLOSED]
223  Ace: Improved circuit-creation key exchange [RESERVE]
224  Next-Generation Hidden Services in Tor [CLOSED]
225  Strawman proposal: commit-and-reveal shared rng [SUPERSEDED]
226  "Scalability and Stability Improvements to BridgeDB: Switching to a Distributed Database System and RDBMS" [RESERVE]
227  Include package fingerprints in consensus documents [CLOSED]
228  Cross-certifying identity keys with onion keys [CLOSED]
229  Further SOCKS5 extensions [REJECTED]
230  How to change RSA1024 relay identity keys [OBSOLETE]
231  Migrating authority RSA1024 identity keys [OBSOLETE]
232  Pluggable Transport through SOCKS proxy [CLOSED]
233  Making Tor2Web mode faster [REJECTED]
234  Adding remittance field to directory specification [REJECTED]
235  Stop assigning (and eventually supporting) the Named flag [CLOSED]
236  The move to a single guard node [CLOSED]
237  All relays are directory servers [CLOSED]
238  Better hidden service stats from Tor relays [CLOSED]
239  Consensus Hash Chaining [OPEN]
240  Early signing key revocation for directory authorities [OPEN]
241  Resisting guard-turnover attacks [REJECTED]
242  Better performance and usability for the MyFamily option [SUPERSEDED]
243  Give out HSDir flag only to relays with Stable flag [CLOSED]
244  Use RFC5705 Key Exporting in our AUTHENTICATE calls [CLOSED]
245  Deprecating and removing the TAP circuit extension protocol [SUPERSEDED]
246  Merging Hidden Service Directories and Introduction Points [REJECTED]
247  Defending Against Guard Discovery Attacks using Vanguards [SUPERSEDED]
248  Remove all RSA identity keys [NEEDS-REVISION]
249  Allow CREATE cells with >505 bytes of handshake data [SUPERSEDED]
250  Random Number Generation During Tor Voting [CLOSED]
251  Padding for netflow record resolution reduction [CLOSED]
252  Single Onion Services [SUPERSEDED]
253  Out of Band Circuit HMACs [DEAD]
254  Padding Negotiation [CLOSED]
255  Controller features to allow for load-balancing hidden services [RESERVE]
256  Key revocation for relays and authorities [RESERVE]
257  Refactoring authorities and making them more isolated from the net [META]
258  Denial-of-service resistance for directory authorities [DEAD]
259  New Guard Selection Behaviour [OBSOLETE]
260  Rendezvous Single Onion Services [FINISHED]
261  AEZ for relay cryptography [OBSOLETE]
262  Re-keying live circuits with new cryptographic material [RESERVE]
263  Request to change key exchange protocol for handshake v1.2 [OBSOLETE]
264  Putting version numbers on the Tor subprotocols [CLOSED]
265  Load Balancing with Overhead Parameters [OPEN]
266  Removing current obsolete clients from the Tor network [SUPERSEDED]
267  Tor Consensus Transparency [OPEN]
268  New Guard Selection Behaviour [OBSOLETE]
269  Transitionally secure hybrid handshakes [NEEDS-REVISION]
270  RebelAlliance: A Post-Quantum Secure Hybrid Handshake Based on NewHope [OBSOLETE]
271  Another algorithm for guard selection [CLOSED]
272  Listed routers should be Valid, Running, and treated as such [CLOSED]
273  Exit relay pinning for web services [RESERVE]
274  Rotate onion keys less frequently [CLOSED]
275  Stop including meaningful "published" time in microdescriptor consensus [CLOSED]
276  Report bandwidth with lower granularity in consensus documents [DEAD]
277  Detect multiple relay instances running with same ID [OPEN]
278  Directory Compression Scheme Negotiation [CLOSED]
279  A Name System API for Tor Onion Services [NEEDS-REVISION]
280  Privacy-Preserving Statistics with Privcount in Tor [SUPERSEDED]
281  Downloading microdescriptors in bulk [RESERVE]
282  Remove "Named" and "Unnamed" handling from consensus voting [ACCEPTED]
283  Move IPv6 ORPorts from microdescriptors to the microdesc consensus [CLOSED]
284  Hidden Service v3 Control Port [CLOSED]
285  Directory documents should be standardized as UTF-8 [ACCEPTED]
286  Controller APIs for hibernation access on mobile [REJECTED]
287  Reduce circuit lifetime without overloading the network [OPEN]
288  Privacy-Preserving Statistics with Privcount in Tor (Shamir version) [RESERVE]
289  Authenticating sendme cells to mitigate bandwidth attacks [CLOSED]
290  Continuously update consensus methods [META]
291  The move to two guard nodes [FINISHED]
292  Mesh-based vanguards [CLOSED]
293  Other ways for relays to know when to publish [CLOSED]
294  TLS 1.3 Migration [DRAFT]
295  Using ADL for relay cryptography (solving the crypto-tagging attack) [OPEN]
296  Have Directory Authorities expose raw bandwidth list files [CLOSED]
297  Relaxing the protover-based shutdown rules [CLOSED]
298  Putting family lines in canonical form [CLOSED]
299  Preferring IPv4 or IPv6 based on IP Version Failure Count [SUPERSEDED]
300  Walking Onions: Scaling and Saving Bandwidth [INFORMATIONAL]
301  Don't include package fingerprints in consensus documents [CLOSED]
302  Hiding onion service clients using padding [CLOSED]
303  When and how to remove support for protocol versions [OPEN]
304  Extending SOCKS5 Onion Service Error Codes [CLOSED]
305  ESTABLISH_INTRO Cell DoS Defense Extension [CLOSED]
306  A Tor Implementation of IPv6 Happy Eyeballs [OPEN]
307  Onion Balance Support for Onion Service v3 [RESERVE]
308  Counter Galois Onion: A New Proposal for Forward-Secure Relay Cryptography [SUPERSEDED]
309  Optimistic SOCKS Data [OPEN]
310  Towards load-balancing in Prop 271 [CLOSED]
311  Tor Relay IPv6 Reachability [ACCEPTED]
312  Tor Relay Automatic IPv6 Address Discovery [ACCEPTED]
313  Tor Relay IPv6 Statistics [ACCEPTED]
314  Allow Markdown for proposal format [CLOSED]
315  Updating the list of fields required in directory documents [CLOSED]
316  FlashFlow: A Secure Speed Test for Tor (Parent Proposal) [DRAFT]
317  Improve security aspects of DNS name resolution [NEEDS-REVISION]
318  Limit protover values to 0-63 [CLOSED]
319  RELAY_FRAGMENT cells [OBSOLETE]
320  Removing TAP usage from v2 onion services [REJECTED]
321  Better performance and usability for the MyFamily option (v2) [ACCEPTED]
322  Extending link specifiers to include the directory port [OPEN]
323  Specification for Walking Onions [OPEN]
324  RTT-based Congestion Control for Tor [FINISHED]
325  Packed relay cells: saving space on small commands [OBSOLETE]
326  The "tor-relay" Well-Known Resource Identifier [OPEN]
327  A First Take at PoW Over Introduction Circuits [CLOSED]
328  Make Relays Report When They Are Overloaded [CLOSED]
329  Overcoming Tor's Bottlenecks with Traffic Splitting [FINISHED]
330  Modernizing authority contact entries [OPEN]
331  Res tokens: Anonymous Credentials for Onion Service DoS Resilience [DRAFT]
332  Ntor protocol with extra data, version 3 [CLOSED]
333  Vanguards lite [CLOSED]
334  A Directory Authority Flag To Mark Relays As Middle-only [SUPERSEDED]
335  An authority-only design for MiddleOnly [CLOSED]
336  Randomized schedule for guard retries [CLOSED]
337  A simpler way to decide, "Is this guard usable?" [CLOSED]
338  Use an 8-byte timestamp in NETINFO cells [ACCEPTED]
339  UDP traffic over Tor [ACCEPTED]
340  Packed and fragmented relay messages [OPEN]
341  A better algorithm for out-of-sockets eviction [OPEN]
342  Decoupling hs_interval and SRV lifetime [DRAFT]
343  CAA Extensions for the Tor Rendezvous Specification [OPEN]
344  Prioritizing Protocol Information Leaks in Tor [OPEN]
345  Migrating the tor specifications to mdbook [CLOSED]
346  Clarifying and extending the use of protocol versioning [OPEN]
347  Domain separation for certificate signing keys [OPEN]
348  UDP Application Support in Tor [OPEN]
349  Client-Side Command Acceptance Validation [DRAFT]
350  A phased plan to remove TAP onion keys [ACCEPTED]


Proposals by status:

 DRAFT:
   294  TLS 1.3 Migration
   316  FlashFlow: A Secure Speed Test for Tor (Parent Proposal)
   331  Res tokens: Anonymous Credentials for Onion Service DoS Resilience
   342  Decoupling hs_interval and SRV lifetime
   349  Client-Side Command Acceptance Validation
 NEEDS-REVISION:
   212  Increase Acceptable Consensus Age [for 0.2.4.x+]
   219  Support for full DNS and DNSSEC resolution in Tor [for 0.2.5.x]
   248  Remove all RSA identity keys
   269  Transitionally secure hybrid handshakes
   279  A Name System API for Tor Onion Services
   317  Improve security aspects of DNS name resolution
 OPEN:
   239  Consensus Hash Chaining
   240  Early signing key revocation for directory authorities
   265  Load Balancing with Overhead Parameters [for arti-dirauth]
   267  Tor Consensus Transparency
   277  Detect multiple relay instances running with same ID [for 0.3.??]
   287  Reduce circuit lifetime without overloading the network
   295  Using ADL for relay cryptography (solving the crypto-tagging attack)
   303  When and how to remove support for protocol versions
   306  A Tor Implementation of IPv6 Happy Eyeballs
   309  Optimistic SOCKS Data
   322  Extending link specifiers to include the directory port
   323  Specification for Walking Onions
   326  The "tor-relay" Well-Known Resource Identifier
   330  Modernizing authority contact entries
   340  Packed and fragmented relay messages
   341  A better algorithm for out-of-sockets eviction
   343  CAA Extensions for the Tor Rendezvous Specification
   344  Prioritizing Protocol Information Leaks in Tor
   346  Clarifying and extending the use of protocol versioning
   347  Domain separation for certificate signing keys
   348  UDP Application Support in Tor
 ACCEPTED:
   282  Remove "Named" and "Unnamed" handling from consensus voting [for arti-dirauth]
   285  Directory documents should be standardized as UTF-8 [for arti-dirauth]
   311  Tor Relay IPv6 Reachability
   312  Tor Relay Automatic IPv6 Address Discovery
   313  Tor Relay IPv6 Statistics
   321  Better performance and usability for the MyFamily option (v2)
   338  Use an 8-byte timestamp in NETINFO cells
   339  UDP traffic over Tor
   350  A phased plan to remove TAP onion keys
 META:
   000  Index of Tor Proposals
   001  The Tor Proposal Process
   202  Two improved relay encryption protocols for Tor cells
   257  Refactoring authorities and making them more isolated from the net
   290  Continuously update consensus methods
 FINISHED:
   260  Rendezvous Single Onion Services [in 0.2.9.3-alpha]
   291  The move to two guard nodes
   324  RTT-based Congestion Control for Tor
   329  Overcoming Tor's Bottlenecks with Traffic Splitting
 CLOSED:
   101  Voting on the Tor Directory System [in 0.2.0.x]
   102  Dropping "opt" from the directory format [in 0.2.0.x]
   103  Splitting identity key from regularly used signing key [in 0.2.0.x]
   104  Long and Short Router Descriptors [in 0.2.0.x]
   105  Version negotiation for the Tor protocol [in 0.2.0.x]
   106  Checking fewer things during TLS handshakes [in 0.2.0.x]
   107  Uptime Sanity Checking [in 0.2.0.x]
   108  Base "Stable" Flag on Mean Time Between Failures [in 0.2.0.x]
   109  No more than one server per IP address [in 0.2.0.x]
   110  Avoiding infinite length circuits [for 0.2.3.x] [in 0.2.1.3-alpha, 0.2.3.11-alpha]
   111  Prioritizing local traffic over relayed traffic [in 0.2.0.x]
   114  Distributed Storage for Tor Hidden Service Descriptors [in 0.2.0.x]
   117  IPv6 exits [for 0.2.4.x] [in 0.2.4.7-alpha]
   119  New PROTOCOLINFO command for controllers [in 0.2.0.x]
   121  Hidden Service Authentication [in 0.2.1.x]
   122  Network status entries need a new Unnamed flag [in 0.2.0.x]
   123  Naming authorities automatically create bindings [in 0.2.0.x]
   125  Behavior for bridge users, bridge relays, and bridge authorities [in 0.2.0.x]
   126  Getting GeoIP data and publishing usage summaries [in 0.2.0.x]
   129  Block Insecure Protocols by Default [in 0.2.0.x]
   130  Version 2 Tor connection protocol [in 0.2.0.x]
   135  Simplify Configuration of Private Tor Networks [for 0.2.1.x] [in 0.2.1.2-alpha]
   136  Mass authority migration with legacy keys [in 0.2.0.x]
   137  Keep controllers informed as Tor bootstraps [in 0.2.1.x]
   138  Remove routers that are not Running from consensus documents [in 0.2.1.2-alpha]
   139  Download consensus documents only when it will be trusted [in 0.2.1.x]
   140  Provide diffs between consensuses [in 0.3.1.1-alpha]
   148  Stream end reasons from the client side should be uniform [in 0.2.1.9-alpha]
   150  Exclude Exit Nodes from a circuit [in 0.2.1.3-alpha]
   151  Improving Tor Path Selection [in 0.2.2.2-alpha]
   152  Optionally allow exit from single-hop circuits [in 0.2.1.6-alpha]
   155  Four Improvements of Hidden Service Performance [in 0.2.1.x]
   157  Make certificate downloads specific [for 0.2.4.x]
   158  Clients download consensus + microdescriptors [in 0.2.3.1-alpha]
   160  Authorities vote for bandwidth offsets in consensus [for 0.2.1.x]
   161  Computing Bandwidth Adjustments [for 0.2.1.x]
   162  Publish the consensus in multiple flavors [in 0.2.3.1-alpha]
   166  Including Network Statistics in Extra-Info Documents [for 0.2.2]
   167  Vote on network parameters in consensus [in 0.2.2]
   171  Separate streams across circuits by connection metadata [in 0.2.3.3-alpha]
   174  Optimistic Data for Tor: Server Side [in 0.2.3.1-alpha]
   176  Proposed version-3 link handshake for Tor [for 0.2.3]
   178  Require majority of authorities to vote for consensus parameters [in 0.2.3.9-alpha]
   179  TLS certificate and parameter normalization [for 0.2.3.x]
   180  Pluggable transports for circumvention [in 0.2.3.x]
   181  Optimistic Data for Tor: Client Side [in 0.2.3.3-alpha]
   183  Refill Intervals [in 0.2.3.5-alpha]
   184  Miscellaneous changes for a v3 Tor link protocol [for 0.2.3.x]
   186  Multiple addresses for one OR or bridge [for 0.2.4.x+]
   187  Reserve a cell type to allow client authorization [for 0.2.3.x]
   193  Safe cookie authentication for Tor controllers
   196  Extended ORPort and TransportControlPort [in 0.2.5.2-alpha]
   198  Restore semantics of TLS ClientHello [for 0.2.4.x]
   200  Adding new, extensible CREATE, EXTEND, and related cells [in 0.2.4.8-alpha]
   204  Subdomain support for Hidden Service addresses
   205  Remove global client-side DNS caching [in 0.2.4.7-alpha.]
   206  Preconfigured directory sources for bootstrapping [in 0.2.4.7-alpha]
   207  Directory guards [for 0.2.4.x]
   208  IPv6 Exits Redux [for 0.2.4.x] [in 0.2.4.7-alpha]
   214  Allow 4-byte circuit IDs in a new link protocol [in 0.2.4.11-alpha]
   215  Let the minimum consensus method change with time [in 0.2.6.1-alpha]
   216  Improved circuit-creation key exchange [in 0.2.4.8-alpha]
   217  Tor Extended ORPort Authentication [for 0.2.5.x]
   218  Controller events to better understand connection/circuit usage [in 0.2.5.2-alpha]
   220  Migrate server identity keys to Ed25519 [in 0.3.0.1-alpha]
   221  Stop using CREATE_FAST [for 0.2.5.x]
   222  Stop sending client timestamps [in 0.2.4.18]
   224  Next-Generation Hidden Services in Tor [in 0.3.2.1-alpha]
   227  Include package fingerprints in consensus documents [in 0.2.6.3-alpha]
   228  Cross-certifying identity keys with onion keys
   232  Pluggable Transport through SOCKS proxy [in 0.2.6]
   235  Stop assigning (and eventually supporting) the Named flag [in 0.2.6, 0.2.7]
   236  The move to a single guard node
   237  All relays are directory servers [for 0.2.7.x]
   238  Better hidden service stats from Tor relays
   243  Give out HSDir flag only to relays with Stable flag
   244  Use RFC5705 Key Exporting in our AUTHENTICATE calls [in 0.3.0.1-alpha]
   250  Random Number Generation During Tor Voting
   251  Padding for netflow record resolution reduction [in 0.3.1.1-alpha]
   254  Padding Negotiation
   264  Putting version numbers on the Tor subprotocols [in 0.2.9.4-alpha]
   271  Another algorithm for guard selection [in 0.3.0.1-alpha]
   272  Listed routers should be Valid, Running, and treated as such [in 0.2.9.3-alpha, 0.2.9.4-alpha]
   274  Rotate onion keys less frequently [in 0.3.1.1-alpha]
   275  Stop including meaningful "published" time in microdescriptor consensus [for 0.3.1.x-alpha] [in 0.4.8.1-alpha]
   278  Directory Compression Scheme Negotiation [in 0.3.1.1-alpha]
   283  Move IPv6 ORPorts from microdescriptors to the microdesc consensus [for 0.3.3.x] [in 0.3.3.1-alpha]
   284  Hidden Service v3 Control Port
   289  Authenticating sendme cells to mitigate bandwidth attacks [in 0.4.1.1-alpha]
   292  Mesh-based vanguards
   293  Other ways for relays to know when to publish [for 0.3.5] [in 0.4.0.1-alpha]
   296  Have Directory Authorities expose raw bandwidth list files [in 0.4.0.1-alpha]
   297  Relaxing the protover-based shutdown rules [for 0.3.5.x] [in 0.4.0.x]
   298  Putting family lines in canonical form [for 0.3.6.x] [in 0.4.0.1-alpha]
   301  Don't include package fingerprints in consensus documents
   302  Hiding onion service clients using padding [in 0.4.1.1-alpha]
   304  Extending SOCKS5 Onion Service Error Codes
   305  ESTABLISH_INTRO Cell DoS Defense Extension
   310  Towards load-balancing in Prop 271
   314  Allow Markdown for proposal format
   315  Updating the list of fields required in directory documents [in 0.4.5.1-alpha]
   318  Limit protover values to 0-63 [in 0.4.5.1-alpha]
   327  A First Take at PoW Over Introduction Circuits
   328  Make Relays Report When They Are Overloaded
   332  Ntor protocol with extra data, version 3
   333  Vanguards lite [in 0.4.7.1-alpha]
   335  An authority-only design for MiddleOnly [in 0.4.7.2-alpha]
   336  Randomized schedule for guard retries
   337  A simpler way to decide, "Is this guard usable?"
   345  Migrating the tor specifications to mdbook
 SUPERSEDED:
   112  Bring Back Pathlen Coin Weight
   113  Simplifying directory authority administration
   118  Advertising multiple ORPorts at once
   124  Blocking resistant TLS certificate usage
   143  Improvements of Distributed Storage for Tor Hidden Service Descriptors
   145  Separate "suitable as a guard" from "suitable as a new guard"
   146  Add new flag to reflect long-term stability
   149  Using data from NETINFO cells
   153  Automatic software update protocol
   154  Automatic Software Update Protocol
   156  Tracking blocked ports on the client side
   163  Detecting whether a connection comes from a client
   169  Eliminate TLS renegotiation for the Tor connection handshake
   170  Configuration options regarding circuit building
   185  Directory caches without DirPort
   194  Mnemonic .onion URLs
   210  Faster Headless Consensus Bootstrapping
   225  Strawman proposal: commit-and-reveal shared rng
   242  Better performance and usability for the MyFamily option
   245  Deprecating and removing the TAP circuit extension protocol
   247  Defending Against Guard Discovery Attacks using Vanguards
   249  Allow CREATE cells with >505 bytes of handshake data
   252  Single Onion Services
   266  Removing current obsolete clients from the Tor network
   280  Privacy-Preserving Statistics with Privcount in Tor
   299  Preferring IPv4 or IPv6 based on IP Version Failure Count
   308  Counter Galois Onion: A New Proposal for Forward-Secure Relay Cryptography
   334  A Directory Authority Flag To Mark Relays As Middle-only
 DEAD:
   100  Tor Unreliable Datagram Extension Proposal
   115  Two Hop Paths
   116  Two hop paths from entry guards
   120  Shutdown descriptors when Tor servers stop
   128  Families of private bridges
   142  Combine Introduction and Rendezvous Points
   195  TLS certificate normalization for Tor 0.2.4.x
   213  Remove stream-level sendmes from the design
   253  Out of Band Circuit HMACs
   258  Denial-of-service resistance for directory authorities
   276  Report bandwidth with lower granularity in consensus documents
 REJECTED:
   134  More robust consensus voting with diverse authority sets
   147  Eliminate the need for v2 directories in generating v3 directories [for 0.2.4.x]
   165  Easy migration for voting authority sets
   168  Reduce default circuit window
   175  Automatically promoting Tor clients to nodes
   197  Message-based Inter-Controller IPC Channel
   229  Further SOCKS5 extensions
   233  Making Tor2Web mode faster
   234  Adding remittance field to directory specification
   241  Resisting guard-turnover attacks
   246  Merging Hidden Service Directories and Introduction Points
   286  Controller APIs for hibernation access on mobile
   320  Removing TAP usage from v2 onion services
 OBSOLETE:
   098  Proposals that should be written
   099  Miscellaneous proposals
   127  Relaying dirport requests to Tor download site / website
   131  Help users to verify they are using Tor
   132  A Tor Web Service For Verifying Correct Browser Configuration
   141  Download server descriptors on demand
   144  Increase the diversity of circuits by detecting nodes belonging the same provider
   164  Reporting the status of server votes
   173  GETINFO Option Expansion
   182  Credit Bucket
   189  AUTHORIZE and AUTHORIZED cells
   190  Bridge Client Authorization Based on a Shared Secret
   191  Bridge Detection Resistance against MITM-capable Adversaries
   192  Automatically retrieve and store information about bridges [for 0.2.[45].x]
   199  Integration of BridgeFinder and BridgeFinderHelper
   203  Avoiding censorship by impersonating an HTTPS server
   209  Tuning the Parameters for the Path Bias Defense [for 0.2.4.x+]
   230  How to change RSA1024 relay identity keys [for 0.2.?]
   231  Migrating authority RSA1024 identity keys [for 0.2.?]
   259  New Guard Selection Behaviour
   261  AEZ for relay cryptography
   263  Request to change key exchange protocol for handshake v1.2
   268  New Guard Selection Behaviour
   270  RebelAlliance: A Post-Quantum Secure Hybrid Handshake Based on NewHope
   319  RELAY_FRAGMENT cells
   325  Packed relay cells: saving space on small commands
 RESERVE:
   133  Incorporate Unreachable ORs into the Tor Network
   172  GETINFO controller option for circuit information
   177  Abstaining from votes on individual flags [for 0.2.4.x]
   188  Bridge Guards and other anti-enumeration defenses
   201  Make bridges report statistics on daily v3 network status requests [for 0.2.4.x]
   211  Internal Mapaddress for Tor Configuration Testing [for 0.2.4.x+]
   223  Ace: Improved circuit-creation key exchange
   226  "Scalability and Stability Improvements to BridgeDB: Switching to a Distributed Database System and RDBMS"
   255  Controller features to allow for load-balancing hidden services
   256  Key revocation for relays and authorities
   262  Re-keying live circuits with new cryptographic material
   273  Exit relay pinning for web services [for n/a]
   281  Downloading microdescriptors in bulk
   288  Privacy-Preserving Statistics with Privcount in Tor (Shamir version)
   307  Onion Balance Support for Onion Service v3
 INFORMATIONAL:
   159  Exit Scanning
   300  Walking Onions: Scaling and Saving Bandwidth
Filename: 001-process.txt
Title: The Tor Proposal Process
Author: Nick Mathewson
Created: 30-Jan-2007
Status: Meta

Overview:

   This document describes how to change the Tor specifications, how Tor
   proposals work, and the relationship between Tor proposals and the
   specifications.

   This is an informational document.

Motivation:

   Previously, our process for updating the Tor specifications was maximally
   informal: we'd patch the specification (sometimes forking first, and
   sometimes not), then discuss the patches, reach consensus, and implement
   the changes.

   This had a few problems.

   First, even at its most efficient, the old process would often have the
   spec out of sync with the code.  The worst cases were those where
   implementation was deferred: the spec and code could stay out of sync for
   versions at a time.

   Second, it was hard to participate in discussion, since you had to know
   which portions of the spec were a proposal, and which were already
   implemented.

   Third, it littered the specifications with too many inline comments.
     [This was a real problem -NM]
       [Especially when it went to multiple levels! -NM]
         [XXXX especially when they weren't signed and talked about that
          thing that you can't remember after a year]

How to change the specs now:

   First, somebody writes a proposal document.  It should describe the change
   that should be made in detail, and give some idea of how to implement it.
   Once it's fleshed out enough, it becomes a proposal.

   Like an RFC, every proposal gets a number.  Unlike RFCs, proposals can
   change over time and keep the same number, until they are finally
   accepted or rejected.  The history for each proposal
   will be stored in the Tor repository.

   Once a proposal is in the repository, we should discuss and improve it
   until we've reached consensus that it's a good idea, and that it's
   detailed enough to implement.  When this happens, we implement the
   proposal and incorporate it into the specifications.  Thus, the specs
   remain the canonical documentation for the Tor protocol: no proposal is
   ever the canonical documentation for an implemented feature.

   (This process is pretty similar to the Python Enhancement Process, with
   the major exception that Tor proposals get re-integrated into the specs
   after implementation, whereas PEPs _become_ the new spec.)

   {It's still okay to make small changes directly to the spec if the code
   can be
   written more or less immediately, or cosmetic changes if no code change is
   required.  This document reflects the current developers' _intent_, not
   a permanent promise to always use this process in the future: we reserve
   the right to get really excited and run off and implement something in a
   caffeine-or-m&m-fueled all-night hacking session.}

How new proposals get added:

  Once an idea has been proposed on the development list, a properly formatted
  (see below) draft exists, and rough consensus within the active development
  community exists that this idea warrants consideration, the proposal editors
  will officially add the proposal.

  To get your proposal in, send it to the tor-dev mailing list.

  The current proposal editors are Nick Mathewson, George Kadianakis,
  Damian Johnson, Isis Lovecruft, and David Goulet.

What should go in a proposal:

   Every proposal should have a header containing these fields:
     Filename, Title, Author, Created, Status.

   These fields are optional but recommended:
     Target, Implemented-In, Ticket**.

   The Target field should describe which version the proposal is hoped to be
   implemented in (if it's Open or Accepted).  The Implemented-In field
   should describe which version the proposal was implemented in (if it's
   Finished or Closed).  The Ticket field should be a ticket number referring
   to Tor's canonical bug tracker (e.g. "#7144" refers to
   https://bugs.torproject.org/7144) or to a publicly accessible URI where one
   may subscribe to updates and/or retrieve information on implementation
   status.

   ** Proposals with assigned numbers of prop#283 and higher are REQUIRED to
      have a Ticket field if the Status is OPEN, ACCEPTED, CLOSED, or FINISHED.

   The body of the proposal should start with an Overview section explaining
   what the proposal's about, what it does, and about what state it's in.

   After the Overview, the proposal becomes more free-form.  Depending on its
   length and complexity, the proposal can break into sections as
   appropriate, or follow a short discursive format.  Every proposal should
   contain at least the following information before it is "ACCEPTED",
   though the information does not need to be in sections with these names.

      Motivation: What problem is the proposal trying to solve?  Why does
        this problem matter?  If several approaches are possible, why take this
        one?

      Design: A high-level view of what the new or modified features are, how
        the new or modified features work, how they interoperate with each
        other, and how they interact with the rest of Tor.  This is the main
        body of the proposal.  Some proposals will start out with only a
        Motivation and a Design, and wait for a specification until the
        Design seems approximately right.

      Security implications: What effects the proposed changes might have on
        anonymity, how well understood these effects are, and so on.

      Specification: A detailed description of what needs to be added to the
        Tor specifications in order to implement the proposal.  This should
        be in about as much detail as the specifications will eventually
        contain: it should be possible for independent programmers to write
        mutually compatible implementations of the proposal based on its
        specifications.

      Compatibility: Will versions of Tor that follow the proposal be
        compatible with versions that do not?  If so, how will compatibility
        be achieved?  Generally, we try to not drop compatibility if at
        all possible; we haven't made a "flag day" change since May 2004,
        and we don't want to do another one.

      Implementation: If the proposal will be tricky to implement in Tor's
        current architecture, the document can contain some discussion of how
        to go about making it work.  Actual patches should go on public git
        branches, or be uploaded to trac.

      Performance and scalability notes: If the feature will have an effect
        on performance (in RAM, CPU, bandwidth) or scalability, there should
        be some analysis on how significant this effect will be, so that we
        can avoid really expensive performance regressions, and so we can
        avoid wasting time on insignificant gains.

How to format proposals:

   Proposals may be written in plain text (like this one), or in Markdown.
   If using Markdown, the header must be wrapped in triple-backtick ("```")
   lines.  Whenever possible, we prefer the Commonmark dialect of Markdown.

Proposal status:

   Open: A proposal under discussion.

   Accepted: The proposal is complete, and we intend to implement it.
      After this point, substantive changes to the proposal should be
      avoided, and regarded as a sign of the process having failed
      somewhere.

   Finished: The proposal has been accepted and implemented.  After this
      point, the proposal should not be changed.

   Closed: The proposal has been accepted, implemented, and merged into the
      main specification documents.  The proposal should not be changed after
      this point.

   Rejected: We're not going to implement the feature as described here,
      though we might do some other version.  See comments in the document
      for details.  The proposal should not be changed after this point;
      to bring up some other version of the idea, write a new proposal.

   Draft: This isn't a complete proposal yet; there are definite missing
      pieces.  Please don't add any new proposals with this status; put them
      in the "ideas" sub-directory instead.

   Needs-Revision: The idea for the proposal is a good one, but the proposal
      as it stands has serious problems that keep it from being accepted.
      See comments in the document for details.

   Dead: The proposal hasn't been touched in a long time, and it doesn't look
      like anybody is going to complete it soon.  It can become "Open" again
      if it gets a new proponent.

   Needs-Research: There are research problems that need to be solved before
      it's clear whether the proposal is a good idea.

   Meta: This is not a proposal, but a document about proposals.

   Reserve: This proposal is not something we're currently planning to
      implement, but we might want to resurrect it some day if we decide to
      do something like what it proposes.

   Informational: This proposal is the last word on what it's doing.
      It isn't going to turn into a spec unless somebody copy-and-pastes
      it into a new spec for a new subsystem.

   Obsolete: This proposal was flawed and has been superseded by another
     proposal. See comments in the document for details.

   The editors maintain the correct status of proposals, based on rough
   consensus and their own discretion.

Proposal numbering:

   Numbers 000-099 are reserved for special and meta-proposals.  100 and up
   are used for actual proposals.  Numbers aren't recycled.
Filename: 098-todo.txt
Title: Proposals that should be written
Author: Nick Mathewson, Roger Dingledine
Created: 26-Jan-2007
Status: Obsolete

{Obsolete: This document has been replaced by the tor-spec issue tracker.}

Overview:

   This document lists ideas that various people have had for improving the
   Tor protocol.  These should be implemented and specified if they're
   trivial, or written up as proposals if they're not.

   This is an active document, to be edited as proposals are written and as
   we come up with new ideas for proposals.  We should take stuff out as it
   seems irrelevant.


For some later protocol version.

  - It would be great to get smarter about identity and linkability.
    It's not crazy to say, "Never use the same circuit for my SSH
    connections and my web browsing."  How far can/should we take this?
    See ideas/xxx-separate-streams-by-port.txt for a start.

  - Fix onionskin handshake scheme to be more mainstream, less nutty.
    Can we just do
        E(HMAC(g^x), g^x) rather than just E(g^x) ?
    No, that has the same flaws as before. We should send
        E(g^x, C) with random C and expect g^y, HMAC_C(K=g^xy).
    Better ask Ian; probably Stephen too.

  - Length on CREATE and friends

  - Versioning on circuits and create cells, so we have a clear path
    to improve the circuit protocol.

  - SHA1 is showing its age.  We should get a design for upgrading our
    hash once the AHS competition is done, or even sooner.

  - Not being able to upgrade ciphersuites or increase key lengths is
    lame.
  - Paul has some ideas about circuit creation; read his PET paper once it's
    out.

Any time:

  - Some ideas for revising the directory protocol:
    - Extend the "r" line in network-status to give a set of buckets (say,
      comma-separated) for that router.
      - Buckets are deterministic based on IP address.
      - Then clients can choose a bucket (or set of buckets) to
        download and use.
    - We need a way for the authorities to declare that nodes are in a
      family.  Also, it kinda sucks that family declarations use O(N^2) space
      in the descriptors.
  - REASON_CONNECTFAILED should include an IP.
  - Spec should incorporate some prose from tor-design to be more readable.
  - Spec when we should rotate which keys
  - Spec how to publish descriptors less often
  - Describe pros and cons of non-deterministic path lengths

  - We should use a variable-length path length by default -- 3 +/- some
    distribution. Need to think harder about allowing values less than 3,
    and there's a tradeoff between having a wide variance and performance.

  - Clients currently use certs during TLS.  Is this wise?  It does make it
    easier for servers to tell which NATted client is which. We could use a
    seprate set of certs for each guard, I suppose, but generating so many
    certs could get expensive.  Omitting them entirely would make OP->OR
    easier to tell from OR->OR.

Things that should change...

B.1. ... but which will require backward-incompatible change

  - Circuit IDs should be longer.
  . IPv6 everywhere.
  - Maybe, keys should be longer.
    - Maybe, key-length should be adjustable.  How to do this without
      making anonymity suck?
  - Drop backward compatibility.
  - We should use a 128-bit subgroup of our DH prime.
  - Handshake should use HMAC.
  - Multiple cell lengths.
  - Ability to split circuits across paths (If this is useful.)
  - SENDME windows should be dynamic.

  - Directory
     - Stop ever mentioning socks ports

B.1. ... and that will require no changes

   - Advertised outbound IP?
   - Migrate streams across circuits.
   - Fix bug 469 by limiting the number of simultaneous connections per IP.

B.2. ... and that we have no idea how to do.

   - UDP (as transport)
   - UDP (as content)
   - Use a better AES mode that has built-in integrity checking,
     doesn't grow with the number of hops, is not patented, and
     is implemented and maintained by smart people.

Let onion keys be not just RSA but maybe DH too, for Paul's reply onion
design.

Filename: 099-misc.txt
Title: Miscellaneous proposals
Author: Various
Created: 26-Jan-2007
Status: Obsolete

{This document is obsolete; we only used it once, and we have implemented
its only idea.)

Overview:

   This document is for small proposal ideas that are about one paragraph in
   length.  From here, ideas can be rejected outright, expanded into full
   proposals, or specified and implemented as-is.

Proposals

1. Directory compression.

  Gzip would be easier to work with than zlib; bzip2 would result in smaller
  data lengths.  [Concretely, we're looking at about 10-15% space savings at
  the expense of 3-5x longer compression time for using bzip2.]  Doing
  on-the-fly gzip requires zlib 1.2 or later; doing bzip2 requires bzlib.
  Pre-compressing status documents in multiple formats would force us to use
  more memory to hold them.

  Status: Open

  -- Nick Mathewson


Filename: 100-tor-spec-udp.txt
Title: Tor Unreliable Datagram Extension Proposal
Author: Marc Liberatore
Created: 23 Feb 2006
Status: Dead

Overview:

   This is a modified version of the Tor specification written by Marc
   Liberatore to add UDP support to Tor.  For each TLS link, it adds a
   corresponding DTLS link: control messages and TCP data flow over TLS, and
   UDP data flows over DTLS.

   This proposal is not likely to be accepted as-is; see comments at the end
   of the document.


Contents

0. Introduction

  Tor is a distributed overlay network designed to anonymize low-latency
  TCP-based applications.  The current tor specification supports only
  TCP-based traffic.  This limitation prevents the use of tor to anonymize
  other important applications, notably voice over IP software.  This document
  is a proposal to extend the tor specification to support UDP traffic.

  The basic design philosophy of this extension is to add support for
  tunneling unreliable datagrams through tor with as few modifications to the
  protocol as possible.  As currently specified, tor cannot directly support
  such tunneling, as connections between nodes are built using transport layer
  security (TLS) atop TCP.  The latency incurred by TCP is likely unacceptable
  to the operation of most UDP-based application level protocols.

  Thus, we propose the addition of links between nodes using datagram
  transport layer security (DTLS).  These links allow packets to traverse a
  route through tor quickly, but their unreliable nature requires minor
  changes to the tor protocol.  This proposal outlines the necessary
  additions and changes to the tor specification to support UDP traffic.

  We note that a separate set of DTLS links between nodes creates a second
  overlay, distinct from the that composed of TLS links.  This separation and
  resulting decrease in each anonymity set's size will make certain attacks
  easier.  However, it is our belief that VoIP support in tor will
  dramatically increase its appeal, and correspondingly, the size of its user
  base, number of deployed nodes, and total traffic relayed.  These increases
  should help offset the loss of anonymity that two distinct networks imply.

1. Overview of Tor-UDP and its complications

  As described above, this proposal extends the Tor specification to support
  UDP with as few changes as possible.  Tor's overlay network is managed
  through TLS based connections; we will re-use this control plane to set up
  and tear down circuits that relay UDP traffic.  These circuits be built atop
  DTLS, in a fashion analogous to how Tor currently sends TCP traffic over
  TLS.

  The unreliability of DTLS circuits creates problems for Tor at two levels:

      1. Tor's encryption of the relay layer does not allow independent
      decryption of individual records. If record N is not received, then
      record N+1 will not decrypt correctly, as the counter for AES/CTR is
      maintained implicitly.

      2. Tor's end-to-end integrity checking works under the assumption that
      all RELAY cells are delivered.  This assumption is invalid when cells
      are sent over DTLS.

  The fix for the first problem is straightforward: add an explicit sequence
  number to each cell.  To fix the second problem, we introduce a
  system of nonces and hashes to RELAY packets.

  In the following sections, we mirror the layout of the Tor Protocol
  Specification, presenting the necessary modifications to the Tor protocol as
  a series of deltas.

2. Connections

  Tor-UDP uses DTLS for encryption of some links.  All DTLS links must have
  corresponding TLS links, as all control messages are sent over TLS.  All
  implementations MUST support the DTLS ciphersuite "[TODO]".

  DTLS connections are formed using the same protocol as TLS connections.
  This occurs upon request, following a CREATE_UDP or CREATE_FAST_UDP cell,
  as detailed in section 4.6.

  Once a paired TLS/DTLS connection is established, the two sides send cells
  to one another.  All but two types of cells are sent over TLS links.  RELAY
  cells containing the commands RELAY_UDP_DATA and RELAY_UDP_DROP, specified
  below, are sent over DTLS links.  [Should all cells still be 512 bytes long?
  Perhaps upon completion of a preliminary implementation, we should do a
  performance evaluation for some class of UDP traffic, such as VoIP. - ML]
  Cells may be sent embedded in TLS or DTLS records of any size or divided
  across such records.  The framing of these records MUST NOT leak any more
  information than the above differentiation on the basis of cell type.  [I am
  uncomfortable with this leakage, but don't see any simple, elegant way
  around it. -ML]

  As with TLS connections, DTLS connections are not permanent.

3. Cell format

  Each cell contains the following fields:

        CircID                                [2 bytes]
        Command                               [1 byte]
        Sequence Number                       [2 bytes]
        Payload (padded with 0 bytes)         [507 bytes]
                                         [Total size: 512 bytes]

  The 'Command' field holds one of the following values:
       0 -- PADDING         (Padding)                     (See Sec 6.2)
       1 -- CREATE          (Create a circuit)            (See Sec 4)
       2 -- CREATED         (Acknowledge create)          (See Sec 4)
       3 -- RELAY           (End-to-end data)             (See Sec 5)
       4 -- DESTROY         (Stop using a circuit)        (See Sec 4)
       5 -- CREATE_FAST     (Create a circuit, no PK)     (See Sec 4)
       6 -- CREATED_FAST    (Circuit created, no PK)      (See Sec 4)
       7 -- CREATE_UDP      (Create a UDP circuit)        (See Sec 4)
       8 -- CREATED_UDP     (Acknowledge UDP create)      (See Sec 4)
       9 -- CREATE_FAST_UDP (Create a UDP circuit, no PK) (See Sec 4)
      10 -- CREATED_FAST_UDP(UDP circuit created, no PK)  (See Sec 4)

  The sequence number allows for AES/CTR decryption of RELAY cells
  independently of one another; this functionality is required to support
  cells sent over DTLS.  The sequence number is described in more detail in
  section 4.5.

  [Should the sequence number only appear in RELAY packets?  The overhead is
  small, and I'm hesitant to force more code paths on the implementor. -ML]
  [There's already a separate relay header that has other material in it,
  so it wouldn't be the end of the world to move it there if it's
  appropriate. -RD]

  [Having separate commands for UDP circuits seems necessary, unless we can
  assume a flag day event for a large number of tor nodes. -ML]

4. Circuit management

4.2. Setting circuit keys

  Keys are set up for UDP circuits in the same fashion as for TCP circuits.
  Each UDP circuit shares keys with its corresponding TCP circuit.

  [If the keys are used for both TCP and UDP connections, how does it
  work to mix sequence-number-less cells with sequenced-numbered cells --
  how do you know you have the encryption order right? -RD]

4.3. Creating circuits

  UDP circuits are created as TCP circuits, using the *_UDP cells as
  appropriate.

4.4. Tearing down circuits

  UDP circuits are torn down as TCP circuits, using the *_UDP cells as
  appropriate.

4.5. Routing relay cells

  When an OR receives a RELAY cell, it checks the cell's circID and
  determines whether it has a corresponding circuit along that
  connection.  If not, the OR drops the RELAY cell.

  Otherwise, if the OR is not at the OP edge of the circuit (that is,
  either an 'exit node' or a non-edge node), it de/encrypts the payload
  with AES/CTR, as follows:
       'Forward' relay cell (same direction as CREATE):
           Use Kf as key; decrypt, using sequence number to synchronize
           ciphertext and keystream.
       'Back' relay cell (opposite direction from CREATE):
           Use Kb as key; encrypt, using sequence number to synchronize
           ciphertext and keystream.
  Note that in counter mode, decrypt and encrypt are the same operation.
  [Since the sequence number is only 2 bytes, what do you do when it
  rolls over? -RD]

  Each stream encrypted by a Kf or Kb has a corresponding unique state,
  captured by a sequence number; the originator of each such stream chooses
  the initial sequence number randomly, and increments it only with RELAY
  cells.  [This counts cells; unlike, say, TCP, tor uses fixed-size cells, so
  there's no need for counting bytes directly.  Right? - ML]
  [I believe this is true. You'll find out for sure when you try to
  build it. ;) -RD]

  The OR then decides whether it recognizes the relay cell, by
  inspecting the payload as described in section 5.1 below.  If the OR
  recognizes the cell, it processes the contents of the relay cell.
  Otherwise, it passes the decrypted relay cell along the circuit if
  the circuit continues.  If the OR at the end of the circuit
  encounters an unrecognized relay cell, an error has occurred: the OR
  sends a DESTROY cell to tear down the circuit.

  When a relay cell arrives at an OP, the OP decrypts the payload
  with AES/CTR as follows:
        OP receives data cell:
           For I=N...1,
               Decrypt with Kb_I, using the sequence number as above.  If the
               payload is recognized (see section 5.1), then stop and process
               the payload.

  For more information, see section 5 below.

4.6. CREATE_UDP and CREATED_UDP cells

  Users set up UDP circuits incrementally.  The procedure is similar to that
  for TCP circuits, as described in section 4.1.  In addition to the TLS
  connection to the first node, the OP also attempts to open a DTLS
  connection.  If this succeeds, the OP sends a CREATE_UDP cell, with a
  payload in the same format as a CREATE cell.  To extend a UDP circuit past
  the first hop, the OP sends an EXTEND_UDP relay cell (see section 5) which
  instructs the last node in the circuit to send a CREATE_UDP cell to extend
  the circuit.

  The relay payload for an EXTEND_UDP relay cell consists of:
         Address                       [4 bytes]
         TCP port                      [2 bytes]
         UDP port                      [2 bytes]
         Onion skin                    [186 bytes]
         Identity fingerprint          [20 bytes]

  The address field and ports denote the IPV4 address and ports of the next OR
  in the circuit.

  The payload for a CREATED_UDP cell or the relay payload for an
  RELAY_EXTENDED_UDP cell is identical to that of the corresponding CREATED or
  RELAY_EXTENDED cell.  Both circuits are established using the same key.

  Note that the existence of a UDP circuit implies the
  existence of a corresponding TCP circuit, sharing keys, sequence numbers,
  and any other relevant state.

4.6.1 CREATE_FAST_UDP/CREATED_FAST_UDP cells

  As above, the OP must successfully connect using DTLS before attempting to
  send a CREATE_FAST_UDP cell.  Otherwise, the procedure is the same as in
  section 4.1.1.

5. Application connections and stream management

5.1. Relay cells

  Within a circuit, the OP and the exit node use the contents of RELAY cells
  to tunnel end-to-end commands, TCP connections ("Streams"), and UDP packets
  across circuits.  End-to-end commands and UDP packets can be initiated by
  either edge; streams are initiated by the OP.

  The payload of each unencrypted RELAY cell consists of:
        Relay command           [1 byte]
        'Recognized'            [2 bytes]
        StreamID                [2 bytes]
        Digest                  [4 bytes]
        Length                  [2 bytes]
        Data                    [498 bytes]

  The relay commands are:
        1 -- RELAY_BEGIN        [forward]
        2 -- RELAY_DATA         [forward or backward]
        3 -- RELAY_END          [forward or backward]
        4 -- RELAY_CONNECTED    [backward]
        5 -- RELAY_SENDME       [forward or backward]
        6 -- RELAY_EXTEND       [forward]
        7 -- RELAY_EXTENDED     [backward]
        8 -- RELAY_TRUNCATE     [forward]
        9 -- RELAY_TRUNCATED    [backward]
       10 -- RELAY_DROP         [forward or backward]
       11 -- RELAY_RESOLVE      [forward]
       12 -- RELAY_RESOLVED     [backward]
       13 -- RELAY_BEGIN_UDP    [forward]
       14 -- RELAY_DATA_UDP     [forward or backward]
       15 -- RELAY_EXTEND_UDP   [forward]
       16 -- RELAY_EXTENDED_UDP [backward]
       17 -- RELAY_DROP_UDP     [forward or backward]

  Commands labelled as "forward" must only be sent by the originator
  of the circuit. Commands labelled as "backward" must only be sent by
  other nodes in the circuit back to the originator. Commands marked
  as either can be sent either by the originator or other nodes.

  The 'recognized' field in any unencrypted relay payload is always set to
  zero. 

  The 'digest' field can have two meanings.  For all cells sent over TLS
  connections (that is, all commands and all non-UDP RELAY data), it is
  computed as the first four bytes of the running SHA-1 digest of all the
  bytes that have been sent reliably and have been destined for this hop of
  the circuit or originated from this hop of the circuit, seeded from Df or Db
  respectively (obtained in section 4.2 above), and including this RELAY
  cell's entire payload (taken with the digest field set to zero).  Cells sent
  over DTLS connections do not affect this running digest.  Each cell sent
  over DTLS (that is, RELAY_DATA_UDP and RELAY_DROP_UDP) has the digest field
  set to the SHA-1 digest of the current RELAY cells' entire payload, with the
  digest field set to zero.  Coupled with a randomly-chosen streamID, this
  provides per-cell integrity checking on UDP cells.
  [If you drop malformed UDP relay cells but don't close the circuit,
  then this 8 bytes of digest is not as strong as what we get in the
  TCP-circuit side. Is this a problem? -RD]

  When the 'recognized' field of a RELAY cell is zero, and the digest
  is correct, the cell is considered "recognized" for the purposes of
  decryption (see section 4.5 above).

  (The digest does not include any bytes from relay cells that do
  not start or end at this hop of the circuit. That is, it does not
  include forwarded data. Therefore if 'recognized' is zero but the
  digest does not match, the running digest at that node should
  not be updated, and the cell should be forwarded on.)

  All RELAY cells pertaining to the same tunneled TCP stream have the
  same streamID.  Such streamIDs are chosen arbitrarily by the OP.  RELAY
  cells that affect the entire circuit rather than a particular
  stream use a StreamID of zero.

  All RELAY cells pertaining to the same UDP tunnel have the same streamID.
  This streamID is chosen randomly by the OP, but cannot be zero.

  The 'Length' field of a relay cell contains the number of bytes in
  the relay payload which contain real payload data. The remainder of
  the payload is padded with NUL bytes.

  If the RELAY cell is recognized but the relay command is not
  understood, the cell must be dropped and ignored. Its contents
  still count with respect to the digests, though. [Before
  0.1.1.10, Tor closed circuits when it received an unknown relay
  command. Perhaps this will be more forward-compatible. -RD]

5.2.1.  Opening UDP tunnels and transferring data

  To open a new anonymized UDP connection, the OP chooses an open
  circuit to an exit that may be able to connect to the destination
  address, selects a random streamID not yet used on that circuit,
  and constructs a RELAY_BEGIN_UDP cell with a payload encoding the address
  and port of the destination host.  The payload format is:

        ADDRESS | ':' | PORT | [00]

  where  ADDRESS can be a DNS hostname, or an IPv4 address in
  dotted-quad format, or an IPv6 address surrounded by square brackets;
  and where PORT is encoded in decimal.

  [What is the [00] for? -NM]
  [It's so the payload is easy to parse out with string funcs -RD]

  Upon receiving this cell, the exit node resolves the address as necessary.
  If the address cannot be resolved, the exit node replies with a RELAY_END
  cell.  (See 5.4 below.)  Otherwise, the exit node replies with a
  RELAY_CONNECTED cell, whose payload is in one of the following formats:
      The IPv4 address to which the connection was made [4 octets]
      A number of seconds (TTL) for which the address may be cached [4 octets]
   or
      Four zero-valued octets [4 octets]
      An address type (6)     [1 octet]
      The IPv6 address to which the connection was made [16 octets]
      A number of seconds (TTL) for which the address may be cached [4 octets]
  [XXXX Versions of Tor before 0.1.1.6 ignore and do not generate the TTL
  field.  No version of Tor currently generates the IPv6 format.]

  The OP waits for a RELAY_CONNECTED cell before sending any data.
  Once a connection has been established, the OP and exit node
  package UDP data in RELAY_DATA_UDP cells, and upon receiving such
  cells, echo their contents to the corresponding socket.
  RELAY_DATA_UDP cells sent to unrecognized streams are dropped.

  Relay RELAY_DROP_UDP cells are long-range dummies; upon receiving such
  a cell, the OR or OP must drop it.

5.3. Closing streams

  UDP tunnels are closed in a fashion corresponding to TCP connections.

6. Flow Control

  UDP streams are not subject to flow control.

7.2. Router descriptor format.

The items' formats are as follows:
   "router" nickname address ORPort SocksPort DirPort UDPPort

      Indicates the beginning of a router descriptor.  "address" must be
      an IPv4 address in dotted-quad format. The last three numbers
      indicate the TCP ports at which this OR exposes
      functionality. ORPort is a port at which this OR accepts TLS
      connections for the main OR protocol; SocksPort is deprecated and
      should always be 0; DirPort is the port at which this OR accepts
      directory-related HTTP connections; and UDPPort is a port at which
      this OR accepts DTLS connections for UDP data.  If any port is not
      supported, the value 0 is given instead of a port number.

Other sections:

What changes need to happen to each node's exit policy to support this? -RD

Switching to UDP means managing the queues of incoming packets better,
so we don't miss packets. How does this interact with doing large public
key operations (handshakes) in the same thread? -RD

========================================================================
COMMENTS
========================================================================

[16 May 2006]

I don't favor this approach; it makes packet traffic partitioned from
stream traffic end-to-end.  The architecture I'd like to see is:

  A *All* Tor-to-Tor traffic is UDP/DTLS, unless we need to fall back on
    TCP/TLS for firewall penetration or something.  (This also gives us an
    upgrade path for routing through legacy servers.)

  B Stream traffic is handled with end-to-end per-stream acks/naks and
    retries.  On failure, the data is retransmitted in a new RELAY_DATA cell;
    a cell isn't retransmitted.

We'll need to do A anyway, to fix our behavior on packet-loss.  Once we've
done so, B is more or less inevitable, and we can support end-to-end UDP
traffic "for free".

(Also, there are some details that this draft spec doesn't address.  For
example, what happens when a UDP packet doesn't fit in a single cell?)

-NM
Filename: 101-dir-voting.txt
Title: Voting on the Tor Directory System
Author: Nick Mathewson
Created: Nov 2006
Status: Closed
Implemented-In: 0.2.0.x

Overview

  This document describes a consensus voting scheme for Tor directories;
  instead of publishing different network statuses, directories would vote on
  and publish a single "consensus" network status document.

  This is an open proposal.

Proposal:

0. Scope and preliminaries

  This document describes a consensus voting scheme for Tor directories.
  Once it's accepted, it should be merged with dir-spec.txt.  Some
  preliminaries for authority and caching support should be done during
  the 0.1.2.x series; the main deployment should come during the 0.2.0.x
  series.

0.1. Goals and motivation: voting.

  The current directory system relies on clients downloading separate
  network status statements from the caches signed by each directory.
  Clients download a new statement every 30 minutes or so, choosing to
  replace the oldest statement they currently have.

  This creates a partitioning problem: different clients have different
  "most recent" networkstatus sources, and different versions of each
  (since authorities change their statements often).

  It also creates a scaling problem: most of the downloaded networkstatus
  are probably quite similar, and the redundancy grows as we add more
  authorities.

  So if we have clients only download a single multiply signed consensus
  network status statement, we can:
       - Save bandwidth.
       - Reduce client partitioning
       - Reduce client-side and cache-side storage
       - Simplify client-side voting code (by moving voting away from the
         client)

  We should try to do this without:
       - Assuming that client-side or cache-side clocks are more correct
         than we assume now.
       - Assuming that authority clocks are perfectly correct.
       - Degrading badly if a few authorities die or are offline for a bit.

  We do not have to perform well if:
      - No clique of more than half the authorities can agree about who
        the authorities are.

1. The idea.

  Instead of publishing a network status whenever something changes,
  each authority instead publishes a fresh network status only once per
  "period" (say, 60 minutes).  Authorities either upload this network
  status (or "vote") to every other authority, or download every other
  authority's "vote" (see 3.1 below for discussion on push vs pull).

  After an authority has (or has become convinced that it won't be able to
  get) every other authority's vote, it deterministically computes a
  consensus networkstatus, and signs it.  Authorities download (or are
  uploaded; see 3.1) one another's signatures, and form a multiply signed
  consensus.  This multiply-signed consensus is what caches cache and what
  clients download.

  If an authority is down, authorities vote based on what they *can*
  download/get uploaded.

  If an authority is "a little" down and only some authorities can reach
  it, authorities try to get its info from other authorities.

  If an authority computes the vote wrong, its signature isn't included on
  the consensus.

  Clients use a consensus if it is "trusted": signed by more than half the
  authorities they recognize. If clients can't find any such consensus,
  they use the most recent trusted consensus they have. If they don't
  have any trusted consensus, they warn the user and refuse to operate
  (and if DirServers is not the default, beg the user to adapt the list
  of authorities).

2. Details.

2.0. Versioning

  All documents generated here have version "3" given in their
  network-status-version entries.

2.1. Vote specifications

  Votes in v3 are similar to v2 network status documents.  We add these
  fields to the preamble:

     "vote-status" -- the word "vote".

     "valid-until" -- the time when this authority expects to publish its
        next vote.

     "known-flags" -- a space-separated list of flags that will sometimes
        be included on "s" lines later in the vote.

     "dir-source" -- as before, except the "hostname" part MUST be the
        authority's nickname, which MUST be unique among authorities, and
        MUST match the nickname in the "directory-signature" entry.

  Authorities SHOULD cache their most recently generated votes so they
  can persist them across restarts.  Authorities SHOULD NOT generate
  another document until valid-until has passed.

  Router entries in the vote MUST be sorted in ascending order by router
  identity digest.  The flags in "s" lines MUST appear in alphabetical
  order.

  Votes SHOULD be synchronized to half-hour publication intervals (one
  hour? XXX say more; be more precise.)

  XXXX some way to request older networkstatus docs?

2.2. Consensus directory specifications

  Consensuses are like v3 votes, except for the following fields:

     "vote-status" -- the word "consensus".

     "published" is the latest of all the published times on the votes.

     "valid-until" is the earliest of all the valid-until times on the
       votes.

     "dir-source" and "fingerprint" and "dir-signing-key" and "contact"
       are included for each authority that contributed to the vote.

     "vote-digest" for each authority that contributed to the vote,
       calculated as for the digest in the signature on the vote. [XXX
       re-English this sentence]

     "client-versions" and "server-versions" are sorted in ascending
       order based on version-spec.txt.

     "dir-options" and "known-flags" are not included.
[XXX really? why not list the ones that are used in the consensus?
For example, right now BadExit is in use, but no servers would be
labelled BadExit, and it's still worth knowing that it was considered
by the authorities. -RD]

  The fields MUST occur in the following order:
     "network-status-version"
     "vote-status"
     "published"
     "valid-until"
     For each authority, sorted in ascending order of nickname, case-
     insensitively:
         "dir-source", "fingerprint", "contact", "dir-signing-key",
         "vote-digest".
     "client-versions"
     "server-versions"

  The signatures at the end of the document appear as multiple instances
  of directory-signature, sorted in ascending order by nickname,
  case-insensitively.

  A router entry should be included in the result if it is included by more
  than half of the authorities (total authorities, not just those whose votes
  we have).  A router entry has a flag set if it is included by more than
  half of the authorities who care about that flag.  [XXXX this creates an
  incentive for attackers to DOS authorities whose votes they don't like.
  Can we remember what flags people set the last time we saw them? -NM]
  [Which 'we' are we talking here? The end-users never learn which
  authority sets which flags. So you're thinking the authorities
  should record the last vote they saw from each authority and if it's
  within a week or so, count all the flags that it advertised as 'no'
  votes? Plausible. -RD]

  The signature hash covers from the "network-status-version" line through
  the characters "directory-signature" in the first "directory-signature"
  line.

  Consensus directories SHOULD be rejected if they are not signed by more
  than half of the known authorities.

2.2.1. Detached signatures

  Assuming full connectivity, every authority should compute and sign the
  same consensus directory in each period.  Therefore, it isn't necessary to
  download the consensus computed by each authority; instead, the authorities
  only push/fetch each others' signatures.  A "detached signature" document
  contains a single "consensus-digest" entry and one or more
  directory-signature entries. [XXXX specify more.]

2.3. URLs and timelines

2.3.1. URLs and timeline used for agreement

  An authority SHOULD publish its vote immediately at the start of each voting
  period.  It does this by making it available at
     http://<hostname>/tor/status-vote/current/authority.z
  and sending it in an HTTP POST request to each other authority at the URL
     http://<hostname>/tor/post/vote

  If, N minutes after the voting period has begun, an authority does not have
  a current statement from another authority, the first authority retrieves
  the other's statement.

  Once an authority has a vote from another authority, it makes it available
  at
      http://<hostname>/tor/status-vote/current/<fp>.z
  where <fp> is the fingerprint of the other authority's identity key.

  The consensus network status, along with as many signatures as the server
  currently knows, should be available at
      http://<hostname>/tor/status-vote/current/consensus.z
  All of the detached signatures it knows for consensus status should be
  available at:
      http://<hostname>/tor/status-vote/current/consensus-signatures.z

  Once an authority has computed and signed a consensus network status, it
  should send its detached signature to each other authority in an HTTP POST
  request to the URL:
      http://<hostname>/tor/post/consensus-signature


  [XXXX Store votes to disk.]

2.3.2. Serving a consensus directory

  Once the authority is done getting signatures on the consensus directory,
  it should serve it from:
      http://<hostname>/tor/status/consensus.z

  Caches SHOULD download consensus directories from an authority and serve
  them from the same URL.

2.3.3. Timeline and synchronization

  [XXXX]

2.4. Distributing routerdescs between authorities

  Consensus will be more meaningful if authorities take steps to make sure
  that they all have the same set of descriptors _before_ the voting
  starts.  This is safe, since all descriptors are self-certified and
  timestamped: it's always okay to replace a signed descriptor with a more
  recent one signed by the same identity.

  In the long run, we might want some kind of sophisticated process here.
  For now, since authorities already download one another's networkstatus
  documents and use them to determine what descriptors to download from one
  another, we can rely on this existing mechanism to keep authorities up to
  date.

  [We should do a thorough read-through of dir-spec again to make sure
  that the authorities converge on which descriptor to "prefer" for
  each router. Right now the decision happens at the client, which is
  no longer the right place for it. -RD]

3. Questions and concerns

3.1. Push or pull?

  The URLs above define a push mechanism for publishing votes and consensus
  signatures via HTTP POST requests, and a pull mechanism for downloading
  these documents via HTTP GET requests.  As specified, every authority will
  post to every other.  The "download if no copy has been received" mechanism
  exists only as a fallback.

4. Migration

     * It would be cool if caches could get ready to download consensus
       status docs, verify enough signatures, and serve them now.  That way
       once stuff works all we need to do is upgrade the authorities.  Caches
       don't need to verify the correctness of the format so long as it's
       signed (or maybe multisigned?).  We need to make sure that caches back
       off very quickly from downloading consensus docs until they're
       actually implemented.

Filename: 102-drop-opt.txt
Title: Dropping "opt" from the directory format
Author: Nick Mathewson
Created: Jan 2007
Status: Closed
Implemented-In: 0.2.0.x

Overview:

  This document proposes a change in the format used to transmit router and
  directory information.

  This proposal has been accepted, implemented, and merged into dir-spec.txt.

Proposal:

  The "opt" keyword in Tor's directory formats was originally intended to
  mean, "it is okay to ignore this entry if you don't understand it"; the
  default behavior has been "discard a routerdesc if it contains entries you
  don't recognize."

  But so far, every new flag we have added has been marked 'opt'.  It would
  probably make sense to change the default behavior to "ignore unrecognized
  fields", and add the statement that clients SHOULD ignore fields they don't
  recognize.  As a meta-principle, we should say that clients and servers
  MUST NOT have to understand new fields in order to use directory documents
  correctly.

  Of course, this will make it impossible to say, "The format has changed a
  lot; discard this quietly if you don't understand it." We could do that by
  adding a version field.

Status:

     * We stopped requiring it as of 0.1.2.5-alpha.  We'll stop generating it
       once earlier formats are obsolete.


Filename: 103-multilevel-keys.txt
Title: Splitting identity key from regularly used signing key
Author: Nick Mathewson
Created: Jan 2007
Status: Closed
Implemented-In: 0.2.0.x

Overview:

  This document proposes a change in the way identity keys are used, so that
  highly sensitive keys can be password-protected and seldom loaded into RAM.

  It presents options; it is not yet a complete proposal.

Proposal:

  Replacing a directory authority's identity key in the event of a compromise
  would be tremendously annoying.  We'd need to tell every client to switch
  their configuration, or update to a new version with an uploaded list.  So
  long as some weren't upgraded, they'd be at risk from whoever had
  compromised the key.

  With this in mind, it's a shame that our current protocol forces us to
  store identity keys unencrypted in RAM.  We need some kind of signing key
  stored unencrypted, since we need to generate new descriptors/directories
  and rotate link and onion keys regularly.  (And since, of course, we can't
  ask server operators to be on-hand to enter a passphrase every time we
  want to rotate keys or sign a descriptor.)

  The obvious solution seems to be to have a signing-only key that lives
  indefinitely (months or longer) and signs descriptors and link keys, and a
  separate identity key that's used to sign the signing key.  Tor servers
  could run in one of several modes:
    1. Identity key stored encrypted.  You need to pick a passphrase when
       you enable this mode, and re-enter this passphrase every time you
       rotate the signing key.
    1'. Identity key stored separate.  You save your identity key to a
       floppy, and use the floppy when you need to rotate the signing key.
    2. All keys stored unencrypted.  In this case, we might not want to even
       *have* a separate signing key.  (We'll need to support no-separate-
       signing-key mode anyway to keep old servers working.)
    3. All keys stored encrypted. You need to enter a passphrase to start
       Tor.
  (Of course, we might not want to implement all of these.)

  Case 1 is probably most usable and secure, if we assume that people don't
  forget their passphrases or lose their floppies.  We could mitigate this a
  bit by encouraging people to PGP-encrypt their passphrases to themselves,
  or keep a cleartext copy of their secret key secret-split into a few
  pieces, or something like that.

  Migration presents another difficulty, especially with the authorities.  If
  we use the current set of identity keys as the new identity keys, we're in
  the position of having sensitive keys that have been stored on
  media-of-dubious-encryption up to now.  Also, we need to keep old clients
  (who will expect descriptors to be signed by the identity keys they know
  and love, and who will not understand signing keys) happy.

A possible solution:

  One thing to consider is that router identity keys are not very sensitive:
  if an OR disappears and reappears with a new key, the network treats it as
  though an old router had disappeared and a new one had joined the network.
  The Tor network continues unharmed; this isn't a disaster.

  Thus, the ideas above are mostly relevant for authorities.

  The most straightforward solution for the authorities is probably to take
  advantage of the protocol transition that will come with proposal 101, and
  introduce a new set of signing _and_ identity keys used only to sign votes
  and consensus network-status documents.  Signing and identity keys could be
  delivered to users in a separate, rarely changing "keys" document, so that
  the consensus network-status documents wouldn't need to include N signing
  keys, N identity keys, and N certifications.

  Note also that there is no reason that the identity/signing keys used by
  directory authorities would necessarily have to be the same as the identity
  keys those authorities use in their capacity as routers.  Decoupling these
  keys would give directory authorities the following set of keys:

       Directory authority identity:
           Highly confidential; stored encrypted and/or offline.  Used to
           identity directory authorities.  Shipped with clients.  Used to
           sign Directory authority signing keys.

       Directory authority signing key:
           Stored online, accessible to regular Tor process.  Used to sign
           votes and consensus directories.  Downloaded as part of a "keys"
           document.

           [Administrators SHOULD rotate their signing keys every month or
           two, just to keep in practice and keep from forgetting the
           password to the authority identity.]

       V1-V2 directory authority identity:
           Stored online, never changed.  Used to sign legacy network-status
           and directory documents.

       Router identity:
           Stored online, seldom changed.  Used to sign server descriptors
           for this authority in its role as a router.  Implicitly certified
           by being listed in network-status documents.

       Onion key, link key:
           As in tor-spec.txt


Extensions to Proposal 101.

  Define a new document type, "Key certificate".  It contains the
  following fields, in order:

    "dir-key-certificate-version": As network-status-version.  Must be
         "3".
    "fingerprint": Hex fingerprint, with spaces, based on the directory
         authority's identity key.
    "dir-identity-key": The long-term identity key for this authority.
    "dir-key-published": The time when this directory's signing key was
         last changed.
    "dir-key-expires": A time after which this key is no longer valid.
    "dir-signing-key": As in proposal 101.
    "dir-key-certification": A signature of the above fields, in order.
         The signed material extends from the beginning of
         "dir-key-certicate-version" through the newline after
         "dir-key-certification".  The identity key is used to generate
         this signature.

      These elements together constitute a "key certificate".  These are
      generated offline when starting a v3 authority.  Private identity
      keys SHOULD be stored offline, encrypted, or both.  A running
      authority only needs access to the signing key.

      Unlike other keys currently used by Tor, the authority identity
      keys and directory signing keys MAY be longer than 1024 bits.
      (They SHOULD be 2048 bits or longer; they MUST NOT be shorter than
      1024.)

  Vote documents change as follows:

      A key certificate MUST be included in-line in every vote document.  With
      the exception of "fingerprint", its elements MUST NOT appear in consensus
      documents.

  Consensus network statuses change as follows:

      Remove dir-signing-key.

      Change "directory-signature" to take a fingerprint of the authority's
      identity key and a fingerprint of the authority's current signing key
      rather than the authority's nickname.

      Change "dir-source" to take the a fingerprint of the authority's
      identity key rather than the authority's nickname or hostname.

  Add a new document type:

      A "keys" document contains all currently known key certificates.
      All authorities serve it at

          http://<hostname>/tor/status/keys.z

      Caches and clients download the keys document whenever they receive a
      consensus vote that uses a key they do not recognize.  Caches download
      from authorities; clients download from caches.

  Processing votes:

      When receiving a vote, authorities check to see if the key
      certificate for the voter is different from the one they have.  If
      the key certificate _is_ different, and its dir-key-published is
      more recent than the most recently known one, and it is
      well-formed and correctly signed with the correct identity key,
      then authorities remember it as the new canonical key certificate
      for that voter.

  A key certificate is invalid if any of the following hold:
      * The version is unrecognized.
      * The fingerprint does not match the identity key.
      * The identity key or the signing key is ill-formed.
      * The published date is very far in the past or future.

      * The signature is not a valid signature of the key certificate
        generated with the identity key.

  When processing the signatures on consensus, clients and caches act as
  follows:

      1. Only consider the directory-signature entries whose identity
         key hashes match trusted authorities.

      2. If any such entries have signing key hashes that match unknown
         signing keys, download a new keys document.

      3. For every entry with a known (identity key,signing key) pair,
         check the signature on the document.

      4. If the document has been signed by more than half of the
         authorities the client recognizes, treat the consensus as
         correctly signed.

         If not, but the number entries with known identity keys but
         unknown signing keys might be enough to make the consensus
         correctly signed, do not use the consensus, but do not discard
         it until we have a new keys document.
Filename: 104-short-descriptors.txt
Title: Long and Short Router Descriptors
Author: Nick Mathewson
Created: Jan 2007
Status: Closed
Implemented-In: 0.2.0.x

Overview:

  This document proposes moving unused-by-clients information from regular
  router descriptors into a new "extra info" router descriptor.

Proposal:

  Some of the costliest fields in the current directory protocol are ones
  that no client actually uses.  In particular, the "read-history" and
  "write-history" fields are used only by the authorities for monitoring the
  status of the network.  If we took them out, the size of a compressed list
  of all the routers would fall by about 60%.  (No other disposable field
  would save much more than 2%.)

  We propose to remove these fields from descriptors, and and have them
  uploaded as a part of a separate signed "extra info" to the authorities.
  This document will be signed.  A hash of this document will be included in
  the regular descriptors.

  (We considered another design, where routers would generate and upload a
  short-form and a long-form descriptor.  Only the short-form descriptor would
  ever be used by anybody for routing.  The long-form descriptor would be
  used only for analytics and other tools.   We decided against this because
  well-behaved tools would need to download short-form descriptors too (as
  these would be the only ones indexed), and hence get redundant info. Badly
  behaved tools would download only long-form descriptors, and expose
  themselves to partitioning attacks.)

Other disposable fields:

  Clients don't need these fields, but removing them doesn't help bandwidth
  enough to be worthwhile.
    contact (save about 1%)
    fingerprint (save about 3%)

  We could represent these fields more succinctly, but removing them would
  only save 1%.  (!)
    reject
    accept
  (Apparently, exit polices are highly compressible.)

  [Does size-on-disk matter to anybody? Some clients and servers don't
   have much disk, or have really slow disk (e.g. USB). And we don't
   store caches compressed right now. -RD]

Specification:

  1. Extra Info Format.

    An "extra info" descriptor contains the following fields:

    "extra-info" Nickname Fingerprint
        Identifies what router this is an extra info descriptor for.
        Fingerprint is encoded in hex (using upper-case letters), with
        no spaces.

    "published" As currently documented in dir-spec.txt.  It MUST match the
        "published" field of the descriptor published at the same time.

    "read-history"
    "write-history"
        As currently documented in dir-spec.txt.  Optional.

    "router-signature" NL Signature NL

        A signature of the PKCS1-padded hash of the entire extra info
        document, taken from the beginning of the "extra-info" line, through
        the newline after the "router-signature" line.  An extra info
        document is not valid unless the signature is performed with the
        identity key whose digest matches FINGERPRINT.

    The "extra-info" field is required and MUST appear first.  The
    router-signature field is required and MUST appear last.  All others are
    optional.  As for other documents, unrecognized fields must be ignored.

  2. Existing formats

     Implementations that use "read-history" and "write-history" SHOULD
     continue accepting router descriptors that contain them.  (Prior to
     0.2.0.x, this information was encoded in ordinary router descriptors;
     in any case they have always been listed as opt, so they should be
     accepted anyway.)

     Add these fields to router descriptors:

       "extra-info-digest" Digest
          "Digest" is a hex-encoded digest (using upper-case characters)
          of the router's extra-info document, as signed in the router's
          extra-info.  (If this field is absent, no extra-info-digest
          exists.)

       "caches-extra-info"
          Present if this router is a directory cache that provides
          extra-info documents, or an authority that handles extra-info
          documents.

     (Since implementations before 0.1.2.5-alpha required that the "opt"
     keyword precede any unrecognized entry, these keys MUST be preceded
     with "opt" until 0.1.2.5-alpha is obsolete.)

  3. New communications rules

     Servers SHOULD generate and upload one extra-info document after each
     descriptor they generate and upload; no more, no less.  Servers MUST
     upload the new descriptor before they upload the new extra-info.

     Authorities receiving an extra-info document SHOULD verify all of the
     following:
       * They have a router descriptor for some server with a matching
         nickname and identity fingerprint.
       * That server's identity key has been used to sign the extra-info
         document.
       * The extra-info-digest field in the router descriptor matches
         the digest of the extra-info document.
       * The published fields in the two documents match.

     Authorities SHOULD drop extra-info documents that do not meet these
     criteria.

     Extra-info documents MAY be uploaded as part of the same HTTP post as
     the router descriptor, or separately.  Authorities MUST accept both
     methods.

     Authorities SHOULD try to fetch extra-info documents from one another if
     they do not have one matching the digest declared in a router
     descriptor.

     Caches that are running locally with a tool that needs to use extra-info
     documents MAY download and store extra-info documents.  They should do
     so when they notice that the recommended descriptor has an
     extra-info-digest not matching any extra-info document they currently
     have.  (Caches not running on a host that needs to use extra-info
     documents SHOULD NOT download or cache them.)

  4. New URLs

     http://<hostname>/tor/extra/d/...
     http://<hostname>/tor/extra/fp/...
     http://<hostname>/tor/extra/all[.z]
        (As for /tor/server/ URLs: supports fetching extra-info documents
        by their digest, by the fingerprint of their servers, or all
        at once. When serving by fingerprint, we serve the extra-info
        that corresponds to the descriptor we would serve by that
        fingerprint. Only directory authorities are guaranteed to support
        these URLs.)

     http://<hostname>/tor/extra/authority[.z]
        (The extra-info document for this router.)

     Extra-info documents are uploaded to the same URLs as regular
     router descriptors.

Migration:

  For extra info approach:
     * First:
       * Authorities should accept extra info, and support serving it.
       * Routers should upload extra info once authorities accept it.
       * Caches should support an option to download and cache it, once
         authorities serve it.
       * Tools should be updated to use locally cached information.
         These tools include:
           lefkada's exit.py script.
           tor26's noreply script and general directory cache.
           https://nighteffect.us/tns/ for its graphs
           and check with or-talk for the rest, once it's time.

     * Set a cutoff time for including bandwidth in router descriptors, so
       that tools that use bandwidth info know that they will need to fetch
       extra info documents.

     * Once tools that want bandwidth info support fetching extra info:
       * Have routers stop including bandwidth info in their router
         descriptors.
Filename: 105-handshake-revision.txt
Title: Version negotiation for the Tor protocol.
Author: Nick Mathewson, Roger Dingledine
Created: Jan 2007
Status: Closed
Implemented-In: 0.2.0.x

Overview:

  This document was extracted from a modified version of tor-spec.txt that we
  had written before the proposal system went into place.  It adds two new
  cells types to the Tor link connection setup handshake: one used for
  version negotiation, and another to prevent MITM attacks.

  This proposal is partially implemented, and partially proceded by
  proposal 130.

Motivation: Tor versions

   Our *current* approach to versioning the Tor protocol(s) has been as
   follows:
     - All changes must be backward compatible.
     - It's okay to add new cell types, if they would be ignored by previous
       versions of Tor.
     - It's okay to add new data elements to cells, if they would be
       ignored by previous versions of Tor.
     - For forward compatibility, Tor must ignore cell types it doesn't
       recognize, and ignore data in those cells it doesn't expect.
     - Clients can inspect the version of Tor declared in the platform line
       of a router's descriptor, and use that to learn whether a server
       supports a given feature.  Servers, however, aren't assumed to all
       know about each other, and so don't know the version of who they're
       talking to.

   This system has these problems:
     - It's very hard to change fundamental aspects of the protocol, like the
       cell format, the link protocol, any of the various encryption schemes,
       and so on.
     - The router-to-router link protocol has remained more-or-less frozen
       for a long time, since we can't easily have an OR use new features
       unless it knows the other OR will understand them.

   We need to resolve these problems because:
     - Our cipher suite is showing its age: SHA1/AES128/RSA1024/DH1024 will
       not seem like the best idea for all time.
     - There are many ideas circulating for multiple cell sizes; while it's
       not obvious whether these are safe, we can't do them at all without a
       mechanism to permit them.
     - There are many ideas circulating for alternative circuit building and
       cell relay rules: they don't work unless they can coexist in the
       current network.
     - If our protocol changes a lot, it's hard to describe any coherent
       version of it: we need to say "the version that Tor versions W through
       X use when talking to versions Y through Z".  This makes analysis
       harder.

Motivation: Preventing MITM attacks

   TLS prevents a man-in-the-middle attacker from reading or changing the
   contents of a communication.  It does not, however, prevent such an
   attacker from observing timing information.  Since timing attacks are some
   of the most effective against low-latency anonymity nets like Tor, we
   should take more care to make sure that we're not only talking to who
   we think we're talking to, but that we're using the network path we
   believe we're using.

Motivation: Signed clock information

   It's very useful for Tor instances to know how skewed they are relative
   to one another.  The only way to find out currently has been to download
   directory information, and check the Date header--but this is not
   authenticated, and hence subject to modification on the wire.  Using
   BEGIN_DIR to create an authenticated directory stream through an existing
   circuit is better, but that's an extra step and it might be nicer to
   learn the information in the course of the regular protocol.

Proposal:

1.0. Version numbers

   The node-to-node TLS-based "OR connection" protocol and the multi-hop
   "circuit" protocol are versioned quasi-independently.

   Of course, some dependencies will continue to exist: Certain versions
   of the circuit protocol may require a minimum version of the connection
   protocol to be used.  The connection protocol affects:
     - Initial connection setup, link encryption, transport guarantees,
       etc.
     - The allowable set of cell commands
     - Allowable formats for cells.

   The circuit protocol determines:
     - How circuits are established and maintained
     - How cells are decrypted and relayed
     - How streams are established and maintained.

   Version numbers are incremented for backward-incompatible protocol changes
   only.  Backward-compatible changes are generally implemented by adding
   additional fields to existing structures; implementations MUST ignore
   fields they do not expect.  Unused portions of cells MUST be set to zero.

   Though versioning the protocol will make it easier to maintain backward
   compatibility with older versions of Tor, we will nevertheless continue to
   periodically drop support for older protocols,
      - to keep the implementation from growing without bound,
      - to limit the maintenance burden of patching bugs in obsolete Tors,
      - to limit the testing burden of verifying that many old protocol
        versions continue to be implemented properly, and
      - to limit the exposure of the network to protocol versions that are
        expensive to support.

   The Tor protocol as implemented through the 0.1.2.x Tor series will be
   called "version 1" in its link protocol and "version 1" in its relay
   protocol.  Versions of the Tor protocol so old as to be incompatible with
   Tor 0.1.2.x can be considered to be version 0 of each, and are not
   supported.

2.1. VERSIONS cells

   When a Tor connection is established, both parties normally send a
   VERSIONS cell before sending any other cells.  (But see below.)

         VersionsLen          [2 byte]
         Versions             [VersionsLen bytes]

   "Versions" is a sequence of VersionsLen bytes.  Each value between 1 and
   127 inclusive represents a single version; current implementations MUST
   ignore other bytes.  Parties should list all of the versions which they
   are able and willing to support.  Parties can only communicate if they
   have some connection protocol version in common.

   Version 0.2.0.x-alpha and earlier don't understand VERSIONS cells,
   and therefore don't support version negotiation.  Thus, waiting until
   the other side has sent a VERSIONS cell won't work for these servers:
   if the other side sends no cells back, it is impossible to tell
   whether they
   have sent a VERSIONS cell that has been stalled, or whether they have
   dropped our own VERSIONS cell as unrecognized.  Therefore, we'll
   change the TLS negotiation parameters so that old parties can still
   negotiate, but new parties can recognize each other.  Immediately
   after a TLS connection has been established, the parties check
   whether the other side negotiated the connection in an "old" way or a
   "new" way.  If either party negotiated in the "old" way, we assume a
   v1 connection.  Otherwise, both parties send VERSIONS cells listing
   all their supported versions.  Upon receiving the other party's
   VERSIONS cell, the implementation begins using the highest-valued
   version common to both cells.  If the first cell from the other party
   has a recognized command, and is _not_ a VERSIONS cell, we assume a
   v1 protocol.

   (For more detail on the TLS protocol change, see forthcoming draft
   proposals from Steven Murdoch.)

   Implementations MUST discard VERSIONS cells that are not the first
   recognized cells sent on a connection.

   The VERSIONS cell must be sent as a v1 cell (2 bytes of circuitID, 1
   byte of command, 509 bytes of payload).

   [NOTE: The VERSIONS cell is assigned the command number 7.]

2.2. MITM-prevention and time checking

   If we negotiate a v2 connection or higher, the second cell we send SHOULD
   be a NETINFO cell.  Implementations SHOULD NOT send NETINFO cells at other
   times.

   A NETINFO cell contains:
         Timestamp              [4 bytes]
         Other OR's address     [variable]
         Number of addresses    [1 byte]
         This OR's addresses    [variable]

   Timestamp is the OR's current Unix time, in seconds since the epoch.  If
   an implementation receives time values from many ORs that
   indicate that its clock is skewed, it SHOULD try to warn the
   administrator. (We leave the definition of 'many' intentionally vague
   for now.)

   Before believing the timestamp in a NETINFO cell, implementations
   SHOULD compare the time at which they received the cell to the time
   when they sent their VERSIONS cell.  If the difference is very large,
   it is likely that the cell was delayed long enough that its
   contents are out of date.

   Each address contains Type/Length/Value as used in Section 6.4 of
   tor-spec.txt.  The first address is the one that the party sending
   the NETINFO cell believes the other has -- it can be used to learn
   what your IP address is if you have no other hints.
   The rest of the addresses are the advertised addresses of the party
   sending the NETINFO cell -- we include them
   to block a man-in-the-middle attack on TLS that lets an attacker bounce
   traffic through his own computers to enable timing and packet-counting
   attacks.

   A Tor instance should use the other Tor's reported address
   information as part of logic to decide whether to treat a given
   connection as suitable for extending circuits to a given address/ID
   combination.  When we get an extend request, we use an
   existing OR connection if the ID matches, and ANY of the following
   conditions hold:
       - The IP matches the requested IP.
       - We know that the IP we're using is canonical because it was
         listed in the NETINFO cell.
       - We know that the IP we're using is canonical because it was
         listed in the server descriptor.

   [NOTE: The NETINFO cell is assigned the command number 8.]

Discussion: Versions versus feature lists

   Many protocols negotiate lists of available features instead of (or in
   addition to) protocol versions.  While it's possible that some amount of
   feature negotiation could be supported in a later Tor, we should prefer to
   use protocol versions whenever possible, for reasons discussed in
   the "Anonymity Loves Company" paper.

Discussion: Bytes per version, versions per cell

   This document provides for a one-byte count of how many versions a Tor
   supports, and allows one byte per version.  Thus, it can only support only
   254 more versions of the protocol beyond the unallocated v0 and the
   current v1.  If we ever need to split the protocol into 255 incompatible
   versions, we've probably screwed up badly somewhere.

   Nevertheless, here are two ways we could support more versions:
     - Change the version count to a two-byte field that counts the number of
       _bytes_ used, and use a UTF8-style encoding: versions 0 through 127
       take one byte to encode, versions 128 through 2047 take two bytes to
       encode, and so on.  We wouldn't need to parse any version higher than
       127 right now, since all bytes used to encode higher versions would
       have their high bit set.

       We'd still have a limit of 380 simultaneously versions that could be
       declared in any version.  This is probably okay.

     - Decide that if we need to support more versions, we can add a
       MOREVERSIONS cell that gets sent before the VERSIONS cell.  The spec
       above requires Tors to ignore unrecognized cell types that they get
       before the first VERSIONS cell, and still allows version negotiation
       to
       succeed.

   [Resolution: Reserve the high bit and the v0 value for later use.  If
    we ever have more live versions than we can fit in a cell, we've made a
    bad design decision somewhere along the line.]

Discussion: Reducing round-trips

   It might be appealing to see if we can cram more information in the
   initial VERSIONS cell.  For example, the contents of NETINFO will pretty
   soon be sent by everybody before any more information is exchanged, but
   decoupling them from the version exchange increases round-trips.

   Instead, we could speculatively include handshaking information at
   the end of a VERSIONS cell, wrapped in a marker to indicate, "if we wind
   up speaking VERSION 2, here's the NETINFO I'll send.  Otherwise, ignore
   this."  This could be extended to opportunistically reduce round trips
   when possible for future versions when we guess the versions right.

   Of course, we'd need to be careful about using a feature like this:
     - We don't want to include things that are expensive to compute,
       like PK signatures or proof-of-work.
     - We don't want to speculate as a mobile client: it may leak our
       experience with the server in question.

Discussion: Advertising versions in routerdescs and networkstatuses.

   In network-statuses:

     The networkstatus "v" line now has the format:
        "v" IMPLEMENTATION IMPL-VERSION "Link" LINK-VERSION-LIST
            "Circuit" CIRCUIT-VERSION-LIST NL

     LINK-VERSION-LIST and CIRCUIT-VERSION-LIST are comma-separated lists of
     supported version numbers.  IMPLEMENTATION is the name of the
     implementation of the Tor protocol (e.g., "Tor"), and IMPL-VERSION is the
     version of the implementation.

     Examples:
        v Tor 0.2.5.1-alpha Link 1,2,3 Circuit 2,5

        v OtherOR 2000+ Link 3 Circuit 5

     Implementations that release independently of the Tor codebase SHOULD NOT
     use "Tor" as the value of their IMPLEMENTATION.

     Additional fields on the "v" line MUST be ignored.

   In router descriptors:

     The router descriptor should contain a line of the form,
       "protocols" "Link" LINK-VERSION-LIST "Circuit" CIRCUIT_VERSION_LIST

     Additional fields on the "protocols" line MUST be ignored.

     [Versions of Tor before 0.1.2.5-alpha rejected router descriptors with
     unrecognized items; the protocols line should be preceded with an "opt"
     until these Tors are obsolete.]

Security issues:

   Client partitioning is the big danger when we introduce new versions; if a
   client supports some very unusual set of protocol versions, it will stand
   out from others no matter where it goes.  If a server supports an unusual
   version, it will get a disproportionate amount of traffic from clients who
   prefer that version.  We can mitigate this somewhat as follows:

     - Do not have clients prefer any protocol version by default until that
       version is widespread.  (First introduce the new version to servers,
       and have clients admit to using it only when configured to do so for
       testing.  Then, once many servers are running the new protocol
       version, enable its use by default.)

     - Do not multiply protocol versions needlessly.

     - Encourage protocol implementors to implement the same protocol version
       sets as some popular version of Tor.

     - Disrecommend very old/unpopular versions of Tor via the directory
       authorities' RecommmendedVersions mechanism, even if it is still
       technically possible to use them.

Filename: 106-less-tls-constraint.txt
Title: Checking fewer things during TLS handshakes
Author: Nick Mathewson
Created: 9-Feb-2007
Status: Closed
Implemented-In: 0.2.0.x

Overview:

    This document proposes that we relax our requirements on the context of
    X.509 certificates during initial TLS handshakes.

Motivation:

    Later, we want to try harder to avoid protocol fingerprinting attacks.
    This means that we'll need to make our connection handshake look closer
    to a regular HTTPS connection: one certificate on the server side and
    zero certificates on the client side.  For now, about the best we
    can do is to stop requiring things during handshake that we don't
    actually use.

What we check now, and where we check it:

 tor_tls_check_lifetime:
    peer has certificate
    notBefore <= now <= notAfter

 tor_tls_verify:
    peer has at least one certificate
    There is at least one certificate in the chain
    At least one of the certificates in the chain is not the one used to
        negotiate the connection.  (The "identity cert".)
    The certificate _not_ used to negotiate the connection has signed the
        link cert

 tor_tls_get_peer_cert_nickname:
    peer has a certificate.
    certificate has a subjectName.
    subjectName has a commonName.
    commonName consists only of characters in LEGAL_NICKNAME_CHARACTERS. [2]

 tor_tls_peer_has_cert:
    peer has a certificate.

 connection_or_check_valid_handshake:
    tor_tls_peer_has_cert [1]
    tor_tls_get_peer_cert_nickname [1]
    tor_tls_verify [1]
    If nickname in cert is a known, named router, then its identity digest
        must be as expected.
    If we initiated the connection, then we got the identity digest we
        expected.

 USEFUL THINGS WE COULD DO:

 [1] We could just not force clients to have any certificate at all, let alone
     an identity certificate.  Internally to the code, we could assign the
     identity_digest field of these or_connections to a random number, or even
     not add them to the identity_digest->or_conn map.
 [so if somebody connects with no certs, we let them. and mark them as
 a client and don't treat them as a server. great. -rd]

 [2] Instead of using a restricted nickname character set that makes our
     commonName structure look unlike typical SSL certificates, we could treat
     the nickname as extending from the start of the commonName up to but not
     including the first non-nickname character.

     Alternatively, we could stop checking commonNames entirely.  We don't
     actually _do_ anything based on the nickname in the certificate, so
     there's really no harm in letting every router have any commonName it
     wants.
 [this is the better choice -rd]
 [agreed. -nm]

REMAINING WAYS TO RECOGNIZE CLIENT->SERVER CONNECTIONS:

 Assuming that we removed the above requirements, we could then (in a later
 release) have clients not send certificates, and sometimes and started
 making our DNs a little less formulaic, client->server OR connections would
 still be recognizable by:
    having a two-certificate chain sent by the server
    using a particular set of ciphersuites
    traffic patterns
    probing the server later

OTHER IMPLICATIONS:

 If we stop verifying the above requirements:

    It will be slightly (but only slightly) more common to connect to a non-Tor
    server running TLS, and believe that you're talking to a Tor server (until
    you send the first cell).

    It will be far easier for non-Tor SSL clients to accidentally connect to
    Tor servers and speak HTTPS or whatever to them.

 If, in a later release, we have clients not send certificates, and we make
 DNs less recognizable:

    If clients don't send certs, servers don't need to verify them: win!

    If we remove these restrictions, it will be easier for people to write
    clients to fuzz our protocol: sorta win!

    If clients don't send certs, they look slightly less like servers.

OTHER SPEC CHANGES:

 When a client doesn't give us an identity, we should never extend any
 circuits to it (duh), and we should allow it to set circuit ID however it
 wants.
Filename: 107-uptime-sanity-checking.txt
Title: Uptime Sanity Checking
Author: Kevin Bauer & Damon McCoy
Created: 8-March-2007
Status: Closed
Implemented-In: 0.2.0.x

Overview:

   This document describes how to cap the uptime that is used when computing
   which routers are marked as stable such that highly stable routers cannot
   be displaced by malicious routers that report extremely high uptime
   values.

   This is similar to how bandwidth is capped at 1.5MB/s.

Motivation:

   It has been pointed out that an attacker can displace all stable nodes and
   entry guard nodes by reporting high uptimes. This is an easy fix that will
   prevent highly stable nodes from being displaced.

Security implications:

   It should decrease the effectiveness of routing attacks that report high
   uptimes while not impacting the normal routing algorithms.

Specification:

  So we could patch Section 3.1 of dir-spec.txt to say:

   "Stable" -- A router is 'Stable' if it is running, valid, not
   hibernating, and either its uptime is at least the median uptime for
   known running, valid, non-hibernating routers, or its uptime is at
   least 30 days. Routers are never called stable if they are running
   a version of Tor known to drop circuits stupidly.  (0.1.1.10-alpha
   through 0.1.1.16-rc are stupid this way.)

Compatibility:

   There should be no compatibility issues due to uptime capping.

Implementation:

   Implemented and merged into dir-spec in 0.2.0.0-alpha-dev (r9788).

Discussion:

   Initially, this proposal set the maximum at 60 days, not 30; the 30 day
   limit and spec wording was suggested by Roger in an or-dev post on 9 March
   2007.

   This proposal also led to 108-mtbf-based-stability.txt

Filename: 108-mtbf-based-stability.txt
Title: Base "Stable" Flag on Mean Time Between Failures
Author: Nick Mathewson
Created: 10-Mar-2007
Status: Closed
Implemented-In: 0.2.0.x

Overview:

   This document proposes that we change how directory authorities set the
   stability flag from inspection of a router's declared Uptime to the
   authorities' perceived mean time between failure for the router.

Motivation:

   Clients prefer nodes that the authorities call Stable.  This flag is (as
   of 0.2.0.0-alpha-dev) set entirely based on the node's declared value for
   uptime.  This creates an opportunity for malicious nodes to declare
   falsely high uptimes in order to get more traffic.

Spec changes:

   Replace the current rule for setting the Stable flag with:

   "Stable" -- A router is 'Stable' if it is active and its observed Stability
   for the past month is at or above the median Stability for active routers.
   Routers are never called stable if they are running a version of Tor
   known to drop circuits stupidly. (0.1.1.10-alpha through 0.1.1.16-rc
   are stupid this way.)

   Stability shall be defined as the weighted mean length of the runs
   observed by a given directory authority.  A run begins when an authority
   decides that the server is Running, and ends when the authority decides
   that the server is not Running.  In-progress runs are counted when
   measuring Stability.  When calculating the mean, runs are weighted by
   $\alpha ^ t$, where $t$ is time elapsed since the end of the run, and
   $0 < \alpha < 1$.  Time when an authority is down do not count to the
   length of the run.

Rejected Alternative:

   "A router's Stability shall be defined as the sum of $\alpha ^ d$ for every
   $d$ such that the router was considered reachable for the entire day
   $d$ days ago.

   This allows a simpler implementation: every day, we multiply
   yesterday's Stability by alpha, and if the router was observed to be
   available every time we looked today, we add 1.

   Instead of "day", we could pick an arbitrary time unit.  We should
   pick alpha to be high enough that long-term stability counts, but low
   enough that the distant past is eventually forgotten.  Something
   between .8 and .95 seems right.

   (By requiring that routers be up for an entire day to get their
   stability increased, instead of counting fractions of a day, we
   capture the notion that stability is more like "probability of
   staying up for the next hour" than it is like "probability of being
   up at some randomly chosen time over the next hour."  The former
   notion of stability is far more relevant for long-lived circuits.)

Limitations:

   Authorities can have false positives and false negatives when trying to
   tell whether a router is up or down.  So long as these aren't terribly
   wrong, and so long as they aren't significantly biased, we should be able
   to use them to estimate stability pretty well.

   Probing approaches like the above could miss short incidents of
   downtime.  If we use the router's declared uptime, we could detect
   these: but doing so would penalize routers who reported their uptime
   accurately.

Implementation:

   For now, the easiest way to store this information at authorities
   would probably be in some kind of periodically flushed flat file.
   Later, we could move to Berkeley db or something if we really had to.

   For each router, an authority will need to store:
     The router ID.
     Whether the router is up.
     The time when the current run started, if the router is up.
     The weighted sum length of all previous runs.
     The time at which the weighted sum length was last weighted down.

   Servers should probe at random intervals to test whether servers are
   running.
Filename: 109-no-sharing-ips.txt
Title: No more than one server per IP address
Author: Kevin Bauer & Damon McCoy
Created: 9-March-2007
Status: Closed
Implemented-In: 0.2.0.x

Overview:
  This document describes a solution to a Sybil attack vulnerability in the
  directory servers. Currently, it is possible for a single IP address to
  host an arbitrarily high number of Tor routers. We propose that the
  directory servers limit the number of Tor routers that may be registered at
  a particular IP address to some small (fixed) number, perhaps just one Tor
  router per IP address.

  While Tor never uses more than one server from a given /16 in the same
  circuit, an attacker with multiple servers in the same place is still
  dangerous because he can get around the per-server bandwidth cap that is
  designed to prevent a single server from attracting too much of the overall
  traffic.

Motivation:
  Since it is possible for an attacker to register an arbitrarily large
  number of Tor routers, it is possible for malicious parties to do this
  as part of a traffic analysis attack.

Security implications:
  This countermeasure will increase the number of IP addresses that an
  attacker must control in order to carry out traffic analysis.

Specification:

  For each IP address, each directory authority tracks the number of routers
  using that IP address, along with their total observed bandwidth.  If there
  are more than MAX_SERVERS_PER_IP servers at some IP, the authority should
  "disable" all but MAX_SERVERS_PER_IP servers.  When choosing which servers
  to disable, the authority should first disable non-Running servers in
  increasing order of observed bandwidth, and then should disable Running
  servers in increasing order of bandwidth.

  [[  We don't actually do this part here. -NM

  If the total observed
  bandwidth of the remaining non-"disabled" servers exceeds MAX_BW_PER_IP,
  the authority should "disable" some of the remaining servers until only one
  server remains, or until the remaining observed bandwidth of non-"disabled"
  servers is under MAX_BW_PER_IP.
  ]]

  Servers that are "disabled" MUST be marked as non-Valid and non-Running.

  MAX_SERVERS_PER_IP is 3.

  MAX_BW_PER_IP is 8 MB per s.

Compatibility:

  Upon inspection of a directory server, we found that the following IP
  addresses have more than one Tor router:

  Scruples    68.5.113.81     ip68-5-113-81.oc.oc.cox.net     443
  WiseUp      68.5.113.81     ip68-5-113-81.oc.oc.cox.net     9001
  Unnamed     62.1.196.71     pc01-megabyte-net-arkadiou.megabyte.gr  9001
  Unnamed     62.1.196.71     pc01-megabyte-net-arkadiou.megabyte.gr  9001
  Unnamed     62.1.196.71     pc01-megabyte-net-arkadiou.megabyte.gr  9001
  aurel       85.180.62.138   e180062138.adsl.alicedsl.de     9001
  sokrates    85.180.62.138   e180062138.adsl.alicedsl.de     9001
  moria1      18.244.0.188    moria.mit.edu   9001
  peacetime   18.244.0.188    moria.mit.edu   9100

  There may exist compatibility issues with this proposed fix.  Reasons why
  more than one server would share an IP address include:

  * Testing. moria1, moria2, peacetime, and other morias all run on one
    computer at MIT, because that way we get testing. Moria1 and moria2 are
    run by Roger, and peacetime is run by Nick.
  * NAT. If there are several servers but they port-forward through the same
    IP address, ... we can hope that the operators coordinate with each
    other. Also, we should recognize that while they help the network in
    terms of increased capacity, they don't help as much as they could in
    terms of location diversity. But our approach so far has been to take
    what we can get.
  * People who have more than 1.5MB/s and want to help out more. For
    example, for a while Tonga was offering 10MB/s and its Tor server
    would only make use of a bit of it. So Roger suggested that he run
    two Tor servers, to use more.

[Note Roger's tweak to this behavior, in
http://archives.seul.org/or/cvs/Oct-2007/msg00118.html]

Filename: 110-avoid-infinite-circuits.txt
Title: Avoiding infinite length circuits
Author: Roger Dingledine
Created: 13-Mar-2007
Status: Closed
Target: 0.2.3.x
Implemented-In: 0.2.1.3-alpha, 0.2.3.11-alpha

History:

  Revised 28 July 2008 by nickm: set K.
  Revised 3 July 2008 by nickm: rename from relay_extend to
     relay_early.  Revise to current migration plan.  Allow K cells
     over circuit lifetime, not just at start.

Overview:

  Right now, an attacker can add load to the Tor network by extending a
  circuit an arbitrary number of times. Every cell that goes down the
  circuit then adds N times that amount of load in overall bandwidth
  use. This vulnerability arises because servers don't know their position
  on the path, so they can't tell how many nodes there are before them
  on the path.

  We propose a new set of relay cells that are distinguishable by
  intermediate hops as permitting extend cells. This approach will allow
  us to put an upper bound on circuit length relative to the number of
  colluding adversary nodes; but there are some downsides too.

Motivation:

  The above attack can be used to generally increase load all across the
  network, or it can be used to target specific servers: by building a
  circuit back and forth between two victim servers, even a low-bandwidth
  attacker can soak up all the bandwidth offered by the fastest Tor
  servers.

  The general attacks could be used as a demonstration that Tor isn't
  perfect (leading to yet more media articles about "breaking" Tor), and
  the targetted attacks will come into play once we have a reputation
  system -- it will be trivial to DoS a server so it can't pass its
  reputation checks, in turn impacting security.

Design:

  We should split RELAY cells into two types: RELAY and RELAY_EARLY.

  Only K (say, 10) Relay_early cells can be sent across a circuit, and
  only relay_early cells are allowed to contain extend requests. We
  still support obscuring the length of the circuit (if more research
  shows us what to do), because Alice can choose how many of the K to
  mark as relay_early. Note that relay_early cells *can* contain any
  sort of data cell; so in effect it's actually the relay type cells
  that are restricted. By default, she would just send the first K
  data cells over the stream as relay_early cells, regardless of their
  actual type.

  (Note that a circuit that is out of relay_early cells MUST NOT be
  cannibalized later, since it can't extend.  Note also that it's always okay
  to use regular RELAY cells when sending non-EXTEND commands targetted at
  the first hop of a circuit, since there is no intermediate hop to try to
  learn the relay command type.)

  Each intermediate server would pass on the same type of cell that it
  received (either relay or relay_early), and the cell's destination
  will be able to learn whether it's allowed to contain an Extend request.

  If an intermediate server receives more than K relay_early cells, or
  if it sees a relay cell that contains an extend request, then it
  tears down the circuit (protocol violation).

Security implications:

  The upside is that this limits the bandwidth amplification factor to
  K: for an individual circuit to become arbitrary-length, the attacker
  would need an adversary-controlled node every K hops, and at that
  point the attack is no worse than if the attacker creates N/K separate
  K-hop circuits.

  On the other hand, we want to pick a large enough value of K that we
  don't mind the cap.

  If we ever want to take steps to hide the number of hops in the circuit
  or a node's position in the circuit, this design probably makes that
  more complex.

Migration:

  In 0.2.0, servers speaking v2 or later of the link protocol accept
  RELAY_EARLY cells, and pass them on.  If the next OR in the circuit
  is not speaking the v2 link protocol, the server relays the cell as
  a RELAY cell.

  In 0.2.1.3-alpha, clients begin using RELAY_EARLY cells on v2
  connections.  This functionality can be safely backported to
  0.2.0.x.  Clients should pick a random number betweeen (say) K and
  K-2 to send.

  In 0.2.1.3-alpha, servers close any circuit in which more than K
  relay_early cells are sent.

  Once all versions the do not send RELAY_EARLY cells are obsolete,
  servers can begin to reject any EXTEND requests not sent in a
  RELAY_EARLY cell.

Parameters:

  Let K = 8, for no terribly good reason.

Spec:

  [We can formalize this part once we think the design is a good one.]

Acknowledgements:

  This design has been kicking around since Christian Grothoff and I came
  up with it at PET 2004. (Nathan Evans, Christian Grothoff's student,
  is working on implementing a fix based on this design in the summer
  2007 timeframe.)

Filename: 111-local-traffic-priority.txt
Title: Prioritizing local traffic over relayed traffic
Author: Roger Dingledine
Created: 14-Mar-2007
Status: Closed
Implemented-In: 0.2.0.x

Overview:

  We describe some ways to let Tor users operate as a relay and enforce
  rate limiting for relayed traffic without impacting their locally
  initiated traffic.

Motivation:

  Right now we encourage people who use Tor as a client to configure it
  as a relay too ("just click the button in Vidalia"). Most of these users
  are on asymmetric links, meaning they have a lot more download capacity
  than upload capacity. But if they enable rate limiting too, suddenly
  they're limited to the same download capacity as upload capacity. And
  they have to enable rate limiting, or their upstream pipe gets filled
  up, starts dropping packets, and now their net connection doesn't work
  even for non-Tor stuff. So they end up turning off the relaying part
  so they can use Tor (and other applications) again.

  So far this hasn't mattered that much: most of our fast relays are
  being operated only in relay mode, so the rate limiting makes sense
  for them. But if we want to be able to attract many more relays in
  the future, we need to let ordinary users act as relays too.

  Further, as we begin to deploy the blocking-resistance design and we
  rely on ordinary users to click the "Tor for Freedom" button, this
  limitation will become a serious stumbling block to getting volunteers
  to act as bridges.

The problem:

  Tor implements its rate limiting on the 'read' side by only reading
  a certain number of bytes from the network in each second. If it has
  emptied its token bucket, it doesn't read any more from the network;
  eventually TCP notices and stalls until we resume reading. But if we
  want to have two classes of service, we can't know what class a given
  incoming cell will be until we look at it, at which point we've already
  read it.

Some options:

  Option 1: read when our token bucket is full enough, and if it turns
  out that what we read was local traffic, then add the tokens back into
  the token bucket. This will work when local traffic load alternates
  with relayed traffic load; but it's a poor option in general, because
  when we're receiving both local and relayed traffic, there are plenty
  of cases where we'll end up with an empty token bucket, and then we're
  back where we were before.

  More generally, notice that our problem is easy when a given TCP
  connection either has entirely local circuits or entirely relayed
  circuits. In fact, even if they are both present, if one class is
  entirely idle (none of its circuits have sent or received in the past
  N seconds), we can ignore that class until it wakes up again. So it
  only gets complex when a single connection contains active circuits
  of both classes.

  Next, notice that local traffic uses only the entry guards, whereas
  relayed traffic likely doesn't. So if we're a bridge handling just
  a few users, the expected number of overlapping connections would be
  almost zero, and even if we're a full relay the number of overlapping
  connections will be quite small.

  Option 2: build separate TCP connections for local traffic and for
  relayed traffic. In practice this will actually only require a few
  extra TCP connections: we would only need redundant TCP connections
  to at most the number of entry guards in use.

  However, this approach has some drawbacks. First, if the remote side
  wants to extend a circuit to you, how does it know which TCP connection
  to send it on? We would need some extra scheme to label some connections
  "client-only" during construction. Perhaps we could do this by seeing
  whether any circuit was made via CREATE_FAST; but this still opens
  up a race condition where the other side sends a create request
  immediately. The only ways I can imagine to avoid the race entirely
  are to specify our preference in the VERSIONS cell, or to add some
  sort of "nope, not this connection, why don't you try another rather
  than failing" response to create cells, or to forbid create cells on
  connections that you didn't initiate and on which you haven't seen
  any circuit creation requests yet -- this last one would lead to a bit
  more connection bloat but doesn't seem so bad. And we already accept
  this race for the case where directory authorities establish new TCP
  connections periodically to check reachability, and then hope to hang
  up on them soon after. (In any case this issue is moot for bridges,
  since each destination will be one-way with respect to extend requests:
  either receiving extend requests from bridge users or sending extend
  requests to the Tor server, never both.)

  The second problem with option 2 is that using two TCP connections
  reveals that there are two classes of traffic (and probably quickly
  reveals which is which, based on throughput). Now, it's unclear whether
  this information is already available to the other relay -- he would
  easily be able to tell that some circuits are fast and some are rate
  limited, after all -- but it would be nice to not add even more ways to
  leak that information. Also, it's less clear that an external observer
  already has this information if the circuits are all bundled together,
  and for this case it's worth trying to protect it.

  Option 3: tell the other side about our rate limiting rules. When we
  establish the TCP connection, specify the different policy classes we
  have configured. Each time we extend a circuit, specify which policy
  class that circuit should be part of. Then hope the other side obeys
  our wishes. (If he doesn't, hang up on him.) Besides the design and
  coordination hassles involved in this approach, there's a big problem:
  our rate limiting classes apply to all our connections, not just
  pairwise connections. How does one server we're connected to know how
  much of our bucket has already been spent by another? I could imagine
  a complex and inefficient "ok, now you can send me those two more cells
  that you've got queued" protocol. I'm not sure how else we could do it.

  (Gosh. How could UDP designs possibly be compatible with rate limiting
  with multiple bucket sizes?)

  Option 4: put both classes of circuits over a single connection, and
  keep track of the last time we read or wrote a high-priority cell. If
  it's been less than N seconds, give the whole connection high priority,
  else give the whole connection low priority.

  Option 5: put both classes of circuits over a single connection, and
  play a complex juggling game by periodically telling the remote side
  what rate limits to set for that connection, so you end up giving
  priority to the right connections but still stick to roughly your
  intended bandwidthrate and relaybandwidthrate.

  Option 6: ?

Prognosis:

  Nick really didn't like option 2 because of the partitioning questions.

  I've put option 4 into place as of Tor 0.2.0.3-alpha.

  In terms of implementation, it will be easy: just add a time_t to
  or_connection_t that specifies client_used (used by the initiator
  of the connection to rate limit it differently depending on how
  recently the time_t was reset). We currently update client_used
  in three places:
    - command_process_relay_cell() when we receive a relay cell for
      an origin circuit.
    - relay_send_command_from_edge() when we send a relay cell for
      an origin circuit.
    - circuit_deliver_create_cell() when send a create cell.
  We could probably remove the third case and it would still work,
  but hey.

Filename: 112-bring-back-pathlencoinweight.txt
Title: Bring Back Pathlen Coin Weight
Author: Mike Perry
Created:
Status: Superseded
Superseded-By: 115


Overview:

  The idea is that users should be able to choose a weight which
  probabilistically chooses their path lengths to be 2 or 3 hops. This
  weight will essentially be a biased coin that indicates an
  additional hop (beyond 2) with probability P. The user should be
  allowed to choose 0 for this weight to always get 2 hops and 1 to
  always get 3.

  This value should be modifiable from the controller, and should be
  available from Vidalia.


Motivation:

  The Tor network is slow and overloaded. Increasingly often I hear
  stories about friends and friends of friends who are behind firewalls,
  annoying censorware, or under surveillance that interferes with their
  productivity and Internet usage, or chills their speech. These people
  know about Tor, but they choose to put up with the censorship because
  Tor is too slow to be usable for them. In fact, to download a fresh,
  complete copy of levine-timing.pdf for the Anonymity Implications
  section of this proposal over Tor took me 3 tries.

  There are many ways to improve the speed problem, and of course we
  should and will implement as many as we can. Johannes's GSoC project
  and my reputation system are longer term, higher-effort things that
  will still provide benefit independent of this proposal.

  However, reducing the path length to 2 for those who do not need the
  (questionable) extra anonymity 3 hops provide not only improves
  their Tor experience but also reduces their load on the Tor network by
  33%, and can be done in less than 10 lines of code. That's not just
  Win-Win, it's Win-Win-Win.

  Furthermore, when blocking resistance measures insert an extra relay
  hop into the equation, 4 hops will certainly be completely unusable
  for these users, especially since it will be considerably more
  difficult to balance the load across a dark relay net than balancing
  the load on Tor itself (which today is still not without its flaws).


Anonymity Implications:

  It has long been established that timing attacks against mixed
  networks are extremely effective, and that regardless of path
  length, if the adversary has compromised your first and last
  hop of your path, you can assume they have compromised your
  identity for that connection.

  In [1], it is demonstrated that for all but the slowest, lossiest
  networks, error rates for false positives and false negatives were
  very near zero. Only for constant streams of traffic over slow and
  (more importantly) extremely lossy network links did the error rate
  hit 20%. For loss rates typical to the Internet, even the error rate
  for slow nodes with constant traffic streams was 13%.

  When you take into account that most Tor streams are not constant,
  but probably much more like their "HomeIP" dataset, which consists
  mostly of web traffic that exists over finite intervals at specific
  times, error rates drop to fractions of 1%, even for the "worst"
  network nodes.

  Therefore, the user has little benefit from the extra hop, assuming
  the adversary does timing correlation on their nodes. The real
  protection is the probability of getting both the first and last hop,
  and this is constant whether the client chooses 2 hops, 3 hops, or 42.

  Partitioning attacks form another concern. Since Tor uses telescoping
  to build circuits, it is possible to tell a user is constructing only
  two hop paths at the entry node. It is questionable if this data is
  actually worth anything though, especially if the majority of users
  have easy access to this option, and do actually choose their path
  lengths semi-randomly.

  Nick has postulated that exits may also be able to tell that you are
  using only 2 hops by the amount of time between sending their
  RELAY_CONNECTED cell and the first bit of RELAY_DATA traffic they
  see from the OP. I doubt that they will be able to make much use
  of this timing pattern, since it will likely vary widely depending
  upon the type of node selected for that first hop, and the user's
  connection rate to that first hop. It is also questionable if this
  data is worth anything, especially if many users are using this
  option (and I imagine many will).

  Perhaps most seriously, two hop paths do allow malicious guards
  to easily fail circuits if they do not extend to their colluding peers
  for the exit hop. Since guards can detect the number of hops in a
  path, they could always fail the 3 hop circuits and focus on
  selectively failing the two hop ones until a peer was chosen.

  I believe currently guards are rotated if circuits fail, which does
  provide some protection, but this could be changed so that an entry
  guard is completely abandoned after a certain ratio of extend or
  general circuit failures with respect to non-failed circuits. This 
  could possibly be gamed to increase guard turnover, but such a game 
  would be much more noticeable than an individual guard failing circuits, 
  though, since it would affect all clients, not just those who chose 
  a particular guard.


Why not fix Pathlen=2?:

  The main reason I am not advocating that we always use 2 hops is that
  in some situations, timing correlation evidence by itself may not be
  considered as solid and convincing as an actual, uninterrupted, fully
  traced path. Are these timing attacks as effective on a real network
  as they are in simulation? Would an extralegal adversary or authoritarian
  government even care? In the face of these situation-dependent unknowns,
  it should be up to the user to decide if this is a concern for them or not.

  It should probably also be noted that even a false positive
  rate of 1% for a 200k concurrent-user network could mean that for a
  given node, a given stream could be confused with something like 10
  users, assuming ~200 nodes carry most of the traffic (ie 1000 users
  each). Though of course to really know for sure, someone needs to do
  an attack on a real network, unfortunately.


Implementation:

  new_route_len() can be modified directly with a check of the
  PathlenCoinWeight option (converted to percent) and a call to
  crypto_rand_int(0,100) for the weighted coin.

  The entry_guard_t structure could have num_circ_failed and
  num_circ_succeeded members such that if it exceeds N% circuit 
  extend failure rate to a second hop, it is removed from the entry list. 
  N should be sufficiently high to avoid churn from normal Tor circuit 
  failure as determined by TorFlow scans.

  The Vidalia option should be presented as a boolean, to minimize confusion
  for the user. Something like a radiobutton with:
 
   * "I use Tor for Censorship Resistance, not Anonymity. Speed is more
      important to me than Anonymity."
   * "I use Tor for Anonymity. I need extra protection at the cost of speed."
  
  and then some explanation in the help for exactly what this means, and
  the risks involved with eliminating the adversary's need for timing attacks 
  wrt to false positives, etc.

Migration:

  Phase one: Experiment with the proper ratio of circuit failures
  used to expire garbage or malicious guards via TorFlow.

  Phase two: Re-enable config and modify new_route_len() to add an
  extra hop if coin comes up "heads".

  Phase three: Make radiobutton in Vidalia, along with help entry
  that explains in layman's terms the risks involved.


[1] http://www.cs.umass.edu/~mwright/papers/levine-timing.pdf
Filename: 113-fast-authority-interface.txt
Title: Simplifying directory authority administration
Author: Nick Mathewson
Created:
Status: Superseded

Overview

The problem:

  Administering a directory authority is a pain: you need to go through
  emails and manually add new nodes as "named".  When bad things come up,
  you need to mark nodes (or whole regions) as invalid, badexit, etc.

  This means that mostly, authority admins don't: only 2/4 current authority
  admins actually bind names or list bad exits, and those two have often
  complained about how annoying it is to do so.

  Worse, name binding is a common path, but it's a pain in the neck: nobody
  has done it for a couple of months.

Digression: who knows what?

  It's trivial for Tor to automatically keep track of all of the
  following information about a server:
    name, fingerprint, IP, last-seen time, first-seen time, declared
    contact.

  All we need to have the administrator set is:
    - Is this name/fingerprint pair bound?
    - Is this fingerprint/IP a bad exit?
    - Is this fingerprint/IP an invalid node?
    - Is this fingerprint/IP to be rejected?

  The workflow for authority admins has two parts:
    - Periodically, go through tor-ops and add new names.  This doesn't
      need to be done urgently.
    - Less often, mark badly behaved serves as badly behaved.  This is more
      urgent.

Possible solution #1: Web-interface for name binding.

  Deprecate use of the tor-ops mailing list; instead, have operators go to a
  webform and enter their server info.  This would put the information in a
  standardized format, thus allowing quick, nearly-automated approval and
  reply.

Possible solution #2: Self-binding names.

  Peter Palfrader has proposed that names be assigned automatically to nodes
  that have been up and running and valid for a while.

Possible solution #3: Self-maintaining approved-routers file

  Mixminion alpha has a neat feature where whenever a new server is seen,
  a stub line gets added to a configuration file.  For Tor, it could look
  something like this:

    ## First seen with this key on 2007-04-21 13:13:14
    ## Stayed up for at least 12 hours on IP 192.168.10.10
    #RouterName AAAABBBBCCCCDDDDEFEF

  (Note that the implementation needs to parse commented lines to make sure
  that it doesn't add duplicates, but that's not so hard.)

  To add a router as named, administrators would only need to uncomment the
  entry.  This automatically maintained file could be kept separately from a
  manually maintained one.

  This could be combined with solution #2, such that Tor would do the hard
  work of uncommenting entries for routers that should get Named, but
  operators could override its decisions.

Possible solution #4: A separate mailing list for authority operators.

  Right now, the tor-ops list is very high volume.  There should be another
  list that's only for dealing with problems that need prompt action, like
  marking a router as !badexit.

Resolution:

  Solution #2 is described in "Proposal 123: Naming authorities
  automatically create bindings", and that approach is implemented.
  There are remaining issues in the problem statement above that need
  their own solutions.
Filename: 114-distributed-storage.txt
Title: Distributed Storage for Tor Hidden Service Descriptors
Author: Karsten Loesing
Created: 13-May-2007
Status: Closed
Implemented-In: 0.2.0.x

Change history:

  13-May-2007  Initial proposal
  14-May-2007  Added changes suggested by Lasse Øverlier
  30-May-2007  Changed descriptor format, key length discussion, typos
  09-Jul-2007  Incorporated suggestions by Roger, added status of specification
               and implementation for upcoming GSoC mid-term evaluation
  11-Aug-2007  Updated implementation statuses, included non-consecutive
               replication to descriptor format
  20-Aug-2007  Renamed config option HSDir as HidServDirectoryV2
  02-Dec-2007  Closed proposal

Overview:

  The basic idea of this proposal is to distribute the tasks of storing and
  serving hidden service descriptors from currently three authoritative
  directory nodes among a large subset of all onion routers. The three
  reasons to do this are better robustness (availability), better
  scalability, and improved security properties. Further,
  this proposal suggests changes to the hidden service descriptor format to
  prevent new security threats coming from decentralization and to gain even
  better security properties.

Status:

  As of December 2007, the new hidden service descriptor format is implemented
  and usable. However, servers and clients do not yet make use of descriptor
  cookies, because there are open usability issues of this feature that might
  be resolved in proposal 121. Further, hidden service directories do not
  perform replication by themselves, because (unauthorized) replica fetch
  requests would allow any attacker to fetch all hidden service descriptors in
  the system. As neither issue is critical to the functioning of v2
  descriptors and their distribution, this proposal is considered as Closed.
  
Motivation:

  The current design of hidden services exhibits the following performance and
  security problems:

  First, the three hidden service authoritative directories constitute a
  performance bottleneck in the system. The directory nodes are responsible for
  storing and serving all hidden service descriptors. As of May 2007 there are
  about 1000 descriptors at a time, but this number is assumed to increase in
  the future. Further, there is no replication protocol for descriptors between
  the three directory nodes, so that hidden services must ensure the
  availability of their descriptors by manually publishing them on all
  directory nodes. Whenever a fourth or fifth hidden service authoritative
  directory is added, hidden services will need to maintain an equally
  increasing number of replicas. These scalability issues have an impact on the
  current usage of hidden services and put an even higher burden on the
  development of new kinds of applications for hidden services that might
  require storing even more descriptors.

  Second, besides posing a limitation to scalability, storing all hidden
  service descriptors on three directory nodes also constitutes a security
  risk. The directory node operators could easily analyze the publish and fetch
  requests to derive information on service activity and usage and read the
  descriptor contents to determine which onion routers work as introduction
  points for a given hidden service and need to be attacked or threatened to
  shut it down. Furthermore, the contents of a hidden service descriptor offer
  only minimal security properties to the hidden service. Whoever gets aware of
  the service ID can easily find out whether the service is active at the
  moment and which introduction points it has. This applies to (former)
  clients, (former) introduction points, and of course to the directory nodes.
  It requires only to request the descriptor for the given service ID, which
  can be performed by anyone anonymously.

  This proposal suggests two major changes to approach the described
  performance and security problems:

  The first change affects the storage location for hidden service descriptors.
  Descriptors are distributed among a large subset of all onion routers instead
  of three fixed directory nodes. Each storing node is responsible for a subset
  of descriptors for a limited time only. It is not able to choose which
  descriptors it stores at a certain time, because this is determined by its
  onion ID which is hard to change frequently and in time (only routers which
  are stable for a given time are accepted as storing nodes). In order to
  resist single node failures and untrustworthy nodes, descriptors are
  replicated among a certain number of storing nodes. A first replication
  protocol makes sure that descriptors don't get lost when the node population
  changes; therefore, a storing node periodically requests the descriptors from
  its siblings. A second replication protocol distributes descriptors among
  non-consecutive nodes of the ID ring to prevent a group of adversaries from
  generating new onion keys until they have consecutive IDs to create a 'black
  hole' in the ring and make random services unavailable. Connections to
  storing nodes are established by extending existing circuits by one hop to
  the storing node. This also ensures that contents are encrypted. The effect
  of this first change is that the probability that a single node operator
  learns about a certain hidden service is very small and that it is very hard
  to track a service over time, even when it collaborates with other node
  operators.
  
  The second change concerns the content of hidden service descriptors.
  Obviously, security problems cannot be solved only by decentralizing storage;
  in fact, they could also get worse if done without caution. At first, a
  descriptor ID needs to change periodically in order to be stored on changing
  nodes over time. Next, the descriptor ID needs to be computable only for the
  service's clients, but should be unpredictable for all other nodes. Further,
  the storing node needs to be able to verify that the hidden service is the
  true originator of the descriptor with the given ID even though it is not a
  client. Finally, a storing node should learn as little information as
  necessary by storing a descriptor, because it might not be as trustworthy as
  a directory node; for example it does not need to know the list of
  introduction points. Therefore, a second key is applied that is only known to
  the hidden service provider and its clients and that is not included in the
  descriptor. It is used to calculate descriptor IDs and to encrypt the
  introduction points. This second key can either be given to all clients
  together with the hidden service ID, or to a group or a single client as
  an authentication token. In the future this second key could be the result of
  some key agreement protocol between the hidden service and one or more
  clients. A new text-based format is proposed for descriptors instead of an
  extension of the existing binary format for reasons of future extensibility.

Design:

  The proposed design is described by the required changes to the current
  design. These requirements are grouped by content, rather than by affected
  specification documents or code files, and numbered for reference below.

  Hidden service clients, servers, and directories:

  /1/ Create routing list

    All participants can filter the consensus status document received from the
    directory authorities to one routing list containing only those servers
    that store and serve hidden service descriptors and which are running for
    at least 24 hours. A participant only trusts its own routing list and never
    learns about routing information from other parties.

  /2/ Determine responsible hidden service directory

    All participants can determine the hidden service directory that is
    responsible for storing and serving a given ID, as well as the hidden
    service directories that replicate its content. Every hidden service
    directory is responsible for the descriptor IDs in the interval from
    its predecessor, exclusive, to its own ID, inclusive. Further, a hidden
    service directory holds replicas for its n predecessors, where n denotes
    the number of consecutive replicas. (requires /1/)

  [/3/ and /4/ were requirements to use BEGIN_DIR cells for directory
   requests which have not been fulfilled in the course of the implementation
   of this proposal, but elsewhere.]

  Hidden service directory nodes:
    
  /5/ Advertise hidden service directory functionality

    Every onion router that has its directory port open can decide whether it
    wants to store and serve hidden service descriptors by setting a new config
    option "HidServDirectoryV2" 0|1 to 1. An onion router with this config
    option being set includes the flag "hidden-service-dir" in its router
    descriptors that it sends to directory authorities.

  /6/ Accept v2 publish requests, parse and store v2 descriptors

    Hidden service directory nodes accept publish requests for hidden service
    descriptors and store them to their local memory. (It is not necessary to
    make descriptors persistent, because after disconnecting, the onion router
    would not be accepted as storing node anyway, because it has not been
    running for at least 24 hours.) All requests and replies are formatted as
    HTTP messages. Requests are directed to the router's directory port and are
    contained within BEGIN_DIR cells. A hidden service directory node stores a
    descriptor only when it thinks that it is responsible for storing that
    descriptor based on its own routing table. Every hidden service directory
    node is responsible for the descriptor IDs in the interval of its n-th
    predecessor in the ID circle up to its own ID (n denotes the number of
    consecutive replicas). (requires /1/)

  /7/ Accept v2 fetch requests

    Same as /6/, but with fetch requests for hidden service descriptors.
    (requires /2/)

  /8/ Replicate descriptors with neighbors

    A hidden service directory node replicates descriptors from its two
    predecessors by downloading them once an hour. Further, it checks its
    routing table periodically for changes. Whenever it realizes that a
    predecessor has left the network, it establishes a connection to the new
    n-th predecessor and requests its stored descriptors in the interval of its
    (n+1)-th predecessor and the requested n-th predecessor. Whenever it
    realizes that a new onion router has joined with an ID higher than its
    former n-th predecessor, it adds it to its predecessors and discards all
    descriptors in the interval of its (n+1)-th and its n-th predecessor.
    (requires /1/)

    [Dec 02: This function has not been implemented, because arbitrary nodes
     what have been able to download the entire set of v2 descriptors. An
     authorized replication request would be necessary. For the moment, the
     system runs without any directory-side replication. -KL]

  Authoritative directory nodes:

  /9/ Confirm a router's hidden service directory functionality

    Directory nodes include a new flag "HSDir" for routers that decided to
    provide storage for hidden service descriptors and that are running for at
    least 24 hours. The last requirement prevents a node from frequently
    changing its onion key to become responsible for an identifier it wants to
    target.

  Hidden service provider:

  /10/ Configure v2 hidden service

    Each hidden service provider that has set the config option
    "PublishV2HidServDescriptors" 0|1 to 1 is configured to publish v2
    descriptors and conform to the v2 connection establishment protocol. When
    configuring a hidden service, a hidden service provider checks if it has
    already created a random secret_cookie and a hostname2 file; if not, it
    creates both of them. (requires /2/)

  /11/ Establish introduction points with fresh key

    If configured to publish only v2 descriptors and no v0/v1 descriptors any
    more, a hidden service provider that is setting up the hidden service at
    introduction points does not pass its own public key, but the public key
    of a freshly generated key pair. It also includes these fresh public keys
    in the hidden service descriptor together with the other introduction point
    information. The reason is that the introduction point does not need to and
    therefore should not know for which hidden service it works, so as to
    prevent it from tracking the hidden service's activity. (If a hidden
    service provider supports both, v0/v1 and v2 descriptors, v0/v1 clients
    rely on the fact that all introduction points accept the same public key,
    so that this new feature cannot be used.)

  /12/ Encode v2 descriptors and send v2 publish requests

    If configured to publish v2 descriptors, a hidden service provider
    publishes a new descriptor whenever its content changes or a new
    publication period starts for this descriptor. If the current publication
    period would only last for less than 60 minutes (= 2 x 30 minutes to allow
    the server to be 30 minutes behind and the client 30 minutes ahead), the
    hidden service provider publishes both a current descriptor and one for
    the next period. Publication is performed by sending the descriptor to all
    hidden service directories that are responsible for keeping replicas for
    the descriptor ID. This includes two non-consecutive replicas that are
    stored at 3 consecutive nodes each. (requires /1/ and /2/)

  Hidden service client:

  /13/ Send v2 fetch requests

    A hidden service client that has set the config option
    "FetchV2HidServDescriptors" 0|1 to 1 handles SOCKS requests for v2 onion
    addresses by requesting a v2 descriptor from a randomly chosen hidden
    service directory that is responsible for keeping replica for the
    descriptor ID. In total there are six replicas of which the first and the
    last three are stored on consecutive nodes. The probability of picking one
    of the three consecutive replicas is 1/6, 2/6, and 3/6 to incorporate the
    fact that the availability will be the highest on the node with next higher
    ID. A hidden service client relies on the hidden service provider to store
    two sets of descriptors to compensate clock skew between service and
    client. (requires /1/ and /2/)

  /14/ Process v2 fetch reply and parse v2 descriptors

    A hidden service client that has sent a request for a v2 descriptor can
    parse it and store it to the local cache of rendezvous service descriptors.

  /15/ Establish connection to v2 hidden service

    A hidden service client can establish a connection to a hidden service
    using a v2 descriptor. This includes using the secret cookie for decrypting
    the introduction points contained in the descriptor. When contacting an
    introduction point, the client does not use the public key of the hidden
    service provider, but the freshly-generated public key that is included in
    the hidden service descriptor. Whether or not a fresh key is used instead
    of the key of the hidden service depends on the available protocol versions
    that are included in the descriptor; by this, connection establishment is
    to a certain extend decoupled from fetching the descriptor.

  Hidden service descriptor:

  (Requirements concerning the descriptor format are contained in /6/ and /7/.)
  
    The new v2 hidden service descriptor format looks like this:

      onion-address = h(public-key) + cookie
      descriptor-id = h(h(public-key) + h(time-period + cookie + relica))
      descriptor-content = {
        descriptor-id,
        version,
        public-key,
        h(time-period + cookie + replica),
        timestamp,
        protocol-versions,
        { introduction-points } encrypted with cookie
      } signed with private-key

    The "descriptor-id" needs to change periodically in order for the
    descriptor to be stored on changing nodes over time. It may only be
    computable by a hidden service provider and all of his clients to prevent
    unauthorized nodes from tracking the service activity by periodically
    checking whether there is a descriptor for this service. Finally, the
    hidden service directory needs to be able to verify that the hidden service
    provider is the true originator of the descriptor with the given ID.
    
    Therefore, "descriptor-id" is derived from the "public-key" of the hidden
    service provider, the current "time-period" which changes every 24 hours,
    a secret "cookie" shared between hidden service provider and clients, and
    a "replica" denoting the number of this non-consecutive replica. (The
    "time-period" is constructed in a way that time periods do not change at
    the same moment for all descriptors by deriving a value between 0:00 and
    23:59 hours from h(public-key) and making the descriptors of this hidden
    service provider expire at that time of the day.) The "descriptor-id" is
    defined to be 160 bits long. [extending the "descriptor-id" length
    suggested by LØ]
    
    Only the hidden service provider and the clients are able to generate
    future "descriptor-ID"s. Hence, the "onion-address" is extended from now 
    the hash value of "public-key" by the secret "cookie". The "public-key" is
    determined to be 80 bits long, whereas the "cookie" is dimensioned to be
    120 bits long. This makes a total of 200 bits or 40 base32 chars, which is
    quite a lot to handle for a human, but necessary to provide sufficient
    protection against an adversary from generating a key pair with same
    "public-key" hash or guessing the "cookie".
    
    A hidden service directory can verify that a descriptor was created by the
    hidden service provider by checking if the "descriptor-id" corresponds to
    the "public-key" and if the signature can be verified with the
    "public-key".

    The "introduction-points" that are included in the descriptor are encrypted
    using the same "cookie" that is shared between hidden service provider and
    clients. [correction to use another key than h(time-period + cookie) as
    encryption key for introduction points made by LØ]

    A new text-based format is proposed for descriptors instead of an extension
    of the existing binary format for reasons of future extensibility.

Security implications:

  The security implications of the proposed changes are grouped by the roles of
  nodes that could perform attacks or on which attacks could be performed.

  Attacks by authoritative directory nodes

    Authoritative directory nodes are no longer the single places in the
    network that know about a hidden service's activity and introduction
    points. Thus, they cannot perform attacks using this information, e.g.
    track a hidden service's activity or usage pattern or attack its
    introduction points. Formerly, it would only require a single corrupted
    authoritative directory operator to perform such an attack.

  Attacks by hidden service directory nodes

    A hidden service directory node could misuse a stored descriptor to track a
    hidden service's activity and usage pattern by clients. Though there is no
    countermeasure against this kind of attack, it is very expensive to track a
    certain hidden service over time. An attacker would need to run a large
    number of stable onion routers that work as hidden service directory nodes
    to have a good probability to become responsible for its changing
    descriptor IDs. For each period, the probability is:

      1-(N-c choose r)/(N choose r) for N-c>=r and 1 otherwise, with N
      as total
      number of hidden service directories, c as compromised nodes, and r as
      number of replicas

    The hidden service directory nodes could try to make a certain hidden
    service unavailable to its clients. Therefore, they could discard all
    stored descriptors for that hidden service and reply to clients that there
    is no descriptor for the given ID or return an old or false descriptor
    content. The client would detect a false descriptor, because it could not
    contain a correct signature. But an old content or an empty reply could
    confuse the client. Therefore, the countermeasure is to replicate
    descriptors among a small number of hidden service directories, e.g. 5.
    The probability of a group of collaborating nodes to make a hidden service
    completely unavailable is in each period:

      (c choose r)/(N choose r) for c>=r and N>=r, and 0 otherwise,
      with N as total
      number of hidden service directories, c as compromised nodes, and r as
      number of replicas

    A hidden service directory could try to find out which introduction points
    are working on behalf of a hidden service. In contrast to the previous
    design, this is not possible anymore, because this information is encrypted
    to the clients of a hidden service.

  Attacks on hidden service directory nodes

    An anonymous attacker could try to swamp a hidden service directory with
    false descriptors for a given descriptor ID. This is prevented by requiring
    that descriptors are signed.

    Anonymous attackers could swamp a hidden service directory with correct
    descriptors for non-existing hidden services. There is no countermeasure
    against this attack. However, the creation of valid descriptors is more
    expensive than verification and storage in local memory. This should make
    this kind of attack unattractive.

  Attacks by introduction points

    Current or former introduction points could try to gain information on the
    hidden service they serve. But due to the fresh key pair that is used by
    the hidden service, this attack is not possible anymore.

  Attacks by clients

    Current or former clients could track a hidden service's activity, attack
    its introduction points, or determine the responsible hidden service
    directory nodes and attack them. There is nothing that could prevent them
    from doing so, because honest clients need the full descriptor content to
    establish a connection to the hidden service. At the moment, the only
    countermeasure against dishonest clients is to change the secret cookie and
    pass it only to the honest clients.

Compatibility:

  The proposed design is meant to replace the current design for hidden service
  descriptors and their storage in the long run.

  There should be a first transition phase in which both, the current design
  and the proposed design are served in parallel. Onion routers should start
  serving as hidden service directories, and hidden service providers and
  clients should make use of the new design if both sides support it. Hidden
  service providers should be allowed to publish descriptors of the current
  format in parallel, and authoritative directories should continue storing and
  serving these descriptors.

  After the first transition phase, hidden service providers should stop
  publishing descriptors on authoritative directories, and hidden service
  clients should not try to fetch descriptors from the authoritative
  directories. However, the authoritative directories should continue serving
  hidden service descriptors for a second transition phase. As of this point,
  all v2 config options should be set to a default value of 1.

  After the second transition phase, the authoritative directories should stop
  serving hidden service descriptors.

Filename: 115-two-hop-paths.txt
Title: Two Hop Paths
Author: Mike Perry
Created:
Status: Dead
Supersedes: 112


Overview:

  The idea is that users should be able to choose if they would like
  to have either two or three hop paths through the tor network. 

  Let us be clear: the users who would choose this option should be
  those that are concerned with IP obfuscation only: ie they would not be
  targets of a resource-intensive multi-node attack. It is sometimes said
  that these users should find some other network to use other than Tor.
  This is a foolish suggestion: more users improves security of everyone,
  and the current small userbase size is a critical hindrance to
  anonymity, as is discussed below and in [1].

  This value should be modifiable from the controller, and should be
  available from Vidalia.


Motivation:

  The Tor network is slow and overloaded. Increasingly often I hear
  stories about friends and friends of friends who are behind firewalls,
  annoying censorware, or under surveillance that interferes with their
  productivity and Internet usage, or chills their speech. These people
  know about Tor, but they choose to put up with the censorship because
  Tor is too slow to be usable for them. In fact, to download a fresh,
  complete copy of levine-timing.pdf for the Theoretical Argument
  section of this proposal over Tor took me 3 tries.

  Furthermore, the biggest current problem with Tor's anonymity for
  those who really need it is not someone attacking the network to
  discover who they are. It's instead the extreme danger that so few
  people use Tor because it's so slow, that those who do use it have
  essentially no confusion set.

  The recent case where the professor and the rogue Tor user were the
  only Tor users on campus, and thus suspected in an incident involving
  Tor and that University underscores this point: "That was why the police
  had come to see me. They told me that only two people on our campus were
  using Tor: me and someone they suspected of engaging in an online scam.
  The detectives wanted to know whether the other user was a former
  student of mine, and why I was using Tor"[1].

  Not only does Tor provide no anonymity if you use it to be anonymous
  but are obviously from a certain institution, location or circumstance,
  it is also dangerous to use Tor for risk of being accused of having
  something significant enough to hide to be willing to put up with
  the horrible performance as opposed to using some weaker alternative.

  There are many ways to improve the speed problem, and of course we
  should and will implement as many as we can. Johannes's GSoC project
  and my reputation system are longer term, higher-effort things that
  will still provide benefit independent of this proposal.

  However, reducing the path length to 2 for those who do not need the
  extra anonymity 3 hops provide not only improves their Tor experience
  but also reduces their load on the Tor network by 33%, and should
  increase adoption of Tor by a good deal. That's not just Win-Win, it's
  Win-Win-Win.


Who will enable this option?

  This is the crux of the proposal. Admittedly, there is some anonymity
  loss and some degree of decreased investment required on the part of
  the adversary to attack 2 hop users versus 3 hop users, even if it is
  minimal and limited mostly to up-front costs and false positives.

  The key questions are:

  1. Are these users in a class such that their risk is significantly
     less than the amount of this anonymity loss?

  2. Are these users able to identify themselves?

  Many many users of Tor are not at risk for an adversary capturing c/n
  nodes of the network just to see what they do. These users use Tor to
  circumvent aggressive content filters, or simply to keep their IP out of
  marketing and search engine databases. Most content filters have no
  interest in running Tor nodes to catch violators, and marketers
  certainly would never consider such a thing, both on a cost basis and a
  legal one.

  In a sense, this represents an alternate threat model against these
  users who are not at risk for Tor's normal threat model.

  It should be evident to these users that they fall into this class. All
  that should be needed is a radio button

   * "I use Tor for local content filter circumvention and/or IP obfuscation, 
      not anonymity. Speed is more important to me than high anonymity. 
      No one will make considerable efforts to determine my real IP."
   * "I use Tor for anonymity and/or national-level, legally enforced 
      censorship. It is possible effort will be taken to identify 
      me, including but not limited to network surveillance. I need more 
      protection."
 
  and then some explanation in the help for exactly what this means, and
  the risks involved with eliminating the adversary's need for timing
  attacks with respect to false positives. Ultimately, the decision is a
  simple one that can be made without this information, however. The user
  does not need Paul Syverson to instruct them on the deep magic of Onion
  Routing to make this decision. They just need to know why they use Tor.
  If they use it just to stay out of marketing databases and/or bypass a
  local content filter, two hops is plenty. This is likely the vast
  majority of Tor users, and many non-users we would like to bring on 
  board.

  So, having established this class of users, let us now go on to
  examine theoretical and practical risks we place them at, and determine
  if these risks violate the users needs, or introduce additional risk 
  to node operators who may be subject to requests from law enforcement
  to track users who need 3 hops, but use 2 because they enjoy the
  thrill of russian roulette.


Theoretical Argument:

  It has long been established that timing attacks against mixed
  and onion networks are extremely effective, and that regardless 
  of path length, if the adversary has compromised your first and 
  last hop of your path, you can assume they have compromised your
  identity for that connection.

  In fact, it was demonstrated that for all but the slowest, lossiest
  networks, error rates for false positives and false negatives were
  very near zero[2]. Only for constant streams of traffic over slow and
  (more importantly) extremely lossy network links did the error rate
  hit 20%. For loss rates typical to the Internet, even the error rate
  for slow nodes with constant traffic streams was 13%.

  When you take into account that most Tor streams are not constant,
  but probably much more like their "HomeIP" dataset, which consists
  mostly of web traffic that exists over finite intervals at specific
  times, error rates drop to fractions of 1%, even for the "worst"
  network nodes.

  Therefore, the user has little benefit from the extra hop, assuming
  the adversary does timing correlation on their nodes. Since timing
  correlation is simply an implementation issue and is most likely
  a single up-front cost (and one that is like quite a bit cheaper
  than the cost of the machines purchased to host the nodes to mount
  an attack), the real protection is the low probability of getting
  both the first and last hop of a client's stream.


Practical Issues:

  Theoretical issues aside, there are several practical issues with the
  implementation of Tor that need to be addressed to ensure that
  identity information is not leaked by the implementation.

  Exit policy issues:

  If a client chooses an exit with a very restrictive exit policy
  (such as an IP or IP range), the first hop then knows a good deal
  about the destination. For this reason, clients should not select
  exits that match their destination IP with anything other than "*".

  Partitioning:

  Partitioning attacks form another concern. Since Tor uses telescoping
  to build circuits, it is possible to tell a user is constructing only
  two hop paths at the entry node and on the local network. An external
  adversary can potentially differentiate 2 and 3 hop users, and decide
  that all IP addresses connecting to Tor and using 3 hops have something
  to hide, and should be scrutinized more closely or outright apprehended.

  One solution to this is to use the "leaky-circuit" method of attaching
  streams: The user always creates 3-hop circuits, but if the option
  is enabled, they always exit from their 2nd hop. The ideal solution
  would be to create a RELAY_SHISHKABOB cell which contains onion
  skins for every host along the path, but this requires protocol
  changes at the nodes to support.

  Guard nodes:

  Since guard nodes can rotate due to client relocation, network
  failure, node upgrades and other issues, if you amortize the risk a
  mobile, dialup, or otherwise intermittently connected user is exposed to
  over any reasonable duration of Tor usage (on the order of a year), it
  is the same with or without guard nodes. Assuming an adversary has
  c%/n% of network bandwidth, and guards rotate on average with period R,
  statistically speaking, it's merely a question of if the user wishes
  their risk to be concentrated with probability c/n over an expected
  period of R*c, and probability 0 over an expected period of R*(n-c),
  versus a continuous risk of (c/n)^2. So statistically speaking, guards
  only create a time-tradeoff of risk over the long run for normal Tor
  usage. Rotating guards do not reduce risk for normal client usage long
  term.[3]

  On other other hand, assuming a more stable method of guard selection
  and preservation is devised, or a more stable client side network than 
  my own is typical (which rotates guards frequently due to network issues
  and moving about), guard nodes provide a tradeoff in the form of c/n% of
  the users being "sacrificial users" who are exposed to high risk O(c/n)
  of identification, while the rest of the network is exposed to zero
  risk.

  The nature of Tor makes it likely an adversary will take a "shock and
  awe" approach to suppressing Tor by rounding up a few users whose
  browsing activity has been observed to be made into examples, in an
  attempt to prove that Tor is not perfect.

  Since this "shock and awe" attack can be applied with or without guard
  nodes, stable guard nodes do offer a measure of accountability of sorts.
  If a user was using a small set of guard nodes and knows them well, and
  then is suddenly apprehended as a result of Tor usage, having a fixed
  set of entry points to suspect is a lot better than suspecting the whole
  network. Conversely, it can also give non-apprehended users comfort
  that they are likely to remain safe indefinitely with their set of (now
  presumably trusted) guards. This is probably the most beneficial
  property of reliable guards: they deter the adversary from mounting
  "shock and awe" attacks because the surviving users will not
  intimidated, but instead made more confident. Of course, guards need to
  be made much more stable and users need to be encouraged to know their
  guards for this property to really take effect. 

  This beneficial property of client vigilance also carries over to an
  active adversary, except in this case instead of relying on the user
  to remember their guard nodes and somehow communicate them after
  apprehension, the code can alert them to the presence of an active
  adversary before they are apprehended. But only if they use guard nodes.

  So lets consider the active adversary: Two hop paths allow malicious
  guards to get considerably more benefit from failing circuits if they do
  not extend to their colluding peers for the exit hop. Since guards can
  detect the number of hops in a path via either timing or by statistical
  analysis of the exit policy of the 2nd hop, they can perform this attack
  predominantly against 2 hop users.

  This can be addressed by completely abandoning an entry guard after a
  certain ratio of extend or general circuit failures with respect to
  non-failed circuits. The proper value for this ratio can be determined
  experimentally with TorFlow. There is the possibility that the local
  network can abuse this feature to cause certain guards to be dropped,
  but they can do that anyways with the current Tor by just making guards
  they don't like unreachable. With this mechanism, Tor will complain
  loudly if any guard failure rate exceeds the expected in any failure
  case, local or remote.

  Eliminating guards entirely would actually not address this issue due
  to the time-tradeoff nature of risk. In fact, it would just make it
  worse. Without guard nodes, it becomes much more difficult for clients
  to become alerted to Tor entry points that are failing circuits to make
  sure that they only devote bandwidth to carry traffic for streams which
  they observe both ends. Yet the rogue entry points are still able to
  significantly increase their success rates by failing circuits.

  For this reason, guard nodes should remain enabled for 2 hop users,
  at least until an IP-independent, undetectable guard scanner can
  be created. TorFlow can scan for failing guards, but after a while, 
  its unique behavior gives away the fact that its IP is a scanner and 
  it can be given selective service.
  
  Consideration of risks for node operators:

  There is a serious risk for two hop users in the form of guard
  profiling. If an adversary running an exit node notices that a
  particular site is always visited from a fixed previous hop, it is
  likely that this is a two hop user using a certain guard, which could be
  monitored to determine their identity. Thus, for the protection of both
  2 hop users and node operators, 2 hop users should limit their guard
  duration to a sufficient number of days to verify reliability of a node,
  but not much more. This duration can be determined experimentally by
  TorFlow.

  Considering a Tor client builds on average 144 circuits/day (10
  minutes per circuit), if the adversary owns c/n% of exits on the
  network, they can expect to see 144*c/n circuits from this user, or
  about 14 minutes of usage per day per percentage of network penetration.
  Since it will take several occurrences of user-linkable exit content
  from the same predecessor hop for the adversary to have any confidence
  this is a 2 hop user, it is very unlikely that any sort of demands made
  upon the predecessor node would guaranteed to be effective (ie it
  actually was a guard), let alone be executed in time to apprehend the 
  user before they rotated guards.

  The reverse risk also warrants consideration. If a malicious guard has
  orders to surveil Mike Perry, it can determine Mike Perry is using two
  hops by observing his tendency to choose a 2nd hop with a viable exit
  policy. This can be done relatively quickly, unfortunately, and
  indicates Mike Perry should spend some of his time building real 3 hop
  circuits through the same guards, to require them to at least wait for
  him to actually use Tor to determine his style of operation, rather than
  collect this information from his passive building patterns.

  However, to actively determine where Mike Perry is going, the guard
  will need to require logging ahead of time at multiple exit nodes that
  he may use over the course of the few days while he is at that guard,
  and correlate the usage times of the exit node with Mike Perry's
  activity at that guard for the few days he uses it. At this point, the
  adversary is mounting a scale and method of attack (widespread logging,
  timing attacks) that works pretty much just as effectively against 3
  hops, so exit node operators are exposed to no additional danger than
  they otherwise normally are.


Why not fix Pathlen=2?:

  The main reason I am not advocating that we always use 2 hops is that
  in some situations, timing correlation evidence by itself may not be
  considered as solid and convincing as an actual, uninterrupted, fully
  traced path. Are these timing attacks as effective on a real network as
  they are in simulation? Maybe the circuit multiplexing of Tor can serve 
  to frustrate them to a degree? Would an extralegal adversary or 
  authoritarian government even care? In the face of these situation 
  dependent unknowns, it should be up to the user to decide if this is 
  a concern for them or not.

  It should probably also be noted that even a false positive
  rate of 1% for a 200k concurrent-user network could mean that for a
  given node, a given stream could be confused with something like 10
  users, assuming ~200 nodes carry most of the traffic (ie 1000 users
  each). Though of course to really know for sure, someone needs to do
  an attack on a real network, unfortunately.

  Additionally, at some point cover traffic schemes may be implemented to
  frustrate timing attacks on the first hop. It is possible some expert
  users may do this ad-hoc already, and may wish to continue using 3 hops
  for this reason.


Implementation:

  new_route_len() can be modified directly with a check of the
  Pathlen option. However, circuit construction logic should be
  altered so that both 2 hop and 3 hop users build the same types of
  circuits, and the option should ultimately govern circuit selection,
  not construction. This improves coverage against guard nodes being
  able to passively profile users who aren't even using Tor.
  PathlenCoinWeight, anyone? :)

  The exit policy hack is a bit more tricky. compare_addr_to_addr_policy
  needs to return an alternate ADDR_POLICY_ACCEPTED_WILDCARD or
  ADDR_POLICY_ACCEPTED_SPECIFIC return value for use in
  circuit_is_acceptable.
  
  The leaky exit is trickier still.. handle_control_attachstream
  does allow paths to exit at a given hop. Presumably something similar
  can be done in connection_ap_handshake_process_socks, and elsewhere?
  Circuit construction would also have to be performed such that the
  2nd hop's exit policy is what is considered, not the 3rd's.

  The entry_guard_t structure could have num_circ_failed and
  num_circ_succeeded members such that if it exceeds F% circuit
  extend failure rate to a second hop, it is removed from the entry list.

  F should be sufficiently high to avoid churn from normal Tor circuit
  failure as determined by TorFlow scans.

  The Vidalia option should be presented as a radio button.


Migration:

  Phase 1: Adjust exit policy checks if Pathlen is set, implement leaky
  circuit ability, and 2-3 hop circuit selection logic governed by
  Pathlen.

  Phase 2: Experiment to determine the proper ratio of circuit
  failures used to expire garbage or malicious guards via TorFlow
  (pending Bug #440 backport+adoption).

  Phase 3: Implement guard expiration code to kick off failure-prone
  guards and warn the user. Cap 2 hop guard duration to a proper number
  of days determined sufficient to establish guard reliability (to be
  determined by TorFlow).

  Phase 4: Make radiobutton in Vidalia, along with help entry
  that explains in layman's terms the risks involved.

  Phase 5: Allow user to specify path length by HTTP URL suffix.


[1] http://p2pnet.net/story/11279
[2] http://www.cs.umass.edu/~mwright/papers/levine-timing.pdf
[3] Proof available upon request ;)
Filename: 116-two-hop-paths-from-guard.txt
Title: Two hop paths from entry guards
Author: Michael Lieberman
Created: 26-Jun-2007
Status: Dead

This proposal is related to (but different from) Mike Perry's proposal 115
"Two Hop Paths."

Overview:

Volunteers who run entry guards should have the option of using only 2
additional tor nodes when constructing their own tor circuits.

While the option of two hop paths should perhaps be extended to every client
(as discussed in Mike Perry's thread), I believe the anonymity properties of
two hop paths are particularly well-suited to client computers that are also
serving as entry guards.

First I will describe the details of the strategy, as well as possible
avenues of attack. Then I will list advantages and disadvantages. Then, I
will discuss some possibly safer variations of the strategy, and finally
some implementation issues.

Details:

Suppose Alice is an entry guard, and wants to construct a two hop circuit.
Alice chooses a middle node at random (not using the entry guard strategy),
and gains anonymity by having her traffic look just like traffic from
someone else using her as an entry guard.

Can Alice's middle node figure out that she is initiator of the traffic? I
can think of four possible approaches for distinguishing traffic from Alice
with traffic through Alice:

1) Notice that communication from Alice comes too fast: Experimentation is
needed to determine if traffic from Alice can be distinguished from traffic
from a computer with a decent link to Alice.

2) Monitor Alice's network traffic to discover the lack of incoming packets
at the appropriate times. If an adversary has this ability, then Alice
already has problems in the current system, because the adversary can run a
standard timing attack on Alice's traffic.

3) Notice that traffic from Alice is unique in some way such that if Alice
was just one of 3 entry guards for this traffic, then the traffic should be
coming from two other entry guards as well. An example of "unique traffic"
could be always sending 117 packets every 3 minutes to an exit node that
exits to port 4661. However, if such patterns existed with sufficient
precision, then it seems to me that Tor already has a problem. (This "unique
traffic" may not be a problem if clients often end up choosing a single
entry guard because their other two are down. Does anyone know if this is
the case?)

4) First, control the middle node *and* some other part of the traffic,
using standard attacks on a two hop circuit without entry nodes (my recent
paper on Browser-Based Attacks would work well for this
http://petworkshop.org/2007/papers/PET2007_preproc_Browser_based.pdf). With
control of the circuit, we can now cause "unique traffic" as in 3).
Alternatively, if we know something about Alice independently, and we can
see what websites are being visited, we might be able to guess that she is
the kind of person that would visit those websites.

Anonymity Advantages:

-Alice never has the problem of choosing a malicious entry guard. In some
sense, Alice acts as her own entry guard.

Anonymity Disadvantages:

-If Alice's traffic is identified as originating from herself (see above for
how hard that might be), then she has the anonymity of a 2 hop circuit
without entry guards.

Additional advantages:

-A discussion of the latency advantages of two hop circuits is going on in
Mike Perry's thread already.
-Also, we can advertise this change as "Run an entry guard and decrease your
own Tor latency." This incentive has the potential to add nodes to the
network, improving the network as a whole.

Safer variations:

To solve the "unique traffic" problem, Alice could use two hop paths only
1/3 of the time, and choose 2 other entry guards for the other 2/3 of the
time. All the advantages are now 1/3 as useful (possibly more, if the other
2 entry guards are not always up).

To solve the problem that Alice's responses are too fast, Alice could delay
her responses (ideally based on some real data of response time when Alice
is used an entry guard). This loses most of the speed advantages of the two
hop path, but if Alice is a fast entry guard, it doesn't lose everything. It
also still has the (arguable) anonymity advantage that Alice doesn't have to
worry about having a malicious entry guard.

Implementation details:
For Alice to remain anonymous using this strategy, she has to actually be
acting as an entry guard for other nodes. This means the two hop option can
only be available to whatever high-performance threshold is currently set on
entry guards. Alice may need to somehow check her own current status as an
entry guard before choosing this two hop strategy.

Another thing to consider: suppose Alice is also an exit node. If the
fraction of exit nodes in existence is too small, she may rarely or never be
chosen as an entry guard. It would be sad if we offered an incentive to run
an entry guard that didn't extend to exit nodes. I suppose clients of Exit
nodes could pull the same trick, and bypass using Tor altogether (zero hop
paths), though that has additional issues.*

Mike Lieberman
MIT

*Why we shouldn't recommend Exit nodes pull the same trick:
1) Exit nodes would suffer heavily from the problem of "unique traffic"
mentioned above.
2) It would give governments an incentive to confiscate exit nodes to see if
they are pulling this trick.
Filename: 117-ipv6-exits.txt
Title: IPv6 exits
Author: coderman
Created: 10-Jul-2007
Status: Closed
Target: 0.2.4.x
Implemented-In: 0.2.4.7-alpha

Overview

   Extend Tor for TCP exit via IPv6 transport and DNS resolution of IPv6
   addresses.  This proposal does not imply any IPv6 support for OR
   traffic, only exit and name resolution.


Contents

0. Motivation

   As the IPv4 address space becomes more scarce there is increasing
   effort to provide Internet services via the IPv6 protocol.  Many
   hosts are available at IPv6 endpoints which are currently
   inaccessible for Tor users.

   Extending Tor to support IPv6 exit streams and IPv6 DNS name
   resolution will allow users of the Tor network to access these hosts.
   This capability would be present for those who do not currently have
   IPv6 access, thus increasing the utility of Tor and furthering
   adoption of IPv6.


1. Design

1.1. General design overview

   There are three main components to this proposal.  The first is a
   method for routers to advertise their ability to exit IPv6 traffic.
   The second is the manner in which routers resolve names to IPv6
   addresses.  Last but not least is the method in which clients
   communicate with Tor to resolve and connect to IPv6 endpoints
   anonymously.

1.2. Router IPv6 exit support

   In order to specify exit policies and IPv6 capability new directives
   in the Tor configuration will be needed.  If a router advertises IPv6
   exit policies in its descriptor this will signal the ability to
   provide IPv6 exit.  There are a number of additional default deny
   rules associated with this new address space which are detailed in
   the addendum.

   When Tor is started on a host it should check for the presence of a
   global unicast IPv6 address and if present include the default IPv6
   exit policies and any user specified IPv6 exit policies.

   If a user provides IPv6 exit policies but no global unicast IPv6
   address is available Tor should generate a warning and not publish the
   IPv6 policies in the router descriptor.

   It should be noted that IPv4 mapped IPv6 addresses are not valid exit
   destinations.  This mechanism is mainly used to interoperate with
   both IPv4 and IPv6 clients on the same socket.  Any attempts to use
   an IPv4 mapped IPv6 address, perhaps to circumvent exit policy for
   IPv4, must be refused.

1.3. DNS name resolution of IPv6 addresses (AAAA records)

   In addition to exit support for IPv6 TCP connections, a method to
   resolve domain names to their respective IPv6 addresses is also
   needed.  This is accomplished in the existing DNS system via AAAA
   records.  Routers will perform both A and AAAA requests when
   resolving a name so that the client can utilize an IPv6 endpoint when
   available or preferred.

   To avoid potential problems with caching DNS servers that behave
   poorly all NXDOMAIN responses to AAAA requests should be ignored if a
   successful response is received for an A request.  This implies that
   both AAAA and A requests will always be performed for each name
   resolution.

   For reverse lookups on IPv6 addresses, like that used for
   RESOLVE_PTR, Tor will perform the necessary PTR requests via
   IP6.ARPA.

   All routers which perform DNS resolution on behalf of clients
   (RELAY_RESOLVE) should perform and respond with both A and AAAA
   resources.

   [NOTE: In a future version, when we extend the behavior of RESOLVE to
    encapsulate more of real DNS, it will make sense to allow more
    flexibility here. -nickm]

1.4. Client interaction with IPv6 exit capability

1.4.1. Usability goals

   There are a number of behaviors which Tor can provide when
   interacting with clients that will improve the usability of IPv6 exit
   capability.  These behaviors are designed to make it simple for
   clients to express a preference for IPv6 transport and utilize IPv6
   host services.

1.4.2. SOCKSv5 IPv6 client behavior

   The SOCKS version 5 protocol supports IPv6 connections.  When using
   SOCKSv5 with hostnames it is difficult to determine if a client
   wishes to use an IPv4 or IPv6 address to connect to the desired host
   if it resolves to both address types.

   In order to make this more intuitive the SOCKSv5 protocol can be
   supported on a local IPv6 endpoint, [::1] port 9050 for example.
   When a client requests a connection to the desired host via an IPv6
   SOCKS connection Tor will prefer IPv6 addresses when resolving the
   host name and connecting to the host.

   Likewise, RESOLVE and RESOLVE_PTR requests from an IPv6 SOCKS
   connection will return IPv6 addresses when available, and fall back
   to IPv4 addresses if not.

   [NOTE: This means that SocksListenAddress and DNSListenAddress should
    support IPv6 addresses.  Perhaps there should also be a general option
    to have listeners that default to 127.0.0.1 and 0.0.0.0 listen
    additionally or instead on ::1 and :: -nickm]

1.4.3. MAPADDRESS behavior

   The MAPADDRESS capability supports clients that may not be able to
   use the SOCKSv4a or SOCKSv5 hostname support to resolve names via
   Tor.  This ability should be extended to IPv6 addresses in SOCKSv5 as
   well.

   When a client requests an address mapping from the wildcard IPv6
   address, [::0], the server will respond with a unique local IPv6
   address on success.  It is important to note that there may be two
   mappings for the same name if both an IPv4 and IPv6 address are
   associated with the host.  In this case a CONNECT to a mapped IPv6
   address should prefer IPv6 for the connection to the host, if
   available, while CONNECT to a mapped IPv4 address will prefer IPv4.

   It should be noted that IPv6 does not provide the concept of a host
   local subnet, like 127.0.0.0/8 in IPv4.  For this reason integration
   of Tor with IPv6 clients should consider a firewall or filter rule to
   drop unique local addresses to or from the network when possible.
   These packets should not be routed, however, keeping them off the
   subnet entirely is worthwhile.

1.4.3.1. Generating unique local IPv6 addresses

   The usual manner of generating a unique local IPv6 address is to
   select a Global ID part randomly, along with a Subnet ID, and sharing
   this prefix among the communicating parties who each have their own
   distinct Interface ID.  In this style a given Tor instance might
   select a random Global and Subnet ID and provide MAPADDRESS
   assignments with a random Interface ID as needed.  This has the
   potential to associate unique Global/Subnet identifiers with a given
   Tor instance and may expose attacks against the anonymity of Tor
   users.

   To avoid this potential problem entirely MAPADDRESS must always
   generate the Global, Subnet, and Interface IDs randomly for each
   request.  It is also highly suggested that explicitly specifying an
   IPv6 source address instead of the wildcard address not be supported
   to ensure that a good random address is used.

1.4.4. DNSProxy IPv6 client behavior

   A new capability in recent Tor versions is the transparent DNS proxy.
   This feature will need to return both A and AAAA resource records
   when responding to client name resolution requests.

   The transparent DNS proxy should also support reverse lookups for
   IPv6 addresses.  It is suggested that any such requests to the
   deprecated IP6.INT domain should be translated to IP6.ARPA instead.
   This translation is not likely to be used and is of low priority.

   It would be nice to support DNS over IPv6 transport as well, however,
   this is not likely to be used and is of low priority.

1.4.5. TransPort IPv6 client behavior

   Tor also provides transparent TCP proxy support via the Trans*
   directives in the configuration.  The TransListenAddress directive
   should accept an IPv6 address in addition to IPv4 so that IPv6 TCP
   connections can be transparently proxied.

1.5. Additional changes

   The RedirectExit option should be deprecated rather than extending
   this feature to IPv6.


2. Spec changes

2.1. Tor specification

   In '6.2. Opening streams and transferring data' the following should
   be changed to indicate IPv6 exit capability:

      "No version of Tor currently generates the IPv6 format."

   In '6.4. Remote hostname lookup' the following should be updated to
   reflect use of ip6.arpa in addition to in-addr.arpa.

      "For a reverse lookup, the OP sends a RELAY_RESOLVE cell containing an
       in-addr.arpa address."

   In 'A.1. Differences between spec and implementation' the following
   should be updated to indicate IPv6 exit capability:

      "The current codebase has no IPv6 support at all."

   [NOTE: the EXITPOLICY end-cell reason says that it can hold an ipv4 or an
    ipv6 address, but doesn't say how.  We may want a separate EXITPOLICY2
    type that can hold an ipv6 address, since the way we encode ipv6
    addresses elsewhere ("0.0.0.0 indicates that the next 16 bytes are ipv6")
    is a bit dumb. -nickm]
   [Actually, the length field lets us distinguish EXITPOLICY. -nickm]

2.2. Directory specification

   In '2.1. Router descriptor format' a new set of directives is needed
   for IPv6 exit policy.  The existing accept/reject directives should
   be clarified to indicate IPv4 or wildcard address relevance.  The new
   IPv6 directives will be in the form of:

      "accept6" exitpattern NL
      "reject6" exitpattern NL

   The section describing accept6/reject6 should explain that the
   presence of accept6 or reject6 exit policies in a router descriptor
   signals the ability of that router to exit IPv6 traffic (according to
   IPv6 exit policies).

   The "[::]/0" notation is used to represent "all IPv6 addresses".
   "[::0]/0" may also be used for this representation.

   If a user specifies a 'reject6 [::]/0:*' policy in the Tor
   configuration this will be interpreted as forcing no IPv6 exit
   support and no accept6/reject6 policies will be included in the
   published descriptor.  This will prevent IPv6 exit if the router host
   has a global unicast IPv6 address present.

   It is important to note that a wildcard address in an accept or
   reject policy applies to both IPv4 and IPv6 addresses.

2.3. Control specification

   In '3.8. MAPADDRESS' the potential to have to addresses for a given
   name should be explained.  The method for generating unique local
   addresses for IPv6 mappings needs explanation as described above.

   When IPv6 addresses are used in this document they should include the
   brackets for consistency.  For example, the null IPv6 address should
   be written as "[::0]" and not "::0".  The control commands will
   expect the same syntax as well.

   In '3.9. GETINFO' the "address" command should return both public
   IPv4 and IPv6 addresses if present.  These addresses should be
   separated via \r\n.


2.4. Tor SOCKS extensions

   In '2. Name lookup' a description of IPv6 address resolution is
   needed for SOCKSv5 as described above.  IPv6 addresses should be
   supported in both the RESOLVE and RESOLVE_PTR extensions.

   A new section describing the ability to accept SOCKSv5 clients on a
   local IPv6 address to indicate a preference for IPv6 transport as
   described above is also needed.  The behavior of Tor SOCKSv5 proxy
   with an IPv6 preference should be explained, for example, preferring
   IPv6 transport to a named host with both IPv4 and IPv6 addresses
   available (A and AAAA records).


3. Questions and concerns

3.1. DNS A6 records

   A6 is explicitly avoided in this document.  There are potential
   reasons for implementing this, however, the inherent complexity of
   the protocol and resolvers make this unappealing.  Is there a
   compelling reason to consider A6 as part of IPv6 exit support?

   [IMO not till anybody needs it. -nickm]

3.2. IPv4 and IPv6 preference

   The design above tries to infer a preference for IPv4 or IPv6
   transport based on client interactions with Tor.  It might be useful
   to provide more explicit control over this preference.  For example,
   an IPv4 SOCKSv5 client may want to use IPv6 transport to named hosts
   in CONNECT requests while the current implementation would assume an
   IPv4 preference.  Should more explicit control be available, through
   either configuration directives or control commands?

   Many applications support a inet6-only or prefer-family type option
   that provides the user manual control over address preference.  This
   could be provided as a Tor configuration option.

   An explicit preference is still possible by resolving names and then
   CONNECTing to an IPv4 or IPv6 address as desired, however, not all
   client applications may have this option available.

3.3. Support for IPv6 only transparent proxy clients

   It may be useful to support IPv6 only transparent proxy clients using
   IPv4 mapped IPv6 like addresses.  This would require transparent DNS
   proxy using IPv6 transport and the ability to map A record responses
   into IPv4 mapped IPv6 like addresses in the manner described in the
   "NAT-PT" RFC for a traditional Basic-NAT-PT with DNS-ALG.  The
   transparent TCP proxy would thus need to detect these mapped addresses
   and connect to the desired IPv4 host.

   The IPv6 prefix used for this purpose must not be the actual IPv4
   mapped IPv6 address prefix, though the manner in which IPv4 addresses
   are embedded in IPv6 addresses would be the same.

   The lack of any IPv6 only hosts which would use this transparent proxy
   method makes this a lot of work for very little gain.  Is there a
   compelling reason to support this NAT-PT like capability?

3.4. IPv6 DNS and older Tor routers

   It is expected that many routers will continue to run with older
   versions of Tor when the IPv6 exit capability is released.  Clients
   who wish to use IPv6 will need to route RELAY_RESOLVE requests to the
   newer routers which will respond with both A and AAAA resource
   records when possible.

   One way to do this is to route RELAY_RESOLVE requests to routers with
   IPv6 exit policies published, however, this would not utilize current
   routers that can resolve IPv6 addresses even if they can't exit such
   traffic.

   There was also concern expressed about the ability of existing clients
   to cope with new RELAY_RESOLVE responses that contain IPv6 addresses.
   If this breaks backward compatibility, a new request type may be
   necessary, like RELAY_RESOLVE6, or some other mechanism of indicating
   the ability to parse IPv6 responses when making the request.

3.5. IPv4 and IPv6 bindings in MAPADDRESS

   It may be troublesome to try and support two distinct address mappings
   for the same name in the existing MAPADDRESS implementation.  If this
   cannot be accommodated then the behavior should replace existing
   mappings with the new address regardless of family.  A warning when
   this occurs would be useful to assist clients who encounter problems
   when both an IPv4 and IPv6 application are using MAPADDRESS for the
   same names concurrently, causing lost connections for one of them.

4. Addendum

4.1. Sample IPv6 default exit policy

   reject 0.0.0.0/8
   reject 169.254.0.0/16
   reject 127.0.0.0/8
   reject 192.168.0.0/16
   reject 10.0.0.0/8
   reject 172.16.0.0/12
   reject6 [0000::]/8
   reject6 [0100::]/8
   reject6 [0200::]/7
   reject6 [0400::]/6
   reject6 [0800::]/5
   reject6 [1000::]/4
   reject6 [4000::]/3
   reject6 [6000::]/3
   reject6 [8000::]/3
   reject6 [A000::]/3
   reject6 [C000::]/3
   reject6 [E000::]/4
   reject6 [F000::]/5
   reject6 [F800::]/6
   reject6 [FC00::]/7
   reject6 [FE00::]/9
   reject6 [FE80::]/10
   reject6 [FEC0::]/10
   reject6 [FF00::]/8
   reject *:25
   reject *:119
   reject *:135-139
   reject *:445
   reject *:1214
   reject *:4661-4666
   reject *:6346-6429
   reject *:6699
   reject *:6881-6999
   accept *:*
   # accept6 [2000::]/3:* is implied

4.2. Additional resources

   'DNS Extensions to Support IP Version 6'
   http://www.ietf.org/rfc/rfc3596.txt

   'DNS Extensions to Support IPv6 Address Aggregation and Renumbering'
   http://www.ietf.org/rfc/rfc2874.txt

   'SOCKS Protocol Version 5'
   http://www.ietf.org/rfc/rfc1928.txt

   'Unique Local IPv6 Unicast Addresses'
   http://www.ietf.org/rfc/rfc4193.txt

   'INTERNET PROTOCOL VERSION 6 ADDRESS SPACE'
   http://www.iana.org/assignments/ipv6-address-space

   'Network Address Translation - Protocol Translation (NAT-PT)'
   http://www.ietf.org/rfc/rfc2766.txt
Filename: 118-multiple-orports.txt
Title: Advertising multiple ORPorts at once
Author: Nick Mathewson
Created: 09-Jul-2007
Status: Superseded
Superseded-By: 186-multiple-orports.txt

[Needs Revision: This proposal needs revision to come up to 2011 standards
and take microdescriptors into account.]

Overview:

   This document is a proposal for servers to advertise multiple
   address/port combinations for their ORPort.

Motivation:

   Sometimes servers want to support multiple ports for incoming
   connections, either in order to support multiple address families, to
   better use multiple interfaces, or to support a variety of
   FascistFirewallPorts settings.  This is easy to set up now, but
   there's no way to advertise it to clients.

New descriptor syntax:

   We add a new line in the router descriptor, "or-address".  This line
   can occur zero, one, or multiple times.  Its format is:

      or-address SP ADDRESS ":" PORTLIST NL

      ADDRESS = IP6ADDR / IP4ADDR
      IPV6ADDR = an ipv6 address, surrounded by square brackets.
      IPV4ADDR = an ipv4 address, represented as a dotted quad.
      PORTLIST = PORTSPEC | PORTSPEC "," PORTLIST
      PORTSPEC = PORT | PORT "-" PORT

   [This is the regular format for specifying sets of addresses and
   ports in Tor.]

New OR behavior:

   We add two more options to supplement ORListenAddress:
   ORPublishedListenAddress, and ORPublishAddressSet.  The former
   listens on an address-port combination and publishes it in addition
   to the regular address.  The latter advertises a set of address-port
   combinations, but does not listen on them.  [To use this option, the
   server operator should set up port forwarding to the regular ORPort,
   as for example with firewall rules.]

   Servers should extend their testing to include advertised addresses
   and ports.  No address or port should be advertised until it's been
   tested.  [This might get expensive in practice.]

New authority behavior:

   Authorities should spot-test descriptors, and reject any where a
   substantial part of the addresses can't be reached.

New client behavior:

   When connecting to another server, clients SHOULD pick an
   address-port ocmbination at random as supported by their
   reachableaddresses.  If a client has a connection to a server at one
   address, it SHOULD use that address for any simultaneous connections
   to that server.  Clients SHOULD use the canonical address for any
   server when generating extend cells.

Not addressed here:

   * There's no reason to listen on multiple dirports; current Tors
   mostly don't connect directly to the dirport anyway.

   * It could be advantageous to list something about extra addresses in
   the network-status document.  This would, however, eat space there.
   More analysis is needed, particularly in light of proposal 141
   ("Download server descriptors on demand")

Dependencies:

   Testing for canonical connections needs to be implemented before it's
   safe to use this proposal.


Notes 3 July:
  - Write up the simple version of this.  No ranges needed yet.  No
    networkstatus chagnes yet.

Filename: 119-controlport-auth.txt
Title: New PROTOCOLINFO command for controllers
Author: Roger Dingledine
Created: 14-Aug-2007
Status: Closed
Implemented-In: 0.2.0.x

Overview:

  Here we describe how to help controllers locate the cookie
  authentication file when authenticating to Tor, so we can a) require
  authentication by default for Tor controllers and b) still keep
  things usable.  Also, we propose an extensible, general-purpose mechanism
  for controllers to learn about a Tor instance's protocol and
  authentication requirements before authenticating.

The Problem:

  When we first added the controller protocol, we wanted to make it
  easy for people to play with it, so by default we didn't require any
  authentication from controller programs. We allowed requests only from
  localhost as a stopgap measure for security.

  Due to an increasing number of vulnerabilities based on this approach,
  it's time to add authentication in default configurations.

  We have a number of goals:
  - We want the default Vidalia bundles to transparently work. That
    means we don't want the users to have to type in or know a password.
  - We want to allow multiple controller applications to connect to the
    control port. So if Vidalia is launching Tor, it can't just keep the
    secrets to itself.

  Right now there are three authentication approaches supported
  by the control protocol: NULL, CookieAuthentication, and
  HashedControlPassword. See Sec 5.1 in control-spec.txt for details.

  There are a couple of challenges here. The first is: if the controller
  launches Tor, how should we teach Tor what authentication approach
  it should require, and the secret that goes along with it? Next is:
  how should this work when the controller attaches to an existing Tor,
  rather than launching Tor itself?

  Cookie authentication seems most amenable to letting multiple controller
  applications interact with Tor. But that brings in yet another question:
  how does the controller guess where to look for the cookie file,
  without first knowing what DataDirectory Tor is using?

Design:

  We should add a new controller command PROTOCOLINFO that can be sent
  as a valid first command (the others being AUTHENTICATE and QUIT). If
  PROTOCOLINFO is sent as the first command, the second command must be
  either a successful AUTHENTICATE or a QUIT.

  If the initial command sequence is not valid, Tor closes the connection.


Spec:

  C:  "PROTOCOLINFO" *(SP PIVERSION) CRLF
  S:  "250+PROTOCOLINFO" SP PIVERSION CRLF *InfoLine "250 OK" CRLF

    InfoLine = AuthLine / VersionLine / OtherLine

     AuthLine = "250-AUTH" SP "METHODS=" AuthMethod *(",")AuthMethod
                       *(SP "COOKIEFILE=" AuthCookieFile) CRLF
     VersionLine = "250-VERSION" SP "Tor=" TorVersion [SP Arguments] CRLF

     AuthMethod =
      "NULL"           / ; No authentication is required
      "HASHEDPASSWORD" / ; A controller must supply the original password
      "COOKIE"         / ; A controller must supply the contents of a cookie

     AuthCookieFile = QuotedString
     TorVersion = QuotedString

     OtherLine = "250-" Keyword [SP Arguments] CRLF

  For example:

  C: PROTOCOLINFO CRLF
  S: "250+PROTOCOLINFO 1" CRLF
  S: "250-AUTH Methods=HASHEDPASSWORD,COOKIE COOKIEFILE="/tor/cookie"" CRLF
  S: "250-VERSION Tor=0.2.0.5-alpha" CRLF
  S: "250 OK" CRLF

  Tor MAY give its InfoLines in any order; controllers MUST ignore InfoLines
  with keywords it does not recognize.  Controllers MUST ignore extraneous
  data on any InfoLine.

  PIVERSION is there in case we drastically change the syntax one day. For
  now it should always be "1", for the controller protocol.  Controllers MAY
  provide a list of the protocol versions they support; Tor MAY select a
  version that the controller does not support.

  Right now only two "topics" (AUTH and VERSION) are included, but more
  may be included in the future. Controllers must accept lines with
  unexpected topics.

  AuthCookieFile = QuotedString

  AuthMethod is used to specify one or more control authentication
  methods that Tor currently accepts.

  AuthCookieFile specifies the absolute path and filename of the
  authentication cookie that Tor is expecting and is provided iff
  the METHODS field contains the method "COOKIE".  Controllers MUST handle
  escape sequences inside this string.

  The VERSION line contains the Tor version.

  [What else might we want to include that could be useful? -RD]

Compatibility:

  Tor 0.1.2.16 and 0.2.0.4-alpha hang up after the first failed
  command. Earlier Tors don't know about this command but don't hang
  up. That means controllers will need a mechanism for distinguishing
  whether they're talking to a Tor that speaks PROTOCOLINFO or not.

  I suggest that the controllers attempt a PROTOCOLINFO. Then:
    - If it works, great. Authenticate as required.
    - If they get hung up on, reconnect and do a NULL AUTHENTICATE.
    - If it's unrecognized but they're not hung up on, do a NULL
      AUTHENTICATE.

Unsolved problems:

  If Torbutton wants to be a Tor controller one day... talking TCP is
  bad enough, but reading from the filesystem is even harder. Is there
  a way to let simple programs work with the controller port without
  needing all the auth infrastructure?

  Once we put this approach in place, the next vulnerability we see will
  involve an attacker somehow getting read access to the victim's files
  --- and then we're back where we started. This means we still need
  to think about how to demand password-based authentication without
  bothering the user about it.

Filename: 120-shutdown-descriptors.txt
Title: Shutdown descriptors when Tor servers stop
Author: Roger Dingledine
Created: 15-Aug-2007
Status: Dead

[Proposal dead as of 11 Jul 2008. The point of this proposal was to give
routers a good way to get out of the networkstatus early, but proposal
138 (already implemented) has achieved this.]

Overview:

  Tor servers should publish a last descriptor whenever they shut down,
  to let others know that they are no longer offering service.

The Problem:

  The main reason for this is in reaction to Internet services that want
  to treat connections from the Tor network differently. Right now,
  if a user experiments with turning on the "relay" functionality, he
  is punished by being locked out of some websites, some IRC networks,
  etc --- and this lockout persists for several days even after he turns
  the server off.

Design:

  During the "slow shutdown" period if exiting, or shortly after the
  user sets his ORPort back to 0 if not exiting, Tor should publish a
  final descriptor with the following characteristics:

  1) Exit policy is listed as "reject *:*"
  2) It includes a new entry called "opt shutdown 1"

  The first step is so current blacklists will no longer list this node
  as exiting to whatever the service is.

  The second step is so directory authorities can avoid wasting time
  doing reachability testing. Authorities should automatically not list
  as Running any router whose latest descriptor says it shut down.

  [I originally had in mind a third step --- Advertised bandwidth capacity
  is listed as "0" --- so current Tor clients will skip over this node
  when building most circuits. But since clients won't fetch descriptors
  from nodes not listed as Running, this step seems pointless. -RD]

Spec:

  TBD but should be pretty straightforward.

Security issues:

  Now external people can learn exactly when a node stopped offering
  relay service. How bad is this? I can see a few minor attacks based
  on this knowledge, but on the other hand as it is we don't really take
  any steps to keep this information secret.

Overhead issues:

  We are creating more descriptors that want to be remembered. However,
  since the router won't be marked as Running, ordinary clients won't
  fetch the shutdown descriptors. Caches will, though. I hope this is ok.

Implementation:

  To make things easy, we should publish the shutdown descriptor only
  on controlled shutdown (SIGINT as opposed to SIGTERM). That would
  leave enough time for publishing that we probably wouldn't need any
  extra synchronization code.

  If that turns out to be too unintuitive for users, I could imagine doing
  it on SIGTERMs too, and just delaying exit until we had successfully
  published to at least one authority, at which point we'd hope that it
  propagated from there.

Acknowledgements:

  tup suggested this idea.

Comments:

  2) Maybe add a rule "Don't do this for hibernation if we expect to wake
     up before the next consensus is published"?
                                                      - NM 9 Oct 2007
Filename: 121-hidden-service-authentication.txt
Title: Hidden Service Authentication
Author: Tobias Kamm, Thomas Lauterbach, Karsten Loesing, Ferdinand Rieger,
        Christoph Weingarten
Created: 10-Sep-2007
Status: Closed
Implemented-In: 0.2.1.x

Change history:

  26-Sep-2007  Initial proposal for or-dev
  08-Dec-2007  Incorporated comments by Nick posted to or-dev on 10-Oct-2007
  15-Dec-2007  Rewrote complete proposal for better readability, modified
               authentication protocol, merged in personal notes
  24-Dec-2007  Replaced misleading term "authentication" by "authorization"
               and added some clarifications (comments by Sven Kaffille)
  28-Apr-2008  Updated most parts of the concrete authorization protocol
  04-Jul-2008  Add a simple algorithm to delay descriptor publication for
               different clients of a hidden service
  19-Jul-2008  Added INTRODUCE1V cell type (1.2), improved replay
               protection for INTRODUCE2 cells (1.3), described limitations
               for auth protocols (1.6), improved hidden service protocol
               without client authorization (2.1), added second, more
               scalable authorization protocol (2.2), rewrote existing
               authorization protocol (2.3); changes based on discussion
               with Nick
  31-Jul-2008  Limit maximum descriptor size to 20 kilobytes to prevent
               abuse.
  01-Aug-2008  Use first part of Diffie-Hellman handshake for replay
               protection instead of rendezvous cookie.
  01-Aug-2008  Remove improved hidden service protocol without client
               authorization (2.1). It might get implemented in proposal
               142.

Overview:

  This proposal deals with a general infrastructure for performing
  authorization (not necessarily implying authentication) of requests to
  hidden services at three points: (1) when downloading and decrypting
  parts of the hidden service descriptor, (2) at the introduction point,
  and (3) at Bob's Tor client before contacting the rendezvous point. A
  service provider will be able to restrict access to his service at these
  three points to authorized clients only. Further, the proposal contains
  specific authorization protocols as instances that implement the
  presented authorization infrastructure.

  This proposal is based on v2 hidden service descriptors as described in
  proposal 114 and introduced in version 0.2.0.10-alpha.

  The proposal is structured as follows: The next section motivates the
  integration of authorization mechanisms in the hidden service protocol.
  Then we describe a general infrastructure for authorization in hidden
  services, followed by specific authorization protocols for this
  infrastructure. At the end we discuss a number of attacks and non-attacks
  as well as compatibility issues.

Motivation:

  The major part of hidden services does not require client authorization
  now and won't do so in the future. To the contrary, many clients would
  not want to be (pseudonymously) identifiable by the service (though this
  is unavoidable to some extent), but rather use the service
  anonymously. These services are not addressed by this proposal.

  However, there may be certain services which are intended to be accessed
  by a limited set of clients only. A possible application might be a
  wiki or forum that should only be accessible for a closed user group.
  Another, less intuitive example might be a real-time communication
  service, where someone provides a presence and messaging service only to
  his buddies. Finally, a possible application would be a personal home
  server that should be remotely accessed by its owner.

  Performing authorization for a hidden service within the Tor network, as
  proposed here, offers a range of advantages compared to allowing all
  client connections in the first instance and deferring authorization to
  the transported protocol:

  (1) Reduced traffic: Unauthorized requests would be rejected as early as
  possible, thereby reducing the overall traffic in the network generated
  by establishing circuits and sending cells.

  (2) Better protection of service location: Unauthorized clients could not
  force Bob to create circuits to their rendezvous points, thus preventing
  the attack described by Øverlier and Syverson in their paper "Locating
  Hidden Servers" even without the need for guards.

  (3) Hiding activity: Apart from performing the actual authorization, a
  service provider could also hide the mere presence of his service from
  unauthorized clients when not providing hidden service descriptors to
  them, rejecting unauthorized requests already at the introduction
  point (ideally without leaking presence information at any of these
  points), or not answering unauthorized introduction requests.

  (4) Better protection of introduction points: When providing hidden
  service descriptors to authorized clients only and encrypting the
  introduction points as described in proposal 114, the introduction points
  would be unknown to unauthorized clients and thereby protected from DoS
  attacks.

  (5) Protocol independence: Authorization could be performed for all
  transported protocols, regardless of their own capabilities to do so.

  (6) Ease of administration: A service provider running multiple hidden
  services would be able to configure access at a single place uniformly
  instead of doing so for all services separately.

  (7) Optional QoS support: Bob could adapt his node selection algorithm
  for building the circuit to Alice's rendezvous point depending on a
  previously guaranteed QoS level, thus providing better latency or
  bandwidth for selected clients.

  A disadvantage of performing authorization within the Tor network is
  that a hidden service cannot make use of authorization data in
  the transported protocol. Tor hidden services were designed to be
  independent of the transported protocol. Therefore it's only possible to
  either grant or deny access to the whole service, but not to specific
  resources of the service.

  Authorization often implies authentication, i.e. proving one's identity.
  However, when performing authorization within the Tor network, untrusted
  points should not gain any useful information about the identities of
  communicating parties, neither server nor client. A crucial challenge is
  to remain anonymous towards directory servers and introduction points.
  However, trying to hide identity from the hidden service is a futile
  task, because a client would never know if he is the only authorized
  client and therefore perfectly identifiable. Therefore, hiding client
  identity from the hidden service is not an aim of this proposal.

  The current implementation of hidden services does not provide any kind
  of authorization. The hidden service descriptor version 2, introduced by
  proposal 114, was designed to use a descriptor cookie for downloading and
  decrypting parts of the descriptor content, but this feature is not yet
  in use. Further, most relevant cell formats specified in rend-spec
  contain fields for authorization data, but those fields are neither
  implemented nor do they suffice entirely.

Details:

  1. General infrastructure for authorization to hidden services

  We spotted three possible authorization points in the hidden service
  protocol:

    (1) when downloading and decrypting parts of the hidden service
        descriptor,
    (2) at the introduction point, and
    (3) at Bob's Tor client before contacting the rendezvous point.

  The general idea of this proposal is to allow service providers to
  restrict access to some or all of these points to authorized clients
  only.

  1.1. Client authorization at directory

  Since the implementation of proposal 114 it is possible to combine a
  hidden service descriptor with a so-called descriptor cookie. If done so,
  the descriptor cookie becomes part of the descriptor ID, thus having an
  effect on the storage location of the descriptor. Someone who has learned
  about a service, but is not aware of the descriptor cookie, won't be able
  to determine the descriptor ID and download the current hidden service
  descriptor; he won't even know whether the service has uploaded a
  descriptor recently. Descriptor IDs are calculated as follows (see
  section 1.2 of rend-spec for the complete specification of v2 hidden
  service descriptors):

      descriptor-id =
          H(service-id | H(time-period | descriptor-cookie | replica))

  Currently, service-id is equivalent to permanent-id which is calculated
  as in the following formula. But in principle it could be any public
  key.

      permanent-id = H(permanent-key)[:10]

  The second purpose of the descriptor cookie is to encrypt the list of
  introduction points, including optional authorization data. Hence, the
  hidden service directories won't learn any introduction information from
  storing a hidden service descriptor. This feature is implemented but
  unused at the moment. So this proposal will harness the advantages
  of proposal 114.

  The descriptor cookie can be used for authorization by keeping it secret
  from everyone but authorized clients. A service could then decide whether
  to publish hidden service descriptors using that descriptor cookie later
  on. An authorized client being aware of the descriptor cookie would be
  able to download and decrypt the hidden service descriptor.

  The number of concurrently used descriptor cookies for one hidden service
  is not restricted. A service could use a single descriptor cookie for all
  users, a distinct cookie per user, or something in between, like one
  cookie per group of users. It is up to the specific protocol and how it
  is applied by a service provider.

  Two or more hidden service descriptors for different groups or users
  should not be uploaded at the same time. A directory node could conclude
  easily that the descriptors were issued by the same hidden service, thus
  being able to link the two groups or users. Therefore, descriptors for
  different users or clients that ought to be stored on the same directory
  are delayed, so that only one descriptor is uploaded to a directory at a
  time. The remaining descriptors are uploaded with a delay of up to
  30 seconds.
  Further, descriptors for different groups or users that are to be stored
  on different directories are delayed for a random time of up to 30
  seconds to hide relations from colluding directories. Certainly, this
  does not prevent linking entirely, but it makes it somewhat harder.
  There is a conflict between hiding links between clients and making a
  service available in a timely manner.

  Although this part of the proposal is meant to describe a general
  infrastructure for authorization, changing the way of using the
  descriptor cookie to look up hidden service descriptors, e.g. applying
  some sort of asymmetric crypto system, would require in-depth changes
  that would be incompatible to v2 hidden service descriptors. On the
  contrary, using another key for en-/decrypting the introduction point
  part of a hidden service descriptor, e.g. a different symmetric key or
  asymmetric encryption, would be easy to implement and compatible to v2
  hidden service descriptors as understood by hidden service directories
  (clients and services would have to be upgraded anyway for using the new
  features).

  An adversary could try to abuse the fact that introduction points can be
  encrypted by storing arbitrary, unrelated data in the hidden service
  directory. This abuse can be limited by setting a hard descriptor size
  limit, forcing the adversary to split data into multiple chunks. There
  are some limitations that make splitting data across multiple descriptors
  unattractive: 1) The adversary would not be able to choose descriptor IDs
  freely and would therefore have to implement his own indexing
  structure. 2) Validity of descriptors is limited to at most 24 hours
  after which descriptors need to be republished.

  The regular descriptor size in bytes is 745 + num_ipos * 837 + auth_data.
  A large descriptor with 7 introduction points and 5 kilobytes of
  authorization data would be 11724 bytes in size. The upper size limit of
  descriptors should be set to 20 kilobytes, which limits the effect of
  abuse while retaining enough flexibility in designing authorization
  protocols.

  1.2. Client authorization at introduction point

  The next possible authorization point after downloading and decrypting
  a hidden service descriptor is the introduction point. It may be important
  for authorization, because it bears the last chance of hiding presence
  of a hidden service from unauthorized clients. Further, performing
  authorization at the introduction point might reduce traffic in the
  network, because unauthorized requests would not be passed to the
  hidden service. This applies to those clients who are aware of a
  descriptor cookie and thereby of the hidden service descriptor, but do
  not have authorization data to pass the introduction point or access the
  service (such a situation might occur when authorization data for
  authorization at the directory is not issued on a per-user basis, but
  authorization data for authorization at the introduction point is).

  It is important to note that the introduction point must be considered
  untrustworthy, and therefore cannot replace authorization at the hidden
  service itself. Nor should the introduction point learn any sensitive
  identifiable information from either the service or the client.

  In order to perform authorization at the introduction point, three
  message formats need to be modified: (1) v2 hidden service descriptors,
  (2) ESTABLISH_INTRO cells, and (3) INTRODUCE1 cells.

  A v2 hidden service descriptor needs to contain authorization data that
  is introduction-point-specific and sometimes also authorization data
  that is introduction-point-independent. Therefore, v2 hidden service
  descriptors as specified in section 1.2 of rend-spec already contain two
  reserved fields "intro-authorization" and "service-authorization"
  (originally, the names of these fields were "...-authentication")
  containing an authorization type number and arbitrary authorization
  data. We propose that authorization data consists of base64 encoded
  objects of arbitrary length, surrounded by "-----BEGIN MESSAGE-----" and
  "-----END MESSAGE-----". This will increase the size of hidden service
  descriptors, but this is allowed since there is no strict upper limit.

  The current ESTABLISH_INTRO cells as described in section 1.3 of
  rend-spec do not contain either authorization data or version
  information. Therefore, we propose a new version 1 of the ESTABLISH_INTRO
  cells adding these two issues as follows:

     V      Format byte: set to 255               [1 octet]
     V      Version byte: set to 1                [1 octet]
     KL     Key length                           [2 octets]
     PK     Bob's public key                    [KL octets]
     HS     Hash of session info                [20 octets]
     AUTHT  The auth type that is supported       [1 octet]
     AUTHL  Length of auth data                  [2 octets]
     AUTHD  Auth data                            [variable]
     SIG    Signature of above information       [variable]

  From the format it is possible to determine the maximum allowed size for
  authorization data: given the fact that cells are 512 octets long, of
  which 498 octets are usable (see section 6.1 of tor-spec), and assuming
  1024 bit = 128 octet long keys, there are 215 octets left for
  authorization data. Hence, authorization protocols are bound to use no
  more than these 215 octets, regardless of the number of clients that
  shall be authenticated at the introduction point. Otherwise, one would
  need to send multiple ESTABLISH_INTRO cells or split them up, which we do
  not specify here.

  In order to understand a v1 ESTABLISH_INTRO cell, the implementation of
  a relay must have a certain Tor version. Hidden services need to be able
  to distinguish relays being capable of understanding the new v1 cell
  formats and perform authorization. We propose to use the version number
  that is contained in networkstatus documents to find capable
  introduction points.

  The current INTRODUCE1 cell as described in section 1.8 of rend-spec is
  not designed to carry authorization data and has no version number, too.
  Unfortunately, unversioned INTRODUCE1 cells consist only of a fixed-size,
  seemingly random PK_ID, followed by the encrypted INTRODUCE2 cell. This
  makes it impossible to distinguish unversioned INTRODUCE1 cells from any
  later format. In particular, it is not possible to introduce some kind of
  format and version byte for newer versions of this cell. That's probably
  where the comment "[XXX011 want to put intro-level auth info here, but no
  version. crap. -RD]" that was part of rend-spec some time ago comes from.

  We propose that new versioned INTRODUCE1 cells use the new cell type 41
  RELAY_INTRODUCE1V (where V stands for versioned):

  Cleartext
     V      Version byte: set to 1                [1 octet]
     PK_ID  Identifier for Bob's PK             [20 octets]
     AUTHT  The auth type that is included        [1 octet]
     AUTHL  Length of auth data                  [2 octets]
     AUTHD  Auth data                            [variable]
  Encrypted to Bob's PK:
     (RELAY_INTRODUCE2 cell)

  The maximum length of contained authorization data depends on the length
  of the contained INTRODUCE2 cell. A calculation follows below when
  describing the INTRODUCE2 cell format we propose to use.

  1.3. Client authorization at hidden service

  The time when a hidden service receives an INTRODUCE2 cell constitutes
  the last possible authorization point during the hidden service
  protocol. Performing authorization here is easier than at the other two
  authorization points, because there are no possibly untrusted entities
  involved.

  In general, a client that is successfully authorized at the introduction
  point should be granted access at the hidden service, too. Otherwise, the
  client would receive a positive INTRODUCE_ACK cell from the introduction
  point and conclude that it may connect to the service, but the request
  will be dropped without notice. This would appear as a failure to
  clients. Therefore, the number of cases in which a client successfully
  passes the introduction point but fails at the hidden service should be
  zero. However, this does not lead to the conclusion that the
  authorization data used at the introduction point and the hidden service
  must be the same, but only that both authorization data should lead to
  the same authorization result.

  Authorization data is transmitted from client to server via an
  INTRODUCE2 cell that is forwarded by the introduction point. There are
  versions 0 to 2 specified in section 1.8 of rend-spec, but none of these
  contain fields for carrying authorization data. We propose a slightly
  modified version of v3 INTRODUCE2 cells that is specified in section
  1.8.1 and which is not implemented as of December 2007. In contrast to
  the specified v3 we avoid specifying (and implementing) IPv6 capabilities,
  because Tor relays will be required to support IPv4 addresses for a long
  time in the future, so that this seems unnecessary at the moment. The
  proposed format of v3 INTRODUCE2 cells is as follows:

     VER    Version byte: set to 3.               [1 octet]
     AUTHT  The auth type that is used            [1 octet]
     AUTHL  Length of auth data                  [2 octets]
     AUTHD  Auth data                            [variable]
     TS     Timestamp (seconds since 1-1-1970)   [4 octets]
     IP     Rendezvous point's address           [4 octets]
     PORT   Rendezvous point's OR port           [2 octets]
     ID     Rendezvous point identity ID        [20 octets]
     KLEN   Length of onion key                  [2 octets]
     KEY    Rendezvous point onion key        [KLEN octets]
     RC     Rendezvous cookie                   [20 octets]
     g^x    Diffie-Hellman data, part 1        [128 octets]

  The maximum possible length of authorization data is related to the
  enclosing INTRODUCE1V cell. A v3 INTRODUCE2 cell with
  1024 bit = 128 octets long public key without any authorization data
  occupies 306 octets (AUTHL is only used when AUTHT has a value != 0),
  plus 58 octets for hybrid public key encryption (see
  section 5.1 of tor-spec on hybrid encryption of CREATE cells). The
  surrounding INTRODUCE1V cell requires 24 octets. This leaves only 110
  of the 498 available octets free, which must be shared between
  authorization data to the introduction point _and_ to the hidden
  service.

  When receiving a v3 INTRODUCE2 cell, Bob checks whether a client has
  provided valid authorization data to him. He also requires that the
  timestamp is no more than 30 minutes in the past or future and that the
  first part of the Diffie-Hellman handshake has not been used in the past
  60 minutes to prevent replay attacks by rogue introduction points. (The
  reason for not using the rendezvous cookie to detect replays---even
  though it is only sent once in the current design---is that it might be
  desirable to re-use rendezvous cookies for multiple introduction requests
  in the future.) If all checks pass, Bob builds a circuit to the provided
  rendezvous point. Otherwise he drops the cell.

  1.4. Summary of authorization data fields

  In summary, the proposed descriptor format and cell formats provide the
  following fields for carrying authorization data:

  (1) The v2 hidden service descriptor contains:
      - a descriptor cookie that is used for the lookup process, and
      - an arbitrary encryption schema to ensure authorization to access
        introduction information (currently symmetric encryption with the
        descriptor cookie).

  (2) For performing authorization at the introduction point we can use:
      - the fields intro-authorization and service-authorization in
        hidden service descriptors,
      - a maximum of 215 octets in the ESTABLISH_INTRO cell, and
      - one part of 110 octets in the INTRODUCE1V cell.

  (3) For performing authorization at the hidden service we can use:
      - the fields intro-authorization and service-authorization in
        hidden service descriptors,
      - the other part of 110 octets in the INTRODUCE2 cell.

  It will also still be possible to access a hidden service without any
  authorization or only use a part of the authorization infrastructure.
  However, this requires to consider all parts of the infrastructure. For
  example, authorization at the introduction point relying on confidential
  intro-authorization data transported in the hidden service descriptor
  cannot be performed without using an encryption schema for introduction
  information.

  1.5. Managing authorization data at servers and clients

  In order to provide authorization data at the hidden service and the
  authenticated clients, we propose to use files---either the Tor
  configuration file or separate files. The exact format of these special
  files depends on the authorization protocol used.

  Currently, rend-spec contains the proposition to encode client-side
  authorization data in the URL, like in x.y.z.onion. This was never used
  and is also a bad idea, because in case of HTTP the requested URL may be
  contained in the Host and Referer fields.

  1.6. Limitations for authorization protocols

  There are two limitations of the current hidden service protocol for
  authorization protocols that shall be identified here.

    1. The three cell types ESTABLISH_INTRO, INTRODUCE1V, and INTRODUCE2
       restricts the amount of data that can be used for authorization.
       This forces authorization protocols that require per-user
       authorization data at the introduction point to restrict the number
       of authorized clients artificially. A possible solution could be to
       split contents among multiple cells and reassemble them at the
       introduction points.

    2. The current hidden service protocol does not specify cell types to
       perform interactive authorization between client and introduction
       point or hidden service. If there should be an authorization
       protocol that requires interaction, new cell types would have to be
       defined and integrated into the hidden service protocol.


  2. Specific authorization protocol instances

  In the following we present two specific authorization protocols that
  make use of (parts of) the new authorization infrastructure:

    1. The first protocol allows a service provider to restrict access
       to clients with a previously received secret key only, but does not
       attempt to hide service activity from others.

    2. The second protocol, albeit being feasible for a limited set of about
       16 clients, performs client authorization and hides service activity
       from everyone but the authorized clients.

  These two protocol instances extend the existing hidden service protocol
  version 2. Hidden services that perform client authorization may run in
  parallel to other services running versions 0, 2, or both.

  2.1. Service with large-scale client authorization

  The first client authorization protocol aims at performing access control
  while consuming as few additional resources as possible. A service
  provider should be able to permit access to a large number of clients
  while denying access for everyone else. However, the price for
  scalability is that the service won't be able to hide its activity from
  unauthorized or formerly authorized clients.

  The main idea of this protocol is to encrypt the introduction-point part
  in hidden service descriptors to authorized clients using symmetric keys.
  This ensures that nobody else but authorized clients can learn which
  introduction points a service currently uses, nor can someone send a
  valid INTRODUCE1 message without knowing the introduction key. Therefore,
  a subsequent authorization at the introduction point is not required.

  A service provider generates symmetric "descriptor cookies" for his
  clients and distributes them outside of Tor. The suggested key size is
  128 bits, so that descriptor cookies can be encoded in 22 base64 chars
  (which can hold up to 22 * 5 = 132 bits, leaving 4 bits to encode the
  authorization type (here: "0") and allow a client to distinguish this
  authorization protocol from others like the one proposed below).
  Typically, the contact information for a hidden service using this
  authorization protocol looks like this:

    v2cbb2l4lsnpio4q.onion Ll3X7Xgz9eHGKCCnlFH0uz

  When generating a hidden service descriptor, the service encrypts the
  introduction-point part with a single randomly generated symmetric
  128-bit session key using AES-CTR as described for v2 hidden service
  descriptors in rend-spec. Afterwards, the service encrypts the session
  key to all descriptor cookies using AES. Authorized client should be able
  to efficiently find the session key that is encrypted for him/her, so
  that 4 octet long client ID are generated consisting of descriptor cookie
  and initialization vector. Descriptors always contain a number of
  encrypted session keys that is a multiple of 16 by adding fake entries.
  Encrypted session keys are ordered by client IDs in order to conceal
  addition or removal of authorized clients by the service provider.

     ATYPE  Authorization type: set to 1.                      [1 octet]
     ALEN   Number of clients := 1 + ((clients - 1) div 16)    [1 octet]
   for each symmetric descriptor cookie:
     ID     Client ID: H(descriptor cookie | IV)[:4]          [4 octets]
     SKEY   Session key encrypted with descriptor cookie     [16 octets]
   (end of client-specific part)
     RND    Random data      [(15 - ((clients - 1) mod 16)) * 20 octets]
     IV     AES initialization vector                        [16 octets]
     IPOS   Intro points, encrypted with session key  [remaining octets]

  An authorized client needs to configure Tor to use the descriptor cookie
  when accessing the hidden service. Therefore, a user adds the contact
  information that she received from the service provider to her torrc
  file. Upon downloading a hidden service descriptor, Tor finds the
  encrypted introduction-point part and attempts to decrypt it using the
  configured descriptor cookie. (In the rare event of two or more client
  IDs being equal a client tries to decrypt all of them.)

  Upon sending the introduction, the client includes her descriptor cookie
  as auth type "1" in the INTRODUCE2 cell that she sends to the service.
  The hidden service checks whether the included descriptor cookie is
  authorized to access the service and either responds to the introduction
  request, or not.

  2.2. Authorization for limited number of clients

  A second, more sophisticated client authorization protocol goes the extra
  mile of hiding service activity from unauthorized clients. With all else
  being equal to the preceding authorization protocol, the second protocol
  publishes hidden service descriptors for each user separately and gets
  along with encrypting the introduction-point part of descriptors to a
  single client. This allows the service to stop publishing descriptors for
  removed clients. As long as a removed client cannot link descriptors
  issued for other clients to the service, it cannot derive service
  activity any more. The downside of this approach is limited scalability.
  Even though the distributed storage of descriptors (cf. proposal 114)
  tackles the problem of limited scalability to a certain extent, this
  protocol should not be used for services with more than 16 clients. (In
  fact, Tor should refuse to advertise services for more than this number
  of clients.)

  A hidden service generates an asymmetric "client key" and a symmetric
  "descriptor cookie" for each client. The client key is used as
  replacement for the service's permanent key, so that the service uses a
  different identity for each of his clients. The descriptor cookie is used
  to store descriptors at changing directory nodes that are unpredictable
  for anyone but service and client, to encrypt the introduction-point
  part, and to be included in INTRODUCE2 cells. Once the service has
  created client key and descriptor cookie, he tells them to the client
  outside of Tor. The contact information string looks similar to the one
  used by the preceding authorization protocol (with the only difference
  that it has "1" encoded as auth-type in the remaining 4 of 132 bits
  instead of "0" as before).

  When creating a hidden service descriptor for an authorized client, the
  hidden service uses the client key and descriptor cookie to compute
  secret ID part and descriptor ID:

    secret-id-part = H(time-period | descriptor-cookie | replica)

    descriptor-id = H(client-key[:10] | secret-id-part)

  The hidden service also replaces permanent-key in the descriptor with
  client-key and encrypts introduction-points with the descriptor cookie.

     ATYPE  Authorization type: set to 2.                         [1 octet]
     IV     AES initialization vector                           [16 octets]
     IPOS   Intro points, encr. with descriptor cookie   [remaining octets]

  When uploading descriptors, the hidden service needs to make sure that
  descriptors for different clients are not uploaded at the same time (cf.
  Section 1.1) which is also a limiting factor for the number of clients.

  When a client is requested to establish a connection to a hidden service
  it looks up whether it has any authorization data configured for that
  service. If the user has configured authorization data for authorization
  protocol "2", the descriptor ID is determined as described in the last
  paragraph. Upon receiving a descriptor, the client decrypts the
  introduction-point part using its descriptor cookie. Further, the client
  includes its descriptor cookie as auth-type "2" in INTRODUCE2 cells that
  it sends to the service.

  2.3. Hidden service configuration

  A hidden service that is meant to perform client authorization adds a
  new option HiddenServiceAuthorizeClient to its hidden service
  configuration. This option contains the authorization type which is
  either "1" for the protocol described in 2.1 or "2" for the protocol in
  2.2 and a comma-separated list of human-readable client names, so that
  Tor can create authorization data for these clients:

    HiddenServiceAuthorizeClient auth-type client-name,client-name,...

  If this option is configured, HiddenServiceVersion is automatically
  reconfigured to contain only version numbers of 2 or higher.

  Tor stores all generated authorization data for the authorization
  protocols described in Sections 2.1 and 2.2 in a new file using the
  following file format:

     "client-name" human-readable client identifier NL
     "descriptor-cookie" 128-bit key ^= 22 base64 chars NL

  If the authorization protocol of Section 2.2 is used, Tor also generates
  and stores the following data:

     "client-key" NL a public key in PEM format

  2.4. Client configuration

  Clients need to make their authorization data known to Tor using another
  configuration option that contains a service name (mainly for the sake of
  convenience), the service address, and the descriptor cookie that is
  required to access a hidden service (the authorization protocol number is
  encoded in the descriptor cookie):

    HidServAuth service-name service-address descriptor-cookie

Security implications:

  In the following we want to discuss possible attacks by dishonest
  entities in the presented infrastructure and specific protocol. These
  security implications would have to be verified once more when adding
  another protocol. The dishonest entities (theoretically) include the
  hidden service itself, the authenticated clients, hidden service directory
  nodes, introduction points, and rendezvous points. The relays that are
  part of circuits used during protocol execution, but never learn about
  the exchanged descriptors or cells by design, are not considered.
  Obviously, this list makes no claim to be complete. The discussed attacks
  are sorted by the difficulty to perform them, in ascending order,
  starting with roles that everyone could attempt to take and ending with
  partially trusted entities abusing the trust put in them.

  (1) A hidden service directory could attempt to conclude presence of a
  service from the existence of a locally stored hidden service descriptor:
  This passive attack is possible only for a single client-service
  relation, because descriptors need to contain a publicly visible
  signature of the service using the client key.
  A possible protection would be to increase the number of hidden service
  directories in the network.

  (2) A hidden service directory could try to break the descriptor cookies
  of locally stored descriptors: This attack can be performed offline. The
  only useful countermeasure against it might be using safe passwords that
  are generated by Tor.

[passwords? where did those come in? -RD]

  (3) An introduction point could try to identify the pseudonym of the
  hidden service on behalf of which it operates: This is impossible by
  design, because the service uses a fresh public key for every
  establishment of an introduction point (see proposal 114) and the
  introduction point receives a fresh introduction cookie, so that there is
  no identifiable information about the service that the introduction point
  could learn. The introduction point cannot even tell if client accesses
  belong to the same client or not, nor can it know the total number of
  authorized clients. The only information might be the pattern of
  anonymous client accesses, but that is hardly enough to reliably identify
  a specific service.

  (4) An introduction point could want to learn the identities of accessing
  clients: This is also impossible by design, because all clients use the
  same introduction cookie for authorization at the introduction point.

  (5) An introduction point could try to replay a correct INTRODUCE1 cell
  to other introduction points of the same service, e.g. in order to force
  the service to create a huge number of useless circuits: This attack is
  not possible by design, because INTRODUCE1 cells are encrypted using a
  freshly created introduction key that is only known to authorized
  clients.

  (6) An introduction point could attempt to replay a correct INTRODUCE2
  cell to the hidden service, e.g. for the same reason as in the last
  attack: This attack is stopped by the fact that a service will drop
  INTRODUCE2 cells containing a DH handshake they have seen recently.

  (7) An introduction point could block client requests by sending either
  positive or negative INTRODUCE_ACK cells back to the client, but without
  forwarding INTRODUCE2 cells to the server: This attack is an annoyance
  for clients, because they might wait for a timeout to elapse until trying
  another introduction point. However, this attack is not introduced by
  performing authorization and it cannot be targeted towards a specific
  client. A countermeasure might be for the server to periodically perform
  introduction requests to his own service to see if introduction points
  are working correctly.

  (8) The rendezvous point could attempt to identify either server or
  client: This remains impossible as it was before, because the
  rendezvous cookie does not contain any identifiable information.

  (9) An authenticated client could swamp the server with valid INTRODUCE1
  and INTRODUCE2 cells, e.g. in order to force the service to create
  useless circuits to rendezvous points; as opposed to an introduction
  point replaying the same INTRODUCE2 cell, a client could include a new
  rendezvous cookie for every request: The countermeasure for this attack
  is the restriction to 10 connection establishments per client per hour.

Compatibility:

  An implementation of this proposal would require changes to hidden
  services and clients to process authorization data and encode and
  understand the new formats. However, both services and clients would
  remain compatible to regular hidden services without authorization.

Implementation:

  The implementation of this proposal can be divided into a number of
  changes to hidden service and client side. There are no
  changes necessary on directory, introduction, or rendezvous nodes. All
  changes are marked with either [service] or [client] do denote on which
  side they need to be made.

  /1/ Configure client authorization [service]

  - Parse configuration option HiddenServiceAuthorizeClient containing
    authorized client names.
  - Load previously created client keys and descriptor cookies.
  - Generate missing client keys and descriptor cookies, add them to
    client_keys file.
  - Rewrite the hostname file.
  - Keep client keys and descriptor cookies of authorized clients in
    memory.
 [- In case of reconfiguration, mark which client authorizations were
    added and whether any were removed. This can be used later when
    deciding whether to rebuild introduction points and publish new
    hidden service descriptors. Not implemented yet.]

  /2/ Publish hidden service descriptors [service]

  - Create and upload hidden service descriptors for all authorized
    clients.
 [- See /1/ for the case of reconfiguration.]

  /3/ Configure permission for hidden services [client]

  - Parse configuration option HidServAuth containing service
    authorization, store authorization data in memory.

  /5/ Fetch hidden service descriptors [client]

  - Look up client authorization upon receiving a hidden service request.
  - Request hidden service descriptor ID including client key and
    descriptor cookie. Only request v2 descriptors, no v0.

  /6/ Process hidden service descriptor [client]

  - Decrypt introduction points with descriptor cookie.

  /7/ Create introduction request [client]

  - Include descriptor cookie in INTRODUCE2 cell to introduction point.
  - Pass descriptor cookie around between involved connections and
    circuits.

  /8/ Process introduction request [service]

  - Read descriptor cookie from INTRODUCE2 cell.
  - Check whether descriptor cookie is authorized for access, including
    checking access counters.
  - Log access for accountability.

Filename: 122-unnamed-flag.txt
Title: Network status entries need a new Unnamed flag
Author: Roger Dingledine
Created: 04-Oct-2007
Status: Closed
Implemented-In: 0.2.0.x

1. Overview:

  Tor's directory authorities can give certain servers a "Named" flag
  in the network-status entry, when they want to bind that nickname to
  that identity key. This allows clients to specify a nickname rather
  than an identity fingerprint and still be certain they're getting the
  "right" server. As dir-spec.txt describes it,

    Name X is bound to identity Y if at least one binding directory lists
    it, and no directory binds X to some other Y'.

  In practice, clients can refer to servers by nickname whether they are
  Named or not; if they refer to nicknames that aren't Named, a complaint
  shows up in the log asking them to use the identity key in the future
  --- but it still works.

  The problem? Imagine a Tor server with nickname Bob. Bob and his
  identity fingerprint are registered in tor26's approved-routers
  file, but none of the other authorities registered him. Imagine
  there are several other unregistered servers also with nickname Bob
  ("the imposters").

  While Bob is online, all is well: a) tor26 gives a Named flag to
  the real one, and refuses to list the other ones; and b) the other
  authorities list the imposters but don't give them a Named flag. Clients
  who have all the network-statuses can compute which one is the real Bob.

  But when the real Bob disappears and his descriptor expires? tor26
  continues to refuse to list any of the imposters, and the other
  authorities continue to list the imposters. Clients don't have any
  idea that there exists a Named Bob, so they can ask for server Bob and
  get one of the imposters. (A warning will also appear in their log,
  but so what.)

2. The stopgap solution:

  tor26 should start accepting and listing the imposters, but it should
  assign them a new flag: "Unnamed".

  This would produce three cases in terms of assigning flags in the consensus
  networkstatus:

  i) a router gets the Named flag in the v3 networkstatus if
    a) it's the only router with that nickname that has the Named flag
       out of all the votes, and
    b) no vote lists it as Unnamed
  else,
  ii) a router gets the Unnamed flag if
    a) some vote lists a different router with that nickname as Named, or
    b) at least one vote lists it as Unnamed, or
    c) there are other routers with the same nickname that are Unnamed
  else,
  iii) the router neither gets a Named nor an Unnamed flag.

  (This whole proposal is meant only for v3 dir flags; we shouldn't try
  to backport it to the v2 dir world.)

  Then client behavior is:

  a) If there's a Bob with a Named flag, pick that one.
  else b) If the Bobs don't have the Unnamed flag (notice that they should
          either all have it, or none), pick one of them and warn.
  else c) They all have the Unnamed flag -- no router found.

3. Problems not solved by this stopgap:

  3.1. Naming authorities can go offline.

  If tor26 is the only authority that provides a binding for Bob, when
  tor26 goes offline we're back in our previous situation -- the imposters
  can be referenced with a mere ignorable warning in the client's log.

  If some other authority Names a different Bob, and tor26 goes offline,
  then that other Bob becomes the unique Named Bob.

  So be it. We should try to solve these one day, but there's no clear way
  to do it that doesn't destroy usability in other ways, and if we want
  to get the Unnamed flag into v3 network statuses we should add it soon.

  3.2. V3 dir spec magnifies brief discrepancies.

  Another point to notice is if tor26 names Bob(1), doesn't know about
  Bob(2), but moria lists Bob(2). Then Bob(2) doesn't get an Unnamed flag
  even if it should (and Bob(1) is not around).

  Right now, in v2 dirs, the case where an authority doesn't know about
  a server but the other authorities do know is rare. That's because
  authorities periodically ask for other networkstatuses and then fetch
  descriptors that are missing.

  With v3, if that window occurs at the wrong time, it is extended for the
  entire period. We could solve this by making the voting more complex,
  but that doesn't seem worth it.

  [3.3. Tor26 is only one tor26.

  We need more naming authorities, possibly with some kind of auto-naming
  feature.  This is out-of-scope for this proposal -NM]

4. Changes to the v2 directory

  Previously, v2 authorities that had a binding for a server named Bob did
  not list any other server named Bob.  This will change too:

  Version 2 authorities will start listing all routers they know about,
  whether they conflict with a name-binding or not:  Servers for which
  this authority has a binding will continue to be marked Named,
  additionally all other servers of that nickname will be listed without the
  Named flag (i.e. there will be no Unnamed flag in v2 status documents).

  Clients already should handle having a named Bob alongside unnamed
  Bobs correctly, and having the unnamed Bobs in the status file even
  without the named server is no worse than the current status quo where
  clients learn about those servers from other authorities.

  The benefit of this is that an authority's opinion on a server like
  Guard, Stable, Fast etc. can now be learned by clients even if that
  specific authority has reserved that server's name for somebody else.

5. Other benefits:

  This new flag will allow people to operate servers that happen to have
  the same nickname as somebody who registered their server two years ago
  and left soon after. Right now there are dozens of nicknames that are
  registered on all three binding directory authorities, yet haven't been
  running for years. While it's bad that these nicknames are effectively
  blacklisted from the network, the really bad part is that this logic
  is really unintuitive to prospective new server operators.

Filename: 123-autonaming.txt
Title: Naming authorities automatically create bindings
Author: Peter Palfrader
Created: 2007-10-11
Status: Closed
Implemented-In: 0.2.0.x

Overview:

  Tor's directory authorities can give certain servers a "Named" flag
  in the network-status entry, when they want to bind that nickname to
  that identity key. This allows clients to specify a nickname rather
  than an identity fingerprint and still be certain they're getting the
  "right" server.

  Authority operators name a server by adding their nickname and
  identity fingerprint to the 'approved-routers' file.  Historically
  being listed in the file was required for a router, at first for being
  listed in the directory at all, and later in order to be used by
  clients as a first or last hop of a circuit.

  Adding identities to the list of named routers so far has been a
  manual, time consuming, and boring job.  Given that and the fact that
  the Tor network works just fine without named routers the last
  authority to keep a current binding list stopped updating it well over
  half a year ago.

  Naming, if it were done, would serve a useful purpose however in that
  users can have a reasonable expectation that the exit server Bob they
  are using in their http://www.google.com.bob.exit/ URL is the same
  Bob every time.

Proposal:
  I propose that identity<->name binding be completely automated:

  New bindings should be added after the router has been around for a
  bit and their name has not been used by other routers, similarly names
  that have not appeared on the network for a long time should be freed
  in case a new router wants to use it.

  The following rules are suggested:
  i) If a named router has not been online for half a year, the
     identity<->name binding for that name is removed.  The nickname
     is free to be taken by other routers now.
  ii) If a router claims a certain nickname and
       a) has been on the network for at least two weeks, and
       b) that nickname is not yet linked to a different router, and
       c) no other router has wanted that nickname in the last month,
      a new binding should be created for this router and its desired
      nickname.

 This automaton does not necessarily need to live in the Tor code, it
 can do its job just as well when it's an external tool.

Filename: 124-tls-certificates.txt
Title: Blocking resistant TLS certificate usage
Author: Steven J. Murdoch
Created: 2007-10-25
Status: Superseded

Overview:

  To be less distinguishable from HTTPS web browsing, only Tor servers should
  present TLS certificates. This should be done whilst maintaining backwards
  compatibility with Tor nodes which present and expect client certificates, and
  while preserving existing security properties. This specification describes
  the negotiation protocol, what certificates should be presented during the TLS
  negotiation, and how to move the client authentication within the encrypted
  tunnel.

Motivation:

  In Tor's current TLS [1] handshake, both client and server present a
  two-certificate chain. Since TLS performs authentication prior to establishing
  the encrypted tunnel, the contents of these certificates are visible to an
  eavesdropper. In contrast, during normal HTTPS web browsing, the server
  presents a single certificate, signed by a root CA and the client presents no
  certificate. Hence it is possible to distinguish Tor from HTTP by identifying
  this pattern.

  To resist blocking based on traffic identification, Tor should behave as close
  to HTTPS as possible, i.e. servers should offer a single certificate and not
  request a client certificate; clients should present no certificate. This
  presents two difficulties: clients are no longer authenticated and servers are
  authenticated by the connection key, rather than identity key. The link
  protocol must thus be modified to preserve the old security semantics.

  Finally, in order to maintain backwards compatibility, servers must correctly
  identify whether the client supports the modified certificate handling. This
  is achieved by modifying the cipher suites that clients advertise support
  for. These cipher suites are selected to be similar to those chosen by web
  browsers, in order to resist blocking based on client hello.

Terminology:

  Initiator: OP or OR which initiates a TLS connection ("client" in TLS
   terminology)
  
  Responder: OR which receives an incoming TLS connection ("server" in TLS
   terminology) 

Version negotiation and cipher suite selection:

  In the modified TLS handshake, the responder does not request a certificate
  from the initiator. This request would normally occur immediately after the
  responder receives the client hello (the first message in a TLS handshake) and
  so the responder must decide whether to request a certificate based only on
  the information in the client hello. This is achieved by examining the cipher
  suites in the client hello.

   List 1: cipher suites lists offered by version 0/1 Tor

   From src/common/tortls.c, revision 12086:
    TLS1_TXT_DHE_RSA_WITH_AES_128_SHA 
    TLS1_TXT_DHE_RSA_WITH_AES_128_SHA : SSL3_TXT_EDH_RSA_DES_192_CBC3_SHA
    SSL3_TXT_EDH_RSA_DES_192_CBC3_SHA

 Client hello sent by initiator:

  Initiators supporting version 2 of the Tor connection protocol MUST
  offer a different cipher suite list from those sent by pre-version 2
  Tors, contained in List 1. To maintain compatibility with older Tor
  versions and common browsers, the cipher suite list MUST include
  support for:

   TLS_DHE_RSA_WITH_AES_256_CBC_SHA
   TLS_DHE_RSA_WITH_AES_128_CBC_SHA
   SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA
   SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA

 Client hello received by responder/server hello sent by responder:

  Responders supporting version 2 of the Tor connection protocol should compare
  the cipher suite list in the client hello with those in List 1. If it matches
  any in the list then the responder should assume that the initiatior supports
  version 1, and thus should maintain the version 1 behavior, i.e. send a
  two-certificate chain, request a client certificate and do not send or expect
  a VERSIONS cell [2].

  Otherwise, the responder should assume version 2 behavior and select a cipher
  suite following TLS [1] behavior, i.e. select the first entry from the client
  hello cipher list which is acceptable. Responders MUST NOT select any suite
  that lacks ephemeral keys, or whose symmetric keys are less then KEY_LEN bits,
  or whose digests are less than HASH_LEN bits. Implementations SHOULD NOT
  allow other SSLv3 ciphersuites. 

  Should no mutually acceptable cipher suite be found, the connection MUST be
  closed.

  If the responder is implementing version 2 of the connection protocol it
  SHOULD send a server certificate with random contents. The organizationName
  field MUST NOT be "Tor", "TOR" or "t o r".

 Server certificate received by initiator:

  If the server certificate has an organizationName of "Tor", "TOR" or "t o r",
  the initiator should assume that the responder does not support version 2 of
  the connection protocol. In which case the initiator should respond following
  version 1, i.e. send a two-certificate client chain and do not send or expect
  a VERSIONS cell.

  [SJM: We could also use the fact that a client certificate request was sent]
  
  If the server hello contains a ciphersuite which does not comply with the key
  length requirements above, even if it was one offered in the client hello, the
  connection MUST be closed. This will only occur if the responder is not a Tor
  server.

 Backward compatibility:

  v1 Initiator, v1 Responder: No change
  v1 Initiator, v2 Responder: Responder detects v1 initiator by client hello
  v2 Initiator, v1 Responder: Responder accepts v2 client hello. Initiator
   detects v1 server certificate and continues with v1 protocol
  v2 Initiator, v2 Responder: Responder accepts v2 client hello. Initiator
   detects v2 server certificate and continues with v2 protocol.

 Additional link authentication process:

  Following VERSION and NETINFO negotiation, both responder and
  initiator MUST send a certification chain in a CERT cell. If one
  party does not have a certificate, the CERT cell MUST still be sent,
  but with a length of zero.

  A CERT cell is a variable length cell, of the format
        CircID                                [2 bytes]
        Command                               [1 byte]
        Length                                [2 bytes]
        Payload                               [<length> bytes]

  CircID MUST set to be 0x0000
  Command is [SJM: TODO]
  Length is the length of the payload
  Payload contains 0 or more certificates, each is of the format:
        Cert_Length  [2 bytes]
        Certificate  [<cert_length> bytes]

  Each certificate MUST sign the one preceding it. The initator MUST
  place its connection certificate first; the responder, having
  already sent its connection certificate as part of the TLS handshake
  MUST place its identity certificate first.

  Initiators who send a CERT cell MUST follow that with an LINK_AUTH
  cell to prove that they posess the corresponding private key.  

  A LINK_AUTH cell is fixed-lenth, of the format:
         CircID                                [2 bytes]
         Command                               [1 byte]
         Length                                [2 bytes]
         Payload (padded with 0 bytes)         [PAYLOAD_LEN - 2 bytes]

  CircID MUST set to be 0x0000
  Command is [SJM: TODO]
  Length is the valid portion of the payload
  Payload is of the format:
         Signature version                     [1 byte]
         Signature                             [<length> - 1 bytes]
         Padding                               [PAYLOAD_LEN - <length> - 2 bytes]

  Signature version: Identifies the type of signature, currently 0x00
  Signature: Digital signature under the initiator's connection key of the
   following item, in PKCS #1 block type 1 [3] format:

    HMAC-SHA1, using the TLS master secret as key, of the
    following elements concatenated:
     - The signature version (0x00)
     - The NUL terminated ASCII string: "Tor initiator certificate verification"
     - client_random, as sent in the Client Hello
     - server_random, as sent in the Server Hello
     - SHA-1 hash of the initiator connection certificate
     - SHA-1 hash of the responder connection certificate

  Security checks:

    - Before sending a LINK_AUTH cell, a node MUST ensure that the TLS
      connection is authenticated by the responder key.
    - For the handshake to have succeeded, the initiator MUST confirm:
       - That the TLS handshake was authenticated by the 
         responder connection key
       - That the responder connection key was signed by the first
         certificate in the CERT cell
       - That each certificate in the CERT cell was signed by the
         following certificate, with the exception of the last
       - That the last certificate in the CERT cell is the expected
         identity certificate for the node being connected to
    - For the handshake to have succeeded, the responder MUST confirm
      either:
       A) - A zero length CERT cell was sent and no LINK_AUTH cell was
            sent
          In which case the responder shall treat the identity of the
          initiator as unknown
        or
       B) - That the LINK_AUTH MAC contains a signature by the first
            certificate in the CERT cell
          - That the MAC signed matches the expected value
          - That each certificate in the CERT cell was signed by the
            following certificate, with the exception of the last
          In which case the responder shall treat the identity of the
          initiator as that of the last certificate in the CERT cell

  Protocol summary:

  1. I(nitiator) <-> R(esponder): TLS handshake, including responder
                               authentication under connection certificate R_c
  2. I <->: VERSION and NETINFO negotiation
  3. R -> I: CERT (Responder identity certificate R_i (which signs R_c))
  4. I -> R: CERT (Initiator connection certificate I_c, 
                   Initiator identity certificate I_i (which signs I_c)
  5. I -> R: LINK_AUTH (Signature, under I_c of HMAC-SHA1(master_secret,
                    "Tor initiator certificate verification" ||
                    client_random || server_random ||
                    I_c hash || R_c hash)

  Notes: I -> R doesn't need to wait for R_i before sending its own
   messages (reduces round-trips).
   Certificate hash is calculated like identity hash in CREATE cells.
   Initiator signature is calculated in a similar way to Certificate
   Verify messages in TLS 1.1 (RFC4346, Sections 7.4.8 and 4.7).
   If I is an OP, a zero length certificate chain may be sent in step 4;
   In which case, step 5 is not performed

  Rationale: 

  - Version and netinfo negotiation before authentication: The version cell needs
   to come before before the rest of the protocol, since we may choose to alter
   the rest at some later point, e.g switch to a different MAC/signature scheme.
   It is useful to keep the NETINFO and VERSION cells close to each other, since
   the time between them is used to check if there is a delay-attack. Still, a
   server might want to not act on NETINFO data from an initiator until the
   authentication is complete.

Appendix A: Cipher suite choices

  This specification intentionally does not put any constraints on the
  TLS ciphersuite lists presented by clients, other than a minimum
  required for compatibility. However, to maximize blocking
  resistance, ciphersuite lists should be carefully selected.

   Recommended client ciphersuite list

     Source: http://lxr.mozilla.org/security/source/security/nss/lib/ssl/sslproto.h

     0xc00a: TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA  
     0xc014: TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA 
     0x0039: TLS_DHE_RSA_WITH_AES_256_CBC_SHA 
     0x0038: TLS_DHE_DSS_WITH_AES_256_CBC_SHA
     0xc00f: TLS_ECDH_RSA_WITH_AES_256_CBC_SHA 
     0xc005: TLS_ECDH_ECDSA_WITH_AES_256_CBC_SHA 
     0x0035: TLS_RSA_WITH_AES_256_CBC_SHA
     0xc007: TLS_ECDHE_ECDSA_WITH_RC4_128_SHA 
     0xc009: TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA 
     0xc011: TLS_ECDHE_RSA_WITH_RC4_128_SHA
     0xc013: TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA 
     0x0033: TLS_DHE_RSA_WITH_AES_128_CBC_SHA 
     0x0032: TLS_DHE_DSS_WITH_AES_128_CBC_SHA 
     0xc00c: TLS_ECDH_RSA_WITH_RC4_128_SHA
     0xc00e: TLS_ECDH_RSA_WITH_AES_128_CBC_SHA
     0xc002: TLS_ECDH_ECDSA_WITH_RC4_128_SHA  
     0xc004: TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA 
     0x0004: SSL_RSA_WITH_RC4_128_MD5 
     0x0005: SSL_RSA_WITH_RC4_128_SHA 
     0x002f: TLS_RSA_WITH_AES_128_CBC_SHA 
     0xc008: TLS_ECDHE_ECDSA_WITH_3DES_EDE_CBC_SHA 
     0xc012: TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA
     0x0016: SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA  
     0x0013: SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA 
     0xc00d: TLS_ECDH_RSA_WITH_3DES_EDE_CBC_SHA 
     0xc003: TLS_ECDH_ECDSA_WITH_3DES_EDE_CBC_SHA
     0xfeff: SSL_RSA_FIPS_WITH_3DES_EDE_CBC_SHA (168-bit Triple DES with RSA and a SHA1 MAC)
     0x000a: SSL_RSA_WITH_3DES_EDE_CBC_SHA 

     Order specified in:
      http://lxr.mozilla.org/security/source/security/nss/lib/ssl/sslenum.c#47

   Recommended options:
      0x0000: Server Name Indication [4]
      0x000a: Supported Elliptic Curves [5]
      0x000b: Supported Point Formats [5]

   Recommended compression:
      0x00

   Recommended server ciphersuite selection:

     The responder should select the first entry in this list which is
     listed in the client hello:

     0x0039: TLS_DHE_RSA_WITH_AES_256_CBC_SHA  [ Common Firefox choice ]
     0x0033: TLS_DHE_RSA_WITH_AES_128_CBC_SHA  [ Tor v1 default ] 
     0x0016: SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA [ Tor v1 fallback ]
     0x0013: SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA [ Valid IE option ]

References:

[1] The Transport Layer Security (TLS) Protocol, Version 1.1, RFC4346, IETF

[2] Version negotiation for the Tor protocol, Tor proposal 105

[3] B. Kaliski, "Public-Key Cryptography Standards (PKCS) #1:
    RSA Cryptography Specifications Version 1.5", RFC 2313,
    March 1998.

[4] TLS Extensions, RFC 3546

[5] Elliptic Curve Cryptography (ECC) Cipher Suites for Transport Layer Security (TLS)

% <!-- Local IspellDict: american -->
Filename: 125-bridges.txt
Title: Behavior for bridge users, bridge relays, and bridge authorities
Author: Roger Dingledine
Created: 11-Nov-2007
Status: Closed
Implemented-In: 0.2.0.x

0. Preface

  This document describes the design decisions around support for bridge
  users, bridge relays, and bridge authorities. It acts as an overview
  of the bridge design and deployment for developers, and it also tries
  to point out limitations in the current design and implementation.

  For more details on what all of these mean, look at blocking.tex in
  /doc/design-paper/

1. Bridge relays

  Bridge relays are just like normal Tor relays except they don't publish
  their server descriptors to the main directory authorities.

1.1. PublishServerDescriptor

  To configure your relay to be a bridge relay, just add
    BridgeRelay 1
    PublishServerDescriptor bridge
  to your torrc. This will cause your relay to publish its descriptor
  to the bridge authorities rather than to the default authorities.

  Alternatively, you can say
    BridgeRelay 1
    PublishServerDescriptor 0
  which will cause your relay to not publish anywhere. This could be
  useful for private bridges.

1.2. Exit policy

  Bridge relays should use an exit policy of "reject *:*". This is
  because they only need to relay traffic between the bridge users
  and the rest of the Tor network, so there's no need to let people
  exit directly from them.

1.3. RelayBandwidthRate / RelayBandwidthBurst

  We invented the RelayBandwidth* options for this situation: Tor clients
  who want to allow relaying too. See proposal 111 for details. Relay
  operators should feel free to rate-limit their relayed traffic.

1.4. Helping the user with port forwarding, NAT, etc.

  Just as for operating normal relays, our documentation and hints for
  how to make your ORPort reachable are inadequate for normal users.

  We need to work harder on this step, perhaps in 0.2.2.x.

1.5. Vidalia integration

  Vidalia has turned its "Relay" settings page into a tri-state
  "Don't relay" / "Relay for the Tor network" / "Help censored users".

  If you click the third choice, it forces your exit policy to reject *:*.

  If all the bridges end up on port 9001, that's not so good. On the
  other hand, putting the bridges on a low-numbered port in the Unix
  world requires jumping through extra hoops. The current compromise is
  that Vidalia makes the ORPort default to 443 on Windows, and 9001 on
  other platforms.

  At the bottom of the relay config settings window, Vidalia displays
  the bridge identifier to the operator (see Section 3.1) so he can pass
  it on to bridge users.

1.6. What if the default ORPort is already used?

  If the user already has a webserver or some other application
  bound to port 443, then Tor will fail to bind it and complain to the
  user, probably in a cryptic way. Rather than just working on a better
  error message (though we should do this), we should consider an
  "ORPort auto" option that tells Tor to try to find something that's
  bindable and reachable. This would also help us tolerate ISPs that
  filter incoming connections on port 80 and port 443. But this should
  be a different proposal, and can wait until 0.2.2.x.

2. Bridge authorities.

  Bridge authorities are like normal directory authorities, except they
  don't create their own network-status documents or votes. So if you
  ask an authority for a network-status document or consensus, they
  behave like a directory mirror: they give you one from one of the main
  authorities. But if you ask the bridge authority for the descriptor
  corresponding to a particular identity fingerprint, it will happily
  give you the latest descriptor for that fingerprint.

  To become a bridge authority, add these lines to your torrc:
    AuthoritativeDirectory 1
    BridgeAuthoritativeDir 1

  Right now there's one bridge authority, running on the Tonga relay.

2.1. Exporting bridge-purpose descriptors

  We've added a new purpose for server descriptors: the "bridge"
  purpose. With the new router-descriptors file format that includes
  annotations, it's easy to look through it and find the bridge-purpose
  descriptors.

  Currently we export the bridge descriptors from Tonga to the
  BridgeDB server, so it can give them out according to the policies
  in blocking.pdf.

2.2. Reachability/uptime testing

  Right now the bridge authorities do active reachability testing of
  bridges, so we know which ones to recommend for users.

  But in the design document, we suggested that bridges should publish
  anonymously (i.e. via Tor) to the bridge authority, so somebody watching
  the bridge authority can't just enumerate all the bridges. But if we're
  doing active measurement, the game is up. Perhaps we should back off on
  this goal, or perhaps we should do our active measurement anonymously?

  Answering this issue is scheduled for 0.2.1.x.

2.3. Migrating to multiple bridge authorities

  Having only one bridge authority is both a trust bottleneck (if you
  break into one place you learn about every single bridge we've got)
  and a robustness bottleneck (when it's down, bridge users become sad).

  Right now if we put up a second bridge authority, all the bridges would
  publish to it, and (assuming the code works) bridge users would query
  a random bridge authority. This resolves the robustness bottleneck,
  but makes the trust bottleneck even worse.

  In 0.2.2.x and later we should think about better ways to have multiple
  bridge authorities.

3. Bridge users.

  Bridge users are like ordinary Tor users except they use encrypted
  directory connections by default, and they use bridge relays as both
  entry guards (their first hop) and directory guards (the source of
  all their directory information).

  To become a bridge user, add the following line to your torrc:

    UseBridges 1

  and then add at least one "Bridge" line to your torrc based on the
  format below.

3.1. Format of the bridge identifier.

  The canonical format for a bridge identifier contains an IP address,
  an ORPort, and an identity fingerprint:
    bridge 128.31.0.34:9009 4C17 FB53 2E20 B2A8 AC19 9441 ECD2 B017 7B39 E4B1

  However, the identity fingerprint can be left out, in which case the
  bridge user will connect to that relay and use it as a bridge regardless
  of what identity key it presents:
    bridge 128.31.0.34:9009
  This might be useful for cases where only short bridge identifiers
  can be communicated to bridge users.

  In a future version we may also support bridge identifiers that are
  only a key fingerprint:
    bridge 4C17 FB53 2E20 B2A8 AC19 9441 ECD2 B017 7B39 E4B1
  and the bridge user can fetch the latest descriptor from the bridge
  authority (see Section 3.4).

3.2. Bridges as entry guards

  For now, bridge users add their bridge relays to their list of "entry
  guards" (see path-spec.txt for background on entry guards). They are
  managed by the entry guard algorithms exactly as if they were a normal
  entry guard -- their keys and timing get cached in the "state" file,
  etc. This means that when the Tor user starts up with "UseBridges"
  disabled, he will skip past the bridge entries since they won't be
  listed as up and usable in his networkstatus consensus. But to be clear,
  the "entry_guards" list doesn't currently distinguish guards by purpose.

  Internally, each bridge user keeps a smartlist of "bridge_info_t"
  that reflects the "bridge" lines from his torrc along with a download
  schedule (see Section 3.5 below). When he starts Tor, he attempts
  to fetch a descriptor for each configured bridge (see Section 3.4
  below). When he succeeds at getting a descriptor for one of the bridges
  in his list, he adds it directly to the entry guard list using the
  normal add_an_entry_guard() interface. Once a bridge descriptor has
  been added, should_delay_dir_fetches() will stop delaying further
  directory fetches, and the user begins to bootstrap his directory
  information from that bridge (see Section 3.3).

  Currently bridge users cache their bridge descriptors to the
  "cached-descriptors" file (annotated with purpose "bridge"), but
  they don't make any attempt to reuse descriptors they find in this
  file. The theory is that either the bridge is available now, in which
  case you can get a fresh descriptor, or it's not, in which case an
  old descriptor won't do you much good.

  We could disable writing out the bridge lines to the state file, if
  we think this is a problem.

  As an exception, if we get an application request when we have one
  or more bridge descriptors but we believe none of them are running,
  we mark them all as running again. This is similar to the exception
  already in place to help long-idle Tor clients realize they should
  fetch fresh directory information rather than just refuse requests.

3.3. Bridges as directory guards

  In addition to using bridges as the first hop in their circuits, bridge
  users also use them to fetch directory updates. Other than initial
  bootstrapping to find a working bridge descriptor (see Section 3.4
  below), all further non-anonymized directory fetches will be redirected
  to the bridge.

  This means that bridge relays need to have cached answers for all
  questions the bridge user might ask. This makes the upgrade path
  tricky --- for example, if we migrate to a v4 directory design, the
  bridge user would need to keep using v3 so long as his bridge relays
  only knew how to answer v3 queries.

  In a future design, for cases where the user has enough information
  to build circuits yet the chosen bridge doesn't know how to answer a
  given query, we might teach bridge users to make an anonymized request
  to a more suitable directory server.

3.4. How bridge users get their bridge descriptor

  Bridge users can fetch bridge descriptors in two ways: by going directly
  to the bridge and asking for "/tor/server/authority", or by going to
  the bridge authority and asking for "/tor/server/fp/ID". By default,
  they will only try the direct queries. If the user sets
    UpdateBridgesFromAuthority 1
  in his config file, then he will try querying the bridge authority
  first for bridges where he knows a digest (if he only knows an IP
  address and ORPort, then his only option is a direct query).

  If the user has at least one working bridge, then he will do further
  queries to the bridge authority through a full three-hop Tor circuit.
  But when bootstrapping, he will make a direct begin_dir-style connection
  to the bridge authority.

  As of Tor 0.2.0.10-alpha, if the user attempts to fetch a descriptor
  from the bridge authority and it returns a 404 not found, the user
  will automatically fall back to trying a direct query. Therefore it is
  recommended that bridge users always set UpdateBridgesFromAuthority,
  since at worst it will delay their fetches a little bit and notify
  the bridge authority of the identity fingerprint (but not location)
  of their intended bridges.

3.5. Bridge descriptor retry schedule

  Bridge users try to fetch a descriptor for each bridge (using the
  steps in Section 3.4 above) on startup. Whenever they receive a
  bridge descriptor, they reschedule a new descriptor download for 1
  hour from then.

  If on the other hand it fails, they try again after 15 minutes for the
  first attempt, after 15 minutes for the second attempt, and after 60
  minutes for subsequent attempts.

  In 0.2.2.x we should come up with some smarter retry schedules.

3.6. Vidalia integration

  Vidalia 0.0.16 has a checkbox in its Network config window called
  "My ISP blocks connections to the Tor network." Users who click that
  box change their configuration to:
    UseBridges 1
    UpdateBridgesFromAuthority 1
  and should specify at least one Bridge identifier.

3.7. Do we need a second layer of entry guards?

  If the bridge user uses the bridge as its entry guard, then the
  triangulation attacks from Lasse and Paul's Oakland paper work to
  locate the user's bridge(s).

  Worse, this is another way to enumerate bridges: if the bridge users
  keep rotating through second hops, then if you run a few fast servers
  (and avoid getting considered an Exit or a Guard) you'll quickly get
  a list of the bridges in active use.

  That's probably the strongest reason why bridge users will need to
  pick second-layer guards. Would this mean bridge users should switch
  to four-hop circuits?

  We should figure this out in the 0.2.1.x timeframe.

Filename: 126-geoip-reporting.txt
Title: Getting GeoIP data and publishing usage summaries
Author: Roger Dingledine
Created: 2007-11-24
Status: Closed
Implemented-In: 0.2.0.x

0. Status

  In 0.2.0.x, this proposal is implemented to the extent needed to
  address its motivations.  See notes below with the test "RESOLUTION"
  for details.

1. Background and motivation

  Right now we can keep a rough count of Tor users, both total and by
  country, by watching connections to a single directory mirror. Being
  able to get usage estimates is useful both for our funders (to
  demonstrate progress) and for our own development (so we know how
  quickly we're scaling and can design accordingly, and so we know which
  countries and communities to focus on more). This need for information
  is the only reason we haven't deployed "directory guards" (think of
  them like entry guards but for directory information; in practice,
  it would seem that Tor clients should simply use their entry guards
  as their directory guards; see also proposal 125).

  With the move toward bridges, we will no longer be able to track Tor
  clients that use bridges, since they use their bridges as directory
  guards. Further, we need to be able to learn which bridges stop seeing
  use from certain countries (and are thus likely blocked), so we can
  avoid giving them out to other users in those countries.

  Right now we already do GeoIP lookups in Vidalia: Vidalia draws relays
  and circuits on its 'network map', and it performs anonymized GeoIP
  lookups to its central servers to know where to put the dots. Vidalia
  caches answers it gets -- to reduce delay, to reduce overhead on
  the network, and to reduce anonymity issues where users reveal their
  knowledge about the network through which IP addresses they ask about.

  But with the advent of bridges, Tor clients are asking about IP
  addresses that aren't in the main directory. In particular, bridge
  users inform the central Vidalia servers about each bridge as they
  discover it and their Vidalia tries to map it.

  Also, we wouldn't mind letting Vidalia do a GeoIP lookup on the client's
  own IP address, so it can provide a more useful map.

  Finally, Vidalia's central servers leave users open to partitioning
  attacks, even if they can't target specific users. Further, as we
  start using GeoIP results for more operational or security-relevant
  goals, such as avoiding or including particular countries in circuits,
  it becomes more important that users can't be singled out in terms of
  their IP-to-country mapping beliefs.

2. The available GeoIP databases

  There are at least two classes of GeoIP database out there: "IP to
  country", which tells us the country code for the IP address but
  no more details, and "IP to city", which tells us the country code,
  the name of the city, and some basic latitude/longitude guesses.

  A recent ip-to-country.csv is 3421362 bytes. Compressed, it is 564252
  bytes. A typical line is:
    "205500992","208605279","US","USA","UNITED STATES"
  http://ip-to-country.webhosting.info/node/view/5

  Similarly, the maxmind GeoLite Country database is also about 500KB
  compressed.
  http://www.maxmind.com/app/geolitecountry

  The maxmind GeoLite City database gives more finegrained detail like
  geo coordinates and city name. Vidalia currently makes use of this
  information. On the other hand it's 16MB compressed. A typical line is:
    206.124.149.146,Bellevue,WA,US,47.6051,-122.1134
  http://www.maxmind.com/app/geolitecity

  There are other databases out there, like
  http://www.hostip.info/faq.html
  http://www.webconfs.com/ip-to-city.php
  that want more attention, but for now let's assume that all the db's
  are around this size.

3. What we'd like to solve

  Goal #1a: Tor relays collect IP-to-country user stats and publish
  sanitized versions.
  Goal #1b: Tor bridges collect IP-to-country user stats and publish
  sanitized versions.

  Goal #2a: Vidalia learns IP-to-city stats for Tor relays, for better
  mapping.
  Goal #2b: Vidalia learns IP-to-country stats for Tor relays, so the user
  can pick countries for her paths.

  Goal #3: Vidalia doesn't do external lookups on bridge relay addresses.

  Goal #4: Vidalia resolves the Tor client's IP-to-country or IP-to-city
  for better mapping.

  Goal #5: Reduce partitioning opportunities where Vidalia central
  servers can give different (distinguishing) responses.

4. Solution overview

  Our goal is to allow Tor relays, bridges, and clients to learn enough
  GeoIP information so they can do local private queries.

4.1. The IP-to-country db

  Directory authorities should publish a "geoip" file that contains
  IP-to-country mappings. Directory caches will mirror it, and Tor clients
  and relays (including bridge relays) will fetch it. Thus we can solve
  goals 1a and 1b (publish sanitized usage info). Controllers could also
  use this to solve goal 2b (choosing path by country attributes). It
  also solves goal 4 (learning the Tor client's country), though for
  huge countries like the US we'd still need to decide where the "middle"
  should be when we're mapping that address.

  The IP-to-country details are described further in Sections 5 and
  6 below.

  [RESOLUTION: The geoip file in 0.2.0.x is not distributed through
  Tor.  Instead, it is shipped with the bundle.]

4.2. The IP-to-city db

  In an ideal world, the IP-to-city db would be small enough that we
  could distribute it in the above manner too. But for now, it is too
  large. Here's where the design choice forks.

  Option A: Vidalia should continue doing its anonymized IP-to-city
  queries. Thus we can achieve goals 2a and 2b. We would solve goal
  3 by only doing lookups on descriptors that are purpose "general"
  (see Section 4.2.1 for how). We would leave goal 5 unsolved.

  Option B: Each directory authority should keep an IP-to-city db,
  lookup the value for each router it lists, and include that line in
  the router's network-status entry. The network-status consensus would
  then use the line that appears in the majority of votes. This approach
  also solves goals 2a and 2b, goal 3 (Vidalia doesn't do any lookups
  at all now), and goal 5 (reduced partitioning risks).

  Option B has the advantage that Vidalia can simplify its operation,
  and the advantage that this consensus IP-to-city data is available to
  other controllers besides just Vidalia. But it has the disadvantage
  that the networkstatus consensus becomes larger, even though most of
  the GeoIP information won't change from one consensus to the next. Is
  there another reasonable location for it that can provide similar
  consensus security properties?

  [RESOLUTION: IP-to-city is not supported.]

4.2.1. Controllers can query for router annotations

  Vidalia needs to stop doing queries on bridge relay IP addresses.
  It could do that by only doing lookups on descriptors that are in
  the networkstatus consensus, but that precludes designs like Blossom
  that might want to map its relay locations. The best answer is that it
  should learn the router annotations, with a new controller 'getinfo'
  command:
    "GETINFO desc-annotations/id/<OR identity>"
  which would respond with something like
    @downloaded-at 2007-11-29 08:06:38
    @source "128.31.0.34"
    @purpose bridge

  [We could also make the answer include the digest for the router in
  question, which would enable us to ask GETINFO router-annotations/all.
  Is this worth it? -RD]

  Then Vidalia can avoid doing lookups on descriptors with purpose
  "bridge". Even better would be to add a new annotation "@private true"
  so Vidalia can know how to handle new purposes that we haven't created
  yet. Vidalia could special-case "bridge" for now, for compatibility
  with the current 0.2.0.x-alphas.

4.3. Recommendation

  My overall recommendation is that we should implement 4.1 soon
  (e.g. early in 0.2.1.x), and we can go with 4.2 option A for now,
  with the hope that later we discover a better way to distribute the
  IP-to-city info and can switch to 4.2 option B.

  Below we discuss more how to go about achieving 4.1.

5. Publishing and caching the GeoIP (IP-to-country) database

  Each v3 directory authority should put a copy of the "geoip" file in
  its datadirectory. Then its network-status votes should include a hash
  of this file (Recommended-geoip-hash: %s), and the resulting consensus
  directory should specify the consensus hash.

  There should be a new URL for fetching this geoip db (by "current.z"
  for testing purposes, and by hash.z for typical downloads). Authorities
  should fetch and serve the one listed in the consensus, even when they
  vote for their own. This would argue for storing the cached version
  in a better filename than "geoip".

  Directory mirrors should keep a copy of this file available via the
  same URLs.

  We assume that the file would change at most a few times a month. Should
  Tor ship with a bootstrap geoip file? An out-of-date geoip file may
  open you up to partitioning attacks, but for the most part it won't
  be that different.

  There should be a config option to disable updating the geoip file,
  in case users want to use their own file (e.g. they have a proprietary
  GeoIP file they prefer to use). In that case we leave it up to the
  user to update his geoip file out-of-band.

  [XXX Should consider forward/backward compatibility, e.g. if we want
  to move to a new geoip file format. -RD]

  [RESOLUTION: Not done over Tor.]

6. Controllers use the IP-to-country db for mapping and for path building

  Down the road, Vidalia could use the IP-to-country mappings for placing
  on its map:
  - The location of the client
  - The location of the bridges, or other relays not in the
    networkstatus, on the map.
  - Any relays that it doesn't yet have an IP-to-city answer for.

  Other controllers can also use it to set EntryNodes, ExitNodes, etc
  in a per-country way.

  To support these features, we need to export the IP-to-country data
  via the Tor controller protocol.

  Is it sufficient just to add a new GETINFO command?
    GETINFO ip-to-country/128.31.0.34
    250+ip-to-country/128.31.0.34="US","USA","UNITED STATES"

  [RESOLUTION: Not done now, except for the getinfo command.]

6.1. Other interfaces

  Robert Hogan has also suggested a

    GETINFO relays-by-country/cn

  as well as torrc options for ExitCountryCodes, EntryCountryCodes,
  ExcludeCountryCodes, etc.

  [RESOLUTION: Not implemented in 0.2.0.x.  Fodder for a future proposal.]

7. Relays and bridges use the IP-to-country db for usage summaries

  Once bridges have a GeoIP database locally, they can start to publish
  sanitized summaries of client usage -- how many users they see and from
  what countries. This might also be a more useful way for ordinary Tor
  relays to convey the level of usage they see, which would allow us to
  switch to using directory guards for all users by default.

  But how to safely summarize this information without opening too many
  anonymity leaks?

7.1 Attacks to think about

  First, note that we need to have a large enough time window that we're
  not aiding correlation attacks much. I hope 24 hours is enough. So
  that means no publishing stats until you've been up at least 24 hours.
  And you can't publish follow-up stats more often than every 24 hours,
  or people could look at the differential.

  Second, note that we need to be sufficiently vague about the IP
  addresses we're reporting. We are hoping that just specifying the
  country will be vague enough. But a) what about active attacks where
  we convince a bridge to use a GeoIP db that labels each suspect IP
  address as a unique country? We have to assume that the consensus GeoIP
  db won't be malicious in this way. And b) could such singling-out
  attacks occur naturally, for example because of countries that have
  a very small IP space? We should investigate that.

7.2. Granularity of users

  Do we only want to report countries that have a sufficient anonymity set
  (that is, number of users) for the day? For example, we might avoid
  listing any countries that have seen less than five addresses over
  the 24 hour period. This approach would be helpful in reducing the
  singling-out opportunities -- in the extreme case, we could imagine a
  situation where one blogger from the Sudan used Tor on a given day, and
  we can discover which entry guard she used.

  But I fear that especially for bridges, seeing only one hit from a
  given country in a given day may be quite common.

  As a compromise, we should start out with an "Other" category in
  the reported stats, which is the sum of unlisted countries; if that
  category is consistently interesting, we can think harder about how
  to get the right data from it safely.

  But note that bridge summaries will not be made public individually,
  since doing so would help people enumerate bridges. Whereas summaries
  from normal relays will be public. So perhaps that means we can afford
  to be more specific in bridge summaries? In particular, I'm thinking the
  "other" category should be used by public relays but not for bridges
  (or if it is, used with a lower threshold).

  Even for countries that have many Tor users, we might not want to be
  too specific about how many users we've seen. For example, we might
  round down the number of users we report to the nearest multiple of 5.
  My instinct for now is that this won't be that useful.

7.3 Other issues

  Another note: we'll likely be overreporting in the case of users with
  dynamic IP addresses: if they rotate to a new address over the course
  of the day, we'll count them twice. So be it.

7.4. Where to publish the summaries?

  We designed extrainfo documents for information like this. So they
  should just be more entries in the extrainfo doc.

  But if we want to publish summaries every 24 hours (no more often,
  no less often), aren't we tried to the router descriptor publishing
  schedule? That is, if we publish a new router descriptor at the 18
  hour mark, and nothing much has changed at the 24 hour mark, won't
  the new descriptor get dropped as being "cosmetically similar", and
  then nobody will know to ask about the new extrainfo document?

  One solution would be to make and remember the 24 hour summary at the
  24 hour mark, but not actually publish it anywhere until we happen to
  publish a new descriptor for other reasons. If we happen to go down
  before publishing a new descriptor, then so be it, at least we tried.

7.5. What if the relay is unreachable or goes to sleep?

  Even if you've been up for 24 hours, if you were hibernating for 18
  of them, then we're not getting as much fuzziness as we'd like. So
  I guess that means that we need a 24-hour period of being "awake"
  before we'll willing to publish a summary. A similar attack works if
  you've been awake but unreachable for the first 18 of the 24 hours. As
  another example, a bridge that's on a laptop might be suspended for
  some of each day.

  This implies that some relays and bridges will never publish summary
  stats, because they're not ever reliably working for 24 hours in
  a row. If a significant percentage of our reporters end up being in
  this boat, we should investigate whether we can accumulate 24 hours of
  "usefulness", even if there are holes in the middle, and publish based
  on that.

  What other issues are like this? It seems that just moving to a new
  IP address shouldn't be a reason to cancel stats publishing, assuming
  we were usable at each address.

7.6. IP addresses that aren't in the geoip db

  Some IP addresses aren't in the public geoip databases. In particular,
  I've found that a lot of African countries are missing, but there
  are also some common ones in the US that are missing, like parts of
  Comcast. We could just lump unknown IP addresses into the "other"
  category, but it might be useful to gather a general sense of how many
  lookups are failing entirely, by adding a separate "Unknown" category.

  We could also contribute back to the geoip db, by letting bridges set
  a config option to report the actual IP addresses that failed their
  lookup. Then the bridge authority operators can manually make sure
  the correct answer will be in later geoip files. This config option
  should be disabled by default.

7.7 Bringing it all together

  So here's the plan:

  24 hours after starting up (modulo Section 7.5 above), bridges and
  relays should construct a daily summary of client countries they've
  seen, including the above "Unknown" category (Section 7.6) as well.

  Non-bridge relays lump all countries with less than K (e.g. K=5) users
  into the "Other" category (see Sec 7.2 above), whereas bridge relays are
  willing to list a country even when it has only one user for the day.

  Whenever we have a daily summary on record, we include it in our
  extrainfo document whenever we publish one. The daily summary we
  remember locally gets replaced with a newer one when another 24
  hours pass.

7.8. Some forward secrecy

  How should we remember addresses locally? If we convert them into
  country-codes immediately, we will count them again if we see them
  again. On the other hand, we don't really want to keep a list hanging
  around of all IP addresses we've seen in the past 24 hours.

  Step one is that we should never write this stuff to disk. Keeping it
  only in ram will make things somewhat better. Step two is to avoid
  keeping any timestamps associated with it: rather than a rolling
  24-hour window, which would require us to remember the various times
  we've seen that address, we can instead just throw out the whole list
  every 24 hours and start over.

  We could hash the addresses, and then compare hashes when deciding if
  we've seen a given address before. We could even do keyed hashes. Or
  Bloom filters. But if our goal is to defend against an adversary
  who steals a copy of our ram while we're running and then does
  guess-and-check on whatever blob we're keeping, we're in bad shape.

  We could drop the last octet of the IP address as soon as we see
  it. That would cause us to undercount some users from cablemodem and
  DSL networks that have a high density of Tor users. And it wouldn't
  really help that much -- indeed, the extent to which it does help is
  exactly the extent to which it makes our stats less useful.

  Other ideas?

Filename: 127-dirport-mirrors-downloads.txt
Title: Relaying dirport requests to Tor download site / website
Author: Roger Dingledine
Created: 2007-12-02
Status: Obsolete

1. Overview

  Some countries and networks block connections to the Tor website. As
  time goes by, this will remain a problem and it may even become worse.

  We have a big pile of mirrors (google for "Tor mirrors"), but few of
  our users think to try a search like that. Also, many of these mirrors
  might be automatically blocked since their pages contain words that
  might cause them to get banned. And lastly, we can imagine a future
  where the blockers are aware of the mirror list too.

  Here we describe a new set of URLs for Tor's DirPort that will relay
  connections from users to the official Tor download site. Rather than
  trying to cache a bunch of new Tor packages (which is a hassle in terms
  of keeping them up to date, and a hassle in terms of drive space used),
  we instead just proxy the requests directly to Tor's /dist page.

  Specifically, we should support

    GET /tor/dist/$1

  and

    GET /tor/website/$1

2. Direct connections, one-hop circuits, or three-hop circuits?

  We could relay the connections directly to the download site -- but
  this produces recognizable outgoing traffic on the bridge or cache's
  network, which will probably surprise our nice volunteers. (Is this
  a good enough reason to discard the direct connection idea?)

  Even if we don't do direct connections, should we do a one-hop
  begindir-style connection to the mirror site (make a one-hop circuit
  to it, then send a 'begindir' cell down the circuit), or should we do
  a normal three-hop anonymized connection?

  If these mirrors are mainly bridges, doing either a direct or a one-hop
  connection creates another way to enumerate bridges. That would argue
  for three-hop. On the other hand, downloading a 10+ megabyte installer
  through a normal Tor circuit can't be fun. But if you're already getting
  throttled a lot because you're in the "relayed traffic" bucket, you're
  going to have to accept a slow transfer anyway. So three-hop it is.

  Speaking of which, we would want to label this connection
  as "relay" traffic for the purposes of rate limiting; see
  connection_counts_as_relayed_traffic() and or_conn->client_used. This
  will be a bit tricky though, because these connections will use the
  bridge's guards.

3. Scanning resistance

  One other goal we'd like to achieve, or at least not hinder, is making
  it hard to scan large swaths of the Internet to look for responses
  that indicate a bridge.

  In general this is a really hard problem, so we shouldn't demand to
  solve it here. But we can note that some bridges should open their
  DirPort (and offer this functionality), and others shouldn't. Then
  some bridges provide a download mirror while others can remain
  scanning-resistant.

4. Integrity checking

  If we serve this stuff in plaintext from the bridge, anybody in between
  the user and the bridge can intercept and modify it. The bridge can too.

  If we do an anonymized three-hop connection, the exit node can also
  intercept and modify the exe it sends back.

  Are we setting ourselves up for rogue exit relays, or rogue bridges,
  that trojan our users?

  Answer #1: Users need to do pgp signature checking. Not a very good
  answer, a) because it's complex, and b) because they don't know the
  right signing keys in the first place.

  Answer #2: The mirrors could exit from a specific Tor relay, using the
  '.exit' notation. This would make connections a bit more brittle, but
  would resolve the rogue exit relay issue. We could even round-robin
  among several, and the list could be dynamic -- for example, all the
  relays with an Authority flag that allow exits to the Tor website.

  Answer #3: The mirrors should connect to the main distribution site
  via SSL. That way the exit relay can't influence anything.

  Answer #4: We could suggest that users only use trusted bridges for
  fetching a copy of Tor. Hopefully they heard about the bridge from a
  trusted source rather than from the adversary.

  Answer #5: What if the adversary is trawling for Tor downloads by
  network signature -- either by looking for known bytes in the binary,
  or by looking for "GET /tor/dist/"? It would be nice to encrypt the
  connection from the bridge user to the bridge. And we can! The bridge
  already supports TLS. Rather than initiating a TLS renegotiation after
  connecting to the ORPort, the user should actually request a URL. Then
  the ORPort can either pass the connection off as a linked conn to the
  dirport, or renegotiate and become a Tor connection, depending on how
  the client behaves.

5. Linked connections: at what level should we proxy?

  Check out the connection_ap_make_link() function, as called from
  directory.c. Tor clients use this to create a "fake" socks connection
  back to themselves, and then they attach a directory request to it,
  so they can launch directory fetches via Tor. We can piggyback on
  this feature.

  We need to decide if we're going to be passing the bytes back and
  forth between the web browser and the main distribution site, or if
  we're going to be actually acting like a proxy (parsing out the file
  they want, fetching that file, and serving it back).

  Advantages of proxying without looking inside:
    - We don't need to build any sort of http support (including
      continues, partial fetches, etc etc).
  Disadvantages:
    - If the browser thinks it's speaking http, are there easy ways
      to pass the bytes to an https server and have everything work
      correctly? At the least, it would seem that the browser would
      complain about the cert. More generally, ssl wants to be negotiated
      before the URL and headers are sent, yet we need to read the URL
      and headers to know that this is a mirror request; so we have an
      ordering problem here.
    - Makes it harder to do caching later on, if we don't look at what
      we're relaying. (It might be useful down the road to cache the
      answers to popular requests, so we don't have to keep getting
      them again.)

6. Outstanding problems

  1) HTTP proxies already exist.  Why waste our time cloning one
  badly? When we clone existing stuff, we usually regret it.

  2) It's overbroad.  We only seem to need a secure get-a-tor feature,
  and instead we're contemplating building a locked-down HTTP proxy.

  3) It's going to add a fair bit of complexity to our code.  We do
  not currently implement HTTPS.  We'd need to refactor lots of the
  low-level connection stuff so that "SSL" and "Cell-based" were no
  longer synonymous.

  4) It's still unclear how effective this proposal would be in
  practice. You need to know that this feature exists, which means
  somebody needs to tell you about a bridge (mirror) address and tell
  you how to use it. And if they're doing that, they could (e.g.) tell
  you about a gmail autoresponder address just as easily, and then you'd
  get better authentication of the Tor program to boot.

Filename: 128-bridge-families.txt
Title: Families of private bridges
Author: Roger Dingledine
Created: 2007-12-xx
Status: Dead

1. Overview

  Proposal 125 introduced the basic notion of how bridge authorities,
  bridge relays, and bridge users should behave. But it doesn't get into
  the various mechanisms of how to distribute bridge relay addresses to
  bridge users.

  One of the mechanisms we have in mind is called 'families of bridges'.
  If a bridge user knows about only one private bridge, and that bridge
  shuts off for the night or gets a new dynamic IP address, the bridge
  user is out of luck and needs to re-bootstrap manually or wait and
  hope it comes back. On the other hand, if the bridge user knows about
  a family of bridges, then as long as one of those bridges is still
  reachable his Tor client can automatically  learn about where the
  other bridges have gone.

  So in this design, a single volunteer could run multiple coordinated
  bridges, or a group of volunteers could each run a bridge. We abstract
  out the details of how these volunteers find each other and decide to
  set up a family.

2. Other notes.

  somebody needs to run a bridge authority

  it needs to have a torrc option to publish networkstatuses of its bridges

  it should also do reachability testing just of those bridges

  people ask for the bridge networkstatus by asking for a url that
  contains a password. (it's safe to do this because of begin_dir.)

  so the bridge users need to know a) a password, and b) a bridge
  authority line.

  the bridge users need to know the bridge authority line.

  the bridge authority needs to know the password.

3. Current state

  I implemented a BridgePassword config option. Bridge authorities
  should set it, and users who want to use those bridge authorities
  should set it.

  Now there is a new directory URL "/tor/networkstatus-bridges" that
  directory mirrors serve if BridgeAuthoritativeDir is set and it's a
  begin_dir connection. It looks for the header
    Authorization: Basic %s
  where %s is the base-64 bridge password.

  I never got around to teaching clients how to set the header though,
  so it may or may not, and may or may not do what we ultimate want.

  I've marked this proposal dead; it really never should have left the
  ideas/ directory. Somebody should pick it up sometime and finish the
  design and implementation.

Filename: 129-reject-plaintext-ports.txt
Title: Block Insecure Protocols by Default
Author: Kevin Bauer & Damon McCoy
Created: 2008-01-15
Status: Closed
Implemented-In: 0.2.0.x

Overview:

  Below is a proposal to mitigate insecure protocol use over Tor.

  This document 1) demonstrates the extent to which insecure protocols are
  currently used within the Tor network, and 2) proposes a simple solution
  to prevent users from unknowingly using these insecure protocols. By
  insecure, we consider protocols that explicitly leak sensitive user names
  and/or passwords, such as POP, IMAP, Telnet, and FTP.

Motivation:

  As part of a general study of Tor use in 2006/2007 [1], we attempted to
  understand what types of protocols are used over Tor. While we observed a
  enormous volume of Web and Peer-to-peer traffic, we were surprised by the
  number of insecure protocols that were used over Tor. For example, over an
  8 day observation period, we observed the following number of connections
  over insecure protocols:

    POP and IMAP:10,326 connections
    Telnet: 8,401 connections
    FTP: 3,788 connections

  Each of the above listed protocols exchange user name and password
  information in plain-text. As an upper bound, we could have observed
  22,515 user names and passwords. This observation echos the reports of
  a Tor router logging and posting e-mail passwords in August 2007 [2]. The
  response from the Tor community has been to further educate users
  about the dangers of using insecure protocols over Tor. However, we
  recently repeated our Tor usage study from last year and noticed that the
  trend in insecure protocol use has not declined. Therefore, we propose that
  additional steps be taken to protect naive Tor users from inadvertently
  exposing their identities (and even passwords) over Tor.

Security Implications:

  This proposal is intended to improve Tor's security by limiting the
  use of insecure protocols.

  Roger added: By adding these warnings for only some of the risky
  behavior, users may do other risky behavior, not get a warning, and
  believe that it is therefore safe. But overall, I think it's better
  to warn for some of it than to warn for none of it.

Specification:

  As an initial step towards mitigating the use of the above-mentioned
  insecure protocols, we propose that the default ports for each respective
  insecure service be blocked at the Tor client's socks proxy. These default
  ports include:

    23 - Telnet
    109 - POP2
    110 - POP3
    143 - IMAP

  Notice that FTP is not included in the proposed list of ports to block. This
  is because FTP is often used anonymously, i.e., without any identifying
  user name or password.

  This blocking scheme can be implemented as a set of flags in the client's
  torrc configuration file:

    BlockInsecureProtocols 0|1
    WarnInsecureProtocols 0|1

  When the warning flag is activated, a message should be displayed to
  the user similar to the message given when Tor's socks proxy is given an IP
  address rather than resolving a host name.

  We recommend that the default torrc configuration file block insecure
  protocols and provide a warning to the user to explain the behavior.

  Finally, there are many popular web pages that do not offer secure
  login features, such as MySpace, and it would be prudent to provide
  additional rules to Privoxy to attempt to protect users from unknowingly
  submitting their login credentials in plain-text.

Compatibility:

  None, as the proposed changes are to be implemented in the client.

References:

  [1] Shining Light in Dark Places: A Study of Anonymous Network Usage.
      University of Colorado Technical Report CU-CS-1032-07. August 2007.

  [2] Rogue Nodes Turn Tor Anonymizer Into Eavesdropper's Paradise.
      http://www.wired.com/politics/security/news/2007/09/embassy_hacks.
      Wired. September 10, 2007.

Implementation:

  Roger added this feature in
  http://archives.seul.org/or/cvs/Jan-2008/msg00182.html
  He also added a status event for Vidalia to recognize attempts to use
  vulnerable-plaintext ports, so it can help the user understand what's
  going on and how to fix it.

Next steps:

  a) Vidalia should learn to recognize this controller status event,
  so we don't leave users out in the cold when we enable this feature.

  b) We should decide which ports to reject by default. The current
  consensus is 23,109,110,143 -- the same set that we warn for now.

Filename: 130-v2-conn-protocol.txt
Title: Version 2 Tor connection protocol
Author: Nick Mathewson
Created: 2007-10-25
Status: Closed
Implemented-In: 0.2.0.x

Overview:

  This proposal describes the significant changes to be made in the v2
  Tor connection protocol.

  This proposal relates to other proposals as follows:

    It refers to and supersedes:
       Proposal 124: Blocking resistant TLS certificate usage
    It refers to aspects of:
       Proposal 105: Version negotiation for the Tor protocol


  In summary, The Tor connection protocol has been in need of a redesign
  for a while.  This proposal describes how we can add to the Tor
  protocol:

     - A new TLS handshake (to achieve blocking resistance without
       breaking backward compatibility)
     - Version negotiation (so that future connection protocol changes
       can happen without breaking compatibility)
     - The actual changes in the v2 Tor connection protocol.

Motivation:

  For motivation, see proposal 124.

Proposal:

0. Terminology

  The version of the Tor connection protocol implemented up to now is
  "version 1".  This proposal describes "version 2".

  "Old" or "Older" versions of Tor are ones not aware that version 2
  of this protocol exists;
  "New" or "Newer" versions are ones that are.

  The connection initiator is referred to below as the Client; the
  connection responder is referred to below as the Server.

1. The revised TLS handshake.

  For motivation, see proposal 124.  This is a simplified version of the
  handshake that uses TLS's renegotiation capability in order to avoid
  some of the extraneous steps in proposal 124.

  The Client connects to the Server and, as in ordinary TLS, sends a
  list of ciphers.  Older versions of Tor will send only ciphers from
  the list:
    TLS_DHE_RSA_WITH_AES_256_CBC_SHA
    TLS_DHE_RSA_WITH_AES_128_CBC_SHA
    SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA
    SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA
  Clients that support the revised handshake will send the recommended
  list of ciphers from proposal 124, in order to emulate the behavior of
  a web browser.

  If the server notices that the list of ciphers contains only ciphers
  from this list, it proceeds with Tor's version 1 TLS handshake as
  documented in tor-spec.txt.

  (The server may also notice cipher lists used by other implementations
  of the Tor protocol (in particular, the BouncyCastle default cipher
  list as used by some Java-based implementations), and whitelist them.)

  On the other hand, if the server sees a list of ciphers that could not
  have been sent from an older implementation (because it includes other
  ciphers, and does not match any known-old list), the server sends a
  reply containing a single connection certificate, constructed as for
  the link certificate in the v1 Tor protocol.  The subject names in
  this certificate SHOULD NOT have any strings to identify them as
  coming from a Tor server.  The server does not ask the client for
  certificates.

  Old Servers will (mostly) ignore the cipher list and respond as in the v1
  protocol, sending back a two-certificate chain.

  After the Client gets a response from the server, it checks for the
  number of certificates it received.  If there are two certificates,
  the client assumes a V1 connection and proceeds as in tor-spec.txt.
  But if there is only one certificate, the client assumes a V2 or later
  protocol and continues.

  At this point, the client has established a TLS connection with the
  server, but the parties have not been authenticated: the server hasn't
  sent its identity certificate, and the client hasn't sent any
  certificates at all.  To fix this, the client begins a TLS session
  renegotiation.  This time, the server continues with two certificates
  as usual, and asks for certificates so that the client will send
  certificates of its own.  Because the TLS connection has been
  established, all of this is encrypted.  (The certificate sent by the
  server in the renegotiated connection need not be the same that
  as sentin the original connection.)

  The server MUST NOT write any data until the client has renegotiated.

  Once the renegotiation is finished, the server and client check one
  another's certificates as in V1.  Now they are mutually authenticated.

1.1. Revised TLS handshake: implementation notes.

  It isn't so easy to adjust server behavior based on the client's
  ciphersuite list.  Here's how we can do it using OpenSSL.  This is a
  bit of an abuse of the OpenSSL APIs, but it's the best we can do, and
  we won't have to do it forever.

  We can use OpenSSL's SSL_set_info_callback() to register a function to
  be called when the state changes.  The type/state tuple of
     SSL_CB_ACCEPT_LOOP/SSL3_ST_SW_SRVR_HELLO_A
  happens when we have completely parsed the client hello, and are about
  to send a response.  From this callback, we can check the cipherlist
  and act accordingly:

     * If the ciphersuite list indicates a v1 protocol, we set the
       verify mode to SSL_VERIFY_NONE with a callback (so we get
       certificates).

     * If the ciphersuite list indicates a v2 protocol, we set the
       verify mode to SSL_VERIFY_NONE with no callback (so we get
       no certificates) and set the SSL_MODE_NO_AUTO_CHAIN flag (so that
       we send only 1 certificate in the response.

  Once the handshake is done, the server clears the
  SSL_MODE_NO_AUTO_CHAIN flag and sets the callback as for the V1
  protocol.  It then starts reading.

  The other problem to take care of is missing ciphers and OpenSSL's
  cipher sorting algorithms. The two main issues are a) OpenSSL doesn't
  support some of the default ciphers that Firefox advertises, and b)
  OpenSSL sorts the list of ciphers it offers in a different way than
  Firefox sorts them, so unless we fix that Tor will still look different
  than Firefox.
  [XXXX more on this.]


1.2. Compatibility for clients using libraries less hackable than OpenSSL.

  As discussed in proposal 105, servers advertise which protocol
  versions they support in their router descriptors.  Clients can simply
  behave as v1 clients when connecting to servers that do not support
  link version 2 or higher, and as v2 clients when connecting to servers
  that do support link version 2 or higher.

  (Servers can't use this strategy because we do not assume that servers
  know one another's capabilities when connecting.)

2. Version negotiation.

  Version negotiation proceeds as described in proposal 105, except as
  follows:

   * Version negotiation only happens if the TLS handshake as described
     above completes.

   * The TLS renegotiation must be finished before the client sends a
     VERSIONS cell; the server sends its VERSIONS cell in response.

   * The VERSIONS cell uses the following variable-width format:
         Circuit  [2 octets; set to 0]
         Command  [1 octet; set to 7 for VERSIONS]
         Length   [2 octets; big-endian]
         Data     [Length bytes]

     The Data in the cell is a series of big-endian two-byte integers.

   * It is not allowed to negotiate V1 connections once the v2 protocol
     has been used.  If this happens, Tor instances should close the
     connection.

3. The rest of the "v2" protocol

   Once a v2 protocol has been negotiated, NETINFO cells are exchanged
   as in proposal 105, and communications begin as per tor-spec.txt.
   Until NETINFO cells have been exchanged, the connection is not open.


Filename: 131-verify-tor-usage.txt
Title: Help users to verify they are using Tor
Author: Steven J. Murdoch
Created: 2008-01-25
Status: Obsolete

Overview:

  Websites for checking whether a user is accessing them via Tor are a
  very helpful aid to configuring web browsers correctly. Existing
  solutions have both false positives and false negatives when
  checking if Tor is being used. This proposal will discuss how to
  modify Tor so as to make testing more reliable.

Motivation:

  Currently deployed websites for detecting Tor use work by comparing
  the client IP address for a request with a list of known Tor nodes.
  This approach is generally effective, but suffers from both false
  positives and false negatives. 

  If a user has a Tor exit node installed, or just happens to have
  been allocated an IP address previously used by a Tor exit node, any
  web requests will be incorrectly flagged as coming from Tor. If any
  customer of an ISP which implements a transparent proxy runs an exit
  node, all other users of the ISP will be flagged as Tor users.

  Conversely, if the exit node chosen by a Tor user has not yet been
  recorded by the Tor checking website, requests will be incorrectly
  flagged as not coming via Tor.
  
  The only reliable way to tell whether Tor is being used or not is for
  the Tor client to flag this to the browser.

Proposal:

  A DNS name should be registered and point to an IP address 
  controlled by the Tor project and likely to remain so for the
  useful lifetime of a Tor client. A web server should be placed
  at this IP address.
  
  Tor should be modified to treat requests to port 80, at the
  specified DNS name or IP address specially. Instead of opening a
  circuit, it should respond to a HTTP request with a helpful web
  page:

  - If the request to open a connection was to the domain name, the web
    page should state that Tor is working properly.
  - If the request was to the IP address, the web page should state
    that there is a DNS-leakage vulnerability.

  If the request goes through to the real web server, the page
  should state that Tor has not been set up properly.

Extensions:

  Identifying proxy server:

  If needed, other applications between the web browser and Tor (e.g.
  Polipo and Privoxy) could piggyback on the same mechanism to flag
  whether they are in use. All three possible web pages should include
  a machine-readable placeholder, into which another program could
  insert their own message.

  For example, the webpage returned by Tor to indicate a successful
  configuration could include the following HTML:
   <h2>Connection chain</h2>
   <ul>
   <li>Tor 0.1.2.14-alpha</li>
   <!-- Tor Connectivity Check: success -->
   </ul>

  When the proxy server observes this string, in response to a request
  for the Tor connectivity check web page, it would prepend it's own
  message, resulting in the following being returned to the web
  browser:
   <h2>Connection chain
   <ul>
   <li>Tor 0.1.2.14-alpha</li>
   <li>Polipo version 1.0.4</li>
   <!-- Tor Connectivity Check: success -->
   </ul>

  Checking external connectivity:

  If Tor intercepts a request, and returns a response itself, the user
  will not actually confirm whether Tor is able to build a successful
  circuit. It may then be advantageous to include an image in the web
  page which is loaded from a different domain. If this is able to be
  loaded then the user will know that external connectivity through
  Tor works.

  Automatic Firefox Notification:

  All forms of the website should return valid XHTML and have a
  hidden link with an id attribute "TorCheckResult" and a target
  property that can be queried to determine the result. For example,   
  a hidden link would convey success like this: 

  <a id="TorCheckResult" target="success" href="/"></a>

  failure like this:

  <a id="TorCheckResult" target="failure" href="/"></a>

  and DNS leaks like this:

  <a id="TorCheckResult" target="dnsleak" href="/"></a>

  Firefox extensions such as Torbutton would then be able to 
  issue an XMLHttpRequest for the page and query the result
  with resultXML.getElementById("TorCheckResult").target
  to automatically report the Tor status to the user when
  they first attempt to enable Tor activity, or whenever
  they request a check from the extension preferences window.

  If the check website is to be themed with heavy graphics and/or
  extensive documentation, the check result itself should be
  contained in a seperate lightweight iframe that extensions can
  request via an alternate url.

Security and resiliency implications:

  What attacks are possible?

  If the IP address used for this feature moves there will be two
  consequences:
   - A new website at this IP address will remain inaccessible over
     Tor
   - Tor users who are leaking DNS will be informed that Tor is not
     working, rather than that it is active but leaking DNS
  We should thus attempt to find an IP address which we reasonably
  believe can remain static.

Open issues:

  If a Tor version which does not support this extra feature is used,
  the webpage returned will indicate that Tor is not being used. Can
  this be safely fixed?

Related work:

  The proposed mechanism is very similar to config.privoxy.org. The
  most significant difference is that if the web browser is
  misconfigured, Tor will only get an IP address. Even in this case,
  Tor should be able to respond with a webpage to notify the user of how
  to fix the problem. This also implies that Tor must be told of the
  special IP address, and so must be effectively permanent.
Filename: 132-browser-check-tor-service.txt
Title: A Tor Web Service For Verifying Correct Browser Configuration
Author: Robert Hogan
Created: 2008-03-08
Status: Obsolete

Overview:

  Tor should operate a primitive web service on the loopback network device
  that tests the operation of user's browser, privacy proxy and Tor client.
  The tests are performed by serving unique, randomly generated elements in
  image URLs embedded in static HTML. The images are only displayed if the DNS
  and HTTP requests for them are routed through Tor, otherwise the 'alt' text
  may be displayed. The proposal assumes that 'alt' text is not displayed on
  all browsers so suggests that text and links should accompany each image
  advising the user on next steps in case the test fails.

  The service is primarily for the use of controllers, since presumably users
  aren't going to want to edit text files and then type something exotic like
  127.0.0.1:9999 into their address bar. In the main use case the controller
  will have configured the actual port for the webservice so will know where
  to direct the request. It would also be the responsibility of the controller
  to ensure the webservice is available, and tor is running, before allowing
  the user to access the page through their browser.

Motivation:

  This is a complementary approach to proposal 131. It overcomes some of the
  limitations of the approach described in proposal 131: reliance
  on a permanent, real IP address and compatibility with older versions of
  Tor. Unlike 131, it is not as useful to Tor users who are not running a
  controller.

Objective:

  Provide a reliable means of helping users to determine if their Tor
  installation, privacy proxy and browser are properly configured for
  anonymous browsing.

Proposal:

  When configured to do so, Tor should run a basic web service available
  on a configured port on 127.0.0.1. The purpose of this web service is to
  serve a number of basic test images that will allow the user to determine
  if their browser is properly configured and that Tor is working normally.

  The service can consist of a single web page with two columns. The left
  column contains images, the right column contains advice on what the
  display/non-display of the column means.

  The rest of this proposal assumes that the service is running on port
  9999. The port should be configurable, and configuring the port enables the
  service. The service must run on 127.0.0.1.

  In all the examples below [uniquesessionid] refers to a random, base64
  encoded string that is unique to the URL it is contained in. Tor only ever
  stores the most recently generated [uniquesessionid] for each URL, storing 3
  in total. Tor should generate a [uniquesessionid] for each of the test URLs
  below every time a HTTP GET is received at 127.0.0.1:9999 for index.htm.

  The most suitable image for each test case is an implementation decision.
  Tor will need to store and serve images for the first and second test
  images, and possibly the third (see 'Open Issues').

  1. DNS Request Test Image
  
  This is a HTML element embedded in the page served by Tor at
  http://127.0.0.1:9999:

  <IMG src="http://[uniquesessionid]:9999/torlogo.jpg" alt="If you can see
  this text, your browser's DNS requests are not being routed through Tor."
  width="200" height="200" align="middle" border="2">

  If the browser's DNS request for [uniquesessionid] is routed through Tor,
  Tor will intercept the request and return 127.0.0.1 as the resolved IP
  address. This will shortly be followed by a HTTP request from the browser
  for http://127.0.0.1:9999/torlogo.jpg. This request should be served with
  the appropriate image.

  If the browser's DNS request for [uniquesessionid] is not routed through Tor
  the browser may display the 'alt' text specified in the html element. The
  HTML served by Tor should also contain text accompanying the image to advise
  users what it means if they do not see an image. It should also provide a
  link to click that provides information on how to remedy the problem. This
  behaviour also applies to the images described in 2. and 3. below, so should
  be assumed there as well.


  2. Proxy Configuration Test Image

  This is a HTML element embedded in the page served by Tor at
  http://127.0.0.1:9999:

  <IMG src="http://torproject.org/[uniquesessionid].jpg" alt="If you can see
  this text, your browser is not configured to work with Tor." width="200"
  height="200" align="middle" border="2">

  If the HTTP request for the resource [uniquesessionid].jpg is received by
  Tor it will serve the appropriate image in response. It should serve this
  image itself, without attempting to retrieve anything from the Internet.

  If Tor can identify the name of the proxy application requesting the
  resource then it could store and serve an image identifying the proxy to the
  user.

  3. Tor Connectivity Test Image

  This is a HTML element embedded in the page served by Tor at
  http://127.0.0.1:9999:

  <IMG src="http://torproject.org/[uniquesessionid]-torlogo.jpg" alt="If you
  can see this text, your Tor installation cannot connect to the Internet."
  width="200" height="200" align="middle" border="2">

  The referenced image should actually exist on the Tor project website. If
  Tor receives the request for the above resource it should remove the random
  base64 encoded digest from the request (i.e. [uniquesessionid]-) and attempt
  to retrieve the real image.

  Even on a fully operational Tor client this test may not always succeed. The
  user should be advised that one or more attempts to retrieve this image may
  be necessary to confirm a genuine problem.

Open Issues:

  The final connectivity test relies on an externally maintained resource, if
  this resource becomes unavailable the connectivity test will always fail.
  Either the text accompanying the test should advise of this possibility or
  Tor clients should be advised of the location of the test resource in the
  main network directory listings.

  Any number of misconfigurations may make the web service unreachable, it is
  the responsibility of the user's controller to recognize these and assist
  the user in eliminating them. Tor can mitigate against the specific
  misconfiguration of routing HTTP traffic to 127.0.0.1 to Tor itself by
  serving such requests through the SOCKS port as well as the configured web
  service report.

  Now Tor is inspecting the URLs requested on its SOCKS port and 'dropping'
  them. It already inspects for raw IP addresses (to warn of DNS leaks) but
  maybe the behaviour proposed here is qualitatively different. Maybe this is
  an unwelcome precedent that can be used to beat the project over the head in
  future. Or maybe it's not such a bad thing, Tor is merely attempting to make
  normally invalid resource requests valid for a given purpose.

Filename: 133-unreachable-ors.txt
Title: Incorporate Unreachable ORs into the Tor Network
Author: Robert Hogan
Created: 2008-03-08
Status: Reserve

Overview:

  Propose a scheme for harnessing the bandwidth of ORs who cannot currently
  participate in the Tor network because they can only make outbound
  TCP connections.

Motivation: 

  Restrictive local and remote firewalls are preventing many willing
  candidates from becoming ORs on the Tor network.These
  ORs have a casual interest in joining the network but their operator is not
  sufficiently motivated or adept to complete the necessary router or firewall
  configuration. The Tor network is losing out on their bandwidth. At the
  moment we don't even know how many such 'candidate' ORs there are.


Objective:

  1. Establish how many ORs are unable to qualify for publication because
     they cannot establish that their ORPort is reachable.

  2. Devise a method for making such ORs available to clients for circuit
     building without prejudicing their anonymity.

Proposal:

  ORs whose ORPort reachability testing fails a specified number of
  consecutive times should:
  1. Enlist themselves with the authorities setting a 'Fallback' flag. This
      flag indicates that the OR is up and running but cannot connect to
      itself.
  2. Open an orconn with all ORs whose fingerprint begins with the same
      byte as their own. The management of this orconn will be transferred
      entirely to the OR at the other end.
  2. The fallback OR should update it's router status to contain the
      'Running' flag if it has managed to open an orconn with 3/4 of the ORs
      with an FP beginning with the same byte as its own.

  Tor ORs who are contacted by fallback ORs requesting an orconn should:
   1. Accept the orconn until they have reached a defined limit of orconn
      connections with fallback ORs.
   2. Should only accept such orconn requests from listed fallback ORs who
      have an FP beginning with the same byte as its own.

  Tor clients can include fallback ORs in the network by doing the
  following:
   1. When building a circuit, observe the fingerprint of each node they
      wish to connect to.
   2. When randomly selecting a node from the set of all eligible nodes,
      add all published, running fallback nodes to the set where the first
      byte of the fingerprint matches the previous node in the circuit.

Anonymity Implications:

  At least some, and possibly all, nodes on the network will have a set
  of nodes that only they and a few others can build circuits on.

    1. This means that fallback ORs might be unsuitable for use as middlemen
       nodes, because if the exit node is the attacker it knows that the
       number of nodes that could be the entry guard in the circuit is
       reduced to roughly 1/256th of the network, or worse 1/256th of all
       nodes listed as Guards. For the same reason, fallback nodes would
       appear to be unsuitable for two-hop circuits.

    2. This is not a problem if fallback ORs are always exit nodes. If
       the fallback OR is an attacker it will not be able to reduce the
       set of possible nodes for the entry guard any further than a normal,
       published OR.

Possible Attacks/Open Issues:

  1. Gaming Node Selection
    Does running a fallback OR customized for a specific set of published ORs
    improve an attacker's chances of seeing traffic from that set of published
    ORs? Would such a strategy be any more effective than running published
    ORs with other 'attractive' properties?

  2. DOS Attack
    An attacker could prevent all other legitimate fallback ORs with a
    given byte-1 in their FP from functioning by running 20 or 30 fallback ORs
    and monopolizing all available fallback slots on the published ORs. 
    This same attacker would then be in a position to monopolize all the
    traffic of the fallback ORs on that byte-1 network segment. I'm not sure
    what this would allow such an attacker to do.

  4. Circuit-Sniffing
    An observer watching exit traffic from a fallback server will know that the
    previous node in the circuit is one of a  very small, identifiable
    subset of the total ORs in the network. To establish the full path of the
    circuit they would only have to watch the exit traffic from the fallback
    OR and all the traffic from the 20 or 30 ORs it is likely to be connected
    to. This means it is substantially easier to establish all members of a
    circuit which has a fallback OR as an exit (sniff and analyse 10-50 (i.e.
    1/256 varying) + 1 ORs) rather than a normal published OR (sniff all 2560
    or so ORs on the network). The same mechanism that allows the client to
    expect a specific fallback OR to be available from a specific published OR
    allows an attacker to prepare his ground.

    Mitigant:
    In terms of the resources and access required to monitor 2000 to 3000
    nodes, the effort of the adversary is not significantly diminished when he
    is only interested in 20 or 30. It is hard to see how an adversary who can
    obtain access to a randomly selected portion of the Tor network would face
    any new or qualitatively different obstacles in attempting to access much
    of the rest of it.


Implementation Issues:

  The number of ORs this proposal would add to the Tor network is not known.
  This is because there is no mechanism at present for recording unsuccessful
  attempts to become an OR. If the proposal is considered promising it may be
  worthwhile to issue an alpha series release where candidate ORs post a
  primitive fallback descriptor to the authority directories. This fallback
  descriptor would not contain any other flag that would make it eligible for
  selection by clients. It would act solely as a means of sizing the number of
  Tor instances that try and fail to become ORs.

  The upper limit on the number of orconns from fallback ORs a normal,
  published OR should be willing to accept is an open question. Is one
  hundred, mostly idle, such orconns too onerous?

Filename: 134-robust-voting.txt
Title: More robust consensus voting with diverse authority sets
Author: Peter Palfrader
Created: 2008-04-01
Status: Rejected

History:
  2009 May 27: Added note on rejecting this proposal -- Nick

Overview:

  A means to arrive at a valid directory consensus even when voters
  disagree on who is an authority.


Motivation:

  Right now there are about five authoritative directory servers in the
  Tor network, tho this number is expected to rise to about 15 eventually.

  Adding a new authority requires synchronized action from all operators of
  directory authorities so that at any time during the update at least half of
  all authorities are running and agree on who is an authority.  The latter
  requirement is there so that the authorities can arrive at a common
  consensus:  Each authority builds the consensus based on the votes from
  all authorities it recognizes, and so a different set of recognized
  authorities will lead to a different consensus document.


Objective:

  The modified voting procedure outlined in this proposal obsoletes the
  requirement for most authorities to exactly agree on the list of
  authorities.


Proposal:

  The vote document each authority generates contains a list of 
  authorities recognized by the generating authority.  This will be 
  a list of authority identity fingerprints.

  Authorities will accept votes from and serve/mirror votes also for
  authorities they do not recognize.  (Votes contain the signing,
  authority key, and the certificate linking them so they can be 
  verified even without knowing the authority beforehand.)

  Before building the consensus we will check which votes to use for
  building:

   1) We build a directed graph of which authority/vote recognizes
      whom.
   2) (Parts of the graph that aren't reachable, directly or
      indirectly, from any authorities we recognize can be discarded
      immediately.)
   3) We find the largest fully connected subgraph.
      (Should there be more than one subgraph of the same size there
      needs to be some arbitrary ordering so we always pick the same.
      E.g. pick the one who has the smaller (XOR of all votes' digests)
      or something.)
   4) If we are part of that subgraph, great.  This is the list of 
      votes we build our consensus with.
   5) If we are not part of that subgraph, remove all the nodes that
      are part of it and go to 3.

  Using this procedure authorities that are updated to recognize a
  new authority will continue voting with the old group until a
  sufficient number has been updated to arrive at a consensus with
  the recently added authority.

  In fact, the old set of authorities will probably be voting among
  themselves until all but one has been updated to recognize the
  new authority.  Then which set of votes is used for consensus 
  building depends on which of the two equally large sets gets 
  ordered before the other in step (3) above.

  It is necessary to continue with the process in (5) even if we
  are not in the largest subgraph.  Otherwise one rogue authority
  could create a number of extra votes (by new authorities) so that
  everybody stops at 5 and no consensus is built, even tho it would
  be trusted by all clients.


Anonymity Implications:

  The author does not believe this proposal to have anonymity
  implications.


Possible Attacks/Open Issues/Some thinking required:

 Q: Can a number (less or exactly half) of the authorities cause an honest
    authority to vote for "their" consensus rather than the one that would
    result were all authorities taken into account?


 Q: Can a set of votes from external authorities, i.e of whom we trust either
    none or at least not all, cause us to change the set of consensus makers we
    pick?
 A: Yes, if other authorities decide they rather build a consensus with them
    then they'll be thrown out in step 3.  But that's ok since those other
    authorities will never vote with us anyway.
    If we trust none of them then we throw them out even sooner, so no harm done.

 Q: Can this ever force us to build a consensus with authorities we do not
    recognize?
 A: No, we can never build a fully connected set with them in step 3.

------------------------------

I'm rejecting this proposal as insecure.

Suppose that we have a clique of size N, and M hostile members in the
clique.  If these hostile members stop declaring trust for up to M-1
good members of the clique, the clique with the hostile members will
in it will be larger than the one without them.

The M hostile members will constitute a majority of this new clique
when M > (N-(M-1)) / 2, or when M > (N + 1) / 3.  This breaks our
requirement that an adversary must compromise a majority of authorities
in order to control the consensus.

-- Nick
Filename: 135-private-tor-networks.txt
Title: Simplify Configuration of Private Tor Networks
Author: Karsten Loesing
Created: 29-Apr-2008
Status: Closed
Target: 0.2.1.x
Implemented-In: 0.2.1.2-alpha

Change history:

  29-Apr-2008  Initial proposal for or-dev
  19-May-2008  Included changes based on comments by Nick to or-dev and
               added a section for test cases.
  18-Jun-2008  Changed testing-network-only configuration option names.

Overview:

  Configuring a private Tor network has become a time-consuming and
  error-prone task with the introduction of the v3 directory protocol. In
  addition to that, operators of private Tor networks need to set an
  increasing number of non-trivial configuration options, and it is hard
  to keep FAQ entries describing this task up-to-date. In this proposal we
  (1) suggest to (optionally) accelerate timing of the v3 directory voting
  process and (2) introduce an umbrella config option specifically aimed at
  creating private Tor networks.

Design:

  1. Accelerate Timing of v3 Directory Voting Process

  Tor has reasonable defaults for setting up a large, Internet-scale
  network with comparably high latencies and possibly wrong server clocks.
  However, those defaults are bad when it comes to quickly setting up a
  private Tor network for testing, either on a single node or LAN (things
  might be different when creating a test network on PlanetLab or
  something). Some time constraints should be made configurable for private
  networks. The general idea is to accelerate everything that has to do
  with propagation of directory information, but nothing else, so that a
  private network is available as soon as possible. (As a possible
  safeguard, changing these configuration values could be made dependent on
  the umbrella configuration option introduced in 2.)

  1.1. Initial Voting Schedule

  When a v3 directory does not know any consensus, it assumes an initial,
  hard-coded VotingInterval of 30 minutes, VoteDelay of 5 minutes, and
  DistDelay of 5 minutes. This is important for multiple, simultaneously
  restarted directory authorities to meet at a common time and create an
  initial consensus. Unfortunately, this means that it may take up to half
  an hour (or even more) for a private Tor network to bootstrap.

  We propose to make these three time constants configurable (note that
  V3AuthVotingInterval, V3AuthVoteDelay, and V3AuthDistDelay do not have an
  effect on the _initial_ voting schedule, but only on the schedule that a
  directory authority votes for). This can be achieved by introducing three
  new configuration options: TestingV3AuthInitialVotingInterval,
  TestingV3AuthInitialVoteDelay, and TestingV3AuthInitialDistDelay.

  As first safeguards, Tor should only accept configuration values for
  TestingV3AuthInitialVotingInterval that divide evenly into the default
  value of 30 minutes. The effect is that even if people misconfigured
  their directory authorities, they would meet at the default values at the
  latest. The second safeguard is to allow configuration only when the
  umbrella configuration option TestingTorNetwork is set.

  1.2. Immediately Provide Reachability Information (Running flag)

  The default behavior of a directory authority is to provide the Running
  flag only after the authority is available for at least 30 minutes. The
  rationale is that before that time, an authority simply cannot deliver
  useful information about other running nodes. But for private Tor
  networks this may be different. This is currently implemented in the code
  as:

  /** If we've been around for less than this amount of time, our
   * reachability information is not accurate. */
  #define DIRSERV_TIME_TO_GET_REACHABILITY_INFO (30*60)

  There should be another configuration option
  TestingAuthDirTimeToLearnReachability with a default value of 30 minutes
  that can be changed when running testing Tor networks, e.g. to 0 minutes.
  The configuration value would simply replace the quoted constant. Again,
  changing this option could be safeguarded by requiring the umbrella
  configuration option TestingTorNetwork to be set.

  1.3. Reduce Estimated Descriptor Propagation Time

  Tor currently assumes that it takes up to 10 minutes until router
  descriptors are propagated from the authorities to directory caches.
  This is not very useful for private Tor networks, and we want to be able
  to reduce this time, so that clients can download router descriptors in a
  timely manner.

  /** Clients don't download any descriptor this recent, since it will
   * probably not have propagated to enough caches. */
  #define ESTIMATED_PROPAGATION_TIME (10*60)

  We suggest to introduce a new config option
  TestingEstimatedDescriptorPropagationTime which defaults to 10 minutes,
  but that can be set to any lower non-negative value, e.g. 0 minutes. The
  same safeguards as in 1.2 could be used here, too.

  2. Umbrella Option for Setting Up Private Tor Networks

  Setting up a private Tor network requires a number of specific settings
  that are not required or useful when running Tor in the public Tor
  network. Instead of writing down these options in a FAQ entry, there
  should be a single configuration option, e.g. TestingTorNetwork, that
  changes all required settings at once. Newer Tor versions would keep the
  set of configuration options up-to-date. It should still remain possible
  to manually overwrite the settings that the umbrella configuration option
  affects.

  The following configuration options are set by TestingTorNetwork:

  - ServerDNSAllowBrokenResolvConf 1
      Ignore the situation that private relays are not aware of any name
      servers.

  - DirAllowPrivateAddresses 1
      Allow router descriptors containing private IP addresses.

  - EnforceDistinctSubnets 0
      Permit building circuits with relays in the same subnet.

  - AssumeReachable 1
      Omit self-testing for reachability.

  - AuthDirMaxServersPerAddr 0
  - AuthDirMaxServersPerAuthAddr 0
      Permit an unlimited number of nodes on the same IP address.

  - ClientDNSRejectInternalAddresses 0
      Believe in DNS responses resolving to private IP addresses.

  - ExitPolicyRejectPrivate 0
      Allow exiting to private IP addresses. (This one is a matter of
      taste---it might be dangerous to make this a default in a private
      network, although people setting up private Tor networks should know
      what they are doing.)

  - V3AuthVotingInterval 5 minutes
  - V3AuthVoteDelay 20 seconds
  - V3AuthDistDelay 20 seconds
      Accelerate voting schedule after first consensus has been reached.

  - TestingV3AuthInitialVotingInterval 5 minutes
  - TestingV3AuthInitialVoteDelay 20 seconds
  - TestingV3AuthInitialDistDelay 20 seconds
      Accelerate initial voting schedule until first consensus is reached.

  - TestingAuthDirTimeToLearnReachability 0 minutes
      Consider routers as Running from the start of running an authority.

  - TestingEstimatedDescriptorPropagationTime 0 minutes
      Clients try downloading router descriptors from directory caches,
      even when they are not 10 minutes old.

  In addition to changing the defaults for these configuration options,
  TestingTorNetwork can only be set when a user has manually configured
  DirServer lines.

Test:

  The implementation of this proposal must pass the following tests:

  1. Set TestingTorNetwork and see if dependent configuration options are
     correctly changed.

     tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \
       "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000"
     telnet 127.0.0.1 9051
     AUTHENTICATE
     GETCONF TestingTorNetwork TestingAuthDirTimeToLearnReachability
     250-TestingTorNetwork=1
     250 TestingAuthDirTimeToLearnReachability=0
     QUIT

  2. Set TestingTorNetwork and a dependent configuration value to see if
     the provided value is used for the dependent option.

     tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \
       "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" \
       TestingAuthDirTimeToLearnReachability 5
     telnet 127.0.0.1 9051
     AUTHENTICATE
     GETCONF TestingTorNetwork TestingAuthDirTimeToLearnReachability
     250-TestingTorNetwork=1
     250 TestingAuthDirTimeToLearnReachability=5
     QUIT

  3. Start with TestingTorNetwork set and change a dependent configuration
     option later on.

     tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \
       "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000"
     telnet 127.0.0.1 9051
     AUTHENTICATE
     SETCONF TestingAuthDirTimeToLearnReachability=5
     GETCONF TestingAuthDirTimeToLearnReachability
     250 TestingAuthDirTimeToLearnReachability=5
     QUIT

  4. Start with TestingTorNetwork set and a dependent configuration value,
     and reset that dependent configuration value. The result should be
     the testing-network specific default value.

     tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \
       "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" \
       TestingAuthDirTimeToLearnReachability 5
     telnet 127.0.0.1 9051
     AUTHENTICATE
     GETCONF TestingAuthDirTimeToLearnReachability
     250 TestingAuthDirTimeToLearnReachability=5
     RESETCONF TestingAuthDirTimeToLearnReachability
     GETCONF TestingAuthDirTimeToLearnReachability
     250 TestingAuthDirTimeToLearnReachability=0
     QUIT

  5. Leave TestingTorNetwork unset and check if dependent configuration
     options are left unchanged.

     tor DataDirectory . ControlPort 9051 DirServer \
       "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000"
     telnet 127.0.0.1 9051
     AUTHENTICATE
     GETCONF TestingTorNetwork TestingAuthDirTimeToLearnReachability
     250-TestingTorNetwork=0
     250 TestingAuthDirTimeToLearnReachability=1800
     QUIT

  6. Leave TestingTorNetwork unset, but set dependent configuration option
     which should fail.

     tor DataDirectory . ControlPort 9051 DirServer \
       "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" \
       TestingAuthDirTimeToLearnReachability 0
     [warn] Failed to parse/validate config:
     TestingAuthDirTimeToLearnReachability may only be changed in testing
     Tor networks!

  7. Start with TestingTorNetwork unset and change dependent configuration
     option later on which should fail.

     tor DataDirectory . ControlPort 9051 DirServer \
       "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000"
     telnet 127.0.0.1 9051
     AUTHENTICATE
     SETCONF TestingAuthDirTimeToLearnReachability=0
     513 Unacceptable option value: TestingAuthDirTimeToLearnReachability
     may only be changed in testing Tor networks!

  8. Start with TestingTorNetwork unset and set it later on which should
     fail.

     tor DataDirectory . ControlPort 9051 DirServer \
       "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000"
     telnet 127.0.0.1 9051
     AUTHENTICATE
     SETCONF TestingTorNetwork=1
     553 Transition not allowed: While Tor is running, changing
     TestingTorNetwork is not allowed.

  9. Start with TestingTorNetwork set and unset it later on which should
     fail.

     tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \
       "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000"
     telnet 127.0.0.1 9051
     AUTHENTICATE
     RESETCONF TestingTorNetwork
     513 Unacceptable option value: TestingV3AuthInitialVotingInterval may
     only be changed in testing Tor networks!

 10. Set TestingTorNetwork, but do not provide an alternate DirServer
     which should fail.

     tor DataDirectory . ControlPort 9051 TestingTorNetwork 1
     [warn] Failed to parse/validate config: TestingTorNetwork may only be
     configured in combination with a non-default set of DirServers.

Filename: 136-legacy-keys.txt
Title: Mass authority migration with legacy keys
Author: Nick Mathewson
Created: 13-May-2008
Status: Closed
Implemented-In: 0.2.0.x

Overview:

  This document describes a mechanism to change the keys of more than
  half of the directory servers at once without breaking old clients
  and caches immediately.

Motivation:

  If a single authority's identity key is believed to be compromised,
  the solution is obvious: remove that authority from the list,
  generate a new certificate, and treat the new cert as belonging to a
  new authority.  This approach works fine so long as less than 1/2 of
  the authority identity keys are bad.

  Unfortunately, the mass-compromise case is possible if there is a
  sufficiently bad bug in Tor or in any OS used by a majority of v3
  authorities.  Let's be prepared for it!

  We could simply stop using the old keys and start using new ones,
  and tell all clients running insecure versions to upgrade.
  Unfortunately, this breaks our cacheing system pretty badly, since
  caches won't cache a consensus that they don't believe in.  It would
  be nice to have everybody become secure the moment they upgrade to a
  version listing the new authority keys, _without_ breaking upgraded
  clients until the caches upgrade.

  So, let's come up with a way to provide a time window where the
  consensuses are signed with the new keys and with the old.

Design:

  We allow directory authorities to list a single "legacy key"
  fingerprint in their votes.  Each authority may add a single legacy
  key.  The format for this line is:

     legacy-dir-key FINGERPRINT

  We describe a new consensus method for generating directory
  consensuses.  This method is consensus method "3".

  When the authorities decide to use method "3" (as described in 3.4.1
  of dir-spec.txt), for every included vote with a legacy-dir-key line,
  the consensus includes an extra dir-source line.  The fingerprint in
  this extra line is as in the legacy-dir-key line.  The ports and
  addresses are in the dir-source line.  The nickname is as in the
  dir-source line, with the string "-legacy" appended.

      [We need to include this new dir-source line because the code
      won't accept or preserve signatures from authorities not listed
      as contributing to the consensus.]

  Authorities using legacy dir keys include two signatures on their
  consensuses: one generated with a signing key signed with their real
  signing key, and another generated with a signing key signed with
  another signing key attested to by their identity key.  These
  signing keys MUST be different.  Authorities MUST serve both
  certificates if asked.

Process:

  In the event of a mass key failure, we'll follow the following
  (ugly) procedure:
     - All affected authorities generate new certificates and identity
       keys, and circulate their new dirserver lines.  They copy their old
       certificates and old broken keys, but put them in new "legacy
       key files".
     - At the earliest time that can be arranged, the authorities
       replace their signing keys, identity keys, and certificates
       with the new uncompromised versions, and update to the new list
       of dirserer lines.
     - They add an "V3DirAdvertiseLegacyKey 1" option to their torrc.
     - Now, new consensuses will be generated using the new keys, but
       the results will also be signed with the old keys.
     - Clients and caches are told they need to upgrade, and given a
       time window to do so.
     - At the end of the time window, authorities remove the
       V3DirAdvertiseLegacyKey option.

Notes:

  It might be good to get caches to cache consensuses that they do not
  believe in.  I'm not sure the best way of how to do this.

  It's a superficially neat idea to have new signing keys and have
  them signed by the new and by the old authority identity keys.  This
  breaks some code, though, and doesn't actually gain us anything,
  since we'd still need to include each signature twice.

  It's also a superficially neat idea, if identity keys and signing
  keys are compromised, to at least replace all the signing keys.
  I don't think this achieves us anything either, though.


Filename: 137-bootstrap-phases.txt
Title: Keep controllers informed as Tor bootstraps
Author: Roger Dingledine
Created: 07-Jun-2008
Status: Closed
Implemented-In: 0.2.1.x

1. Overview.

  Tor has many steps to bootstrapping directory information and
  initial circuits, but from the controller's perspective we just have
  a coarse-grained "CIRCUIT_ESTABLISHED" status event. Tor users with
  slow connections or with connectivity problems can wait a long time
  staring at the yellow onion, wondering if it will ever change color.

  This proposal describes a new client status event so Tor can give
  more details to the controller. Section 2 describes the changes to the
  controller protocol; Section 3 describes Tor's internal bootstrapping
  phases when everything is going correctly; Section 4 describes when
  Tor detects a problem and issues a bootstrap warning; Section 5 covers
  suggestions for how controllers should display the results.

2. Controller event syntax.

  The generic status event is:

    "650" SP StatusType SP StatusSeverity SP StatusAction
                                        [SP StatusArguments] CRLF

  So in this case we send
  650 STATUS_CLIENT NOTICE/WARN BOOTSTRAP \
  PROGRESS=num TAG=Keyword SUMMARY=String \
  [WARNING=String REASON=Keyword COUNT=num RECOMMENDATION=Keyword]

  The arguments MAY appear in any order. Controllers MUST accept unrecognized
  arguments.

  "Progress" gives a number between 0 and 100 for how far through
  the bootstrapping process we are. "Summary" is a string that can be
  displayed to the user to describe the *next* task that Tor will tackle,
  i.e., the task it is working on after sending the status event. "Tag"
  is an optional string that controllers can use to recognize bootstrap
  phases from Section 3, if they want to do something smarter than just
  blindly displaying the summary string.

  The severity describes whether this is a normal bootstrap phase
  (severity notice) or an indication of a bootstrapping problem
  (severity warn). If severity warn, it should also include a "warning"
  argument string with any hints Tor has to offer about why it's having
  troubles bootstrapping, a "reason" string that lists one of the reasons
  allowed in the ORConn event, a "count" number that tells how many
  bootstrap problems there have been so far at this phase, and a
  "recommendation" keyword to indicate how the controller ought to react.

3. The bootstrap phases.

  This section describes the various phases currently reported by
  Tor. Controllers should not assume that the percentages and tags listed
  here will continue to match up, or even that the tags will stay in
  the same order. Some phases might also be skipped (not reported) if the
  associated bootstrap step is already complete, or if the phase no longer
  is necessary.  Only "starting" and "done" are guaranteed to exist in all
  future versions.

  Current Tor versions enter these phases in order, monotonically;
  future Tors MAY revisit earlier stages.

  Phase 0:
  tag=starting summary="starting"

  Tor starts out in this phase.

  Phase 5:
  tag=conn_dir summary="Connecting to directory mirror"

  Tor sends this event as soon as Tor has chosen a directory mirror ---
  one of the authorities if bootstrapping for the first time or after
  a long downtime, or one of the relays listed in its cached directory
  information otherwise.

  Tor will stay at this phase until it has successfully established
  a TCP connection with some directory mirror. Problems in this phase
  generally happen because Tor doesn't have a network connection, or
  because the local firewall is dropping SYN packets.

  Phase 10
  tag=handshake_dir summary="Finishing handshake with directory mirror"

  This event occurs when Tor establishes a TCP connection with a relay used
  as a directory mirror (or its https proxy if it's using one). Tor remains
  in this phase until the TLS handshake with the relay is finished.

  Problems in this phase generally happen because Tor's firewall is
  doing more sophisticated MITM attacks on it, or doing packet-level
  keyword recognition of Tor's handshake.

  Phase 15:
  tag=onehop_create summary="Establishing one-hop circuit for dir info"

  Once TLS is finished with a relay, Tor will send a CREATE_FAST cell
  to establish a one-hop circuit for retrieving directory information.
  It will remain in this phase until it receives the CREATED_FAST cell
  back, indicating that the circuit is ready.

  Phase 20:
  tag=requesting_status summary="Asking for networkstatus consensus"

  Once we've finished our one-hop circuit, we will start a new stream
  for fetching the networkstatus consensus. We'll stay in this phase
  until we get the 'connected' relay cell back, indicating that we've
  established a directory connection.

  Phase 25:
  tag=loading_status summary="Loading networkstatus consensus"

  Once we've established a directory connection, we will start fetching
  the networkstatus consensus document. This could take a while; this
  phase is a good opportunity for using the "progress" keyword to indicate
  partial progress.

  This phase could stall if the directory mirror we picked doesn't
  have a copy of the networkstatus consensus so we have to ask another,
  or it does give us a copy but we don't find it valid.

  Phase 40:
  tag=loading_keys summary="Loading authority key certs"

  Sometimes when we've finished loading the networkstatus consensus,
  we find that we don't have all the authority key certificates for the
  keys that signed the consensus. At that point we put the consensus we
  fetched on hold and fetch the keys so we can verify the signatures.

  Phase 45
  tag=requesting_descriptors summary="Asking for relay descriptors"

  Once we have a valid networkstatus consensus and we've checked all
  its signatures, we start asking for relay descriptors. We stay in this
  phase until we have received a 'connected' relay cell in response to
  a request for descriptors.

  Phase 50:
  tag=loading_descriptors summary="Loading relay descriptors"

  We will ask for relay descriptors from several different locations,
  so this step will probably make up the bulk of the bootstrapping,
  especially for users with slow connections. We stay in this phase until
  we have descriptors for at least 1/4 of the usable relays listed in
  the networkstatus consensus. This phase is also a good opportunity to
  use the "progress" keyword to indicate partial steps.

  Phase 80:
  tag=conn_or summary="Connecting to entry guard"

  Once we have a valid consensus and enough relay descriptors, we choose
  some entry guards and start trying to build some circuits. This step
  is similar to the "conn_dir" phase above; the only difference is
  the context.

  If a Tor starts with enough recent cached directory information,
  its first bootstrap status event will be for the conn_or phase.

  Phase 85:
  tag=handshake_or summary="Finishing handshake with entry guard"

  This phase is similar to the "handshake_dir" phase, but it gets reached
  if we finish a TCP connection to a Tor relay and we have already reached
  the "conn_or" phase. We'll stay in this phase until we complete a TLS
  handshake with a Tor relay.

  Phase 90:
  tag=circuit_create "Establishing circuits"

  Once we've finished our TLS handshake with an entry guard, we will
  set about trying to make some 3-hop circuits in case we need them soon.

  Phase 100:
  tag=done summary="Done"

  A full 3-hop circuit has been established. Tor is ready to handle
  application connections now.

4. Bootstrap problem events.

  When an OR Conn fails, we send a "bootstrap problem" status event, which
  is like the standard bootstrap status event except with severity warn.
  We include the same progress, tag, and summary values as we would for
  a normal bootstrap event, but we also include "warning", "reason",
  "count", and "recommendation" key/value combos.

  The "reason" values are long-term-stable controller-facing tags to
  identify particular issues in a bootstrapping step.  The warning
  strings, on the other hand, are human-readable. Controllers SHOULD
  NOT rely on the format of any warning string. Currently the possible
  values for "recommendation" are either "ignore" or "warn" -- if ignore,
  the controller can accumulate the string in a pile of problems to show
  the user if the user asks; if warn, the controller should alert the
  user that Tor is pretty sure there's a bootstrapping problem.

  Currently Tor uses recommendation=ignore for the first nine bootstrap
  problem reports for a given phase, and then uses recommendation=warn
  for subsequent problems at that phase. Hopefully this is a good
  balance between tolerating occasional errors and reporting serious
  problems quickly.

5. Suggested controller behavior.

  Controllers should start out with a yellow onion or the equivalent
  ("starting"), and then watch for either a bootstrap status event
  (meaning the Tor they're using is sufficiently new to produce them,
  and they should load up the progress bar or whatever they plan to use
  to indicate progress) or a circuit_established status event (meaning
  bootstrapping is finished).

  In addition to a progress bar in the display, controllers should also
  have some way to indicate progress even when no controller window is
  open. For example, folks using Tor Browser Bundle in hostile Internet
  cafes don't want a big splashy screen up. One way to let the user keep
  informed of progress in a more subtle way is to change the task tray
  icon and/or tooltip string as more bootstrap events come in.

  Controllers should also have some mechanism to alert their user when
  bootstrapping problems are reported. Perhaps we should gather a set of
  help texts and the controller can send the user to the right anchor in a
  "bootstrapping problems" page in the controller's help subsystem?

6. Getting up to speed when the controller connects.

  There's a new "GETINFO /status/bootstrap-phase" option, which returns
  the most recent bootstrap phase status event sent. Specifically,
  it returns a string starting with either "NOTICE BOOTSTRAP ..." or
  "WARN BOOTSTRAP ...".

  Controllers should use this getinfo when they connect or attach to
  Tor to learn its current state.

Filename: 138-remove-down-routers-from-consensus.txt
Title: Remove routers that are not Running from consensus documents
Author: Peter Palfrader
Created: 11-Jun-2008
Status: Closed
Implemented-In: 0.2.1.2-alpha

1. Overview.

  Tor directory authorities hourly vote and agree on a consensus document
  which lists all the routers on the network together with some of their
  basic properties, like if a router is an exit node, whether it is
  stable or whether it is a version 2 directory mirror.

  One of the properties given with each router is the 'Running' flag.
  Clients do not use routers that are not listed as running.

  This proposal suggests that routers without the Running flag are not
  listed at all.

2. Current status

  At a typical bootstrap a client downloads a 140KB consensus, about
  10KB of certificates to verify that consensus, and about 1.6MB of
  server descriptors, about 1/4 of which it requires before it will
  start building circuits.

  Another proposal deals with how to get that huge 1.6MB fraction to
  effectively zero (by downloading only individual descriptors, on
  demand).  Should that get successfully implemented that will leave the
  140KB compressed consensus as a large fraction of what a client needs
  to get in order to work.

  About one third of the routers listed in a consensus are not running
  and will therefore never be used by clients who use this consensus.
  Not listing those routers will save about 30% to 40% in size.

3. Proposed change

  Authority directory servers produce vote documents that include all
  the servers they know about, running or not, like they currently
  do.  In addition these vote documents also state that the authority
  supports a new consensus forming method (method number 4).

  If more than two thirds of votes that an authority has received claim
  they support method 4 then this new method will be used:  The
  consensus document is formed like before but a new last step removes
  all routers from the listing that are not marked as Running.

Filename: 139-conditional-consensus-download.txt
Title: Download consensus documents only when it will be trusted
Author: Peter Palfrader
Created: 2008-04-13
Status: Closed
Implemented-In: 0.2.1.x

Overview:

  Servers only provide consensus documents to clients when it is known that
  the client will trust it.

Motivation:

  When clients[1] want a new network status consensus they request it
  from a Tor server using the URL path /tor/status-vote/current/consensus.
  Then after downloading the client checks if this consensus can be
  trusted.  Whether the client trusts the consensus depends on the
  authorities that the client trusts and how many of those
  authorities signed the consensus document.

  If the client cannot trust the consensus document it is disregarded
  and a new download is tried at a later time.  Several hundred
  kilobytes of server bandwidth were wasted by this single client's
  request.

  With hundreds of thousands of clients this will have undesirable
  consequences when the list of authorities has changed so much that a
  large number of established clients no longer can trust any consensus
  document formed.

Objective:

  The objective of this proposal is to make clients not download
  consensuses they will not trust.

Proposal:

  The list of authorities that are trusted by a client are encoded in
  the URL they send to the directory server when requesting a consensus
  document.

  The directory server then only sends back the consensus when more than
  half of the authorities listed in the request have signed the
  consensus.  If it is known that the consensus will not be trusted
  a 404 error code is sent back to the client.

  This proposal does not require directory caches to keep more than one
  consensus document.  This proposal also does not require authorities
  to verify the signature on the consensus document of authorities they
  do not recognize.

  The new URL scheme to download a consensus is
  /tor/status-vote/current/consensus/<F> where F is a list of
  fingerprints, sorted in ascending order, and concatenated using a +
  sign.

  Fingerprints are uppercase hexadecimal encodings of the authority
  identity key's digest.  Servers should also accept requests that
  use lower case or mixed case hexadecimal encodings.

  A .z URL for compressed versions of the consensus will be provided
  similarly to existing resources and is the URL that usually should
  be used by clients.

Migration:

  The old location of the consensus should continue to work
  indefinitely.  Not only is it used by old clients, but it is a useful
  resource for automated tools that do not particularly care which
  authorities have signed the consensus.

  Authorities that are known to the client a priori by being shipped
  with the Tor code are assumed to handle this format.

  When downloading a consensus document from caches that do not support this
  new format they fall back to the old download location.

  Caches support the new format starting with Tor version 0.2.1.1-alpha.

Anonymity Implications:

  By supplying the list of authorities a client trusts to the directory
  server we leak information (like likely version of Tor client) to the
  directory server.  In the current system we also leak that we are
  very old - by re-downloading the consensus over and over again, but
  only when we are so old that we no longer can trust the consensus.



Footnotes:
 1. For the purpose of this proposal a client can be any Tor instance
    that downloads a consensus document.  This includes relays,
    directory caches as well as end users.
Filename: 140-consensus-diffs.txt
Title: Provide diffs between consensuses
Author: Peter Palfrader
Created: 13-Jun-2008
Implemented-In: 0.3.1.1-alpha
Status: Closed
Ticket: https://bugs.torproject.org/13339

0. History

  22-May-2009: Restricted the ed format even more strictly for ease of
  implementation. -nickm

  25-May-2014: Adapted to the new dir-spec version 3 and made the diff urls
  backwards-compatible. -mvdan

  1-Mar-2017: Update to new stats, note newer proposals, note flavors,
  diffs, add parameters, restore diff-only URLs, say what "Digest"
  means. -nickm

  3-May-2017: Add a notion of "digest-as-signed" vs "full digest", since
  otherwise the fact that there are multiple encodings of the same valid
  consensus signatures would make clients identify which encodings they
  had been given as they asked for diffs.

  4-May-2017: Remove support for truncated digest prefixes.

1. Overview.

  Tor clients and servers need a list of which relays are on the
  network.  This list, the consensus, is created by authorities
  hourly and clients fetch a copy of it, with some delay, hourly.

  This proposal suggests that clients download diffs of consensuses
  once they have a consensus instead of hourly downloading a full
  consensus.

  This does not only apply to ordinary directory consensuses, but to the
  newer microdescriptor consensuses added in the third version of the
  directory specification.

2. Numbers

  After implementing proposal 138, which removed nodes that are not
  running from the list, a consensus document was about 92 kilobytes
  in size after compression... back in 2008 when this proposal was first
  written.

  But now in March 2017, that figure is more like 625 kilobytes.

  The diff between two consecutive consensuses, in ed format, is on
  average 37 kilobytes compressed.  So by making this change, we could
  save something like 94% of our consensus download bandwidth.

3. Proposal

3.0. Preliminaries.

  Unless otherwise specified, all digests in this document are SHA3-256
  digests, encoded in base64.  This document also uses "hash" as
  synonymous with "digest".

  A "full digest" of a consensus document covers the entire document,
  from the "network-status-version" through the newline after the final
  "-----END SIGNATURE-----".

  A "digest as signed" of a consensus document covers the same part that
  the signatures cover: the "network-status-version" through the space
  immediately after the "directory-signature" keyword on the first
  "directory-signature" line.

3.1 Clients

  If a client has a consensus that is recent enough it SHOULD
  try to download a diff to get the latest consensus rather than
  fetching a full one.

  [XXX: what is recent enough?
	time delta in hours / size of compressed diff

1:	38177
2:      66955
3:	93502
4:	118959
5:	143450
6:	167136
12:	291354
18:	404008
24:	416663
30:	431240
36:	443858
42:	454849
48:	464677
54:	476716
60:	487755
66:	497502
72:	506421

   Data suggests that for the first few hours' diffs are very useful,
   saving at least 50% for the first 12 hours.  After that, returns seem to
   be more marginal.  But note the savings from proposals like 274-276, which
   make diffs smaller over a much longer timeframe. ]


3.2 Servers

  Directory authorities and servers need to keep a number of old consensus
  documents so they can build diffs.  (See section 5 below ).  They should
  offer a diff to the most recent consensus at the following request:

  HTTP/1.0 GET /tor/status-vote/current/consensus{-Flavor}/<FPRLIST>.z
  X-Or-Diff-From-Consensus: HASH1 HASH2...

  where the hashes are the digests-as-signed of the consensuses the client
  currently has, and FPRLIST is a list of (abbreviated) fingerprints of
  authorities the client trusts.

  Servers will only return a consensus if more than half of the requested
  authorities have signed the document. Otherwise, a 404 error will be sent
  back.

  The advantage of using the same URL that is currently used for
  consensuses is that the client doesn't need to know whether a server
  supports consensus diffs.  If it doesn't, it will simply ignore the
  extra header and return the full consensus.

  If a server cannot offer a diff from one of the consensuses identified
  by one of the hashes but has a current consensus it MUST return the
  full consensus.

  [XXX: what should we do when the client already has the latest
  consensus?  I can think of the following options:
    - send back 3xx not modified
    - send back 200 ok and an empty diff
    - send back 404 nothing newer here.

    I currently lean towards the empty diff.]

  Additionally, specific diff for a given consensus digest-as-signed
  should be available a URL of the form:

    /tor/status-vote/current/consensus{-Flavor}/diff/<HASH>/<FPRLIST>.z

  This differs from the previous request type in that it should never
  return a whole consensus: if a diff is not available, it should return
  404.

4. Diff Format

  Diffs start with the token "network-status-diff-version" followed by a
  space and the version number, currently "1".

  If a document does not start with network-status-diff it is assumed
  to be a full consensus download and would therefore currently start
  with "network-status-version 3".

  Following the network-status-diff line is another header line,
  starting with the token "hash" followed by the digest-as-signed of the
  consensus that this diff applies to, and the full digest that the
  resulting consensus should have.

  Following the network-status-diff header lines is a diff, or patch, in
  limited ed format.  We choose this format because it is easy to create
  and process with standard tools (patch, diff -e, ed).  This will help
  us in developing and testing this proposal and it should make future
  debugging easier.

  [ If at one point in the future we decide that the space benefits from
    a custom diff format outweighs these benefits we can always
    introduce a new diff format and offer it at for instance
    ../diff2/... ]

  We support the following ed commands, each on a line by itself:
   - "<n1>d"          Delete line n1
   - "<n1>,<n2>d"     Delete lines n1 through n2, inclusive
   - "<n1>,$d"        Delete line n1 through the end of the file, inclusive.
   - "<n1>c"          Replace line n1 with the following block
   - "<n1>,<n2>c"     Replace lines n1 through n2, inclusive, with the
                      following block.
   - "<n1>a"          Append the following block after line n1.
   - "a"              Append the following block after the current line.

  Note that line numbers always apply to the file after all previous
  commands have already been applied.  Note also that line numbers
  are 1-indexed.

  The commands MUST apply to the file from back to front, such that
  lines are only ever referred to by their position in the original
  file.

  If there are any directory signatures on the original document, the
  first command MUST be a "<n1>,$d" form to remove all of the directory
  signatures.  Using this format ensures that the client will
  successfully apply the diff even if they have an unusual encoding for
  the signatures.

  The "current line" is either the first line of the file, if this is
  the first command, the last line of a block we added in an append or
  change command, or the line immediate following a set of lines we just
  deleted (or the last line of the file if there are no lines after
  that).

  The replace and append command take blocks.  These blocks are simply
  appended to the diff after the line with the command.  A line with
  just a period (".") ends the block (and is not part of the lines
  to add).  Note that it is impossible to insert a line with just
  a single dot.

4.1. Concatenating multiple diffs

  Directory caches may, at their discretion, return the concatenation of
  multiple diffs using the format above.  Such diffs are to be applied from
  first to last.  This allows the caches to cache a smaller number of
  compressed diffs, at the expense of some loss in bandwidth efficiency.


5. Networkstatus parameters

  The following parameters govern how relays and clients use this protocol.

     min-consensuses-age-to-cache-for-diff
       (min 0, max 744, default 6)
     max-consensuses-age-to-cache-for-diff
       (min 0, max 8192, default 72)

       These two parameters determine how much consensus history (in
       hours) relays should try to cache in order to serve diffs.

     try-diff-for-consensus-newer-than
       (min 0, max 8192, default 72)

       This parameter determines how old a consensus can be (in hours)
       before a client should no longer try to find a diff for it.
Filename: 141-jit-sd-downloads.txt
Title: Download server descriptors on demand
Author: Peter Palfrader
Created: 15-Jun-2008
Status: Obsolete

1. Overview

  Downloading all server descriptors is the most expensive part
  of bootstrapping a Tor client.  These server descriptors currently
  amount to about 1.5 Megabytes of data, and this size will grow
  linearly with network size.

  Fetching all these server descriptors takes a long while for people
  behind slow network connections.  It is also a considerable load on
  our network of directory mirrors.

  This document describes proposed changes to the Tor network and
  directory protocol so that clients will no longer need to download
  all server descriptors.

  These changes consist of moving load balancing information into
  network status documents, implementing a means to download server
  descriptors on demand in an anonymity-preserving way, and dealing
  with exit node selection.

2. What is in a server descriptor

  When a Tor client starts the first thing it will try to get is a
  current network status document: a consensus signed by a majority
  of directory authorities.  This document is currently about 100
  Kilobytes in size, tho it will grow linearly with network size.
  This document lists all servers currently running on the network.
  The Tor client will then try to get a server descriptor for each
  of the running servers.  All server descriptors currently amount
  to about 1.5 Megabytes of downloads.

  A Tor client learns several things about a server from its descriptor.
  Some of these it already learned from the network status document
  published by the authorities, but the server descriptor contains it
  again in a single statement signed by the server itself, not just by
  the directory authorities.

  Tor clients use the information from server descriptors for
  different purposes, which are considered in the following sections.

  #three ways:  One, to determine if a server will be able to handle
  #this client's request; two, to actually communicate or use the server;
  #three, for load balancing decisions.
  #
  #These three points are considered in the following subsections.

2.1 Load balancing

  The Tor load balancing mechanism is quite complex in its details, but
  it has a simple goal: The more traffic a server can handle the more
  traffic it should get.  That means the more traffic a server can
  handle the more likely a client will use it.

  For this purpose each server descriptor has bandwidth information
  which tries to convey a server's capacity to clients.

  Currently we weigh servers differently for different purposes.  There
  is a weight for when we use a server as a guard node (our entry to the
  Tor network), there is one weight we assign servers for exit duties,
  and a third for when we need intermediate (middle) nodes.

2.2 Exit information

  When a Tor wants to exit to some resource on the internet it will
  build a circuit to an exit node that allows access to that resource's
  IP address and TCP Port.

  When building that circuit the client can make sure that the circuit
  ends at a server that will be able to fulfill the request because the
  client already learned of all the servers' exit policies from their
  descriptors.

2.3 Capability information

  Server descriptors contain information about the specific version of
  the Tor protocol they understand [proposal 105].

  Furthermore the server descriptor also contains the exact version of
  the Tor software that the server is running and some decisions are
  made based on the server version number (for instance a Tor client
  will only make conditional consensus requests [proposal 139] when
  talking to Tor servers version 0.2.1.1-alpha or later).

2.4 Contact/key information

  A server descriptor lists a server's IP address and TCP ports on which
  it accepts onion and directory connections.  Furthermore it contains
  the onion key (a short lived RSA key to which clients encrypt CREATE
  cells).

2.5 Identity information

  A Tor client learns the digest of a server's key from the network
  status document.  Once it has a server descriptor this descriptor
  contains the full RSA identity key of the server.  Clients verify
  that 1) the digest of the identity key matches the expected digest
  it got from the consensus, and 2) that the signature on the descriptor
  from that key is valid.


3. No longer require clients to have copies of all SDs

3.1 Load balancing info in consensus documents

  One of the reasons why clients download all server descriptors is for
  doing load proper load balancing as described in 2.1.  In order for
  clients to not require all server descriptors this information will
  have to move into the network status document.

  Consensus documents will have a new line per router similar
  to the "r", "s", and "v" lines that already exist.  This line
  will convey weight information to clients.

   "w Bandwidth=193"

  The bandwidth number is the lesser of observed bandwidth and bandwidth
  rate limit from the server descriptor that the "r" line referenced by
  digest (1st and 3rd field of the bandwidth line in the descriptor).
  It is given in kilobytes per second so the byte value in the
  descriptor has to be divided by 1024 (and is then truncated, i.e.
  rounded down).

  Authorities will cap the bandwidth number at some arbitrary value,
  currently 10MB/sec.  If a router claims a larger bandwidth an
  authority's vote will still only show Bandwidth=10240.

  The consensus value for bandwidth is the median of all bandwidth
  numbers given in votes.  In case of an even number of votes we use
  the lower median.  (Using this procedure allows us to change the
  cap value more easily.)

  Clients should believe the bandwidth as presented in the consensus,
  not capping it again.

3.2 Fetching descriptors on demand

  As described in 2.4 a descriptor lists IP address, OR- and Dir-Port,
  and the onion key for a server.

  A client already knows the IP address and the ports from the consensus
  documents, but without the onion key it will not be able to send
  CREATE/EXTEND cells for that server.  Since the client needs the onion
  key it needs the descriptor.

  If a client only downloaded a few descriptors in an observable manner
  then that would leak which nodes it was going to use.

  This proposal suggests the following:

  1) when connecting to a guard node for which the client does not
     yet have a cached descriptor it requests the descriptor it
     expects by hash.  (The consensus document that the client holds
     has a hash for the descriptor of this server.  We want exactly
     that descriptor, not a different one.)

     It does that by sending a RELAY_REQUEST_SD cell.

     A client MAY cache the descriptor of the guard node so that it does
     not need to request it every single time it contacts the guard.

  2) when a client wants to extend a circuit that currently ends in
     server B to a new next server C, the client will send a
     RELAY_REQUEST_SD cell to server B.  This cell contains in its
     payload the hash of a server descriptor the client would like
     to obtain (C's server descriptor).  The server sends back the
     descriptor and the client can now form a valid EXTEND/CREATE cell
     encrypted to C's onion key.

     Clients MUST NOT cache such descriptors.  If they did they might
     leak that they already extended to that server at least once
     before.

  Replies to RELAY_REQUEST_SD requests need to be padded to some
  constant upper limit in order to conceal a client's destination
  from anybody who might be counting cells/bytes.

  RELAY_REQUEST_SD cells contain the following information:
    - hash of the server descriptor requested
    - hash of the identity digest of the server for which we want the SD
    - IP address and OR-port or the server for which we want the SD
    - padding factor - the number of cells we want the answer
      padded to.
      [XXX this just occured to me and it might be smart.  or it might
       be stupid.  clients would learn the padding factor they want
       to use from the consensus document.  This allows us to grow
       the replies later on should SDs become larger.]
  [XXX: figure out a decent padding size]

3.3 Protocol versions

  Server descriptors contain optional information of supported
  link-level and circuit-level protocols in the form of
  "opt protocols Link 1 2 Circuit 1".  These are not currently needed
  and will probably eventually move into the "v" (version) line in
  the consensus.  This proposal does not deal with them.

  Similarly a server descriptor contains the version number of
  a Tor node.  This information is already present in the consensus
  and is thus available to all clients immediately.

3.4 Exit selection

  Currently finding an appropriate exit node for a user's request is
  easy for a client because it has complete knowledge of all the exit
  policies of all servers on the network.

  The consensus document will once again be extended to contain the
  information required by clients.  This information will be a summary
  of each node's exit policy.  The exit policy summary will only contain
  the list of ports to which a node exits to most destination IP
  addresses.

  A summary should claim a router exits to a specific TCP port if,
  ignoring private IP addresses, the exit policy indicates that the
  router would exit to this port to most IP address.  either two /8
  netblocks, or one /8 and a couple of /12s or any other combination).
  The exact algorith used is this:  Going through all exit policy items
   - ignore any accept that is not for all IP addresses ("*"),
   - ignore rejects for these netblocks (exactly, no subnetting):
     0.0.0.0/8, 169.254.0.0/16, 127.0.0.0/8, 192.168.0.0/16, 10.0.0.0/8,
     and 172.16.0.0/12m
   - for each reject count the number of IP addresses rejected against
     the affected ports,
   - once we hit an accept for all IP addresses ("*") add the ports in
     that policy item to the list of accepted ports, if they don't have
     more than 2^25 IP addresses (that's two /8 networks) counted
     against them (i.e. if the router exits to a port to everywhere but
     at most two /8 networks).

  An exit policy summary will be included in votes and consensus as a
  new line attached to each exit node.  The line will have the format
   "p" <space> "accept"|"reject" <portlist>
  where portlist is a comma seperated list of single port numbers or
  portranges (e.g.  "22,80-88,1024-6000,6667").

  Whether the summary shows the list of accepted ports or the list of
  rejected ports depends on which list is shorter (has a shorter string
  representation).  In case of ties we choose the list of accepted
  ports.  As an exception to this rule an allow-all policy is
  represented as "accept 1-65535" instead of "reject " and a reject-all
  policy is similarly given as "reject 1-65535".

  Summary items are compressed, that is instead of "80-88,89-100" there
  only is a single item of "80-100", similarly instead of "20,21" a
  summary will say "20-21".

  Port lists are sorted in ascending order.

  The maximum allowed length of a policy summary (including the "accept "
  or "reject ") is 1000 characters.  If a summary exceeds that length we
  use an accept-style summary and list as much of the port list as is
  possible within these 1000 bytes.

3.4.1 Consensus selection

  When building a consensus, authorities have to agree on a digest of
  the server descriptor to list in the router line for each router.
  This is documented in dir-spec section 3.4.

  All authorities that listed that agreed upon descriptor digest in
  their vote should also list the same exit policy summary - or list
  none at all if the authority has not been upgraded to list that
  information in their vote.

  If we have votes with matching server descriptor digest of which at
  least one of them has an exit policy then we differ between two cases:
   a) all authorities agree (or abstained) on the policy summary, and we
      use the exit policy summary that they all listed in their vote,
   b) something went wrong (or some authority is playing foul) and we
      have different policy summaries.  In that case we pick the one
      that is most commonly listed in votes with the matching
      descriptor.  We break ties in favour of the lexigraphically larger
      vote.

  If none one of the votes with a matching server descriptor digest has
  an exit policy summary we use the most commonly listed one in all
  votes, breaking ties like in case b above.

3.4.2 Client behaviour

  When choosing an exit node for a specific request a Tor client will
  choose from the list of nodes that exit to the requested port as given
  by the consensus document.  If a client has additional knowledge (like
  cached full descriptors) that indicates the so chosen exit node will
  reject the request then it MAY use that knowledge (or not include such
  nodes in the selection to begin with).  However, clients MUST NOT use
  nodes that do not list the port as accepted in the summary (but for
  which they know that the node would exit to that address from other
  sources, like a cached descriptor).

  An exception to this is exit enclave behaviour: A client MAY use the
  node at a specific IP address to exit to any port on the same address
  even if that node is not listed as exiting to the port in the summary.

4. Migration

4.1 Consensus document changes.

  The consensus will need to include
    - bandwidth information (see 3.1)
    - exit policy summaries (3.4)

  A new consensus method (number TBD) will be chosen for this.

5. Future possibilities

  This proposal still requires that all servers have the descriptors of
  every other node in the network in order to answer RELAY_REQUEST_SD
  cells.  These cells are sent when a circuit is extended from ending at
  node B to a new node C.  In that case B would have to answer a
  RELAY_REQUEST_SD cell that asks for C's server descriptor (by SD digest).

  In order to answer that request B obviously needs a copy of C's server
  descriptor.  The RELAY_REQUEST_SD cell already has all the info that
  B needs to contact C so it can ask about the descriptor before passing it
  back to the client.

Filename: 142-combine-intro-and-rend-points.txt
Title: Combine Introduction and Rendezvous Points
Author: Karsten Loesing, Christian Wilms
Created: 27-Jun-2008
Status: Dead

Change history:

  27-Jun-2008  Initial proposal for or-dev
  04-Jul-2008  Give first security property the new name "Responsibility"
               and change new cell formats according to rendezvous protocol
               version 3 draft.
  19-Jul-2008  Added comment by Nick (but no solution, yet) that sharing of
               circuits between multiple clients is not supported by Tor.

Overview:

  Establishing a connection to a hidden service currently involves two Tor
  relays, introduction and rendezvous point, and 10 more relays distributed
  over four circuits to connect to them. The introduction point is
  established in the mid-term by a hidden service to transfer introduction
  requests from client to the hidden service. The rendezvous point is set
  up by the client for a single hidden service request and actually
  transfers end-to-end encrypted application data between client and hidden
  service.

  There are some reasons for separating the two roles of introduction and
  rendezvous point: (1) Responsibility: A relay shall not be made
  responsible that it relays data for a certain hidden service; in the
  original design as described in [1] an introduction point relays no
  application data, and a rendezvous points neither knows the hidden
  service nor can it decrypt the data. (2) Scalability: The hidden service
  shall not have to maintain a number of open circuits proportional to the
  expected number of client requests. (3) Attack resistance: The effect of
  an attack on the only visible parts of a hidden service, its introduction
  points, shall be as small as possible.

  However, elimination of a separate rendezvous connection as proposed by
  Øverlier and Syverson [2] is the most promising approach to improve the
  delay in connection establishment. From all substeps of connection
  establishment extending a circuit by only a single hop is responsible for
  a major part of delay. Reducing on-demand circuit extensions from two to
  one results in a decrease of mean connection establishment times from 39
  to 29 seconds [3]. Particularly, eliminating the delay on hidden-service
  side allows the client to better observe progress of connection
  establishment, thus allowing it to use smaller timeouts. Proposal 114
  introduced new introduction keys for introduction points and provides for
  user authorization data in hidden service descriptors; it will be shown
  in this proposal that introduction keys in combination with new
  introduction cookies provide for the first security property
  responsibility. Further, eliminating the need for a separate introduction
  connection benefits the overall network load by decreasing the number of
  circuit extensions. After all, having only one connection between client
  and hidden service reduces the overall protocol complexity.

Design:

  1. Hidden Service Configuration

  Hidden services should be able to choose whether they would like to use
  this protocol. This might be opt-in for 0.2.1.x and opt-out for later
  major releases.

  2. Contact Point Establishment

  When preparing a hidden service, a Tor client selects a set of relays to
  act as contact points instead of introduction points. The contact point
  combines both roles of introduction and rendezvous point as proposed in
  [2]. The only requirement for a relay to be picked as contact point is
  its capability of performing this role. This can be determined from the
  Tor version number that needs to be equal or higher than the first
  version that implements this proposal.

  The easiest way to implement establishment of contact points is to
  introduce v2 ESTABLISH_INTRO cells. By convention, the relay recognizes
  version 2 ESTABLISH_INTRO cells as requests to establish a contact point
  rather than an introduction point.

     V      Format byte: set to 255               [1 octet]
     V      Version byte: set to 2                [1 octet]
     KLEN   Key length                           [2 octets]
     PK     Public introduction key           [KLEN octets]
     HS     Hash of session info                [20 octets]
     SIG    Signature of above information       [variable]

  The hidden service does not create a fixed number of contact points, like
  3 in the current protocol. It uses a minimum of 3 contact points, but
  increases this number depending on the history of client requests within
  the last hour. The hidden service also increases this number depending on
  the frequency of failing contact points in order to defend against
  attacks on its contact points. When client authorization as described in
  proposal 121 is used, a hidden service can also use the number of
  authorized clients as first estimate for the required number of contact
  points.

  3. Hidden Service Descriptor Creation

  A hidden service needs to issue a fresh introduction cookie for each
  established introduction point. By requiring clients to use this cookie
  in a later connection establishment, an introduction point cannot access
  the hidden service that it works for. Together with the fresh
  introduction key that was introduced in proposal 114, this reduces
  responsibility of a contact point for a specific hidden service.

  The v2 hidden service descriptor format contains an
  "intro-authentication" field that may contain introduction-point specific
  keys. The hidden service creates a random string, comparable to the
  rendezvous cookie, and includes it in the descriptor as introduction
  cookie for auth-type "1". By convention, clients recognize existence of
  auth-type 1 as possibility to connect to a hidden service via a contact
  point rather than an introduction point. Older clients that do not
  understand this new protocol simply ignore that cookie.

  4. Connection Establishment

  When establishing a connection to a hidden service a client learns about
  the capability of using the new protocol from the hidden service
  descriptor. It may choose whether to use this new protocol or not,
  whereas older clients cannot understand the new capability and can only
  use the current protocol. Client using version 0.2.1.x should be able to
  opt-in for using the new protocol, which should change to opt-out for
  later major releases.

  When using the new capability the client creates a v2 INTRODUCE1 cell
  that extends an unversioned INTRODUCE1 cell by adding the content of an
  ESTABLISH_RENDEZVOUS cell. Further, the client sends this cell using the
  new cell type 41 RELAY_INTRODUCE1_VERSIONED to the introduction point,
  because unversioned and versioned INTRODUCE1 cells are indistinguishable:

  Cleartext
     V      Version byte: set to 2                [1 octet]
     PK_ID  Identifier for Bob's PK             [20 octets]
     RC     Rendezvous cookie                   [20 octets]
  Encrypted to introduction key:
     VER    Version byte: set to 3.               [1 octet]
     AUTHT  The auth type that is supported       [1 octet]
     AUTHL  Length of auth data                  [2 octets]
     AUTHD  Auth data                            [variable]
     RC     Rendezvous cookie                   [20 octets]
     g^x    Diffie-Hellman data, part 1        [128 octets]

  The cleartext part contains the rendezvous cookie that the contact point
  remembers just as a rendezvous point would do.

  The encrypted part contains the introduction cookie as auth data for the
  auth type 1. The rendezvous cookie is contained as before, but there is
  no further rendezvous point information, as there is no separate
  rendezvous point.

  5. Rendezvous Establishment

  The contact point recognizes a v2 INTRODUCE1 cell with auth type 1 as a
  request to be used in the new protocol. It remembers the contained
  rendezvous cookie, replies to the client with an INTRODUCE_ACK cell
  (omitting the RENDEZVOUS_ESTABLISHED cell), and forwards the encrypted
  part of the INTRODUCE1 cell as INTRODUCE2 cell to the hidden service.

  6. Introduction at Hidden Service

  The hidden services recognizes an INTRODUCE2 cell containing an
  introduction cookie as authorization data. In this case, it does not
  extend a circuit to a rendezvous point, but sends a RENDEZVOUS1 cell
  directly back to its contact point as usual.

  7. Rendezvous at Contact Point

  The contact point processes a RENDEZVOUS1 cell just as a rendezvous point
  does. The only difference is that the hidden-service-side circuit is not
  exclusive for the client connection, but shared among multiple client
  connections.

  [Tor does not allow sharing of a single circuit among multiple client
   connections easily. We need to think about a smart and efficient way to
   implement this. Comment by Nick. -KL]

Security Implications:

  (1) Responsibility

  One of the original reasons for the separation of introduction and
  rendezvous points is that a relay shall not be made responsible that it
  relays data for a certain hidden service. In the current design an
  introduction point relays no application data and a rendezvous points
  neither knows the hidden service nor can it decrypt the data.

  This property is also fulfilled in this new design. A contact point only
  learns a fresh introduction key instead of the hidden service key, so
  that it cannot recognize a hidden service. Further, the introduction
  cookie, which is unknown to the contact point, prevents it from accessing
  the hidden service itself. The only way for a contact point to access a
  hidden service is to look up whether it is contained in the descriptors
  of known hidden services. A contact point cannot directly be made
  responsible for which hidden service it is working. In addition to that,
  it cannot learn the data that it transfers, because all communication
  between client and hidden service are end-to-end encrypted.

  (2) Scalability

  Another goal of the existing hidden service protocol is that a hidden
  service does not have to maintain a number of open circuits proportional
  to the expected number of client requests. The rationale behind this is
  better scalability.

  The new protocol eliminates the need for a hidden service to extend
  circuits on demand, which has a positive effect on circuits establishment
  times and overall network load. The solution presented here to establish
  a number of contact points proportional to the history of connection
  requests reduces the number of circuits to a minimum number that fits the
  hidden service's needs.

  (3) Attack resistance

  The third goal of separating introduction and rendezvous points is to
  limit the effect of an attack on the only visible parts of a hidden
  service which are the contact points in this protocol.

  In theory, the new protocol is more vulnerable to this attack. An
  attacker who can take down a contact point does not only eliminate an
  access point to the hidden service, but also breaks current client
  connections to the hidden service using that contact point.

  Øverlier and Syverson proposed the concept of valet nodes as additional
  safeguard for introduction/contact points [4]. Unfortunately, this
  increases hidden service protocol complexity conceptually and from an
  implementation point of view. Therefore, it is not included in this
  proposal.

  However, in practice attacking a contact point (or introduction point) is
  not as rewarding as it might appear. The cost for a hidden service to set
  up a new contact point and publish a new hidden service descriptor is
  minimal compared to the efforts necessary for an attacker to take a Tor
  relay down. As a countermeasure to further frustrate this attack, the
  hidden service raises the number of contact points as a function of
  previous contact point failures.

  Further, the probability of breaking client connections due to attacking
  a contact point is minimal. It can be assumed that the probability of one
  of the other five involved relays in a hidden service connection failing
  or being shut down is higher than that of a successful attack on a
  contact point.

  (4) Resistance against Locating Attacks

  Clients are no longer able to force a hidden service to create or extend
  circuits. This further reduces an attacker's capabilities of locating a
  hidden server as described by Øverlier and Syverson [5].

Compatibility:

  The presented protocol does not raise compatibility issues with current
  Tor versions. New relay versions support both, the existing and the
  proposed protocol as introduction/rendezvous/contact points. A contact
  point acts as introduction point simultaneously. Hidden services and
  clients can opt-in to use the new protocol which might change to opt-out
  some time in the future.

References:

  [1] Roger Dingledine, Nick Mathewson, and Paul Syverson, Tor: The
  Second-Generation Onion Router. In the Proceedings of the 13th USENIX
  Security Symposium, August 2004.

  [2] Lasse Øverlier and Paul Syverson, Improving Efficiency and Simplicity
  of Tor Circuit Establishment and Hidden Services. In the Proceedings of
  the Seventh Workshop on Privacy Enhancing Technologies (PET 2007),
  Ottawa, Canada, June 2007.

  [3] Christian Wilms, Improving the Tor Hidden Service Protocol Aiming at
  Better Performance, diploma thesis, June 2008, University of Bamberg.

  [4] Lasse Øverlier and Paul Syverson, Valet Services: Improving Hidden
  Servers with a Personal Touch. In the Proceedings of the Sixth Workshop
  on Privacy Enhancing Technologies (PET 2006), Cambridge, UK, June 2006.

  [5] Lasse Øverlier and Paul Syverson, Locating Hidden Servers. In the
  Proceedings of the 2006 IEEE Symposium on Security and Privacy, May 2006.

Filename: 143-distributed-storage-improvements.txt
Title: Improvements of Distributed Storage for Tor Hidden Service Descriptors
Author: Karsten Loesing
Created: 28-Jun-2008
Status: Superseded

Change history:

  28-Jun-2008  Initial proposal for or-dev

Overview:

  An evaluation of the distributed storage for Tor hidden service
  descriptors and subsequent discussions have brought up a few improvements
  to proposal 114. All improvements are backwards compatible to the
  implementation of proposal 114.

Design:

  1. Report Bad Directory Nodes

  Bad hidden service directory nodes could deny existence of previously
  stored descriptors. A bad directory node that does this with all stored
  descriptors causes harm to the distributed storage in general, but
  replication will cope with this problem in most cases. However, an
  adversary that attempts to make a specific hidden service unavailable by
  running relays that become responsible for all of a service's
  descriptors poses a more serious threat. The distributed storage needs to
  defend against this attack by detecting and removing bad directory nodes.

  As a countermeasure hidden services try to download their descriptors
  every hour at random times from the hidden service directories that are
  responsible for storing it. If a directory node replies with 404 (Not
  found), the hidden service reports the supposedly bad directory node to
  a random selection of half of the directory authorities (with version
  numbers equal to or higher than the first version that implements this
  proposal). The hidden service posts a complaint message using HTTP 'POST'
  to a URL "/tor/rendezvous/complain" with the following message format:

    "hidden-service-directory-complaint" identifier NL

      [At start, exactly once]

      The identifier of the hidden service directory node to be
      investigated.

    "rendezvous-service-descriptor" descriptor NL

      [At end, Excatly once]

      The hidden service descriptor that the supposedly bad directory node
      does not serve.

  The directory authority checks if the descriptor is valid and the hidden
  service directory responsible for storing it. It waits for a random time
  of up to 30 minutes before posting the descriptor to the hidden service
  directory. If the publication is acknowledged, the directory authority
  waits another random time of up to 30 minutes before attempting to
  request the descriptor that it has posted. If the directory node replies
  with 404 (Not found), it will be blacklisted for being a hidden service
  directory node for the next 48 hours.

  A blacklisted hidden service directory is assigned the new flag BadHSDir
  instead of the HSDir flag in the vote that a directory authority creates.
  In a consensus a relay is only assigned a HSDir flag if the majority of
  votes contains a HSDir flag and no more than one third of votes contains
  a BadHSDir flag. As a result, clients do not have to learn about the
  BadHSDir flag. A blacklisted directory node will simply not be assigned
  the HSDir flag in the consensus.

  In order to prevent an attacker from setting up new nodes as replacement
  for blacklisted directory nodes, all directory nodes in the same /24
  subnet are blacklisted, too. Furthermore, if two or more directory nodes
  are blacklisted in the same /16 subnet concurrently, all other directory
  nodes in that /16 subnet are blacklisted, too. Blacklisting holds for at
  most 48 hours.

  2. Publish Fewer Replicas

  The evaluation has shown that the probability of a directory node to
  serve a previously stored descriptor is 85.7% (more precisely, this is
  the 0.001-quantile of the empirical distribution with the rationale that
  it holds for 99.9% of all empirical cases). If descriptors are replicated
  to x directory nodes, the probability of at least one of the replicas to
  be available for clients is 1 - (1 - 85.7%) ^ x. In order to achieve an
  overall availability of 99.9%, x = 3.55 replicas need to be stored. From
  this follows that 4 replicas are sufficient, rather than the currently
  stored 6 replicas.

  Further, the current design stores 2 sets of descriptors on 3 directory
  nodes with consecutive identities. Originally, this was meant to
  facilitate replication between directory nodes, which has not been and
  will not be implemented (the selection criterion of 24 hours uptime does
  not make it necessary). As a result, storing descriptors on directory
  nodes with consecutive identities is not required. In fact it should be
  avoided to enable an attacker to create "black holes" in the identifier
  ring.

  Hidden services should store their descriptors on 4 non-consecutive
  directory nodes, and clients should request descriptors from these
  directory nodes only. For compatibility reasons, hidden services also
  store their descriptors on 2 consecutive directory nodes. Hence, 0.2.0.x
  clients will be able to retrieve 4 out of 6 descriptors, but will fail
  for the remaining 2 descriptors, which is sufficient for reliability. As
  soon as 0.2.0.x is deprecated, hidden services can stop publishing the
  additional 2 replicas.

  3. Change Default Value of Being Hidden Service Directory

  The requirements for becoming a hidden service directory node are an open
  directory port and an uptime of at least 24 hours. The evaluation has
  shown that there are 300 hidden service directory candidates in the mean,
  but only 6 of them are configured to act as hidden service directories.
  This is bad, because those 6 nodes need to serve a large share of all
  hidden service descriptors. Optimally, there should be hundreds of hidden
  service directories. Having a large number of 0.2.1.x directory nodes
  also has a positive effect on 0.2.0.x hidden services and clients.

  Therefore, the new default of HidServDirectoryV2 should be 1, so that a
  Tor relay that has an open directory port automatically accepts and
  serves v2 hidden service descriptors. A relay operator can still opt-out
  running a hidden service directory by changing HidServDirectoryV2 to 0.
  The additional bandwidth requirements for running a hidden service
  directory node in addition to being a directory cache are negligible.

  4. Make Descriptors Persistent on Directory Nodes

  Hidden service directories that are restarted by their operators or after
  a failure will not be selected as hidden service directories within the
  next 24 hours. However, some clients might still think that these nodes
  are responsible for certain descriptors, because they work on the basis
  of network consensuses that are up to three hours old. The directory
  nodes should be able to serve the previously received descriptors to
  these clients. Therefore, directory nodes make all received descriptors
  persistent and load previously received descriptors on startup.

  5. Store and Serve Descriptors Regardless of Responsibility

  Currently, directory nodes only accept descriptors for which they think
  they are responsible. This may lead to problems when a directory node
  uses an older or newer network consensus than hidden service or client
  or when a directory node has been restarted recently. In fact, there are
  no security issues in storing or serving descriptors for which a
  directory node thinks it is not responsible. To the contrary, doing so
  may improve reliability in border cases. As a result, a directory node
  does not pay attention to responsibilty when receiving a publication or
  fetch request, but stores or serves the requested descriptor. Likewise,
  the directory node does not remove descriptors when it thinks it is not
  responsible for them any more.

  6. Avoid Periodic Descriptor Re-Publication

  In the current implementation a hidden service re-publishes its
  descriptor either when its content changes or an hour elapses. However,
  the evaluation has shown that failures of hidden service directory nodes,
  i.e. of nodes that have not failed within the last 24 hours, are very
  rare. Together with making descriptors persistent on directory nodes,
  there is no necessity to re-publish descriptors hourly.

  The only two events leading to descriptor re-publication should be a
  change of the descriptor content and a new directory node becoming
  responsible for the descriptor. Hidden services should therefore consider
  re-publication every time they learn about a new network consensus
  instead of hourly.

  7. Discard Expired Descriptors

  The current implementation lets directory nodes keep a descriptor for two
  days before discarding it. However, with the v2 design, descriptors are
  only valid for at most one day. Directory nodes should determine the
  validity of stored descriptors and discard them one hour after they have
  expired (to compensate wrong clocks on clients).

  8. Shorten Client-Side Descriptor Fetch History

  When clients try to download a hidden service descriptor, they memorize
  fetch requests to directory nodes for up to 15 minutes. This allows them
  to request all replicas of a descriptor to avoid bad or failing directory
  nodes, but without querying the same directory node twice.

  The downside is that a client that has requested a descriptor without
  success, will not be able to find a hidden service that has been started
  during the following 15 minutes after the client's last request.

  This can be improved by shortening the fetch history to only 5 minutes.
  This time should be sufficient to complete requests for all replicas of a
  descriptor, but without ending in an infinite request loop.

Compatibility:

  All proposed improvements are compatible to the currently implemented
  design as described in proposal 114.

Filename: 144-enforce-distinct-providers.txt
Title: Increase the diversity of circuits by detecting nodes belonging the
   same provider
Author: Mfr
Created: 2008-06-15
Status: Obsolete

Overview:

  Increase network security by reducing the capacity of the relay or
  ISPs monitoring personally or requisition, a large part of traffic
  Tor trying to break circuits privacy.  A way to increase the
  diversity of circuits without killing the network performance.

Motivation:

  Since 2004, Roger an Nick publication about diversity [1], very fast
  relays Tor running are focused among an half dozen of providers,
  controlling traffic of some dozens of routers [2].

  In the same way the generalization of VMs clonables paid by hour,
  allowing starting in few minutes and for a small cost, a set of very
  high-speed relay whose in a few hours can attract a big traffic that
  can be analyzed, increasing the vulnerability of the network.

  Whether ISPs or domU providers, these usually have several groups of
  IP Class B.  Also the restriction in place EnforceDistinctSubnets
  automatically excluding IP subnet class B is only partially
  effective. By contrast a restriction at the class A will be too
  restrictive.

 Therefore it seems necessary to consider another approach.

Proposal:

  Add a provider control based on AS number added by the router on is
  descriptor, controlled by Directories Authorities, and used like the
  declarative family field for circuit creating.

Design:

Step 1 :

 Add to the router descriptor a provider information get request [4]
  by the router itself.

         "provider" name NL

            'names' is the AS number of the router formated like this:
            'ASxxxxxx' where AS is fixed and xxxxxx is the AS number,
            left aligned ( ex: AS98304 , AS4096,AS1 ) or if AS number
            is missing the network A class number is used like that:
            'ANxxx' where AN is fixed and xxx is the first 3 digits of
            the IP (ex: for the IP 1.1.1.2 AN1) or an 'L' value is set
            if it's a local network IP.

            If two ORs list one another in their "provider" entries,
            then OPs should treat them as a single OR for the purpose
            of path selection.

            For example, if node A's descriptor contains "provider B",
            and node B's descriptor contains "provider A", then node A
            and node B should never be used on the same circuit.

    Add the regarding config option in torrc

            EnforceDistinctProviders set to 1 by default.
            Permit building circuits with relays in the same provider
            if set to 0.
            Regarding to proposal 135 if TestingTorNetwork is set
            need to be EnforceDistinctProviders is unset.

    Control by Authorities Directories of the AS numbers

         The Directories Authority control the AS numbers of the new node
         descriptor uploaded.

            If an old version is operated by the node this test is
            bypassed.

            If AS number get by request is different from the
            description, router is flagged as non-Valid by the testing
            Authority for the voting process.

Step 2     When a ' significant number of nodes' of valid routers are
generating descriptor with provider information.

        Add missing provider information get by DNS request
functionality for the circuit user:

                During circuit building, computing, OP apply first
                family check and EnforceDistinctSubnets directives for
                performance, then if provider info is needed and
                missing in router descriptor try to get AS provider
                info by DNS request [4].  This information could be
                DNS cached.  AN ( class A number) is never generated
                during this process to prevent DNS block problems.  If
                DNS request fails ignore and continue building
                circuit.

Step 3 When the 'whole majority' of valid Tor clients are providing
DNS request.

        Older versions are deprecated and mark as no-Valid.

  EnforceDistinctProviders replace EnforceDistinctSubnets functionnality.

        EnforceDistinctSubnets is removed.

        Functionalities deployed in step 2 are removed.

Security implications:

      This providermeasure will increase the number of providers
      addresses that an attacker must use in order to carry out
      traffic analysis.

Compatibility:

        The presented protocol does not raise compatibility issues
        with current Tor versions. The compatibility is preserved by
        implementing this functionality in 3 steps, giving time to
        network users to upgrade clients and routers.

Performance and scalability notes:

        Provider change for all routers could reduce a little
        performance if the circuit to long.

        During step 2 Get missing provider information could increase
        building path time and should have a time out.

Possible Attacks/Open Issues/Some thinking required:

        These proposal seems be compatible with proposal 135 Simplify
        Configuration of Private Tor Networks.

        This proposal does not resolve multiples AS owners and top
        providers traffic monitoring attacks [5].

        Unresolved AS number are treated as a Class A network. Perhaps
        should be marked as invalid.  But there's only fives items on
        last check see [2].

        Need to define what's a 'significant number of nodes' and
        'whole majority' ;-)

References:
[1] Location Diversity in Anonymity Networks by Nick Feamster and Roger
Dingledine.
In the Proceedings of the Workshop on Privacy in the Electronic Society
(WPES 2004), Washington, DC, USA, October 2004
http://freehaven.net/anonbib/#feamster:wpes2004
[2] http://as4jtw5gc6efb267.onion/IPListbyAS.txt
[3] see Goodell Tor Exit Page
http://cassandra.eecs.harvard.edu/cgi-bin/exit.py
[4] see the great IP to ASN DNS Tool
http://www.team-cymru.org/Services/ip-to-asn.html
[5] Sampled Traffic Analysis by Internet-Exchange-Level Adversaries by
Steven J. Murdoch and Piotr Zielinski.
In the Proceedings of the Seventh Workshop on Privacy Enhancing Technologies

(PET 2007), Ottawa, Canada, June 2007.
http://freehaven.net/anonbib/#murdoch-pet2007
[5] http://bugs.noreply.org/flyspray/index.php?do=details&id=690
Filename: 145-newguard-flag.txt
Title: Separate "suitable as a guard" from "suitable as a new guard"
Author: Nick Mathewson
Created: 1-Jul-2008
Status: Superseded

[This could be obsoleted by proposal 141, which could replace NewGuard
with a Guard weight.]

[This _is_ superseded by 236, which adds guard weights for real.]

Overview

   Right now, Tor has one flag that clients use both to tell which
   nodes should be kept as guards, and which nodes should be picked
   when choosing new guards.  This proposal separates this flag into
   two.

Motivation

   Balancing clients amoung guards is not done well by our current
   algorithm.  When a new guard appears, it is chosen by clients
   looking for a new guard with the same probability as all existing
   guards... but new guards are likelier to be under capacity, whereas
   old guards are likelier to be under more use.

Implementation

   We add a new flag, NewGuard.  Clients will change so that when they
   are choosing new guards, they only consider nodes with the NewGuard
   flag set.

   For now, authorities will always set NewGuard if they are setting
   the Guard flag.  Later, it will be easy to migrate authorities to
   set NewGuard for underused guards.

Alternatives

   We might instead have authorities list weights with which nodes
   should be picked as guards.
Filename: 146-long-term-stability.txt
Title: Add new flag to reflect long-term stability
Author: Nick Mathewson
Created: 19-Jun-2008
Status: Superseded
Superseded-by: 206

Status:

  The applications of this design are achieved by proposal 206 instead.
  Instead of having the authorities track long-term stability for nodes
  that might be useful as directories in a fallback consensus, we
  eliminated the idea of a fallback consensus, and just have a DirSource
  configuration option.  (Nov 2013)


Overview

  This document proposes a new flag to indicate that a router has
  existed at the same address for a long time, describes how to
  implement it, and explains what it's good for.

Motivation

  Tor has had three notions of "stability" for servers.  Older
  directory protocols based a server's stability on its
  (self-reported) uptime: a server that had been running for a day was
  more stable than a server that had been running for five minutes,
  regardless of their past history.  Current directory protocols track
  weighted mean time between failure (WMTBF) and weighted fractional
  uptime (WFU).  WFU is computed as the fraction of time for which the
  server is running, with measurements weighted to exponentially
  decay such that old days count less.  WMTBF is computed as the
  average length of intervals for which the server runs between
  downtime, with old intervals weighted to count less.

  WMTBF is useful in answering the question: "If a server is running
  now, how long is it likely to stay running?"  This makes it a good
  choice for picking servers for streams that need to be long-lived.
  WFU is useful in answering the question: "If I try connecting to
  this server at an arbitrary time, is it likely to be running?"  This
  makes it an important factor for picking guard nodes, since we want
  guard nodes to be usually-up.

  There are other questions that clients want to answer, however, for
  which the current flags aren't very useful.   The one that this
  proposal addresses is,

       "If I found this server in an old consensus, is it likely to
       still be running at the same address?"

  This one is useful when we're trying to find directory mirrors in a
  fallback-consensus file.  This property is equivalent to,

       "If I find this server in a current consensus, how long is it
       likely to exist on the network?"

  This one is useful if we're trying to pick introduction points or
  something and care more about churn rate than about whether every IP
  will be up all the time.

Implementation:

  I propose we add a new flag, called "Longterm."  Authorities should
  set this flag for routers if their Longevity is in the upper
  quartile of all routers.  A router's Longevity is computed as the
  total amount of days in the last year or so[*] for which the router has
  been Running at least once at its current IP:orport pair.

  Clients should use directory servers from a fallback-consensus only
  if they have the Longterm flag set.

  Authority ops should be able to mark particular routers as not
  Longterm, regardless of history.  (For instance, it makes sense to
  remove the Longterm flag from a router whose op says that it will
  need to shutdown in a month.)

  [*] This is deliberately vague, to permit efficient implementations.

Compatibility and migration issues:

  The voting protocol already acts gracefully when new flags are
  added, so no change to the voting protocol is needed.

  Tor won't have collected this data, however.  It might be desirable
  to bootstrap it from historical consensuses.  Alternatively, we can
  just let the algorithm run for a month or two.

Issues and future possibilities:

  Longterm is a really awkward name.


Filename: 147-prevoting-opinions.txt
Title: Eliminate the need for v2 directories in generating v3 directories
Author: Nick Mathewson
Created: 2-Jul-2008
Status: Rejected
Target: 0.2.4.x

Overview

  We propose a new v3 vote document type to replace the role of v2
  networkstatus information in generating v3 consensuses.

Motivation

  When authorities vote on which descriptors are to be listed in the
  next consensus, it helps if they all know about the same descriptors
  as one another.  But a hostile, confused, or out-of-date server may
  upload a descriptor to only some authorities.  In the current v3
  directory design, the authorities don't have a good way to tell one
  another about the new descriptor until they exchange votes... but by
  the time this happens, they are already committed to their votes,
  and they can't add anybody they learn about from other authorities
  until the next voting cycle.  That's no good!

  The current Tor implementation avoids this problem by having
  authorities also look at v2 networkstatus documents, but we'd like
  in the long term to eliminate these, once 0.1.2.x is obsolete.

Design:

  We add a new value for vote-status in v3 consensus documents in
  addition to "consensus" and "vote": "opinion".  Authorities generate
  and sign an opinion document as if they were generating a vote,
  except that they generate opinions earlier than they generate votes.

  [This proposal doesn't say what lines must be contained in opinion
   documents.  It seems that an authority that parses an opinion
   document is only interested in a) relay fingerprint, b) descriptor
   publication time, and c) descriptor digest; unless there's more
   information that helps authorities decide whether "they might
   accept" a descriptor.  If not, opinion documents only need to
   contain a small subset of headers and all the "r" lines that would
   be contained in a later vote. -KL]
  [This seems okay.  It would however mean that we can't use the same
   parsing logic as we use for regular votes. -NM]

  [Authorities should use the same "valid-after", "fresh-until",
   and "valid-until" lines in opinion documents as they are going to
   use in their next vote. -KL]
  [Maybe these lines should just get ignored on opinions.  Or
   omitted. -NM]

  Authorities don't need to generate more than one opinion document
  per voting interval, but may.  They should send it to the other
  authorities they know about, at
     http://<hostname>/tor/post/opinion ,
  before the authorities begin voting, so that enough time remains for
  the authorities to fetch new descriptors.

  Additionally, authories make their opinions available at
     http://<hostname>/tor/status-vote/next/opinion.z
  and download opinions from authorities they haven't heard from in a
  while.

  Authorities SHOULD send their opinion document to all other
  authorities OpinionSeconds seconds before voting and request
  missing opinion documents OpinionSeconds/2 seconds before voting.
  OpinionSeconds SHOULD be defined as part of "voting-delay" lines
  and otherwise default to the same number of seconds as VoteSeconds.

  Authorities MAY generate opinions on demand.

  Upon receiving an opinion document, authorities scan it for any
  descriptors that:
     - They might accept.
     - Are for routers they don't know about, or are published more
       recently than any descriptor they have for that router.
  Authorities then begin downloading such descriptors from authorities
  that claim to have them.

  Authorities also download corresponding extra-info descriptors for
  any router descriptor they learned from parsing an opinion document.

  Authorities MAY cache opinion documents, but don't need to.

Reasons for rejection:

  1. Authorities learn about new relays from each others' vote documents.

  See git commits 2e692bd8 and eaf5487d, which went into 0.2.2.12-alpha:
  o Major bugfixes:
    - Many relays have been falling out of the consensus lately because
      not enough authorities know about their descriptor for them to get
      a majority of votes. When we deprecated the v2 directory protocol,
      we got rid of the only way that v3 authorities can hear from each
      other about other descriptors. Now authorities examine every v3
      vote for new descriptors, and fetch them from that authority. Bugfix
      on 0.2.1.23.

  2. Authorities don't serve version 2 statuses anymore.

  Since January 2013, there was only a single version 3 directory
  authority left that served version 2 statuses: dizum.  moria1 and tor26
  have been rejecting version 2 requests for a long time, and it was
  mostly an oversight that dizum still served them.  As of January 2014,
  dizum does not serve version 2 statuses anymore.  The other six
  authorities have never generated version 2 statuses for others to be
  used as pre-voting opinions.

  3. Vote documents indicate that pre-voting opinions wouldn't help much.

  From January 1 to 7, 2014, only 0.4 relays on average were not included
  in a consensus because they were listed in less than 5 votes.  These 0.4
  relays could probably have been included with pre-voting opinions.

  (Here's how to find out: extract the votes-2014-01.tar.bz2 tarball, run
  `grep -R "^r " 0[1-7] | cut -c 4-22,112- | cut -d" " -f1,3 | sort | uniq
  -c | sort | grep " [1-4] " | wc -l`, result is 63, divide by 7*24
  published consensuses, obtain 0.375 as end result.)

Filename: 148-uniform-client-end-reason.txt
Title: Stream end reasons from the client side should be uniform
Author: Roger Dingledine
Created: 2-Jul-2008
Status: Closed
Implemented-In: 0.2.1.9-alpha

Overview

  When a stream closes before it's finished, the end relay cell that's
  sent includes an "end stream reason" to tell the other end why it
  closed. It's useful for the exit relay to send a reason to the client,
  so the client can choose a different circuit, inform the user, etc. But
  there's no reason to include it from the client to the exit relay,
  and in some cases it can even harm anonymity.

  We should pick a single reason for the client-to-exit-relay direction
  and always just send that.

Motivation

  Back when I first deployed the Tor network, it was useful to have
  the Tor relays learn why a stream closed, so I could debug both ends
  of the stream at once. Now that streams have worked for many years,
  there's no need to continue telling the exit relay whether the client
  gave up on a stream because of "timeout" or "misc" or what.

  Then in Tor 0.2.0.28-rc, I fixed this bug:
    - Fix a bug where, when we were choosing the 'end stream reason' to
      put in our relay end cell that we send to the exit relay, Tor
      clients on Windows were sometimes sending the wrong 'reason'. The
      anonymity problem is that exit relays may be able to guess whether
      the client is running Windows, thus helping partition the anonymity
      set. Down the road we should stop sending reasons to exit relays,
      or otherwise prevent future versions of this bug.

  It turned out that non-Windows clients were choosing their reason
  correctly, whereas Windows clients were potentially looking at errno
  wrong and so always choosing 'misc'.

  I fixed that particular bug, but I think we should prevent future
  versions of the bug too.

  (We already fixed it so *circuit* end reasons don't get sent from
  the client to the exit relay. But we appear to be have skipped over
  stream end reasons thus far.)

Design:

  One option would be to no longer include any 'reason' field in end
  relay cells. But that would introduce a partitioning attack ("users
  running the old version" vs "users running the new version").

  Instead I suggest that clients all switch to sending the "misc" reason,
  like most of the Windows clients currently do and like the non-Windows
  clients already do sometimes.

Filename: 149-using-netinfo-data.txt
Title: Using data from NETINFO cells
Author: Nick Mathewson
Created: 2-Jul-2008
Status: Superseded
Target: 0.2.1.x

[Partially done: we do the anti-MITM part.  Not entirely done: we don't do
the time part.]

Overview

   Current Tor versions send signed IP and timestamp information in
   NETINFO cells, but don't use them to their fullest.  This proposal
   describes how they should start using this info in 0.2.1.x.

Motivation

   Our directory system relies on clients and routers having
   reasonably accurate clocks to detect replayed directory info, and
   to set accurate timestamps on directory info they publish
   themselves.  NETINFO cells contain timestamps.

   Also, the directory system relies on routers having a reasonable
   idea of their own IP addresses, so they can publish correct
   descriptors.  This is also in NETINFO cells.

Learning the time and IP address

   We need to think about attackers here.  Just because a router tells
   us that we have a given IP or a given clock skew doesn't mean that
   it's true.  We believe this information only if we've heard it from
   a majority of the routers we've connected to recently, including at
   least 3 routers.  Routers only believe this information if the
   majority includes at least one authority.

Avoiding MITM attacks

   Current Tors use the IP addresses published in the other router's
   NETINFO cells to see whether the connection is "canonical".  Right
   now, we prefer to extend circuits over "canonical" connections.  In
   0.2.1.x, we should refuse to extend circuits over non-canonical
   connections without first trying to build a canonical one.


Filename: 150-exclude-exit-nodes.txt
Title: Exclude Exit Nodes from a circuit
Author: Mfr
Created: 2008-06-15
Status: Closed
Implemented-In: 0.2.1.3-alpha

Overview

   Right now, Tor users can manually exclude a node from all positions
   in their circuits created using the directive ExcludeNodes.
   This proposal makes this exclusion less restrictive, allowing users to
   exclude a node only from the exit part of a circuit.

Motivation

   This feature would Help the integration into vidalia (tor exit
   branch) or other tools, of features to exclude a country for exit
   without reducing circuits possibilities, and privacy.  This feature
   could help people from a country were many sites are blocked to
   exclude this country for browsing, giving them a more stable
   navigation.  It could also add the possibility for the user to
   exclude a currently used exit node.

Implementation

   ExcludeExitNodes is similar to ExcludeNodes except it's only
   the exit node which is excluded for circuit build.

   Tor doesn't warn if node from this list is not an exit node.

Security implications:

   Open also possibilities for a future user bad exit reporting

Risks:

   Use of this option can make users partitionable under certain attack
   assumptions.  However, ExitNodes already creates this possibility,
   so there isn't much increased risk in ExcludeExitNodes.

   We should still encourage people who exclude an exit node because
   of bad behavior to report it instead of just adding it to their
   ExcludeExit list.  It would be unfortunate if we didn't find out
   about broken exits because of this option.  This issue can probably
   be addressed sufficiently with documentation.

Filename: 151-path-selection-improvements.txt
Title: Improving Tor Path Selection
Author: Fallon Chen, Mike Perry
Created: 5-Jul-2008
Status: Closed
In-Spec: path-spec.txt
Implemented-In: 0.2.2.2-alpha

Overview

  The performance of paths selected can be improved by adjusting the
  CircuitBuildTimeout and avoiding failing guard nodes. This proposal
  describes a method of tracking buildtime statistics at the client, and
  using those statistics to adjust the CircuitBuildTimeout.

Motivation

  Tor's performance can be improved by excluding those circuits that
  have long buildtimes (and by extension, high latency). For those Tor
  users who require better performance and have lower requirements for
  anonymity, this would be a very useful option to have.

Implementation

  Gathering Build Times

    Circuit build times are stored in the circular array
    'circuit_build_times' consisting of uint32_t elements as milliseconds.
    The total size of this array is based on the number of circuits
    it takes to converge on a good fit of the long term distribution of
    the circuit builds for a fixed link. We do not want this value to be
    too large, because it will make it difficult for clients to adapt to
    moving between different links.

    From our observations, the minimum value for a reasonable fit appears
    to be on the order of 500 (MIN_CIRCUITS_TO_OBSERVE). However, to keep
    a good fit over the long term, we store 5000 most recent circuits in
    the array (NCIRCUITS_TO_OBSERVE).

    The Tor client will build test circuits at a rate of one per
    minute (BUILD_TIMES_TEST_FREQUENCY) up to the point of
    MIN_CIRCUITS_TO_OBSERVE. This allows a fresh Tor to have
    a CircuitBuildTimeout estimated within 8 hours after install,
    upgrade, or network change (see below).

  Long Term Storage

    The long-term storage representation is implemented by storing a
    histogram with BUILDTIME_BIN_WIDTH millisecond buckets (default 50) when
    writing out the statistics to disk. The format this takes in the
    state file is 'CircuitBuildTime <bin-ms> <count>', with the total
    specified as 'TotalBuildTimes <total>'
    Example:

    TotalBuildTimes 100
    CircuitBuildTimeBin 25 50
    CircuitBuildTimeBin 75 25
    CircuitBuildTimeBin 125 13
    ...

    Reading the histogram in will entail inserting <count> values
    into the circuit_build_times array each with the value of
    <bin-ms> milliseconds. In order to evenly distribute the values
    in the circular array, the Fisher-Yates shuffle will be performed
    after reading values from the bins.

  Learning the CircuitBuildTimeout

    Based on studies of build times, we found that the distribution of
    circuit buildtimes appears to be a Frechet distribution. However,
    estimators and quantile functions of the Frechet distribution are
    difficult to work with and slow to converge. So instead, since we
    are only interested in the accuracy of the tail, we approximate
    the tail of the distribution with a Pareto curve starting at
    the mode of the circuit build time sample set.

    We will calculate the parameters for a Pareto distribution
    fitting the data using the estimators at
    http://en.wikipedia.org/wiki/Pareto_distribution#Parameter_estimation.

    The timeout itself is calculated by using the Quartile function (the
    inverted CDF) to give us the value on the CDF such that
    BUILDTIME_PERCENT_CUTOFF (80%) of the mass of the distribution is
    below the timeout value.

    Thus, we expect that the Tor client will accept the fastest 80% of
    the total number of paths on the network.

  Detecting Changing Network Conditions

    We attempt to detect both network connectivity loss and drastic
    changes in the timeout characteristics.

    We assume that we've had network connectivity loss if 3 circuits
    timeout and we've received no cells or TLS handshakes since those
    circuits began. We then set the timeout to 60 seconds and stop
    counting timeouts.

    If 3 more circuits timeout and the network still has not been
    live within this new 60 second timeout window, we then discard
    the previous timeouts during this period from our history.

    To detect changing network conditions, we keep a history of
    the timeout or non-timeout status of the past RECENT_CIRCUITS (20)
    that successfully completed at least one hop. If more than 75%
    of these circuits timeout, we discard all buildtimes history,
    reset the timeout to 60, and then begin recomputing the timeout.

  Testing

    After circuit build times, storage, and learning are implemented,
    the resulting histogram should be checked for consistency by
    verifying it persists across successive Tor invocations where
    no circuits are built. In addition, we can also use the existing
    buildtime scripts to record build times, and verify that the histogram
    the python produces matches that which is output to the state file in Tor,
    and verify that the Pareto parameters and cutoff points also match.

    We will also verify that there are no unexpected large deviations from
    node selection, such as nodes from distant geographical locations being
    completely excluded.

  Dealing with Timeouts

    Timeouts should be counted as the expectation of the region of
    of the Pareto distribution beyond the cutoff. This is done by
    generating a random sample for each timeout at points on the
    curve beyond the current timeout cutoff.

  Future Work

    At some point, it may be desirable to change the cutoff from a
    single hard cutoff that destroys the circuit to a soft cutoff and
    a hard cutoff, where the soft cutoff merely triggers the building
    of a new circuit, and the hard cutoff triggers destruction of the
    circuit.

    It may also be beneficial to learn separate timeouts for each
    guard node, as they will have slightly different distributions.
    This will take longer to generate initial values though.

Issues

  Impact on anonymity

    Since this follows a Pareto distribution, large reductions on the
    timeout can be achieved without cutting off a great number of the
    total paths. This will eliminate a great deal of the performance
    variation of Tor usage.
Filename: 152-single-hop-circuits.txt
Title: Optionally allow exit from single-hop circuits 
Author: Geoff Goodell
Created: 13-Jul-2008
Status: Closed
Implemented-In: 0.2.1.6-alpha

Overview

    Provide a special configuration option that adds a line to descriptors
    indicating that a router can be used as an exit for one-hop circuits,
    and allow clients to attach streams to one-hop circuits provided
    that the descriptor for the router in the circuit includes this
    configuration option.

Motivation

    At some point, code was added to restrict the attachment of streams
    to one-hop circuits.

    The idea seems to be that we can use the cost of forking and
    maintaining a patch as a lever to prevent people from writing
    controllers that jeopardize the operational security of routers
    and the anonymity properties of the Tor network by creating and
    using one-hop circuits rather than the standard three-hop circuits.
    It may be, for example, that some users do not actually seek true
    anonymity but simply reachability through network perspectives
    afforded by the Tor network, and since anonymity is stronger in
    numbers, forcing users to contribute to anonymity and decrease the
    risk to server operators by using full-length paths may be reasonable.

    As presently implemented, the sweeping restriction of one-hop circuits
    for all routers limits the usefulness of Tor as a general-purpose
    technology for building circuits.  In particular, we should allow
    for controllers, such as Blossom, that create and use single-hop
    circuits involving routers that are not part of the Tor network.

Design

    Introduce a configuration option for Tor servers that, when set,
    indicates that a router is willing to provide exit from one-hop
    circuits.  Routers with this policy will not require that a circuit
    has at least two hops when it is used as an exit.

    In addition, routers for which this configuration option
    has been set will have a line in their descriptors, "opt
    exit-from-single-hop-circuits".  Clients will keep track of which
    routers have this option and allow streams to be attached to
    single-hop circuits that include such routers.

Security Considerations

    This approach seems to eliminate the worry about operational router
    security, since server operators will not set the configuraiton
    option unless they are willing to take on such risk.

    To reduce the impact on anonymity of the network resulting
    from including such "risky" routers in regular Tor path
    selection, clients may systematically exclude routers with "opt
    exit-from-single-hop-circuits" when choosing random paths through
    the Tor network.

Filename: 153-automatic-software-update-protocol.txt
Title: Automatic software update protocol
Author: Jacob Appelbaum 
Created: 14-July-2008
Status: Superseded

[Superseded by thandy-spec.txt]


                      Automatic Software Update Protocol Proposal

0.0 Introduction

The Tor project and its users require a robust method to update shipped
software bundles. The software bundles often includes Vidalia, Privoxy, Polipo,
Torbutton and of course Tor itself. It is not inconcievable that an update
could include all of the Tor Browser Bundle. It seems reasonable to make this 
a standalone program that can be called in shell scripts, cronjobs or by
various Tor controllers.

0.1 Minimal Tasks To Implement Automatic Updating

At the most minimal, an update must be able to do the following: 

    0 - Detect the curent Tor version, note the working status of Tor.
    1 - Detect the latest Tor version. 
    2 - Fetch the latest version in the form of a platform specific package(s).
    3 - Verify the itegrity of the downloaded package(s).
    4 - Install the verified package(s).
    5 - Test that the new package(s) works properly.

0.2 Specific Enumeration Of Minimal Tasks

To implement requirement 0, we need to detect the current Tor version of both 
the updater and the current running Tor. The update program itself should be 
versioned internally. This requirement should also test connecting through Tor 
itself and note if such connections are possible.

To implement requirement 1, we need to learn the concensus from the directory 
authorities or fail back to a known good URL with cryptographically signed 
content.

To implement requirement 2, we need to download Tor - hopefully over Tor.

To implement requirement 3, we need to verify the package signature.

To implement requirement 4, we need to use a platform specific method of 
installation. The Tor controller performing the update perform these platform 
specific methods.

To implement requirement 5, we need to be able to extend circuits and reach 
the internet through Tor.

0.x Implementation Goals

The update system will be cross platform and rely on as little external code 
as possible. If the update system uses it, it must be updated by the update 
system itself. It will consist only of free software and will not rely on any 
non-free components until the actual installation phase. If a package manager 
is in use, it will be platform specific and thus only invoked by the update 
system implementing the update protocol.

The update system itself will attempt to perform update related network 
activity over Tor. Possibly it will attempt to use a hidden service first.
It will attempt to use novel and not so novel caching 
when possible, it will always verify cryptographic signatures before any 
remotely fetched code is executed. In the event of an unusable Tor system, 
it will be able to attempt to fetch updates without Tor. This should be user 
configurable, some users will be unwilling to update without the protection of 
using Tor - others will simply be unable because of blocking of the main Tor 
website.

The update system will track current version numbers of Tor and supporting 
software. The update system will also track known working versions to assist 
with automatic The update system itself will be a standalone library. It will be 
strongly versioned internally to match the Tor bundle it was shiped with. The 
update system will keep track of the given platform, cpu architecture, lsb_release, 
package management functionality and any other platform specific metadata.

We have referenced two popular automatic update systems, though neither fit 
our needs, both are useful as an idea of what others are doing in the same 
area.

The first is sparkle[0] but it is sadly only available for Cocoa 
environments and is written in Objective C. This doesn't meet our requirements 
because it is directly tied into the private Apple framework.

The second is the Mozilla Automatic Update System[1]. It is possibly useful 
as an idea of how other free software projects automatically update. It is 
however not useful in its currently documented form.


    [0] http://sparkle.andymatuschak.org/documentation/
    [1] http://wiki.mozilla.org/AUS:Manual

0.x Previous methods of Tor and related software update

Previously, Tor users updated their Tor related software by hand. There has
been no fully automatic method for any user to update. In addition, there
hasn't been any specific way to find out the most current stable version of Tor
or related software as voted on by the directory authority concensus.

0.x Changes to the directory specification

We will want to supplement client-versions and server-versions in the 
concensus voting with another version identifier known as 
'auto-update-versions'. This will keep track of the current concensus of 
specific versions that are best per platform and per architecture. It should 
be noted that while the Mac OS X universal binary may be the best for x86 
processers with Tiger, it may not be the best for PPC users on Panther. This 
goes for all of the package updates. We want to prevent updates that cause Tor 
to break even if the updating program can recover gracefully.

x.x Assumptions About Operating System Package Management

It is assumed that users will use their package manager unless they are on 
Microsoft Windows (any version) or Mac OS X (any version). Microsoft Windows 
users will have integration with the normal "add/remove program" functionality 
that said users would expect.

x.x Package Update System Failure Modes

The package update will try to ensure that a user always has a working Tor at 
the very least. It will keep state to remember versions of Tor that were able 
to bootstrap properly and reach the rest of the Tor network. It will also keep 
note of which versions broke. It will select the best Tor that works for the 
user. It will also allow for anonymized bug reporting on the packages 
available and tested by the auto-update system.

x.x Package Signature Verification

The update system will be aware of replay attacks against the update signature 
system itself. It will not allow package update signatures that are radically 
out of date. It will be a multi-key system to prevent any single party from 
forging an update. The key will be updated regularly. This is like authority 
key (see proposal 103) usage.

x.x Package Caching

The update system will iterate over different update methods. Whichever method 
is picked will have caching functionality. Each Tor server itself should be 
able to serve cached update files. This will be an option that friendly server 
administrators can turn on should they wish to support caching. In addition, 
it is possible to cache the full contents of a package in an 
authoratative DNS zone. Users can then query the DNS zone for their package. 
If we wish to further distribute the update load, we can also offer packages 
with encrypted bittorrent. Clients who wish to share the updates but do not 
wish to be a server can help distribute Tor updates. This can be tied together 
with the DNS caching[2][3] if needed.

    [2] http://www.netrogenic.com/dnstorrent/
    [3] http://www.doxpara.com/ozymandns_src_0.1.tgz

x.x Helping Our Users Spread Tor

There should be a way for a user to participate in the packaging caching as 
described in section x.x. This option should be presented by the Tor 
controller.

x.x Simple HTTP Proxy To The Tor Project Website

It has been suggested that we should provide a simple proxy that allows a user 
to visit the main Tor website to download packages. This was part of a 
previous proposal and has not been closely examined.

x.x Package Installation

Platform specific methods for proper package installation will be left to the 
controller that is calling for an update. Each platform is different, the 
installation options and user interface will be specific to the controller in 
question.

x.x Other Things

Other things should be added to this proposal. What are they?
Filename: 154-automatic-updates.txt
Title: Automatic Software Update Protocol
Author: Matt Edman
Created: 30-July-2008
Status: Superseded
Target: 0.2.1.x

Superseded by thandy-spec.txt

Scope

  This proposal specifies the method by which an automatic update client can
  determine the most recent recommended Tor installation package for the
  user's platform, download the package, and then verify that the package was
  downloaded successfully. While this proposal focuses on only the Tor
  software, the protocol defined is sufficiently extensible such that other
  components of the Tor bundles, like Vidalia, Polipo, and Torbutton, can be
  managed and updated by the automatic update client as well.

  The initial target platform for the automatic update framework is Windows,
  given that's the platform used by a majority of our users and that it lacks
  a sane package management system that many Linux distributions already have.
  Our second target platform will be Mac OS X, and so the protocol will be
  designed with this near-future direction in mind.

  Other client-side aspects of the automatic update process, such as user
  interaction, the interface presented, and actual package installation
  procedure, are outside the scope of this proposal.


Motivation

  Tor releases new versions frequently, often with important security,
  anonymity, and stability fixes. Thus, it is important for users to be able
  to promptly recognize when new versions are available and to easily
  download, authenticate, and install updated Tor and Tor-related software
  packages.

  Tor's control protocol [2] provides a method by which controllers can
  identify when the user's Tor software is obsolete or otherwise no longer
  recommended. Currently, however, no mechanism exists for clients to
  automatically download and install updated Tor and Tor-related software for
  the user.


Design Overview

  The core of the automatic update framework is a well-defined file called a
  "recommended-packages" file. The recommended-packages file is accessible via
  HTTP[S] at one or more well-defined URLs. An example recommended-packages
  URL may be:

    https://updates.torproject.org/recommended-packages

  The recommended-packages document is formatted according to Section 1.2
  below and specifies the most recent recommended installation package
  versions for Tor or Tor-related software, as well as URLs at which the
  packages and their signatures can be downloaded.

  An automatic update client process runs on the Tor user's computer and
  periodically retrieves the recommended-packages file according to the method
  described in Section 2.0. As described further in Section 1.2, the
  recommended-packages file is signed and can be verified by the automatic
  update client with one or more public keys included in the client software.
  Since it is signed, the recommended-packages file can be mirrored by
  multiple hosts (e.g., Tor directory authorities), whose URLs are included in
  the automatic update client's configuration.

  After retrieving and verifying the recommended-packages file, the automatic
  update client compares the versions of the recommended software packages
  listed in the file with those currently installed on the end-user's
  computer. If one or more of the installed packages is determined to be out
  of date, an updated package and its signature will be downloaded from one of
  the package URLs listed in the recommended-packages file as described in
  Section 2.2.

  The automatic update system uses a multilevel signing key scheme for package
  signatures. There are a small number of entities we call "packaging
  authorities" that each have their own signing key. A packaging authority is
  responsible for signing and publishing the recommended-packages file.
  Additionally, each individual packager responsible for producing an
  installation package for one or more platforms has their own signing key.
  Every packager's signing key must be signed by at least one of the packaging
  authority keys.


Specification

  1. recommended-packages Specification

  In this section we formally specify the format of the published
  recommended-packages file.

  1.1. Document Meta-format

  The recommended-packages document follows the lightweight extensible
  information format defined in Tor's directory protocol specification [1]. In
  the interest of self-containment, we have reproduced the relevant portions
  of that format's specification in this Section. (Credits to Nick Mathewson
  for much of the original format definition language.)

  The highest level object is a Document, which consists of one or more
  Items.  Every Item begins with a KeywordLine, followed by zero or more
  Objects. A KeywordLine begins with a Keyword, optionally followed by
  whitespace and more non-newline characters, and ends with a newline.  A
  Keyword is a sequence of one or more characters in the set [A-Za-z0-9-].
  An Object is a block of encoded data in pseudo-Open-PGP-style
  armor. (cf. RFC 2440)

  More formally:

    Document     ::= (Item | NL)+
    Item         ::= KeywordLine Object*
    KeywordLine  ::= Keyword NL | Keyword WS ArgumentChar+ NL
    Keyword      ::= KeywordChar+
    KeywordChar  ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9' | '-'
    ArgumentChar ::= any printing ASCII character except NL.
    WS           ::= (SP | TAB)+
    Object       ::= BeginLine Base-64-encoded-data EndLine
    BeginLine    ::= "-----BEGIN " Keyword "-----" NL
    EndLine      ::= "-----END " Keyword "-----" NL

    The BeginLine and EndLine of an Object must use the same keyword.

  In our Document description below, we also tag Items with a multiplicity in
  brackets. Possible tags are:

    "At start, exactly once": These items MUST occur in every instance of the
    document type, and MUST appear exactly once, and MUST be the first item in
    their documents.

    "Exactly once": These items MUST occur exactly one time in every
    instance of the document type.

    "Once or more": These items MUST occur at least once in any instance
    of the document type, and MAY occur more than once.

    "At end, exactly once": These items MUST occur in every instance of
    the document type, and MUST appear exactly once, and MUST be the
    last item in their documents.

  1.2. recommended-packages Document Format

  When interpreting a recommended-packages Document, software MUST ignore
  any KeywordLine that starts with a keyword it doesn't recognize; future
  implementations MUST NOT require current automatic update clients to
  understand any KeywordLine not currently described.

  In lines that take multiple arguments, extra arguments SHOULD be
  accepted and ignored.

  The currently defined Items contained in a recommended-packages document
  are:

    "recommended-packages-format" SP number NL

      [Exactly once]

      This Item specifies the version of the recommended-packages format that
      is contained in the subsequent document. The version defined in this
      proposal is version "1". Subsequent iterations of this protocol MUST
      increment this value if they introduce incompatible changes to the
      document format and MAY increment this value if they only introduce
      additional Keywords.

    "published" SP YYYY-MM-DD SP HH:MM:SS NL

      [Exactly once]

      The time, in GMT, when this recommended-packages document was generated.
      Automatic update clients SHOULD ignore Documents over 60 days old.

    "tor-stable-win32-version" SP TorVersion NL

      [Exactly once]

      This keyword specifies the latest recommended release of Tor's "stable"
      branch for the Windows platform that has an installation package
      available. Note that this version does not necessarily correspond to the
      most recently tagged stable Tor version, since that version may not yet
      have an installer package available, or may have known issues on
      Windows.

      The TorVersion field is formatted according to Section 2 of Tor's
      version specification [3].

    "tor-stable-win32-package" SP Url NL

      [Once or more]

      This Item specifies the location from which the most recent
      recommended Windows installation package for Tor's stable branch can be
      downloaded.

      When this Item appears multiple times within the Document, automatic
      update clients SHOULD select randomly from the available package
      mirrors.

    "tor-dev-win32-version" SP TorVersion NL

      [Exactly once]

      This Item specifies the latest recommended release of Tor's
      "development" branch for the Windows platform that has an installation
      package available. The same caveats from the description of
      "tor-stable-win32-version" also apply to this keyword.

      The TorVersion field is formatted according to Section 2 of Tor's
      version specification [3].

    "tor-dev-win32-package" SP Url NL

      [Once or more]

      This Item specifies the location from which the most recent recommended
      Windows installation package and its signature for Tor's development
      branch can be downloaded.

      When this Keyword appears multiple times within the Document, automatic
      update clients SHOULD select randomly from the available package
      mirrors.

    "signature" NL SIGNATURE NL

      [At end, exactly once]

      The "SIGNATURE" Object contains a PGP signature (using a packaging
      authority signing key) of the entire document, taken from the beginning
      of the "recommended-packages-format" keyword, through the newline after
      the "signature" Keyword.


  2. Automatic Update Client Behavior

  The client-side component of the automatic update framework is an
  application that runs on the end-user's machine. It is responsible for
  fetching and verifying a recommended-packages document, as well as
  downloading, verifying, and subsequently installing any necessary updated
  software packages.

  2.1. Download and verify a recommended-packages document

  The first step in the automatic update process is for the client to download
  a copy of the recommended-packages file. The automatic update client
  contains a (hardcoded and/or user-configurable) list of URLs from which it
  will attempt to retrieve a recommended-packages file.

  Connections to each of the recommended-packages URLs SHOULD be attempted in
  the following order:

    1) HTTPS over Tor
    2) HTTP over Tor
    3) Direct HTTPS
    4) Direct HTTP

  If the client fails to retrieve a recommended-packages document via any of
  the above connection methods from any of the configured URLs, the client
  SHOULD retry its download attempts following an exponential back-off
  algorithm. After the first failed attempt, the client SHOULD delay one hour
  before attempting again, up to a maximum of 24 hours delay between retry
  attempts.

  After successfully downloading a recommended-packages file, the automatic
  update client will verify the signature using one of the public keys
  distributed with the client software. If more than one recommended-packages
  file is downloaded and verified, the file with the most recent "published"
  date that is verified will be retained and the rest discarded.

  2.2. Download and verify the updated packages

  The automatic update client next compares the latest recommended package
  version from the recommended-packages document with the currently installed
  Tor version. If the user currently has installed a Tor version from Tor's
  "development" branch, then the version specified in "tor-dev-*-version" Item
  is used for comparison. Similarly, if the user currently has installed a Tor
  version from Tor's "stable" branch, then the version specified in the
  "tor-stable-*version" Item is used for comparison. Version comparisons are
  done according to Tor's version specification [3].

  If the automatic update client determines an installation package newer than
  the user's currently installed version is available, it will attempt to
  download a package appropriate for the user's platform and Tor branch from a
  URL specified by a "tor-[branch]-[platform]-package" Item. If more than one
  mirror for the selected package is available, a mirror will be chosen at
  random from all those available.

  The automatic update client must also download a ".asc" signature file for
  the retrieved package. The URL for the package signature is the same as that
  for the package itself, except with the extension ".asc" appended to the
  package URL.

  Connections to download the updated package and its signature SHOULD be
  attempted in the same order described in Section 2.1.

  After completing the steps described in Sections 2.1 and 2.2, the automatic
  update client will have downloaded and verified a copy of the latest Tor
  installation package. It can then take whatever subsequent platform-specific
  steps are necessary to install the downloaded software updates.

  2.3. Periodic checking for updates

  The automatic update client SHOULD maintain a local state file in which it
  records (at a minimum) the timestamp at which it last retrieved a
  recommended-packages file and the timestamp at which the client last
  successfully downloaded and installed a software update.

  Automatic update clients SHOULD check for an updated recommended-packages
  document at most once per day but at least once every 30 days.


  3. Future Extensions

  There are several possible areas for future extensions of this framework.
  The extensions below are merely suggestions and should be the subject of
  their own proposal before being implemented.

  3.1. Additional Software Updates

  There are several software packages often included in Tor bundles besides
  Tor, such as Vidalia, Privoxy or Polipo, and Torbutton. The versions and
  download locations of updated installation packages for these bundle
  components can be easily added to the recommended-packages document
  specification above.

  3.2. Including ChangeLog Information

  It may be useful for automatic update clients to be able to display for
  users a summary of the changes made in the latest Tor or Tor-related
  software release, before the user chooses to install the update. In the
  future, we can add keywords to the specification in Section 1.2 that specify
  the location of a ChangeLog file for the latest recommended package
  versions. It may also be desirable to allow localized ChangeLog information,
  so that the automatic update client can fetch release notes in the
  end-user's preferred language.

  3.3. Weighted Package Mirror Selection

  We defined in Section 1.2 a method by which automatic update clients can
  select from multiple available package mirrors. We may want to add a Weight
  argument to the "*-package" Items that allows the recommended-packages file
  to suggest to clients the probability with which a package mirror should be
  chosen. This will allow clients to more appropriately distribute package
  downloads across available mirrors proportional to their approximate
  bandwidth.


Implementation

  Implementation of this proposal will consist of two separate components.

  The first component is a small "au-publish" tool that takes as input a
  configuration file specifying the information described in Section 1.2 and a
  private key. The tool is run by a "packaging authority" (someone responsible
  for publishing updated installation packages), who will be prompted to enter
  the passphrase for the private key used to sign the recommended-packages
  document. The output of the tool is a document formatted according to
  Section 1.2, with a signature appended at the end. The resulting document
  can then be published to any of the update mirrors.

  The second component is an "au-client" tool that is run on the end-user's
  machine. It periodically checks for updated installation packages according
  to Section 2 and fetches the packages if necessary. The public keys used
  to sign the recommended-packages file and any of the published packages are
  included in the "au-client" tool.


References

  [1] Tor directory protocol (version 3),
  https://tor-svn.freehaven.net/svn/tor/trunk/doc/spec/dir-spec.txt

  [2] Tor control protocol (version 2),
  https://tor-svn.freehaven.net/svn/tor/trunk/doc/spec/control-spec.txt

  [3] Tor version specification,
  https://tor-svn.freehaven.net/svn/tor/trunk/doc/spec/version-spec.txt

Filename: 155-four-hidden-service-improvements.txt
Title: Four Improvements of Hidden Service Performance
Author: Karsten Loesing, Christian Wilms
Created: 25-Sep-2008
Status: Closed
Implemented-In: 0.2.1.x

Change history:

  25-Sep-2008  Initial proposal for or-dev

Overview:

  A performance analysis of hidden services [1] has brought up a few
  possible design changes to reduce advertisement time of a hidden service
  in the network as well as connection establishment time. Some of these
  design changes have side-effects on anonymity or overall network load
  which had to be weighed up against individual performance gains. A
  discussion of seven possible design changes [2] has led to a selection
  of four changes [3] that are proposed to be implemented here.

Design:

  1. Shorter Circuit Extension Timeout

  When establishing a connection to a hidden service a client cannibalizes
  an existing circuit and extends it by one hop to one of the service's
  introduction points. In most cases this can be accomplished within a few
  seconds. Therefore, the current timeout of 60 seconds for extending a
  circuit is far too high.

  Assuming that the timeout would be reduced to a lower value, for example
  30 seconds, a second (or third) attempt to cannibalize and extend would
  be started earlier. With the current timeout of 60 seconds, 93.42% of all
  circuits can be established, whereas this fraction would have been only
  0.87% smaller at 92.55% with a timeout of 30 seconds.

  For a timeout of 30 seconds the performance gain would be approximately 2
  seconds in the mean as opposed to the current timeout of 60 seconds. At
  the same time a smaller timeout leads to discarding an increasing number
  of circuits that might have been completed within the current timeout of
  60 seconds.

  Measurements with simulated low-bandwidth connectivity have shown that
  there is no significant effect of client connectivity on circuit
  extension times. The reason for this might be that extension messages are
  small and thereby independent of the client bandwidth. Further, the
  connection between client and entry node only constitutes a single hop of
  a circuit, so that its influence on the whole circuit is limited.

  The exact value of the new timeout does not necessarily have to be 30
  seconds, but might also depend on the results of circuit build timeout
  measurements as described in proposal 151.

  2. Parallel Connections to Introduction Points

  An additional approach to accelerate extension of introduction circuits
  is to extend a second circuit in parallel to a different introduction
  point. Such parallel extension attempts should be started after a short
  delay of, e.g., 15 seconds in order to prevent unnecessary circuit
  extensions and thereby save network resources. Whichever circuit
  extension succeeds first is used for introduction, while the other
  attempt is aborted.

  An evaluation has been performed for the more resource-intensive approach
  of starting two parallel circuits immediately instead of waiting for a
  short delay. The result was a reduction of connection establishment times
  from 27.4 seconds in the original protocol to 22.5 seconds.

  While the effect of the proposed approach of delayed parallelization on
  mean connection establishment times is expected to be smaller,
  variability of connection attempt times can be reduced significantly.

  3. Increase Count of Internal Circuits

  Hidden services need to create or cannibalize and extend a circuit to a
  rendezvous point for every client request. Really popular hidden services
  require more than two internal circuits in the pool to answer multiple
  client requests at the same time. This scenario was not yet analyzed, but
  will probably exhibit worse performance than measured in the previous
  analysis. The number of preemptively built internal circuits should be a
  function of connection requests in the past to adapt to changing needs.
  Furthermore, an increased number of internal circuits on client side
  would allow clients to establish connections to more than one hidden
  service at a time.

  Under the assumption that a popular hidden service cannot make use of
  cannibalization for connecting to rendezvous points, the circuit creation
  time needs to be added to the current results. In the mean, the
  connection establishment time to a popular hidden service would increase
  by 4.7 seconds.

  4. Build More Introduction Circuits

  When establishing introduction points, a hidden service should launch 5
  instead of 3 introduction circuits at the same time and use only the
  first 3 that could be established. The remaining two circuits could still
  be used for other purposes afterwards.

  The effect has been simulated using previously measured data, too.
  Therefore, circuit establishment times were derived from log files and
  written to an array. Afterwards, a simulation with 10,000 runs was
  performed picking 5 (4, 6) random values and using the 3 lowest values in
  contrast to picking only 3 values at random. The result is that the mean
  time of the 3-out-of-3 approach is 8.1 seconds, while the mean time of
  the 3-out-of-5 approach is 4.4 seconds.

  The effect on network load is minimal, because the hidden service can
  reuse the slower internal circuits for other purposes, e.g., rendezvous
  circuits. The only change is that a hidden service starts establishing
  more circuits at once instead of subsequently doing so.

References:

  [1] http://freehaven.net/~karsten/hidserv/perfanalysis-2008-06-15.pdf

  [2] http://freehaven.net/~karsten/hidserv/discussion-2008-07-15.pdf

  [3] http://freehaven.net/~karsten/hidserv/design-2008-08-15.pdf

Filename: 156-tracking-blocked-ports.txt
Title: Tracking blocked ports on the client side
Author: Robert Hogan
Created: 14-Oct-2008
Status: Superseded

[Superseded by 156, which recognizes the security issues here.]


Motivation:
Tor clients that are behind extremely restrictive firewalls can end up
waiting a while for their first successful OR connection to a node on the
network.  Worse, the more restrictive their firewall the more susceptible
they are to an attacker guessing their entry nodes. Tor routers that
are behind extremely restrictive firewalls can only offer a limited,
'partitioned' service to other routers and clients on the network. Exit
nodes behind extremely restrictive firewalls may advertise ports that they
are actually not able to connect to, wasting network resources in circuit
constructions that are doomed to fail at the last hop on first use.

Proposal:

When a client attempts to connect to an entry guard it should avoid
further attempts on ports that fail once until it has connected to at
least one entry guard successfully. (Maybe it should wait for more than
one failure to reduce the skew on the first node selection.) Thereafter
it should select entry guards regardless of port and warn the user if
it observes that connections to a given port have failed every multiple
of 5 times without success or since the last success.

Tor should warn the operators of exit, middleman and entry nodes if it
observes that connections to a given port have failed a multiple of 5
times without success or since the last success. If attempts on a port
fail 20 or more times without or since success, Tor should add the port
to a 'blocked-ports' entry in its descriptor's extra-info. Some thought
needs to be given to what the authorities might do with this information.

Related TODO item:
    "- Automatically determine what ports are reachable and start using
      those, if circuits aren't working and it's a pattern we
      recognize ("port 443 worked once and port 9001 keeps not
      working")."


I've had a go at implementing all of this in the attached.

Addendum:
Just a note on the patch, storing the digest of each router that uses the port
is a bit of a memory hog, and its only real purpose is to provide a count of
routers using that port when warning the user. That could be achieved when
warning the user by iterating through the routerlist instead.

Index: src/or/connection_or.c
===================================================================
--- src/or/connection_or.c	(revision 17104)
+++ src/or/connection_or.c	(working copy)
@@ -502,6 +502,9 @@
 connection_or_connect_failed(or_connection_t *conn,
                              int reason, const char *msg)
 {
+  if ((reason == END_OR_CONN_REASON_NO_ROUTE) ||
+      (reason == END_OR_CONN_REASON_REFUSED))
+    or_port_hist_failure(conn->identity_digest,TO_CONN(conn)->port);
   control_event_or_conn_status(conn, OR_CONN_EVENT_FAILED, reason);
   if (!authdir_mode_tests_reachability(get_options()))
     control_event_bootstrap_problem(msg, reason);
@@ -580,6 +583,7 @@
     /* already marked for close */
     return NULL;
   }
+
   return conn;
 }
 
@@ -909,6 +913,7 @@
   control_event_or_conn_status(conn, OR_CONN_EVENT_CONNECTED, 0);
 
   if (started_here) {
+    or_port_hist_success(TO_CONN(conn)->port);
     rep_hist_note_connect_succeeded(conn->identity_digest, now);
     if (entry_guard_register_connect_status(conn->identity_digest,
                                             1, now) < 0) {
Index: src/or/rephist.c
===================================================================
--- src/or/rephist.c	(revision 17104)
+++ src/or/rephist.c	(working copy)
@@ -18,6 +18,7 @@
 static void bw_arrays_init(void);
 static void predicted_ports_init(void);
 static void hs_usage_init(void);
+static void or_port_hist_init(void);
 
 /** Total number of bytes currently allocated in fields used by rephist.c. */
 uint64_t rephist_total_alloc=0;
@@ -89,6 +90,25 @@
   digestmap_t *link_history_map;
 } or_history_t;
 
+/** or_port_hist_t contains our router/client's knowledge of
+    all OR ports offered on the network, and how many servers with each port we
+    have succeeded or failed to connect to. */
+typedef struct {
+  /** The port this entry is tracking. */
+  uint16_t or_port;
+  /** Have we ever connected to this port on another OR?. */
+  unsigned int success:1;
+  /** The ORs using this port. */
+  digestmap_t *ids;
+  /** The ORs using this port we have failed to connect to. */
+  digestmap_t *failure_ids;
+  /** Are we excluding ORs with this port during entry selection?*/
+  unsigned int excluded;
+} or_port_hist_t;
+
+static unsigned int still_searching = 0;
+static smartlist_t *or_port_hists;
+
 /** When did we last multiply all routers' weighted_run_length and
  * total_run_weights by STABILITY_ALPHA? */
 static time_t stability_last_downrated = 0;
@@ -164,6 +184,16 @@
   tor_free(hist);
 }
 
+/** Helper: free storage held by a single OR port history entry. */
+static void
+or_port_hist_free(or_port_hist_t *p)
+{
+  tor_assert(p);
+  digestmap_free(p->ids,NULL);
+  digestmap_free(p->failure_ids,NULL);
+  tor_free(p);
+}
+
 /** Update an or_history_t object <b>hist</b> so that its uptime/downtime
  * count is up-to-date as of <b>when</b>.
  */
@@ -1639,7 +1669,7 @@
     tmp_time = smartlist_get(predicted_ports_times, i);
     if (*tmp_time + PREDICTED_CIRCS_RELEVANCE_TIME < now) {
       tmp_port = smartlist_get(predicted_ports_list, i);
-      log_debug(LD_CIRC, "Expiring predicted port %d", *tmp_port);
+      log_debug(LD_HIST, "Expiring predicted port %d", *tmp_port);
       smartlist_del(predicted_ports_list, i);
       smartlist_del(predicted_ports_times, i);
       rephist_total_alloc -= sizeof(uint16_t)+sizeof(time_t);
@@ -1821,6 +1851,12 @@
   tor_free(last_stability_doc);
   built_last_stability_doc_at = 0;
   predicted_ports_free();
+  if (or_port_hists) {
+    SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, p,
+                      or_port_hist_free(p));
+    smartlist_free(or_port_hists);
+    or_port_hists = NULL;
+  }
 }
 
 /****************** hidden service usage statistics ******************/
@@ -2356,3 +2392,225 @@
   tor_free(fname);
 }
 
+/** Create a new entry in the port tracking cache for the or_port in
+  * <b>ri</b>. */
+void
+or_port_hist_new(const routerinfo_t *ri)
+{
+  or_port_hist_t *result;
+  const char *id=ri->cache_info.identity_digest;
+
+  if (!or_port_hists)
+    or_port_hist_init();
+
+  SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp,
+    {
+      /* Cope with routers that change their advertised OR port or are
+         dropped from the networkstatus. We don't discard the failures of
+         dropped routers because they are still valid when counting
+         consecutive failures on a port.*/
+      if (digestmap_get(tp->ids, id) && (tp->or_port != ri->or_port)) {
+        digestmap_remove(tp->ids, id);
+      }
+      if (tp->or_port == ri->or_port) {
+        if (!(digestmap_get(tp->ids, id)))
+          digestmap_set(tp->ids, id, (void*)1);
+        return;
+      }
+    });
+
+  result = tor_malloc_zero(sizeof(or_port_hist_t));
+  result->or_port=ri->or_port;
+  result->success=0;
+  result->ids=digestmap_new();
+  digestmap_set(result->ids, id, (void*)1);
+  result->failure_ids=digestmap_new();
+  result->excluded=0;
+  smartlist_add(or_port_hists, result);
+}
+
+/** Create the port tracking cache. */
+/*XXX: need to call this when we rebuild/update our network status */
+static void
+or_port_hist_init(void)
+{
+  routerlist_t *rl = router_get_routerlist();
+
+  if (!or_port_hists)
+    or_port_hists=smartlist_create();
+
+  if (rl && rl->routers) {
+    SMARTLIST_FOREACH(rl->routers, routerinfo_t *, ri,
+    {
+      or_port_hist_new(ri);
+    });
+  }
+}
+
+#define NOT_BLOCKED 0
+#define FAILURES_OBSERVED 1
+#define POSSIBLY_BLOCKED 5
+#define PROBABLY_BLOCKED 10
+/** Return the list of blocked ports for our router's extra-info.*/
+char *
+or_port_hist_get_blocked_ports(void)
+{
+  char blocked_ports[2048];
+  char *bp;
+  
+  tor_snprintf(blocked_ports,sizeof(blocked_ports),"blocked-ports");
+  SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp,
+    {
+      if (digestmap_size(tp->failure_ids) >= PROBABLY_BLOCKED)
+        tor_snprintf(blocked_ports+strlen(blocked_ports),
+                     sizeof(blocked_ports)," %u,",tp->or_port);
+    });
+  if (strlen(blocked_ports) == 13)
+    return NULL;
+  bp=tor_strdup(blocked_ports);
+  bp[strlen(bp)-1]='\n';
+  bp[strlen(bp)]='\0';
+  return bp;
+}
+
+/** Revert to client-only mode if we have seen to many failures on a port or
+  * range of ports.*/
+static void
+or_port_hist_report_block(unsigned int min_severity)
+{
+  or_options_t *options=get_options();
+  char failures_observed[2048],possibly_blocked[2048],probably_blocked[2048];
+  char port[1024];
+
+  memset(failures_observed,0,sizeof(failures_observed));
+  memset(possibly_blocked,0,sizeof(possibly_blocked));
+  memset(probably_blocked,0,sizeof(probably_blocked));
+
+  SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp,
+    {
+      unsigned int failures = digestmap_size(tp->failure_ids);
+      if (failures >= min_severity) {
+        tor_snprintf(port, sizeof(port), " %u (%u failures %s out of %u on the"
+                     " network)",tp->or_port,failures,
+                     (!tp->success)?"and no successes": "since last success",
+                     digestmap_size(tp->ids));
+        if (failures >= PROBABLY_BLOCKED) {
+          strlcat(probably_blocked, port, sizeof(probably_blocked));
+        } else if (failures >= POSSIBLY_BLOCKED)
+          strlcat(possibly_blocked, port, sizeof(possibly_blocked));
+        else if (failures >= FAILURES_OBSERVED)
+          strlcat(failures_observed, port, sizeof(failures_observed));
+      }
+    });
+
+  log_warn(LD_HIST,"%s%s%s%s%s%s%s%s",
+           server_mode(options) &&
+           ((min_severity==FAILURES_OBSERVED) || strlen(probably_blocked))?
+           "You should consider disabling your Tor server.":"",
+           (min_severity==FAILURES_OBSERVED)?
+           "Tor appears to be blocked from connecting to a range of ports "
+           "with the result that it cannot connect to one tenth of the Tor "
+           "network. ":"",
+           strlen(failures_observed)?
+           "Tor has observed failures on the following ports: ":"",
+           failures_observed,
+           strlen(possibly_blocked)?
+           "Tor is possibly blocked on the following ports: ":"",
+           possibly_blocked,
+           strlen(probably_blocked)?
+           "Tor is almost certainly blocked on the following ports: ":"",
+           probably_blocked);
+
+}
+
+/** Record the success of our connection to <b>digest</b>'s
+  * OR port. */
+void
+or_port_hist_success(uint16_t or_port)
+{
+  SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp,
+    {
+      if (tp->or_port != or_port)
+        continue;
+      /*Reset our failure stats so we can notice if this port ever gets
+        blocked again.*/
+      tp->success=1;
+      if (digestmap_size(tp->failure_ids)) {
+        digestmap_free(tp->failure_ids,NULL);
+        tp->failure_ids=digestmap_new();
+      }
+      if (still_searching) {
+        still_searching=0;
+        SMARTLIST_FOREACH(or_port_hists,or_port_hist_t *,t,t->excluded=0;);
+      }
+      return;
+    });
+}
+/** Record the failure of our connection to <b>digest</b>'s
+  * OR port. Warn, exclude the port from future entry guard selection, or
+  * add port to blocked-ports in our server's extra-info as appropriate. */
+void
+or_port_hist_failure(const char *digest, uint16_t or_port)
+{
+  int total_failures=0, ports_excluded=0, report_block=0;
+  int total_routers=smartlist_len(router_get_routerlist()->routers);
+
+  SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp,
+    {
+      ports_excluded += tp->excluded;
+      total_failures+=digestmap_size(tp->failure_ids);
+      if (tp->or_port != or_port)
+        continue;
+      /* We're only interested in unique failures */
+      if (digestmap_get(tp->failure_ids, digest))
+        return;
+
+      total_failures++;
+      digestmap_set(tp->failure_ids, digest, (void*)1);
+      if (still_searching && !tp->success) {
+        tp->excluded=1;
+        ports_excluded++;
+      }
+      if ((digestmap_size(tp->ids) >= POSSIBLY_BLOCKED) &&
+         !(digestmap_size(tp->failure_ids) % POSSIBLY_BLOCKED))
+        report_block=POSSIBLY_BLOCKED;
+    });
+
+  if (total_failures >= (int)(total_routers/10))
+    or_port_hist_report_block(FAILURES_OBSERVED);
+  else if (report_block)
+    or_port_hist_report_block(report_block);
+
+  if (ports_excluded >= smartlist_len(or_port_hists)) {
+    log_warn(LD_HIST,"During entry node selection Tor tried every port "
+             "offered on the network on at least one server "
+             "and didn't manage a single "
+             "successful connection. This suggests you are behind an "
+             "extremely restrictive firewall. Tor will keep trying to find "
+             "a reachable entry node.");
+    SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp, tp->excluded=0;);
+  }
+}
+
+/** Add any ports marked as excluded in or_port_hist_t to <b>rt</b> */
+void
+or_port_hist_exclude(routerset_t *rt)
+{
+  SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp,
+    {
+      char portpolicy[9];
+      if (tp->excluded) {
+        tor_snprintf(portpolicy,sizeof(portpolicy),"*:%u", tp->or_port);
+        log_warn(LD_HIST,"Port %u may be blocked, excluding it temporarily "
+                          "from entry guard selection.", tp->or_port);
+        routerset_parse(rt, portpolicy, "Ports");
+      }
+    });
+}
+
+/** Allow the exclusion of ports during our search for an entry node. */
+void
+or_port_hist_search_again(void)
+{
+    still_searching=1;
+}
Index: src/or/or.h
===================================================================
--- src/or/or.h	(revision 17104)
+++ src/or/or.h	(working copy)
@@ -3864,6 +3864,13 @@
 int any_predicted_circuits(time_t now);
 int rep_hist_circbuilding_dormant(time_t now);
 
+void or_port_hist_failure(const char *digest, uint16_t or_port);
+void or_port_hist_success(uint16_t or_port);
+void or_port_hist_new(const routerinfo_t *ri);
+void or_port_hist_exclude(routerset_t *rt);
+void or_port_hist_search_again(void);
+char *or_port_hist_get_blocked_ports(void);
+
 /** Possible public/private key operations in Tor: used to keep track of where
  * we're spending our time. */
 typedef enum {
Index: src/or/routerparse.c
===================================================================
--- src/or/routerparse.c	(revision 17104)
+++ src/or/routerparse.c	(working copy)
@@ -1401,6 +1401,8 @@
     goto err;
   }
 
+  or_port_hist_new(router);
+
   if (!router->platform) {
     router->platform = tor_strdup("<unknown>");
   }
Index: src/or/router.c
===================================================================
--- src/or/router.c	(revision 17104)
+++ src/or/router.c	(working copy)
@@ -1818,6 +1818,7 @@
   char published[ISO_TIME_LEN+1];
   char digest[DIGEST_LEN];
   char *bandwidth_usage;
+  char *blocked_ports;
   int result;
   size_t len;
 
@@ -1825,7 +1826,6 @@
                 extrainfo->cache_info.identity_digest, DIGEST_LEN);
   format_iso_time(published, extrainfo->cache_info.published_on);
   bandwidth_usage = rep_hist_get_bandwidth_lines(1);
-
   result = tor_snprintf(s, maxlen,
                         "extra-info %s %s\n"
                         "published %s\n%s",
@@ -1835,6 +1835,16 @@
   if (result<0)
     return -1;
 
+  blocked_ports = or_port_hist_get_blocked_ports();
+  if (blocked_ports) {
+      result = tor_snprintf(s+strlen(s), maxlen-strlen(s),
+                            "%s",
+                            blocked_ports);
+      tor_free(blocked_ports);
+      if (result<0)
+        return -1;
+  }
+
   if (should_record_bridge_info(options)) {
     static time_t last_purged_at = 0;
     char *geoip_summary;
Index: src/or/circuitbuild.c
===================================================================
--- src/or/circuitbuild.c	(revision 17104)
+++ src/or/circuitbuild.c	(working copy)
@@ -62,6 +62,7 @@
 
 static void entry_guards_changed(void);
 static time_t start_of_month(time_t when);
+static int num_live_entry_guards(void);
 
 /** Iterate over values of circ_id, starting from conn-\>next_circ_id,
  * and with the high bit specified by conn-\>circ_id_type, until we get
@@ -1627,12 +1628,14 @@
   smartlist_t *excluded;
   or_options_t *options = get_options();
   router_crn_flags_t flags = 0;
+  routerset_t *_ExcludeNodes;
 
   if (state && options->UseEntryGuards &&
       (purpose != CIRCUIT_PURPOSE_TESTING || options->BridgeRelay)) {
     return choose_random_entry(state);
   }
 
+  _ExcludeNodes = routerset_new();
   excluded = smartlist_create();
 
   if (state && (r = build_state_get_exit_router(state))) {
@@ -1670,12 +1673,18 @@
   if (options->_AllowInvalid & ALLOW_INVALID_ENTRY)
     flags |= CRN_ALLOW_INVALID;
 
+  if (options->ExcludeNodes)
+    routerset_union(_ExcludeNodes,options->ExcludeNodes);
+
+  or_port_hist_exclude(_ExcludeNodes);
+
   choice = router_choose_random_node(
            NULL,
            excluded,
-           options->ExcludeNodes,
+           _ExcludeNodes,
            flags);
   smartlist_free(excluded);
+  routerset_free(_ExcludeNodes);
   return choice;
 }
 
@@ -2727,6 +2736,7 @@
 entry_guards_update_state(or_state_t *state)
 {
   config_line_t **next, *line;
+  unsigned int have_reachable_entry=0;
   if (! entry_guards_dirty)
     return;
 
@@ -2740,6 +2750,7 @@
       char dbuf[HEX_DIGEST_LEN+1];
       if (!e->made_contact)
         continue; /* don't write this one to disk */
+      have_reachable_entry=1;
       *next = line = tor_malloc_zero(sizeof(config_line_t));
       line->key = tor_strdup("EntryGuard");
       line->value = tor_malloc(HEX_DIGEST_LEN+MAX_NICKNAME_LEN+2);
@@ -2785,6 +2796,11 @@
   if (!get_options()->AvoidDiskWrites)
     or_state_mark_dirty(get_or_state(), 0);
   entry_guards_dirty = 0;
+
+  /* XXX: Is this the place to decide that we no longer have any reachable
+    guards? */
+  if (!have_reachable_entry)
+    or_port_hist_search_again();
 }
 
 /** If <b>question</b> is the string "entry-guards", then dump

Filename: 157-specific-cert-download.txt
Title: Make certificate downloads specific
Author: Nick Mathewson
Created: 2-Dec-2008
Status: Closed
Target: 0.2.4.x

History:

  2008 Dec 2, 22:34
     Changed name of cross certification field to match the other authority
     certificate fields.

Status:

  As of 0.2.1.9-alpha:
    Cross-certification is implemented for new certificates, but not yet
    required.  Directories support the tor/keys/fp-sk urls.

Overview:

  Tor's directory specification gives two ways to download a certificate:
  by its identity fingerprint, or by the digest of its signing key.  Both
  are error-prone.  We propose a new download mechanism to make sure that
  clients get the certificates they want.

Motivation:

  When a client wants a certificate to verify a consensus, it has two choices
  currently:
     - Download by identity key fingerprint.  In this case, the client risks
       getting a certificate for the same authority, but with a different
       signing key than the one used to sign the consensus.

     - Download by signing key fingerprint.  In this case, the client risks
       getting a forged certificate that contains the right signing key
       signed with the wrong identity key.  (Since caches are willing to
       cache certs from authorities they do not themselves recognize, the
       attacker wouldn't need to compromise an authority's key to do this.)

Current solution:

  Clients fetch by identity keys, and re-fetch with backoff if they don't get
  certs with the signing key they want.

Proposed solution:

  Phase 1: Add a URL type for clients to download certs by identity _and_
  signing key fingerprint.  Unless both fields match, the client doesn't
  accept the certificate(s).  Clients begin using this method when their
  randomly chosen directory cache supports it.

  Phase 1A: Simultaneously, add a cross-certification element to
  certificates.

  Phase 2: Once many directory caches support phase 1, clients should prefer
  to fetch certificates using that protocol when available.

  Phase 2A: Once all authorities are generating cross-certified certificates
  as in phase 1A, require cross-certification.

Specification additions:

  The key certificate whose identity key fingerprint is <F> and whose signing
  key fingerprint is <S> should be available at:

      http://<hostname>/tor/keys/fp-sk/<F>-<S>.z

  As usual, clients may request multiple certificates using:

      http://<hostname>/tor/keys/fp-sk/<F1>-<S1>+<F2>-<S2>.z

  Clients SHOULD use this format whenever they know both key fingerprints for
  a desired certificate.


  Certificates SHOULD contain the following field (at most once):

  "dir-key-crosscert" NL CrossSignature NL

  where CrossSignature is a signature, made using the certificate's signing
  key, of the digest of the PKCS1-padded hash of the certificate's identity
  key.  For backward compatibility with broken versions of the parser, we
  wrap the base64-encoded signature in -----BEGIN ID SIGNATURE---- and
  -----END ID SIGNATURE----- tags.  (See bug 880.) Implementations MUST allow
  the "ID " portion to be omitted, however.

  When encountering a certificate with a dir-key-crosscert entry,
  implementations MUST verify that the signature is a correct signature of
  the hash of the identity key using the signing key.

  (In a future version of this specification, dir-key-crosscert entries will
  be required.)

Why cross-certify too?

  Cross-certification protects clients who haven't updated yet, by reducing
  the number of caches that are willing to hold and serve bogus certificates.

References:

  This is related to part 2 of bug 854.
Filename: 158-microdescriptors.txt
Title: Clients download consensus + microdescriptors
Author: Roger Dingledine
Created: 17-Jan-2009
Status: Closed
Implemented-In: 0.2.3.1-alpha

0. History

  15 May 2009: Substantially revised based on discussions on or-dev
  from late January.  Removed the notion of voting on how to choose
  microdescriptors; made it just a function of the consensus method.
  (This lets us avoid the possibility of "desynchronization.")
  Added suggestion to use a new consensus flavor.  Specified use of
  SHA256 for new hashes. -nickm

  15 June 2009: Cleaned up based on comments from Roger. -nickm

1. Overview

  This proposal replaces section 3.2 of proposal 141, which was
  called "Fetching descriptors on demand". Rather than modifying the
  circuit-building protocol to fetch a server descriptor inline at each
  circuit extend, we instead put all of the information that clients need
  either into the consensus itself, or into a new set of data about each
  relay called a microdescriptor.

  Descriptor elements that are small and frequently changing should go
  in the consensus itself, and descriptor elements that are small and
  relatively static should go in the microdescriptor. If we ever end up
  with descriptor elements that aren't small yet clients need to know
  them, we'll need to resume considering some design like the one in
  proposal 141.

  Note also that any descriptor element which clients need to use to
  decide which servers to fetch info about, or which servers to fetch
  info from, needs to stay in the consensus.

2. Motivation

  See
  http://archives.seul.org/or/dev/Nov-2008/msg00000.html and
  http://archives.seul.org/or/dev/Nov-2008/msg00001.html and especially
  http://archives.seul.org/or/dev/Nov-2008/msg00007.html
  for a discussion of the options and why this is currently the best
  approach.

3. Design

  There are three pieces to the proposal. First, authorities will list in
  their votes (and thus in the consensus) the expected hash of
  microdescriptor for each relay. Second, authorities will serve
  microdescriptors, directory mirrors will cache and serve
  them. Third, clients will ask for them and cache them.

3.1. Consensus changes

  If the authorities choose a consensus method of a given version or
  later, a microdescriptor format is implicit in that version.
  A microdescriptor should in every case be a pure function of the
  router descriptor and the consensus method.

  In votes, we need to include the hash of each expected microdescriptor
  in the routerstatus section. I suggest a new "m" line for each stanza,
  with the base64 of the SHA256 hash of the router's microdescriptor.

  For every consensus method that an authority supports, it includes a
  separate "m" line in each router section of its vote, containing:
    "m" SP methods 1*(SP AlgorithmName "=" digest) NL
  where methods is a comma-separated list of the consensus methods
  that the authority believes will produce "digest".

  (As with base64 encoding of SHA1 hashes in consensuses, let's
  omit the trailing =s)

  The consensus microdescriptor-elements and "m" lines are then computed
  as described in Section 3.1.2 below.

  (This means we need a new consensus-method that knows
  how to compute the microdescriptor-elements and add "m" lines.)

  The microdescriptor consensus uses the directory-signature format from
  proposal 162, with the "sha256" algorithm.


3.1.1. Descriptor elements to include for now

  In the first version, the microdescriptor should contain the
  onion-key element, and the family element from the router descriptor,
  and the exit policy summary as currently specified in dir-spec.txt.

3.1.2. Computing consensus for microdescriptor-elements and "m" lines

  When we are generating a consensus, we use whichever m line
  unambiguously corresponds to the descriptor digest that will be
  included in the consensus.

  (If different votes have different microdescriptor digests for a
  single <descriptor-digest, consensus-method> pair, then at least one
  of the authorities is broken.  If this happens, the consensus should
  contain whichever microdescriptor digest is most common.  If there is
  no winner, we break ties in the favor of the lexically earliest.
  Either way, we should log a warning: there is definitely a bug.)

  The "m" lines in a consensus contain only the digest, not a list of
  consensus methods.

3.1.3. A new flavor of consensus

  Rather than inserting "m" lines in the current consensus format,
  they should be included in a new consensus flavor (see proposal
  162).

  This flavor can safely omit descriptor digests.

  When we implement this voting method, we can remove the exit policy
  summary from the current "ns" flavor of consensus, since no current
  clients use them, and they take up about 5% of the compressed
  consensus.

  This new consensus flavor should be signed with the sha256 signature
  format as documented in proposal 162.

3.2. Directory mirrors fetch, cache, and serve microdescriptors

  Directory mirrors should fetch, catch, and serve each microdescriptor
  from the authorities.  (They need to continue to serve normal relay
  descriptors too, to handle old clients.)

  The microdescriptors with base64 hashes <D1>,<D2>,<D3> should be
  available at:
    http://<hostname>/tor/micro/d/<D1>-<D2>-<D3>.z
  (We use base64 for size and for consistency with the consensus
  format. We use -s instead of +s to separate these items, since
  the + character is used in base64 encoding.)

  All the microdescriptors from the current consensus should also be
  available at:
    http://<hostname>/tor/micro/all.z
  so a client that's bootstrapping doesn't need to send a 70KB URL just
  to name every microdescriptor it's looking for.

  Microdescriptors have no header or footer.
  The hash of the microdescriptor is simply the hash of the concatenated
  elements.

  Directory mirrors should check to make sure that the microdescriptors
  they're about to serve match the right hashes (either the hashes from
  the fetch URL or the hashes from the consensus, respectively).

  We will probably want to consider some sort of smart data structure to
  be able to quickly convert microdescriptor hashes into the appropriate
  microdescriptor. Clients will want this anyway when they load their
  microdescriptor cache and want to match it up with the consensus to
  see what's missing.

3.3. Clients fetch them and cache them

  When a client gets a new consensus, it looks to see if there are any
  microdescriptors it needs to learn. If it needs to learn more than
  some threshold of the microdescriptors (half?), it requests 'all',
  else it requests only the missing ones.  Clients MAY try to
  determine whether the upload bandwidth for listing the
  microdescriptors they want is more or less than the download
  bandwidth for the microdescriptors they do not want.

  Clients maintain a cache of microdescriptors along with metadata like
  when it was last referenced by a consensus, and which identity key
  it corresponds to.  They keep a microdescriptor
  until it hasn't been mentioned in any consensus for a week. Future
  clients might cache them for longer or shorter times.

3.3.1. Information leaks from clients

  If a client asks you for a set of microdescs, then you know she didn't
  have them cached before. How much does that leak? What about when
  we're all using our entry guards as directory guards, and we've seen
  that user make a bunch of circuits already?

  Fetching "all" when you need at least half is a good first order fix,
  but might not be all there is to it.

  Another future option would be to fetch some of the microdescriptors
  anonymously (via a Tor circuit).

  Another crazy option (Roger's phrasing) is to do decoy fetches as
  well.

4. Transition and deployment

  Phase one, the directory authorities should start voting on
  microdescriptors, and putting them in the consensus.

  Phase two, directory mirrors should learn how to serve them, and learn
  how to read the consensus to find out what they should be serving.

  Phase three, clients should start fetching and caching them instead
  of normal descriptors.

Filename: 159-exit-scanning.txt
Title: Exit Scanning
Author: Mike Perry
Created: 13-Feb-2009
Status: Informational

Overview:

This proposal describes the implementation and integration of an
automated exit node scanner for scanning the Tor network for malicious,
misconfigured, firewalled or filtered nodes.

Motivation:

Tor exit nodes can be run by anyone with an Internet connection. Often,
these users aren't fully aware of limitations of their networking
setup.  Content filters, antivirus software, advertisements injected by
their service providers, malicious upstream providers, and the resource
limitations of their computer or networking equipment have all been
observed on the current Tor network.

It is also possible that some nodes exist purely for malicious
purposes.  In the past, there have been intermittent instances of
nodes spoofing SSH keys, as well as nodes being used for purposes of
plaintext surveillance.

While it is not realistic to expect to catch extremely targeted or
completely passive malicious adversaries, the goal is to prevent
malicious adversaries from deploying dragnet attacks against large
segments of the Tor userbase.


Scanning methodology:

The first scans to be implemented are HTTP, HTML, Javascript, and
SSL scans.

The HTTP scan scrapes Google for common filetype urls such as exe, msi,
doc, dmg, etc. It then fetches these urls through Non-Tor and Tor, and
compares the SHA1 hashes of the resulting content.

The SSL scan downloads certificates for all IPs a domain will locally
resolve to and compares these certificates to those seen over Tor. The
scanner notes if a domain had rotated certificates locally in the
results for each scan.

The HTML scan checks HTML, Javascript, and plugin content for
modifications. Because of the dynamic nature of most of the web, the
scanner has a number of mechanisms built in to filter out false
positives that are used when a change is noticed between Tor and
Non-Tor.

All tests also share a URL-based false positive filter that
automatically removes results retroactively if the number of failures
exceeds a certain percentage of nodes tested with the URL.


Deployment Stages:

To avoid instances where bugs cause us to mark exit nodes as BadExit
improperly, it is proposed that we begin use of the scanner in stages.

1. Manual Review:

  In the first stage, basic scans will be run by a small number of
  people while we stabilize the scanner. The scanner has the ability
  to resume crashed scans, and to rescan nodes that fail various
  tests.

2. Human Review:

  In the second stage, results will be automatically mailed to
  an email list of interested parties for review. We will also begin
  classifying failure types into three to four different severity
  levels, based on both the reliability of the test and the nature of
  the failure.

3. Automatic BadExit Marking:

  In the final stage, the scanner will begin marking exits depending
  on the failure severity level in one of three different ways: by
  node idhex, by node IP, or by node IP mask. A potential fourth, less
  severe category of results may still be delivered via email only for
  review.

  BadExit markings will be delivered in batches upon completion
  of whole-network scans, so that the final false positive
  filter has an opportunity to filter out URLs that exhibit
  dynamic content beyond what we can filter.


Specification of Exit Marking:

Technically, BadExit could be marked via SETCONF AuthDirBadExit over
the control port, but this would allow full access to the directory
authority configuration and operation.

The approved-routers file could also be used, but currently it only
supports fingerprints, and it also contains other data unrelated to
exit scanning that would be difficult to coordinate.

Instead, we propose that a new badexit-routers file that has three
keywords:

  BadExitNet 1*[exitpattern from 2.3 in dir-spec.txt]
  BadExitFP 1*[hexdigest from 2.3 in dir-spec.txt]

BadExitNet lines would follow the codepaths used by AuthDirBadExit to
set authdir_badexit_policy, and BadExitFP would follow the codepaths
from approved-router's !badexit lines.

The scanner would have exclusive ability to write, append, rewrite,
and modify this file. Prior to building a new consensus vote, a
participating Tor authority would read in a fresh copy.


Security Implications:

Aside from evading the scanner's detection, there are two additional
high-level security considerations:

1. Ensure nodes cannot be marked BadExit by an adversary at will

It is possible individual website owners will be able to target certain
Tor nodes, but once they begin to attempt to fail more than the URL
filter percentage of the exits, their sites will be automatically
discarded.

Failing specific nodes is possible, but scanned results are fully
reproducible, and BadExits should be rare enough that humans are never
fully removed from the loop.

State (cookies, cache, etc) does not otherwise persist in the scanner
between exit nodes to enable one exit node to bias the results of a
later one.

2. Ensure that scanner compromise does not yield authority compromise

Having a separate file that is under the exclusive control of the
scanner allows us to heavily isolate the scanner from the Tor
authority, potentially even running them on separate machines.

Filename: 160-bandwidth-offset.txt
Title: Authorities vote for bandwidth offsets in consensus
Author: Roger Dingledine
Created: 4-May-2009
Status: Closed
Target: 0.2.1.x

1. Motivation

  As part of proposal 141, we moved the bandwidth value for each relay
  into the consensus. Now clients can know how they should load balance
  even before they've fetched the corresponding relay descriptors.

  Putting the bandwidth in the consensus also lets the directory
  authorities choose more accurate numbers to advertise, if we come up
  with a better algorithm for deciding weightings.

  Our original plan was to teach directory authorities how to measure
  bandwidth themselves; then every authority would vote for the bandwidth
  it prefers, and we'd take the median of votes as usual.

  The problem comes when we have 7 authorities, and only a few of them
  have smarter bandwidth allocation algorithms. So long as the majority
  of them are voting for the number in the relay descriptor, the minority
  that have better numbers will be ignored.

2. Options

  One fix would be to demand that every authority also run the
  new bandwidth measurement algorithms: in that case, part of the
  responsibility of being an authority operator is that you need to run
  this code too. But in practice we can't really require all current
  authority operators to do that; and if we want to expand the set of
  authority operators even further, it will become even more impractical.
  Also, bandwidth testing adds load to the network, so we don't really
  want to require that the number of concurrent bandwidth tests match
  the number of authorities we have.

  The better fix is to allow certain authorities to specify that they are
  voting on bandwidth measurements: more accurate bandwidth values that
  have actually been evaluated. In this way, authorities can vote on 
  the median measured value if sufficient measured votes exist for a router,
  and otherwise fall back to the median value taken from the published router
  descriptors.

3. Security implications

  If only some authorities choose to vote on an offset, then a majority of
  those voting authorities can arbitrarily change the bandwidth weighting
  for the relay. At the extreme, if there's only one offset-voting
  authority, then that authority can dictate which relays clients will
  find attractive.

  This problem isn't entirely new: we already have the worry wrt
  the subset of authorities that vote for BadExit.

  To make it not so bad, we should deploy at least three offset-voting
  authorities.

  Also, authorities that know how to vote for offsets should vote for
  an offset of zero for new nodes, rather than choosing not to vote on
  any offset in those cases.

4. Design

  First, we need a new consensus method to support this new calculation.

  Now v3 votes can have an additional value on the "w" line:
    "w Bandwidth=X Measured=" INT.

  Once we're using the new consensus method, the new way to compute the
  Bandwidth weight is by checking if there are at least 3 "Measured"
  votes. If so, the median of these is taken. Otherwise, the median
  of the "Bandwidth=" values are taken, as described in Proposal 141.

  Then the actual consensus looks just the same as it did before,
  so clients never have to know that this additional calculation is
  happening.

5. Implementation

  The Measured values will be read from a file provided by the scanners
  described in proposal 161. Files with a timestamp older than 3 days
  will be ignored.

  The file will be read in from dirserv_generate_networkstatus_vote_obj()
  in a location specified by a new config option "V3MeasuredBandwidths".
  A helper function will be called to populate new 'measured' and
  'has_measured' fields of the routerstatus_t 'routerstatuses' list with 
  values read from this file.

  An additional for_vote flag will be passed to 
  routerstatus_format_entry() from format_networkstatus_vote(), which will 
  indicate that the "Measured=" string should be appended to the "w Bandwith=" 
  line with the measured value in the struct.

  routerstatus_parse_entry_from_string() will be modified to parse the
  "Measured=" lines into routerstatus_t struct fields.

  Finally, networkstatus_compute_consensus() will set rs_out.bandwidth 
  to the median of the measured values if there are more than 3, otherwise
  it will use the bandwidth value median as normal.



Title: Computing Bandwidth Adjustments
Filename: 161-computing-bandwidth-adjustments.txt
Author: Mike Perry
Created: 12-May-2009
Target: 0.2.1.x
Status: Closed


1. Motivation

  There is high variance in the performance of the Tor network. Despite
  our efforts to balance load evenly across the Tor nodes, some nodes are
  significantly slower and more overloaded than others.

  Proposal 160 describes how we can augment the directory authorities to
  vote on measured bandwidths for routers. This proposal describes what
  goes into the measuring process.


2. Measurement Selection

  The general idea is to determine a load factor representing the ratio
  of the capacity of measured nodes to the rest of the network. This load
  factor could be computed from three potentially relevant statistics:
  circuit failure rates, circuit extend times, or stream capacity.

  Circuit failure rates and circuit extend times appear to be
  non-linearly proportional to node load. We've observed that the same
  nodes when scanned at US nighttime hours (when load is presumably
  lower) exhibit almost no circuit failure, and significantly faster
  extend times than when scanned during the day.

  Stream capacity, however, is much more uniform, even during US
  nighttime hours. Moreover, it is a more intuitive representation of
  node capacity, and also less dependent upon distance and latency
  if amortized over large stream fetches.


3. Average Stream Bandwidth Calculation

  The average stream bandwidths are obtained by dividing the network into
  slices of 50 nodes each, grouped according to advertised node bandwidth.

  Two hop circuits are built using nodes from the same slice, and a large
  file is downloaded via these circuits. The file sizes are set based
  on node percentile rank as follows:
    
     0-10: 2M
     10-20: 1M
     20-30: 512k
     30-50: 256k
     50-100: 128k

  These sizes are based on measurements performed during test scans.

  This process is repeated until each node has been chosen to participate
  in at least 5 circuits.


4. Ratio Calculation

  The ratios are calculated by dividing each measured value by the 
  network-wide average.


5. Ratio Filtering

  After the base ratios are calculated, a second pass is performed
  to remove any streams with nodes of ratios less than X=0.5 from
  the results of other nodes. In addition, all outlying streams
  with capacity of one standard deviation below a node's average
  are also removed.

  The final ratio result will be greater of the unfiltered ratio
  and the filtered ratio.


6. Pseudocode for Ratio Calculation Algorithm

  Here is the complete pseudocode for the ratio algorithm:

    Slices = {S | S is 50 nodes of similar consensus capacity}
    for S in Slices:
      while exists node N in S with circ_chosen(N) < 7:
        fetch_slice_file(build_2hop_circuit(N, (exit in S)))
      for N in S:
        BW_measured(N) = MEAN(b | b is bandwidth of a stream through N)
        Bw_stddev(N) = STDDEV(b | b is bandwidth of a stream through N)
      Bw_avg(S) = MEAN(b | b = BW_measured(N) for all N in S)  
      for N in S:
        Normal_Streams(N) = {stream via N | bandwidth >= BW_measured(N)} 
        BW_Norm_measured(N) =  MEAN(b | b is a bandwidth of Normal_Streams(N))

    Bw_net_avg(Slices) = MEAN(BW_measured(N) for all N in Slices)
    Bw_Norm_net_avg(Slices) = MEAN(BW_Norm_measured(N) for all N in Slices)

    for N in all Slices:
      Bw_net_ratio(N) = Bw_measured(N)/Bw_net_avg(Slices)
      Bw_Norm_net_ratio(N) = BW_Norm_measured(N)/Bw_Norm_net_avg(Slices)

      ResultRatio(N) = MAX(Bw_net_ratio(N), Bw_Norm_net_ratio(N))


7. Security implications

  The ratio filtering will deal with cases of sabotage by dropping
  both very slow outliers in stream average calculations, as well
  as dropping streams that used very slow nodes from the calculation
  of other nodes.

  This scheme will not address nodes that try to game the system by
  providing better service to scanners. The scanners can be detected
  at the entry by IP address, and at the exit by the destination fetch
  IP.

  Measures can be taken to obfuscate and separate the scanners' source
  IP address from the directory authority IP address. For instance,
  scans can happen offsite and the results can be rsynced into the
  authorities. The destination server IP can also change.
 
  Neither of these methods are foolproof, but such nodes can already
  lie about their bandwidth to attract more traffic, so this solution
  does not set us back any in that regard.


8. Parallelization

  Because each slice takes as long as 6 hours to complete, we will want
  to parallelize as much as possible. This will be done by concurrently
  running multiple scanners from each authority to deal with different
  segments of the network. Each scanner piece will continually loop 
  over a portion of the network, outputting files of the form:

   node_id=<idhex> SP strm_bw=<BW_measured(N)> SP 
         filt_bw=<BW_Norm_measured(N)> ns_bw=<CurrentConsensusBw(N)> NL

  The most recent file from each scanner will be periodically gathered 
  by another script that uses them to produce network-wide averages 
  and calculate ratios as per the algorithm in section 6. Because nodes 
  may shift in capacity, they may appear in more than one slice and/or 
  appear more than once in the file set. The most recently measured
  line will be chosen in this case.


9. Integration with Proposal 160

  The final results will be produced for the voting mechanism
  described in Proposal 160 by multiplying the derived ratio by
  the average published consensus bandwidth during the course of the
  scan, and taking the weighted average with the previous consensus
  bandwidth:

     Bw_new = Round((Bw_current * Alpha + Bw_scan_avg*Bw_ratio)/(Alpha + 1))

  The Alpha parameter is a smoothing parameter intended to prevent
  rapid oscillation between loaded and unloaded conditions. It is
  currently fixed at 0.333.

  The Round() step consists of rounding to the 3 most significant figures
  in base10, and then rounding that result to the nearest 1000, with 
  a minimum value of 1000.

  This will produce a new bandwidth value that will be output into a 
  file consisting of lines of the form:

     node_id=<idhex> SP bw=<Bw_new> NL
 
  The first line of the file will contain a timestamp in UNIX time()
  seconds. This will be used by the authority to decide if the 
  measured values are too old to use.
 
  This file can be either copied or rsynced into a directory readable
  by the directory authority.

Filename: 162-consensus-flavors.txt
Title: Publish the consensus in multiple flavors
Author: Nick Mathewson
Created: 14-May-2009
Implemented-In: 0.2.3.1-alpha
Status: Closed

[Implementation notes: the 'consensus index' feature never got implemented.]

Overview:

   This proposal describes a way to publish each consensus in
   multiple simultaneous formats, or "flavors".  This will reduce the
   amount of time needed to deploy new consensus-like documents, and
   reduce the size of consensus documents in the long term.

Motivation:

   In the future, we will almost surely want different fields and
   data in the network-status document.  Examples include:
      - Publishing hashes of microdescriptors instead of hashes of
        full descriptors (Proposal 158).
      - Including different digests of descriptors, instead of the
        perhaps-soon-to-be-totally-broken SHA1.

   Note that in both cases, from the client's point of view, this
   information _replaces_ older information.  If we're using a
   SHA256 hash, we don't need to see the SHA1.  If clients only want
   microdescriptors, they don't (necessarily) need to see hashes of
   other things.

   Our past approach to cases like this has been to shovel all of
   the data into the consensus document.  But this is rather poor
   for bandwidth.  Adding a single SHA256 hash to a consensus for
   each router increases the compressed consensus size by 47%.  In
   comparison, replacing a single SHA1 hash with a SHA256 hash for
   each listed router increases the consensus size by only 18%.

Design in brief:

   Let the voting process remain as it is, until a consensus is
   generated.  With future versions of the voting algorithm, instead
   of just a single consensus being generated, multiple consensus
   "flavors" are produced.

   Consensuses (all of them) include a list of which flavors are
   being generated.  Caches fetch and serve all flavors of consensus
   that are listed, regardless of whether they can parse or validate
   them, and serve them to clients.  Thus, once this design is in
   place, we won't need to deploy more cache changes in order to get
   new flavors of consensus to be cached.

   Clients download only the consensus flavor they want.

A note on hashes:

   Everything in this document is specified to use SHA256, and to be
   upgradeable to use better hashes in the future.

Spec modifications:

   1. URLs and changes to the current consensus format.

   Every consensus flavor has a name consisting of a sequence of one
   or more alphanumeric characters and dashes.  For compatibility
   current descriptor flavor is called "ns".

   The supported consensus flavors are defined as part of the
   authorities' consensus method.

   For each supported flavor, every authority calculates another
   consensus document of as-yet-unspecified format, and exchanges
   detached signatures for these documents as in the current consensus
   design.

   In addition to the consensus currently served at
   /tor/status-vote/(current|next)/consensus.z  and
   /tor/status-vote/(current|next)/consensus/<FP1>+<FP2>+<FP3>+....z ,
   authorities serve another consensus of each flavor "F" from the
   locations /tor/status-vote/(current|next)/consensus-F.z. and
   /tor/status-vote/(current|next)/consensus-F/<FP1>+....z.

   When caches serve these documents, they do so from the same
   locations.

   2. Document format: generic consensus.

   The format of a flavored consensus is as-yet-unspecified, except
   that the first line is:
      "network-status-version" SP version SP flavor NL

   where version is 3 or higher, and the flavor is a string
   consisting of alphanumeric characters and dashes, matching the
   corresponding flavor listed in the unflavored consensus.

   3. Document format: detached signatures.

   We amend the detached signature format to include more than one
   consensus-digest line, and more than one set of signatures.

   After the consensus-digest line, we allow more lines of the form:
      "additional-digest" SP flavor SP algname SP digest NL

   Before the directory-signature lines, we allow more entries of the form:
      "additional-signature" SP flavor SP algname SP identity SP
           signing-key-digest NL signature.

   [We do not use "consensus-digest" or "directory-signature" for flavored
   consensuses, since this could confuse older Tors.]

   The consensus-signatures URL should contain the signatures
   for _all_ flavors of consensus.

   4. The consensus index:

   Authorities additionally generate and serve a consensus-index
   document.  Its format is:

       Header ValidAfter ValidUntil Documents Signatures

       Header = "consensus-index" SP version NL
       ValidAfter = as in a consensus
       ValidUntil = as in a consensus
       Documents = Document*
       Document = "document" SP flavor SP SignedLength
                                    1*(SP AlgorithmName "=" Digest) NL
       Signatures = Signature*
       Signature = "directory-signature" SP algname SP identity
                           SP signing-key-digest NL signature

    There must be one Document line for each generated consensus flavor.
    Each Document line describes the length of the signed portion of
    a consensus (the signatures themselves are not included), along
    with one or more digests of that signed portion.  Digests are
    given in hex.  The algorithm "sha256" MUST be included; others
    are allowed.

    The algname part of a signature describes what algorithm was
    used to hash the identity and signing keys, and to compute the
    signature.  The algorithm "sha256" MUST be recognized;
    signatures with unrecognized algorithms MUST be ignored.
    (See below).

    The consensus index is made available at
       /tor/status-vote/(current|next)/consensus-index.z.

    Caches should fetch this document so they can check the
    correctness of the different consensus documents they fetch.
    They do not need to check anything about an unrecognized
    consensus document beyond its digest and length.

    4.1. The "sha256" signature format.

    The 'SHA256' signature format for directory objects is defined as
    the RSA signature of the OAEP+-padded SHA256 digest of the item to
    be signed.  When checking signatures, the signature MUST be treated
    as valid if the signature material begins with SHA256(document);
    this allows us to add other data later.

Considerations:

    - We should not create a new flavor of consensus when adding a
      field instead wouldn't be too onerous.

    - We should not proliferate flavors lightly: clients will be
      distinguishable based on which flavor they download.

Migration:

    - Stage one: authorities begin generating and serving
      consensus-index files.

    - Stage two: Caches begin downloading consensus-index files,
      validating them, and using them to decide what flavors of
      consensus documents to cache.  They download all listed
      documents, and compare them to the digests given in the
      consensus.

    - Stage three: Once we want to make a significant change to the
      consensus format, we deploy another flavor of consensus at the
      authorities.  This will immediately start getting cached by the
      caches, and clients can start fetching the new flavor without
      waiting a version or two for enough caches to begin supporting
      it.

Acknowledgements:

    Aspects of this design and its applications to hash migration were
    heavily influenced by IRC conversations with Marian.

Filename: 163-detecting-clients.txt
Title: Detecting whether a connection comes from a client
Author: Nick Mathewson
Created: 22-May-2009
Target: 0.2.2
Status: Superseded

[Note: Actually, this is partially done, partially superseded
       -nickm, 9 May 2011]


Overview:

   Some aspects of Tor's design require relays to distinguish
   connections from clients from connections that come from relays.
   The existing means for doing this is easy to spoof.  We propose
   a better approach.

Motivation:

   There are at least two reasons for which Tor servers want to tell
   which connections come from clients and which come from other
   servers:

     1) Some exits, proposal 152 notwithstanding, want to disallow
        their use as single-hop proxies.
     2) Some performance-related proposals involve prioritizing
        traffic from relays, or limiting traffic per client (but not
        per relay).

   Right now, we detect client vs server status based on how the
   client opens circuits.  (Check out the code that implements the
   AllowSingleHopExits option if you want all the details.)  This
   method is depressingly easy to fake, though.  This document
   proposes better means.

Goals:

   To make grabbing relay privileges at least as difficult as just
   running a relay.

   In the analysis below, "using server privileges" means taking any
   action that only servers are supposed to do, like delivering a
   BEGIN cell to an exit node that doesn't allow single hop exits,
   or claiming server-like amounts of bandwidth.

Passive detection:

   A connection is definitely a client connection if it takes one of
   the TLS methods during setup that does not establish an identity
   key.

   A circuit is definitely a client circuit if it is initiated with
   a CREATE_FAST cell, though the node could be a client or a server.

   A node that's listed in a recent consensus is probably a server.

   A node to which we have successfully extended circuits from
   multiple origins is probably a server.

Active detection:

   If a node doesn't try to use server privileges at all, we never
   need to care whether it's a server.

   When a node or circuit tries to use server privileges, if it is
   "definitely a client" as per above, we can refuse it immediately.

   If it's "probably a server" as per above, we can accept it.

   Otherwise, we have either a client, or a server that is neither
   listed in any consensus or used by any other clients -- in other
   words, a new or private server.

   For these servers, we should attempt to build one or more test
   circuits through them.  If enough of the circuits succeed, the
   node is a real relay.  If not, it is probably a client.

   While we are waiting for the test circuits to succeed, we should
   allow a short grace period in which server privileges are
   permitted.  When a test is done, we should remember its outcome
   for a while, so we don't need to do it again.

Why it's hard to do good testing:

   Doing a test circuit starting with an unlisted router requires
   only that we have an open connection for it.  Doing a test
   circuit starting elsewhere _through_ an unlisted router--though
   more reliable-- would require that we have a known address, port,
   identity key, and onion key for the router.  Only the address and
   identity key are easily available via the current Tor protocol in
   all cases.

   We could fix this part by requiring that all servers support
   BEGIN_DIR and support downloading at least a current descriptor
   for themselves.

Open questions:

   What are the thresholds for the needed numbers of circuits
   for us to decide that a node is a relay?

      [Suggested answer: two circuits from two distinct hosts.]

   How do we pick grace periods?  How long do we remember the
   outcome of a test?

      [Suggested answer: 10 minute grace period; 48 hour memory of
      test outcomes.]

   If we can build circuits starting at a suspect node, but we don't
   have enough information to try extending circuits elsewhere
   through the node, should we conclude that the node is
   "server-like" or not?

      [Suggested answer: for now, just try making circuits through
      the node.  Extend this to extending circuits as needed.]

Filename: 164-reporting-server-status.txt
Title: Reporting the status of server votes
Author: Nick Mathewson
Created: 22-May-2009
Status: Obsolete

Notes: This doesn't work with the current things authorities do,
 though we could revise it to work if we ever want to do this.

Overview:

   When a given node isn't listed in the directory, it isn't always easy
   to tell why.  This proposal suggest a quick-and-dirty way for
   authorities to export not only how they voted, but why, and a way to
   collate the information.

Motivation:

   Right now, if you want to know the reason why your server was listed
   a certain way in the Tor directory, the following steps are
   recommended:

       - Look through your log for reports of what the authority said
         when you tried to upload.

       - Look at the consensus; see if you're listed.

       - Wait a while, see if things get better.

       - Download the votes from all the authorities, and see how they
         voted.  Try to figure out why.

       - If you think they'll listen to you, ask some authority
         operators to look you up in their mtbf files and logs to see
         why they voted as they did.

   This is far too hard.

Solution:

   We should add a new vote-like information-only document that
   authorities serve on request.  Call it a "vote info".  It is
   generated at the same time as a vote, but used only for
   determining why a server voted as it did.  It is served from
   /tor/status-vote-info/current/authority[.z]

   It differs from a vote in that:

   * Its vote-status field is 'vote-info'.

   * It includes routers that the authority would not include
     in its vote.

     For these, it includes an "omitted" line with an English
     message explaining why they were omitted.

   * For each router, it includes a line describing its WFU and
     MTBF.  The format is:

       "stability <mtbf> up-since='date'"
       "uptime <wfu> down-since='date'"

   * It describes the WFU and MTBF thresholds it requires to
     vote for a given router in various roles in the header.
     The format is:

       "flag-requirement <flag-name> <field> <op> <value>"

     e.g.

       "flag-requirement Guard uptime > 80"

   * It includes info on routers all of whose descriptors that
     were uploaded but rejected over the past few hours.  The
     "r" lines for these are the same as for regular routers.
     The other lines are omitted for these routers, and are
     replaced with a single "rejected" line, explaining (in
     English) why the router was rejected.


   A status site (like Torweather or Torstatus or another
   tool) can poll these files when they are generated, collate
   the data, and make it available to server operators.

Risks:

   This document makes no provisions for caching these "vote
   info" documents.  If many people wind up fetching them
   aggressively from the authorities, that would be bad.



Filename: 165-simple-robust-voting.txt
Title: Easy migration for voting authority sets
Author: Nick Mathewson
Created: 2009-05-28
Status: Rejected


Status: rejected as too complex.

Overview:

  This proposal describes an easy-to-implement, easy-to-verify way to
  change the set of authorities without creating a "flag day" situation.

Motivation:

  From proposal 134 ("More robust consensus voting with diverse
  authority sets") by Peter Palfrader:

      Right now there are about five authoritative directory servers
      in the Tor network, tho this number is expected to rise to about
      15 eventually.

      Adding a new authority requires synchronized action from all
      operators of directory authorities so that at any time during the
      update at least half of all authorities are running and agree on
      who is an authority.  The latter requirement is there so that the
      authorities can arrive at a common consensus: Each authority
      builds the consensus based on the votes from all authorities it
      recognizes, and so a different set of recognized authorities will
      lead to a different consensus document.

  In response to this problem, proposal 134 suggested that every
  candidate authority list in its vote whom it believes to be an
  authority.  These A-says-B-is-an-authority relationships form a
  directed graph.  Each authority then iteratively finds the largest
  clique in the graph and remove it, until they find one containing
  them.  They vote with this clique.

  Proposal 134 had some problems:

    - It had a security problem in that M hostile authorities in a
      clique could effectively kick out M-1 honest authorities.  This
      could enable a minority of the original authorities to take over.

    - It was too complex in its implications to analyze well: it took us
      over a year to realize that it was insecure.

    - It tried to solve a bigger problem: general fragmentation of
      authority trust.  Really, all we wanted to have was the ability to
      add and remove authorities without forcing a flag day.

Proposed protocol design:

   A "Voting Set" is a set of authorities.  Each authority has a list of
   the voting sets it considers acceptable.  These sets are chosen
   manually by the authority operators. They must always contain the
   authority itself.  Each authority lists all of these voting sets in
   its votes.

   Authorities exchange votes with every other authority in any of their
   voting sets.

   When it is time to calculate a consensus, an authority picks votes from
   whichever voting set it lists that is listed by the most members of
   that set.  In other words, given two sets S1 and S2 that an authority
   lists, that authority will prefer to vote with S1 over S2 whenever
   the number of other authorities in S1 that themselves list S1 is
   higher than the number of other authorities in S2 that themselves
   list S2.

   For example, suppose authority A recognizes two sets, "A B C D" and
   "A E F G H".  Suppose that the first set is recognized by all of A,
   B, C, and D, whereas the second set is recognized only by A, E, and
   F.  Because the first set is recognize by more of the authorities in
   it than the other one, A will vote with the first set.

   Ties are broken in favor of some arbitrary function of the identity
   keys of the authorities in the set.

How to migrate authority sets:

   In steady state, each authority operator should list only the current
   actual voting set as accepted.

   When we want to add an authority, each authority operator configures
   his or her server to list two voting sets: one containing all the old
   authorities, and one containing the old authorities and the new
   authority too.  Once all authorities are listing the new set of
   authorities, they will start voting with that set because of its
   size.

   What if one or two authority operators are slow to list the new set?
   Then the other operators can stop listing the old set once there are
   enough authorities listing the new set to make its voting successful.
   (Note that these authorities not listing the new set will still have
   their votes counted, since they themselves will be members of the new
   set.  They will only fail to sign the consensus generated by the
   other authorities who are using the new set.)

   When we want to remove an authority, the operators list two voting
   sets: one containing all the authorities, and one omitting the
   authority we want to remove.  Once enough authorities list the new
   set as acceptable, we start having authority operators stop listing
   the old set.  Once there are more listing the new set than the old
   set, the new set will win.

Data format changes:

   Add a new 'voting-set' line to the vote document format.  Allow it to
   occur any number of times.  Its format is:

      voting-set SP 'fingerprint' SP 'fingerprint' ... NL

   where each fingerprint is the hex fingerprint of an identity key of
   an authority.  Sort fingerprints in ascending order.

   When the consensus method is at least 'X' (decide this when we
   implement the proposal), add this line to the consensus format as
   well, before the first dir-source line.  [This information is not
   redundant with the dir-source sections in the consensus: If an
   authority is recognized but didn't vote, that authority will appear in
   the voting-set line but not in the dir-source sections.]

   We don't need to list other information about authorities in our
   vote.

Migration issues:

   We should keep track somewhere which Tor client versions
   recognized which authorities.

Acknowledgments:

   The design came out of an IRC conversation with Peter Palfrader.  He
   had the basic idea first.
Filename: 166-statistics-extra-info-docs.txt
Title: Including Network Statistics in Extra-Info Documents
Author: Karsten Loesing
Created: 21-Jul-2009
Target: 0.2.2
Status: Closed

Change history:

  21-Jul-2009  Initial proposal for or-dev


Overview:

  The Tor network has grown to almost two thousand relays and millions
  of casual users over the past few years. With growth has come
  increasing performance problems and attempts by some countries to
  block access to the Tor network. In order to address these problems,
  we need to learn more about the Tor network. This proposal suggests to
  measure additional statistics and include them in extra-info documents
  to help us understand the Tor network better.


Introduction:

  As of May 2009, relays, bridges, and directories gather the following
  data for statistical purposes:

  - Relays and bridges count the number of bytes that they have pushed
    in 15-minute intervals over the past 24 hours. Relays and bridges
    include these data in extra-info documents that they send to the
    directory authorities whenever they publish their server descriptor.

  - Bridges further include a rough number of clients per country that
    they have seen in the past 48 hours in their extra-info documents.

  - Directories can be configured to count the number of clients they
    see per country in the past 24 hours and to write them to a local
    file.

  Since then we extended the network statistics in Tor. These statistics
  include:

  - Directories now gather more precise statistics about connecting
    clients. Fixes include measuring in intervals of exactly 24 hours,
    counting unsuccessful requests, measuring download times, etc. The
    directories append their statistics to a local file every 24 hours.

  - Entry guards count the number of clients per country per day like
    bridges do and write them to a local file every 24 hours.

  - Relays measure statistics of the number of cells in their circuit
    queues and how much time these cells spend waiting there. Relays
    write these statistics to a local file every 24 hours.

  - Exit nodes count the number of read and written bytes on exit
    connections per port as well as the number of opened exit streams
    per port in 24-hour intervals. Exit nodes write their statistics to
    a local file.

  The following four sections contain descriptions for adding these
  statistics to the relays' extra-info documents.


Directory request statistics:

  The first type of statistics aims at measuring directory requests sent
  by clients to a directory mirror or directory authority. More
  precisely, these statistics aim at requests for v2 and v3 network
  statuses only. These directory requests are sent non-anonymously,
  either via HTTP-like requests to a directory's Dir port or tunneled
  over a 1-hop circuit.

  Measuring directory request statistics is useful for several reasons:
  First, the number of locally seen directory requests can be used to
  estimate the total number of clients in the Tor network. Second, the
  country-wise classification of requests using a GeoIP database can
  help counting the relative and absolute number of users per country.
  Third, the download times can give hints on the available bandwidth
  capacity at clients.

  Directory requests do not give any hints on the contents that clients
  send or receive over the Tor network. Every client requests network
  statuses from the directories, so that there are no anonymity-related
  concerns to gather these statistics. It might be, though, that clients
  wish to hide the fact that they are connecting to the Tor network.
  Therefore, IP addresses are resolved to country codes in memory,
  events are accumulated over 24 hours, and numbers are rounded up to
  multiples of 4 or 8.

   "dirreq-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
      [At most once.]

      YYYY-MM-DD HH:MM:SS defines the end of the included measurement
      interval of length NSEC seconds (86400 seconds by default).

      A "dirreq-stats-end" line, as well as any other "dirreq-*" line,
      is only added when the relay has opened its Dir port and after 24
      hours of measuring directory requests.

   "dirreq-v2-ips" CC=N,CC=N,... NL
      [At most once.]
   "dirreq-v3-ips" CC=N,CC=N,... NL
      [At most once.]

      List of mappings from two-letter country codes to the number of
      unique IP addresses that have connected from that country to
      request a v2/v3 network status, rounded up to the nearest multiple
      of 8. Only those IP addresses are counted that the directory can
      answer with a 200 OK status code.

   "dirreq-v2-reqs" CC=N,CC=N,... NL
      [At most once.]
   "dirreq-v3-reqs" CC=N,CC=N,... NL
      [At most once.]

      List of mappings from two-letter country codes to the number of
      requests for v2/v3 network statuses from that country, rounded up
      to the nearest multiple of 8. Only those requests are counted that
      the directory can answer with a 200 OK status code.

   "dirreq-v2-share" num% NL
      [At most once.]
   "dirreq-v3-share" num% NL
      [At most once.]

      The share of v2/v3 network status requests that the directory
      expects to receive from clients based on its advertised bandwidth
      compared to the overall network bandwidth capacity. Shares are
      formatted in percent with two decimal places. Shares are
      calculated as means over the whole 24-hour interval.

   "dirreq-v2-resp" status=num,... NL
      [At most once.]
   "dirreq-v3-resp" status=nul,... NL
      [At most once.]

      List of mappings from response statuses to the number of requests
      for v2/v3 network statuses that were answered with that response
      status, rounded up to the nearest multiple of 4. Only response
      statuses with at least 1 response are reported. New response
      statuses can be added at any time. The current list of response
      statuses is as follows:

      "ok": a network status request is answered; this number
         corresponds to the sum of all requests as reported in
         "dirreq-v2-reqs" or "dirreq-v3-reqs", respectively, before
         rounding up.
      "not-enough-sigs: a version 3 network status is not signed by a
         sufficient number of requested authorities.
      "unavailable": a requested network status object is unavailable.
      "not-found": a requested network status is not found.
      "not-modified": a network status has not been modified since the
         If-Modified-Since time that is included in the request.
      "busy": the directory is busy.

   "dirreq-v2-direct-dl" key=val,... NL
      [At most once.]
   "dirreq-v3-direct-dl" key=val,... NL
      [At most once.]
   "dirreq-v2-tunneled-dl" key=val,... NL
      [At most once.]
   "dirreq-v3-tunneled-dl" key=val,... NL
      [At most once.]

      List of statistics about possible failures in the download process
      of v2/v3 network statuses. Requests are either "direct"
      HTTP-encoded requests over the relay's directory port, or
      "tunneled" requests using a BEGIN_DIR cell over the relay's OR
      port. The list of possible statistics can change, and statistics
      can be left out from reporting. The current list of statistics is
      as follows:

      Successful downloads and failures:

      "complete": a client has finished the download successfully.
      "timeout": a download did not finish within 10 minutes after
         starting to send the response.
      "running": a download is still running at the end of the
         measurement period for less than 10 minutes after starting to
         send the response.

      Download times:

      "min", "max": smallest and largest measured bandwidth in B/s.
      "d[1-4,6-9]": 1st to 4th and 6th to 9th decile of measured
         bandwidth in B/s. For a given decile i, i/10 of all downloads
         had a smaller bandwidth than di, and (10-i)/10 of all downloads
         had a larger bandwidth than di.
      "q[1,3]": 1st and 3rd quartile of measured bandwidth in B/s. One
         fourth of all downloads had a smaller bandwidth than q1, one
         fourth of all downloads had a larger bandwidth than q3, and the
         remaining half of all downloads had a bandwidth between q1 and
         q3.
      "md": median of measured bandwidth in B/s. Half of the downloads
         had a smaller bandwidth than md, the other half had a larger
         bandwidth than md.


Entry guard statistics:

  Entry guard statistics include the number of clients per country and
  per day that are connecting directly to an entry guard.

  Entry guard statistics are important to learn more about the
  distribution of clients to countries. In the future, this knowledge
  can be useful to detect if there are or start to be any restrictions
  for clients connecting from specific countries.

  The information which client connects to a given entry guard is very
  sensitive. This information must not be combined with the information
  what contents are leaving the network at the exit nodes. Therefore,
  entry guard statistics need to be aggregated to prevent them from
  becoming useful for de-anonymization. Aggregation includes resolving
  IP addresses to country codes, counting events over 24-hour intervals,
  and rounding up numbers to the next multiple of 8.

   "entry-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
      [At most once.]

      YYYY-MM-DD HH:MM:SS defines the end of the included measurement
      interval of length NSEC seconds (86400 seconds by default).

      An "entry-stats-end" line, as well as any other "entry-*"
      line, is first added after the relay has been running for at least
      24 hours.

   "entry-ips" CC=N,CC=N,... NL
      [At most once.]

      List of mappings from two-letter country codes to the number of
      unique IP addresses that have connected from that country to the
      relay and which are no known other relays, rounded up to the
      nearest multiple of 8.


Cell statistics:

  The third type of statistics have to do with the time that cells spend
  in circuit queues. In order to gather these statistics, the relay
  memorizes when it puts a given cell in a circuit queue and when this
  cell is flushed. The relay further notes the life time of the circuit.
  These data are sufficient to determine the mean number of cells in a
  queue over time and the mean time that cells spend in a queue.

  Cell statistics are necessary to learn more about possible reasons for
  the poor network performance of the Tor network, especially high
  latencies. The same statistics are also useful to determine the
  effects of design changes by comparing today's data with future data.

  There are basically no privacy concerns from measuring cell
  statistics, regardless of a node being an entry, middle, or exit node.

   "cell-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
      [At most once.]

      YYYY-MM-DD HH:MM:SS defines the end of the included measurement
      interval of length NSEC seconds (86400 seconds by default).

      A "cell-stats-end" line, as well as any other "cell-*" line,
      is first added after the relay has been running for at least 24
      hours.

   "cell-processed-cells" num,...,num NL
      [At most once.]

      Mean number of processed cells per circuit, subdivided into
      deciles of circuits by the number of cells they have processed in
      descending order from loudest to quietest circuits.

   "cell-queued-cells" num,...,num NL
      [At most once.]

      Mean number of cells contained in queues by circuit decile. These
      means are calculated by 1) determining the mean number of cells in
      a single circuit between its creation and its termination and 2)
      calculating the mean for all circuits in a given decile as
      determined in "cell-processed-cells". Numbers have a precision of
      two decimal places.

   "cell-time-in-queue" num,...,num NL
      [At most once.]

      Mean time cells spend in circuit queues in milliseconds. Times are
      calculated by 1) determining the mean time cells spend in the
      queue of a single circuit and 2) calculating the mean for all
      circuits in a given decile as determined in
      "cell-processed-cells".

   "cell-circuits-per-decile" num NL
      [At most once.]

      Mean number of circuits that are included in any of the deciles,
      rounded up to the next integer.


Exit statistics:

  The last type of statistics affects exit nodes counting the number of
  bytes written and read and the number of streams opened per port and
  per 24 hours. Exit port statistics can be measured from looking at
  headers of BEGIN and DATA cells. A BEGIN cell contains the exit port
  that is required for the exit node to open a new exit stream.
  Subsequent DATA cells coming from the client or being sent back to the
  client contain a length field stating how many bytes of application
  data are contained in the cell.

  Exit port statistics are important to measure in order to identify
  possible load-balancing problems with respect to exit policies. Exit
  nodes that permit more ports than others are very likely overloaded
  with traffic for those ports plus traffic for other ports. Improving
  load balancing in the Tor network improves the overall utilization of
  bandwidth capacity.

  Exit traffic is one of the most sensitive parts of network data in the
  Tor network. Even though these statistics do not require looking at
  traffic contents, statistics are aggregated so that they are not
  useful for de-anonymizing users. Only those ports are reported that
  have seen at least 0.1% of exiting or incoming bytes, numbers of bytes
  are rounded up to full kibibytes (KiB), and stream numbers are rounded
  up to the next multiple of 4.

   "exit-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
      [At most once.]

      YYYY-MM-DD HH:MM:SS defines the end of the included measurement
      interval of length NSEC seconds (86400 seconds by default).

      An "exit-stats-end" line, as well as any other "exit-*" line, is
      first added after the relay has been running for at least 24 hours
      and only if the relay permits exiting (where exiting to a single
      port and IP address is sufficient).

   "exit-kibibytes-written" port=N,port=N,... NL
      [At most once.]
   "exit-kibibytes-read" port=N,port=N,... NL
      [At most once.]

      List of mappings from ports to the number of kibibytes that the
      relay has written to or read from exit connections to that port,
      rounded up to the next full kibibyte.

   "exit-streams-opened" port=N,port=N,... NL
      [At most once.]

      List of mappings from ports to the number of opened exit streams
      to that port, rounded up to the nearest multiple of 4.


Implementation notes:

  Right now, relays that are configured accordingly write similar
  statistics to those described in this proposal to disk every 24 hours.
  With this proposal being implemented, relays include the contents of
  these files in extra-info documents.

  The following steps are necessary to implement this proposal:

  1. The current format of [dirreq|entry|buffer|exit]-stats files needs
     to be adapted to the description in this proposal. This step
     basically means renaming keywords.

  2. The timing of writing the four *-stats files should be unified, so
     that they are written exactly 24 hours after starting the
     relay. Right now, the measurement intervals for dirreq, entry, and
     exit stats starts with the first observed request, and files are
     written when observing the first request that occurs more than 24
     hours after the beginning of the measurement interval. With this
     proposal, the measurement intervals should all start at the same
     time, and files should be written exactly 24 hours later.

  3. It is advantageous to cache statistics in local files in the data
     directory until they are included in extra-info documents. The
     reason is that the 24-hour measurement interval can be very
     different from the 18-hour publication interval of extra-info
     documents. When a relay crashes after finishing a measurement
     interval, but before publishing the next extra-info document,
     statistics would get lost. Therefore, statistics are written to
     disk when finishing a measurement interval and read from disk when
     generating an extra-info document. Only the statistics that were
     appended to the *-stats files within the past 24 hours are included
     in extra-info documents. Further, the contents of the *-stats files
     need to be checked in the process of generating extra-info documents.

  4. With the statistics patches being tested, the ./configure options
     should be removed and the statistics code be compiled by default.
     It is still required for relay operators to add configuration
     options (DirReqStatistics, ExitPortStatistics, etc.) to enable
     gathering statistics. However, in the near future, statistics shall
     be enabled gathered by all relays by default, where requiring a
     ./configure option would be a barrier for many relay operators.
Filename: 167-params-in-consensus.txt
Title: Vote on network parameters in consensus
Author: Roger Dingledine
Created: 18-Aug-2009
Status: Closed
Implemented-In: 0.2.2

0. History


1. Overview

  Several of our new performance plans involve guessing how to tune
  clients and relays, yet we won't be able to learn whether we guessed
  the right tuning parameters until many people have upgraded. Instead,
  we should have directory authorities vote on the parameters, and teach
  Tors to read the currently recommended values out of the consensus.

2. Design

  V3 votes should include a new "params" line after the known-flags
  line. It contains key=value pairs, where value is an integer.

  Consensus documents that are generated with a sufficiently new consensus
  method (7?) then include a params line that includes every key listed
  in any vote, and the median value for that key (in case of ties,
  we use the median closer to zero).

2.1. Planned keys.

  The first planned parameter is "circwindow=101", which is the initial
  circuit packaging window that clients and relays should use. Putting
  it in the consensus will let us perform experiments with different
  values once enough Tors have upgraded -- see proposal 168.

  Later parameters might include a weighting for how much to favor quiet
  circuits over loud circuits in our round-robin algorithm; a weighting
  for how much to prioritize relays over clients if we use an incentive
  scheme like the gold-star design; and what fraction of circuits we
  should throw out from proposal 151.

2.2. What about non-integers?

  I'm not sure how we would do median on non-integer values. Further,
  I don't have any non-integer values in mind yet. So I say we cross
  that bridge when we get to it.

Filename: 168-reduce-circwindow.txt
Title: Reduce default circuit window
Author: Roger Dingledine
Created: 12-Aug-2009
Status: Rejected


0. History


1. Overview

  We should reduce the starting circuit "package window" from 1000 to
  101. The lower package window will mean that clients will only be able
  to receive 101 cells (~50KB) on a circuit before they need to send a
  'sendme' acknowledgement cell to request 100 more.

  Starting with a lower package window on exit relays should save on
  buffer sizes (and thus memory requirements for the exit relay), and
  should save on queue sizes (and thus latency for users).

  Lowering the package window will induce an extra round-trip for every
  additional 50298 bytes of the circuit. This extra step is clearly a
  slow-down for large streams, but ultimately we hope that a) clients
  fetching smaller streams will see better response, and b) slowing
  down the large streams in this way will produce lower e2e latencies,
  so the round-trips won't be so bad.

2. Motivation

  Karsten's torperf graphs show that the median download time for a 50KB
  file over Tor in mid 2009 is 7.7 seconds, whereas the median download
  time for 1MB and 5MB are around 50s and 150s respectively. The 7.7
  second figure is way too high, whereas the 50s and 150s figures are
  surprisingly low.

  The median round-trip latency appears to be around 2s, with 25% of
  the data points taking more than 5s. That's a lot of variance.

  We designed Tor originally with the goal of maximizing
  throughput. We figured that would also optimize other network properties
  like round-trip latency. Looks like we were wrong.

3. Design

  Wherever we initialize the circuit package window, initialize it to
  101 rather than 1000. Reducing it should be safe even when interacting
  with old Tors: the old Tors will receive the 101 cells and send back
  a sendme ack cell. They'll still have much higher deliver windows,
  but the rest of their deliver window will go unused.

  You can find the patch at arma/circwindow. It seems to work.

3.1. Why not 100?

  Tor 0.0.0 through 0.2.1.19 have a bug where they only send the sendme
  ack cell after 101 cells rather than the intended 100 cells.

  Once 0.2.1.19 is obsolete we can change it back to 100 if we like. But
  hopefully we'll have moved to some datagram protocol long before
  0.2.1.19 becomes obsolete.

3.2. What about stream packaging windows?

  Right now the stream packaging windows start at 500. The goal was to
  set the stream window to half the circuit window, to provide a crude
  load balancing between streams on the same circuit. Once we lower
  the circuit packaging window, the stream packaging window basically
  becomes redundant.

  We could leave it in -- it isn't hurting much in either case. Or we
  could take it out -- people building other Tor clients would thank us
  for that step. Alas, people building other Tor clients are going to
  have to be compatible with current Tor clients, so in practice there's
  no point taking out the stream packaging windows.

3.3. What about variable circuit windows?

  Once upon a time we imagined adapting the circuit package window to
  the network conditions. That is, we would start the window small,
  and raise it based on the latency and throughput we see.

  In theory that crude imitation of TCP's windowing system would allow
  us to adapt to fill the network better. In practice, I think we want
  to stick with the small window and never raise it. The low cap reduces
  the total throughput you can get from Tor for a given circuit. But
  that's a feature, not a bug.

4. Evaluation

  How do we know this change is actually smart? It seems intuitive that
  it's helpful, and some smart systems people have agreed that it's
  a good idea (or said another way, they were shocked at how big the
  default package window was before).

  To get a more concrete sense of the benefit, though, Karsten has been
  running torperf side-by-side on exit relays with the old package window
  vs the new one. The results are mixed currently -- it is slightly faster
  for fetching 40KB files, and slightly slower for fetching 50KB files.

  I think it's going to be tough to get a clear conclusion that this is
  a good design just by comparing one exit relay running the patch. The
  trouble is that the other hops in the circuits are still getting bogged
  down by other clients introducing too much traffic into the network.

  Ultimately, we'll want to put the circwindow parameter into the
  consensus so we can test a broader range of values once enough relays
  have upgraded.

5. Transition and deployment

  We should put the circwindow in the consensus (see proposal 167),
  with an initial value of 101. Then as more exit relays upgrade,
  clients should seamlessly get the better behavior.

  Note that upgrading the exit relay will only affect the "download"
  package window. An old client that's uploading lots of bytes will
  continue to use the old package window at the client side, and we
  can't throttle that window at the exit side without breaking protocol.

  The real question then is what we should backport to 0.2.1. Assuming
  this could be a big performance win, we can't afford to wait until
  0.2.2.x comes out before starting to see the changes here. So we have
  two options as I see them:
  a) once clients in 0.2.2.x know how to read the value out of the
  consensus, and it's been tested for a bit, backport that part to
  0.2.1.x.
  b) if it's too complex to backport, just pick a number, like 101, and
  backport that number.

  Clearly choice (a) is the better one if the consensus parsing part
  isn't very complex. Let's shoot for that, and fall back to (b) if the
  patch turns out to be so big that we reconsider.

Filename: 169-eliminating-renegotiation.txt
Title: Eliminate TLS renegotiation for the Tor connection handshake
Author: Nick Mathewson
Created: 27-Jan-2010
Status: Superseded
Target: 0.2.2
Superseded-By: 176

1. Overview

   I propose a backward-compatible change to the Tor connection
   establishment protocol to avoid the use of TLS renegotiation.

   Rather than doing a TLS renegotiation to exchange certificates
   and authenticate the original handshake, this proposal takes an
   approach similar to Steven Murdoch's proposal 124, and uses Tor
   cells to finish authenticating the parties' identities once the
   initial TLS handshake is finished.

   Terminological note: I use "client" below to mean the Tor
   instance (a client or a relay) that initiates a TLS connection,
   and "server" to mean the Tor instance (a relay) that accepts it.

2. Motivation and history

   In the original Tor TLS connection handshake protocol ("V1", or
   "two-cert"), parties that wanted to authenticate provided a
   two-cert chain of X.509 certificates during the handshake setup
   phase.  Every party that wanted to authenticate sent these
   certificates.

   In the current Tor TLS connection handshake protocol ("V2", or
   "renegotiating"), the parties begin with a single certificate
   sent from the server (responder) to the client (initiator), and
   then renegotiate to a two-certs-from-each-authenticating-party.
   We made this change to make Tor's handshake look like a browser
   speaking SSL to a webserver.  (See proposal 130, and
   tor-spec.txt.)  To tell whether to use the V1 or V2 handshake,
   servers look at the list of ciphers sent by the client.  (This is
   ugly, but there's not much else in the ClientHello that they can
   look at.) If the list contains any cipher not used by the V1
   protocol, the server sends back a single cert and expects a
   renegotiation.  If the client gets back a single cert, then it
   withholds its own certificates until the TLS renegotiation phase.

   In other words, initiator behavior now looks like this:

      - Begin TLS negotiation with V2 cipher list; wait for
        certificate(s).
      - If we get a certificate chain:
         - Then we are using the V1 handshake.  Send our own
           certificate chain as part of this initial TLS handshake
           if we want to authenticate; otherwise, send no
           certificates.  When the handshake completes, check
           certificates.  We are now mutually authenticated.

        Otherwise, if we get just a single certificate:
         - Then we are using the V2 handshake.  Do not send any
           certificates during this handshake.
         - When the handshake is done, immediately start a TLS
           renegotiation.  During the renegotiation, expect
           a certificate chain from the server; send a certificate
           chain of our own if we want to authenticate ourselves.
         - After the renegotiation, check the certificates. Then
           send (and expect) a VERSIONS cell from the other side to
           establish the link protocol version.

   And V2 responder behavior now looks like this:

      - When we get a TLS ClientHello request, look at the cipher
        list.
      - If the cipher list contains only the V1 ciphersuites:
         - Then we're doing a V1 handshake.  Send a certificate
           chain.  Expect a possible client certificate chain in
           response.
        Otherwise, if we get other ciphersuites:
         - We're using the V2 handshake.  Send back a single
           certificate and let the handshake complete.
         - Do not accept any data until the client has renegotiated.
         - When the client is renegotiating, send a certificate
           chain, and expect (possibly multiple) certificates in
           reply.
         - Check the certificates when the renegotiation is done.
           Then exchange VERSIONS cells.

   Late in 2009, researchers found a flaw in most applications' use
   of TLS renegotiation: Although TLS renegotiation does not
   reauthenticate any information exchanged before the renegotiation
   takes place, many applications were treating it as though it did,
   and assuming that data sent _before_ the renegotiation was
   authenticated with the credentials negotiated _during_ the
   renegotiation.  This problem was exacerbated by the fact that
   most TLS libraries don't actually give you an obvious good way to
   tell where the renegotiation occurred relative to the datastream.
   Tor wasn't directly affected by this vulnerability, but its
   aftermath hurts us in a few ways:

      1) OpenSSL has disabled renegotiation by default, and created
         a "yes we know what we're doing" option we need to set to
         turn it back on.  (Two options, actually: one for openssl
         0.9.8l and one for 0.9.8m and later.)

      2) Some vendors have removed all renegotiation support from
         their versions of OpenSSL entirely, forcing us to tell
         users to either replace their versions of OpenSSL or to
         link Tor against a hand-built one.

      3) Because of 1 and 2, I'd expect TLS renegotiation to become
         rarer and rarer in the wild, making our own use stand out
         more.

3. Design

3.1. The view in the large

   Taking a cue from Steven Murdoch's proposal 124, I propose that
   we move the work currently done by the TLS renegotiation step
   (that is, authenticating the parties to one another) and do it
   with Tor cells instead of with TLS.

   Using _yet another_ variant response from the responder (server),
   we allow the client to learn that it doesn't need to rehandshake
   and can instead use a cell-based authentication system.  Once the
   TLS handshake is done, the client and server exchange VERSIONS
   cells to determine link protocol version (including
   handshake version).  If they're using the handshake version
   specified here, the client and server arrive at link protocol
   version 3 (or higher), and use cells to exchange further
   authentication information.

3.2. New TLS handshake variant

   We already used the list of ciphers from the clienthello to
   indicate whether the client can speak the V2 ("renegotiating")
   handshake or later, so we can't encode more information there.

   We can, however, change the DN in the certificate passed by the
   server back to the client.  Currently, all V2 certificates are
   generated with CN values ending with ".net".  I propose that we
   have the ".net" commonName ending reserved to indicate the V2
   protocol, and use commonName values ending with ".com" to
   indicate the V3 ("minimal") handshake described herein.

   Now, once the initial TLS handshake is done, the client can look
   at the server's certificate(s).  If there is a certificate chain,
   the handshake is V1.  If there is a single certificate whose
   subject commonName ends in ".net", the handshake is V2 and the
   client should try to renegotiate as it would currently.
   Otherwise, the client should assume that the handshake is V3+.
   [Servers should _only_ send ".com" addesses, to allow room for
   more signaling in the future.]

3.3. Authenticating inside Tor

   Once the TLS handshake is finished, if the client renegotiates,
   then the server should go on as it does currently.

   If the client implements this proposal, however, and the server
   has shown it can understand the V3+ handshake protocol, the
   client immediately sends a VERSIONS cell to the server
   and waits to receive a VERSIONS cell in return.  We negotiate
   the Tor link protocol version _before_ we proceed with the
   negotiation, in case we need to change the authentication
   protocol in the future.

   Once either party has seen the VERSIONS cell from the other, it
   knows which version they will pick (that is, the highest version
   shared by both parties' VERSIONS cells).  All Tor instances using
   the handshake protocol described in 3.2 MUST support at least
   link protocol version 3 as described here.

   On learning the link protocol, the server then sends the client a
   CERT cell and a NETINFO cell.  If the client wants to
   authenticate to the server, it sends a CERT cell, an AUTHENTICATE
   cell, and a NETINFO cell; or it may simply send a NETINFO cell if
   it does not want to authenticate.

   The CERT cell describes the keys that a Tor instance is claiming
   to have.  It is a variable-length cell.  Its payload format is:

        N: Number of certs in cell            [1 octet]
        N times:
           CLEN                               [2 octets]
           Certificate                        [CLEN octets]

   Any extra octets at the end of a CERT cell MUST be ignored.

   Each certificate has the form:

        CertType                              [1 octet]
        CertPurpose                           [1 octet]
        PublicKeyLen                          [2 octets]
        PublicKey                             [PublicKeyLen octets]
        NotBefore                             [4 octets]
        NotAfter                              [4 octets]
        SignerID                              [HASH256_LEN octets]
        SignatureLen                          [2 octets]
        Signature                             [SignatureLen octets]

   where CertType is 1 (meaning "RSA/SHA256")
         CertPurpose is 1 (meaning "link certificate")
         PublicKey is the DER encoding of the ASN.1 representation
            of the RSA key of the subject of this certificate
         NotBefore is a time in HOURS since January 1, 1970, 00:00
            UTC before which this certificate should not be
            considered valid.
         NotAfter is a time in HOURS since January 1, 1970, 00:00
            UTC after which this certificate should not be
            considered valid.
         SignerID is the SHA-256 digest of the public key signing
            this certificate
         and Signature is the signature of all the other fields in
            this certificate, using SHA256 as described in proposal
            158.

   While authenticating, a server need send only a self-signed
   certificate for its identity key.  (Its TLS certificate already
   contains its link key signed by its identity key.)  A client that
   wants to authenticate MUST send two certificates: one containing
   a public link key signed by its identity key, and one self-signed
   cert for its identity.

   Tor instances MUST ignore any certificate with an unrecognized
   CertType or CertPurpose, and MUST ignore extra bytes in the cert.

   The AUTHENTICATE cell proves to the server that the client with
   whom it completed the initial TLS handshake is the one possessing
   the link public key in its certificate.  It is a variable-length
   cell.  Its contents are:

        SignatureType                         [2 octets]
        SignatureLen                          [2 octets]
        Signature                             [SignatureLen octets]

   where SignatureType is 1 (meaning "RSA-SHA256") and Signature is
   an RSA-SHA256 signature of the HMAC-SHA256, using the TLS master
   secret key as its key, of the following elements:

     - The SignatureType field (0x00 0x01)
     - The NUL terminated ASCII string: "Tor certificate verification"
     - client_random, as sent in the Client Hello
     - server_random, as sent in the Server Hello

   Once the above handshake is complete, the client knows (from the
   initial TLS handshake) that it has a secure connection to an
   entity that controls a given link public key, and knows (from the
   CERT cell) that the link public key is a valid public key for a
   given Tor identity.

   If the client authenticates, the server learns from the CERT cell
   that a given Tor identity has a given current public link key.
   From the AUTHENTICATE cell, it knows that an entity with that
   link key knows the master secret for the TLS connection, and
   hence must be the party with whom it's talking, if TLS works.

3.4. Security checks

   If the TLS handshake indicates a V2 or V3+ connection, the server
   MUST reject any connection from the client that does not begin
   with either a renegotiation attempt or a VERSIONS cell containing
   at least link protocol version "3".  If the TLS handshake
   indicates a V3+ connection, the client MUST reject any connection
   where the server sends anything before the client has sent a
   VERSIONS cell, and any connection where the VERSIONS cell does
   not contain at least link protocol version "3".

   If link protocol version 3 is chosen:

     Clients and servers MUST check that all digests and signatures
     on the certificates in CERT cells they are given are as
     described above.

     After the VERSIONS cell, clients and servers MUST close the
     connection if anything besides a CERT or AUTH cell is sent
     before the

     CERT or AUTHENTICATE cells anywhere after the first NETINFO
     cell must be rejected.

   ... [write more here.  What else?] ...

3.5. Summary

   We now revisit the protocol outlines from section 2 to incorporate
   our changes.  New or modified steps are marked with a *.

   The new initiator behavior now looks like this:

      - Begin TLS negotiation with V2 cipher list; wait for
        certificate(s).
      - If we get a certificate chain:
         - Then we are using the V1 handshake.  Send our own
           certificate chain as part of this initial TLS handshake
           if we want to authenticate; otherwise, send no
           certificates.  When the handshake completes, check
           certificates.  We are now mutually authenticated.
        Otherwise, if we get just a single certificate:
         - Then we are using the V2 or the V3+ handshake.  Do not
           send any certificates during this handshake.
         * When the handshake is done, look at the server's
           certificate's subject commonName.
           * If it ends with ".net", we're doing a V2 handshake:
             - Immediately start a TLS renegotiation.  During the
               renegotiation, expect a certificate chain from the
               server; send a certificate chain of our own if we
               want to authenticate ourselves.
             - After the renegotiation, check the certificates. Then
               send (and expect) a VERSIONS cell from the other side
               to establish the link protocol version.
           * If it ends with anything else, assume a V3 or later
             handshake:
             * Send a VERSIONS cell, and wait for a VERSIONS cell
               from the server.
             * If we are authenticating, send CERT and AUTHENTICATE
               cells.
             * Send a NETINFO cell.  Wait for a CERT and a NETINFO
               cell from the server.
             * If the CERT cell contains a valid self-identity cert,
               and the identity key in the cert can be used to check
               the signature on the x.509 certificate we got during
               the TLS handshake, then we know we connected to the
               server with that identity.  If any of these checks
               fail, or the identity key was not what we expected,
               then we close the connection.
             * Once the NETINFO cell arrives, continue as before.

   And V3+ responder behavior now looks like this:

      - When we get a TLS ClientHello request, look at the cipher
        list.

      - If the cipher list contains only the V1 ciphersuites:
         - Then we're doing a V1 handshake.  Send a certificate
           chain.  Expect a possible client certificate chain in
           response.
        Otherwise, if we get other ciphersuites:
         - We're using the V2 handshake.  Send back a single
           certificate whose subject commonName ends with ".com",
           and let the handshake complete.
         * If the client does anything besides renegotiate or send a
           VERSIONS cell, drop the connection.
         - If the client renegotiates immediately, it's a V2
           connection:
           - When the client is renegotiating, send a certificate
             chain, and expect (possibly multiple certificates in
             reply).
           - Check the certificates when the renegotiation is done.
             Then exchange VERSIONS cells.
         * Otherwise we got a VERSIONS cell and it's a V3 handshake.
           * Send a VERSIONS cell, a CERT cell, an AUTHENTICATE
             cell, and a NETINFO cell.
           * Wait for the client to send cells in reply.  If the
             client sends a CERT and an AUTHENTICATE and a NETINFO,
             use them to authenticate the client.  If the client
             sends a NETINFO, it is unauthenticated.  If it sends
             anything else before its NETINFO, it's rejected.

4. Numbers to assign

   We need a version number for this link protocol.  I've been
   calling it "3".

   We need to reserve command numbers for CERT and AUTH cells.  I
   suggest that in link protocol 3 and higher, we reserve command
   numbers 128..240 for variable-length cells.  (241-256 we can hold
   for future extensions.)

5. Efficiency

   This protocol adds a round-trip step when the client sends a
   VERSIONS cell to the server and waits for the {VERSIONS, CERT,
   NETINFO} response in turn.  (The server then waits for the
   client's {NETINFO} or {CERT, AUTHENTICATE, NETINFO} reply,
   but it would have already been waiting for the client's NETINFO,
   so that's not an additional wait.)

   This is actually fewer round-trip steps than required before for
   TLS renegotiation, so that's a win.

6. Open questions:

  - Should we use X.509 certificates instead of the certificate-ish
    things we describe here?  They are more standard, but more ugly.

  - May we cache which certificates we've already verified?  It
    might leak in timing whether we've connected with a given server
    before, and how recently.

  - Is there a better secret than the master secret to use in the
    AUTHENTICATE cell?  Say, a portable one?  Can we get at it for
    other libraries besides OpenSSL?

  - Does using the client_random and server_random data in the
    AUTHENTICATE message actually help us?  How hard is it to pull
    them out of the OpenSSL data structure?

  - Can we give some way for clients to signal "I want to use the
    V3 protocol if possible, but I can't renegotiate, so don't give
    me the V2"?  Clients currently have a fair idea of server
    versions, so they could potentially do the V3+ handshake with
    servers that support it, and fall back to V1 otherwise.

  - What should servers that don't have TLS renegotiation do?  For
    now, I think they should just get it.  Eventually we can
    deprecate the V2 handshake as we did with the V1 handshake.

Title: Configuration options regarding circuit building
Filename: 170-user-path-config.txt
Author: Sebastian Hahn
Created: 01-March-2010
Status: Superseded

Overview:

    This document outlines how Tor handles the user configuration
    options to influence the circuit building process.

Motivation:

    Tor's treatment of the configuration *Nodes options was surprising
    to many users, and quite a few conspiracy theories have crept up. We
    should update our specification and code to better describe and
    communicate what is going during circuit building, and how we're
    honoring configuration. So far, we've been tracking a bugreport
    about this behaviour (
    https://bugs.torproject.org/flyspray/index.php?do=details&id=1090 )
    and Nick replied in a thread on or-talk (
    http://archives.seul.org/or/talk/Feb-2010/msg00117.html ).

    This proposal tries to document our intention for those configuration
    options.

Design:

    Five configuration options are available to users to influence Tor's
    circuit building. EntryNodes and ExitNodes define a list of nodes
    that are for the Entry/Exit position in all circuits. ExcludeNodes
    is a list of nodes that are used for no circuit, and
    ExcludeExitNodes is a list of nodes that aren't used as the last
    hop. StrictNodes defines Tor's behaviour in case of a conflict, for
    example when a node that is excluded is the only available
    introduction point. Setting StrictNodes to 1 breaks Tor's
    functionality in that case, and it will refuse to build such a
    circuit.

    Neither Nick's email nor bug 1090 have clear suggestions how we
    should behave in each case, so I tried to come up with something
    that made sense to me.

Security implications:

    Deviating from normal circuit building can break one's anonymity, so
    the documentation of the above option should contain a warning to
    make users aware of the pitfalls.

Specification:

    It is proposed that the "User configuration" part of path-spec
    (section 2.2.2) be replaced with this:

    Users can alter the default behavior for path selection with
    configuration options. In case of conflicts (excluding and requiring
    the same node) the "StrictNodes" option is used to determine
    behaviour. If a nodes is both excluded and required via a
    configuration option, the exclusion takes preference.

    - If "ExitNodes" is provided, then every request requires an exit
      node on the ExitNodes list. If a request is supported by no nodes
      on that list, and "StrictNodes" is false, then Tor treats that
      request as if ExitNodes were not provided.

    - "EntryNodes" behaves analogously.

    - If "ExcludeNodes" is provided, then no circuit uses any of the
      nodes listed. If a circuit requires an excluded node to be used,
      and "StrictNodes" is false, then Tor uses the node in that
      position while not using any other of the excluded nodes.

    - If "ExcludeExitNodes" is provided, then Tor will not use the nodes
      listed for the exit position in a circuit. If a circuit requires
      an excluded node to be used in the exit position and "StrictNodes"
      is false, then Tor builds that circuit as if ExcludeExitNodes were
      not provided.

    - If a user tries to connect to or resolve a hostname of the form
      <target>.<servername>.exit and the "AllowDotExit" configuration
      option is set to 1, the request is rewritten to a request for
      <target>, and the request is only supported by the exit whose
      nickname or fingerprint is <servername>. If "AllowDotExit" is set
      to 0 (default), any request for <anything>.exit is denied.

    - When any of the *Nodes settings are changed, all circuits are
      expired immediately, to prevent a situation where a previously
      built circuit is used even though some of its nodes are now
      excluded.


Compatibility:

    The old Strict*Nodes options are deprecated, and the StrictNodes
    option is new. Tor users may need to update their configuration file.
Filename: 171-separate-streams.txt
Title: Separate streams across circuits by connection metadata
Author: Robert Hogan, Jacob Appelbaum, Damon McCoy, Nick Mathewson
Created: 21-Oct-2008
Modified: 7-Dec-2010
Status: Closed
Implemented-In: 0.2.3.3-alpha

Summary:

  We propose a new set of options to isolate unrelated streams from one
  another, putting them on separate circuits so that semantically
  unrelated traffic is not inadvertently made linkable.

Motivation:

  Currently, Tor attaches regular streams (that is, ones not carrying
  rendezvous or directory traffic) to circuits based only on whether Tor
  circuit's current exit node supports the destination, and whether the
  circuit has been dirty (that is, in use) for too long.

  This means that traffic that would otherwise be unrelated sometimes
  gets sent over the same circuit, allowing the exit node to link such
  streams with certainty, and allowing other parties to link such
  streams probabilistically.

  Older versions of onion routing tried to address this problem by
  sending every stream over a separate circuit; performance issues made
  this unfeasible. Moreover, in the presence of a localized adversary,
  separating streams by circuits increases the odds that, for any given
  linked set of streams, at least one will go over a compromised
  circuit.

  Therefore we ought to look for ways to allow streams that ought to be
  linked to travel over a single circuit, while keeping streams that
  ought not be linked isolated to separate circuits.

Discussion:

  Let's call a series of inherently-linked streams (like a set of
  streams downloading objects from the same webpage, or a browsing
  session where the user requests several related webpages) a "Session".

  "Sessions" are a necessarily a fuzzy concept.  While users typically
  consider some activities as wholly unrelated to each other ("My IM
  session has nothing to do with my web browsing!"), the boundaries
  between activities are sometimes hard to determine.  If I'm reading
  lolcats in one browser tab and reading about treatments for an
  embarrassing disease in another, those are probably separate sessions.
  If I search for a forum, log in, read it for a while, and post a few
  messages on unrelated topics, that's probably all the same session.

  So with the proviso that no automated process can identify sessions
  100% accurately, let's see which options we have available.

  Generally, all the streams on a session come from a single
  application.  Unfortunately, isolating streams by application
  automatically isn't feasible, given the lack of any nice
  cross-platform way to tell which local process originated a given
  connection.  (Yes, lsof works.  But a quick review of the lsof code
  should be sufficient to scare you away from thinking there is a
  portable option, much less a portable O(1) option.)  So instead, we'll
  have to use some other aspect of a Tor request as a proxy for the
  application.

  Generally, traffic from separate applications is not in the same
  session.

  With some applications (IRC, for example), each stream is a session.

  Some applications (most notably web browsing) can't be meaningfully
  split into sessions without inspecting the traffic itself and
  maintaining a lot of state.

  How well do ports correspond to sessions?  Early versions of this
  proposal focused on using destination ports as a proxy for
  application, since a connection to port 22 for SSH is probably not in
  the same session as one to port 80. This only works with some
  applications better than others, though: while SSH users typically
  know when they're on port 22 and when they aren't, a web browser can
  be coaxed (though img urls or any number of releated tricks) into
  connecting to any port at all.  Moreover, when Tor gets a DNS lookup
  request, it doesn't know in advance which port the resulting address
  will be used to connect to.

  So in summary, each kind of traffic wants to follow different rules,
  and assuming the existence of a web browser and a hostile web page or
  exit node, we can't tell one kind of traffic from another by simply
  looking at the destination:port of the traffic.

  Fortunately, we're not doomed.

Design:

  When a stream arrives at Tor, we have the following data to examine:
    1) The destination address
    2) The destination port (unless this a DNS lookup)
    3) The protocol used by the application to send the stream to Tor:
       SOCKS4, SOCKS4A, SOCKS5, or whatever local "transparent proxy"
       mechanism the kernel gives us.
    4) The port used by the application to send the stream to Tor --
       that is, the SOCKSListenAddress or TransListenAddress that the
       application used, if we have more than one.
    5) The SOCKS username and password, if any.
    6) The source address and port for the application.

  We propose to use 3, 4, and 5 as a backchannel for applications to
  tell Tor about different sessions.  Rather than running only one
  SOCKSPort, a Tor user who would prefer better session isolation should
  run multiple SOCKSPorts/TransPorts, and configure different
  applications to use separate ports. Applications that support SOCKS
  authentication can further be separated on a single port by their
  choice of username/password.  Streams sent to separate ports or using
  different authentication information should never be sent over the
  same circuit.  We allow each port to have its own settings for
  isolation based on destination port, destination address, or both.

  Handling DNS can be a challenge.  We can get hostnames by one of three
  means:

    A) A SOCKS4a request, or a SOCKS5 request with a hostname.  This
       case is handled trivially using the rules above.
    B) A RESOLVE request on a SOCKSPort.  This case is handled using the
       rules above, except that port isolation can't work to isolate
       RESOLVE requests into a proper session, since we don't know which
       port will eventually be used when we connect to the returned
       address.
    C) A request on a DNSPort.  We have no way of knowing which
       address/port will be used to connect to the requested address.

  When B or C is required but problematic, we could favor the use of
  AutomapHostsOnResolve.

Interface:

  We propose that {SOCKS,Natd,Trans,DNS}ListenAddr be deprecated in
  favor of an expanded {SOCKS,Natd,Trans,DNS}Port syntax:

  ClientPortLine = OptionName SP (Addr ":")? Port (SP Options?)
  OptionName = "SOCKSPort" / "NatdPort" / "TransPort" / "DNSPort"
  Addr = An IPv4 address / an IPv6 address surrounded by brackets.
         If optional, we default to 127.0.0.1
  Port = An integer from 1 through 65535 inclusive
  Options = Option
  Options = Options SP Option
  Option = IsolateOption / GroupOption
  GroupOption = "SessionGroup=" UINT
  IsolateOption =  OptNo ("IsolateDestPort" / "IsolateDestAddr" /
         "IsolateSOCKSUser"/ "IsolateClientProtocol" /
         "IsolateClientAddr") OptPlural
  OptNo = "No" ?
  OptPlural = "s" ?
  SP = " "
  UINT = An unsigned integer

  All options are case-insensitive.

  The "IsolateSOCKSUser" and "IsolateClientAddr" options are on by
  default; "NoIsolateSOCKSUser" and "NoIsolateClientAddr" respectively
  turn them off.  The IsolateDestPort and IsolateDestAddr and
  IsolateClientProtocol options are off by default.  NoIsolateDestPort and
  NoIsolateDestAddr and NoIsolateClientProtocol have no effect.

  Given a set of ClientPortLines, streams must NOT be placed on the same
  circuit if ANY of the following hold:

    * They were sent to two different client ports, unless the two
      client ports both specify a "SessionGroup" option with the same
      integer value.
    * At least one was sent to a client port with the IsolateDestPort
      active, and they have different destination ports.
    * At least one was sent to a client port with IsolateDestAddr
      active, and they have different destination addresses.
    * At least one was sent to a client port with IsolateClientProtocol
      active, and they use different protocols (where SOCKS4, SOCKS4a,
      SOCKS5, TransPort, NatdPort, and DNS are the protocols in question)
    * At least one was sent to a client port with IsolateSOCKSUser
      active, and they have different SOCKS username/password values
      configurations.  (For the purposes of this option, the
      username/password pair of ""/"" is distinct from SOCKS without
      authentication, and both are distinct from any non-SOCKS client's
      non-authentication.)
    * At least one was sent to a client port with IsolateClientAddr
      active, and they came from different client addresses.  (For the
      purpose of this option, any local interface counts as the same
      address.  So if the host is configured with addresses 10.0.0.1,
      192.0.32.10, and 127.0.0.1, then traffic from those addresses can
      leave on the same circuit, but traffic to from 10.0.0.2 (for
      example) could not share a circuit with any of them.)

  These rules apply regardless of whether the streams are active at the
  same time.  In other words, if the rules say that streams A and B must
  not be on the same circuit, and stream A is attached to circuit X,
  then stream B must never be attached to stream X, even if stream A is
  closed first.

Alternative Interface:

  We're cramming a lot onto one line in the design above.  Perhaps
  instead it would be a better idea to have grouped lines of the form:

    StreamGroup 1
    SOCKSPort 9050
    TransPort 9051
    IsolateDestPort 1
    IsolateClientProtocol 0
    EndStreamGroup

    StreamGroup 2
    SOCKSPort 9052
    DNSPort 9053
    IsolateDestAddr 1
    EndStreamGroup

  This would be equivalent to:
   SOCKSPort 9050 SessionGroup=1 IsolateDestPort NoIsolateClientProtocol
   TransPort 9051 SessionGroup=1 IsolateDestPort NoIsolateClientProtocol
   SOCKSPort 9052 SessionGroup=2 IsolateDestAddr
   DNSPort   9053 SessionGroup=2 IsolateDestAddr

  But it would let us extend range of allowed options later without
  having client port lines group without bound.  For example, we might
  give different circuit building parameters to different session
  groups.

Example of use:

  Suppose that we want to use a web browser, an IRC client, and a SSH
  client all at the same time.  Let's assume that we want web traffic to
  be isolated from all other traffic, even if the browser makes
  connections to ports usually used for IRC or SSH.  Let's also assume
  that IRC and SSH are both used for relatively long-lived connections,
  and we want to keep all IRC/SSH sessions separate from one another.

  In this case, we could say:

    SOCKSPort 9050
    SOCKSPort 9051 IsolateDestAddr IsolateDestPort

  We would then configure our browser to use 9050 and our IRC/SSH
  clients to use 9051.

Advanced example of use, #2:

  Suppose that we have a bunch of applications, and we launch them all
  using torsocks, and we want to keep each applications isolated from
  one another.  We just create a shell script, "torlaunch":
    #!/bin/bash
    export TORSOCKS_USERNAME="$1"
    exec torsocks $@
  And we configure our SOCKSPort with IsolateSOCKSUser.

  Or if we're on Linux and we want to isolate by application invocation,
  we would change the TORSOCKS_USERNAME line to:

    export TORSOCKS_USERNAME="`cat /proc/sys/kernel/random/uuid`"

Advanced example of use, #2:

  Now suppose that we want to achieve the benefits of the first example
  of use, but we are stuck using transparent proxies.  Let's suppose
  this is Linux.

    TransPort 9090
    TransPort 9091 IsolateDestAddr IsolateDestPort
    DNSPort 5353
    AutomapHostsOnResolve 1

  Here we use the iptables --cmd-owner filter to distinguish which
  command is originating the packets, directing traffic from our irc
  client and our SSH client to port 9091, and directing other traffic to
  9090.  Using AutomapHostsOnResolve will confuse ssh in its default
  configuration; we'll need to find a way around that.

Security Risks:

  Disabling IsolateClientAddr is a pretty bad idea.

  Setting up a set of applications to use this system effectively is a
  big problem.  It's likely that lots of people who try to do this will
  mess it up.  We should try to see which setups are sensible, and see
  if we can provide good feedback to explain which streams are isolated
  how.

Performance Risks:

  This proposal will result in clients building many more circuits than
  they do today.  To avoid accidentally hammering the network, we should
  have in-process limits on the maximum circuit creation rate and the
  total maximum client circuits.

Specification:

  The Tor client circuit selection process is not entirely specified.
  Any client circuit specification must take these changes into account.

Implementation notes:

  The more obvious ways to implement the "find a good circuit to attach
  to" part of this proposal involve doing an O(n_circuits) operation
  every time we have a stream to attach.  We already do such an
  operation, so it's not as if we need to hunt for fancy ways to make it
  O(1).  What will be harder is implementing the "launch circuits as
  needed" part of the proposal.  Still, it should come down to "a simple
  matter of programming."

  The SOCKS4 spec has the client provide authentication info when it
  connects; accepting such info is no problem.  But the SOCKS5 spec has
  the client send a list of known auth methods, then has the server send
  back the authentication method it chooses.  We'll need to update the
  SOCKS5 implementation so it can accept user/password authentication if
  it's offered.

  If we use the second syntax for describing these options, we'll want
  to add a new "section-based" entry type for the configuration parser.
  Not a huge deal; we already have kludged up something similar for
  hidden service configurations.

  Opening circuits for predicted ports has the potential to get a little
  more complicated; we can probably get away with the existing
  algorithm, though, to see where its weak points are and look for
  better ones.

  Perhaps we can get our next-gen HTTP proxy to communicate browser tab
  or session into to tor via authentication, or have torbutton do it
  directly.  More design is needed here, though.

Alternative designs:

  The implementation of this option may want to consider cases where the
  same exit node is shared by two or more circuits and
  IsolateStreamsByPort is in force.  Since one possible use of the option
  is to reduce the opportunity of Exit Nodes to attack traffic from the
  same source on multiple ports, the implementation may need to ensure
  that circuits reserved for the exclusive use of given ports do not
  share the same exit node.  On the other hand, if our goal is only that
  streams should be unlinkable, deliberately shunting them to different
  exit nodes is unnecessary and slightly counterproductive.

  Earlier versions of this design included a mechanism to isolate
  _particular_ destination ports and addresses, so that traffic sent to,
  say, port 22 would never share a port with any traffic *not* sent to
  port 22.  You can achieve this here by having all applications that
  send traffic to one of these ports use a separate SOCKSPort, and
  then setting IsolateDestPorts on that SOCKSPort.

Future work:

  Nikita Borisov suggests that different session profiles -- so long as
  there aren't too many of them -- could well get different guard node
  allocations in order to prevent guard profiling.  This can be done
  orthogonally to the rest of this proposal.

Lingering questions:

  I suspect there are issues remaining with DNS and TransPort users, and
  that my "just use AutomapHostsOnResolve" suggestion may be
  insufficient.
Filename: 172-circ-getinfo-option.txt
Title: GETINFO controller option for circuit information
Author: Damian Johnson
Created: 03-June-2010
Status: Reserve

Overview:

    This details an additional GETINFO option that would provide information
    concerning a relay's current circuits.

Motivation:

    The original proposal was for connection related information, but Jake make
    the excellent point that any information retrieved from the control port
    is...
    
      1. completely ineffectual for auditing purposes since either (a) these
      results can be fetched from netstat already or (b) the information would
      only be provided via tor and can't be validated.
      
      2. The more useful uses for connection information can be achieved with
      much less (and safer) information.
    
    Hence the proposal is now for circuit based rather than connection based
    information. This would strip the most controversial and sensitive data
    entirely (ip addresses, ports, and connection based bandwidth breakdowns)
    while still being useful for the following purposes:

    - Basic Relay Usage Questions
    How is the bandwidth I'm contributing broken down? Is it being evenly
    distributed or is someone hogging most of it? Do these circuits belong to
    the hidden service I'm running or something else? Now that I'm using exit
    policy X am I desirable as an exit, or are most people just using me as a
    relay?

    - Debugging
    Say a relay has a restrictive firewall policy for outbound connections,
    with the ORPort whitelisted but doesn't realize that tor needs random high
    ports. Tor would report success ("your orport is reachable - excellent")
    yet the relay would be nonfunctional. This proposed information would
    reveal numerous RELAY -> YOU -> UNESTABLISHED circuits, giving a good
    indicator of what's wrong.

    - Visualization
    A nice benefit of visualizing tor's behavior is that it becomes a helpful
    tool in puzzling out how tor works. For instance, tor spawns numerous
    client connections at startup (even if unused as a client). As a newcomer
    to tor these asymmetric (outbound only) connections mystified me for quite
    a while until until Roger explained their use to me. The proposed
    TYPE_FLAGS would let controllers clearly label them as being client
    related, making their purpose a bit clearer.

    At the moment connection data can only be retrieved via commands like
    netstat, ss, and lsof. However, providing an alternative via the control
    port provides several advantages:

      - scrubbing for private data
          Raw connection data has no notion of what's sensitive and what is
          not. The relay's flags and cached consensus can be used to take
          educated guesses concerning which connections could possibly belong
          to client or exit traffic, but this is both difficult and inaccurate.
          Anything provided via the control port can scrubbed to make sure we
          aren't providing anything we think relay operators should not see.
     
      - additional information
          All connection querying commands strictly provide the ip address and
          port of connections, and nothing else. However, for the uses listed
          above the far more interesting attributes are the circuit's type,
          bandwidth usage and uptime.
     
      - improved performance
          Querying connection data is an expensive activity, especially for
          busy relays or low end processors (such as mobile devices). Tor
          already internally knows its circuits, allowing for vastly quicker
          lookups.
     
      - cross platform capability
          The connection querying utilities mentioned above not only aren't
          available under Windows, but differ widely among different *nix
          platforms. FreeBSD in particular takes a very unique approach,
          dropping important options from netstat and assigning ss to a
          spreadsheet application instead. A controller interface, however,
          would provide a uniform means of retrieving this information.

Security Implications:

    This is an open question. This proposal lacks the most controversial pieces
    of information (ip addresses and ports) and insight into potential threats
    this would pose would be very welcomed!

Specification:

   The following addition would be made to the control-spec's GETINFO section:

  "rcirc/id/<Circuit identity>" -- Provides entry for the associated relay
    circuit, formatted as:
      CIRC_ID=<circuit ID> CREATED=<timestamp> UPDATED=<timestamp> TYPE=<flag>
        READ=<bytes> WRITE=<bytes>

    none of the parameters contain whitespace, and additional results must be
    ignored to allow for future expansion. Parameters are defined as follows:
      CIRC_ID - Unique numeric identifier for the circuit this belongs to.
      CREATED - Unix timestamp (as seconds since the Epoch) for when the
          circuit was created.
      UPDATED - Unix timestamp for when this information was last updated.
      TYPE - Single character flags indicating attributes in the circuit:
          (E)ntry : has a connection that doesn't belong to a known Tor server,
            indicating that this is either the first hop or bridged
          E(X)it : has been used for at least one exit stream
          (R)elay : has been extended
          Rende(Z)vous : is being used for a rendezvous point
          (I)ntroduction : is being used for a hidden service introduction
          (N)one of the above: none of the above have happened yet.
      READ - Total bytes transmitted toward the exit over the circuit.
      WRITE - Total bytes transmitted toward the client over the circuit.

  "rcirc/all" -- The 'rcirc/id/*' output for all current circuits, joined by
    newlines.

   The following would be included for circ info update events.

4.1.X. Relay circuit status changed

  The syntax is:
     "650" SP "RCIRC" SP CircID SP Notice [SP Created SP Updated SP Type SP
          Read SP Write] CRLF
     
     Notice =
            "NEW"    / ; first information being provided for this circuit
            "UPDATE" / ; update for a previously reported circuit
            "CLOSED"   ; notice that the circuit no longer exists
    
  Notice indicating that queryable information on a relay related circuit has
  changed. If the Notice parameter is either "NEW" or "UPDATE" then this
  provides the same fields that would be given by calling "GETINFO rcirc/id/"
  with the CircID.

Filename: 173-getinfo-option-expansion.txt
Title: GETINFO Option Expansion
Author: Damian Johnson
Created: 02-June-2010
Status: Obsolete

Overview:

    Over the course of developing arm there's been numerous hacks and
    workarounds to glean pieces of basic, desirable information about the tor
    process. As per Roger's request I've compiled a list of these pain points
    to try and improve the control protocol interface.

Motivation:

    The purpose of this proposal is to expose additional process and relay
    related information that is currently unavailable in a convenient,
    dependable, and/or platform independent way. Examples are:

      - The relay's total contributed bandwidth. This is a highly requested
        piece of information and, based on the following patch from pipe, looks
        trivial to include.
        http://www.mail-archive.com/or-talk@freehaven.net/msg13085.html

      - The process ID of the tor process. There is a high degree of guess work
        in obtaining this. Arm for instance uses pidof, netstat, and ps yet
        still fails on some platforms, and Orbot recently got a ticket about
        its own attempt to fetch it with ps:
        https://trac.torproject.org/projects/tor/ticket/1388

    This just includes the pieces of missing information I've noticed
    (suggestions or questions of their usefulness are welcome!).

Security Implications:

    None that I'm aware of. From a security standpoint this seems decently
    innocuous.

Specification:

    The following addition would be made to the control-spec's GETINFO section:

    "relay/bw-limit" -- Effective relayed bandwidth limit.

    "relay/burst-limit" -- Effective relayed burst limit.

    "relay/read-total" -- Total bytes relayed (download).

    "relay/write-total" -- Total bytes relayed (upload).

    "relay/flags" -- Space separated listing of flags currently held by the
    relay as reported by the currently cached consensus.

    "process/user" -- Username under which the tor process is running,
    or an empty string if none exists.
    [what do we mean 'if none exists'?]
      [Implemented in 0.2.3.1-alpha.]

    "process/pid" -- Process id belonging to the main tor process, -1 if none
    exists for the platform.
      [Implemented in 0.2.3.1-alpha.]

    "process/uptime" -- Total uptime of the tor process (in seconds).

    "process/uptime-reset" -- Time since last reset (startup, sighup, or RELOAD
    signal, in seconds). [should clarify exactly which events cause an
    uptime reset]

    "process/descriptors-used" -- Count of file descriptors used.

    "process/descriptor-limit" -- File descriptor limit (getrlimit results).

    "ns/authority" -- Router status info (v2 directory style) for all
    recognized directory authorities, joined by newlines.

    "state/names" -- A space-separated list of all the keys supported by this
    version of Tor's state.

    "state/val/<key>" -- Provides the current state value belonging to the
    given key. If undefined, this provides the key's default value.

    "status/ports-seen" -- A summary of which ports we've seen connections'
    circuits connect to recently, formatted the same as the EXITS_SEEN status
    event described in Section 4.1.XX. This GETINFO option is currently
    available only for exit relays.

4.1.XX. Per-port exit stats

  The syntax is:
     "650" SP "EXITS_SEEN" SP TimeStarted SP PortSummary CRLF

  We just generated a new summary of which ports we've seen exiting circuits
  connecting to recently. The controller could display this for the user, e.g.
  in their "relay" configuration window, to give them a sense of how they're
  being used (popularity of the various ports they exit to). Currently only
  exit relays will receive this event.

  TimeStarted is a quoted string indicating when the reported summary
  counts from (in GMT).

  The PortSummary keyword has as its argument a comma-separated, possibly
  empty set of "port=count" pairs. For example (without linebreak),
  650-EXITS_SEEN TimeStarted="2008-12-25 23:50:43"
  PortSummary=80=16,443=8

Filename: 174-optimistic-data-server.txt
Title: Optimistic Data for Tor: Server Side
Author: Ian Goldberg
Created: 2-Aug-2010
Status: Closed
Implemented-In: 0.2.3.1-alpha

Overview:

When a SOCKS client opens a TCP connection through Tor (for an HTTP
request, for example), the query latency is about 1.5x higher than it
needs to be.  Simply, the problem is that the sequence of data flows
is this:

1. The SOCKS client opens a TCP connection to the OP
2. The SOCKS client sends a SOCKS CONNECT command
3. The OP sends a BEGIN cell to the Exit
4. The Exit opens a TCP connection to the Server
5. The Exit returns a CONNECTED cell to the OP
6. The OP returns a SOCKS CONNECTED notification to the SOCKS client
7. The SOCKS client sends some data (the GET request, for example)
8. The OP sends a DATA cell to the Exit
9. The Exit sends the GET to the server
10. The Server returns the HTTP result to the Exit
11. The Exit sends the DATA cells to the OP
12. The OP returns the HTTP result to the SOCKS client

Note that the Exit node knows that the connection to the Server was
successful at the end of step 4, but is unable to send the HTTP query to
the server until step 9.

This proposal (as well as its upcoming sibling concerning the client
side) aims to reduce the latency by allowing:
1. SOCKS clients to optimistically send data before they are notified
    that the SOCKS connection has completed successfully
2. OPs to optimistically send DATA cells on streams in the CONNECT_WAIT
    state
3. Exit nodes to accept and queue DATA cells while in the
    EXIT_CONN_STATE_CONNECTING state

This particular proposal deals with #3.

In this way, the flow would be as follows:

1. The SOCKS client opens a TCP connection to the OP
2. The SOCKS client sends a SOCKS CONNECT command, followed immediately
    by data (such as the GET request)
3. The OP sends a BEGIN cell to the Exit, followed immediately by DATA
    cells
4. The Exit opens a TCP connection to the Server
5. The Exit returns a CONNECTED cell to the OP, and sends the queued GET
    request to the Server
6. The OP returns a SOCKS CONNECTED notification to the SOCKS client,
    and the Server returns the HTTP result to the Exit
7. The Exit sends the DATA cells to the OP
8. The OP returns the HTTP result to the SOCKS client

Motivation:

This change will save one OP<->Exit round trip (down to one from two).
There are still two SOCKS Client<->OP round trips (negligible time) and
two Exit<->Server round trips.  Depending on the ratio of the
Exit<->Server (Internet) RTT to the OP<->Exit (Tor) RTT, this will
decrease the latency by 25 to 50 percent.  Experiments validate these
predictions. [Goldberg, PETS 2010 rump session; see
https://thunk.cs.uwaterloo.ca/optimistic-data-pets2010-rump.pdf ]

Design:

The current code actually correctly handles queued data at the Exit; if
there is queued data in a EXIT_CONN_STATE_CONNECTING stream, that data
will be immediately sent when the connection succeeds.  If the
connection fails, the data will be correctly ignored and freed.  The
problem with the current server code is that the server currently
drops DATA cells on streams in the EXIT_CONN_STATE_CONNECTING state.
Also, if you try to queue data in the EXIT_CONN_STATE_RESOLVING state,
bad things happen because streams in that state don't yet have
conn->write_event set, and so some existing sanity checks (any stream
with queued data is at least potentially writable) are no longer sound.

The solution is to simply not drop received DATA cells while in the
EXIT_CONN_STATE_CONNECTING state.  Also do not send SENDME cells in this
state, so that the OP cannot send more than one window's worth of data
to be queued at the Exit.  Finally, patch the sanity checks so that
streams in the EXIT_CONN_STATE_RESOLVING state that have buffered data
can pass.

If no clients ever send such optimistic data, the new code will never be
executed, and the behaviour of Tor will not change.  When clients begin
to send optimistic data, the performance of those clients' streams will
improve.

After discussion with nickm, it seems best to just have the server
version number be the indicator of whether a particular Exit supports
optimistic data.  (If a client sends optimistic data to an Exit which
does not support it, the data will be dropped, and the client's request
will fail to complete.)  What do version numbers for hypothetical future
protocol-compatible implementations look like, though?

Security implications:

Servers (for sure the Exit, and possibly others, by watching the
pattern of packets) will be able to tell that a particular client
is using optimistic data.  This will be discussed more in the sibling
proposal.

On the Exit side, servers will be queueing a little bit extra data, but
no more than one window.  Clients today can cause Exits to queue that
much data anyway, simply by establishing a Tor connection to a slow
machine, and sending one window of data.

Specification:

tor-spec section 6.2 currently says:

    The OP waits for a RELAY_CONNECTED cell before sending any data.
    Once a connection has been established, the OP and exit node
    package stream data in RELAY_DATA cells, and upon receiving such
    cells, echo their contents to the corresponding TCP stream.
    RELAY_DATA cells sent to unrecognized streams are dropped.

It is not clear exactly what an "unrecognized" stream is, but this last
sentence would be changed to say that RELAY_DATA cells received on a
stream that has processed a RELAY_BEGIN cell and has not yet issued a
RELAY_END or a RELAY_CONNECTED cell are queued; that queue is processed
immediately after a RELAY_CONNECTED cell is issued for the stream, or
freed after a RELAY_END cell is issued for the stream.

The earlier part of this section will be addressed in the sibling
proposal.

Compatibility:

There are compatibility issues, as mentioned above.  OPs MUST NOT send
optimistic data to Exit nodes whose version numbers predate (something).
OPs MAY send optimistic data to Exit nodes whose version numbers match
or follow that value.  (But see the question about independent server
reimplementations, above.)

Implementation:

Here is a simple patch.  It seems to work with both regular streams and
hidden services, but there may be other corner cases I'm not aware of.
(Do streams used for directory fetches, hidden services, etc. take a
different code path?)

diff --git a/src/or/connection.c b/src/or/connection.c
index 7b1493b..f80cd6e 100644
--- a/src/or/connection.c
+++ b/src/or/connection.c
@@ -2845,7 +2845,13 @@ _connection_write_to_buf_impl(const char *string, size_t len,
     return;
   }
 
-  connection_start_writing(conn);
+  /* If we receive optimistic data in the EXIT_CONN_STATE_RESOLVING
+   * state, we don't want to try to write it right away, since
+   * conn->write_event won't be set yet.  Otherwise, write data from
+   * this conn as the socket is available. */
+  if (conn->state != EXIT_CONN_STATE_RESOLVING) {
+      connection_start_writing(conn);
+  }
   if (zlib) {
     conn->outbuf_flushlen += buf_datalen(conn->outbuf) - old_datalen;
   } else {
@@ -3382,7 +3388,11 @@ assert_connection_ok(connection_t *conn, time_t now)
     tor_assert(conn->s < 0);
 
   if (conn->outbuf_flushlen > 0) {
-    tor_assert(connection_is_writing(conn) || conn->write_blocked_on_bw ||
+    /* With optimistic data, we may have queued data in
+     * EXIT_CONN_STATE_RESOLVING while the conn is not yet marked to writing.
+     * */
+    tor_assert(conn->state == EXIT_CONN_STATE_RESOLVING ||
+	    connection_is_writing(conn) || conn->write_blocked_on_bw ||
             (CONN_IS_EDGE(conn) && TO_EDGE_CONN(conn)->edge_blocked_on_circ));
   }
 
diff --git a/src/or/relay.c b/src/or/relay.c
index fab2d88..e45ff70 100644
--- a/src/or/relay.c
+++ b/src/or/relay.c
@@ -1019,6 +1019,9 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ,
   relay_header_t rh;
   unsigned domain = layer_hint?LD_APP:LD_EXIT;
   int reason;
+  int optimistic_data = 0;  /* Set to 1 if we receive data on a stream
+			       that's in the EXIT_CONN_STATE_RESOLVING
+			       or EXIT_CONN_STATE_CONNECTING states.*/
 
   tor_assert(cell);
   tor_assert(circ);
@@ -1038,9 +1041,20 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ,
   /* either conn is NULL, in which case we've got a control cell, or else
    * conn points to the recognized stream. */
 
-  if (conn && !connection_state_is_open(TO_CONN(conn)))
-    return connection_edge_process_relay_cell_not_open(
-             &rh, cell, circ, conn, layer_hint);
+  if (conn && !connection_state_is_open(TO_CONN(conn))) {
+    if ((conn->_base.state == EXIT_CONN_STATE_CONNECTING ||
+	    conn->_base.state == EXIT_CONN_STATE_RESOLVING) &&
+	rh.command == RELAY_COMMAND_DATA) {
+	/* We're going to allow DATA cells to be delivered to an exit
+	 * node in state EXIT_CONN_STATE_CONNECTING or
+	 * EXIT_CONN_STATE_RESOLVING.  This speeds up HTTP, for example. */
+	log_warn(domain, "Optimistic data received.");
+	optimistic_data = 1;
+    } else {
+	return connection_edge_process_relay_cell_not_open(
+		 &rh, cell, circ, conn, layer_hint);
+    }
+  }
 
   switch (rh.command) {
     case RELAY_COMMAND_DROP:
@@ -1090,7 +1104,9 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ,
       log_debug(domain,"circ deliver_window now %d.", layer_hint ?
                 layer_hint->deliver_window : circ->deliver_window);
 
-      circuit_consider_sending_sendme(circ, layer_hint);
+      if (!optimistic_data) {
+	  circuit_consider_sending_sendme(circ, layer_hint);
+      }
 
       if (!conn) {
         log_info(domain,"data cell dropped, unknown stream (streamid %d).",
@@ -1107,7 +1123,9 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ,
       stats_n_data_bytes_received += rh.length;
       connection_write_to_buf(cell->payload + RELAY_HEADER_SIZE,
                               rh.length, TO_CONN(conn));
-      connection_edge_consider_sending_sendme(conn);
+      if (!optimistic_data) {
+	  connection_edge_consider_sending_sendme(conn);
+      }
       return 0;
     case RELAY_COMMAND_END:
       reason = rh.length > 0 ?

Performance and scalability notes:

There may be more RAM used at Exit nodes, as mentioned above, but it is
transient.
Filename: 175-automatic-node-promotion.txt
Title: Automatically promoting Tor clients to nodes
Author: Steven Murdoch
Created: 12-Mar-2010
Status: Rejected

1. Overview

   This proposal describes how Tor clients could determine when they
   have sufficient bandwidth capacity and are sufficiently reliable to
   become either bridges or Tor relays. When they meet this
   criteria, they will automatically promote themselves, based on user
   preferences. The proposal also defines the new controller messages
   and options which will control this process.

   Note that for the moment, only transitions between client and
   bridge are being considered. Transitions to public relay will
   be considered at a future date, but will use the same
   infrastructure for measuring capacity and reliability.

2. Motivation and history

   Tor has a growing user-base and one of the major impediments to the
   quality of service offered is the lack of network capacity. This is
   particularly the case for bridges, because these are gradually
   being blocked, and thus no longer of use to people within some
   countries. By automatically promoting Tor clients to bridges, and
   perhaps also to full public relays, this proposal aims to solve
   these problems.

   Only Tor clients which are sufficiently useful should be promoted,
   and the process of determining usefulness should be performed
   without reporting the existence of the client to the central
   authorities. The criteria used for determining usefulness will be
   in terms of bandwidth capacity and uptime, but parameters should be
   specified in the directory consensus. State stored at the client
   should be in no more detail than necessary, to prevent sensitive
   information being recorded.

3. Design

3.x Opt-in state model

   Tor can be in one of five node-promotion states:

   - off (O): Currently a client, and will stay as such
   - auto (A): Currently a client, but will consider promotion
   - bridge (B): Currently a bridge, and will stay as such
   - auto-bridge (AB): Currently a bridge, but will consider promotion
   - relay (R): Currently a public relay, and will stay as such

   The state can be fully controlled from the configuration file or
   controller, but the normal state transitions are as follows:

   Any state -> off: User has opted out of node promotion
   Off -> any state: Only permitted with user consent

   Auto -> auto-bridge: Tor has detected that it is sufficiently
    reliable to be a *bridge*
   Auto -> bridge: Tor has detected that it is sufficiently reliable
    to be a *relay*, but the user has chosen to remain a *bridge*
   Auto -> relay: Tor has detected that it is sufficiently reliable
    to be *relay*, and will skip being a *bridge*
   Auto-bridge -> relay: Tor has detected that it is sufficiently
    reliable to be a *relay*

   Note that this model does not support automatic demotion. If this
   is desirable, there should be some memory as to whether the
   previous state was relay, bridge, or auto-bridge. Otherwise the
   user may be prompted to become a relay, although he has opted to
   only be a bridge.

3.x User interaction policy

   There are a variety of options in how to involve the user into the
   decision as to whether and when to perform node promotion. The
   choice also may be different when Tor is running from Vidalia (and
   thus can readily prompt the user for information), and standalone
   (where Tor can only log messages, which may or may not be read).

   The option requiring minimal user interaction is to automatically
   promote nodes according to reliability, and allow the user to opt
   out, by changing settings in the configuration file or Vidalia user
   interface.

   Alternatively, if a user interface is available, Tor could prompt
   the user when it detects that a transition is available, and allow
   the user to choose which of the available options to select. If
   Vidalia is not available, it still may be possible to solicit an
   email address on install, and contact the operator to ask whether
   a transition to bridge or relay is permitted.

   Finally, Tor could by default not make any transition, and the user
   would need to opt in by stating the maximum level (bridge or
   relay) to which the node may automatically promote itself.

3.x Performance monitoring model

   To prevent a large number of clients activating as relays, but
   being too unreliable to be useful, clients should measure their
   performance. If this performance meets a parameterized acceptance
   criteria, a client should consider promotion. To measure
   reliability, this proposal adopts a simple user model:

    - A user decides to use Tor at times which follow a Poisson
      distribution
    - At each time, the user will be happy if the bridge chosen has
      adequate bandwidth and is reachable
    - If the chosen bridge is down or slow too many times, the user
      will consider Tor to be bad

   If we additionally assume that the recent history of relay
   performance matches the current performance, we can measure
   reliability by simulating this simple user.

   The following parameters are distributed to clients in the
   directory consensus:

     - min_bandwidth: Minimum self-measured bandwidth for a node to be
       considered useful, in bytes per second
     - check_period: How long, in seconds, to wait between checking
       reachability and bandwidth (on average)
     - num_samples: Number of recent samples to keep
     - num_useful: Minimum number of recent samples where the node was
       reachable and had at least min_bandwidth capacity, for a client
       to consider promoting to a bridge

   A different set of parameters may be used for considering when to
   promote a bridge to a full relay, but this will be the subject of a
   future revision of the proposal.

3.x Performance monitoring algorithm

   The simulation described above can be implemented as follows:

   Every 60 seconds:
     1. Tor generates a random floating point number x in
        the interval [0, 1).
     2. If x > (1 / (check_period / 60)) GOTO end; otherwise:
     3. Tor sets the value last_check to the current_time (in seconds)
     4. Tor measures reachability
     5. If the client is reachable, Tor measures its bandwidth
     6. If the client is reachable and the bandwidth is >=
        min_bandwidth, the test has succeeded, otherwise it has failed.
     7. Tor adds the test result to the end of a ring-buffer containing
        the last num_samples results: measurement_results
     8. Tor saves last_check and measurements_results to disk
     9. If the length of measurements_results == num_samples and
        the number of successes >= num_useful, Tor should consider
        promotion to a bridge
   end.

   When Tor starts, it must fill in the samples for which it was not
   running. This can only happen once the consensus has downloaded,
   because the value of check_period is needed.

      1. Tor generates a random number y from the Poisson distribution [1]
         with lambda = (current_time - last_check) * (1 / check_period)
      2. Tor sets the value last_check to the current_time (in seconds)
      3. Add y test failures to the ring buffer measurements_results
      4. Tor saves last_check and measurements_results to disk

   In this way, a Tor client will measure its bandwidth and
   reachability every check_period seconds, on average. Provided
   check_period is sufficiently greater than a minute (say, at least an
   hour), the times of check will follow a Poisson distribution. [2]

   While this does require that Tor does record the state of a client
   over time, this does not leak much information. Only a binary
   reachable/non-reachable is stored, and the timing of samples becomes
   increasingly fuzzy as the data becomes less recent.

   On IP address changes, Tor should clear the ring-buffer, because
   from the perspective of users with the old IP address, this node
   might as well be a new one with no history. This policy may change
   once we start allowing the bridge authority to hand out new IP
   addresses given the fingerprint.
   [Perhaps another consensus param? Also, this means we save previous
    IP address in our state file, yes? -RD]

3.x Bandwidth measurement

   Tor needs to measure its bandwidth to test the usefulness as a
   bridge. A non-intrusive way to do this would be to passively measure
   the peak data transfer rate since the last reachability test. Once
   this exceeds min_bandwidth, Tor can set a flag that this node
   currently has sufficient bandwidth to pass the bandwidth component
   of the upcoming performance measurement.

   For the first version we may simply skip the bandwidth test,
   because the existing reachability test sends 500 kB over several
   circuits, and checks whether the node can transfer at least 50
   kB/s.  This is probably good enough for a bridge, so this test
   might be sufficient to record a success in the ring buffer.

3.x New options

3.x New controller message

4. Migration plan

   We should start by setting a high bandwidth and uptime requirement
   in the consensus, so as to avoid overloading the bridge authority
   with too many bridges. Once we are confident our systems can scale,
   the criteria can be gradually shifted down to gain more bridges.

5. Related proposals

6. Open questions:

   - What user interaction policy should we take?

   - When (if ever) should we turn a relay into an exit relay?

   - What should the rate limits be for auto-promoted bridges/relays?
     Should we prompt the user for this?

   - Perhaps the bridge authority should tell potential bridges
     whether to enable themselves, by taking into account whether
     their IP address is blocked

   - How do we explain the possible risks of running a bridge/relay
     * Use of bandwidth/congestion
     * Publication of IP address
     * Blocking from IRC (even for non-exit relays)

   - What feedback should we give to bridge relays, to encourage them
     e.g. number of recent users (what about reserve bridges)?

   - Can clients back-off from doing these tests (yes, we should do
     this)

[1] For algorithms to generate random numbers from the Poisson
    distribution, see: http://en.wikipedia.org/wiki/Poisson_distribution#Generating_Poisson-distributed_random_variables
[2] "The sample size n should be equal to or larger than 20 and the
     probability of a single success, p, should be smaller than or equal to
     .05. If n >= 100, the approximation is excellent if np is also <= 10."
    http://www.itl.nist.gov/div898/handbook/pmc/section3/pmc331.htm (e-Handbook of Statistical Methods)

% vim: spell ai et:
Filename: 176-revising-handshake.txt
Title: Proposed version-3 link handshake for Tor
Author: Nick Mathewson
Created: 31-Jan-2011
Status: Closed
Target: 0.2.3
Supersedes: 169

1. Overview

   I propose a (mostly) backward-compatible change to the Tor
   connection establishment protocol to avoid the use of TLS
   renegotiation, to avoid certain protocol fingerprinting attacks,
   and to make it easier to write Tor clients and servers.

   Rather than doing a TLS renegotiation to exchange certificates
   and authenticate the original handshake, this proposal takes an
   approach similar to Steven Murdoch's proposal 124 and my old
   proposal 169, and uses Tor cells to finish authenticating the
   parties' identities once the initial TLS handshake is finished.

   I discuss some alternative design choices and why I didn't make
   them in section 7; please have a quick look there before
   telling me that something is pointless or makes no sense.

   Terminological note: I use "client" or "initiator" below to mean
   the Tor instance (a client or a bridge or a relay) that initiates a
   TLS connection, and "server" or "responder" to mean the Tor
   instance (a bridge or a relay) that accepts it.

2. History and Motivation

   The _goals_ of the Tor link handshake have remained basically uniform
   since our earliest versions.  They are:

      * Provide data confidentiality, data integrity
      * Provide forward secrecy
      * Allow responder authentication or bidirectional authentication.
      * Try to look like some popular too-important-to-block-at-whim
        encryption protocol, to avoid fingerprinting and censorship.
      * Try to be implementable -- on the client side at least! --
        by as many TLS implementations as possible.

   When we added the v2 handshake, we added another goal:

      * Remain compatible with older versions of the handshake
        protocol.

   In the original Tor TLS connection handshake protocol ("V1", or
   "two-cert"), parties that wanted to authenticate provided a
   two-cert chain of X.509 certificates during the handshake setup
   phase.  Every party that wanted to authenticate sent these
   certificates.  The security properties of this protocol are just
   fine; the problem was that our behavior of sending
   two-certificate chains made Tor easy to identify.

   In the current Tor TLS connection handshake protocol ("V2", or
   "renegotiating"), the parties begin with a single certificate
   sent from the server (responder) to the client (initiator), and
   then renegotiate to a two-certs-from-each-authenticating party.
   We made this change to make Tor's handshake look like a browser
   speaking SSL to a webserver.  (See proposal 130, and
   tor-spec.txt.)  So from an observer's point of view, two parties
   performing the V2 handshake begin by making a regular TLS
   handshake with a single certificate, then renegotiate
   immediately.

   To tell whether to use the V1 or V2 handshake, the servers look
   at the list of ciphers sent by the client.  (This is ugly, but
   there's not much else in the ClientHello that they can look at.)
   If the list contains any cipher not used by the V1 protocol, the
   server sends back a single cert and expects a renegotiation.  If
   the client gets back a single cert, then it withholds its own
   certificates until the TLS renegotiation phase.

   In other words, V2-supporting initiator behavior currently looks
   like this:

      - Begin TLS negotiation with V2 cipher list; wait for
        certificate(s).
      - If we get a certificate chain:
         - Then we are using the V1 handshake.  Send our own
           certificate chain as part of this initial TLS handshake
           if we want to authenticate; otherwise, send no
           certificates.  When the handshake completes, check
           certificates.  We are now mutually authenticated.

        Otherwise, if we get just a single certificate:
         - Then we are using the V2 handshake.  Do not send any
           certificates during this handshake.
         - When the handshake is done, immediately start a TLS
           renegotiation.  During the renegotiation, expect
           a certificate chain from the server; send a certificate
           chain of our own if we want to authenticate ourselves.
         - After the renegotiation, check the certificates. Then
           send (and expect) a VERSIONS cell from the other side to
           establish the link protocol version.

   And V2-supporting responder behavior now looks like this:

      - When we get a TLS ClientHello request, look at the cipher
        list.
      - If the cipher list contains only the V1 ciphersuites:
         - Then we're doing a V1 handshake.  Send a certificate
           chain.  Expect a possible client certificate chain in
           response.
        Otherwise, if we get other ciphersuites:
         - We're using the V2 handshake.  Send back a single
           certificate and let the handshake complete.
         - Do not accept any data until the client has renegotiated.
         - When the client is renegotiating, send a certificate
           chain, and expect (possibly multiple) certificates in
           reply.
         - Check the certificates when the renegotiation is done.
           Then exchange VERSIONS cells.

   Late in 2009, researchers found a flaw in most applications' use
   of TLS renegotiation: Although TLS renegotiation does not
   reauthenticate any information exchanged before the renegotiation
   takes place, many applications were treating it as though it did,
   and assuming that data sent _before_ the renegotiation was
   authenticated with the credentials negotiated _during_ the
   renegotiation.  This problem was exacerbated by the fact that
   most TLS libraries don't actually give you an obvious good way to
   tell where the renegotiation occurred relative to the datastream.
   Tor wasn't directly affected by this vulnerability, but the
   aftermath hurts us in a few ways:

      1) OpenSSL has disabled renegotiation by default, and created
         a "yes we know what we're doing" option we need to set to
         turn it back on.  (Two options, actually: one for openssl
         0.9.8l and one for 0.9.8m and later.)

      2) Some vendors have removed all renegotiation support from
         their versions of OpenSSL entirely, forcing us to tell
         users to either replace their versions of OpenSSL or to
         link Tor against a hand-built one.

      3) Because of 1 and 2, I'd expect TLS renegotiation to become
         rarer and rarer in the wild, making our own use stand out
         more.

   Furthermore, there are other issues related to TLS and
   fingerprinting that we want to fix in any revised handshake:

      1) We should make it easier to use self-signed certs, or maybe
         even existing HTTPS certificates, for the server side
         handshake, since most non-Tor SSL handshakes use either
         self-signed certificates or CA-signed certificates.

      2) We should allow other changes in our use of TLS and in our
         certificates so as to resist fingerprinting based on how
         our certificates look.  (See proposal 179.)

3. Design

3.1. The view in the large

   Taking a cue from Steven Murdoch's proposal 124 and my old
   proposal 169, I propose that we move the work currently done by
   the TLS renegotiation step (that is, authenticating the parties
   to one another) and do it with Tor cells instead of with TLS
   alone.

   This section outlines the protocol; we go into more detail below.

   To tell the client that it can use the new cell-based
   authentication system, the server sends a "V3 certificate" during
   the initial TLS handshake.  (More on what makes a certificate
   "v3" below.)  If the client recognizes the format of the
   certificate and decides to pursue the V3 handshake, then instead
   of renegotiating immediately on completion of the initial TLS
   handshake, the client instead sends a VERSIONS cell (and the
   negotiation begins).

   So the flowchart on the server side is:

      Wait for a ClientHello.
      If the client sends a ClientHello that indicates V1:
          - Send a certificate chain.
          - When the TLS handshake is done, if the client sent us a
            certificate chain, then check it.
      If the client sends a ClientHello that indicates V2 or V3:
          - Send a self-signed certificate or a CA-signed certificate
          - When the TLS handshake is done, wait for renegotiation or data.
            - If renegotiation occurs, the client is V2: send a
              certificate chain and maybe receive one.  Check the
              certificate chain as in V1.
            - If the client sends data without renegotiating, it is
              starting the V3 handshake.  Proceed with the V3
              handshake as below.

   And the client-side flowchart is:

      - Send a ClientHello with a set of ciphers that indicates V2/V3.
      - After the handshake is done:
        - If the server sent us a certificate chain, check it: we
          are using the V1 handshake.
        - If the server sent us a single "V2 certificate", we are
          using the v2 handshake: the client begins to renegotiate
          and proceeds as before.
        - Finally, if the server sent us a "v3 certificate", we are
          doing the V3 handshake below.

   And the cell-based part of the V3 handshake, in summary, is:

    C<->S: TLS handshake where S sends a "v3 certificate"

    In TLS:

       C->S: VERSIONS cell
       S->C: VERSIONS cell, CERT cell, AUTH_CHALLENGE cell, NETINFO cell

       C->S: Optionally: CERT cell, AUTHENTICATE cell
       C->S: NETINFO cell

   A "CERTS" cell contains a set of certificates; an "AUTHENTICATE"
   cell authenticates the client to the server.  More on these
   later.

3.2. Distinguishing V2 and V3 certificates

   In the protocol outline above, we require that the client can
   distinguish between v2 certificates (that is, those sent by
   current servers) and v3 certificates.  We further require that
   existing clients will accept v3 certificates as they currently
   accept v2 certificates.

   Fortunately, current certificates have a few characteristics that
   make them fairly well-mannered as it is.  We say that a certificate
   indicates a V2-only server if ALL of the following hold:
      * The certificate is not self-signed.
      * There is no DN field set in the certificate's issuer or
        subject other than "commonName".
      * The commonNames of the issuer and subject both end with
        ".net"
      * The public modulus is at most 1024 bits long.

   Otherwise, the client should assume that the server supports the
   V3 handshake.

   To the best of my knowledge, current clients will behave properly
   on receiving non-v2 certs during the initial TLS handshake so
   long as they eventually get the correct V2 cert chain during the
   renegotiation.

   The v3 requirements are easy to meet: any certificate designed to
   resist fingerprinting will likely be self-signed, or if it's
   signed by a CA, then the issuer will surely have more DN fields
   set.  Certificates that aren't trying to resist fingerprinting
   can trivially become v3 by using a CN that doesn't end with .net,
   or using a key longer than 1024 bits.

3.3. Authenticating via Tor cells: server authentication

   Once the TLS handshake is finished, if the client renegotiates,
   then the server should go on as it does currently.

   If the client implements this proposal, however, and the server
   has shown it can understand the V3+ handshake protocol, the
   client immediately sends a VERSIONS cell to the server
   and waits to receive a VERSIONS cell in return.  We negotiate
   the Tor link protocol version _before_ we proceed with the
   negotiation, in case we need to change the authentication
   protocol in the future.

   Once either party has seen the VERSIONS cell from the other, it
   knows which version they will pick (that is, the highest version
   shared by both parties' VERSIONS cells).  All Tor instances using
   the handshake protocol described in 3.2 MUST support at least
   link protocol version 3 as described here.  If a version lower
   than 3 is negotiated with the V3 handshake in place, a Tor
   instance MUST close the connection.

   On learning the link protocol, the server then sends the client a
   CERT cell and a NETINFO cell.  If the client wants to
   authenticate to the server, it sends a CERT cell, an AUTHENTICATE
   cell, and a NETINFO cell; or it may simply send a NETINFO cell if
   it does not want to authenticate.

   The CERT cell describes the keys that a Tor instance is claiming
   to have.  It is a variable-length cell.  Its payload format is:

        N: Number of certs in cell            [1 octet]
        N times:
           CertType                           [1 octet]
           CLEN                               [2 octets]
           Certificate                        [CLEN octets]

   Any extra octets at the end of a CERT cell MUST be ignored.

     CertType values are:
        1: Link key certificate from RSA1024 identity
        2: RSA1024 Identity certificate
        3: RSA1024 AUTHENTICATE cell link certificate

   The certificate format is X509.

   To authenticate the server, the client MUST check the following:
     * The CERTS cell contains exactly one CertType 1 "Link" certificate.
     * The CERTS cell contains exactly one CertType 2 "ID" certificate.
     * Both certificates have validAfter and validUntil dates that
       are not expired.
     * The certified key in the Link certificate matches the
       link key that was used to negotiate the TLS connection.
     * The certified key in the ID certificate is a 1024-bit RSA key.
     * The certified key in the ID certificate was used to sign both
       certificates.
     * The link certificate is correctly signed with the key in the
       ID certificate
     * The ID certificate is correctly self-signed.

   If all of these conditions hold, then the client knows that it is
   connected to the server whose identity key is certified in the ID
   certificate.  If any condition does not hold, the client closes
   the connection.  If the client wanted to connect to a server with
   a different identity key, the client closes the connection.

   An AUTH_CHALLENGE cell is a variable-length cell with the following
   fields:
       Challenge [32 octets]
       N_Methods [2 octets]
       Methods   [2 * N_Methods octets]

   It is sent from the server to the client.  Clients MUST ignore
   unexpected bytes at the end of the cell.  Servers MUST generate
   every challenge using a strong RNG or PRNG.

   The Challenge field is a randomly generated string that the
   client must sign (a hash of) as part of authenticating.  The
   methods are the authentication methods that the server will
   accept.  Only one authentication method is defined right now; see
   3.4 below.

3.4. Authenticating via Tor cells: Client authentication

   A client does not need to authenticate to the server.  If it
   does not wish to, it responds to the server's valid CERT cell by
   sending a NETINFO cell: once it has gotten a valid NETINFO cell,
   the client should consider the connection open, and the
   server should consider the connection as opened by an
   unauthenticated client.

   If a client wants to authenticate, it responds to the
   AUTH_CHALLENGE cell with a CERT cell and an AUTHENTICATE cell.
   The CERT cell is as a server would send, except that instead of
   sending a CertType 1 cert for an arbitrary link certificate, the
   client sends a CertType 3 cert for an RSA AUTHENTICATE key.
   (This difference is because we allow any link key type on a TLS
   link, but the protocol described here will only work for 1024-bit
   RSA keys.  A later protocol version should extend the protocol
   here to work with non-1024-bit, non-RSA keys.)

        AuthType                              [2 octets]
        AuthLen                               [2 octets]
        Authentication                        [AuthLen octets]

   Servers MUST ignore extra bytes at the end of an AUTHENTICATE
   cell.  If AuthType is 1 (meaning "RSA-SHA256-TLSSecret"), then the
   Authentication contains the following:

       TYPE: The characters "AUTH0001" [8 octets]
       CID: A SHA256 hash of the client's RSA1024 identity key [32 octets]
       SID: A SHA256 hash of the server's RSA1024 identity key [32 octets]
       SLOG: A SHA256 hash of all bytes sent from the server to the client
         as part of the negotiation up to and including the
         AUTH_CHALLENGE cell; that is, the VERSIONS cell,
         the CERT cell, the AUTH_CHALLENGE cell, and any padding cells.
         [32 octets]
       CLOG: A SHA256 hash of all bytes sent from the client to the
         server as part of the negotiation so far; that is, the
         VERSIONS cell and the CERT cell and any padding cells. [32 octets]
       SCERT: A SHA256 hash of the server's TLS link
         certificate. [32 octets]
       TLSSECRETS: A SHA256 HMAC, using the TLS master secret as the
         secret key, of the following:
           - client_random, as sent in the TLS Client Hello
           - server_random, as sent in the TLS Server Hello
           - the NUL terminated ASCII string:
             "Tor V3 handshake TLS cross-certification"
          [32 octets]
       TIME: The time of day in seconds since the POSIX epoch. [8 octets]
       RAND: A 16 byte value, randomly chosen by the client [16 octets]
       SIG: A signature of a SHA256 hash of all the previous fields
         using the client's "Authenticate" key as presented.  (As
         always in Tor, we use OAEP-MGF1 padding; see tor-spec.txt
         section 0.3.)
          [variable length]

   To check the AUTHENTICATE cell, a server checks that all fields
   containing from TYPE through TLSSECRETS contain their unique
   correct values as described above, and then verifies the signature.
   signature.  The server MUST ignore any extra bytes in the signed
   data after the SHA256 hash.

3.5. Responding to extra cells, and other security checks.

   If the handshake is a V3 TLS handshake, both parties MUST reject
   any negotiated link version less than 3.  Both parties MUST check
   this and close the connection if it is violated.

   If the handshake is not a V3 TLS handshake, both parties MUST
   still advertise all link protocols they support in their versions
   cell.  Both parties MUST close the link if it turns out they both
   would have supported version 3 or higher, but they somehow wound
   up using a v2 or v1 handshake.  (More on this in section 6.4.)

   Either party may send a VPADDING cell at any time during the
   handshake, except as the first cell. (See proposal 184.)

   A server SHOULD NOT send any sequence of cells when starting a v3
   negotiation other than "VERSIONS, CERT, AUTH_CHALLENGE,
   NETINFO".  A client SHOULD drop a CERT, AUTH_CHALLENGE, or
   NETINFO cell that appears at any other time or out of sequence.

   A client should not begin a v3 negotiation with any sequence
   other than "VERSIONS, NETINFO" or "VERSIONS, CERT, AUTHENTICATE,
   NETINFO".   A server SHOULD drop a CERT, AUTH_CHALLENGE, or
   NETINFO cell that appears at any other time or out of sequence.

4. Numbers to assign

   We need a version number for this link protocol.  I've been
   calling it "3".

   We need to reserve command numbers for CERT, AUTH_CHALLENGE, and
   AUTHENTICATE.  I suggest that in link protocol 3 and higher, we
   reserve a separate range of commands for variable-length cells.
   See proposal 184 for more there.

5. Efficiency

   This protocol adds a round-trip step when the client sends a
   VERSIONS cell to the server, and waits for the {VERSIONS, CERT,
   NETINFO} response in turn.  (The server then waits for the
   client's {NETINFO} or {CERT, AUTHENTICATE, NETINFO} reply,
   but it would have already been waiting for the client's NETINFO,
   so that's not an additional wait.)

   This is actually fewer round-trip steps than required before for
   TLS renegotiation, so that's a win over v2.

6. Security argument

   These aren't crypto proofs, since I don't write those.  They are
   meant to be reasonably convincing.

6.1. The server is authenticated

   TLS guarantees that if the TLS handshake completes successfully,
   the client knows that it is speaking to somebody who knows the
   private key corresponding to the public link key that was used in
   the TLS handshake.

   Because this public link key is signed by the server's identity
   key in the CERT cell, the client knows that somebody who holds
   the server's private identity key says that the server's public
   link key corresponds to the server's public identity key.

   Therefore, if the crypto works, and if TLS works, and if the keys
   aren't compromised, then the client is talking to somebody who
   holds the server's private identity key.

6.2. The client is authenticated

   Once the server has checked the client's certificates, the server
   knows that somebody who knows the client's private identity key
   says that he is the one holding the private key corresponding to
   the client's presented link-authentication public key.

   Once the server has checked the signature in the AUTHENTICATE
   cell, the server knows that somebody holding the client's
   link-authentication private key signed the data in question.  By
   the standard certification argument above, the server knows that
   somebody holding the client's private identity key signed the
   data in question.

   So the server's remaining question is: am I really talking to
   somebody holding the client's identity key, or am I getting a
   replayed or MITM'd AUTHENTICATE cell that was previously sent by
   the client?

   Because the client includes a TLSSECRET component, and the
   server is able to verify it, then the answer is easy: the server
   knows for certain that it is talking to the party with whom it
   did the TLS handshake, since if somebody else generated a correct
   TLSSECRET, they would have to know the master secret of the TLS
   connection, which would require them to have broken TLS.

   Even if the protocol didn't contain the TLSSECRET component,
   the server could the client's authentication, but it's a little
   trickier.  The server knows that it is not getting a replayed
   AUTHENTICATE cell, since the cell authenticates (among other
   stuff) the server's AUTH_CHALLENGE cell, which it has never used
   before.  The server knows that it is not getting a MITM'd
   AUTHENTICATE cell, since the cell includes a hash of the server's
   link certificate, which nobody else should have been able to use
   in a successful TLS negotiation.

6.3. MITM attacks won't work any better than they do against TLS

   TLS guarantees that a man-in-the-middle attacker can't read the
   content of a successfully negotiated encrypted connection, nor
   alter the content in any way other than truncating it, unless he
   compromises the session keys or one of the key-exchange secret
   keys used to establish that connection.  Let's make sure we do at
   least that well.

   Suppose that a client Alice connects to an MITM attacker Mallory,
   thinking that she is connecting to some server Bob.  Let's assume
   that the TLS handshake between Alice and Mallory finishes
   successfully and the v3 protocol is chosen.  [If the v1 or v2
   protocol is chosen, those already resist MITM.  If the TLS
   handshake doesn't complete, then Alice isn't connected to anybody.]

   During the v3 handshake, Mallory can't convince Alice that she is
   talking to Bob, since she should not be able to produce a CERT
   cell containing a certificate chain signed by Bob's identity key
   and used to authenticate the link key that Mallory used during
   TLS.  (If Mallory used her own link key for the TLS handshake, it
   won't match anything Bob signed unless Bob is compromised.
   Mallory can't use any key that Bob _did_ produce a certificate
   for, since she doesn't know the private key.)

   Even if Alice fails to check the certificates from Bob, Mallory
   still can't convince Bob that she is really Alice.  Assuming that
   Alice's keys aren't compromised, Mallory can't send a CERT cell
   with a cert chain from Alice's identity key to a key that Mallory
   controls, so if Mallory wants to impersonate Alice's identity
   key, she can only do so by sending an AUTHENTICATE cell really
   generated by Alice.  Because Bob will check that the random bytes
   in the AUTH_CHALLENGE cell will influence the SLOG hash, Mallory
   needs to send Bob's challenge to Alice, and can't use any other
   AUTHENTICATE cell that Alice generated before.  But because the
   AUTHENTICATE cell Alice will generate will include in the SCERT
   field a hash of the link certificate used by Mallory, Bob will
   reject it as not being valid to connect to him.

6.4. Protocol downgrade attacks won't work.

   Assuming that Alice checks the certificates from Bob, she knows
   that Bob really sent her the VERSION cell that she received.

   Because the AUTHENTICATE cell from Alice includes signed hashes
   of the VERSIONS cells from Alice and Bob, Bob knows that Alice
   got the VERSIONS cell he sent and sent the VERSIONS cell that he
   received.

   But what about attempts to downgrade the protocol earlier in the
   handshake?  Here TLS comes to the rescue: because the TLS
   Finished handshake message includes an authenticated digest of
   everything previously said during the handshake, an attacker
   can't replace the client's ciphersuite list (to trigger a
   downgrade to the v1 protocol) or the server's certificate [chain]
   (to trigger a downgrade to the v1 or v2 protocol).

7. Design considerations

   I previously considered adding our own certificate format in
   order to avoid the pain associated with X509, but decided instead
   to simply use X509 since a correct Tor implementation will
   already need to have X509 code to handle the other handshake
   versions and to use TLS.

   The trickiest part of the design here is deciding what to stick
   in the AUTHENTICATE cell.  Some of it is strictly necessary, and
   some of it is left there for security margin in case my other
   security arguments fail.  Because of the CID and SID elements
   you can't use an AUTHENTICATE cell for anything other than
   authenticating a client ID to a server with an appropriate
   server ID.  The SLOG and CLOG elements are there mostly to
   authenticate the VERSIONS cells and resist downgrade attacks
   once there are two versions of this.  The presence of the
   AUTH_CHALLENGE field in the stuff authenticated in SLOG
   prevents replays and ensures that the AUTHENTICATE cell was
   really generated by somebody who is reading what the server is
   sending over the TLS connection.  The SCERT element is meant to
   prevent MITM attacks.  When the TLSSECRET field is
   used, it should prevent the use of the AUTHENTICATE cell for
   anything other than the TLS connection the client had in mind.

   A signature of the TLSSECRET element on its own should also be
   sufficient to prevent the attacks we care about.  The redundancy
   here should come in handy if I've made a mistake somewhere else in
   my analysis.

   If the client checks the server's certificates and matches them
   to the TLS connection link key before proceding with the
   handshake, then signing the contents of the AUTH_CHALLENGE cell
   would be sufficient to authenticate the client.  But implementers
   of allegedly compatible Tor clients have in the past skipped
   certificate verification steps, and I didn't want a client's
   failure to verify certificates to mean that a server couldn't
   trust that he was really talking to the client.  To prevent this,
   I added the TLS link certificate to the authenticated data: even
   if the Tor client code doesn't check any certificates, the TLS
   library code will still check that the certificate used in the
   handshake contains a link key that matches the one used in the
   handshake.

8. Open questions:

  - May we cache which certificates we've already verified?  It
    might leak in timing whether we've connected with a given server
    before, and how recently.

  - With which TLS libraries is it feasible to yoink client_random,
    server_random, and the master secret?  If the answer is "All
    free C TLS libraries", great.  If the answer is "OpenSSL only",
    not so great.

  - Should we do anything to check the timestamp in the AUTHENTICATE
    cell?

  - Can we give some way for clients to signal "I want to use the
    V3 protocol if possible, but I can't renegotiate, so don't give
    me the V2"?  Clients currently have a fair idea of server
    versions, so they could potentially do the V3 handshake with
    servers that support it, and fall back to V1 otherwise.

  - What should servers that don't have TLS renegotiation do?  For
    now, I think they should just stick with V1.  Eventually we can
    deprecate the V2 handshake as we did with the V1 handshake.
    When that happens, servers can be V3-only.
Filename: 177-flag-abstention.txt
Title: Abstaining from votes on individual flags
Author: Nick Mathewson
Created: 14 Feb 2011
Status: Reserve
Target: 0.2.4.x

Overview:

   We should have a way for authorities to vote on flags in
   particular instances, without having to vote on that flag for all
   servers.

Motivation:

   Suppose that the status of some router becomes controversial, and
   an authority wants to vote for or against the BadExit status of
   that router.  Suppose also that the authority is not currently
   voting on the BadExit flag.  If the authority wants to say that
   the router is or is not "BadExit", it cannot currently do so
   without voting yea or nay on the BadExit status of all other
   routers.

   Suppose that an authority wants to vote "Valid" or "Invalid" on a
   large number of routers, but does not have an opinion on some of
   them.  Currently, it cannot do so: if it votes for the Valid flag
   anywhere, it votes for it everywhere.

Design:

   We add a new line "extra-flags" in directory votes, to appear
   after "known-flags".  It lists zero or more flags that an
   authority has occasional opinions on, but for which the authority
   will usually abstain.  No flag may appear in both extra-flags and
   known-flags.

   In the router-status section for each directory vote, we allow an
   optional "s2" line to appear after the "s" line.  It contains
   zero or more flag votes.  A flag vote is of the form of one of
   "+", "-", or "/" followed by the name of a flag.  "+" denotes a
   yea vote, and "-" denotes a nay vote, and "/" notes an
   abstention.  Authorities may omit most abstentions, except as
   noted below.  No flag may appear in an s2 line unless it appears
   in the known-flags or extra-flags line.We retain the rule that no
   flag may appear in an s line unless it appears in the known-flags
   line.

   When using an appropriate consensus method to vote, we use these
   new rules to determine flags:

   A flag is listed in the consensus if it is in the known-flags
   section of at least one voter, and in the known-flags or
   extra-flags section of at least three voters (or half the
   authorities, whichever set is smaller).

   A single authority's vote for a given flag on a given router is
   interpreted as follows:

      - If the authority votes +Flag or -Flag or /Flag in the s2 line for
        that router, the vote is "yea" or "nay" or "abstain" respectively.
      - Otherwise, if the flag is listed on the "s" line for the
        router, then the vote is "yea".
      - Otherwise, if the flag is listed in the known-flags line,
        then the vote is "nay".
      - Otherwise, the vote is "abstain".

   A router is assigned a flag in the consensus iff the total "yeas"
   outnumber the total "nays".

   As an exception, this proposal does not affect the behavior of
   the "Named" and "Unnamed" flags; these are still treated as
   before.  (An authority can already abstain from a single naming
   decision by not voting Named on any router with a given name.)

Examples:

   Suppose that it becomes important to know which Tor servers are
   operated by burrowing marsupials.  Some authority operators
   diligently research this question; others want to vote about
   individual routers on an ad hoc basis when they learn about a
   particular router's being e.g. located underground in New South
   Wales.

   If an authority usually has no opinions on the RunByWombats flag,
   it should list it in the "extra-flags" of its votes.  If it
   occasionally wants to vote that a router is (or is not) run by
   wombats, it should list "s2 +RunByWombats" or "s2 -RunByWombats"
   for the routers in question.  Otherwise it can omit the flag from
   its s and s2 lines entirely.

   If an authority usually has an opinion on the RunByWombats flag,
   but wants to abstain in some cases, it should list "RunByWombats"
   in the "known-flags" part of its votes, and include
   "RunByWombats" in the s line for every router that it believes is
   run by wombats. When it wants to vote that a router is not run
   by wombats, it should list the RunByWombats flag in neither the s
   nor the s2 line.  When it wants to abstain, it should list "s2
   /RunByWombats".

   In both cases, when the new consensus method is used, a router
   will get listed as "RunByWombats" if there are more authorities
   that say it is run by wombats than there are authorities saying
   it is not run by wombats.  (As now, "no" votes win ties.)


Filename: 178-param-voting.txt
Title: Require majority of authorities to vote for consensus parameters
Author: Sebastian Hahn
Created: 16-Feb-2011
Status: Closed
Implemented-In: 0.2.3.9-alpha

Overview:

The consensus that the directory authorities create may contain one or
more parameters (32-bit signed integers) that influence the behavior
of Tor nodes (see proposal 167, "Vote on network parameters in
consensus" for more details).

Currently (as of consensus method 11), a consensus will end up
containing a parameter if at least one directory authority votes for
that paramater. The value of the parameter will be the low-median of
all the votes for this parameter.

This proposal aims at changing this voting process to be more secure
against tampering by a small fraction of directory authorities.

Motivation:

To prevent a small fraction of the directory authorities from
influencing the value of a parameter unduly, a big enough fraction
of all directory authorities authorities has to vote for that
parameter. This is not currently happening, and it is in fact not
uncommon for a single authority to govern the value of a consensus
parameter.

Design:

When the consensus is generated, the directory authorities ensure that
a param is only included in the list of params if at least three of the
authorities (or a simple majority, whichever is the smaller number)
votes for that param. The value chosen is the low-median of all the
votes. We don't mandate that the authorities have to vote on exactly
the same value for it to be included because some consensus parameters
could be the result of active measurements that individual authorities
make.

Security implications:

This change is aimed at improving the security of Tor nodes against
attacks carried out by a small fraction of directory authorities. It
is possible that a consensus parameter that would be helpful to the
network is not included because not enough directory authorities
voted for it, but since clients are required to have sane defaults
in case the parameter is absent this does not carry a security risk.

This proposal makes a security vs coordination effort tradeoff. When
considering only the security of the design, it would be better to
require a simple majority of directory authorities to agree on
voting on a parameter, but it would involve requiring more
directory authority operators to coordinate their actions to set the
parameter successfully.

Specification:

dir-spec section 3.4 currently says:

     Entries are given on the "params" line for every keyword on which any
     authority voted.  The values given are the low-median of all votes on
     that keyword.

It is proposed that the above is changed to:

     Entries are given on the "params" line for every keyword on which a
     majority of authorities (total authorities, not just those
     participating in this vote) voted on, or if at least three
     authorities voted for that parameter. The values given are the
     low-median of all votes on that keyword.

     Consensus methods 11 and before, entries are given on the "params"
     line for every keyword on which any authority voted, the value given
     being the low-median of all votes on that keyword.

The following should be added to the bottom of section 3.4.:

        * If consensus method 12 or later is used, only consensus
          parameters that more than half of the total number of
          authorities voted for are included in the consensus.

The following line should be added to the bottom of section 3.4.1.:

     "12" -- Params are only included if enough auths voted for them

Compatibility:

A sufficient number of directory authorities must upgrade to the new
consensus method used to calculate the params in the way this proposal
calls for, otherwise the old mechanism is used. Nodes that do not act
as directory authorities do not need to be upgraded and should
experience no change in behaviour.

Implementation:

An example implementation of this feature can be found in
https://gitweb.torproject.org/sebastian/tor.git, branch safer_params.

Filename: 179-TLS-cert-and-parameter-normalization.txt
Title: TLS certificate and parameter normalization
Author: Jacob Appelbaum, Gladys Shufflebottom
Created: 16-Feb-2011
Status: Closed
Target: 0.2.3.x


        Draft spec for TLS certificate and handshake normalization


                                    Overview

     STATUS NOTE:

     This document is implemented in part in 0.2.3.x, deferred in part, and
     rejected in part.  See indented bracketed comments in individual
     sections below for more information. -NM

Scope

This is a document that proposes improvements to problems with Tor's
current TLS (Transport Layer Security) certificates and handshake that will
reduce the distinguishability of Tor traffic from other encrypted traffic that
uses TLS.  It also addresses some of the possible fingerprinting attacks
possible against the current Tor TLS protocol setup process.

Motivation and history

Censorship is an arms race and this is a step forward in the defense
of Tor.  This proposal outlines ideas to make it more difficult to
fingerprint and block Tor traffic.

Goals

This proposal intends to normalize or remove easy-to-predict or static
values in the Tor TLS certificates and with the Tor TLS setup process.
These values can be used as criteria for the automated classification of
encrypted traffic as Tor traffic. Network observers should not be able
to trivially detect Tor merely by receiving or observing the certificate
used or advertised by a Tor relay. I also propose the creation of
a hard-to-detect covert channel through which a server can signal that it
supports the third version ("V3") of the Tor handshake protocol.

Non-Goals

This document is not intended to solve all of the possible active or passive
Tor fingerprinting problems. This document focuses on removing distinctive
and predictable features of TLS protocol negotiation; we do not attempt to
make guarantees about resisting other kinds of fingerprinting of Tor
traffic, such as fingerprinting techniques related to timing or volume of
transmitted data.

                                Implementation details


Certificate Issues

The CN or commonName ASN1 field

Tor generates certificates with a predictable commonName field; the
field is within a given range of values that is specific to Tor.
Additionally, the generated host names have other undesirable properties.
The host names typically do not resolve in the DNS because the domain
names referred to are generated at random. Although they are syntatically
valid, they usually refer to domains that have never been registered by
any domain name registrar.

An example of the current commonName field: CN=www.s4ku5skci.net

An example of OpenSSL’s asn1parse over a typical Tor certificate:

   0:d=0  hl=4 l= 438 cons: SEQUENCE
    4:d=1  hl=4 l= 287 cons: SEQUENCE
    8:d=2  hl=2 l=   3 cons: cont [ 0 ]
   10:d=3  hl=2 l=   1 prim: INTEGER           :02
   13:d=2  hl=2 l=   4 prim: INTEGER           :4D3C763A
   19:d=2  hl=2 l=  13 cons: SEQUENCE
   21:d=3  hl=2 l=   9 prim: OBJECT            :sha1WithRSAEncryption
   32:d=3  hl=2 l=   0 prim: NULL
   34:d=2  hl=2 l=  35 cons: SEQUENCE
   36:d=3  hl=2 l=  33 cons: SET
   38:d=4  hl=2 l=  31 cons: SEQUENCE
   40:d=5  hl=2 l=   3 prim: OBJECT            :commonName
   45:d=5  hl=2 l=  24 prim: PRINTABLESTRING   :www.vsbsvwu5b4soh4wg.net
   71:d=2  hl=2 l=  30 cons: SEQUENCE
   73:d=3  hl=2 l=  13 prim: UTCTIME           :110123184058Z
   88:d=3  hl=2 l=  13 prim: UTCTIME           :110123204058Z
  103:d=2  hl=2 l=  28 cons: SEQUENCE
  105:d=3  hl=2 l=  26 cons: SET
  107:d=4  hl=2 l=  24 cons: SEQUENCE
  109:d=5  hl=2 l=   3 prim: OBJECT            :commonName
  114:d=5  hl=2 l=  17 prim: PRINTABLESTRING   :www.s4ku5skci.net
  133:d=2  hl=3 l= 159 cons: SEQUENCE
  136:d=3  hl=2 l=  13 cons: SEQUENCE
  138:d=4  hl=2 l=   9 prim: OBJECT            :rsaEncryption
  149:d=4  hl=2 l=   0 prim: NULL
  151:d=3  hl=3 l= 141 prim: BIT STRING
  295:d=1  hl=2 l=  13 cons: SEQUENCE
  297:d=2  hl=2 l=   9 prim: OBJECT            :sha1WithRSAEncryption
  308:d=2  hl=2 l=   0 prim: NULL
  310:d=1  hl=3 l= 129 prim: BIT STRING

I propose that we match OpenSSL's default self-signed certificates. I hypothesise
that they are the most common self-signed certificates. If this turns out not
to be the case, then we should use whatever the most common turns out to be.

Certificate serial numbers

Currently our generated certificate serial number is set to the number of
seconds since the epoch at the time of the certificate's creation. I propose
that we should ensure that our serial numbers are unrelated to the epoch,
since the generation methods are potentially recognizable as Tor-related.

Instead, I propose that we use a randomly generated number that is
subsequently hashed with SHA-512 and then truncate the data to eight bytes[1].

Random sixteen byte values appear to be the high bound for serial number as
issued by Verisign and DigiCert.  RapidSSL appears to be three bytes in length.
Others common byte lengths appear to be between one and four bytes. The default
OpenSSL certificates are eight bytes and we should use this length with our
self-signed certificates.

This randomly generated serial number field may now serve as a covert channel
that signals to the client that the OR will not support TLS renegotiation; this
means that the client can expect to perform a V3 TLS handshake setup.
Otherwise, if the serial number is a reasonable time since the epoch, we should
assume the OR is using an earlier protocol version and hence that it expects
renegotiation.

We also have a need to signal properties with our certificates for a possible
v3 handshake in the future. Therefore I propose that we match OpenSSL default
self-signed certificates (a 64-bit random number), but reserve the two least-
significant bits for signaling. For the moment, these two bits will be zero.

This means that an attacker may be able to identify Tor certificates from default
OpenSSL certificates with a 75% probability.

As a security note, care must be taken to ensure that supporting this
covert channel will not lead to an attacker having a method to downgrade client
behavior. This shouldn't be a risk because the TLS Finished message hashes over
all the bytes of the handshake, including the certificates.

     [Randomized serial numbers are implemented in 0.2.3.9-alpha. We probably
     shouldn't do certificate tagging by a covert channel in serial numbers,
     since doing so would mean we could never have an externally signed
     cert. -NM]

Certificate fingerprinting issues expressed as base64 encoding

It appears that all deployed Tor certificates have the following strings in
common:

MIIB
CCA
gAwIBAgIETU
ANBgkqhkiG9w0BAQUFADA
YDVQQDEx
3d3cu

As expected these values correspond to specific ASN.1 OBJECT IDENTIFIER (OID)
properties (sha1WithRSAEncryption, commonName, etc) of how we generate our
certificates.

As an illustrated example of the common bytes of all certificates used within
the Tor network within a single one hour window, I have replaced the actual
value with a wild card ('.') character here:

-----BEGIN CERTIFICATE-----
MIIB..CCA..gAwIBAgIETU....ANBgkqhkiG9w0BAQUFADA.M..w..YDVQQDEx.3
d3cu............................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
........................... <--- Variable length and padding
-----END CERTIFICATE-----

This fine ascii art only illustrates the bytes that absolutely match in all
cases.  In many cases, it's likely that there is a high probability for a given
byte to be only a small subset of choices.

Using the above strings, the EFF's certificate observatory may trivially
discover all known relays, known bridges and unknown bridges in a single SQL
query.  I propose that we ensure that we test our certificates to ensure that
they do not have these kinds of statistical similarities without ensuring
overlap with a very large cross section of the internet's certificates.

Certificate dating and validity issues

TLS certificates found in the wild are generally found to be long-lived;
they are frequently old and often even expired. The current Tor certificate
validity time is a very small time window starting at generation time and
ending shortly thereafter, as defined in or.h by MAX_SSL_KEY_LIFETIME
(2*60*60).

I propose that the certificate validity time length is extended to a period of
twelve Earth months, possibly with a small random skew to be determined by the
implementer. Tor should randomly set the start date in the past or some
currently unspecified window of time before the current date. This would
more closely track the typical distribution of non-Tor TLS certificate
expiration times.

The certificate values, such as expiration, should not be used for anything
relating to security; for example, if the OR presents an expired TLS
certificate, this does not imply that the client should terminate the
connection (as would be appropriate for an ordinary TLS implementation).
Rather, I propose we use a TOFU style expiration policy - the certificate
should never be trusted for more than a two hour window from first sighting.

This policy should have two major impacts. The first is that an adversary will
have to perform a differential analysis of all certificates for a given IP
address rather than a single check. The second is that the server expiration
time is enforced by the client and confirmed by keys rotating in the consensus.

The expiration time should not be a fixed time that is simple to calculate by
any Deep Packet Inspection device or it will become a new Tor TLS setup
fingerprint.

   [Deferred and needs revision; see proposal XXX. -NM]

Proposed certificate form

The following output from openssl asn1parse results from the proposed
certificate generation algorithm. It matches the results of generating a
default self-signed certificate:

    0:d=0  hl=4 l= 513 cons: SEQUENCE          
    4:d=1  hl=4 l= 362 cons: SEQUENCE          
    8:d=2  hl=2 l=   9 prim: INTEGER           :DBF6B3B864FF7478
   19:d=2  hl=2 l=  13 cons: SEQUENCE          
   21:d=3  hl=2 l=   9 prim: OBJECT            :sha1WithRSAEncryption
   32:d=3  hl=2 l=   0 prim: NULL              
   34:d=2  hl=2 l=  69 cons: SEQUENCE          
   36:d=3  hl=2 l=  11 cons: SET               
   38:d=4  hl=2 l=   9 cons: SEQUENCE          
   40:d=5  hl=2 l=   3 prim: OBJECT            :countryName
   45:d=5  hl=2 l=   2 prim: PRINTABLESTRING   :AU
   49:d=3  hl=2 l=  19 cons: SET               
   51:d=4  hl=2 l=  17 cons: SEQUENCE          
   53:d=5  hl=2 l=   3 prim: OBJECT            :stateOrProvinceName
   58:d=5  hl=2 l=  10 prim: PRINTABLESTRING   :Some-State
   70:d=3  hl=2 l=  33 cons: SET               
   72:d=4  hl=2 l=  31 cons: SEQUENCE          
   74:d=5  hl=2 l=   3 prim: OBJECT            :organizationName
   79:d=5  hl=2 l=  24 prim: PRINTABLESTRING   :Internet Widgits Pty Ltd
  105:d=2  hl=2 l=  30 cons: SEQUENCE          
  107:d=3  hl=2 l=  13 prim: UTCTIME           :110217011237Z
  122:d=3  hl=2 l=  13 prim: UTCTIME           :120217011237Z
  137:d=2  hl=2 l=  69 cons: SEQUENCE          
  139:d=3  hl=2 l=  11 cons: SET               
  141:d=4  hl=2 l=   9 cons: SEQUENCE          
  143:d=5  hl=2 l=   3 prim: OBJECT            :countryName
  148:d=5  hl=2 l=   2 prim: PRINTABLESTRING   :AU
  152:d=3  hl=2 l=  19 cons: SET               
  154:d=4  hl=2 l=  17 cons: SEQUENCE          
  156:d=5  hl=2 l=   3 prim: OBJECT            :stateOrProvinceName
  161:d=5  hl=2 l=  10 prim: PRINTABLESTRING   :Some-State
  173:d=3  hl=2 l=  33 cons: SET               
  175:d=4  hl=2 l=  31 cons: SEQUENCE          
  177:d=5  hl=2 l=   3 prim: OBJECT            :organizationName
  182:d=5  hl=2 l=  24 prim: PRINTABLESTRING   :Internet Widgits Pty Ltd
  208:d=2  hl=3 l= 159 cons: SEQUENCE          
  211:d=3  hl=2 l=  13 cons: SEQUENCE          
  213:d=4  hl=2 l=   9 prim: OBJECT            :rsaEncryption
  224:d=4  hl=2 l=   0 prim: NULL              
  226:d=3  hl=3 l= 141 prim: BIT STRING        
  370:d=1  hl=2 l=  13 cons: SEQUENCE          
  372:d=2  hl=2 l=   9 prim: OBJECT            :sha1WithRSAEncryption
  383:d=2  hl=2 l=   0 prim: NULL              
  385:d=1  hl=3 l= 129 prim: BIT STRING        

    [Rejected pending more evidence; this pattern is trivially detectable,
    and there is just not enough reason at the moment to think that this
    particular certificate pattern is common enough for sites that matter
    that the censors wouldn't be willing to block it. -NM]

Custom Certificates

It should be possible for a Tor relay operator to use a specifically supplied
certificate and secret key. This will allow a relay or bridge operator to use a
certificate signed by any member of any geographically relevant certificate
authority racket; it will also allow for any other user-supplied certificate.
This may be desirable in some kinds of filtered networks or when attempting to
avoid attracting suspicion by blending in with the TLS web server certificate
crowd.

    [Deferred; see proposal XXX]

Problematic Diffie–Hellman parameters

We currently send a static Diffie–Hellman parameter, prime p (or “prime p
outlaw”) as specified in RFC2409 as part of the TLS Server Hello response.

The use of this prime in TLS negotiations may, as a result, be filtered and
effectively banned by certain networks. We do not have to use this particular
prime in all cases.

While amusing to have the power to make specific prime numbers into a new class
of numbers (cf. imaginary, irrational, illegal [3]) - our new friend prime p
outlaw is not required.

The use of this prime in TLS negotiations may, as a result, be filtered and
effectively banned by certain networks. We do not have to use this particular
prime in all cases.

I propose that the function to initialize and generate DH parameters be
split into two functions.

First, init_dh_param() should be used only for OR-to-OR DH setup and
communication. Second, it is proposed that we create a new function
init_tls_dh_param() that will have a two-stage development process.

The first stage init_tls_dh_param() will use the same prime that
Apache2.x [4] sends (or “dh1024_apache_p”), and this change should be
made immediately. This is a known good and safe prime number (p-1 / 2
is also prime) that is currently not known to be blocked.

The second stage init_tls_dh_param() should randomly generate a new prime on a
regular basis; this is designed to make the prime difficult to outlaw or
filter.  Call this a shape-shifting or "Rakshasa" prime.  This should be added
to the 0.2.3.x branch of Tor. This prime can be generated at setup or execution
time and probably does not need to be stored on disk. Rakshasa primes only
need to be generated by Tor relays as Tor clients will never send them. Such
a prime should absolutely not be shared between different Tor relays nor
should it ever be static after the 0.2.3.x release.

As a security precaution, care must be taken to ensure that we do not generate
weak primes or known filtered primes. Both weak and filtered primes will
undermine the TLS connection security properties. OpenSSH solves this issue
dynamically in RFC 4419 [5] and may provide a solution that works reasonably
well for Tor. More research in this area including the applicability of
Miller-Rabin or AKS primality tests[6] will need to be analyzed and probably
added to Tor.

      [Randomized DH groups are implemented in 0.2.3.9-alpha. -NM]

Practical key size

Currently we use a 1024 bit long RSA modulus. I propose that we increase the
RSA key size to 2048 as an additional channel to signal support for the V3
handshake setup.  2048 appears to be the most common key size[0] above 1024.
Additionally, the increase in modulus size provides a reasonable security boost
with regard to key security properties.

The implementer should increase the 1024 bit RSA modulus to 2048 bits.

     [Deferred and needs performance analysis.  See proposal
     XXX. Additionally, DH group strength seems far more crucial. Still, this
     is out-of-scope for a "normalization" question. -NM]

Possible future filtering nightmares

At some point it may cost effective or politically feasible for a network
filter to simply block all signed or self-signed certificates without a known
valid CA trust chain. This will break many applications on the internet and
hopefully, our option for custom certificates will ensure that this step is
simply avoided by the censors.

The Rakshasa prime approach may cause censors to specifically allow only
certain known and accepted DH parameters.


Appendix: Other issues

What other obvious TLS certificate issues exist? What other static values are
present in the Tor TLS setup process?

[0] http://archives.seul.org/or/dev/Jan-2011/msg00051.html
[1] http://archives.seul.org/or/dev/Feb-2011/msg00016.html
[2] http://archives.seul.org/or/dev/Feb-2011/msg00039.html
[3] To be fair this is hardly a new class of numbers. History is rife with
    similar examples of inane authoritarian attempts at mathematical secrecy.
    Probably the most dramatic example is the story of the pupil Hipassus of
    Metapontum, pupil of the famous Pythagoras, who, legend goes, proved the
    fact that Root2 cannot be expressed as a fraction of whole numbers (now
    called an irrational number) and was assassinated for revealing this
    secret.  Further reading on the subject may be found on the Wikipedia:
    http://en.wikipedia.org/wiki/Hippasus

[4] httpd-2.2.17/modules/ss/ssl_engine_dh.c
[5] http://tools.ietf.org/html/rfc4419
[6] http://archives.seul.org/or/dev/Jan-2011/msg00037.html
Filename: 180-pluggable-transport.txt
Title: Pluggable transports for circumvention
Author: Jacob Appelbaum, Nick Mathewson
Created: 15-Oct-2010
Status: Closed
Implemented-In: 0.2.3.x

Overview

  This proposal describes a way to decouple protocol-level obfuscation
  from the core Tor protocol in order to better resist client-bridge
  censorship.  Our approach is to specify a means to add pluggable
  transport implementations to Tor clients and bridges so that they can
  negotiate a superencipherment for the Tor protocol.

Scope

  This is a document about transport plugins; it does not cover
  discovery improvements, or bridgedb improvements.  While these
  requirements might be solved by a program that also functions as a
  transport plugin, this proposal only covers the requirements and
  operation of transport plugins.

Motivation

  Frequently, people want to try a novel circumvention method to help
  users connect to Tor bridges.  Some of these methods are already
  pretty easy to deploy: if the user knows an unblocked VPN or open
  SOCKS proxy, they can just use that with the Tor client today.

  Less easy to deploy are methods that require participation by both the
  client and the bridge.  In order of increasing sophistication, we
  might want to support:

  1. A protocol obfuscation tool that transforms the output of a TLS
     connection into something that looks like HTTP as it leaves the
     client, and back to TLS as it arrives at the bridge.
  2. An additional authentication step that a client would need to
     perform for a given bridge before being allowed to connect.
  3. An information passing system that uses a side-channel in some
     existing protocol to convey traffic between a client and a bridge
     without the two of them ever communicating directly.
  4. A set of clients to tunnel client->bridge traffic over an existing
     large p2p network, such that the bridge is known by an identifier
     in that network rather than by an IP address.

  We could in theory support these almost fine with Tor as it stands
  today: every Tor client can take a SOCKS proxy to use for its outgoing
  traffic, so a suitable client proxy could handle the client's traffic
  and connections on its behalf, while a corresponding program on the
  bridge side could handle the bridge's side of the protocol
  transformation.  Nevertheless, there are some reasons to add support
  for transportation plugins to Tor itself:

  1. It would be good for bridges to have a standard way to advertise
     which transports they support, so that clients can have multiple
     local transport proxies, and automatically use the right one for
     the right bridge.

  2. There are some changes to our architecture that we'll need for a
     system like this to work.  For testing purposes, if a bridge blocks
     off its regular ORPort and instead has an obfuscated ORPort, the
     bridge authority has no way to test it.  Also, unless the bridge
     has some way to tell that the bridge-side proxy at 127.0.0.1 is not
     the origin of all the connections it is relaying, it might decide
     that there are too many connections from 127.0.0.1, and start
     paring them down to avoid a DoS.

  3. Censorship and anticensorship techniques often evolve faster than
     the typical Tor release cycle.  As such, it's a good idea to
     provide ways to test out new anticensorship mechanisms on a more
     rapid basis.

  4. Transport obfuscation is a relatively distinct problem
     from the other privacy problems that Tor tries to solve, and it
     requires a fairly distinct skill-set from hacking the rest of Tor.
     By decoupling transport obfuscation from the Tor core, we hope to
     encourage people working on transport obfuscation who would
     otherwise not be interested in hacking Tor.

  5. Finally, we hope that defining a generic transport obfuscation plugin
     mechanism will be useful to other anticensorship projects.

Non-Goals

  We're not going to talk about automatic verification of plugin
  correctness and safety via sandboxing, proof-carrying code, or
  whatever.

  We need to do more with discovery and distribution, but that's not
  what this proposal is about.  We're pretty convinced that the problems
  are sufficiently orthogonal that we should be fine so long as we don't
  preclude a single program from implementing both transport and
  discovery extensions.

  This proposal is not about what transport plugins are the best ones
  for people to write.  We do, however, make some general
  recommendations for plugin authors in an appendix.

  We've considered issues involved with completely replacing Tor's TLS
  with another encryption layer, rather than layering it inside the
  obfuscation layer.  We describe how to do this in an appendix to the
  current proposal, though we are not currently sure whether it's a good
  idea to implement.

  We deliberately reject any design that would involve linking the
  transport plugins into Tor's process space.

Design overview

  To write a new transport protocol, an implementer must provide two
  pieces: a "Client Proxy" to run at the initiator side, and a "Server
  Proxy" to run at the server side.  These two pieces may or may not be
  implemented by the same program.

  Each client may run any number of Client Proxies.  Each one acts like
  a SOCKS proxy that accepts connections on localhost.  Each one
  runs on a different port, and implements one or more transport
  methods.  If the protocol has any parameters, they are passed from Tor
  inside the regular username/password parts of the SOCKS protocol.

  Bridges (and maybe relays) may run any number of Server Proxies: these
  programs provide an interface like stunnel: they get connections from the
  network (typically by listening for connections on the network) and relay
  them to the Bridge's real ORPort.

  To configure one of these programs, it should be sufficient simply to
  list it in your torrc.  The program tells Tor which transports it
  provides.  The Tor consensus should carry a new approved version number that
  is specific for pluggable transport; this will allow Tor to know when a
  particular transport is known to be unsafe, safe, or non-functional.

  Bridges (and maybe relays) report in their descriptors which transport
  protocols they support.  This information can be copied into bridge
  lines.  Bridges using a transport protocol may have multiple bridge
  lines.

  Any methods that are wildly successful, we can bake into Tor.

Specifications: Client behavior

  We extend the bridge line format to allow you to say which method
  to use to connect to a bridge.

  The new format is:
     Bridge method address:port [[keyid=]id-fingerprint] [k=v] [k=v] [k=v]

  To connect to such a bridge, the Tor program needs to know which
  SOCKS proxy will support the transport called "method".  It
  then connects to this proxy, and asks it to connect to
  address:port.  If [id-fingerprint] is provided, Tor should expect
  the public identity key on the TLS connection to match the digest
  provided in [id-fingerprint].  If any [k=v] items are provided,
  they are configuration parameters for the proxy: Tor should
  separate them with semicolons and put them in the user and
  password fields of the request, splitting them across the fields
  as necessary.  If a key or value value must contain a semicolon or
  a backslash, it is escaped with a backslash.

  Method names must be C identifiers.

  For reference, the old bridge format was
    Bridge address[:port] [id-fingerprint]
  where port defaults to 443 and the id-fingerprint is optional. The
  new format can be distinguished from the old one by checking if the
  first argument has any non-C-identifier characters. (Looking for a
  period should be a simple way.) Also, while the id-fingerprint could
  optionally include whitespace in the old format, whitespace in the
  id-fingerprint is not permitted in the new format.

  Example: if the bridge line is "bridge trebuchet www.example.com:3333
     keyid=09F911029D74E35BD84156C5635688C009F909F9 rocks=20 height=5.6m"
     AND if the Tor client knows that the 'trebuchet' method is supported,
     the client should connect to the proxy that provides the 'trebuchet'
     method, ask it to connect to www.example.com, and provide the string
     "rocks=20;height=5.6m" as the username, the password, or split
     across the username and password.

  There are two ways to tell Tor clients about protocol proxies:
  external proxies and managed proxies.  An external proxy is configured
  with
     ClientTransportPlugin <method> socks4 <address:port> [auth=X]
  or
     ClientTransportPlugin <method> socks5 <address:port> [username=X] [password=Y]
  as in
     "ClientTransportPlugin trebuchet socks5 127.0.0.1:9999".
  This example tells Tor that another program is already running to handle
  'trubuchet' connections, and Tor doesn't need to worry about it.

  A managed proxy is configured with
     ClientTransportPlugin <methods> exec <path> [options]
  as in
    "ClientTransportPlugin trebuchet exec /usr/libexec/trebuchet --managed".
  This example tells Tor to launch an external program to provide a
  socks proxy for 'trebuchet' connections. The Tor client only
  launches one instance of each external program with a given set of
  options, even if the same executable and options are listed for
  more than one method.

  In managed proxies, <methods> can be a comma-separated list of
  pluggable transport method names, as in:
    "ClientTransportPlugin pawn,bishop,rook exec /bin/ptproxy --managed".

  If instead of a transport method, the torrc lists "*" for a managed
  proxy, Tor uses that proxy for all transport methods that the plugin
  supports. So "ClientTransportPlugin * exec /usr/libexec/tor/foobar"
  tells Tor that Tor should use the foobar plugin for every method that
  the proxy supports. See the "Managed proxy interface" section below
  for details on how Tor learns which methods a plugin supports.

  If two plugins support the same method, Tor should use whichever
  one is listed first.

  The same program can implement a managed or an external proxy: it just
  needs to take an argument saying which one to be.

Server behavior

  Server proxies are configured similarly to client proxies.  When
  launching a proxy, the server must tell it what ORPort it has
  configured, and what address (if any) it can listen on.  The
  server must tell the proxy which (if any) methods it should
  provide if it can; the proxy needs to tell the server which
  methods it is actually providing, and on what ports.

  When a client connects to the proxy, the proxy may need a way to
  tell the server some identifier for the client address.  It does
  this in-band.

  As before, the server lists proxies in its torrc.  These can be
  external proxies that run on their own, or managed proxies that Tor
  launches.

  An external server proxy is configured as
     ServerTransportPlugin <method> proxy <address:port> <param=val> ...
  as in
     "ServerTransportPlugin trebuchet proxy 127.0.0.1:999 rocks=heavy".
  The param=val pairs and the address are used to make the bridge
  configuration information that we'll tell users.

  A managed proxy is configured as
     ServerTransportPlugin <methods> exec </path/to/binary> [options]
  or
     ServerTransportPlugin * exec </path/to/binary> [options]

  When possible, Tor should launch only one binary of each binary/option
  pair configured.  So if the torrc contains

     ClientTransportPlugin foo exec /usr/bin/megaproxy --foo
     ClientTransportPlugin bar exec /usr/bin/megaproxy --bar
     ServerTransportPlugin * exec /usr/bin/megaproxy --foo

  then Tor will launch the megaproxy binary twice: once with the option
  --foo and once with the option --bar.

Managed proxy interface

   When the Tor client or relay launches a managed proxy, it communicates
   via environment variables.  At a minimum, it sets (in addition to the
   normal environment variables inherited from Tor):

      {Client and server}

      "TOR_PT_STATE_LOCATION" -- A filesystem directory path where the
       proxy should store state if it wants to.  This directory is not
       required to exist, but the proxy SHOULD be able to create it if
       it doesn't.  The proxy MUST NOT store state elsewhere.
      Example: TOR_PT_STATE_LOCATION=/var/lib/tor/pt_state/

      "TOR_PT_MANAGED_TRANSPORT_VER" -- To tell the proxy which
       versions of this configuration protocol Tor supports.  Future
       versions will give a comma-separated list.  Clients MUST accept
       comma-separated lists containing any version that they
       recognize, and MUST work correctly even if some of the versions
       they don't recognize are non-numeric.  Valid version characters
       are non-space, non-comma printing ASCII characters.
      Example: TOR_PT_MANAGED_TRANSPORT_VER=1,1a,2,4B

      {Client only}

      "TOR_PT_CLIENT_TRANSPORTS" -- A comma-separated list of which
       methods this client should enable, or * if all methods should
       be enabled.  The proxy SHOULD ignore methods that it doesn't
       recognize.
      Example: TOR_PT_CLIENT_TRANSPORTS=trebuchet,battering_ram,ballista

      {Server only}

      "TOR_PT_EXTENDED_SERVER_PORT" -- An <address>:<port> where tor
       should be listening for connections speaking the extended
       ORPort protocol (See the "The extended ORPort protocol" section
       below). If tor does not support the extended ORPort protocol,
       it MUST use the empty string as the value of this environment
       variable.
      Example: TOR_PT_EXTENDED_SERVER_PORT=127.0.0.1:4200

      "TOR_PT_ORPORT" -- Our regular ORPort in a form suitable
       for local connections, i.e. connections from the proxy to
       the ORPort.
      Example: TOR_PT_ORPORT=127.0.0.1:9001

      "TOR_PT_SERVER_BINDADDR" -- A comma seperated list of
       <key>-<value> pairs, where <key> is a transport name and
       <value> is the adress:port on which it should listen for client
       proxy connections.
       The keys holding transport names must appear on the same order
       as they appear on TOR_PT_SERVER_TRANSPORTS.
       This might be the advertised address, or might be a local
       address that Tor will forward ports to.  It MUST be an address
       that will work with bind().
      Example:
        TOR_PT_SERVER_BINDADDR=trebuchet-127.0.0.1:1984,ballista-127.0.0.1:4891

      "TOR_PT_SERVER_TRANSPORTS" -- A comma-separated list of server
       methods that the proxy should support, or * if all methods
       should be enabled.  The proxy SHOULD ignore methods that it
       doesn't recognize.
      Example: TOR_PT_SERVER_TRANSPORTS=trebuchet,ballista

  The transport proxy replies by writing NL-terminated lines to
  stdout.  The line metaformat is

      <Line> ::= <Keyword> <OptArgs> <NL>
      <Keyword> ::= <KeywordChar> | <Keyword> <KeywordChar>
      <KeyWordChar> ::= <any US-ASCII alphanumeric, dash, and underscore>
      <OptArgs> ::= <Args>*
      <Args> ::= <SP> <ArgChar> | <Args> <ArgChar>
      <ArgChar> ::= <any US-ASCII character but NUL or NL>
      <SP> ::= <US-ASCII whitespace symbol (32)>
      <NL> ::= <US-ASCII newline (line feed) character (10)>

  Tor MUST ignore lines with keywords that it doesn't recognize.

  First, if there's an error parsing the environment variables, the
  proxy should write:
    ENV-ERROR <errormessage>
  and exit.

  If the environment variables were correctly formatted, the proxy
  should write:
    VERSION <configuration protocol version>
  to say that it supports this configuration protocol version (example
  "VERSION 1"). It must either pick a version that Tor told it about
  in TOR_PT_MANAGED_TRANSPORT_VER, or pick no version at all, say:
     VERSION-ERROR no-version
  and exit.

  The proxy should then open its ports.  If running as a client
  proxy, it should not use fixed ports; instead it should autoselect
  ports to avoid conflicts.  A client proxy should by default only
  listen on localhost for connections.

  A server proxy SHOULD try to listen at a consistent port, though it
  SHOULD pick a different one if the port it last used is now allocated.

  A client or server proxy then should tell which methods it has
  made available and how.  It does this by printing zero or more
  CMETHOD and SMETHOD lines to its stdout.  These lines look like:

   CMETHOD <methodname> socks4/socks5 <address:port> [ARGS=arglist] \
        [OPT-ARGS=arglist]

  as in

   CMETHOD trebuchet socks5 127.0.0.1:19999 ARGS=rocks,height \
              OPT-ARGS=tensile-strength

  The ARGS field lists mandatory parameters that must appear in
  every bridge line for this method. The OPT-ARGS field lists
  optional parameters.  If no ARGS or OPT-ARGS field is provided,
  Tor should not check the parameters in bridge lines for this
  method.

  The proxy should print a single "CMETHODS DONE" line after it is
  finished telling Tor about the client methods it provides.  If it
  tries to supply a client method but can't for some reason, it
  should say:
    CMETHOD-ERROR <methodname> <errormessage>

  A proxy should also tell Tor about the server methods it is providing
  by printing zero or more SMETHOD lines.  These lines look like:

    SMETHOD <methodname> <address:port> [options]

  If there's an error setting up a configured server method, the
  proxy should say:
    SMETHOD-ERROR <methodname> <errormessage>
  as in
    SMETHOD-ERROR trebuchet could not setup 'trebuchet' method

  The 'address:port' part of an SMETHOD line is the address to put
  in the bridge line.  The Options part is a list of space-separated
  K:V flags that Tor should know about.  Recognized options are:

      - FORWARD:1

        If this option is set (for example, because address:port is not
        a publicly accessible address), then Tor needs to forward some
        other address:port to address:port via upnp-helper. Tor would
        then advertise that other address:port in the bridge line instead.

      - ARGS:K=V,K=V,K=V

        If this option is set, the K=V arguments are added to Tor's
        extrainfo document.

      - DECLARE:K=V,...

        If this option is set, the K=V options should be added as
        extension entries to the router descriptor, so clients and other
        relays can make use of it. See ideas/xxx-triangleboy-transport.txt
        for an example situation where the plugin would want to declare
        parameters to other Tors.

      - USE-EXTENDED-PORT:1

        If this option is set, the server plugin is planning to connect
        to Tor's extended server port.

  SMETHOD and CMETHOD lines may be interspersed, to allow the proxies to
  report methods as they become available, even when some methods may
  require probing your network, connecting to some kind of peers, etc
  before they are set up. After the final SMETHOD line, the proxy says
  "SMETHODS DONE".

  The proxy SHOULD NOT tell Tor about a server or client method
  unless it is actually open and ready to use.

  Tor clients SHOULD NOT use any method from a client proxy or
  advertise any method from a server proxy UNLESS it is listed as a
  possible method for that proxy in torrc, and it is listed by the
  proxy as a method it supports.

  Proxies should respond to a single INT signal by closing their
  listener ports and not accepting any new connections, but keeping
  all connections open, then terminating when connections are all
  closed.  Proxies should respond to a second INT signal by shutting
  down cleanly.

  The managed proxy configuration protocol version defined in this
  section is "1".
  So, for example, if tor supports this configuration protocol it
  should set the environment variable:
    TOR_PT_MANAGED_TRANSPORT_VER=1

The Extended ORPort protocol

  The Extended ORPort protocol is described in proposal 196.

Advertising bridge methods

  Bridges put the 'method' lines in their extra-info documents.

     transport SP <transportname> SP <address:port> [SP arglist] NL

  The address:port are as returned from an SMETHOD line (unless they are
  replaced by the FORWARD: directive).  The arglist is a K=V,... list as
  returned in the ARGS: part of the SMETHOD line's Options component.

  If the SMETHOD line includes a DECLARE: part, the router descriptor gets
  a new line:

     transport-info SP <transportname> [SP arglist] NL

Bridge authority behavior

  We need to specify a way to test different transport methods that
  bridges claim to support.  We should test as many as possible.  We
  should NOT require that we have a way to test every possible
  transport method before we allow its use: the point of this design
  is to remove bottlenecks in transport deployment.

Bridgedb behavior

  Bridgedb can, given a set of router descriptors and their
  corresponding extrainfo documents, generate a set of bridge lines
  for each bridge.  Bridgedb may want to avoid handing out
  methods that seem to get bridges blocked quickly.

Implementation plan

  First, we should implement per-bridge proxies via the "external
  proxy" method described in "Specifications: Client behavior".  Also,
  we'll want to build the
  extended-server-port mechanism.  This will let bridges run
  transport proxies such that they can generate bridge lines to
  give to clients for testing, so long as the user configures and
  launches their proxies on their own.

  Once that's done, we can see if we need any managed proxies, or if
  the whole idea there is silly.

  If we do, the next most important part seems to be getting
  the client-side automation part written.  And once that's done, we
  can evaluate how much of the server side is easy for people to do
  and how much is hard.

  The "obfsproxy" obfuscating proxy is a likely candidate for an
  initial transport (trac entry #2760), as is Steven Murdoch's http
  thing (trac entry #2759) or something similar.

Notes on plugins to write

   We should ship a couple of null plugin implementations in one or two
   popular, portable languages so that people get an idea of how to
   write the stuff.

   1. We should have one that's just a proof of concept that does
      nothing but transfer bytes back and forth.

   2. We should implement DNS or HTTP using other software (as Geoff Goodell
      did years ago with DNS) as an example of wrapping existing code into
      our plugin model.

   3. The obfuscated-ssh superencipherment is pretty trivial and pretty
      useful.  It makes the protocol stringwise unfingerprintable.

   4. If we do a raw-traffic proxy, openssh tunnels would be the logical
      choice.

Appendix: recommendations for transports

  Be free/open-source software.  Also, if you think your code might
  someday do so well at circumvention that it should be implemented
  inside Tor, it should use the same license as Tor.

  Tor already uses OpenSSL, Libevent, and zlib.  Before you go and decide
  to use crypto++ in your transport plugin, ask yourself whether OpenSSL
  wouldn't be a nicer choice.

  Be portable: most Tor users are on Windows, and most Tor developers
  are not, so designing your code for just one of these platforms will
  make it either get a small userbase, or poor auditing.

  Think secure: if your code is in a C-like language, and it's hard to
  read it and become convinced it's safe, then it's probably not safe.

  Think small: we want to minimize the bytes that a Windows user needs
  to download for a transport client.

  Avoid security-through-obscurity if possible.  Specify.

  Resist trivial fingerprinting: There should be no good string or regex
  to search for to distinguish your protocol from protocols permitted by
  censors.

  Imitate a real profile: There are many ways to implement most
  protocols -- and in many cases, most possible variants of a given
  protocol won't actually exist in the wild.

Filename: 181-optimistic-data-client.txt
Title: Optimistic Data for Tor: Client Side
Author: Ian Goldberg
Created: 2-Jun-2011
Status: Closed
Implemented-In: 0.2.3.3-alpha

Overview:

This proposal (as well as its already-implemented sibling concerning the
server side) aims to reduce the latency of HTTP requests in particular
by allowing:
1. SOCKS clients to optimistically send data before they are notified
    that the SOCKS connection has completed successfully
2. OPs to optimistically send DATA cells on streams in the CONNECT_WAIT
    state
3. Exit nodes to accept and queue DATA cells while in the
    EXIT_CONN_STATE_CONNECTING state

This particular proposal deals with #1 and #2.

For more details (in general and for #3), see the sibling proposal 174
(Optimistic Data for Tor: Server Side), which has been implemented in
0.2.3.1-alpha.

Motivation:

This change will save one OP<->Exit round trip (down to one from two).
There are still two SOCKS Client<->OP round trips (negligible time) and
two Exit<->Server round trips.  Depending on the ratio of the
Exit<->Server (Internet) RTT to the OP<->Exit (Tor) RTT, this will
decrease the latency by 25 to 50 percent.  Experiments validate these
predictions. [Goldberg, PETS 2010 rump session; see
https://thunk.cs.uwaterloo.ca/optimistic-data-pets2010-rump.pdf ]

Design:

Currently, data arriving on the SOCKS connection to the OP on a stream
in AP_CONN_STATE_CONNECT_WAIT is queued, and transmitted when the state
transitions to AP_CONN_STATE_OPEN.  Instead, when data arrives on the
SOCKS connection to the OP on a stream in AP_CONN_STATE_CONNECT_WAIT
(connection_edge_process_inbuf):

- Check to see whether optimistic data is allowed at all (see below).
- Check to see whether the exit node for this stream supports optimistic
  data (according to tor-spec.txt section 6.2, this means that the
  exit node's version number is at least 0.2.3.1-alpha).  If you don't
  know the exit node's version number (because it's not in your
  hashtable of fingerprints, for example), assume it does *not* support
  optimistic data.
- If both are true, transmit the data on the stream.

Also, when a stream transitions *to* AP_CONN_STATE_CONNECT_WAIT
(connection_ap_handshake_send_begin), do the above checks, and
immediately send any already-queued data if they pass.

SOCKS clients (e.g. polipo) will also need to be patched to take
advantage of optimistic data.  The simplest solution would seem to be to
just start sending data immediately after sending the SOCKS CONNECT
command, without waiting for the SOCKS server reply.  When the SOCKS
client starts reading data back from the SOCKS server, it will first
receive the SOCKS server reply, which may indicate success or failure.
If success, it just continues reading the stream as normal.  If failure,
it does whatever it used to do when a SOCKS connection failed.

Security implications:

ORs (for sure the Exit, and possibly others, by watching the
pattern of packets), as well as possibly end servers, will be able to
tell that a particular client is using optimistic data.  This of course
has the potential to fingerprint clients, dividing the anonymity set.
The usual kind of solution is suggested:

- There is a boolean consensus parameter UseOptimisticData.
- There is a 3-state (-1, 0, 1) configuration parameter
  UseOptimisticData (or give it a distinct name if you like)
  defaulting to -1.
- If the configuration parameter is -1, the OP obeys the consensus
  value; otherwise, it obeys the configuration parameter.

It may be wise to set the consensus parameter to 1 at the same time as
similar other client protocol changes are made (for example, a new
circuit construction protocol) in order to not further subdivide the
anonymity set.

Specification:

The current tor-spec has already been updated by proposal 174 to handle
optimistic data.  It says, in part:

    If the exit node does not support optimistic data (i.e. its version
    number is before 0.2.3.1-alpha), then the OP MUST wait for a
    RELAY_CONNECTED cell before sending any data.  If the exit node
    supports optimistic data (i.e. its version number is 0.2.3.1-alpha
    or later), then the OP MAY send RELAY_DATA cells immediately after
    sending the RELAY_BEGIN cell (and before receiving either a
    RELAY_CONNECTED or RELAY_END cell).

Should the "MAY" be more specific, referring to the consensus
parameters?  Or does the existence of the configuration parameter
override mean it's really "MAY", regardless?

Compatibility:

There are compatibility issues, as mentioned above.  OPs MUST NOT send
optimistic data to Exit nodes whose version numbers predate
0.2.3.1-alpha.  OPs MAY send optimistic data to Exit nodes whose version
numbers match or follow that value.

Implementation:

My git diff is 42 lines long (+17 lines, -1 line), changing only the two
functions mentioned above (connection_edge_process_inbuf and
connection_ap_handshake_send_begin).  This diff does not, however,
handle the configuration options, or check the version number of the
exit node.

I have patched a command-line SOCKS client (webfetch) to use optimistic
data.  I have not attempted to patch polipo, but I have looked at it a
bit, and it seems pretty straightforward.  (Of course, if and when
polipo is deprecated, whatever else speaks SOCKS to the OP should take
advantage of optimistic data.)

Performance and scalability notes:

OPs may queue a little more data, if the SOCKS client pushes it faster
than the OP can write it out.  But that's also true today after the
SOCKS CONNECT returns success, right?
Filename: 182-creditbucket.txt
Title: Credit Bucket
Author: Florian Tschorsch and Björn Scheuermann
Created: 22 Jun 2011
Status: Obsolete

Note: Obsolete because we no longer have a once-per-second bucket refill.

Overview:

  The following proposal targets the reduction of queuing times in onion
  routers. In particular, we focus on the token bucket algorithm in Tor and
  point out that current usage unnecessarily locks cells for long time spans.
  We propose a non-intrusive change in Tor's design which overcomes the
  deficiencies.

Motivation and Background:

  Cell statistics from the Tor network [1] reveal that cells reside in
  individual onion routers' cell queues for up to several seconds. These
  queuing times increase the end-to-end delay very significantly and are
  apparently the largest contributor to overall cell latency in Tor.

  In Tor there exist multiple token buckets on different logical levels. They 
  all work independently. They are used to limit the up- and downstream of an
  onion router. All token buckets are refilled every second with a constant
  amount of tokens that depends on the configured bandwidth limits. For
  example, the so-called RelayedTokenBucket limits relay traffic only. All
  read data of incoming connections are bound to a dedicated read token
  bucket. An analogous mechanism exists for written data leaving the onion
  router. We were able to identify the specific usage and implementation of
  the token bucket algorithm as one cause for very high (and unnecessary)
  queuing times in an onion router.

  We observe that the token buckets in Tor are (surprisingly at a first
  glance) allowed to take on negative fill levels. This is justified by the
  TLS connections between onion routers where whole TLS records need to be
  processed. The token bucket on the incoming side (i.e., the one which
  determines at which rate it is allowed to read from incoming TCP
  connections) in particular often runs into non-negligible negative fill
  levels. As a consequence of this behavior, sometimes slightly more data is
  read than it would be admissible upon strict interpretation of the token
  bucket concept.

  However, the token bucket for limiting the outgoing rate does not take on
  negative fill levels equally often. Consequently, it regularly happens
  that somewhat more data are read on the incoming side than the outgoing
  token bucket allows to be written during the same cycle, even if their
  configured data rates are the same. The respective cells will thus not be
  allowed to leave the onion router immediately. They will thus necessarily
  be queued for at least as long as it takes until the token bucket on the
  outgoing side is refilled again. The refill interval currently is, as
  mentioned before, one second -- so, these cells are delayed for a very
  substantial time. In summary, one could say that the two buckets, on the
  incoming and outgoing side, work like a double door system and frequently
  lock cells for a full token bucket refill interval length.

General Design:

  In order to overcome the described problem, we propose the following 
  changes related to the token bucket algorithm.

  We observe that the token bucket on the outgoing connections with its
  current design is contra productive in the sense of queuing times. We 
  therefore propose modifications to the token bucket algorithm that will
  eliminate the "double door effect" discussed above.

  Let us start from Tor's current approach: Thus, we have a regular token 
  bucket on the reading side with a certain rate and a certain burst size. 
  Let x denote the current amount of tokens in the bucket. On the outgoing 
  side we need something appropriate that monitors and constrains the 
  outgoing rate, but at the same time avoids holding back cells (cf. double 
  door effects) whenever possible.

  Here we propose something that adopts the role of a token bucket, but 
  realizes this functionality in a slightly different way. We call it a 
  "credit bucket". Like a token bucket, the credit bucket also has a current 
  fill level, denoted by y. However, the credit bucket is refilled in a 
  different way.

  To understand how it works, let us look at the possible operations:

  As said, x is the fill level of a regular token bucket on the incoming 
  side   and thus gets incremented periodically according to the configured 
  rate. No changes here.

  If x<=0, we are obviously not allowed to read. If x>0, we are allowed to 
  read up to x bytes of incoming data. If k bytes are read (k<=x), then we 
  update x and y as follows:

    x = x - k        (1)
    y = y + k        (2)

  (1) is the standard token bucket operation on the incoming side. Whenever 
  data is admitted in, though, an additional operation is performed: (2) 
  allocates the same number of bytes on the outgoing side, which will later 
  on allow the same number of bytes to leave the onion router without any 
  delays.

  If y + x > -M, we are allowed to write up to y + x + M bytes on the 
  outgoing side, where M is a positive constant. M specifies a burst size for
  the outgoing side. M should be higher than the number of tokens that get 
  refilled during a refill interval, we would suggest to have M in the order 
  of a few seconds "worth" of data. Now if k bytes are written on the 
  outgoing side, we proceed as follows:

    If k <= y then y = y - k

  In this case we use "saved" credits, previously allocated on the incoming 
  side when incoming data has been processed.

    If k > y then y = 0 and x = x - (k-y)

  We generated additional traffic in the onion router, so that more data is 
  to be sent than has been read (the credit is not sufficient). We therefore 
  "steal" tokens from the token buffer on the incoming side to compensate for 
  the additionally generated data. This will result in correspondingly less 
  data being read on the incoming side subsequently. As a result of such an 
  operation, the token bucket fill level x on the incoming side may become 
  negative (but it can never fall below -M).

  If y + x <= -M then outgoing data will be held back. This may lead to 
  double-door effects, but only in extreme cases where the outgoing traffic 
  largely exceeds the incoming traffic, so that the outgoing bursts size M is 
  exceeded.

  Aside from short-term bursts of configurable size (as with every token 
  bucket), this procedure guarantees that the configured rate may never be 
  exceeded (on the application layer, that is; as with the current 
  implementation, an attacker may easily cause the onion router to 
  arbitrarily exceed the limits on the lower layers). Over time, we never 
  send more data than the configured rate: every sent byte needs a 
  corresponding token on the incoming side; this token must either have been
  consumed by an incoming byte before (it then became a "credit"), or it is 
  "stolen" from the incoming bucket to compensate for data generated within 
  the onion router.

Specific Design Changes: 

  In the following we briefly point out the specific changes that need to be 
  done in Tor's source code. By doing so one can see how non intrusive our
  modifications are. 
  
  First we need to address the bucket increment and decrement operations. 
  According to the described logic above, this should be done in the methods 
  connection_bucket_refill and connection_buckets_decrement respectively. In
  particular allocating, saving and "stealing" of tokens need to be 
  considered here. 
  
  Second the rate limiting, i.e. the amount we are allowed to write 
  (connection_bucket_write_limit) needs to be adapted in lines of the credit 
  bucket logic. Meaning in order to avoid  the here identified unnecessary 
  queuing of cells, we need to consider the new burst parameter M. Here we 
  also need to take non rate limited connections such as from the localhost 
  into account. The rate limiting on the reading side remains the same.   

  At last we need to find good values/ ratios for the parameter M such that 
  the trade off between avoiding "double door effects" and maintaining 
  strict rate limits work as expected. As future work and after insights 
  about the performance gain of the here described proposal we need to find a
  way to implement this both using bufferevent rate limiting with libevent 
  2.3.x and Tor's rate limiting code. 

Conclusion:

  This proposal can be implemented with moderate effort and requires changes 
  only at the points where currently the token bucket operations are 
  performed.

  We feel that this is not the be-all and end-all solution, because it again 
  introduces a feedback loop between the incoming and the outgoing side. We 
  therefore still hope that we will be able to come to a both simpler and 
  more effective design in the future. However, we believe that what we 
  proposed here is a good compromise between avoiding double-door effects to 
  the furthest possible extent, strictly enforcing an application-layer data 
  rate, and keeping the extent of changes to the code small.

  Feedback is highly appreciated.

References:

  [1] Karsten Loesing. Analysis of Circuit Queues in Tor. August 25, 2009.
  [2] https://trac.torproject.org/projects/tor/wiki/sponsors/SponsorD/June2011
Filename: 183-refillintervals.txt
Title: Refill Intervals
Author: Florian Tschorsch and Björn Scheuermann
Created: 03-Dec-2010
Status: Closed
Implemented-In: 0.2.3.5-alpha

Overview:

  In order to avoid additional queuing and bursty traffic, the refill 
  interval of the token bucket algorithm should be shortened. Thus we 
  propose a configurable parameter that sets the refill interval 
  accordingly. 

Motivation and Background:

  In Tor there exist multiple token buckets on different logical levels. They 
  all work independently. They are used to limit the up- and downstream of an
  onion router. All token buckets are refilled every second with a constant
  amount of tokens that depends on the configured bandwidth limits. The very
  coarse-grained refill interval of one second has detrimental effects. 

  First, consider an onion router with multiple TLS connections over which 
  cells arrive. If there is high activity (i.e., many incoming cells in
  total), then the coarse refill interval will cause unfairness. Assume (just
  for simplicity) that C doesn't share its TLS connection with any other
  circuit. Moreover, assume that C hasn't transmitted any data for some time
  (e.g., due a typical bursty HTTP traffic pattern). Consequently, there are
  no cells from this circuit in the incoming socket buffers. When the buckets
  are refilled, the incoming token bucket will immediately spend all its
  tokens on other incoming connections. Now assume that cells from C arrive
  soon after. For fairness' sake, these cells should be serviced timely --
  circuit C hasn't received any bandwidth for a significant time before.
  However, it will take a very long time (one refill interval) before the
  current implementation will fetch these cells from the incoming TLS
  connection, because the token bucket will remain empty for a long time. Just
  because the cells happened to arrive at the "wrong" point in time, they must
  wait. Such situations may occur even though the configured admissible
  incoming data rate is not exceeded by incoming cells: the long refill
  intervals often lead to an operational state where all the cells that were
  admissible during a given one-second period are queued until the end of this
  second, before the onion router even just starts processing them. This
  results in unnecessary, long queuing delays in the incoming socket buffers.
  These delays are not visible in the Tor circuit queue delay statistics [1]. 

  Finally, the coarse-grained refill intervals result in a very bursty outgoing
  traffic pattern at the onion routers (one large chunk of data once per
  second, instead of smooth transmission progress). This is undesirable, since
  such a traffic pattern can interfere with TCP's control mechanisms and can
  be the source of suboptimal TCP performance on the TLS links between onion
  routers.  

Specific Changes: 

  The token buckets should be refilled more often, with a correspondingly 
  smaller amount of tokens. For instance, the buckets might be refilled every
  10 milliseconds with one-hundredth of the amount of data admissible per 
  second. This will help to overcome the problem of unfairness when reading 
  from the incoming socket buffers. At the same time it smoothes the traffic 
  leaving the onion routers. We are aware that this latter change has 
  apparently been discussed before [2]; we are not sure why this change has
  not been implemented yet.

  In particular we need to change the current implementation in Tor which 
  triggers refilling always after exactly one second. Instead the refill event 
  should fire more frequently. The smaller time intervals between each refill 
  action need to be taken into account for the number of tokens that are added 
  to the bucket. 
  
  With libevent 2.x and bufferevents enabled, smaller refill intervals are 
  already considered but hard coded. This should be changed to a configurable 
  parameter, too.   

Conclusion:

  This proposal can be implemented with moderate effort and requires changes 
  only at the points where the token bucket operations are currently
  performed.
  
  This change will also be a good starting point for further enhancements 
  to improve queuing times in Tor. I.e. it will pave the ground for other means 
  that tackle this problem.  

  Feedback is highly appreciated.

References:

  [1] Karsten Loesing. Analysis of Circuit Queues in Tor. August 25, 2009.
  [2] https://trac.torproject.org/projects/tor/wiki/sponsors/SponsorD/June2011
  
Filename: 184-v3-link-protocol.txt
Title: Miscellaneous changes for a v3 Tor link protocol
Author: Nick Mathewson
Created: 19-Sep-2011
Status: Closed
Target: 0.2.3.x

Overview:

  When proposals 176 and 179 are implemented, Tor will have a new
  link protocol.  I propose two simple improvements for the v3 link
  protocol: a more partitioned set of which types indicate
  variable-length cells, and a better way to handle link padding if
  and when we come up with a decent scheme for it.

Motivation:

  We're getting a new link protocol in 0.2.3.x, thanks (again) to
  TLS fingerprinting concerns.  When we do, it'd be nice to take
  care of some small issues that require a link protocol version
  increment.

  First, our system for introducing new variable-length cell types
  has required a protocol increment for each one.  Unlike
  fixed-length (512 byte) cells, we can't add new variable-length
  cells in the existing link protocols and just let older clients
  ignore them, because unless the recipient knows which cells are
  variable-length, it will treat them as 512-byte cells and discard
  too much of the stream or too little.  In the past, it's been
  useful to be able to introduce new cell types without having to
  increment the link protocol version.

  Second, once we have our new TLS handshake in place, we will want
  a good way to address the remaining fingerprinting opportunities.
  Some of those will likely involve traffic volume.  We can't fix
  that easily with our existing PADDING cell type, since PADDING
  cells are fixed-length, and wouldn't be so easy to use to break up
  our TLS record sizes.

Design: Indicating variable-length cells.

  Beginning with the v3 link protocol, we specify that all cell
  types in the range 128..255 indicate variable-length cells.
  Cell types in the range 0..127 are still used for 512-byte
  cells, except that the VERSIONS cell type (7) also indicates a
  variable-length cell (for backward compatibility).

  As before, all Tor instances must ignore cells with types that
  they don't recognize.

Design: Variable-length padding.

  We add a new variable-length cell type, "VPADDING", to be used for
  padding.  All Tor instances may send a VPADDING cell at any point that
  a VERSIONS cell is not required; a VPADDING cell's body may be any
  length; the body of a VPADDING cell MAY have any content.  Upon
  receiving a VPADDING cell, the recipient should drop it, as with a
  PADDING cell.

  (This does not give a way to send fewer than 5 bytes of padding.
  We could add this in the future, in a new link protocol.)

  Implementations SHOULD fill the content of all padding cells
  randomly.

A note on padding:

  We do not specify any situation in which a node ought to generate
  a VPADDING cell; that's left for future work.  Implementors should
  be aware that many schemes have been proposed for link padding
  that do not in fact work as well as one would expect.  We
  recommend that no mainstream implementation should produce padding
  in an attempt to resist traffic analysis, without real research
  showing that it helps.

Interaction with proposal 176:

  Proposal 176 says that during the v3 handshake, no cells other
  than VERSIONS, AUTHENTICATE, AUTH_CHALLENGE, CERT, and NETINFO are
  allowed, and those are only allowed in their standard order.  If
  this proposal is accepted, then VPADDING cells should also be
  allowed in the handshake at any point after the VERSIONS cell.
  They should be included when computing the "SLOG" and "CLOG"
  handshake-digest fields of the AUTHENTICATE cell.

Notes on future-proofing:

  It may be in the future we need a new cell format that is neither the
  original 512-byte format nor the variable-length format.  If we
  do, we can just increment the link protocol version number again.

  Right now we have 10 cell types; with this proposal and proposal
  176, we will have 14.  It's unlikely that we'll run out any time
  soon, but if we start to approach the number 64 with fixed-length
  cell types or 196 with var-length cell types, we should consider
  tweaking the link protocol to have a variable-length cell type
  encoding.

Filename: 185-dir-without-dirport.txt
Title: Directory caches without DirPort
Author: Nick Mathewson
Created: 20-Sep-2011
Status: Superseded
Superseded-by: 237

Overview:

  Exposing a directory port is no longer necessary for running as a
  directory cache.  This proposal suggests that we eliminate that
  requirement, and describes how.

Motivation:

  Now that we tunnel directory connections by default, it is no
  longer necessary to have a DirPort to be a directory cache.  In
  fact, bridges act as directory caches but do not actually have a
  DirPort exposed.  It would be nice and tidy to expand that
  property to the rest of the network.

Configuration:

  Add a new torrc option, "DirCache".  Its values can be "0", "1",
  and "auto".  If it is 0, we never act as a directory cache, even
  if DirPort is set.  If it is 1, then we act as a directory cache
  according to same rules as those used for nodes that set a
  DirPort.  If it is "auto", then Tor decides whether to act as a
  directory cache based on some future intelligent algorithm. "Auto"
  should be the new default.

Advertising cache status:

  Nodes that are running as a directory cache should set the entry
  "dir-cache 1" in their router descriptors.  If they do not have a
  DirPort set, or do not have a working DirPort, they should give
  their directory port as 0 in their router lines.  (Nodes that have
  a working directory port advertise it as usual, and also include a
  "dir-cache" line.  Nodes that do not serve directory information
  should set their directory port to 0, and not include any
  dir-cache line.  Implementations should accept and ignore
  dir-cache lines with values other than "dir-cache 1".)

Consensus:

  Authorities should assign a "DirCache" flag to all nodes running
  as a directory cache.

  This does not require a new version of the consensus algorithm.
Filename: 186-multiple-orports.txt
Title: Multiple addresses for one OR or bridge
Author: Nick Mathewson
Created: 19-Sep-2011
Supersedes: 118
Status: Closed
Target: 0.2.4.x+

Status:

  This proposal is partially implemented to the extent needed to allow nodes
  to have one IPv4 and one IPv6 address.

Overview:

  This document is a proposal for servers to advertise multiple
  address/port combinations for their ORPort.

  It supersedes proposal 118.

Motivation:

  Sometimes servers want to support multiple ports for incoming
  connections, either in order to support multiple address families
  (ie, to add IPv6 support), to better use multiple interfaces, or
  to support a variety of FascistFirewallPorts settings.  This is
  easy to set up now, but there's no way to advertise it to clients.

Configuring additional addresses and ports:

  In consonance with our changes to the (Socks|Trans|NATD|DNS)Port
  options made in 0.2.3.x for proposal 171, I make a corresponding
  change to allow multiple ORPort options and deprecate
  ORListenAddress.

  The new syntax will be:

      "ORPort" PortDescription Option*

      Option = "NoAdvertise" | "NoListen" | "AllAddrs" | "IPV4Only"
          | "IPV6Only"

      PortDescription = PORTLIST |
                        ADDRESS ":" PORTLIST |
                        Hostname ":" PORTLIST

      (PORTLIST and ADDRESS are defined below.)

  The 'NoAdvertise' option performs the function of the old
  ORListenAddress option.  If it is set, we bind a port, but
  don't put it in our descriptor.

  The 'NoListen' option tells Tor to advertise an address, but not
  bind to it.  The operator needs to use some other mechanism to
  ensure that ports are redirected to ports that _are_ listened on.

  The 'AllAddrs' option tells Tor that if no address is given in the
  PortDescription part, we should bind/advertise every one of our
  publicly visible unicast addresses; and that if a hostname address
  is given in the PortDescription, we should bind/advertise every
  publicly visible unicast address that the hostname resolves to.
  (Q: Should this be on by default?)   The 'IPv4Only' and 'IPv6Only'
  options tell Tor to interpret such situations as applying only to
  IPv4 addresses or to IPv6 addresses.

  As with the client *Port options, only the old format or the new
  format are allowed: either a single numeric ORPort and zero or
  more ORListenAddress options, or a set of one or more
  ORPorts in the new extended format.

  In current operating systems (unless we get into crazy nonportable
  tricks) we need to use one socket for every address:port that Tor
  binds on.  As a sanity check, we can limit the number of such sockets
  we use to, say, something between 8 and 64.  If you want to bind lots
  of address:port combinations, you'll want to do it at the
  firewall/routing level.

  Example: We want to bind on 0.0.0.0:9001

     ORPort 9001

  Example: Our firewall is redirecting ports 80, 443, and 7000
  on all hosts in 18.244.2.0 onto our port 2929.

     ORPort 2929 noadvertise
     ORPort 18.244.2.0:80,443,7000 nolisten

  Example: We have a dynamic DNS provider that maps
  tornode.example.com to our current external IPv4 and IPv6
  addresses.  Our firewall forwards port 443 on those addresses to our
  port 1337.

     ORPort 1337 noadvertise alladdrs
     ORPort tornode.example.com:443 nobind alladdrs

Self-testing:

  Right now, Tor nodes need to check every port that they advertise
  before they declare themselves reachable.  If a Tor has
  a lot of advertised ports, that could be prohibitive.
  Instead, it should try a sample of ports for each address.  It should
  not advertise any given ORPort line until it has tried
  extending to or connecting to a sample of the address/port
  combinations.

  It will now be possible for a Tor node to find that some addresses
  work and others do not.  In this case, the node should only advertise
  ORPort lines that have been checked.  (As a consequence, the node
  should not advertise any address unless at least one ORPort without
  nolisten has been specified.)

  {Until support is added for extend cells to IPv6 addresses, it
  will only be possible to test IPv6 addresses by connecting
  directly.  We might want to just skip self-testing those until we
  have IPv6 extend support.}

New descriptor syntax:

  We add a new line in the router descriptor, "or-address".  This line
  can occur zero, one, or multiple times.  Its format is:

      or-address SP ADDRESS ":" PORTLIST NL

      ADDRESS = IPV6ADDR | IPV4ADDR
      IPV6ADDR = an ipv6 address, surrounded by square brackets.
      IPV4ADDR = an ipv4 address, represented as a dotted quad.
      PORTLIST = PORTSPEC | PORTSPEC "," PORTLIST
      PORTSPEC = PORT
      PORT = a number between 1 and 65535 inclusive.

  [This is the regular format for specifying sets of addresses and
  ports in Tor.]

  A descriptor should not include an or-address line that does
  nothing but duplicate the address:port pair from its "router"
  line.

  A node must not list more than 8 or-address lines.

  A PORTLIST must have no more than 16 PORTSPEC entries, and its entries must
  be disjoint.

  (Q: Any reason to allow more than 2?  Multiple interfaces, I guess.)

New authority behavior:

  The same rationale applies as for self-testing.  An authority
  needs to test the main address:port from the router line, and
  every or-address line.  For or-address lines that contain
  multiple ports, it needs to test all of them if they are few, or a
  sample if they are not.

  An authority shouldn't list a node as Running unless every
  or-address line it advertises looks like it will work.

Consensus directories and microdescriptors:

  We introduce a new line type for microdescriptors and consensuses,
  "a".  Each "a" line has the same format as an or-address line.
  The "a" lines (if any) appear immediately after the "r" line for a
  router in the consensus, and immediately after the "onion-key"
  entry in a microdescriptor.

  Clients that use microdescriptors should consider a node's
  addresses to be the address:port listed in the "r" line of a
  consensus, plus all "a" lines for that node in the consensus, plus
  all "a" lines for that node in its microdescriptor.  Clients
  that use full descriptors should consider a node's addresses to be
  everything listed in its descriptor.

  We will have to define a new voting algorithm version; when using
  this version or later, votes should include a single "a" line for
  every relay that has an IPv6 address, to include the first IPv6
  line in its descriptor.  (If there are no IPv6 or-address lines, then
  they shouldn't include any "a" lines.)  The remaining or-address
  lines will turn into "a" lines in the microdescriptor.

  As with other data in the vote derived from the descriptor, the
  consensus will include whichever set of "a" lines are given by the
  most authorities who voted for the descriptor digest that will be
  used for the router.

Directory authorities with more addresses:

  We need a way for a client to configure a TrustedDirServer as
  having multiple OR addresses, specifically so that we can give at
  least one default authority an IPv6 address for bootstrapping
  purposes.

  (Q: Do any of the current authorities have stable IPv6 addresses?)

  We will want to allow the address in a "dir-source" line in a vote
  to contain an IPv6 address, and/or allow voters to list themselves
  with more addresses in votes/consensuses.  But right now, nothing
  actually uses the addresses listed for voters in dir-source lines
  for anything besides log messages.

Client behavior:

  I propose that initially we shouldn't change client behavior too
  much here.

  (Q: Is there any advantage to having a client choose a random
  address?  If so we can do it later.  If not, why list any more
  than one IPv4 and one IPv6 address?)

  Tor clients not running with bridges, and running with IPv4
  support, should still use the address and ORPort as advertised in
  the "router" or "r" line of the appropriate directory object.

  Tor clients not running with bridges, and running without IPv4
  support, should use the first listed IPv6 address for a node,
  using the lowest-numbered listed port for that address.  They
  should only connect to nodes with an IPv6 address.

  Clients should accept Bridge lines with IPv6 addresses, and
  address:port sets, in addition to the lines they currently accept.

  Clients, for now, should only use the address:port from the router
  line when making EXTEND cells; see below.

Nodes without IPv4 addresses:

  Currently Tor requires every node or bridge to have an IPv4
  address.  We will want to maintain this property for the
  foreseeable future, but we should define how a node without an IPv4
  address would advertise itself.

  Right now, there's no way to do that: if anything but an IPv4
  address appears in a router line of a routerdesc, or the "r" line of
  a consensus, then it won't parse.  If something that looks like an
  IPv4 address appears there, clients will (I believe) try to
  connect to it.

  We can make this work, though: let's allow nodes to list themselves
  with a magic IPv4 address (say, 127.1.1.1) if they have
  or-address entries containing only IPv6 address.  We could give
  these nodes a new flag other than Running to indicate that they're
  up, and not give them the Running flag.  That way, old clients
  would never try to use them, but new clients could know to treat
  the new flag as indicating that the node is running, and know not
  to connect to a node listed with address 127.1.1.1.

Interaction with EXTEND and NETINFO:

  Currently, EXTEND cells only support IPv4 addresses, so we should
  use only those.  There is a proposal draft to support more address
  types.

  A server's NETINFO cells must list all configured addresses for a
  server.

Why not extend DirPort this way too?

  Because clients are all using BEGINDIR these days.

  That is, clients tunnel their directory requests inside OR
  connections, and don't generally connect to DirPorts at all.

Why not have address and port ranges?

  Earlier drafts of this proposal suggested that servers should provide
  ranges of addresses, specified with bitmasks.  That's a neat idea for
  circumvention, but if we did that, you wouldn't want to advertise
  publicly that you have an entire address range.

  Port ranges are out because I don't think they would actually get used
  much, and they add a fair bit of complexity.

Coding impact:

  In addition to the obvious changes, we need to audit everything
  that looks up or compares OR connections and nodes by address:port
  under the assumptions that each node has only a single address or
  ORPort.

TODO:

  * Make it so that authorities can vote on which addresses are working
    somehow.

  * Specify some way to say "I only want to connect to v4/v6 addresses".

  * Come up with a better alternative to running6 for the longterm?

Filename: 187-allow-client-auth.txt
Title: Reserve a cell type to allow client authorization
Author: Nick Mathewson
Created: 16-Oct-2011
Status: Closed
Target: 0.2.3.x

Overview:

  Proposals 176 and 184 introduce a new "v3" handshake, coupled with
  a new version 3 link protocol.  This is a good time to introduce
  other stuff we might need.

  One thing we might want is a scanning resistance feature for
  bridges.  This proposal suggests a change we should make right
  away to enable us to deploy such a feature in future versions of
  Tor.

Motivation:

  If an adversary has a suspected bridge address/port combination,
  the easiest way for them to confirm or disconfirm their suspicion
  is to connect to the address and see whether they can do a Tor
  handshake.  The easiest way to fix this problem seems to be to
  give out bridge addresses along with some secret that clients
  should know, but which an adversary shouldn't be able to learn
  easily.  The client should prove to the bridge that it's
  authorized to know about the bridge, before the bridge acts like a
  bridge.  If the client doesn't show knowledge of the proper
  secret, the bridge should act like an HTTPS server or a bittorrent
  tracker or something.

  This proposal *does not* specify a way for clients to authorize
  themselves at bridges; rather, it specifies changes that we should
  make now in order to allow this kind of authorization in the
  future.

Design:

  Currently, now that proposal 176 is implemented, if a server
  provides a certificate that indicates a v3 handshake, and the
  client understands how to do a V3 handshake, we specify that the
  client's first cell must be a VERSIONS cell.

  Instead, we make the following specification changes:

  We reserve a new variable-length cell type, "AUTHORIZE".

  We specify that any number of PADDING or VPADDING or AUTHORIZE
  cells may be sent by the client before it sends a VERSIONS cell.
  Servers that do not require client authorization MUST ignore such
  cells, except to include them when calculating the HMAC that will
  appear in the CLOG part of a client's AUTHENTICATE cell.

  We still specify that clients SHOULD send VERSIONS as their first
  cell; only in some future version of Tor will an AUTHORIZE cell be sent
  first.

Discussion:

  This change allows future versions of the Tor client to know that
  some bridges need authorization, and to send them authentication
  before sending them anything recognizably Tor-like.

  The authorization cell needs to be received before the server can
  send any Tor cells, so we can't just patch it in after the
  VERSIONS cell exchange: the server's VERSIONS cell is unsendable
  until after the AUTHORIZE has been accepted.

  Note that to avoid scanning attacks, it's not sufficient to wait
  for a single cell, and then either handle it as authorization or
  reject the connection.  Instead, we need to decide what kind of
  server we're impersonating, and respond once the client has
  provided *either* an authorization cell, *or* a recognizably valid
  or invalid command in the impersonated protocol.


Alternative design: Just use pluggable transports

  Pluggable transports can do this too, but in general, we want to
  avoid designing the Tor protocol so that any particular desirable
  feature can only be done with a pluggable transport.  That is, any
  feature that *every* bridge should want, should be doable in Tor
  proper.

  Also, as of 16 Oct 2011, pluggable transports aren't in general
  use.  Past experience IMO suggests that we shouldn't offload
  architectural responsibilities to our chickens until they've
  hatched.

Alternative design: Out-of-TLS authorization

  There are features (like port-knocking) designed to allow a client
  to show that it's authorized to use a bridge before the TLS
  handshake even happens.  These are appropriate for bunches of
  applications, but they're trickier with an adversary who is
  MITMing the client.

Alternative design: Just use padding.

  Arguably, we could only add the "VPADDING" cell type to the list
  of those allowed before VERSIONS cells, and say that any client
  authorization we specify later on will be sent as a VPADDING
  cell.  But that design is kludgy: padding should be padding, not
  semantically significant.  Besides, cell types are still fairly
  plentiful.

Counterargument: specify it later

  We could, later on, say that if a client learns that a bridge
  needs authorization, it should send an AUTHORIZE cell.  So long as
  a client never sends an AUTHORIZE to anything other than a bridge that
  needs authorization, it'll never violate the spec.

  But all things considered, it seems easier (just a few lines of
  spec and code) to let bridges eat unexpected authorization now
  than it does to have stuff fail later when clients think that a
  bridge needs authorization but it doesn't.

Counterargument: it's too late!

  We've already got the prop176 branch merged and running on a few
  servers.  But as of this writing, it isn't in any Tor version.

  Even if it *is* out in an alpha before we can get this proposal
  accepted and implemented, that's not a big disaster.  In the worst
  case, where future clients don't know whom to send authorization
  to so they need to send it to _all_ v3 servers, they will at worst
  break their connections only to a couple of alpha versions which
  one hopes by then will be long-deprecated already.

Filename: 188-bridge-guards.txt
Title: Bridge Guards and other anti-enumeration defenses
Author: Nick Mathewson, Isis Lovecruft
Created: 14 Oct 2011
Modified: 10 Sep 2015
Status: Reserve

   [NOTE: This proposal is marked as "reserve" because the enumeration
   technique it addresses does not currently seem to be in use. See
   ticket tor#7144 for more information. (2020 July 31)]


1. Overview

   Bridges are useful against censors only so long as the adversary
   cannot easily enumerate their addresses. I propose a design to make
   it harder for an adversary who controls or observes only a few
   nodes to enumerate a large number of bridges.

   Briefly: bridges should choose guard nodes, and use the Tor
   protocol's "Loose source routing" feature to re-route all extend
   requests from clients through an additional layer of guard nodes
   chosen by the bridge.  This way, only a bridge's guard nodes can
   tell that it is a bridge, and the attacker needs to run many more
   nodes in order to enumerate a large number of bridges.

   I also discuss other ways to avoid enumeration, recommending some.

   These ideas are due to a discussion at the 2011 Tor Developers'
   Meeting in Waterloo, Ontario.  Practically none of the ideas here
   are mine; I'm just writing up what I remember.

2. History and Motivation

   Under the current bridge design, an attacker who runs a node can
   identify bridges by seeing which "clients" make a large number of
   connections to it, or which "clients" make connections to it in the
   same way clients do.  This has been a known attack since early
   versions {XXXX check} of the design document; let's try to fix it.

2.1. Related idea: Guard nodes

   The idea of guard nodes isn't new: since 0.1.1, Tor has used guard
   nodes (first designed as "Helper" nodes by Wright et al in {XXXX})
   to make it harder for an adversary who controls a smaller number of
   nodes to eavesdrop on clients.  The rationale was: an adversary who
   controls or observes only one entry and one exit will have a low
   probability of correlating any single circuit, but over time, if
   clients choose a random entry and exit for each circuit, such an
   adversary will eventually see some circuits from each client with a
   probability of 1, thereby building a statistical profile of the
   client's activities.  Therefore, let each client choose its entry
   node only from among a small number of client-selected "guard"
   nodes: the client is still correlated with the same probability as
   before, but now the client has a nonzero chance of remaining
   unprofiled.

2.2. Related idea: Loose source routing

   Since the earliest versions of Onion Routing, the protocol has
   provided "loose source routing".  In strict source routing, the
   source of a message chooses every hop on the message's path.  But
   in loose source routing, the message traverses the selected nodes,
   but may also traverse other nodes as well.  In other words, the
   client selects nodes N_a, N_b, and N_c, but the message may in fact
   traverse any sequence of nodes N_1...N_j, so long as N_1=N_a,
   N_x=N_b, and N_y=N_c, for 1 < x < y.

   Tor has retained this feature, but has not yet made use of it.

3. Design

   Every bridge currently chooses a set of guard nodes for its
   circuits.  Bridges should also re-route client circuits through
   these circuits.

   Specifically, when a bridge receives a request from a client to
   extend a circuit, it should first create a circuit to its guard,
   and then relay that extend cell through the guard.  The bridge
   should add an additional layer of encryption to outgoing cells on
   that circuit corresponding to the encryption that the guard will
   remove, and remove a layer of encryption on incoming cells on that
   circuit corresponding to the encryption that the guard will add.

3.1. Loose-Source Routed Circuit Construction

   Alice, an OP, is using a bridge, Bob, and she has chosen the
   following path through the network:

       Alice -> Bob -> Charlie -> Deidra

   However, Bob has decided to take advantage of the loose-source
   routing circuit characteristic (for example, in order to use a bridge
   guard), and Bob has chosen N additional loose-source routed hop(s),
   through which he will transparently relays cells.

   NOTE: For the purposes of bridge guards, N is always 1.  However, for
   completion's sake, the following details of the circuit construction
   are generalized to include N > 1.  Additionally, the following steps
   should hold for a hop at any position in Alice's circuit that has
   decided to take advantage of the loose-source routing feature, not
   only for bridge ORs.

   From Alice's perspective, her circuit path matches the one diagrammed
   above.  However, the overall path of the circuit is:

       Alice -> Bob -> Guillaume -> Charlie -> Deidra

   From Bob's perspective, the circuit's path is:

       Alice -> Bob -> Guillaume -> Charlie -> UNKNOWN

   Interestingly, because Bob's behaviour towards Guillaume and choices
   of cell types is that of a normal OP, Guillaume's perspective of the
   circuit's path is:

       Bob -> Guillaume -> Charlie -> UNKNOWN

   That is, to Guillaume, Bob appears (for the most part) to be a
   normally connecting client.  (See §4.1 for more detailed analysis.)

3.1.1. Detailed Steps of Loose-Source Routed Circuit Construction

   1. Connection from OP

      Alice has connected to Bob, and she has sent to Bob either a
      CREATE/CREATE_FAST or CREATE2 cell.

   2. Loose-Source Path Selection

      In anticipation of Alice's first RELAY_EARLY cell (which will
      contain an EXTEND cell to Alice's next hop), Bob begins
      constructing a loose-source routed circuit.  To do so, Bob chooses
      N additional hop(s):

      2.a. For the first additional hop, H_1, Bob chooses a suitable
           entry guard node, Guillaume, using the same algorithm as OPs.
           See "§5 Guard nodes" of path-spec.txt for additional
           information on the selection algorithm.

      2.b. Each additional hop, [H_2, ..., H_N], is chosen at random
           from a list of suitable, non-excluded ORs.

   3. Loose-Source Routed Circuit Extension and Cell Types

      Bob now follows the same procedure as OPs use to complete the key
      exchanges with his chosen additional hop(s).

      While undergoing these following substeps, Bob SHOULD continue to
      proceed with Step 4, below, in parallel, as an optimization for
      speeding up circuit construction.

      3.a. Create Cells

           Bob sends the appropriate type of create cell to Guillaume.
           For ORs new enough to support the NTor handshake (nearly all
           of them at this point), Bob sends a CREATE2 cell.  Otherwise,
           for ORs which only support the older TAP handshake, Bob sends
           either a CREATE or CREATE_FAST cell, using the same
           decision-making logic as OPs.

           See §4.1 for more information the distinguishability of
           bridges based upon whether they use CREATE versus
           CREATE_FAST.  Also note that the CREATE2 cell has since
           become ubiquitous after this proposal was originally drafted.
           Thus, because we prefer ORs which use loose-source routing to
           behave (as much as possible) like OPs, we now prefer to use
           CREATE2.

      3.b. Created Cells

           Later, when Bob receives a corresponding CREATED/CREATED_FAST
           or CREATED2 cell from Guillaume, Bob extracts key material
           for the shared forward and reverse keys, KG_f and KG_b,
           respectively.

      3.c. Extend Cells

           When N > 1, for each additional hop, H_i, in [H_2, ..., H_N],
           Bob chooses the appropriate type of extend cell for H_i, and
           sends this extend cell to H_i-1, who transforms it into a
           create cell in order to perform the extension.  To choose
           which type of extend cell to send, Bob uses the same
           algorithm as an OP to determine whether to use EXTEND or
           EXTEND2.  Similar to the CREATE* cells above, for most modern
           ORs, this will very likely mean an EXTEND2 cell.

      3.d. Extended Cells

           When a corresponding EXTENDED/EXTENDED2 cell is received for
           an additional hop, H_i, Bob extracts the shared forward and
           reverse keys, Ki_f and Ki_b, respectively.

   4. Responding to the OP

      Now that the additional hops in Bob's loose-source routed circuit
      are chosen, and construction of the loose-source routed circuit
      has begun, Bob answers Alice's original CREATE/CREATE_FAST or
      CREATE2 cell (from Step 1) by sending the corresponding created
      cell type.

      Alice has now built a circuit through Bob, and the two share the
      negotiated forward and reverse keys, KB_n and KB_p, respectively.

      Note that Bob SHOULD do this step in tandem with the loose-source
      routed circuit construction procedure outlined in Step 3, above.

   5. OP Circuit Extension

      Alice then wants to extend the circuit to node Charlie.  She makes
      a hybrid-encrypted onionskin, encrypted to Charlie's public key,
      containing her chosen g^x value.  She puts this in an extend cell:
      "Extend (Charlie's address) (Charlie's OR Port) (Onionskin)
      (Charlie's ID)".  She encrypts this with KB_n and sends it as a
      RELAY_EARLY cell to Bob.

      Bob's behaviour is now dependent on whether the loose-source
      routed circuit construction steps (as outlined in Step 3, above)
      have already completed.

      5.a. The Loose-Source Routed Circuit Construction is Incomplete

           If Bob has not yet finished the loose-source routed circuit
           construction, then Bob MUST store the first outgoing
           (i.e. exitward) RELAY_EARLY cell received from Alice until
           the loose-source routed circuit construction has been
           completed.

           If any incoming (i.e. toward the OP) RELAY* cell is received
           while the loose-source routed circuit is not fully
           constructed, Bob MUST drop the cell.

           If Bob has already stored Alice's first RELAY_EARLY cell, and
           Alice sends any additional RELAY* cell, then Bob SHOULD mark
           the entire circuit for close with END_CIRC_REASON_TORPROTOCOL.

      5.b. The Loose-Source Routed Circuit Construction is Completed

           Later, when the loose-source routed circuit is fully
           constructed, Bob MUST send any stored cells from Alice
           outward by following the procedure described in Step 6.a.

   6. Relay Cells

      When receiving a RELAY* cell in either direction, Bob MAY keep
      statistics on the number of relay cells encountered, as well as
      the number of relay cells relayed.

      6.a. Outgoing Relay Cells

           Bob decrypts the RELAY* cell with KB_n.  If the cell becomes
           recognized, Bob should now follow the relay command checks
           described in Step 6.c.

           Bob MUST encrypt the relay cell's underlying payload to each
           additional hop in the loose-source routed circuit, in
           reverse: for each additional hop, H_i, in [H_N, ..., H_1],
           Bob encrypts the relay cell payload to Ki_f, the shared
           forward key for the hop H_i.

           Bob MUST update the forward digest, DG_f, of the relay cell,
           regardless of whether or not the cell is recognized.  See
           6.c. for additional information on recognized cells.

           Bob now sends the cell outwards through the additional hops.
           At each hop, H_i, the hop removes a layer of the onionskin by
           decrypting the cell with Ki_f, and then hop H_i forwards the
           cell to the next addition additional hop H_i+1.  When the
           final additional hop, H_N, received the cell, the OP's cell
           command and payload should be processed by H_N in the normal
           manner for an OR.

      6.b. Incoming Relay Cells

           Bob MUST decrypt the relay cell's underlying payload from
           each additional hop in the loose-source routed circuit (in
           forward order, this time): For each additional hop, H_i, in
           [H_1, ..., H_N], Bob decrypts the relay cell payload with
           Ki_b, the shared backward key for the hop H_i.

           If the cell has becomes recognized after all decryptions, Bob
           should now follow the relay command checks described in Step
           6.c.

           Bob MUST update the backward digest, DG_b, of the relay cell,
           regardless of whether or not the cell is recognized.  See
           6.c. for additional information on recognized cells.

           Bob encrypts the cell towards the OP with KB_p, and sends the
           cell inwards.

      6.c. Recognized Cells

           If a relay cell, either incoming or outgoing, becomes
           recognized (i.e. Bob sees that the cell was intended for him)
           after decryption, and there is no stream attached to the
           circuit, then Bob SHOULD mark the circuit for close if the
           relay command contained within the cell is any of the
           following types:

               - RELAY_BEGIN
               - RELAY_CONNECTED
               - RELAY_END
               - RELAY_RESOLVE
               - RELAY_RESOLVED
               - RELAY_BEGIN_DIR

           Apart from the above checks, Bob SHOULD essentially treat
           every cell as "unrecognized" by following the en-/de-cryption
           procedures in Steps 6.a. and 6.b. regardless of whether the
           cell is actually recognized or not.  That is, since this is a
           loose-source routed circuit, Bob SHOULD relay cells not
           intended for him *and* cells intended for him through the
           leaky pipe, no matter what the cell's underlying payload and
           command are.

3.1.2. Example Loose-Source Circuit Construction

   For example, given the following circuit path chosen by Alice:

       Alice -> Bob -> Charlie -> Deidra

   when Alice wishes to extend to node Charlie, and Bob the bridge is
   using only one additional loose-source routed hop, Guillaume, as his
   bridge guard, the following steps are taken:

       - Alice packages the extend into a RELAY_EARLY cell and encrypts
         the RELAY_EARLY cell with KB_f to Bob.

       - Bob receives the RELAY_EARLY cell from Alice, and he follows
         the procedure (outlined in §3.1.1. Step 6.a.) by:

           * Decrypting the cell with KB_f,
           * Encrypting the cell to the forward key, KG_f, which Bob
             shares with his guard node, Guillaume,
           * Updating the cell forward digest, DG_f, and
           * Sending the cell as a RELAY_EARLY cell to Guillaume.

       - When Guillaume receives the cell from Bob, he processes it by:

           * Decrypting the cell with KG_f.  Guillaume now sees that it
             is a RELAY_EARLY cell containing an extend cell "intended"
             for him, containing: "Extend (Charlie's address) (Charlie's
             OR Port) (Onionskin) (Charlie's ID)".
           * Performing the circuit extension to the specified node,
             Charlie, by acting accordingly: creating a connection to
             Charlie if he doesn't have one, ensuring that the ID is as
             expected, and then sending the onionskin in a create cell
             on that connection.  Note that Guillaume is behaving
             exactly as a regular node would upon receiving an Extend
             cell.
           * Now the handshake finishes.  Charlie receives the onionskin
             and sends Guillaume "CREATED g^y,KH".
           * Making an extended cell for Bob which contains
             "E(KG_b, EXTENDED g^y KH)", and
           * Sending the extended cell to Bob.  Note that Charlie and
             Guillaume are both still behaving in a manner identical to
             regular ORs.

       - Bob receives the extended cell from Guillaume, and he follows
         the procedure (outlined in §3.1.1. Step 6.b.) by:

           * Decrypting the cell with KG_b,
           * Encrypting the cell to Alice with KB_b,
           * Updating the cell backward digest, DG_b, and
           * Sending the cell to Alice.

        - Alice receives the cell, and she decrypts it with KB_b, just
          as she would have if Bob had extended to Charlie directly.
          She then processes the extended cell contained within to
          extract shared keys with Charlie.  Note that Alice's behaviour
          is identical to regular OPs.

3.2. Additional Notes on the Construction

   Note that this design does not require that our stream cipher
   operations be commutative, even though they are.

   Note also that this design requires no change in behavior from any
   node other than Bob, and as we can see in the above example in §3.1.2
   for Alice's circuit extension, Alice, Guillaume, and Charlie behave
   identical to a normal OP and normal ORs.

   Finally, observe that even though the circuit N hops longer than it
   would be otherwise, no relay's count of permissible RELAY_EARLY cells
   falls lower than it otherwise would.  This is because the extra hop
   that Bob adds is done with RELAY_EARLY cells, then he continues to
   relay Alice's cells as RELAY_EARLY, until the appropriate maximum
   number of RELAY_EARLY cells is reached.  Afterwards, further
   RELAY_EARLY cells from Alice are repackaged by Bob as normal RELAY
   cells.

4. Alternative designs

4.1. Client-enforced bridge guards

   What if Tor didn't have loose source routing?  We could have
   bridges tell clients what guards to use by advertising those guard
   in their descriptors, and then refusing to extend circuits to any
   other nodes.  This change would require all clients to upgrade in
   order to be able to use the newer bridges, and would quite possibly
   cause a fair amount of pain along the way.

   Fortunately, we don't need to go down this path.  So let's not!

4.2. Separate bridge-guards and client-guards

   In the design above, I specify that bridges should use the same
   guard nodes for extending client circuits as they use for their own
   circuits.  It's not immediately clear whether this is a good idea
   or not.  Having separate sets would seem to make the two kinds of
   circuits more easily distinguishable (even though we already assume
   they are distinguishable).  Having different sets of guards would
   also seem like a way to keep the nodes who guard our own traffic
   from learning that we're a bridge... but another set of nodes will
   learn that anyway, so it's not clear what we'd gain.

   One good reason to keep separate guard lists is to prevent the
   *client* of the bridge from being able to enumerate the guards that
   the bridge uses to protect its own traffic (by extending a circuit
   through the bridge to a node it controls, and finding out where the
   extend request arrives from).

5. Additional bridge enumeration methods and protections

   In addition to the design above, there are more ways to try to
   prevent enumeration.

   Right now, there are multiple ways for the node after a bridge to
   distinguish a circuit extended through the bridge from one
   originating at the bridge.  (This lets the node after the bridge
   tell that a bridge is talking to it.)

5.1. Make it harder to tell clients from bridges

   When using the older TAP circuit handshake protocol, one of the
   giveaways is that the first hop in a circuit is created with
   CREATE_FAST cells, but all subsequent hops are created with CREATE
   cells.

   However, because nearly everything in the network now uses the newer
   NTor circuit handshake protocol, clients send CREATE2 cells to all
   hops, regardless of position.  Therefore, in the above design, it's
   no longer quite so simple to distinguish an OP connecting through
   bridge from an actual OP, since all of the circuits that extend
   through a bridge now reach its guards through CREATE2 cells (whether
   the bridge originated them or not), and only as a fallback (e.g. if
   an additional node in the loose-source routed path does not support
   NTor) will the bridge ever use CREATE/CREATE_FAST.  (Additionally,
   when using the fallback mathod, the behaviour for choosing either
   CREATE or CREATE_FAST is identical to normal OP behaviour.)

   The CREATE/CREATE_FAST distinction is not the only way for a
   bridge's guard to tell bridges from orginary clients, however.
   Most importantly, a busy bridge will open far more circuits than a
   client would.  More subtly, the timing on response from the client
   will be higher and more highly variable that it would be with an
   ordinary client.  I don't think we can make bridges behave wholly
   indistinguishably from clients: that's why we should go with guard
   nodes for bridges.

   [XXX For further research: we should study the methods by which a
   bridge guard can determine that they are acting as a guard for a
   bridge, rather than for a normal OP, and which methods are likely to
   be more accurate or efficient than others. -IL]

5.2. Bridge Reachability Testing

   Currently, a bridge's reachability is tested both by the bridge
   itself (called "self-testing") and by the BridgeAuthority.

5.2.1. Bridge Reachability Self-Testing

   Before a bridge uploads its descriptors to the BridgeAuthority, it
   creates a special type of testing circuit which ends at itself:

       Bob -> Guillaume -> Charlie -> Bob

   Thus, going to all this trouble to later use loose-source routing in
   order to relay Alice's traffic through Guillaume (rather than
   connecting directly to Charlie, as Alice intended) is diminished by
   the fact that Charlie can still passively enumerate bridges by
   waiting to be asked to connect to a node which is not contained
   within the consensus.

   We could get around this option by disabling self-testing for bridges
   entirely, by automatically setting "AssumeReachable 1" for all bridge
   relays… although I am not sure if this is wise.

   Our best idea thus far, for bridge reachability self-testing, is to create
   a circuit like so:

       Bridge → Guard → Middle → OtherMiddle → Guard → Bridge

   While, clearly, that circuit is just a little bit insane, it must be that
   way because we cannot simply do:

       Bridge → Guard → Middle → Guard → Bridge

   because the Middle would refuse to extend back to the previous node
   (all ORs follow this rule).  Similarly, it would be inane to do:

       Bridge → Guard → Middle → OtherMiddle → Bridge

   because, obviously, that merely shifts the problem to OtherMiddle and
   accomplishes nothing.  [XXX Is there something smarter we could do? —IL]

5.2.2. Bridge Reachability Testing by the BridgeAuthority

   After receiving Bob's descriptors, the BridgeAuthority attempts to
   connect to Bob's ORPort by making a direct TLS connection to the
   bridge's advertised ORPort.

   Should we change this behaviour?  One the one hand, at least this
   does not enable any random OR in the entire network to enumerate
   bridges.  On the other hand, any adversary who can observe packets
   from the BridgeAuthority is capable of enumeration.

6. Other considerations

   What fraction of our traffic is bridge traffic?  Will this alter
   our circuit selection weights?
Filename: 189-authorize-cell.txt
Title: AUTHORIZE and AUTHORIZED cells
Author: George Kadianakis
Created: 04 Nov 2011
Status: Obsolete

1. Overview

   Proposal 187 introduced the concept of the AUTHORIZE cell, a cell
   whose purpose is to make Tor bridges resistant to scanning attacks.

   This is achieved by having the bridge and the client share a secret
   out-of-band and then use AUTHORIZE cells to validate that the
   client indeed knows that secret before proceeding with the Tor
   protocol.

   This proposal specifies the format of the AUTHORIZE cell and also
   introduces the AUTHORIZED cell, a way for bridges to announce to
   clients that the authorization process is complete and successful.

2. Motivation

   AUTHORIZE cells should be able to perform a variety of
   authorization protocols based on a variety of shared secrets. This
   forces the AUTHORIZE cell to have a dynamic format based on the
   authorization method used.

   AUTHORIZED cells are used by bridges to signal the end of a
   successful bridge client authorization and the beginning of the
   actual link handshake. AUTHORIZED cells have no other use and for
   this reason their format is very simple.

   Both AUTHORIZE and AUTHORIZED cells are to be used under censorship
   conditions and they should look innocuous to any adversary capable
   of monitoring network traffic.

   As an attack example, an adversary could passively monitor the
   traffic of a bridge host, looking at the packets directly after the
   TLS handshake and trying to deduce from their packet size if they
   are AUTHORIZE and AUTHORIZED cells. For this reason, AUTHORIZE and
   AUTHORIZED cells are padded with a random amount of padding before
   sending.

3. Design

3.1. AUTHORIZE cell

   The AUTHORIZE cell is a variable-sized cell.

   The generic AUTHORIZE cell format is:

         AuthMethod                       [1 octet]
         MethodFields                     [...]
         PadLen                           [2 octets]
         Padding                          ['PadLen' octets]

   where:

   'AuthMethod', is the authorization method to be used.

   'MethodFields', is dependent on the authorization Method used. It's
                   a meta-field hosting an arbitrary amount of fields.

   'PadLen', specifies the amount of padding in octets.
   Implementations SHOULD pick 'PadLen' to be a random integer from 1
   to 3141 inclusive.

   'Padding', is 'PadLen' octets of random content.

3.2. AUTHORIZED cell format

   The AUTHORIZED cell is a variable-sized cell.

   The AUTHORIZED cell format is:

         'AuthMethod'                       [1 octet]
         'PadLen'                           [2 octets]
         'Padding'                          ['PadLen' octets]

   where all fields have the same meaning as in section 3.1.

3.3. Cell parsing

   Implementations MUST ignore the contents of 'Padding'.

   Implementations MUST reject an AUTHORIZE or AUTHORIZED cell where
   the 'Padding' field is not 'PadLen' octets long.

   Implementations MUST reject an AUTHORIZE cell with an 'AuthMethod'
   they don't recognize.

4. Discussion

4.1. What's up with the [1,3141] padding bytes range?

   The upper limit is larger than the Ethernet MTU so that AUTHORIZE
   and AUTHORIZED cells are not always transmitted into a single
   packet. Other than that, it's indeed pretty much arbitrary.

4.2. Why not let the pluggable transports do the padding, like they
     are supposed to do for the rest of the Tor protocol?

   The arguments of section "Alternative design: Just use pluggable
   transports" of proposal 187, apply here as well:

   All bridges who use client authorization will also need padded
   AUTHORIZE and AUTHORIZED cells.

4.3. How should multiple round-trip authorization protocols be handled?

   Protocols that require multiple round trips between the client and
   the bridge should use AUTHORIZE cells for communication.

   The format of the AUTHORIZE cell is flexible enough to support
   messages from the client to the bridge and the reverse.

   At the end of a successful multiple-round-trip protocol, an
   AUTHORIZED cell must be issued from the bridge to the client.

4.4. AUTHORIZED seems useless. Why not use VPADDING instead?

   As noted in proposal 187, the Tor protocol uses VPADDING cells for
   padding; any other use of VPADDING makes the Tor protocol kludgy.

   In the future, and in the example case of a v3 handshake, a client
   can optimistically send a VERSIONS cell along with the final
   AUTHORIZE cell of an authorization protocol. That allows the
   bridge, in the case of successful authorization, to also process
   the VERSIONS cell and begin the v3 handshake promptly.

4.5. What should actually happen when a bridge rejects an AUTHORIZE
     cell?

   When a bridge detects a badly formed or malicious AUTHORIZE cell,
   it should assume that the other side is an adversary scanning for
   bridges. The bridge should then act accordingly to avoid detection.

   This proposal does not try to specify how a bridge can avoid
   detection by an adversary.

Filename: 190-shared-secret-bridge-authorization.txt
Title: Bridge Client Authorization Based on a Shared Secret
Author: George Kadianakis
Created: 04 Nov 2011
Status: Obsolete

Notes: This is obsoleted by pluggable transports.

1. Overview

   Proposals 187 and 189 introduced AUTHORIZE and AUTHORIZED cells.
   Their purpose is to make bridge relays scanning-resistant against
   censoring adversaries capable of probing hosts to observe whether
   they speak the Tor protocol.

   This proposal specifies a bridge client authorization scheme based
   on a shared secret between the bridge user and bridge operator.

2. Motivation

   A bridge client authorization scheme should only allow clients who
   show knowledge of a shared secret to talk Tor to the bridge.

3. Shared-secret-based authorization

3.1. Where do shared secrets come from?

   A shared secret is a piece of data known only to the bridge
   operator and the bridge client.

   It's meant to be automatically generated by the bridge
   implementation to avoid issues with insecure and weak passwords.

   Bridge implementations SHOULD create shared secrets by generating
   random data using a strong RNG or PRNG.

3.2. AUTHORIZE cell format

   In shared-secret-based authorization, the MethodFields field of the
   AUTHORIZE cell becomes:

       'shared_secret'               [10 octets]

   where:

   'shared_secret', is the shared secret between the bridge operator
                    and the bridge client.

3.3. Cell parsing

   Bridge implementations MUST reject any AUTHORIZE cells whose
   'shared_secret' field does not match the shared secret negotiated
   between the bridge operator and authorized bridge clients.

4. Tor implementation

4.1. Bridge side

   Tor bridge implementations MUST create the bridge shared secret by
   generating 10 octets of random data using a strong RNG or PRNG.

   Tor bridge implementations MUST store the shared secret in
   'DataDirectory/keys/bridge_auth_ss_key' in hexadecimal encoding.

   Tor bridge implementations MUST support the boolean
   'BridgeRequireClientSharedSecretAuthorization' configuration file
   option which enables bridge client authorization based on a shared
   secret.

   If 'BridgeRequireClientSharedSecretAuthorization' is set, bridge
   implementations MUST generate a new shared secret, if
   'DataDirectory/keys/bridge_auth_ss_key' does not already exist.

4.2. Client side

   Tor client implementations must extend their Bridge line format to
   support bridge shared secrets. The new format is:
     Bridge [<method>] <address[:port]> [["keyid="]<id-fingerprint>] ["shared_secret="<shared_secret>]

   where <shared_secret> is the bridge shared secret in hexadecimal
   encoding.

   Tor clients who use bridges with shared-secret-based client
   authorization must specify the bridge's shared secret as in:
     Bridge 12.34.56.78 shared_secret=934caff420aa7852b855

5. Discussion

5.1. What should actually happen when a bridge rejects an AUTHORIZE
     cell?

   When a bridge detects a badly formed or malicious AUTHORIZE cell,
   it should assume that the other side is an adversary scanning for
   bridges. The bridge should then act accordingly to avoid detection.

   This proposal does not try to specify how a bridge can avoid
   detection by an adversary.

6. Acknowledgements

   Thanks to Nick Mathewson and Robert Ransom for the help and
   suggestions while writing this proposal.

Filename: 191-mitm-bridge-detection-resistance.txt
Title: Bridge Detection Resistance against MITM-capable Adversaries
Author: George Kadianakis
Created: 07 Nov 2011
Status: Obsolete

1. Overview

   Proposals 187, 189 and 190 make the first steps toward scanning
   resistant bridges. They attempt to block attacks from censoring
   adversaries who provoke bridges into speaking the Tor protocol.

   An attack vector that hasn't been explored in those previous
   proposals is that of an adversary capable of performing Man In The
   Middle attacks to Tor clients. At the moment, Tor clients using the
   v3 link protocol have no way to detect such an MITM attack, and
   will gladly send a VERSIONS or AUTHORIZE cell to the MITMed
   connection, thereby revealing the Tor protocol and thus the bridge.

   This proposal introduces a way for clients to detect an MITMed SSL
   connection, allowing them to protect against the above attack.

2. Motivation

   When the v3 link handshake protocol is performed, Tor's SSL
   handshake is performed with the server sending a self-signed
   certificate and the client blindly accepting it. This allows the
   adversary to perform an MITM attack.

   A Tor client must detect the MITM attack before he initiates the
   Tor protocol by sending a VERSIONS or AUTHORIZE cell. A good
   moment to detect such an MITM attack is during the SSL handshake.

   To achieve that, bridge operators provide their bridge users with a
   hash digest of the public-key certificate their bridge is using for
   SSL. Bridge clients store that hash digest locally and associate it
   with that specific bridge. Bridge clients who have "pinned" a
   bridge to a certificate "fingerprint" can thereafter validate that
   their SSL connection peer is the intended bridge.

   Of course, the hash digest must be provided to users out-of-band
   and before the actual SSL handshake. Usually, the bridge operator
   gives the hash digest to her bridge users along with the rest of
   the bridge credentials, like the bridge's address and port.

3. Security implications

   Bridge clients who have pinned a bridge to a certificate
   fingerprint will be able to detect an MITMing adversary in time.
   If after detection they act as an innocuous Internet
   client, they can successfully remove suspicion from the SSL
   connection and subvert bridge detection.

   Pinning a certificate fingerprint and detecting an MITMing attacker
   does not automatically alleviate suspicions from the bridge or the
   client. Clients must have a behavior to follow after detecting the
   MITM attack so that they look like innocent Netizens. This proposal
   does not try to specify such a behavior.

   Implementation and use of this scheme does not render bridges and
   clients immune to scanning or DPI attacks. This scheme should be
   used along with bridge client authorization schemes like the ones
   detailed in proposal 190.

4. Tor Implementation

4.1. Certificate fingerprint creation

   The certificate fingerprints used on this scheme MUST be computed
   by applying the SHA256 cryptographic hash function upon the ASN.1
   DER encoding of a public-key certificate, then truncating the hash
   output to 12 bytes, encoding it to RFC4648 Base32 and omitting any
   trailing padding '='.

4.2. Bridge side implementation

   Tor bridge implementations SHOULD provide a command line option
   that exports a fully equipped Bridge line containing the bridge
   address and port, the link certificate fingerprint, and any other
   enabled Bridge options, so that bridge operators can easily send it
   to their users.

   In the case of expiring SSL certificates, Tor bridge
   implementations SHOULD warn the bridge operator a sensible amount
   of time before the expiration, so that she can warn her clients and
   potentially rotate the certificate herself.

4.3. Client side implementation

   Tor client implementations MUST extend their Bridge line format to
   support bridge SSL certificate fingerprints. The new format is:
     Bridge <method> <address:port> [["keyid="]<id-fingerprint>] \
       ["shared_secret="<shared_secret>] ["link_cert_fpr="<fingerprint>]

   where <fingerprint> is the bridge's SSL certificate fingerprint.

   Tor clients who use bridges and want to pin their SSL certificates
   must specify the bridge's SSL certificate fingerprint as in:
     Bridge 12.34.56.78 shared_secret=934caff420aa7852b855 \
         link_cert_fpr=GM4GEMBXGEZGKOJQMJSWINZSHFSGMOBRMYZGCMQ

4.4. Implementation prerequisites

   Tor bridges currently rotate their SSL certificates every 2
   hours. This not only acts as a fingerprint for the bridges, but it
   also acts as a blocker for this proposal.

   Tor trac ticket #4390 and proposal YYY were created to resolve this
   issue.

5. Other ideas

5.1. Certificate tagging using a shared secret

   Another idea worth considering is having the bridge use the shared
   secret from proposal 190 to embed a "secret message" on her
   certificate, which could only be understood by a client who knows
   that shared secret, essentially authenticating the bridge.

   Specifically, the bridge would "tag" the Serial Number (or any
   other covert field) of her certificate with the (potentially
   truncated) HMAC of her link public key, using the shared secret of
   proposal 190 as the key: HMAC(shared_secret, link_public_key).

   A client knowing the shared secret would be able to verify the
   'link_public_key' and authenticate the bridge, and since the Serial
   Number field is usually composed of random bytes a probing attacker
   would not notice the "tagging" of the certificate.

   Arguments for this scheme are that it:
   a) doesn't need extra bridge credentials apart from the shared secret
      of prop190.
   b) doesn't need any maintenance in case of certificate expiration.

   Arguments against this scheme are:
   a) In the case of self-signed certificates, OpenSSL creates an
      8-bytes random Serial number, and we would probably need
      something more than 8-bytes to tag. There are not many other
      covert fields in SSL certificates mutable by vanilla OpenSSL.
   b) It complicates the scheme, and if not implemented and researched
      wisely it might also make it fingerprintable.
   c) We most probably won't be able to tag CA-signed certificates.

6. Discussion

6.1. In section 4.1, why do you truncate the SHA256 output to 12 bytes?!

   Bridge credentials are frequently propagated by word of mouth or
   are physically written down, which renders the occult Base64
   encoding unsatisfactory. The 104 characters Base32 encoding or the
   64 characters hex representation of the SHA256 output would also be
   too much bloat.

   By truncating the SHA256 output to 12 bytes and encoding it with
   Base32, we get 39 characters of readable and easy to transcribe
   output, and sufficient security. Finally, dividing '39' by the
   golden ratio gives us about 24.10!

7. Acknowledgements

   Thanks to Robert Ransom for his great help and suggestions on
   devising this scheme and writing this proposal!

Filename: 192-store-bridge-information.txt
Title: Automatically retrieve and store information about bridges
Author: Sebastian Hahn
Created: 16-Nov-2011
Status: Obsolete
Target: 0.2.[45].x

Overview:
Currently, tor already stores some information about the bridges it is
configured to use locally, but doesn't make great use of the stored
data. This data is the Tor configuration information about the bridge
(IP address, port, and optionally fingerprint) and the bridge descriptor
which gets stored along with the other descriptors a Tor client fetches,
as well as an "EntryGuard" line in the state file. That line includes
the Tor version we used to add the bridge, and a slightly randomized
timestamp (up to a month in the past of the real date). The descriptor
data also includes some more accurate timestamps about when the
descriptor was fetched.

The information we give out about bridges via bridgedb currently only
includes the IP address and port, because giving out the fingerprint as
well might mean that Tor clients make direct connections to the bridge
authority, since we didn't design Tor's UpdateBridgesFromAuthority
behaviour correctly.

Motivation:

The only way to let Tor know about a change affecting the bridge (IP
address or port change) is to either ask the bridge authority directly,
or reconfigure Tor. The former requires making a non-anonymized direct
connection to the bridge authority Tonga and asking it for the current
descriptor of the bridge with a given fingerprint - this is unsafe and
also requires prior knowledge of the fingerprint. The latter requires
user intervention, first to learn that there was an update and second to
actually teach Tor about the change.

This is way too complicated for most users, and should be unnecessary
while the user has at least one bridge that remains working: Tonga can
give out bridge descriptors when asked for the descriptor for a certain
fingerprint, and Tor clients learn the fingerprint either from their
torrc file or from the first connection they make to a bridge.

For some users, however, this option is not what they want: They might
use private bridges or have special security concerns, which would make
them want to connect to the IP addresses specified in their
configuration only, and not tell Tonga about the set of bridges they
know about, even through a Tor circuit. Also see
https://blog.torproject.org/blog/different-ways-use-bridge for more
information about the different types of bridge users.

Design:

Tor should provide a new configuration option that allows bridge users
to indicate that they wish to contact Tonga anonymously and learn about
updates for the bridges that they know about, but can't currently reach.
Once those updates have been received, the clients would then hold on to
the new information in their state file, and use it across restarts for
connection attempts.

The option UpdateBridgesFromAuthority should be removed or recycled for
this purpose, as it is currently dangerous to set (it makes direct
connections to the bridge authority, thus leaking that a user is about
to use bridges). Recycling the option is probably the better choice,
because current users of the option get a surprising and never useful
behaviour. On the other hand, users who downgrade their Tors might get
the old behaviour by accident.

If configured with this option, tor would make an anonymized connection
to Tonga to ask for the descriptors of bridges that it cannot currently
connect to, once every few hours. Making more frequent requests would
likely not help, as bridge information doesn't typically change that
frequently, and may overload Tonga.

This information needs to be stored in the state file:

- An exact copy of the Bridge stanza in the torrc file, so that tor can
  detect when the bridge is unconfigured/the configuration is changed

- The IP address, port, and fingerprint we last used when making a
  successful connection to the bridge, if this differs from/supplements
  the configured data.

- The IP address, port, and fingerprint we learned from the bridge
  authority, if this differs from both the configured data and the data
  we used for the last successful connection.

We don't store more data in the state file to avoid leaking too much if
the state file falls into the hands of an adversary.

Security implications:

Storing sensitive data on disk is risky when the computer one uses gets
into the wrong hands, and state file entries can be used to identify
times the user was online. This is already a problem for the Bridge
lines in a user's configuration file, but by storing more information
about bridges some timings can be deduced.

Another risk is that this allows long-term tracking of users when the
set of bridges a user knows about is known to the attacker, and the set
is unique.  This is not very hard to achieve for bridgedb, as users
typically make requests to it non-anomymized and bridgedb can
selectively pick bridges to report. By combining the data about
descriptor fetches on Tonga and this fingerprint, a usage pattern can be
established. Also, bridgedb could give out a made-up fingerprint to a
user that requested bridges, thus easily creating a unique set.

Users of private bridges should not set this option, as it will leak the
fingerprints of their bridges to Tonga. This is not a huge concern, as
Tonga doesn't know about those descriptors, but private bridge users
will likely want to avoid leaking the existence of their bridge. We
might want to figure out a way to indicate that a bridge is private on
the Bridge line in the configuration, so fetching the descriptor from
Tonga is disabled for those automatically. This warrants more discussion
to find a solution that doesn't require bridge users to understand the
trade-offs of setting a configuration option.

One idea is to indicate that a bridge is private by a special flag in
its bridge descriptor, so clients can avoid leaking those to the bridge
authority automatically. Also, Bridge lines for private bridges
shouldn't include the fingerprint so that users don't accidentally leak
the fingerprint to the bridge authority before they have talked to the
bridge.

Specification:

No change/addition to the current specification is necessary, as the
data that gets stored at clients is not covered by the specification.
This document is supposed to serve as a basis for discussion and to
provide hints for implementors.

Compatibility:

Tonga is already set up to send out descriptors requested by clients, so
the bridge authority side doesn't need any changes. The new
configuration options governing the behaviour of Tor would be
incompatible with previous versions, so the torrc needs to be adapted.
The state file changes should not affect older versions.
Filename: 193-safe-cookie-authentication.txt
Title: Safe cookie authentication for Tor controllers
Author: Robert Ransom
Created: 2012-02-04
Status: Closed

Overview:

  Not long ago, all Tor controllers which automatically attempted
  'cookie authentication' were vulnerable to an information-disclosure
  attack.  (See https://bugs.torproject.org/4303 for slightly more
  information.)

  Now, some Tor controllers which automatically attempt cookie
  authentication are only vulnerable to an information-disclosure
  attack on any 32-byte files they can read.  But the Ed25519
  signature scheme (among other cryptosystems) has 32-byte secret
  keys, and we would like to not worry about Tor controllers leaking
  our secret keys to whatever can listen on what the controller thinks
  is Tor's control port.

  Additionally, we would like to not have to remodel Tor's innards and
  rewrite all of our Tor controllers to use TLS on Tor's control port
  this week (or deal with the many design issues which that would
  raise).

Design:

From af6bf472d59162428a1d7f1d77e6e77bda827414 Mon Sep 17 00:00:00 2001
From: Robert Ransom <rransom.8774@gmail.com>
Date: Sun, 5 Feb 2012 04:02:23 -0800
Subject: [PATCH] Add SAFECOOKIE control-port authentication method

---
 control-spec.txt |   59 ++++++++++++++++++++++++++++++++++++++++++++++-------
 1 files changed, 51 insertions(+), 8 deletions(-)

diff --git a/control-spec.txt b/control-spec.txt
index 66088f7..3651c86 100644
--- a/control-spec.txt
+++ b/control-spec.txt
@@ -323,11 +323,12 @@
   For information on how the implementation securely stores authentication
   information on disk, see section 5.1.
 
-  Before the client has authenticated, no command other than PROTOCOLINFO,
-  AUTHENTICATE, or QUIT is valid.  If the controller sends any other command,
-  or sends a malformed command, or sends an unsuccessful AUTHENTICATE
-  command, or sends PROTOCOLINFO more than once, Tor sends an error reply and
-  closes the connection.
+  Before the client has authenticated, no command other than
+  PROTOCOLINFO, AUTHCHALLENGE, AUTHENTICATE, or QUIT is valid.  If the
+  controller sends any other command, or sends a malformed command, or
+  sends an unsuccessful AUTHENTICATE command, or sends PROTOCOLINFO or
+  AUTHCHALLENGE more than once, Tor sends an error reply and closes
+  the connection.
 
   To prevent some cross-protocol attacks, the AUTHENTICATE command is still
   required even if all authentication methods in Tor are disabled.  In this
@@ -949,6 +950,7 @@
       "NULL"           / ; No authentication is required
       "HASHEDPASSWORD" / ; A controller must supply the original password
       "COOKIE"         / ; A controller must supply the contents of a cookie
+      "SAFECOOKIE"       ; A controller must prove knowledge of a cookie
 
      AuthCookieFile = QuotedString
      TorVersion = QuotedString
@@ -970,9 +972,9 @@
   methods that Tor currently accepts.
 
   AuthCookieFile specifies the absolute path and filename of the
-  authentication cookie that Tor is expecting and is provided iff
-  the METHODS field contains the method "COOKIE".  Controllers MUST handle
-  escape sequences inside this string.
+  authentication cookie that Tor is expecting and is provided iff the
+  METHODS field contains the method "COOKIE" and/or "SAFECOOKIE".
+  Controllers MUST handle escape sequences inside this string.
 
   The VERSION line contains the Tor version.
 
@@ -1033,6 +1035,47 @@
 
   [TAKEOWNERSHIP was added in Tor 0.2.2.28-beta.]
 
+3.24. AUTHCHALLENGE
+
+  The syntax is:
+    "AUTHCHALLENGE" SP "AUTHMETHOD=SAFECOOKIE"
+                    SP "COOKIEFILE=" AuthCookieFile
+                    SP "CLIENTCHALLENGE=" 2*HEXDIG / QuotedString
+                    CRLF
+
+  The server will reject this command with error code 512, then close
+  the connection, if Tor is not using the file specified in the
+  AuthCookieFile argument as a controller authentication cookie file.
+
+  If the server accepts the command, the server reply format is:
+    "250-AUTHCHALLENGE"
+            SP "CLIENTRESPONSE=" 64*64HEXDIG
+            SP "SERVERCHALLENGE=" 2*HEXDIG
+            CRLF
+
+  The CLIENTCHALLENGE, CLIENTRESPONSE, and SERVERCHALLENGE values are
+  encoded/decoded in the same way as the argument passed to the
+  AUTHENTICATE command.
+
+  The CLIENTRESPONSE value is computed as:
+    HMAC-SHA256(HMAC-SHA256("Tor server-to-controller cookie authenticator",
+                            CookieString)
+                ClientChallengeString)
+  (with the HMAC key as its first argument)
+
+  After a controller sends a successful AUTHCHALLENGE command, the
+  next command sent on the connection must be an AUTHENTICATE command,
+  and the only authentication string which that AUTHENTICATE command
+  will accept is:
+    HMAC-SHA256(HMAC-SHA256("Tor controller-to-server cookie authenticator",
+                            CookieString)
+                ServerChallengeString)
+
+  [Unlike other commands besides AUTHENTICATE, AUTHCHALLENGE may be
+  used (but only once!) before AUTHENTICATE.]
+
+  [AUTHCHALLENGE was added in Tor FIXME.]
+
 4. Replies
 
   Reply codes follow the same 3-character format as used by SMTP, with the
-- 
1.7.8.3

Rationale:

  The weird inner HMAC was meant to ensure that whatever impersonates
  Tor's control port cannot even abuse a secret key meant to be used
  with HMAC-SHA256.

  Then I added the server-to-controller challenge-response
  authentication step, to ensure that the server can only use a
  controller as an HMAC oracle if it already knows the contents of the
  cookie file.  Now, the inner HMAC is just a not-very-efficient way
  to keep controllers from using the server as an oracle for its own
  challenges (it could be replaced with a hash function).

Filename: 194-mnemonic-urls.txt
Title: Mnemonic .onion URLs
Author: Sai, Alex Fink
Created: 29-Feb-2012
Status: Superseded

1. Overview

  Currently, canonical Tor .onion URLs consist of a naked 80-bit hash[1]. This
  is not something that users can even recognize for validity, let alone produce
  directly. It is vulnerable to partial-match fuzzing attacks[2], where a
  would-be MITM attacker generates a very similar hash and uses various social
  engineering, wiki poisoning, or other methods to trick the user into visiting
  the spoof site.

  This proposal gives an alternative method for displaying and entering .onion
  and other URLs, such that they will be easily remembered and generated by end
  users, and easily published by hidden service websites, without any dependency
  on a full domain name type system like e.g. namecoin[3]. This makes it easier
  to implement (requiring only a change in the proxy).

  This proposal could equally be used for IPv4, IPv6, etc, if normal DNS is for
  some reason untrusted.

  This is not a petname system[4], in that it does not allow service providers
  or users[5] to associate a name of their choosing to an address[6]. Rather, it
  is a mnemonic system that encodes the 80 bit .onion address into a
  meaningful[7] and memorable sentence. A full petname system (based on
  registration of some kind, and allowing for shorter, service-chosen URLs) can
  be implemented in parallel[8].

  This system has the three properties of being secure, distributed, and
  human-meaningful — it just doesn't also have choice of name (except of course
  by brute force creation of multiple keys to see if one has an encoding the
  operator likes).

  This is inspired by Jonathan Ackerman's "Four Little Words" proposal[9] for
  doing the same thing with IPv4 addresses. We just need to handle 80+ bits, not
  just 32 bits.

  It is similar to Markus Jakobsson & Ruj Akavipat's FastWord system[10], except
  that it does not permit user choice of passphrase, does not know what URL a
  user will enter (vs verifying against a single stored password), and again has
  to encode significantly more data.

  This is also similar to RFC1751[11], RFC2289[12], and multiple other
  fingerprint encoding systems[13] (e.g.  PGPfone[14] using the PGP
  wordlist[15], and Arturo Filatsò's OnionURL[16]), but we aim to make something
  that's as easy as possible for users to remember — and significantly easier
  than just a list of words or pseudowords, which we consider only useful as an
  active confirmation tool, not as something that can be fully memorized and
  recalled, like a normal domain name.

2. Requirements

2.1. encodes at least 80 bits of random data (preferably more, eg for a
checksum)

2.2. valid, visualizable English sentence — not just a series of words[17]

2.3. words are common enough that non-native speakers and bad spellers will have
minimum difficulty remembering and producing (perhaps with some spellcheck help)

2.4. not syntactically confusable (e.g. order should not matter)

2.5. short enough to be easily memorized and fully recalled at will, not just
recognized

2.6. no dependency on an external service

2.7. dictionary size small enough to be reasonable for end users to download as
part of the onion package

2.8. consistent across users (so that websites can e.g. reinforce their random
hash's phrase with a clever drawing)

2.9. not create offensive sentences that service providers will reject

2.10. resistant against semantic fuzzing (e.g. by having uniqueness against
WordNet synsets[18])

3. Possible implementations

  This section is intentionally left unfinished; full listing of template
  sentences and the details of their parser and generating implementation is
  co-dependent on the creation of word class dictionaries fulfilling the above
  criteria. Since that's fairly labor-intensive, we're pausing at this stage for
  input first, to avoid wasting work.

3.1. Have a fixed number of template sentences, such as:

  1. Adj subj adv vtrans adj obj
  2. Subj and subj vtrans adj obj
  3. … etc

  For a 6 word sentence, with 8 (3b) templates, we need ~12b (4k word)
  dictionaries for each word category.

  If multiple words of the same category are used, they must either play
  different grammatical roles (eg subj vs obj, or adj on a different item), be
  chosen from different dictionaries, or there needs to be an order-agnostic way
  to join them at the bit level. Preferably this should be avoided, just to
  prevent users forgetting the order.

3.2. As 3.1, but treat sentence generation as decoding a prefix code, and have
  a Huffman code for each word class.

  We suppose it’s okay if the generated sentence has a few more words than it
  might, as long as they’re common lean words.  E.g., for adjectives, "good"
  might cost only six bits while "unfortunate" costs twelve.

  Choice between different sentence syntaxes could be worked into the prefix
  code as well, and potentially done separately for each syntactic constituent.

4. Usage

  To form mnemonic .onion URL, just join the words with dashes or underscores,
  stripping minimal words like 'a', 'the', 'and' etc., and append '.onion'. This
  can be readily distinguished from standard hash-style .onion URLs by form.

  Translation should take place at the client — though hidden service servers
  should also be able to output the mnemonic form of hashes too, to assist
  website operators in publishing them (e.g. by posting an amusing drawing of
  the described situation on their website to reinforce the mnemonic).

  After the translation stage of name resolution, everything proceeds as normal
  for an 80-bit hash onion URL.

  The user should be notified of the mnemonic form of hash URL in some way, and
  have an easy way in the client UI to translate mnemonics to hashes and vice
  versa. For the purposes of browser URLs and the like though, the mnemonic
  should be treated on par with the hash; if the user enters a mnemonic URL they
  should not become redirected to the hash version. (If anything, the opposite
  may be true, so that users become used to seeing and verifying the mnemonic
  version of hash URLs, and gain the security benefits against partial-match
  fuzzing.)

  Ideally, inputs that don't validly resolve should have a response page served
  by the proxy that uses a simple spell-check system to suggest alternate domain
  names that are valid hash encodings. This could hypothetically be done inline
  in URL input, but would require changes on the browser (normally domain names
  aren't subject so spellcheck), and this avoids that implementation problem.

5. International support

  It is not possible for this scheme to support non-English languages without
  a) (usually) Unicode in domains (which is not yet well supported by browsers),
  and
  b) fully customized dictionaries and phrase patterns per language

  The scheme must not be used in an attempted 'translation' by simply replacing
  English words with glosses in the target language. Several of the necessary
  features would be completely mangled by this (e.g. other languages have
  different synonym, homonym, etc groupings, not to mention completely different
  grammar).

  It is unlikely a priori that URLs constructed using a non-English
  dictionary/pattern setup would in any sense 'translate' semantically to
  English; more likely is that each language would have completely unrelated
  encodings for a given hash.

  We intend to only make an English version at first, to avoid these issues
  during testing.

________________

[1] https://trac.torproject.org/projects/tor/wiki/doc/HiddenServiceNames
https://gitweb.torproject.org/torspec.git/blob/HEAD:/address-spec.txt
[2] http://www.thc.org/papers/ffp.html
[3] http://dot-bit.org/Namecoin
[4] https://en.wikipedia.org/wiki/Zooko's_triangle
[5] https://addons.mozilla.org/en-US/firefox/addon/petname-tool/
[6] However, service operators can generate a large number of hidden service
descriptors and check whether their hashes result in a desirable phrasal
encoding (much like certain hidden services currently use brute force generated
hashes to ensure their name is the prefix of their raw hash). This won't get you
whatever phrase you want, but will at least improve the likelihood that it's
something amusing and acceptable.
[7] "Meaningful" here inasmuch as e.g. "Barnaby thoughtfully mangles simplistic
yellow camels" is an absurdist but meaningful sentence. Absurdness is a feature,
not a bug; it decreases the probability of mistakes if the scenario described is
not one that the user would try to fit into a template of things they have
previously encountered IRL. See research into linguistic schema for further
details.
[8] https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/ideas/xxx-oni
on-nyms.txt
[9] http://blog.rabidgremlin.com/2010/11/28/4-little-words/
[10] http://fastword.me/
[11] https://tools.ietf.org/html/rfc1751
[12] http://tools.ietf.org/html/rfc2289
[13] https://github.com/singpolyma/mnemonicode
http://mysteryrobot.com
https://github.com/zacharyvoase/humanhash
[14] http://www.mathcs.duq.edu/~juola/papers.d/icslp96.pdf
[15] http://en.wikipedia.org/wiki/PGP_word_list
[16] https://github.com/hellais/Onion-url
https://github.com/hellais/Onion-url/blob/master/dev/mnemonic.py
[17] http://www.reddit.com/r/technology/comments/ecllk
[18] http://wordnet.princeton.edu/wordnet/man2.1/wnstats.7WN.html

Filename: 195-TLS-normalization-for-024.txt
Title: TLS certificate normalization for Tor 0.2.4.x
Author: Jacob Appelbaum, Gladys Shufflebottom, Nick Mathewson, Tim Wilde
Created: 6-Mar-2012
Status: Dead
Target: 0.2.4.x


0. Introduction

   The TLS (Transport Layer Security) protocol was designed for security
   and extensibility, not for uniformity.  Because of this, it's not
   hard for an attacker to tell one application's use of TLS from
   another's.

   We proposes improvements to Tor's current TLS certificates to
   reduce the distinguishability of Tor traffic.

0.1. History

   This draft is based on parts of Proposal 179, by Jacob Appelbaum
   and Gladys Shufflebottom, but removes some already implemented parts
   and replaces others.

0.2. Non-Goals

   We do not address making TLS harder to distinguish after the
   handshake is done.  We also do not discuss TLS improvements not
   related to distinguishability (such as increased key size, algorithm
   choice, and so on).

1. Certificate Issues

   Currently, Tor generates certificates according to a fixed pattern,
   where lifetime is fairly small, the certificate Subject DN is a
   single randomly generated CN, and the certificate Issuer DN is a
   different single randomly generated CN.

   We propose several ways to improve this below.

1.1. Separate initial certificate from link certificate

   When Tor is using the v2 or v3 link handshake (see tor-spec.txt), it
   currently presents an initial handshake authenticating the link key
   with the identity key.

   We propose instead that Tor should be able to present an arbitrary
   initial certificate (so long as its key matches the link key used in
   the actual TLS handshake), and then present the real certificate
   authenticating the link key during the Tor handshake.  (That is,
   during the v2 handshake's renegotiation step, or in the v3
   handshake's CERTS cell.)

   The TLS protocol and the Tor handshake protocol both allow this, and
   doing so will give us more freedom for the alternative certificate
   presentation ideas below.

1.2. Allow externally generated certificates

   It should be possible for a Tor relay operator to generate and
   provide their own certificate and secret key.  This will allow a relay or
   bridge operator to use a certificate signed by any member of the "SSL
   mafia,"[*] to generate their own self-signed certificate, and so on.

   For compatibility, we need to require that the key be an RSA secret
   key, of at least 1024 bits, generated with e=65537.

   As a proposed interface, let's require that the certificate be stored
   in ${DataDir}/tls_cert/tls_certificate.crt , that the secret key be
   stored in ${DataDir}/tls_cert/private_tls_key.key , and that they be
   used instead of generating our own certificate whenever the new
   boolean option "ProvidedTLSCert" is set to true.

   (Alternative interface: Allow the cert and key cert to be stored
   wherever, and have the user provide their respective locations with
   TLSCertificateFile and TLSCertificateKeyFile options.)

1.3. Longer certificate lifetimes

   Tor's current certificates aren't long-lived, which makes them
   different from most other certificates in the wild.

   Typically, certificates are valid for a year, so let's use that as
   our default lifetime.  [TODO: investigate whether "a year" for most
   CAs and self-signed certs have their validity dates running for a
   calendar year ending at the second of issue, one calendar year
   ending at midnight, or 86400*(365.5 +/- .5) seconds, or what.]

   There are two ways to approach this.  We could continue our current
   certificate management approach where we frequently generate new
   certificates (albeit with longer lifetimes), or we could make a cert,
   store it to disk, and use it for all or most of its declared
   lifetime.

   If we continue to use fairly short lifetimes for the _true_ link
   certificates (the ones presented during the Tor handshake), then
   presenting long-lived certificates doesn't hurt us much: in the event
   of a link-key-only compromise, the adversary still couldn't actually
   impersonate a server for long.[**]

   Using shorter-lived certificates with long nominal lifetimes doesn't
   seem to buy us much.  It would let us rotate link keys more
   frequently, but we're already getting forward secrecy from our use of
   diffie-hellman key agreement.  Further, it would make our behavior
   look less like regular TLS behavior, where certificates are typically
   used for most of their nominal lifetime.  Therefore, let's store and
   use certs and link keys for the full year.

1.4. Self-signed certificates with better DNs

   When we generate our own certificates, we currently set no DN fields
   other than the commonName.  This behavior isn't terribly common:
   users of self-signed certs usually/often set other fields too.
   [TODO: find out frequency.]

   Unfortunately, it appears that no particular other set of fields or
   way of filling them out _is_ universal for self-signed certificates,
   or even particularly common.  The most common schema seem to be for
   things most censors wouldn't mind blocking, like embedded devices.
   Even the default openssl schema, though common, doesn't appear to
   represent a terribly large fraction of self-signed websites.  [TODO:
   get numbers here.]

   So the best we can do here is probably to reproduce the process that
   results in self-signed certificates originally: let the bridge and relay
   operators to pick the DN fields themselves.  This is an annoying
   interface issue, and wants a better solution.

1.5. Better commonName values

   Our current certificates set the commonName to a randomly generated
   field like www.rmf4h4h.net.  This is also a weird behavior: nearly
   all TLS certs used for web purposes will have a hostname that
   resolves to their IP.

   The simplest way to get a plausible commonName here would be to do a
   reverse lookup on our IP and try to find a good hostname.  It's not
   clear whether this would actually work out in practice, or whether
   we'd just get dynamic-IP-pool hostnames everywhere blocked when they
   appear in certificates.

   Alternatively, if we are told a hostname in our Torrc (possibly in
   the Address field), we could try to use that.

2. TLS handshake issues

2.1. Session ID.

   Currently we do not send an SSL session ID, as we do not support session
   resumption.  However, Apache (and likely other major SSL servers) do have
   this support, and do send a 32 byte SSLv3/TLSv1 session ID in their Server
   Hello cleartext.  We should do the same to avoid an easy fingerprinting
   opportunity.  It may be necessary to lie to OpenSSL to claim that we are
   tracking session IDs to cause it to generate them for us.

   (We should not actually support session resumption.)




[*] "Hey buddy, it's a nice website you've got there.  Sure would be a
    shame if somebody started poppin' up warnings on all your user's
    browsers, tellin' everbody that you're _insecure_..."

[**] Furthermore, a link-key-only compromise isn't very realistic atm;
     nearly any attack that would let an adversary learn a link key would
     probably let the adversary learn the identity key too.  The most
     plausible way would probably be an implementation bug in OpenSSL or
     something.

Filename: 196-transport-control-ports.txt
Title: Extended ORPort and TransportControlPort
Author: George Kadianakis, Nick Mathewson
Created: 14 Mar 2012
Status: Closed
Implemented-In: 0.2.5.2-alpha

1. Overview

  Proposal 180 defined Tor pluggable transports, a way to decouple
  protocol-level obfuscation from the core Tor protocol in order to
  better resist client-bridge censorship. This is achieved by
  introducing pluggable transport proxies, programs that obfuscate Tor
  traffic to resist DPI detection.

  Proposal 180 defined a way for pluggable transport proxies to
  communicate with local Tor clients and bridges, so as to exchange
  traffic. This document extends this communication protocol, so that
  pluggable transport proxies can exchange arbitrary operational
  information and metadata with Tor clients and bridges.

2. Motivation

  The communication protocol specified in Proposal 180 gives a way
  for transport proxies to announce the IP address of their clients
  to tor. Still, modern pluggable transports might have more (?)
  needs than this. For example:

  1. Tor might want to inform pluggable transport proxies on how to
     rate-limit incoming or outgoing connections.

  2. Server pluggable transport proxies might want to pass client
     information to an anti-active-probing system controlled by tor.

  3. Tor might want to temporarily stop a transport proxy from
     obfuscating traffic.

  To satisfy the above use cases, there must be real-time
  communication between the tor process and the pluggable transport
  proxy. To achieve this, this proposal refactors the Extended ORPort
  protocol specified in Proposal 180, and introduces a new port,
  TransportControlPort, whose sole role is the exchange of control
  information between transport proxies and tor.

  Specifically, transports proxies deliver each connection to the
  "Extended ORPort", where they provide metadata and agree on an
  identifier for each tunneled connection.  Once this handshake
  occurs, the OR protocol proceeds unchanged.

  Additionally, each transport maintains a single connection to Tor's
  "TransportControlPort", where it receives instructions from Tor
  about rate-limiting on individual connections.

3. The new port protocols

3.1. The new extended ORPort protocol

3.1.1. Protocol

  The extended server port protocol is as follows:

     COMMAND [2 bytes, big-endian]
     BODYLEN [2 bytes, big-endian]
     BODY [BODYLEN bytes]

     Commands sent from the transport proxy to the bridge are:

     [0x0000] DONE: There is no more information to give. The next
       bytes sent by the transport will be those tunneled over it.
       (body ignored)

     [0x0001] USERADDR: an address:port string that represents the
       client's address.

     [0x0002] TRANSPORT: a string of the name of the pluggable
       transport currently in effect on the connection.

     Replies sent from tor to the proxy are:

     [0x1000] OKAY: Send the user's traffic. (body ignored)

     [0x1001] DENY: Tor would prefer not to get more traffic from
       this address for a while. (body ignored)

     [0x1002] CONTROL: a NUL-terminated "identifier" string. The
       pluggable transport proxy must use the "identifier" to access
       the TransportControlPort. See the 'Association and identifier
       creation' section below.

  Parties MUST ignore command codes that they do not understand.

  If the server receives a recognized command that does not parse, it
  MUST close the connection to the client.

3.1.2. Command descriptions

3.1.2.1. USERADDR

  An ASCII string holding the TCP/IP address of the client of the
  pluggable transport proxy. A Tor bridge SHOULD use that address to
  collect statistics about its clients.  Recognized formats are:
    1.2.3.4:5678
    [1:2::3:4]:5678

  (Current Tor versions may accept other formats, but this is a bug:
  transports MUST NOT send them.)

  The string MUST not be NUL-terminated.

3.1.2.2. TRANSPORT

  An ASCII string holding the name of the pluggable transport used by
  the client of the pluggable transport proxy. A Tor bridge that
  supports multiple transports SHOULD use that information to collect
  statistics about the popularity of individual pluggable transports.

  The string MUST not be NUL-terminated.

  Pluggable transport names are C-identifiers and Tor MUST check them
  for correctness.

3.2. The new TransportControlPort protocol

  The TransportControlPort protocol is as follows:

     CONNECTIONID[16 bytes, big-endian]
     COMMAND [2 bytes, big-endian]
     BODYLEN [2 bytes, big-endian]
     BODY [BODYLEN bytes]

     Commands sent from the transport proxy to the bridge:

     [0x0001] RATE_LIMITED: Message confirming that the rate limiting
       request of the bridge was carried out successfully (body
       ignored). See the 'Rate Limiting' section below.

     [0x0002] NOT_RATE_LIMITED: Message notifying that the transport
       proxy failed to carry out the rate limiting request of the
       bridge (body ignored). See the 'Rate Limiting' section below.

     Configuration commands sent from the bridge to the transport
     proxy are:

     [0x1001] NOT_ALLOWED: Message notifying that the CONNECTIONID
       could not be matched with an authorized connection ID. The
       bridge SHOULD shutdown the connection.

     [0x1001] RATE_LIMIT: Carries information on how the pluggable
       transport proxy should rate-limit its traffic. See the 'Rate
       Limiting' section below.

  CONNECTIONID should carry the connection identifier described in the
  'Association and identifier creation' section.

  Parties should ignore command codes that they do not understand.

3.3. Association and identifier creation

  For Tor and a transport proxy to communicate using the
  TransportControlPort, an identifier must have already been negotiated
  using the 'CONTROL' command of Extended ORPort.

  The TransportControlPort identifier should not be predictable by a
  user who hasn't received a 'CONTROL' command from the Extended
  ORPort. For this reason, the TransportControlPort identifier should
  not be cryptographically-weak or deterministically created.

  Tor MUST create its identifiers by generating 16 bytes of random
  data.

4. Configuration commands

4.1. Rate Limiting

  A Tor relay should be able to inform a transport proxy in real-time
  about its rate-limiting needs.

  This can be achieved by using the TransportControlPort and sending a
  'RATE_LIMIT' command to the transport proxy.

  The body of the 'RATE_LIMIT' command should contain two integers,
  4 bytes each, in big-endian format. The two numbers should represent
  the bandwidth rate and bandwidth burst respectively in 'bytes per
  second' which the transport proxy must set as its overall
  rate-limiting setting.

  When the transport proxy sets the appropriate rate limiting, it
  should send back a 'RATE_LIMITED' command. If it fails while setting
  up rate limiting, it should send back a 'NOT_RATE_LIMITED' command.

  After sending a 'RATE_LIMIT' command. the tor bridge MAY want to
  stop pushing data to the transport proxy, till it receives a
  'RATE_LIMITED' command. If, instead, it receives a 'NOT_RATE_LIMITED'
  command it MAY want to shutdown its connections to the transport
  proxy.

5. Authentication

  To defend against cross-protocol attacks on the Extended ORPort,
  proposal 213 defines an authentication scheme that should be used to
  protect it.

  If the Extended ORPort is enabled, Tor should regenerate the cookie
  file of proposal 213 on startup and store it in
  $DataDirectory/extended_orport_auth_cookie.

  The location of the cookie can be overriden by using the
  configuration file parameter ExtORPortCookieAuthFile, which is
  defined as:

    ExtORPortCookieAuthFile <path>

  where <path> is a filesystem path.

  XXX should we also add an ExtORPortCookieFileGroupReadable torrc option?

6. Security Considerations

  Extended ORPort or TransportControlPort do _not_ provide link
  confidentiality, authentication or integrity. Sensitive data, like
  cryptographic material, should not be transferred through them.

  An attacker with superuser access, is able to sniff network traffic,
  and capture TransportControlPort identifiers and any data passed
  through those ports.

  Tor SHOULD issue a warning if the bridge operator tries to bind
  Extended ORPort or TransportControlPort to a non-localhost address.

  Pluggable transport proxies SHOULD issue a warning if they are
  instructed to connect to a non-localhost Extended ORPort or
  TransportControlPort.

7. Future

  In the future, we might have pluggable transports which require the
  _client_ transport proxy to use the TransportControlPort and exchange
  control information with the Tor client. The current proposal doesn't
  yet support this, but we should not add functionality that will
  prevent it from being possible.
Filename: 197-postmessage-ipc.txt
Title: Message-based Inter-Controller IPC Channel
Author: Mike Perry
Created: 16-03-2012
Status: REJECTED


Overview

  This proposal seeks to create a means for inter-controller
  communication using the Tor Control Port.

Motivation

  With the advent of pluggable transports, bridge discovery mechanisms,
  and tighter browser-Vidalia integration, we're going to have an
  increasing number of collaborating Tor controller programs
  communicating with each other. Rather than define new pairwise IPC
  mechanisms for each case, we will instead create a generalized
  message-passing mechanism through the Tor Control Port.

Control Protocol Specification Changes

  CONTROLLERNAME command

    Sent from the client to the server. The syntax is:

      "CONTROLLERNAME" SP ControllerID
        ControllerID = 1*(ALNUM / "_")

    Server returns "250 OK" and records the ControllerID to use for
    this control port connection for messaging information if successful,
    or "553 Controller name already in use" if the name is in use by
    another controller, or if an attempt is made to register the special
    names "all" or "unset".

    [CONTROLLERNAME need not be issued to send POSTMESSAGE commands,
     and CONTROLLERNAME may be unsupported by initial POSTMESSAGE
     implementations in Tor.]

  POSTMESSAGE command

    Sent from the client to the server. The syntax is:

      "POSTMESSAGE" SP "@" DestControllerID SP LineItem CRLF
         DestControllerID = "all" / 1*(ALNUM / "_")

    If DestControllerID is "all", the message will be posted to all
    controllers that have "SETEVENTS POSTMESSAGE" set. Otherwise, the
    message should be posted to the controller with the appropriate
    ControllerID.

    Server returns "250 OK" if successful, or "552 Invalid destination
    controller name" if the name is not registered.

    [Initial implementations may require DestControllerID always be
     "all"]

  POSTMESSAGE event

      "650" SP "POSTMESSAGE" SP MessageID SP SourceControllerID SP
                        "@" DestControllerID SP LineItem CRLF
         MessageID = 1*DIGIT
         SourceControllerID = "unset" / 1*(ALNUM / "_")
         DestControllerID = "all" / 1*(ALNUM / "_")

      MessageID is an incrementing integer identifier that uniquely
      identifies this message to all controllers.

      The SourceControllerID is the value from the sending
      controller's CONTROLLERNAME command, or "unset" if the
      CONTROLLERNAME command was not used or unimplemented.

  GETINFO commands
    "recent-messages" -- Retrieves messages
      sent to ControllerIDs that match the current controller
      in POSTMESSAGE event format. This list should be generated
      on the fly, to handle disconnecting controllers.

    "new-messages" -- Retrieves the last 10 "unread" messages
      sent to this controller, in POSTMESSAGE event format. If
      SETEVENTS POSTMESSAGE was set, this command should always return
      nothing.

    "list-controllers" -- Retrieves a list of all connected controllers
      with either their registered ControllerID or "unset".

Implementation plan

  The POSTMESSAGE protocol is designed to be incrementally deployable.
  Initial implementations are only expected to implement broadcast
  capabilities and SETEVENTS based delivery. CONTROLLERNAME need not be
  supported, nor do non-"@all" POSTMESSAGE destinations.

  To support command-based controllers (which do not use SETEVENTS) such
  as Torbutton, at minimum the "GETINFO recent-messages" command is
  needed.  However, Torbutton does not have immediate need for this
  protocol.

Filename: 198-restore-clienthello-semantics.txt
Title: Restore semantics of TLS ClientHello
Author: Nick Mathewson
Created: 19-Mar-2012
Status: Closed
Target: 0.2.4.x

Status:

   Tor 0.2.3.17-beta implements the client-side changes, and no longer
   advertises openssl-supported TLS ciphersuites we don't have.

Overview:

   Currently, all supported Tor versions try to imitate an older version
   of Firefox when advertising ciphers in their TLS ClientHello.  This
   feature is intended to make it harder for a censor to distinguish a
   Tor client from other TLS traffic.  Unfortunately, it makes the
   contents of the ClientHello unreliable: a server cannot conclude that
   a cipher is really supported by a Tor client simply because it is
   advertised in the ClientHello.

   This proposal suggests an approach for restoring sanity to our use of
   ClientHello, so that we still avoid ciphersuite-based fingerprinting,
   but allow nodes to negotiate better ciphersuites than they are
   allowed to negotiate today.

Background reading:

   Section 2 of tor-spec.txt 2 describes our current baroque link
   negotiation scheme.  Proposals 176 and 184 describe more information
   about how it got that way.

   Bug 4744 is a big part of the motivation for this proposal: we want
   to allow Tors to advertise even more ciphers, some of which we would
   actually prefer to the ones we are using now.

   What you need to know about the TLS handshake is that the client
   sends a list of all the ciphersuites that it supports in its
   ClientHello message, and then the server chooses one and tells the
   client which one it picked.

Motivation and constraints:

   We'd like to use some of the ECDHE TLS ciphersuites, since they allow
   us to get better forward-secrecy at lower cost than our current
   DH-1024 usage.  But right now, we can't ever use them, since Tor will
   advertise them whether or not it has a version of OpenSSL that
   supports them.

   (OpenSSL before 1.0.0 did not support ECDHE ciphersuites; OpenSSL
   before 1.0.0e or so had some security issues with them.)

   We cannot have the rule be "Tors must only advertise ciphersuites
   that they can use", since current Tors will advertise such
   ciphersuites anyway.

   We cannot have the rule be "Tors must support every ECDHE ciphersuite
   on the following list", since current Tors don't do all that, and
   since one prominent Linux distribution builds OpenSSL without ECC
   support because of patent/freedom fears.

   Fortunately, nearly every ciphersuite that we would like to advertise
   to imitate FF8 (see bug 4744) is currently supported by OpenSSL 1.0.0
   and later.  This enables the following proposal to work:

Proposed spec changes:

   I propose that the rules for handling ciphersuites at the server side
   become the following:

   If the ciphersuites in the ClientHello contains no ciphers other than
   the following[*], they indicate that the Tor v1 link protocol is in use.

     TLS_DHE_RSA_WITH_AES_256_CBC_SHA
     TLS_DHE_RSA_WITH_AES_128_CBC_SHA
     SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA

   If the advertised ciphersuites in the ClientHello are _exactly_[*]
   the following, they indicate that the Tor v2+ link protocol is in
   use, AND that the ClientHello may have unsupported ciphers.  In this
   case, the server may choose DHE_RSA_WITH_AES_128_CBC_SHA  or
   DHE_RSA_WITH_AES_256_SHA, but may not choose any other cipher.

     TLS1_ECDHE_ECDSA_WITH_AES_256_CBC_SHA
     TLS1_ECDHE_RSA_WITH_AES_256_CBC_SHA
     TLS1_DHE_RSA_WITH_AES_256_SHA
     TLS1_DHE_DSS_WITH_AES_256_SHA
     TLS1_ECDH_RSA_WITH_AES_256_CBC_SHA
     TLS1_ECDH_ECDSA_WITH_AES_256_CBC_SHA
     TLS1_RSA_WITH_AES_256_SHA
     TLS1_ECDHE_ECDSA_WITH_RC4_128_SHA
     TLS1_ECDHE_ECDSA_WITH_AES_128_CBC_SHA
     TLS1_ECDHE_RSA_WITH_RC4_128_SHA
     TLS1_ECDHE_RSA_WITH_AES_128_CBC_SHA
     TLS1_DHE_RSA_WITH_AES_128_SHA
     TLS1_DHE_DSS_WITH_AES_128_SHA
     TLS1_ECDH_RSA_WITH_RC4_128_SHA
     TLS1_ECDH_RSA_WITH_AES_128_CBC_SHA
     TLS1_ECDH_ECDSA_WITH_RC4_128_SHA
     TLS1_ECDH_ECDSA_WITH_AES_128_CBC_SHA
     SSL3_RSA_RC4_128_MD5
     SSL3_RSA_RC4_128_SHA
     TLS1_RSA_WITH_AES_128_SHA
     TLS1_ECDHE_ECDSA_WITH_DES_192_CBC3_SHA
     TLS1_ECDHE_RSA_WITH_DES_192_CBC3_SHA
     SSL3_EDH_RSA_DES_192_CBC3_SHA
     SSL3_EDH_DSS_DES_192_CBC3_SHA
     TLS1_ECDH_RSA_WITH_DES_192_CBC3_SHA
     TLS1_ECDH_ECDSA_WITH_DES_192_CBC3_SHA
     SSL3_RSA_FIPS_WITH_3DES_EDE_CBC_SHA
     SSL3_RSA_DES_192_CBC3_SHA

  [*] The "extended renegotiation is supported" ciphersuite, 0x00ff, is
      not counted when checking the list of ciphersuites.

  Otherwise, the ClientHello has these semantics: The inclusion of any
  cipher supported by OpenSSL 1.0.0 means that the client supports it,
  with the exception of
      SSL_RSA_FIPS_WITH_3DES_EDE_CBC_SHA
  which is never supported. Clients MUST advertise support for at least one of
  TLS_DHE_RSA_WITH_AES_256_CBC_SHA or TLS_DHE_RSA_WITH_AES_128_CBC_SHA.

  The server MUST choose a ciphersuite with ephemeral keys for forward
  secrecy; MUST NOT choose a weak or null ciphersuite; and SHOULD NOT
  choose any cipher other than AES or 3DES.

Discussion and consequences:


  Currently, OpenSSL 1.0.0 (in its default configuration) supports every
  cipher that we would need in order to give the same list as Firefox
  versions 8 through 11 give in their default configuration, with the
  exception of the FIPS ciphersuite above.  Therefore, we will be able
  to fake the new ciphersuite list correctly in all of our bundles that
  include OpenSSL, and on every version of Unix that keeps up-to-date.

  However, versions of Tor compiled to use older versions of OpenSSL, or
  versions of OpenSSL with some ciphersuites disabled, will no
  longer give the same ciphersuite lists as other versions of Tor.  On
  these platforms, Tor clients will no longer impersonate Firefox.
  Users who need to do so will have to download one of our bundles, or
  use a non-system OpenSSL.


  The proposed spec change above tries to future-proof ourselves by not
  declaring that we support every declared cipher, in case we someday
  need to handle a new Firefox version.  If a new Firefox version
  comes out that uses ciphers not supported by OpenSSL 1.0.0, we will
  need to define whether clients may advertise its ciphers without
  supporting them; but existing servers will continue working whether
  we decide yes or no.


  The restriction to "servers SHOULD only pick AES or 3DES" is meant to
  reflect our current behavior, not to represent a permanent refusal to
  support other ciphers.  We can revisit it later as appropriate, if for
  some bizarre reason Camellia or Seed or Aria becomes a better bet than
  AES.

Open questions:

  Should the client drop connections if the server chooses a bad
  cipher, or a suite without forward secrecy?

  Can we get OpenSSL to support the dubious FIPS suite excluded above,
  in order to remove a distinguishing opportunity?  It is not so simple
  as just editing the SSL_CIPHER list in s3_lib.c, since the nonstandard
  SSL_RSA_FIPS_WITH_3DES_EDE_CBC_SHA cipher is (IIUC) defined to use the
  TLS1 KDF, while declaring itself to be an SSL cipher (!).

  Can we do anything to eventually allow the IE7+[**] cipher list as
  well?  IE does not support TLS_DHE_RSA_WITH_AES_{256,128}_SHA or
  SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA, and so wouldn't work with current
  Tor servers, which _only_ support those.  It looks like the only
  forward-secure ciphersuites that IE7+ *does* support are ECDHE ones,
  and DHE+DSS ones.  So if we want this flexibility, we could mandate
  server-side ECDHE, or somehow get DHE+DSS support (which would play
  havoc with our current certificate generation code IIUC), or say that
  it is sometimes acceptable to have a non-forward-secure link
  protocol[***].  None of these answers seems like a great one.  Is one
  best?  Are there other options?

  [**] Actually, I think it's the Windows SChannel cipher list we
  should be looking at here.
  [***] If we did _that_, we'd want to specify that CREATE_FAST could
  never be used on a non-forward-secure link.  Even so, I don't like the
  implications of leaking cell types and circuit IDs to a future
  compromise.

Filename: 199-bridgefinder-integration.txt
Title: Integration of BridgeFinder and BridgeFinderHelper
Author: Mike Perry
Created: 18-03-2012
Status: OBSOLETE


Overview

  This proposal describes how the Tor client software can interact with
  an external program that performs bridge discovery based on user input
  or information extracted from a web page, QR Code, online game, or
  other transmission medium.


Scope and Audience

  This document describes how all of the components involved in bridge
  discovery communicate this information to the rest of the Tor
  software. The mechanisms of bridge discovery are not discussed, though
  the design aims to be generalized enough to allow arbitrary new
  discovery mechanisms to be added at any time.
  
  This document is also written with the hope that those who wish to
  implement BridgeFinder components and BridgeFinderHelpers can get
  started immediately after a read of this proposal, so that development
  of bridge discovery mechanisms can proceed in parallel to supporting
  functionality improvements in the Tor client software.


Components and Responsibilities

 0. Tor Client
 
    The Tor Client is the piece of software that connects to the Tor
    network (optionally using bridges) and provides a SOCKS proxy for
    use by the user.
 
    In initial implementations, the Tor Client will support only
    standard bridges. In later implementations, it is expected to
    support pluggable transports as defined by Proposal 180.

 1. Tor Control Port
 
    The Tor Control Port provides commands to perform operations,
    configuration, and to obtain status information. It also optionally
    provides event driven status updates.
    
    In initial implementations, it will be used directly by BridgeFinder
    to configure bridge information via GETINFO and SETCONF. It is covered
    by control-spec.txt in the tor-specs git repository.

    In later implementations, it will support the inter-controller
    POSTMESSAGE IPC protocol as defined by Proposal 197 for use
    in conveying bridge information to the Primary Controller.
 
 2. Primary Controller
 
    The Primary Controller is the program that launches and configures the
    Tor client, and monitors its status.
    
    On desktop platforms, this program is Vidalia, and it also launches
    the Tor Browser. On Android, this program is Orbot. Orbot does not
    launch a browser.
    
    On all platforms, this proposal requires that the Primary Controller
    will launch one or more BridgeFinder child processes and provide
    them with authentication information through the environment variables
    TOR_CONTROL_PORT and TOR_CONTROL_PASSWD.

    In later implementations, the Primary Controller will be expected
    to receive Bridge configuration information via the free-form
    POSTMESSAGE protocol from Proposal 197, validate that information,
    and hold that information for user approval.
 
 3. BridgeFinder
 
    A BridgeFinder is a program that discovers bridges and configures
    Tor to use them.
    
    In initial implementations, it is likely to be very dumb, and its main
    purpose will be to serve as a layer of abstraction that should free
    the Primary Controller from having to directly implement numerous ways
    of retrieving bridges for various pluggable transports.
    
    In later implementations, it may perform arbitrary network operations
    to discover, authenticate to, and/or verify bridges, possibly using
    informational hints provided by one or more external
    BridgeFinderHelpers (see next component). It could even go so far as
    to download new pluggable transport plugins and/or transform
    definition files from arbitrary urls.
    
    It will be launched by the Primary Controller and given access to the
    Tor Control Port via the environment variables TOR_CONTROL_PORT and
    TOR_CONTROL_PASSWD.
    
    Initial control port interactions can be command driven via GETINFO
    and SETCONF, and do not need to subscribe to or process control port
    events. Later implementations will use POSTMESSAGE as defined in
    Proposal 197 to pass command requests to Vidalia, which will parse
    them and ask for user confirmation before deploying them. Use of
    POSTMESSAGE may or may not require event driven operation, depending
    on POSTMESSAGE implementation status (POSTMESSAGE is designed to
    support both command and event driven operation, but it is possible 
    event driven operation will happen first).
 
 4. BridgeFinderHelper
 
    Each BridgeFinder implementation can optionally communicate with one
    or more BridgeFinderHelpers. BridgeFinderHelpers are plugins to
    external 3rd party applications that can inspect traffic, handle mime
    types, or implement protocol handlers for accepting bridge discovery
    information to pass to BridgeFinder. Example 3rd party applications
    include Chrome, World of Warcraft, QR Code readers, or simple cut
    and paste.
    
    Due to the arbitrary nature of sandboxing that may be present in
    various BridgeFinderHelper host applications, we do not mandate the
    exact nature of the IPC between BridgeFinder instances and external
    BridgeFinderHelper addons. However, please see the "Security Concerns"
    section for common pitfalls to avoid. 
 
 5. Tor Browser
 
    This is the browser the user uses with Tor. It is not useful until Tor
    is properly configured to use bridges. It fails closed.
    
    It is not expected to run BridgeFinderHelper plugin instances, unless
    those plugin instances exist to ensure the user always has a pool of
    working bridges available after successfully configuring an
    initial bridge. Once all bridges fail, the Tor Browser is useless.
 
 6. Non-Tor Browser (aka BridgeFinderHelper host)
 
    This is the program the user uses for normal Internet activity to
    obtain bridges via a BridgeFinderHelper plugin. It does not have to be
    a browser. In advanced scenarios, this component may not be a browser
    at all, but may be a program such as World of Warcraft instead.


Incremental Deployability

  The system is designed to be incrementally deployable: Simple designs
  should be possible to develop and test immediately. The design is
  flexible enough to be easily upgraded as more advanced features become
  available from both Tor and new pluggable transports.

Initial Implementation

  In the simplest possible initial implementation, BridgeFinder will
  only discover Tor Bridges as they are deployed today. It will use the
  Tor Control Port to configure these bridges directly via the SETCONF
  command. It may or may not receive bridge information from a
  BridgeFinderHelper. In an even more degenerate case,
  BridgeFinderHelper may even be Vidalia or Orbot itself, acting upon
  user input from cut and paste.

 Initial Implementation: BridgeFinder Launch
 
   In the initial implementation, the Primary Controller will launch one
   or more BridgeFinders, providing control port authentication
   information to them through the environment variables TOR_CONTROL_PORT
   and TOR_CONTROL_PASSWD.
   
   BridgeFinder will then directly connect to the control port and
   authenticate. Initial implementations should be able to function
   without using SETEVENTS, and instead only using command-based
   status inquiries and configuration (GETINFO and SETCONF).
 
 Initial Implementation: Obtaining Bridge Hint Information
 
   In the initial implementation, to test functionality,
   BridgeFinderHelper can simply scrape bridges directly from
   https://bridges.torproject.org.
   
   In slightly more advanced implementations, a BridgeFinderHelper
   instance may be written for use in the user's Non-Tor Browser. This
   plugin could extract bridges from images, html comments, and other
   material present in ad banners and slack space on unrelated pages.
 
   BridgeFinderHelper would then communicate with the appropriate
   BridgeFinder instance over an acceptable IPC mechanism. This proposal
   does not seek to specify the nature of that IPC channel (because
   BridgeFinderHelper may be arbitrarily constrained due to host
   application sandboxing), but we do make several security
   recommendations under the section "Security Concerns: BridgeFinder and
   BridgeFinderHelper".
 
 Initial Implementation: Configuring New Bridges
 
   In the initial implementation, Bridge configuration will be done
   directly though the control port using the SETCONF command.
   
   Initial implementations will support only retrieval and configuration
   of standard Tor Bridges. These are configured using SETCONF on the Tor
   Control Port as follows:
     SETCONF Bridge="IP:ORPort [fingerprint]"


Future Implementations

  In future implementations, the system can incrementally evolve in a
  few different directions. As new pluggable transports are created, it
  is conceivable that BridgeFinder may want to download new plugin
  binaries (and/or new transport transform definition files) and
  provide them to Tor.
  
  Furthermore, it may prove simpler to deploy multiple concurrent
  BridgeFinder+BridgeFinderHelper pairs as opposed to adding new
  functionality to existing prototypes.
  
  Finally, it is desirable for BridgeFinder to obtain approval
  from the user before updating bridge configuration, especially for
  cases where BridgeFinderHelper is automatically discovering bridges
  in-band during Non-Tor activity.

  The exact mechanisms for accomplishing these improvements is
  described in the following subsections.

 Future Implementations: BridgeFinder Launch and POSTMESSAGE handshake
 
   The nature of the BridgeFinder launch and the environment variables
   provided is not expected to change. However, future Primary Controller
   implementations may decide to launch more than one BridgeFinder
   instance side by side.
 
   Additionally, to negotiate the IPC channel created by Proposal 197
   for purposes of providing user confirmation, it is recommended that
   BridgeFinder and the Primary Controller perform a handshake using
   POSTMESSAGE upon launch, to establish that all parties properly
   support the feature:
 
     Primary Controller: "POSTMESSAGE @all Controller wants POSTMESSAGE v1.1"
     BridgeFinder: "POSTMESSAGE @all BridgeFinder has POSTMESSAGE v1.0"
     Primary Controller: "POSTMESSAGE @all Controller expects POSTMESSAGE v1.0"
     BridgeFinder: "POSTMESSAGE @all BridgeFinder will POSTMESSAGE v1.0"
 
   If this 4 step handshake proceeds with an acceptable version,
   BridgeFinder must use POSTMESSAGE to transmit SETCONF Bridge lines
   (see "Future Implementations: Configuring New Bridges" below). If
   POSTMESSAGE support is expected, but the handshake does not complete
   for any reason, BridgeFinder should either exit or go dormant.
 
   The exact nature of the version negotiation and exactly how much
   backwards compatibility must be tolerated is unspecified.
   "All-or-nothing" is a safe assumption to get started.
 
 Future Implementations: Obtaining Bridge Hint Information
 
   Future BridgeFinder implementations may download additional
   information based on what is provided by BridgeFinderHelper. They
   may fetch pluggable transport plugins, transformation parameters,
   and other material.
 
 Future Implementations: Configuring New Bridges
 
   Future implementations will be concerned with providing two new pieces
   of functionality with respect to configuring bridges: configuring
   pluggable transports, and properly prompting the user before altering
   Tor configuration.
 
   There are two ways to tell Tor clients about pluggable transports
   (as defined in Proposal 180).
 
   On the control port, an external Proposal 180 transport will be
   configured with
     SETCONF ClientTransportPlugin=<method> socks5 <addr:port> [auth=X]
   as in
     SETCONF ClientTransportPlugin="trebuchet socks5 127.0.0.1:9999".
 
   A managed proxy is configured with
     SETCONF ClientTransportPlugin=<methods> exec <path> [options]
   as in
     SETCONF ClientTransportPlugin="trebuchet exec /usr/libexec/trebuchet --managed".
 
   This example tells Tor to launch an external program to provide a
   socks proxy for 'trebuchet' connections. The Tor client only
   launches one instance of each external program with a given set of
   options, even if the same executable and options are listed for
   more than one method.
 
   Pluggable transport bridges discovered for this transport by
   BridgeFinder would then be set with:
     SETCONF Bridge="trebuchet 3.2.4.1:8080 keyid=09F911029D74E35BD84156C5635688C009F909F9 rocks=20 height=5.6m".

   For more information on pluggable transports and supporting Tor
   configuration commands, see Proposal 180.
 
 Future Implementations: POSTMESSAGE and User Confirmation
 
   Because configuring even normal bridges alone can expose the user to
   attacks, it is strongly desired to provide some mechanism to allow
   the user to approve new bridges prior to their use, especially for
   situations where BridgeFinderHelper is extracting them transparently
   while the user performs unrelated activity.
 
   If BridgeFinderHelper grows to the point where it is downloading new
   transform definitions or plugins, user confirmation becomes
   absolutely required.
 
   To achieve user confirmation, we depend upon the POSTMESSAGE command
   defined in Proposal 197. 
 
   If the POSTMESSAGE handshake succeeds, instead of sending SETCONF
   commands directly to the control port, the commands will be wrapped
   inside a POSTMESSAGE:
     POSTMESSAGE @all SETCONF Bridge="www.example.com:8284"
 
   Upon receiving this POSTMESSAGE, the Primary Controller will
   validate it, evaluate it, store it to be later enabled by the
   user, and alert the user that new bridges are available for
   approval. It is only after the user has approved the new bridges
   that the Primary Controller should then re-issue the SETCONF commands
   to configure and deploy them in the tor client.
 
   Additionally, see "Security Concerns: Primary Controller" for more
   discussion on potential pitfalls with POSTMESSAGE.

Security Concerns

  While automatic bridge discovery and configuration is quite compelling
  and powerful, there are several serious security concerns that warrant
  extreme care. We've broken them down by component.
  
 Security Concerns: Primary Controller
 
   In the initial implementation, Orbot and Vidalia must take care to
   transmit the Tor Control password to BridgeFinder in such a way that
   it does not end up in system logs, process list, or viewable by other
   system users. The best known strategy for doing this is by passing the
   information through exported environment variables.
   
   Additionally, in future implementations, Orbot and Vidalia will need
   to validate Proposal 197 POSTMESSAGE input before prompting the user.
   POSTMESSAGE is a free-form message-passing mechanism. All sorts of
   unexpected input may be passed through it by any other authenticated
   Tor Controllers for their own unrelated communication purposes.

   Minimal validation includes verifying that the POSTMESSAGE data is a
   valid Bridge or ClientTransportPlugin line and is acceptable input for
   SETCONF. All unexpected characters should be removed through using a
   whitelist, and format and structure should be checked against a
   regular expression. Additionally, the POSTMESSAGE string should not be
   passed through any string processing engines that automatically decode
   character escape encodings, to avoid arbitrary control port execution.
   
   At the same time, POSTMESSAGE validation should be light. While fully
   untrusted input is not expected due to the need for control port
   authentication and BridgeFinder sanitation, complicated manual string
   parsing techniques during validation should be avoided. Perform simple
   easy-to-verify whitelist-based checks, and ignore unrecognized input.
   
   Beyond POSTMESSAGE validation, the manner in which the Primary
   Controller achieves consent from the user is absolutely crucial to
   security under this scheme. A simple "OK/Cancel" dialog is
   insufficient to protect the user from the dangers of switching
   bridges and running new plugins automatically.
   
   Newly discovered bridge lines from POSTMESSAGE should be added to a
   disabled set that the user must navigate to as an independent window
   apart from any confirmation dialog. The user must then explicitly
   enable recently added plugins by checking them off individually. We
   need the user's brain to be fully engaged and aware that it is
   interacting with Tor during this step.  If they get an "OK/Cancel"
   popup that interrupts their online game play, they will almost
   certainly simply click "OK" just to get back to the game quickly.
 
   The Primary Controller should transmit the POSTMESSAGE content to the
   control port only after obtaining this out-of-band approval.

Security Concerns: BridgeFinder and BridgeFinderHelper

  The unspecified nature of the IPC channel between BridgeFinder and
  BridgeFinderHelper makes it difficult to make concrete security
  suggestions. However, from past experience, the following best
  practices must be employed to avoid security vulnerabilities:

  1. Define a non-webby handshake and/or perform authentication

     The biggest risk is that unexpected applications will be manipulated
     into posting malformed data to the BridgeFinder's IPC channel as if it
     were from BridgeFinderHelper. The best way to defend against this is
     to require a handshake to properly complete before accepting input. If
     the handshake fails at any point, the IPC channel must be abandoned
     and closed. Do not continue scanning for good input after any bad
     input has been encountered.
     
     Additionally, if possible, it is wise to establish a shared secret
     between BridgeFinder and BridgeFinderHelper through the filesystem or
     any other means available for use in authentication. For an a good
     example on how to use such a shared secret properly for
     authentication, see Trac Ticket #5185 and/or the SafeCookie Tor
     Control Port authentication mechanism.

  2. Perform validation before parsing 

     Care must be taken before converting BridgeFinderHelper data into
     Bridge lines, especially for cases where the BridgeFinderHelper data
     is fed directly to the control port after passing through
     BridgeFinder.

     The input should be subjected to a character whitelist and possibly
     also validated against a regular expression to verify format, and if
     any unexpected or poorly-formed data is encountered, the IPC channel
     must be closed.

  3. Fail closed on unexpected input

     If the handshake fails, or if any other part of the BridgeFinderHelper
     input is invalid, the IPC channel must be abandoned and closed. Do
     *not* continue scanning for good input after any bad input has been
     encountered.


Filename: 200-new-create-and-extend-cells.txt
Title: Adding new, extensible CREATE, EXTEND, and related cells
Author: Robert Ransom
Created: 2012-03-22
Status: Closed
Implemented-In: 0.2.4.8-alpha

History

  The original draft of this proposal was from 2010-12-27; nickm revised
  it slightly on 2012-03-22 and added it as proposal 200.

Overview and Motivation:

  In Tor's current circuit protocol, every field, including the 'onion
  skin', in the EXTEND relay cell has a fixed meaning and length.
  This prevents us from extending the current EXTEND cell to support
  IPv6 relays, efficient UDP-based link protocols, larger 'onion
  keys', new circuit-extension handshake protocols, or larger
  identity-key fingerprints.  We will need to support all of these
  extensions in the near future.  This proposal specifies a
  replacement EXTEND2 cell and related cells that provide more room
  for future extension.

Design:

  FIXME - allocate command ID numbers (non-RELAY commands for CREATE2 and
  CREATED2; RELAY commands for EXTEND2 and EXTENDED2)

  The CREATE2 cell contains the following payload:

        Handshake type                        [2 bytes]
        Handshake data length                 [2 bytes]
        Handshake data                        [variable]

  The relay payload for an EXTEND2 relay cell contains the following
  payload:

        Number of link specifiers             [1 byte]
           N times:
            Link specifier type               [1 byte]
            Link specifier length             [1 byte]
            Link specifier                    [variable]
        Handshake type                        [2 bytes]
        Handshake data length                 [2 bytes]
        Handshake data                        [variable]

  The CREATED2 cell and EXTENDED2 relay cell both contain the following
  payload:

        Handshake data length                 [2 bytes]
        Handshake data                        [variable]

  All four cell types are padded to 512-byte cells.

  When a relay X receives an EXTEND2 relay cell:

  * X finds or opens a link to the relay Y using the link target
    specifiers in the EXTEND2 relay cell; if X fails to open a link, it
    replies with a TRUNCATED relay cell. (FIXME: what do we do now?)

  * X copies the handshake type and data into a CREATE2 cell and sends
    it along the link to Y.

  * If the handshake data is valid, Y replies by sending a CREATED2
    cell along the link to X; otherwise, Y replies with a TRUNCATED
    relay cell. (XXX: we currently use a DESTROY cell?)

  * X copies the contents of the CREATED2 cell into an EXTENDED2 relay
    cell and sends it along the circuit to the OP.


Link target specifiers:

  The list of link target specifiers must include at least one address and
  at least one identity fingerprint, in a format that the extending node is
  known to recognize.

  The extending node MUST NOT accept the connection unless at least one
  identity matches, and should follow the current rules for making sure that
  addresses match.

  [00] TLS-over-TCP, IPv4 address
       A four-byte IPv4 address plus two-byte ORPort
  [01] TLS-over-TCP, IPv6 address
       A sixteen-byte IPv6 address plus two-byte ORPort
  [02] Legacy identity
       A 20-byte SHA1 identity fingerprint. At most one may be listed.

  As always, values are sent in network (big-endian) order.

Legacy handshake type:

  The current "onionskin" handshake type is defined to be handshake type
  [00 00], or "legacy".

  The first (client->relay) message in a handshake of type “legacy”
  contains the following data:

        ‘Onion skin’ (as in CREATE cell)      [DH_LEN+KEY_LEN+PK_PAD_LEN bytes]

  This value is generated and processed as sections 5.1 and 5.2 of
  tor-spec.txt specify for the current CREATE cell.

  The second (relay->client) message in a handshake of type “legacy”
  contains the following data:

        Relay DH public key                   [DH_LEN bytes]
        KH (see section 5.2 of tor-spec.txt)  [HASH_LEN bytes]

  These values are generated and processed as sections 5.1 and 5.2 of
  tor-spec.txt specify for the current CREATED cell.

  After successfully completing a handshake of type “legacy”, the
  client and relay use the current relay cryptography protocol.

Bugs:

  This specification does not accommodate:

  * circuit-extension handshakes requiring more than one round

    No circuit-extension handshake should ever require more than one
    round (i.e. more than one message from the client and one reply
    from the relay).  We can easily extend the protocol to handle
    this, but we will never need to.

  * circuit-extension handshakes in which either message cannot fit in
    a single 512-byte cell along with the other required fields

    This can be handled by specifying a dummy handshake type whose
    data (sent from the client) consists of another handshake type and
    the beginning of the data required by that handshake type, and
    then using several (newly defined) HANDSHAKE_COMPLETION relay
    cells sent in each direction to transport the remaining handshake
    data.

    The specification of a HANDSHAKE_COMPLETION relay cell and its
    associated dummy handshake type can safely be postponed until we
    develop a circuit-extension handshake protocol that would require
    it.

  * link target specifiers that cause EXTEND2 cells to exceed 512
    bytes

    This can be handled by specifying a LONG_COMMAND relay cell type
    that can be used to transport a large ‘virtual cell’ in multiple
    512-byte cells.

    The specification of a LONG_COMMAND relay cell can safely be
    postponed until we develop a link target specifier, a RELAY_BEGIN2
    relay cell and stream target specifier, or some other relay cell
    type that would require it.


Filename: 201-bridge-v3-reqs-stats.txt
Title: Make bridges report statistics on daily v3 network status requests
Author: Karsten Loesing
Created: 10-May-2012
Status: Reserve
Target: 0.2.4.x

Overview:

  Our current approach [0] to estimate daily bridge users is based on
  unique IP addresses reported by bridges, and it is very likely broken.
  A bridge user can connect to two or more bridges, so that unique IP address
  sets overlap to an unknown extent.  We should instead count requests for
  v3 network statuses, sum them up for all bridges, and divide by the
  average number of requests that a bridge client makes per day.  This
  approach is similar to how we estimate directly connecting users.  This
  proposal describes how bridges would report v3 network status requests
  in their extra-info descriptors.

Specification:

  Bridges include a new keyword line in their extra-info descriptors that
  contains the number of v3 network status requests by country they saw
  over a period of 24 hours.  The reported numbers refer to the period
  stated in the "bridge-stats-end" line.  The new keyword line would go
  after the "bridge-ips" line in dir-spec.txt:

  "bridge-v3-reqs" CC=N,CC=N,... NL
      [At most once.]

      List of mappings from two-letter country codes to the number of
      requests for v3 network statuses from that country as seen by the
      bridge, rounded up to the nearest multiple of 8. Only those requests
      are counted that the directory can answer with a 200 OK status code.


[0] https://metrics.torproject.org/papers/countingusers-2010-11-30.pdf

Filename: 202-improved-relay-crypto.txt
Title: Two improved relay encryption protocols for Tor cells
Author: Nick Mathewson
Created: 19 Jun 2012
Status: Meta

Note: This is an important development step in improving our relay
  crypto, but it doesn't actually say how to do this.


Overview:

   This document describes two candidate designs for a better Tor
   relay encryption/decryption protocol, designed to stymie tagging
   attacks and better accord with best practices in protocol design.

   My hope is that readers will examine these protocols, evaluate their
   security and practicality, improve on them, and help to pick one for
   implementation in Tor.

   In section 1, I describe Tor's current relay crypto protocol, its
   drawbacks, how it fits in with the rest of Tor, and some
   requirements/desiderata for a replacement.  In sections 2 and 3, I
   propose two alternative replacements for this protocol.  In section
   4, I discuss their pros and cons.

1. Background and Motivation

1.0. A short overview of Tor's current protocols

   The core pieces of the Tor protocol are the link protocol,
   the circuit extend protocol, the relay protocol, and the stream
   protocol.  All are documented in [TorSpec].

   Briefly:

     - The link protocol handles all direct pairwise communication
       between nodes.  Everything else is transmitted over it.  It
       uses TLS.

     - The circuit extend protocol uses public-key crypto to set up
       multi-node virtual tunnels, called "circuits", from a client
       through one or more nodes.

   *** The relay protocol uses established circuits to communicate
       from a client to a node on a circuit.  That's the one we'll
       be talking about here. ***

     - The stream protocol is tunneled over relay protocol; clients
       use it to tell servers to open anonymous TCP connections, to
       send data, and so forth.  Servers use it to report success or
       failure opening anonymous TCP connections, to send data from
       TCP connections back to clients, and so forth.

   In more detail: The link protocol's job is to connect two
   computers with an encrypted, authenticated stream, to
   authenticate one or both of them to the other, and to provide a
   channel for passing "cells" between them.  The circuit extend
   protocol's job is to set up circuits: persistent tunnels that
   connect a Tor client to an exit node through a series of
   (typically three) hops, each of which knows only the previous and
   next hop, and each of which has a set of keys that they share
   only with the client.  Finally, the relay protocol's job is to
   allow a client to communicate bidirectionally with the node(s) on
   the circuit, once their shared keys have been established.

   (We'll ignore the stream protocol for the rest of this document.)

   Note on terminology: Tor nodes are sometimes called "servers",
   "relays", or "routers".  I'll use all these terms more or less
   interchangeably.  For simplicity's sake, I will call the party
   who is constructing and using a circuit "the client" or "Alice",
   even though nodes sometimes construct circuits too.

   Tor's internal packets are called "cells".  All the cells we deal
   with here are 512 bytes long.

   The nodes in a circuit are called its "hops"; most circuits are 3
   hops long.  This doesn't include the client: when Alice builds a
   circuit through nodes Bob_1, Bob_2, and Bob_3, the first hop is
   "Bob_1" and the last hop is "Bob_3".

1.1. The current relay protocol and its drawbacks

   [This section describes Tor's current relay protocol.  It is not a
   proposal; rather it is what we do now.  Sections 2 and 3 have my
   proposed replacements for it.]

   A "relay cell" is a cell that is generated by the client to send
   to a node, or by a node to send to the client.  It's called a
   "relay" cell because a node that receives one may need to relay
   it to the next or previous node in the circuit (or to handle the
   cell itself).

   A relay cell moving towards the client is called "inbound"; a
   cell moving away is called "outbound".

   When a relay cell is constructed by the client, the client adds one
   layer of crypto for each node that will process the cell, and gives
   the cell to the first node in the circuit.  Each node in turn then
   removes one layer of crypto, and either forwards the cell to the next
   node in the circuit or acts on that cell itself.

   When a relay cell is constructed by a node, it encrypts it and sends
   it to the preceding node in the circuit.  Each node between the
   originating node and the client then encrypts the cell and passes it
   back to the preceding node.  When the client receives the cell, it
   removes layers of crypto until it has an unencrypted cell, which it
   then acts on.

   In the current protocol, the body of each relay cell contains, in
   its unencrypted form:

        Relay command     [1 byte]
        Zeros             [2 bytes]
        StreamID          [2 bytes]
        Digest            [4 bytes]
        Length            [2 bytes]
        Data              [498 bytes]

   (This only adds up to 509 bytes.  That's because the Tor link
   protocol transfers 512-byte cells, and has a 3 byte overhead per
   cell.  Not how we would do it if we were starting over at this
   point.)

   At every node of a circuit, the node relaying a cell
   encrypts/decrypts it with AES128-CTR.  If the cell is outbound
   and the "zeros" field is set to all-zeros, the node additionally
   checks whether 'digest' is correct.  A correct digest is the
   first 4 bytes of the running SHA1 digest of: a shared secret,
   concatenated with all the relay cells destined for this node on
   this circuit so far, including this cell.  If _that's_ true, then
   the node accepts this cell.  (See section 6 of [TorSpec] for full
   detail; see section A.1 for a more rigorous description.)

   The current approach has some actual problems.  Notably:

      * It permits tagging attacks. Because AES_CTR is an XOR-based
        stream cipher, an adversary who controls the first node in a
        circuit can XOR anything he likes into the relay cell, and
        then see whether he/she encounters a correspondingly
        defaced cell at some exit that he also controls.

        That is, the attacker picks some pattern P, and when he
        would transmit some outbound relay cell C at hop 1, he
        instead transmits C xor P.  If an honest exit receives the
        cell, the digest will probably be wrong, and the honest exit
        will reject it.  But if the attacker controls the exit, he
        will notice that he has received a cell C' where the digest
        doesn't match, but where C' xor P _does_ have a good digest.
        The attacker will then know that his two nodes are on the
        same circuit, and thereby be able to link the user (whom the
        first node sees) to her activities (which the last node sees).

        Back when we did the Tor design, this didn't seem like a
        big deal, since an adversary who controls both the first and
        last node in a circuit is presumed to win already based on
        traffic correlation attacks.  This attack seemed strictly
        worse than that, since it was trivially detectable in the
        case where the attacker _didn't_ control both ends.  See
        section 4.4 of the Tor paper [TorDesign] for our early
        thoughts here; see Xinwen Fu et al's 2009 work for a more
        full explanation of the in-circuit tagging attack [XF]; and
        see "The 23 Raccoons'" March 2012 "Analysis of the Relative
        Severity of Tagging Attacks" mail on tor-dev (and the
        ensuing thread) for a discussion of why we may want to care
        after all, due to attacks that use tagging to amplify route
        capture. [23R]

   It also has some less practical issues.

      * The digest portion is too short.  Yes, if you're an attacker
        trying to (say) change an "ls *" into an "rm *", you can
        only expect to get one altered cell out of 2^32 accepted --
        and all future cells on the circuit will be rejected with
        similar probability due to the change in the running hash
        -- but 1/2^32 is a pretty high success rate for crypto attacks.

      * It does MAC-then-encrypt.  That makes smart people cringe.

      * Its approach to MAC is H(Key | Msg), which is vulnerable to
        length extension attack if you've got a Merkle-Damgard hash
        (which we do).  This isn't relevant in practice right now,
        since the only parties who see the digest are the two
        parties that rely on it (because of the MAC-then-encrypt).


1.2. Some feature requirements

   Relay cells can originate at the client or at a relay.  Relay cells
   that originate at the client are given to the first node in the
   circuit, and constructed so that they will be decrypted and forwarded
   by the first M-1 nodes in the circuit, and finally decrypted and
   processed by the Mth node, where the client chooses M. (Usually, the
   Mth node is the the last one, which will be an exit node.) Relay
   cells that originate at a hop in the circuit are given to the
   preceding node, and eventually delivered to the client.

   Tor provides a so called "leaky pipe" circuit topology
   [TorDesign]: a client can send a cell to any node in the circuit,
   not just the last node. I'll try to keep that property, although
   historically we haven't really made use of it.

   In order to implement telescoping circuit construction (where the
   circuit is built by iteratively asking the last node in the
   circuit to extend the circuit one hop more), it's vital that the
   last hop of the circuit be able to change.

   Proposal 188 [Prop188] suggests that we implement a "bridge guards"
   feature: making some (non-public) nodes insert an extra hop into
   the circuit after themselves, in a way that will make it harder
   for other nodes in the network to enumerate them.  We
   therefore want our circuits to be one-hop re-extensible: when the
   client extends a circuit from Bob1 to Bob2, we want Bob1 to be
   able to introduce a new node "Bob1.5" into the circuit such that
   the circuit runs Alice->Bob1->Bob1.5->Bob2. (This feature has
   been called "loose source routing".)

   Any new approach should be able to coexist on a circuit
   with the old approach.  That is, if Alice wants to build a
   circuit through Bob1, Bob2, and Bob3, and only Bob2 supports a
   revised relay protocol, then Alice should be able to build a
   circuit such that she can have Bob1 and Bob3 process each cell
   with the current protocol, and Bob2 process it with a revised
   protocol.  (Why?  Because if all nodes in a circuit needed to use
   the same relay protocol, then each node could learn information
   about the other nodes in the circuit from which relay protocol
   was chosen.  For example, if Bob1 supports the new protocol, and
   sees that the old relay protocol is in use, and knows that Bob2
   supports the new one, then Bob1 has learned that Bob3 is some
   node that does not support the new relay protocol.)

   Cell length needs to be constant as cells move through the
   network.  For historical reasons mentioned above in section 1.1,
   the length of the encrypted part of a relay cell needs to be 509
   bytes.

1.3. Some security requirements

   Two adjacent nodes on a circuit can trivially tell that they are
   on the same circuit, and the first node can trivially tell who
   the client is. Other than that, we'd like any attacker who
   controls a node on the circuit not to have a good way to learn
   any other nodes, even if he/she controls those nodes. [*]

   Relay cells should not be malleable: no relay should be able to
   alter a cell between an honest sender and an honest recipient in
   a way that they cannot detect.

   Relay cells should be secret: nobody but the sender and recipient
   of a relay cell should be able to learn what it says.

   Circuits should resist transparent, recoverable tagging attacks:
   if an attacker controls one node in a circuit and alters a relay
   cell there, no non-adjacent point in the circuit should be able
   to recover the relay cell as it would have received it had the
   attacker not altered it.

   The above properties should apply to sequences of cells too:
   an attacker shouldn't be able to change what sequence of cells
   arrives at a destination (for example, by removing, replaying, or
   reordering one or more cells) without being detected.

   (Feel free to substitute whatever formalization of the above
   requirements makes you happiest, and add whatever caveats are
   necessary to make you comfortable.  I probably missed at least
   two critical properties.)

   [*] Of course, an attacker who controls two points on a circuit
       can probably confirm this through traffic correlation.  But
       we'd prefer that the cryptography not create other, easier
       ways for them to do this.

1.4. A note on algorithms

   This document is deliberately agnostic concerning the choice of
   cryptographic primitives -- not because I have no opinions about
   good ciphers, MACs, and modes of operation -- but because
   experience has taught me that mentioning any particular
   cryptographic primitive will prevent discussion of anything else.

   Please DO NOT suggest algorithms to use in implementing these
   protocols yet.  It will distract!  There will be time later!

   If somebody _else_ suggests algorithms to use, for goodness' sake
   DON'T ARGUE WITH THEM!  There will be time later!


2. Design 1: Large-block encryption

   In this approach, we use a tweakable large-block cipher for
   encryption and decryption, and a tweak-chaining function TC.

2.1. Chained large-block what now?

   We assume the existence of a primitive that provides the desired
   properties of a tweakable[Tweak] block cipher, taking blocks of any
   desired size.  (In our case, the block size is 509 bytes[*].)

   It also takes a Key, and a per-block "tweak" parameter that plays
   the same role that an IV plays in CBC, or that the counter plays
   in counter mode.

   The Tweak-chaining function TC takes as input a previous tweak, a
   tweak chaining key, and a cell; it outputs a new tweak.  Its
   purpose is to make future cells undecryptable unless you have
   received all previous cells.  It could probably be something like
   a MAC of the old tweak and the cell using the tweak chaining key
   as the MAC key.

   (If the initial tweak is secret, I am not sure that TC needs to
   be keyed.)

   [*] Some large-block cipher constructions use blocks whose size is
       the multiple of some underlying cipher's block size.  If we wind
       up having to use one of those, we can use 496-byte blocks instead
       at the cost of 2.5% wasted space.

2.2. The protocol

2.2.1. Setup phase

   The circuit construction algorithm needs to produce forward and
   backward keys Kf and Kb, the forward and backward tweak chaining
   keys TCKf and TCKb, as well as initial tweak values Tf and Tb.

2.2.2. The cell format

   We replace the "Digest" and "Zeros" fields of the cell with a
   single Z-byte "Zeros" field to determine when the cell is
   recognized and correctly decrypted; its length is a security
   parameter.

2.2.3. The decryption operations

   For a relay to handle an inbound RELAY cell, it sets Tb_next to
   TC(TCKb, Tb, Cell).  Then it encrypts the cell using the large
   block cipher keyed with Kb and tweaked with Tb.  Then it sets Tb
   to Tb_next.

   For a relay to handle an outbound RELAY cell, it sets Tf_next to
   TC(TCKf, Tf, Cell).  Then it decrypts the cell using the large
   block cipher keyed with Kf and tweaked with Tf.  Then it sets Tf
   to Tf_next.  Then it checks the 'Zeros' field on the cell;
   if that field is all [00] bytes, the cell is for us.

2.3. Security discussion

   This approach is fairly simple (at least, no more complex than
   its primitives) and achieves some of our security goals.  Because
   of the large block cipher approach, any change to a cell will
   render that cell undecryptable, and indistinguishable from random
   junk.  Because of the tweak chaining approach, if even one cell
   is missed or corrupted or reordered, future cells will also
   decrypt into random junk.

   The tagging attack in this case is turned into a circuit-junking
   attack: an adversary who tries to mount it can probably confirm
   that he was indeed first and last node on a target circuit
   (assuming that circuits which turn to junk in this way are rare),
   but cannot recover the circuit after that point.

   As a neat side observation, note that this approach improves upon
   Tor's current forward secrecy, by providing forward secrecy while
   circuits are still operational, since each change to the tweak
   should make previous cells undecryptable if the old tweak value
   isn't recoverable.

   The length of Zeros is a parameter for what fraction of "random
   junk" cells will potentially be accepted by a router or client.
   If Zeros is Z bytes long, then junk cells will be accepted with
   P < 2^-(8*Z + 7).  (The '+7' is there because the top 7 bits of
   the Length field must also be 0.)

   There's no trouble using this protocol in a mixed circuit, where
   some nodes speak the old protocol and some speak the
   large-block-encryption protocol.

3. Design 2: short-MAC-and-pad

   In this design, we behave more similarly to a mix-net design
   (such as Mixmaster or Mixminion's headers).  Each node checks a
   MAC, and then re-pads the cell to its chosen length before
   decoding the cell.

   This design uses as a primitive a MAC and a stream cipher.  It
   might also be possible to use an authenticating cipher mode,
   if we can find one that works like a stream cipher and allows us
   to efficiently output authenticators for the stream so far.

   NOTE TO AE/AEAD FANS: The encrypt-and-MAC model here could be
   replaced with an authenticated encryption mode without too much
   loss of generality.

3.1. The protocol

3.1.1 Setup phase

   The circuit construction algorithm needs to produce forward and
   backward keys Kf and Kb, forward and backward stream cipher IVs
   IVf and IVb, and forward and backward MAC keys Mf and Mb.

   Additionally, the circuit construction algorithm needs a way for
   the client to securely (and secretly? XXX) tell each hop in the
   circuit a value 'bf' for the number of bytes of MAC it should
   expect on outbound cells and 'bb' for the number of bytes it
   should use on cells it's generating.   Each node can get a
   different 'bf' and 'bb'.  These values can be 0: if a node's bf
   is 0, it doesn't authenticate cells; if its bb is 0, it doesn't
   originate them.

   The circuit construction algorithm also needs a way to tell each
   the client to securely (and secretly? XXX) tell each hop in the
   circuit whether it is allowed to be the final destination for
   relay cells.

   Set the stream Sf and the stream Sb to empty values.

3.1.2. The cell format

   The Zeros and Digest field of the cell format are removed.

3.1.3. The relay operations

   Upon receiving an outbound cell, a node removes the first b bytes
   of the cell, and puts them aside as 'M'.  The node then computes
   between 0 and 2 MACs of the stream consisting of all previously
   MAC'd data, plus the remainder of the cell:

      If b>0 and there is a next hop, the node computes M_relay.

      If this node was told to deliver traffic, or it is the last
      node in the circuit so far, the node computes M_receive.

   M_relay is computed as MAC(stream | "relay"); M_receive is
   computed as MAC(stream | "receive").

   If M = M_receive, this cell is for the node; it should process
   it.

   If M = M_relay, or if b == 0, this cell should be relayed.

   If a MAC was computed and neither of the above cases was met,
   then the cell is bogus; the node should discard it and destroy
   the circuit.

   The node then removes the first bf bytes of the cell, and pads the
   cell at the end with bf zero bytes.  Finally, the node decrypts
   the whole remaining padded cell with the stream cipher.

   To handle an inbound cell, the node simply does a stream cipher
   with no checking.

3.1.4. Generating inbound cells.

   To generate an inbound cell, a node makes a 509-bb byte RELAY
   cell C, encrypts that cell with Kb, appends the encrypted cell to
   Sb, and prepends M_receive(Sb) to the cell.

3.1.5. Generating outbound cells

   Generating an outbound cell is harder, since we need to know what
   padding the earlier nodes will generate in order to know what
   padding the later nodes will receive and compute their MACs, but
   we also need to know what MACs we'll send to the later nodes in
   order to compute which MACs we'll send to the earlier ones.

   Mixnet clients have needed to do this for ages, though, so the
   algorithms are pretty well settled.  I'll give one below in A.3.

3.2. Security discussion

   This approach is also simple and (given good parameter choices)
   can achieve our goals.  The trickiest part is the algorithm that
   clients must follow to package cells, but that's not so bad.

   It's not necessary to check MACs on inbound traffic, because
   nobody but the client can tell scrambled messages from good ones,
   and the client can be trusted to keep the client's own secrets.

   With this protocol, if the attacker tries to do a tagging attack,
   the circuit should get destroyed by the next node in the circuit
   that has a nonzero "bf" value, with probability == 1-2^-(8*bf).
   (If there are further intervening honest nodes, they also have a
   chance to detect the attack.)

   Similarly, any attempt to replay, or reorder outbound cells
   should fail similarly.

   The "bf" values could reveal to each node its position in the
   circuit and the client preferences, depending on how we set them;
   on the other hand, having a fixed bf value would reveal to the
   last node the length of the circuit.  Neither option seems great.

   This protocol doesn't provide any additional forward secrecy
   beyond what Tor has today.  We could fix that by changing our use
   of the stream cipher so that instead of running in counter mode
   between cells, we use a tweaked stream cipher and change the
   tweak with each cell (possibly based on the unused portion of the
   MAC).

   This protocol does support loose source routing, so long as
   no padding bytes are added by any router-added nodes.

   In a circuit, every node has at least one relay cell sent to it:
   even non-exit nodes get a RELAY_EXTEND cell.

4. Discussion

   I'm not currently seeing a reason to strongly prefer one of these
   approaches over another.

   In favor of large-block encryption:
     - The space overhead seems smaller: we need to use up fewer
       bytes in order to get equivalent looking security.

       (For example, if we want to have P < 2^64 that a cell altered
       by hop 1 could be accepted by hop 2 or hop 3, *and* we want P
       < 2^64 that a cell altered by hop 2 could be accepted by hop
       3, the large-block approach needs about 8 bytes for the Zeros
       field, whereas the short-MAC-and-pad approach needs 16 bytes
       worth of MACs.)

     - We get forward secrecy pretty easily.

     - The format doesn't leak anything about the length of the
       circuit, or limit it.

     - We don't need complicated logic to set the 'b' parameters.

     - It doesn't need tricky padding code.

   In the favor of short-MAC-and-pad:
     - All of the primitives used are much better analyzed and
       understood.  There's nothing esoteric there.  The format
       itself is similar to older, well-analyzed formats.

     - Most of the constructions for the large-block-cipher approach
       seem less efficient in CPU cycles than a good stream cipher
       and a MAC. (But I don't want to discuss that now; see section
       1.4 above!)

   Unclear:

     - Suppose that an attacker controls the first and last hop of a
       circuit, and tries an end-to-end tagging attack.  With
       large-block encryption, the tagged cell and all future cells
       on the circuit turn to junk after the tagging attack, with
       P~~100%.  With short-MAC-and-pad, the circuit is destroyed at
       the second hop, with P ~~ 1- 2^(-8*bf).  Is one of these
       approaches materially worse for the attacker?

     - Can we do better than the "compute two MACs" approach for
       distinguishing the relay and receive cases of the
       short-MAC-and-pad protocol?

     - To what extent can we improve these protocols?

     - If we do short-MAC-and-pad, should we apply the forward
       security hack alluded to in section 3.2?

5. Acknowledgments

   Thanks to the many reviewers of the initial drafts of this
   proposal.  If you can make any sense of what I'm saying, they
   deserve much of the credit.

A. Formal description

   Note that in all these cases, more efficient descriptions exist.

A.1. The current Tor relay protocol.

   Relay cell format:

     Relay command     [1 byte]
     Zeros             [2 bytes]
     StreamID          [2 bytes]
     Digest            [4 bytes]
     Length            [2 bytes]
     Data              [498 bytes]

   Circuit setup:

     (Specified elsewhere; the client negotiates with each router in
     a circuit the secret AES keys Kf, Kb, and the secret 'digest
     keys' Df, and Db.  They initialize AES counters Cf and Cb to
     0.  They initialize the digest stream Sf to Df, and Sb to Db.)

   HELPER FUNCTION: CHECK(Cell [in], Stream [in,out]):

     1. If the Zeros field of Cell is not [00 00], return False.
     2. Let Cell' = Cell with its Digest field set to [00 00 00 00].
     3. Let S' = Stream | Cell'.
     4. If SHA1(S') = the Digest field of Cell, set Stream to S',
        and return True.
     5. Otherwise return False.

   HELPER FUNCTION: CONSTRUCT(Cell' [in], Stream [in,out])

     1. Set the Zeros and Digest field of Cell' to [00] bytes.
     2. Set Stream to Stream | Cell'.
     3. Construct Cell from Cell' by setting the Digest field to
        SHA1(Stream), and taking all other fields as-is.
     4. Return Cell.

   HELPER_FUNCTION: ENCRYPT(Cell [in,out], Key [in], Ctr [in,out])
     1. Encrypt Cell's contents using AES128_CTR, with key 'Key' and
        counter 'Ctr'.  Increment 'Ctr' by the length of the cell.

   HELPER_FUNCTION: DECRYPT(Cell [in,out], Key [in], Ctr [in,out])
     1. Same as ENCRYPT.


   Router operation, upon receiving an inbound cell -- that is, one
   sent towards the client.

     1. ENCRYPT(cell, Kb, Cb)
     2. Send the decrypted contents towards the client.

   Router operation, upon receiving an outbound cell -- that is, one
   sent away from the client.

     1. DECRYPT(cell, Kf, Cf)
     2. If CHECK(Cell, Sf) is true, this cell is for us.  Do not
        relay the cell.
     3. Otherwise, this cell is not for us.  Send the decrypted cell
        to the next hop on the circuit, or discard it if there is no
        next hop.

   Router operation, to create a relay cell that will be delivered
   to the client.

     1. Construct a Relay cell Cell' with the relay command, length,
        stream ID, and body fields set as appropriate.
     2. Let Cell = CONSTRUCT(Cell', Sb).
     3. ENCRYPT(Cell, Kb, Cb)
     4. Send the encrypted cell towards the client.

   Client operation, receiving an inbound cell.

     For each hop H in a circuit, starting with the first hop and
     ending (possibly) with the last:

        1. DECRYPT(Cell, Kb_H, Cb_H)

        2. If CHECK(Cell, Sb_H) is true, this cell was sent from hop
           H.  Exit the loop, and return the cell in its current
           form.

     If we reach the end of the loop without finding the right hop,
     the cell is bogus or corrupted.

   Client operation, sending an outbound cell to hop H.

     1. Construct a Relay cell Cell' with the relay command, length,
        stream ID, and body fields set as appropriate.
     2. Let Cell = CONSTRUCT(Cell', Sf_H)
     3. For i = H..1:
          A. ENCRYPT(Cell, Sf_i, Cf_i)
     4. Deliver Cell to the first hop in the circuit.

A.2. The large-block-cipher protocol

   Same as A.1, except for the following changes.

   The cell format is now:
        Zeros             [Z bytes]
        Length            [2 bytes]
        StreamID          [2 bytes]
        Relay command     [1 byte]
        Data              [504-Z bytes]

   Ctr is no longer a counter, but a Tweak value.

   Each key is now a tuple of (Key_Crypt, Key_TC)

   Streams are no longer used.

   HELPER FUNCTION: CHECK(Cell [in], Stream [in,out])
        1. If the Zeros field of Cell contains only [00] bytes,
           return True.
        2. Otherwise return false.

   HELPER FUNCTION: CONSTRUCT(Cell' [in], Stream [in,out])
        1. Let Cell be Cell', with its "Zeros" field set to Z [00]
           bytes.
        2. Return Cell'.

   HELPER FUNCTION: ENCRYPT(Cell [in,out], Key [in], Tweak [in,out])
        2. Encrypt Cell using Key and Tweak
        1. Let Tweak' = TC(Key_TC, Tweak, Cell)
        3. Set Tweak to Tweak'.

   HELPER FUNCTION: DECRYPT(Cell [in,out], Key [in], Tweak [in,out])
        1. Let Tweak' = TC(Key_TC, Tweak, Cell)
        2. Decrypt Cell using Key and Tweak
        3. Set Tweak to Tweak'.

A.3. The short-MAC-and-pad protocol.

   Define M_relay(K,S) as MAC(K, S|"relay").
   Define M_receive(K,S) as MAC(K, S|"receive").
   Define Z(n) as a series of n [00] bytes.
   Define BODY_LEN as 509

   The cell message format is now:

     Relay command     [1 byte]
     StreamID          [2 bytes]
     Length            [2 bytes]
     Data              [variable bytes]

   Helper function: CHECK(Cell [in], b [in], K [in], S [in,out]):
       Let M = Cell[0:b]
       Let Rest = Cell[b:...]
       If b == 0:
          Return (nil, Rest)
       Let S' = S | Rest
       If M == M_relay(K,S')[0:b]:
          Set S = S'
          Return ("relay", Rest)
       If M == M_receive(K,S')[0:b]:
          Set S = S'
          Return ("receive", Rest)
       Return ("BAD", nil)

   HELPER_FUNCTION: ENCRYPT(Cell [in,out], Key [in], Ctr [in,out])
     1. Encrypt Cell's contents using AES128_CTR, with key 'Key' and
        counter 'Ctr'.  Increment 'Ctr' by the length of the cell.

   HELPER_FUNCTION: DECRYPT(Cell [in,out], Key [in], Ctr [in,out])
     1. Same as ENCRYPT.

   Router operation, upon receiving an inbound cell:
     1. ENCRYPT(cell, Kb, Cb)
     2. Send the decrypted contents towards the client.

   Router operation, upon receiving an outbound cell:
     1. Let Status, Msg = CHECK(Cell, bf, Mf, Sf)
     2. If Status == "BAD", drop the cell and destroy the circuit.
     3. Let Cell' = Msg | Z(BODY_LEN - len(Msg))
     4. DECRYPT(Cell', Kf, Cf) [*]
     5. If Status == "receive" or (Status == nil and there is no
        next hop), Cell' is for us: process it.
     6. Otherwise, send Cell' to the next node.

   Router operation, sending a cell towards the client:
     1. Let Body = a relay cell body of BODY_LEN-bb_i bytes.
     2. Let Cell' = ENCRYPT(Body, Kb, Cb)
     3. Let Sb = Sb | Cell'
     4. Let M = M_receive(Mb, Sb)[0:b]
     5. Send the cell M | Cell' back towards the client.

   Client operation, upon receiving an inbound cell:

     For each hop H in the circuit, from first to last:

        1. Let Status, Msg = CHECK(Cell, bb_i, Mb_i, Sb_i)
        2. If Status = "relay", drop the cell and destroy
           the circuit.  (BAD is okay; it means that this hop didn't
           originate the cell.)
        3. DECRYPT(Msg, Kb_i, Cb_i)
        4. If Status = "receive", this cell is from hop i; process
           it.
        5. Otherwise, set Cell = Msg.

   Client operation, sending an outbound cell:

        Let BF = the total of all bf_i values.

        1. Construct a relay cell body Msg of BODY_LEN-BF bytes.
        2. For each hop i, let Stream_i = ENCRYPT(Kf_i,Z(CELL_LEN),Cf_i)
        3. Let Pad_0 = "".
        4. For i in range 1..N, where N is the number of hops:
             Let Pad_i = Pad_{i-1} | Z(bf_i)
             Let S_last = the last len(Pad_i) bytes of Stream_i.
             Let Pad_i = Pad_i xor S_last
           Now Pad_i is the padding as it will stand after node i
           has processed it.

        5. For i in range N..1, where N is the number of hops:
             If this is the last hop, let M_* = M_receive. Else let
             M_* = M_relay.

             Let Body = Msg xor the first len(Msg) bytes of Stream_i

             Let M = M_*(Mf, Body | Pad_(i-1))

             Set Msg = M[:bf_i] | Body

        6. Send Msg outbound to the first relay in the circuit.


   [*] Strictly speaking, we could omit the pad-and-decrypt
       operation once we know we're the final hop.



R. References

[Prop188] Tor Proposal 188: Bridge Guards and other anti-enumeration defenses
     https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/188-bridge-guards.txt

[TorSpec] The Tor Protocol Specification
     https://gitweb.torproject.org/torspec.git?a=blob_plain;hb=HEAD;f=tor-spec.txt

[TorDesign] Dingledine et al, "Tor: The Second Generation Onion
     Router",
     https://svn.torproject.org/svn/projects/design-paper/tor-design.pdf

[Tweak] Liskov et al, "Tweakable Block Ciphers",
        http://www.cs.berkeley.edu/~daw/papers/tweak-crypto02.pdf

[XF] Xinwen Fu et al, "One Cell is Enough to Break Tor's Anonymity"

[23R] The 23 Raccoons, "Analysis of the Relative Severity of Tagging
      Attacks"  http://archives.seul.org/or/dev/Mar-2012/msg00019.html
      (You'll want to read the rest of the thread too.)
Filename: 203-https-frontend.txt
Title: Avoiding censorship by impersonating an HTTPS server
Author: Nick Mathewson
Created: 24 Jun 2012
Status: Obsolete

Note: Obsoleted-by pluggable transports.


Overview:

   One frequently proposed approach for censorship resistance is that
   Tor bridges ought to act like another TLS-based service, and deliver
   traffic to Tor only if the client can demonstrate some shared
   knowledge with the bridge.

   In this document, I discuss some design considerations for building
   such systems, and propose a few possible architectures and designs.

Background:

   Most of our previous work on censorship resistance has focused on
   preventing passive attackers from identifying Tor bridges, or from
   doing so cheaply.  But active attackers exist, and exist in the wild:
   right now, the most sophisticated censors use their anti-Tor passive
   attacks only as a first round of filtering before launching a
   secondary active attack to confirm suspected Tor nodes.

   One idea we've been talking about for a while is that of having a
   service that looks like an HTTPS service unless a client does some
   particular secret thing to prove it is allowed to use it as a Tor
   bridge.  Such a system would still succumb to passive traffic
   analysis attacks (since the packet timings and sizes for HTTPS don't
   look that much like Tor), but it would be enough to beat many current
   censors.

Goals and requirements:

   We should make it impossible for a passive attacker who examines only
   a few packets at a time to distinguish Tor->Bridge traffic from an
   HTTPS client talking to an HTTPS server.

   We should make it impossible for an active attacker talking to the
   server to tell a Tor bridge server from a regular HTTPS server.

   We should make it impossible for an active attacker who can MITM the
   server to learn from the client whether it thought it was connecting
   to an HTTPS server or a Tor bridge.  (This implies that an MITM
   attacker shouldn't be able to learn anything that would help it
   convince the server to act like a bridge.)

   It would be nice to minimize the required code changes to Tor, and
   the required code changes to any other software.

   It would be good to avoid any requirement of close integration with
   any particular HTTP or HTTPS implementation.

   If we're replacing our own profile with that of an HTTPS service, we
   should do so in a way that lets us use the profile of a popular
   HTTPS implementation.

   Efficiency would be good: layering TLS inside TLS is best avoided if
   we can.

Discussion:

   We need an actual web server; HTTP and HTTPS are so complicated that
   there's no practical way to behave in a bug-compatible way with any
   popular webserver short of running that webserver.

   More obviously, we need a TLS implementation (or we can't implement
   HTTPS), and we need a Tor bridge (since that's the whole point of
   this exercise).

   So from a top-level point of view, the question becomes: how shall we
   wire these together?

   There are three obvious ways; I'll discuss them in turn below.

Design #1: TLS in Tor

   Under this design, Tor accepts HTTPS connections, decides which ones
   don't look like the Tor protocol, and relays them to a webserver.

                   +--------------------------------------+
     +------+  TLS |  +------------+  http +-----------+  |
     | User |<------> | Tor Bridge |<----->| Webserver |  |
     +------+      |  +------------+       +-----------+  |
                   |     trusted host/network             |
                   +--------------------------------------+

   This approach would let us use a completely unmodified webserver
   implementation, but would require the most extensive changes in Tor:
   we'd need to add yet another flavor to Tor's TLS ice cream parlor,
   and try to emulate a popular webserver's TLS behavior even more
   thoroughly.

   To authenticate, we would need to take a hybrid approach, and begin
   forwarding traffic to the webserver as soon as a webserver
   might respond to the traffic.  This could be pretty complicated,
   since it requires us to have a model of how the webserver would
   respond to any given set of bytes.  As a workaround, we might try
   relaying _all_ input to the webserver, and only replying as Tor in
   the cases where the website hasn't replied.  (This would likely
   create recognizable timing patterns, though.)

   The authentication itself could use a system akin to Tor proposals
   189/190, where an early AUTHORIZE cell shows knowledge of a shared
   secret if the client is a Tor client.

Design #2: TLS in the web server

                   +----------------------------------+
     +------+  TLS |  +------------+  tor0   +-----+  |
     | User |<------> | Webserver  |<------->| Tor |  |
     +------+      |  +------------+         +-----+  |
                   |     trusted host/network         |
                   +----------------------------------+

   In this design, we write an Apache module or something that can
   recognize an authenticator of some kind in an HTTPS header, or
   recognize a valid AUTHORIZE cell, and respond by forwarding the
   traffic to a Tor instance.

   To avoid the efficiency issue of doing an extra local
   encrypt/decrypt, we need to have the webserver talk to Tor over a
   local unencrypted connection. (I've denoted this as "tor0" in the
   diagram above.)  For implementation convenience, we might want to
   implement that as a NULL TLS connection, so that the Tor server code
   wouldn't have to change except to allow local NULL TLS connections in
   this configuration.

   For the Tor handshake to work properly here, we'll need a way for the
   Tor instance to know which public key the webserver is configured to
   use.

   We wouldn't need to support the parts of the Tor link protocol used
   to authenticate clients to servers: relays shouldn't be using this
   subsystem at all.

   The Tor client would need to connect and prove its status as a Tor
   client.  If the client uses some means other than AUTHORIZE cells, or
   if we want to do the authentication in a pluggable transport, and we
   therefore decided to offload the responsibility for TLS itself to the
   pluggable transport, that would scare me: Supporting pluggable
   transports that have the responsibility for TLS would make it fairly
   easy to mess up the crypto, and I'd rather not have it be so easy to
   write a pluggable transport that accidentally makes Tor less secure.

Design #3: Reverse proxy


                   +----------------------------------+
                   |  +-------+  http  +-----------+  |
                   |  |       |<------>| Webserver |  |
     +------+  TLS |  |       |        +-----------+  |
     | User |<------> | Proxy |                       |
     +------+      |  |       |  tor0  +-----------+  |
                   |  |       |<------>|    Tor    |  |
                   |  +-------+        +-----------+  |
                   |     trusted host/network         |
                   +----------------------------------+

   In this design, we write a server-side proxy to sit in front of Tor
   and a webserver, or repurpose some existing HTTPS proxy. Its role
   will be to do TLS, and then forward connections to Tor or the
   webserver as appropriate.  (In the web world, this kind of thing is
   called a "reverse proxy", so that's the term I'm using here.)

   To avoid fingerprinting, we should choose a proxy that's already in
   common use as a TLS front-end for webservers -- nginx, perhaps.
   Unfortunately, the more popular tools here seem to be pretty complex,
   and the simpler tools less widely deployed.  More investigation would
   be needed.

   The authorization considerations would be as in Design #2 above; for
   the reasons discussed there, it's probably a good idea to build the
   necessary authorization into Tor itself.

   I generally like this design best: it lets us isolate the "Check for
   a valid authenticator and/or a valid or invalid HTTP header, and
   react accordingly" question to a single program.

How to authenticate: The easiest way

   Designing a good MITM-resistant AUTHORIZE cell, or an equivalent
   HTTP header, is an open problem that we should solve in proposals
   190 and 191 and their successors.  I'm calling it out-of-scope here;
   please see those proposals, their attendant discussion, and their
   eventual successors.

How to authenticate: a slightly harder way

   Some proposals in this vein have in the past suggested a special
   HTTP header to distinguish Tor connections from non-Tor connections.
   This could work too, though it would require substantially larger
   changes on the Tor client's part, would still require the client
   take measures to avoid MITM attacks, and would also require the
   client to implement a particular browser's http profile.

Some considerations on distinguishability

   Against a passive eavesdropper, the easiest way to avoid
   distinguishability in server responses will be to use an actual web
   server or reverse web proxy's TLS implementation.
   (Distinguishability based on client TLS use is another topic
   entirely.)

   Against an active non-MITM attacker, the best probing attacks will be
   ones designed to provoke the system into acting in ways different from
   those in which a webserver would act: responding earlier than a web
   server would respond, or later, or differently.  We need to make sure
   that, whatever the front-end program is, it answers anything that
   would qualify as a well-formed or ill-formed HTTP request whenever
   the web server would.  This must mean, for example, that whatever the
   correct form of client authorization turns out to be, no prefix of
   that authorization is ever something that the webserver would respond
   to.  With some web servers (I believe), that's as easy as making sure
   that any valid authenticator isn't too long, and doesn't contain a CR
   or LF character.  With others, the authenticator would need to be a
   valid HTTP request, with all the attendant difficulty that would
   raise.

   Against an attacker who can MITM the bridge, the best attacks will be
   to wait for clients to connect and see how they behave.  In this
   case, the client probably needs to be able to authenticate the bridge
   certificate as presented in the initial TLS handshake -- or some
   other aspect of the TLS handshake if we're feeling insane.  If the
   certificate or handshake isn't as expected, the client should behave
   as a web browser that's just received a bad TLS certificate.  (The
   alternative there would be to try to impersonate an HTTPS client that
   has just accepted a self-signed certificate.  But that would probably
   require the Tor client to impersonate a full web browser, which isn't
   realistic.)

Side note: What to put on the webserver?

   To credibly pretend not to be ourselves, we must pretend to be
   something else in particular -- and something not easily identifiable
   or inherently worthless.  We should not, for example, have all
   deployments of this kind use a fixed website, even if that website is
   the default "Welcome to Apache" configuration: A censor would
   probably feel that they weren't breaking anything important by
   blocking all unconfigured websites with nothing on them.

   Therefore, we should probably conceive of a system like this as
   "Something to add to your HTTPS website" rather than as a standalone
   installation.

Related work:

   meek [1] is a pluggable transport that uses HTTP for carrying bytes
   and TLS for obfuscation. Traffic is relayed through a third-party
   server (Google App Engine). It uses a trick to talk to the third
   party so that it looks like it is talking to an unblocked server.

   meek itself is not really about HTTP at all. It uses HTTP only
   because it's convenient and the big Internet services we use as cover
   also use HTTP. meek uses HTTP as a transport, and TLS for
   obfuscation, but the key idea is really "domain fronting," where it
   appears to the censor you are talking to one domain (www.google.com),
   but behind the scenes you are talking to another
   (meek-reflect.appspot.com). The meek-server program is an ordinary
   HTTP (not necessarily even HTTPS!) server, whose communication is
   easily fingerprintable; but that doesn't matter because the censor
   never sees that part of the communication, only the communication
   between the client and CDN.

   One way to think about the difference: if a censor (somehow) learns
   the IP address of a bridge as described in this proposal, it's easy
   and low-cost for the censor to block that bridge by IP address. meek
   aims to make it much more expensive: even if you know a domain is
   being used (in part) for circumvention, in order to block it have to
   block something important like the Google frontend or CloudFlare
   (high collateral damage).

1. https://trac.torproject.org/projects/tor/wiki/doc/meek
Filename: 204-hidserv-subdomains.txt
Title: Subdomain support for Hidden Service addresses
Author: Alessandro Preite Martinez
Created: 6 July 2012
Status: Closed


1. Overview

  This proposal aims to extend the .onion naming scheme for Hidden
  Service addresses with sub-domain components, which will be ignored
  by the Tor layer but will appear in HTTP Host headers, allowing
  subdomain-based virtual hosting.

2. Motivation

  Sites doing large-scale HTTP virtual hosting on subdomains currently
  do not have a good option for exposure via Hidden Services, short of
  creating a separate HS for every subdomain (which in some cases is
  simply not possible due to the subdomains not being fully known
  beforehand).

3. Implementation

  Tor should ignore any subdomain components besides the Hidden
  Service key, i.e. "foo.aaaaaaaaaaaaaaaa.onion" should be treated
  simply as "aaaaaaaaaaaaaaaa.onion".


Filename: 205-local-dnscache.txt
Title: Remove global client-side DNS caching
Author: Nick Mathewson
Created: 20 July 2012
Implemented-In: 0.2.4.7-alpha.
Status: Closed


-1. STATUS

   In 0.2.4.7-alpha, client-side DNS caching is off by default; there
   didn't seem to be much benefit in having per-circuit caches.  I'm
   leaving the original proposal below in tact for historical reasons.
     -Nick

0. Overview

   This proposal suggests that, for reasons of security, we move
   client-side DNS caching from a global cache to a set of per-circuit
   caches.

   This will break some things that used to work.  I'll explain how to
   fix them.

1. Background and Motivation

   Since the earliest Tor releases, we've kept a client-side DNS
   cache.  This lets us implement exit policies and exit enclaves --
   if we remember that www.mit.edu is 18.9.22.169 the first time we
   see it, then we can avoid making future requests for www.mit.edu
   via any node whose exit policy refuses net 18.  Also, if there
   happened to be a Tor node at 18.9.22.169, we could use that node as
   an exit enclave.

   But there are security issues with DNS caches.  A malicious exit
   node or DNS server can lie.  And unlike other traffic, where the
   effect of a lie is confined to the request in question, a malicious
   exit node can affect the behavior of future circuits when it gives
   a false DNS reply.  This false reply could be used to keep a client
   connecting to an MITM'd target, or to make a client use a chosen
   node as an exit enclave for that node, or so on.

   With IPv6, tracking attacks will become even more possible: A
   hostile exit node can give every client a different IPv6 address
   for every hostname they want to resolve, such that every one of
   those addresses is under the attacker's control.

   And even if the exit node is honest, having a cached DNS result can
   cause Tor clients to build their future circuits distinguishably:
   the exit on any subsequent circuit can tell whether the client knew
   the IP for the address yet or not.  Further, if the site's DNS
   provides different answers to clients from different parts of the
   world, then the client's cached choice of IP will reveal where it
   first learned about the website.

   So client-side DNS caching needs to go away.

2. Design

2.1. The basic idea

   I propose that clients should cache DNS results in per-circuit DNS
   caches, not in the global address map.

2.2. What about exit policies?

   Microdescriptor-based clients have already dropped the ability to
   track which nodes declare which exit policies, without much ill
   effect.  As we go forward, I think that remembering the IP address
   of each request so that we can match it to exit policies will be
   even less effective, especially if proposals to allow AS-based exit
   policies can succeed.

2.3. What about exit enclaves?

   Exit enclaves are already borken.  They need to move towards a
   cross-certification solution where a node advertises that it can
   exit to a hostname or domain X.Y.Z, and a signed record at X.Y.Z
   advertises that the node is an enclave exit for X.Y.Z.  That's
   out-of-scope for this proposal, except to note that nothing
   proposed here keeps that design from working.

2.4. What about address mapping?

   Our current address map algorithm is, more or less:

     N = 0
     while  N < MAX_MAPPING && exists map[address]:
         address = map[address]
         N = N + 1
     if N == MAX_MAPPING:
         Give up, it's a loop.

   Where 'map' is the union of all mapping entries derived from the
   controller, the configuration file, trackhostexits maps,
   virtual-address maps, DNS replies, and so on.

   With this proposed design, the DNS cache will not be part of the address
   map.  That means that entries in the address map which relied on
   happening after the DNS cache entries can no longer work so well.
   These would include:

       A) Mappings from an IP address to a particular exit, either
          manually declared or inserted by TrackHostExits.
       B) Mappings from IP addresses to other IP addresses.
       C) Mappings from IP addresses to hostnames.

   We can try to solve these by introducing an extra step of address
   mapping after the DNS cache is applied.  In other words, we should
   apply the address map, then see if we can attach to a circuit.  If
   we can, we try to apply that circuit's dns cache, then apply the
   address map again.


2.5. What about the performance impact?

   That all depends on application behavior.

   If the application continues to make all of its requests with the
   hostname, there shouldn't be much trouble.  Exit-side DNS caches and
   exit-side DNS will avoid any additional round trips across the Tor
   network; compared to that, the time to do a DNS resolution at the
   exit node *should* be small.

   That said, this will hurt performance a little in the case where
   the exit node and its resolver don't have the answer cached, and it
   takes a long time to resolve the hostname.


   If the application is doing "resolve, then connect to an IP", see
   2.6 below.

2.6. What about DNSPort?

   If the application is doing its own DNS caching, they won't get
   much security benefit from here.

   If the application is doing a resolve before each connect, there
   will be a performance hit when the resolver is using a circuit that
   hadn't previously resolved the address.

   Also, DNSPort users: AutomapHostsOnResolve is your friend.

3. Alternate designs and future directions

3.1. Why keep client-side DNS caching at all?

   A fine question!  I am not sure it actually buys us anything any
   longer, since exits also have DNS caching.  Shall we discuss that?
   It would sure simplify matters.

3.2. The impact of DNSSec

   Once we get DNSSec support, clients will be able to verify whether
   an exit's answers are correctly signed or not.  When that happens,
   we could get most of the benefits of global DNS caching back,
   without most of the security issues, if we restrict it to
   DNSSec-signed answers.

Filename: 206-directory-sources.txt
Title: Preconfigured directory sources for bootstrapping
Author: Nick Mathewson
Created: 10-Oct-2012
Status: Closed
Implemented-In: 0.2.4.7-alpha


Motivation and History:

   We've long wanted a way for clients to do their initial
   bootstrapping not from the directory authorities, but from some
   other set of nodes expected to probably be up when future clients are
   starting.

   We tried to solve this a while ago by adding a feature where we could
   ship a 'fallback' networkstatus file -- one that would get parsed
   when we had no current networkstatus file, and which we would use to
   learn about possible directory sources.  But we couldn't actually use
   it, since it turns out that a randomly chosen list of directory
   caches from 4-5 months ago is a terrible place to go when
   bootstrapping.

   Then for a while we considered an "Extra-Stable" flag so that clients
   could use only nodes with a long history of existence from these
   fallback networkstatus files.  We never built it, though.

   Instead, we can do this so much more simply.  If we want to ship Tor
   with a list of initial locations to go for directory information, why
   not just do so?

Proposal:

   In the same way that Tor currently ships with a list of directory
   authorities, Tor should also ship with a list of directory sources --
   places to go for an initial consensus if you don't have a somewhat
   recent one.

   These need to include an address for the cache's ORPort, and its
   identity key.  Additionally, they should include a selection weight.

   They can be configured with a torrc option, just like directory
   authorities are now.

   Whenever Tor is starting without a consensus, if it would currently
   ask a directory authority for a consensus, it should instead ask one
   of these preconfigured directory sources.

   I have code for this (see git branch fallback_dirsource_v2) in my
   public repository.

   When we deploy this, we can (and should) rip out the Fallback
   Networkstatus File logic.


How to find nodes to make into directory sources:

   We could take any of three approaches for selecting these initial
   directory sources.

   First, we could try to vet them a little, with a light variant of the
   process we use for authorities.  We'd want to look for nodes where we knew
   the operators, verify that they were okay with keeping the same IP for a
   very long time, and so forth.

   Second, we could try to pick nodes for listing with each Tor release
   based entirely on how long those nodes have been up.  Anything that's
   been a high-reliability directory for a long time on the same IP
   (like, say, a year) could be a good choice.

   Third, we could blend the approach and start by looking for
   up-for-a-long-time nodes, and then also ask the operators whether
   their nodes are likely to stay running for a long time.

   I think the third model is best.


Some notes on security:

   Directory source nodes have an opportunity to learn about new users
   connecting to the network for the first time.  Once we have directory
   guards, that's going to be a fairly uncommon ability.  We should be
   careful in any directory guard design to make sure that we don't fall
   back to the directory sources any more than we need to.  See proposal 207.





Filename: 207-directory-guards.txt
Title: Directory guards
Author: Nick Mathewson
Created: 10-Oct-2012
Status: Closed
Target: 0.2.4.x


Motivation:

   When we added guard nodes to resist profiling attacks, we made it so
   that clients won't build general-purpose circuits through just any
   node.  But clients don't use their guard nodes when downloading
   general-purpose directory information from the Tor network.  This
   allows a directory cache, over time, to learn a large number of IPs
   for non-bridge-using users of the Tor network.

Proposal:

   In the same way as they currently pick guard nodes as needed, adding more
   guards as those nodes are down, clients should also pick a small-ish set
   of directory guard nodes, to persist in Tor's state file.

   Clients should, as much as possible, use their regular guards as their
   directory guards.

   When downloading a regular directory object (that is, not a hidden
   service descriptor), clients should prefer their directory guards
   first.  Then they should try more directories from a recent consensus
   (if they have one) and pick one of those as a new guard if the
   existing guards are down and a new one is up.  Failing that, they
   should fall back to a directory authority (or a directory source, if
   those get implemented-- see proposal 206).

   If a client has only one directory guard running, they should add new
   guards and try them, and then use their directory guards to fetch multiple
   descriptors in parallel.

Open questions and notes:

   What properties does a node need to be a suitable directory guard?
   If we require that it have the Guard flag, we'll lose some nodes:
   only 74% of the directory caches have it (weighted by bandwidth).

   We may want to tune the algorithm used to update guards.

   For future-proofing, we may want to have the DirCache flag from 185
   be the one that nodes must have in order to be directory guards.  For
   now, we could have authorities set it to Guard && DirPort!=0, with a
   better algorithm to follow.  Authorities should never get the
   DirCache flag.



Filename: 208-ipv6-exits-redux.txt
Title: IPv6 Exits Redux
Author: Nick Mathewson
Created: 10-Oct-2012
Status: Closed
Target: 0.2.4.x
Implemented-In: 0.2.4.7-alpha

1. Obligatory Motivation Section

   [Insert motivations for IPv6 here.  Mention IPv4 address exhaustion.

   Insert official timeline for official IPv6 adoption here.

   Insert general desirability of being able to connect to whatever
   address there is here.

   Insert profession of firm conviction that eventually there will be
   something somebody wants to connect to which requires the ability to
   connect to an IPv6 address.]

2. Proposal

   Proposal 117 has been there since coderman wrote it in 2007, and it's
   still mostly right.  Rather than replicate it in full, I'll describe
   this proposal as a patch to it.

2.1. Exit policies

   Rather than specify IPv6 policies in full, we should move (as we have
   been moving with IPv4 addresses) to summaries of which IPv6 ports
   are generally permitted.  So let's allow server descriptors to include
   a list of accepted IPv6 ports, using the same format as the "p" line
   in microdescriptors, using the "ipv6-policy" keyword.

        "ipv6-policy" SP ("accept" / "reject") SP PortList NL

   Exits should still, of course, be able to configure more complex
   policies, but they should no longer need to tell the whole world
   about them.

   After this ipv6-policy line is validated, its numeric ports and ranges
   should get copied into a "p6" line in microdescriptors.

   This change breaks the existing exit enclave idea for IPv6, but the
   exiting exit enclave implementation never worked right in the first
   place.  If we can come up with a good way to support it, we can add
   that back in.

2.2. Which addresses should we connect to?

   One issue that's tripped us up a few times is how to decide whether
   we can use IPv6 addresses.  You can't use them with SOCKS4 or
   SOCKS4a, IIUC.  With SOCKS5, there's no way to indicate that you
   prefer IPv4 or IPv6.  It's possible that some SOCKS5 users won't
   understand IPv6 addresses.

   With this in mind, I'm going to suggest that with SOCKS4 or SOCKS4a,
   clients should always require IPv4.  With SOCKS5, clients should
   accept IPv6.

   If it proves necessary, we can also add per-SOCKSPort configuration
   flags to override the above default behavior.

   See also partitioning discussion in Security Notes below.

2.3. Extending BEGIN cells.

   Prop117 (and the section above) says that clients should prefer one
   address or another, but doesn't give them a means to tell the exit to
   do so.  Here's one.

   We define an extension to the BEGIN cell as follows.  After the
   ADDRESS | ':' | PORT | [00] portion, the cell currently contains all
   [00] bytes.  We add a 32-bit flags field, stored as an unsigned 32
   bit value, after the [00].  All these flags default to 0, obviously.
   We define the following flags:

     bit
      1 -- IPv6 okay.  We support learning about IPv6 addresses and
           connecting to IPv6 addresses.
      2 -- IPv4 not okay.  We don't want to learn about IPv4 addresses
           or connect to them.
      3 -- IPv6 preferred.  If there are both IPv4 and IPv6 addresses,
           we want to connect to the IPv6 one.  (By default, we connect
           to the IPv4 address.)
      4..32 -- Reserved.

   As with so much else, clients should look at the platform version of
   the exit they're using to see if it supports these flags before
   sending them.

2.4. Minor changes to proposal 117

   GETINFO commands that return an address, and which should return two,
   should not in fact begin returning two addresses separated by CRLF.
   They should retain their current behavior, and there should be a new
   "all my addresses" GETINFO target.

3. Security notes:

   Letting clients signal that they want or will accept IPv6 addresses
   creates two partitioning issues that didn't exist before.  One is the
   version partitioning issue: anybody who supports IPv6 addresses is
   obviously running the new software.  Another is option partitioning:
   anybody who is using a SOCKS4a application will look different from
   somebody who is using a SOCKS5 application.

   We can't do much about version partitioning, I think.  If we felt
   especially clever, we could have a flag day.  Is that necessary?

   For option partitioning, are there many applications whose behavior
   is indistinguishable except that they are sometimes configured to use
   SOCKS4a and sometimes to use SOCKS5?  If so, the answer may well be
   to persuade as many users as possible to switch those to SOCKS5, so
   that they get IPv6 support and have a large anonymity set.



   IPv6 addresses are plentiful, which makes caching them dangerous
   if you're hoping to avoid tracking over time.  (With IPv4 addresses,
   it's harder to give every user a different IPv4 address for a target
   hostname with a long TTL, and then accept connections to those IPv4
   addresses from different exits over time.  With IPv6, it's easy.)
   This makes proposal 205 especially necessary here.


Filename: 209-path-bias-tuning.txt
Title: Tuning the Parameters for the Path Bias Defense
Author: Mike Perry
Created: 01-10-2012
Status: Obsolete
Target: 0.2.4.x+


Overview

 This proposal describes how we can use the results of simulations in
 combination with network scans to set reasonable limits for the Path
 Bias defense, which causes clients to be informed about and ideally
 rotate away from Guards that provide extremely low circuit success
 rates.

Motivation

 The Path Bias defense is designed to defend against a type of route
 capture where malicious Guard nodes deliberately fail circuits that
 extend to non-colluding Exit nodes to maximize their network
 utilization in favor of carrying only compromised traffic.

 This attack was explored in the academic literature in [1], and a
 variant involving cryptographic tagging was posted to tor-dev[2] in
 March.

 In the extreme, the attack allows an adversary that carries c/n
 of the network capacity to deanonymize c/n of the network
 connections, breaking the O((c/n)^2) property of Tor's original
 threat model.

 In this case, however, the adversary is only carrying circuits for
 which either the entry and exit are compromised, or all three nodes are
 compromised.  This means that the adversary's Guards will fail all but 
 (c/n) + (c/n)^2 of their circuits for clients that select it. For 10%
 c/n compromise, such an adversary succeeds only 11% of their circuits
 that start at their compromised Guards. For 20% c/n compromise, such
 an adversary would only succeed 24% of their circuit attempts.

 It is this property which leads me to believe that a simple local
 accounting defense is indeed possible and worthwhile.

Design Description

 The Path Bias defense is a client-side accounting mechanism in Tor that
 tracks the circuit failure rate for each of the client's guards.

 Clients maintain two integers for each of their guards: a count of the
 number of times a circuit was extended at least one hop through that
 guard, and a count of the number of circuits that successfully complete
 through that guard. The ratio of these two numbers is used to determine
 a circuit success rate for that Guard.

 The system should issue a notice log message when Guard success rate
 falls below 70%, a warn when Guard success rate falls below 50%, and
 should drop the Guard when the success rate falls below 30%.

 Circuit build timeouts are only counted as path failures if the
 circuit fails to complete before the 95% "right-censored" (aka
 "MEASUREMENT_EXPIRED") timeout interval, not the 80% timeout
 condition[5]. This was done based on the assumption that destructive
 cryptographic tagging is the primary vector for the path bias attack,
 until such time as Tor's circuit crypto can be upgraded. Therefore,
 being more lenient with timeout makes us more resilient to network
 conditions.

 To ensure correctness, checks are performed to ensure that
 we do not count successes without also counting the first hop (see
 usage of path_state_t as well as pathbias_* in the source).

 Similarly, to provide a moving average of recent Guard activity while
 still preserving the ability to ensure correctness, we periodically
 "scale" the success counts by first multiplying by a numerator
 (currently 1) and then dividing by an integer divisor (currently 2). 

 Scaling is performed when when the counts exceed the moving average
 window (300) and when the division does not produce integer truncation.

 No log messages should be displayed, nor should any Guard be
 dropped until it has completed at least 150 first hops (inclusive).

Analysis: Simulation

 To test the defense in the face of various types of malicious and
 non-malicious Guard behavior, I wrote a simulation program in
 Python[3].

 The simulation confirmed that without any defense, an adversary
 that provides c/n of the network capacity is able to observe c/n
 of the network flows using circuit failure attacks.

 It also showed that with the defense, an adversary that wishes to
 evade detection has compromise rates bounded by:

   P(compromise) <= (c/n)^2 * (100/CUTOFF_PERCENT)
   circs_per_client <= circuit_attempts*(c/n)

 In this way, the defense restores the O((c/n)^2) compromise property,
 but unfortunately only over long periods of time (see Security
 Considerations below).

 The spread between the cutoff values and the normal rate of circuit
 success has a substantial effect on false positives. From the
 simulation's results, the sweet spot for the size of this spread
 appears to be 10%. In other words, we want to set the cutoffs such that
 they are 10% below the success rate we expect to see in normal usage.

 The simulation also demonstrates that larger "scaling window" sizes
 reduce false positives for instances where non-malicious guards
 experience some ambient rate of circuit failure.

Analysis: Live Scan

 Preliminary Guard node scanning using the txtorcon circuit scanner[4]
 shows normal circuit completion rates between 80-90% for most Guard
 nodes.
 
 However, it also showed that CPU overload conditions can easily push
 success rates as low as 45%. Even more concerning is that for a brief
 period during the live scan, success rates dropped to 50-60%
 network-wide (regardless of Guard node choice).

 Based on these results, the notice condition should be 70%, the warn 
 condition should be 50%, and the drop condition should be 30%.

 However, see the Security Considerations sections for reasons
 to choose more lenient bounds.

Future Analysis: Deployed Clients

 It's my belief that further analysis should be done by deploying 
 loglines for all three thresholds in clients in the live network
 to utilize user reports on how often high rates of circuit failure
 are seen before we deploy changes to rotate away from failing Guards.

 I believe these log lines should be deployed in 0.2.3.x clients,
 to maximize the exposure of the code to varying network conditions,
 so that we have enough data to consider deploying the Guard-dropping
 cutoff in 0.2.4.x.

Security Considerations: DoS Conditions

 While the scaling window does provide freshness and can help mitigate
 "bait-and-switch" attacks, it also creates the possibility of conditions
 where clients can be forced off their Guards due to temporary
 network-wide CPU DoS. This provides another reason beyond false positive
 concerns to set the scaling window as large as is reasonable.

 A DoS directed at specific Guard nodes is unlikely to allow an
 adversary to cause clients to rotate away from that Guard, because it
 is unlikely that the DoS can be precise enough to allow first hops to
 that Guard to succeed, but also cause extends to fail. This leaves
 network-wide DoS as the primary vector for influencing clients.

 Simulation results show that in order to cause clients to rotate away
 from a Guard node that previously succeeded 80% of its circuits, an
 adversary would need to induce a 25% success rate for around 350 circuit
 attempts before the client would reject it or a 5% success rate
 for around 215 attempts, both using a scaling window of 300 circuits.
 
 Assuming one circuit per Guard per 10 minutes of active client
 activity, this is a sustained network-wide DoS attack of 60 hours
 for the 25% case, or 38 hours for the 5% case.

 Presumably this is enough time for the directory authorities to respond by
 altering the pb_disablepct consensus parameter before clients rotate,
 especially given that most clients are not active for even 38 hours on end,
 and will tend to stop building circuits while idle.

 If we raised the scaling window to 500 circuits, it would require 1050
 circuits if the DoS brought circuit success down to 25% (175 hours), and
 415 circuits if the DoS brought the circuit success down to 5% (69 hours).

 The tradeoff, though, is that larger scaling window values allow Guard nodes
 to compromise clients for duty cycles of around the size of this window (up to
 the (c/n)^2 * 100/CUTOFF_PERCENT limit in aggregate), so we do have to find
 balance between these concerns.

Security Considerations: Targeted Failure Attacks

 If an adversary controls a significant portion of the network, they
 may be able to target a Guard node by failing their circuits. In the
 context of cryptographic tagging, both the Middle node and the Exit
 node are able to recognize their colluding peers. The Middle node sees
 the Guard directly, and the Exit node simply reverses a non-existent
 tag, causing a failure.

 P(EvilMiddle) || P(EvilExit) = 1.0 - P(HonestMiddle) && P(HonestExit)
                              = 1.0 - (1.0-(c/n))*(1.0-(c/n))

 For 10% compromise, this works out to the ability to fail an
 additional 19% of honest Guard circuits, and for 20% compromise,
 it works out to 36%.

 When added to the ambient circuit failure rates (10-20%), this is
 within range of the notice and warn conditions, but not the guard
 failure condition.

 However, this attack does become feasible if a network-wide DoS
 (or simply CPU load) is able to elevate the ambient failure
 rate to 51% for the 10% compromise case, or 34% for the 20%
 compromise case.

 Since both conditions would elicit notices and/or warns from *all*
 clients, this attack should be detectable. It can also be detected
 through the bandwidth authorities (who could possibly even
 set pathbias parameters directly based on measured ambient circuit
 failure rates), should we deploy #7023.

Implementation Notes: Log Messages

 Log messages need to be chosen with care to avoid alarming users.
 I suggest:

 Notice: "Your Guard %s is failing more circuits than usual. Most likely
 this means the Tor network is overloaded. Success counts are %d/%d."

 Warn: "Your Guard %s is failing a very large amount of circuits. Most likely
 this means the Tor network is overloaded, but it could also mean an attack
 against you or potentially the Guard itself. Success counts are %d/%d."

 Drop: "Your Guard %s is failing an extremely large amount of circuits. [Tor
 has disabled use of this Guard.] Success counts are %d/%d."

 The second piece of the Drop message would not be present in 0.2.3.x,
 since the Guard won't actually be dropped.

Implementation Notes: Consensus Parameters

 The following consensus parameters reflect the constants listed
 in the proposal. These parameters should also be available 
 for override in torrc.

 pb_mincircs=150
   The minimum number of first hops before we log or drop Guards.

 pb_noticepct=70
   The threshold of circuit success below which we display a notice.

 pb_warnpct=50
   The threshold of circuit success below which we display a warn.

 pb_disablepct=30
   The threshold of circuit success below which we disable the guard.

 pb_scalecircs=300
   The number of first hops at which we scale the counts down.

 pb_multfactor=1
   The integer numerator by which we scale.

 pb_scalefactor=2
   The integer divisor by which we scale.

 pb_dropguards=0
   If non-zero, we should actually drop guards as opposed to warning.

Implementation Notes: Differences between proposal and current source

 This proposal adds a few changes over the implementation currently
 deployed in origin/master.

 The log messages suggested above are different than those in the
 source.

 The following consensus parameters had changes to their default
 values, based on results from simulation and scanning:
   pb_mincircs=150
   pb_noticepct=70
   pb_disablepct=30
   pb_scalecircs=300

 Also, the following consensus parameters are additions:
   pb_multfactor=1
   pb_warnpct=50
   pb_dropguards=0

 Finally, 0.2.3.x needs to be synced with origin/master, but should
 also ignore the pb_dropguards parameter (but ideally still provide
 the equivalent pb_dropguards torrc option).


1. http://freehaven.net/anonbib/cache/ccs07-doa.pdf
2. https://lists.torproject.org/pipermail/tor-dev/2012-March/003347.html
3. https://gitweb.torproject.org/torflow.git/tree/HEAD:/CircuitAnalysis/PathBias
4. https://github.com/meejah/txtorcon/blob/exit_scanner/apps/exit_scanner/failure-rate-scanner.py
5. See 2.4.1 of path-spec.txt for further details on circuit timeout calculations.
Filename: 210-faster-headless-consensus-bootstrap.txt
Title: Faster Headless Consensus Bootstrapping
Author: Mike Perry, Tim Wilson-Brown, Peter Palfrader
Created: 01-10-2012
Last Modified: 02-10-2015
Status: Superseded
Target: 0.2.8.x+

Status-notes:

   * This has been partially superseded by the fallback directory code,
     and partially by the exponential-backoff code.

Overview and Motiviation

 This proposal describes a way for clients to fetch the initial
 consensus more quickly in situations where some or all of the directory
 authorities are unreachable. This proposal is meant to describe a
 solution for bug #4483.

Design: Bootstrap Process Changes

 The core idea is to attempt to establish bootstrap connections in
 parallel during the bootstrap process, and download the consensus from
 the first connection that completes.

 Connection attempts will be performed on an exponential backoff basis.
 Initially, connections will be performed to a randomly chosen hard
 coded directory mirror and a randomly chosen canonical directory
 authority. If neither of these connections complete, additional mirror
 and authority connections are tried. Mirror connections are tried at
 a faster rate than authority connections.

 Clients represent the majority of the load on the network. They can use
 directory mirrors to download their documents, as the mirrors download
 their documents from the authorities early in the consensus validity
 period.

 We specify that client mirror connections retry after one second, and
 then double the retry time with every connection attempt:
 0, 1, 2, 4, 8, 16, 32, ...
 (The timers currently implemented in Tor increment with every
 connection failure.)

 We specify that client directory authority connections retry after
 10 seconds, and then double the retry time with every connection:
 0, 10, 20, ...

 If a client has both an IPv4 and IPv6 address, it will try IPv4 and
 IPv6 mirrors and authorities on the following schedule:
 IPv4, IPv6, IPv4, IPv6, ...

 [ TODO: should we add random noise to these scheduled times? - teor
         Tor doesn’t add random noise to the current failure-based
         timers, but as failures are a network event, they are
         somewhat random/arbitrary already. These attempt-based timers
         will go off every few seconds, exactly erraon the second. ]

 (Relays can’t use directory mirrors to download their documents,
 as they *are* the directory mirrors.)

 The maximum retry time for all these timers is 3 days + 1 hour. This
 places a small load on the mirrors and authorities, while allowing a
 client that regains a network connection to eventually download a
 consensus.

 We try IPv4 first to avoid overloading IPv6-enabled authorities and
 mirrors. Each timing schedule uses a separate IPv4/IPv6 schedule.
 This ensures that clients try an IPv6 authority within the first
 10 seconds. This helps implement #8374 and related tickets.

 We don't want to keep on trying an IP version that always fails.
 Therefore, once sufficient IPv4 and IPv6 connections have been
 attempted, we select an IP version for new connections based on the ratio
 of their failure rates, up to a maximum of 1:5. This may not make a
 substantial difference to consensus downloads, as we only need one
 successful consensus download to bootstrap. However, it is important for
 future features like #17217, where clients try to automatically determine
 if they can use IPv4 or IPv6 to contact the Tor network.

 The retry timers and IP version schedules must reset on HUP and any
 network reachability events, so that clients that have unreliable networks
 can recover from network failures.
 [ TODO: Do we do this for any other timers?
         I think this needs another proposal, it’s out of scope here.
         - teor ]

 The first connection to complete will be used to download the consensus
 document and the others will be closed, after which bootstrapping will
 proceed as normal.

 We expect the vast majority of clients to succeed within 4 seconds,
 after making up to 4 connection attempts to mirrors and 1 connection
 attempt to an authority. Clients which can't connect in the first
 10 seconds, will try 1 more mirror, then try to contact another
 directory authority. We expect almost all clients to succeed within
 10 seconds. This is a much better success rate than the current Tor
 implementation, which fails k/n of clients if k of the n directory
 authorities are down. (Or, if the connection fails in certain ways,
 it will retry once, failing 1-(1-(k/n)^2).)

 If at any time, the total outstanding bootstrap connection attempts
 exceeds 10, no new connection attempts are to be launched until an
 existing connection attempt experiences full timeout. The retry time
 is not doubled when a connection is skipped.

 A benefit of connecting to directory authorities is that clients are
 warned if their clock is wrong. Starting the authority and fallback
 schedules at the same time should ensure that some clients check their
 clock with an authority at each bootstrap.

Design: Fallback Dir Mirror Selection

 The set of hard coded directory mirrors from #572 shall be chosen using
 the 100 Guard nodes with the longest uptime.

 The fallback weights will be set using each mirror's fraction of
 consensus bandwidth out of the total of all 100 mirrors, adjusted to
 ensure no fallback directory sees more than 10% of clients. We will
 also exclude fallback directories that are less than 1/1000 of the
 consensus weight, as they are not large enough to make it worthwhile
 including them.

 This list of fallback dir mirrors should be updated with every
 major Tor release. In future releases, the number of dir mirrors
 should be set at 20% of the current Guard nodes (approximately 200 as
 of October 2015), rather than fixed at 100.

 [TODO: change the script to dynamically calculate an upper limit.]

Performance: Additional Load with Current Parameter Choices

 This design and the connection count parameters were chosen such that
 no additional bandwidth load would be placed on the directory
 authorities. In fact, the directory authorities should experience less
 load, because they will not need to serve the entire consensus document
 for a connection in the event that one of the directory mirrors complete
 their connection before the directory authority does.

 However, the scheme does place additional TLS connection load on the
 fallback dir mirrors. Because bootstrapping is rare, and all but one of
 the TLS connections will be very short-lived and unused, this should not
 be a substantial issue.

 The dangerous case is in the event of a prolonged consensus failure
 that induces all clients to enter into the bootstrap process. In this
 case, the number of TLS connections to the fallback dir mirrors within
 the first second would be 2*C/100, or 40,000 for C=2,000,000 users. If
 no connections complete before the 10 retries, 7 of which go to
 mirrors, this could reach as high as 140,000 connection attempts, but
 this is extremely unlikely to happen in full aggregate.

 However, in the no-consensus scenario today, the directory authorities
 would already experience 2*C/9 or 444,444 connection attempts. (Tor
 currently tries 2 authorities, before delaying the next attempt.) The
 10-retry scheme, 3 of which go to authorities, increases their total
 maximum load to about  666,666 connection attempts, but again this is
 unlikely to be reached in aggregate. Additionally, with this scheme,
 even if the dirauths are taken down by this load, the dir mirrors
 should be able to survive it.

Implementation Notes: Code Modifications

 The implementation of the bootstrap process is unfortunately mixed 
 in with many types of directory activity.

 The process starts in update_consensus_networkstatus_downloads(),
 which initiates a single directory connection through
 directory_get_from_dirserver(). Depending on bootstrap state,
 a single directory server is selected and a connection is
 eventually made through directory_initiate_command_rend().

 There appear to be a few options for altering this code to retry multiple
 simultaneous connections. It looks like we can modify
 update_consensus_networkstatus_downloads() to make connections more often
 if the purpose is DIR_PURPOSE_FETCH_CONSENSUS and there is no valid
 (reasonably live) consensus. We can make multiple connections from
 update_consensus_networkstatus_downloads(), as the sockets are non-blocking.
 (This socket appears to be non-blocking on Unixes (SOCK_NONBLOCK & O_NONBLOCK)
 and Windows (FIONBIO).) As long as we can tolerate a timer resolution of
 ~1 second (due to the use of second_elapsed_callback and time_t), this
 requires no additional timers or callbacks. We can make 1 connection for each
 schedule per second, for a maximum of 2 per second.

 The schedules can be specified in:
 TestingClientBootstrapConsensusAuthorityDownloadSchedule
 TestingClientBootstrapConsensusFallbackDownloadSchedule
 (Similar to the existing TestingClientConsensusDownloadSchedule.)

 TestingServerIPVersionPreferenceSchedule
 (Consisting of a CSV like “4,6,4,6”, or perhaps “0,1,0,1”.)

 update_consensus_networkstatus_downloads() checks the list of pending
 connections and, if it is 10 or greater, skip the connection attempt,
 and leave the retry time constant.

 The code in directory_send_command() and connection_finished_connecting()
 would need to be altered to check that we are not already downloading the
 consensus. If we’re not, then download the consensus on this connection, and
 close any other pending consensus dircons.

 We might also need to make similar changes in authority_certs_fetch_missing(),
 as we can’t use a consensus until we have enough authority certificates.
 However, Tor already makes multiple requests (one per certificate), and only
 needs a majority of certificates to validate a consensus. Therefore, we will
 only need to modify authority_certs_fetch_missing() if clients download a
 consensus, then end up getting stuck downloading certificates. (Current tests
 show bootstrapping working well without any changes to authority certificate
 fetches.)

Reliability Analysis

 We make the pessimistic assumptions that 50% of connections to directory
 mirrors fail, and that 20% of connections to authorities fail. (Actual
 figures depend on relay churn, age of the fallback list, and authority
 uptime.)

 We expect the first 10 connection retry times to be:
 (Research shows users tend to lose interest after 40 seconds.)
 Mirror:   0s  1s  2s    4s    8s           16s             32s
 Auth:     0s                        10s            20s
 Success: 90% 95% 97% 98.7% 99.4% 99.89% 99.94% 99.988% 99.994%

 97%    of clients succeed in the first 2 seconds.
 99.4%  of clients succeed without trying a second authority.
 99.89% of clients succeed in the first 10 seconds.
  0.11% of clients remain, but in this scenario, 2 authorities are
        unreachable, so the client is most likely blocked from the Tor
        network. Alternately, they will likely succeed on relaunch.

 The current implementation makes 1 or 2 authority connections within the
 first second, depending on exactly how the first connection fails. Under
 the 20% authority failure assumption, these clients would have a success
 rate of either 80% or 96% within a few seconds. The scheme above has a
 greater success rate in the first few seconds, while spreading the load
 among a larger number of directory mirrors. In addition, if all the
 authorities are blocked, current clients will inevitably fail, as they
 do not have a list of directory mirrors.
Filename: 211-mapaddress-tor-status.txt
Title: Internal Mapaddress for Tor Configuration Testing
Author: Mike Perry
Created: 08-10-2012
Status: Reserve
Target: 0.2.4.x+


Overview

 This proposal describes a method by which we can replace the
 https://check.torproject.org/ testing service with an internal XML
 document provided by the Tor client.

Motivation

 The Tor Check service is a central point of failure in terms of Tor
 usability. If it is ever out of sync with the set of exit nodes on the
 Tor network or down, user experience is degraded considerably. Moreover,
 the check itself is very time-consuming. Users must wait seconds or more
 for the result to come back. Worse still, if the user's software *was*
 in fact misconfigured, the check.torproject.org DNS resolution and
 request leaks out on to the network.

Design Overview

 The system will have three parts: an internal hard-coded IP address
 mapping (127.84.111.114:80), a hard-coded mapaddress to a DNS name
 (selftest.torproject.org:80), and a DirPortFrontPage-style simple
 HTTP server that serves an XML document for both addresses.

 Upon receipt of a request to the IP address mapping, the system will 
 create a new 128 bit randomly generated nonce and provide it
 in the XML document.
 
 Requests to http://selftest.torproject.org/ must include a valid,
 recent nonce as the GET url path. Upon receipt of a valid nonce,
 it is removed from the list of valid nonces. Nonces are only valid
 for 60 seconds or until SIGNAL NEWNYM, which ever comes first.

 The list of pending nonces should not be allowed to grow beyond 10
 entries. 

 The timeout period and nonce limit should be configurable in torrc.

Design: XML document format for http://127.84.111.114

 To avoid the need to localize the message in Tor, Tor will only provide
 a XML object with connectivity information. Here is an example form:

 <tor-test>
  <tor-bootstrap-percent>100</tor-bootstrap-percent>
  <tor-version-current>true</tor-version-current>
  <dns-nonce>4977eb4842c7c59fa5b830ac4da896d9</dns-nonce>
 <tor-test/>

 The tor-bootstrap-percent field represents the results of the Tor client
 bootstrap status as integer percentages from bootstrap_status_t.

 The tor-version-current field represents the results of the Tor client
 consensus version check. If the bootstrap process has not yet
 downloaded a consensus document, this field will have the value
 null.

 The dns-nonce field contains a 128-bit secret, encoded in base16. This
 field is only present for requests that list the Host: header as
 127.84.111.114.

Design: XML document format for http://selftest.torproject.org/nonce

 <tor-test>
  <tor-bootstrap-percent>100</tor-bootstrap-percent>
  <tor-version-current>true</tor-version-current>
  <dns-nonce-valid>true</dns-nonce-valid>
 <tor-test/>

 The first two fields are the same as for the IP address version.

 The dns-nonce-valid field is only true if the Host header matches
 selftest.torproject.org and the nonce is current and valid. Upon
 receipt of a valid nonce, that nonce is removed from the list of
 valid nonces.

Design: Request Servicing

 Care must be taken with the dns-nonce generation and usage, to prevent
 users from being tracked through leakage of nonce value to application
 content. While the usage of XML appears to make this impossible
 due to stricter same-origin policy enforcement than JSON, same-origin
 enforcement is still fraught with exceptions and loopholes.

 In particular: 

 Any requests that contain the Origin: header MUST be ignored,
 as the Origin: header is only included for third party web content
 (CORS).

 dns-nonce fields MUST be omitted if the HTTP Host: header does not
 match the IP address 127.84.111.114.

 Requests to selftest.torproject.org MUST return false for the
 dns-nonce-valid field if the HTTP Host: header does not match
 selftest.torproject.org, regardless of nonce value.

 Further, requests to selftest.torproject.org MUST validate that
 'selftest.torproject.org' was the actual hostname provided to
 SOCKS4A, and not some alternate address mapping (due to DNS rebinding
 attacks, for example).

Design: Application Usage

 Applications will use the system in two steps. First, they will make an
 HTTP request to http://127.84.111.114:80/ over Tor's SOCKS port and
 parse the resulting XML, if any.

 If the request at this stage fails, the application should inform the
 user that either their Tor client is too old, or that it is
 misconfigured, depending upon the nature of the failure.

 If the request succeeds and valid XML is returned, the application
 will record the value of the dns-nonce field, and then perform a second
 request to http://selftest.torproject.org/nonce_value. If the second
 request succeeds, and the dns-nonce-valid field is true, the application
 may inform the user that their Tor settings are valid.

 If the second request fails, or does not provide the correct dns-nonce,
 the application will inform the user that their Tor DNS proxy settings
 are incorrect.
 
 If either tor-bootstrap-percent is not 100, or tor-version-current is
 false, applications may choose to inform the user of these facts using
 properly localized strings and appropriate UI.

Security Considerations

 XML was chosen over JSON due to the risks of the identifier leaking
 in a way that could enable websites to track the user[1].

 Because there are many exceptions and circumvention techniques
 to the same-origin policy, we have also opted for strict controls
 on dns-nonce lifetimes and usage, as well as validation of the Host
 header and SOCKS4A request hostnames.


1. http://www.hpenterprisesecurity.com/vulncat/en/vulncat/dotnet/javascript_hijacking_vulnerable_framework.html
Filename: 212-using-old-consensus.txt
Title: Increase Acceptable Consensus Age
Author: Mike Perry
Created: 01-10-2012
Status: Needs-Revision
Target: 0.2.4.x+

Overview

  This proposal aims to extend the duration that clients will accept
  old consensus material under conditions where the directory authorities
  are either down or fail to produce a valid consensus for an extended
  period of time.

Motivation

  Currently, if the directory authorities are down or fail to consense
  for 24 hours, the entire Tor network will cease to function. Worse,
  clients will enter into a state where they all need to re-bootstrap
  directly from the directory authorities, which will likely exacerbate
  any potential DoS condition that may have triggered the downtime in the
  first place.

  The Tor network has had such close calls before. In the past, we've
  been able to mobilize a majority of the directory authority operators
  within this 24 hour window, but that is only because we've been
  exceedingly lucky and the DoS conditions were accidental rather than
  deliberate.

  If a DoS attack was deliberately timed to coincide with a major US
  and European combined holiday such as Christmas Eve, New Years Eve, or
  Easter, it is very unlikely we would be able to muster the resources to
  diagnose and deploy a fix to the authorities in time to prevent network
  collapse.

Description

  Based on the need to survive multi-day holidays and long weekends
  balanced with the need to ensure clients can't be captured on an old
  consensus forever, I propose that the consensus liveness constants be
  set at 3 days rather than 24 hours.

  This requires updating two consensus defines in the source, and one
  descriptor freshness variable. The descriptor freshness should be
  set to a function of the consensus freshness.

  See Implementation Notes for further details.

Security Concerns: Using an Old Consensus

  Clients should not trust old consensus data without an attempt to
  download fresher data from a directory mirror.

  As far as I could tell, the code already does this. The minimum
  consensus age before we try to download new data is two hours.

  However, the ability to accept old consensus documents does introduce
  the ability of malicious directory mirrors to feed their favorite old
  consensus document to clients to alter their paths until they
  download a fresher consensus from elsewhere. Directory guards
  (Proposal 207) may exacerbate this ability.

  This proposal does not address such attacks, and seeks only a modest
  increase in the valid timespan as a compromise.

  Future consideration of these and other targeted-consensus attacks
  will be left to proposals related to ticket #7126[1]. Once those
  proposals are complete and implemented, raising the freshness limit
  beyond 3 days should be possible.

Implementation Notes

  There appear to be at least three constants in the code involved with
  using potentially expired consensus data. Two of them
  (REASONABLY_LIVE_TIME and NS_EXPIRY_SLOP) involve the consensus itself,
  and two (OLD_ROUTER_DESC_MAX_AGE and TOLERATE_MICRODESC_AGE) deal with
  descriptor liveness.

  Two additional constants ROUTER_MAX_AGE and ROUTER_MAX_AGE_TO_PUBLISH
  are only used when submitting descriptors for consensus voting.

  FORCE_REGENERATE_DESCRIPTOR_INTERVAL is the maximum age a router
  descriptor will get before a relay will re-publish. It is set to 18
  hours.

  OLD_ROUTER_DESC_MAX_AGE is set at 5 days. TOLERATE_MICRODESC_AGE
  is set at 7 days.

  The consensus timestamps are used in
  networkstatus_get_reasonably_live_consensus() and 
  networkstatus_set_current_consensus().

  OLD_ROUTER_DESC_MAX_AGE is checked in routerlist_remove_old_routers(), 
  router_add_to_routerlist(), and client_would_use_router().

  It is my opinion that we should combine REASONABLY_LIVE_TIME and
  NS_EXPIRY_SLOP into a single define, and make OLD_ROUTER_DESC_MAX_AGE a
  function of REASONABLY_LIVE_TIME and FORCE_REGENERATE_DESCRIPTOR_INTERVAL:

  #define REASONABLY_LIVE_TIME           (3*24*60*60)
  #define NS_EXPIRY_SLOP                 REASONABLY_LIVE_TIME
  #define OLD_ROUTER_DESC_MAX_AGE        \
          (REASONABLY_LIVE_TIME+FORCE_REGENERATE_DESCRIPTOR_INTERVAL)

  Based on my review of the above code paths, these changes should be all
  we need to enable clients to use older consensuses for longer while
  still attempting to fetch new ones.

1. https://trac.torproject.org/projects/tor/ticket/7126
Filename: 213-remove-stream-sendmes.txt
Title: Remove stream-level sendmes from the design
Author: Roger Dingledine
Created: 4-Nov-2012
Status: Dead

1. Motivation

  Tor uses circuit-level sendme cells to handle congestion / flow
  fairness at the circuit level, but it has a second stream-level
  flow/congestion/fairness layer under that to share a given circuit
  between multiple streams.

  The circuit-level flow control, or something like it, is needed
  because different users are competing for the same resources. But the
  stream-level flow control has a different threat model, since all the
  streams belong to the same user.

  When the circuit has only one active stream, the downsides are a)
  that we waste 2% of our bandwidth sending stream-level sendmes, and b)
  because of the circuit-level and stream-level window parameters we
  picked, we end up sending only half the cells we might otherwise send.

  When the circuit has two active streams, they each get to send 500
  cells for their window, because the circuit window is 1000. We still
  spend the 2% overhead.

  When the circuit has three or more active streams, they're all typically
  limited by the circuit window, since the stream-level window won't
  kick in. We still spend the 2% overhead though. And depending on their
  sending pattern, we could experience cases where a given stream might
  be able to send more data on the circuit, but it chooses not to because
  its stream-level window is empty.

  More generally, we don't have a good handle on the interactions between
  all the layers of congestion control in Tor. It would behoove us to
  simplify in the case where we're not clear on what it buys us.

2. Design

  We should strip all aspects of this stream-level flow control from
  the Tor design and code.

2.1. But doesn't having a lower stream window than circuit window save
     room for new streams?

  It could be that a feature of the stream window is that there's always
  space in the circuit window for another begin cell, so new streams
  will open faster than otherwise. But first, if there are two or more
  active streams going, there won't be any extra space. Second, since
  begin cells are client-to-exit, and typical circuits don't fill their
  outbound circuit windows very often anyway, and also since we're hoping
  to move to a world where we isolate more activities between circuits,
  I'm not inclined to worry much about losing this maybe-feature.

  See also proposal 168, "reduce default circuit window" -- it's
  interesting to note that proposal 168 was unknowingly dabbling in
  exactly this question, since reducing the default circuit window to
  500 or less made stream windows moot. It might be worth resurrecting
  the proposal 168 experiments once this proposal is implemented.

2.2. If we dump stream windows, we're effectively doubling them.

  Right now the circuit window starts at 1000, and the stream window
  starts at 500. So if we just rip out stream windows, we'll effectively
  change the stream window default to 1000, doubling the amount of data
  in flight and potentially clogging up the network more.

  We could either live with that, or we could change the default circuit
  window to 500 (which is easy to do even in a backward compatible way,
  since the edge connection can simply choose to not send as many cells).

3. Evaluation

  It would be wise to have some plan for making sure we didn't screw
  up the network too much with this change. The main trouble there is
  that torperf et al only do one stream at a time, so we really have no
  good baseline, or measurement tools, to capture network performance
  for multiple parallel streams.

  Maybe we should resolve task 7168 before the transition, so we're
  more prepared.

4. Transition

  Option one is to do a two-phase transition. In the first phase,
  edges stop enforcing the deliver window (i.e. stop closing circuits
  when the stream deliver goes negative, but otherwise they send and
  receive stream-level sendmes as now). In the second phase (once all
  old versions are gone), we can start disobeying the deliver window,
  and also stop sending stream-level sendmes back.

  That approach takes a while before it will matter. As an optimization,
  since clients can know which relay versions support the new behavior,
  we could have relays interpret violating the deliver window as signaling
  support for removed stream-level sendmes: the relay would then stop
  sending or expecting sendmes. That optimization is somewhat klunky
  though, first because web-browsing clients don't generally finish out
  a stream window in the upstream direction (so the klunky trick will
  probably never happen by accident), and second because if we lower
  the circuit window to 500 (see Sec 2.2), there's now no way to violate
  stream deliver windows.

  Option two is to introduce another relay cell type, which the client
  sends before opening any streams to let the other side know that
  it shouldn't use or expect stream-level sendmes. A variation here
  is to extend either the create cell or the begin cell (ha -- and they
  thought I was crazy when I included the explicit \0 at the end of the
  current begin cell payload), so we can specify our circuit preferences
  without any extra overhead.

  Option three is to wait until we switch to a new circuit protocol
  (e.g. when we move to ntor or ace), and use that as the signal to
  drop stream-level sendmes from the design. And hey, if we're lucky,
  by then we'll have sorted out the n23 questions (see ticket 4506)
  and we might be dumping circuit-level sendmes at that point too.

  Options two or three seem way better than option one.

  And since it's not super-urgent, I suggest we hold off on option two
  to see if option three makes sense.

5. Discussion

  Based on feedback from Andreas Krey on tor-dev, I believe this proposal
  is flawed, and should likely move to Status: Dead.

  Looking at it from the exit relay's perspective (which is where it matters
  most, since most use of Tor is sending a little bit and receiving a lot):
  when a create cell shows up to establish a circuit, that circuit is
  allowed to send back at most 1000 cells. When a begin relay cell shows
  up to ask that circuit to open a new stream, that stream is allowed to
  send back at most 500 cells.

  Whenever the Tor client has received 100 cells on that circuit, she
  immediately sends a circuit-level sendme back towards the exit, to let
  it know to increment its "number of cells it's allowed to send on the
  circuit" by 100.

  However, a stream-level sendme is only sent when both a) the Tor client
  has received 50 cells on a particular stream, *and* b) the application
  that initiated the stream is willing to accept more data.

  If we ripped out stream-level sendmes, then as you say, we'd have to
  choose between "queue all the data for the stream, no matter how big it
  gets" and "tell the whole circuit to shut up".

  I believe you have just poked a hole in the n23 ("defenestrator") design
  as well: http://freehaven.net/anonbib/#pets2011-defenestrator
  since it lacks any stream-level pushback for streams that are blocking
  on writes. Nicely done!

Filename: 214-longer-circids.txt
Title: Allow 4-byte circuit IDs in a new link protocol
Author: Nick Mathewson
Created: 6 Nov 2012
Status: Closed
Implemented-In: 0.2.4.11-alpha


0. Overview

   Relays are running out of circuit IDs.  It's time to make the field
   bigger.

1. Background and Motivation

   Long ago, we thought that 65535 circuit IDs would be enough for anybody.
   It wasn't.  But our cell format in link protocols is still:

    Cell [512 bytes]
      CircuitID [2 bytes]
      Command [1 byte]
      Payload [509 bytes]

    Variable-length cell [Length+5 bytes]
       CircID   [2 bytes]
       Command  [1 byte]
       Length   [2 bytes]
       Payload  [Length bytes]

   This means that a relay can run out of circuit IDs pretty easily.

2. Design

   I propose a new link cell format for relays that support it.  It should
   be:

    Cell [514 bytes]
       CircuitID [4 bytes]
       Command [1 byte]
       Payload [509 bytes]

    Variable cell (Length+7 bytes)
       CircID   [4 bytes]
       Command  [1 byte]
       Length   [2 bytes]
       Payload  [Length bytes]

   We need to keep the payload size in fixed-length cells to its current
   value, since otherwise the relay protocol won't work.

   This new cell format should be used only when the link protocol is 4.
   (To negotiation link protocol 4, both sides need to use the "v3"
   handshake, and include "4" in their version cells.  If version 4 or
   later is negotiated, this is the cell format to use.)

2.1. Better allocation of circuitID space

   In the current Tor design, circuit ID allocation is determined by
   whose RSA public key has the lower modulus.  How ridiculous!
   Instead, I propose that when the version 4 link protocol is in use,
   the connection initiator use the low half of the circuit ID space,
   and the responder use the high half of the circuit ID space.

3. Discussion

   * Why 4 bytes?

     Because 3 would result in an odd cell size, and 8 seems like
     overkill.

   * Will this be distinguishable from the v3 protocol?

     Yes. Anybody who knows they're seeing the Tor protocol can probably
     tell by the TLS record sizes which version of the protocol is in
     use.  Probably not a huge deal though; which approximate range of
     versions of Tor a client or server is running is not something
     we've done much to hide in the past.

   * Why a new link protocol and not a new cell type?

     Because pretty much every cell has a meaningful circuit ID.

   * Okay, why a new link protocol and not a new _set of_ cell types?

     Because it's a bad idea to mix short and long circIDs on the same
     channel.  (That would leak which cells go with what kind of
     circuits ID, potentially.)

   * How hard is this to implement?

     I wasn't sure, so I coded it up.  I've got a probably-buggy
     implementation in branch "wide_circ_ids" in my public repository.
     Be afraid!  More testing is needed!

Filename: 215-update-min-consensus-ver.txt
Title: Let the minimum consensus method change with time
Author: Nick Mathewson
Created: 15 Nov 2012
Status: Closed
Implemented-In: 0.2.6.1-alpha


0. Overview

   This proposal suggests that we drop the requirement that
   authorities support the very old consensus method "1", and instead
   move to a wider window of recognized consensus methods as Tor
   evolves.

1. Background and Motivation

   When we designed the directory voting system, we added the notion
   of "consensus method" so that we could smoothly upgrade the voting
   process over time.  We also said that all authorities must support
   the consensus method '1', and must fall back to it if they don't
   support the method that the supermajority of authorities will
   choose.

   Consensus method 1 is no longer viable for the Tor network.  It
   doesn't result in a microdescriptor consensus, and omits other
   fields that clients need in order to work well.  Consensus methods
   under 12 have security issues, since they let a single authority
   set a consensus parameter.

   In the future, new consensus methods will be needed so that
   microdescriptor-using clients can use IPv6 exits and ECC
   onion-keys.  Rolling back from those would degrade functionality.

   We need a way to change the minimum consensus method over time.

2. Design

   I propose that we change the minimum consensus method about once
   per release cycle, or once per ever other release cycle.

   As a rule of thumb, let the minimum consensus method in Tor series
   X be the highest method supported by the oldest version that
   "anybody reasonable" would use for running an authority.
   Typically, that's the stable version of the previous release
   series.

   For flexibility, it might make sense to choose a slightly older
   method, if falling back to that method wouldn't cause security
   problems.


   For example, while Tor 0.2.4.x is under development, authorities
   should really not be running anything before Tor 0.2.3.x.  Tor
   0.2.3.x has supported consensus method 13 since 0.2.3.21-rc, so
   it's okay for 0.2.4.x to require 13 as the minimum method.  We even
   might go back to method 12, since the worst outcome of not using 13
   would be some warnings in client logs.  Consensus method 12 was a
   security improvement, so we don't want to roll back before that.

2.1. Behavior when the method used is one we don't know

   The spec currently says that if an authority sees that a method
   will be used that it doesn't support, it should act as if the
   consensus method will be "1".  This attempt will be doomed, since
   the other authorities will be computing the consensus with a more
   recent method, and any attempt to use method "1" won't get enough
   signatures.

   Instead, let's say that authorities fall back to the most recent
   method that they *do* support.  This isn't any likelier to reach
   consensus, but it is less likely to result in anybody signing
   something they don't like.


3. Likely outcomes

   If a bunch of authorities were to downgrade to a much older
   version, all at once, then newer authorities would not be able to
   sign the consensus they made.  That's probably for the best: if a
   bunch of authorities were to suddenly start running 0.2.0.x,
   consensing along with them would be a poor idea.

4. Alternatives

   We might choose a less narrow window of allowable method, when we
   can do so securely.  Maybe two release series, rather than one,
   would be a good interval to do when the consensus format isn't
   changing rapidly.

   We might want to have the behavior when we see that everybody else
   will be using a method we don't support be "Don't make a consensus
   at all."  That's harder to program, though.


Filename: 216-ntor-handshake.txt
Title: Improved circuit-creation key exchange
Author:  Nick Mathewson
Created: 11-May-2011
Status: Closed
Implemented-In: 0.2.4.8-alpha

Summary:

  This is an attempt to translate the proposed circuit handshake from
  "Anonymity and one-way authentication in key-exchange protocols" by
  Goldberg, Stebila, and Ustaoglu, into a Tor proposal format.

  It assumes that proposal 200 is implemented, to provide an extended CREATE
  cell format that can indicate what type of handshake is in use.

Notation:

  Let a|b be the concatenation of a with b.

  Let H(x,t) be a tweakable hash function of output width H_LENGTH bytes.

  Let t_mac, t_key, and t_verify be a set of arbitrarily-chosen tweaks
  for the hash function.

  Let EXP(a,b) be a^b in some appropriate group G where the appropriate DH
  parameters hold.  Let's say elements of this group, when represented as
  byte strings, are all G_LENGTH bytes long.  Let's say we are using a
  generator g for this group.

  Let a,A=KEYGEN() yield a new private-public keypair in G, where a is the
  secret key and A = EXP(g,a).  If additional checks are needed to ensure
  a valid keypair, they should be performed.

  Let PROTOID be a string designating this variant of the protocol.

  Let KEYID be a collision-resistant (but not necessarily preimage-resistant)
     hash function on members of G, of output length H_LENGTH bytes.

  Let each node have a unique identifier, ID_LENGTH bytes in length.

Instantiation:

  Let's call this PROTOID "ntor-curve25519-sha256-1"  (We might want to make
  this shorter if it turns out to save us a block of hashing somewhere.)

  Set H(x,t) == HMAC_SHA256 with message x and key t. So H_LENGTH == 32.
  Set t_mac   == PROTOID | ":mac"
      t_key  == PROTOID | ":key_extract"
      t_verify  == PROTOID | ":verify"

  Set EXP(a,b) == curve25519(.,b,a), and g == 9 .  Let KEYGEN() do the
  appropriate manipulations when generating the secret key (clearing the
  low bits, twiddling the high bits).

  Set KEYID(B) == B.  (We don't need to use a hash function here, since our
     keys are already very short.  It is trivially collision-resistant, since
     KEYID(A)==KEYID(B) iff A==B.)

  When representing an element of the curve25519 subgroup as a byte string,
  use the standard (32-byte, little-endian, x-coordinate-only) representation
  for curve25519 points.

Protocol:

  Take a router with identity key digest ID.

  As setup, the router generates a secret key b, and a public onion key
  B with b, B = KEYGEN().  The router publishes B in its server descriptor.

  To send a create cell, the client generates a keypair x,X = KEYGEN(), and
  sends a CREATE cell with contents:

    NODEID:     ID             -- ID_LENGTH bytes
    KEYID:      KEYID(B)       -- H_LENGTH bytes
    CLIENT_PK:  X              -- G_LENGTH bytes

  The server generates a keypair of y,Y = KEYGEN(), and computes

    secret_input = EXP(X,y) | EXP(X,b) | ID | B | X | Y | PROTOID
    KEY_SEED = H(secret_input, t_key)
    verify = H(secret_input, t_verify)
    auth_input = verify | ID | B | Y | X | PROTOID | "Server"

  The server sends a CREATED cell containing:

    SERVER_PK:  Y                     -- G_LENGTH bytes
    AUTH:       H(auth_input, t_mac)  -- H_LENGTH bytes

  The client then checks Y is in G^* [see NOTE below], and computes

    secret_input = EXP(Y,x) | EXP(B,x) | ID | B | X | Y | PROTOID
    KEY_SEED = H(secret_input, t_key)
    verify = H(secret_input, t_verify)
    auth_input = verify | ID | B | Y | X | PROTOID | "Server"

    The client verifies that AUTH == H(auth_input, t_mac).

  Both parties check that none of the EXP() operations produced the point at
  infinity. [NOTE: This is an adequate replacement for checking Y for group
  membership, if the group is curve25519.]

  Both parties now have a shared value for KEY_SEED.  They expand this into
  the keys needed for the Tor relay protocol.

Key expansion:

  Currently, the key expansion formula used by Tor here is

       K = SHA(K0 | [00]) | SHA(K0 | [01]) | SHA(K0 | [02]) | ...

       where K0==g^xy, and K is divvied up into Df, Db, Kf, and Kb portions.

  Instead, let's have it be HKDF-SHA256 as defined in RFC5869:

       K = K_1 | K_2 | K_3 | ...

       Where K_1     = H(m_expand | INT8(1) , KEY_SEED )
         and K_(i+1) = H(K_i | m_expand | INT8(i) , KEY_SEED )
         and m_expand is an arbitrarily chosen value,
         and INT8(i) is a octet with the value "i".

  Ian says this is due to a construction from Krawczyk at
  http://eprint.iacr.org/2010/264 .

  Let m_expand be PROTOID | ":key_expand"

  In RFC5869's vocabulary, this is HKDF-SHA256 with info == m_expand,
  salt == t_key, and IKM == secret_input.

Performance notes:

  In Tor's current circuit creation handshake, the client does:
     One RSA public-key encryption
     A full DH handshake in Z_p
     A short AES encryption
     Five SHA1s for key expansion
  And the server does:
     One RSA private-key decryption
     A full DH handshake in Z_p
     A short AES decryption
     Five SHA1s for key expansion

  While in the revised handshake, the client does:
     A full DH handshake
     A public-half of a DH handshake
     3 H operations for the handshake
     3 H operations for the key expansion
  and the server does:
     A full DH handshake
     A private-half of a DH handshake
     3 H operations for the handshake
     3 H operations for the key expansion

Integrating with the rest of Tor:

  Add a new optional entry to router descriptors and microdescriptors:

     "ntor-onion-key" SP Base64Key NL

  where Base64Key is a base-64 encoded 32-byte value, with padding
  omitted.

  Add a new consensus method to tell servers to copy "ntor-onion-key"
  entries to from router descriptors to microdescriptors.

  In microdescriptors, "ntor-onion-key" can go right after the "onion-key"
  line.

  Add a "UseNTorHandshake" configuration option and a corresponding
  consensus parameter to control whether clients use the ntor
  handshake.  If the configuration option is "auto", clients should
  obey the consensus parameter.  Have the configuration default be
  "auto" and the consensus value initially be "0".

  Reserve the handshake type [00 02] for this handshake in CREATE2 and
  EXTEND2 cells.

  Specify that this handshake type can be used in EXTEND/EXTENDED/
  CREATE/CREATED cells as follows: instead of a 190-byte TAP onionskin, send
  the 16-byte string "ntorNTORntorNTOR", followed by the client's ntor
  message.  Instead of a 148-byte TAP response, send the server's ntor
  response.  (We need this so that a client can extend from an 0.2.3 server,
  which doesn't know about CREATE2/CREATED2/EXTEND/EXTENDED2.)

Test vectors for HKDF-SHA256:

 These are some test vectors for HKDF-SHA256 using the values for M_EXPAND
 and T_KEY above, taking 100 bytes of key material.

  INPUT: "" (The empty string)
  OUTPUT: d3490ed48b12a48f9547861583573fe3f19aafe3
          f81dc7fc75eeed96d741b3290f941576c1f9f0b2
          d463d1ec7ab2c6bf71cdd7f826c6298c00dbfe67
          11635d7005f0269493edf6046cc7e7dcf6abe0d2
          0c77cf363e8ffe358927817a3d3e73712cee28d8

  INPUT: "Tor" (546f72)
  OUTPUT: 5521492a85139a8d9107a2d5c0d9c91610d0f959
          89975ebee6c02a4f8d622a6cfdf9b7c7edd3832e
          2760ded1eac309b76f8d66c4a3c4d6225429b3a0
          16e3c3d45911152fc87bc2de9630c3961be9fdb9
          f93197ea8e5977180801926d3321fa21513e59ac

  INPUT: "AN ALARMING ITEM TO FIND ON YOUR CREDIT-RATING STATEMENT"
         (414e20414c41524d494e47204954454d20544f2046494e44204f4e20
          594f5552204352454449542d524154494e472053544154454d454e54)
  OUTPUT: a2aa9b50da7e481d30463adb8f233ff06e9571a0
          ca6ab6df0fb206fa34e5bc78d063fc291501beec
          53b36e5a0e434561200c5f8bd13e0f88b3459600
          b4dc21d69363e2895321c06184879d94b18f0784
          11be70b767c7fc40679a9440a0c95ea83a23efbf

Filename: 217-ext-orport-auth.txt
Title: Tor Extended ORPort Authentication
Author: George Kadianakis
Created: 28-11-2012
Status: Closed
Target: 0.2.5.x

1. Overview

  This proposal defines a scheme for Tor components to authenticate to
  each other using a shared-secret.

2. Motivation

  Proposal 196 introduced new ways for pluggable transport proxies to
  communicate with Tor. The communication happens using TCP in the same
  fashion that controllers speak to the ControlPort.

  To defend against cross-protocol attacks [0] on the transport ports,
  we need to define an authentication scheme that will restrict passage
  to unknown clients.

  Tor's ControlPort uses an authentication scheme called safe-cookie
  authentication [1]. Unfortunately, the design of the safe-cookie
  authentication was influenced by the protocol structure of the
  ControlPort and the need for backwards compatibility of the
  cookie-file and can't be easily reused in other use cases.

3. Goals

  The general goal of Extended ORPort authentication is to authenticate
  the client based on a shared-secret that only authorized clients
  should know.

  Furthermore, its implementation should be flexible and easy to reuse,
  so that it can be used as the authentication mechanism in front of
  future Tor helper ports (for example, in proposal 199).

  Finally, the protocol is able to support multiple authentication
  schemes and each of them has different goals.

4. Protocol Specification

4.1. Initial handshake

  When a client connects to the Extended ORPort, the server sends:

    AuthTypes                                   [variable]
    EndAuthTypes                                [1 octet]

  Where,

  + AuthTypes are the authentication schemes that the server supports
    for this session. They are multiple concatenated 1-octet values that
    take values from 1 to 255.
  + EndAuthTypes is the special value 0.

  The client reads the list of supported authentication schemes and
  replies with the one he prefers to use:

    AuthType                                    [1 octet]

  Where,

  + AuthType is the authentication scheme that the client wants to use
    for this session. A valid authentication type takes values from 1 to
    255. A value of 0 means that the client did not like the
    authentication types offered by the server.

  If the client sent an AuthType of value 0, or an AuthType that the
  server does not support, the server MUST close the connection.

4.2. Authentication types

4.2.1 SAFE_COOKIE handshake

  Authentication type 1 is called SAFE_COOKIE.

4.2.1.1. Motivation and goals

  The SAFE_COOKIE scheme is pretty-much identical to the authentication
  scheme that was introduced for the ControlPort in proposal 193.

  An additional goal of the SAFE_COOKIE authentication scheme (apart
  from the goals of section 2), is that it should not leak the contents
  of the cookie-file to untrusted parties.

  Specifically, the SAFE_COOKIE protocol will never leak the actual
  contents of the file. Instead, it uses a challenge-response protocol
  (similar to the HTTP digest authentication of RFC2617) to ensure that
  both parties know the cookie without leaking it.

4.2.1.2. Cookie-file format

  The format of the cookie-file is:

     StaticHeader                                [32 octets]
     Cookie                                      [32 octets]

  Where,
  + StaticHeader is the following string:
    "! Extended ORPort Auth Cookie !\x0a"
  + Cookie is the shared-secret. During the SAFE_COOKIE protocol, the
    cookie is called CookieString.

  Extended ORPort clients MUST make sure that the StaticHeader is
  present in the cookie file, before proceeding with the
  authentication protocol.

  Details on how Tor locates the cookie file can be found in section 5
  of proposal 196. Details on how transport proxies locate the cookie
  file can be found in pt-spec.txt.

4.2.1.3. Protocol specification

  A client that performs the SAFE_COOKIE handshake begins by sending:

     ClientNonce                                 [32 octets]

  Where,
  + ClientNonce is 32 octets of random data.

  Then, the server replies with:

     ServerHash                                  [32 octets]
     ServerNonce                                 [32 octets]

  Where,
  + ServerHash is computed as:
      HMAC-SHA256(CookieString,
        "ExtORPort authentication server-to-client hash" | ClientNonce | ServerNonce)
  + ServerNonce is 32 random octets.

  Upon receiving that data, the client computes ServerHash herself and
  validates it against the ServerHash provided by the server.

  If the server-provided ServerHash is invalid, the client MUST
  terminate the connection.

  Otherwise the client replies with:

     ClientHash                                  [32 octets]

  Where,
  + ClientHash is computed as:
      HMAC-SHA256(CookieString,
        "ExtORPort authentication client-to-server hash" | ClientNonce | ServerNonce)

  Upon receiving that data, the server computes ClientHash herself and
  validates it against the ClientHash provided by the client.

  Finally, the server replies with:

     Status                                      [1 octet]

  Where,
  + Status is 1 if the authentication was successfull. If the
    authentication failed, Status is 0.

4.3. Post-authentication

  After completing the Extended ORPort authentication successfully, the
  two parties should proceed with the Extended ORPort protocol on the
  same TCP connection.

5. Acknowledgments

  Thanks to Robert Ransom for helping with the proposal and designing
  the original safe-cookie authentication scheme. Thanks to Nick
  Mathewson for advices and reviews of the proposal.

[0]:
http://archives.seul.org/or/announce/Sep-2007/msg00000.html

[1]:
https://gitweb.torproject.org/torspec.git/blob/79f488c32c43562522e5592f2c19952dc7681a65:/control-spec.txt#l1069

Filename: 218-usage-controller-events.txt
Title: Controller events to better understand connection/circuit usage
Author: Rob Jansen, Karsten Loesing
Created: 2013-02-06
Status: Closed
Implemented-In: 0.2.5.2-alpha

1. Overview

  This proposal defines three new controller events that shall help
  understand connection and circuit usage.  These events are designed
  to be emitted in private Tor networks only.  This proposal also
  defines a tweak to an existing event for the same purpose.

2. Motivation

  We need to better understand connection and circuit usage in order to
  better simulate Tor networks.  Existing controller events are a fine
  start, but we need more detailed information about per-connection
  bandwidth, processed cells by circuit, and token bucket refills.  This
  proposal defines controller events containing the desired information.

  Most of these usage data are too sensitive to be captured in the
  public network, unless aggregated sufficiently.  That is why we're
  focusing on private Tor networks first, that is, relays that have
  TestingTorNetwork set.  The new controller events described in this
  proposal shall all be restricted to private Tor networks.  In the next
  step we might define aggregate statistics to be gathered by public
  relays, but that will require a new proposal.

3. Design

  The proposed new event types use Tor's asynchronous event mechanism
  where a controller registers for events by type and processes events
  received from the Tor process.

  Tor controllers can register for any of the new event types, but
  events will only be emitted if the Tor process is running in
  TestingTorNetwork mode.

4. Security implications

  There should be no security implications from the new event types,
  because they are only emitted in private Tor networks.

5. Specification

5.1. ConnID Token

  Addition for section 2.4 of the control-spec (General-use tokens).

  ; Unique identifiers for connections or queues.  Only included in
  ; TestingTorNetwork mode.

  ConnID = 1*16 IDChar
  QueueID = 1*16 IDChar

5.2. Adding an ID field to ORCONN events

  The new syntax for ORCONN events is:

    "650" SP "ORCONN" SP (LongName / Target) SP ORStatus
             [ SP "ID=" ConnID ] [ SP "REASON=" Reason ]
             [ SP "NCIRCS=" NumCircuits ] CRLF

  The remaining specification of that event type stays unchanged.

5.3. Bandwidth used on an OR or DIR or EXIT connection

  The syntax is:
     "650" SP "CONN_BW" SP "ID=" ConnID SP "TYPE=" ConnType
              SP "READ=" BytesRead SP "WRITTEN=" BytesWritten CRLF
     ConnType = "OR" / "DIR" / "EXIT"
     BytesRead = 1*DIGIT
     BytesWritten = 1*DIGIT

  Controllers MUST tolerate unrecognized connection types.

  BytesWritten and BytesRead are the number of bytes written and read
  by Tor since the last CONN_BW event on this connection.

  These events are generated about once per second per connection; no
  events are generated for connections that have not read or written.
  These events are only generated if TestingTorNetwork is set.

5.4. Bandwidth used by all streams attached to a circuit

  The syntax is:
     "650" SP "CIRC_BW" SP "ID=" CircuitID SP "READ=" BytesRead SP
              "WRITTEN=" BytesWritten CRLF
     BytesRead = 1*DIGIT
     BytesWritten = 1*DIGIT

  BytesRead and BytesWritten are the number of bytes read and written by
  all applications with streams attached to this circuit since the last
  CIRC_BW event.

  These events are generated about once per second per circuit; no events
  are generated for circuits that had no attached stream writing or
  reading.

5.5. Per-circuit cell stats

  The syntax is:
     "650" SP "CELL_STATS"
              [ SP "ID=" CircuitID ]
              [ SP "InboundQueue=" QueueID SP "InboundConn=" ConnID ]
              [ SP "InboundAdded=" CellsByType ]
              [ SP "InboundRemoved=" CellsByType SP
                   "InboundTime=" MsecByType ]
              [ SP "OutboundQueue=" QueueID SP "OutboundConn=" ConnID ]
              [ SP "OutboundAdded=" CellsByType ]
              [ SP "OutboundRemoved=" CellsByType SP
                   "OutboundTime=" MsecByType ] CRLF
     CellsByType, MsecByType = CellType ":" 1*DIGIT
                               0*( "," CellType ":" 1*DIGIT )
     CellType = 1*( "a" - "z" / "0" - "9" / "_" )

  Examples are:
     650 CELL_STATS ID=14 OutboundQueue=19403 OutboundConn=15
         OutboundAdded=create_fast:1,relay_early:2
         OutboundRemoved=create_fast:1,relay_early:2
         OutboundTime=create_fast:0,relay_early:0
     650 CELL_STATS InboundQueue=19403 InboundConn=32
         InboundAdded=relay:1,created_fast:1
         InboundRemoved=relay:1,created_fast:1
         InboundTime=relay:0,created_fast:0
         OutboundQueue=6710 OutboundConn=18
         OutboundAdded=create:1,relay_early:1
         OutboundRemoved=create:1,relay_early:1
         OutboundTime=create:0,relay_early:0

  ID is the locally unique circuit identifier that is only included if the
  circuit originates at this node.

  Inbound and outbound refer to the direction of cell flow through the
  circuit which is either to origin (inbound) or from origin (outbound).

  InboundQueue and OutboundQueue are identifiers of the inbound and
  outbound circuit queues of this circuit.  These identifiers are only
  unique per OR connection.  OutboundQueue is chosen by this node and
  matches InboundQueue of the next node in the circuit.

  InboundConn and OutboundConn are locally unique IDs of inbound and
  outbound OR connection.  OutboundConn does not necessarily match
  InboundConn of the next node in the circuit.

  InboundQueue and InboundConn are not present if the circuit originates
  at this node.  OutboundQueue and OutboundConn are not present if the
  circuit (currently) ends at this node.

  InboundAdded and OutboundAdded are total number of cells by cell type
  added to inbound and outbound queues.  Only present if at least one cell
  was added to a queue.

  InboundRemoved and OutboundRemoved are total number of cells by
  cell type processed from inbound and outbound queues.  InboundTime and
  OutboundTime are total waiting times in milliseconds of all processed
  cells by cell type.  Only present if at least one cell was removed from
  a queue.

  These events are generated about once per second per circuit; no
  events are generated for circuits that have not added or processed any
  cell.  These events are only generated if TestingTorNetwork is set.

5.6. Token buckets refilled

  The syntax is:
     "650" SP "TB_EMPTY" SP BucketName [ SP "ID=" ConnID ] SP
              "READ=" ReadBucketEmpty SP "WRITTEN=" WriteBucketEmpty SP
              "LAST=" LastRefill CRLF

     BucketName = "GLOBAL" / "RELAY" / "ORCONN"
     ReadBucketEmpty = 1*DIGIT
     WriteBucketEmpty = 1*DIGIT
     LastRefill = 1*DIGIT

  Examples are:
     650 TB_EMPTY ORCONN ID=16 READ=0 WRITTEN=0 LAST=100
     650 TB_EMPTY GLOBAL READ=93 WRITTEN=93 LAST=100
     650 TB_EMPTY RELAY READ=93 WRITTEN=93 LAST=100

  This event is generated when refilling a previously empty token
  bucket.  BucketNames "GLOBAL" and "RELAY" keywords are used for the
  global or relay token buckets, BucketName "ORCONN" is used for the
  token buckets of an OR connection.  Controllers MUST tolerate
  unrecognized bucket names.

  ConnID is only included if the BucketName is "ORCONN".

  If both global and relay buckets and/or the buckets of one or more OR
  connections run out of tokens at the same time, multiple separate
  events are generated.

  ReadBucketEmpty (WriteBucketEmpty) is the time in millis that the read
  (write) bucket was empty since the last refill.  LastRefill is the
  time in millis since the last refill.

  If a bucket went negative and if refilling tokens didn't make it go
  positive again, there will be multiple consecutive TB_EMPTY events for
  each refill interval during which the bucket contained zero tokens or
  less.  In such a case, ReadBucketEmpty or WriteBucketEmpty are capped
  at LastRefill in order not to report empty times more than once.

  These events are only generated if TestingTorNetwork is set.

6. Compatibility

  There should not be any compatibility issues with other Tor versions.

7. Implementation

  Most of the implementation should be straight-forward.

8. Performance and scalability notes

  Most of the new code won't be executed in normal Tor mode.  Wherever
  we needed new fields in existing structs, we tried hard to keep them
  as small as possible.  Still, we should make sure that memory
  requirements won't grow significantly on busy relays.

Filename: 219-expanded-dns.txt
Title: Support for full DNS and DNSSEC resolution in Tor
Authors: Ondrej Mikle
Created: 4 February 2012
Modified: 2 August 2013
Target: 0.2.5.x
Status: Needs-Revision

0. Overview

  Adding support for any DNS query type to Tor.

0.1. Motivation

  Many applications running over Tor need more than just resolving FQDN to
  IPv4 and vice versa. Sometimes to prevent DNS leaks the applications have to
  be hacked around to be supplied necessary data by hand (e.g. SRV records in
  XMPP). TLS connections will benefit from planned TLSA record that provides
  certificate pinning to avoid another Diginotar-like fiasco.

0.2. What about DNSSEC?

  Routine DNSSEC resolution is not practical with this proposal alone,
  because of round-trip issues: a single name lookup can require
  dozens of round trips across a circuit, rendering it very slow. (We
  don't want to add minutes to every webpage load time!)

  For records like TLSA that need extra signing, this might not be an
  unacceptable amount of overhead, but routine hostname lookup, it's
  probably overkill.

  [Further, thanks to the changes of proposal 205, DNSSEC for routine
  hostname lookup is less useful in Tor than it might have been back
  when we cached IPv4 and IPv6 addresses and used them across multiple
  circuits and exit nodes.]

  See section 8 below for more discussion of DNSSEC issues.

1. Design

1.1 New cells

  There will be two new cells, RELAY_DNS_BEGIN and RELAY_DNS_RESPONSE (we'll
  use DNS_BEGIN and DNS_RESPONSE for short below).

1.1.1. DNS_BEGIN

  DNS_BEGIN payload:

    FLAGS        [2 octets]
    DNS packet data (variable length, up to length of relay cell.)

  The DNS packet must be generated internally by Tor to avoid
  fingerprinting users by differences in client resolvers' behavior.

  [XXXX We need to specify the exact behavior here: saying "Just do what
  Libunbound does!" would make it impossible to implement a
  Tor-compatible client without reverse-engineering libunbound. - NM]

  The FLAGS field is reserved, and should be set to 0 by all clients.

  Because of the maximum length of the RELAY cell, the DNS packet may
  not be longer than 496 bytes. [XXXX Is this enough? -NM]

  Some fields in the query must be omitted or set to zero: see section 3
  below.

1.1.2. DNS_RESPONSE

  DNS_RESPONSE payload:

    STATUS [1 octet]
    CONTENT [variable, up to length of relay cell]

  If the low bit of STATUS is set, this is the last DNS_RESPONSE that
  the server will send in response to the given DNS_BEGIN.  Otherwise,
  there will be more DNS_RESPONSE packets.  The other bits are reserved,
  and should be set to zero for now.

  The CONTENT fields of the DNS_RESPONSE cells contain a DNS record,
  split across multiple cells as needed, encoded as:


    total length (2 octets)
    data         (variable)

  So for example, if the DNS record R1 is only 300 bytes long, then it
  is sent in a single DNS_RESPONSE cell with payload [01 01 2C] R1.  But
  if the DNS record R2 is 1024 bytes long, it's sent in 3 DNS_RESPONSE
  cells, with contents: [00 04 00] R2[0:495], [00] R2[495:992], and
  [01] R2[992:1024] respectively.

  [NOTE: I'm using the length field and the is-this-the-last-cell
  field to allow multi-packet responses in the future. -NM]

  AXFR and IXRF are not supported in this cell by design (see
  specialized tool below in section 5).

1.1.3. Matching queries to responses.

  DNS_BEGIN must use a non-zero, distinct StreamID.  The client MUST NOT
  re-use the same stream ID until it has received a complete response
  from the server or a RELAY_END cell.

  The client may cancel a DNS_BEGIN request by sending a RELAY_END cell.
  The server may refused to answer, or abort answering, a DNS_BEGIN cell
  by sending a RELAY_END cell.

2. Interfaces to applications

  DNSPort evdns - existing implementation will be updated to use
  DNS_BEGIN.

  [XXXX we should add a dig-like tool that can work over the socksport
  via some extension, as tor-resolve does now. -NM]

3. Limitations on DNS query

  Clients must only set query class to IN (INTERNET), since the only
  other useful class CHAOS is practical for directly querying
  authoritative servers (OR in this case acts as a recursive resolver).
  Servers MUST return REFUSED for any for class other than IN.

  Multiple questions in a single packet are not supported and OR will
  respond with REFUSED as the DNS error code.

  All query RR types are allowed.

  [XXXX I originally thought about some exit policy like "basic RR types" and
  "all RRs", but managing such list in deployed nodes with extra directory
  flags outweighs the benefit. Maybe disallow ANY RR type? -OM]

  Client as well as OR MUST block attempts to resolve local RFC 1918,
  4193, or 4291 adresses (PTR). REFUSED will be returned as DNS error
  code from OR.  [XXXX Must they also refuse to report addresses that
  resolve to these? -NM]

  [XXX I don't think so. People often use public DNS
  records that map to private adresses. We can't effectively separate
  "truly public" records from the ones client's dnsmasq or similar DNS
  resolver returns. - OM]

  [XXX Then do you mean "must be returned as the DNS error from the OP"?]

  Request for special names (.onion, .exit, .noconnect) must never be
  sent, and will return REFUSED.

  The DNS transaction ID field MUST be set to zero in all requests and
  replies; the stream ID field plays the same function in Tor.

4. Implementation notes

  Client will periodically purge incomplete DNS replies. Any unexpected
  DNS_RESPONSE will be dropped.

  AD flag must be zeroed out on client unless validation is performed.

  [XXXX libunbound lowlevel API, Tor+libunbound libevent loop

  libunbound doesn't publicly expose all the necessary parts of low-level API.
  It can return the received DNS packet, but not let you construct a packet
  and get it in wire-format, for example.

  Options I see:

  a) patch libunbound to be able feed wire-format DNS packets and add API to
  obtain constructed packets instead of sending over network

  b) replace bufferevents for sockets in unbound with something like
  libevent's paired bufferevents. This means that data extracted from
  DNS_RESPONSE/DNS_BEGIN cells would be fed directly to some evbuffers that
  would be picked up by libunbound. It could possibly result in avoiding
  background thread of libunbound's ub_resolve_async running separate libevent
  loop.

  c) bind to some arbitrary local address like 127.1.2.3:53 and use it as
  forwarder for libunbound. The code there would pack/unpack the DNS packets
  from/to libunbound into DNS_BEGIN/DNS_RESPONSE cells. It wouldn't require
  modification of libunbound code, but it's not pretty either. Also the bind
  port must be 53 which usually requires superuser privileges.

  Code of libunbound is fairly complex for me to see outright what would the
  best approach be.
  ]

5. Separate tool for AXFR

  The AXFR tool will have similar interface like tor-resolve, but will
  return raw DNS data.

  Parameters are: query domain, server IP of authoritative DNS.

  The tool will transfer the data through "ordinary" tunnel using RELAY_BEGIN
  and related cells.

  This design decision serves two goals:

  - DNS_BEGIN and DNS_RESPONSE will be simpler to implement (lower chance of
    bugs)
  - in practice it's often useful do AXFR queries on secondary authoritative
    DNS servers

  IXFR will not be supported (infrequent corner case, can be done by manual
  tunnel creation over Tor if truly necessary).

6. Security implications

  As proposal 171 mentions, we need mitigate circuit correlation. One solution
  would be keeping multiple streams to multiple exit nodes and picking one at
  random for DNS resolution. Other would be keeping DNS-resolving circuit open
  only for a short time (e.g. 1-2 minutes). Randomly changing the circuits
  however means that it would probably incur additional latency since there
  would likely be a few cache misses on the newly selected exits.

  [This needs more analysis; We need to consider the possible attacks
  here.  It would be good to have a way to tie requests to
  SocksPorts, perhaps? -NM]

7. TTL normalization idea

  A bit complex on implementation, because it requires parsing DNS packets at
  exit node.

  TTL in reply DNS packet MUST be normalized at exit node so that client won't
  learn what other clients queried. The normalization is done in following
  way:

  - for a RR, the original TTL value received from authoritative DNS server
    should be used when sending DNS_RESPONSE, trimming the values to interval
    [5, 600]
  - does not pose "ghost-cache-attack", since once RR is flushed from
    libunbound's cache, it must be fetched anew

8. DNSSEC notes

8.1. Where to do the resolution?

  DNSSEC is part of the DNS protocol and the most appropriate place for DNSSEC
  API would be probably in OS libraries (e.g. libc). However that will
  probably take time until it becomes widespread.

  On the Tor's side (as opposed to application's side), DNSSEC will provide
  protection against DNS cache-poisoning attacks (provided that exit is not
  malicious itself, but still reduces attack surface).

8.2. Round trips and serialization

  Following are two examples of resolving two A records. The one for
  addons.mozila.org is an example of a "common" RR without CNAME/DNAME, the
  other for www.gov.cn an extreme example chained through 5 CNAMEs and 3 TLDs.
  The examples below are shown for resolving that started with an empty DNS
  cache.

  Note that multiple queries are made by libunbound as it tries to adjust for
  the latency of network. "Standard query response" below that does not list
  RR type is a negative NOERROR reply with NSEC/NSEC3 (usually reply to DS
  query).

  The effect of DNS cache plays a great role - once DS/DNSKEY for root and a
  TLD is cached, at most 3 records usually need to be fetched for a record
  that does not utilize CNAME/DNAME (3 roundtrips for DS, DNSKEY and the
  record itself if there are no zone cuts below).

  Query for addons.mozilla.org, 6 roundtrips (not counting retries):

    Standard query A addons.mozilla.org
    Standard query A addons.mozilla.org
    Standard query A addons.mozilla.org
    Standard query A addons.mozilla.org
    Standard query A addons.mozilla.org
    Standard query response A 63.245.217.112 RRSIG
    Standard query response A 63.245.217.112 RRSIG
    Standard query response A 63.245.217.112 RRSIG
    Standard query A addons.mozilla.org
    Standard query response A 63.245.217.112 RRSIG
    Standard query response A 63.245.217.112 RRSIG
    Standard query A addons.mozilla.org
    Standard query response A 63.245.217.112 RRSIG
    Standard query response A 63.245.217.112 RRSIG
    Standard query DNSKEY <Root>
    Standard query DNSKEY <Root>
    Standard query response DNSKEY DNSKEY RRSIG
    Standard query response DNSKEY DNSKEY RRSIG
    Standard query DS org
    Standard query response DS DS RRSIG
    Standard query DNSKEY org
    Standard query response DNSKEY DNSKEY DNSKEY DNSKEY RRSIG RRSIG
    Standard query DS mozilla.org
    Standard query response DS RRSIG
    Standard query DNSKEY mozilla.org
    Standard query response DNSKEY DNSKEY DNSKEY RRSIG RRSIG

  Query for www.gov.cn, 16 roundtrips (not counting retries):

    Standard query A www.gov.cn
    Standard query A www.gov.cn
    Standard query A www.gov.cn
    Standard query A www.gov.cn
    Standard query A www.gov.cn
    Standard query response CNAME www.gov.chinacache.net CNAME www.gov.cncssr.chinacache.net CNAME www.gov.foreign.ccgslb.com CNAME wac.0b51.edgecastcdn.net CNAME gp1.wac.v2cdn.net A 68.232.35.119
    Standard query response CNAME www.gov.chinacache.net CNAME www.gov.cncssr.chinacache.net CNAME www.gov.foreign.ccgslb.com CNAME wac.0b51.edgecastcdn.net CNAME gp1.wac.v2cdn.net A 68.232.35.119
    Standard query A www.gov.cn
    Standard query response CNAME www.gov.chinacache.net CNAME www.gov.cncssr.chinacache.net CNAME www.gov.foreign.ccgslb.com CNAME wac.0b51.edgecastcdn.net CNAME gp1.wac.v2cdn.net A 68.232.35.119
    Standard query response CNAME www.gov.chinacache.net CNAME www.gov.cncssr.chinacache.net CNAME www.gov.foreign.ccgslb.com CNAME wac.0b51.edgecastcdn.net CNAME gp1.wac.v2cdn.net A 68.232.35.119
    Standard query response CNAME www.gov.chinacache.net CNAME www.gov.cncssr.chinacache.net CNAME www.gov.foreign.ccgslb.com CNAME wac.0b51.edgecastcdn.net CNAME gp1.wac.v2cdn.net A 68.232.35.119
    Standard query A www.gov.cn
    Standard query response CNAME www.gov.chinacache.net CNAME www.gov.cncssr.chinacache.net CNAME www.gov.foreign.ccgslb.com CNAME wac.0b51.edgecastcdn.net CNAME gp1.wac.v2cdn.net A 68.232.35.119
    Standard query response CNAME www.gov.chinacache.net CNAME www.gov.cncssr.chinacache.net CNAME www.gov.foreign.ccgslb.com CNAME wac.0b51.edgecastcdn.net CNAME gp1.wac.v2cdn.net A 68.232.35.119
    Standard query A www.gov.chinacache.net
    Standard query response CNAME www.gov.cncssr.chinacache.net CNAME www.gov.foreign.ccgslb.com CNAME wac.0b51.edgecastcdn.net CNAME gp1.wac.v2cdn.net A 68.232.35.119
    Standard query A www.gov.cncssr.chinacache.net
    Standard query response CNAME www.gov.foreign.ccgslb.com CNAME wac.0b51.edgecastcdn.net CNAME gp1.wac.v2cdn.net A 68.232.35.119
    Standard query A www.gov.foreign.ccgslb.com
    Standard query response CNAME wac.0b51.edgecastcdn.net CNAME gp1.wac.v2cdn.net A 68.232.35.119
    Standard query A wac.0b51.edgecastcdn.net
    Standard query response CNAME gp1.wac.v2cdn.net A 68.232.35.119
    Standard query A gp1.wac.v2cdn.net
    Standard query response A 68.232.35.119
    Standard query DNSKEY <Root>
    Standard query response DNSKEY DNSKEY RRSIG
    Standard query DS cn
    Standard query response
    Standard query DS net
    Standard query response DS RRSIG
    Standard query DNSKEY net
    Standard query response DNSKEY DNSKEY RRSIG
    Standard query DS chinacache.net
    Standard query response
    Standard query DS com
    Standard query response DS RRSIG
    Standard query DNSKEY com
    Standard query response DNSKEY DNSKEY RRSIG
    Standard query DS ccgslb.com
    Standard query response
    Standard query DS edgecastcdn.net
    Standard query response
    Standard query DS v2cdn.net
    Standard query response

  An obvious idea to avoid so many roundtrips is to serialize them together.
  There has been an attempt to standardize such "DNSSEC stapling" [1], however
  it's incomplete for the general case, mainly due to various intricacies -
  proofs of non-existence, NSEC3 opt-out zones, TTL handling (see RFC 4035
  section 5).

References:

  [1] https://www.ietf.org/mail-archive/web/dane/current/msg02823.html
Filename: 220-ecc-id-keys.txt
Title: Migrate server identity keys to Ed25519
Authors: Nick Mathewson
Created: 12 August 2013
Implemented-In: 0.3.0.1-alpha
Status: Closed

   [Note: This is a draft proposal; I've probably made some important
   mistakes, and there are parts that need more thinking.  I'm
   publishing it now so that we can do the thinking together.]

   (Sections 0-5 are currently implemented, except for section 2.3.  Sections
   6-8 are a work in progress, and may require revision.)

0. Introduction

   In current Tor designs, router identity keys are limited to
   1024-bit RSA keys.

   Clearly, that should change, because RSA doesn't represent a good
   performance-security tradeoff nowadays, and because 1024-bit RSA is
   just plain too short.

   We've already got an improved circuit extension handshake protocol
   that uses curve25519 in place of RSA1024, and we're using (where
   supported) P256 ECDHE in our TLS handshakes, but there are more uses
   of RSA1024 to replace, including:

      * Router identity keys
      * TLS link keys
      * Hidden service keys

   This proposal describes how we'll migrate away from using 1024-bit
   RSA in the first two, since they're tightly coupled.  Hidden service
   crypto changes will be complex, and will merit their own proposal.

   In this proposal, we'll also (incidentally) be extirpating a number
   of SHA1 usages.

1. Overview

   When this proposal is implemented, every router will have an Ed25519
   identity key in addition to its current RSA1024 public key.

   Ed25519 (specifically, Ed25519-SHA-512 as described and specified at
   http://ed25519.cr.yp.to/) is a desirable choice here: it's secure,
   fast, has small keys and small signatures, is bulletproof in several
   important ways, and supports fast batch verification. (It isn't quite
   as fast as RSA1024 when it comes to public key operations, since RSA
   gets to take advantage of small exponents when generating public
   keys.)

   (For reference: In Ed25519 public keys are 32 bytes long, private keys
   are 64 bytes long, and signatures are 64 bytes long.)

   To mirror the way that authority identity keys work, we'll fully
   support keeping Ed25519 identity keys offline; they'll be used to
   sign long-ish term signing keys, which in turn will do all of the
   heavy lifting.  A signing key will get used to sign the things that
   RSA1024 identity keys currently sign.

1.1. 'Personalized' signatures

   Each of the keys introduced here is used to sign more than one kind
   of document. While these documents should be unambiguous, I'm going
   to forward-proof the signatures by specifying each signature to be
   generated, not on the document itself, but on the document prefixed
   with some distinguishing string.

2. Certificates and Router descriptors.

2.1. Certificates

   When generating a signing key, we also generate a certificate for it.
   Unlike the certificates for authorities' signing keys, these
   certificates need to be sent around frequently, in significant
   numbers.  So we'll choose a compact representation.

         VERSION         [1 Byte]
         CERT_TYPE       [1 Byte]
         EXPIRATION_DATE [4 Bytes]
         CERT_KEY_TYPE   [1 byte]
         CERTIFIED_KEY   [32 Bytes]
         N_EXTENSIONS    [1 byte]
         EXTENSIONS      [N_EXTENSIONS times]
         SIGNATURE       [64 Bytes]

   The "VERSION" field holds the value [01].  The "CERT_TYPE" field
   holds a value depending on the type of certificate. (See appendix
   A.1.) The CERTIFIED_KEY field is an Ed25519 public key if
   CERT_KEY_TYPE is [01], or a SHA256 hash of some other key type
   depending on the value of CERT_KEY_TYPE. The EXPIRATION_DATE is a
   date, given in HOURS since the epoch, after which this
   certificate isn't valid. (A four-byte field here will work fine
   until 10136 A.D.)

   The EXTENSIONS field contains zero or more extensions, each of
   the format:

         ExtLength [2 bytes]
         ExtType   [1 byte]
         ExtFlags  [1 byte]
         ExtData   [Length bytes]

   The meaning of the ExtData field in an extension is type-dependent.

   The ExtFlags field holds flags; this flag is currently defined:

      1 -- AFFECTS_VALIDATION. If this flag is present, then the
           extension affects whether the certificate is valid; clients
           must not accept the certificate as valid unless they
           understand the extension.

   It is an error for an extension to be truncated; such a
   certificate is invalid.

   Before processing any certificate, parties MUST know which
   identity key it is supposed to be signed by, and then check the
   signature.  The signature is formed by signing the first N-64
   bytes of the certificate prefixed with the string "Tor node
   signing key certificate v1".

2.2. Basic extensions

2.2.1. Signed-with-ed25519-key extension [type 04]

   In several places, it's desirable to bundle the key signing a
   certificate along with the certificate.  We do so with this
   extension.

        ExtLength = 32
        ExtData =
           An ed25519 key    [32 bytes]

   When this extension is present, it MUST match the key used to
   sign the certificate.

2.3. Revoking keys.

   We also specify a revocation document for revoking a signing key or an
   identity key.  Its format is:
         FIXED_PREFIX    [8 Bytes]
         VERSION         [1 Byte]
         KEYTYPE         [1 Byte]
         IDENTITY_KEY    [32 Bytes]
         REVOKED_KEY     [32 Bytes]
         PUBLISHED       [8 Bytes]
         N_EXTENSIONS    [1 Byte]
           N_EXTENSIONS_TIMES:
           EXTENSIONS      [N_EXTENSIONS times]
         SIGNATURE       [64 Bytes]

   FIXED_PREFIX is "REVOKEID" or "REVOKESK". VERSION is [01]. KEYTYPE is
   [01] for revoking a signing key, [02] for revoking an identity key,
   or [03] for revoking an RSA identity key.
   REVOKED_KEY is the key being revoked or a SHA256 hash of the key if
   it is an RSA identity key; IDENTITY_KEY is the node's
   Ed25519 identity key. PUBLISHED is the time that the document was
   generated, in seconds since the epoch. REV_EXTENSIONS is left for a
   future version of this document.  The SIGNATURE is generated with
   the same key as in IDENTITY_KEY, and covers the entire revocation,
   prefixed with "Tor key revocation v1".

   Using these revocation documents is left for a later specification.

2.4. Managing keys

   By default, we can keep the easy-to-setup key management properties
   that Tor has now, so that node operators aren't required to have
   offline public keys:

        * When a Tor node starts up with no Ed25519 identity keys, it
          generates a new identity keypair.
        * When a Tor node has an Ed25519 identity keypair, and it has
          no signing key, or its signing key is going to expire within
          the next 48 hours, it generates a new signing key to last
          30 days.

   But we also support offline identity keys:

        * When a Tor node starts with an Ed25519 public identity key
          but no private identity key, it checks whether it has
          a currently valid certified signing keypair.  If it does,
          it starts.  Otherwise, it refuses to start.
        * If a Tor node's signing key is going to expire soon, it starts
          warning the user.  If it is expired, then the node shuts down.

2.5. Router descriptors

   We specify the following element that may appear at most once in
   each router descriptor:
      "identity-ed25519" NL "-----BEGIN ED25519 CERT-----" NL certificate
           "-----END ED25519 CERT-----" NL

   The certificate is base64-encoded with
   terminating =s removed.  When this element is present, it MUST appear
   as the first or second element in the router descriptor.
   [XXX The rationale here is to allow extracting the identity key and
   signing key and checking the signature before fully parsing the rest
   of the document. -NM]

   The certificate has CERT_TYPE of [04].  It must include a
   signed-with-ed25519-key extension (see section 2.2.1), so that we
   can extract the identity key.

   When an identity-ed25519 element is present, there must also be a
   "router-sig-ed25519" element.  It MUST be the next-to-last element in
   the descriptor, appearing immediately before the RSA signature.  (In
   future versions of the descriptor format that do not require an RSA
   identity key, it MUST be last.)  It MUST contain an ed25519 signature
   of a SHA256 digest of the entire document, from the first character
   up to and including the first space after the "router-sig-ed25519"
   string, prefixed
   with the string "Tor router descriptor signature v1".  Its format is:

      "router-sig-ed25519" SP signature NL

   Where 'signature' is encoded in base64 with terminating =s removed.

   The signing key in the certificate MUST
   be the one used to sign the document.

   Note that these keys cross-certify as follows: the ed25519 identity
   key signs the ed25519 signing key in the certificate.  The ed25519
   signing key signs itself and the ed25519 identity key and the RSA
   identity key as part of signing the descriptor.  And the RSA identity
   key also signs all three keys as part of signing the descriptor.


   When an ed25519 signature is present, there MAY be a "master-key-ed25519"
   element containing the base64 encoded ed25519 master key as a single
   argument.  If it is present, it MUST match the identity key in
   the certificate.

2.5.1. Checking descriptor signatures.

   Current versions of Tor will handle these new formats by ignoring the
   new fields, and not checking any ed25519 information.

   New versions of Tor will have a flag that tells them whether to check
   ed25519 information.  When it is set, they must check:

      * All RSA information and signatures that Tor implementations
        currently check.
      * If the identity-ed25519 line is present, it must be well-formed,
        and the certificate must be well-formed and correctly signed,
        and there must be a valid router-signature-ed25519 signature.
      * If we require an ed25519 key for this node (see 3.1 below), the
        ed25519 key must be present.

   Authorities and directory caches will have this flag always-on.  For
   clients, it will be controlled by a torrc option and consensus
   option, to be set to "always-on" in the future once enough clients
   support it.

2.5.2. Extra-info documents

   Extra-info documents now include "identity-ed25519" and
   "router-signature-ed25519" fields in the same positions in which they
   appear in router descriptors.

   Additionally, we add the base64-encoded, =-stripped SHA256 digest of
   a node's extra-info document field to the extra-info-digest line in
   the router descriptor. (All versions of Tor that recognize this line
   allow an extra field there.)

2.5.3. A note on signature verification

   Here and elsewhere, we're receiving a certificate and a document
   signed with the key certified by that certificate in the same step.
   This is a fine time to use the batch signature checking capability of
   Ed25519, so that we can check both signatures at once without (much)
   additional overhead over checking a single signature.

3. Consensus documents and authority operation

3.1. Handling router identity at the authority

   When receiving router descriptors, authorities must track mappings
   between RSA and Ed25519 keys.

   Rule 1: Once an authority has seen an Ed25519 identity key and an RSA
   identity key together on the same (valid) descriptor, it should no
   longer accept any descriptor signed by that RSA key with a different
   Ed25519 key, or that Ed25519 key with a different RSA key.

   Rule 2: Once an authority has seen an Ed25519 identity key and an RSA
   identity key on the same descriptor, it should no longer accept any
   descriptor signed by that RSA key unless it also has that Ed25519
   key.


   These rules together should enforce the property that, even if an
   attacker manages to steal or factor a node's RSA identity key, the
   attacker can't impersonate that node to the authorities, even when
   that node is identified by its RSA key.


   Enforcement of Rule 1 should be advisory-only for a little while (a
   release or two) while node operators get experience having Ed25519
   keys, in case there are any bugs that cause or force identity key
   replacement.  Enforcement of Rule 2 should be advisory-only for
   little while, so that node operators can try 0.2.5 but downgrade to
   0.2.4 without being de-listed from the consensus.


3.2. Formats

   Vote and microdescriptor documents now contain an optional "id"
   field for each routerstatus section.  Its format is:

       "id" SP "ed25519" SP ed25519-identity NL

   where ed25519-identity is base64-encoded, with trailing = characters
   omitted.  In vote documents, it may be replaced by the format:

       "id" SP "ed25519" SP "none" NL

   which indicates that the node does not have an ed25519 identity.  (In
   a microdescriptor, a lack of "id" line means that the node has no ed25519
   identity.)

   A vote or consensus document is ill-formed if it includes the same
   ed25519 identity key twice.

   A vote listing ed25519 identities must also include a new entry in its
   "r" lines, containing a base64-encoded SHA256 digest of the entire
   descriptor (including signature).  This kills off another place where
   we rely on sha1.  The format for 'r' lines is now:

    "r" SP nickname SP identity SP digest SP publication SP IP SP ORPort
        SP DirPort [ SP digest-sha256 ] NL

3.3. Generating votes

   An authority should pick which descriptor to choose for a node as
   before, and include the ed25519 identity key for the descriptor if
   it's present.

   As a transition, before Rule 1 and Rule 2 in 3.1 are fully enforced,
   authorities need a way to deal with the possibility that there might
   be two nodes with the same ed25519 key but different RSA keys.  In
   that case, it votes for the one with the most recent publication
   date.

   (The existing rules already prevent an authority from voting for two
   servers with the same RSA identity key.)

3.4. Generating a consensus from votes

   This proposal requires a new consensus vote method.  When we deploy
   it, we'll pick the next available vote method in sequence to use for
   this.

   When the new consensus method is in use, we must choose nodes first by ECC
   key, then by RSA key.  [This procedure is analogous to the current one,
   except that it is aware of multiple kinds of keys.]

3.4.1. Notation for voting

  We have a set of votes.  Each contains either 'old tuples' or 'new tuples'.

  Old tuples are:
    <id-RSA, descriptor-digest, published, nickname, IP, ports>

  New tuples are:
    <id-Ed, id-RSA, descriptor-digest, dd256, published, nickname, IP, ports>


3.4.2. Validating votes

  It is an error for a vote to have the same id-RSA or the same id-Ed listed
  twice.  Throw it away if it does.

3.4.3. Decide which ids to include.

  For each <id-Ed, id-RSA> that is listed by more than half of the total
    authorities (not just total votes), include it.  (No other <id-Ed, id-RSA'>
    can have as many votes.)

  Log any other id-RSA values corresponding to an id-Ed we included, and any
    other id-Ed values corresponding to an id-RSA we included.

  For each <id-RSA> that is not yet included, if it is listed by more than
    half of the total authorities, and we do not already have it listed with
    some <id-Ed>, include it without an id-Ed.

3.4.4. Decide which descriptors to include.

   A tuple belongs to an <id-RSA, id-Ed> identity if it is a new tuple that
   matches both ID parts, or if it is an old tuple that matches the RSA part.
   A tuple belongs to an <id-RSA> identity if its RSA identity matches.

   A tuple matches another tuple if all the fields that are present in both
   tuples are the same.

   For every included identity, consider the tuples belonging to that
   identity.  Group them into sets of matching tuples.  Include the tuple
   that matches the largest set, breaking ties in favor of the most recently
   published, and then in favor of the smaller server descriptor digest.

4. The link protocol

4.1. Overview of the status quo

   This section won't make much sense unless you grok the v3
   link protocol as described in tor-spec.txt, first proposed in
   proposal 195. So let's review.

   In the v3 link protocol, the client completes a TLS handshake
   with the server, in which the server uses an arbitrary
   certificate signed with an RSA key.  The client then sends a
   VERSIONS cell.  The server replies with a VERSIONS cell to
   negotiate version 3 or higher.  The server also sends a CERTS
   cell and an AUTH_CHALLENGE cell and a NETINFO cell.

   The CERTS cell from the server contains a set of one or more
   certificates that authenticate the RSA key used in the TLS
   handshake.  (Right now there's one self-signed RSA identity key
   certificate, and one certificate signing the RSA link key with
   the identity key.  These certificates are X509.)

   Having received a CERTS cell, the client has enough information
   to authenticate the server.  At this point, the client may send a
   NETINFO cell to finish the handshake.  But if the client wants to
   authenticate as well, it can send a CERTS cell and an AUTENTICATE
   cell.

   The client's CERTS cell also contains certs of the same general
   kinds as the server's key file: a self-signed identity
   certificate, and an authentication certificate signed with the
   identity key.  The AUTHENTICATE cell contains a signature of
   various fields, including the contents of the AUTH_CHALLENGE
   which the server sent, using the client's authentication
   key.  These cells allow the client to authenticate to the server.


4.2. Link protocol changes for ECC ID keys

   We add four new CertType values for use in CERTS cells:
        4: Ed25519 signing key
        5: Link key certificate certified by Ed25519 signing key
        6: Ed25519 TLS authentication key certified by Ed25519 signing key
        7: RSA cross-certificate for Ed25519 identity key
   These correspond to types used in the CERT_TYPE field of
   the certificates.

   The content of certificate type [04] (Ed25519 signing key)
   is as in section 2.5 above, containing an identity key and the
   signing key, both signed by the identity key.

   Certificate type [05] (Link certificate signed with Ed25519
   signing key) contains a SHA256 digest of the X.509 link
   certificate used on the TLS connection in its key field; it is
   signed with the signing key.

   Certificate type [06] (Ed25519 TLS authentication signed with
   Ed25519 signing key) has the signing key used to sign the
   AUTHENTICATE cell described later in this section.

   Certificate type [07] (Cross-certification of Ed25519 identity
   with RSA key) contains the following data:
       ED25519_KEY                       [32 bytes]
       EXPIRATION_DATE                   [4 bytes]
       SIGLEN                            [1 byte]
       SIGNATURE                         [SIGLEN bytes]
   Here, the Ed25519 identity key is signed with router's RSA
   identity key, to indicate that authenticating with a key
   certified by the Ed25519 key counts as certifying with RSA
   identity key.  (The signature is computed on the SHA256 hash of
   the non-signature parts of the certificate, prefixed with the
   string "Tor TLS RSA/Ed25519 cross-certificate".)

   (There's no reason to have a corresponding Ed25519-signed-RSA-key
   certificate here, since we do not treat authenticating with an RSA
   key as proving ownership of the Ed25519 identity.)

   Relays with Ed25519 keys should always send these certificate types
   in addition to their other certificate types.

   Non-bridge relays with Ed25519 keys should generate TLS link keys of
   appropriate strength, so that the certificate chain from the Ed25519
   key to the link key is strong enough.


   We add a new authentication type for AUTHENTICATE cells:
   "Ed25519-TLSSecret", with AuthType value 2. Its format is the same as
   "RSA-SHA256-TLSSecret", except that the CID and SID fields support
   more key types; some strings are different, and the signature is
   performed with Ed25519 using the authentication key from a type-6
   cert.  Clients can send this AUTHENTICATE type if the server
   lists it in its AUTH_CHALLENGE cell.

   Modified values and new fields below are marked with asterisks.

       TYPE: The characters "AUTH0002"* [8 octets]
       CID: A SHA256 hash of the initiator's RSA1024 identity key [32 octets]
       SID: A SHA256 hash of the responder's RSA1024 identity key [32 octets]
       *CID_ED: The initiator's Ed25519 identity key [32 octets]
       *SID_ED: The responder's Ed25519 identity key, or all-zero. [32 octets]
       SLOG: A SHA256 hash of all bytes sent from the responder to the
         initiator as part of the negotiation up to and including the
         AUTH_CHALLENGE cell; that is, the VERSIONS cell, the CERTS cell,
         the AUTH_CHALLENGE cell, and any padding cells.  [32 octets]
       CLOG: A SHA256 hash of all bytes sent from the initiator to the
         responder as part of the negotiation so far; that is, the
         VERSIONS cell and the CERTS cell and any padding cells. [32
         octets]
       SCERT: A SHA256 hash of the responder's TLS link certificate. [32
         octets]
       TLSSECRETS: A SHA256 HMAC, using the TLS master secret as the
         secret key, of the following:
           - client_random, as sent in the TLS Client Hello
           - server_random, as sent in the TLS Server Hello
           - the NUL terminated ASCII string:
             "Tor V3 handshake TLS cross-certification with Ed25519"*
          [32 octets]
       RAND: A 24 byte value, randomly chosen by the initiator. [24 octets]
       *SIG: A signature of all previous fields using the initiator's
          Ed25519 authentication flags.
          [variable length]

   If you've got a consensus that lists an ECC key for a node, but the
   node doesn't give you an ECC key, then refuse this connection.

5. The extend protocol

   We add a new NSPEC node specifier for use in EXTEND2 cells, with
   LSTYPE value [03].  Its length must be 32 bytes; its content is the
   Ed25519 identity key of the target node.

   Clients should use this type only when:
     * They know an Ed25519 identity key for the destination node.
     * The source node supports EXTEND2 cells
     * A torrc option is set, _or_ a consensus value is set.

   We'll leave the consensus value off for a while until more clients
   support this, and then turn it on.

   When picking a channel for a circuit, if this NSPEC value is
   provided, then the RSA identity *and* the Ed25519 identity must
   match.

   If we have a channel with a given Ed25519 ID and RSA identity, and we
   have a request for that Ed25519 ID and a different RSA identity, we
   do not attempt to make another connection: we just fail and DESTROY
   the circuit.

   If we receive an EXTEND or EXTEND2 request for a node listed in the
   consensus, but that EXTEND/EXTEND2 request does not include an
   Ed25519 identity key, the node SHOULD treat the connection as failed
   if the Ed25519 identity key it receives does not match the one in the
   consensus.

   For testing, clients may have the ability to configure whether to
   include Ed25519 identities in EXTEND2 cells.  By default, this should
   be governed by the boolean "ExtendByEd25519ID" consensus parameter,
   with default value '0'.

6. Naming nodes in the interface

   Anywhere in the interface that takes an $identity should be able to
   take an ECC identity too.  ECC identities are case-sensitive base64
   encodings of Ed25519 identity keys. You can use $ to indicate them as
   well; we distinguish RSA identity digests by length.

   When we need to indicate an Ed25519 identity key in a hostname
   format (as in a .exit address), we use the lowercased version of the
   name, and perform a case-insensitive match.  (This loses us a little
   less than one bit per byte of name, leaving plenty of bits to make
   sure we choose the right node.)

   Nodes must not list Ed25519 identities in their family lines; clients and
   authorities must not honor them there.  (Doing so would make different
   clients change paths differently in a possibly manipulatable way.)

   Clients shouldn't accept .exit addresses with Ed25519 names on SOCKS
   or DNS ports by default, even when AllowDotExit is set.  We can add
   another option for them later if there's a good reason to have this.

   We need an identity-to-node map for ECC identity and for RSA
   identity.

   The controller interface will need to accept and report Ed25519
   identity keys as well as (or instead of) RSA identity keys.  That's a
   separate proposal, though.

7. Hidden service changes out of scope

   Hidden services need to be able to identify nodes by ECC keys, just as
   they will need to include ntor keys as well as TAP keys.  Not just
   yet though.  This needs to be part of a bigger hidden service
   revamping strategy.

8. Proposed migration steps

   Once a few versions have shipped with Ed25519 key support, turn on
   "Rule 1" on the authorities.  (Don't allow an Ed25519<->RSA pairing
   to change.)

   Once the release with these changes is in beta or rc, turn on the
   consensus option for everyone who receives descriptors with
   Ed25519 identity keys to check them.

   Once the release with these changes is in beta or rc, turn on the
   consensus option for clients to generate EXTEND2 requests with
   Ed25519 identity keys.

   Once the release with these changes has been stable for a month
   or two, turn on "Rule 2" on authorities.  (Don't allow nodes that
   have advertised an Ed25519 key to stop.)

9. Future proposals

   * Ed25519 identity support on the controller interface
   * Supporting nodes without RSA keys
   * Remove support for nodes without Ed25519 keys
   * Ed25519 support for hidden services
   * Bridge identity support.
   * Ed25519-aware family support

A.1. List of certificate types

   The values marked with asterisks are not types corresponding to
   the certificate format of section 2.1.  Instead, they are
   reserved for RSA-signed certificates to avoid conflicts between
   the certificate type enumeration of the CERTS cell and the
   certificate type enumeration of in our Ed25519 certificates.


   **[00],[01],[02],[03] - Reserved to avoid conflict with types used
          in CERTS cells.

   [04] - signing a signing key with an identity key (Section 2.5)

   [05] - TLS link certificate signed with ed25519 signing key
         (Section 4.2)

   [06] - Ed25519 authentication key signed with ed25519 signing key
          (Section 4.2)

   **[07] - reserved for RSA identity cross-certification (Section 4.2)


A.2. List of extension types


   [01] - signed-with-ed25519-key (section 2.2.1)

A.3. List of signature prefixes

   We describe various documents as being signed with a prefix. Here
   are those prefixes:

      "Tor router descriptor signature v1" (section 2.5)
      "Tor node signing key certificate v1" (section 2.1)

A.4. List of certified key types

   [01] ed25519 key
   [02] SHA256 hash of an RSA key
   [03] SHA256 hash of an X.509 certificate

A.5. Reserved numbers

   We need a new consensus algorithm number to encompass checking
   ed25519 keys and putting them in microdescriptors.


   We need new CertType values for use in CERTS cells.  We reserved
   in section 4.2.

        4: Ed25519 signing key
        5: Link key certificate certified by Ed25519 signing key
        6: TLS authentication key certified by Ed25519 signing key
        7: RSA cross-certificate for Ed25519 identity key


A.6. Related changes

   As we merge this, proposal, we should also extend link key size to
   2048 bits, and use SHA256 as the x509 cert algorithm for our link
   keys. This will improve link security, and deliver better
   fingerprinting resistence.  See proposal 179 for an older discussion
   of this issue.
Filename: 221-stop-using-create-fast.txt
Title: Stop using CREATE_FAST
Authors: Nick Mathewson
Created: 12 August 2013
Target: 0.2.5.x
Status: Closed

0. Summary

   I propose that in 0.2.5.x, Tor clients stop sending CREATE_FAST
   cells, and use CREATE or CREATE2 cells instead as appropriate.

1. Introduction

   The CREATE_FAST cell was created to avoid the performance hit of
   using the TAP handshake on a TLS session that already provided what
   TAP provided: authentication with RSA1024 and forward secrecy with
   DH1024.  But thanks to the introduction of the ntor onionskin
   handshake in Tor 0.2.4.x, for nodes with older versions of OpenSSL,
   the TLS handshake strength lags behind with the strength of the onion
   handshake, and the arguments against CREATE no longer apply.

   Similarly, it's good to have an argument for circuit security that
   survives possible breakdowns in TLS. But when CREATE_FAST is in use,
   this is impossible: we can only argue forward-secrecy at the first
   hop of each circuit by assuming that TLS has succeeded.

   So let's simply stop sending CREATE_FAST cells.

2. Proposed design

   Currently, only clients will send CREATE_FAST, and only when they
   have FastFirstHopPK set to its default value, 1.

   I propose that we change "FastFirstHopPK" from a boolean to also
   allow a new default "auto" value that tells Tor to take a value from
   the consensus.  I propose a new consensus parameter, "usecreatefast",
   default value taken to be 1.

   Once enough versions of Tor support this proposal, the authorities
   should set the value for "usecreatefast" to be 0.

   In the series after that (0.2.6.x?), the default value for
   "FastFirstHopPK" should be 0.

   (Note that CREATE_FAST must still be used in the case where a client
   has connected to a guard node or bridge without knowing any onion
   keys for it, and wants to fetch directory information from it.)

3. Alternative designs

   We might make some choices to preserve CREATE_FAST under some
   circumstances.  For example, we could say that CREATE_FAST is okay if
   we have a TLS connection with a cipher, public key, and ephemeral key
   algorithm of a given strength.

   We might try to trust the TLS handshake for authentication but not
   forward secrecy, and come up with a first-hop handshake that did a
   simple curve25519 diffie-hellman.

   We might use CREATE_FAST only whenever ntor is not available.

   I'm rejecting all of the above for complexity reasons.

   We might just change the default for FastFirstHopPK to 0 in
   0.2.5.x-alpha.  It would make early users of that alpha easy for
   their guards to distinguish.

4. Performance considerations

   This will increase the CPU requirements on guard nodes; their
   cpuworkers would be more heavily loaded as 0.2.5.x is more
   adopted.

   I believe that, if guards upgrade to 0.2.4.x as 0.2.5.x is under
   development, the commensurate benefits of ntor will outweigh the
   problems here.  This holds even more if we wind up with a better ntor
   implementation or replacement.

5. Considerations on client detection

   Right now, in a few places, Tor nodes assume that any connection on
   which they have received a CREATE_FAST cell is probably from a
   non-relay node, since relays never do that.  Implementing this
   proposal would make that signal unreliable.

   We should do this proposal anyway.  CREATE_FAST has never been a
   reliable signal, since "FastFirstHopPK 0" is easy enough to type, and
   the source code is easy enough to edit.  Proposal 163 and its
   successors have better ideas here anyway.
Filename: 222-remove-client-timestamps.txt
Title: Stop sending client timestamps
Authors: Nick Mathewson
Created: 22 August 2013
Status: Closed
Implemented-In: 0.2.4.18

0. Summary

   There are a few places in Tor where clients and servers send
   timestamps.  I list them and discuss how to eliminate them.

1. Introduction

   Despite this late date, many hosts aren't running NTP and
   don't have very well synchronized clocks. Even more hosts
   aren't running a secure NTP; it's probably easy to
   desynchronize target hosts.

   Given all of this, it's probably a fingerprinting opportunity
   whenever clients send their view of the current time.
   Let's try to avoid that.

   I'm also going to list the places where servers send their
   view of the current time, and propose that we eliminate some
   of those.

   Scope: This proposal is about eliminating passive timestamp
   exposure, not about tricky active detection mechanisms where
   you do something like offering a client a large number of
   about-to-expire/just-expired certificates to see which ones
   they accept.

2. The Tor link protocol

2.1. NETINFO (client and server)

   NETINFO cells specify that both parties include a 4-byte
   timestamp.

   Instead, let's say that clients should set this timestamp to
   0.  Nothing currently looks at a client's setting for this
   field, so this change should be safe.

2.2. AUTHENTICATE (server)

   The AUTHENTICATE cell is not ordinarily sent by clients. It
   contains an 8-byte timestamp and a 16-byte random value.
   Instead, let's just send 24 bytes or random value.

   (An earlier version of this proposal suggested that we replace
   them both with a 24-byte (truncated) HMAC of the current time,
   using a random key, in an attempt to retain the allegedly
   desirable property of avoiding nonce duplication in the event of
   a bad RNG. But really, a Tor process with a bad RNG is not going
   to get security in any case, so let's KISS.)

2.3. TLS

2.3.1. ClientRandom in the TLS handshake

   See TLS proposal in appendix A.

   This presents a TLS fingerprinting/censorship opportunity. I
   propose that we investigate whether "random " or "zero" is
   more common on the wire, choose that, and lobby for changes to
   TLS implementations.

2.3.2. Certificate validity intervals

   Servers use the current time in setting certificate validity
   for their initial certificates.  They randomize this value
   somewhat.  I propose that we don't change this, since it's a
   server-only issue, and already somewhat mitigated.

3. Directory protocol

3.1. Published

  This field in descriptors is generated by servers only; I
  propose no change.

3.2. The Date header

  This HTTP header is sent by directory servers only; I propose
  no change.

4. The hidden service protocol

4.1. Descriptor publication time

  Hidden service descriptors include a publication time.  I
  propose that we round this time down to the nearest N minutes,
  where N=60.

4.2. INTRODUCE2 cell timestamp

  INTRODUCE2 cells once limited the duration of their replay
  caches by including a timestamp in the INTRODUCE2 cells.  Since
  0.2.3.9-alpha, this timestamp is ignored, and key lifetime is
  used instead.

  When we determine that no hidden services are running on
  0.2.2.x (and really, no hidden services should be running on
  0.2.2.x!), we can simply send 0 instead.  (See ticket #7803).

  We can control this behavior with a consensus parameter
  (Support022HiddenServices) and a tristate (0/1/auto) torrc option of
  the same name.

  When the timestamp is not completely disabled, it should be
  rounded to the closest 10 minutes.

  I claim this would be suitable for backport to 0.2.4.

5. The application layer

  The application layer is mostly out of scope for this proposal,
  except:

  TorBrowser already (I hear) drops the timestamp from the
  ClientRandom field in TLS.  We should encourage other TLS
  applications to do so.  (See Appendix A.)



=================================================================
APPENDIX A:  "Let's replace gmt_unix_time in TLS"

PROBLEM:

The gmt_unix_time field in the Random field in the TLS handshake
provides a way for an observer to fingerprint clients.

Despite the late date, much of the world is still not
synchronized to the second via an ntp-like service. This means
that different clients have different views of the current time,
which provides a fingerprint that helps to track and distinguish
them.  This fingerprint is useful for tracking clients as they
move around.  It can also distinguish clients using a single VPN,
NAT, or privacy network.  (Tor's modified firefox avoids this by
not sending the time.)

Worse, some implementations don't send the current time, but the
process time, or the computer's uptime, both of which are far
more distinguishing than the current time() value.

The information fingerprint here is strong enough to uniquely
identify some TLS users (the ones whose clocks are hours off).
Even for the ones whose clocks are mostly right (within a second
or two), the field leaks a bit of information, and it only takes
so many bits to make a user unique.


WHY gmt_unix_time IN THE FIRST PLACE?

According to third-hand reports -- (and correct me if I'm wrong!)
it was introduced in SSL 3.0 to prevent complete failure in cases
where the PRNG was completely broken, by making a part of the
Random field that would definitely vary between TLS handshakes.

I doubt that this goal is really achieved: on modern desktop
environments, it's not really so strange to start two TLS
connections within the same second.

WHY ELSE IS gmt_unix_time USED?

The consensus among implementors seems to be that it's unwise to
depend on any particular value or interpretation for the field.
The TLS 1.2 standard, RFC 5246, says that "Clocks are not
required to be set correctly by the basic TLS protocol;
higher-level or application protocols may define additional
requirements."

Some implementations set the entire field randomly; this appears
not to have broken TLS on the internet.

At least one tool (tlsdate) uses the server-side value of the
field as an authenticated view of the current time.



PROPOSAL 1:

Declare that implementations MAY replace gmt_unix_time either
with four more random bytes, or four bytes of zeroes.

Make your implementation just do that.

(Rationale: some implementations (like TorBrowser) are already
doing this in practice.  It's sensible and simple.  You're
unlikely to mess it up, or cause trouble.)



PROPOSAL 2:

Okay, if you really want to preserve the security allegedly
provided by gmt_unix_time, allow the following approach instead:

Set the Random field, not to 32 bytes from your PRNG, but to the
HMAC-SHA256 of any high resolution timer that you have, using 32
bytes from your PRNG as a key.  In other words, replace this:

   Random.gmt_unix_time = time();
   Random.random_bytes = get_random_bytes(28)

with this:

   now = hires_time(); // clock_gettime(), or concatenate time()
                       // with a CPU timer, or process
                       // uptime, or whatever.
   key = get_random_bytes(32);
   Random = hmac_sha256(key, now);

This approach is better than the status quo on the following
counts:

   * It doesn't leak your view of the current time, assuming that
     your PRNG isn't busted.

   * It actually fixes the problem that gmt_unix_time purported to
     fix, by using a high-resolution time that's much less likely to
     be used twice.  Even if the PRNG is broken, the value is still
     nonrepeating.

It is not worse than the status quo:

   * It is unpredictable from an attacker's POV, assuming that the
     PRNG works.  (Because an HMAC, even of known data, with an
     unknown random key is supposed to look random).


CONSIDERATIONS:

I'd personally suggest proposal 1 (just set the field at random) for
most users.  Yes, it makes things a little worse if your PRNG can
generate repeat values... but nearly everything in cryptography
fails if your PRNG is broken.


You might want to apply this fix on clients only.  With a few
exceptions (like hidden services) the server's view of the current
time is not sensitive.


Implementors might want to make this feature optional and
on-by-default, just in case some higher-level application protocol
really does depend on it.
==================================================================
Filename: 223-ace-handshake.txt
Title: Ace: Improved circuit-creation key exchange
Author: Esfandiar Mohammadi, Aniket Kate, Michael Backes
Created: 22-July-2013
Status: Reserve

History:

   22-July-2013 -- Submitted
   20-Nov-2013  -- Reformatted slightly, wrapped lines, added
                   references, adjusted the KDF [nickm]
   20-Nov-2013  -- Clarified that there's only one group here [nickm]

Summary:

This is an attempt to translate the proposed circuit handshake from
"Ace: An Efficient Key-Exchange Protocol for Onion Routing" by
Backes, Kate, and Mohammadi into a Tor proposal format.

The specification assumes an implementation of scalar multiplication
and addition of two curve elements, as in Robert Ransom's celator
library.

Notation:

  Let a|b be the concatenation of a with b.

  Let H(x,t) be a tweakable hash function of output width H_LENGTH
  bytes.

  Let t_mac, t_key, and t_verify be a set of arbitrarily-chosen
  tweaks for the hash function.

  Let EXP(a,b) be a^b in some appropriate group G where the
  appropriate DH parameters hold.  Let's say elements of this group,
  when represented as byte strings, are all G_LENGTH bytes long.
  Let's say we are using a generator g for this group.

  Let MUTLIEXPONEN (a,b,c,d) be (a^b)*(c^d) in the same group G as above.

  Let PROTOID be a string designating this variant of the protocol.

  Let KEYID be a collision-resistant (but not necessarily preimage-resistant)
     hash function on members of G, of output length H_LENGTH bytes.

Instantiation:

  Let's call this PROTOID "ace-curve25519-ed-uncompressed-sha256-1"

  Set H(x,t) == HMAC_SHA256 with message x and key t. So H_LENGTH == 32.
  Set t_mac   == PROTOID | ":mac"
      t_key  == PROTOID | ":key"
      t_verify  == PROTOID | ":verify"
  Set EXP(a,b) == scalar_mult_curve25519(a,b),
      MUTLIEXPONEN(a,b) == dblscalarmult_curve25519(a,b,c,d), and g == 9 .

  Set KEYID(B) == B.  (We don't need to use a hash function here, since our
     keys are already very short.  It is trivially collision-resistant, since
     KEYID(A)==KEYID(B) iff A==B.)

Protocol:

  Take a router with identity key digest ID.

  As setup, the router generates a secret key b, and a public onion key
  B = EXP(g,b).  The router publishes B in its server descriptor.

  To send a create cell, the client generates two keypairs of x_1,
  X_1=EXP(g,x_1) and x_2, X_2=EXP(g,x_2) and sends a CREATE cell
  with contents:

    NODEID:     ID             -- H_LENGTH bytes
    KEYID:      KEYID(B)       -- H_LENGTH bytes
    CLIENT_PK:  X_1, X_2           -- 2 x G_LENGTH bytes

  The server checks X_1, X_2, generates a keypair of y, Y=EXP(g,y)
  and computes

    point = MUTLIEXPONEN(X_1,y,X_2,b)
    secret_input = point | ID | B | X_1 | X_2 | Y | PROTOID
    KEY_SEED = H(secret_input | "Key Seed", t_key)
    KEY_VERIFY = H(secret_input | "HMac Seed", t_verify)
    auth_input = ID | B | Y | X_1 | X_2 | PROTOID | "Server"

  The server sends a CREATED cell containing:

    SERVER_PK:  Y                     -- G_LENGTH bytes
    AUTH:       H(auth_input, KEY_VERIFY)  -- H_LENGTH bytes

  The client then checks Y, and computes

    point = MUTLIEXPONEN(Y,x_1,B,x_2)
    secret_input = point | ID | B | X_1 | X_2 | Y | PROTOID
    KEY_SEED = H(secret_input | "Key Seed", t_key)
    KEY_VERIFY = H(secret_input | "HMac Seed", t_verify)
    auth_input = ID | B | Y | X_1 | X_2 | PROTOID | "Server"

    The client verifies that AUTH == H(auth_input, KEY_VERIFY).

  Both parties now have a shared value for KEY_SEED.  They expand
  this into the keys needed for the Tor relay protocol.

Key expansion:

  When using this handshake, clients and servers should expand keys
  using HKDF as with the ntor handshake today.

See also:

  http://www.infsec.cs.uni-saarland.de/~mohammadi/ace/ace.html
  for implementations, academic paper, and benchmarking code.

Filename: 224-rend-spec-ng.txt
Title: Next-Generation Hidden Services in Tor
Author: David Goulet, George Kadianakis, Nick Mathewson
Created: 2013-11-29
Status: Closed
Implemented-In: 0.3.2.1-alpha

Table of contents:

    0. Hidden services: overview and preliminaries.
        0.1. Improvements over previous versions.
        0.2. Notation and vocabulary
        0.3. Cryptographic building blocks
        0.4. Protocol building blocks [BUILDING-BLOCKS]
        0.5. Assigned relay cell types
        0.6. Acknowledgments
    1. Protocol overview
        1.1. View from 10,000 feet
        1.2. In more detail: naming hidden services [NAMING]
        1.3. In more detail: Access control [IMD:AC]
        1.4. In more detail: Distributing hidden service descriptors. [IMD:DIST]
        1.5. In more detail: Scaling to multiple hosts
        1.6. In more detail: Backward compatibility with older hidden service
        1.7. In more detail: Keeping crypto keys offline
        1.8. In more detail: Encryption Keys And Replay Resistance
        1.9. In more detail: A menagerie of keys
            1.9.1. In even more detail: Client authorization [CLIENT-AUTH]
    2. Generating and publishing hidden service descriptors [HSDIR]
        2.1. Deriving blinded keys and subcredentials [SUBCRED]
        2.2. Locating, uploading, and downloading hidden service descriptors
            2.2.1. Dividing time into periods [TIME-PERIODS]
            2.2.2. When to publish a hidden service descriptor [WHEN-HSDESC]
            2.2.3. Where to publish a hidden service descriptor [WHERE-HSDESC]
            2.2.4. Using time periods and SRVs to fetch/upload HS descriptors
            2.2.5. Expiring hidden service descriptors [EXPIRE-DESC]
            2.2.6. URLs for anonymous uploading and downloading
        2.3. Publishing shared random values [PUB-SHAREDRANDOM]
            2.3.1. Client behavior in the absense of shared random values
            2.3.2. Hidden services and changing shared random values
        2.4. Hidden service descriptors: outer wrapper [DESC-OUTER]
        2.5. Hidden service descriptors: encryption format [HS-DESC-ENC]
            2.5.1. First layer of encryption [HS-DESC-FIRST-LAYER]
                2.5.1.1. First layer encryption logic
                2.5.1.2. First layer plaintext format
                2.5.1.3. Client behavior
                2.5.1.4. Obfuscating the number of authorized clients
            2.5.2. Second layer of encryption [HS-DESC-SECOND-LAYER]
                2.5.2.1. Second layer encryption keys
                2.5.2.2. Second layer plaintext format
            2.5.3. Deriving hidden service descriptor encryption keys [HS-DESC-ENCRYPTION-KEYS]
    3. The introduction protocol [INTRO-PROTOCOL]
        3.1. Registering an introduction point [REG_INTRO_POINT]
            3.1.1. Extensible ESTABLISH_INTRO protocol. [EST_INTRO]
            3.1.2. Registering an introduction point on a legacy Tor node [LEGACY_EST_INTRO]
            3.1.3. Acknowledging establishment of introduction point [INTRO_ESTABLISHED]
        3.2. Sending an INTRODUCE1 cell to the introduction point. [SEND_INTRO1]
            3.2.1. INTRODUCE1 cell format [FMT_INTRO1]
            3.2.2. INTRODUCE_ACK cell format. [INTRO_ACK]
        3.3. Processing an INTRODUCE2 cell at the hidden service. [PROCESS_INTRO2]
            3.3.1. Introduction handshake encryption requirements [INTRO-HANDSHAKE-REQS]
            3.3.2. Example encryption handshake: ntor with extra data [NTOR-WITH-EXTRA-DATA]
        3.4. Authentication during the introduction phase. [INTRO-AUTH]
            3.4.1. Ed25519-based authentication.
    4. The rendezvous protocol
        4.1. Establishing a rendezvous point [EST_REND_POINT]
        4.2. Joining to a rendezvous point [JOIN_REND]
            4.2.1. Key expansion
        4.3. Using legacy hosts as rendezvous points
    5. Encrypting data between client and host
    6. Encoding onion addresses [ONIONADDRESS]
    7. Open Questions:

-1. Draft notes

   This document describes a proposed design and specification for
   hidden services in Tor version 0.2.5.x or later. It's a replacement
   for the current rend-spec.txt, rewritten for clarity and for improved
   design.

   Look for the string "TODO" below: it describes gaps or uncertainties
   in the design.

   Change history:

       2013-11-29: Proposal first numbered. Some TODO and XXX items remain.

       2014-01-04: Clarify some unclear sections.

       2014-01-21: Fix a typo.

       2014-02-20: Move more things to the revised certificate format in the
           new updated proposal 220.

       2015-05-26: Fix two typos.


0. Hidden services: overview and preliminaries.

   Hidden services aim to provide responder anonymity for bidirectional
   stream-based communication on the Tor network. Unlike regular Tor
   connections, where the connection initiator receives anonymity but
   the responder does not, hidden services attempt to provide
   bidirectional anonymity.

   Participants:

      Operator -- A person running a hidden service

      Host, "Server" -- The Tor software run by the operator to provide
         a hidden service.

      User -- A person contacting a hidden service.

      Client -- The Tor software running on the User's computer

      Hidden Service Directory (HSDir) -- A Tor node that hosts signed
        statements from hidden service hosts so that users can make
        contact with them.

      Introduction Point -- A Tor node that accepts connection requests
        for hidden services and anonymously relays those requests to the
        hidden service.

      Rendezvous Point -- A Tor node to which clients and servers
        connect and which relays traffic between them.

0.1. Improvements over previous versions.

   Here is a list of improvements of this proposal over the legacy hidden
   services:

   a) Better crypto (replaced SHA1/DH/RSA1024 with SHA3/ed25519/curve25519)
   b) Improved directory protocol leaking less to directory servers.
   c) Improved directory protocol with smaller surface for targeted attacks.
   d) Better onion address security against impersonation.
   e) More extensible introduction/rendezvous protocol.
   f) Offline keys for onion services
   g) Advanced client authorization

0.2. Notation and vocabulary

   Unless specified otherwise, all multi-octet integers are big-endian.

   We write sequences of bytes in two ways:

     1. A sequence of two-digit hexadecimal values in square brackets,
        as in [AB AD 1D EA].

     2. A string of characters enclosed in quotes, as in "Hello". The
        characters in these strings are encoded in their ascii
        representations; strings are NOT nul-terminated unless
        explicitly described as NUL terminated.

   We use the words "byte" and "octet" interchangeably.

   We use the vertical bar | to denote concatenation.

   We use INT_N(val) to denote the network (big-endian) encoding of the
   unsigned integer "val" in N bytes. For example, INT_4(1337) is [00 00
   05 39]. Values are truncated like so: val % (2 ^ (N * 8)). For example,
   INT_4(42) is 42 % 4294967296 (32 bit).

0.3. Cryptographic building blocks

   This specification uses the following cryptographic building blocks:

      * A pseudorandom number generator backed by a strong entropy source.
        The output of the PRNG should always be hashed before being posted on
        the network to avoid leaking raw PRNG bytes to the network
        (see [PRNG-REFS]).

      * A stream cipher STREAM(iv, k) where iv is a nonce of length
        S_IV_LEN bytes and k is a key of length S_KEY_LEN bytes.

      * A public key signature system SIGN_KEYGEN()->seckey, pubkey;
        SIGN_SIGN(seckey,msg)->sig; and SIGN_CHECK(pubkey, sig, msg) ->
        { "OK", "BAD" }; where secret keys are of length SIGN_SECKEY_LEN
        bytes, public keys are of length SIGN_PUBKEY_LEN bytes, and
        signatures are of length SIGN_SIG_LEN bytes.

        This signature system must also support key blinding operations
        as discussed in appendix [KEYBLIND] and in section [SUBCRED]:
        SIGN_BLIND_SECKEY(seckey, blind)->seckey2 and
        SIGN_BLIND_PUBKEY(pubkey, blind)->pubkey2 .

      * A public key agreement system "PK", providing
        PK_KEYGEN()->seckey, pubkey; PK_VALID(pubkey) -> {"OK", "BAD"};
        and PK_HANDSHAKE(seckey, pubkey)->output; where secret keys are
        of length PK_SECKEY_LEN bytes, public keys are of length
        PK_PUBKEY_LEN bytes, and the handshake produces outputs of
        length PK_OUTPUT_LEN bytes.

      * A cryptographic hash function H(d), which should be preimage and
        collision resistant. It produces hashes of length HASH_LEN
        bytes.

      * A cryptographic message authentication code MAC(key,msg) that
        produces outputs of length MAC_LEN bytes.

      * A key derivation function KDF(message, n) that outputs n bytes.

   As a first pass, I suggest:

      * Instantiate STREAM with AES256-CTR.

      * Instantiate SIGN with Ed25519 and the blinding protocol in
        [KEYBLIND].

      * Instantiate PK with Curve25519.

      * Instantiate H with SHA3-256.

      * Instantiate KDF with SHAKE-256.

      * Instantiate MAC(key=k, message=m) with H(k_len | k | m),
        where k_len is htonll(len(k)).

   For legacy purposes, we specify compatibility with older versions of
   the Tor introduction point and rendezvous point protocols. These used
   RSA1024, DH1024, AES128, and SHA1, as discussed in
   rend-spec.txt.

   As in [proposal 220], all signatures are generated not over strings
   themselves, but over those strings prefixed with a distinguishing
   value.

0.4. Protocol building blocks [BUILDING-BLOCKS]

   In sections below, we need to transmit the locations and identities
   of Tor nodes. We do so in the link identification format used by
   EXTEND2 cells in the Tor protocol.

         NSPEC      (Number of link specifiers)   [1 byte]
         NSPEC times:
           LSTYPE (Link specifier type)           [1 byte]
           LSLEN  (Link specifier length)         [1 byte]
           LSPEC  (Link specifier)                [LSLEN bytes]

   Link specifier types are as described in tor-spec.txt. Every set of
   link specifiers MUST include at minimum specifiers of type [00]
   (TLS-over-TCP, IPv4), [02] (legacy node identity) and [03] (ed25519
   identity key).

   We also incorporate Tor's circuit extension handshakes, as used in
   the CREATE2 and CREATED2 cells described in tor-spec.txt. In these
   handshakes, a client who knows a public key for a server sends a
   message and receives a message from that server. Once the exchange is
   done, the two parties have a shared set of forward-secure key
   material, and the client knows that nobody else shares that key
   material unless they control the secret key corresponding to the
   server's public key.

0.5. Assigned relay cell types

   These relay cell types are reserved for use in the hidden service
   protocol.

      32 -- RELAY_COMMAND_ESTABLISH_INTRO

            Sent from hidden service host to introduction point;
            establishes introduction point. Discussed in
            [REG_INTRO_POINT].

      33 -- RELAY_COMMAND_ESTABLISH_RENDEZVOUS

            Sent from client to rendezvous point; creates rendezvous
            point. Discussed in [EST_REND_POINT].

      34 -- RELAY_COMMAND_INTRODUCE1

            Sent from client to introduction point; requests
            introduction. Discussed in [SEND_INTRO1]

      35 -- RELAY_COMMAND_INTRODUCE2

            Sent from introduction point to hidden service host; requests
            introduction. Same format as INTRODUCE1. Discussed in
            [FMT_INTRO1] and [PROCESS_INTRO2]

      36 -- RELAY_COMMAND_RENDEZVOUS1

            Sent from hidden service host to rendezvous point;
            attempts to join host's circuit to
            client's circuit. Discussed in [JOIN_REND]

      37 -- RELAY_COMMAND_RENDEZVOUS2

            Sent from rendezvous point to client;
            reports join of host's circuit to
            client's circuit. Discussed in [JOIN_REND]

      38 -- RELAY_COMMAND_INTRO_ESTABLISHED

            Sent from introduction point to hidden service host;
            reports status of attempt to establish introduction
            point. Discussed in [INTRO_ESTABLISHED]

      39 -- RELAY_COMMAND_RENDEZVOUS_ESTABLISHED

            Sent from rendezvous point to client; acknowledges
            receipt of ESTABLISH_RENDEZVOUS cell. Discussed in
            [EST_REND_POINT]

      40 -- RELAY_COMMAND_INTRODUCE_ACK

            Sent from introduction point to client; acknowledges
            receipt of INTRODUCE1 cell and reports success/failure.
            Discussed in [INTRO_ACK]

0.6. Acknowledgments

   This design includes ideas from many people, including
     Christopher Baines,
     Daniel J. Bernstein,
     Matthew Finkel,
     Ian Goldberg,
     George Kadianakis,
     Aniket Kate,
     Tanja Lange,
     Robert Ransom,
     Roger Dingledine,
     Aaron Johnson,
     Tim Wilson-Brown ("teor"),
     special (John Brooks),
     s7r

   It's based on Tor's original hidden service design by Roger
   Dingledine, Nick Mathewson, and Paul Syverson, and on improvements to
   that design over the years by people including
     Tobias Kamm,
     Thomas Lauterbach,
     Karsten Loesing,
     Alessandro Preite Martinez,
     Robert Ransom,
     Ferdinand Rieger,
     Christoph Weingarten,
     Christian Wilms,

   We wouldn't be able to do any of this work without good attack
   designs from researchers including
     Alex Biryukov,
     Lasse Øverlier,
     Ivan Pustogarov,
     Paul Syverson
     Ralf-Philipp Weinmann,
   See [ATTACK-REFS] for their papers.

   Several of these ideas have come from conversations with
      Christian Grothoff,
      Brian Warner,
      Zooko Wilcox-O'Hearn,

   And if this document makes any sense at all, it's thanks to
   editing help from
      Matthew Finkel
      George Kadianakis,
      Peter Palfrader,
      Tim Wilson-Brown ("teor"),


   [XXX  Acknowledge the huge bunch of people working on 8106.]
   [XXX  Acknowledge the huge bunch of people working on 8244.]


   Please forgive me if I've missed you; please forgive me if I've
   misunderstood your best ideas here too.


1. Protocol overview

   In this section, we outline the hidden service protocol. This section
   omits some details in the name of simplicity; those are given more
   fully below, when we specify the protocol in more detail.

1.1. View from 10,000 feet

   A hidden service host prepares to offer a hidden service by choosing
   several Tor nodes to serve as its introduction points. It builds
   circuits to those nodes, and tells them to forward introduction
   requests to it using those circuits.

   Once introduction points have been picked, the host builds a set of
   documents called "hidden service descriptors" (or just "descriptors"
   for short) and uploads them to a set of HSDir nodes. These documents
   list the hidden service's current introduction points and describe
   how to make contact with the hidden service.

   When a client wants to connect to a hidden service, it first chooses
   a Tor node at random to be its "rendezvous point" and builds a
   circuit to that rendezvous point. If the client does not have an
   up-to-date descriptor for the service, it contacts an appropriate
   HSDir and requests such a descriptor.

   The client then builds an anonymous circuit to one of the hidden
   service's introduction points listed in its descriptor, and gives the
   introduction point an introduction request to pass to the hidden
   service. This introduction request includes the target rendezvous
   point and the first part of a cryptographic handshake.

   Upon receiving the introduction request, the hidden service host
   makes an anonymous circuit to the rendezvous point and completes the
   cryptographic handshake. The rendezvous point connects the two
   circuits, and the cryptographic handshake gives the two parties a
   shared key and proves to the client that it is indeed talking to the
   hidden service.

   Once the two circuits are joined, the client can send Tor RELAY cells
   to the server. RELAY_BEGIN cells open streams to an external process
   or processes configured by the server; RELAY_DATA cells are used to
   communicate data on those streams, and so forth.

1.2. In more detail: naming hidden services [NAMING]

   A hidden service's name is its long term master identity key.  This is
   encoded as a hostname by encoding the entire key in Base 32, including a
   version byte and a checksum, and then appending the string ".onion" at the
   end. The result is a 56-character domain name.

   (This is a change from older versions of the hidden service protocol,
   where we used an 80-bit truncated SHA1 hash of a 1024 bit RSA key.)

   The names in this format are distinct from earlier names because of
   their length. An older name might look like:

        unlikelynamefora.onion
        yyhws9optuwiwsns.onion

   And a new name following this specification might look like:

        l5satjgud6gucryazcyvyvhuxhr74u6ygigiuyixe3a6ysis67ororad.onion

   Please see section [ONIONADDRESS] for the encoding specification.

1.3. In more detail: Access control [IMD:AC]

   Access control for a hidden service is imposed at multiple points through
   the process above. Furthermore, there is also the option to impose
   additional client authorization access control using pre-shared secrets
   exchanged out-of-band between the hidden service and its clients.

   The first stage of access control happens when downloading HS descriptors.
   Specifically, in order to download a descriptor, clients must know which
   blinded signing key was used to sign it. (See the next section for more info
   on key blinding.)

   To learn the introduction points, clients must decrypt the body of the
   hidden service descriptor. To do so, clients must know the _unblinded_
   public key of the service, which makes the descriptor unuseable by entities
   without that knowledge (e.g. HSDirs that don't know the onion address).

   Also, if optional client authorization is enabled, hidden service
   descriptors are superencrypted using each authorized user's identity x25519
   key, to further ensure that unauthorized entities cannot decrypt it.

   In order to make the introduction point send a rendezvous request to the
   service, the client needs to use the per-introduction-point authentication
   key found in the hidden service descriptor.

   The final level of access control happens at the server itself, which may
   decide to respond or not respond to the client's request depending on the
   contents of the request. The protocol is extensible at this point: at a
   minimum, the server requires that the client demonstrate knowledge of the
   contents of the encrypted portion of the hidden service descriptor. If
   optional client authorization is enabled, the service may additionally
   require the client to prove knowledge of a pre-shared private key.

1.4. In more detail: Distributing hidden service descriptors. [IMD:DIST]

   Periodically, hidden service descriptors become stored at different
   locations to prevent a single directory or small set of directories
   from becoming a good DoS target for removing a hidden service.

   For each period, the Tor directory authorities agree upon a
   collaboratively generated random value. (See section 2.3 for a
   description of how to incorporate this value into the voting
   practice; generating the value is described in other proposals,
   including [SHAREDRANDOM-REFS].) That value, combined with hidden service
   directories' public identity keys, determines each HSDir's position
   in the hash ring for descriptors made in that period.

   Each hidden service's descriptors are placed into the ring in
   positions based on the key that was used to sign them. Note that
   hidden service descriptors are not signed with the services' public
   keys directly. Instead, we use a key-blinding system [KEYBLIND] to
   create a new key-of-the-day for each hidden service. Any client that
   knows the hidden service's credential can derive these blinded
   signing keys for a given period. It should be impossible to derive
   the blinded signing key lacking that credential.

   The body of each descriptor is also encrypted with a key derived from
   the credential.

   To avoid a "thundering herd" problem where every service generates
   and uploads a new descriptor at the start of each period, each
   descriptor comes online at a time during the period that depends on
   its blinded signing key. The keys for the last period remain valid
   until the new keys come online.

1.5. In more detail: Scaling to multiple hosts

   This design is compatible with our current approaches for scaling hidden
   services. Specifically, hidden service operators can use onionbalance to
   achieve high availability between multiple nodes on the HSDir
   layer. Furthermore, operators can use proposal 255 to load balance their
   hidden services on the introduction layer. See [SCALING-REFS] for further
   discussions on this topic and alternative designs.

1.6. In more detail: Backward compatibility with older hidden service
      protocols

   This design is incompatible with the clients, server, and hsdir node
   protocols from older versions of the hidden service protocol as
   described in rend-spec.txt. On the other hand, it is designed to
   enable the use of older Tor nodes as rendezvous points and
   introduction points.

1.7. In more detail: Keeping crypto keys offline

   In this design, a hidden service's secret identity key may be
   stored offline.  It's used only to generate blinded signing keys,
   which are used to sign descriptor signing keys.

   In order to operate a hidden service, the operator can generate in
   advance a number of blinded signing keys and descriptor signing
   keys (and their credentials; see [DESC-OUTER] and [HS-DESC-ENC]
   below), and their corresponding descriptor encryption keys, and
   export those to the hidden service hosts.

   As a result, in the scenario where the Hidden Service gets
   compromised, the adversary can only impersonate it for a limited
   period of time (depending on how many signing keys were generated
   in advance).

   It's important to not send the private part of the blinded signing
   key to the Hidden Service since an attacker can derive from it the
   secret master identity key. The secret blinded signing key should
   only be used to create credentials for the descriptor signing keys.

1.8. In more detail: Encryption Keys And Replay Resistance

   To avoid replays of an introduction request by an introduction point,
   a hidden service host must never accept the same request
   twice. Earlier versions of the hidden service design used an
   authenticated timestamp here, but including a view of the current
   time can create a problematic fingerprint. (See proposal 222 for more
   discussion.)

1.9. In more detail: A menagerie of keys

   [In the text below, an "encryption keypair" is roughly "a keypair you
   can do Diffie-Hellman with" and a "signing keypair" is roughly "a
   keypair you can do ECDSA with."]

   Public/private keypairs defined in this document:

      Master (hidden service) identity key -- A master signing keypair
        used as the identity for a hidden service.  This key is long
        term and not used on its own to sign anything; it is only used
        to generate blinded signing keys as described in [KEYBLIND]
        and [SUBCRED]. The public key is encoded in the ".onion"
        address according to [NAMING].

      Blinded signing key -- A keypair derived from the identity key,
        used to sign descriptor signing keys. It changes periodically for
        each service. Clients who know a 'credential' consisting of the
        service's public identity key and an optional secret can derive
        the public blinded identity key for a service.  This key is used
        as an index in the DHT-like structure of the directory system
        (see [SUBCRED]).

      Descriptor signing key -- A key used to sign hidden service
        descriptors.  This is signed by blinded signing keys. Unlike
        blinded signing keys and master identity keys, the secret part
        of this key must be stored online by hidden service hosts. The
        public part of this key is included in the unencrypted section
        of HS descriptors (see [DESC-OUTER]).

      Introduction point authentication key -- A short-term signing
        keypair used to identify a hidden service to a given
        introduction point. A fresh keypair is made for each
        introduction point; these are used to sign the request that a
        hidden service host makes when establishing an introduction
        point, so that clients who know the public component of this key
        can get their introduction requests sent to the right
        service. No keypair is ever used with more than one introduction
        point. (previously called a "service key" in rend-spec.txt)

      Introduction point encryption key -- A short-term encryption
        keypair used when establishing connections via an introduction
        point. Plays a role analogous to Tor nodes' onion keys. A fresh
        keypair is made for each introduction point.

   Symmetric keys defined in this document:

      Descriptor encryption keys -- A symmetric encryption key used to
        encrypt the body of hidden service descriptors. Derived from the
        current period and the hidden service credential.

   Public/private keypairs defined elsewhere:

      Onion key -- Short-term encryption keypair

      (Node) identity key

   Symmetric key-like things defined elsewhere:

      KH from circuit handshake -- An unpredictable value derived as
      part of the Tor circuit extension handshake, used to tie a request
      to a particular circuit.

1.9.1. In even more detail: Client authorization keys [CLIENT-AUTH]

   When client authorization is enabled, each authorized client of a hidden
   service has two more assymetric keypairs which are shared with the hidden
   service. An entity without those keys is not able to use the hidden
   service. Throughout this document, we assume that these pre-shared keys are
   exchanged between the hidden service and its clients in a secure out-of-band
   fashion.

   Specifically, each authorized client possesses:

   - An x25519 keypair used to compute decryption keys that allow the client to
     decrypt the hidden service descriptor. See [HS-DESC-ENC].

   - An ed25519 keypair which allows the client to compute signatures which
     prove to the hidden service that the client is authorized. These
     signatures are inserted into the INTRODUCE1 cell, and without them the
     introduction to the hidden service cannot be completed. See [INTRO-AUTH].

   The right way to exchange these keys is to have the client generate keys and
   send the corresponding public keys to the hidden service out-of-band. An
   easier but less secure way of doing this exchange would be to have the
   hidden service generate the keypairs and pass the corresponding private keys
   to its clients. See section [CLIENT-AUTH-MGMT] for more details on how these
   keys should be managed.

   [TODO: Also specify stealth client authorization.]

2. Generating and publishing hidden service descriptors [HSDIR]

   Hidden service descriptors follow the same metaformat as other Tor
   directory objects. They are published anonymously to Tor servers with the
   HSDir flag, HSDir=2 protocol version and tor version >= 0.3.0.8 (because a
   bug was fixed in this version).

2.1. Deriving blinded keys and subcredentials [SUBCRED]

   In each time period (see [TIME-PERIODS] for a definition of time
   periods), a hidden service host uses a different blinded private key
   to sign its directory information, and clients use a different
   blinded public key as the index for fetching that information.

   For a candidate for a key derivation method, see Appendix [KEYBLIND].

   Additionally, clients and hosts derive a subcredential for each
   period. Knowledge of the subcredential is needed to decrypt hidden
   service descriptors for each period and to authenticate with the
   hidden service host in the introduction process. Unlike the
   credential, it changes each period. Knowing the subcredential, even
   in combination with the blinded private key, does not enable the
   hidden service host to derive the main credential--therefore, it is
   safe to put the subcredential on the hidden service host while
   leaving the hidden service's private key offline.

   The subcredential for a period is derived as:

       subcredential = H("subcredential" | credential | blinded-public-key).

   In the above formula, credential corresponds to:

       credential = H("credential" | public-identity-key)

   where public-identity-key is the public identity master key of the hidden
   service.

2.2. Locating, uploading, and downloading hidden service descriptors
       [HASHRING]

   To avoid attacks where a hidden service's descriptor is easily
   targeted for censorship, we store them at different directories over
   time, and use shared random values to prevent those directories from
   being predictable far in advance.

   Which Tor servers hosts a hidden service depends on:

         * the current time period,
         * the daily subcredential,
         * the hidden service directories' public keys,
         * a shared random value that changes in each time period,
         * a set of network-wide networkstatus consensus parameters.
           (Consensus parameters are integer values voted on by authorities
           and published in the consensus documents, described in
           dir-spec.txt, section 3.3.)

   Below we explain in more detail.

2.2.1. Dividing time into periods [TIME-PERIODS]

   To prevent a single set of hidden service directory from becoming a
   target by adversaries looking to permanently censor a hidden service,
   hidden service descriptors are uploaded to different locations that
   change over time.

   The length of a "time period" is controlled by the consensus
   parameter 'hsdir-interval', and is a number of minutes between 30 and
   14400 (10 days). The default time period length is 1440 (one day).

   Time periods start at the Unix epoch (Jan 1, 1970), and are computed by
   taking the number of minutes since the epoch and dividing by the time
   period. However, we want our time periods to start at 12:00UTC every day, so
   we subtract a "rotation time offset" of 12*60 minutes from the number of
   minutes since the epoch, before dividing by the time period (effectively
   making "our" epoch start at Jan 1, 1970 12:00UTC).

   Example: If the current time is 2016-04-13 11:15:01 UTC, making the seconds
   since the epoch 1460546101, and the number of minutes since the epoch
   24342435.  We then subtract the "rotation time offset" of 12*60 minutes from
   the minutes since the epoch, to get 24341715. If the current time period
   length is 1440 minutes, by doing the division we see that we are currently
   in time period number 16903.

   Specifically, time period #16903 began 16903*1440*60 + (12*60*60) seconds
   after the epoch, at 2016-04-12 12:00 UTC, and ended at 16904*1440*60 +
   (12*60*60) seconds after the epoch, at 2016-04-13 12:00 UTC.

2.2.2. When to publish a hidden service descriptor [WHEN-HSDESC]

   Hidden services periodically publish their descriptor to the responsible
   HSDirs. The set of responsible HSDirs is determined as specified in
   [WHERE-HSDESC].

   Specifically, everytime a hidden service publishes its descriptor, it also
   sets up a timer for a random time between 60 minutes and 120 minutes in the
   future. When the timer triggers, the hidden service needs to publish its
   descriptor again to the responsible HSDirs for that time period.
   [TODO: Control republish period using a consensus parameter?]

2.2.2.1. Overlapping descriptors

   Hidden services need to upload multiple descriptors so that they can be
   reachable to clients with older or newer consensuses than them. Services
   need to upload their descriptors to the HSDirs _before_ the beginning of
   each upcoming time period, so that they are readily available for clients to
   fetch them. Furthermore, services should keep uploading their old descriptor
   even after the end of a time period, so that they can be reachable by
   clients that still have consensuses from the previous time period.

   Hence, services maintain two active descriptors at every point. Clients on
   the other hand, don't have a notion of overlapping descriptors, and instead
   always download the descriptor for the current time period and shared random
   value. It's the job of the service to ensure that descriptors will be
   available for all clients. See section [FETCHUPLOADDESC] for how this is
   achieved.

   [TODO: What to do when we run multiple hidden services in a single host?]

2.2.3. Where to publish a hidden service descriptor [WHERE-HSDESC]

   This section specifies how the HSDir hash ring is formed at any given
   time. Whenever a time value is needed (e.g. to get the current time period
   number), we assume that clients and services use the valid-after time from
   their latest live consensus.

   The following consensus parameters control where a hidden service
   descriptor is stored;

        hsdir_n_replicas = an integer in range [1,16] with default value 2.
        hsdir_spread_fetch = an integer in range [1,128] with default value 3.
        hsdir_spread_store = an integer in range [1,128] with default value 3.

   To determine where a given hidden service descriptor will be stored
   in a given period, after the blinded public key for that period is
   derived, the uploading or downloading party calculates:

        for replicanum in 1...hsdir_n_replicas:
            hs_index(replicanum) = H("store-at-idx" |
                                     blinded_public_key |
                                     INT_8(replicanum) |
                                     INT_8(period_length) |
                                     INT_8(period_num) )

   where blinded_public_key is specified in section [KEYBLIND], period_length
   is the length of the time period in minutes, and period_num is calculated
   using the current consensus "valid-after" as specified in section
   [TIME-PERIODS].

   Then, for each node listed in the current consensus with the HSDirV3 flag,
   we compute a directory index for that node as:

           hsdir_index(node) = H("node-idx" | node_identity |
                                 shared_random_value |
                                 INT_8(period_num) |
                                 INT_8(period_length) )

   where shared_random_value is the shared value generated by the authorities
   in section [PUB-SHAREDRANDOM], and node_identity is the ed25519 identity
   key of the node.

   Finally, for replicanum in 1...hsdir_n_replicas, the hidden service
   host uploads descriptors to the first hsdir_spread_store nodes whose
   indices immediately follow hs_index(replicanum). If any of those
   nodes have already been selected for a lower-numbered replica of the
   service, any nodes already chosen are disregarded (i.e. skipped over)
   when choosing a replica's hsdir_spread_store nodes.

   When choosing an HSDir to download from, clients choose randomly from
   among the first hsdir_spread_fetch nodes after the indices.  (Note
   that, in order to make the system better tolerate disappearing
   HSDirs, hsdir_spread_fetch may be less than hsdir_spread_store.)
   Again, nodes from lower-numbered replicas are disregarded when
   choosing the spread for a replica.

2.2.4. Using time periods and SRVs to fetch/upload HS descriptors [FETCHUPLOADDESC]

   Hidden services and clients need to make correct use of time periods (TP)
   and shared random values (SRVs) to successfuly fetch and upload
   descriptors. Furthermore, to avoid problems with skewed clocks, both clients
   and services use the 'valid-after' time of a live consensus as a way to take
   decisions with regards to uploading and fetching descriptors. By using the
   consensus times as the ground truth here, we minimize the desynchronization
   of clients and services due to system clock. Whenever time-based decisions
   are taken in this section, assume that they are consensus times and not
   system times.

   As [PUB-SHAREDRANDOM] specifies, consensuses contain two shared random
   values (the current one and the previous one). Hidden services and clients
   are asked to match these shared random values with descriptor time periods
   and use the right SRV when fetching/uploading descriptors. This section
   attempts to precisely specify how this works.

   Let's start with an illustration of the system:

      +------------------------------------------------------------------+
      |                                                                  |
      | 00:00      12:00       00:00       12:00       00:00       12:00 |
      | SRV#1      TP#1        SRV#2       TP#2        SRV#3       TP#3  |
      |                                                                  |
      |  $==========|-----------$===========|-----------$===========|    |
      |                                                                  |
      |                                                                  |
      +------------------------------------------------------------------+

                                      Legend: [TP#1 = Time Period #1]
                                              [SRV#1 = Shared Random Value #1]
                                              ["$" = descriptor rotation moment]

2.2.4.1. Client behavior for fetching descriptors [CLIENTFETCH]

   And here is how clients use TPs and SRVs to fetch descriptors:

   Clients always aim to synchronize their TP with SRV, so they always want to
   use TP#N with SRV#N: To achieve this wrt time periods, clients always use
   the current time period when fetching descriptors. Now wrt SRVs, if a client
   is in the time segment between a new time period and a new SRV (i.e. the
   segments drawn with "-") it uses the current SRV, else if the client is in a
   time segment between a new SRV and a new time period (i.e. the segments
   drawn with "="), it uses the previous SRV.

   Example:

   +------------------------------------------------------------------+
   |                                                                  |
   | 00:00      12:00       00:00       12:00       00:00       12:00 |
   | SRV#1      TP#1        SRV#2       TP#2        SRV#3       TP#3  |
   |                                                                  |
   |  $==========|-----------$===========|-----------$===========|    |
   |              ^           ^                                       |
   |              C1          C2                                      |
   +------------------------------------------------------------------+

   If a client (C1) is at 13:00 right after TP#1, then it will use TP#1 and
   SRV#1 for fetching descriptors. Also, if a client (C2) is at 01:00 right
   after SRV#2, it will still use TP#1 and SRV#1.

2.2.4.2. Service behavior for uploading descriptors [SERVICEUPLOAD]

   As discussed above, services maintain two active descriptors at any time. We
   call these the "first" and "second" service descriptors. Services rotate
   their descriptor everytime they receive a consensus with a valid_after time
   past the next SRV calculation time. They rotate their descriptors by
   discarding their first descriptor, pushing the second descriptor to the
   first, and rebuilding their second descriptor with the latest data.

   Services like clients also employ a different logic for picking SRV and TP
   values based on their position in the graph above. Here is the logic:

2.2.4.2.1. First descriptor upload logic [FIRSTDESCUPLOAD]

   Here is the service logic for uploading its first descriptor:

   When a service is in the time segment between a new time period a new SRV
   (i.e. the segments drawn with "-"), it uses the previous time period and
   previous SRV for uploading its first descriptor: that's meant to cover
   for clients that have a consensus that is still in the previous time period.

   Example: Consider in the above illustration that the service is at 13:00
   right after TP#1. It will upload its first descriptor using TP#0 and SRV#0.
   So if a client still has a 11:00 consensus it will be able to access it
   based on the client logic above.

   Now if a service is in the time segment between a new SRV and a new time
   period (i.e. the segments drawn with "=") it uses the current time period
   and the previous SRV for its first descriptor: that's meant to cover clients
   with an up-to-date consensus in the same time period as the service.

   Example:

   +------------------------------------------------------------------+
   |                                                                  |
   | 00:00      12:00       00:00       12:00       00:00       12:00 |
   | SRV#1      TP#1        SRV#2       TP#2        SRV#3       TP#3  |
   |                                                                  |
   |  $==========|-----------$===========|-----------$===========|    |
   |                          ^                                       |
   |                          S                                       |
   +------------------------------------------------------------------+

   Consider that the service is at 01:00 right after SRV#2: it will upload its
   first descriptor using TP#1 and SRV#1.

2.2.4.2.2. Second descriptor upload logic [SECONDDESCUPLOAD]

   Here is the service logic for uploading its second descriptor:

   When a service is in the time segment between a new time period a new SRV
   (i.e. the segments drawn with "-"), it uses the current time period and
   current SRV for uploading its second descriptor: that's meant to cover for
   clients that have an up-to-date consensus on the same TP as the service.

   Example: Consider in the above illustration that the service is at 13:00
   right after TP#1: it will upload its second descriptor using TP#1 and SRV#1.

   Now if a service is in the time segment between a new SRV and a new time
   period (i.e. the segments drawn with "=") it uses the next time period and
   the current SRV for its second descriptor: that's meant to cover clients
   with a newer consensus than the service (in the next time period).

   Example:

   +------------------------------------------------------------------+
   |                                                                  |
   | 00:00      12:00       00:00       12:00       00:00       12:00 |
   | SRV#1      TP#1        SRV#2       TP#2        SRV#3       TP#3  |
   |                                                                  |
   |  $==========|-----------$===========|-----------$===========|    |
   |                          ^                                       |
   |                          S                                       |
   +------------------------------------------------------------------+

   Consider that the service is at 01:00 right after SRV#2: it will upload its
   second descriptor using TP#2 and SRV#2.

2.2.5. Expiring hidden service descriptors [EXPIRE-DESC]

   Hidden services set their descriptor's "descriptor-lifetime" field to 180
   minutes (3 hours). Hidden services ensure that their descriptor will remain
   valid in the HSDir caches, by republishing their descriptors periodically as
   specified in [WHEN-HSDESC].

   Hidden services MUST also keep their introduction circuits alive for as long
   as descriptors including those intro points are valid (even if that's after
   the time period has changed).

2.2.6. URLs for anonymous uploading and downloading

   Hidden service descriptors conforming to this specification are uploaded
   with an HTTP POST request to the URL /tor/hs/<version>/publish relative to
   the hidden service directory's root, and downloaded with an HTTP GET
   request for the URL /tor/hs/<version>/<z> where <z> is a base64 encoding of
   the hidden service's blinded public key and <version> is the protocol
   version which is "3" in this case.

   These requests must be made anonymously, on circuits not used for
   anything else.

2.2.7. Client-side validation of onion addresses

   When a Tor client receives a prop224 onion address from the user, it
   MUST first validate the onion address before attempting to connect or
   fetch its descriptor. If the validation fails, the client MUST
   refuse to connect.

   As part of the address validation, Tor clients should check that the
   underlying ed25519 key does not have a torsion component. If Tor accepted
   ed25519 keys with torsion components, attackers could create multiple
   equivalent onion addresses for a single ed25519 key, which would map to the
   same service. We want to avoid that because it could lead to phishing
   attacks and surprising behaviors (e.g. imagine a browser plugin that blocks
   onion addresses, but could be bypassed using an equivalent onion address
   with a torsion component).

   The right way for clients to detect such fraudulent addresses (which should
   only occur malevolently and never natutally) is to extract the ed25519
   public key from the onion address and multiply it by the ed25519 group order
   and ensure that the result is the ed25519 identity element. For more
   details, please see [TORSION-REFS].

2.3. Publishing shared random values [PUB-SHAREDRANDOM]

   Our design for limiting the predictability of HSDir upload locations
   relies on a shared random value (SRV) that isn't predictable in advance or
   too influenceable by an attacker. The authorities must run a protocol
   to generate such a value at least once per hsdir period. Here we
   describe how they publish these values; the procedure they use to
   generate them can change independently of the rest of this
   specification. For more information see [SHAREDRANDOM-REFS].

   According to proposal 250, we add two new lines in consensuses:

     "shared-rand-previous-value" SP NUM_REVEALS SP VALUE NL
     "shared-rand-current-value" SP NUM_REVEALS SP VALUE NL

2.3.1. Client behavior in the absense of shared random values

   If the previous or current shared random value cannot be found in a
   consensus, then Tor clients and services need to generate their own random
   value for use when choosing HSDirs.

   To do so, Tor clients and services use:

     SRV = H("shared-random-disaster" | INT_8(period_length) | INT_8(period_num))

   where period_length is the length of a time period in minutes, period_num is
   calculated as specified in [TIME-PERIODS] for the wanted shared random value
   that could not be found originally.

2.3.2. Hidden services and changing shared random values

   It's theoretically possible that the consensus shared random values will
   change or disappear in the middle of a time period because of directory
   authorities dropping offline or misbehaving.

   To avoid client reachability issues in this rare event, hidden services
   should use the new shared random values to find the new responsible HSDirs
   and upload their descriptors there.

   XXX How long should they upload descriptors there for?

2.4. Hidden service descriptors: outer wrapper [DESC-OUTER]

   The format for a hidden service descriptor is as follows, using the
   meta-format from dir-spec.txt.

     "hs-descriptor" SP version-number NL

       [At start, exactly once.]

       The version-number is a 32 bit unsigned integer indicating the version
       of the descriptor. Current version is "3".

     "descriptor-lifetime" SP LifetimeMinutes NL

       [Exactly once]

       The lifetime of a descriptor in minutes. An HSDir SHOULD expire the
       hidden service descriptor at least LifetimeMinutes after it was
       uploaded.

       The LifetimeMinutes field can take values between 30 and 3000 (50 hours).

    "descriptor-signing-key-cert" NL certificate NL

       [Exactly once.]

       The 'certificate' field contains a certificate in the format from
       proposal 220, wrapped with "-----BEGIN ED25519 CERT-----".  The
       certificate cross-certifies the short-term descriptor signing key with
       the blinded public key.  The certificate type must be [08], and the
       blinded public key must be present as the signing-key extension.

     "revision-counter" SP Integer NL

       [Exactly once.]

       The revision number of the descriptor. If an HSDir receives a
       second descriptor for a key that it already has a descriptor for,
       it should retain and serve the descriptor with the higher
       revision-counter.

       (Checking for monotonically increasing revision-counter values
       prevents an attacker from replacing a newer descriptor signed by
       a given key with a copy of an older version.)

     "superencrypted" NL encrypted-string

       [Exactly once.]

       An encrypted blob, whose format is discussed in [HS-DESC-ENC] below. The
       blob is base64 encoded and enclosed in -----BEGIN MESSAGE---- and
       ----END MESSAGE---- wrappers.

     "signature" SP signature NL

       [exactly once, at end.]

       A signature of all previous fields, using the signing key in the
       descriptor-signing-key-cert line, prefixed by the string "Tor onion
       service descriptor sig v3". We use a separate key for signing, so that
       the hidden service host does not need to have its private blinded key
       online.

   HSDirs accept hidden service descriptors of up to 50k bytes (a consensus
   parameter should also be introduced to control this value).

2.5. Hidden service descriptors: encryption format [HS-DESC-ENC]

   Hidden service descriptors are protected by two layers of encryption.
   Clients need to decrypt both layers to connect to the hidden service.

   The first layer of encryption provides confidentiality against entities who
   don't know the public key of the hidden service (e.g. HSDirs), while the
   second layer of encryption is only useful when client authorization is enabled
   and protects against entities that do not possess valid client credentials.

2.5.1. First layer of encryption [HS-DESC-FIRST-LAYER]

   The first layer of HS descriptor encryption is designed to protect
   descriptor confidentiality against entities who don't know the blinded
   public key of the hidden service.

2.5.1.1. First layer encryption logic

   The encryption keys and format for the first layer of encryption are
   generated as specified in [HS-DESC-ENCRYPTION-KEYS] with customization
   parameters:

     SECRET_DATA = blinded-public-key
     STRING_CONSTANT = "hsdir-superencrypted-data"

   The ciphertext is placed on the "superencrypted" field of the descriptor.

   Before encryption the plaintext is padded with NUL bytes to the nearest
   multiple of 10k bytes.

2.5.1.2. First layer plaintext format

   After clients decrypt the first layer of encryption, they need to parse the
   plaintext to get to the second layer ciphertext which is contained in the
   "encrypted" field.

   If client auth is enabled, the hidden service generates a fresh
   descriptor_cookie key (32 random bytes) and encrypts it using each
   authorized client's identity x25519 key. Authorized clients can use the
   descriptor cookie to decrypt the second layer of encryption. Our encryption
   scheme requires the hidden service to also generate an ephemeral x25519
   keypair for each new descriptor.

   If client auth is disabled, fake data is placed in each of the fields below
   to obfuscate whether client authorization is enabled.

   Here are all the supported fields:

     "desc-auth-type" SP type NL

      [Exactly once]

      This field contains the type of authorization used to protect the
      descriptor. The only recognized type is "x25519" and specifies the
      encryption scheme described in this section.

      If client authorization is disabled, the value here should be "x25519".

     "desc-auth-ephemeral-key" SP key NL

      [Exactly once]

      This field contains an ephemeral x25519 public key generated by the
      hidden service and encoded in base64. The key is used by the encryption
      scheme below.

      If client authorization is disabled, the value here should be a fresh
      x25519 pubkey that will remain unused.

     "auth-client" SP client-id SP iv SP encrypted-cookie

      [Any number]

      When client authorization is enabled, the hidden service inserts an
      "auth-client" line for each of its authorized clients. If client
      authorization is disabled, the fields here can be populated with random
      data of the right size (that's 8 bytes for 'client-id', 16 bytes for 'iv'
      and 16 bytes for 'encrypted-cookie' all encoded with base64).

      When client authorization is enabled, each "auth-client" line contains
      the descriptor cookie encrypted to each individual client. We assume that
      each authorized client possesses a pre-shared x25519 keypair which is
      used to decrypt the descriptor cookie.

      We now describe the descriptor cookie encryption scheme. Here are the
      relevant keys:

          client_x = private x25519 key of authorized client
          client_X = public x25519 key of authorized client
          hs_y = private key of ephemeral x25519 keypair of hidden service
          hs_Y = public key of ephemeral x25519 keypair of hidden service
          descriptor_cookie = descriptor cookie used to encrypt the descriptor

      And here is what the hidden service computes:

          SECRET_SEED = x25519(hs_y, client_X)
          KEYS = KDF(SECRET_SEED, 40)
          CLIENT-ID = fist 8 bytes of KEYS
          COOKIE-KEY = last 32 bytes of KEYS

      Here is a description of the fields in the "auth-client" line:

      - The "client-id" field is CLIENT-ID from above encoded in base64.

      - The "iv" field is 16 random bytes encoded in base64.

      - The "encrypted-cookie" field contains the descriptor cookie ciphertext
        as follows and is encoded in base64:
           encrypted-cookie = STREAM(iv, COOKIE-KEY) XOR descriptor_cookie

      See section [FIRST-LAYER-CLIENT-BEHAVIOR] for the client-side logic of
      how to decrypt the descriptor cookie.

    "encrypted" NL encrypted-string

     [Exactly once]

      An encrypted blob containing the second layer ciphertext, whose format is
      discussed in [HS-DESC-SECOND-LAYER] below. The blob is base64 encoded
      and enclosed in -----BEGIN MESSAGE---- and ----END MESSAGE---- wrappers.

2.5.1.3. Client behavior [FIRST-LAYER-CLIENT-BEHAVIOR]

    The goal of clients at this stage is to decrypt the "encrypted" field as
    described in [HS-DESC-SECOND-LAYER].

    If client authorization is enabled, authorized clients need to extract the
    descriptor cookie to proceed with decryption of the second layer as
    follows:

    An authorized client parsing the first layer of an encrypted descriptor,
    extracts the ephemeral key from "desc-auth-ephemeral-key" and calculates
    CLIENT-ID and COOKIE-KEY as described in the section above using their
    x25519 private key. The client then uses CLIENT-ID to find the right
    "auth-client" field which contains the ciphertext of the descriptor
    cookie. The client then uses COOKIE-KEY and the iv to decrypt the
    descriptor_cookie, which is used to decrypt the second layer of descriptor
    encryption as described in [HS-DESC-SECOND-LAYER].

2.5.1.4. Hiding client authorization data

    Hidden services should avoid leaking whether client authorization is
    enabled or how many authorized clients there are.

    Hence even when client authorization is disabled, the hidden service adds
    fake "desc-auth-type", "desc-auth-ephemeral-key" and "auth-client" lines to
    the descriptor, as described in [HS-DESC-FIRST-LAYER].

    The hidden service also avoids leaking the number of authorized clients by
    adding fake "auth-client" entries to its descriptor. Specifically,
    descriptors always contain a number of authorized clients that is a
    multiple of 16 by adding fake "auth-client" entries if needed.
    [XXX consider randomization of the value 16]

    Clients MUST accept descriptors with any number of "auth-client" lines as
    long as the total descriptor size is within the max limit of 50k (also
    controlled with a consensus parameter).

2.5.2. Second layer of encryption [HS-DESC-SECOND-LAYER]

   The second layer of descriptor encryption is designed to protect descriptor
   confidentiality against unauthorized clients. If client authorization is
   enabled, it's encrypted using the descriptor_cookie, and contains needed
   information for connecting to the hidden service, like the list of its
   introduction points.

   If client authorization is disabled, then the second layer of HS encryption
   does not offer any additional security, but is still used.

2.5.2.1. Second layer encryption keys

   The encryption keys and format for the second layer of encryption are
   generated as specified in [HS-DESC-ENCRYPTION-KEYS] with customization
   parameters as follows:

     SECRET_DATA = blinded-public-key | descriptor_cookie
     STRING_CONSTANT = "hsdir-encrypted-data"

   If client authorization is disabled the 'descriptor_cookie' field is left blank.

   The ciphertext is placed on the "encrypted" field of the descriptor.

2.5.2.2. Second layer plaintext format

   After decrypting the second layer ciphertext, clients can finally learn the
   list of intro points etc. The plaintext has the following format:

     "create2-formats" SP formats NL

      [Exactly once]

      A space-separated list of integers denoting CREATE2 cell format numbers
      that the server recognizes. Must include at least ntor as described in
      tor-spec.txt. See tor-spec section 5.1 for a list of recognized
      handshake types.

     "intro-auth-required" SP types NL

      [At most once]

      A space-separated list of introduction-layer authentication types; see
      section [INTRO-AUTH] for more info. A client that does not support at
      least one of these authentication types will not be able to contact the
      host. Recognized types are: 'password' and 'ed25519'.

     "single-onion-service"

      [None or at most once]

      If present, this line indicates that the service is a Single Onion
      Service (see prop260 for more details about that type of service). This
      field has been introduced in 0.3.0 meaning 0.2.9 service don't include
      this.

     Followed by zero or more introduction points as follows (see section
     [NUM_INTRO_POINT] below for accepted values):

        "introduction-point" SP link-specifiers NL

          [Exactly once per introduction point at start of introduction
            point section]

          The link-specifiers is a base64 encoding of a link specifier
          block in the format described in BUILDING-BLOCKS.

        "onion-key" SP "ntor" SP key NL

          [Exactly once per introduction point]

          The key is a base64 encoded curve25519 public key which is the onion
          key of the introduction point Tor node used for the ntor handshake
          when a client extends to it.

        "auth-key" NL certificate NL

          [Exactly once per introduction point]

          The certificate is a proposal 220 certificate wrapped in
          "-----BEGIN ED25519 CERT-----", cross-certifying the descriptor
          signing key with the introduction point authentication key, which
          is included in the mandatory signing-key extension.  The certificate
          type must be [09].

        "enc-key" SP "ntor" SP key NL

          [Exactly once per introduction point]

          The key is a base64 encoded curve25519 public key used to encrypt
          the introduction request to service.

        "enc-key-cert" NL certificate NL

          [Exactly once per introduction point]

          Cross-certification of the descriptor signing key by the encryption
          key.

          For "ntor" keys, certificate is a proposal 220 certificate wrapped
          in "-----BEGIN ED25519 CERT-----" armor, cross-certifying the
          descriptor signing key with the ed25519 equivalent of a curve25519
          public encryption key derived using the process in proposal 228
          appendix A. The certificate type must be [0B], and the signing-key
          extension is mandatory.

        "legacy-key" NL key NL

          [None or at most once per introduction point]

          The key is an ASN.1 encoded RSA public key in PEM format used for a
          legacy introduction point as described in [LEGACY_EST_INTRO].

          This field is only present if the introduction point only supports
          legacy protocol (v2) that is <= 0.2.9 or the protocol version value
          "HSIntro 3".

        "legacy-key-cert NL certificate NL

          [None or at most once per introduction point]

          MUST be present if "legacy-key" is present.

          The certificate is a proposal 220 RSA->Ed cross-certificate wrapped
          in "-----BEGIN CROSSCERT-----" armor, cross-certifying the
          descriptor signing key with the RSA public key found in
          "legacy-key".

   To remain compatible with future revisions to the descriptor format,
   clients should ignore unrecognized lines in the descriptor.
   Other encryption and authentication key formats are allowed; clients
   should ignore ones they do not recognize.

   Clients who manage to extract the introduction points of the hidden service
   can prroceed with the introduction protocol as specified in [INTRO-PROTOCOL].

2.5.3. Deriving hidden service descriptor encryption keys [HS-DESC-ENCRYPTION-KEYS]

   In this section we present the generic encryption format for hidden service
   descriptors. We use the same encryption format in both encryption layers,
   hence we introduce two customization parameters SECRET_DATA and
   STRING_CONSTANT which vary between the layers.

   The SECRET_DATA parameter specifies the secret data that are used during
   encryption key generation, while STRING_CONSTANT is merely a string constant
   that is used as part of the KDF.

   Here is the key generation logic:

       SALT = 16 bytes from H(random), changes each time we rebuld the
              descriptor even if the content of the descriptor hasn't changed.
              (So that we don't leak whether the intro point list etc. changed)

       secret_input = SECRET_DATA | subcredential | INT_8(revision_counter)

       keys = KDF(secret_input | salt | STRING_CONSTANT, S_KEY_LEN + S_IV_LEN + MAC_KEY_LEN)

       SECRET_KEY = first S_KEY_LEN bytes of keys
       SECRET_IV  = next S_IV_LEN bytes of keys
       MAC_KEY    = last MAC_KEY_LEN bytes of keys

   The encrypted data has the format:

       SALT       hashed random bytes from above  [16 bytes]
       ENCRYPTED  The ciphertext                  [variable]
       MAC        MAC of both above fields        [32 bytes]

   The final encryption format is ENCRYPTED = STREAM(SECRET_IV,SECRET_KEY) XOR Plaintext

2.5.4. Number of introduction points [NUM_INTRO_POINT]

   This section defines how many introduction points an hidden service
   descriptor can have at minimum, by default and the maximum:

      Minimum: 0 - Default: 3 - Maximum: 20

   A value of 0 would means that the service is still alive but doesn't want
   to be reached by any client at the moment. Note that the descriptor size
   increases considerably as more introduction points are added.

   The reason for a maximum value of 20 is to give enough scalability to tools
   like OnionBalance to be able to load balance up to 120 servers (20 x 6
   HSDirs) but also in order for the descriptor size to not overwhelmed hidden
   service directories with user defined values that could be gigantic.

3. The introduction protocol [INTRO-PROTOCOL]

   The introduction protocol proceeds in three steps.

   First, a hidden service host builds an anonymous circuit to a Tor
   node and registers that circuit as an introduction point.

        [After 'First' and before 'Second', the hidden service publishes its
        introduction points and associated keys, and the client fetches
        them as described in section [HSDIR] above.]

   Second, a client builds an anonymous circuit to the introduction
   point, and sends an introduction request.

   Third, the introduction point relays the introduction request along
   the introduction circuit to the hidden service host, and acknowledges
   the introduction request to the client.

3.1. Registering an introduction point [REG_INTRO_POINT]

3.1.1. Extensible ESTABLISH_INTRO protocol. [EST_INTRO]

   When a hidden service is establishing a new introduction point, it
   sends an ESTABLISH_INTRO cell with the following contents:

     AUTH_KEY_TYPE    [1 byte]
     AUTH_KEY_LEN     [2 bytes]
     AUTH_KEY         [AUTH_KEY_LEN bytes]
     N_EXTENSIONS     [1 byte]
     N_EXTENSIONS times:
        EXT_FIELD_TYPE [1 byte]
        EXT_FIELD_LEN  [1 byte]
        EXT_FIELD      [EXT_FIELD_LEN bytes]
     HANDSHAKE_AUTH   [MAC_LEN bytes]
     SIG_LEN          [2 bytes]
     SIG              [SIG_LEN bytes]

   The AUTH_KEY_TYPE field indicates the type of the introduction point
   authentication key and the type of the MAC to use in
   HANDSHAKE_AUTH. Recognized types are:

       [00, 01] -- Reserved for legacy introduction cells; see
                   [LEGACY_EST_INTRO below]
       [02] -- Ed25519; SHA3-256.

   The AUTH_KEY_LEN field determines the length of the AUTH_KEY
   field. The AUTH_KEY field contains the public introduction point
   authentication key.

   The EXT_FIELD_TYPE, EXT_FIELD_LEN, EXT_FIELD entries are reserved for
   future extensions to the introduction protocol. Extensions with
   unrecognized EXT_FIELD_TYPE values must be ignored.

   The HANDSHAKE_AUTH field contains the MAC of all earlier fields in
   the cell using as its key the shared per-circuit material ("KH")
   generated during the circuit extension protocol; see tor-spec.txt
   section 5.2, "Setting circuit keys". It prevents replays of
   ESTABLISH_INTRO cells.

   SIG_LEN is the length of the signature.

   SIG is a signature, using AUTH_KEY, of all contents of the cell, up
   to but not including SIG. These contents are prefixed with the string
   "Tor establish-intro cell v1".

   Upon receiving an ESTABLISH_INTRO cell, a Tor node first decodes the
   key and the signature, and checks the signature. The node must reject
   the ESTABLISH_INTRO cell and destroy the circuit in these cases:

        * If the key type is unrecognized
        * If the key is ill-formatted
        * If the signature is incorrect
        * If the HANDSHAKE_AUTH value is incorrect

        * If the circuit is already a rendezvous circuit.
        * If the circuit is already an introduction circuit.
          [TODO: some scalability designs fail there.]
        * If the key is already in use by another circuit.

   Otherwise, the node must associate the key with the circuit, for use
   later in INTRODUCE1 cells.

3.1.2. Registering an introduction point on a legacy Tor node
       [LEGACY_EST_INTRO]

   Tor nodes should also support an older version of the ESTABLISH_INTRO
   cell, first documented in rend-spec.txt. New hidden service hosts
   must use this format when establishing introduction points at older
   Tor nodes that do not support the format above in [EST_INTRO].

   In this older protocol, an ESTABLISH_INTRO cell contains:

        KEY_LEN        [2 bytes]
        KEY            [KEY_LEN bytes]
        HANDSHAKE_AUTH [20 bytes]
        SIG            [variable, up to end of relay payload]

   The KEY_LEN variable determines the length of the KEY field.

   The KEY field is the ASN1-encoded legacy RSA public key that was also
   included in the hidden service descriptor.

   The HANDSHAKE_AUTH field contains the SHA1 digest of (KH | "INTRODUCE").

   The SIG field contains an RSA signature, using PKCS1 padding, of all
   earlier fields.

   Older versions of Tor always use a 1024-bit RSA key for these introduction
   authentication keys.

3.1.3. Acknowledging establishment of introduction point [INTRO_ESTABLISHED]

   After setting up an introduction circuit, the introduction point reports its
   status back to the hidden service host with an INTRO_ESTABLISHED cell.

   The INTRO_ESTABLISHED cell has the following contents:

     N_EXTENSIONS [1 byte]
     N_EXTENSIONS times:
       EXT_FIELD_TYPE [1 byte]
       EXT_FIELD_LEN  [1 byte]
       EXT_FIELD      [EXT_FIELD_LEN bytes]

   Older versions of Tor send back an empty INTRO_ESTABLISHED cell instead.
   Services must accept an empty INTRO_ESTABLISHED cell from a legacy relay.

3.2. Sending an INTRODUCE1 cell to the introduction point. [SEND_INTRO1]

   In order to participate in the introduction protocol, a client must
   know the following:

     * An introduction point for a service.
     * The introduction authentication key for that introduction point.
     * The introduction encryption key for that introduction point.

   The client sends an INTRODUCE1 cell to the introduction point,
   containing an identifier for the service, an identifier for the
   encryption key that the client intends to use, and an opaque blob to
   be relayed to the hidden service host.

   In reply, the introduction point sends an INTRODUCE_ACK cell back to
   the client, either informing it that its request has been delivered,
   or that its request will not succeed.

   [TODO: specify what tor should do when receiving a malformed cell. Drop it?
          Kill circuit? This goes for all possible cells.]

3.2.1. INTRODUCE1 cell format [FMT_INTRO1]

   When a client is connecting to an introduction point, INTRODUCE1 cells
   should be of the form:

     LEGACY_KEY_ID   [20 bytes]
     AUTH_KEY_TYPE   [1 byte]
     AUTH_KEY_LEN    [2 bytes]
     AUTH_KEY        [AUTH_KEY_LEN bytes]
     N_EXTENSIONS    [1 byte]
     N_EXTENSIONS times:
       EXT_FIELD_TYPE [1 byte]
       EXT_FIELD_LEN  [1 byte]
       EXT_FIELD      [EXT_FIELD_LEN bytes]
     ENCRYPTED        [Up to end of relay payload]

   AUTH_KEY_TYPE is defined as in [EST_INTRO]. Currently, the only value of
   AUTH_KEY_TYPE for this cell is an Ed25519 public key [02].

   The LEGACY_KEY_ID field is used to distinguish between legacy and new style
   INTRODUCE1 cells. In new style INTRODUCE1 cells, LEGACY_KEY_ID is 20 zero
   bytes. Upon receiving an INTRODUCE1 cell, the introduction point checks the
   LEGACY_KEY_ID field. If LEGACY_KEY_ID is non-zero, the INTRODUCE1 cell
   should be handled as a legacy INTRODUCE1 cell by the intro point.

   Upon receiving a INTRODUCE1 cell, the introduction point checks
   whether AUTH_KEY matches the introduction point authentication key for an
   active introduction circuit.  If so, the introduction point sends an
   INTRODUCE2 cell with exactly the same contents to the service, and sends an
   INTRODUCE_ACK response to the client.

3.2.2. INTRODUCE_ACK cell format. [INTRO_ACK]

   An INTRODUCE_ACK cell has the following fields:

     STATUS       [2 bytes]
     N_EXTENSIONS [1 bytes]
     N_EXTENSIONS times:
       EXT_FIELD_TYPE [1 byte]
       EXT_FIELD_LEN  [1 byte]
       EXT_FIELD      [EXT_FIELD_LEN bytes]

   Recognized status values are:
     [00 00] -- Success: cell relayed to hidden service host.
     [00 01] -- Failure: service ID not recognized
     [00 02] -- Bad message format
     [00 03] -- Can't relay cell to service

3.3. Processing an INTRODUCE2 cell at the hidden service. [PROCESS_INTRO2]

   Upon receiving an INTRODUCE2 cell, the hidden service host checks whether
   the AUTH_KEY or LEGACY_KEY_ID field matches the keys for this
   introduction circuit.

   The service host then checks whether it has received a cell with these
   contents or rendezvous cookie before. If it has, it silently drops it as a
   replay. (It must maintain a replay cache for as long as it accepts cells
   with the same encryption key. Note that the encryption format below should
   be non-malleable.)

   If the cell is not a replay, it decrypts the ENCRYPTED field,
   establishes a shared key with the client, and authenticates the whole
   contents of the cell as having been unmodified since they left the
   client. There may be multiple ways of decrypting the ENCRYPTED field,
   depending on the chosen type of the encryption key. Requirements for
   an introduction handshake protocol are described in
   [INTRO-HANDSHAKE-REQS]. We specify one below in section
   [NTOR-WITH-EXTRA-DATA].

   The decrypted plaintext must have the form:

      RENDEZVOUS_COOKIE                          [20 bytes]
      N_EXTENSIONS                               [1 byte]
      N_EXTENSIONS times:
          EXT_FIELD_TYPE                         [1 byte]
          EXT_FIELD_LEN                          [1 byte]
          EXT_FIELD                              [EXT_FIELD_LEN bytes]
      ONION_KEY_TYPE                             [1 bytes]
      ONION_KEY_LEN                              [2 bytes]
      ONION_KEY                                  [ONION_KEY_LEN bytes]
      NSPEC      (Number of link specifiers)     [1 byte]
      NSPEC times:
          LSTYPE (Link specifier type)           [1 byte]
          LSLEN  (Link specifier length)         [1 byte]
          LSPEC  (Link specifier)                [LSLEN bytes]
      PAD        (optional padding)              [up to end of plaintext]

   Upon processing this plaintext, the hidden service makes sure that
   any required authentication is present in the extension fields, and
   then extends a rendezvous circuit to the node described in the LSPEC
   fields, using the ONION_KEY to complete the extension. As mentioned
   in [BUILDING-BLOCKS], the "TLS-over-TCP, IPv4" and "Legacy node
   identity" specifiers must be present.

   The hidden service SHOULD NOT reject any LSTYPE fields which it
   doesn't recognize; instead, it should use them verbatim in its EXTEND
   request to the rendezvous point.

   The ONION_KEY_TYPE field is:

      [01] NTOR:          ONION_KEY is 32 bytes long.

   The ONION_KEY field describes the onion key that must be used when
   extending to the rendezvous point. It must be of a type listed as
   supported in the hidden service descriptor.

   When using a legacy introduction point, the INTRODUCE cells must be padded
   to a certain length using the PAD field in the encrypted portion.

   Upon receiving a well-formed INTRODUCE2 cell, the hidden service host
   will have:

     * The information needed to connect to the client's chosen
       rendezvous point.
     * The second half of a handshake to authenticate and establish a
       shared key with the hidden service client.
     * A set of shared keys to use for end-to-end encryption.

3.3.1. Introduction handshake encryption requirements [INTRO-HANDSHAKE-REQS]

   When decoding the encrypted information in an INTRODUCE2 cell, a
   hidden service host must be able to:

     * Decrypt additional information included in the INTRODUCE2 cell,
       to include the rendezvous token and the information needed to
       extend to the rendezvous point.

     * Establish a set of shared keys for use with the client.

     * Authenticate that the cell has not been modified since the client
       generated it.

   Note that the old TAP-derived protocol of the previous hidden service
   design achieved the first two requirements, but not the third.

3.3.2. Example encryption handshake: ntor with extra data
       [NTOR-WITH-EXTRA-DATA]

   [TODO: relocate this]

   This is a variant of the ntor handshake (see tor-spec.txt, section
   5.1.4; see proposal 216; and see "Anonymity and one-way
   authentication in key-exchange protocols" by Goldberg, Stebila, and
   Ustaoglu).

   It behaves the same as the ntor handshake, except that, in addition
   to negotiating forward secure keys, it also provides a means for
   encrypting non-forward-secure data to the server (in this case, to
   the hidden service host) as part of the handshake.

   Notation here is as in section 5.1.4 of tor-spec.txt, which defines
   the ntor handshake.

   The PROTOID for this variant is "tor-hs-ntor-curve25519-sha3-256-1".
   We also use the following tweak values:

      t_hsenc    = PROTOID | ":hs_key_extract"
      t_hsverify = PROTOID | ":hs_verify"
      t_hsmac    = PROTOID | ":hs_mac"
      m_hsexpand = PROTOID | ":hs_key_expand"

   To make an INTRODUCE1 cell, the client must know a public encryption
   key B for the hidden service on this introduction circuit. The client
   generates a single-use keypair:
             x,X = KEYGEN()
   and computes:
             intro_secret_hs_input = EXP(B,x) | AUTH_KEY | X | B | PROTOID
             info = m_hsexpand | subcredential
             hs_keys = KDF(intro_secret_hs_input | t_hsenc | info, S_KEY_LEN+MAC_LEN)
             ENC_KEY = hs_keys[0:S_KEY_LEN]
             MAC_KEY = hs_keys[S_KEY_LEN:S_KEY_LEN+MAC_KEY_LEN]

   and sends, as the ENCRYPTED part of the INTRODUCE1 cell:

          CLIENT_PK                [PK_PUBKEY_LEN bytes]
          ENCRYPTED_DATA           [Padded to length of plaintext]
          MAC                      [MAC_LEN bytes]


   Substituting those fields into the INTRODUCE1 cell body format
   described in [FMT_INTRO1] above, we have

            LEGACY_KEY_ID               [20 bytes]
            AUTH_KEY_TYPE               [1 byte]
            AUTH_KEY_LEN                [2 bytes]
            AUTH_KEY                    [AUTH_KEY_LEN bytes]
            N_EXTENSIONS                [1 bytes]
            N_EXTENSIONS times:
               EXT_FIELD_TYPE           [1 byte]
               EXT_FIELD_LEN            [1 byte]
               EXT_FIELD                [EXT_FIELD_LEN bytes]
            ENCRYPTED:
               CLIENT_PK                [PK_PUBKEY_LEN bytes]
               ENCRYPTED_DATA           [Padded to length of plaintext]
               MAC                      [MAC_LEN bytes]


   (This format is as documented in [FMT_INTRO1] above, except that here
   we describe how to build the ENCRYPTED portion.)

   Here, the encryption key plays the role of B in the regular ntor
   handshake, and the AUTH_KEY field plays the role of the node ID.
   The CLIENT_PK field is the public key X. The ENCRYPTED_DATA field is
   the message plaintext, encrypted with the symmetric key ENC_KEY. The
   MAC field is a MAC of all of the cell from the AUTH_KEY through the
   end of ENCRYPTED_DATA, using the MAC_KEY value as its key.

   To process this format, the hidden service checks PK_VALID(CLIENT_PK)
   as necessary, and then computes ENC_KEY and MAC_KEY as the client did
   above, except using EXP(CLIENT_PK,b) in the calculation of
   intro_secret_hs_input. The service host then checks whether the MAC is
   correct. If it is invalid, it drops the cell. Otherwise, it computes
   the plaintext by decrypting ENCRYPTED_DATA.

   The hidden service host now completes the service side of the
   extended ntor handshake, as described in tor-spec.txt section 5.1.4,
   with the modified PROTOID as given above. To be explicit, the hidden
   service host generates a keypair of y,Y = KEYGEN(), and uses its
   introduction point encryption key 'b' to computes:

      intro_secret_hs_input = EXP(X,b) | AUTH_KEY | X | B | PROTOID
      info = m_hsexpand | subcredential
      hs_keys = KDF(intro_secret_hs_input | t_hsenc | info, S_KEY_LEN+MAC_LEN)
      HS_DEC_KEY = hs_keys[0:S_KEY_LEN]
      HS_MAC_KEY = hs_keys[S_KEY_LEN:S_KEY_LEN+MAC_KEY_LEN]

      (The above are used to check the MAC and then decrypt the
      encrypted data.)

      rend_secret_hs_input = EXP(X,y) | EXP(X,b) | AUTH_KEY | B | X | Y | PROTOID
      NTOR_KEY_SEED = MAC(rend_secret_hs_input, t_hsenc)
      verify = MAC(rend_secret_hs_input, t_hsverify)
      auth_input = verify | AUTH_KEY | B | Y | X | PROTOID | "Server"
      AUTH_INPUT_MAC = MAC(auth_input, t_hsmac)

      (The above are used to finish the ntor handshake.)

   The server's handshake reply is:
       SERVER_PK   Y                         [PK_PUBKEY_LEN bytes]
       AUTH        AUTH_INPUT_MAC            [MAC_LEN bytes]

   These fields will be sent to the client in a RENDEZVOUS1 cell using the
   HANDSHAKE_INFO element (see [JOIN_REND]).

   The hidden service host now also knows the keys generated by the
   handshake, which it will use to encrypt and authenticate data
   end-to-end between the client and the server. These keys are as
   computed in tor-spec.txt section 5.1.4.

3.4. Authentication during the introduction phase. [INTRO-AUTH]

   Hidden services may restrict access only to authorized users.
   One mechanism to do so is the credential mechanism, where only users who
   know the credential for a hidden service may connect at all.

3.4.1. Ed25519-based authentication.

   To authenticate with an Ed25519 private key, the user must include an
   extension field in the encrypted part of the INTRODUCE1 cell with an
   EXT_FIELD_TYPE type of [02] and the contents:

        Nonce     [16 bytes]
        Pubkey    [32 bytes]
        Signature [64 bytes]

   Nonce is a random value. Pubkey is the public key that will be used
   to authenticate. [TODO: should this be an identifier for the public
   key instead?]  Signature is the signature, using Ed25519, of:

        "hidserv-userauth-ed25519"
        Nonce       (same as above)
        Pubkey      (same as above)
        AUTH_KEY    (As in the INTRODUCE1 cell)

   The hidden service host checks this by seeing whether it recognizes
   and would accept a signature from the provided public key. If it
   would, then it checks whether the signature is correct. If it is,
   then the correct user has authenticated.

   Replay prevention on the whole cell is sufficient to prevent replays
   on the authentication.

   Users SHOULD NOT use the same public key with multiple hidden
   services.

4. The rendezvous protocol

   Before connecting to a hidden service, the client first builds a
   circuit to an arbitrarily chosen Tor node (known as the rendezvous
   point), and sends an ESTABLISH_RENDEZVOUS cell. The hidden service
   later connects to the same node and sends a RENDEZVOUS cell. Once
   this has occurred, the relay forwards the contents of the RENDEZVOUS
   cell to the client, and joins the two circuits together.

4.1. Establishing a rendezvous point [EST_REND_POINT]

   The client sends the rendezvous point a RELAY_COMMAND_ESTABLISH_RENDEZVOUS
   cell containing a 20-byte value.

            RENDEZVOUS_COOKIE         [20 bytes]

   Rendezvous points MUST ignore any extra bytes in an
   ESTABLISH_RENDEZVOUS cell. (Older versions of Tor did not.)

   The rendezvous cookie is an arbitrary 20-byte value, chosen randomly
   by the client. The client SHOULD choose a new rendezvous cookie for
   each new connection attempt. If the rendezvous cookie is already in
   use on an existing circuit, the rendezvous point should reject it and
   destroy the circuit.

   Upon receiving an ESTABLISH_RENDEZVOUS cell, the rendezvous point associates
   the cookie with the circuit on which it was sent. It replies to the client
   with an empty RENDEZVOUS_ESTABLISHED cell to indicate success. Clients MUST
   ignore any extra bytes in a RENDEZVOUS_ESTABLISHED cell.

   The client MUST NOT use the circuit which sent the cell for any
   purpose other than rendezvous with the given location-hidden service.

   The client should establish a rendezvous point BEFORE trying to
   connect to a hidden service.

4.2. Joining to a rendezvous point [JOIN_REND]

   To complete a rendezvous, the hidden service host builds a circuit to
   the rendezvous point and sends a RENDEZVOUS1 cell containing:

       RENDEZVOUS_COOKIE          [20 bytes]
       HANDSHAKE_INFO             [variable; depends on handshake type
                                   used.]

   where RENDEZVOUS_COOKIE is the cookie suggested by the client during the
   introduction (see [PROCESS_INTRO2]) and HANDSHAKE_INFO is defined in
   [NTOR-WITH-EXTRA-DATA].

   If the cookie matches the rendezvous cookie set on any
   not-yet-connected circuit on the rendezvous point, the rendezvous
   point connects the two circuits, and sends a RENDEZVOUS2 cell to the
   client containing the HANDSHAKE_INFO field of the RENDEZVOUS1 cell.

   Upon receiving the RENDEZVOUS2 cell, the client verifies that HANDSHAKE_INFO
   correctly completes a handshake. To do so, the client parses SERVER_PK from
   HANDSHAKE_INFO and reverses the final operations of section
   [NTOR-WITH-EXTRA-DATA] as shown here:

      rend_secret_hs_input = EXP(Y,x) | EXP(B,x) | AUTH_KEY | B | X | Y | PROTOID
      NTOR_KEY_SEED = MAC(ntor_secret_input, t_hsenc)
      verify = MAC(ntor_secret_input, t_hsverify)
      auth_input = verify | AUTH_KEY | B | Y | X | PROTOID | "Server"
      AUTH_INPUT_MAC = MAC(auth_input, t_hsmac)

   Finally the client verifies that the received AUTH field of HANDSHAKE_INFO
   is equal to the computed AUTH_INPUT_MAC.

   Now both parties use the handshake output to derive shared keys for use on
   the circuit as specified in the section below:

4.2.1. Key expansion

   The hidden service and its client need to derive crypto keys from the
   NTOR_KEY_SEED part of the handshake output. To do so, they use the KDF
   construction as follows:

       K = KDF(NTOR_KEY_SEED | m_hsexpand,    HASH_LEN * 2 + S_KEY_LEN * 2)

   The first HASH_LEN bytes of K form the forward digest Df; the next HASH_LEN
   bytes form the backward digest Db; the next S_KEY_LEN bytes form Kf, and the
   final S_KEY_LEN bytes form Kb.  Excess bytes from K are discarded.

   Subsequently, the rendezvous point passes relay cells, unchanged, from each
   of the two circuits to the other.  When Alice's OP sends RELAY cells along
   the circuit, it authenticates with Df, and encrypts them with the Kf, then
   with all of the keys for the ORs in Alice's side of the circuit; and when
   Alice's OP receives RELAY cells from the circuit, it decrypts them with the
   keys for the ORs in Alice's side of the circuit, then decrypts them with Kb,
   and checks integrity with Db.  Bob's OP does the same, with Kf and Kb
   interchanged.

   [TODO: Should we encrypt HANDSHAKE_INFO as we did INTRODUCE2
   contents? It's not necessary, but it could be wise. Similarly, we
   should make it extensible.]

4.3. Using legacy hosts as rendezvous points

   The behavior of ESTABLISH_RENDEZVOUS is unchanged from older versions
   of this protocol, except that relays should now ignore unexpected
   bytes at the end.

   Old versions of Tor required that RENDEZVOUS cell payloads be exactly
   168 bytes long. All shorter rendezvous payloads should be padded to
   this length with random bytes, to make them difficult to distinguish from
   older protocols at the rendezvous point.

   Relays older than 0.2.9.1 should not be used for rendezvous points by next
   generation onion services because they enforce too-strict length checks to
   rendezvous cells. Hence the "HSRend" protocol from proposal#264 should be
   used to select relays for rendezvous points.

5. Encrypting data between client and host

   A successfully completed handshake, as embedded in the
   INTRODUCE/RENDEZVOUS cells, gives the client and hidden service host
   a shared set of keys Kf, Kb, Df, Db, which they use for sending
   end-to-end traffic encryption and authentication as in the regular
   Tor relay encryption protocol, applying encryption with these keys
   before other encryption, and decrypting with these keys before other
   decryption. The client encrypts with Kf and decrypts with Kb; the
   service host does the opposite.

6. Encoding onion addresses [ONIONADDRESS]

   The onion address of a hidden service includes its identity public key, a
   version field and a basic checksum. All this information is then base32
   encoded as shown below:

     onion_address = base32(PUBKEY | CHECKSUM | VERSION) + ".onion"
     CHECKSUM = H(".onion checksum" | PUBKEY | VERSION)[:2]

     where:
       - PUBKEY is the 32 bytes ed25519 master pubkey of the hidden service.
       - VERSION is an one byte version field (default value '\x03')
       - ".onion checksum" is a constant string
       - CHECKSUM is truncated to two bytes before inserting it in onion_address

  Here are a few example addresses:

       pg6mmjiyjmcrsslvykfwnntlaru7p5svn6y2ymmju6nubxndf4pscryd.onion
       sp3k262uwy4r2k3ycr5awluarykdpag6a7y33jxop4cs2lu5uz5sseqd.onion
       xa4r2iadxm55fbnqgwwi5mymqdcofiu3w6rpbtqn7b2dyn7mgwj64jyd.onion

   For more information about this encoding, please see our discussion thread
   at [ONIONADDRESS-REFS].

7. Open Questions:

   Scaling hidden services is hard. There are on-going discussions that
   you might be able to help with. See [SCALING-REFS].

   How can we improve the HSDir unpredictability design proposed in
   [SHAREDRANDOM]? See [SHAREDRANDOM-REFS] for discussion.

   How can hidden service addresses become memorable while retaining
   their self-authenticating and decentralized nature? See
   [HUMANE-HSADDRESSES-REFS] for some proposals; many more are possible.

   Hidden Services are pretty slow. Both because of the lengthy setup
   procedure and because the final circuit has 6 hops. How can we make
   the Hidden Service protocol faster? See [PERFORMANCE-REFS] for some
   suggestions.

References:

[KEYBLIND-REFS]:
        https://trac.torproject.org/projects/tor/ticket/8106
        https://lists.torproject.org/pipermail/tor-dev/2012-September/004026.html

[KEYBLIND-PROOF]:
        https://lists.torproject.org/pipermail/tor-dev/2013-December/005943.html

[SHAREDRANDOM-REFS]:
        https://gitweb.torproject.org/torspec.git/tree/proposals/250-commit-reveal-consensus.txt
        https://trac.torproject.org/projects/tor/ticket/8244

[SCALING-REFS]:
        https://lists.torproject.org/pipermail/tor-dev/2013-October/005556.html

[HUMANE-HSADDRESSES-REFS]:
        https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/ideas/xxx-onion-nyms.txt
        http://archives.seul.org/or/dev/Dec-2011/msg00034.html

[PERFORMANCE-REFS]:
        "Improving Efficiency and Simplicity of Tor circuit
        establishment and hidden services" by Overlier, L., and
        P. Syverson

        [TODO: Need more here! Do we have any? :( ]

[ATTACK-REFS]:
        "Trawling for Tor Hidden Services: Detection, Measurement,
        Deanonymization" by Alex Biryukov, Ivan Pustogarov,
        Ralf-Philipp Weinmann

        "Locating Hidden Servers" by Lasse Øverlier and Paul
        Syverson

[ED25519-REFS]:
        "High-speed high-security signatures" by Daniel
        J. Bernstein, Niels Duif, Tanja Lange, Peter Schwabe, and
        Bo-Yin Yang. http://cr.yp.to/papers.html#ed25519

[ED25519-B-REF]:
        https://tools.ietf.org/html/draft-josefsson-eddsa-ed25519-03#section-5:

[PRNG-REFS]:
        http://projectbullrun.org/dual-ec/ext-rand.html
        https://lists.torproject.org/pipermail/tor-dev/2015-November/009954.html

[SRV-TP-REFS]:
        https://lists.torproject.org/pipermail/tor-dev/2016-April/010759.html

[VANITY-REFS]:
        https://github.com/Yawning/horse25519

[ONIONADDRESS-REFS]:
        https://lists.torproject.org/pipermail/tor-dev/2017-January/011816.html

[TORSION-REFS]:
        https://lists.torproject.org/pipermail/tor-dev/2017-April/012164.html
        https://getmonero.org/2017/05/17/disclosure-of-a-major-bug-in-cryptonote-based-currencies.html

Appendix A. Signature scheme with key blinding [KEYBLIND]

A.1. Key derivation overview

  As described in [IMD:DIST] and [SUBCRED] above, we require a "key
  blinding" system that works (roughly) as follows:

        There is a master keypair (sk, pk).

        Given the keypair and a nonce n, there is a derivation function
        that gives a new blinded keypair (sk_n, pk_n).  This keypair can
        be used for signing.

        Given only the public key and the nonce, there is a function
        that gives pk_n.

        Without knowing pk, it is not possible to derive pk_n; without
        knowing sk, it is not possible to derive sk_n.

        It's possible to check that a signature was made with sk_n while
        knowing only pk_n.

        Someone who sees a large number of blinded public keys and
        signatures made using those public keys can't tell which
        signatures and which blinded keys were derived from the same
        master keypair.

        You can't forge signatures.

        [TODO: Insert a more rigorous definition and better references.]

A.2. Tor's key derivation scheme

  We propose the following scheme for key blinding, based on Ed25519.

  (This is an ECC group, so remember that scalar multiplication is the
  trapdoor function, and it's defined in terms of iterated point
  addition. See the Ed25519 paper [Reference ED25519-REFS] for a fairly
  clear writeup.)

  Let B be the ed25519 basepoint as found in section 5 of [ED25519-B-REF]:
      B = (15112221349535400772501151409588531511454012693041857206046113283949847762202,
           46316835694926478169428394003475163141307993866256225615783033603165251855960)

  Assume B has prime order l, so lB=0. Let a master keypair be written as
  (a,A), where a is the private key and A is the public key (A=aB).

  To derive the key for a nonce N and an optional secret s, compute the
  blinding factor like this:

           h = H(BLIND_STRING | A | s | B | N)
           BLIND_STRING = "Derive temporary signing key"
           N = "key-blind" | INT_8(period-number) | INT_8(period_length)

  then clamp the blinding factor 'h' according to the ed25519 spec:

           h[0] &= 248;
           h[31] &= 127;
           h[31] |= 64;

  and do the key derivation as follows:

      private key for the period:   a' = h a
      public key for the period:    A' = h A = (ha)B

  Generating a signature of M: given a deterministic random-looking r
  (see EdDSA paper), take R=rB, S=r+hash(R,A',M)ah mod l. Send signature
  (R,S) and public key A'.

  Verifying the signature: Check whether SB = R+hash(R,A',M)A'.

  (If the signature is valid,
       SB = (r + hash(R,A',M)ah)B
          = rB + (hash(R,A',M)ah)B
          = R + hash(R,A',M)A' )

  See [KEYBLIND-REFS] for an extensive discussion on this scheme and
  possible alternatives. Also, see [KEYBLIND-PROOF] for a security
  proof of this scheme.

Appendix B. Selecting nodes [PICKNODES]

  Picking introduction points
  Picking rendezvous points
  Building paths
  Reusing circuits

  (TODO: This needs a writeup)

Appendix C. Recommendations for searching for vanity .onions [VANITY]

  EDITORIAL NOTE: The author thinks that it's silly to brute-force the
  keyspace for a key that, when base-32 encoded, spells out the name of
  your website. It also feels a bit dangerous to me. If you train your
  users to connect to

      llamanymityx4fi3l6x2gyzmtmgxjyqyorj9qsb5r543izcwymle.onion

  I worry that you're making it easier for somebody to trick them into
  connecting to

      llamanymityb4sqi0ta0tsw6uovyhwlezkcrmczeuzdvfauuemle.onion

  Nevertheless, people are probably going to try to do this, so here's a
  decent algorithm to use.

  To search for a public key with some criterion X:

        Generate a random (sk,pk) pair.

        While pk does not satisfy X:

            Add the number 8 to sk
            Add the scalar 8*B to pk

        Return sk, pk.

  We add 8 and 8*B, rather than 1 and B, so that sk is always a valid
  Curve25519 private key, with the lowest 3 bits equal to 0.

  This algorithm is safe [source: djb, personal communication] [TODO:
  Make sure I understood correctly!] so long as only the final (sk,pk)
  pair is used, and all previous values are discarded.

  To parallelize this algorithm, start with an independent (sk,pk) pair
  generated for each independent thread, and let each search proceed
  independently.

  See [VANITY-REFS] for a reference implementation of this vanity .onion
  search scheme.

Appendix D. Numeric values reserved in this document

  [TODO: collect all the lists of commands and values mentioned above]

Appendix E. Reserved numbers

  We reserve these certificate type values for Ed25519 certificates:

      [08] short-term descriptor signing key, signed with blinded
           public key. (Section 2.4)
      [09] intro point authentication key, cross-certifying the descriptor
           signing key. (Section 2.5)
      [0B] ed25519 key derived from the curve25519 intro point encryption key,
           cross-certifying the descriptor signing key. (Section 2.5)

      Note: The value "0A" is skipped because it's reserved for the onion key
            cross-certifying ntor identity key from proposal 228.

Appendix F. Hidden service directory format [HIDSERVDIR-FORMAT]

  This appendix section specifies the contents of the HiddenServiceDir directory:

  - "hostname"                                       [FILE]

   This file contains the onion address of the onion service.

  - "private_key_ed25519"                            [FILE]

   This file contains the private master ed25519 key of the onion service.
   [TODO: Offline keys]

  - "client_authorized_pubkeys"                      [FILE]

   If client authorization is _enabled_, this is a newline-separated file of
   "<client name> <pubkeys> entries for authorized clients. You can think of it
   as the ~/.ssh/authorized_keys of onion services. See [CLIENT-AUTH-MGMT] for
   more details.

  - "./client_authorized_privkeys/"                  [DIRECTORY]
    "./client_authorized_privkeys/alice.privkey"     [FILE]
    "./client_authorized_privkeys/bob.privkey"       [FILE]
    "./client_authorized_privkeys/charlie.privkey"   [FILE]

   If client authorization is _enabled_ _AND_ if the hidden service is
   responsible for generating and distributing private keys for its clients,
   then this directory contains files with client's private keys. See
   [CLIENT-AUTH-MGMT] for more details.

Appendix E. Managing authorized client data [CLIENT-AUTH-MGMT]

  Hidden services and clients can configure their authorized client data either
  using the torrc, or using the control port. This section presents a suggested
  scheme for configuring client authorization. Please see appendix
  [HIDSERVDIR-FORMAT] for more information about relevant hidden service files.

  E.1. Configuring client authorization using torrc

  E.1.1. Hidden Service side

     A hidden service that wants to perform client authorization, adds a new
     option HiddenServiceAuthorizeClient to its torrc file:

        HiddenServiceAuthorizeClient auth-type client-name,client-name,...

     The only recognized auth-type value is "basic" which describes the scheme in
     section [CLIENT-AUTH]. The rest of the line is a comma-separated list of
     human-readable authorized client names.

     Let's consider that one of the listed client names is "alice". In this
     case, Tor checks the "client_authorized_pubkeys" file for any entries
     with client_name being "alice". If an "alice" entry is found, we use the
     relevant pubkeys to authenticate Alice.

     If no "alice" entry is found in the "client_authorized_pubkeys" file, Tor
     is tasked with generating public/private keys for Alice. To do so, Tor
     generates x25519 and ed25519 keypairs for Alice, then makes a
     "client_authorized_privkeys/alice.privkey" file and writes the private
     keys inside; it also adds an entry for alice to the
     "client_authorized_pubkeys" file.

     In this last case, the hidden service operator has the responsibility to
     pass the .key file to Alice in a secure out-of-band way. After the file
     is passed to Alice, it can be shredded from the filesystem, as only the
     public keys are required for the hidden service to function.

  E.1.2. Client side

     A client who wants to register client authorization data for a hidden service
     needs to add the following line to their torrc:

           HidServAuth onion-address x25519-private-key ed25519-private-key

     The keys above are either generated by Alice using a key generation utility,
     or they are extracted from a .key file provided by the hidden service.

     In the former case, the client is also tasked with transfering the public
     keys to the hidden service in a secure out-of-band way.

  E.2. Configuring client authorization using the control port

  E.2.1. Service side

     A hidden service also has the option to configure authorized clients
     using the control port. The idea is that hidden service operators can use
     controller utilities that manage their access control instead of using
     the filesystem to register client keys.

     Specifically, we require a new control port command ADD_ONION_CLIENT_AUTH
     which is able to register x25519/ed25519 public keys tied to a specific
     authorized client.
      [XXX figure out control port command format]

     Hidden services who use the control port interface for client auth need
     to perform their own key management.

  E.2.2. Client side

     There should also be a control port interface for clients to register
     authorization data for hidden services without having to use the
     torrc. It should allow both generation of client authorization private
     keys, and also to import client authorization data provided by a hidden
     service

     This way, Tor Browser can present "Generate client auth keys" and "Import
     client auth keys" dialogs to users when they try to visit a hidden service
     that is protected by client authorization.

     Specifically, we require two new control port commands:
                   IMPORT_ONION_CLIENT_AUTH_DATA
                   GENERATE_ONION_CLIENT_AUTH_DATA
     which import and generate client authorization data respectively.

     [XXX how does key management work here?]
     [XXX what happens when people use both the control port interface and the
          filesystem interface?]

Filename: 225-strawman-shared-rand.txt
Title: Strawman proposal: commit-and-reveal shared rng
Author: Nick Mathewson
Created: 2013-11-29
Status: Superseded
Superseded-by: 250

1. Introduction

  This is a strawman proposal: I don't think we should actually build
  it.  It's just a simple writeup of the more trivial commit-then-reveal
  protocol for generating a shared random value.  It's insecure to the
  extent that an adversary who controls b of the authorities gets to
  choose among 2^b outcomes for the result of the protocol.

  See proposal 224, section HASHRING for some motivation of why we want
  one of these in Tor.

  Let's do better!

  [TODO: Are we really stuck with Tor's nasty metaformat here?]

2. The protocol

  Here's a protocol for producing a shared random value. It should run
  less frequently than the directory consensus algorithm. It runs in
  these phases.

    1. COMMITMENT
    2. REVEAL
    3. COMPUTE SHARED RANDOM

  It should be implemented by software other than Tor, which should be
  okay for authorities.

  Note: This is not a great protocol. It has a number of failure
  modes. Better protocols seem hard to implement, though, and it ought
  to be possible to drop in a replacement here, if we do it right.

  At the start of phase 1, each participating authority publishes a
  statement of the form:

      shared-random 1
      shared-random-type commit
      signing-key-certification (certification here; see proposal 220)
      commitment-key-certification (certification here; see proposal 220)
      published YYYY-MM-DD HH:MM:SS
      period-start YYYY-MM-DD HH:MM:SS
      attempt INT
      commitment sha512 C
      signature (made with commitment key; see proposal 220)

  The signing key is the one used for consensus votes, signed by the
  directory authority identity key. The commitment key is used for this
  protocol only. The signature is made with the commitment key. The
  period-start value is the start of the period for which the shared
  random value should be in use. The attempt value starts at 1, and
  increments by 1 for each time that the protocol fails.

  The other fields should be self-explanatory.

  The commitment value C is a base64-encoded SHA-512 hash of a 256-bit
  random value R.

  During the rest of phase 1, every authority collects the commitments
  from other authorities, and publishes them to other authorities, as
  they do today with directory votes.

  At the start of phase 2, each participating authority publishes:

      shared-random 1
      shared-random-type reveal
      signing-key-certification (certification here; see proposal 220)
      commitment-key-certification (certification here; see proposal 220)
      received-commitment ID sig
      received-commitment ID sig
      published YYYY-MM-DD HH:MM:SS
      period-start YYYY-MM-DD HH:MM:SS
      attempt INT
      commitment sha512 C
      reveal R
      signature (made with commitment key; see proposal 220)

  The R value is the one used to generate C. The received-commitment
  lines are the signatures on the documents from other authorities in
  phase 1. All other fields are as in the commitments.

  During the rest of phase 2, every authority collects the
  reveals from other authorities, as above with commitments.

  At the start of phase 3, each participating authority either has a
  reveal from every authority that it received a commitment from, or it
  does not. Each participating authority then says

      shared-random 1
      shared-random-type finish
      signing-key-certification (certification here; see proposal 220)
      commitment-key-certification (certification here; see proposal 220)
      received-commitment ID sig R
      received-commitment ID sig R ...
      published YYYY-MM-DD HH:MM:SS
      period-start YYYY-MM-DD HH:MM:SS
      attempt INT
      consensus C
      signature (made with commitment key; see proposal 220)

  Where C = SHA256(ID | R | ID | R | ID | R | ...) where the ID
  values appear in ascending order and the R values appear after
  their corresponding ID values.

  See [SHAREDRANDOM-REFS] for more discussion here.

  (TODO: should this be its own spec?  If so, does it have to use our
  regular metaformat or can it use something less sucky?)
Filename: 226-bridgedb-database-improvements.txt
Title: "Scalability and Stability Improvements to BridgeDB: Switching to a
        Distributed Database System and RDBMS"
Author: Isis Agora Lovecruft
Created: 12 Oct 2013
Related Proposals: XXX-social-bridge-distribution.txt
Status: Reserve

*  I.     Overview

   BridgeDB is Tor's Bridge Distribution system, which currently has two major
   Bridge Distribution mechanisms: the HTTPS Distributor and an Email
   Distributor. [0]

   BridgeDB is written largely in Twisted Python, and uses Python2's builtin
   sqlite3 as its database backend.  Unfortunately, this backend system is
   already showing strain through increased times for queries, and sqlite's
   memory usage is not up-to-par with modern, more efficient, NoSQL databases.

   In order to better facilitate the implementation of newer, more complex
   Bridge Distribution mechanisms, several improvements should be made to the
   underlying database system of BridgeDB.  Additionally, I propose that a
   clear distinction in terms, as well as a modularisation of the codebase, be
   drawn between the mechanisms for Bridge Distribution versus the backend
   Bridge Database (BridgeDB) storage system.

   This proposal covers the design and implementation of a scalable NoSQL ―
   Document-Based and Key-Value Relational ― database backend for storing data
   on Tor Bridge relays, in an efficient manner that is ammenable to
   interfacing with the Twisted Python asynchronous networking code of current
   and future Bridge Distribution mechanisms.

*  II.   Terminology

   BridgeDistributor := A program which decides when and how to hand out
                        information on a Tor Bridge relay, and to whom.

   BridgeDB := The backend system of databases and object-relational mapping
               servers, which interfaces with the BridgeDistributor in order
               to hand out bridges to clients, and to obtain and process new,
               incoming ``@type bridge-server-descriptors``,
               ``@type bridge-networkstatus`` documents, and
               ``@type bridge-extrainfo`` descriptors. [3]

   BridgeFinder := A client-side program for an Onion Proxy (OP) which handles
                   interfacing with a BridgeDistributor in order to obtain new
                   Bridge relays for a client.  A BridgeFinder also interfaces
                   with a local Tor Controller (such as TorButton or ARM) to
                   handle automatic, transparent Bridge configuration (no more
                   copy+pasting into a torrc) without being given any
                   additional privileges over the Tor process, [1] and relies
                   on the Tor Controller to interface with the user for
                   control input and displaying up-to-date information
                   regarding available Bridges, Pluggable Transport methods,
                   and potentially Invite Tickets and Credits (a cryptographic
                   currency without fiat value which is generated
                   automatically by clients whose Bridges remain largely
                   uncensored, and is used to purchase new Bridges), should a
                   Social Bridge Distributor be implemented. [2]

*  III.   Databases
** III.A. Scalability Requirements

   Databases SHOULD be implemented in a manner which is ammenable to using a
   distributed storage system; this is necessary because many potential
   datatypes required by future BridgeDistributors MUST be stored permanently.
   For example, in the designs for the Social Bridge Distributor, the list of
   hash digests of spent Credits, and the list of hash digests of redeemed
   Invite Tickets MUST be stored forever to prevent either from being replayed
   ― or double-spent ― by a malicious user who wishes to block bridges faster.
   Designing the BridgeDB backend system such that additional nodes may be
   added in the future will allow the system to freely scale in relation to
   the storage requirements of future BridgeDistributors.

   Additionally, requiring that the implementation allow for distributed
   database backends promotes modularisation the components of BridgeDB, such
   that BridgeDistributors can be separated from the backend storage system,
   BridgeDB, as all queries will be issued through a simplified, common API,
   regardless of the number of nodes system, or the design of future
   BridgeDistributors.

***   1.  Distributed Database System

   A distributed database system SHOULD be used for BridgeDB, in order to
   scale resources as the number of Tor bridge users grows. This database
   system, hereafter referred to as DDBS.

   The DDBS MUST be capable of working within Twisted's asynchronous
   framework. If possible, a Object-Relational Mapper (ORM) SHOULD be used to
   abstract the database backend's structure and query syntax from the Twisted
   Python classes which interact with it, so that the type of database may be
   swapped out for another with less code refactoring.

   The DDBM SHALL be used for persistent storage of complex data structures
   such as the bridges, which MAY include additional information from both the
   `@type bridge-server-descriptor`s and the `@type bridge-extra-info`
   descriptors. [3]

**** 1.a. Choice of DDBS

   CouchDB is chosen for its simple HTTP API, ease of use, speed, and official
   support for Twisted Python applications. [4] Additionally, its
   document-based data model is very similar to the current archetecture of
   tor's Directory Server/Mirror system, in that an HTTP API is used to
   retrieve data stored within virtual directories.  Internally, it uses JSON
   to store data and JavaScript as its query language, both of which are
   likely friendlier to various other components of the Tor Metrics
   infrastructure which sanitise and analyse portions of the Bridge
   descriptors.  At the very least, friendlier than hardcoding raw SQL queries
   as Python strings.

**** 1.b. Data Structures which should be stored in a DDBS:

   - RedactedDB - The Database of Blocked Bridges

     The RedactedDB will hold entries of bridges which have been discovered to
     be unreachable from BridgeDB network vantage point, or have been reported
     unreachable by clients.

   - BridgeDB - The Database of Bridges

     BridgeDB holds information on available Bridges, obtained via bridge
     descriptors and networkstatus documents from the BridgeAuthority. Because
     a Bridge may have multiple `ORPort`s and multiple
     `ServerTransportListenAddress`es, attaching additional data to each of
     these addresses which MAY include the following information on a blocking
     event:
         - Geolocational country code of the reported blocking event
         - Timestamp for when the blocking event was first reported
         - The method used for discovery of the block
         - an the believed mechanism which is causing the block
     would quickly become unwieldy, the RedactedDB and BridgeDB SHOULD be kept
     separate.

   - User Credentials

     For the Social BridgeDistributor, these are rather complex,
     increasingly-growing, concatenations (or tuples) of several datatypes,
     including Non-Interactive Proofs-of-Knowledge (NIPK) of Commitments to
     k-TAA Blind Signatures, and NIPK of Commitments to a User's current
     number of Credits and timestamps of requests for Invite Tickets.

***   2.  Key-Value Relational Database Mapping Server

    For simpler data structures which must be persistently stored, such as the
    list of hashes of previously seen Invite Tickets, or the list of
    previously spent Tokens, a Relational Database Mapping Server (RDBMS)
    SHALL be used for optimisation of queries.

    Redis and Memcached are two examples of RDBMS which are well tested and
    are known to work well with Twisted. The major difference between the two
    is that Memcaches are stored only within volatile memory, while Redis
    additionally supports commands for transferring objects into persistent,
    on-disk storage. 

    There are several support modules for interfacing with both Memcached and
    Redis from Twisted Python, see Twisted's MemCacheProtocol class [5] [6] or
    txyam [7] for Memcached, and txredis [8] or txredisapi [9] for
    Redis. Additionally, numerous big name projects both use Redis as part of
    their backend systems, and also provide helpful documentation on their own
    experience of the process of switching over to the new systems. [17] For
    non-Twisted Python Redis APIs, there is redis-py, which provides a
    connection pool that could likely be interfaced with from Twisted Python
    without too much difficultly. [10] [11]

**** 2.a. Data Structures which should be stored in a RDBMS

    Simple, mostly-flat datatypes, and data which must be frequently indexed
    should be stored in a RDBMS, such as large lists of hashes, or arbitrary
    strings with assigned point-values (i.e. the "Uniform Mapping" for the
    current HTTPS BridgeDistributor).

    For the Social BridgeDistributor, hash digests of the following datatypes
    SHOULD be stored in the RDBMS, in order to prevent double-spending and
    replay attacks:

      - Invite Tickets

        These are anonymous, unlinkable, unforgeable, and verifiable tokens
        which are occasionally handed out to well-behaved Users by the Social
        BridgeDistributor to permit new Users to be invited into the system.
        When they are redeemed, the Social BridgeDistributor MUST store a hash
        digest of their contents to prevent replayed Invite Tickets.

      - Spent Credits

        These are Credits which have already been redeemed for new Bridges.
        The Social BridgeDistributor MUST also store a hash digest of Spent
        Credits to prevent double-spending.

***   3.  Bloom Filters and Other Database Optimisations

    In order to further decrease the need for lookups in the backend
    databases, Bloom Filters can used to eliminate extraneous
    queries. However, this optimization would only be beneficial for string
    lookups, i.e. querying for a User's Credential, and SHOULD NOT be used for
    queries within any of the hash lists, i.e. the list of hashes of
    previously seen Invite Tickets. [14]

****  3.a. Bloom Filters within Redis

    It might be possible to use Redis' GETBIT and SETBIT commands to store a
    Bloom Filter within a Redis cache system; [15] doing so would offload the
    severe memory requirements of loading the Bloom Filter into memory in
    Python when inserting new entries, reducing the time complexity from some
    polynomial time complexity that is proportional to the integral of the
    number of bridge users over the rate of change of bridge users over time,
    to a time complexity of order O(1).

****  3.b. Expiration of Stale Data

    Some types of data SHOULD be safe to expire, such as User Credentials
    which have not been updated within a certain timeframe. This idea should
    be further explored to assess the safety and potential drawbacks to
    removing old data.

    If there is data which SHOULD expire, the PEXPIREAT command provided by
    Redis for the key datatype would allow the RDBMS itself to handle cleanup
    of stale data automatically. [16]

****  4.   Other potential uses of the improved Bridge database system

    Redis provides mechanisms for evaluations to be made on data by calling
    the sha1 for a serverside Lua script. [15] While not required in the
    slightest, it is a rather neat feature, as it would allow Tor's Metrics
    infrastructure to offload some of the computational overhead of gathering
    data on Bridge usage to BridgeDB (as well as diminish the security
    implications of storing Bridge descriptors).

    Also, if Twisted's IProducer and IConsumer interfaces do not provide
    needed interface functionality, or it is desired that other components of
    the Tor software ecosystem be capable of scheduling jobs for BridgeDB,
    there are well-tested mechanisms for using Redis as a message
    queue/scheduling system. [16]

*  References

[0]: https://bridges.torproject.org
     mailto:bridges@bridges.torproject.org
[1]: See proposals 199-bridgefinder-integration.txt at
     https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/199-bridgefinder-integration.txt
[2]: See XXX-social-bridge-distribution.txt at
     https://gitweb.torproject.org/user/isis/bridgedb.git/blob/refs/heads/feature/7520-social-dist-design:/doc/proposals/XXX-bridgedb-social-distribution.txt
[3]: https://metrics.torproject.org/formats.html#descriptortypes
[4]: https://github.com/couchbase/couchbase-python-client#twisted-api
[5]: https://twistedmatrix.com/documents/current/api/twisted.protocols.memcache.MemCacheProtocol.html
[6]: http://stackoverflow.com/a/5162203
[7]: http://findingscience.com/twisted/python/memcache/2012/06/09/txyam:-yet-another-memcached-twisted-client.html
[8]: https://pypi.python.org/pypi/txredis
[9]: https://github.com/fiorix/txredisapi
[10]: https://github.com/andymccurdy/redis-py/
[11]: http://degizmo.com/2010/03/22/getting-started-redis-and-python/
[12]: http://www.dr-josiah.com/2012/03/why-we-didnt-use-bloom-filter.html
[13]: http://redis.io/topics/data-types §"Strings"
[14]: http://redis.io/commands/pexpireat
[15]: http://redis.io/commands/evalsha
[16]: http://www.restmq.com/
[17]: https://www.mediawiki.org/wiki/Redis
Filename: 227-vote-on-package-fingerprints.txt
Title: Include package fingerprints in consensus documents
Author: Nick Mathewson, Mike Perry
Created: 2014-02-14
Status: Closed
Implemented-In: 0.2.6.3-alpha

0. Abstract

   We propose extending the Tor consensus document to include
   digests of the latest versions of one or more package files, to
   allow software using Tor to determine its up-to-dateness, and
   help users verify that they are getting the correct software.

1. Introduction

   To improve the integrity and security of updates, it would be
   useful to provide a way to authenticate the latest versions of
   core Tor software through the consensus. By listing a location
   with this information for each version of each package, we can
   augment the update process of Tor software to authenticate the
   packages it downloads through the Tor consensus.

2. Proposal

   We introduce a new line for inclusion in votes and consensuses.
   Its format is:

     "package" SP PACKAGENAME SP VERSION SP URL SP DIGESTS NL

      PACKAGENAME = NONSPACE
      VERSION = NONSPACE
      URL = NONSPACE
      DIGESTS = DIGEST | DIGESTS SP DIGEST
      DIGEST = DIGESTTYPE "=" DIGESTVAL

      NONSPACE = one or more non-space printing characters

      DIGESTVAL = DIGESTTYPE = one or more non-=, non-" " characters.

      SP = " "
      NL = a newline

   Votes and consensuses may include any number of "package" lines,
   but no vote or consensus may include more than one "package" line
   with the same PACKAGENAME and VERSION values.  All "package"
   lines must be sorted by "PACKAGENAME VERSION", in
   lexical (strcmp) order.

   (If a vote contains multiple entries with the same PACKAGENAME and
   VERSION, then only the last one is considered.)

   If the consensus-method is at least 19, then when computing
   the consensus, package lines for a given PACKAGENAME/VERSION pair
   should be included if at least three authorities list such a
   package in their votes.  (Call these lines the "input" lines for
   PACKAGENAME.)  That consensus should contain every "package" line
   that is listed verbatim by more than half of the authorities
   listing a line for the PACKAGENAME/VERSION pair, and no
   others.

   These lines appear immediately following the client-versions and
   server-versions lines.

3. Recommended usage

   Programs that want to use this facility should pick their
   PACKAGENAME values, and arrange to have their versions listed in
   the consensus by at least three friendly authority operators.

   Programs may want to have multiple PACKAGENAME values in order to
   keep separate lists. These lists could correspond to how the
   software is used (as tor has client-versions and
   server-versions); or to a release series (as in tbb-alpha,
   tbb-beta, and tbb-stable); or to how bad it is to use versions
   not listed (as in foo-noknownexploits, foo-recommended).

   Programs MUST NOT use "package" lines from consensuses that have
   not been verified and accepted as valid according to the rules in
   dir-spec.txt, and SHOULD NOT fetch their own consensuses if there
   is a tor process also running that can fetch the consensus
   itself.

   For safety, programs MAY want to disable functionality until
   confirming that their versions are acceptable.

   To avoid synchronization problems, programs that use the DIGEST
   field to store a digest of the contents of the URL SHOULD NOT use
   any URLs whose contents are expected to change while any valid
   consensus lists them.

3.1. Intended usage by the Tor Browser Bundle

   Tor Browser Bundle packages will be listed with package names
   'tbb-stable, 'tbb-beta', and 'tbb-alpha'. We will list a line for
   the latest version of each release series.

   When the updater downloads a new update, it always downloads the
   latest version of the Tor Browser Bundle. Because of this, and
   because we will only use these lines to authenticate updates, we
   should not need to list more than one version per series in the
   consensus.

   After completing a package download and verifying the download
   signatures (which are handled independently from the Tor
   Consensus), it will consult the appropriate current consensus
   document through the control port.

   If the current consensus timestamp is not yet more recent than
   the proposed update timestamp, the updater will delay installing
   the package until a consensus timestamp that is more recent than
   the update timestamp has been obtained by the Tor client.

   If the consensus document has a package line for the current
   release series with a matching version, it will then download the
   file at the specified URL, and then compute its hash to make sure
   it matches the value in the consensus.

   If the hash matches, the Tor Browser will download the file and
   parse its contents, which will be a JSON file which lists
   information needed to verify the hashes of the downloaded update
   file.

   If the hash does not match, the Tor Browser Bundle should display
   an error to the user and not install the package.

   If there are no package lines in the consensus for the expected
   version, the updater will delay installing the update (but the
   bundle should still inform the user they are out of date and may
   update manually).

   If there are no package lines in the consensus for the current
   release series at all, the updater should install the package
   using only normal signature verification.

4. Limitations and open questions

   This proposal won't tell users how to upgrade, or even exactly
   what version to upgrade to.

   If software is so broken that it won't start at all, or shouldn't
   be started at all, this proposal can't help with that.

   This proposal is not a substitute for a proper software update
   tool.


Filename: 228-cross-certification-onionkeys.txt
Title: Cross-certifying identity keys with onion keys
Author: Nick Mathewson
Created: 25 February 2014
Status: Closed


0. Abstract

   In current Tor router descriptor designs, routers prove ownership
   of an identity key (by signing the router descriptors), but not
   of their onion keys.  This document describes a method for them
   to do so.

1. Introduction.

   Signing router descriptors with identity keys prevents attackers
   from impersonating a server and advertising their own onion keys
   and IP addresses.  That's good.

   But there's nothing in Tor right now that effectively stops you
   (an attacker) from listing somebody else's public onion key in
   your descriptor.  If you do, you can't actually recover any keys
   negotiated using that key, and you can't MITM circuits made with
   that key (since you don't have the private key).

   (You _could_ do something weird in the TAP protocol where you
   receive an onionskin that you can't process, relay it to the
   party who can process it, and receive a valid reply that you
   could send back to the user.  But this makes you a less effective
   man-in-the-middle than you would be if you had just generated
   your own onion key.  The ntor protocol shuts down this
   possibility by including the router identity in the material to
   be hashed, so that you can't complete an ntor handshake unless
   the client agrees with you about what identity goes with your
   ntor onion key.)

   Nonetheless, it's probably undesirable that this is possible at
   all.  Just because it isn't obvious today how to exploit this
   doesn't mean it will never be possible.

2. Cross-certifying identities with onion keys

2.1. What to certify

   Once proposal 220 is implemented, we'll sign our Ed25519 identity
   key as described in proposal 220.  Since the Ed25519 identity key
   certifies the RSA key, there's no strict need to certify both
   separately.

   On the other hand, this proposal may be implemented before proposal
   220.  If it is, we'll need a way for it to certify the RSA1024 key
   too.

2.2. TAP onion keys

   We add to each router descriptor a new element,
   "onion-key-crosscert", containing a RSA signature of:

       A SHA1 hash of the identity key  [20 bytes]
       The Ed25519 identity key, if any [32 bytes]

   If there is no ed25519 identity key, or if in some future version
   there is no RSA identity key, the corresponding field must be
   zero-filled.

   Parties verifying this signature MUST allow additional data beyond
   the 52 bytes listed above.

2.3. ntor onion keys

   Here, we need to convert the ntor key to an ed25519 key for
   signing.  See the appendix A for how to do that.  We'll also need
   to transmit a sign bit.

   We can add an element "ntor-onion-key-crosscert", containing an
   Ed25519 certificate in the format from proposal 220 section 2.1,
   with a sign indicator to indicate which ed25519 public key to use
   to check the key:

      "ntor-onion-key-crosscert" SP SIGNBIT SP CERT NL

      SIGNBIT = "0" / "1"

   Note that this cert format has 32 bytes of of redundant data, since it
   includes the identity key an extra time.  That seems okay to me.

   The signed key here is the master identity key.

   The TYPE field in this certificate should be set to
      [0A] - ntor onion key cross-certifying ntor identity key

3. Authority behavior

   Authorities should reject any router descriptor with an invalid
   onion-key-crosscert element or ntor-onion-key-crosscert element.

   Both elements should be required on any cert containing an
   ed25519 identity key.

   See section 3.1 of proposal 220 for rules requiring routers to
   eventually have ed25519 keys.

4. Performance impact

   Routers do not generate new descriptors frequently enough for the
   extra signing operations required here to have an appreciable affect
   on their performance.

   Checking an extra ed25519 signature when parsing a descriptor is
   very cheap, since we can use batch signature checking.

   The point decompression algorithm will require us to calculate
   1/(u+1), which costs as much as an exponentiation in
   GF(2^255-19).

   Checking an RSA1024 signature is also cheap, since we use small
   public exponents.

   Adding an extra RSA signature and an extra ed25519 signature to
   each descriptor will make each descriptor, after compression,
   about 128+100 bytes longer.  (Compressed base64-encoded random
   bytes are about as long as the original random bytes.) Most
   clients don't download raw descriptors, though, so it shouldn't
   matter too much.


A. Converting a curve25519 public key to an ed25519 public key

   Given a curve25519 x-coordinate (u), we can get the y coordinate
   of the ed25519 key using

         y = (u-1)/(u+1)

   and then we can apply the usual ed25519 point decompression
   algorithm to find the x coordinate of the ed25519 point to check
   signatures with.

   Note that we need the sign of the X coordinate to do this
   operation; otherwise, we'll have two possible X coordinates that
   might have correspond to the key.  Therefore, we need the 'sign'
   of the X coordinate, as used by the ed25519 key expansion
   algorithm.

   To get the sign, the easiest way is to take the same private key,
   feed it to the ed25519 public key generation algorithm, and see
   what the sign is.


B. Security notes

   It would be very bad for security if we provided a diffie-hellman
   oracle for our curve25519 ntor keys.  Fortunately, we don't, since
   nobody else can influence the certificate contents.

C. Implementation notes

   As implemented in Tor, I've decided to make this proposal cross-dependent
   on proposal 220. A router descriptor must have ALL or NONE
   of the following:
            * An Ed25529 identity key
            * A TAP cross-certification
            * An ntor cross-certification

   Further, if it has the above, it must also have:
            * An ntor onion key.


Filename: 229-further-socks5-extensions.txt
Title: Further SOCKS5 extensions
Author: Yawning Angel
Created: 25-Feb-2014
Status: Rejected

Note: These are good ideas, but it's better not to hack SOCKS any further
  now that we support HTTP CONNECT tunnels.

0. Abstract

   We propose extending the SOCKS5 protocol to allow passing more
   per-session metadata, and to allow returning more meaningful
   response failure codes back to the client.

1. Introduction

   The SOCKS5 protocol is used by Tor both as the primary interface
   for applications to transfer data, and as the interface by which
   Tor communicates with pluggable transport implementations.

   While the current specifications allow for passing a limited
   amount of per-session metadata via hijacking the
   Username/Password authentication method fields, this solution is
   limited in that the amount of payload that can be conveyed is
   restricted to 510 bytes, does not allow the SOCKS server to
   return a response, and precludes using authentication on the
   SOCKS port.

   The first part of this proposal defines a new authentication
   method to overcome both of these limitations.

   The second part of this proposal defines a range of SOCKS5
   response codes that can be used to signal Tor specific error
   conditions when processing SOCKS requests.

2. Proposal

2.1. Tor Extended SOCKS5 Authentication

   We introduce a new authentication method to the SOCKS5 protocol.

   The METHOD number to be returned to indicate support for or
   select this method is X'97', which belongs to the "RESERVED FOR
   PRIVATE METHODS" range in RFC 1928.

   After the authentication method has been negotiated following the
   standard SOCKS5 protocol, the actual authentication phase begins.

   If any requirement labeled with a "MUST" below in this protocol
   is violated, the party receiving the violation MUST close the
   connection.

   All multibyte numeric values in this protocol MUST be transmitted
   in network (big-endian) byte order.

   The initiator will send an Extended Authentication request:

    +----+----------+-------+-------------+-------+-------------+---
    |VER | NR PAIRS | KLEN1 |    KEY1     | VLEN1 |   VALUE1    | ...
    +----+----------+-------+-------------+-------+-------------+---
    | 1  |    2     |   2   | KLEN1 bytes |   2   | VLEN1 bytes | ...
    +----+----------+-------+-------------+-------+-------------+---

    VER: 8 bits (unsigned integer)

       This field specifies the version of the authentication
       method.  It MUST be set to X'01'.

    NR PAIRS: 16 bits (unsigned integer)

       This field specifies the number of key/value pairs to follow.

    KLEN: 16 bits (unsigned integer)

       This field specifies the length of the key in bytes.  It MUST
       be greater than 0.

    KEY: variable length

       This field contains the key associated with the subsequent
       VALUE field as an ASCII string, without a NUL terminator.

    VLEN: 16 bits (unsigned integer)

       This field specifies the length of the value in bytes. It MAY
       be X'0000', in which case the corresponding VALUE field is
       omitted.

    VALUE: variable length, optional

       The value corresponding to the KEY.


   The responder will verify the contents of the Extended
   Authentication request and send the following response:

    +----+--------+----------+-------+-------------+-------+-------------+---
    |VER | STATUS | NR PAIRS | KLEN1 |    KEY1     | VLEN1 |   VALUE1    | ...
    +----+--------+----------+-------+-------------+-------+-------------+---
    | 1  |   1    |    2     |   2   | KLEN1 bytes |   2   | VLEN1 bytes | ...
    +----+--------+----------+-------+-------------+-------+-------------+---

    VER: 8 bits (unsigned integer)

       This field specifies the version of the authentication
       method.  It MUST be set to X'01'.

    STATUS: 8 bits (unsigned integer)

       The status of the Extended Authentication request where:

        * X'00' SUCCESS
        * X'01' AUTHENTICATION FAILED
        * X'02' INVALID ARGUMENTS

       If a server sends a response indicating failure (STATUS value
       other than X'00') it MUST close the connection.

       [XXXX What should a client if it gets a value here it does
       not recognize?]

    NR PAIRS, KLEN, KEY, VLEN, VALUE:

       These fields have the same format as they do in Extended
       Authentication requests.


   The currently defined KEYs are:

    * "USERNAME" The username for authentication.
    * "PASSWD" The password for authentication.

    [XXXX What do these do?  What is their behavior?  Are they
      client-only? Right now, Tor uses SOCKS5 usernames and
      passwords in two ways:

            1) as a way to control isolation, when receiving them
               from a SOCKS client.
            2) as a way to encode arbitrary data, when sending data
               to a PT.

      Neither of these seem necessary any more.  We can turn 1 into
       a new KEY, and we can turn 2 into a new set of keys.  -NM]

    [XXX - Add some more here, Stream isolation? -YA]

    [XXXX What should a client if it gets a value here it does
     not recognize? -NM]

    [XXXX Should we recommend any namespace conventions for these? -NM]


2.2. Tor Extended SOCKS5 Reply Codes

   We introduce the following additional SOCKS5 reply codes to be
   sent in the REP field of a SOCKS5 message.  Implementations MUST
   NOT send any of the extended codes unless the initiator has
   indicated that it understands the "Tor Extended SOCKS5
   Authentication" as part of the version identifier/method
   selection SOCKS5 message.

    [Actually, should this perhaps be controlled by additional KEY?
       (I'm not sure.) -NM]

   Where:

    * X'E0' Hidden Service Not Found

      The requested Tor Hidden Service was not reachable.

    * X'E1' Hidden Service Not Reachable

      The requested Tor Hidden Service was not found.

    * X'F0' Temporary Pluggable Transport failure, retry immediately

      Pluggable transports SHOULD return this status code if the
      connection attempt failed, but the pluggable transport
      believes that subsequent connections with the same parameters
      are likely to succeed.

      Example:

         The ScrambleSuit Session Ticket handshake failed, but
         reconnecting is likely to succeed as it will use the
         UniformDH handshake.

    * X'F1' Pluggable transport protocol failure, invalid bridge

      Pluggable transports MUST return this status code if the
      connection attempt failed in a manner that indicates that the
      remote peer is not likely to accept connections at a later
      time.

      Example:

         The obfs3 handshake failed.

    * X'F2' Pluggable transport internal error

      Pluggable transports SHOULD return this status code if the
      connection attempt failed due to an internal error in the
      pluggable transport implementation.

      Tor might wish to restart the pluggable transport executable,
      or retry after a delay.

3. Compatibility

   SOCKS5 negotiates authentication methods so backward and forward
   compatibility is obtained for free, assuming a non-broken SOCKS5
   implementation on the responder side that ignores unrecognised
   authentication methods in the negotiation phase.

4. Security Considerations

   Identical security considerations to RFC 1929 Username/Password
   authentication applies when doing Username/Password
   authentication using the keys reserved for such.  As SOCKS5 is
   sent in cleartext, this extension (like the rest of the SOCKS5
   protocol) MUST NOT be used in scenarios where sniffing is possible.

   The authors of this proposal note that binding any of the Tor
   (and associated) SOCKS5 servers to non-loopback interfaces is
   strongly discouraged currently, so in the current model this is
   believed to be acceptable.

5. References

   Leech, M., Ganis, M., Lee, Y., Kuris, R., Koblas, D., Jones L., "SOCKS
   Protocol Version 5", RFC 1928, March 1996.

   Tor Project, "Tor's extensions to the SOCKS protocol"

   Leech, M. "Username/Password Authentication for SOCKS V5", RFC 1929,
   March 1996.

   Appelbaum, J., Mathewson, N., "Pluggable Transport Specification",
   June 2012.

[XXX -  Changelog (Remove when accepted) -YA]

   2014-02-28 (Thanks to nickm/arma)
    * Generalize to also support tor
      * Add status codes for bug #6031
    * Switch from always having a username/password field to making them just
      predefined keys.
    * Change the auth method number to 0x97

   2014-02-28 (nickm's fault)
    * check it into git
    * clean text a little, fix redundancy
    * ask some questions
Filename: 230-rsa1024-relay-id-migration.txt
Title: How to change RSA1024 relay identity keys
Authors: Nick Mathewson
Created: 7 April 2014
Target: 0.2.?
Status: Obsolete

Note: Obsoleted by Ed25519 ID keys; superseded by 240 and 256.

1. Intro and motivation

   Some times, a relay would like to migrate from one RSA1024
   identity key to another without losing its previous status.

   This is especially important because proposal 220 ("Migrate
   server identity keys to Ed25519") is not yet implemented, and so
   server identity keys are not kept offline.  So when an OpenSSL
   bug like CVE-2014-0160 makes memory-reading attacks a threat to
   identity keys, we need a way for routers to migrate ASAP.

   This proposal does not cover migrating RSA1024 OR identity keys
   for authorities.

2. Design

   I propose that when a relay changes its identity key, it should
   include a "old-identity" field in its server descriptor for 60 days
   after the migration.  This old-identity would include the
   old RSA1024 identity, a signature of the new identity key
   with the old one, and the date when the migration occurred.

   This field would appear as an "old-id" field in microdescriptors,
   containing a SHA1 fingerprint of the old identity key, if the
   signature turned out to be value.

   Authorities would store old-identity => new-identity mappings,
   and:

      * Treat history information (wfu, mtbf, [and what else?]) from
        old identities as applying to new identities instead.

      * No longer accept any routers descriptors signed by the old
        identity.

   Clients would migrate any guard entries for the old identity to
   the new identity.

   (This will break clients connections for clients who try to
   connect to the old identity key before learning about the new
   one, but the window there won't be large for any single router.)

3. Descriptor format details

   Router descriptors may contain these new elements:

      "old-rsa1024-id-key" NL RSA_KEY NL

        Contains an old RSA1024 identity key. If this appears,
        old-rsa1024-id-migration must also appear. [At most once]

      "old-rsa1024-id-migration" SP ISO-TIME NL SIGNATURE NL

        Contains a signature of:
         The bytes "RSA1024 ID MIGRATION"               [20 bytes]
         The ISO-TIME field above as an 8 byte field    [8 bytes]
         A SHA256 hash of the new identity              [32 bytes]

        If this appears, "old-rsa1024-id-key" must also appear.
        [At most once].

4. Interface

   To use this feature, a router should rename its secret_id_key
   file to secret_id_key_OLD.  The first time that Tor starts and
   finds a secret_id_key_OLD file, it generates a new ID key if one
   is not present, and generates the text of the old-rsa-1024-id-key
   and old-rsa1024-id-migration fields above.  It stores them in a
   new "old_id_key_migration" file, and deletes the
   secret_id_key_OLD file.  It includes them in its desecriptors.

   Sixty days after the stored timestamp, the router deletes the
   "old_id_key_migration" file and stops including its contents in
   the descriptor.


Filename: 231-migrate-authority-rsa1024-ids.txt
Title: Migrating authority RSA1024 identity keys
Authors: Nick Mathewson
Created: 8 April 2014
Target: 0.2.?
Status: Obsolete

Note: Obsoleted by Ed25519 ID keys; superseded by 240 and 256.

1. Intro and motivation

   We'd like for RSA1024 identity keys to die out entirely.  But we
   may need to migrate authority identity keys before that happens.

   This is especially important because proposal 220 ("Migrate
   server identity keys to Ed25519") is not yet implemented, and so
   server identity keys are not kept offline.  So when an OpenSSL
   bug like CVE-2014-0160 makes memory-reading attacks a threat to
   identity keys, we need a way for authorities to migrate ASAP.

   Migrating authority ID keys is a trickier problem than migrating
   router ID keys, since the authority RSA1024 keys are hardwired in the
   source.  We use them to authenticate encrypted OR connections to
   authorities that we use to publish and retrieve directory
   information.

   This proposal does not cover migrating RSA1024 OR identity keys for
   other nodes; for that, see proposal 230.

2. Design

   When an authority is using a new RSA1024 key, it retains the old one
   in a "legacy_link_id_key" file.  It uses this key to perform link
   protocol handshakes at its old address:port, and it uses the new key
   to perform link protocol handshakes at a new address:port.

   This should be sufficient for all clients that expect the old
   address:port:fingerprint to work, while allowing new clients to use
   the correct address:port:fingerprint.

   Authorities will sign their own router descriptors with their new
   identity key, and won't advertise the old port or fingerprint at all
   in their descriptors.  This shouldn't break anything, so far as I
   know.

3. Implementation

   We'll have a new flag on an ORPort: "LegacyIDKey". It implies
   NoAdvertise.  If it is present, we use our LegacyIDKey for that
   ORPort and that ORPort, for all of:

     * The TLS certificate chains used in the v1 and v2 link protocol
       handshake.

     * The certificate chains and declared identity in the v3 link
       handshake.

     * Accepting ntor cells.

4. Open questions

   On ticket #11448, Robert Ransom suggests that authorities may need to
   publish extra server descriptors for themselves, signed with the old
   identity key too.  We should investigate whether clients will
   misbehave if they can't find such descriptors.

   If that's the case, authorities should generate these descriptors,
   but not include them in votes or the consensus; or if they are
   included, don't assign them flags that will get them used.

Filename: 232-pluggable-transports-through-proxy.txt
Title: Pluggable Transport through SOCKS proxy
Author: Arturo Filastò
Created: 28 February 2012
Status: Closed
Implemented-In: 0.2.6

Overview

  Tor introduced Pluggable Transports in proposal "180 Pluggable
  Transports for circumvention".

  The problem is that Tor currently cannot use a pluggable transport
  proxy and a normal (SOCKS/HTTP) proxy at the same time. This has
  been noticed by users in #5195, where Tor would be failing saying
  "Unacceptable option value: You have configured more than one proxy
  type".

Trivia

  This comes from a discussion that came up with Nick and I promised
  to write a proposal for it if I wanted to hear what he had to say.
  Nick spoke and I am writing this proposal.

Acknowledgments

  Most of the credit goes to Nick Mathewson for the main idea and
  the rest of it goes to George Kadianakis for helping me out in writing
  it.

Motivation

  After looking at some options we decided to go for this solution
  since it guarantees backwards compatibility and is not particularly
  costly to implement.

Design overview

  When Tor is configured to use both a pluggable transport proxy and a
  normal proxy it should delegate the proxying to the pluggable
  transport proxy.

  This can be achieved by specifying the address and port of the normal
  proxy to the pluggable transport proxy using environment variables:
  When both a normal proxy and the ClientTransportPlugin directives
  are set in the torrc, Tor should put the address of the normal proxy
  in an environment variable and start the pluggable transport
  proxy. When the pluggable transport proxy starts, it should read the
  address of the normal proxy and route all its traffic through it.

  After connecting to the normal proxy, the pluggable transport proxy
  notifies Tor whether it managed to connect or not.

  The environment variables also contain the authentication
  credentials for accessing the proxy.

Specifications: Tor Pluggable Transport communication

  When Tor detects a normal proxy directive and a pluggable transport
  proxy directive, it sets the environment variable:

    "TOR_PT_PROXY" -- This is the address of the proxy to be used by
    the pluggable transport proxy. It is in the format:
    <proxy_type>://[<user_name>][:<password>][@]<ip>:<port>
    ex. socks5://tor:test1234@198.51.100.1:8000
        socks4a://198.51.100.2:8001

  Acceptable values for <proxy_type> are: 'socks5', 'socks4a' and 'http'.
  If no <password> can be specified (e.g. in 'socks4a'), it is left out.

  If the pluggable transport proxy detects that the TOR_PT_PROXY
  environment variable is set, it attempts connecting to it. On
  success it writes to stdout: "PROXY DONE".
  On failure it writes: "PROXY-ERROR <errormessage>".

  If Tor does not read a PROXY line or it reads a PROXY-ERROR line
  from its stdout and it is configured to use both a normal proxy and
  a pluggable transport it should kill the transport proxy.

Filename: 233-quicken-tor2web-mode.txt
Title: Making Tor2Web mode faster
Author: Virgil Griffith, Fabio Pietrosanti, Giovanni Pellerano
Created: 2014-03-27
Status: Rejected


1. Introduction

   While chatting with the Tor archons at the winter meeting, two speed
   optimizations for tor2web mode [1] were put forward.  This proposal
   specifies concretizes these two optimizations.  As with the current
   tor2web mode, this quickened tor2web mode negates any client
   anonymity.

2. Tor2web optimizations

2.1. Self-rendezvous

  In the current tor2web mode, the client establishes a 1-hop circuit
  (direct connection) to a chosen rendezvous point.  We propose that,
  alternatively, the client set *itself* as the rendezvous point. This
  coincides with ticket #9685[2].

2.2. direct-introduction

  Identical to the non-tor2web mode, in the current tor2web mode, the
  client establishes a 3-hop circuit to the introduction point.  We
  propose that, alternatively, the client builds a 1-hop circuit to the
  introduction point.

4. References

  [1] Tor2web mode: https://trac.torproject.org/projects/tor/ticket/2553
  [2] Self-rendezvous: https://trac.torproject.org/projects/tor/ticket/9685
Filename: 234-remittance-addresses.txt
Title: Adding remittance field to directory specification
Author: Virgil Griffith, Leif Ryge, Rob Jansen
Created: 2014-03-27
Status: Rejected

Note: Rejected. People are doing this with ContactInfo lines.

1. Motivation

  We wish to add the ability for individual users to donate to the
  owners of relay operators using a cryptocurrency.  We propose adding
  an optional line to the torrc file which will be published in the
  directory consensus and listed on https://compass.torproject.org.


2. Proposal

  Allow an optional "RemittanceAddresses" line to the torrc file
  containing comma-delimited cryptocurrency URIs.  The format is:

    RemittanceAddressses <currency1>:<address>1,<currency2>:<address2>

  For an example using an actual bitcoin and namecoin address, this is:

    RemittanceAddressses bitcoin:19mP9FKrXqL46Si58pHdhGKow88SUPy1V8,namecoin:NAMEuWT2icj3ef8HWJwetZyZbXaZUJ5hFT

  The contents of a relay's RemittanceAddresses line will be mirrored in
  the relay's router descriptor (which is then published in the
  directory consensus).  This line will be treated akin to the
  ContactInfo field.  A cryptocurrency address may not contain a colon,
  comma, whitespace, or other nonprintable ASCII.

  Like the ContactInfo line, there is no explicit length limit for
  RemittanceAddressses---the only limit is the length of the entire
  descriptor.  If the relay lists multiple addresses of the same
  currency type (e.g., two bitcoin addresses), only the first
  (left-most) one of each currency is published in the directory
  consensus.

Filename: 235-kill-named-flag.txt
Title: Stop assigning (and eventually supporting) the Named flag
Authors: Sebastian Hahn
Created: 10 April 2014
Implemented-In: 0.2.6, 0.2.7
Status: Closed

1. Intro and motivation

  Currently, Tor supports the concept of linking a Tor relay's nickname
  to its identity key. This happens automatically as a new relay joins
  the network with a unique nickname, and keeps it for a while. To
  indicate that a nickname is linked to the presented identity, the
  directory authorities vote on a Named flag for all relays where they
  have such a link. Not all directory authorities are currently doing
  this - in fact, there are only two, gabelmoo and tor26.

  For a long time, we've been telling everyone to not rely on relay
  nicknames, even if the Named flag is assigned. This has two reasons:
  First off, it adds another trust requirement on the directory
  authorities, and secondly naming may change over time as relays go
  offline for substantial amounts of time.

  Now that a significant portion of the network is required to rotate
  their identity keys, few relays will keep their Named flag. We should
  use this chance to stop assigning Named flags.

2. Design

  None so far, but we should review older-but-still-supported Tor
  versions (down to 0.2.2.x) for potential issues. In theory, Tor
  clients already support consensuses without Named flags, and testing
  in private Tor networks has never revealed any issues in this regard,
  but we're unsure if there might be some functionality that isn't
  typically tested with private networks and could get broken now.

3. Implementation

  The gabelmoo and tor26 directory authorities can simply remove the
  NamingAuthoritativeDirectory configuration option to stop giving out
  Named flags. This will mean the consensus won't include Named and
  Unnamed flags any longer. The code collecting naming statistics is
  independent of Tor, so it can run a while longer to ensure Naming can
  be switched on if unforeseen issues arise.

  Once this has been shown to not cause any issues, support for the
  Named flag can be removed from the Tor client implementation, and
  support for the NamingAuthoritativeDirectory can be removed from the
  Tor directory authority implementation.

4. Open questions

  None.
Filename: 236-single-guard-node.txt
Title: The move to a single guard node
Author: George Kadianakis, Nicholas Hopper
Created: 2014-03-22
Status: Closed

-1. Implementation-status

   Partially implemented, and partially superseded by proposal 271.

0. Introduction

   It has been suggested that reducing the number of guard nodes of
   each user and increasing the guard node rotation period will make
   Tor more resistant against certain attacks [0].

   For example, an attacker who sets up guard nodes and hopes for a
   client to eventually choose them as their guard will have much less
   probability of succeeding in the long term.

   Currently, every client picks 3 guard nodes and keeps them for 2 to
   3 months (since 0.2.4.12-alpha) before rotating them. In this
   document, we propose the move to a single guard per client and an
   increase of the rotation period to 9 to 10 months.

1. Proposed changes

1.1. Switch to one guard per client

   When this proposal becomes effective, clients will switch to using
   a single guard node.

   That is, in its first startup, Tor picks one guard and stores its
   identity persistently to disk. Tor uses that guard node as the
   first hop of its circuits from thereafter.

   If that Guard node ever becomes unusable, rather than replacing it,
   Tor picks a new guard and adds it to the end of the list. When
   choosing the first hop of a circuit, Tor tries all guard nodes from
   the top of the list sequentially till it finds a usable guard node.

   A Guard node is considered unusable according to section "5. Guard
   nodes" in path-spec.txt. The rest of the rules from that section
   apply here too. XXX which rules specifically? -asn
   XXX Probably the rules about how to add a new guard (only after
       contact), when to re-try a guard for reachability, and when to
       discard a guard?  -nickhopper

   XXX Do we need to specify how already existing clients migrate?

1.1.1. Alternative behavior to section 1.1

   Here is an alternative behavior than the one specified in the
   previous section. It's unclear which one is better.

   Instead of picking a new guard when the old guard becomes unusable,
   we pick a number of guards in the beginning but only use the top
   usable guard each time. When our guard becomes unusable, we move to
   the guard below it in the list.

   This behavior _might_ make some attacks harder; for example, an
   attacker who shoots down your guard in the hope that you will pick
   his guard next, is now forced to have evil guards in the network at
   the time you first picked your guards.

   However, this behavior might also influence performance, since a
   guard that was fast enough 7 months ago, might not be this fast
   today. Should we reevaluate our opinion based on the last
   consensus, when we have to pick a new guard? Also, a guard that was
   up 7 months ago might be down today, so we might end up sampling
   from the current network anyway.

1.2. Increase guard rotation period

   When this proposal becomes effective, Tor clients will set the
   lifetime of each guard to a random time between 9 to 10 months.

   If Tor tries to use a guard whose age is over its lifetime value,
   the guard gets discarded (also from persistent storage) and a new
   one is picked in its place.

   XXX We didn't do any analysis on extending the rotation period.
       For example, we don't even know the average age of guards, and
       whether all guards stay around for less than 9 months anyway.
       Maybe we should do some analysis before proceeding?

   XXX The guard lifetime should be controlled using the
       (undocumented?) GuardLifetime consensus option, right?

1.2.1. Alternative behavior to section 1.2

   Here is an alternative behavior than the one specified in the
   previous section. It's unclear which one is better.

   Similar to section 1.2, but instead of rotating to completely new
   guard nodes after 9 months, we pick a few extra guard nodes in the
   beginning, and after 9 months we delete the already used guard
   nodes and use the one after them.

   This has approximately the same tradeoffs as section 1.1.1.

   Also, should we check the age of all of our guards periodically, or
   only check them when we try to use them?

1.3. Age of guard as a factor on guard probabilities

   By increasing the guard rotation period we also increase the lack
   of utilization for young guards since clients will rotate guards even
   more infrequently now (see 'Phase three' of [1]).

   We can mitigate this phenomenon by treating these recent guards as
   "fractional" guards:

   To do so, everytime an authority needs to vote for a guard, it
   reads a set of consensus documents spanning the past NNN months,
   where NNN is the number of months in the guard rotation period (10
   months if this proposal is adopted in full) and calculates in how
   many consensuses it has had the guard flag for.

   Then, in their votes, the authorities include the Guard Fraction of
   each guard by appending '[SP "GuardFraction=" INT]' in the guard's
   "w" line. Its value is an integer between 0 and 100, with 0 meaning
   that it's a brand new guard, and 100 that it has been present in
   all the inspected consensuses.

   A guard N that has been visible for V out of NNN*30*24 consensuses
   has had the opportunity to be chosen as a guard by approximately
   F = V/NNN*30*24 of the clients in the network, and the remaining
   1-F fraction of the clients have not noticed this change.  So when
   being chosen for middle or exit positions on a circuit, clients
   should treat N as if F fraction of its bandwidth is a guard
   (respectively, dual) node and (1-F) is a middle (resp, exit) node.
   Let Wpf denote the weight from the 'bandwidth-weights' line a
   client would apply to N for position p if it had the guard
   flag, Wpn the weight if it did not have the guard flag, and B the
   measured bandwidth of N in the consensus.  Then instead of choosing
   N for position p proportionally to Wpf*B or Wpn*B, clients should
   choose N proportionally to F*Wpf*B + (1-F)*Wpn*B.

   Similarly, when calculating the bandwidth-weights line as in
   section 3.8.3 of dir-spec.txt, directory authorities should treat N
   as if fraction F of its bandwidth has the guard flag and (1-F) does
   not.  So when computing the totals G,M,E,D, each relay N with guard
   visibility fraction F and bandwidth B should be added as follows:

   G' = G + F*B, if N does not have the exit flag
   M' = M + (1-F)*B, if N does not have the exit flag
   D' = D + F*B, if N has the exit flag
   E' = E + (1-F)*B, if N has the exit flag

1.3.1. Guard Fraction voting

  To pass that information to clients, we introduce consensus method
  19, where if 3 or more authorities provided GuardFraction values in
  their votes, the authorities produce a consensus containing a
  GuardFraction keyword equal to the low-median of the GuardFraction
  votes.

  The GuardFraction keyword is appended in the 'w' line of each router
  in the consensus, after the optional 'Unmeasured' keyword. Example:
    w Bandwidth=20 Unmeasured=1 GuardFraction=66
  or
    w Bandwidth=53600 GuardFraction=99

1.4. Raise the bandwidth threshold for being a guard

   From dir-spec.txt:
      "Guard" -- A router is a possible 'Guard' if its Weighted Fractional
       Uptime is at least the median for "familiar" active routers, and if
       its bandwidth is at least median or at least 250KB/s.

   When this proposal becomes effective, authorities should change the
   bandwidth threshold for being a guard node to 2000KB/s instead of
   250KB/s.

   Implications of raising the bandwidth threshold are discussed in
   section 2.3.

   XXX Is this insane? It's an 8-fold increase.

2. Discussion

2.1. Guard node set fingerprinting

   With the old behavior of three guard nodes per user, it was
   extremely unlikely for two users to have the same guard node
   set. Hence the set of guard nodes acted as a fingerprint to each
   user.

   When this proposal becomes effective, each user will have one guard
   node. We believe that this slightly reduces the effectiveness of
   this fingerprint since users who pick a popular guard node will now
   blend in with thousands of other users. However, clients who pick a
   slow guard will still have a small anonymity set [2].

   All in all, this proposal slightly improves the situation of guard
   node fingerprinting, but does not solve it. See the next section
   for a suggested scheme that would further fix the guard node set
   fingerprinting problem

2.1.1. Potential fingerprinting solution: Guard buckets

   One of the suggested alternatives that moves us closer to solving
   the guard node fingerprinting problem, would be to split the list
   of N guard nodes into buckets of K guards, and have each client
   pick a bucket [3].

   This reduces the fingerprint from N-choose-k to N/k guard set
   choices; it also allows users to have multiple guard nodes which
   provides reliability and performance.

   Unfortunately, the implementation of this idea is not as easy and
   its anonymity effects are not well understood so we had to reject
   this alternative for now.

2.2. What about 'multipath' schemes like Conflux?

   By switching to one guard, we rule out the deployment of
   'multipath' systems like Conflux [4] which build multiple circuits
   through the Tor network and attempt to detect and use the most
   efficient circuits.

   On the other hand, the 'Guard buckets' idea outlined in section
   2.1.1 works well with Conflux-type schemes so it's still worth
   considering.

2.3. Implications of raising the bandwidth threshold for guards

   By raising the bandwidth threshold for being a guard we directly
   affect the performance and anonymity of Tor clients. We performed a
   brief analysis of the implications of switching to one guard and
   the results imply that the changes are not tragic [2].

   Specifically, it seems that the performance of about half of the
   clients will degrade slightly, but the performance of the other
   half will remain the same or even improve.

   Also, it seems that the powerful guard nodes of the Tor network
   have enough total bandwidth capacity to handle client traffic even
   if some slow guard nodes get discarded.

   On the anonymity side, by increasing the bandwidth threshold to
   2MB/s we half our guard nodes; we discard 1000 out of 2000
   guards. Even if this seems like a substantial diversity loss, it
   seems that the 1000 discarded guard nodes had a very small chance
   of being selected in the first place (7% chance of any of the being
   selected).

   However, it's worth noting that the performed analysis was quite
   brief and the implications of this proposal are complex, so we
   should be prepared for surprises.

2.4. Should we stop building circuits after a number of guard failures?

   Inspired by academic papers like the Sniper attack [5], a powerful
   attacker can choose to shut down guard nodes till a client is
   forced to pick an attacker controlled guard node. Similarly, a
   local network attacker can kill all connections towards all guards
   except the ones she controls.

   This is a very powerful attack that is hard to defend against. A
   naive way of defending against it would be for Tor to refuse to
   build any more circuits after a number of guard node failures have
   been experienced.

   Unfortunately, we believe that this is not a sufficiently strong
   countermeasure since puzzled users will not comprehend the
   confusing warning message about guard node failures and they will
   instead just uninstall and reinstall TBB to fix the issue.

2.5. What this proposal does not propose

   Finally, this proposal does not aim to solve all the problems with
   guard nodes. This proposal only tries to solve some of the problems
   whose solution is analyzed sufficiently and seems harmless enough
   to us.

   For example, this proposal does not try to solve:
   - Guard enumeration attacks. We need guard layers or virtual
     circuits for this [6].
   - The guard node set fingerprinting problem [7]
   - The fact that each isolation profile or virtual identity should
     have its own guards.

XXX It would also be nice to have some way to easily revert back to 3
    guards if we later decide that a single guard was a very stupid
    idea.

References:

[0]: https://blog.torproject.org/blog/improving-tors-anonymity-changing-guard-parameters
     http://freehaven.net/anonbib/#wpes12-cogs

[1]: https://blog.torproject.org/blog/lifecycle-of-a-new-relay

[2]: https://lists.torproject.org/pipermail/tor-dev/2014-March/006458.html

[3]: https://trac.torproject.org/projects/tor/ticket/9273#comment:4

[4]: http://freehaven.net/anonbib/#pets13-splitting

[5]: https://blog.torproject.org/blog/new-tor-denial-service-attacks-and-defenses

[6]: https://trac.torproject.org/projects/tor/ticket/9001

[7]: https://trac.torproject.org/projects/tor/ticket/10969
Filename: 237-directory-servers-for-all.txt
Title: All relays are directory servers
Author: Matthew Finkel
Created: 29-Jul-2014
Status: Closed
Target: 0.2.7.x
Implemented-in: 0.2.8.1-alpha
Supersedes: 185

Overview:

      This proposal aims at simplying how users interact directly with
  the Tor network by turning all relays into directory servers (also
  known as directory caches), too.  Currently an operator has the
  options of running a relay, a directory server, or both.  With the
  acceptance (and implementation) of this proposal the options will be
  simplified by having (nearly) all relays cache and serve directory
  documents, without additional configuration.

Motivation:

      Fetching directory documents and descriptors is not always a
  simple operation for a client. This is especially true and potentially
  dangerous when the client would prefer querying its guard but its
  guard is not a directory server. When this is the case, the client
  must choose and query a distinct directory server. At best this should
  not be necessary and at worst, it seems, this adds another position
  within the network for profiling and partitioning users. With the
  orthogonally proposed move to clients using a single guard, the
  resulting benefits could be reduced by clients using distinct
  directory servers. In addition, in the case where the client does not
  use guards, it is important to have the largest possible amount of
  diversity in the set of directory servers. In a network where (almost)
  every relay is a directory server, the profiling and partitioning
  attack vector is reduced to the guard (for clients who use them),
  which is already in a privileged position for this. In addition, with
  the increased set size, relay descriptors and documents are more
  readily available and it diversifies the providers.

Design:

      The changes needed to achieve this should be simple. Currently all
  relays download and cache the majority of relay documents in any case,
  so the slight increased memory usage from downloading all of them should
  have minimal consequences. There will be necessary logical changes in
  the client, router, and directory code.

      Currently directory servers are defined as such if they advertise
  having an open directory port. We can no longer assume this is true. To
  this end, we will introduce a new server descriptor line.

  	"tunnelled-dir-server" NL
        [At most once]
        [No extra arguments]


      The presence of this line indicates that the relay accepts
  tunnelled directory requests. For a relay that implements this
  proposal, this line MUST be added to its descriptor if it does not
  advertise a directory port, and the line MAY be added if it also
  advertises an open directory port. In addition to this, relays will
  now download and cache all descriptors and documents listed in the
  consensus, regardless of whether they are deemed useful or usable,
  exactly like the current directory server behavior. All relays will
  also accept directory requests when they are tunnelled over a
  connection established with a BEGIN_DIR cell, the same way these
  connections are already accepted by bridges and directory servers with
  an open DirPort.

      Directory Authorities will now assign the V2Dir flag to a server if
  it supports a version of the directory protocol which is useful to
  clients and it has at least an open directory port or it has an open
  and reachable OR port and advertises "tunnelled-dir-server" in its
  server descriptor.

      Clients choose a directory by using the current criteria with the
  additional criterion that a server only needs the V2Dir status flag
  instead of requiring an open DirPort.

Security Considerations and Implications:

      Currently all directory servers are explicitly configured. This is
  necessary because they must have a configured and reachable external
  port. However, within Tor, this requires additional configuration and
  results in a reduced number of directory servers in the network. As a
  consequence, this could allow an adversary to control a non-negligable
  fraction of the servers. By increasing the number of directory servers
  in the network the likelihood of selecting one that is malicious is
  reduced. Also, with this proposal, it will be more likely that a
  client's entry guard is also a directory server (as alluded to in
  Proposal 207). However, the reduced anonymity set created when the
  guard does not have, or is unwilling to distribute, a specific
  document still exists. With the increased diversity in the available
  servers, the impact of this should be reduced.

      Another question that may need further consideration is whether we
  trust bad directories to be good guards and exits.

Specification:

  	The version 3 directory protocol specification does not
  currently document the use of directory guards. This spec should be
  updated to mention the preferred use of directory guards during
  directory requests. In addition, the new criteria for assigning the
  V2Dir flag should be documented.

Impact on local resources:

      Should relays attempt to download documents from another mirror
  before asking an authority? All relays, with minor exceptions, will
  now contact the authorities for documents, but this will not scale
  well and will partition users from relays.

      If all relays become directory servers, they will choose to
  download all documents, regardless of whether they are useful, in case
  another client does want them. This will have very little impact on
  the most relays, however on memory constrained relays (BeagleBone,
  Raspberry Pi, and similar), every megabyte allocated to directory
  documents is not available for new circuits. For this reason, a new
  configuration option will be introduced within Tor for these systems,
  named DirCache, which the operator may choose to set as 0, thus
  disabling caching of directory documents and denying client directory
  requests.

Future Considerations:

      Should the DirPort be deprecated at some point in the future?

      Write a proposal requiring that a relay must have the V2Dir flag
  as a criterion for being a guard.

      Is V2Dir a good name for this? It's the name we currently use, but
  that's a silly reason to continue using it.
Filename: 238-hs-relay-stats.txt
Title: Better hidden service stats from Tor relays
Author: George Kadianakis, David Goulet, Karsten Loesing, Aaron Johnson
Created: 2014-11-17
Status: Closed

0. Motivation

   Hidden Services is one of the least understood parts of the Tor
   network. We don't really know how many hidden services there are
   and how much they are used.

   This proposal suggests that Tor relays include some hidden service
   related stats to their extra info descriptors. No stats are
   collected from Tor hidden services or clients.

   While uncertainty might be a good thing in a hidden network,
   learning more information about the usage of hidden services can be
   helpful.

   For example, learning how many cells are sent for hidden service
   purposes tells us whether hidden service traffic is 2% of the Tor
   network traffic or 90% of the Tor network traffic. This info can
   also help us during load balancing, for example if we change the
   path building of hidden services to mitigate guard discovery
   attacks [GUARD-DISCOVERY].

   Also, learning the number of hidden services, can give us an
   understanding of how widespread hidden services are. It will also
   help us understand approximately how much load is put in the
   network by hidden service logistics, like introduction point
   circuits etc.


1. Design

   Tor relays shall add some fields related to hidden service
   statistics in their extra-info descriptors.

   Tor relays collect these statistics by keeping track of their
   hidden service directory or rendezvous point activities, slightly
   obfuscating the numbers and posting them to the directory
   authorities. Extra-info descriptors are posted to directory
   authorities every 24 hours.


2. Implementation

2.1. Hidden service statistics interval

   We want relays to report hidden-service statistics over a long-enough
   time period to not put users at risk.  Similar to other statistics, we
   suggest a 24-hour statistics interval.  All related statistics are
   collected at the end of that interval and included in the next
   extra-info descriptors published by the relay.

   Tor relays will add the following line to their extra-info descriptor:

    "hidserv-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
        [At most once.]

        YYYY-MM-DD HH:MM:SS defines the end of the included measurement
        interval of length NSEC seconds (86400 seconds by default).

        A "hidserv-stats-end" line, as well as any other "hidserv-*" line,
        is first added after the relay has been running for at least 24
        hours.

2.2. Hidden service traffic statistics

   We want to learn how much of the total Tor network traffic is
   caused by hidden service usage.  More precisely, we measure hidden
   service traffic by counting RELAY cells seen on a rendezvous point
   after receiving a RENDEZVOUS1 cell.  These RELAY cells include
   commands to open or close application streams, and they include
   application data.

   Tor relays will add the following line to their extra-info descriptor:

    "hidserv-rend-relayed-cells" SP num SP key=val SP key=val ... NL
        [At most once.]

        Where 'num' is the number of RELAY cells seen in either
        direction on a circuit after receiving and successfully
        processing a RENDEZVOUS1 cell.

        The actual number is obfuscated as detailed in
        [STAT-OBFUSCATION]. The parameters of the obfuscation are
        included in the key=val part of the line.

   The obfuscatory parameters for this statistic are:
     * delta_f = 2048
     * epsilon = 0.3
     * bin_size = 1024
   (Also see [CELL-LAPLACE-GRAPH] for a graph of the Laplace distribution.)

   So, an example line could be:
     hidserv-rend-relayed-cells 19456 delta_f=2048 epsilon=0.30 binsize=1024

2.3. HSDir hidden service counting

   We also want to learn how many hidden services exist in the
   network.  The best place to learn this is at hidden service
   directories where hidden services publish their descriptors.

   Tor relays will add the following line to their extra-info descriptor:

    "hidserv-dir-onions-seen" SP num SP key=val SP key=val ... NL
        [At most once.]

        Approximate number of unique hidden-service identities seen in
        descriptors published to and accepted by this hidden-service
        directory.

        The actual number number is obfuscated as detailed in
        [STAT-OBFUSCATION]. The parameters of the obfuscation are
        included in the key=val part of the line.

   The obfuscatory parameters for this statistic are:
     * delta_f = 8
     * epsilon = 0.3
     * bin_size = 8
   (Also see [ONIONS-LAPLACE-GRAPH] for a graph of the Laplace distribution.)

   So, an example line could be:
    hidserv-dir-onions-seen 112 delta_f=1 epsilon=0.30 binsize=8

2.4. Statistics obfuscation [STAT-OBFUSCATION]

  We believe that publishing the actual measurement values in such a
  system might have unpredictable effects, so we obfuscate these
  statistics before publishing:

                   +-----------+    +--------------+
   actual value -> |  binning  | -> |additive noise| -> public statistic
                   +-----------+    +--------------+

  We are using two obfuscation methods to better hide the actual
  numbers even if they remain the same over multiple measurement
  periods.

  Specifically, given the actual measurement value, we first apply
  data binning to it (basically we round it up to the nearest multiple
  of an integer, see [DATA-BINNING]). And then we apply additive noise
  to the binned value in a fashion similar to differential privacy.

  More information about the obfuscation methods follows:

2.4.1. Data binning

  The first thing we do to the original measurement value, is to round
  it up to the nearest multiple of 'bin_size'.  'bin_size' is an
  integer security parameter and can be found on the respective
  statistics sections.

  This is similar to how Tor keeps bridge user statistics. As an
  example, if the measurement value is 9 and bin_size is 8, then the
  final value will be rounded up to 16. This also works for negative
  values, so for example, if the measurement value is -9 and bin_size
  is 8, the value will be rounded up to -8.

2.4.2. Additive noise

  Then, before publishing the statistics, we apply additive noise to
  the binned value by adding to it a random value sampled from a
  Laplace distribution . Following the differential privacy
  methodology [DIFF-PRIVACY], our obfuscatory Laplace distribution has
  mu = 0 and b = (delta_f / epsilon).

  The precise values of delta_f and epsilon are different for each
  statistic and are defined on the respective statistics sections.


3. Security

   The main security considerations that need discussion are what an
   adversary could do with reported statistics that they couldn't do
   without them.  In the following, we're going through things the
   adversary could learn, how plausible that is, and how much we care.
   (All these things refer to hidden-service traffic, not to
   hidden-service counting.  We should think about the latter, too.)

3.1. Identify rendezvous point of high-volume and long-lived connection

   The adversary could identify the rendezvous point of a very large and
   very long-lived HS connection by observing a relay with unexpectedly
   large relay cell count.

3.2. Identify number of users of a hidden service

   The adversary may be able to identify the number of users
   of an HS if he knows the amount of traffic on a connection to that HS
   (which he potentially can determine himself) and knows when that
   service goes up or down. He can look at the change in the total
   reported RP traffic to determine about how many fewer HS users there
   are when that HS is down.


4. Discussion

4.1. Why count only RP cells? Why not count IP cells too?

   There are three phases in the rendezvous protocol where traffic is
   generated: (1) when hidden services make themselves available in
   the network, (2) when clients open connections to hidden services,
   and (3) when clients exchange application data with hidden
   services.  We expect (3), that is the RP cells, to consume most
   bytes here, so we're focusing on this only.

   Furthermore, introduction points correspond to specific HSes, so
   publishing IP cell stats could reveal the popularity of specific
   HSes.

4.2. How to use these stats?

 4.2.1. How to use rendezvous cell statistics

   We plan to extrapolate reported values to network totals by dividing
   values by the probability of clients picking relays as rendezvous
   point.  This approach should become more precise on faster relays and
   the more relays report these statistics.

   We also plan to compare reported values with "cell-*" statistics to
   learn what fraction of traffic can be attributed to hidden services.

   Ideally, we'd be able to compare values to "write-history" and
   "read-history" lines to compute similar fractions of traffic used for
   hidden services.  The goal would be to avoid enabling "cell-*"
   statistics by default.  In order for this to work we'll have to
   multiply reported cell numbers with the default cell size of 512 bytes
   (we cannot infer the actual number of bytes, because cells are
   end-to-end encrypted between client and service).

 4.2.2. How to use HSDir HS statistics

   We plan to extrapolate this value to network totals by calculating what
   fraction of hidden-service identities this relay was supposed to see.
   This extrapolation will be very rough, because each hidden-service
   directory is only responsible for a tiny share of hidden-service
   descriptors, and there is no way to increase that share significantly.

   Here are some numbers: there are about 3000 directories, and each
   descriptor is stored on three directories.  So, each directory is
   responsible for roughly 1/1000 of descriptor identifiers.  There are
   two replicas for each descriptor (that is, each descriptor is stored
   under two descriptor identifiers), and descriptor identifiers change
   once per day (which means that, during a 24-hour period, there are two
   opportunities for each directory to see a descriptor).  Hence, each
   descriptor is stored to four places in
   identifier space throughout a 24-hour period.  The probability of any
   given directory to see a given hidden-service identity is
   1-(1-1/1000)^4 = 0.00399 = 1/250.  This approximation constitutes an
   upper threshold, because it assumes that services are running all day.
   An extrapolation based on this formula will lead to undercounting the
   total number of hidden services.

   A possible inaccuracy in the estimation algorithm comes from the fact
   that a relay may not be acting as hidden-service directory during the
   full statistics interval.  We'll have to look at consensuses to
   determine when the relay first received the "HSDir" flag, and only
   consider the part of the statistics interval following the valid-after
   time of that consensus.

4.3. Why does the obfuscation work?

   By applying data binning, we smudge the original value making it
   harder for attackers to guess it. Specifically, an attacker who
   knows the bin, can only guess the underlying value with probability
   1/bin_size.

   By applying additive noise, we make it harder for the adversary to
   find out the current bin, which makes it even harder to get the
   original value. If additive noise was not applied, an adversary
   could try to detect changes in the original value by checking when
   we switch bins.

5. Acknowledgements

   Thanks go to 'pfm' for the helpful Laplace graphs.

6. References

[GUARD-DISCOVERY]: https://lists.torproject.org/pipermail/tor-dev/2014-September/007474.html

[DIFF-PRIVACY]: http://research.microsoft.com/en-us/projects/databaseprivacy/dwork.pdf

[DATA-BINNING]: https://en.wikipedia.org/wiki/Data_binning

[CELL-LAPLACE-GRAPH]: https://raw.githubusercontent.com/corcra/pioton/master/vis/laplacePDF_mu0.0_b6826.67.png
                      https://raw.githubusercontent.com/corcra/pioton/master/vis/laplaceCDF_mu0.0_b6826.67.png

[ONIONS-LAPLACE-GRAPH]: https://raw.githubusercontent.com/corcra/pioton/master/vis/laplacePDF_mu0.0_b26.67.png
                        https://raw.githubusercontent.com/corcra/pioton/master/vis/laplaceCDF_mu0.0_b26.67.png
Filename: 239-consensus-hash-chaining.txt
Title: Consensus Hash Chaining
Author: Nick Mathewson, Andrea Shepard
Created: 06-Jan-2015
Status: Open

1. Introduction and overview

  To avoid some categories of attacks against directory authorities
  and their keys, it would be handy to have an explicit hash chain in
  consensuses.

2. Directory authority operation

  We add the following field to votes and consensuses:

          previous-consensus ISOTIME [SP HashName "=" Base16]* NL

  where HashName is any keyword.

  This field may occur any number of times.

  The date in a previous-consensus line in a vote is the valid-after
  date of the consensus the line refers to.  The hash should be
  computed over the signed portion of the consensus document. A
  directory authority should include a previous-consensus line for a
  consensus using all hashes it supports for all consensuses it knows
  which are still valid, together with the two most recently expired
  ones.

  When this proposal is implemented, a new consensus method should be
  allocated for adding previous-consensus lines to the consensus.

  A previous-consensus line is included in the consensus if and only
  if a line with that date was listed by more than half of the
  authorities whose votes are under consideration.  A hash is included
  in that line if the hash was listed by more than half of the
  authorities whose votes are under consideration.  Hashes are sorted
  lexically with a line by hashname; dates are sorted in temporal
  order.

  If, when computing a consensus, the authorities find that any
  previous-consensus line is *incompatible* with another, they must
  issue a loud warning.  Two lines are incompatible if they have the
  same ISOTIME, but different values for the the same HashName.

  The hash "sha256" is mandatory.

3. Client and cache operation

  All parties receiving consensus documents should validate
  previous-consensus lines, and complain loudly if a hash fails to
  match.

  When a party receives a consensus document, it SHOULD check all
  previous-consensus lines against any previous consensuses it has
  retained, and if a hash fails to match it SHOULD warn loudly in the
  log mentioning the specific hashes and valid-after times in
  question, and store both the new consensus containing the
  mismatching hashes and the old consensus being checked for later
  analysis.  An option SHOULD be provided to disable operation as a
  client or as a hidden service if this occurs.

  All relying parties SHOULD by default retain all valid consensuses
  they download plus two; but see "Security considerations" below.

  If a hash is not mismatched, the relying party may nonetheless be
  unable to validate the chain: either because there is a gap in the
  chain itself, or because the relying party does not have any of the
  consensuses that the latest consensus mentions.  If this happens,
  the relying party should log a warning stating the specific cause,
  the hashes and valid-after time of both the consensus containing the
  unverifiable previous-consensus line and the hashes and valid-after
  time of the line for each such line, and retain a copy of the
  consensus document in question.  A relying party MAY provide an
  option to disable operation as a client or hidden service in this
  event, but due to the risk that breaks in the chain may occur
  accidentally, such an option SHOULD be disabled by default if
  provided.

  If a relying party starts up and finds only very old consensuses
  such that no previous-consensus lines can be verified, it should log
  a notice of the gap along the lines of "consensus (date, hash) is
  quite new.  Can't chain back to old consensus (date, hash)".  If it
  has no old consensuses at all, it should log an info-level message
  of the form "we got consensus (date, hash).  We haven't got any
  older consensuses, so we won't do any hash chain verification"

4. Security Considerations:

   * Retaining consensus documents on clients might leak information
     about when the client was active if a disk is later stolen or the
     client compromised.  This should be documented somewhere and an
     option to disable (but thereby also disable verifying
     previous-consensus hashes) should be provided.

   * Clients MAY offer the option to retain previous consensuses in
     memory only to allow for validation without the potential disk
     leak.
Filename: 240-auth-cert-revocation.txt
Title: Early signing key revocation for directory authorities
Author: Nick Mathewson
Created: 09-Jan-2015
Status: Open

1. Overview

   This proposal describes a simple way for directory authorities to
   perform signing key revocation.

2. Specification

   We add the following lines to the authority signing certificate
   format:

     revoked-signing-key SP algname SP FINGERPRINT NL

   This line may appear zero or more times.

   It indicates that a particular not-yet-expired signing key should not
   be used.

3. Client and cache operation

   No client or cache should retain, use, or serve any certificate whose
   signing key is described in a revoked-signing-key line in a
   certificate with the same authority identity key.  (If the signing
   key fingerprint appears in a cert with a different identity key, it
   has no effect: you aren't allowed to revoke other people's keys.)

   No Tor instance should download a certificate whose signing
   key,identity key combination is known to be revoked.

4. Authority operator interface.

   The 'tor-gencert' command will take a number of older certificates to
   revoke as optional command-line arguments.  It will include their
   keys in revoked-signing-key lines only if they are still valid, or
   have been expired for no more than a month.

5. Circular revocation

   My first attempt at writing a proposal here included a lengthy
   section about how to handle cases where certificate A revokes the key
   of certificate B, and certificate B revokes the key of certificate A.

   Instead, I am inclined to say that this is a MUST NOT.
Filename: 241-suspicious-guard-turnover.txt
Title: Resisting guard-turnover attacks
Author: Aaron Johnson, Nick Mathewson
Created: 2015-01-27
Status: Rejected


This proposal was made obsolete by the introduction of Proposal #259.
Some of the ideas here have be incorporated into Proposal #259.


1. Introduction

  Tor uses entry guards to prevent an attacker who controls some
  fraction of the network from observing a fraction of every user's
  traffic. If users chose their entries and exits uniformly at
  random from the list of servers every time they build a circuit,
  then an adversary who had (k/N) of the network would deanonymize
  F=(k/N)^2 of all circuits... and after a given user had built C
  circuits, the attacker would see them at least once with
  probability 1-(1-F)^C.  With large C, the attacker would get a
  sample of every user's traffic with probability 1.

  To prevent this from happening, Tor clients choose a small number
  of guard nodes (currently 1: see proposal 236).  These guard nodes
  are the only nodes that the client will connect to directly.  If
  they are not compromised, the user's paths are not compromised.

  But attacks remain.  Consider an attacker who can run a firewall
  between a target user and the Tor network, and make
  many of the guards they don't control appear to be unreachable.
  Or consider an attacker who can identify a user's guards, and mount
  denial-of-service attacks on them until the user picks a guard
  that the attacker controls.

  In the presence of these attacks, we can't continue to connect to
  the Tor network unconditionally.  Doing so would eventually result
  in the user choosing a hostile node as their guard, and losing
  anonymity.




2. Proposed behavior

   Keep a record of all the guards we've tried to connect to,
   connected to, or extended circuits through in the last PERIOD
   days.

   (We have connected to a guard if we authenticate its identity.
   We have extended a circuit through a guard if we built a
   multi-hop circuit with it.)

   If the number of guards we have *tried* to connect to in the last
   PERIOD days is greater than CANDIDATE_THRESHOLD, do not attempt
   to connect to any other guards; only attempt the ones we have
   previously *tried* to connect to.

   If the number of guards we *have* connected to in the last PERIOD
   days is greater than CONNECTED_THRESHOLD, do not attempt to
   connect to any other guards; only attempt ones we have already
   *successfully* connected to.

   If we fail to connect to NET_THRESHOLD guards in a row, conclude
   that the network is likely down. Stop/notify the user; retry
   later; add no new guards for consideration.

 [[ optional
   If we notice that USE_THRESHOLD guards that we *used for
   circuits* in the last FAST_REACT_PERIOD days are not working, but
   some other guards are, assume that an attack is in progress, and
   stop/notify the user.
  ]]

2.1. Suggested parameter thresholds.

  PERIOD -- 60 days

  FAST_REACT_PERIOD -- 10 days

  CONNECTED_THRESHOLD -- 8

  CANDIDATE_THRESHOLD -- 20

  NET_THRESHOLD -- 10 (< CANDIDATE_THRESHOLD)

 [[ optional
  USE_THRESHOLD -- 3 (< CONNECTED_THRESHOLD)
 ]]
  (Each of the above should have a corresponding consensus parameter.)

2.2. What do we mean by "Stop/warn"?

  By default, we should probably give warnings in most of the above
  cases for the first version that deploys them.  We can have an
  on/off/auto setting for whether we will build circuits at all if we're
  in a "stopped" mode.  Default should be auto, meaning off for now.

  The warning needs to be carefully chosen, and suggest a workaround
  better than "get a better network" or "clear your state file".

2.3. What's with making USE_THRESHOLD optional?

  Aaron thinks that getting rid of it might help in the fascistfirewall
  case.  I'm a little unclear whether that makes any of the attacks
  easier.

3. State storage requirements

Right now, we save for each guard that we have made contact with:

   ID
   Added
   is dircache?
   down-since
   last-attempted
   bad-since
   chosen-on-date, chosen-by-version
   path bias info (circ_attempts, successes, close_success)

To implement the above proposal, we'll need to add, for each guard
*or guard candidate*:
   when did we first decide to try connecting to it?
   when did we last do one of:
       decide to try connecting to it?
       connect to it?
       build a multihop circuit through it?
   which one was it?

Probably round these to the nearest day or so.

4. Future work

   We need to make this play nicely with mobility.  When a user has
   three guards on port 9001 and they move to a firewall that only
   allows 80/443, we'd prefer that they not simply grind to a halt.  If
   nodes are configured to stop when too many of their guards have gone
   away, this will confuse them.

   If people need to turn FascistFirewall on and off, great.  But if
   they just clear their state file as a workaround, that's not so good.


   If we could tie guard choice to location, that would help a great
   deal, but we'd need to answer the question, "Where am I on the
   network", which is not so easy to do passively if you're behind a
   NAT.



Appendix A. Scenario analysis

A.1. Example attacks

 * Filter Alice's connection so they can only talk to your guards.

 * Whenever Alice is using a guard you don't control, DOS it.

A.2. Example non-attacks

 * Alice's guard goes down.

 * Alice is on a laptop that is sometimes behind a firewall that
   blocks a guard, and sometimes is not.

 * Alice is on a laptop that's behind a firewall that blocks a lot
   of the tor network, (like, everything not on 80/443).

 * Alice has a network connection that sometimes turns off and turns
   on again.

 * Alice reboots her computer periodically, and tor starts a little
   while before the network is live.

Appendix B. Acknowledgements

  Thanks to Rob Jansen and David Goulet for comments on earlier versions of
  this draft.

Appendix C. Desirable revisions

  Incorporate ideas from proposal 156.
Filename: 242-better-families.txt
Title: Better performance and usability for the MyFamily option
Author: Nick Mathewson
Created: 2015-02-27
Status: Superseded
Superseded-by: 321-happy-families.md

1. Problem statement.

   The current family interface allows well-behaved relays to
   identify that they all belong to the same 'family', and should
   not be used in the same circuits.

   Right now, this interface works by having every family member
   list every other family member in its server descriptor.  This
   winds up using O(n^2) space in microdescriptors, server
   descriptors, and RAM.  Adding or removing a server from the
   family requires all the other servers to change their torrc
   settings.

   One proposal is to eliminate the use of the Family option
   entirely; see ticket #6676.  But if we don't, let's come up with
   a way to make it better.  (I'm writing this down mainly to get it
   out of my head.)

2. Design overview.

   In this design, every family has a master ed25519 key.  A node is
   in the family iff its server descriptor includes a certificate of
   its ed25519 identity key with the master ed25519 key.  The
   certificate format is as in proposal 220 section 2.1.

   Note that because server descriptors are signed with the node's
   ed25519 signing key, this creates a bidirectional relationship
   where nodes can't be put in families without their consent.

3. Changes to server descriptors

   We add a new entry to server descriptors:
      "family-cert"

   This line contains a base64-encoded certificate as described
   above.  It may appear any number of times.

4. Changes to microdescriptors

   We add a new entry to microdescriptors:
      "family-keys"

   This line contains one or more space-separated strings describing
   families to which the node belongs.  These strings MUST be
   between 1 and 64 characters long, and sorted in lexical order.
   Clients MUST NOT depend on any particular property of these
   strings.

5. Changes to voting algorithm

   We allocate a new consensus method number for voting on these keys.

   When generating microdescriptors using a suitable consensus
   method, the authorities include a "family-keys" line if the
   underlying server descriptor contains any family-cert lines.
   For each family-cert in the server descriptor, they add a
   base-64-encoded string of that family-cert's signing key.

6. Client behavior

   Clients should treat node A and node B as belonging to the same
   family if ANY of these is true:

       * The client has server descriptors or microdescriptors for A
         and B, and A's descriptor lists B in its family line, and
         B's descriptor lists A in its family line.

       * The client has a server descriptor for A and one for B, and
         they both contain valid family-cert lines whose certs are
         signed by the family key.

       * The client has microdescriptors for A and B, and they both
         contain some string in common on their family-cert line.

7. Deprecating the old family lines.

   Once all clients that support the old family line format are
   deprecated, servers can stop including family lines in their
   descriptors, and authorities can stop including them in their
   microdescriptors.

8. Open questions

   The rules in section 6 above leave open the possibility of old
   clients and new clients reaching different decisions about who is
   in a family.  We should evaluate this for anonymity implications.

   It's possible that families are a bad idea entirely; see ticket
   #6676.

Filename: 243-hsdir-flag-need-stable.txt
Title: Give out HSDir flag only to relays with Stable flag
Author: George Kadianakis
Created: 2015-03-23
Status: Closed
Implemented-in: 0.2.7

1. Introduction

   The descriptors of hidden services are stored by hidden service
   directories. Those are chosen by directory authorities who assign
   the "HSDir" flag to those relays according to their uptime.

   It's important for new relays to not be able to get the HSDir flag
   too easily, because a few correctly placed HSDirs can launch a
   denial of service attack on a hidden service. We should make sure
   that a naive Sybil attacker that injects thousands of new Tor
   relays to the network cannot position herself like this.

2. Motivation

   Currently, directory authorities give out the HSDir flag to relays
   that volunteer to be hidden service directories by sending a
   "hidden-service-dir" line in their relay descriptor, which is the
   default relay behavior. Furthermore, the HSDir flag is only given
   to relays that have been up for more than MinUptimeHidServDirectoryV2 hours.
   MinUptimeHidServDirectoryV2 is a parameter locally set at the
   directory authorities and it's somewhere between 25 to 96 hours.

   We propose changing that last requirement, and instead giving the
   HSDir flag only to relays that have the Stable flag. We believe
   that this will result in a few benefits:

   - We stop using the ad-hoc uptime calculation that we are currently
     doing (see dirserv_thinks_router_is_hs_dir()). Instead, we use
     the MTBF uptime calculation that is performed for the Stable flag
     which is more robust.

   - We increase the time required to get the HSDir flag, making it
     harder for naive adversaries that flood the network with relays
     to actually get the HSDir flag.

   - After implementing non-deterministic HSDir picks (#8244) we also
     make it harder for sophisticated adversaries to DoS a hidden
     service, since at that point their main attack strategy is to
     flood the network with relays.

   - By increasing the stability of HSDirs, we reduce the misses
     during descriptor fetching that get caused by natural churn of
     relays on the list of HSDirs.

3. Specification

   We are suggesting changing the criteria that directory authorities
   use to vote for HSDirs to the following:

   - The relay has included the "hidden-service-dir\n" line in its
     descriptor.

   - The relay is eligible for having the "Stable" flag.

4. Security considerations

   As it currently is, a router is 'Stable' if it is active, and
   either its Weighted MTBF is at least the median for known active
   routers or its Weighted MTBF corresponds to at least 7 days. This
   is stricter criteria than what's required for HSDir, which means
   that the number of HSDirs will decrease after the suggested changes.

   Currently there are about 2400 HSDirs in the consensus, and about
   2300 of them are Stable, which means that we will lose about 100 HSDirs.
   We believe that this is an acceptable temporary loss. In the
   short-term future, the number of HSDirs will greatly improve as
   more directory authorities upgrade to #14202 and more relays
   upgrade to #12538.

5. Future

   Should we give out the HSDir flag only to relays that are Fast? Is
   being an HSDir a demanding job bandwidth-wise?

   With the upcoming keyblinding scheme (#8106) and non-deterministic
   HSDir selection (#8244), are there any other criteria that we
   should use when assigning HSDir flags?
Filename: 244-use-rfc5705-for-tls-binding.txt
Title: Use RFC5705 Key Exporting in our AUTHENTICATE calls
Author: Nick Mathewson
Created: 2015-05-14
Status: Closed
Implemented-In: 0.3.0.1-alpha

0. IMPLEMENTATION-NOTES

  We decided to implement this proposal for the Ed25519 handshake only.

1. Proposal

  We use AUTHENTICATE cells to bind the connection-initiator's Tor
  identity to a TLS session.  Our current type of authentication
  ("RSA-SHA256-TLSSecret", see tor-spec.txt section 4.4) does this by
  signing a document that includes an HMAC of client_random and
  server_random, using the TLS master secret as a secret key.

  There is a more standard way to get at this information, by using the
  facility defined in RFC5705.  Further, it is likely to continue to
  work with more TLS libraries, including TLS libraries like OpenSSL 1.1
  that make master secrets and session data opaque.

  I propose that we introduce a new authentication type, with AuthType
  and TYPE field to be determined, that works the same as our current
  "RSA-SHA256-TLSSecret" authentication, except for these fields:

    TYPE is a different constant string, "AUTH0002".

    TLSSECRETS is replaced by the output of the Exporter function in
    RFC5705, using as its inputs:
        * The label string "EXPORTER FOR TOR TLS CLIENT BINDING " + TYPE
        * The context value equal to the client's identity key digest.
        * The length 32.

  I propose that proposal 220's section on authenticating with ed25519
  keys be amended accordingly:

    TYPE is a different constant string, "AUTH0003".

    TLSSECRETS is replaced by the output of the Exporter function in
    RFC5705, using as its inputs:
        * The label string "EXPORTER FOR TOR TLS CLIENT BINDING " + TYPE
        * The context value equal to the client's Ed25519 identity key
        * The length 32.
Filename: 245-tap-out.txt
Title: Deprecating and removing the TAP circuit extension protocol
Author: Nick Mathewson
Created: 2015-06-02
Status: Superseded
Superseded-by: 350

0. Introduction

  This proposal describes a series of steps necessary for deprecating
  TAP without breaking functionality.

  TAP is the original protocol for one-way authenticated key negotiation
  used by Tor.  Before Tor version 0.2.4, it was the only supported
  protocol.  Its key length is unpleasantly short, however, and it had
  some design warts.  Moreover, it had no name, until Ian Goldberg wrote
  a paper about the design warts.

  Why deprecate and remove it?  Because ntor is better in basically
  every way.  It's actually got a proper security proof, the key
  strength seems to be 20th-century secure, and so on.  Meanwhile, TAP
  is lingering as a zombie, taking up space in descriptors and
  microdescriptors.

1. TAP is still in (limited) use today for hidden service hops.

  The original hidden service protocol only describes a way to tell
  clients and servers about an introduction point's or a rendezvous
  point's TAP onion key.

  We can do a bit better (see section 4), but we can't break TAP
  completely until current clients and hidden services are obsolete.

2. The step-by-step process.

  Step 1. Adjust the parsing algorithm for descriptors and microdescriptors
  on servers so that it accepts MDs without a TAP key.  See section 3 below.
  Target: 0.2.7.

  Step 1b. Optionally, when connecting to a known IP/RP, extend by ntor.
  (See section 4 below.)

  Step 2. Wait until proposal 224 is implemented.  (Clients and hidden
  services implementing 224 won't need TAP for anything.)

  Step 3. Begin throttling TAP answers even more aggressively at relays.
  Target: prop224 is stable.

  Step 4. Wait until all versions of Tor without prop224 support are
  obsolete/deprecated.

  Step 5. Stop generating TAP keys; stop answering TAP requests; stop
  advertising TAP keys in descriptors; stop including them in
  microdescriptors.
  Target: prop224 has been stable for 12-18 months, and 0.2.7 has been stable
  for 2-3 years.


3. Accepting descriptors without TAP keys. (Step 1)

  Our microdescriptor parsing code uses the string "onion-key" at the
  start of the line to identify the boundary between microdescriptors,
  so we can't remove it entirely.  Instead, we will make the body
  optional.

  We will make the following changes to dir-spec:

   - In router descriptors, make the onion-key field "at most once"
     instead of "exactly once."

   - In microdescriptors, make the body of "onion-key" optional.

  Until Step 4, authorities MUST still reject any descriptor without a
  TAP key.

  If we do step 1 before proposal 224 is implemented, we'll need to make
  sure that we never choose a relay without a TAP key as an introduction
  point or a rendezvous point.

4. Avoiding TAP earlier for HS usage (Step 1b)

  We could begin to move more circuits off TAP now by adjusting our
  behavior for extending circuits to Introduction Points and Rendezvous
  Points.  The new rule would be:

     If you've been told to extend to an IP/RP, and you know a directory
     entry for that relay (matching by identity), you extend using the
     node_t you have instead.

  This would improve cryptographic security a bit, at the expense of
  making it possible to probe for whether a given hidden service has an
  up-to-date consensus or not, and learn whether each client has an
  up-to-date consensus or not. We need to figure out whether that
  enables an attack.

  (For reference, the functions to patch would be
  rend_client_get_random_intro_impl and find_rp_for_intro.)
Filename: 246-merge-hsdir-and-intro.txt
Title: Merging Hidden Service Directories and Introduction Points
Author: John Brooks, George Kadianakis
Created: 2015-07-12
Status: Rejected

Change history:

   18-Jan-2016  Changed status to "Needs-Research" after discussion in email
                thread [1].

1. Overview and Motivation

   This document describes a modification to proposal 224 ("Next-Generation
   Hidden Services in Tor"), which simplifies and improves the architecture by
   combining hidden service directories and introduction points at the same
   relays.

   A reader will want to be familiar with the existing hidden service design,
   and with the changes in proposal 224. If accepted, this proposal should be
   combined with proposal 224 to make a superseding specification.

1.1. Overview

   In the existing hidden service design and proposal 224, there are three
   distinct steps building a connection: fetching the descriptor from a
   directory, contacting an introduction point listed in the descriptor, and
   rendezvous as specified during the introduction. The hidden service
   directories are selected algorithmically, and introduction points are
   selected at random by the service.

   We propose to combine the responsibilities of the introduction point and
   hidden service directory. The list of introduction points responsible for a
   service will be selected using the algorithm specified for HSDirs [proposal
   224, section 2.2.3]. The service builds a long-term introduction circuit to
   each of these, identified by its blinded public key. Clients can calculate
   the same set of relays, build an introduction circuit, retrieve the
   ephemeral keys, and proceed with sending an introduction to the service in
   the same ways as before.

1.2. Benefits over proposal 224

   With this change, client connections are made more efficient by needing only
   two circuits (for introduction and rendezvous), instead of the three needed
   previously, and need to contact fewer relays. Clients also no longer cache
   descriptors, which substantially simplifies code and removes a common source
   of bugs and reliability issues.

   Hidden services are able to stay online by simply maintaining their
   introduction circuits; there is no longer a need to periodically update
   descriptors. This reduces network load and traffic fingerprinting
   opportunities for a hidden service.

   The number and churn of relays a hidden service depends on is also reduced.
   In particular, prior hidden service designs may frequently choose new
   introduction points, and each of these has an opportunity to observe the
   popularity or connection behavior of clients.

1.3. Other effects on proposal 224

   An adversarial introduction point is not significantly more capable than a
   hidden service directory under proposal 224. The differences are:

     1. The introduction point maintains a long-lived circuit with the service
     2. The introduction point can break that circuit and cause the service to
        rebuild it

   See section 4 ("Discussion") for other impacts and open discussion
   questions.

2. Specification

2.1. Picking introduction points for a service

   Instead of picking HSDirs, hidden services pick their introduction points
   using the same algorithm as defined in proposal 224 section 2.2 [HASHRING].

   To be used as an introduction point, a relay must have the Stable flag in
   the consensus and an uptime of at least twice the shared random period
   defined in proposal 224 section 2.3.

   This also specifies the lifetime of introduction points, since they will be
   rotated with the change of time period and shared randomness.

2.2. Hidden service sets up introduction points

   After a hidden service has picked its intro points, it needs to establish
   long-term introduction circuits to them and also send them an encrypted
   descriptor that should be forwarded to potential clients. The descriptor
   contains a service key that should be used by clients to encrypt the
   INTRODUCE1 cell that will be sent to the hidden service. The encrypted parts
   of the descriptor are encrypted with the symmetric keys specified in prop224
   section [ENCRYPTED-DATA].

2.2.1. Hidden service uploads a descriptor

   Services post a descriptor by opening a directory stream with BEGIN_DIR, and
   sending a HTTP POST request as described in proposal 224, section 2.2.4.

   The relay must verify the signatures of the descriptor, and check whether it
   is responsible for that blinded public key in the hash ring. Relays should
   connect the descriptor to the circuit used to upload it, which will be
   repurposed as the service introduction circuit. The descriptor does not need
   to be cached by the introduction point after that introduction circuit has
   closed.

   It is unexpected and invalid to send more than one descriptor on the same
   introduction circuit.

2.2.2. Descriptor format

   The format for the hidden service descriptor is as described in proposal 224
   sections 2.4 and 2.5, with the following modifications:

       * The "revision-counter" field is removed
       * The introduction-point section is removed
       * The "auth-key" field is removed
       * The "enc-key legacy" field is removed
       * The "enc-key ntor" field must be specified exactly once per descriptor

   Unlike previous versions, the descriptor does not encode the entire list of
   introduction points. The descriptor only contains a key for the particular
   introduction point it was sent to.

2.2.3. ESTABLISH_INTRO cell

   When a hidden service is establishing a new introduction point, it sends the
   ESTABLISH_INTRO cell, which is formatted as described by proposal 224
   section 3.1.1, except for the following:

   The AUTH_KEY_TYPE value 02 is changed to:

     [02] -- Signing key certificate cross-certified with the blinded key, in
             the same format as in the hidden service descriptor.

   In this case, SIG is a signature of the cell with the signing key specified
   in AUTH_KEY. The relay must verify this signature, as well as the
   certification with the blinded key. The relay should also verify that it has
   received a valid descriptor with this blinded key.

   [XXX: Other options include putting only the blinded key, or only the
   signing key in this cell. In either of these cases, we must look up the
   descriptor to fully validate the contents, but we require the descriptor
   to be present anyway. -special]

   [XXX: What happens with the MAINT_INTRO process defined in proposal 224
   section 3.1.3? -special]

2.3. Client connection to a service

   A client that wants to connect to a hidden service should first calculate
   the responsible introduction points for the onion address as described in
   section 2.1 above.

   The client chooses one introduction point at random, builds a circuit, and
   fetches the descriptor. Once it has received, verified, and decrypted the
   descriptor, the client can use the same circuit to send the INTRODUCE1 cell.

2.3.1. Client requests a descriptor

   Clients can request a descriptor by opening a directory stream with
   BEGIN_DIR, and sending a HTTP GET request as described in proposal 224,
   section 2.2.4.

   The client must verify the signatures of the descriptor, and decrypt the
   encrypted portion to access the "enc-key". This key is used to encrypt the
   contents of the INTRODUCE1 cell to the service.

   Because the descriptor is specific to each introduction point, client-side
   descriptor caching changes significantly. There is little point in caching
   these descriptors, because they are inexpensive to request and will always
   be available when a service-side introduction circuit is available. A client
   that does caching must be prepared to handle INTRODUCE1 failures due to
   rotated keys.

2.3.2. Client sends INTRODUCE1

   After requesting the descriptor, the client can use the same circuit to send
   an INTRODUCE1 cell, which is forwarded to the service and begins the
   rendezvous process.

   The INTRODUCE1 cell is the same as proposal 224 section 3.2.1, except that
   the AUTH_KEYID is the blinded public key, instead of the now-removed
   introduction point authentication key.

   The relay must permit this circuit to change purpose from the directory
   request to a client or server introduction.

3. Other changes to proposal 224

3.1. Removing proposal 224 legacy relay support

   Proposal 224 defines a process for using legacy relays as introduction
   points; see section 3.1.2 [LEGACY_EST_INTRO], and 3.2.3 [LEGACY-INTRODUCE1].
   With the changes to the introduction point in this proposals, it's no longer
   possible to maintain support for legacy introduction points.

   These sections of proposal 224 are removed, along with other references to
   legacy introduction points and RSA introduction point keys. We will need to
   handle the migration process to ensure that sufficient relays are available
   as introduction points. See the discussion in section 4.1 for more details.

3.2. Removing the "introduction point authentication key"

   The "introduction point authentication key" defined in proposal 224 is
   removed.  The "descriptor signing key" is used to sign descriptors and the
   ESTABLISH_INTRO2 cell. Descriptors are unique for each introduction point,
   and there is no point in generating a new key used only to sign the
   ESTABLISH_INTRO2 cell.

4. Discussion

4.1. No backwards compatibility with legacy relays

   By changing the introduction procedure in such a way, we are unable to
   maintain backwards compatibility. That is, hidden services will be unable to
   use old relays as their introduction points, and similarly clients will be
   unable to introduce through old relays.

   To maintain an adequate anonymity set of intro points, clients and hidden
   services should perform this introduction method only after most relays have
   upgraded. For this reason we introduce the consensus parameter
   HSMergedIntroduction which controls whether hidden services should perform
   this merged introduction or fall back to the old one.

   [XXX: Do we? This sounds like we have to implement both in the client, which
   I thought we wanted to avoid. An alternative is to make sure that the intro
   point side is done early enough, and that clients know not to rely on the
   security of 224 services until enough relays are upgraded and the
   implementation is done. -special]

4.2. Restriction on the number of intro points and impact on load balancing

   One drawback of this proposal is that the number of introduction points of a
   hidden service is now a constant global parameter. Hence, a hidden service
   can no longer adjust how many introduction points it uses, or select the
   nodes that will serve as its introduction points.

   While bad, we don't consider this a major drawback since we don't believe
   that introduction points are a significant bottleneck on hidden services
   performance.

   However, our system significantly impacts the way some load balancing
   schemes for hidden services work. For example, onionbalance is a third-party
   application that manages the introduction points of a hidden service in a
   way that allows traffic load-balancing.  This is achieved by compiling a
   master descriptor that mixes and matches the introduction points of
   underlying hidden service instances.

   With our system there are no descriptors that onionbalance can use to mix
   and match introduction points. A variant of the onionbalance idea that could
   work with our system would involve onionbalance starting a hidden service,
   not establishing any intro points, and then ordering the underlying hidden
   service load-balancing instances to establish intro points to all the right
   introduction points.

4.3. Behavior when introduction points go offline or misbehave

   In this new system, it's the Tor network that decides which relays should be
   used as the intro points of a hidden service for every time period. This
   means, that a hidden service is forced to use those relays as intro points
   if it wants clients to connect to it.

   This brings up the topic of what should happen when the designated relays go
   offline or refuse connections. Our behavior here should block guard
   discovery attacks (as in #8239) while allowing maximum reachability for
   clients.

   We should also make sure that an adversary cannot manipulate the hash ring
   in such a way that forces us to rotate introduction points quickly. This is
   enforced by the uptime check that is necessary for acquiring the HSDir flag
   (#8243).

   For this reason we propose the following rules:

    - After every consensus and when the blinded public key changes as a result
      of the time period, hidden services need to recalculate their
      introduction points and adjust themselves by establishing intro points to
      the new relays.

    - When an introduction point goes offline or drops connections, we attempt
      to re-establish to it INTRO_RETRIES times per consensus. If the intro
      point failed more than INTRO_RETRIES times for a consensus period, we
      abandon it and stay with one less intro point.

      If a new consensus is released and that relay is still listed as online,
      then we reset our retry counter and start trying again.

   [XXX: Is this crazy? -asn]
   [XXX: INTRO_RETRIES = 3? -asn]

4.4. Defining constants; how many introduction points for a service?

   We keep the same intro point configuration as in proposal 224. That is, each
   hidden service uses 6 relays and keeps them for a whole time period.

   [XXX: Are these good constants? We don't have a chance to change them
   in the future!! -asn]
   [XXX: 224 makes them consensus parameters, which we can keep, but they
   can still only be changed on a network-wide basis. -special]

References:

[1] : https://lists.torproject.org/pipermail/tor-dev/2016-January/010203.html
Filename: 247-hs-guard-discovery.txt
Title: Defending Against Guard Discovery Attacks using Vanguards
Authors: George Kadianakis and Mike Perry
Created: 2015-07-10
Status: Superseded
Superseded-by: 292-mesh-vanguards.txt

[This proposal is superseded by proposal 292-mesh-vanguards.txt based on our
analysis and experiences while implementing and simulating the vanguard design.]

0. Motivation

  A guard discovery attack allow attackers to determine the guard
  node of a Tor client. The hidden service rendezvous protocol
  provides an attack vector for a guard discovery attack since anyone
  can force an HS to construct a 3-hop circuit to a relay (#9001).

  Following the guard discovery attack with a compromise and/or
  coercion of the guard node can lead to the deanonymization of a
  hidden service.

1. Overview

  This document tries to make the above guard discovery + compromise
  attack harder to launch. It introduces a configuration
  option which makes the hidden service also pin the second and third
  hops of its circuits for a longer duration.

  With this new path selection, we force the adversary to perform a
  Sybil attack and two compromise attacks before succeeding. This is
  an improvement over the current state where the Sybil attack is
  trivial to pull off, and only a single compromise attack is required.

  With this new path selection, an attacker is forced to
  compromise one or more nodes before learning the guard node of a hidden
  service. This increases the uncertainty of the attacker, since
  compromise attacks are costly and potentially detectable, so an
  attacker will have to think twice before beginning a chain of node
  compromise attacks that he might not be able to complete.

1.1. Visuals

  Here is how a hidden service rendezvous circuit currently looks like:

                     -> middle_1 -> middle_A
                     -> middle_2 -> middle_B
                     -> middle_3 -> middle_C
                     -> middle_4 -> middle_D
       HS -> guard   -> middle_5 -> middle_E -> Rendezvous Point
                     -> middle_6 -> middle_F
                     -> middle_7 -> middle_G
                     -> middle_8 -> middle_H
                     ->   ...    ->  ...
                     -> middle_n -> middle_n

  this proposal pins the two middle nodes to a much more restricted
  set, as follows:

                                  -> guard_3A_A
                     -> guard_2_A -> guard_3A_B
                                  -> guard_3A_C -> Rendezvous Point
       HS -> guard_1
                                  -> guard_3B_D
                     -> guard_2_B -> guard_3B_E
                                  -> guard_3B_F -> Rendezvous Point


  Note that the third level guards are partitioned into buckets such that
  they are only used with one specific second-level guard. In this way,
  we ensure that even if an adversary is able to execute a Sybil attack
  against the third layer, they only get to learn one of the second-layer
  Guards, and not all of them. This prevents the adversary from gaining
  the ability to take their pick of the weakest of the second-level
  guards for further attack.

2. Design

  This feature requires the HiddenServiceGuardDiscovery torrc option
  to be enabled.

  When a hidden service picks its guard nodes, it also picks an
  additional NUM_SECOND_GUARDS-sized set of middle nodes for its
  `second_guard_set`. For each of those middle layer guards, it
  picks NUM_THIRD_GUARDS that will be used only with a specific
  middle node. These sets are unique to each hidden service created
  by a single Tor client, and must be kept separate and distinct.

  When a hidden service needs to establish a circuit to an HSDir,
  introduction point or a rendezvous point, it uses nodes from
  `second_guard_set` as the second hop of the circuit and nodes from
  that second hop's corresponding `third_guard_set` as third hops of
  the circuit.

  A hidden service rotates nodes from the 'second_guard_set' at a random
  time between MIN_SECOND_GUARD_LIFETIME hours and
  MAX_SECOND_GUARD_LIFETIME hours.

  A hidden service rotates nodes from the 'third_guard_set' at a random
  time between MIN_THIRD_GUARD_LIFETIME and MAX_THIRD_GUARD_LIFETIME
  hours.

  These extra guard nodes should be picked with the same path selection
  procedure that is used for regular middle nodes (though see Section 4.3
  and Section 5.1 for reasons to restrict this slightly beyond the current
  path selection rules).

  Each node's rotation time is tracked independently, to avoid disclosing
  the rotation times of the primary and second-level guards.

  XXX: IP and RP actually need to be separate 4th hops. On the server side,
  IP should be separate to better unlink IP from the 3rd layer guards,
  and on the client side, the RP needs to come from the full network to
  avoid cross-visit linkability. So it's seven proxies all teh time...

  XXX: What about hsdir fetch? to avoid targeting and visit linkability,
  it needs an emphemeral hop too.. Unless we believe that linkability is low?
  It is lower than IP linkability, since the hsdescs can be cached for a bit.
  But if we are worried about visit linkability, then client should also add
  an extra ephemeral hop during IP visits, making that circuit 8 hops long...

  XXX: Emphemeral hops for service side before RP?

  XXX: Really crazy idea: We can provide multiple path security levels.
  We could have full 4 hops, or combine Layer2+Layer3, or combine Layer1+Layer2
  and Layer3+Layer4 for lower-security HS circs..

  XXX: update the load balancing proposal with the outcome of this :/

  XXX how should proposal 241 ("Resisting guard-turnover attacks") be
      applied here?

2.1. Security parameters

  We set NUM_SECOND_GUARDS to 4 nodes and NUM_THIRD_GUARDS to 4 nodes (ie
  four sets of four). However, see Section 5.2 for some performance
  versus security tradeoffs and discussion.

  We set MIN_SECOND_GUARD_LIFETIME to 1 day, and
  MAX_SECOND_GUARD_LIFETIME to 32 days inclusive, for an average rotation
  rate of ~11 days, using the min(X,X) distribution specified in Section
  3.2.3.

  We set MIN_THIRD_GUARD_LIFETIME to 1 hour, and
  MAX_THIRD_GUARD_LIFETIME to 18 hours inclusive, for an average rotation
  rate of ~12 hours, using the max(X,X) distribution specified in Section
  3.2.3.

  The above parameters should be configurable in the Tor consensus and
  torrc.

  See Section 3 for more analysis on these constants.


3. Rationale and Security Parameter Selection

3.1. Threat model, Assumptions, and Goals

  Consider an adversary with the following powers:

     - Can launch a Sybil guard discovery attack against any node of a
       rendezvous circuit. The slower the rotation period of the node,
       the longer the attack takes. Similarly, the higher the percentage
       of the network is compromised, the faster the attack runs.

     - Can compromise any node on the network, but this compromise takes
       time and potentially even coercive action, and also carries risk
       of discovery.

  We also make the following assumptions about the types of attacks:

  1. A Sybil attack is observable by both people monitoring the network
     for large numbers of new nodes, as well as vigilant hidden service
     operators. It will require either large amounts of traffic sent
     towards the hidden service, multiple test circuits, or both.

  2. A Sybil attack against the second or first layer Guards will be
     more noisy than a Sybil attack against the third layer guard, since the
     second and first layer Sybil attack requires a timing side channel in
     order to determine success, whereas the Sybil success is almost
     immediately obvious to third layer guard, since it will be instructed
     to connect to a cooperating malicious rend point by the adversary.

  3. As soon as the adversary is confident they have won the Sybil attack,
     an even more aggressive circuit building attack will allow them to
     determine the next node very fast (an hour or less).

  4. The adversary is strongly disincentivized from compromising nodes that
     may prove useless, as node compromise is even more risky for the
     adversary than a Sybil attack in terms of being noticed.

  Given this threat model, our security parameters were selected so that
  the first two layers of guards should be hard to attack using a Sybil
  guard discovery attack and hence require a node compromise attack. Ideally,
  we want the node compromise attacks to carry a non-negligible probability of
  being useless to the adversary by the time they complete.

  On the other hand, the outermost layer of guards should rotate fast enough to
  _require_ a Sybil attack.

3.2. Parameter Tuning

3.2.1. Sybil rotation counts for a given number of Guards

  The probability of Sybil success for Guard discovery can be modeled as
  the probability of choosing 1 or more malicious middle nodes for a
  sensitive circuit over some period of time.

  P(At least 1 bad middle) = 1 - P(All Good Middles)
                           = 1 - P(One Good middle)^(num_middles)
                           = 1 - (1 - c/n)^(num_middles)

  c/n is the adversary compromise percentage

  In the case of Vanguards, num_middles is the number of Guards you rotate
  through in a given time period. This is a function of the number of vanguards
  in that position (v), as well as the number of rotations (r).

  P(At least one bad middle) = 1 - (1 - c/n)^(v*r)

  Here's detailed tables in terms of the number of rotations required for
  a given Sybil success rate for certain number of guards.

  1.0% Network Compromise:
   Sybil Success   One   Two  Three  Four  Five  Six  Eight  Nine  Ten  Twelve  Sixteen
    10%            11     6     4     3     3     2     2     2     2     1       1
    15%            17     9     6     5     4     3     3     2     2     2       2
    25%            29    15    10     8     6     5     4     4     3     3       2
    50%            69    35    23    18    14    12     9     8     7     6       5
    60%            92    46    31    23    19    16    12    11    10     8       6
    75%           138    69    46    35    28    23    18    16    14    12       9
    85%           189    95    63    48    38    32    24    21    19    16      12
    90%           230   115    77    58    46    39    29    26    23    20      15
    95%           299   150   100    75    60    50    38    34    30    25      19
    99%           459   230   153   115    92    77    58    51    46    39      29

  5.0% Network Compromise:
   Sybil Success   One   Two  Three  Four  Five  Six  Eight  Nine  Ten  Twelve  Sixteen
    10%             3     2     1     1     1     1     1     1     1     1       1
    15%             4     2     2     1     1     1     1     1     1     1       1
    25%             6     3     2     2     2     1     1     1     1     1       1
    50%            14     7     5     4     3     3     2     2     2     2       1
    60%            18     9     6     5     4     3     3     2     2     2       2
    75%            28    14    10     7     6     5     4     4     3     3       2
    85%            37    19    13    10     8     7     5     5     4     4       3
    90%            45    23    15    12     9     8     6     5     5     4       3
    95%            59    30    20    15    12    10     8     7     6     5       4
    99%            90    45    30    23    18    15    12    10     9     8       6

  10.0% Network Compromise:
   Sybil Success   One   Two  Three  Four  Five  Six  Eight  Nine  Ten  Twelve  Sixteen
    10%             2     1     1     1     1     1     1     1     1     1       1
    15%             2     1     1     1     1     1     1     1     1     1       1
    25%             3     2     1     1     1     1     1     1     1     1       1
    50%             7     4     3     2     2     2     1     1     1     1       1
    60%             9     5     3     3     2     2     2     1     1     1       1
    75%            14     7     5     4     3     3     2     2     2     2       1
    85%            19    10     7     5     4     4     3     3     2     2       2
    90%            22    11     8     6     5     4     3     3     3     2       2
    95%            29    15    10     8     6     5     4     4     3     3       2
    99%            44    22    15    11     9     8     6     5     5     4       3

  The rotation counts in these tables were generated with:
     def num_rotations(c, v, success):
       r = 0
       while 1-math.pow((1-c), v*r) < success: r += 1
       return r

3.2.2. Rotation Period

  As specified in Section 3.1, the primary driving force for the third
  layer selection was to ensure that these nodes rotate fast enough that
  it is not worth trying to compromise them, because it is unlikely for
  compromise to succeed and yield useful information before the nodes stop
  being used. For this reason we chose 1 to 18 hours, with a weighted
  distribution (Section 3.2.3) causing the expected average to be 12 hours.

  From the table in Section 3.2.1, with NUM_SECOND_GUARDS=4 and
  NUM_THIRD_GUARDS=4, it can be seen that this means that the Sybil attack
  will complete with near-certainty (99%) in 29*12 hours (14.5 days) for
  the 1% adversary, 3 days for the 5% adversary, and 1.5 days for the 10%
  adversary.

  Since rotation of each node happens independently, the distribution of
  when the adversary expects to win this Sybil attack in order to discover
  the next node up is uniform. This means that on average, the adversary
  should expect that half of the rotation period of the next node is already
  over by the time that they win the Sybil.

  With this fact, we choose our range and distribution for the second
  layer rotation to be short enough to cause the adversary to risk
  compromising nodes that are useless, yet long enough to require a
  Sybil attack to be noticeable in terms of client activity. For this
  reason, we choose a minimum second-layer guard lifetime of 1 day,
  since this gives the adversary a minimum expected value of 12 hours for
  during which they can compromise a guard before it might be rotated.
  If the total expected rotation rate is 11 days, then the adversary can
  expect overall to have 5.5 days remaining after completing their Sybil
  attack before a second-layer guard rotates away.

3.2.3. Rotation distributions

  In order to skew the distribution of the third layer guard towards
  higher values, we use max(X,X) for the distribution, where X is a
  random variable that takes on values from the uniform distribution.

  In order to skew the distribution of the second layer guard towards
  low values (to increase the risk of compromising useless nodes) we
  skew the distribution towards lower values, using min(X,X).

  Here's a table of expectation (arithmetic means) for relevant
  ranges of X (sampled from 0..N-1). The table was generated with the
  following python functions:

  def ProbMinXX(N, i): return (2.0*(N-i)-1)/(N*N)
  def ProbMaxXX(N, i): return (2.0*i+1)/(N*N)

  def ExpFn(N, ProbFunc):
    exp = 0.0
    for i in xrange(N): exp += i*ProbFunc(N, i)
    return exp

  The current choice for second-layer guards is noted with **, and
  the current choice for third-layer guards is noted with ***.

   Range  Exp[Min(X,X)]   Exp[Max(X,X)]
   10        2.85            6.15
   11        3.18            6.82
   12        3.51            7.49
   13        3.85            8.15
   14        4.18            8.82
   15        4.51            9.49
   16        4.84            10.16
   17        5.18            10.82***
   18        5.51            11.49
   19        5.84            12.16
   20        6.18            12.82
   21        6.51            13.49
   22        6.84            14.16
   23        7.17            14.83
   24        7.51            15.49
   25        7.84            16.16
   26        8.17            16.83
   27        8.51            17.49
   28        8.84            18.16
   29        9.17            18.83
   30        9.51            19.49
   31        9.84            20.16
   32        10.17**         20.83
   33        10.51           21.49
   34        10.84           22.16
   35        11.17           22.83
   36        11.50           23.50
   37        11.84           24.16
   38        12.17           24.83
   39        12.50           25.50

  The Cumulative Density Function (CDF) tells us the probability that a
  guard will no longer be in use after a given number of time units have
  passed.

  Because the Sybil attack on the third node is expected to complete at any
  point in the second node's rotation period with uniform probability, if we
  want to know the probability that a second-level Guard node will still be in
  use after t days, we first need to compute the probability distribution of
  the rotation duration of the second-level guard at a uniformly random point
  in time. Let's call this P(R=r).

  For P(R=r), the probability of the rotation duration depends on the selection
  probability of a rotation duration, and the fraction of total time that
  rotation is likely to be in use. This can be written as:

  P(R=r) = ProbMinXX(X=r)*r / \sum_{i=1}^N ProbMinXX(X=i)*i

  or in Python:

  def ProbR(N, r, ProbFunc=ProbMinXX):
     return ProbFunc(N, r)*r/ExpFn(N, ProbFunc)

  For the full CDF, we simply sum up the fractional probability density for
  all rotation durations. For rotation durations less than t days, we add the
  entire probability mass for that period to the density function. For
  durations d greater than t days, we take the fraction of that rotation
  period's selection probability and multiply it by t/d and add it to the
  density. In other words:

  def FullCDF(N, t, ProbFunc=ProbR):
    density = 0.0
    for d in xrange(N):
      if t >= d: density += ProbFunc(N, d)
      # The +1's below compensate for 0-indexed arrays:
      else: density += ProbFunc(N, d)*(float(t+1))/(d+1)
    return density

  Computing this yields the following distribution for our current parameters:

   t          P(SECOND_ROTATION <= t)
   1               0.07701
   2               0.15403
   3               0.22829
   4               0.29900
   5               0.36584
   6               0.42869
   7               0.48754
   8               0.54241
   9               0.59338
  10               0.64055
  11               0.68402
  12               0.72392
  13               0.76036
  14               0.79350
  15               0.82348
  16               0.85043
  17               0.87452
  18               0.89589
  19               0.91471
  20               0.93112
  21               0.94529
  22               0.95738
  23               0.96754
  24               0.97596
  25               0.98278
  26               0.98817
  27               0.99231
  28               0.99535
  29               0.99746
  30               0.99881
  31               0.99958
  32               0.99992
  33               1.00000

  This CDF tells us that for the second-level Guard rotation, the
  adversary can expect that 7.7% of the time, their third-level Sybil
  attack will provide them with a second-level guard node that has only
  1 day remaining before it rotates. 15.4% of the time, there will
  be only 2 day or less remaining, and 22.8% of the time, 3 days or less.

  Note that this distribution is still a day-resolution approximation. The
  actual numbers are likely even more biased towards lower values.

  In this way, we achieve our goal of ensuring that the adversary must
  do the prep work to compromise multiple second-level nodes before
  likely being successful, or be extremely fast in compromising a
  second-level guard after winning the Sybil attack.


4. Security concerns and mitigations

4.1. Mitigating fingerprinting of new HS circuits

  By pinning the middle nodes of rendezvous circuits, we make it
  easier for all hops of the circuit to detect that they are part of a
  special hidden service circuit with varying degrees of certainty.

  The Guard node is able to recognize a Vanguard client with a high
  degree of certainty because it will observe a client IP creating the
  overwhelming majority of its circuits to just a few middle nodes in
  any given 10-18 day time period.

  The middle nodes will be able to tell with a variable certainty that
  depends on both its traffic volume and upon the popularity of the
  service, because they will see a large number of circuits that tend to
  pick the same Guard and Exit.

  The final nodes will be able to tell with a similar level of certainty
  that depends on their capacity and the service popularity, because they
  will see a lot of rend handshakes that all tend to have the same second
  hop. The final nodes can also actively confirm that they have been
  selected for the third hop by creating multiple Rend circuits to a
  target hidden service, and seeing if they are chosen for the Rend point.

  The most serious of these is the Guard fingerprinting issue. When
  proposal 254-padding-negotiation is implemented, services that enable
  this feature should use those padding primitives to create fake circuits
  to random middle nodes that are not their guards, in an attempt to look
  more like a client.

  Additionally, if Tor Browser implements "virtual circuits" based on
  SOCKS username+password isolation in order to enforce the re-use of
  paths when SOCKS username+passwords are re-used, then the number of
  middle nodes in use during a typical user's browsing session will be
  proportional to the number of sites they are viewing at any one time.
  This is likely to be much lower than one new middle node every ten
  minutes, and for some users, may be close to the number of Vanguards
  we're considering.

  This same reasoning is also an argument for increasing the number of
  second-level guards beyond just two, as it will spread the hidden
  service's traffic over a wider set of middle nodes, making it both
  easier to cover, and behave closer to a client using SOCKS virtual
  circuit isolation.

4.2. Hidden service linkability

  Multiple hidden services on the same Tor instance should use separate
  second and third level guard sets; otherwise an adversary is trivially
  able to determine that the two hidden services are co-located by
  inspecting their current chosen rend point nodes.

  Unfortunately, if the adversary is still able to determine that two or
  more hidden services are run on the same Tor instance through some other
  means, then they are able to take advantage of this fact to execute a
  Sybil attack more effectively, since there will now be an extra set of
  guard nodes for each hidden service in use.

  For this reason, if Vanguards are enabled, and more than one hidden
  service is configured, the user should be advised to ensure that they do
  not accidentally leak that the two hidden services are from the same Tor
  instance.

  For cases where the user or application wants to deliberately link multiple
  different hidden services together (for example, to support concurrent file
  transfer and chat for the same identity), this behavior should be
  configurable. A torrc option DisjointHSVanguards should be provided that
  defaults to keeping the Vanguards separate for each hidden service.

4.3. Long term information leaks

  Due to Tor's path selection constraints, the client will never choose
  its primary guard node as later positions in the circuit. Over time,
  the absence of these nodes will give away information to the adversary.

  Unfortunately, the current solution (from bug #14917) of simply creating
  a temporary second guard connection to allow the primary guard to appear
  in some paths will make the hidden service fingerprinting problem worse,
  since only hidden services will exhibit this behavior on the local
  network. The simplest mitigation is to require that no Guard-flagged nodes
  be used for the second and third-level nodes at all, and to allow the
  primary guard to be chosen as a rend point.

  XXX: Dgoulet suggested using arbitrary subsets here rather than the
  no Guard-flag restriction, esp since Layer2 inference is still a
  possibility.

  XXX: If a Guard-flagged node is chosen for the alls IP or RP, raise
  protocolerror. Refuse connection. Or allow our guard/other nodes in
  IP/RP..

  Additionally, in order to further limit the exposure of secondary guards
  to sybil attacks, the bin position of the third-level guards should be
  stable over long periods of time. When choosing third-level guards, these
  guards should be given a fixed bin number so that if they are selected
  at a later point in the future, they are placed after the same
  second-level guard, and not a different one. A potential stateless way
  of accomplishing this is to assign third-level guards to a bin number
  such that H(bin_number | HS addr) is closest to the key for the
  third-level relay.

4.4. Denial of service

  Since it will be fairly trivial for the adversary to enumerate the
  current set of third-layer guards for a hidden service, denial of
  service becomes a serious risk for Vanguard users.

  For this reason, it is important to support a large number of
  third-level guards, to increase the amount of resources required to
  bring a hidden service offline by DoSing just a few Tor nodes.

  Even with multiple third-level guards, an adversary is still able to
  degrade either performance or user experience significantly, simply by
  taking out a fraction of them. The solution to this is to make use
  of the circuit build timeout code (Section 5.2) to have the hidden
  service retry the rend connection multiple times. Unfortunately, it is
  unwise to simply replace unresponsive third-level guards that fail to
  complete circuits, as this will accelerate the Sybil attack.

4.5. Path Bias

XXX: Re-use Prop#259 here.


5. Performance considerations

  The switch to a restricted set of nodes will very likely cause
  significant performance issues, especially for high-traffic hidden
  services. If any of the nodes they select happen to be temporarily
  overloaded, performance will suffer dramatically until the next
  rotation period.

5.1. Load Balancing

  Since the second and third level "guards" are chosen from the set of all
  nodes eligible for use in the "middle" hop (as per hidden services
  today), this proposal should not significantly affect the long-term load
  on various classes of the Tor network, and should not require any
  changes to either the node weight equations, or the bandwidth
  authorities.

  Unfortunately, transient load is another matter, as mentioned
  previously. It is very likely that this scheme will increase instances
  of transient overload at nodes selected by high-traffic hidden services.

  One option to reduce the impact of this transient overload is to
  restrict the set of middle nodes that we choose from to some percentage
  of the fastest middle-capable relays in the network. This may have
  some impact on load balancing, but since the total volume of hidden
  service traffic is low, it may be unlikely to matter.

5.2. Circuit build timeout and topology

  The adaptive circuit build timeout mechanism in Tor is what corrects
  for instances of transient node overload right now.

  The timeout will naturally tend to select the current fastest and
  least-loaded paths even through this set of restricted routes, but it
  may fail to behave correctly if there are a very small set of nodes in
  each guard set, as it is based upon assumptions about the current path
  selection algorithm, and it may need to be tuned specifically for
  Vanguards, especially if the set of possible routes is small.

  It turns out that a fully-connected/mesh (aka non-binned) second guard to
  third guard mapping topology is a better option for CBT for performance,
  because it will create a larger total set of paths for CBT to choose
  from while using fewer nodes.

  This comes at the expense of exposing all second-layer guards to a
  single sybil attack, but for small numbers of guard sets, it may be
  worth the tradeoff. However, it also turns out that this need not block
  implementation, as worst-case the data structures and storage needed to
  support a fully connected mesh topology can do so by simply replicating
  the same set of third-layer guards for each second-layer guard bin.

  Since we only expect this tradeoff to be worth it when the sets are
  small, this replication should not be expensive in practice.

5.3. OnionBalance

  At first glance, it seems that this scheme makes multi-homed hidden
  services such as OnionBalance[1] even more important for high-traffic
  hidden services.

  Unfortunately, if it is equally damaging to the user for any of their
  multi-homed hidden service locations to be discovered, then OnionBalance
  is strictly equivalent to simply increasing the number of second-level
  guard nodes in use, because an active adversary can perform simultaneous
  Sybil attacks against all of the rend points offered by the multi-homed
  OnionBalance introduction points.

  XXX: This actually matters for high-perf censorship resistant publishing.
  It is better for those users to use onionbalance than to up their guards,
  since redundancy is useful for them.

5.4. Default vs optional behavior

  We suggest this torrc option to be optional because it changes path
  selection in a way that may seriously impact hidden service performance,
  especially for high traffic services that happen to pick slow guard
  nodes.

  However, by having this setting be disabled by default, we make hidden
  services who use it stand out a lot. For this reason, we should in fact
  enable this feature globally, but only after we verify its viability for
  high-traffic hidden services, and ensure that it is free of second-order
  load balancing effects.

  Even after that point, until Single Onion Services are implemented,
  there will likely still be classes of very high traffic hidden services
  for whom some degree of location anonymity is desired, but for which
  performance is much more important than the benefit of Vanguards, so there
  should always remain a way to turn this option off.


6. Future directions

  Here are some more ideas for improvements that should be done sooner
  or later:

  - Do we want to consider using Tor's GeoIP country database (if present)
    to ensure that the second-layer guards are chosen from a different
    country as the first-layer guards, or does this leak too much information
    to the adversary?

  - What does the security vs performance tradeoff actually look like
    for different amounts of bins? Or for mesh vs bins? We may need
    to simulate or run CBT tests to learn this.

  - With this tradeoff information, do we want to provide the user
    (or application) with a choice of 3 different Vanguard sets?
    One could imagine "small", "medium", and "large", for example.


7. Acknowledgments

 Thanks to Aaron Johnson, John Brooks, Mike Perry and everyone else
 who helped with this idea.

 This research was supported in part by NSF grants CNS-1111539,
 CNS-1314637, CNS-1526306, CNS-1619454, and CNS-1640548.


Appendix A: Full Python program for generating tables in this proposal

#!/usr/bin/python
import math

############ Section 3.2.1 #################
def num_rotations(c, v, success):
  i = 0
  while 1-math.pow((1-c), v*i) < success: i += 1
  return i

def rotation_line(c, pct):
  print "    %2d%%        %6d%6d%6d%6d%6d%6d%6d%6d%6d%6d%8d" % \
     (pct, num_rotations(c, 1, pct/100.0), num_rotations(c, 2, pct/100.0), \
      num_rotations(c, 3, pct/100.0), num_rotations(c, 4, pct/100.0),
      num_rotations(c, 5, pct/100.0), num_rotations(c, 6, pct/100.0),
      num_rotations(c, 8, pct/100.0), num_rotations(c, 9, pct/100.0),
      num_rotations(c, 10, pct/100.0), num_rotations(c, 12, pct/100.0),
      num_rotations(c, 16, pct/100.0))

def rotation_table_321():
  for c in [1,5,10]:
    print "\n  %2.1f%% Network Compromise: " % c
    print "   Sybil Success   One   Two  Three  Four  Five  Six  Eight  Nine  Ten  Twelve  Sixteen"
    for success in [10,15,25,50,60,75,85,90,95,99]:
      rotation_line(c/100.0, success)

############ Section 3.2.3 #################
def ProbMinXX(N, i): return (2.0*(N-i)-1)/(N*N)
def ProbMaxXX(N, i): return (2.0*i+1)/(N*N)

def ExpFn(N, ProbFunc):
  exp = 0.0
  for i in xrange(N): exp += i*ProbFunc(N, i)
  return exp

def ProbR(N, r, ProbFunc=ProbMinXX):
  return ProbFunc(N, r)*r/ExpFn(N, ProbFunc)

def FullCDF(N, t, ProbFunc=ProbR):
  density = 0.0
  for d in xrange(N):
    if t >= d: density += ProbFunc(N, d)
    # The +1's below compensate for 0-indexed arrays:
    else: density += ProbFunc(N, d)*float(t+1)/(d+1)
  return density

def expectation_table_323():
  print "\n   Range  Min(X,X)   Max(X,X)"
  for i in xrange(10,40):
    print "   %2d      %2.2f       %2.2f" % (i, ExpFn(i,ProbMinXX), ExpFn(i, ProbMaxXX))

def CDF_table_323():
  print "\n   t          P(SECOND_ROTATION <= t)"
  for i in xrange(1,34):
    print "  %2d               %2.5f" % (i, FullCDF(33, i-1))

########### Output ############

# Section 3.2.1
rotation_table_321()

# Section 3.2.3
expectation_table_323()
CDF_table_323()


----------------------

1. https://onionbalance.readthedocs.org/en/latest/design.html#overview
Filename: 248-removing-rsa-identities.txt
Title: Remove all RSA identity keys
Authors: Nick Mathewson
Created: 15 July 2015
Status: Needs-Revision

1. Summary

   With 0.2.7.2-alpha, all relays will have Ed25519 identity keys.  Old
   identity keys are 1024-bit RSA, which should not really be considered
   adequate.  In proposal 220, we describe a migration path to start
   using Ed25519 keys.  This proposal describes an additional migration
   path, for finally removing our old RSA identity keys.

   See also proposal 245, which describes a migration path away from the
   old TAP RSA1024-based circuit extension protocol.

1.1. Steps of migration

   Phase 1. Prepare for routers that do not advertise their RSA
     identities, by teaching clients and relays and other dependent
     software how to handle them.  Reject such routers at the authority
     level.

   Phase 2. Once all supported routers and clients are updated to phase
     1, we can accept routers at the authority level which lack RSA
     keys.

   Phase 3. Once all authorities accept routers without RSA keys, we can
     finally remove RSA keys from relays.

2. Accepting descriptors without RSA identities

   We make the following changes to the descriptor format:

   If an ed25519 key and signature are present, then these elements may
   be omitted: "fignerprint", "signing-key", "router-signature".  They
   must either be all present or all absent.  If they are all absent,
   then the router has no RSA identity key.

   Authorities MUST NOT accept routers descriptors of this form in phase
   1.

3. Accepting handshakes without RSA identities

   When performing a new version of our link handshake, only the Ed25519
   key and certificates and authentication need to be performed.  If the
   link handshake is performed this way, it is accepted as
   authenticating the route with an ed25519 key but no RSA key.

   A circuit extension EXTEND2 cell may contain an Ed25519 identity but
   not an RSA identity.  In this case, the relay should connect the
   circuit to any connection with the correct ed25519 identity,
   regardless of RSA identity.  If an EXTEND2 cell contains an RSA
   identity fingerprint, however, its the relay receiving it should not
   connect to any relay that has a different RSA identity or that has no
   identity, even if the Ed25519 identity does match.

4. UI updates

   In phase 1 we can update our UIs to refer to all relays that have
   Ed25519 keys by their Ed25519 keys.  We can update our configuration
   and control port interfaces so that they accept Ed keys as well as
   RSA keys.

   During phase 1, we should warn about identifying any dual-identity
   relays by their Ed identity alone.

   For backward compatibility, we should consider a default that refers
   to Ed25519 relays by the first 160 bits of their key.
   This would allow many controller-based tools to work transparently
   with the new key types.

5. Changes to external tools

   This is the big one.  We need a relatively comprehensive list of
   tools we can break with the above changes.  Anything that refers to
   relays by SHA1(RSA1024_id) will need to be able to receive, store,
   and use an Ed25519 key instead.

5. Testing

   Before going forward with phase 2 and phase 3, we need to verify that
   we did phase 1 correctly.  To do so, we should create a small
   temporary testing network, and verify that it works correctly as we
   make the phase 2 and phase 3 changes.


Filename: 249-large-create-cells.txt
Title: Allow CREATE cells with >505 bytes of handshake data
Authors: Nick Mathewson, Isis Lovecruft
Created: 23 July 15
Updated: 13 December 2017
Status: Superseded
Superseded-By: 319-wide-everything.md

1. Summary

   There have been multiple proposals over the last year or so for
   adding post-quantum cryptography to Tor's circuit extension
   handshakes.  (See for example https://eprint.iacr.org/2015/008 or
   https://eprint.iacr.org/2015/287 .) These proposals share the property
   that the request and reply for a handshake message do not fit in a
   single RELAY cell.

   In this proposal I describe a new CREATE2V cell for handshakes that
   don't fit in a 505-byte CREATE2 cell's HDATA section, and a means for
   fragmenting these CREATE2V cells across multiple EXTEND2 cells.  I
   also discuss replies, migration, and DoS-mitigation strategies.

2. CREATE2V and CREATED2V

   First, we add two variable-width cell types, CREATE2V and CREATED2V.

   These cell formats are nearly the same as CREATE2 and CREATED2.  (Here
   specified using Trunnel.)

     struct create2v_cell_body {
        /* Handshake type */
        u16 htype;
        /* Length of handshake data */
        u16 hlen;
        /* Handshake data */
        u8 hdata[hlen];
        /* Padding data to be ignored */
        u8 ignored[];
     };

     struct created2v_cell_body {
        /* Handshake reply length */
        u16 hlen;
        /* Handshake reply data */
        u8 hdata[hlen];
        /* Padding data to be ignored */
        u8 ignored[];
     };

   The 'ignored' fields, which extend to the end of the variable-length
   cells, are reserved.  Initiators MAY set them to any length, and MUST
   fill them with either zero-valued bytes or pseudo-random bytes.
   Responders MUST ignore them, regardless of what they contain.  When a
   CREATE2V cell is generated in response to a set of EXTEND2 cells, these
   fields are set by the relay that receives the EXTEND2 cells.

   (The purpose of the 'ignored' fields here is future-proofing and
   padding.)

   Protocols MAY wish to pad to a certain multiple of bytes, or wish to pad
   the initiator/receiver payloads to be of equal length.  This is
   encouraged but NOT REQUIRED.

3. Fragmented EXTEND2 cells

   Without changing the current EXTEND2 cell format, we change its
   semantics:

   If the 'HLEN' field in an EXTEND2 cell describes a handshake data
   section that would be too long to fit in the EXTEND2 cell's payload,
   the handshake data of the EXTEND2 cell is to be continued in one or
   more subsequent EXTEND2 cells.  These subsequent cells MUST have zero
   link specifiers, handshake type 0xFFFF, and handshake data length
   field set to zero.

   Similarly, if the 'HLEN' field in an EXTENDED2 cell would be too long
   to fit into the EXTENDED2 cell's payload, the handshake reply data of
   the EXTENDED2 cell is to be continued in one or more subsequent
   EXTENDED2 cells.  These subsequent cells must have the handshake data
   length field set to zero.

   These cells must be sent on the circuit with no intervening cells.
   If any intervening cells are received, the receiver SHOULD destroy
   the circuit.

   Protocols which make use of CREATE(D)2V cells SHOULD send an equal number
   of cells in either direction, to avoid trivially disclosing information
   about the direction of the circuit: for example a relay might use the
   fact that it saw five EXTEND2 cells in one direction and three in the
   other to easily determine whether it is the middle relay on the onion
   service-side or the middle relay on the client-side of a rendezvous
   circuit.

4. Interacting with RELAY_EARLY cells

   The first EXTEND2 cell in a batch must arrive in a RELAY_EARLY cell.
   The others MAY arrive in RELAY_EARLY cells.  For many handshakes, for
   the possible lengths of many types of circuits, sending all EXTEND2 cells
   inside RELAY_EARLY cells will not be possible.  For example, for a
   fragmented EXTEND2 cell with parts A B C D E, A is the only fragment that
   MUST be sent within a RELAY_EARLY.  For parts B C D E, these are merely
   sent as EXTEND2{CREATE2V} cells.

   Note that this change leaks the size of the handshake being used to
   intermediate relays.  We should analyze this and see whether it matters.
   Clients and relays MAY send RELAY_DROP cells during circuit
   construction in order to hide the true size of their handshakes
   (but they can't send these drop cells inside a train of EXTEND2 or
   EXTENDED2 cells for a given handshake).

5. Example

   So for example, if we are a client, and we need to send a 2000-byte
   handshake to extend a circuit from relay X to relay Y, we might send
   cells as follows:

      EXTEND2 {
        nspec = 2;
          lstype = [0x01 || 0x02];          (IPv4 or IPv6 node address)
          lslen =  [0x04 || 0x16];
          lspec =  { node address for Y, taking 8 bytes or 16 bytes};
          lstype = 0x03;                    (An ed25519 node identity)
          lslen = 32;
          lspen = { ed25519 node ID for Y, taking 32 bytes }
        htype = {whatever the handshake type is.}
        hlen = 2000
        hdata = { the first 462 bytes of the handshake }
      }
      EXTEND2 {
        nspec = 0;
        htype = 0xffff;
        hlen = 0;
        hdata = { the next 492 bytes of the handshake }
      }
      EXTEND2 {
        nspec = 0;
        htype = 0xffff;
        hlen = 0;
        hdata = { the next 492 bytes of the handshake }
      }
      EXTEND2 {
        nspec = 0;
        htype = 0xffff;
        hlen = 0;
        hdata = { the next 492 bytes of the handshake }
      }
      EXTEND2 {
        nspec = 0;
        htype = 0xffff;
        hlen = 0;
        hdata = { the final 62 bytes of the handshake }
      }

   Upon receiving this last cell, the relay X would send a create2v cell
   to Y, containing the entire handshake.

6. Migration

   We can and should implement the EXTEND2 fragmentation feature before
   we implement anything that uses it.  If we can get it widely deployed
   before it's needed, we can use the new handshake types whenever both
   of the involved relays support this proposal.

   Clients MUST NOT send fragmented EXTEND2 cells to relays that don't
   support them, since this would cause them to close the circuit.

   Relays MAY send CREATE2V and CREATED2V cells to relays that don't
   support them, since unrecognized cell types are ignored.

6.1. New Subprotocols and Subprotocol Versions

   This proposal introduces, following prop#264, the following new
   subprotocol numbers and their uses.

6.1.1. Relay Subprotocol

     "Relay 3" -- The OP supports all of "Relay 2", plus support for CREATE2V
       and CREATED2V cells and their above specification for link-layer
       authentication specifiers.

6.1.2. Link Subprotocol

     "Link 5": The OP supports all of "Link 1-4", plus support for the new
       EXTEND2 semantics.  Namely, it understands that an EXTEND2 cell whose
       "hlen" field is greater than 505 will be followed by further "hdata"
       in fragmented EXTEND2 cells which MUST follow.  It also understands
       that the following combination of EXTEND2 payload specifiers
       indicates that the cell is a continuation of the earlier payload
       portions:

           nspec = 0;
           htype = 0xffff;
           hlen = 0;

6.1.3. Handshake Subprotocol

   Additionally, we introduce a new subprotocol, "Handshake" and the
   following number assignments for previously occuring instances:

     "Handshake 1" -- The OP supports the TAP handshake.

     "Handshake 2" -- The OP supports the ntor handshake.

   We also reserve the following assignments for future use:

     "Handshake 3" -- The OP supports the "hybrid+null" ntor-like handshake
       from prop#269.

     "Handshake 4" -- The OP supports a(n as yet unspecified) post-quantum
       secure hybrid handshake, that is, the "hybrid+null" handshake from
       "Handshake 3", except with "null" part replaced with another (as yet
       unspecified) protocol to be composed with the ntor-like ECDH-based
       handshake.

   Further handshakes MUST be specified with "Handshake" subprotocol
   numbers, and MUST NOT be specified with "Relay" subprotocol numbers.  The
   "Relay" subprotocol SHALL be used in the future to denote changes to
   handshake protocol handling of CREATE* and EXTEND* cells, i.e. CREATE,
   CREATED, CREATE_FAST, CREATED_FAST, CREATE2, CREATED2, CREATE2V,
   CREATED2V, EXTEND, EXTENDED, EXTEND2, and EXTENDED2.

   Thus, "Handshake 1" is taken to be synonymous with "Relay 1", and
   likewise "Handshake 2" is with "Relay 2".

6.2. Subprotocol Recommendations

   After the subprotocol additions above, we change to recommending the
   following in the consensus:

      recommended-client-protocols […] Link=5 Relay=3 Handshake=2
      recommended-relay-protocols  […] Link=5 Relay=3 Handshake=2
      required-client-protocols    […] Link=4-5 Relay=2-3 Handshake=1-2
      required-relay-protocols     […] Link=3-5 Relay=1-3 Handshake=1-2

6.2. New Consensus Parameters

   We introduce the following new consensus parameters:

     Create2VMaximumData SP int
        The maximum amount of "hlen" data, in bytes, which may carried in
        either direction within a set of CREATE(D)2V cells.  (default: 10240)

7. Resource management issues

   This feature requires relays and clients to buffer EXTEND2 cell
   bodies for incoming cells until the entire CREATE2V/CREATED2V body
   has arrived.  To avoid memory-related denial-of-service attacks,
   the buffers allocated for this data need to be counted against the
   total data usage of the circuit.

   Further, circuits which receive and buffer CREATE(D)2V cells MUST store
   the time the first buffer chunk was allocated, and use it to inform the
   OOM manager w.r.t. the amount of data used and its staleness.


Appendix A. A rejected idea for migration

   In section 5 above, I gave up on the idea of allowing relay A to
   extend to relay B with a large CREATE cell when relay A does not
   support this proposal.

   There are other ways to do this, but they are impressively kludgey.
   For example, we could have a fake CREATE cell for new handshake types
   that always elicits a "yes, keep going!" CREATED cell.  Then the
   client could send the rest of the handshake and receive the rest of
   the CREATED cell as RELAY cells inside the circuit.

   This design would add an extra round-trip to circuit extension
   whenever it was used, however, and would violate a number of Tor's
   assumptions about circuits (e.g., by having half-created circuits,
   where authentication hasn't actually been performed).  So I'm
   guessing we shouldn't do that.

Appendix B. Acknowledgements

  This research was supported in part by NSF grants CNS-1111539,
  CNS-1314637, CNS-1526306, CNS-1619454, and CNS-1640548.

Filename: 250-commit-reveal-consensus.txt
Title: Random Number Generation During Tor Voting
Authors: David Goulet, George Kadianakis
Created: 2015-08-03
Status: Closed
Supersedes: 225

UPDATE 2017/01/26: This proposal now has its own specification file as srv-spec.txt .

   Table Of Contents:

      1. Introduction
         1.1. Motivation
         1.2. Previous work
      2. Overview
         2.1. Introduction to our commit-and-reveal protocol
         2.2. Ten thousand feet view of the protocol
         2.3. How we use the consensus [CONS]
            2.3.1. Inserting Shared Random Values in the consensus
         2.4. Persistent State of the Protocol [STATE]
         2.5. Protocol Illustration
      3. Protocol
         3.1 Commitment Phase [COMMITMENTPHASE]
            3.1.1. Voting During Commitment Phase
            3.1.2. Persistent State During Commitment Phase [STATECOMMIT]
         3.2 Reveal Phase
            3.2.1. Voting During Reveal Phase
            3.2.2. Persistent State During Reveal Phase [STATEREVEAL]
         3.3. Shared Random Value Calculation At 00:00UTC
            3.3.1. Shared Randomness Calculation [SRCALC]
         3.4. Bootstrapping Procedure
         3.5. Rebooting Directory Authorities [REBOOT]
      4. Specification [SPEC]
         4.1. Voting
            4.1.1. Computing commitments and reveals [COMMITREVEAL]
            4.1.2. Validating commitments and reveals [VALIDATEVALUES]
            4.1.4. Encoding commit/reveal values in votes [COMMITVOTE]
            4.1.5. Shared Random Value [SRVOTE]
         4.2. Encoding Shared Random Values in the consensus [SRCONSENSUS]
         4.3. Persistent state format [STATEFORMAT]
      5. Security Analysis
         5.1. Security of commit-and-reveal and future directions
         5.2. Predicting the shared random value during reveal phase
         5.3. Partition attacks
            5.3.1. Partition attacks during commit phase
            5.3.2. Partition attacks during reveal phase
      6. Discussion
         6.1. Why the added complexity from proposal 225?
         6.2. Why do you do a commit-and-reveal protocol in 24 rounds?
         6.3. Why can't we recover if the 00:00UTC consensus fails?
      7. Acknowledgements


1. Introduction

1.1. Motivation

   For the next generation hidden services project, we need the Tor network to
   produce a fresh random value every day in such a way that it cannot be
   predicted in advance or influenced by an attacker.

   Currently we need this random value to make the HSDir hash ring
   unpredictable (#8244), which should resolve a wide class of hidden service
   DoS attacks and should make it harder for people to gauge the popularity
   and activity of target hidden services. Furthermore this random value can
   be used by other systems in need of fresh global randomness like
   Tor-related protocols (e.g. OnioNS) or even non-Tor-related (e.g. warrant
   canaries).

1.2. Previous work

   Proposal 225 specifies a commit-and-reveal protocol that can be run as an
   external script and have the results be fed to the directory authorities.
   However, directory authority operators feel unsafe running a third-party
   script that opens TCP ports and accepts connections from the Internet.
   Hence, this proposal aims to embed the commit-and-reveal idea in the Tor
   voting process which should make it smoother to deploy and maintain.

   Another idea proposed specifically for Tor is Nick Hopper's "A threshold
   signature-based proposal for a shared RNG" which was never turned into an
   actual Tor proposal.

2. Overview

   This proposal alters the Tor consensus protocol such that a random number is
   generated every midnight by the directory authorities during the regular voting
   process. The distributed random generator scheme is based on the
   commit-and-reveal technique.

   The proposal also specifies how the final shared random value is embedded
   in consensus documents so that clients who need it can get it.

2.1. Introduction to our commit-and-reveal protocol

   Every day, before voting for the consensus at 00:00UTC each authority
   generates a new random value and keeps it for the whole day. The authority
   cryptographically hashes the random value and calls the output its
   "commitment" value. The original random value is called the "reveal" value.

   The idea is that given a reveal value you can cryptographically confirm that
   it corresponds to a given commitment value (by hashing it). However given a
   commitment value you should not be able to derive the underlying reveal
   value. The construction of these values is specified in section [COMMITREVEAL].

2.1. Ten thousand feet view of the protocol

   Our commit-and-reveal protocol aims to produce a fresh shared random value
   everyday at 00:00UTC. The final fresh random value is embedded in the
   consensus document at that time.

   Our protocol has two phases and uses the hourly voting procedure of Tor.
   Each phase lasts 12 hours, which means that 12 voting rounds happen in
   between. In short, the protocol works as follows:

      Commit phase:

        Starting at 00:00UTC and for a period of 12 hours, authorities every
        hour include their commitment in their votes. They also include any
        received commitments from other authorities, if available.

      Reveal phase:

        At 12:00UTC, the reveal phase starts and lasts till the end of the
        protocol at 00:00UTC. In this stage, authorities must reveal the value
        they committed to in the previous phase. The commitment and revealed
        values from other authorities, when available, are also added to the
        vote.

      Shared Randomness Calculation:

        At 00:00UTC, the shared random value is computed from the agreed
        revealed values and added to the consensus.

   This concludes the commit-and-reveal protocol at 00:00UTC everyday.

2.3. How we use the consensus [CONS]

   The produced shared random values needs to be readily available to
   clients. For this reason we include them in the consensus documents.

   Every hour the consensus documents need to include the shared random value
   of the day, as well as the shared random value of the previous day. That's
   because either of these values might be needed at a given time for a Tor
   client to access a hidden service according to section [TIME-OVERLAP] of
   proposal 224. This means that both of these two values need to be included
   in votes as well.

   Hence, consensuses need to include:

      (a) The shared random value of the current time period.
      (b) The shared random value of the previous time period.

   For this, a new SR consensus method will be needed to indicate which
   authorities support this new protocol.

2.3.1. Inserting Shared Random Values in the consensus

   After voting happens, we need to be careful on how we pick which shared
   random values (SRV) to put in the consensus, to avoid breaking the consensus
   because of authorities having different views of the commit-and-reveal
   protocol (because maybe they missed some rounds of the protocol).

   For this reason, authorities look at the received votes before creating a
   consensus and employ the following logic:

   - First of all, they make sure that the agreed upon consensus method is
     above the SR consensus method.

   - Authorities include an SRV in the consensus if and only if the SRV has
     been voted by at least the majority of authorities.

   - For the consensus at 00:00UTC, authorities include an SRV in the consensus
     if and only if the SRV has been voted by at least AuthDirNumAgreements
     authorities (where AuthDirNumAgreements is a newly introduced consensus
     parameter).

   Authorities include in the consensus the most popular SRV that also
   satisfies the above constraints. Otherwise, no SRV should be included.

   The above logic is used to make it harder to break the consensus by natural
   partioning causes.

   We use the AuthDirNumAgreements consensus parameter to enforce that a
   _supermajority_ of dirauths supports the SR protocol during SRV creation, so
   that even if a few of those dirauths drop offline in the middle of the run
   the SR protocol does not get disturbed. We go to extra lengths to ensure
   this because changing SRVs in the middle of the day has terrible
   reachability consequences for hidden service clients.

2.4. Persistent State of the Protocol [STATE]

   A directory authority needs to keep a persistent state on disk of the on
   going protocol run. This allows an authority to join the protocol seamlessly
   in the case of a reboot.

   During the commitment phase, it is populated with the commitments of all
   authorities. Then during the reveal phase, the reveal values are also
   stored in the state.

   As discussed previously, the shared random values from the current and
   previous time period must also be present in the state at all times if they
   are available.

2.5. Protocol Illustration

   An illustration for better understanding the protocol can be found here:
         https://people.torproject.org/~asn/hs_notes/shared_rand.jpg

   It reads left-to-right.

   The illustration displays what the authorities (A_1, A_2, A_3) put in their
   votes. A chain 'A_1 -> c_1 -> r_1' denotes that authority A_1 committed to
   the value c_1 which corresponds to the reveal value r_1.

   The illustration depicts only a few rounds of the whole protocol. It starts
   with the first three rounds of the commit phase, then it jumps to the last
   round of the commit phase. It continues with the first two rounds of the
   reveal phase and then it jumps to the final round of the protocol run. It
   finally shows the first round of the commit phase of the next protocol run
   (00:00UTC) where the final Shared Random Value is computed. In our fictional
   example, the SRV was computed with 3 authority contributions and its value
   is "a56fg39h".

   We advice you to revisit this after you have read the whole document.

3. Protocol

   In this section we give a detailed specification of the protocol. We
   describe the protocol participants' logic and the messages they send. The
   encoding of the messages is specified in the next section ([SPEC]).

   Now we go through the phases of the protocol:

3.1 Commitment Phase [COMMITMENTPHASE]

   The commit phase lasts from 00:00UTC to 12:00UTC.

   During this phase, an authority commits a value in its vote and
   saves it to the permanent state as well.

   Authorities also save any received authoritative commits by other authorities
   in their permanent state. We call a commit by Alice "authoritative" if it was
   included in Alice's vote.

3.1.1. Voting During Commitment Phase

   During the commit phase, each authority includes in its votes:

    - The commitment value for this protocol run.
    - Any authoritative commitments received from other authorities.
    - The two previous shared random values produced by the protocol (if any).

   The commit phase lasts for 12 hours, so authorities have multiple chances to
   commit their values. An authority MUST NOT commit a second value during a
   subsequent round of the commit phase.

   If an authority publishes a second commitment value in the same commit
   phase, only the first commitment should be taken in account by other
   authorities. Any subsequent commitments MUST be ignored.

3.1.2. Persistent State During Commitment Phase [STATECOMMIT]

   During the commitment phase, authorities save in their persistent state the
   authoritative commits they have received from each authority. Only one commit
   per authority must be considered trusted and active at a given time.

3.2 Reveal Phase

   The reveal phase lasts from 12:00UTC to 00:00UTC.

   Now that the commitments have been agreed on, it's time for authorities to
   reveal their random values.

3.2.1. Voting During Reveal Phase

   During the reveal phase, each authority includes in its votes:

    - Its reveal value that was previously committed in the commit phase.
    - All the commitments and reveals received from other authorities.
    - The two previous shared random values produced by the protocol (if any).

   The set of commitments have been decided during the commitment
   phase and must remain the same. If an authority tries to change its
   commitment during the reveal phase or introduce a new commitment,
   the new commitment MUST be ignored.

3.2.2. Persistent State During Reveal Phase [STATEREVEAL]

   During the reveal phase, authorities keep the authoritative commits from the
   commit phase in their persistent state. They also save any received reveals
   that correspond to authoritative commits and are valid (as specified in
   [VALIDATEVALUES]).

   An authority that just received a reveal value from another authority's vote,
   MUST wait till the next voting round before including that reveal value in
   its votes.

3.3. Shared Random Value Calculation At 00:00UTC

   Finally, at 00:00UTC every day, authorities compute a fresh shared random
   value and this value must be added to the consensus so clients can use it.

   Authorities calculate the shared random value using the reveal values in
   their state as specified in subsection [SRCALC].

   Authorities at 00:00UTC start including this new shared random value in
   their votes, replacing the one from two protocol runs ago. Authorities also
   start including this new shared random value in the consensus as well.

   Apart from that, authorities at 00:00UTC proceed voting normally as they
   would in the first round of the commitment phase (section [COMMITMENTPHASE]).

3.3.1. Shared Randomness Calculation [SRCALC]

   An authority that wants to derive the shared random value SRV, should use
   the appropriate reveal values for that time period and calculate SRV as
   follows.

      HASHED_REVEALS = H(ID_a | R_a | ID_b | R_b | ..)

      SRV = SHA3-256("shared-random" | INT_8(REVEAL_NUM) | INT_4(VERSION) |
                     HASHED_REVEALS | PREVIOUS_SRV)

   where the ID_a value is the identity key fingerprint of authority 'a' and R_a
   is the corresponding reveal value of that authority for the current period.

   Also, REVEAL_NUM is the number of revealed values in this construction,
   VERSION is the protocol version number and PREVIOUS_SRV is the previous
   shared random value. If no previous shared random value is known, then
   PREVIOUS_SRV is set to 32 NUL (\x00) bytes.

   To maintain consistent ordering in HASHED_REVEALS, all the ID_a | R_a pairs
   are ordered based on the R_a value in ascending order.

3.4. Bootstrapping Procedure

   As described in [CONS], two shared random values are required for the HSDir
   overlay periods to work properly as specified in proposal 224. Hence
   clients MUST NOT use the randomness of this system till it has bootstrapped
   completely; that is, until two shared random values are included in a
   consensus. This should happen after three 00:00UTC consensuses have been
   produced, which takes 48 hours.

3.5. Rebooting Directory Authorities [REBOOT]

   The shared randomness protocol must be able to support directory
   authorities who leave or join in the middle of the protocol execution.

   An authority that commits in the Commitment Phase and then leaves MUST have
   stored its reveal value on disk so that it continues participating in the
   protocol if it returns before or during the Reveal Phase. The reveal value
   MUST be stored timestamped to avoid sending it on wrong protocol runs.

   An authority that misses the Commitment Phase cannot commit anymore, so it's
   unable to participate in the protocol for that run. Same goes for an
   authority that misses the Reveal phase. Authorities who do not participate in
   the protocol SHOULD still carry commits and reveals of others in their vote.

   Finally, authorities MUST implement their persistent state in such a way that they
   will never commit two different values in the same protocol run, even if they
   have to reboot in the middle (assuming that their persistent state file is
   kept). A suggested way to structure the persistent state is found at [STATEFORMAT].

4. Specification [SPEC]

4.1. Voting

   This section describes how commitments, reveals and SR values are encoded in
   votes. We describe how to encode both the authority's own
   commitments/reveals and also the commitments/reveals received from the other
   authorities. Commitments and reveals share the same line, but reveals are
   optional.

   Participating authorities need to include the line:
                 "shared-rand-participate"
   in their votes to announce that they take part in the protocol.

4.1.1. Computing commitments and reveals [COMMITREVEAL]

   A directory authority that wants to participate in this protocol needs to
   create a new pair of commitment/reveal values for every protocol
   run. Authorities SHOULD generate a fresh pair of such values right before the
   first commitment phase of the day (at 00:00UTC).

   The value REVEAL is computed as follows:

      REVEAL = base64-encode( TIMESTAMP || H(RN) )

      where RN is the SHA3 hashed value of a 256-bit random value. We hash the
      random value to avoid exposing raw bytes from our PRNG to the network (see
      [RANDOM-REFS]).

      TIMESTAMP is an 8-bytes network-endian time_t value. Authorities SHOULD
      set TIMESTAMP to the valid-after time of the vote document they first plan
      to publish their commit into (so usually at 00:00UTC, except if they start
      up in a later commit round).

   The value COMMIT is computed as follows:

      COMMIT = base64-encode( TIMESTAMP || H(REVEAL) )

4.1.2. Validating commitments and reveals [VALIDATEVALUES]

   Given a COMMIT message and a REVEAL message it should be possible to verify
   that they indeed correspond. To do so, the client extracts the random value
   H(RN) from the REVEAL message, hashes it, and compares it with the H(H(RN))
   from the COMMIT message. We say that the COMMIT and REVEAL messages
   correspond, if the comparison was successful.

   Pariticipants MUST also check that corresponding COMMIT and REVEAL values
   have the same timestamp value.

   Authorities should ignore reveal values during the Reveal Phase that don't
   correspond to commit values published during the Commitment Phase.

4.1.4. Encoding commit/reveal values in votes [COMMITVOTE]

   An authority puts in its vote the commitments and reveals it has produced and
   seen from the other authorities. To do so, it includes the following in its
   votes:

      "shared-rand-commit" SP VERSION SP ALGNAME SP IDENTITY SP COMMIT [SP REVEAL] NL

   where VERSION is the version of the protocol the commit was created with.
   IDENTITY is the authority's SHA1 identity fingerprint and COMMIT is the
   encoded commit [COMMITREVEAL].  Authorities during the reveal phase can
   also optionally include an encoded reveal value REVEAL.  There MUST be only
   one line per authority else the vote is considered invalid. Finally, the
   ALGNAME is the hash algorithm that should be used to compute COMMIT and
   REVEAL which is "sha3-256" for version 1.

4.1.5. Shared Random Value [SRVOTE]

  Authorities include a shared random value (SRV) in their votes using the
  following encoding for the previous and current value respectively:

     "shared-rand-previous-value" SP NUM_REVEALS SP VALUE NL
     "shared-rand-current-value" SP NUM_REVEALS SP VALUE NL

  where VALUE is the actual shared random value encoded in hex (computed as
  specified in section [SRCALC]. NUM_REVEALS is the number of reveal values
  used to generate this SRV.

  To maintain consistent ordering, the shared random values of the previous
  period should be listed before the values of the current period.

4.2. Encoding Shared Random Values in the consensus [SRCONSENSUS]

   Authorities insert the two active shared random values in the consensus
   following the same encoding format as in [SRVOTE].

4.3. Persistent state format [STATEFORMAT]

   As a way to keep ground truth state in this protocol, an authority MUST
   keep a persistent state of the protocol. The next sub-section suggest a
   format for this state which is the same as the current state file format.

   It contains a preamble, a commitment and reveal section and a list of
   shared random values.

   The preamble (or header) contains the following items. They MUST occur in
   the order given here:

    "Version" SP version NL

        [At start, exactly once.]

        A document format version. For this specification, version is "1".

    "ValidUntil" SP YYYY-MM-DD SP HH:MM:SS NL

        [Exactly once]

        After this time, this state is expired and shouldn't be used nor
        trusted. The validity time period is till the end of the current
        protocol run (the upcoming noon).

   The following details the commitment and reveal section. They are encoded
   the same as in the vote. This makes it easier for implementation purposes.

     "Commit" SP version SP algname SP identity SP commit [SP reveal] NL

        [Exactly once per authority]

        The values are the same as detailed in section [COMMITVOTE].

        This line is also used by an authority to store its own value.

   Finally is the shared random value section.

     "SharedRandPreviousValue" SP num_reveals SP value NL

        [At most once]

        This is the previous shared random value agreed on at the previous
        period. The fields are the same as in section [SRVOTE].

     "SharedRandCurrentValue" SP num_reveals SP value NL

        [At most once]

        This is the latest shared random value. The fields are the same as in
        section [SRVOTE].

5. Security Analysis

5.1. Security of commit-and-reveal and future directions

   The security of commit-and-reveal protocols is well understood, and has
   certain flaws. Basically, the protocol is insecure to the extent that an
   adversary who controls b of the authorities gets to choose among 2^b
   outcomes for the result of the protocol. However, an attacker who is not a
   dirauth should not be able to influence the outcome at all.

   We believe that this system offers sufficient security especially compared
   to the current situation. More secure solutions require much more advanced
   crypto and more complex protocols so this seems like an acceptable solution
   for now.

   For alternative approaches on collaborative random number generation also
   see the discussion at [RNGMESSAGING].

5.2. Predicting the shared random value during reveal phase

   The reveal phase lasts 12 hours, and most authorities will send their
   reveal value on the first round of the reveal phase. This means that an
   attacker can predict the final shared random value about 12 hours before
   it's generated.

   This does not pose a problem for the HSDir hash ring, since we impose an
   higher uptime restriction on HSDir nodes, so 12 hours predictability is not
   an issue.

   Any other protocols using the shared random value from this system should
   be aware of this property.

5.3. Partition attacks

   This design is not immune to certain partition attacks.  We believe they
   don't offer much gain to an attacker as they are very easy to detect and
   difficult to pull off since an attacker would need to compromise a directory
   authority at the very least. Also, because of the byzantine general problem,
   it's very hard (even impossible in some cases) to protect against all such
   attacks. Nevertheless, this section describes all possible partition attack
   and how to detect them.

5.3.1. Partition attacks during commit phase

   A malicious directory authority could send only its commit to one single
   authority which results in that authority having an extra commit value for
   the shared random calculation that the others don't have. Since the
   consensus needs majority, this won't affect the final SRV value. However,
   the attacker, using this attack, could remove a single directory authority
   from the consensus decision at 24:00 when the SRV is computed.

   An attacker could also partition the authorities by sending two different
   commitment values to different authorities during the commit phase.

   All of the above is fairly easy to detect. Commitment values in the vote
   coming from an authority should NEVER be different between authorities. If
   so, this means an attack is ongoing or very bad bug (highly unlikely).

5.3.2. Partition attacks during reveal phase

   Let's consider Alice, a malicious directory authority. Alice could wait
   until the last reveal round, and reveal its value to half of the
   authorities. That would partition the authorities into two sets: the ones
   who think that the shared random value should contain this new reveal, and
   the rest who don't know about it. This would result in a tie and two
   different shared random value.

   A similar attack is possible. For example, two rounds before the end of the
   reveal phase, Alice could advertise her reveal value to only half of the
   dirauths. This way, in the last reveal phase round, half of the dirauths
   will include that reveal value in their votes and the others will not. In
   the end of the reveal phase, half of the dirauths will calculate a
   different shared randomness value than the others.

   We claim that this attack is not particularly fruitful: Alice ends up
   having two shared random values to chose from which is a fundamental
   problem of commit-and-reveal protocols as well (since the last person can
   always abort or reveal). The attacker can also sabotage the consensus, but
   there are other ways this can be done with the current voting system.

   Furthermore, we claim that such an attack is very noisy and detectable.
   First of all, it requires the authority to sabotage two consensuses which
   will cause quite some noise. Furthermore, the authority needs to send
   different votes to different auths which is detectable. Like the commit
   phase attack, the detection here is to make sure that the commiment values
   in a vote coming from an authority are always the same for each authority.

6. Discussion

6.1. Why the added complexity from proposal 225?

   The complexity difference between this proposal and prop225 is in part
   because prop225 doesn't specify how the shared random value gets to the
   clients. This proposal spends lots of effort specifying how the two shared
   random values can always be readily accessible to clients.

6.2. Why do you do a commit-and-reveal protocol in 24 rounds?

   The reader might be wondering why we span the protocol over the course of a
   whole day (24 hours), when only 3 rounds would be sufficient to generate a
   shared random value.

   We decided to do it this way, because we piggyback on the Tor voting
   protocol which also happens every hour.

   We could instead only do the shared randomness protocol from 21:00 to 00:00
   every day. Or to do it multiple times a day.

   However, we decided that since the shared random value needs to be in every
   consensus anyway, carrying the commitments/reveals as well will not be a
   big problem. Also, this way we give more chances for a failing dirauth to
   recover and rejoin the protocol.

6.3. Why can't we recover if the 00:00UTC consensus fails?

   If the 00:00UTC consensus fails, there will be no shared random value for
   the whole day. In theory, we could recover by calculating the shared
   randomness of the day at 01:00UTC instead. However, the engineering issues
   with adding such recovery logic are too great. For example, it's not easy
   for an authority who just booted to learn whether a specific consensus
   failed to be created.

7. Acknowledgements

   Thanks to everyone who has contributed to this design with feedback and
   discussion.

   Thanks go to arma, ioerror, kernelcorn, nickm, s7r, Sebastian, teor, weasel
   and everyone else!

References:

[RANDOM-REFS]:
   http://projectbullrun.org/dual-ec/ext-rand.html
   https://lists.torproject.org/pipermail/tor-dev/2015-November/009954.html

[RNGMESSAGING]:
   https://moderncrypto.org/mail-archive/messaging/2015/002032.html
Filename: 251-netflow-padding.txt
Title: Padding for netflow record resolution reduction
Authors: Mike Perry
Created: 20 August 2015
Status: Closed
Implemented-In: 0.3.1.1-alpha

NOTE: Please look at section 2 of padding-spec.txt now, not this document.

0. Motivation

 It is common practice by many ISPs to record data about the activity of
 endpoints that use their uplink, if nothing else for billing purposes, but
 sometimes also for monitoring for attacks and general failure.

 Unfortunately, Tor node operators typically have no control over the data
 recorded and retained by their ISP. They are often not even informed about
 their ISP's retention policy, or the associated data sharing policy of those
 records (which tends to be "give them to whoever asks" in practice[1]).

 It is also likely that defenses for this problem will prove useful against
 proposed data retention plans in the EU and elsewhere, since these schemes
 will likely rely on the same technology.

0.1. Background

 At the ISP level, this data typically takes the form of Netflow, jFlow,
 Netstream, or IPFIX flow records. These records are emitted by gateway
 routers in a raw form and then exported (often over plaintext) to a
 "collector" that either records them verbatim, or reduces their granularity
 further[2].

 Netflow records and the associated data collection and retention tools are
 very configurable, and have many modes of operation, especially when
 configured to handle high throughput. However, at ISP scale, per-flow records
 are very likely to be employed, since they are the default, and also provide
 very high resolution in terms of endpoint activity, second only to full packet
 and/or header capture.

 Per-flow records record the endpoint connection 5-tuple, as well as the
 total number of bytes sent and received by that 5-tuple during a particular
 time period. They can store additional fields as well, but it is primarily
 timing and bytecount information that concern us.

 When configured to provide per-flow data, routers emit these raw flow
 records periodically for all active connections passing through them
 based on two parameters: the "active flow timeout" and the "inactive
 flow timeout".

 The "active flow timeout" causes the router to emit a new record
 periodically for every active TCP session that continuously sends data. The
 default active flow timeout for most routers is 30 minutes, meaning that a
 new record is created for every TCP session at least every 30 minutes, no
 matter what. This value can be configured to be from 1 minute to 60 minutes
 on major routers.

 The "inactive flow timeout" is used by routers to create a new record if a
 TCP session is inactive for some number of seconds. It allows routers to
 avoid the need to track a large number of idle connections in memory, and
 instead emit a separate record only when there is activity. This value
 ranges from 10 seconds to 600 seconds on common routers. It appears as
 though no routers support a value lower than 10 seconds.

0.2. Default timeout values of major routers

 For reference, here are default values and ranges (in parenthesis when
 known) for common routers, along with citations to their manuals.

 Some routers speak other collection protocols than Netflow, and in the
 case of Juniper, use different timeouts for these protocols. Where this
 is known to happen, it has been noted.

                         Inactive Timeout              Active Timeout
 Cisco IOS[3]              15s (10-600s)               30min (1-60min)
 Cisco Catalyst[4]         5min                        32min
 Juniper (jFlow)[5]        15s (10-600s)               30min (1-60min)
 Juniper (Netflow)[6,7]    60s (10-600s)               30min (1-30min)
 H3C (Netstream)[8]        60s (60-600s)               30min (1-60min)
 Fortinet[9]               15s                         30min
 MicroTik[10]              15s                         30min
 nProbe[14]                30s                         120s
 Alcatel-Lucent[15]        15s (10-600s)               30min (1-600min)

1. Proposal Overview

 The combination of the active and inactive netflow record timeouts allow us
 to devise a low-cost padding defense that causes what would otherwise be
 split records to "collapse" at the router even before they are exported to
 the collector for storage. So long as a connection transmits data before the
 "inactive flow timeout" expires, then the router will continue to count the
 total bytes on that flow before finally emitting a record at the "active
 flow timeout".

 This means that for a minimal amount of padding that prevents the "inactive
 flow timeout" from expiring, it is possible to reduce the resolution of raw
 per-flow netflow data to the total amount of bytes send and received in a 30
 minute window. This is a vast reduction in resolution for HTTP, IRC, XMPP,
 SSH, and other intermittent interactive traffic, especially when all
 user traffic in that time period is multiplexed over a single connection
 (as it is with Tor).


2. Implementation

 Tor clients currently maintain one TLS connection to their Guard node to
 carry actual application traffic, and make up to 3 additional connections to
 other nodes to retrieve directory information.

 We propose to pad only the client's connection to the Guard node, and not
 any other connection. We propose to treat Bridge node connections to the Tor
 network as client connections, and pad them, but otherwise not pad between
 normal relays.

 Both clients and Guards will maintain a timer for all application (ie:
 non-directory) TLS connections. Every time a non-padding packet is sent or
 received by either end, that endpoint will sample a timeout value from
 between 1.5 seconds and 9.5 seconds. If the connection becomes active for
 any reason before this timer expires, the timer is reset to a new random
 value between 1.5 and 9.5 seconds. If the connection remains inactive until
 the timer expires, a single CELL_PADDING cell will be sent on that connection.

 In this way, the connection will only be padded in the event that it is
 idle, and will always transmit a packet before the minimum 10 second inactive
 timeout.

2.1. Tunable parameters

 We propose that the defense be controlled by the following consensus
 parameters:

   * nf_ito_low
     - The low end of the range to send padding when inactive, in ms.
     - Default: 1500
   * nf_ito_high
     - The high end of the range to send padding, in ms.
     - Default: 9500

   * nf_pad_relays
     - If set to 1, we also pad inactive relay-to-relay connections
     - Default: 0

   * conn_timeout_low
     - The low end of the range to decide when we should close an idle
       connection (not counting padding).
     - Default: 900 seconds after last circuit closes

   * conn_timeout_high
     - The high end of the range to decide when we should close an idle
       connection.
     - Default: 1800 seconds after last circuit close

 If nf_ito_low == nf_ito_high == 0, padding will be disabled.

2.2. Maximum overhead bounds

 With the default parameters, we expect a padded connection to send one
 padding cell every 5.5 seconds (see Appendix A for the statistical analysis
 of expected padding packet rate on an idle link). This averages to 103 bytes
 per second full duplex (~52 bytes/sec in each direction), assuming a 512 byte
 cell and 55 bytes of TLS+TCP+IP headers. For a connection that remains idle
 for a full 30 minutes of inactivity, this is about 92KB of overhead in each
 direction.

 With 2.5M completely idle clients connected simultaneously, 52 bytes per
 second still amounts to only 130MB/second in each direction network-wide,
 which is roughly the current amount of Tor directory traffic[11]. Of course,
 our 2.5M daily users will neither be connected simultaneously, nor entirely
 idle, so we expect the actual overhead to be much lower than this.

2.3. Measuring actual overhead

 To measure the actual padding overhead in practice, we propose to export
 the following statistics in extra-info descriptors for the previous (fixed,
 non-rolling) 24 hour period:

    * Total cells read (padding and non-padding)
    * Total cells written (padding and non-padding)
    * Total CELL_PADDING cells read
    * Total CELL_PADDING cells written
    * Total RELAY_COMMAND_DROP cells read
    * Total RELAY_COMMAND_DROP cells written

 These values will be rounded to 100 cells each, and no values are reported if
 the relay has read or written less than 10000 cells in the previous period.

 RELAY_COMMAND_DROP cells are circuit-level padding not used by this defense,
 but we may as well start recording statistics about them now, too, to aid in
 the development of future defenses.

2.4. Load balancing considerations

 Eventually, we will likely want to update the consensus weights to properly
 load balance the selection of Guard nodes that must carry this overhead.

 We propose that we use the extra-info documents to get a more accurate value
 for the total average Guard and Guard+Exit node overhead of this defense in
 practice, and then use that value to fractionally reduce the consensus
 selection weights for Guard nodes and Guard+Exit nodes, to reflect their
 reduced capacity relative to middle nodes.


3. Threat model and adversarial considerations

 This defense does not assume fully adversarial behavior on the part of the
 upstream network administrator, as that administrator typically has no
 specific interest in trying to deanonymize Tor, but only in monitoring their
 own network for signs of overusage, attack, or failure.

 Therefore, in a manner closer to the "honest but curious" threat model, we
 assume that the netflow collector will be using standard equipment not
 specifically tuned to capturing Tor traffic. We want to reduce the resolution
 of logs that are collected incidentally, so that if they happen to fall into
 the wrong hands, we can be more certain will not be useful.

 We feel that this assumption is a fair one because correlation attacks (and
 statistical attacks in general) will tend to accumulate false positives very
 quickly if the adversary loses resolution at any observation points. It is
 especially unlikely for the the attacker to benefit from only a few
 high-resolution collection points while the remainder of the Tor network
 is only subject to connection-level/per-flow netflow data retention, or even
 less data retention than that.

 Nonetheless, it is still worthwhile to consider what the adversary is capable
 of, especially in light of looming data retention regulation.

 Because no major router appears to have the ability to set the inactive
 flow timeout below 10 seconds, it would seem as though the adversary is left
 with three main options: reduce the active record timeout to the minimum (1
 minute), begin logging full packet and/or header data, or develop a custom
 solution.

 It is an open question to what degree these approaches would help the
 adversary, especially if only some of its observation points implemented
 these changes.

3.1 What about sampled data?

 At scale, it is known that some Internet backbone routers at AS boundaries
 and exchanges perform sampled packet header collection and/or produce
 netflow records based on a subset of the packets that pass through their
 infrastructure.

 The effects of this against Tor were studied before against the (much
 smaller) Tor network as it was in 2007[12]. At sampling rate of 1 out of
 every 2000 packets, the attack did not achieve high accuracy until over
 100MB of data were transmitted, even when correlating only 500 flows in
 a closed-world lab setting.

 We suspect that this type of attack is unlikely to be effective at scale on
 the Tor network today, but we make no claims that this defense will make any
 impact upon sampled correlation, primarily because the amount of padding
 that this defense introduces is comparatively low relative to the amount of
 transmitted traffic that sampled correlation attacks require to attain
 any accuracy.

3.2. What about long-term statistical disclosure?

 This defense similarly does not claim to defeat long-term correlation
 attacks involving many observations over large amounts of time.

 However, we do believe it will significantly increase the amount of traffic
 and the number of independent observations required to attain the same
 accuracy if the adversary uses default per-flow netflow records.

3.3. What about prior information/confirmation?

 In truth, the most dangerous aspect of these netflow logs is not actually
 correlation at all, but confirmation.

 If the adversary has prior information about the location of a target, and/or
 when and how that target is expected to be using Tor, then the effectiveness
 of this defense will be very situation-dependent (on factors such as the
 number of other tor users in the area at that time, etc).

 In any case, the odds that there is other concurrent activity (to
 create a false positive) within a single 30 minute record are much higher
 than the odds that there is concurrent activity that aligns with a
 subset of a series of smaller, more frequent inactive timeout records.


4. Synergistic effects with future padding and other changes

 Because this defense only sends padding when the OR connection is completely
 idle, it should still operate optimally when combined with other forms of
 padding (such as padding for website traffic fingerprinting and hidden service
 circuit fingerprinting). If those future defenses choose to send padding for
 any reason at any layer of Tor, then this defense automatically will not.

 In addition to interoperating optimally with any future padding defenses,
 simple changes to the Tor network usage can serve to further reduce the
 usefulness of any data retention, as well as reduce the overhead from this
 defense.

 For example, if all directory traffic were also tunneled through the main
 Guard node instead of independent directory guards, then the adversary
 would lose additional resolution in terms of the ability to differentiate
 directory traffic from normal usage, especially when it is occurs within
 the same netflow record. As written and specified, the defense will pad
 such tunneled directory traffic optimally.

 Similarly, if bridge guards[13] are implemented such that bridges use their
 own guard node to route all of their connecting client traffic through, then
 users who run bridges will also benefit from blending their own client traffic
 with the concurrent traffic of their connected clients, the sum total of
 which will also be optimally padded such that it only transmits padding when
 the connection to the bridge's guard is completely idle.


Appendix A: Padding Cell Timeout Distribution Statistics

 It turns out that because the padding is bidirectional, and because both
 endpoints are maintaining timers, this creates the situation where the time
 before sending a padding packet in either direction is actually
 min(client_timeout, server_timeout).

 If client_timeout and server_timeout are uniformly sampled, then the
 distribution of min(client_timeout,server_timeout) is no longer uniform, and
 the resulting average timeout (Exp[min(X,X)]) is much lower than the
 midpoint of the timeout range.

 To compensate for this, instead of sampling each endpoint timeout uniformly,
 we instead sample it from max(X,X), where X is uniformly distributed.

 If X is a random variable uniform from 0..R-1 (where R=high-low), then the
 random variable Y = max(X,X) has Prob(Y == i) = (2.0*i + 1)/(R*R).

 Then, when both sides apply timeouts sampled from Y, the resulting
 bidirectional padding packet rate is now a third random variable:
 Z = min(Y,Y).

 The distribution of Z is slightly bell-shaped, but mostly flat around the
 mean. It also turns out that Exp[Z] ~= Exp[X]. Here's a table of average
 values for each random variable:

    R       Exp[X]    Exp[Z]    Exp[min(X,X)]   Exp[Y=max(X,X)]
    2000     999.5    1066        666.2           1332.8
    3000    1499.5    1599.5      999.5           1999.5
    5000    2499.5    2666       1666.2           3332.8
    6000    2999.5    3199.5     1999.5           3999.5
    7000    3499.5    3732.8     2332.8           4666.2
    8000    3999.5    4266.2     2666.2           5332.8
    10000   4999.5    5328       3332.8           6666.2
    15000   7499.5    7995       4999.5           9999.5
    20000   9900.5    10661      6666.2           13332.8

 In this way, we maintain the property that the midpoint of the timeout range
 is the expected mean time before a padding packet is sent in either
 direction.


1. https://lists.torproject.org/pipermail/tor-relays/2015-August/007575.html
2. https://en.wikipedia.org/wiki/NetFlow
3. http://www.cisco.com/en/US/docs/ios/12_3t/netflow/command/reference/nfl_a1gt_ps5207_TSD_Products_Command_Reference_Chapter.html#wp1185203
4. http://www.cisco.com/c/en/us/support/docs/switches/catalyst-6500-series-switches/70974-netflow-catalyst6500.html#opconf
5. https://www.juniper.net/techpubs/software/erx/junose60/swconfig-routing-vol1/html/ip-jflow-stats-config4.html#560916
6. http://www.jnpr.net/techpubs/en_US/junos15.1/topics/reference/configuration-statement/flow-active-timeout-edit-forwarding-options-po.html
7. http://www.jnpr.net/techpubs/en_US/junos15.1/topics/reference/configuration-statement/flow-active-timeout-edit-forwarding-options-po.html
8. http://www.h3c.com/portal/Technical_Support___Documents/Technical_Documents/Switches/H3C_S9500_Series_Switches/Command/Command/H3C_S9500_CM-Release1648%5Bv1.24%5D-System_Volume/200901/624854_1285_0.htm#_Toc217704193
9. http://docs-legacy.fortinet.com/fgt/handbook/cli52_html/FortiOS%205.2%20CLI/config_system.23.046.html
10. http://wiki.mikrotik.com/wiki/Manual:IP/Traffic_Flow
11. https://metrics.torproject.org/dirbytes.html
12. http://freehaven.net/anonbib/cache/murdoch-pet2007.pdf
13. https://gitweb.torproject.org/torspec.git/tree/proposals/188-bridge-guards.txt
14. http://www.ntop.org/wp-content/uploads/2013/03/nProbe_UserGuide.pdf
15. http://infodoc.alcatel-lucent.com/html/0_add-h-f/93-0073-10-01/7750_SR_OS_Router_Configuration_Guide/Cflowd-CLI.html
Filename: 252-single-onion.txt
Title: Single Onion Services
Author: John Brooks, Paul Syverson, Roger Dingledine
Created: 2015-07-13
Status: Superseded
Superseded-by: 260

1. Overview

   Single onion services are a modified form of onion services, which trade
   service-side location privacy for improved performance, reliability, and
   scalability.

   Single onion services have a .onion address identical to any other onion
   service. The descriptor contains information sufficient to do a relay
   extend of a circuit to the onion service and to open a stream for the onion
   address. The introduction point and rendezvous protocols are bypassed for
   these services.

   We also specify behavior for a tor instance to publish a single onion
   service, which requires a reachable OR port, without necessarily acting
   as a public relay in the network.

2. Motivation

   Single onion services have a few benefits over double onion services:

      * Connection latency is much lower by skipping rendezvous
      * Stream latency is reduced on a 4-hop circuit
      * Removing rendezvous circuits improves service scalability
      * A single onion service can use multiple relays for load balancing

   Single onion services are not location hidden on the service side,
   but clients retain all of the benefits and privacy of onion
   services. More details, relation to double onion services, and the
   rationale for the 'single' and 'double' nomenclature are further
   described in section 7.4.

   We believe these improvements, along with the other benefits of onion
   services, will be a significant incentive for website and other internet
   service operators to provide these portals to preserve the privacy of their
   users.

3. Onion descriptors

   The onion descriptor format is extended to add:

     "service-extend-locations" NL encrypted-string
       [At most once]

       A list of relay extend info, which is used instead of introduction
       points and rendezvous for single onion services. This field is
       encoded and optionally encrypted in the same way as the
       "introduction-points" field.

       The encoded contents of this field contains no more than 10 entries,
       each containing the following data:

         "service-extend-location" SP link-specifiers NL
            [At start, exactly once]
            link-specifiers is a base64 encoded link specifier block, in
            the format described by proposal 224 [BUILDING-BLOCKS] and the
            EXTEND2 cell.

          "onion-key" SP key-type NL onion-key
            [Exactly once]
            Describes the onion key that must be used when extending to the
            single onion service relay.

            The key-type field is one of:
               "tap"
                  onion-key is a PEM-encoded RSA relay onion key
               "ntor"
                  onion-key is a base64-encoded NTOR relay onion key

   [XXX: Should there be some kind of cookie to prove that we have the desc?
   See also section 7.1. -special]

   A descriptor may contain either or both of "introduction-points" and
   "service-extend-locations"; see section 5.2.

   [XXX: What kind of backwards compatibility issues exist here? Will existing
   relays accept one of those descriptors? -special]

4. Reaching a single onion service as a client

   Single onion services use normal onion hostnames, so the client will first
   request the service's descriptor. If the descriptor contains a
   "service-extend-locations" field, the client should ignore the introduction
   points and rendezvous process in favor of the process defined here.

   The descriptor's "service-extend-locations" information is sufficient for a
   client to extend a circuit to the onion service, regardless of whether it
   is also listed as a relay in the network consensus. This extend info must
   not be used for any other purpose. If multiple extend locations are
   specified, the client should randomly select one.

   The client uses a 3-hop circuit to extend to the service location from the
   descriptor. Once this circuit is built, the client sends a BEGIN cell to
   the relay, with the onion address as hostname and the desired TCP port.

   If the circuit or stream fails, the client should retry using another
   extend location from the descriptor. If all extend locations fail, and the
   descriptor contains an "introduction-points" field, the client may fall
   back to a full rendezvous operation.

5. Publishing a single onion service

   To act as a single onion service, a tor instance (or cooperating group of
   tor instances) must:

      * Have a publicly accessible OR port
      * Publish onion descriptors in the same manner as any onion service
      * Include a "service-extend-locations" section in the onion descriptor
      * Accept RELAY_BEGIN cells for the service as defined in section 5.3

5.1. Configuration options

   The tor server operating a single onion service must accept connections as
   a tor relay, but is not required to be published in the consensus or to
   allow extending circuits. To enable this, we propose the following
   configuration option:

      RelayAllowExtend 0|1
         If set, allow clients to extend circuits from this relay. Otherwise,
         refuse all extend cells. PublishServerDescriptor must also be disabled
         if this option is disabled. If ExitRelay is also disabled, this relay
         will not pass through any traffic.

5.2. Publishing descriptors

   A single onion service must publish descriptors in the same manner as any
   onion service, as defined by rend-spec and section 3 of this proposal.

   Optionally, a set of introduction points may be included in the descriptor
   to provide backwards compatibility with clients that don't support single
   onion services, or to provide a fallback when the extend locations fail.

5.3. RELAY_BEGIN

   When a RELAY_BEGIN cell is received with a configured single onion hostname
   as the destination, the stream should be connected to the configured
   backend server in the same manner as a service-side rendezvous stream.

   All relays must reject any RELAY_BEGIN cell with an address ending in
   ".onion" that does not match a locally configured single onion service.

6. Other considerations

6.1. Load balancing

   High capacity services can distribute load by including multiple entries in
   the "service-extend-locations" section of the descriptor, or by publishing
   several descriptors to different onion service directories, or by a
   combination of these methods.

6.2. Benefits of also running a Tor relay

   If a single onion service also acts as a published tor relay, it will keep
   connections to many other tor relays. This can significantly reduce the
   latency of connections to the single onion service, and also helps the tor
   network.

6.3. Proposal 224 ("Next-Generation Hidden Services")

   This proposal is compatible with proposal 224, with small changes to the
   service descriptor format. In particular:

   The "service-extend-location" sections are included in the encrypted
   portion of the descriptor, adjacent to any "introduction-point" sections.
   The "service-extend-locations" field is no longer present. An onion service
   is also single onion service if any "service-extend-location" field is
   present.

6.4. Proposal 246 ("Merging Hidden Service Directories and Intro Points")

   This proposal is compatible with proposal 246. The onion service will
   publish its descriptor to the introduction points in the same manner as any
   other onion service. The client may choose to build a circuit to the
   specified relays, or to continue with the rendezvous protocol.

   The client should not extend from the introduction point to the single
   onion service's relay, to avoid overloading the introduction point. The
   client may truncate the circuit and extend through a new relay.

7. Discussion

7.1. Authorization

   Client authorization for a single onion service is possible through
   encryption of the service-extend-locations section in the descriptor, or
   "stealth" publication under a new onion address, as with traditional onion
   services.

   One problem with this is that if you suspect a relay is also serving a
   single onion service, you can connect to it and send RELAY_BEGIN without
   any further authorization. To prevent this, we would need to include a
   cookie from the descriptor in the RELAY_BEGIN information.

7.2. Preventing relays from being unintentionally published

   Many single onion servers will not want to relay other traffic, and will
   set 'PublishServerDescriptor 0' to prevent it. Even when they do, they will
   still generate a relay descriptor, which could be downloaded and published
   to a directory authority without the relay's consent. To prevent this, we
   should insert a field in the relay descriptor when PublishServerDescriptor
   is disabled that instructs relays to never include it as part of a
   consensus.

   [XXX: Also see task #16564]

7.3. Ephemeral single onion services (ADD_ONION)

   The ADD_ONION control port command could be extended to support ephemerally
   configured single onion services. We encourage this, but specifying its
   behavior is out of the scope of this proposal.

7.4. Onion service taxonomy and nomenclature

   Onion services in general provide several benefits. First, by requiring a
   connection via Tor they provide the client the protections of Tor and make
   it much more difficult to inadvertently bypass those protections than when
   connecting to a non .onion site.  Second, because .onion addresses are
   self-authenticating, onion services have look-up, routing, and
   authentication protections not provided by sites with standard domain
   addresses. These benefits apply to all onion services.

   Onion services as originally introduced also provide network location
   hiding of the service itself: because the client only ever connects through
   the end of a Tor circuit created by the onion service, the IP address of
   the onion service also remains protected.

   Applications and services already exist that use existing onion service
   protocols for the above described general benefits without the need for
   network location hiding. This Proposal is accordingly motivated by a desire
   to provide the general benefits, without the complexity and overhead of
   also protecting the location of the service.

   Further, as with what had originally been called 'location hidden
   services', there may be useful and valid applications of this design that
   are not reflected in our current intent. Just as 'location hidden service'
   is a misleading name for many current onion service applications, we prefer
   a name that is descriptive of the system but flexible with respect to
   applications of it. We also prefer a nomenclature that consistently works
   for the different types of onion services.

   It is also important to have short, simple names lest usage efficiencies
   evolve easier names for us. For example, 'hidden service' has replaced the
   original 'location hidden service' in Tor Proposals and other writings.

   For these reasons, we have chosen 'onion services' to refer to both those
   as set out in this Proposal and those with the client-side and server-side
   protections of the original---also for referring indiscriminately to any
   and all onion services. We use 'double-onion service' to refer to services
   that join two Tor circuits, one from the server and one from the client. We
   use 'single-onion' when referring to services that use only a client-side
   Tor circuit. In speech we sometimes use the even briefer, 'two-nion' and
   'one-ion' respectively.

Filename: 253-oob-hmac.txt
Title: Out of Band Circuit HMACs
Authors: Mike Perry
Created: 01 September 2015
Status: Dead


0. Motivation


It is currently possible for Guard nodes (and MITM adversaries that
steal their identity keys) to perform a "tagging" attack to influence
circuit construction and resulting relay usage[1].

Because Tor uses AES as a stream cipher, malicious or intercepted Guard
nodes can simply XOR a unique identifier into the circuit cipherstream
during circuit setup and usage. If this identifier is not removed by a
colluding exit (either by performing another XOR, or making use of known
plaintext regions of a cell to directly extract a complete side-channel
value), then the circuit will fail. In this way, malicious or
intercepted Guard nodes can ensure that all client traffic is directed
only to colluding exit nodes, who can observe the destinations and
deanonymize users.

Most code paths in the Tor relay source code will emit loud warnings for
the most obvious instances circuit failure caused by this attack.
However, it is very difficult to ensure that all such error conditions
are properly covered such that warnings will be emitted.

This proposal aims to provide a mechanism to ensure that tagging and
related malleability attacks are cryptographically detectable when they
happen.


1. Overview

Since Tor Relays are already storing a running hash of all data
transmitted on their circuits (via the or_circuit_t::n_digest and
or_circuit_t::p_digest properties), it is possible to compute an
out-of-band HMAC on circuit data, and verify that it is as expected.

This proposal first defines an OOB_HMAC primitive that can be included
standalone in a new relay cell command type, and additionally in other
cell types.

Use of the standalone relay cell command serves to ensure that circuits
that are successfully built and used were not manipulated at a previous
point.

By altering the RELAY_COMMAND_TRUNCATED and CELL_DESTROY cells to also
include the OOB_HMAC information, it is similarly possible to detect
alteration of circuit contents that cause failures before the point of
usage.


2. The OOB_HMAC primitive

The OOB_HMAC primitive uses the existing rolling hashes present in
or_circuit_t to provide a Tor OP (aka client) with the hash history of
the traffic that a given relay has seen it so far.

Note that to avoid storing an additional 64 bytes of SHA256 digest for
every circuit at every relay, we use SHA1 for the hash logs, since the
circuits are already storing SHA1 hashes. It's not immediately clear how
to upgrade the existing SHA1 digests to SHA256 with the current circuit
protocol, either, since matching hash algorithms are essential to the
'recognized' relay cell forwarding behavior. The version field exists
primarily for this reason, should the rolling circuit hashes ever
upgrade to SHA256.

The OOB_HMAC primitive is specified in Trunnel as follows:

     struct oob_hmac_body {
        /* Version of this section. Must be 1 */
        u8 version;
       
        /* SHA1 hash of all client-originating data on this circuit
           (obtained from or_circuit_t::n_digest). */
        u8 client_hash_log[20];
        /* Number of cells processed in this hash, mod 2^32. Used
           to spot-check hash position */
        u32 client_cell_count;
        
        /* SHA1 hash of all server-originating data on this circuit
           (obtained from or_circuit_t::p_digest). */
        u8 server_hash_log[20];
        /* Number of cells processed in this hash, mod 2^32. Used
           to spot-check hash position.
           XXX: Technically the server-side is not needed. */
        u32 server_cell_count;

        /* HMAC-SHA-256 of the entire cell contents up to this point,
           using or_circuit_t::p_crypto as the hmac key.
           XXX: Should we use a KDF here instead of p_crypto directly? */   
        u8 cell_hmac_256[32];
     };


3. Usage of OOB_HMAC

The OOB_HMAC body will be included in three places:

 1. In a new relay cell command RELAY_COMMAND_HMAC_SEND, which is sent in
    response to a client-originating RELAY_COMMAND_HMAC_GET on stream 0.
 2. In CELL_DESTROY, immediately after the error code
 3. In RELAY_COMMAND_TRUNCATED, immediately after the CELL_DESTROY
    contents

3.1. RELAY_COMMAND_HMAC_GET/SEND relay commands

Clients should use leaky-pipe topology to send RELAY_COMMAND_HMAC_GET to
the second-to-last node (typically the middle node) in the circuit at
three points during circuit construction and usage:

  1. Immediately after the last RELAY_EARLY cell is sent
  2. Upon any stream detachment, timeout, or failure.
  3. Upon any OP-initiated circuit teardown (including timed-out partially
     built circuits).

We use RELAY_EARLY as the point at which to send these cells to avoid
leaking the path length to the middle hop.

3.2. Alteration of CELL_DESTROY and RELAY_COMMAND_TRUNCATED

In order to provide an HMAC even when a circuit is torn down before use
due to failure, the behavior for generating and handling CELL_DESTROY
and RELAY_COMMAND_TRUNCATED should be modified as follows:

Whenever an OR sends a CELL_DESTROY for a circuit towards the OP, if
that circuit was already properly established, the OR should include the
contents of oob_hmac_body immediately after the reason field. The HMAC
must cover the error code from CELL_DESTROY.

Upon receipt of a CELL_DESTROY, and in any other case where an OR would
generate a RELAY_COMMAND_TRUNCATED due to error, a conformant relay
would include the CELL_DESTROY oob_hmac_body, as well as its own
locally created oob_hmac_body. The locally created oob_hmac_body must
cover the entire payload contents of RELAY_COMMAND_TRUNCATED, including 
the error code and the CELL_DESTROY oob_hmac_body.

Here is a new Trunnel specification for RELAY_COMMAND_TRUNCATED:

     struct relay_command_truncated {
        /* Error code */
        u8 error_code;

        /* Number of oob_hmacs. Must be 0, 1, or 2 */
        u8 num_hmac;

        /* If there are 2 hmacs, the first one is from the CELL_DESTROY,
           and the second one is from the truncating relay. If num_hmac
           is 0, then this came from a relay without support for
           oob_hmac. */
        struct oob_hmac_body[num_hmac];
     };

The usage of a strong HMAC to cover the entire CELL_DESTROY contents
also allows an OP to properly authenticate the reason a remote node
needed to close a circuit, without relying on the previous hop to be
honest about it.


4. Ensuring proper ordering with respect to hashes

4.1. RELAY_COMMAND_HMAC_GET/SEND

The in-order delivery guarantee of circuits will mean that the incoming
hashes will match upon receipt of the RELAY_COMMAND_HMAC_SEND cell, but
any outgoing traffic the OP sent since RELAY_COMMAND_HMAC_GET will
not have been seen by the responding OR.

Therefore, immediately upon sending a RELAY_COMMAND_HMAC_GET, the OP
must record and store its current outgoing hash state for that circuit,
until the RELAY_COMMAND_HMAC_SEND arrives, and use that stored hash
value for comparison against the oob_hmac_body's client_hash_log field.

The server_hash_log should be checked against the corresponding
crypt_path_t entry in origin_circuit_t for the relay that the command
was sent to.

4.2. RELAY_COMMAND_TRUNCATED

Since RELAY_COMMAND_TRUNCATED may be sent in response to any error
condition generated by a cell in either direction, the OP must check
that its local cell counts match those present in the oob_hmac_body for
that hop.

If the counts do not match, the OP may generate a RELAY_COMMAND_HMAC_GET
to the hop that sent RELAY_COMMAND_TRUNCATED, prior to tearing down the
circuit.

4.3. CELL_DESTROY

If the cell counts of the destroy cell's oob_hmac_body do not match what
the client sent for that hop, unfortunately that hash must be discarded.
Otherwise, it may be checked against values held from before processing
the RELAY_COMMAND_TRUNCATED envelope.


5. Security concerns and mitigations

5.1. Silent circuit failure attacks

The primary way to game this oob-hmac is to omit or block cells
containing HMACs from reaching the OP, or otherwise tear down circuits
before responses arrive with proof of tampering.

If a large fraction of circuits somehow fail without any
RELAY_COMMAND_TRUNCATED oob_hmac_body payloads present, and without any
responses to RELAY_COMMAND_HMAC_GET requests, the user should be alerted
of this fact as well.

This rate of silent circuit failure should be kept as an additional,
separate per-Guard Path Bias statistic, and the user should be warned if
this failure rate exceeds some (low) threshold for circuits containing
relays that should have supported this proposal.

5.2. Malicious/colluding middle nodes

If the adversary is prevented from causing silent circuit failure
without the client being able to notice and react, their next available
vector is to ensure that circuits are only built to middle nodes that
are malicious and colluding with them (or that do not support this
proposal), so that they may lie about the proper hash values that they
see (or omit them).

Right now, the current path bias code also does not count circuit
failures to the middle hop as circuit attempts. This was done to reduce
the effect of ambient circuit failure on the path bias accounting (since
an average ambient circuit failure of X per-hop causes the total circuit
failure middle+exit circuits to be 2X). Unfortunately, not counting
middle hop failure allows the adversary to only allow circuits to
colluding middle hops to complete, so that they may lie about their hash
logs. All failed circuits to non-colluding middle nodes could be torn
down before RELAY_COMMAND_TRUNCATED is sent.

For this reason, the per-Guard Path Bias counts should be augmented to
additionally track middle-node-only failure as a separate statistic as
well, and the user should be warned if middle-node failure drops below a
similar threshold as the current end-to-end failure.

5.3. Side channel issues, mitigations, and limitations

Unfortunately, leaking information about circuit usage to the middle
node prevents us from sending RELAY_COMMAND_HMAC_GET cells at more
optimal points in circuit usage (such as immediately upon open,
immediately after stream usage, etc).

As such, we are limited to waiting until RELAY_EARLY cells stop being
sent. It is debatable if we should send hashes periodically (perhaps
with windowing information updates?) instead.


6. Alternatives

A handful of alternatives to this proposal have already been discussed,
but have been dismissed for various reasons. Per-hop cell HMACs were
ruled out because they will leak the total path length, as well as the
current hop's position in the circuit.

Wide-block ciphers have been discussed, which would provide the property
that attempts to alter a cell at a previous hop would render it
completely corrupted upon its final destination, thus preventing
untagging and recovery, even by a colluding malicious peer.

Unfortunately, performance analysis of modern provably secure versions
of wide-block ciphers has shown them to be at least 10X slower than
AES-NI[2].


1. https://lists.torproject.org/pipermail/tor-dev/2012-March/003347.html
2. http://archives.seul.org/tor/dev/Mar-2015/msg00137.html
Filename: 254-padding-negotiation.txt
Title: Padding Negotiation
Authors: Mike Perry
Created: 01 September 2015
Status: Closed

   [See padding-spec.txt for the implemented version of this proposal.]


0. Overview

This proposal aims to describe mechanisms for requesting various types
of padding from relays.

These padding primitives are general enough to use to defend against
both website traffic fingerprinting as well as hidden service circuit
setup fingerprinting.


1. Motivation

Tor already supports both link-level padding via (CELL_PADDING cell
types), as well as circuit-level padding (via RELAY_COMMAND_DROP relay
cells).

Unfortunately, there is no way for clients to request padding from
relays, or request that relays not send them padding to conserve
bandwidth. This proposal aims to create a mechanism for clients to do
both of these.

It also establishes consensus parameters to limit the amount of padding
that relays will send, to prevent custom wingnut clients from requesting
too much.


2. Link-level padding

Padding is most useful if it can defend against a malicious or
compromised guard node. However, link-level padding is still useful to
defend against an adversary that can merely observe a Guard node
externally, such as for low-resolution netflow-based attacks (see
Proposal 251[1]).

In that scenario, the primary negotiation mechanism we need is a way for
mobile clients to tell their Guards to stop padding, or to pad less
often. The following Trunnel payload should cover the needed
parameters:

  const CHANNELPADDING_COMMAND_STOP = 1;
  const CHANNELPADDING_COMMAND_START = 2;

  /* The start command tells the relay to alter its min and max netflow
     timeout range values, and send padding at that rate (resuming
     if stopped). The stop command tells the relay to stop sending
     link-level padding. */
  struct channelpadding_negotiate {
    u8 version IN [0];
    u8 command IN [CHANNELPADDING_COMMAND_START, CHANNELPADDING_COMMAND_STOP];

    /* Min must not be lower than the current consensus parameter
       nf_ito_low. Ignored if command is stop. */
    u16 ito_low_ms;

    /* Max must not be lower than ito_low_ms. Ignored if command is stop. */
    u16 ito_high_ms;
  };

After the above cell is received, the guard should use the 'ito_low_ms' and
'ito_high_ms' values as the minimum and maximum values (respectively) for
inactivity before it decides to pad the channel. The actual timeout value is
randomly chosen between those two values through an appropriate probability
distribution (see proposal251 for the netflow padding protocol).

More complicated forms of link-level padding can still be specified
using the primitives in Section 3, by using "leaky pipe" topology to
send the RELAY commands to the Guard node instead of to later nodes in
the circuit.

Because the above link-level padding only sends padding cells if the link is
idle, it can be used in combination with the more complicated circuit-level
padding below, without compounding overhead effects.


3. End-to-end circuit padding

For circuit-level padding, we need the ability to schedule a statistical
distribution of arbitrary padding to overlay on top of non-padding
traffic (aka "Adaptive Padding").

The statistical mechanisms that define padding are known as padding
machines. Padding machines can be hardcoded in Tor, specified in the
consensus, and custom research machines can be listed in Torrc.

3.1. Padding Machines

Circuits can have either one or two state machines at both the origin and at a
specified middle hop.

Each state machine can contain up to three states ("Start", "Burst" and "Gap")
governing their behavior, as well as an "END" state. Not all states need to be
used.

Each state of a padding machine specifies either:
  * A histogram describing inter-arrival cell delays; OR
  * A parameterized delay probability distribution for inter-arrival cell delays

In either case, the lower bound of the delay probability distribution can be
specified as the start_usec parameter, and/or it can be learned by measuring
the RTT of the circuit at the middle node. For client-side machines, RTT
measurement is always set to 0. RTT measurement at the middle node is
calculated by measuring the difference between the time of arrival of an
received cell (ie: away from origin) and the time of arrival of a sent cell
(ie: towards origin). The RTT is continually updated so long as two cells do
not arrive back-to-back in either direction. If the most recent measured RTT
value is larger than our measured value so far, this larger value is used. If
the most recent measured RTT value is lower than our measured value so far, it
is averaged with our current measured value. (We favor longer RTTs slightly in
this way, because circuits are growing away from the middle node and becoming
longer).

If the histogram is used, it has an additional special "infinity" bin that
means "infinite delay".

The state can also provide an optional parameterized distribution that
specifies how many total cells (or how many padding cells) can be sent on the
circuit while the machine is in this state, before it transitions to a new
state.

Each state of a padding machine can react to the following cell events:
  * Non-padding cell received
  * Padding cell received
  * Non-padding cell sent
  * Padding cell sent

Additionally, padding machines emit the following internal events to themselves:
  * Infinity bin was selected
  * The histogram bins are empty
  * The length count for this state was exceeded

Each state of the padding machine specifies a set of these events that cause
it to cancel any pending padding, and a set of events that cause it to
transition to another state, or transition back itself.

When an event causes a transition to a state (or back to the same state), a
delay is sampled from the histogram or delay distribution, and padding cell is
scheduled to be sent after that delay.

If a non-padding cell is sent before the timer, the timer is canceled and a
new padding delay is chosen.

3.1.1. Histogram Specification

If a histogram is used by a state (as opposed to a fixed parameterized
distribution), then each of the histograms' fields represent a probability
distribution that is encoded into bins of exponentially increasing width.

The first bin of the histogram (bin 0) has 0 width, with a delay value of
start_usec+rtt_estimate (from the machine definition, and rtt estimate above).

The remaining bins are exponentially spaced, starting at this offset and
covering the range of the histogram, which is range_usec.

The intermediate bins thus divide the timespan range_usec with offset
start_usec+rtt_estimate, so that smaller bin indexes represent narrower time
ranges, doubling up until the last bin. The last bin before the "infinity bin"
thus covers [start_usec+rtt_estimate+range_usec/2,
start_usec+rtt_estimate+range_usec).

This exponentially increasing bin width allows the histograms to most
accurately represent small interpacket delay (where accuracy is needed), and
devote less accuracy to larger timescales (where accuracy is not as
important).

To sample the delay time to send a padding packet, perform the
following:
  * Select a bin weighted by the number of tokens in its index compared to
    the total.
  * If the infinity bin is selected, do not schedule padding.
  * If bin 0 is selected, schedule padding at exactly its time value.
  * For other bins, uniformly sample a time value between this bin and
    the next bin, and schedule padding then.

3.1.1.1. Histogram Token Removal

Tokens can be optionally removed from histogram bins whenever a padding or
non-padding packet is sent. With this token removal, the histogram functions
as an overall target delay distribution for the machine while it is in that
state.

If token removal is enabled, when a padding packet is sent, a token is removed
from the bin corresponding to the target delay. When a non-padding packet is
sent, the actual delay from the previous packet is calculated, and the
histogram bin corresponding to that delay is inspected.  If that bin has
tokens remaining, it is decremented.

If the bin has no tokens left, the state removes a token from a different bin,
as specified in its token removal rule. The following token removal options
are defined:
  * None -- Never remove any tokens
  * Exact -- Only remove from the target bin, if it is empty, ignore it.
  * Higher -- Remove from the next higher non-empty bin
  * Lower -- Remove from the next higher non-empty bin
  * Closest -- Remove from the closest non-empty bin by index
  * Closest_time -- Remove from the closest non-empty bin by index, by time

When all bins exept the infinity bin are empty in a histogram, the padding
machine emits the internal "bins empty" event to itself.

Bin 0 and the bin before the infinity bin both have special rules for purposes
of token removal. While removing tokens, all values less than bin 0 are
treated as part of bin 0, and all values greater than
start_usec+rtt_estimate+range_sec are treated as part of the bin before the
infinity bin. Tokens are not removed from the infinity bin when non-padding is
sent. (They are only removed when an "infinite" delay is chosen).

3.1.2. Delay Probability Distribution

Alternatively, a delay probability distribution can be used instead of a
histogram, to sample padding delays.

In this case, the designer also needs to specify the appropriate distribution
parameters, and when a padding cell needs to be scheduled, the padding
subsystem will sample a positive delay value (in microseconds) from that
distribution (where the minimum delay is range_usec+start_usec as is the case
for histograms).

We currently support the following probability distributions:
   Uniform, Logistic, Log-logistic, Geometric, Weibull, Pareto

3.2. State Machine Selection

Clients will select which of the defined available padding machines to use
based on the conditions that these machines specify. These conditions include:
  * How many hops the circuit must be in order for the machine to apply
  * If the machine requires vanguards to be enabled to apply
  * The state the circuit must be in for machines to apply (building,
    relay early cells remaining, opened, streams currently attached).
  * If the circuit purpose matches a set of purposes for the machine.
  * If the target hop of the machine supports circuit padding.

Clients will only select machines whose conditions fully match given circuits.

A machine is represented by a positive number that can be thought of as a "menu
option" through the list of padding machines. The currently supported padding
state machines are:

        [1]: CIRCPAD_MACHINE_CIRC_SETUP

             A padding machine that obscures the initial circuit setup in an
             attempt to hide onion services.

3.3. Machine Negotiation

When a machine is selected, the client uses leaky-pipe delivery to send a
RELAY_COMMAND_PADDING_NEGOTIATE to the target hop of the machine, using the
following trunnel relay cell payload format:

  /**
   * This command tells the relay to alter its min and max netflow
   * timeout range values, and send padding at that rate (resuming
   * if stopped). */
  struct circpad_negotiate {
    u8 version IN [0];
    u8 command IN [CIRCPAD_COMMAND_START, CIRCPAD_COMMAND_STOP];

    /** Machine type is left unbounded because we can specify
     * new machines in the consensus */
    u8 machine_type;
  };

Upon receipt of a RELAY_COMMAND_PADDING_NEGOTIATE cell, the middle node sends
a RELAY_COMMAND_PADDING_NEGOTIATED with the following format:

  /**
   * This command tells the relay to alter its min and max netflow
   * timeout range values, and send padding at that rate (resuming
   * if stopped). */
  struct circpad_negotiated {
    u8 version IN [0];
    u8 command IN [CIRCPAD_COMMAND_START, CIRCPAD_COMMAND_STOP];
    u8 response IN [CIRCPAD_RESPONSE_OK, CIRCPAD_RESPONSE_ERR];

    /** Machine type is left unbounded because we can specify
     * new machines in the consensus */
    u8 machine_type;
  };

The 'machine_type' field should be the same as the one from the
PADDING_NEGOTIATE cell. This is because, as an optimization, new machines can
be installed at the client side immediately after tearing down an old machine.
If the response machine type does not match the current machine type, the
response was for a previous machine, and can be ignored.

If the response field is CIRCPAD_RESPONSE_OK, padding was successfully
negotiated. If it is CIRCPAD_RESPONSE_ERR, the machine is torn down and we do
not pad.


4. Examples of Padding Machines

In the original WTF-PAD design[2], the state machines are used as follows:

The "Burst" histogram specifies the delay probabilities for sending a
padding packet after the arrival of a non-padding data packet.

The "Gap" histogram specifies the delay probabilities for sending
another padding packet after a padding packet was just sent from this
node. This self-triggering property of the "Gap" histogram allows the
construction of multi-packet padding trains using a simple statistical
distribution.

Both "Gap" and "Burst" histograms each have a special "Infinity" bin,
which means "We have decided not to send a packet".

Intuitively, the burst state is used to detect when the line is idle
(and should therefore have few or no tokens in low histogram bins). The
lack of tokens in the low histogram bins causes the system to remain in
the burst state until the actual application traffic either slows,
stalls, or has a gap.

The gap state is used to fill in otherwise idle periods with artificial
payloads from the server (and should have many tokens in low bins, and
possibly some also at higher bins). In this way, the gap state either
generates entirely fake streams of cells, or extends real streams with
additional cells.

The Adaptive Padding Early implementation[3] uses parameterized distributions
instead of histograms, but otherwise uses the states in the same way.

It should be noted that due to our generalization of these states and their
transition possibilities, more complicated interactions are also possible. For
example, it is possible to simulate circuit extension, so that all circuits
appear to continue to extend up until the RELAY_EARLY cell count is depleted.

It is also possible to create machines that simulate traffic on unused
circuits, or mimic onion service activity on clients that aren't otherwise
using onion services.


5. Security considerations and mitigations

The risks from this proposal are primarily DoS/resource exhaustion, and
side channels.

5.1. Rate limiting

Current research[2,3] indicates that padding should be be effective against
website traffic fingerprinting at overhead rates of 50-60%. Circuit setup
behavior can be concealed with far less overhead.

We recommend that three consensus parameters be used in the event that
the network is being overloaded from padding to such a degree that
padding requests should be ignored:

  * circpad_global_max_padding_pct
    - The maximum percent of sent padding traffic out of total traffic
      to allow in a tor process before ceasing to pad. Ex: 75 means
      75 padding packets for every 100 non-padding+padding packets.
      This definition is consistent with the overhead values in Proposal
      #265, though it does not take node position into account.
  * circpad_global_allowed_cells
    - The number of padding cells that must be transmitted before the
      global ratio limit is applied.

Additionally, each machine can specify its own per-machine limits for
the allowed cell counters and padding overhead percentages.

When either global or machine limits are reached, padding is no longer
scheduled. The machine simply becomes idle until the overhead drops below
the threshold.

Finally, the consensus can also be used to specify that clients should
use only machines that are flagged as reduced padding, or disable circuit
padding entirely, with the following two parameters:

   * circpad_padding_reduced=1
     - Tells clients to only use padding machines with the
       'reduced_padding_ok' machine condition flag set.
   * circpad_padding_disabled=1
     - Tells clients to stop circuit padding immediately, and not negotiate
       any further padding machines.

5.2. Overhead accounting

In order to monitor the quantity of padding to decide if we should alter
these limits in the consensus, every node will publish the following
values in a padding-counts line in its extra-info descriptor:

 * read_drop_cell_count
   - The number of RELAY_DROP cells read by this relay.
 * write_drop_cell_count
   - The number of RELAY_DROP cells sent by this relay.

Each of these counters will be rounded to the nearest 10,000 cells. This
rounding parameter will also be listed in the extra-info descriptor line, in
case we change it in a later release.

In the future, we may decide to introduce Laplace Noise in a similar
manner to the hidden service statistics, to further obscure padding
quantities.

5.3. Side channels

In order to prevent relays from introducing side channels by requesting
padding from clients, all of the padding negotiation commands are only
valid in the outgoing (from the client/OP) direction.

Similarly, to prevent relays from sending malicious padding from arbitrary
circuit positions, if RELAY_DROP cells arrive from a hop other than that
with which padding was negotiated, this cell is counted as invalid for
purposes of CIRC_BW control port fields, allowing the vanguards addon to
close the circuit upon detecting this activity.

-------------------

1. https://gitweb.torproject.org/torspec.git/tree/proposals/251-netflow-padding.txt
2. https://www.cs.kau.se/pulls/hot/thebasketcase-wtfpad/
3. https://www.cs.kau.se/pulls/hot/thebasketcase-ape/
Filename: 255-hs-load-balancing.txt
Title: Controller features to allow for load-balancing hidden services
Author: Tom van der Woerdt
Created: 2015-10-12
Status: Reserve

1. Overview and motivation

To address scaling concerns with the onion web, we want to be able to
spread the load of hidden services across multiple machines.
OnionBalance is a great stab at this, and it can currently give us 60x
the capacity by publishing 6 separate descriptors, each with 10
introduction points, but more is better. This proposal aims to address
hidden service scaling up to a point where we can handle millions of
concurrent connections.

The basic idea involves splitting the 'introduce' from the
'rendezvous', in the tor implementation, and adding new events and
commands to the control specification to allow intercepting
introductions and transmitting them to different nodes, which will then
take care of the actual rendezvous. External controller code could
relay the data to another node or a pool of nodes, all which are run by
the hidden service operator, effectively distributing the load of
hidden services over multiple processes.

By cleverly utilizing the current descriptor methods through
OnionBalance, we could publish up to sixty unique introduction points,
which could translate to many thousands of parallel tor workers after
implementing this proposal. This should allow hidden services to go
multi-threaded with a few small changes, and continue scaling for a
long time.


2. Specification

We propose two additions to the control specification, of which one is
an event and the other is a new command. We also introduce two new
configuration options.


2.1. HiddenServiceAutomaticRendezvous configuration option

The syntax is:
    "HiddenServiceAutomaticRendezvous" SP [1|0] CRLF

This configuration option is defined to be a boolean toggle which, if
zero, stops the tor implementation from automatically doing a rendezvous
when an INTRODUCE2 cell is received. Instead, an event will be sent to
the controllers. If no controllers are present, the introduction cell
should be dropped, as acting on it instead of dropping it could open a
window for a DoS.

This configuration option can be specified on a per-hidden service
level, and can be set through the controller for ephemeral hidden
services as well.


2.2. HiddenServiceTag configuration option

The syntax is:
    "HiddenServiceTag" SP [a-zA-Z0-9] CRLF

To identify groups of hidden services more easily across nodes, a
name/tag can be given to a hidden service. Defaults to the storage path
of the hidden service (HiddenServiceDir).


2.3. The "INTRODUCE" event

The syntax is:
    "650" SP "INTRODUCE" SP HSTag SP RendezvousData CRLF

    HSTag = the tag of the hidden service
    RendezvousData = implementation-specific, but must not contain
                     whitespace, must only contain human-readable
                     characters, and should be no longer than 2048 bytes

The INTRODUCE event should contain sufficient data to allow continuing
the rendezvous from another Tor instance. The exact format is left
unspecified and left up to the implementation. From this follows that
only matching versions can be used safely to coordinate the rendezvous
of hidden service connections.


2.4. "PERFORM-RENDEZVOUS" command

The syntax is:
  "PERFORM-RENDEZVOUS" SP HSTag SP RendezvousData CRLF

This command allows a controller to perform a rendezvous using data
received through an INTRODUCE event. The format of RendezvousData is
not specified other than that it must not contain whitespace, and
should be no longer than 2048 bytes.


2.5. The RendezvousData blob

The "RendezvousData" blob is opaque to the controller, however the tor
implementation should of course know how to deal with it. Its contents
is the minimal amount of data required to process the INTRODUCE2 cell
on another machine.

Before proposal 224 is implemented, this could consist of the
INTRODUCE2 cell payload, the key to decrypt the cell if the cell
is not already decrypted (which may be preferable, for performance
reasons), and data necessary for other machines to recognize what to do
with the cell.

After proposal 224 is implemented, the blob would contain any
additional keys needed to perform the rendezvous handshake.

Implementations do not need to handle blobs generated by other versions
of the software. Because of this, it is recommended to include a
version number which can be used to verify that the blob is from a
compatible implementation.


3. Compatibility and security

The implementation of these methods should, ideally, not change
anything in the network, and all control changes are opt-in, so this
proposal is fully backwards compatible.

Controllers handling this data must be careful to not leak rendezvous
data to untrusted parties, as it could be used to intercept and
manipulate hidden services traffic.


4. Example

Let's take an example where a client (Alice) tries to contact Bob's
hidden service. To do this, Bob follows the normal hidden service
specification, except he sets up ten servers to do this. One of these
publishes the descriptor, the others have this disabled. When the
INTRODUCE2 cell arrives at the node which published the descriptor, it
does not immediately try to perform the rendezvous, but instead outputs
this to the controller. Through an out-of-band process this message is
relayed to a controller of another node of Bob's, and this transmits
the "PERFORM-RENDEZVOUS" command to that node. This node
performs the rendezvous, and will continue to serve data to Alice,
whose client will now not have to talk to the introduction point
anymore.


5. Other considerations

We have left the actual format of the rendezvous data in the control
protocol unspecified, so that controllers do not need to worry about
the various types of hidden service connections, most notably proposal
224.

The decision to not implement the actual cell relaying in the tor
implementation itself was taken to allow more advanced configurations,
and to leave the actual load-balancing algorithm to the implementor of
the controller. The developer of the tor implementation should not
have to choose between a round-robin algorithm and something that could
pull CPU load averages from a centralized monitoring system.

Filename: 256-key-revocation.txt
Title: Key revocation for relays and authorities
Authors: Nick Mathewson
Created: 27 October 2015
Status: Reserve

1. Introduction

   This document examines the different kinds of long-lived public keys
   in Tor, and discusses a way to revoke each.

   The kind of keys at issue are:

      * Authority identity keys
      * Authority signing keys

      * OR identity keys (ed25519)
      * OR signing keys (ed25519)
      * OR identity keys (RSA)

   Additionally, we need to make sure that all other key types, if they
   are compromised, can be replaced or rendered unusable.

2. When to revoke keys

   Revoking keys should happen when the operator of an authority or
   relay believes that the key has been compromised, or has a
   significant risk of having been compromised.  In this proposal we
   deliberately leave this decision up to the authority/relay operators.

   (If a third party believes that a key has been compromised, they
   should attempt to get the key-issuing party to revoke their key.  If
   that can't be done, the uncompromised authorities should block the
   relay or de-list the authority in question.)

   Our key-generation code (for authorities and relays) should generate
   preemptive revocation documents at the same time it generates the
   original keys, so that operators can retain those documents in the
   event that access to the original keys is lost.  The operators should
   keep these revocation documents private and available enough so that
   they can issue the revocation if necessary, but nobody else can.

   Additionally, the key generation code should be able to generate
   retrospective revocation documents for existing keys and
   certificates.  (This approach can be more useful when a subkey is
   revoked, but the operator still has ready access to the issuing key.)

3. Authority keys

   Authority identity keys are the most important keys in Tor.  They are
   kept offline and encrypted.  They are used to sign authority signing
   keys, and for no other purpose.

   Authority signing keys are kept online.  They are authenticated by
   authority identity keys, and used to sign votes and consensus
   documents.

   (For the rest of section 2, all keys mentioned will be authority keys.)

3.1. Revocation certificates for authorities

   We add the following extensions to authority key certificates (see
   dir-spec.txt section 3.1), for use in key revocation.

   "dir-key-revocation-type" SP "master" | "signing" NL

      Specifies which kind of revocation document this is.

      If dir-key-revocation is absent, this is not a revocation.

      [At most once]

   "dir-key-revocation-notes" SP (any non-NL text) NL

      An explanation of why the key was revoked.

      Must be absent unless dir-key-revocation-type is set.

      [Any number of times]

   "dir-key-revocation-signing-key-unusable" NL

      Present if the signing key in this document will not actually be
      used to sign votes and consensuses.

      [At most once]

   "dir-key-revoked-signing-key" SP DIGEST NL

      Fingerprints of signing keys being explicitly revoked by this
      certificate. (All keys published before this one are _implicitly_
      revoked.)

      [Any number of times]

   "dir-key-revocation-published" SP YYYY-MM-DD SP HH:MM:SS NL

      The actual time when the revocation was generated. (Used since the
      'published' field in the certificate will lie; see below.)

      [at most once.]


3.2. Generating revocations

   Revocations for signing keys should be generated with:
     * A 'published' time immediately following the published date on
       the key that they are revoking.
     * An 'expires' time at least 48 hours after the expires date on
       the key that they are revoking, and at least one week in the
       future.

   (Note that this ensures as-correct-as-possible behavior for existing
   Tor clients and servers.  For Tor versions before 0.2.6, having a
   more recent published date than the older key will cause the revoked
   key certificate to be removed by trusted_dirs_remove_old_certs() if
   it is published at least 7 days in the past.  For Tor versions 0.2.6
   or later, the interval is reduced to 2 days.)

   If generating a signing key revocation ahead of time, the revocation
   document should include a dummy signing key, to be thrown away
   immediately after it is generated and used to make the revocation
   document.  The "dir-key-revocation-signing-key-unusable" element
   should be present.

   If generating a signing key revocation in response to an event,
   the revocation document should include the new signing key to be
   used.   The "dir-key-revocation-signing-key-unusable" element
   must be be absent.

   All replacement certificates generated for the lifetime of the
   original revoked certificate should be generated as revocations.



   Revocations for master keys should be generated with:
     * A 'published' time immediately following the published date on
       the most recently generated certificate, if possible.
     * An 'expires' time equal to 18 Jan 2038. (The next-to-last day
       encodeable in time_t, to avoid Y2038 problems.)
     * A dummy signing key, as above.

3.3. Submitting revocations

  In the event of a key revocation, authority operators should
  upload the revocation document to every other authority.

  If there is a replacement signing key, it should be included in the
  authority's votes (as any new key certificate would be).

3.4. Handling revocations

  We add these additional rules for caching and storing revocations on
  Tor servers and clients.
     * Master key revocations should be stored indefinitely.
     * If we have a master key revocation, no other certificates for
       that key should be fetched, stored, or served.
     * If we have a master key revocation, we should replace any
       DirAuthority entry for that master key with a 'null' entry -- an
       authority with no address and no keys, from which nothing can be
       downloaded and nothing can be trusted, but which still counts against
       the total number of authorities.

     * Signing key revocations should be retained until their 'expires'
       date.
     * If we have a signing key revocation document, we should not trust
       any signature generated with any key in an older signing key
       certificates for the same master key.  We should not serve such
       key certificates.
     * We should not attempt to fetch any certificate document matching
       an <identity, signing> pair for which a revocation document exists.

  We add these additional rule for directory authorities:

     * When generating or serving a consensus document, an authority
       should include a dir-source entry based on the most recent
       revocation cert it has from an authority, unless that authority
       has a more recent valid key cert.

       (This will require a new consensus method.)

     * When generating or serving a consensus document, if no valid
       signature exists from a given authority, and that authority has
       a currently valid key revocation document with a signing key in
       it, it should include a bogus signature purporting to be made
       with that signing key.  (All-zeros is suggested.)

       (Doing this will make old Tor clients download the revocation
       certificates.)

4. Router identity key revocations

4.1. RSA identity keys

   If the RSA key is compromised but the ed25519 identity and signing
   keys are not, simply disable the router.  Key pinning should take
   care of the rest.

   (This isn't ideal when key pinning isn't deployed yet, but I'm
   betting that key pinning comes online before this proposal does.)

4.2. Ed25519 master identity keys

   (We use the data format from proposal 220, section 2.3 here.)

   To revoke a master identity key, create a revocation for the master
   key and upload it to every authority.

   Authorities should accept these documents as meaning that the signing
   key should never be allowed to appear on the Tor network.  This can
   be enforced with the key pinning mechanism.

4.3. Ed25519 signing keys

   (We use the data format from proposal 220, section 2.3 here.)

   To revoke a signing key, include the revocation for every
   not-yet-expired signing key in your routerinfo document, as in:

      "revoked-signing-key" SP Base64-Ed25519-Key NL

        Note that this doesn't need to be authenticated, since the newer
        signing key certificate creates a trust path from the master
        identity key to the the revocation.

        [Up to 32 times.]

   Upon receiving these entries, authorities store up to 32 such entries
   per router per year.  (If you have more than 32 keys compromised,
   give up and take your router down. Start it with a new master key.)

   When voting, authorities include a "rev" line in the
   microdescriptor for every revoked-signing-key in the routerinfo:

      "rev" SP "ed25519" SP Base64-Ed25519-Key NL

      (This will require a new microdescriptor version.)

   Upon receiving such a line in the microdescriptor, Tor instances MUST
   NOT trust any signing key certificate with a matching key.


Filename: 257-hiding-authorities.txt
Title: Refactoring authorities and making them more isolated from the net
Authors: Nick Mathewson, Andrea Shepard
Created: 2015-10-27
Status: Meta

0. Meta status

   This proposal is 'accepted' as an outline for future proposals, though
   it doesn't actually specify itself in enough detail to be implemented as
   it stands.


1. Introduction

   Directory authorities are critical to the Tor network, and represent
   a DoS target to anybody trying to disable the network. This document
   describes a strategy for making directory authorities in general less
   vulnerable to DoS by separating the different parts of their
   functionality.

2. Design

2.1. The status quo

   This proposal is about splitting up the roles of directory
   authorities.  But, what are these roles?  Currently, directory
   authorities perform the following functions.

   Some of these functions require large amounts of bandwidth; I've noted
   that with a (BW).  Some of these functions require a publicly known
   address; I've marked those with a (PUB). Some of these functions
   inevitably leak the location from which they are performed. I've marked
   those with a (LOC).

   Not everything in this list is something that needs to be done by an
   authority permanently!  This list is, again, just what authorities do now.

     * Authorities receive uploaded server descriptors and extrainfo
       descriptors from regular Tor servers and from each other. (BW, PUB?)

     * Authorities periodically probe the routers they know about in
       order to determine whether they are running or not.  By
       remembering the past behavior of nodes, they also build a view of
       each node's fractional uptime and mean time between
       failures. (LOC, unless we're clever)

     * Authorities perform the consensus protocol by:

          * Generating 'vote' documents describing their view of the
            network, along with a set of microdescriptors for later
            client use.

          * Uploading these votes to one another.

          * Computing a 'consensus' of these votes.

     * Authorities serve as a location for distributing consensus
       documents, descriptors, extrainfo documents, and
       microdescriptors...

          * To directory mirrors. (BW?, PUB?, LOC?)

          * To clients that do not yet know a directory mirror. (BW!!, PUB)

   These functions are tied to directory authorities, but done
   out-of-process:

     * Bandwidth measurement (BW)

     * Sybil detection

     * 'Guardiness' measurement, possibly.

2.2. Design goals

   Principally, we attempt to isolate the security-critical,
   high-resource, and availability-critical pieces of our directory
   infrastructure from one another.

   We would like to make the security-critical pieces of our
   infrastructure easy to relocate, and the communications between them
   easy to secure.

   We require that the Tor network remain able to bootstrap itself in
   the event of catastrophic failure.  So, while we can _use_ a running
   Tor network to communicate, we should not _require_ that a running
   Tor network exist in order to perform the voting process.

2.3. Division of responsibility

   We propose dividing directory authority operations into these modules:


      ----------       ----------    --------------    ----------------
      | Upload |======>| Voting |===>| Publishing |===>| Distribution |
      ----------       ----------    --------------    ----------------
          I                 ^
          I    -----------  I
          ====>| Metrics |===
               -----------

   A number of 'upload' servers are responsible for receiving
   router descriptors.  These are publicly known, and responsible for
   collecting descriptors.

   Information from these servers is used by 'metrics' modules
   (which check Tor servers for reliability and measure their history),
   and fed into the voting process.

   The voting process involves only communication (indirectly) from
   authorities to authorities, to produce a consensus and a set of
   microdescriptors.

   When voting is complete, consensuses, descriptors, and microdescriptors
   must be made available to the rest of the world.  This is done by
   the 'publishing' module.  The consensuses, descriptors, and mds are then
   taken up by the directory caches, and distributed.

   The rest of this proposal will describe means of communication between
   these modules.

3. The modules in more detail

   This section will outline possibilities for communication between the
   various parts of the system to isolate them.  There will be plenty of
   "may"s and "could"s and "might"s here: these are possibilities, in
   need of further consideration.

3.1. Sending descriptors to the Upload module

   We retain the current mechanism: a set of well-known IP
   addresses with well-known OR keys to which each relay should upload a
   copy of its descriptors.

   The worst that a hostile upload server can do is to drop descriptors.
   (It could also generate large numbers of spurious descriptors in
   order to increase the load on the metrics system. But an attacker
   could do that without running an upload server)

   With respect to dropping, upload servers can use an anytrust model:
   so long as a single server receives and honestly reports descriptors
   to the rest of the system, those descriptors will arrive correctly.

   To avoid DoS attacks, we can require that any relay not previously known
   to an upload module perform some kind of a proof of work as it first
   registers itself.  (Details TBD)

   If we're using TLS here, we should also consider a check-then-start TLS
   design as described in A.1 below.

   The location of Upload servers can change over time; they can be
   published in the consensus.

   (Note also that as an alternative, we could distribute this functionality
   across the whole network.)

3.2. Transferring descriptors to the metrics server and the voters

   The simplest option here would be for the authorities and metrics
   servers to mirror them using Tor.  rsync-over-ssh-over-Tor is a
   possibility, if we don't feel like building more machinery.

   (We could use hidden services here, but it is probably okay for
   upload servers and to be known by the the voters and metrics.)

   A fallback to a non-Tor connection could be done manually, or could
   require explicit buy-in from the voter/metrics operator.

3.3. Transferring information from metrics server to voters

   The same approaches as 3.2 should work fine.

3.4. Communication between voters

   Voters can, we hope, communicate to each other over authenticated
   hidden services.  But we'll need a fallback mechanism here.

   Another option is to have public ledgers available for voters to talk
   to anonymously.  This is probably a better idea.  We could re-use the
   upload servers for this purpose, perhaps.

   Giving voters each others' addresses seems like a bad idea.

3.5. Communication from voters to directory nodes

   We should design a flood-distribution mechanism for microdescriptors,
   listed descriptors, and consensuses so that authorities can each
   upload to a few targets anonymously, and have them propagate through
   the rest of the network.

4. Migration

   To support old clients and old servers, the current authority IP
   addresses should remain as Upload and Distribution points.  The
   current authority identity keys keys should remain as the keys for
   voters.

A.1. Check-then-start TLS

   Current TLS protocols are quite susceptible to denial-of-service
   attacks, with large asymmetries in resource consumption.  (Client
   sends junk, forcing server to perform private-key operation on junk.)

   We could hope for a good TLS replacement to someday emerge, or for
   TLS to improve its properties.  But as a replacement, I suggest that
   we wrap TLS in a preliminary challenge-response protocol to establish
   that the use is authorized before we allow the TLS handshake to
   begin.

   (We shouldn't do this for all the TLS in Tor: only for the cases
   where we want to restrict the users of a given TLS server.)

Filename: 258-dirauth-dos.txt
Title: Denial-of-service resistance for directory authorities
Author: Andrea Shepard
Created: 2015-10-27
Status: Dead

1. Problem statement

   The directory authorities are few in number and vital for the
   functioning of the Tor network; threats of denial of service
   attacks against them have occurred in the past.  They should be
   more resistant to unreasonably large connection volumes.

2. Design overview

   There are two possible ways a new connection to a directory
   authority can be established, directly by a TCP connection to the
   DirPort, or tunneled inside a Tor circuit and initiated with a
   begindir cell.  The client can originate the former as direct
   connections or from a Tor exit, and the latter either as fully
   anonymized circuits or one-hop links to the dirauth's ORPort.

   The dirauth will try to heuristically classify incoming requests
   as one of these four indirection types, and then in the two
   non-anonymized cases further sort them into hash buckets on the
   basis of source IP.  It will use an exponentially-weighted moving
   average to measure the rate of connection attempts in each
   bucket, and also separately limit the number of begindir cells
   permitted on each circuit.  It will periodically scan the hash
   tables and forget counters which have fallen below a threshold to
   prevent memory exhaustion.

3. Classification of incoming connections

   Clients can originate connections as one of four indirection
   types:


    - DIRIND_ONEHOP: begindir cell on a single-hop Tor circuit
    - DIRIND_ANONYMOUS: begindir cell on a fully anonymized Tor
      circuit
    - DIRIND_DIRECT_CONN: direct TCP connection to dirport
    - DIRIND_ANON_DIRPORT: TCP connection to dirport from an exit
      relay

   The directory authority can always tell a dirport connection from
   a begindir, but it must use its knowledge of the current
   consensus and exit policies to disambiguate whether the
   connection is anonymized.

   It should treat a begindir as DIRIND_ANONYMOUS when the previous
   hop in the circuit it appears on is in the current consensus, and
   as DIRIND_ONEHOP otherwise; it should treat a dirport connection
   as DIRIND_ANON_DIRPORT if the source address appears in the
   consensus and allows exits to the dirport in question, or as
   DIRIND_DIRECT_CONN otherwise.  In the case of relays which also
   act as clients, these heuristics may falsely classify
   direct/onehop connections as anonymous, but will never falsely
   classify anonymous connections as direct/onehop.

4. Exponentially-weighted moving average counters and hash table

   The directory authority implements a set of
   exponentially-weighted moving averages to measure the rate of
   incoming connections in each bucket.  The two anonymous
   connection types are each a single bucket, but the two non-
   anonymous cases get a single bucket per source IP each, stored in
   a hash table.  The directory authority must periodically scan
   this hash table for counters which have decayed close to zero and
   free them to avoid permitting memory exhaustion.

   This introduces five new configuration parameters:

    - DirDoSFilterEWMATimeConstant: the time for an EWMA counter to
      decay by a factor of 1/e, in seconds.

    - DirDoSFilterMaxAnonConnectRate: the threshold to trigger the
      DoS filter on DIRIND_ANONYMOUS connections.

    - DirDoSFilterMaxAnonDirportConnectRate: the threshold to
      trigger the DoS filter on DIRIND_ANON_DIRPORT connections.

    - DirDoSFilterMaxBegindirRatePerIP: the threshold per source IP
      to trigger the DoS filter on DIRIND_ONEHOP connections.

    - DirDoSFilterMaxDirectConnRatePerIP: the threshold per source
      IP to trigger the DoS filter on DIRIND_DIRECT_CONN
      connections.

   When incrementing a counter would put it over the relevant
   threshold, the filter is said to be triggered.  In this case, the
   directory authority does not update the counter, but instead
   suppresses the incoming request.  In the DIRIND_ONEHOP and
   DIRIND_ANONYMOUS cases, the directory authority must kill the
   circuit rather than merely refusing the request, to prevent an
   unending stream of client retries on the same circuit.

5. Begindir cap

   Directory authorities limit the number of begindir cells
   permitted in the lifetime of a particular circuit, separately
   from the EWMA counters.  This can only affect the
   DIRIND_ANONYMOUS and DIRIND_ONEHOP connetion types.  A sixth
   configuration variable, DirDoSFilterMaxBegindirPerCircuit,
   controls this feature.

6. Limitations

   Widely distributed DoS attacks with many source IPs may still be
   able to avoid raising any single DIRIND_ONEHOP or
   DIRIND_DIRECT_CONN counter above threshold.
Filename: 259-guard-selection.txt
Title: New Guard Selection Behaviour
Author: Isis Lovecruft, George Kadianakis
Created: 2015-10-28
Status: Obsolete
Extends: 241-suspicious-guard-turnover.txt

This proposal was made obsolete by proposal #271.

§1. Overview

  In addition to the concerns regarding path bias attacks, namely that the
  space from which guards are selected by some specific client should not
  consist of the entirety of nodes with the Guard flag (cf. §1 of proposal
  #247), several additional concerns with respect to guard selection behaviour
  remain.  This proposal outlines a new entry guard selection algorithm, which
  additionally addresses the following concerns:

    - Heuristics and algorithms for determining how and which guard(s)
      is(/are) chosen should be kept as simple and easy to understand as
      possible.

    - Clients in censored regions or who are behind a fascist firewall who
      connect to the Tor network should not experience any significant
      disadvantage in terms of reachability or usability.

    - Tor should make a best attempt at discovering the most appropriate
      behaviour, with as little user input and configuration as possible.


§2. Design

  Alice, an OP attempting to connect to the Tor network, should undertake the
  following steps to determine information about the local network and to
  select (some) appropriate entry guards.  In the following scenario, it is
  assumed that Alice has already obtained a recent, valid, and verifiable
  consensus document.

  Before attempting the guard selection procedure, Alice initialises the guard
  data structures and prepopulates the guardlist structures, including the
  UTOPIC_GUARDLIST and DYSTOPIC_GUARDLIST (cf. §XXX).  Additionally, the
  structures have been designed to make updates efficient both in terms of
  memory and time, in order that these and other portions of the code which
  require an up-to-date guard structure are capable of obtaining such.

    0. Determine if the local network is potentially accessible.

       Alice should attempt to discover if the local network is up or down,
       based upon information such as the availability of network interfaces
       and configured routing tables.  See #16120. [0]

       [XXX: This section needs to be fleshed out more.  I'm ignoring it for
       now, but since others have expressed interest in doing this, I've added
       this preliminary step. —isis]

    1. Check that we have not already attempted to add too many guards
       (cf. proposal #241).

    2. Then, if the PRIMARY_GUARDS on our list are marked offline, the
       algorithm attempts to retry them, to ensure that they were not flagged
       offline erroneously when the network was down. This retry attempt
       happens only once every 20 mins to avoid infinite loops.

       [Should we do an exponential decay on the retry as s7r suggested? —isis]

    3. Take the list of all available and fitting entry guards and return the
       top one in the list.

    4. If there were no available entry guards, the algorithm adds a new entry
       guard and returns it.  [XXX detail what "adding" means]

    5. Go through the steps 1-4 above algorithm, using the UTOPIC_GUARDLIST.

       5.a. When the GUARDLIST_FAILOVER_THRESHOLD of the UTOPIC_GUARDLIST has
            been tried (without success), Alice should begin trying steps 1-4
            with entry guards from the DYSTOPIC_GUARDLIST as well.  Further,
            if no nodes from UTOPIC_GUARDLIST work, and it appears that the
            DYSTOPIC_GUARDLIST nodes are accessible, Alice should make a note
            to herself that she is possibly behind a fascist firewall.

       5.b. If no nodes from either the UTOPIC_GUARDLIST or the
            DYSTOPIC_GUARDLIST are working, Alice should make a note to
            herself that the network has potentially gone down.  Alice should
            then schedule, at exponentially decaying times, to rerun steps 0-5.
           
            [XXX Should we do step 0? Or just 1-4?  Should we retain any
            previous assumptions about FascistFirewall?  —isis]

    6. [XXX Insert potential other fallback mechanisms, e.g. switching to
       using bridges? —isis]


§3. New Data Structures, Consensus Parameters, & Configurable Variables

§3.1. Consensus Parameters & Configurable Variables

    Variables marked with an asterisk (*) SHOULD be consensus parameters.

    DYSTOPIC_GUARDS ¹ 
        All nodes listed in the most recent consensus which are marked with
        the Guard flag and which advertise their ORPort(s) on 80, 443, or any
        other addresses and/or ports controllable via the FirewallPorts and
        ReachableAddresses configuration options.

    UTOPIC_GUARDS
        All nodes listed in the most recent consensus which are marked with
        the Guard flag and which do NOT advertise their ORPort(s) on 80, 443,
        or any other addresses and/or ports controllable via the FirewallPorts
        and ReachableAddresses configuration options.

    PRIMARY_GUARDS * 
       The number of first, active, PRIMARY_GUARDS on either the
       UTOPIC_GUARDLIST or DYSTOPIC_GUARDLIST as "primary". We will go to
       extra lengths to ensure that we connect to one of our primary guards,
       before we fall back to a lower priority guard. By "active" we mean that
       we only consider guards that are present in the latest consensus as
       primary.

    UTOPIC_GUARDS_ATTEMPTED_THRESHOLD *
    DYSTOPIC_GUARDS_ATTEMPTED_THRESHOLD *
       These thresholds limit the amount of guards from the UTOPIC_GUARDS and
       DYSTOPIC_GUARDS which should be partitioned into a single
       UTOPIC_GUARDLIST or DYSTOPIC_GUARDLIST respectively.  Thus, this
       represents the maximum percentage of each of UTOPIC_GUARDS and
       DYSTOPIC_GUARDS respectively which we will attempt to connect to.  If
       this threshold is hit we assume that we are offline, filtered, or under
       a path bias attack by a LAN adversary.

       There are currently 1600 guards in the network.  We allow the user to
       attempt 80 of them before failing (5% of the guards).  With regards to
       filternet reachability, there are 450 guards on ports 80 or 443, so the
       probability of picking such a guard here should be high.

       This logic is not based on bandwidth, but rather on the number of
       relays which possess the Guard flag.  This is for three reasons: First,
       because each possible *_GUARDLIST is roughly equivalent to others of
       the same category in terms of bandwidth, it should be unlikely [XXX How
       unlikely? —isis] for an OP to select a guardset which contains less
       nodes of high bandwidth (or vice versa).  Second, the path-bias attacks
       detailed in proposal #241 are best mitigated through limiting the
       number of possible entry guards which an OP might attempt to use, and
       varying the level of security an OP can expect based solely upon the
       fact that the OP picked a higher number of low-bandwidth entry guards
       rather than a lower number of high-bandwidth entry guards seems like a
       rather cruel and unusual punishment in addition to the misfortune of
       already having slower entry guards.  Third, we favour simplicity in the
       redesign of the guard selection algorithm, and introducing bandwidth
       weight fraction computations seems like an excellent way to
       overcomplicate the design and implementation.
       

§3.2. Data Structures

    UTOPIC_GUARDLIST
    DYSTOPIC_GUARDLIST
        These lists consist of a subset of UTOPIC_GUARDS and DYSTOPIC_GUARDS
        respectively.  The guards in these guardlists are the only guards to
        which we will attempt connecting.

        When an OP is attempting to connect to the network, she will construct
        hashring structure containing all potential guard nodes from both
        UTOPIC_GUARDS and DYSTOPIC_GUARDS.  The nodes SHOULD BE inserted into
        the structure some number of times proportional to their consensus
        bandwidth weight. From this, the client will hash some information
        about themselves [XXX what info should we use? —isis] and, from that,
        choose #P number of points on the ring, where #P is
        {UTOPIC,DYSTOPIC}_GUARDLIST_ATTEMPTED_THRESHOLD proportion of the
        total number of unique relays inserted (if a duplicate is selected, it
        is discarded).  These selected nodes comprise the
        {UTOPIC,DYSTOPIC}_GUARDLIST for (first) entry guards.  (We say "first"
        in order to distinguish between entry guards and the vanguards
        proposed for hidden services in proposal #247.)

        [Perhaps we want some better terminology for this.  Suggestions
        welcome. —isis]

        Each GUARDLIST SHOULD have the property that the total sum of
        bandwidth weights for the nodes contained within it is roughly equal
        to each other guardlist of the same type (i.e. one UTOPIC_GUARDLIST is
        roughly equivalent in terms of bandwidth to another UTOPIC_GUARDLIST,
        but necessarily equivalent to a DYSTOPIC_GUARDLIST).

        For space and time efficiency reasons, implementations of the
        GUARDLISTs SHOULD support prepopulation(), update(), insert(), and
        remove() functions.  A second data structure design consideration is
        that the amount of "shifting" — that is, the differential between
        constructed hashrings as nodes are inserted or removed (read: ORs
        falling in and out of the network consensus) — SHOULD be minimised in
        order to reduce the resources required for hashring update upon
        receiving a newer consensus.

        The implementation we propose is to use a Consistent Hashring,
        modified to dynamically allocate replications in proportion to
        fraction of total bandwidth weight.  As with a normal Consistent
        Hashring, replications determine the number times the relay is
        inserted into the hashring.  The algorithm goes like this:

          router          ← ⊥
          key             ← 0
          replications    ← 0
          bw_weight_total ← 0
          while router ∈ GUARDLIST:
           | bw_weight_total ← bw_weight_total + BW(router)
          while router ∈ GUARDLIST:
           | replications ← FLOOR(CONSENSUS_WEIGHT_FRACTION(BW(router), bw_total) * T)
           | factor ← (S / replications)
           | while replications != 0:
           |  | key ← (TOINT(HMAC(ID)[:X] * replications * factor) mod S
           |  | INSERT(key, router)
           |  | replications <- replications - 1

        where:
 
          - BW is a function for extracting the value of an OR's `w bandwith=`
            weight line from the consensus,
          - GUARDLIST is either UTOPIC_GUARDLIST or DYSTOPIC_GUARDLIST,
          - CONSENSUS_WEIGHT_FRACTION is a function for computing a router's
            consensus weight in relation to the summation of consensus weights
            (bw_total),
          - T is some arbitrary number for translating a router's consensus
            weight fraction into the number of replications,
          - H is some collision-resistant hash digest,
          - S is the total possible hash space of H (e.g. for SHA-1, with
            digest sizes of 160 bits, this would be 2^160),
          - HMAC is a keyed message authentication code which utilises H,
          - ID is an hexadecimal string containing the hash of the router's
            public identity key,
          - X is some (arbitrary) number of bytes to (optionally) truncate the
            output of the HMAC to,
          - S[:X] signifies truncation of S, some array of bytes, to a
            sub-array containing X bytes, starting from the first byte and
            continuing up to and including the Xth byte, such that the
            returned sub-array is X bytes in length.
          - INSERT is an algorithm for inserting items into the hashring,
          - TOINT converts hexadecimal to decimal integers,
 
        For routers A and B, where B has a little bit more bandwidth than A,
        this gets you a hashring which looks like this:

                           B-´¯¯`-BA
                        A,`        `.
                        /            \
                       B|            |B
                        \            /
                         `.        ,´A
                          AB--__--´B
 
        When B disappears, A remains in the same positions:

                           _-´¯¯`-_A
                        A,`        `.
                        /            \
                        |            |
                        \            /
                         `.        ,´A
                          A`--__--´
                                
        And similarly if A disappears:

                           B-´¯¯`-B
                         ,`        `.
                        /            \
                       B|            |B
                        \            /
                         `.        ,´
                           B--__--´B
 
        Thus, no "shifting" problems, and recalculation of the hashring when a
        new consensus arrives via the update() function is much more time
        efficient.

        Alternatively, for a faster and simpler algorithm, but non-uniform
        distribution of the keys, one could remove the "factor" and replace
        the derivation of "key" in the algorithm above with:

                key ← HMAC(ID || replications)[:X]

        A reference implementation in Python is available². [1]


§4. Footnotes

¹ "Dystopic" was chosen because those are the guards you should choose from if
  you're behind a FascistFirewall.

² One tiny caveat being that the ConsistentHashring class doesn't dynamically
  assign replication count by bandwidth weight; it gets initialised with the
  number of replications.  However, nothing in the current implementation
  prevents you from doing:
      >>> h = ConsistentHashring('SuperSecureKey', replications=6)
      >>> h.insert(A)
      >>> h.replications = 23
      >>> h.insert(B)
      >>> h.replications = 42
      >>> h.insert(C)


§5. References

  [0]: https://trac.torproject.org/projects/tor/ticket/16120
  [1]: https://gitweb.torproject.org/user/isis/bridgedb.git/tree/bridgedb/hashring.py?id=949d33e8#n481


-*- coding: utf-8 -*-
Filename: 260-rend-single-onion.txt
Title: Rendezvous Single Onion Services
Author: Tim Wilson-Brown, John Brooks, Aaron Johnson, Rob Jansen, George Kadianakis, Paul Syverson, Roger Dingledine
Created: 2015-10-17
Status: Finished
Implemented-In: 0.2.9.3-alpha

1. Overview

   Rendezvous single onion services are an alternative design for single onion
   services, which trade service-side location privacy for improved
   performance, reliability, and scalability.

   Rendezvous single onion services have a .onion address identical to any
   other onion service. The descriptor contains the same information as the
   existing double onion (hidden) service descriptors. The introduction point
   and rendezvous protocols occur as in double onion services, with one
   modification: one-hop connections are made from the onion server to the
   introduction and rendezvous points.

   This proposal is a revision of the unnumbered proposal Direct Onion
   Services: Fast-but-not-hidden services by Roger Dingledine, and George
   Kadianakis at
   https://lists.torproject.org/pipermail/tor-dev/2015-April/008625.html

   It incorporates much of the discussion around hidden services since April
   2015, including content from Single Onion Services (Proposal #252) by John
   Brooks, Paul Syverson, and Roger Dingledine.

2. Motivation

   Rendezvous single onion services are best used by sites which:
      * Don't require location anonymity
      * Would appreciate lower latency or self-authenticated addresses
      * Would like to work with existing tor clients and relays
      * Can't accept connections to an open ORPort

   Rendezvous single onion services have a few benefits over double onion
   services:

      * Connection latency is lower, as one-hop circuits are built to the
        introduction and rendezvous points, rather than three-hop circuits
      * Stream latency is reduced on a four-hop circuit
      * Less Tor network capacity is consumed by the service, as there are
        fewer hops (4 rather than 6) between the client and server via the
        rendezvous point

   Rendezvous single onion services have a few benefits over single onion
   services:

      * A rendezvous single onion service can load-balance over multiple
        rendezvous backends (see proposal #255)
      * A rendezvous single onion service doesn't need an accessible ORPort
        (it works behind a NAT, and in server enclaves that only allow
        outward connections)
      * A rendezvous single onion service is compatible with existing tor
        clients, hidden service directories, introduction points, and
        rendezvous points

   Rendezvous single onion services have a few drawbacks over single onion
   services:

      * Connection latency is higher, as one-hop circuits are built to the
        introduction and rendezvous points. Single onion services perform one
        extend to the single onion service's ORPort only

   It should also be noted that, while single onion services receive many
   incoming connections from different relays, rendezvous single onion
   services make many outgoing connections to different relays. This should
   be taken into account when planning the connection capacity of the
   infrastructure supporting the onion service.

   Rendezvous single onion services are not location hidden on the service
   side, but clients retain all of the benefits and privacy of onion
   services. (The rationale for the 'single' and 'double' nomenclature is
   described in section 7.4 of proposal #252.)

   We believe that it is important for the Tor community to be aware of the
   alternative single onion service designs, so that we can reach consensus
   on the features and tradeoffs of each design. However, we recognise that
   each additional flavour of onion service splits the anonymity set of onion
   service users. Therefore, it may be best for user anonymity that not all
   designs are adopted, or that mitigations are implemented along with each
   additional flavour. (See sections 8 & 9 for a further discussion.)

3. Onion descriptors

   The rendezvous single onion descriptor format is identical to the double
   onion descriptor format.

4. Reaching a rendezvous single onion service as a client

   Clients reach rendezvous single onion services in an identical fashion
   to double onion services. The rendezvous design means that clients do not
   know whether they are talking to a double or rendezvous single onion
   service, unless that service tells them. (This may be a security issue.)

   However, the use of a four-hop path between client and rendezvous single
   onion service may be statistically distinguishable. (See section 8 for
   further discussion of security issues.)

   (Please note that this proposal follows the hop counting conventions in the
   tor source code. A circuit with a single connection between the client and
   the endpoint is one-hop; a circuit with 4 connections (and 3 nodes) between
   the client and endpoint is four-hop.)

5. Publishing a rendezvous single onion service

   To act as a rendezvous single onion service, a tor instance (or cooperating
   group of tor instances) must:

      * Publish onion descriptors in the same manner as any onion service,
        using three-hop circuits. This avoids service blocking by IP address.
        Proposal #224 (next-generation hidden services) avoids blocking by
        onion address.
      * Perform the rendezvous protocol in the same manner as a double
        onion service, but make the intro and rendezvous circuits one-hop.
        (This may allow intro and rendezvous points to block the service.)

5.1. Configuration options

5.1.1 RendezvousSingleOnionServiceNonAnonymousServer

   The tor instance operating a rendezvous single onion service must make
   one-hop circuits to the introduction and rendezvous points:

      RendezvousSingleOnionServiceNonAnonymousServer 0|1
        If set, make one-hop circuits between the Rendezvous Single Onion
        Service server, and the introduction and rendezvous points. This
        option makes every onion service instance hosted by this tor instance
        a Rendezvous Single Onion Service. (Default: 0)

   Because of the grave consequences of misconfiguration here, we have added
   'NonAnonymous' to the name of the torrc option. Furthermore, Tor MUST issue
   a startup warning message to operators of the onion service if this feature
   is enabled.
   [Should the name start with 'NonAnonymous' instead?]

   As RendezvousSingleOnionServiceNonAnonymousServer modifies the behaviour
   of every onion service on a tor instance, it is impossible to run hidden
   (double onion) services and rendezvous single onion services on the same
   tor instance. This is considered a feature, as it prevents hidden services
   from being discovered via rendezvous single onion services on the same tor
   instance.

5.1.2 Recommended Additional Options: Correctness

   Based on the experiences of Tor2Web with one-hop paths, operators should
   consider using the following options with every rendezvous single onion
   service, and every single onion service:
     
      UseEntryGuards 0
        One-hop paths do not use entry guards. This also deactivates the entry
        guard pathbias code, which is not compatible with one-hop paths. Entry
        guards are a security measure against Sybil attacks. Unfortunately,
        they also act as the bottleneck of busy onion services and overload
        those Tor relays.

      LearnCircuitBuildTimeout 0
        Learning circuit build timeouts is incompatible with one-hop paths.
        It also creates additional, unnecessary connections.

   Perhaps these options should be set automatically on (rendezvous) single
   onion services. Tor2Web sets these options automatically:
      UseEntryGuards 0
      LearnCircuitBuildTimeout 0

5.1.3 Recommended Additional Options: Performance

      LongLivedPorts
        The default LongLivedPorts setting creates additional, unnecessary
        connections. This specifies no long-lived ports (the empty list).

      PredictedPortsRelevanceTime 0 seconds
        The default PredictedPortsRelevanceTime setting creates additional,
        unnecessary connections.

   High-churn / quick-failover RSOS using descriptor competition strategies
   should consider setting the following option:

      RendPostPeriod 600 seconds
        Refresh onion service descriptors, choosing an interval between
        0 and 2*RendPostPeriod. Tor also posts descriptors on bootstrap, and
        when they change.
        (Strictly, 30 seconds after they first change, for descriptor
        stability.)

        XX - Reduce the minimum RendPostPeriod for RSOS to 1 minute?
        XX - Make the initial post 30 + rand(1*rendpostperiod) ?
             (Avoid thundering herd, but don't hide startup time)

   However, we do NOT recommend setting the following option to 1, unless bug
   #17359 is resolved so tor onion services can bootstrap without predicted
   circuits.

      __DisablePredictedCircuits 0
        This option disables all predicted circuits. It is equivalent to:
          LearnCircuitBuildTimeout 0
          LongLivedPorts
          PredictedPortsRelevanceTime 0 seconds
          And turning off hidden service server preemptive circuits, which is
          currently unimplemented (#17360)

5.1.4 Recommended Additional Options: Security

   We recommend that no other services are run on a rendezvous single onion
   service tor instance. Since tor runs as a client (and not a relay) by
   default, rendezvous single onion service operators should set:

      XX - George says we don't allow operators to run HS/Relay any more,
           or that we warn them.

      SocksPort 0
        Disallow connections from client applications to the tor network
        via this tor instance.

      ClientOnly 1
        Even if the defaults file configures this instance to be a relay,
        never relay any traffic or serve any descriptors.

5.2. Publishing descriptors

   A single onion service must publish descriptors in the same manner as any
   onion service, as defined by rend-spec.

5.3. Authorization

   Client authorization for a rendezvous single onion service is possible via
   the same methods used for double onion services.

6. Related Proposals, Tools, and Features

6.1. Load balancing

   High capacity services can distribute load and implement failover by:
      * running multiple instances that publish to the same onion service
        directories,
      * publishing descriptors containing multiple introduction points
        (OnionBalance),
      * publishing different introduction points to different onion service
        directories (OnionBalance upcoming(?) feature),
      * handing off rendezvous to a different tor instance via control port
        messages (proposal #255),
   or by a combination of these methods.

6.2. Ephemeral single onion services (ADD_ONION)

   The ADD_ONION control port command could be extended to support ephemerally
   configured rendezvous single onion services. Given that
   RendezvousSingleOnionServiceNonAnonymousServer modifies the behaviour of
   all onion services on a tor instance, if it is set, any ephemerally
   configured onion service should become a rendezvous single onion service.

6.3. Proposal 224 ("Next-Generation Hidden Services")

   This proposal is compatible with proposal 224, with onion services
   acting just like a next-generation hidden service, but making one-hop
   paths to the introduction and rendezvous points.

6.4. Proposal 246 ("Merging Hidden Service Directories and Intro Points")

   This proposal is compatible with proposal 246. The onion service will
   publish its descriptor to the introduction points in the same manner as any
   other onion service. Clients will use the merged hidden service directory
   and introduction point just as they do for other onion services.

6.5. Proposal 252 ("Single Onion Services")

   This proposal is compatible with proposal 252. The onion service will
   publish its descriptor to the introduction points in the same manner as any
   other onion service. Clients can then choose to extend to the single onion
   service, or continue with the rendezvous protocol.

   Running a rendezvous single onion service and single onion service allows
   older clients to connect via rendezvous, and newer clients to connect via
   extend. This is useful for the transition period where not all clients
   support single onion services.

6.6. Proposal 255 ("Hidden Service Load Balancing")

   This proposal is compatible with proposal 255. The onion service will
   perform the rendezvous protocol in the same manner as any other onion
   service. Controllers can then choose to handoff the rendezvous point
   connection to another tor instance, which should also be configured
   as a rendezvous single onion service.

7. Considerations

7.1 Modifying RendezvousSingleOnionServiceNonAnonymousServer at runtime
   
   Implementations should not reuse introduction points or introduction point
   circuits if the value of RendezvousSingleOnionServiceNonAnonymousServer is
   different than it was when the introduction point was selected. This is
   because these circuits will have an undesirable length.

   There is specific code in tor that preserves introduction points on a HUP,
   if RendezvousSingleOnionServiceNonAnonymousServer has changed, all circuits
   should be closed, and all introduction points must be discarded.

7.2 Delaying connection expiry

   Tor clients typically expire connections much faster than tor relays
   [citation needed].

   (Rendezvous) single onion service operators may find that keeping
   connections open saves on connection latency. However, it may also place an
   additional load on the service. (This could be implemented by increasing the
   configured connection expiry time.)

7.3. (No) Benefit to also running a Tor relay

   In tor Trac ticket #8742, running a relay and hidden onion service on the
   same tor instance was disabled for security reasons. While there may be
   benefits to running a relay on the same instance as a rendezvous single
   onion service (existing connections mean lower latency, it helps the tor
   network overall), a security analysis of this configuration has not yet
   been performed. In addition, a potential drawback is overloading a busy
   single onion service.

7.4 Predicted circuits

   We should look whether we can optimize further the predicted circuits that
   Tor makes as an onion service for this mode.

8. Security Implications

8.1 Splitting the Anonymity Set

   Each additional flavour of onion service, and each additional externally
   visible onion service feature, provides oportunities for fingerprinting.

   Also, each additional type of onion service shrinks the anonymity set for
   users of double onion (hidden) services who require server location
   anonymity. These users benefit from the cover provided by current users of
   onion services, who use them for client anonymity, self-authentication,
   NAT-punching, or other benefits.

   For this reason, features that shrink the double onion service anonymity
   set should be carefully considered. The benefits and drawbacks of
   additional features also often depend on a particular threat model.

   It may be that a significant number of users and sites adopt (rendezvous)
   single onion services due to their benefits. This could increase the
   traffic on the tor network, therefore increasing anonymity overall.
   However, the unique behaviour of each type of onion service may still be
   distinguishable on both the client and server ends of the connection.

8.2 Hidden Service Designs can potentially be more secure

   As a side-effect, by optimizing for performance in this feature, it
   allows us to lean more heavily towards security decisions for
   regular onion services.

8.3 One-hop onion service paths may encourage more attacks

   There's a possible second-order effect here since both RSOS
   and double onion services will have foo.onion addresses and it's
   not clear based on the address which one the service uses:
   if *some* .onion addresses are easy to track down, are we encouraging
   adversaries to attack all rendezvous points just in case?

9. Further Work

Further proposals or research could attempt to mitigate the anonymity-set
splitting described in section 8. Here are some initial ideas.

9.1 Making Client Exit connections look like Client Onion Service Connections

   A mitigation to this fingerprinting is to make each (or some) exit
   connections look like onion service connections. This provides cover for
   particular types of onion service connections. Unfortunately, it is not
   possible to make onion service connections look like exit connections,
   as there are no suitable dummy servers to exit to on the Internet.

9.1.1 Making Client Exit connections perform Descriptor Downloads

   (Some) exit connections could perform a dummy descriptor download.
   (However, descriptors for recently accessed onion services are cached, so
   dummy downloads should only be performed occasionally.)

   Exit connections already involve a four-hop "circuit" to the server
   (including the connection between the exit and the server on the Internet).
   The server on the Internet is not included in the consensus. Therefore,
   this mitigation would effectively cover single onion services which are not
   relays.

9.1.2 Making Client Exit connections perform the Rendezvous Protocol

   (Some) exit connections could perform a dummy rendezvous protocol.

   Exit connections already involve a four-hop "circuit" to the server
   (including the connection between the exit and the server on the Internet).
   Therefore, this mitigation would effectively cover rendezvous single onion
   services, as long as a dummy descriptor download was also performed
   occasionally.

9.1.3 Making Single Onion Service rendezvous points perform name resolution

   Currently, Exits perform DNS name resolution, and changing this behaviour
   would cause unacceptable connection latency. Therefore, we could make
   onion service connections look like exit connections by making the
   rendezvous point do name resolution (that is, descriptor fetching), and, if
   needed, the introduction part of the protocol. This could potentially
   *reduce* the latency of single onion service connections, depending on the
   length of the paths used by the rendezvous point.

   However, this change makes rendezvous points almost as powerful as Exits,
   a careful security analysis will need to be performed before this is
   implemented.

   There is also a design issue with rendezvous name resolution: a client
   wants to leave resolution (descriptor download) to the RP, but it doesn't
   know whether it can use the exit-like protocol with an RP until it has
   downloaded the descriptor. This might mean that single onion services of
   both flavours need a different address style or address namespace. We could
   use .single.onion or something. (This would require an update to the HSDir
   code.)

9.2 Performing automated and common queries over onion services

   Tor could create cover traffic for a flavour of onion service by performing
   automated or common queries via an onion service of that type. In addition,
   onion service-based checks have security benefits over DNS-based checks.
   See Genuine Onion, Syverson and Boyce, 2015, at
   http://www.nrl.navy.mil/itd/chacs/syverson-genuine-onion-simple-fast-flexible-and-cheap-website-authentication

   Here are some examples of automated queries that could be performed over
   an onion service:

9.2.1 torcheck over onion service

   torcheck ("Congratulations! This browser is configured to use Tor.") could
   be retrieved from an onion service.

   Incidentally, this would resolve the exitmap issues in #17297, but it
   would also fail to check that exit connections work, which is important for
   many Tor Browser users.

9.2.2 Tor Browser version checks over onion service

   Running tor browser version checks over an onion service seems to be an
   excellent use-case for onion services. It would also have the Tor Project
   "eating its own dogfood", that is, using onion services for its essential
   services.

9.2.3 Tor Browser downloads over onion service

   Running tor browser downloads over an onion service might require some work
   on the onion service codebase to support high loads, load-balancing, and
   failover. It is a good use case for a (rendezvous) single onion service,
   as the traffic over the tor network is only slightly higher than for
   Tor Browser downloads over tor. (4 hops for [R]SOS, 3 hops for Exit.)

9.2.4 SSL Observatory submissions over onion service

   HTTPS certificates could be submitted to HTTPS Everywhere's SSL Observatory
   over an onion service.

   This option is disabled in Tor Browser by default. Perhaps some users would
   be more comfortable enabling submission over an onion service, due to the
   additional security benefits.

Filename: 261-aez-crypto.txt
Title: AEZ for relay cryptography
Author: Nick Mathewson
Created: 28 Oct 2015
Status: Obsolete

0. History

   I wrote the first draft of this around October.  This draft takes a
   more concrete approach to the open questions from last time around.

1. Summary and preliminaries

   This proposal describes an improved algorithm for circuit
   encryption, based on the wide-block SPRP AEZ. I also describe the
   attendant bookkeeping, including CREATE cells, and several
   variants of the proposal.

   For more information about AEZ, see
           http://web.cs.ucdavis.edu/~rogaway/aez/

   For motivations, see proposal 202.

2. Specifications

2.1. New CREATE cell types.

   We add a new CREATE cell type that behaves as an ntor cell but which
   specifies that the circuit will be created to use this mode of
   encryption.

   [TODO: Can/should we make this unobservable?]

   The ntor handshake is performed as usual, but a different PROTOID is
   used:
        "ntor-curve25519-sha256-aez-1"

   To derive keys under this handshake, we use SHAKE256 to derive the
   following output:

     struct shake_output {
         u8 aez_key[48];
         u8 chain_key[32];
         u8 chain_val_forward[16];
         u8 chain_val_backward[16];
     };

   The first two two fields are constant for the lifetime of the
   circuit.

2.2. New relay cell payload

   We specify the following relay cell payload format, to be used when
   the exit node circuit hop was created with the CREATE format in 2.1
   above:

     struct relay_cell_payload {
        u32 zero_1;
        u16 zero_2;
        u16 stream_id;
        u16 length IN [0..498];
        u8 command;
        u8 data[498]; // payload_len - 11
     };

   Note that the payload length is unchanged.  The fields are now
   rearranged to be aligned.  The 'recognized' and 'length' fields are
   replaced with zero_1, zero_2, and the high 7 bits of length, for a
   minimum of 55 bits of unambigious verification.  (Additional
   verification can be done by checking the other fields for
   correctness; AEZ users can exploit plaintext redundancy for
   additional cryptographic checking.)

   When encrypting a cell for a hop that was created using one of these
   circuits, clients and relays encrypt them using the AEZ algorithm
   with the following parameters:

       Let Chain denote chain_val_forward if this is a forward cell
          or chain_val_backward otherwise.

       tau = 0

       # We set tau=0 because want no per-hop ciphertext expansion.  Instead
       # we use redundancy in the plaintext to authenticate the data.

       Nonce =
         struct {
           u64 cell_number;
           u8 is_forward;
           u8 is_early;
         }

       # The cell number is the number of relay cells that have
       # traveled in this direction on this circuit before this cell.
       # ie, it's zero for the first cell, two for the second, etc.
       #
       # is_forward is 1 for outbound cells, 0 for inbound cells.
       # is_early is 1 for cells packaged as RELAY_EARLY, 0 for
       #   cells packaged as RELAY.
       #
       # Technically these two values would be more at home in AD
       # than in Nonce; but AEZ doesn't actually distinguish N and AD
       # internally.

       Define CELL_CHAIN_BYTES = 32

       AD = [ XOR(prev_plaintext[:CELL_CHAIN_BYTES],
                  prev_ciphertext[:CELL_CHAIN_BYTES]),
              Chain ]

       # Using the previous cell's plaintext/ciphertext as additional data
       # guarantees that any corrupt ciphertext received will corrupt the
       # plaintext, which will corrupt all future plaintexts.

       Set Chain = AES(chain_key, Chain) xor Chain.

       # This 'chain' construction is meant to provide forward
       # secrecy.  Each chain value is replaced after each cell with a
       # (hopefully!) hard-to-reverse construction.

   This instantiates a wide-block cipher, tweaked based on the cell
   index and direction.  It authenticates part of the previous cell's
   plaintext, thereby ensuring that if the previous cell was corrupted,
   this cell will be unrecoverable.

3. Design considerations

3.1. Wide-block pros and cons?

   See proposal 202, section 4.

3.2. Given wide-block, why AEZ?

   It's a reasonably fast probably secure wide-block cipher.  In
   particular, it's performance-competitive with AES_CTR, and far better
   than what we're doing now.  See performance appendix.

   It seems secure-ish too.  Several cryptographers I know seem to
   think it's likely secure enough, and almost surely at least as
   good as AES.

   [There are many other competing wide-block SPRP constructions if
   you like.  Many require blocks be an integer number of blocks, or
   aren't tweakable.  Some are slow.  Do you know a good one?]

3.3. Why _not_ AEZ?

   There are also some reasons to consider avoiding AEZ, even if we do
   decide to use a wide-block cipher.

   FIRST it is complicated to implement.  As the specification says,
   "The easiness claim for AEZ is with respect to ease and versatility
   of use, not implementation."

   SECOND, it's still more complicated to implement well (fast,
   side-channel-free) on systems without AES acceleration.  We'll need
   to pull the round functions out of fast assembly AES, which is
   everybody's favorite hobby.

   THIRD, it's really horrible to try to do it in hardware.

   FOURTH, it is comparatively new.  Although several cryptographers
   like it, and it is closely related to a system with a security proof,
   you never know.

   FIFTH, something better may come along.


4. Alternative designs

4.1. Two keys.

   We could have a separate AEZ key for forward and backward encryption.
   This would use more space, however.

4.2. Authenticating things differently

   In computing the AD, we could replace xor with concat.

   In computing the AD, we could replace CELL_CHAIN_BYTES with 16, or
   509.

   (Another thing we might dislike about the current proposal is
   that it appears to requires us to remember 32 bytes of plaintext
   until we get another cell.  But that part is fixable: note that
   in the structure of AEZ, the AD is processed in the AEZ-hash()
   function, and then no longer used.  We can compute the AEZ-hash()
   to be used for the next cell after each cell is en/de crypted.)

4.3. Other hashes.

   We could update the ntor definition used in this to use a better hash
   than SHA256 inside.

4.4. Less predictable plaintext.

   A positively silly option would be to reserve the last X bytes of
   each relay cell's plaintext for random bytes, if they are not used
   for payload.  This might help a little, in a really doofy way.


A.1. Performance notes: memory requirements

  Let's ignore Tor overhead here, but not openssl overhead.

  IN THE CURRENT PROTOCOL, the total memory required at each relay is: 2
  sha1 states, 2 aes states.

  Each sha1 state uses 96 bytes.  Each aes state uses 244 bytes.  (Plus
  32 bytes counter-mode overhead.)  This works out to 704 bytes at each
  hop.

  IN THE PROTOCOL ABOVE, using an optimized AEZ implementation, we'll
  need 128 bytes for the expanded AEZ key schedule.  We'll need another
  244 bytes for the AES key schedule for the chain key.  And there's 32
  bytes of chaining values.  This gives us 404 bytes at each hop, for a
  savings of 42%.

  If we used separate AES and AEZ keys in each direction, we would be
  looking at 776 bytes, for a space increase of 10%.

A.2. Performance notes: CPU requirements on AESNI hosts

  The cell_ops benchmark in bench.c purports to tell us how long it
  takes to encrypt a tor cell.  But it wasn't really telling the truth,
  since it only did one SHA1 operation every 2^16 cells, when
  entries and exits really do one SHA1 operation every end-to-end cell.

  I expanded it to consider the slow (SHA1) case as well.  I ran this on
  my friendly OSX laptop (2.9 GHz Intel Core i5) with AESNI support:

                  Inbound cells: 169.72 ns per cell.
                 Outbound cells: 175.74 ns per cell.
       Inbound cells, slow case: 934.42 ns per cell.
      Outbound cells, slow case: 928.23 ns per cell.


  Note that So for an n-hop circuit, each cell does the slow case and
  (n-1) fast cases at the entry; the slow case at the exit, and the fast
  case at each middle relay.

  So for 3 hop circuits, the total current load on the network is
  roughly 425 ns per hop, concentrated at the exit.

  Then I started messing around with AEZ benchmarks, using the
  aesni-optimized version of AEZ on the AEZ website.  (Further
  optimizations are probably possible.)  For the AES256, I
  used the usual aesni-based aes construction.

  Assuming xor is free in comparison to other operations, and
  CELL_CHAIN_BYTES=32, I get roughly 270 ns per cell for the entire
  operation.

  If we were to pick CELL_CHAIN_BYTES=509, we'd be looking at around 303
  ns per cell.

  If we were to pick CELL_CHAIN_BYTES=509 and replace XOR with CONCAT,
  it would be around 355 ns per cell.

  If we were to keep CELL_CHAIN_BYTES=32, and remove the
  AES256-chaining, I see values around 260 ns per cell.

  (This is all very spotty measurements, with some xors left off, and
  not much effort done at optimization beyond what the default
  optimized AEZ does today.)

A.3. Performance notes: what if we don't have AESNI?

  Here I'll test on a host with sse2 and ssse3, but no aesni instructions.
  From Tor's benchmarks I see:

               Inbound cells: 1218.96 ns per cell.
              Outbound cells: 1230.12 ns per cell.
    Inbound cells, slow case: 2099.97 ns per cell.
   Outbound cells, slow case: 2107.45 ns per cell.

  For a 3-hop circuit, that gives on average 1520 ns per cell.

  [XXXX Do a real benchmark with a fast AEZ backend. First, write one.
   Preliminary results are a bit disappointing, though, so I'll
   need to invetigate alternatives as well.]

Filename: 262-rekey-circuits.txt
Title: Re-keying live circuits with new cryptographic material
Author: Nick Mathewson
Created: 28 Dec 2015
Status: Reserve

   [NOTE: This proposal is in "Reserve" status because the issue it
   addresses should be solved by any future relay encryption
   protocol. (2020 July 31)]

1. Summary and Motivation

   Cryptographic primitives have an upper limit of how much data should
   be encrypted with the same key.  But currently Tor circuits have no
   upper limit of how much data they will deliver.

   While the upper limits of our AES-CTR crypto is ridiculously high
   (on the order of hundreds of exabytes), the AEZ crypto we're
   considering suggests we should rekey after the equivalent in cells
   after around 280 TB.  280 TB is still high, but not ridiculously
   high.

   So in this proposal I explain a general mechanism for rekeying a
   circuit.  We shouldn't actually build this unless we settle on

2. RELAY_REKEY cell operation

   To rekey, the circuit initiator ("client") can send a new RELAY_REKEY cell
   type:

        struct relay_rekey {
          u16 rekey_method IN [0, 1];
          u8 rekey_data[];
        }

        const REKEY_METHOD_ACK = 0;
        const REKEY_METHOD_SHAKE256_CLIENT = 1;

   This cell means "I am changing the key." The new key material will be
   derived from SHAKE256 of the aez_key concatenated with the rekey_data
   field, to fill a new shake_output structure.  The client should set
   rekey_data at random.

   After sending one of these RELAY_REKEY cells, the client uses the new
   aez_key to encrypt all of its data to this hop, but retains the old
   aez_key for decrypting the data coming back from the relay.

   When the relay receives a RELAY_REKEY cell, it sends a RELAY_REKEY
   cell back towards the client, with empty rekey_data, and
   relay_method==0, and then updates its own key material for all
   additional data it sends and receives to the client.

   When the client receives this reply, it can discard the old AEZ key, and
   begin decrypting subsequent inbound cells with the new key.


   So in summary: the client sends a series of cells encrypted with the
   old key, and then sends a REKEY cell, followed by relay cells
   encrypted with the new key:

     OldKey[data data data ... data rekey] NewKey[data data data...]

   And after the server receives the REKEY cell, it stops sending relay
   cells encrypted with the old keys, sends its own REKEY cell with the
   ACK method, and starts sending cells encrypted with the new key.

       REKEY arrives somewhere in here
                   I
                   V
     OldKey[data data data data rekey-ack] NewKey[data data data ...]

2.1. Supporting other cryptography types

   Each relay cipher must specify its own behavior in the presence of a
   REKEY cell of each type that it supports.  In general, the behavior
   of method 1 ("shake256-client") is "regenerate keys as if we were
   calling the original KDF after a CREATE handshake, using SHAKE256 on
   our current static key material and on a 32-byte random input."

   The behavior of any unsupported REKEY method must be to close the
   circuit with an error.

   The AES-CTR relay cell crypto doesn't support rekeying. See 3.2 below
   if you disagree.

2.2. How often to re-key?

   Clients should follow a deterministic algorithm in deciding when to
   re-key, so as not to leak client differences.  This algorithm should
   be type-specific.  For AEZ, I recommend that clients conservatively
   rekey every 2**32 cells (about 2 TB).  And to make sure that this
   code actually works, the schedule should be after 2**15 cells, and
   then every 2**32 cells thereafter.

   It may be beneficial to randomize these numbers.  If so, let's try
   subtracting between 0 and 25% at random.

2.3. How often to allow re-keying?

   We could define a lower bound to prevent too-frequent rekeying.  I'm
   not sure I see the point here; the process described above is not
   that costly.

3.  Alternative designs

3.1. Should we add some public key cryptography here?

   We could change the body of a REKEY cell and its ack to be more like
   CREATE/CREATED.  Then we'd have to add a third step from the client
   to the server to acknowledge receipt of the 'CREATED' cell and
   changing of the key.

   So, what would this added complexity and computational load buy us?
   It would defend against the case where an adversary had compromised
   the shared key material for a circuit, but was not able to compromise
   the rekey process.  I'm not sure that this is reasonable; the
   likeliest cases I can think of here seem to be "get compromised, stay
   compromised" for a circuit.

3.2. Hey, could we use this for forward secrecy with AES-CTR?

   We could, but the best solution to AES-CTR's limitations right now is
   to stop using our AES-CTR setup.  Anything that supports REKEY will
   also presumably support AEZ or something better.

3.3. We could upgrade ciphers with this!

   Yes we could.  We could define this not only to change the key, but
   to upgrade to a better ciphersuite.  For example, we could start by
   negotiating AES-CTR, and then "secretly" upgrade to AEZ.  I'm not
   sure that's worth the complexity, or that it would really be secret
   in the presence of traffic analysis.


Filename: 263-ntru-for-pq-handshake.txt
Title: Request to change key exchange protocol for handshake v1.2
Author: John SCHANCK, William WHYTE and Zhenfei ZHANG
Created: 29 Aug 2015
Updated: 4 Feb 2016
Status: Obsolete

This proposal was made obsolete by proposal #269.


1. Introduction

  Recognized handshake types are:
    0x0000  TAP         --  the original Tor handshake;
    0x0001  reserved
    0x0002  ntor        --  the ntor+curve25519+sha256 handshake;

  Request for a new (set of) handshake type:
    0x010X  ntor+qsh    --  the hybrid of ntor+curve25519+sha3 handshake
                            and a quantum-safe key encapsulation mechanism

  where
    0X0101  ntor+qsh    --  refers to this modular design; no specific Key
                            Encapsulation Mechanism (KEM) is assigned.

    0X0102  ntor+ntru   --  the quantum safe KEM is based on NTRUEncrypt, with
                            parameter ntrueess443ep2

    0X0103  ntor+rlwe   --  the quantum safe KEM is based on ring learning with
                            error encryption scheme; parameter not specified

        DEPENDENCY:
          Proposal 249: Allow CREATE cells with >505 bytes of handshake data

  1.1 Motivation: Quantum-safe forward-secure key agreement

    We are trying to add Quantum-safe forward-secrecy to the key agreement in
    tor handshake. (Classical) forward-secrecy means that if the long-term key
    is compromised, the communication prior to this compromise still stays
    secure. Similarly, Quantum-safe forward-secrecy implies if the long-term
    key is compromised due to attackers with quantum-computing capabilities, the
    prior communication still remains secure.

    Current approaches for handling key agreement, for instance the ntor
    handshake protocol, do not have this feature. ntor uses ECC, which will be
    broken when quantum computers become available. This allows the simple yet
    very effective harvest-then-decrypt attack, where an adversary with
    significant storage capabilities harvests Tor handshakes now and decrypts
    them in the future.

    The proposed handshake protocol achieves quantum-safe forward-secrecy and
    stops those attacks by introducing a secondary short-term pre-master secret
    that is transported via a quantum-safe method. In the case where the long-term
    key is compromised via quantum algorithm, the attacker still needs to recover
    the second pre-master secret to be able to decrypt the communication.

  1.2 Motivation: Allowing plug & play for quantum-safe encryption algorithms

    We would like to be conservative on the selection of quantum-safe encryption
    algorithm. For this purpose, we propose a modular design that allows any
    quantum-safe encryption algorithm to be included in this handshake
    framework. We will illustrate the proposal with NTRUEncrypt encryption
    algorithm.

2. Proposal

  2.1 Overview

    In Tor, authentication is one-way in the authenticated key-exchange
    protocol. This proposed new handshake protocol is consistent with that
    approach.

    We aim to provide quantum-safe forward-secrecy and modular design to the Tor
    handshake, with the minimum impact on the current version. We aim to use
    as many existing mechanisms as possible.

    For purposes of comparison, proposed modifications are indicated with * at
    the beginning of the corresponding line, the original approaches in ntor
    are marked with # when applicable.

    In order to enable variant quantum-safe algorithms for Tor handshake, we
    propose a modular approach that allows any quantum-safe encryption algorithm
    to be adopted in this framework. Our approach is a hybridization of ntor
    protocol and a KEM. We instantiate this framework with NTRUEncrypt, a
    lattice-based encryption scheme that is believed to be quantum resistant.
    This framework is expandable to other quantum-safe encryptions such as Ring
    Learning with Error (R-LWE) based schemes.

    2.1.1 Achieved Property:

      1)  The proposed key exchange method is quantum-safe forward-secure: two
      secrets are exchanged, one protected by ECC, one protected by NTRUEncrypt,
      and then put through the native Tor Key Derivation Function (KDF) to
      derive the encryption and authentication keys. Both secrets are protected
      with one-time keys for their respective public key algorithms.

      2)  The proposed key exchange method provides one-way authentication: The
      server is authenticated, while the client remains anonymous.

      3)  The protocol is at least as secure as ntor. In the case where the
      quantum-safe encryption algorithm fails, the protocol is indentical to
      ntor protocol.

    2.1.2 General idea:

      When a client wishes to establish a one-way authenticated key K with a
      server, a session key is established through the following steps:
      1)  Establish a common secret E (classical cryptography, i.e., ECC) using
      a one-way authenticated key exchange protocol.
      #ntor currently uses this approach#;
      2)  Establish a common "parallel" secret P using a key encapsulation
      mechanism similar to TLS_RSA. In this feature request we use NTRUEncrypt
      as an example.
      3)  Establish a new session key k = KDF(E|P, info, i), where KDF is a Key
      Derivation Function.

    2.1.3 Building Blocks

      1)  ntor: ECDH-type key agreement protocol with one-way authentication;
      ##existing approach: See 5.1.4 tor-spec.txt##

      2)  A quantum-safe encryption algorithm: we use QSE to refer to the
      quantum-safe encryption algorithm, and use NTRUEncrypt as our example;
      **new approach**

      3)  SHA3-256 hash function (see FIPS 202), and SHAKE256 KDF;
      ##previous approach: HMAC-based Extract-and-Expand KDF-RFC5869##

  2.2 The protocol

    2.2.1 Initialization

      H(x,t) as SHA3-256 with message x and key t.
      H_LENGTH      = 32
      ID_LENGTH     = 20
      G_LENGTH      = 32

*     QSPK_LENGTH   = XXX           length of QSE public key
*     QSC_LENGTH    = XXX           length of QSE cipher

*     PROTOID       = "ntor-curve25519-sha3-1-[qseid]"
#pre  PROTOID       = "ntor-curve25519-sha256-1"

      t_mac         = PROTOID | ":mac"
      t_key         = PROTOID | ":key_extract"
      t_verify      = PROTOID | ":verify"

      These three variables define three different cryptographic hash functions:
      hash1         = H(*, t_mac);
      hash2         = H(*, t_key);
      hash3         = H(*, t_verify);

      MULT(A,b)     = the multiplication of the curve25519 point 'A' by the
                      scalar 'b'.
      G             = The preferred base point for curve25519
      KEYGEN()      = The curve25519 key generation algorithm,
                      returning a private/public keypair.
      m_expand      = PROTOID | ":key_expand"

      curve25519
        b, B        = KEYGEN();

*     QSH
*       QSSK,QSPK   = QSKEYGEN();
*       cipher      = QSENCRYPT (*, PK);
*       message     = QSDECRYPT (*, SK);

    2.2.2 Handshake

      To perform the handshake, the client needs to know an identity key digest
      for the server, and an ntor onion key (a curve25519 public key) for that
      server. Call the ntor onion key "B".

      The client generates a temporary key pair:
        x, X        = KEYGEN();

      and a QSE temporary key pair:
*       QSSK, QSPK  = QSKEYGEN();

================================================================================
      and generates a client-side handshake with contents:
        NODEID      Server identity digest  [ID_LENGTH   bytes]
        KEYID       KEYID(B)                [H_LENGTH    bytes]
        CLIENT_PK   X                       [G_LENGTH    bytes]
*       QSPK        QSPK                    [QSPK_LENGTH bytes]
================================================================================

      The server generates an ephemeral curve25519 keypair:
        y, Y        = KEYGEN();

      and an ephemeral "parallel" secret for encryption with QSE:
*       PAR_SEC     P                       [H_LENGTH    bytes]

      and computes:
*       C           = ENCRYPT( P | B | Y, QSPK);

      Then it uses its ntor private key 'b' to compute an ECC secret
        E           = EXP(X,y) | EXP(X,b) | B | X | Y

      and computes:

*       secret_input    = E | P | QSPK | ID | PROTOID
#pre    secret_input    = E | ID | PROTOID

        KEY_SEED        = H(secret_input, t_key)
        verify          = H(secret_input, t_verify)
*       auth_input      = verify | B | Y | X | C | QSPK
                          | ID | PROTOID | "Server"
#pre    auth_input      = verify | B | Y | X | ID | PROTOID | "Server"

================================================================================
      The server's handshake reply is:
        AUTH            H(auth_input, t_mac)    [H_LENGTH     bytes]
*       QSCIPHER        C                       [QSPK_LENGTH  bytes]

      Note: in previous ntor protocol the server also needs to send
#pre    SERVER_PK       Y                       [G_LENGTH     bytes]
      This value is now encrypted in C, so the server does not need to send Y.

================================================================================
      The client decrypts C, then checks Y is in G^*, and computes

        E               = EXP(Y,x) | EXP(B,x) | B | X | Y
*       P'              = DECRYPT(C, QSSK)

      extract P,B from P' (P' = P|B), verifies B, and computes

*       secret_input    = E | P | QSPK | ID | PROTOID
#pre    secret_input    = E | ID | PROTOID

        KEY_SEED        = H(secret_input, t_key)
        verify          = H(secret_input, t_verify)
*       auth_input      = verify | B | Y | X | C | ID | PROTOID | "Server"
#pre    auth_input      = verify | B | Y | X | ID | PROTOID | "Server"

      The client verifies that AUTH == H(auth_input, t_mac).

      Both parties now have a shared value for KEY_SEED. This value will be used
      during Key Derivation Function.

  2.3 Instantiation with NTRUEncrypt

    The example uses the NTRU parameter set NTRU_EESS443EP2. This has keys
    and ciphertexts of length 610 bytes. This parameter set delivers 128 bits
    classical security and quantum security. This parameter set uses product
    form NTRU polynomials. For 256 bits classical and quantum security, use
    NTRU_EESS743EP2.

    We adjust the following parameters:

    handshake type:
    0X0102  ntor+ntru       the quantum safe KEM is based on NTRUEncrypt, with
                            parameter ntrueess443ep2
    PROTOID       = "ntor-curve25519-sha3-1-ntrueess443ep2"
    QSPK_LENGTH   = 610     length of NTRU_EESS443EP2 public key
    QSC_LENGTH    = 610     length of NTRU_EESS443EP2 cipher

    NTRUEncrypt can be adopted in our framework without further modification.

3. Security Concerns

  The proof of security can be found at https://eprint.iacr.org/2015/287
  We highlight some desired features.

  3.1 One-way Authentication
    The one-way authentication feature is inherent from the ntor protocol.

  3.2 Multiple Encryption
    The technique to combine two encryption schemes used in 2.2.4 is named
    Multiple Encryption. Discussion of appropriate security models can be
    found in [DK05]. Proof that the proposed handshake is secure under this
    model can be found at https://eprint.iacr.org/2015/287.

  3.3 Cryptographic hash function
    The default hash function HMAC_SHA256 from Tor suffices to provide
    desired security for the present day. However, to be more future proof, we
    propose to use SHA3 when Tor starts to migrate to SHA3.

  3.4 Key Encapsulation Mechanism
    The KEM in our protocol can be proved to be KEM-CCA-2 secure.

  3.5 Quantum-safe Forward Secrecy
    Quantum-safe forward secrecy is achieved.

  3.6 Quantum-safe authentication
    The proposed protocol is secure only until a quantum computer is developed
    that is capable of breaking the onion keys in real time. Such a computer can
    compromise the authentication of ntor online; the security of this approach
    depends on the authentication being secure at the time the handshake is
    executed. This approach is intended to provide security against the
    harvest-then-decrypt attack while an acceptable quantum-safe approach with
    security against an active attacker is developed.

4. Candidate quantum-safe encryption algorithms

  Two candidate quantum-safe encryption algorithms are under consideration.

  NTRUEncrypt, with parameter set ntrueess443ep2 provides 128 bits classcial and
  quantum security. The parameter sets is available for use now.

  LWE-based key exchange, based on Peikert's idea [Pei14]. Parameter sets
  suitable for this framework (the newerhop vairant) is still under development.

5. Bibliography

[DK05]  Y. Dodis, J. Katz, "Chosen-Ciphertext Security of Mulitple Encryption",
    Theory of Cryptography Conference, 2005.
    http://link.springer.com/chapter/10.1007%2F978-3-540-30576-7_11
    (conference version) or http://cs.nyu.edu/~dodis/ps/2enc.pdf (preprint)

[Pei14] C. Peikert, "Lattice Cryptography for the Internet", PQCrypto 2014.




Filename: 264-subprotocol-versions.txt
Title: Putting version numbers on the Tor subprotocols
Author: Nick Mathewson
Created: 6 Jan 2016
Status: Closed
Implemented-In: 0.2.9.4-alpha


1. Introduction

   At various points in Tor's history, we've needed to migrate from one
   protocol to another.  In the past, we've mostly done this by allowing
   relays to advertise support for various features.  We've done this in
   an ad-hoc way, though. In some cases, we've done it entirely based on
   the relays' advertised Tor version.

   That's a pattern we shouldn't continue.  We'd like to support more
   live Tor relay implementations, and that means that tying "features"
   to "tor version" won't work going forwards.

   This proposal describes an alternative method that we can use to
   simplify the advertisement and discovery of features, and the
   transition from one set of features to another.

1.1. History: "Protocols" vs version-based gating.

   For ages, we've specified a "protocols" line in relay descriptors,
   with a list of supported versions for "relay" and "link" protocols.
   But we've never actually looked at it, and instead we've relied on
   tor version numbers to determine which features we could rely upon.
   We haven't kept the relay and link protocols listed there up-to-date
   either.

   Clients have used version checks for three purposes historically:
   checking relays for bugs, checking relays for features, and
   implementing bug-workarounds on their own state files.

   In this design, feature checks are now performed directly with
   subprotocol versions. We only need to keep using Tor versions
   specifically for bug workarounds.

2. Design: Advertising protocols.

   We revive the "Protocols" design above, in a new form.

   "proto" SP Entries NL

     Entries =
     Entries = Entry
     Entries = Entry SP Entries

     Entry = Keyword "=" Values

     Values = Value
     Values = Value "," Values

     Value = Int
     Value = Int "-" Int

     Int = NON_ZERO_DIGIT
     Int = Int DIGIT


   Each 'Entry' in the "proto" line indicates that the Tor relay
   supports one or more versions of the protocol in question.  Entries
   should be sorted by keyword.  Values should be numerically ascending
   within each entry.  (This implies that there should be no overlapping
   ranges.)  Ranges should be represented as compactly as possible. Ints
   must be no more than 2^32 - 1.

   The semantics for each keyword must be defined in a Tor
   specification.  Extension keywords are allowed if they begin with
   "x-" or "X-".  Keywords are case-sensitive.

   During voting, authorities copy these lines immediately below the "v"
   lines, using "pr" as the keyword instead of "proto".
   When a descriptor does not contain a "proto" entry, the
   authorities should reconstruct it using the approach described below
   in section A.1.  They are included in the consensus using the same
   rules as currently used for "v" lines, if a sufficiently late
   consensus method is in use.

2.1. An alternative: Moving 'v' lines into microdescriptors.

   [[[[[
   Here's an alternative: we could put "v" and "proto" lines into
   microdescriptors.

   When building microdescriptors, authorities could copy all valid
   "proto" entries verbatim if a sufficiently late consensus method is
   in use.  When a descriptor does not contain a "proto" entry, the
   authorities should reconstruct it using the approach described below
   in section A.1.

   Tor clients that want to use "v" lines should prefer those in
   microdescriptors if present, and ignore those in the consensus.

   (Existing maintained client versions can be adapted to never look at
   "v" lines at all; the only versions that they still check for are
   ones not allowed on the network.  The "v" line can be dropped
   from the consensus entirely when current clients have upgraded.)
   ]]]]]

   [I am rejecting this alternative for now, since proto lines should
   compress very well, given that the proto line is completely
   inferrable from the v line.  Removing all the v lines from the
   current consensus would save only 1.7% after gzip compression.]

3. Using "proto"/"pr" and "v" lines

   Whenever possible, clients and relays should use the list of
   advertised protocols instead of version numbers.  Version numbers
   should only be used when implementing bug-workarounds for specific
   Tor versions.

   Every new feature in tor-spec.txt, dir-spec.txt, and rend-spec.txt
   should be gated on a particular protocol version.

4. Required protocols

   The consensus may contain four lines:
      "recommended-relay-protocols",
      "required-relay-protocols",
      "recommended-client-protocols", and
      "required-client-protocols".

   Each has the same format as the "proto" line.  To vote on these
   entries, a protocol/version combination is included only if it is
   listed by a majority of the voters.

   When a relay lacks a protocol listed in recommended-relay-protocols, it
   should warn its operator that the relay is obsolete.

   When a relay lacks a protocol listed in required-relay-protocols, it
   must not attempt to join the network.

   When a client lacks a protocol listed in recommended-client-protocols,
   it should warn the user that the client is obsolete.

   When a client lacks a protocol listed in required-client-protocols, it
   must not connect to the network.  This implements a "safe
   forward shutdown" mechanism for zombie clients.

   If a client or relay has a cached consensus telling it that a given
   protocol is required, and it does not implement that protocol, it
   SHOULD NOT try to fetch a newer consensus.

   [[XXX I propose we remove this idea:

    The above features should be backported to 0.2.4 and later, or all the
    versions we expect to continue supporting.]]

   These lines should be voted on.  A majority of votes is sufficient to
   make a protocol un-supported and it should require a supermajority of
   authorities (2/3) to make a protocol required.  The required protocols
   should not be torrc-configurable, but rather should be hardwired in the
   Tor code.


5. Current protocols

   (See "6. Maintaining the protocol list" below for information about
   how I got these, and why version 0.2.4.19 comes up so often.)

5.1. "Link"

   The "link" protocols are those used by clients and relays to initiate
   and receive OR connections and to handle cells on OR connections.
   The "link" protocol versions correspond 1:1 to those versions.

   Two Tor instances can make a connection to each other only if they
   have at least one link protocol in common.

   The current "link" versions are: "1" through "4"; see tor-spec.txt
   for more information.  All current Tor versions support "1-3";
   version from 0.2.4.11-alpha and on support "1-4".  Eventually we
   will drop "1" and "2".

5.2. "LinkAuth"

   LinkAuth protocols correspond to varieties of Authenticate cells used
   for the v3+ link protocools.

   The current version is "1".

   "2" is unused, and reserved by proposal 244.

   "3" is the ed25519 link handshake of proposal 220.

5.3. "Relay"

   The "relay" protocols are those used to handle CREATE cells, and
   those that handle the various RELAY cell types received after a
   CREATE cell.  (Except, relay cells used to manage introduction and
   rendezvous points are managed with the "HSIntro" and "HSRend" protocols
   respectively.)

      "1" -- supports the TAP key exchange, with all features in Tor
         0.2.3.  Support for CREATE and CREATED and CREATE_FAST and
         CREATED_FAST and EXTEND and EXTENDED.

      "2" -- supports the ntor key exchange, and all features in Tor
         0.2.4.19.  Includes support for CREATE2 and CREATED2 and
         EXTEND2 and EXTENDED2.

5.4. "HSIntro"

   The "HSIntro" protocol handles introduction points.

      "3" -- supports authentication as of proposal 121 in Tor
             0.2.1.6-alpha.

5.5. "HSRend"

   The "HSRend" protocol handles rendezvous points.

      "1" -- supports all features in Tor 0.0.6.

      "2" -- supports RENDEZVOUS2 cells of arbitrary length as long as they
             have 20 bytes of cookie in Tor 0.2.9.1-alpha.

5.6. "HSDir"

   The HSDir protocols are the set of hidden service document types
   that can be uploaded to, understood by, and downloaded from a tor
   relay, and the set of URLs available to fetch them.

      "1" -- supports all features in Tor 0.2.0.10-alpha.

5.7. "DirCache"

   The "DirCache" protocols are the set of documents available for
   download from a directory cache via BEGIN_DIR, and the set of URLs
   available to fetch them.  (This excludes URLs for hidden service
   objects.)

      "1" -- supports all features in Tor 0.2.4.19.

5.8. "Desc"

   Describes features present or absent in descriptors.

   Most features in descriptors don't require a "Desc" update -- only
   those that need to someday be required.  For example, someday clients
   will need to understand ed25519 identities.

      "1" -- supports all features in Tor 0.2.4.19.

      "2" -- cross-signing with onion-keys, signing with ed25519
             identities.

5.9. "Microdesc"

   Describes features present or absent in microdescriptors.

   Most features in descriptors don't require a "MicroDesc" update --
   only those that need to someday be required.
   These correspond more or less with consensus methods.

      "1" -- consensus methods 9 through 20.

      "2" -- consensus method 21 (adds ed25519 keys to microdescs).

5.10. "Cons"

   Describes features present or absent in consensus documents.

   Most features in consensus documents don't require a "Cons" update --
   only those that need to someday be required.

   These correspond more or less with consensus methods.

      "1" -- consensus methods 9 through 20.

      "2" -- consensus method 21 (adds ed25519 keys to microdescs).


6. Maintaining the protocol lists

   What makes a good fit for a "protocol" type?  Generally, it's a set
   of communications functionality that tends to upgrade in tandem, and
   in isolation from other parts of the Tor protocols.  It tends to be
   functionality where it doesn't make sense to implement only part of
   it -- though omitting the whole thing might make sense.

   (Looking through our suite of protocols, you might make a case for
   splitting DirCache into sub-protocols.)

   We shouldn't add protocols for features where others can remain
   oblivious to their presence or absence.  For example, if some
   directory caches start supporting a new header, and clients can
   safely send that header without knowing whether the directory cache
   will understand it, then a new protocol version is not required.

   Because all relays currently on the network are 0.2.4.19 or later, we
   can require 0.2.4.19, and use 0.2.4.19 as the minimal version so we
   we don't need to do code archaeology to determine how many
   no-longer-relevant versions of each protocol once existed.

   Adding new protocol types is pretty cheap, given compression.

A.1.  Inferring missing proto lines

   The directory authorities no longer allow versions of Tor before
   0.2.4.18-rc.  But right now, there is no version of Tor in the
   consensus before 0.2.4.19.  Therefore, we should disallow versions of
   Tor earlier than 0.2.4.19, so that we can have the protocol list for
   all current Tor versions include:

     Cons=1-2 Desc=1-2 DirCache=1 HSDir=1 HSIntro=3 HSRend=1-2 Link=1-4
     LinkAuth=1 Microdesc=1-2 Relay=1-2

   For Desc, Tor versions before 0.2.7.stable should be taken to have
   Desc=1 and versions 0.2.7.stable or later should have Desc=1-2.

   For Microdesc and Cons, Tor versions before 0.2.7.stable should be
   taken to support version 1; 0.2.7.stable and later should have
   1-2.

A.2. Initial required protocols

   For clients we will Recommend and Require these.

        Cons=1-2 Desc=1-2 DirCache=1 HSDir=2 HSIntro=3 HSRend=1 Link=4
        LinkAuth=1 Microdesc=1-2 Relay=2

   For relays we will Require:

        Cons=1 Desc=1 DirCache=1 HSDir=2 HSIntro=3 HSRend=1 Link=3-4
        LinkAuth=1 Microdesc=1 Relay=1-2

   For relays, we will additionally Recommend all protocols which we
   recommend for clients.

A.3. Example integration with other open proposals

   In this appendix, I try to show that this proposal is viable by
   showing how it can integrate with other open proposals to avoid
   version-gating.  I'm looking at open/draft/accepted proposals only.

    140  Provide diffs between consensuses

         This proposal doesn't affect interoperability, though we could
         add a DirCache protocol version for it if we think we might
         want to require it someday.

    164  Reporting the status of server votes

         Interoperability not affected; no new protocol.

    165  Easy migration for voting authority sets

         Authority-only; no new protocol.

    168  Reduce default circuit window

         Interoperability slightly affected; could be a new Relay
         protocol.

    172  GETINFO controller option for circuit information
    173  GETINFO Option Expansion

         Client/Relay interop not affected; no new protocol.

    177  Abstaining from votes on individual flags

         Authority-only; no new protocol.

    182  Credit Bucket

         No new protocol.

    188  Bridge Guards and other anti-enumeration defenses

         No new protocol.

    189  AUTHORIZE and AUTHORIZED cells

         This would be a new protocol, either a Link protocol or a new
         LinkAuth protocol.

    191  Bridge Detection Resistance against MITM-capable Adversaries

         No new protocol.

    192  Automatically retrieve and store information about bridges

         No new protocol.

    195  TLS certificate normalization for Tor 0.2.4.x

         Interop not affected; no new protocol.

    201  Make bridges report statistics on daily v3 network status
         requests

         No new protocol.

    202  Two improved relay encryption protocols for Tor cells

         This would be a new Relay protocol.

    203  Avoiding censorship by impersonating an HTTPS server

         Bridge/PT only; no new protocol.

    209  Tuning the Parameters for the Path Bias Defense

         Client behavior only; no new protocol.

    210  Faster Headless Consensus Bootstrapping

         Client behavior only; no new protocol.

    212  Increase Acceptable Consensus Age

         Possibly add a new DirCache protocol version to describe the
         "will hold older descriptors" property.

    219  Support for full DNS and DNSSEC resolution in Tor

         New relay protocol, or new protocol class (DNS=2?)

    220  Migrate server identity keys to Ed25519

         Once link authentication is supported, that's a new LinkAuth
         protocol version.

         No new protocol version is required for circuit extension,
         since it's a backward-compatible change.

    224  Next-Generation Hidden Services in Tor

         Adds new HSDir and HSIntro and HSRend protocols.

    226 "Scalability and Stability Improvements to BridgeDB: Switching
         to a Distributed Database System and RDBMS"

         No new protocol.

    229  Further SOCKS5 extensions

         Client-only; no new protocol.

    233  Making Tor2Web mode faster

         No new protocol.

    234  Adding remittance field to directory specification

         Could be a new protocol; or not.

    237  All relays are directory servers

         No new protocol.

    239  Consensus Hash Chaining

         No new protocol.

    242  Better performance and usability for the MyFamily option

         New Desc protocol.

    244  Use RFC5705 Key Exporting in our AUTHENTICATE calls

         Part of prop220.  Also adds another LinkAuth protocol version.

    245  Deprecating and removing the TAP circuit extension protocol

         Removes Linkauth protocol 1.

         Removes a Desc protocol.

    246  Merging Hidden Service Directories and Introduction Points

         Possibly adds a new HSIntro or HSDir protocol.

    247  Defending Against Guard Discovery Attacks using Vanguards

         No new protocol.

    248  Remove all RSA identity keys

         Adds a new Desc protocol version and a new Cons protocol
         version; eventually removes a version of each.

    249  Allow CREATE cells with >505 bytes of handshake data

         Adds a new Link protocol version for CREATE2V.

         Adds a new Relay protocol version for new EXTEND2 semantics.

    250  Random Number Generation  During Tor Voting

         No new protocol.

    251  Padding for netflow record resolution reduction

         No new protocol.

    252  Single Onion Services

         No new protocol.

    253  Out of Band Circuit HMACs

         New Relay protocol.

    254  Padding Negotiation

         New Link protocol, new Relay protocol.

    255  Controller features to allow for load-balancing hidden services

         No new protocol.

    256  Key revocation for relays and authorities

         New Desc protocol.

    257  Refactoring authorities and taking parts offline

         No new protocol.

    258  Denial-of-service resistance for directory authorities

         No new protocol.

    259  New Guard Selection Behaviour

         No new protocol

    260  Rendezvous Single Onion Services

         No new protocol

    261  AEZ for relay cryptography

         New Relay protocol version.

    262  Re-keying live circuits with new cryptographic material

         New Relay protocol version

    263  Request to change key exchange protocol for handshake

         New Relay protocol version.

Filename: 265-load-balancing-with-overhead.txt
Title: Load Balancing with Overhead Parameters
Authors: Mike Perry
Created: 01 January 2016
Status: Open
Target: arti-dirauth

NOTE: This is one way to address several load balancing problems in Tor,
including padding overhead and Exit+Guard issues. However, before attempting
this, we should see if we can simplify the equations further by changing how
we assign Guard, Fast and Stable flags in the first place. If we assign Guard
flags such that Guards are properly allocated wrt Middle and Fast, and avoid
assigning Guard to Exit, this will become simpler. Unfortunately, this is
literally impossible to fix with C-Tor. In adition to numerous overrides and
disparate safety checks that prevent changes, several bugs mean that Guard,
Stable, and Fast flags are randomly assigned: See:
  https://gitlab.torproject.org/tpo/core/tor/-/issues/40230
  https://gitlab.torproject.org/tpo/core/tor/-/issues/40395
  https://gitlab.torproject.org/tpo/core/tor/-/issues/19162
  https://gitlab.torproject.org/tpo/core/tor/-/issues/40733
  https://gitlab.torproject.org/tpo/network-health/analysis/-/issues/45
  https://gitlab.torproject.org/tpo/core/torspec/-/issues/100
  https://gitlab.torproject.org/tpo/core/torspec/-/issues/160
  https://gitlab.torproject.org/tpo/core/torspec/-/issues/158

Other approaches to flag equations that have been proposed:
  https://github.com/frochet/wf_proposal/blob/master/waterfilling-balancing-with-max-diversity.txt
  https://petsymposium.org/popets/2023/popets-2023-0127.pdf



0. Motivation

In order to properly load balance in the presence of padding and
non-negligible amounts of directory and hidden service traffic, the load
balancing equations in Section 3.8.3 of dir-spec.txt are in need of some
modifications.

In addition to supporting the idea of overhead, the load balancing
equations can also be simplified by treating Guard+Exit nodes as Exit
nodes in all cases. This causes the 9 sub-cases of the current load
balancing equations to consolidate into a single solution, which also
will greatly simplify the consensus process, and eliminate edge cases
such as #16255[1].


1. Overview

For padding overhead due to Proposals 251 and 254, and changes to hidden
service path selection in Proposal 247, it will be useful to be able to
specify a pair of parameters that represents the additional traffic
present on Guard and Middle nodes due to these changes.

The current load balancing equations unfortunately make this excessively
complicated. With overhead factors included, each of the 9 subcases goes
from being a short solution to over a page of calculations for each
subcase.

Moreover, out of 8751 hourly consensus documents produced in 2015[2],
only 78 of them had a non-zero weight for using Guard+Exit nodes in the
Guard position (weight Wgd), and most of those were well under 1%. The
highest weight for using Guard+Exits in the Guard position recorded in
2015 was 2.62% (on December 10th, 2015). This means clients that chose a
Guard node during that particular hour used only 2.62% of Guard+Exit
flagged nodes' bandwidth when performing a bandwidth-weighted Guard
selection. All clients that chose a Guard node during any other hour did
not consider Guard+Exit nodes at all as potential candidates for their
Guards.

This indicates that we can greatly simplify these load balancing
equations with little to no change in diversity to the network.


2. Simplified Load Balancing Equations

Recall that the point of the load balancing equations in section 3.8.3
of dir-spec.txt is to ensure that an equal amount of client traffic is
distributed between Guards, Middles, Exits, and Guard+Exits, where each
flag type can occupy one or more positions in a path. This allocation is
accomplished by solving a system of equations for weights for flag
position selection to ensure equal allocation of client traffic for each
position in a circuit.

If we ignore overhead for the moment and treat Guard+Exit nodes as Exit
nodes, then this allows the simplified system of equations to become:

  Wgg*G == M + Wme*E + Wmg*G    # Guard position == middle position
  Wgg*G == Wee*E                # Guard position == equals exit position
  Wmg*G + Wgg*G == G            # Guard allocation weights sum to 1
  Wme*E + Wee*E == E            # Exit allocation weights sum to 1

This system has four equations and four unknowns, and by transitivity we
ensure that allocated capacity for guard, middle, and exit positions are
all equal. Unlike the equations in 3.8.3 of dir-spec.txt, there are no
special cases to the solutions of these equations because there is no
shortage of constraints and no decision points for allocation based on
scarcity. Thus, there is only one solution. Using SymPy's symbolic
equation solver (see attached script) we obtain:

       E + G + M       E + G + M       2*E - G - M       2*G - E - M
  Wee: ---------, Wgg: ---------, Wme: -----------, Wmg: ------------
          3*E             3*G              3*E               3*G

For the rest of the flags weights, we will do the following:

  Dual-flagged (Guard+Exit) nodes should be treated as Exits:
     Wgd = 0, Wmd = Wme, Wed = Wee

  Directory requests use middle weights:
     Wbd=Wmd, Wbg=Wmg, Wbe=Wme, Wbm=Wmm

  Handle bridges and strange exit policies:
     Wgm=Wgg, Wem=Wee, Weg=Wed

2.1. Checking for underflow and overflow

In the old load balancing equations, we required a case-by-case proof to
guard against overflow and underflow, and to decide what to do in the
event of various overflow and underflow conditions[3]. Even still, the
system proved fragile to changes, such as the implementation of Guard
uptime fractions[1].

Here, with the simplified equations, we can plainly see that the only
time that a negative weight can arise is in Wme and Wmg, when 2*E < G+M
or when 2*G < E+M. In other words, only when Exits or Guards are scarce.

Similarly, the only time that a weight exceeding 1.0 can arise is in Wee
and Wgg, which also happens when 2*E < G+M or 2*G < E+M. This means that
parameters will always overflow in pairs (Wee and Wme, and/or Wgg and
Wmg).

In both these cases, simply clipping the parameters at 1 and 0 provides
as close of a balancing condition as is possible, given the scarcity.


3. Load balancing with Overhead Parameters

Intuitively, overhead due to padding and path selection changes can be
represented as missing capacity in the relevant position. This means
that in the presence of a Guard overhead fraction of G_o and a Middle
overhead fraction of M_o, the total fraction of actual client traffic
carried in those positions is (1-G_o) and (1-M_o), respectively.

Then, to achieve a balanced allocation of traffic, we consider only the
actual client capacity carried in each position:

  # Guard position minus overhead matches middle position minus overhead:
  (1-G_o)*(Wgg*G) == (1-M_o)*(M + Wme*E + Wmg*G)
  # Guard position minus overhead matches exit position:
  (1-G_o)*(Wgg*G) == 1*(Wee*E)
  # Guard weights still sum to 1:
  Wmg*G + Wgg*G == G
  # Exit weights still sum to 1:
  Wme*E + Wee*E == E

Solving this system with SymPy unfortunately yields some unintuitively
simplified results. For each weight, we first show the SymPy solution,
and then factor that solution into a form analogous to Section 2:

         -(G_o - 1)*(M_o - 1)*(E + G + M)
 Wee: ---------------------------------------
      E*(G_o + M_o - (G_o - 1)*(M_o - 1) - 2)


         (1 - G_o)*(1 - M_o)*(E + G + M)
 Wee: ---------------------------------------
      E*(2 - G_o - M_o + (1 - G_o)*(1 - M_o))



               (M_o - 1)*(E + G + M)
 Wgg: ---------------------------------------
      G*(G_o + M_o - (G_o - 1)*(M_o - 1) - 2)

               (1 - M_o)*(E + G + M)
 Wgg: ---------------------------------------
      G*(2 - G_o - M_o + (1 - G_o)*(1- M_o))



      -E*(M_o - 1) + G*(G_o - 1)*(-M_o + 2) - M*(M_o - 1)
 Wmg: ---------------------------------------------------
           G*(G_o + M_o - (G_o - 1)*(M_o - 1) - 2)

      (2 - M_o)*G*(1 - G_o) - M*(1 - M_o) - E*(1 - M_o)
 Wmg: ---------------------------------------------------
           G*(2 - G_o - M_o + (1 - G_o )*(1 - M_o))



      E*(G_o + M_o - 2) + G*(G_o - 1)*(M_o - 1) + M*(G_o - 1)*(M_o - 1)
 Wme: -----------------------------------------------------------------
                   E*(G_o + M_o - (G_o - 1)*(M_o - 1) - 2)

      (2 - G_o - M_o)*E - G*(1 - G_o)*(1 - M_o) - M*(1 - G_o)*(1 - M_o)
 Wme: -----------------------------------------------------------------
                   E*(2 - G_o - M_o + (1 - G_o)*(1 - M_o))


A simple spot check with G_o = M_o = 0 shows us that with zero overhead,
these solutions become identical to the solutions in Section 2 of this
proposal.

The final condition that we need to ensure is that these weight values
never become negative or greater than 1.0[3].

3.1. Ensuring against underflow and overflow

Note that if M_o = G_o = 0, then the solutions and the overflow
conditions are the same as in Section 2.

Unfortunately, SymPy is unable to solve multivariate inequalities, which
prevents us from directly deriving overflow conditions for each variable
independently (at least easily and without mistakes). Wolfram Alpha is
able to derive closed form solutions to some degree for this, but they
are more complicated than checking the weights for underflow and
overflow directly.

However, for all overflow and underflow cases, simply warning in the
event of overflow or underflow in the weight variable solutions above is
equivalent anyway. Optimal load balancing given this scarcity should
still result if we clip the resulting solutions to [0, 1.0].

It will be wise in the implementation to test the overflow conditions
with M_o = G_o = 0, and with their actual values. This will allow us to
know if the overflow is a result of inherent unbalancing, or due to
input overhead values that are too large (and need to be reduced by, for
example, reducing padding).


4. Consensus integration

4.1. Making use of the Overhead Factors

In order to keep the consensus process simple on the Directory
Authorities, the overhead parameters represent the combined overhead
from many factors.

The G_o variable is meant to account for sum of directory overhead,
netflow padding overhead, future two-hop padding overhead, and future
hidden service overhead (for cases where Guard->Middle->Exit circuits
are not used).

The M_o variable is meant to account for multi-hop padding overhead,
hidden service overhead, as well as an overhead for any future two-hop
directory connections (so that we can consolidate Guards and Directory
guard functionality into a single Guard node).

There is no need for an E_o variable, because even if there were
Exit-specific overhead, it could be represented by an equivalent
reductions in both G_o and M_o instead.

Since all relevant padding and directory overhead information is
included in the extra-info documents for each relay, the M_o and G_o
variables could be computed automatically from these extra-info
documents during the consensus process. However, it is probably wiser to
keep humans in the loop and set them manually as consensus parameters
instead, especially since we have not previously had to deal with
serious adversarial consequences from malicious extra-info reporting.

For clarity, though, it may be a good idea to separate all of the
components of M_o and G_o into separate consensus parameters, and
combine them (via addition) in the final equations. That way it will be
easier to pinpoint the source of any potential overflow issues. This
separation will also enable us to potentially govern padding's
contribution to the overhead via a single tunable value.

4.2. Accounting for hidden service overhead with Prop 247

XXX: Hidden service path selection and 247 complicates this. With 247, we
want paths only of G M M, where the Ms exclude Guard-flaged nodes. This means
that M_o needs to add the total hidden service *network bytecount* overhead
(2X the hidden service end-to-end traffic bytecount). We also need to
*subtract* 4*Wmg*hs_e2e_bytecount from the G_o overhead, to account for not using
Guard-flagged nodes for the four M's in full prop-247 G M M M M G circuits.

4.3. Accounting for RSOS overhead

XXX: We also need to separately account for RSOS (and maybe SOS?) path usage
in M_o. This will require separate acocunting for these service types in
extra-info descriptors.

4.4 Integration with Guardfraction

The GuardFraction changes in Proposal 236 and #16255 should continue to
work with these new equations, so long as the total T, G, and M values
are counted after the GuardFraction multiplier has been applied.

4.5. Guard flag assignment

Ideally, the Guard flag assignment process would also not count
Exit-flagged nodes when determining the Guard flag uptime and bandwidth
cutoffs, since we will not be using Guard+Exit flagged nodes as Guard
nodes at all when this change is applied. This will result in more
accurate thresholds for Guard node status, as well as better control
over the true total amount of Guard bandwidth in the consensus.

4.6. Cannibalization

XXX: It sucks and complicates everything. kill it, except for hsdirs.


1. https://trac.torproject.org/projects/tor/ticket/16255
2. https://collector.torproject.org/archive/relay-descriptors/consensuses/
3. http://tor-dev.torproject.narkive.com/17H9FewJ/correctness-proof-for-new-bandwidth-weights-bug-1952


Appendix A: SymPy Script for Balancing Equation Solutions
#!/usr/bin/python
from sympy.solvers import solve
from sympy import simplify, Symbol, init_printing, pprint

# Sympy variable declarations
(G,M,E,D) = (Symbol('G'),Symbol('M'),Symbol('E'),Symbol('D'))
(Wgd,Wmd,Wed,Wme,Wmg,Wgg,Wee) = (Symbol('Wgd'),Symbol('Wmd'),Symbol('Wed'),
                                 Symbol('Wme'),Symbol('Wmg'),Symbol('Wgg'),
                                 Symbol('Wee'))
(G_o, M_o) = (Symbol('G_o'),Symbol('M_o'))

print "Current Load Balancing Equation Solutions, Case 1:"
pprint(solve(
      [Wgg*G + Wgd*D - (M + Wmd*D + Wme*E + Wmg*G),
       Wgg*G + Wgd*D - (Wee*E + Wed*D),
       Wed*D + Wmd*D + Wgd*D - D,
       Wmg*G + Wgg*G - G,
       Wme*E + Wee*E - E,
       Wmg - Wmd,
       3*Wed - 1],
       Wgd, Wmd, Wed, Wme, Wmg, Wgg, Wee))

print
print "Case 1 with guard and middle overhead: "
pprint(solve(
        [(1-G_o)*(Wgg*G + Wgd*D) - (1-M_o)*(M + Wmd*D + Wme*E + Wmg*G),
         (1-G_o)*(Wgg*G + Wgd*D) - (Wee*E + Wed*D),
         Wed*D + Wmd*D + Wgd*D - D,
         Wmg*G + Wgg*G - G,
         Wme*E + Wee*E - E,
         Wmg - Wmd,
         3*Wed - 1],
         Wgd, Wmd, Wed, Wme, Wmg, Wgg, Wee))

print "\n\n"
print "Elimination of combined Guard+Exit flags (no overhead): "
pprint(solve(
      [(Wgg*G) - (M + Wme*E + Wmg*G),
       (Wgg*G) - 1*(Wee*E),
       Wmg*G + Wgg*G - G,
       Wme*E + Wee*E - E],
       Wme, Wmg, Wgg, Wee))

print
print "Elimination of combined Guard+Exit flags (Guard+middle overhead): "
combined = solve(
      [(1-G_o)*(Wgg*G) - (1-M_o)*(M + Wme*E + Wmg*G),
       (1-G_o)*(Wgg*G) - 1*(Wee*E),
       Wmg*G + Wgg*G - G,
       Wme*E + Wee*E - E],
       Wme, Wmg, Wgg, Wee)
pprint(combined)

Filename: 266-removing-current-obsolete-clients.txt
Title: Removing current obsolete clients from the Tor network
Author: Nick Mathewson
Created: 14 Jan 2016
Status: Superseded
Superseded-by: 264, 272.

1. Introduction

   Frequently, we find that very old versions of Tor should no longer be
   supported on the network.  To remove relays is easy enough: we
   simply update the directory authorities to stop listing relays that
   advertise versions that are too old.

   But to disable clients is harder.

   In another proposal I describe a system for letting future clients go
   gracefully obsolete.  This proposal explains how we can safely
   disable the obsolete clients we have today (and all other client
   versions of Tor to date, assuming that they will someday become
   obsolete).

1.1. Why disable clients?

   * Security.  Anybody who hasn't updated their Tor client in 5
     years is probably vulnerable to who-knows-what attacks.  They
     aren't likely to get much anonymity either.

   * Withstand zombie installations. Some Tors out there were once
     configured to start-on-boot systems that are now unmaintained.
     (See 1.4 below.)  They put needless load on the network, and help
     nobody.

   * Be able to remove backward-compatibility code.  Currently, Tor
     supports some truly ancient protocols in order to avoid breaking
     ancient versions or Tor.  This code needs to be maintained and
     tested. Some of it depends on undocumented or deprecated or
     non-portable OpenSSL features, and makes it hard to produce a
     conforming Tor server implementation.

   * Make it easier to write a conforming Tor relay.  If a Tor relay
     needs to support every Tor client back through the beginning of
     time, that makes it harder to develop and test compatible
     implementations.

1.2. Is this dangerous?

   I don't think so.  This proposal describes a way to make older
   clients gracefully disconnect from the network only when a majority
   of authorities agree that they should.  A majority of authorities
   already have the ability to inflict arbitrary degrees of sabotage on
   the consensus document.

1.3. History

   The earliest versions of Tor checked the recommended-versions field
   in the directory to see whether they should keep running.  If they
   saw that their version wasn't recommended, they'd shut down.  There
   was an "IgnoreVersion" option that let you keep running anyway.

   Later, around 2004, the rule changed to "shut down if the version is
   _obsolete_", where obsolete was defined as "not recommended, and
   older than a version that is recommended."

   In 0.1.1.7-alpha, we made obsolete versions only produce a warning,
   and removed IgnoreVersion.  (See 3ac34ae3293ceb0f2b8c49.)

   We have still disabled old tor versions.  With Tor 0.2.0.5-alpha,
   we disabled Tor versions before 0.1.1.6-alpha by having the v1
   authorities begin publishing empty directories only.

   In version 0.2.5.2-alpha, we completely removed support for the v2
   directory protocol used before Tor 0.2.0; there are no longer any v2
   authorities on the network.

   Tor versions before 0.2.1 will currently not progress past fetching
   an initial directory, because they believe in a number of directory
   authority identity keys that no longer sign the directory.

   Tor versions before 0.2.4 are (lightly) throttled in multihop
   circuit creation, because we prioritize ntor CREATE cells over
   TAP ones when under load.

1.4. The big problem: slow zombies and fast zombies

   It would be easy enough to 'disable' old clients by simply removing
   server support for the obsolete protocols that they use.  But there's
   a problem with that approach: what will the clients do when they fail
   to make connections, or to extend circuits, or whatever else they are
   no longer able to do?

     * Ideally, I'd like such clients to stop functioning _quietly_.  If
       they stop contacting the network, that would be best.

     * Next best would be if these clients contacted the network only
       occasionally and at different times.  I'll call these clients
       "slow zombies".

     * Worse would be if the clients contact the network frequently,
       over and over.  I'll call these clients "fast zombies".  They
       would be at their worst when they focus on authorities, or when
       they act in synchrony to all strike at once.

   One goal of this proposal is to ensure that future clients do not
   become zombies at all; and that ancient clients become slow zombies
   at worst.


2. Some ideas that don't work.

2.1. Dropping connections based on link protocols.

   Tor versions before 0.2.3.6-alpha use a renegotiation-based
   handshake instead of our current handshake.  We could detect these
   handshakes and close the connection at the relay side if the client
   attempts to renegotiate.

   I've tested these changes on versions maint-0.2.0 through
   maint-0.2.2.  They result in zombies with the following behavior:

      The client contact each authority it knows about, attempting to
      make a one-hop directory connection.  It fails, detects a failure,
      then reconnects more and more slowly ... but one hour later, it
      resets its connection schedule and starts again.

   In the steady state this appears to result in about two connections
   per client per authority per hour.  That is probably too many.

   (Most authorities would be affected: of the authorities that existed
   in 0.2.2, gabelmoo has moved and turtles has shut down.  The
   authorities Faravahar and longclaw are new. The authorities moria1,
   tor26, dizum, dannenberg, urras, maatuska and maatuska would all get
   hit here.) [two maatuskas? -RD]

   (We could simply remove the renegotiation-detection code entirely,
   and reply to all connections with an immediate VERSIONS cell.  The
   behavior would probably be the same, though.)

   If we throttled connections rather than closing them, we'd only get
   one connection per authority per hour, but authorities would have to
   keep open a potentially huge number of sockets.

2.2. Blocking circuit creation under certain circumstances

   In tor 0.2.5.1-alpha, we began ignoring the UseNTorHandshake option,
   and always preferring the ntor handshake where available.

   Unfortunately, we can't simply drop all TAP handshakes, since clients
   and relays can still use them in the hidden service protocol.  But
   we could detect these versions by:

        Looking for use of a TAP handshake from an IP not associated
        with any known relay, or on a connection where the client
        did not authenticate.  (This could be from a bridge, but clients
        don't build circuits that go to an IntroPoint or RendPoint
        directly after a bridge.)

   This would still result in clients not having directories, however,
   and retrying once an hour.

3. Ideas that might work

3.1. Move all authorities to new ports

   We could have each authority known to older clients start listening
   for connections at a new port P. We'd forward the old port to the new
   port.  Once sufficiently many clients were using the new ports, we
   could disable the forwarding.

   This would result in the old clients turning into zombies as above,
   but they would only be scrabbling at nonexistent ports, causing less
   load on the authorities.

   [This proposal would probably be easiest to implement.]

3.2. Start disabling old link protocols on relays

   We could have new relays start dropping support for the old link
   protocols, while maintaining support on the authorities and older
   relays.

   The result here would be a degradation of older client performance
   over time.  They'd still behave zombieishly if the authorities
   dropped support, however.

3.3. Changing the consensus format.

   We could allow 'f' (short for "flag") as a synonym for 's' in
   consensus documents.  Later, if we want to disable all Tor versions
   before today, we can change the consensus algorithm so that the
   consensus (or perhaps only the microdesc consensus) is spelled with
   'f' lines instead of 's' lines.  This will create a consensus which
   older clients and relays parse as having all nodes down, which will
   make them not connect to the network at all.

   We could similarly replace "r" with "n", or replace Running with
   Online, or so on.

   In doing this, we could also rename fresh-until and valid-until, so
   that new clients would have the real expiration date, and old clients
   would see "this consensus never expires".  This would prevent them
   from downloading new consensuses.

   [This proposal would result in the quietest shutdown.]

A. How to "pull the switch."

   This is an example timeline of how we could implement 3.3 above,
   along with proposal 264.

     TIME 0:
        Implement the client/relay side of proposal 264, backported
        to every currently extant Tor version that we still
        support.

        At the same time, add support for the new consensus type to
        all the same Tor versions.

        Don't disable anything yet.

     TIME 1....N:
        Encourage all distributions shipping packages for those old
        tor versions to upgrade to ones released at Time 0 or later.

        Keep informed of the upgrade status of the clients and
        relays on the Tor network.


     LATER:
        At some point after nearly all clients and relays have
        upgraded to the versions released at Time 0 or later, we
        could make the switchover to publishing the new consensus
        type.


B. Next steps.

   We should verify what happens when currently extant client
   versions get an empty consensus.  This will determine whether
   3.3 will not work.  Will they try to fetch a new one from the
   authorities at the end of the validity period.

   Another option is from Roger: we could add a flag meaning "ignore
   this consensus; it is a poison consensus to kill old Tor
   versions."  And maybe we could have it signed only by keys that
   the current clients won't accept.  And we could serve it to old
   clients rather than serving them the real consensus.  And we
   could give it a really high expiration time.  New clients
   wouldn't believe it.  We'd need to flesh this out.

   Another option is also from Roger:  Tell new clients about new
   locations to fetch directories from.  Keep the old locations working
   for as long as we want to support them.  We'd need to flesh this
   out too.

   The timeline above requires us to keep informed of the status of
   the different clients and relays attempting to connect to the tor
   network.  We should make sure we'll actually able to do so.

   http://meetbot.debian.net/tor-dev/2016/tor-dev.2016-02-12-15.01.log.html
   has a more full discussion of the above ideas.
Filename: 267-tor-consensus-transparency.txt
Title: Tor Consensus Transparency
Author: Linus Nordberg
Created: 2014-06-28
Status: Open

0. Introduction

   This document describes how to provide and use public, append-only,
   verifiable logs containing Tor consensus and vote status documents,
   much like what Certificate Transparency [CT] does for TLS
   certificates, making it possible for log monitors to detect false
   consensuses and votes.

   Tor clients and relays can refuse using a consensus not present in
   a set of logs of their choosing, as well as provide possible
   evidence of misissuance by submitting such a consensus to any
   number of logs.

1. Overview

   Tor status documents, consensuses as well as votes, are stored in
   one or more public, append-only, externally verifiable log using a
   history tree like the one described in [CrosbyWallach].

   Consensus-users, i.e. Tor clients and relays, expect to receive one
   or more "proof of inclusions" with new consensus documents. A proof
   of inclusion is a hash sum representing the tree head of a log,
   signed by the logs private key, and an audit path listing the nodes
   in the tree needed to recreate the tree head. Consensus-users are
   configured to use one or more logs by listing a log address and a
   public key for each log. This is enough for verifying that a given
   consensus document is present in a given log.

   Submission of status documents to a log can be done by anyone with
   an internet connection (and the Tor network, in case of logs only
   on a .onion address). The submitter gets a signed tree head and a
   proof of inclusion in return. Directory authorities are expected to
   submit to one or more logs and include the proofs when serving
   consensus documents. Directory caches and consensus-users receiving
   a consensus not including a proof of inclusion may submit the
   document and use the proof they receive in return.

   Auditing log behaviour and monitoring the contents of logs is
   performed in cooperation between the Tor network and external
   services. Relays act as log auditors with help from Tor clients
   gossiping about what they see. Directory authorities are good
   candidates for monitoring log content since they know what votes
   they have sent and received as well as what consensus documents
   they have issued. Anybody can run both an auditor and a monitor
   though, which is an important property of the proposed system.

2. Motivation

   Popping a handful of boxes (currently five) or factoring the same
   number of RSA keys should not be ruled out as a possible attack
   against a subset of Tor users. An attacker controlling a majority
   of the directory authorities signing keys can, using
   man-in-the-middle or man-on-the-side attacks, serve consensus
   documents listing relays under their control. If mounted on a small
   subset of Tor users on the internet, the chance of detection is
   probably low. Implementation of this proposal increases the cost
   for such an attack by raising the chances of it being detected.

   Note that while the proposed solution gives each individual some
   degree of protection against using a false consensus this is not
   the primary goal but more of a nice side effect. The primary goal
   is to detect correctly signed consensus documents which differ from
   the consensus of the directory authoritites. This raises the risk
   of exposure of an attacker capable of producing a consensus and
   feed it to users.

   The complexity of the proposed solution is motivated by the fact
   that the log key is not just another key on top of the directory
   authority keys since the log doesn't have to be trusted. Another
   value is the decentralisation given -- anybody can run their own
   log and use it. Anybody can audit all existing logs and verify
   their correct behaviour. This empowers people outside the group of
   Tor directory authority operators and the people who trust them for
   one reason or the other.

3. Design

   Communication with logs is done over HTTP using TLS or Tor onion
   services for transport, similar to what is defined in
   [rfc6962-bis-12]. Parameters for POSTs and all responses are
   encoded as name/value pairs in JSON objects [RFC4627].

   Summary of proposed changes to Tor:

   - Configuration is added for listing known logs and for describing
     policy for using them.

   - Directory authorities start submitting newly created consensuses
     to at least one public log.

   - Tor clients and relays receiving a consensus not accompanied by a
     proof of inclusion start submitting that consensus to at least
     one public log.

   - Consensus-users start rejecting consensuses accompanied by an
     invalid proof of inclusion.

   - A new cell type LOG_STH is defined, for clients and relays to
     exchange information about seen tree heads and their validity.

   - Consensus-users send seen tree heads to relays acting as log
     auditors.

   - Relays acting as log auditors validate tree heads (section 3.2.2)
     received from consensus-users and send results back.

   - Consensus-users start rejecting consensuses for which valid
     proofs of inclusion can not be obtained.

   Definitions:

   - Log id: The SHA-256 hash of the log's public key, to be treated
     as an opaque byte string identifying the log.

3.1. Consensus submission

   Logs accept consensus submissions from anyone as long as the
   consensus is signed by a majority of the Tor directory authorities
   of the Tor network that it's logging.

   Consensus documents are POST:ed to a well-known URL as defined in
   section 5.2.

   The output is what we call a proof of inclusion.

3.2. Verification

3.2.1. Log entry membership verification

   Calculate a tree head from the hash of the received consensus and
   the audit path in the accompanying proof. Verify that the
   calculated tree head is identical to the tree head in the
   proof. This can easily be done by consensus-users for each received
   consensus.

   We now know that the consensus is part of a tree which the log
   claims to be The Tree. Whether this tree is the same tree that
   everybody else see is unknown at this point.

3.2.2. Log consistency verification

   Ask the log for a consistency proof between the tree head to verify
   and a previously known good tree head from the pool. Section 5.3
   specifies how to fetch a consistency proof.

   [[TBD require auditors to fetch and store the tree head for the
   empty tree as part of bootstrapping, in order to avoid the case
   where there's no older tree to verify against?]]

   [[TODO description of verification of consistency goes here]]

   Relays acting as auditors cache results to minimise calculations
   and communication with log servers.

   [[TBD have clients verify consistency as well? NOTE: we still want
   relays to see tree heads in order to catch a lying log (the
   split-view attack)]]

   We now know that the verified tree is a superset of a known good
   tree.

3.3. Log auditing

   A log auditor verifies two things:

   - A logs append-only property, i.e. that no entries once accepted
   by a log are ever altered or removed.

   - That a log presents the same view to all of its users [[TODO
   describe the Tor networks role in auditing more than what's found
   in section 3.2.2]]

   A log auditor typically doesn't care about the contents of the log
   entries, other than calculating their hash sums for auditing
   purposes.

   Tor relays should act as log auditors.

3.4. Log monitoring

   A log monitor downloads and investigates each entry in a log
   searching for anomalies according to its monitoring policy.

   This document doesn't define monitoring policies but does outline a
   few strategies for monitoring in section [[TBD]].

   Note that there can be more than one valid consensus documents for
   a given point in time. One reason for this is that the number of
   signatures can differ due to consensus voting timing
   details. [[TODO Are there more reasons?]]

   [[TODO expand on monitoring strategies -- even if this is not part
   of the proposed extensions to the Tor network it's good for
   understanding. a) dirauths can verify consensus documents byte for
   byte; b) anyone can look for diffs larger than D per time T, where
   "diffs" certainly can be smarter than a plain text diff]]

3.5. Consensus-user behaviour

   [[TODO move most of this to section 5]]

   Keep an on-disk cache of consensus documents. Mark them as being in
   one of three states:

   LOG_STATE_UNKNOWN -- don't know whether it's present in enough logs
                        or not
   LOG_STATE_LOGGED -- have seen good proof(s) of inclusion
   LOG_STATE_LOGGED_GOOD -- confident about the tree head representing
                            a good tree

   Newly arrived consensus documents start in UNKNOWN or LOGGED
   depending on whether they are accompanied by enough proofs or
   not. There are two possible state transitions:

   - UNKNOWN --> LOGGED: When enough correctly verifying proofs of
     inclusion (section 3.2.1) have been seen. The number of good
     proofs required is a policy setting in the configuration of the
     consensus-user.

   - LOGGED --> LOGGED_GOOD: When the tree head in enough of the
     inclusion proofs have been verified (section 3.2.2) or enough
     LOG_STH cells vouching for the same tree heads have been
     seen. The number of verifications required is a policy setting in
     the configuration of the consensus-user.

   Consensuses in state UNKNOWN are not used but are instead submitted
   to one or more logs. If the submission succeeds, this will take the
   consensus to state LOGGED.

   Consensuses in state LOGGED are used despite not being fully
   verified with regard to logging. LOG_STH cells containing
   tree heads from received proofs are being sent to relays for
   verification. Clients send to all relays that they have a circuit
   to, i.e. their guard relay(s). Relays send to three random relays
   that they have a circuit to.

3.6. Relay behaviour when acting as an auditor

   In order to verify the append-only property of a log, relays acting
   as log auditors verify the consistency of tree heads received in
   LOG_STH cells. An auditor keeps a copy of 2+N known good tree heads
   in a pool stored on persistent media [[TBD where N is either a
   fixed number in the range 32-128 or is a function of the log
   size]]. Two of them are the oldest and newest tree heads seen,
   respectively. The rest, N, are randomly chosen from the tree heads
   seen.

   [[TODO describe or refer to an algorithm for "randomly chosen",
   hopefully not subjective to flushing attacks (or other attacks)]].

3.7. Notable differences from Certificate Transparency

   - The data logged is "strictly time-stamped", i.e. ordered.

   - Much shorter lifetime of logged data -- a day rather than a
     year. Is the effects of this difference of importance only for
     "one-shot attacks"?

   - Directory authorities have consensus about what they're
     signing -- there are no "web sites knowing better".

   - Submitters are not in the same hurry as CA:s and can wait minutes
     rather than seconds for a proof of inclusion.

4. Security implications

  TODO

5. Specification

5.0. Data structures

   Data structures are defined as described in [RFC5246] section 4,
   i.e. TLS 1.2 presentation language. While it is tempting to try to
   avoid yet another format, the cost of redefining the data
   structures in [rfc6962-bis-12] outweighs this consideration. The
   burden of redefining, reimplementing and testing is extra true for
   those structures which need precise definitions because they are to
   be signed.

5.1. Signed Tree Head (STH)

   An STH is a TransItem structure of type "signed_tree_head" as
   defined in [rfc6962-bis-12] section 5.8.

5.2. Submitting a consensus document to a log

   POST https://<log server>/tct/v1/add-consensus

   Input:

     consensus: A consensus status document as defined in [dir-spec]
       section 3.4.1 [[TBD gziped and base64 encoded to save 50%?]]

   Output:

     sth: A signed tree head as defined in section 5.1 refering to a
     tree in which the submitted document is included.

     inclusion: An inclusion proof as specified for the "inclusion"
     output in [rfc6962-bis-12] section 6.5.

5.3. Getting a consistency proof from a log

   GET https://<log server>/tct/v1/get-sth-consistency

   Input and output as specified in [rfc6962-bis-12] section 6.4.

5.x. LOG_STH cells

   A LOG_STH cell is a variable-length cell with the following
   fields:

     TBDname [TBD octets]
     TBDname [TBD octets]
     TBDname [TBD octets]

6. Compatibility

   TBD

7. Implementation

   TBD

8. Performance and scalability notes

   TBD

A. Open issues / TODOs

   - TODO: Add SCTs from CT, at least as a practical "cookie" (i.e. no
     need to send them around or include them anywhere). Logs should
     be given more time for distributing than we're willing to wait on
     an HTTP response for.

   - TODO: explain why no hash function and signing algorithm agility,
     [[rfc6962-bis-12] section 10

   - TODO: add a blurb about the values of publishing logs as onion
     services

   - TODO: discuss compromise of log keys

B. Acknowledgements

   This proposal leans heavily on [rfc6962-bis-12]. Some definitions
   are copied verbatim from that document. Valuable feedback has been
   received from Ben Laurie, Karsten Loesing and Ximin Luo.

C. References

   [CrosbyWallach] http://static.usenix.org/event/sec09/tech/full_papers/crosby.pdf
   [dir-spec] https://gitweb.torproject.org/torspec.git/blob/HEAD:/dir-spec.txt
   [RFC4627] https://tools.ietf.org/html/rfc4627
   [rfc6962-bis-12] https://datatracker.ietf.org/doc/draft-ietf-trans-rfc6962-bis/12
   [CT] https://https://www.certificate-transparency.org/
Filename: 268-guard-selection.txt
Title: New Guard Selection Behaviour
Author: Isis Lovecruft, George Kadianakis, [Ola Bini]
Created: 2015-10-28
Status: Obsolete

  (Editorial note: this was origianlly written as a revision of
  proposal 259, but it diverges so substantially that it seemed
  better to assign it a new number for reference, so that we
  aren't always talking about "The old 259" and "the new 259". -NM)

  This proposal has been obsoleted by proposal #271.

§1. Overview

  Tor uses entry guards to prevent an attacker who controls some
  fraction of the network from observing a fraction of every user's
  traffic. If users chose their entries and exits uniformly at
  random from the list of servers every time they build a circuit,
  then an adversary who had (k/N) of the network would deanonymize
  F=(k/N)^2 of all circuits... and after a given user had built C
  circuits, the attacker would see them at least once with
  probability 1-(1-F)^C.  With large C, the attacker would get a
  sample of every user's traffic with probability 1.

  To prevent this from happening, Tor clients choose a small number of
  guard nodes (currently 3).  These guard nodes are the only nodes
  that the client will connect to directly.  If they are not
  compromised, the user's paths are not compromised.

  But attacks remain.  Consider an attacker who can run a firewall
  between a target user and the Tor network, and make
  many of the guards they don't control appear to be unreachable.
  Or consider an attacker who can identify a user's guards, and mount
  denial-of-service attacks on them until the user picks a guard
  that the attacker controls.

  In the presence of these attacks, we can't continue to connect to
  the Tor network unconditionally.  Doing so would eventually result
  in the user choosing a hostile node as their guard, and losing
  anonymity.

  This proposal outlines a new entry guard selection algorithm, which
  addresses the following concerns:

    - Heuristics and algorithms for determining how and which guard(s)
      is(/are) chosen should be kept as simple and easy to understand
      as possible.

    - Clients in censored regions or who are behind a fascist firewall
      who connect to the Tor network should not experience any
      significant disadvantage in terms of reachability or usability.

    - Tor should make a best attempt at discovering the most
      appropriate behaviour, with as little user input and
      configuration as possible.


§2. Design

  Alice, an OP attempting to connect to the Tor network, should
  undertake the following steps to determine information about the
  local network and to select (some) appropriate entry guards.  In the
  following scenario, it is assumed that Alice has already obtained a
  recent, valid, and verifiable consensus document.

  The algorithm is divided into four components such that the full
  algorithm is implemented by first invoking START, then repeatedly
  calling NEXT while adviced it SHOULD_CONTINUE and finally calling
  END. For an example usage see §A. Appendix.

  Several components of NEXT can be invoked asynchronously. SHOULD_CONTINUE
  is used for the algorithm to be able to tell the caller whether we
  consider the work done or not - this can be used to retry primary
  guards when we finally are able to connect to a guard after a long
  network outage, for example.

  This algorithm keeps track of the unreachability status for guards
  in state global to the system, so that repeated runs will not have
  to rediscover unreachability over and over again. However, this
  state does not need to be persisted permanently - it is purely an
  optimization.

  The algorithm expects several arguments to guide its behavior. These
  will be defined in §2.1.

  The goal of this algorithm is to strongly prefer connecting to the
  same guards we have connected to before, while also trying to detect
  conditions such as a network outage. The way it does this is by keeping
  track of how many guards we have exposed ourselves to, and if we have
  connected to too many we will fall back to only retrying the ones we have
  already tried. The algorithm also decides on sample set that should
  be persisted - in order to minimize the risk of an attacker forcing
  enumeration of the whole network by triggering rebuilding of
  circuits.


§2.1. Definitions

  Bad guard: a guard is considered bad if it conforms with the function IS_BAD
  (see §G. Appendix for details).

  Dead guard: a guard is considered dead if it conforms with the function
  IS_DEAD (see §H. Appendix for details).

  Obsolete guard: a guard is considered obsolete if it conforms with the
  function IS_OBSOLETE (see §I. Appendix for details).

  Live entry guard: a guard is considered live if it conforms with the function
  IS_LIVE (see §D. Appendix for details).

§2.1. The START algorithm

  In order to start choosing an entry guard, use the START
  algorithm. This takes four arguments that can be used to fine tune
  the workings:

  USED_GUARDS
      This is a list that contains all the guards that have been used
      before by this client. We will prioritize using guards from this
      list in order to minimize our exposure. The list is expected to
      be sorted based on priority, where the first entry will have the
      highest priority.

  SAMPLED_GUARDS
      This is a set that contains all guards that should be considered
      for connection. This set should be persisted between runs. It
      should be filled by using NEXT_BY_BANDWIDTH with GUARDS as an
      argument if it's empty, or if it contains less than SAMPLE_SET_THRESHOLD
      guards after winnowing out older guards.

  N_PRIMARY_GUARDS
      The number of guards we should consider our primary
      guards. These guards will be retried more frequently and will
      take precedence in most situations. By default the primary
      guards will be the first N_PRIMARY_GUARDS guards from USED_GUARDS.
      When the algorith is used in constrained mode (have bridges or entry
      nodes in the configuration file), this value should be 1 otherwise the
      proposed value is 3.

  DIR
      If this argument is set, we should only consider guards that can
      be directory guards. If not set, we will consider all guards.

  The primary work of START is to initialize the state machine depicted
  in §2.2. The initial state of the machine is defined by:

  GUARDS
      This is a set of all guards from the consensus. It will primarily be used
      to fill in SAMPLED_GUARDS

  FILTERED_SAMPLED
      This is a set that contains all guards that we are willing to connect to.
      It will be obtained from calling FILTER_SET with SAMPLED_GUARDS as
      argument.

  REMAINING_GUARDS
      This is a running set of the guards we have not yet tried to connect to.
      It should be initialized to be FILTERED_SAMPLED without USED_GUARDS.

  STATE
      A variable that keeps track of which state in the state
      machine we are currently in. It should be initialized to
      STATE_PRIMARY_GUARDS.

  PRIMARY_GUARDS
      This list keeps track of our primary guards. These are guards
      that we will prioritize when trying to connect, and will also
      retry more often in case of failure with other guards.
      It should be initialized by calling algorithm
      NEXT_PRIMARY_GUARD repeatedly until PRIMARY_GUARDS contains
      N_PRIMARY_GUARDS elements.


§2.2. The NEXT algorithm

  The NEXT algorithm is composed of several different possibly flows. The
  first one is a simple state machine that can transfer between two
  different states. Every time NEXT is invoked, it will resume at the
  state where it left off previously. In the course of selecting an
  entry guard, a new consensus can arrive. When that happens we need
  to update the data structures used, but nothing else should change.

  Before jumping in to the state machine, we should first check if it
  was at least PRIMARY_GUARDS_RETRY_INTERVAL minutes since we tried
  any of the PRIMARY_GUARDS. If this is the case, and we are not in
  STATE_PRIMARY_GUARDS, we should save the previous state and set the
  state to STATE_PRIMARY_GUARDS.


§2.2.1. The STATE_PRIMARY_GUARDS state

  Return each entry in PRIMARY_GUARDS in turn. For each entry, if the
  guard should be retried and considered suitable use it. A guard is
  considered to eligible to retry if is marked for retry or is live
  and id not bad. Also, a guard is considered to be suitable if is
  live and, if is a directory it should not be a cache.

  If all entries have been tried transition to STATE_TRY_REMAINING.

§2.2.2. The STATE_TRY_REMAINING state

  Return each entry in USED_GUARDS that is not in PRIMARY_GUARDS in
  turn.For each entry, if a guard is found return it.

  Return each entry from REMAINING_GUARDS in turn.
  For each entry, if the guard should be retried and considered
  suitable use it and mark it as unreachable. A guard is
  considered to eligible to retry if is marked for retry or is live
  and id not bad. Also, a guard is considered to be suitable if is
  live and, if is a directory it should not be a cache.

  If no entries remain in REMAINING_GUARDS, transition to
  STATE_PRIMARY_GUARDS.


§2.2.3. ON_NEW_CONSENSUS

  First, ensure that all guard profiles are updated with information
  about whether they were in the newest consensus or not.

  Update the bad status for all guards in USED_GUARDS and SAMPLED_GUARDS.
  Remove all dead guards from USED_GUARDS and SAMPLED_GUARDS.
  Remove all obsolete guards from USED_GUARDS and SAMPLED_GUARDS.

§2.3. The SHOULD_CONTINUE algorithm

  This algorithm takes as an argument a boolean indicating whether the
  circuit was successfully built or not.

  After the caller have tried to build a circuit with a returned
  guard, they should invoke SHOULD_CONTINUE to understand if the
  algorithm is finished or not. SHOULD_CONTINUE will always return
  true if the circuit failed. If the circuit succeeded,
  SHOULD_CONTINUE will always return false, unless the guard that
  succeeded was the first guard to succeed after
  INTERNET_LIKELY_DOWN_INTERVAL minutes - in that case it will set the
  state to STATE_PRIMARY_GUARDS and return true.


§2.4. The END algorithm

  The goal of this algorithm is simply to make sure that we keep track
  of successful connections made. This algorithm should be invoked
  with the guard that was used to correctly set up a circuit.

  Once invoked, this algorithm will mark the guard as used, and make
  sure it is in USED_GUARDS, by adding it at the end if it was not there.


§2.5. Helper algorithms

  These algorithms are used in the above algorithms, but have been
  separated out here in order to make the flow clearer.

  NEXT_PRIMARY_GUARD
      - Return the first entry from USED_GUARDS that is not in
        PRIMARY_GUARDS and that is in the most recent consensus.
      - If USED_GUARDS is empty, use NEXT_BY_BANDWIDTH with
        REMAINING_GUARDS as the argument.

  NEXT_BY_BANDWIDTH
      - Takes G as an argument, which should be a set of guards to
        choose from.
      - Return a randomly select element from G, weighted by bandwidth.

  FILTER_SET
      - Takes G as an argument, which should be a set of guards to filter.
      - Filter out guards in G that don't comply with IS_LIVE (see
        §D. Appendix for details).
      - If the filtered set is smaller than MINIMUM_FILTERED_SAMPLE_SIZE and G
        is smaller than MAXIMUM_SAMPLE_SIZE_THRESHOLD, expand G and try to
        filter out again. G is expanded by adding one new guard at a time using
        NEXT_BY_BANDWIDTH with GUARDS as an argument.
      - If G is not smaller than MAXIMUM_SAMPLE_SIZE_THRESHOLD, G should not be
        expanded. Abort execution of this function by returning null and report
	      an error to the user.


§3. Consensus Parameters, & Configurable Variables

  This proposal introduces several new parameters that ideally should
  be set in the consensus but that should also be possible to
  set or override in the client configuration file. Some of these have
  proposed values, but for others more simulation and trial needs to
  happen.

  PRIMARY_GUARDS_RETRY_INTERVAL
      In order to make it more likely we connect to a primary guard,
      we would like to retry the primary guards more often than other
      types of guards. This parameter controls how many minutes should
      pass before we consider retrying primary guards again. The
      proposed value is 3.

  SAMPLE_SET_THRESHOLD
      In order to allow us to recognize completely unreachable network,
      we would like to avoid connecting to too many guards before switching
      modes. We also want to avoid exposing ourselves to too many nodes in a
      potentially hostile situation. This parameter, expressed as a
      fraction, determines the number of guards we should keep as the
      sampled set of the only guards we will consider connecting
      to. It will be used as a fraction for the sampled set.
      If we assume there are 1900 guards, a setting of 0.02
      means we will have a sample set of 38 guards.
      This limits our total exposure. Proposed value is 0.02.

  MINIMUM_FILTERED_SAMPLE_SIZE
      The minimum size of the sampled set after filtering out nodes based on
      client configuration (FILTERED_SAMPLED). Proposed value is ???.

  MAXIMUM_SAMPLE_SIZE_THRESHOLD
      In order to guarantee a minimum size of guards after filtering,
      we expand SAMPLED_GUARDS until a limit.  This fraction of GUARDS will be
      used as an upper bound when expanding SAMPLED_GUARDS.
      Proposed value is 0.03.

  INTERNET_LIKELY_DOWN_INTERVAL
      The number of minutes since we started trying to find an entry
      guard before we should consider the network down and consider
      retrying primary guards before using a functioning guard
      found. Proposed value 5.

§4.  Security properties and behavior under various conditions

  Under normal conditions, this algorithm will allow us to quickly
  connect and use guards we have used before with high likelihood of
  working. Assuming the first primary guard is reachable and in the
  consensus, this algorithm will deterministically always return that
  guard.

  Under dystopic conditions (when a firewall is in place that blocks
  all ports except for potentially port 80 and 443), this algorithm
  will try to connect to 2% of all guards before switching modes to try
  dystopic guards. Currently, that means trying to connect to circa 40
  guards before getting a successful connection. If we assume a
  connection try will take maximum 10 seconds, that means it will take
  up to 6 minutes to get a working connection.

  When the network is completely down, we will try to connect to 2% of
  all guards plus 2% of all dystopic guards before realizing we are
  down. This means circa 50 guards tried assuming there are 1900 guards
  in the network.

  In terms of exposure, we will connect to a maximum of 2% of all
  guards plus 2% of all dystopic guards, or 3% of all guards,
  whichever is lower. If N is the number of guards, and k is the
  number of guards an attacker controls, that means an attacker would
  have a probability of 1-(1-(k/N)^2)^(N * 0.03) to have one of their
  guards selected before we fall back. In real terms, this means an
  attacker would need to control over 10% of all guards in order to
  have a larger than 50% chance of controlling a guard for any given client.

  In addition, since the sampled set changes slowly (the suggestion
  here is that guards in it expire every month) it is not possible for
  an attacker to force a connection to an entry guard that isn't
  already in the users sampled set.


§A. Appendix: An example usage

  In order to clarify how this algorithm is supposed to be used, this
  pseudo code illustrates the building of a circuit:

    ESTABLISH_CIRCUIT:

      if chosen_entry_node = NULL
        if context = NULL
          context = ALGO_CHOOSE_ENTRY_GUARD_START(used_guards,
                                    sampled_guards=[],
                                    options,
                                    n_primary_guards=3,
                                    dir=false,
                                    guards_in_consensus)

        chosen_entry_node = ALGO_CHOOSE_ENTRY_GUARD_NEXT(context)
        if not IS_SUITABLE(chosen_entry_node)
          try another entry guard

      circuit = composeCircuit(chosen_entry_node)
      return circuit

    ON_FIRST_HOP_CALLBACK(channel):

      if !SHOULD_CONTINUE:
            ALGO_CHOOSE_ENTRY_GUARD_END(entryGuard)
      else
            chosen_entry_node = NULL


§B. Appendix: Entry Points in Tor

  In order to clarify how this algorithm is supposed to be integrated with
  Tor, here are some entry points to trigger actions mentioned in spec:

    When establish_circuit:

        If *chosen_entry_node* doesn't exist
            If *context* exist, populate the first one as *context*
            Otherwise, use ALGO_CHOOSE_ENTRY_GUARD_START to initalize a new *context*.

            After this when we want to choose_good_entry_server, we will use
            ALGO_CHOOSE_ENTRY_GUARD_NEXT to get a candidate.

        Use chosen_entry_node to build_circuit and handle_first_hop,
        return this circuit

    When entry_guard_register_connect_status(should_continue):

        if !should_continue:
           Call ALGO_CHOOSE_ENTRY_GUARD_END(chosen_entry_node)
        else:
           Set chosen_entry_node to NULL

    When new directory_info_has_arrived:

        Do ON_NEW_CONSENSUS


§C. Appendix: IS_SUITABLE helper function

    A guard is suitable if it satisfies all of the folowing conditions:
      - It's considered to be live, according to IS_LIVE.
      - It's a directory cache if a directory guard is requested.
      - It's not the chosen exit node.
      - It's not in the family of the chosen exit node.

    This conforms to the existing conditions in "populate_live_entry_guards()".


§D. Appendix: IS_LIVE helper function

    A guard is considered live if it satisfies all of the folowing conditions:
      - It's not disabled because of path bias issues (path_bias_disabled).
      - It was not observed to become unusable according to the directory or
        the user configuration (bad_since).
      - It's marked for retry (can_retry) or it's been unreachable for some
        time (unreachable_since) but enough time has passed since we last tried
        to connect to it (entry_is_time_to_retry).
      - It's in our node list, meaninig it's present in the latest consensus.
      - It has a usable descriptor (either a routerdescriptor or a
        microdescriptor) unless a directory guard is requested.
      - It's a general-purpose router unless UseBridges is configured.
      - It's reachable by the configuration (fascist_firewall_allows_node).

    This conforms to the existing conditions in "entry_is_live()".

    A guard is observed to become unusable according to the directory or the
    user configuration if it satisfies any of the following conditions:
      - It's not in our node list, meaninig it's present in the latest consensus.
      - It's not currently running (is_running).
      - It's not a bridge and not a configured bridge
        (node_is_a_configured_bridge) and UseBridges is True.
      - It's not a possible guard and is not in EntryNodes and UseBridges is
        False.
      - It's in ExcludeNodes. Nevertheless this is ignored when
    loading from config.
      - It's not reachable by the configuration (fascist_firewall_allows_node).
      - It's disabled because of path bias issues (path_bias_disabled).

    This conforms to the existing conditions in "entry_guards_compute_status()".

§E. Appendix: UseBridges and Bridges configurations

  This is mutually exclusive with EntryNodes.

  If options->UseBridges OR options->EntryNodes:
    - guards = populate_live_entry_guards() - this is the "bridge flavour" of
      IS_SUITABLE as mentioned before.
    - return node_sl_choose_by_bandwidth(guards, WEIGHT_FOR_GUARD)
  This is "choose a guard from S by bandwidth weight".

  UseBridges and Bridges must be set together. Bridges go to bridge_list (via
  bridge_add_from_config()), but how is it used?
  learned_bridge_descriptor() adds the bridge to the global entry_guards if
  UseBridges = True.

  We either keep the existing global entry_guards OR incorporate bridges in the
  proposal (remove non bridges from USED_GUARDS, and REMAINING_GUARDS = bridges?)

  If UseBridges is set as true, we need to fill the SAMPLED_GUARDS
  with bridges specified and learned from consensus.

§F. Appendix: EntryNodes configuration

  This is mutually exclusive with Bridges.

  The global entry_guards will be updated with entries in EntryNodes
  (see entry_guards_set_from_config()).

  If EntryNodes is set, we need to fill the SAMPLED_GUARDS with
  EntryNodes specified in options.

§G. Appendix: IS_BAD helper function

  A guard is considered bad if is not included in the newest
  consensus.

§H. Appendix: IS_DEAD helper function

  A guard is considered dead if it's marked as bad for
  ENTRY_GUARD_REMOVE_AFTER period (30 days) unless they have been disabled
  because of path bias issues (path_bias_disabled).

§I. Appendix: IS_OBSOLETE helper function

  A guard is considered obsolete if it was chosen by an Tor
  version we can't recognize or it was chosen more than GUARD_LIFETIME ago.

-*- coding: utf-8 -*-
Filename: 269-hybrid-handshake.txt
Title: Transitionally secure hybrid handshakes
Author: John Schanck, William Whyte, Zhenfei Zhang,
        Nick Mathewson, Isis Lovecruft, Peter Schwabe
Created: 7 June 2016
Updated: 2 Sept 2016
Status: Needs-Revision


1. Introduction

  This document describes a generic method for integrating a post-quantum key
  encapsulation mechanism (KEM) into an ntor-like handshake.  A full discussion
  of the protocol and its proof of security may be found in [SWZ16].

  1.1 Motivation: Transitional forward-secret key agreement

    All currently deployed forward-secret key agreement protocols are
    vulnerable to quantum cryptanalysis. The obvious countermeasure is to
    switch to a key agreement mechanism that uses post-quantum primitives for
    both authentication and confidentiality.

    This option should be explored, but providing post-quantum router
    authentication in Tor would require a new consensus method and new
    microdescriptor elements. Since post-quantum public keys and signatures can
    be quite large, this may be a very expensive modification.

    In the near future it will suffice to use a "transitional" key agreement
    protocol -- one that provides pre-quantum authentication and post-quantum
    confidentiality. Such a protocol is secure in the transition between pre-
    and post-quantum settings and provides forward secrecy against adversaries
    who gain quantum computing capabilities after session negotiation.

  1.2 Motivation: Fail-safe plug & play for post-quantum KEMs

    We propose a modular design that allows any post-quantum KEM to be included
    in the handshake. As there may be some uncertainty as to the security of
    the currently available post-quantum KEMs, and their implementations, we
    ensure that the scheme safely degrades to ntor in the event of a complete
    break on the KEM.


2. Proposal

  2.1 Overview

    We re-use the public key infrastructure currently used by ntor.  Each
    server publishes a static Diffie-Hellman (DH) onion key. Each client is
    assumed to have a certified copy of each server's public onion key and each
    server's "identity digest". To establish a session key, we propose that the
    client send two ephemeral public keys to the server. The first is an
    ephemeral DH key, the second is an ephemeral public key for a post-quantum
    KEM. The server responds with an ephemeral DH public key and an
    encapsulation of a random secret under the client's ephemeral KEM key.  The
    two parties then derive a shared secret from: 1) the static-ephemeral DH
    share, 2) the ephemeral-ephemeral DH share, 3) the encapsulated secret, 4)
    the transcript of their communication.

  2.2 Notation

    Public, non-secret, values are denoted in UPPER CASE.
    Private, secret, values are denoted in lower case.
    We use multiplicative notation for Diffie-Hellman operations.

  2.3 Parameters

    DH                           A Diffie-Hellman primitive
    KEM                          A post-quantum key encapsulation mechanism
    H                            A cryptographic hash function

    LAMBDA            (bits)     Pre-quantum bit security parameter
    MU                (bits)     2*LAMBDA
    KEY_LEN           (bits)     Length of session key material to output

    H_LEN             (bytes)    Length of output of H
    ID_LEN            (bytes)    Length of server identity digest
    DH_LEN            (bytes)    Length of DH public key
    KEM_PK_LEN        (bytes)    Length of KEM public key
    KEM_C_LEN         (bytes)    Length of KEM ciphertext

    PROTOID           (string)   "hybrid-[DH]-[KEM]-[H]-[revision]"
    T_KEY             (string)   PROTOID | ":key"
    T_AUTH            (string)   PROTOID | ":auth"

    Note: [DH], [KEM], and [H] are strings that uniquely identify
          the primitive, e.g. "x25519"

  2.4 Subroutines

    HMAC(key, msg):
      The pseudorandom function defined in [RFC2104] with H
      as the underlying hash function.

    EXTRACT(salt, secret):
      A randomness extractor with output of length >= MU bits.

      For most choices of H one should use the HMAC based
      randomness extractor defined in [RFC5869]:
        EXTRACT(salt, secret) := HMAC(salt, secret).

      If MU = 256 and H is SHAKE-128 with MU bit output, or
      if MU = 512 and H is SHAKE-256 with MU bit output, then
      one may instead define:
        EXTRACT(salt, secret) := H(salt | secret).

    EXPAND(seed, context, len):
      The HMAC based key expansion function defined in [RFC5869].
      Outputs the first len bits of
        K = K_1 | K_2 | K_3 | ...
      where
        K_0 = empty string (zero bits)
        K_i = HMAC(seed, K_(i-1) | context | INT8(i)).

      Alternatively, an eXtendable Output Function (XOF) may be used.
      In which case,
      EXPAND(seed, context, len) = XOF(seed | context, len)

    DH_GEN() -> (x, X):
      Diffie-Hellman keypair generation. Secret key x, public key X.

    DH_MUL(P,x) -> xP:
      Scalar multiplication in the DH group of the base point P by
      the scalar x.

    KEM_GEN() -> (sk, PK):
      Key generation for KEM.

    KEM_ENC(PK) -> (m, C):
      Encapsulation, C, of a uniform random message m under public key PK.

    KEM_DEC(C, sk):
      Decapsulation of the ciphertext C with respect to the secret key sk.

    KEYID(A) -> A or H(A):
      For DH groups with long element presentations it may be desirable to
      identify a key by its hash. For typical elliptic curve groups this should
      be the identity map.

  2.5 Handshake

    To perform the handshake, the client needs to know the identity digest and
    an onion key for the router. The onion key must be for the specified DH
    scheme (e.g. x25519). Call the router's identity digest "ID" and its public
    onion key "A". The following Client Init / Server Response / Client Finish
    sequence defines the hybrid-DH-KEM protocol. See Fig. 1 for a schematic
    depiction of the same operations.

    - Client Init ------------------------------------------------------------

    The client generates ephemeral key pairs:
      x, X            = DH_GEN()
      esk, EPK        = KEM_GEN()

    The client sends a CREATE cell with contents:
      ID              [ID_LEN       bytes]
      KEYID(A)        [H_LEN        bytes]
      X               [DH_LEN       bytes]
      EPK             [KEM_PK_LEN   bytes]

    - Server Response --------------------------------------------------------

    The server generates an ephemeral DH keypair:
      y, Y            := DH_GEN()

    The server computes the three secret shares:
      s0              := H(DH_MUL(X,a))
      s1              := DH_MUL(X,y)
      s2, C           := KEM_ENC(EPK)

    The server extracts the seed:
      SALT            := ID | A | X | EPK
      secret          := s0 | s1 | s2
      seed            := EXTRACT(SALT, secret)

    The server derives the authentication tag:
      verify          := EXPAND(seed, T_AUTH, MU)
      TRANSCRIPT      := ID | A | X | EPK | Y | C | PROTOID
      AUTH            := HMAC(verify, TRANSCRIPT)

    The server sends a CREATED cell with contents:
      Y               [DH_LEN     bytes]
      C               [KEM_C_LEN  bytes]
      AUTH            [CEIL(MU/8) bytes]

    - Client Finish ----------------------------------------------------------

    The client computes the three secret shares:
      s0              := H(DH_MUL(A,x))
      s1              := DH_MUL(Y,x)
      s2              := KEM_DEC(C, esk)

    The client then derives the seed:
      SALT            := ID | A | X | EPK
      secret          := s0 | s1 | s2
      seed            := EXTRACT(SALT, secret);

    The client derives the authentication tag:
      verify          := EXPAND(seed, T_AUTH, MU)
      TRANSCRIPT      := ID | A | X | EPK | Y | C | PROTOID
      AUTH            := HMAC(verify, TRANSCRIPT)

    The client verifies that AUTH matches the tag received from the server.

    If the authentication check fails the client aborts the session.

    - Key derivation ---------------------------------------------------------

    Both parties derive the shared key from the seed:
      key             := EXPAND(seed, T_KEY, KEY_LEN).

  .--------------------------------------------------------------------------.
  | Fig. 1: The hybrid-DH-KEM handshake.                                     |
  .--------------------------------------------------------------------------.
  |                                                                          |
  | Initiator                             Responder with identity key ID     |
  | ---------                             --------- and onion key A          |
  |                                                                          |
  | x, X         := DH_GEN()                                                 |
  | esk, EPK     := KEM_GEN()                                                |
  | CREATE_DATA  := ID | A | X | EPK                                         |
  |                                                                          |
  |               --- CREATE_DATA --->                                       |
  |                                                                          |
  |                       y, Y         := DH_GEN()                           |
  |                       s0           := H(DH_MUL(X,a))                     |
  |                       s1           := DH_MUL(X,y)                        |
  |                       s2, C        := KEM_ENC(EPK)                       |
  |                       SALT         := ID | A | X | EPK                   |
  |                       secret       := s0 | s1 | s2                       |
  |                       seed         := EXTRACT(SALT, secret)              |
  |                       verify       := EXPAND(seed, T_AUTH, MU)           |
  |                       TRANSCRIPT   := ID | A | X | Y | EPK | C | PROTOID |
  |                       AUTH         := HMAC(verify, TRANSCRIPT)           |
  |                       key          := EXPAND(seed, T_KEY, KEY_LEN)       |
  |                       CREATED_DATA := Y | C | AUTH                       |
  |                                                                          |
  |               <-- CREATED_DATA ---                                       |
  |                                                                          |
  | s0           := H(DH_MUL(A,x))                                           |
  | s1           := DH_MUL(Y,x)                                              |
  | s2           := KEM_DEC(C, esk)                                          |
  | SALT         := ID | A | X | EPK                                         |
  | secret       := s0 | s1 | s2                                             |
  | seed         := EXTRACT(SALT, secret)                                    |
  | verify       := EXPAND(seed, T_AUTH, MU)                                 |
  | TRANSCRIPT   := ID | A | X | Y | EPK | C                                 |
  |                                                                          |
  | assert AUTH == HMAC(verify, TRANSCRIPT)                                  |
  | key := EXPAND(seed, T_KEY, KEY_LEN)                                      |
  '--------------------------------------------------------------------------'


3. Changes from ntor

  The hybrid-null handshake differs from ntor in a few ways.

  First there are some superficial differences.
  The protocol IDs differ:
    ntor                PROTOID         "ntor-curve25519-sha256-1",
    hybrid-null         PROTOID         "hybrid-x25519-null-sha256-1",
  and the context strings differ:
    ntor                T_MAC           PROTOID | ":mac",
    ntor                T_KEY           PROTOID | ":key_extract",
    ntor                T_VERIFY        PROTOID | ":verify",
    ntor                M_EXPAND        PROTOID | ":key_expand",
    hybrid-null         T_KEY           PROTOID | ":key",
    hybrid-null         T_AUTH          PROTOID | ":auth".

  Then there are significant differences in how the authentication tag
  (AUTH) and key (key) are derived. The following description uses the
  HMAC based definitions of EXTRACT and EXPAND.

  In ntor the server computes
    secret_input        := EXP(X,y) | EXP(X,a) | ID | A | X | Y | PROTOID
    seed                := HMAC(T_KEY, secret_input)
    verify              := HMAC(T_VERIFY, seed)
    auth_input          := verify | ID | A | Y | X | PROTOID | "Server"
    AUTH                := HMAC(T_MAC, auth_input)
    key                 := EXPAND(seed, M_EXPAND, KEY_LEN)

  In hybrid-null the server computes
    SALT                := ID | A | X
    secret_input        := H(EXP(X,a)) | EXP(X,y)
    seed                := EXTRACT(SALT, secret_input)
    verify              := EXPAND(seed, T_AUTH, MU)
    TRANSCRIPT          := ID | A | X | Y | PROTOID
    AUTH                := HMAC(verify, TRANSCRIPT)
    key                 := EXPAND(seed, T_KEY, KEY_LEN)

  First, note that hybrid-null hashes EXP(X,a). This is due to
  the fact that weaker assumptions were used to prove the security
  of hybrid-null than were used to prove the security of ntor. While
  this may seem artificial we recommend keeping it.

  Second, ntor uses fixed HMAC keys for all sessions. This is unlikely
  to be a real-world security issue, but it requires stronger assumptions
  about HMAC than if the order of the arguments were reversed.

  Finally, ntor uses a mixture of public and secret data in auth_input,
  whereas the equivalent term in hybrid-null is the public transcript.



4. Versions

  [XXX rewrite section w/ new versioning proposal]

  Recognized handshake types are:
    0x0000  TAP         --  the original Tor handshake;
    0x0001  reserved
    0x0002  ntor        --  the ntor-x25519-sha256 handshake;

  Request for new handshake types:
    0x010X  hybrid-XX   --  a hybrid of a x25519 handshake
                            and a post-quantum key encapsulation mechanism

  where
    0x0101  hybrid-null      -- No post-quantum key encapsulation mechanism.

    0x0102  hybrid-ees443ep2 -- Using NTRUEncrypt parameter set ntrueess443ep2

    0x0103  hybrid-newhope   -- Using the New Hope R-LWE scheme

        DEPENDENCY:
          Proposal 249: Allow CREATE cells with >505 bytes of handshake data



5. Bibliography

[SWZ16]   Schanck, J., Whyte, W., and Z. Zhang, "Circuit extension handshakes
          for Tor achieving forward secrecy in a quantum world", PETS 2016,
          DOI 10.1515/popets-2016-0037, June 2016.
[RFC2104] Krawczyk, H., Bellare, M., and R. Canetti,
          "HMAC: Keyed-Hashing for Message Authentication",
          RFC 2104, DOI 10.17487/RFC2104, February 1997
[RFC5869] Krawczyk, H. and P. Eronen,
          "HMAC-based Extract-and-Expand Key Derivation Function (HKDF)",
          RFC 5869, DOI 10.17487/RFC5869, May 2010


A1. Instantiation with NTRUEncrypt

  This example uses the NTRU parameter set EESS443EP2 [XXX cite] which is
  estimated at the 128 bit security level for both pre- and post-quantum
  settings.

  EES443EP2 specifies three algorithms:
    EES443EP2_GEN()          -> (sk, PK),
    EES443EP2_ENCRYPT(m, PK) -> C,
    EES443EP2_DECRYPT(C, sk) -> m.

  The m parameter for EES443EP2_ENCRYPT can be at most 49 bytes.
  We define EES443EP2_MAX_M_LEN := 49.

  0x0102  hybrid-x25519-ees443ep2-shake128-1
  --------------------
    DH            := x25519
    KEM           := EES443EP2
    H             := SHAKE-128 with 256 bit output

    LAMBDA        := 128
    MU            := 256

    H_LEN         := 32
    ID_LEN        := 20
    DH_LEN        := 32
    KEM_PK_LEN    := 615
    KEM_C_LEN     := 610
    KEY_LEN       := XXX

    PROTOID       := "hybrid-x25519-ees443ep2-shake128-1"
    T_KEY         := "hybrid-x25519-ees443ep2-shake128-1:key"
    T_AUTH        := "hybrid-x25519-ees443ep2-shake128-1:auth"

    Subroutines
    -----------
      HMAC(key, message)         := SHAKE-128(key | message, MU)
      EXTRACT(salt, secret)      := SHAKE-128(salt | secret, MU)
      EXPAND(seed, context, len) := SHAKE-128(seed | context, len)
      KEM_GEN()                  := EES443EP2_GEN()
      KEM_ENC(PK)                := (s, C)
                                    where s = RANDOMBYTES(EES443EP2_MAX_M_LEN)
                                      and C = EES443EP2_ENCRYPT(s, PK)
      KEM_DEC(C, sk)             := EES443EP2_DECRYPT(C, sk)


A2. Instantiation with NewHope

  [XXX write intro]

  0x0103  hybrid-x25519-newhope-shake128-1
  --------------------
    DH            := x25519
    KEM           := NEWHOPE
    H             := SHAKE-128 with 256 bit output

    LAMBDA        := 128
    MU            := 256

    H_LEN         := 32
    ID_LEN        := 20
    DH_LEN        := 32
    KEM_PK_LEN    := 1824
    KEM_C_LEN     := 2048
    KEY_LEN       := XXX

    PROTOID       := "hybrid-x25519-newhope-shake128-1"
    T_KEY         := "hybrid-x25519-newhope-shake128-1:key"
    T_AUTH        := "hybrid-x25519-newhope-shake128-1:auth"

    Subroutines
    -----------
      HMAC(key, message)         := SHAKE-128(key | message, MU)
      EXTRACT(salt, secret)      := SHAKE-128(salt | secret, MU)
      EXPAND(seed, context, len) -> SHAKE-128(seed | context, len)
      KEM_GEN()                  -> (sk, PK)
                                    where SEED   := RANDOMBYTES(MU)
                                          (sk,B) := NEWHOPE_KEYGEN(A_SEED)
                                          PK     := B | A_SEED
      KEM_ENC(PK)                -> NEWHOPE_ENCAPS(PK)
      KEM_DEC(C, sk)             -> NEWHOPE_DECAPS(C, sk)
Filename: 270-newhope-hybrid-handshake.txt
Title: RebelAlliance: A Post-Quantum Secure Hybrid Handshake Based on NewHope
Author: Isis Lovecruft, Peter Schwabe
Created: 16 Apr 2016
Updated: 22 Jul 2016
Status: Obsolete
Depends: prop#220 prop#249 prop#264 prop#270

§0. Introduction

  RebelAlliance is a post-quantum secure hybrid handshake, comprised of an
  alliance between the X25519 and NewHope key exchanges.

  NewHope is a post-quantum-secure lattice-based key-exchange protocol based
  on the ring-learning-with-errors (Ring-LWE) problem.  We propose a hybrid
  handshake for Tor, based on a combination of Tor's current NTor handshake
  and a shared key derived through a NewHope ephemeral key exchange.

  For further details on the NewHope key exchange, the reader is referred to
  "Post-quantum key exchange - a new hope" by Alkim, Ducas, Pöppelmann, and
  Schwabe [0][1].

  For the purposes of brevity, we consider that NTor is currently the only
  handshake protocol in Tor; the older TAP protocol is ignored completely, due
  to the fact that it is currently deprecated and nearly entirely unused.


§1. Motivation

  An attacker currently monitoring and storing circuit-layer NTor handshakes
  who later has the ability to run Shor's algorithm on a quantum computer will
  be able to break Tor's current handshake protocol and decrypt previous
  communications.

  It is unclear if and when such attackers equipped with large quantum
  computers will exist, but various estimates by researchers in quantum
  physics and quantum engineering give estimates of only 1 to 2 decades.
  Clearly, the security requirements of many Tor users include secrecy of
  their messages beyond this time span, which means that Tor needs to update
  the key exchange to protect against such attackers as soon as possible.


§2. Design

  An initiator and responder, in parallel, conduct two handshakes:

  - An X25519 key exchange, as described in the description of the NTor
    handshake in Tor proposal #216.
  - A NewHope key exchange.

  The shared keys derived from these two handshakes are then concatenated and
  used as input to the SHAKE-256 extendable output function (XOF), as described
  in FIPS-PUB-202 [2], in order to produce a shared key of the desired length.
  The testvectors in §C assume that this key has a length of 32 bytes, but the
  use of a XOF allows arbitrary lengths to easily support future updates of
  the symmetric primitives using the key. See also §3.3.1.


§3. Specification

§3.1. Notation

  Let `a || b` be the concatenation of a with b.

  Let `a^b` denote the exponentiation of a to the bth power.

  Let `a == b` denote the equality of a with b, and vice versa.

  Let `a := b` be the assignment of the value of b to the variable a.

  Let `H(x)` be 32-bytes of output of the SHAKE-256 XOF (as described in
  FIPS-PUB-202) applied to message x.

  Let X25519 refer to the curve25519-based key agreement protocol described
  in RFC7748 §6.1. [3]

  Let `EXP(a, b) == X25519(., b, a)` with `g == 9`. Let X25519_KEYGEN() do
  the appropriate manipulations when generating the secret key (clearing the
  low bits, twidding the high bits).  Additionally, EXP() MUST include the
  check for all-zero output due to the input point being of small
  order (cf. RFC7748 §6).

  Let `X25519_KEYID(B) == B` where B is a valid X25519 public key.

  When representing an element of the Curve25519 subgroup as a byte string,
  use the standard (32-byte, little-endian, x-coordinate-only) representation
  for Curve25519 points.

  Let `ID` be a router's identity key taken from the router microdescriptor.
  In the case for relays possessing Ed25519 identity keys (cf. Tor proposal
  #220), this is a 32-byte string representing the public Ed25519 identity key.
  For backwards and forwards compatibility with routers which do not possess
  Ed25519 identity keys, this is a 32-byte string created via the output of
  H(ID).

  We refer to the router as the handshake "responder", and the client (which
  may be an OR or an OP) as the "initiator".


  ID_LENGTH      [32 bytes]
  H_LENGTH       [32 bytes]
  G_LENGTH       [32 bytes]

  PROTOID  :=    "pqtor-x25519-newhope-shake256-1"
  T_MAC    :=    PROTOID || ":mac"
  T_KEY    :=    PROTOID || ":key_extract"
  T_VERIFY :=    PROTOID || ":verify"

  (X25519_SK, X25519_PK) := X25519_KEYGEN()


§3.2. Protocol

 ========================================================================================
 |                                                                                      |
 | Fig. 1: The NewHope-X25519 Hybrid Handshake.                                         |
 |                                                                                      |
 | Before the handshake the Initiator is assumed to know Z, a public X25519 key for     |
 | the Responder, as well as the Responder's ID.                                        |
 ----------------------------------------------------------------------------------------
 |                                                                                      |
 | Initiator                             Responder                                      |
 |                                                                                      |
 | SEED         := H(randombytes(32))                                                   |
 | x, X         := X25519_KEYGEN()                                                      |
 | a, A         := NEWHOPE_KEYGEN(SEED)                                                 |
 | CLIENT_HDATA := ID || Z || X || A                                                    |
 |                                                                                      |
 |               --- CLIENT_HDATA --->                                                  |
 |                                                                                      |
 |                                       y, Y           := X25519_KEYGEN()              |
 |                                       NTOR_KEY, AUTH := NTOR_SHAREDB(X,y,Y,z,Z,ID,B) |
 |                                       M, NEWHOPE_KEY := NEWHOPE_SHAREDB(A)           |
 |                                       SERVER_HDATA   := Y || AUTH || M               |
 |                                       sk := SHAKE-256(NTOR_KEY || NEWHOPE_KEY)       |
 |                                                                                      |
 |               <-- SERVER_HDATA ----                                                  |
 |                                                                                      |
 | NTOR_KEY    := NTOR_SHAREDA(x, X, Y, Z, ID, AUTH)                                    |
 | NEWHOPE_KEY := NEWHOPE_SHAREDA(M, a)                                                 |
 | sk := SHAKE-256(NTOR_KEY || NEWHOPE_KEY)                                             |
 |                                                                                      |
 ========================================================================================


§3.2.1. The NTor Handshake

§3.2.1.1. Prologue

  Take a router with identity ID. As setup, the router generates a secret key z,
  and a public onion key Z with:

    z, Z := X25519_KEYGEN()

  The router publishes Z in its server descriptor in the "ntor-onion-key" entry.
  Henceforward, we refer to this router as the "responder".


§3.2.1.2. Initiator

  To send a create cell, the initiator generates a keypair:

    x, X := X25519_KEYGEN()

  and creates the NTor portion of a CREATE2V cell's HDATA section:

    CLIENT_NTOR    := ID || Z || X                   [96 bytes]

  The initiator includes the responder's ID and Z in the CLIENT_NTOR so that, in
  the event the responder OR has recently rotated keys, the responder can
  determine which keypair to use.

  The initiator then concatenates CLIENT_NTOR with CLIENT_NEWHOPE (see §3.2.2),
  to create CLIENT_HDATA, and creates and sends a CREATE2V cell (see §A.1)
  to the responder.

    CLIENT_NEWHOPE                                   [1824 bytes]  (see §3.2.2)
    CLIENT_HDATA   := CLIENT_NTOR || CLIENT_NEWHOPE  [1920 bytes]

  If the responder does not respond with a CREATED2V cell, the initiator SHOULD
  NOT attempt to extend the circuit through the responder by sending fragmented
  EXTEND2 cells, since the responder's lack of support for CREATE2V cells is
  assumed to imply the responder also lacks support for fragmented EXTEND2
  cells.  Alternatively, for initiators with a sufficiently late consensus
  method, the initiator MUST check that "proto" line in the responder's
  descriptor (cf. Tor proposal #264) advertises support for the "Relay"
  subprotocol version 3 (see §5).


§3.2.1.3. Responder

  The responder generates a keypair of y, Y = X25519_KEYGEN(), and does
  NTOR_SHAREDB() as follows:

  (NTOR_KEY, AUTH) ← NTOR_SHAREDB(X, y, Y, z, Z, ID, B):
    secret_input := EXP(X, y) || EXP(X, z) || ID || B || Z || Y || PROTOID
    NTOR_KEY     := H(secret_input, T_KEY)
    verify       := H(secret_input, T_VERIFY)
    auth_input   := verify || ID || Z || Y || X || PROTOID || "Server"
    AUTH         := H(auth_input, T_MAC)

  The responder sends a CREATED2V cell containing:

    SERVER_NTOR    := Y || AUTH                      [64 bytes]
    SERVER_NEWHOPE                                   [2048 bytes]  (see §3.2.2)
    SERVER_HDATA   := SERVER_NTOR || SERVER_NEWHOPE  [2112 bytes]

  and sends this to the initiator.


§3.2.1.4. Finalisation

  The initiator then checks Y is in G^* [see NOTE below], and does
  NTOR_SHAREDA() as follows:

  (NTOR_KEY) ← NTOR_SHAREDA(x, X, Y, Z, ID, AUTH)
    secret_input := EXP(Y, x) || EXP(Z, x) || ID || Z || X || Y || PROTOID
    NTOR_KEY     := H(secret_input, T_KEY)
    verify       := H(secret_input, T_VERIFY)
    auth_input   := verify || ID || Z || Y || X || PROTOID || "Server"
    if AUTH == H(auth_input, T_MAC)
       return NTOR_KEY

  Both parties now have a shared value for NTOR_KEY.  They expand this into
  the keys needed for the Tor relay protocol.

  [XXX We think we want to omit the final hashing in the production of NTOR_KEY
  here, and instead put all the inputs through SHAKE-256. --isis, peter]

  [XXX We probably want to remove ID and B from the input to the shared key
  material, since they serve for authentication but, as pre-established
  "prologue" material to the handshake, they should not be used in attempts to
  strengthen the cryptographic suitability of the shared key.  Also, their
  inclusion is implicit in the DH exponentiations.  I should probably ask Ian
  about the reasoning for the original design choice.  --isis]


§3.2.2. The NewHope Handshake

§3.2.2.1. Parameters & Mathematical Structures

  Let ℤ be the ring of rational integers. Let ℤq, for q ≥ 1, denote the quotient
  ring ℤ/qℤ.  We define R = ℤ[X]/((X^n)+1) as the ring of integer polynomials
  modulo ((X^n)+1), and Rq = ℤq[X]/((X^n)+1) as the ring of integer polynomials
  modulo ((X^n)+1) where each coefficient is reduced modulo q. When we refer to
  a polynomial, we mean an element of Rq.

    n := 1024
    q := 12289

    SEED         [32 Bytes]
    NEWHOPE_POLY [1792 Bytes]
    NEWHOPE_REC  [256 Bytes]
    NEWHOPE_KEY  [32 Bytes]

    NEWHOPE_MSGA := (NEWHOPE_POLY || SEED)
    NEWHOPE_MSGB := (NEWHOPE_POLY || NEWHOPE_REC)


§3.2.2.2. High-level Description of Newhope API Functions

  For a description of internal functions, see §B.

    (NEWHOPE_POLY, NEWHOPE_MSGA) ← NEWHOPE_KEYGEN(SEED):
        â    := gen_a(seed)
        s    := poly_getnoise()
        e    := poly_getnoise()
        ŝ    := poly_ntt(s)
        ê    := poly_ntt(e)
        b̂    := pointwise(â, ŝ) + ê
        sp   := poly_tobytes(ŝ)
        bp   := poly_tobytes(b̂)
        return (sp, (bp || seed))

    (NEWHOPE_MSGB, NEWHOPE_KEY) ← NEWHOPE_SHAREDB(NEWHOPE_MSGA):
        s'   := poly_getnoise()
        e'   := poly_getnoise()
        e"   := poly_getnoise()
        b̂    := poly_frombytes(bp)
        â    := gen_a(seed)
        ŝ'   := poly_ntt(s')
        ê'   := poly_ntt(e')
        û    := poly_pointwise(â, ŝ') + ê'
        v    := poly_invntt(poly_pointwise(b̂,ŝ')) + e"
        r    := helprec(v)
        up   := poly_tobytes(û)
        k    := rec(v, r)
        return ((up || r), k)

    NEWHOPE_KEY ← NEWHOPE_SHAREDA(NEWHOPE_MSGB, NEWHOPE_POLY):
        û    := poly_frombytes(up)
        ŝ    := poly_frombytes(sp)
        v'   := poly_invntt(poly_pointwise(û, ŝ))
        k    := rec(v', r)
        return k

  When a client uses a SEED within a CREATE2V cell, the client SHOULD NOT use
  that SEED in any other CREATE2V or EXTEND2 cells.  See §4 for further
  discussion.


§3.3. Key Expansion

  The client and server derive a shared key, SHARED, by:

    HKDFID := "THESE ARENT THE DROIDS YOURE LOOKING FOR"
    SHARED := SHAKE_256(HKDFID || NTorKey || NewHopeKey)


§3.3.1. Note on the Design Choice

  The reader may wonder why one would use SHAKE-256 to produce a 256-bit
  output, since the security strength in bits for SHAKE-256 is min(d/2,256)
  for collision resistance and min(d,256) for first- and second-order
  preimages, where d is the output length.

  The reasoning is that we should be aiming for 256-bit security for all of
  our symmetric cryptography.  One could then argue that we should just use
  SHA3-256 for the KDF.  We choose SHAKE-256 instead in order to provide an
  easy way to derive longer shared secrets in the future without requiring a
  new handshake.  The construction is odd, but the future is bright.
  As we are already using SHAKE-256 for the 32-byte output hash, we are also
  using it for all other 32-byte hashes involved in the protocol. Note that
  the only difference between SHA3-256 and SHAKE-256 with 32-byte output is
  one domain-separation byte.

  [XXX why would you want 256-bit security for the symmetric side? Are you
  talking pre- or post-quantum security? --peter]


§4. Security & Anonymity Implications

  This handshake protocol is one-way authenticated.  That is, the server is
  authenticated, while the client remains anonymous.

  The client MUST NOT cache and reuse SEED.  Doing so gives non-trivial
  adversarial advantages w.r.t. all-for-the-price-of-one attacks during the
  caching period.  More importantly, if the SEED used to generate NEWHOPE_MSGA
  is reused for handshakes along the same circuit or multiple different
  circuits, an adversary conducting a sybil attack somewhere along the path(s)
  will be able to correlate the identity of the client across circuits or
  hops.


§5. Compatibility

  Because our proposal requires both the client and server to send more than
  the 505 bytes possible within a CREATE2 cell's HDATA section, it depends
  upon the implementation of a mechanism for allowing larger CREATE cells
  (cf. Tor proposal #249).

  We reserve the following handshake type for use in CREATE2V/CREATED2V and
  EXTEND2V/EXTENDED2V cells:

    0x0003            [NEWHOPE + X25519 HYBRID HANDSHAKE]

  We introduce a new sub-protocol number, "Relay=3", (cf. Tor proposal #264
  §5.3) to signify support this handshake, and hence for the CREATE2V and
  fragmented EXTEND2 cells which it requires.

  There are no additional entries or changes required within either router
  descriptors or microdescriptors to support this handshake method, due to the
  NewHope keys being ephemeral and derived on-the-fly, and due to the NTor X25519
  public keys already being included within the "ntor-onion-key" entry.

  Add a "UseNewHopeKEX" configuration option and a corresponding consensus
  parameter to control whether clients prefer using this NewHope hybrid
  handshake or some previous handshake protocol.  If the configuration option
  is "auto", clients SHOULD obey the consensus parameter.  The default
  configuration SHOULD be "auto" and the consensus value SHOULD initially be "0".


§6. Implementation

  The paper by Alkim, Ducas, Pöppelmann and Schwabe describes two software
  implementations of NewHope, one C reference implementation and an optimized
  implementation using AVX2 vector instructions. Those implementations are
  available at [1].

  Additionally, there are implementations in Go by Yawning Angel, available
  from [4] and in Rust by Isis Lovecruft, available from [5].

  The software used to generate the test vectors in §C is based on the C
  reference implementation and available from:

  https://code.ciph.re/isis/newhope-tor-testvectors
  https://github.com/isislovecruft/newhope-tor-testvectors


§7. Performance & Scalability

  The computationally expensive part in the current NTor handshake is the
  X25519 key-pair generation and the X25519 shared-key computation. The
  current implementation in Tor is a wrapper to support various highly optimized
  implementations on different architectures. On Intel Haswell processors, the
  fastest implementation of X25519, as reported by the eBACS benchmarking
  project [6], takes 169920 cycles for key-pair generation and 161648 cycles
  for shared-key computation; these add up to a total of 331568 cycles on each
  side (initiator and responder).

  The C reference implementation of NewHope, also benchmarked on Intel
  Haswell, takes 358234 cycles for the initiator and 402058 cycles for the
  Responder. The core computation of the proposed combination of NewHope and
  X25519 will thus mean a slowdown of about a factor of 2.1 for the Initiator
  and a slowdown by a factor of 2.2 for the Responder compared to the current
  NTor handshake. These numbers assume a fully optimized implementation of the
  NTor handshake and a C reference implementation of NewHope. With optimized
  implementations of NewHope, such as the one for Intel Haswell described in
  [0], the computational slowdown will be considerably smaller than a factor
  of 2.


§8. References

[0]: https://cryptojedi.org/papers/newhope-20160328.pdf
[1]: https://cryptojedi.org/crypto/#newhope
[2]: http://www.nist.gov/customcf/get_pdf.cfm?pub_id=919061
[3]: https://tools.ietf.org/html/rfc7748#section-6.1
[4]: https://github.com/Yawning/newhope
[5]: https://code.ciph.re/isis/newhopers
[6]: http://bench.cr.yp.to


§A. Cell Formats

§A.1. CREATE2V Cells

  The client portion of the handshake should send CLIENT_HDATA, formatted
  into a CREATE2V cell as follows:

    CREATE2V {                                              [2114 bytes]
      HTYPE   := 0x0003                                     [2 bytes]
      HLEN    := 0x0780                                     [2 bytes]
      HDATA   := CLIENT_HDATA                               [1920 bytes]
      IGNORED := 0x00                                       [194 bytes]
    }

  [XXX do we really want to pad with IGNORED to make CLIENT_HDATA the
  same number of bytes as SERVER_HDATA? --isis]

§A.2. CREATED2V Cells

  The server responds to the client's CREATE2V cell with SERVER_HDATA,
  formatted into a CREATED2V cell as follows:

    CREATED2V {                                             [2114 bytes]
      HLEN    := 0x0800                                     [2 bytes]
      HDATA   := SERVER_HDATA                               [2112 bytes]
      IGNORED := 0x00                                       [0 bytes]
    }

§A.3. Fragmented EXTEND2 Cells

  When the client wishes to extend a circuit, the client should fragment
  CLIENT_HDATA into four EXTEND2 cells:

    EXTEND2 {
      NSPEC := 0x02 {                                     [1 byte]
        LINK_ID_SERVER                                    [22 bytes] XXX
        LINK_ADDRESS_SERVER                               [8 bytes]  XXX
      }
      HTYPE := 0x0003                                     [2 bytes]
      HLEN  := 0x0780                                     [2 bytes]
      HDATA := CLIENT_HDATA[0,461]                        [462 bytes]
    }
    EXTEND2 {
      NSPEC := 0x00                                       [1 byte]
      HTYPE := 0xFFFF                                     [2 bytes]
      HLEN  := 0x0000                                     [2 bytes]
      HDATA := CLIENT_HDATA[462,954]                      [492 bytes]
    }
    EXTEND2 {
      NSPEC := 0x00                                       [1 byte]
      HTYPE := 0xFFFF                                     [2 bytes]
      HLEN  := 0x0000                                     [2 bytes]
      HDATA := CLIENT_HDATA[955,1447]                     [492 bytes]
    }
    EXTEND2 {
      NSPEC := 0x00                                       [1 byte]
      HTYPE := 0xFFFF                                     [2 bytes]
      HLEN  := 0x0000                                     [2 bytes]
      HDATA := CLIENT_HDATA[1448,1919] || 0x00[20]        [492 bytes]
    }
    EXTEND2 {
      NSPEC := 0x00                                       [1 byte]
      HTYPE := 0xFFFF                                     [2 bytes]
      HLEN  := 0x0000                                     [2 bytes]
      HDATA := 0x00[172]                                  [172 bytes]
    }

  The client sends this to the server to extend the circuit from, and that
  server should format the fragmented EXTEND2 cells into a CREATE2V cell, as
  described in §A.1.

§A.4. Fragmented EXTENDED2 Cells

    EXTENDED2 {
      NSPEC := 0x02 {                                     [1 byte]
        LINK_ID_SERVER                                    [22 bytes] XXX
        LINK_ADDRESS_SERVER                               [8 bytes]  XXX
      }
      HTYPE := 0x0003                                     [2 bytes]
      HLEN  := 0x0800                                     [2 bytes]
      HDATA := SERVER_HDATA[0,461]                        [462 bytes]
    }
    EXTENDED2 {
      NSPEC := 0x00                                       [1 byte]
      HTYPE := 0xFFFF                                     [2 bytes]
      HLEN  := 0x0000                                     [2 bytes]
      HDATA := SERVER_HDATA[462,954]                      [492 bytes]
    }
    EXTEND2 {
      NSPEC := 0x00                                       [1 byte]
      HTYPE := 0xFFFF                                     [2 bytes]
      HLEN  := 0x0000                                     [2 bytes]
      HDATA := SERVER_HDATA[955,1447]                     [492 bytes]
    }
    EXTEND2 {
      NSPEC := 0x00                                       [1 byte]
      HTYPE := 0xFFFF                                     [2 bytes]
      HLEN  := 0x0000                                     [2 bytes]
      HDATA := SERVER_HDATA[1448,1939]                    [492 bytes]
    }
    EXTEND2 {
      NSPEC := 0x00                                       [1 byte]
      HTYPE := 0xFFFF                                     [2 bytes]
      HLEN  := 0x0000                                     [2 bytes]
      HDATA := SERVER_HDATA[1940,2112]                    [172 bytes]
    }


§B. NewHope Internal Functions

  gen_a(SEED):                  returns a uniformly random poly
  poly_getnoise():              returns a poly sampled from a centered binomial
  poly_ntt(poly):               number-theoretic transform; returns a poly
  poly_invntt(poly):            inverse number-theoretic transform; returns a poly
  poly_pointwise(poly, poly):   pointwise multiplication; returns a poly
  poly_tobytes(poly):           packs a poly to a NEWHOPE_POLY byte array
  poly_frombytes(NEWHOPE_POLY): unpacks a NEWHOPE_POLY byte array to a poly

  helprec(poly):                returns a NEWHOPE_REC byte array
  rec(poly, NEWHOPE_REC):       returns a NEWHOPE_KEY


  --- Description of the Newhope internal functions ---

  gen_a(SEED seed) receives as input a 32-byte (public) seed.  It expands
  this seed through SHAKE-128 from the FIPS202 standard. The output of SHAKE-128
  is considered a sequence of 16-bit little-endian integers. This sequence is
  used to initialize the coefficients of the returned polynomial from the least
  significant (coefficient of X^0) to the most significant (coefficient of
  X^1023) coefficient. For each of the 16-bit integers first eliminate the
  highest two bits (to make it a 14-bit integer) and then use it as the next
  coefficient if it is smaller than q=12289.
  Note that the amount of output required from SHAKE to initialize all 1024
  coefficients of the polynomial varies depending on the input seed.
  Note further that this function does not process any secret data and thus does
  not need any timing-attack protection.


  poly_getnoise() first generates 4096 bytes of uniformly random data. This can
  be done by reading these bytes from the system's RNG; efficient
  implementations will typically only read a 32-byte seed from the system's RNG
  and expand it through some fast PRG (for example, ChaCha20 or AES-256 in CTR
  mode). The output of the PRG is considered an array of 2048 16-bit integers
  r[0],...,r[2047]. The coefficients of the output polynomial are computed as
  HW(r[0])-HW(r[1]), HW(r[2])-HW(r[3]),...,HW(r[2046])-HW(r[2047]), where HW
  stands for Hamming weight.
  Note that the choice of RNG is a local decision; different implementations are
  free to use different RNGs.
  Note further that the output of this function is secret; the PRG (and the
  computation of HW) need to be protected against timing attacks.


  poly_ntt(poly f): For a mathematical description of poly_ntt see the [0]; a
  pseudocode description of a very naive in-place transformation of an input
  polynomial f = f[0] + f[1]*X + f[2]*X^2 + ... + f[1023]*X^1023 is the
  following code (all arithmetic on coefficients performed modulo q):

    psi   = 7
    omega = 49

    for i in range(0,n):
      t[i] = f[i] * psi^i

    for i in range(0,n):
      f[i] = 0
      for j in range(0,n):
        f[i] += t[j] * omega^((i*j)%n)

  Note that this is not how poly_ntt should be implemented if performance is
  an issue; in particular, efficient algorithms for the number-theoretic
  transform take time O(n*log(n)) and not O(n^2)
  Note further that all arithmetic in poly_ntt has to be protected against
  timing attacks.


  poly_invntt(poly f): For a mathematical description of poly_invntt see the
  [0]; a pseudocode description of a very naive in-place transformation of an
  input polynomial f = f[0] + f[1]*X + f[2]*X^2 + ... + f[1023]*X^1023 is the
  following code (all arithmetic on coefficients performed modulo q):

    invpsi = 8778;
    invomega = 1254;
    invn = 12277;

    for i in range(0,n):
      t[i] = f[i];

    for i in range(0,n):
      f[i]=0;
      for j in range(0,n):
        f[i] += t[j] * invomega^((i*j)%n)
      f[i] *= invpsi^i
      f[i] *= invn

  Note that this is not how poly_invntt should be implemented if performance
  is an issue; in particular, efficient algorithms for the inverse
  number-theoretic transform take time O(n*log(n)) and not O(n^2)
  Note further that all arithmetic in poly_invntt has to be protected against
  timing attacks.


  poly_pointwise(poly f, poly g) performs pointwise multiplication of the two
  polynomials.  This means that for f = (f0 + f1*X + f2*X^2 + ... +
  f1023*X^1023) and g = (g0 + g1*X + g2*X^2 + ... + g1023*X^1023) it computes
  and returns h = (h0 + h1*X + h2*X^2 + ... + h1023*X^1023) with h0 = f0*g0,
  h1 = f1*g1,..., h1023 = f1023*g1023.


  poly_tobytes(poly f) first reduces all coefficents of f modulo q, i.e.,
  brings them to the interval [0,q-1]. Denote these reduced coefficients as
  f0,..., f1023; note that they all fit into 14 bits. The function then packs
  those coefficients into an array of 1792 bytes r[0],..., r[1792] in "packed
  little-endian representation", i.e.,
  r[0]     = f[0] & 0xff;
  r[1]     = (f[0] >>  8) & ((f[1] & 0x03) << 6)
  r[2]     = (f[1] >>  2) & 0xff;
  r[3]     = (f[1] >> 10) & ((f[2] & 0x0f) << 4)
  .
  .
  .
  r[1790]  = (f[1022]) >> 12) & ((f[1023] & 0x3f) << 2)
  r[1791]  = f[1023] >> 6
  Note that this function needs to be protected against timing attacks. In
  particular, avoid non-constant-time conditional subtractions (or other
  non-constant-time expressions) in the reduction modulo q of the coefficients.


  poly_frombytes(NEWHOPE_POLY b) is the inverse of poly_tobytes; it receives
  as input an array of 1792 bytes and coverts it into the internal
  representation of a poly. Note that poly_frombytes does not need to check
  whether the coefficients are reduced modulo q or reduce coefficients modulo
  q. Note further that the function must not leak any information about its
  inputs through timing information, as it is also applied to the secret key
  of the initiator.


  helprec(poly f) computes 256 bytes of reconciliation information from the
  input poly f. Internally, one byte of reconciliation information is computed
  from four coefficients of f by a function helprec4. Let the input polynomial f
  = (f0 + f1*X + f2*X^2 + ... + f1023*X^1023); let the output byte array be
  r[0],...r[256]. This output byte array is computed as
  r[0]   = helprec4(f0,f256,f512,f768)
  r[1]   = helprec4(f1,f257,f513,f769)
  r[2]   = helprec4(f2,f258,f514,f770)
  .
  .
  .
  r[255] = helprec4(f255,f511,f767,f1023), where helprec4 does the following:

    helprec4(x0,x1,x2,x3):
      b = randombit()
      r0,r1,r2,r3 = CVPD4(8*x0+4*b,8*x1+4*b,8*x2+4*b,8*x3+4*b)
      r = (r0 & 0x03) | ((r1 & 0x03) << 2) | ((r2 & 0x03) << 4) | ((r3 & 0x03) << 6)
      return r

  The function CVPD4 does the following:

    CVPD4(y0,y1,y2,y3):
      v00 = round(y0/2q)
      v01 = round(y1/2q)
      v02 = round(y2/2q)
      v03 = round(y3/2q)
      v10 = round((y0-1)/2q)
      v11 = round((y1-1)/2q)
      v12 = round((y2-1)/2q)
      v13 = round((y3-1)/2q)
      t   = abs(y0 - 2q*v00)
      t  += abs(y1 - 2q*v01)
      t  += abs(y2 - 2q*v02)
      t  += abs(y3 - 2q*v03)
      if(t < 2q):
        v0 = v00
        v1 = v01
        v2 = v02
        v3 = v03
        k  = 0
      else
        v0 = v10
        v1 = v11
        v2 = v12
        v3 = v13
        r  = 1
      return (v0-v3,v1-v3,v2-v3,k+2*v3)

  In this description, round(x) is defined as ⌊x + 0.5⌋, where ⌊x⌋ rounds to
  the largest integer that does not exceed x; abs() returns the absolute
  value.
  Note that all computations involved in helprec operate on secret data and must
  be protected against timing attacks.


  rec(poly f, NEWHOPE_REC r) computes the pre-hash (see paper) Newhope key from
  f and r. Specifically, it computes one bit of key from 4 coefficients of f and
  one byte of r. Let f = f0 + f1*X + f2*X^2 + ... + f1023*X^1023 and let r =
  r[0],r[1],...,r[255]. Let the bytes of the output by k[0],...,k[31] and let
  the bits of the output by k0,...,k255, where
  k0   = k[0] & 0x01
  k1   = (k[0] >> 1) & 0x01
  k2   = (k[0] >> 2) & 0x01
  .
  .
  .
  k8   = k[1] & 0x01
  k9   = (k[1] >> 1) & 0x01
  .
  .
  .
  k255 = (k[32] >> 7)
  The function rec computes k0,...,k255 as
  k0   = rec4(f0,f256,f512,f768,r[0])
  k1   = rec4(f1,f257,f513,f769,r[1])
  .
  .
  .
  k255 = rec4(f255,f511,f767,f1023,r[255])

  The function rec4 does the following:

    rec4(y0,y1,y2,y3,r):
      r0 = r & 0x03
      r1 = (r >> 2) & 0x03
      r2 = (r >> 4) & 0x03
      r3 = (r >> 6) & 0x03
      Decode(8*y0-2q*r0, 8*y1-2q*r1, 8*y2-2q*r2, 8*y3-q*r3)

  The function Decode does the following:

    Decode(v0,v1,v2,v3):
      t0 = round(v0/8q)
      t1 = round(v1/8q)
      t2 = round(v2/8q)
      t3 = round(v3/8q)
      t  = abs(v0 - 8q*t0)
      t += abs(v0 - 8q*t0)
      t += abs(v0 - 8q*t0)
      t += abs(v0 - 8q*t0)
      if(t > 1) return 1
      else return 0


§C. Test Vectors
Filename: 271-another-guard-selection.txt
Title: Another algorithm for guard selection
Author:  Isis Lovecruft, George Kadianakis, Ola Bini, Nick Mathewson
Created: 2016-07-11
Supersedes: 259, 268
Status: Closed
Implemented-In: 0.3.0.1-alpha

0.0. Preliminaries

   This proposal derives from proposals 259 and 268; it is meant to
   supersede both.  It is in part a restatement of it, in part a
   simplification, and in part a refactoring so that it does not
   have the serialization problems noted by George Kadianakis.  It
   makes other numerous small changes.  Isis, George, and Ola should
   all get the credit for the well-considered ideas.

   Whenever I say "Y is a subset of X" you can think in terms of
   "Y-membership is a flag that can be set on members of X" or
   "Y-membership is a predicate that can be evaluated on members of
   X."

   "More work is needed."  There's a to-do at the end of the
   document.

0.1. Notation: identifiers

   We mention identifiers of these kinds:

   [SECTIONS]

   {INPUTS}, {PERSISTENT_DATA}, and {OPERATING_PARAMETERS}.

   {non_persistent_data}

   <states>.

   Each named identifier receives a type where it is defined, and
   is used by reference later on.

   I'm using this convention to make it easier to tell for certain
   whether every thingy we define is used, and vice versa.

1. Introduction and motivation

  Tor uses entry guards to prevent an attacker who controls some
  fraction of the network from observing a fraction of every user's
  traffic. If users chose their entries and exits uniformly at
  random from the list of servers every time they build a circuit,
  then an adversary who had (k/N) of the network would deanonymize
  F=(k/N)^2 of all circuits... and after a given user had built C
  circuits, the attacker would see them at least once with
  probability 1-(1-F)^C.  With large C, the attacker would get a
  sample of every user's traffic with probability 1.

  To prevent this from happening, Tor clients choose a small number
  of guard nodes (currently 3).  These guard nodes are the only
  nodes that the client will connect to directly.  If they are not
  compromised, the user's paths are not compromised.

  But attacks remain.  Consider an attacker who can run a firewall
  between a target user and the Tor network, and make many of the
  guards they don't control appear to be unreachable.  Or consider
  an attacker who can identify a user's guards, and mount
  denial-of-service attacks on them until the user picks a guard
  that the attacker controls.

  In the presence of these attacks, we can't continue to connect to
  the Tor network unconditionally.  Doing so would eventually result
  in the user choosing a hostile node as their guard, and losing
  anonymity.

  This proposal outlines a new entry guard selection algorithm,
  which tries to meet the following goals:

    - Heuristics and algorithms for determining how and which guards
      are chosen should be kept as simple and easy to understand as
      possible.

    - Clients in censored regions or who are behind a fascist
      firewall who connect to the Tor network should not experience
      any significant disadvantage in terms of reachability or
      usability.

    - Tor should make a best attempt at discovering the most
      appropriate behaviour, with as little user input and
      configuration as possible.

    - Tor clients should discover usable guards without too much
      delay.

    - Tor clients should resist (to the extent possible) attacks
      that try to force them onto compromised guards.


2. State instances

   In the algorithm below, we describe a set of persistent and
   non-persistent state variables.  These variables should be
   treated as an object, of which multiple instances can exist.

   In particular, we specify the use of three particular instances:

     A. UseBridges

      If UseBridges is set, then we replace the {GUARDS} set in
      [Sec:GUARDS] below with the list of list of configured
      bridges.  We maintain a separate persistent instance of
      {SAMPLED_GUARDS} and {CONFIRMED_GUARDS} and other derived
      values for the UseBridges case.

      In this case, we impose no upper limit on the sample size.

    B. EntryNodes / ExcludeNodes / Reachable*Addresses /
        FascistFirewall / ClientUseIPv4=0

      If one of the above options is set, and UseBridges is not,
      then we compare the fraction of usable guards in the consensus
      to the total number of guards in the consensus.

      If this fraction is less than {MEANINGFUL_RESTRICTION_FRAC},
      we use a separate instance of the state.

      (While Tor is running, we do not change back and forth between
      the separate instance of the state and the default instance
      unless the fraction of usable guards is 5% higher than, or 5%
      lower than, {MEANINGFUL_RESTRICTION_FRAC}.  This prevents us
      from flapping back and forth between instances if we happen to
      hit {MEANINGFUL_RESTRICTION_FRAC} exactly.

      If this fraction is less than {EXTREME_RESTRICTION_FRAC}, we use a
      separate instance of the state, and warn the user.

      [TODO: should we have a different instance for each set of heavily
      restricted options?]

   C. Default

      If neither of the above variant-state instances is used,
      we use a default instance.

3. Circuit Creation, Entry Guard Selection (1000 foot view)

   A circuit in Tor is a path through the network connecting a client to
   its destination. At a high-level, a three-hop exit circuit will look
   like this:

   Client <-> Entry Guard <-> Middle Node <-> Exit Node <-> Destination

   Entry guards are the only nodes which a client will connect to
   directly, Exit relays are the nodes by which traffic exists the
   Tor network in order to connect to an external destination.

   3.1 Path selection

   For any circuit, at least one entry guard and middle node(s) are
   required. An exit node is required if traffic will exit the Tor
   network. Depending on its configuration, a relay listed in a
   consensus could be used for any of these roles. However, this
   proposal defines how entry guards specifically should be selected and
   managed, as opposed to middle or exit nodes.

   3.1.1 Entry guard selection

   At a high level, a relay listed in a consensus will move through the
   following states in the process from initial selection to eventual
   usage as an entry guard:

      relays listed in consensus
                 |
               sampled
               |     |
         confirmed   filtered
               |     |      |
               primary      usable_filtered

   Relays listed in the latest consensus can be sampled for guard usage
   if they have the "Guard" flag. Sampling is random but weighted by
   bandwidth.

[Paul Syverson in a conversation at the Wilmington Meeting 2017 says that
we should look into how we're doing this sampling.  Essentially, his
concern is that, since we are sampling by bandwidth at first (when we
choose the `sampled` set), then later there is another bias—when trying to
build circuits (and hence marking guards as confirmed) we select those
which completed a usable circuit first (and hence have the lowest
latency)—that this sort of "doubly skewed" selection may "snub" some
low-consensus-weight guards and leave them unused completely.  Thus the
issue is primarily that we're not allocating network resources
efficiently.  Mine and Nick's guard algorithm simulation code never
checked what percentage of possible guards the algorithm reasonably
allowed clients to use; this would be an interesting thing to check in
simulation at some point.  If it does turn out to be a problem, Paul's
intuition for a fix is to select uniformly at random to obtain the
`sampled` set, then weight by bandwidth when trying to build circuits and
marking guards as confirmed. —isis]

   Once a path is built and a circuit established using this guard, it
   is marked as confirmed. Until this point, guards are first sampled
   and then filtered based on information such as our current
   configuration (see SAMPLED and FILTERED sections) and later marked as
   usable_filtered if the guard is not primary but can be reached.

   It is always preferable to use a primary guard when building a new
   circuit in order to reduce guard churn; only on failure to connect to
   existing primary guards will new guards be used.

   3.1.2 Middle and exit node selection

   Middle nodes are selected at random from relays listed in the
   latest consensus, weighted by bandwidth. Exit nodes are chosen
   similarly but restricted to relays with an exit policy.

   3.2 Circuit Building

   Once a path is chosen, Tor will use this path to build a new circuit.

   If the circuit is built successfully, it either can be used
   immediately or wait for a better guard, depending on whether other
   circuits already exist with higher-priority guards.

   If at any point the circuit fails, the guard is marked as
   unreachable, the circuit is closed, and waiting circuits are updated.

4. The algorithm.

4.0.  The guards listed in the current consensus. [Section:GUARDS]

   By {set:GUARDS} we mean the set of all guards in the current
   consensus that are usable for all circuits and directory
   requests. (They must have the flags: Stable, Fast, V2Dir, Guard.)

      **Rationale**

   We require all guards to have the flags that we potentially need
   from any guard, so that all guards are usable for all circuits.

4.1.  The Sampled Guard Set. [Section:SAMPLED]

   We maintain a set, {set:SAMPLED_GUARDS}, that persists across
   invocations of Tor. It is an unordered subset of the nodes that
   we have seen listed as a guard in the consensus at some point.
   For each such guard, we record persistently:

      - {pvar:ADDED_ON_DATE}: The date on which it was added to
        sampled_guards.

        We base this value on RAND(now, {GUARD_LIFETIME}/10). See
        Appendix [RANDOM] below.

      - {pvar:ADDED_BY_VERSION}: The version of Tor that added it to
        sampled_guards.

      - {pvar:IS_LISTED}: Whether it was listed as a usable Guard in
        the _most recent_ consensus we have seen.

      - {pvar:FIRST_UNLISTED_AT}: If IS_LISTED is false, the publication date
        of the earliest consensus in which this guard was listed such that we
        have not seen it listed in any later consensus.  Otherwise "None."
        We randomize this, based on
          RAND(added_at_time, {REMOVE_UNLISTED_GUARDS_AFTER} / 5)

   For each guard in {SAMPLED_GUARDS}, we also record this data,
   non-persistently:

      - {tvar:last_tried_connect}: A 'last tried to connect at'
        time.  Default 'never'.

      - {tvar:is_reachable}: an "is reachable" tristate, with
        possible values { <state:yes>, <state:no>, <state:maybe> }.
        Default '<maybe>.'

               [Note: "yes" is not strictly necessary, but I'm
                making it distinct from "maybe" anyway, to make our
                logic clearer.  A guard is "maybe" reachable if it's
                worth trying. A guard is "yes" reachable if we tried
                it and succeeded.]

      - {tvar:failing_since}: The first time when we failed to
        connect to this guard. Defaults to "never".  Reset to
        "never" when we successfully connect to this guard.

      - {tvar:is_pending} A "pending" flag.  This indicates that we
        are trying to build an exploratory circuit through the
        guard, and we don't know whether it will succeed.

   We require that {SAMPLED_GUARDS} contain at least
   {MIN_FILTERED_SAMPLE} guards from the consensus (if possible),
   but not more than {MAX_SAMPLE_THRESHOLD} of the number of guards
   in the consensus, and not more then {MAX_SAMPLE_SIZE} in total.
   (But if the maximum would be smaller than {MIN_FILTERED_SAMPLE}, we
   set the maximum at {MIN_FILTERED_SAMPLE}.)

   To add a new guard to {SAMPLED_GUARDS}, pick an entry at random
   from ({GUARDS} - {SAMPLED_GUARDS}), weighted by bandwidth.

   We remove an entry from {SAMPLED_GUARDS} if:

      * We have a live consensus, and {IS_LISTED} is false, and
        {FIRST_UNLISTED_AT} is over {REMOVE_UNLISTED_GUARDS_AFTER}
        days in the past.

     OR

      * We have a live consensus, and {ADDED_ON_DATE} is over
        {GUARD_LIFETIME} ago, *and* {CONFIRMED_ON_DATE} is either
        "never", or over {GUARD_CONFIRMED_MIN_LIFETIME} ago.

   Note that {SAMPLED_GUARDS} does not depend on our configuration.
   It is possible that we can't actually connect to any of these
   guards.

     **Rationale**

   The {SAMPLED_GUARDS} set is meant to limit the total number of
   guards that a client will connect to in a given period.  The
   upper limit on its size prevents us from considering too many
   guards.

   The first expiration mechanism is there so that our
   {SAMPLED_GUARDS} list does not accumulate so many dead
   guards that we cannot add new ones.

   The second expiration mechanism makes us rotate our guards slowly
   over time.


4.2. The Usable Sample [Section:FILTERED]

   We maintain another set, {set:FILTERED_GUARDS}, that does not
   persist. It is derived from:
       - {SAMPLED_GUARDS}
       - our current configuration,
       - the path bias information.

   A guard is a member of {set:FILTERED_GUARDS} if and only if all
   of the following are true:

       - It is a member of {SAMPLED_GUARDS}, with {IS_LISTED} set to
         true.
       - It is not disabled because of path bias issues.
       - It is not disabled because of ReachableAddress police,
         the ClientUseIPv4 setting, the ClientUseIPv6 setting,
         the FascistFirewall setting, or some other
         option that prevents using some addresses.
       - It is not disabled because of ExcludeNodes.
       - It is a bridge if UseBridges is true; or it is not a
         bridge if UseBridges is false.
       - Is included in EntryNodes if EntryNodes is set and
         UseBridges is not. (But see 2.B above).

   We have an additional subset, {set:USABLE_FILTERED_GUARDS}, which
   is defined to be the subset of {FILTERED_GUARDS} where
   {is_reachable} is <yes> or <maybe>.

   We try to maintain a requirement that {USABLE_FILTERED_GUARDS}
   contain at least {MIN_FILTERED_SAMPLE} elements:

     Whenever we are going to sample from {USABLE_FILTERED_GUARDS},
     and it contains fewer than {MIN_FILTERED_SAMPLE} elements, we
     add new elements to {SAMPLED_GUARDS} until one of the following
     is true:

       * {USABLE_FILTERED_GUARDS} is large enough,
     OR
       * {SAMPLED_GUARDS} is at its maximum size.


     ** Rationale **

  These filters are applied _after_ sampling: if we applied them
  before the sampling, then our sample would reflect the set of
  filtering restrictions that we had in the past.

4.3. The confirmed-guard list. [Section:CONFIRMED]

  [formerly USED_GUARDS]

  We maintain a persistent ordered list, {list:CONFIRMED_GUARDS}.
  It contains guards that we have used before, in our preference
  order of using them.  It is a subset of {SAMPLED_GUARDS}.  For
  each guard in this list, we store persistently:

      - {pvar:IDENTITY} Its fingerprint

      - {pvar:CONFIRMED_ON_DATE} When we added this guard to
        {CONFIRMED_GUARDS}.

        Randomized as RAND(now, {GUARD_LIFETIME}/10).

  We add new members to {CONFIRMED_GUARDS} when we mark a circuit
  built through a guard as "for user traffic."  That is, a circuit is
  considered for use for client traffic when we have decided that we
  could attach a stream to it; at that point the guard for that
  circuit SHOULD be added to {CONFIRMED_GUARDS}.

  Whenever we remove a member from {SAMPLED_GUARDS}, we also remove
  it from {CONFIRMED_GUARDS}.

      [Note: You can also regard the {CONFIRMED_GUARDS} list as a
      total ordering defined over a subset of {SAMPLED_GUARDS}.]

  Definition: we call Guard A "higher priority" than another Guard B
  if, when A and B are both reachable, we would rather use A.  We
  define priority as follows:

     * Every guard in {CONFIRMED_GUARDS} has a higher priority
       than every guard not in {CONFIRMED_GUARDS}.

     * Among guards in {CONFIRMED_GUARDS}, the one appearing earlier
       on the {CONFIRMED_GUARDS} list has a higher priority.

     * Among guards that do not appear in {CONFIRMED_GUARDS},
       {is_pending}==true guards have higher priority.

     * Among those, the guard with earlier {last_tried_connect} time
       have higher priority.

     * Finally, among guards that do not appear in
       {CONFIRMED_GUARDS} with {is_pending==false}, all have equal
       priority.

   ** Rationale **

  We add elements to this ordering when we have actually used them
  for building a usable circuit.  We could mark them at some other
  time (such as when we attempt to connect to them, or when we
  actually connect to them), but this approach keeps us from
  committing to a guard before we actually use it for sensitive
  traffic.

4.4. The Primary guards [Section:PRIMARY]

  We keep a run-time non-persistent ordered list of
  {list:PRIMARY_GUARDS}.  It is a subset of {FILTERED_GUARDS}.  It
  contains {N_PRIMARY_GUARDS} elements.

  To compute primary guards, take the ordered intersection of
  {CONFIRMED_GUARDS} and {FILTERED_GUARDS}, and take the first
  {N_PRIMARY_GUARDS} elements.  If there are fewer than
  {N_PRIMARY_GUARDS} elements, add additional elements to
  PRIMARY_GUARDS chosen _uniformly_ at random from
  ({FILTERED_GUARDS} - {CONFIRMED_GUARDS}).

  Once an element has been added to {PRIMARY_GUARDS}, we do not remove it
  until it is replaced by some element from {CONFIRMED_GUARDS}. Confirmed
  elements always proceed unconfirmed ones in the {PRIMARY_GUARDS} list.

  Note that {PRIMARY_GUARDS} do not have to be in
  {USABLE_FILTERED_GUARDS}: they might be unreachable.

    ** Rationale **

  These guards are treated differently from other guards.  If one of
  them is usable, then we use it right away. For other guards
  {FILTERED_GUARDS}, if it's usable, then before using it we might
  first double-check whether perhaps one of the primary guards is
  usable after all.

4.5. Retrying guards. [Section:RETRYING]

  (We run this process as frequently as needed. It can be done once
  a second, or just-in-time.)

  If a primary sampled guard's {is_reachable} status is <no>, then
  we decide whether to update its {is_reachable} status to <maybe>
  based on its {last_tried_connect} time, its {failing_since} time,
  and the {PRIMARY_GUARDS_RETRY_SCHED} schedule.

  If a non-primary sampled guard's {is_reachable} status is <no>, then
  we decide whether to update its {is_reachable} status to <maybe>
  based on its {last_tried_connect} time, its {failing_since} time,
  and the {GUARDS_RETRY_SCHED} schedule.

    ** Rationale **

  An observation that a guard has been 'unreachable' only lasts for
  a given amount of time, since we can't infer that it's unreachable
  now from the fact that it was unreachable a few minutes ago.

4.6. Selecting guards for circuits. [Section:SELECTING]

  Every origin circuit is now in one of these states:
     <state:usable_on_completion>,
     <state:usable_if_no_better_guard>,
     <state:waiting_for_better_guard>, or
     <state:complete>.

  You may only attach streams to <complete> circuits.
  (Additionally, you may only send RENDEZVOUS cells, ESTABLISH_INTRO
  cells, and INTRODUCE cells on <complete> circuits.)

  The per-circuit state machine is:

      New circuits are <usable_on_completion> or
      <usable_if_no_better_guard>.

      A <usable_on_completion> circuit may become <complete>, or may
      fail.

      A <usable_if_no_better_guard> circuit may become
      <usable_on_completion>; may become <waiting_for_better_guard>; or may
      fail.

      A <waiting_for_better_guard> circuit will become <complete>, or will
      be closed, or will fail.

      A <complete> circuit remains <complete> until it fails or is
      closed.

      Each of these transitions is described below.

  We keep, as global transient state:

    * {tvar:last_time_on_internet} -- the last time at which we
      successfully used a circuit or connected to a guard.  At
      startup we set this to "infinitely far in the past."

  When we want to build a circuit, and we need to pick a guard:

    * If any entry in PRIMARY_GUARDS has {is_reachable} status of
      <maybe> or <yes>, return the first such guard. The circuit is
      <usable_on_completion>.

      [Note: We do not use {is_pending} on primary guards, since we
      are willing to try to build multiple circuits through them
      before we know for sure whether they work, and since we will
      not use any non-primary guards until we are sure that the
      primary guards are all down.  (XX is this good?)]

    * Otherwise, if the ordered intersection of {CONFIRMED_GUARDS}
      and {USABLE_FILTERED_GUARDS} is nonempty, return the first
      entry in that intersection that has {is_pending} set to
      false. Set its value of {is_pending} to true.  The circuit
      is now <usable_if_no_better_guard>.  (If all entries have
      {is_pending} true, pick the first one.)

    * Otherwise, if there is no such entry, select a member at
      random from {USABLE_FILTERED_GUARDS}. Set its {is_pending}
      field to true.  The circuit is <usable_if_no_better_guard>.

  We update the {last_tried_connect} time for the guard to 'now.'

  In some cases (for example, when we need a certain directory feature,
  or when we need to avoid using a certain exit as a guard), we need to
  restrict the guards that we use for a single circuit. When this happens, we
  remember the restrictions that applied when choosing the guard for
  that circuit, since we will need them later (see [UPDATE_WAITING].).

    ** Rationale **

  We're getting to the core of the algorithm here.  Our main goals are to
  make sure that
    1. If it's possible to use a primary guard, we do.
    2. We probably use the first primary guard.

  So we only try non-primary guards if we're pretty sure that all
  the primary guards are down, and we only try a given primary guard
  if the earlier primary guards seem down.

  When we _do_ try non-primary guards, however, we only build one
  circuit through each, to give it a chance to succeed or fail.  If
  ever such a circuit succeeds, we don't use it until we're pretty
  sure that it's the best guard we're getting. (see below).

         [XXX timeout.]

4.7. When a circuit fails. [Section:ON_FAIL]

   When a circuit fails in a way that makes us conclude that a guard
   is not reachable, we take the following steps:

      * We set the guard's {is_reachable} status to <no>.  If it had
        {is_pending} set to true, we make it non-pending.

      * We close the circuit, of course.  (This removes it from
        consideration by the algorithm in [UPDATE_WAITING].)

      * Update the list of waiting circuits.  (See [UPDATE_WAITING]
        below.)

   [Note: the existing Tor logic will cause us to create more
   circuits in response to some of these steps; and also see
   [ON_CONSENSUS].]

    ** Rationale **

   See [SELECTING] above for rationale.

4.8. When a circuit succeeds [Section:ON_SUCCESS]

   When a circuit succeeds in a way that makes us conclude that a
   guard _was_ reachable, we take these steps:

      * We set its {is_reachable} status to <yes>.
      * We set its {failing_since} to "never".
      * If the guard was {is_pending}, we clear the {is_pending} flag.
      * If the guard was not a member of {CONFIRMED_GUARDS}, we add
        it to the end of {CONFIRMED_GUARDS}.

      * If this circuit was <usable_on_completion>, this circuit is
        now <complete>. You may attach streams to this circuit,
        and use it for hidden services.

      * If this circuit was <usable_if_no_better_guard>, it is now
        <waiting_for retry>.  You may not yet attach streams to it.
        Then check whether the {last_time_on_internet} is more than
        {INTERNET_LIKELY_DOWN_INTERVAL} seconds ago:

           * If it is, then mark all {PRIMARY_GUARDS} as "maybe"
             reachable.

           * If it is not, update the list of waiting circuits. (See
             [UPDATE_WAITING] below)

   [Note: the existing Tor logic will cause us to create more
   circuits in response to some of these steps; and see
   [ON_CONSENSUS].]

    ** Rationale **

   See [SELECTING] above for rationale.

4.9. Updating the list of waiting circuits [Section:UPDATE_WAITING]

   We run this procedure whenever it's possible that a
   <waiting_for_better_guard> circuit might be ready to be called
   <complete>.

   * If any circuit C1 is <waiting_for_better_guard>, AND:
       * All primary guards have reachable status of <no>.
       * There is no circuit C2 that "blocks" C1.
     Then, upgrade C1 to <complete>.

   Definition: In the algorithm above, C2 "blocks" C1 if:
       * C2 obeys all the restrictions that C1 had to obey, AND
       * C2 has higher priority than C1, AND
       * Either C2 is <complete>, or C2 is <waiting_for_better_guard>,
         or C2 has been <usable_if_no_better_guard> for no more than
         {NONPRIMARY_GUARD_CONNECT_TIMEOUT} seconds.

   We run this procedure periodically:

   * If any circuit stays is <waiting_for_better_guard>
     for more than {NONPRIMARY_GUARD_IDLE_TIMEOUT} seconds,
     time it out.

      **Rationale**

   If we open a connection to a guard, we might want to use it
   immediately (if we're sure that it's the best we can do), or we
   might want to wait a little while to see if some other circuit
   which we like better will finish.


   When we mark a circuit <complete>, we don't close the
   lower-priority circuits immediately: we might decide to use
   them after all if the <complete> circuit goes down before
   {NONPRIMARY_GUARD_IDLE_TIMEOUT} seconds.

4.10.  Whenever we get a new consensus. [Section:ON_CONSENSUS]

   We update {GUARDS}.

   For every guard in {SAMPLED_GUARDS}, we update {IS_LISTED} and
   {FIRST_UNLISTED_AT}.

   [**] We remove entries from {SAMPLED_GUARDS} if appropriate,
   according to the sampled-guards expiration rules.  If they were
   in {CONFIRMED_GUARDS}, we also remove them from
   {CONFIRMED_GUARDS}.

   We recompute {FILTERED_GUARDS}, and everything that derives from
   it, including {USABLE_FILTERED_GUARDS}, and {PRIMARY_GUARDS}.

   (Whenever one of the configuration options that affects the
   filter is updated, we repeat the process above, starting at the
   [**] line.)

4.11. Deciding whether to generate a new circuit.
  [Section:NEW_CIRCUIT_NEEDED]

   In current Tor, we generate a new circuit when we don't have
   enough circuits either built or in-progress to handle a given
   stream, or an expected stream.

   For the purpose of this rule, we say that <waiting_for_better_guard>
   circuits are neither built nor in-progress; that <complete>
   circuits are built; and that the other states are in-progress.

A. Appendices

A.1.  Parameters with suggested values. [Section:PARAM_VALS]

   (All suggested values chosen arbitrarily)

   {param:MAX_SAMPLE_THRESHOLD} -- 20%

   {param:MAX_SAMPLE_SIZE} -- 60

   {param:GUARD_LIFETIME} -- 120 days

   {param:REMOVE_UNLISTED_GUARDS_AFTER} -- 20 days
     [previously ENTRY_GUARD_REMOVE_AFTER]

   {param:MIN_FILTERED_SAMPLE} -- 20

   {param:N_PRIMARY_GUARDS} -- 3

   {param:PRIMARY_GUARDS_RETRY_SCHED}
      -- every 30 minutes for the first 6 hours.
      -- every 2 hours for the next 3.75 days.
      -- every 4 hours for the next 3 days.
      -- every 9 hours thereafter.

   {param:GUARDS_RETRY_SCHED} -- 1 hour
      -- every hour for the first 6 hours.
      -- every 4 hours for the next 3.75 days.
      -- every 18 hours for the next 3 days.
      -- every 36 hours thereafter.

   {param:INTERNET_LIKELY_DOWN_INTERVAL} -- 10 minutes

   {param:NONPRIMARY_GUARD_CONNECT_TIMEOUT} -- 15 seconds

   {param:NONPRIMARY_GUARD_IDLE_TIMEOUT} -- 10 minutes

   {param:MEANINGFUL_RESTRICTION_FRAC} -- .2

   {param:EXTREME_RESTRICTION_FRAC} -- .01

   {param:GUARD_CONFIRMED_MIN_LIFETIME} -- 60 days

A.2. Random values [Section:RANDOM]

   Frequently, we want to randomize the expiration time of something
   so that it's not easy for an observer to match it to its start
   time. We do this by randomizing its start date a little, so that
   we only need to remember a fixed expiration interval.

   By RAND(now, INTERVAL) we mean a time between now and INTERVAL in
   the past, chosen uniformly at random.


A.3. Why not a sliding scale of primaryness? [Section:CVP]

   At one meeting, I floated the idea of having "primaryness" be a
   continuous variable rather than a boolean.

   I'm no longer sure this is a great idea, but I'll try to outline
   how it might work.

   To begin with: being "primary" gives it a few different traits:

      1) We retry primary guards more frequently. [Section:RETRYING]

      2) We don't even _try_ building circuits through
         lower-priority guards until we're pretty sure that the
         higher-priority primary guards are down. (With non-primary
         guards, on the other hand, we launch exploratory circuits
         which we plan not to use if higher-priority guards
         succeed.) [Section:SELECTING]

      3) We retry them all one more time if a circuit succeeds after
         the net has been down for a while. [Section:ON_SUCCESS]

   We could make each of the above traits continuous:

      1) We could make the interval at which a guard is retried
         depend continuously on its position in CONFIRMED_GUARDS.

      2) We could change the number of guards we test in parallel
         based on their position in CONFIRMED_GUARDS.

      3) We could change the rule for how long the higher-priority
         guards need to have been down before we call a
         <usable_if_no_better_guard> circuit <complete> based on a
         possible network-down condition.  For example, we could
         retry the first guard if we tried it more than 10 seconds
         ago, the second if we tried it more than 20 seconds ago,
         etc.

   I am pretty sure, however, that if these are worth doing, they
   need more analysis!  Here's why:

      * They all have the potential to leak more information about a
        guard's exact position on the list.  Is that safe? Is there
        any way to exploit that?  I don't think we know.

      * They all seem like changes which it would be relatively
        simple to make to the code after we implement the simpler
        version of the algorithm described above.

A.3. Controller changes

   We will add to control-spec.txt a new possible circuit state, GUARD_WAIT,
   that can be given as part of circuit events and GETINFO responses about
   circuits.  A circuit is in the GUARD_WAIT state when it is fully built,
   but we will not use it because a circuit with a better guard might
   become built too.

A.4. Persistent state format

   The persistent state format doesn't need to be part of this
   proposal, since different implementations can do it
   differently. Nonetheless, here's the one Tor uses:

   The "state" file contains one Guard entry for each sampled guard
   in each instance of the guard state (see section 2).  The value
   of this Guard entry is a set of space-separated K=V entries,
   where K contains any nonspace character except =, and V contains
   any nonspace characters.

   Implementations must retain any unrecognized K=V entries for a
   sampled guard when the regenerate the state file.

   The order of K=V entries is not allowed to matter.

   Recognized fields (values of K) are:

        "in" -- the name of the guard state instance that this
        sampled guard is in.  If a sampled guard is in two guard
        states instances, it appears twice, with a different "in"
        field each time. Required.

        "rsa_id" -- the RSA id digest for this guard, encoded in
        hex. Required.

        "bridge_addr" -- If the guard is a bridge, its configured
        address and OR port. Optional.

        "nickname" -- the guard's nickname, if any. Optional.

        "sampled_on" -- the date when the guard was sampled. Required.

        "sampled_by" -- the Tor version that sampled this guard.
        Optional.

        "unlisted_since" -- the date since which the guard has been
        unlisted. Optional.

        "listed" -- 0 if the guard is not listed ; 1 if it is. Required.

        "confirmed_on" -- date when the guard was
        confirmed. Optional.

        "confirmed_idx" -- position of the guard in the confirmed
        list. Optional.

        "pb_use_attempts", "pb_use_successes", "pb_circ_attempts",
        "pb_circ_successes", "pb_successful_circuits_closed",
        "pb_collapsed_circuits", "pb_unusable_circuits",
        "pb_timeouts" -- state for the circuit path bias algorithm,
        given in decimal fractions. Optional.

   All dates here are given as a (spaceless) ISO8601 combined date
   and time in UTC (e.g., 2016-11-29T19:39:31).

   I do not plan to build a migration mechanism from the old format
   to the new.


TODO. Still non-addressed issues [Section:TODO]

   Simulate to answer:  Will this work in a dystopic world?

   Simulate actual behavior.

   For all lifetimes: instead of storing the "this began at" time,
   store the "remove this at" time, slightly randomized.

   Clarify that when you get a <complete> circuit, you might need to
   relaunch circuits through that same guard immediately, if they
   are circuits that have to be independent.


   Fix all items marked XX or TODO.

   "Directory guards" -- do they matter?

       Suggestion: require that all guards support downloads via BEGINDIR.
       We don't need to worry about directory guards for relays, since we
       aren't trying to prevent relay enumeration.

   IP version preferenes via ClientPreferIPv6ORPort

       Suggestion: Treat it as a preference when adding to
       {CONFIRMED_GUARDS}, but not otherwise.

Filename: 272-valid-and-running-by-default.txt
Title: Listed routers should be Valid, Running, and treated as such
Created: 26 Aug 2016
Author: Nick Mathewson
Status: Closed
Implemented-In: 0.2.9.3-alpha, 0.2.9.4-alpha

1. Introduction and proposal.

   This proposal describes a change in how clients understand consensus
   flags, and how authorities vote on consensuses.

1.1. Authority-side changes

   Back in proposal 138, we made it so that non-Running routers were not
   included in the consensus documents. We should do the same with the
   Valid flag.  Specifically, after voting, if the authorities find that
   a router would not receive the Valid flag, they should not include it
   at all.

   This will require the allocation of a new consensus method, since it
   is a change in how consensuses are made from votes.

   In the most recent consensus, it will affect exactly 1 router.

1.2. Client-side changes

   I propose that clients should consider every listed router to be
   listed as Running and Valid if any consensus method above or higher
   is in use.

2. Benefits

   Removing the notion of listed but invalid routers will remove an
   opportunity for error, and let us remove some client side code.

   More interestingly, the above changes would allow us to eventually
   stop including the Running and Valid flags, thereby providing an
   authority-side way to feature-gate clients off of the Tor network
   without a fast-zombie problem. (See proposal 266 for discussion.)


A. An additional possible change

   Perhaps authorities might also treat BadExit like they treat the
   absence of Valid and Running: as sufficient reason to not include a
   router in the consensus.  Right now, there are only 4 listed BadExit
   routers in the consensus, amounting to a small fraction of total
   bandwidth.

   Making this change would allow us to remove the client-side badexit
   logic.


B. Does this solve the zombie problem?

   I tested it a little, and it does seem to be a way to make even the
   most ancient consensus-understanding Tors stop fetching descriptors
   and using the network. More testing needed though.

Filename: 273-exit-relay-pinning.txt
Title: Exit relay pinning for web services
Author: Philipp Winter, Tobias Pulls, Roya Ensafi, and Nick Feamster
Created: 2016-09-22
Status: Reserve
Target: n/a

0. Overview

   To mitigate the harm caused by malicious exit relays, this proposal
   presents a novel scheme -- exit relay pinning -- to allow web sites
   to express that Tor connections should preferably originate from a
   set of predefined exit relays.  This proposal is currently in draft
   state.  Any feedback is appreciated.

1. Motivation

   Malicious exit relays are increasingly becoming a problem.  We have
   been witnessing numerous opportunistic attacks, but also highly
   sophisticated, targeted attacks that are financially motivated.  So
   far, we have been looking for malicious exit relays using active
   probing and a number of heuristics, but since it is inexpensive to
   keep setting up new exit relays, we are facing an uphill battle.

   Similar to the now-obsolete concept of exit enclaves, this proposal
   enables web services to express that Tor clients should prefer a
   predefined set of exit relays when connecting to the service.  We
   encourage sensitive sites to set up their own exit relays and have
   Tor clients prefer these relays, thus greatly mitigating the risk of
   man-in-the-middle attacks.

2. Design

2.1 Overview

   A simple analogy helps in explaining the concept behind exit relay
   pinning: HTTP Public Key Pinning (HPKP) allows web servers to express
   that browsers should pin certificates for a given time interval.
   Similarly, exit relay pinning (ERP) allows web servers to express
   that Tor Browser should prefer a predefined set of exit relays.  This
   makes it harder for malicious exit relays to be selected as last hop
   for a given website.

   Web servers advertise support for ERP in a new HTTP header that
   points to an ERP policy.  This policy contains one or more exit
   relays, and is signed by the respective relay's master identity key.
   Once Tor Browser obtained a website's ERP policy, it will try to
   select the site's preferred exit relays for subsequent connections.
   The following subsections discuss this mechanism in greater detail.

2.2 Exit relay pinning header

   Web servers support ERP by advertising it in the "Tor-Exit-Pins" HTTP
   header.  The header contains two directives, "url" and "max-age":

     Tor-Exit-Pins: url="https://example.com/pins.txt"; max-age=2678400

   The "url" directive points to the full policy, which MUST be HTTPS.
   Tor Browser MUST NOT fetch the policy if it is not reachable over
   HTTPS.  Also, Tor Browser MUST abort the ERP procedure if the HTTPS
   certificate is not signed by a trusted authority.  The "max-age"
   directive determines the time in seconds for how long Tor Browser
   SHOULD cache the ERP policy.

   After seeing a Tor-Exit-Pins header in an HTTP response, Tor Browser
   MUST fetch and interpret the policy unless it already has it cached
   and the cached policy has not yet expired.

2.3 Exit relay pinning policy

   An exit relay pinning policy MUST be formatted in JSON.  The root
   element is called "erp-policy" and it points to a list of pinned exit
   relays.  Each list element MUST contain two elements, "fingerprint"
   and "signature".  The "fingerprint" element points to the
   hex-encoded, uppercase, 40-digit fingerprint of an exit relay, e.g.,
   9B94CD0B7B8057EAF21BA7F023B7A1C8CA9CE645.  The "signature" element
   points to an Ed25519 signature, uppercase and hex-encoded.  The
   following JSON shows a conceptual example:

   {
     "erp-policy": [
       "start-policy",
       {
         "fingerprint": Fpr1,
         "signature": Sig_K1("erp-signature" || "example.com" || Fpr1)
       },
       {
         "fingerprint": Fpr2,
         "signature": Sig_K2("erp-signature" || "example.com" || Fpr2)
       },
       ...
       {
         "fingerprint": Fprn,
         "signature": Sig_Kn("erp-signature" || "example.com" || Fprn)
       },
       "end-policy"
     ]
   }

   Fpr refers to a relay's fingerprint as discussed above.  In the
   signature, K refers to a relay's master private identity key.  The ||
   operator refers to string concatenation, i.e., "foo" || "bar" results
   in "foobar".  "erp-signature" is a constant and denotes the purpose
   of the signature.  "start-policy" and "end-policy" are both constants
   and meant to prevent an adversary from serving a client only a
   partial list of pins.

   The signatures over fingerprint and domain are necessary to prove
   that an exit relay agrees to being pinned.  The website's domain --
   in this case example.com -- is part of the signature, so third
   parties such as evil.com cannot coerce exit relays they don't own to
   serve as their pinned exit relays.

   After having fetched an ERP policy, Tor Browser MUST first verify
   that the two constants "start-policy" and "end-policy" are present,
   and then validate the signature over all list elements.  If any
   element does not validate, Tor Browser MUST abort the ERP procedure.

   If an ERP policy contains more than one exit relay, Tor Browser MUST
   select one at random, weighted by its bandwidth.  That way, we can
   balance load across all pinned exit relays.

   Tor Browser could enforce the mapping from domain to exit relay by
   adding the following directive to its configuration file:

     MapAddress example.com example.com.Fpr_n.exit

2.4 Defending against malicious websites

   The purpose of exit relay pinning is to protect a website's users
   from malicious exit relays.  We must further protect the same users
   from the website, however, because it could abuse ERP to reduce a
   user's anonymity set.  The website could group users into
   arbitrarily-sized buckets by serving them different ERP policies on
   their first visit.  For example, the first Tor user could be pinned
   to exit relay A, the second user could be pinned to exit relay B,
   etc.  This would allow the website to link together the sessions of
   anonymous users.

   We cannot prevent websites from serving client-specific policies, but
   we can detect it by having Tor Browser fetch a website's ERP policy
   over multiple independent exit relays.  If the policies are not
   identical, Tor Browser MUST ignore the ERP policies.

   If Tor Browser would attempt to fetch the ERP policy over n circuits
   as quickly as possible, the website would receive n connections
   within a narrow time interval, suggesting that all these connections
   originated from the same client.  To impede such time-based
   correlation attacks, Tor Browser MUST wait for a randomly determined
   time span before fetching the ERP policy.  Tor Browser SHOULD
   randomly sample a delay from an exponential distribution.  The
   disadvantage of this defence is that it can take a while until Tor
   Browser knows that it can trust an ERP policy.

2.5 Design trade-offs

   We now briefly discuss alternative design decisions, and why we
   defined ERP the way we did.

   Instead of having a web server *tell* Tor Browser about pinned exit
   relays, we could have Tor Browser *ask* the web server, e.g., by
   making it fetch a predefined URL, similar to robots.txt.  We believe
   that this would involve too much overhead because only a tiny
   fraction of sites that Tor users visit will have an ERP policy.

   ERP implies that adversaries get to learn all the exit relays from
   which all users of a pinned site come from.  These exit relays could
   then become a target for traffic analysis or compromise.  Therefore,
   websites that pin exit relays SHOULD have a proper HTTPS setup and
   host their exit relays topologically close to the content servers, to
   mitigate the threat of network-level adversaries.

   It's possible to work around the bootstrapping problem (i.e., the
   very first website visit cannot use pinned exits) by having an
   infrastructure that allows us to pin exits out-of-band, e.g., by
   hard-coding them in Tor Browser, or by providing a lookup service
   prior to connecting to a site, but the additional complexity does not
   seem to justify the added security or reduced overhead.

2.6 Open questions

   o How should we deal with selective DoS or otherwise unavailable exit
     relays?  That is, what if an adversary takes offline pinned exit
     relays?  Should Tor Browser give up, or fall back to non-pinned
     exit relays that are potentially malicious?  Should we give site
     operators an option to express a fallback if they care more about
     availability than security?

   o Are there any aspects that are unnecessarily tricky to implement in
     Tor Browser?  If so, let's figure out how to make it easier to
     build.

   o Is a domain-level pinning granularity sufficient?

   o Should we use the Ed25519 master or signing key?

   o Can cached ERP policies survive a Tor Browser restart?  After all,
     we are not supposed to write to disk, and ERP policies are
     basically like a browsing history.

   o Should we have some notion of "freshness" in an ERP policy?  The
     problem is that an adversary could save my ERP policy for
     example.com, and if I ever give up example.com, the adversary could
     register it, and use my relays for pinning.  This could easily be
     mitigated by rotating my relay identity keys, and might not be that
     big a problem.

   o Should we support non-HTTP services?  For example, do we want to
     support, say, SSH?  And if so, how would we go about it?

   o HPKP also defines a "report-uri" directive to which errors should
     be reported.  Do we want something similar, so site operators can
     detect issues such as attempted DoS attacks?

   o It is wasteful to send a 60-70 byte header to all browsers while
     only a tiny fraction of them will want it.  Web servers could send
     the header only to IP addresses that run an exit relay, but that
     adds quite a bit of extra complexity.

   o We currently defend against malicious websites by fetching the ERP
     policy over several exit relays, spread over time.  In doing so, we
     are making assumptions on the number of visits the website sees.
     Is there a better solution that isn't significantly more complex?
Filename: 274-rotate-onion-keys-less.txt
Title: Rotate onion keys less frequently
Author: Nick Mathewson
Created: 20-Feb-2017
Status: Closed
Implemented-In: 0.3.1.1-alpha

1. Overview

   This document proposes that, in order to limit the bandwidth needed
   for microdescriptor listing and transmission, we reduce the onion key
   rotation rate from the current value (7 days) to something closer to
   28 days.

   Doing this will reduce the total microdescriptor download volume
   by approximately 70%.

2. Motivation

   Currently, clients must download a networkstatus consensus document
   once an hour, and must download every unfamiliar microdescriptor
   listed in that document.  Therefore, we can reduce client directory
   bandwidth if we can cause microdescriptors to change less often.

   Furthermore, we are planning (in proposal 140) to implement a
   diff-based mechanism for clients to download only the parts of each
   consensus that have changed.  If we do that, then by having the
   microdescriptor for each router change less often, we can make these
   consensus diffs smaller as well.

3. Analysis

   I analyzed microdescriptor changes over the month of January
   2017, and found that 94.5% of all microdescriptor transitions
   were changes in onion key alone.

   Therefore, we could reduce the number of changed "m" lines in
   consensus diffs by approximately 94.5% * (3/4) =~ 70%,
   if we were to rotate onion keys one-fourth as often.

   The number of microdescriptors to actually download should
   decrease by a similar number.

   This amounts to a significant reduction: currently, by
   back-of-the-envelope estimates, an always-on client that downloads
   all the directory info in a month downloads about 449MB of compressed
   consensuses and something around 97 MB of compressed
   microdescriptors.  This proposal would save that user about 12% of
   their total directory bandwidth.

   If we assume that consensus diffs are implemented (see proposal 140),
   then the user's compressed consensus downloads fall to something
   closer to 27 MB.  Under that analysis, the microdescriptors will
   dominate again at 97 MB -- so lowering the number of microdescriptors
   to fetch would save more like 55% of the remaining bandwidth.

   [Back-of-the-envelope technique: assume every consensus is
   downloaded, and every microdesc is downloaded, and microdescs are
   downloaded in groups of 61, which works out to a constant rate.]

   We'll need to do more analysis to assess the impact on clients that
   connect to the network infrequently enough to miss microdescriptors:
   nonetheless, the 70% figure above ought to apply to clients that connect
   at least weekly.

   (XXXX Better results pending feedback from ahf's analysis.)

4. Security analysis

   The onion key is used to authenticate a relay to a client when the
   client is building a circuit through that relay.  The only reason to
   limit their lifetime is to limit the impact if an attacker steals an
   onion key without being detected.

   If an attacker steals an onion key and is detected, the relay can
   issue a new onion key ahead of schedule, with little disruption.

   But if the onion key theft is _not_ detected, then the attacker
   can use that onion key to impersonate the relay until clients
   start using the relay's next key.  In order to do so, the
   attacker must also impersonate the target relay at the link
   layer: either by stealing the relay's link keys, which rotate
   more frequently, or by compromising the previous relay in the
   circuit.

   Therefore, onion key rotation provides a small amount of
   protection only against an attacker who can compromise relay keys
   very intermittently, and who controls only a small portion of the
   network.  Against an attacker who can steal keys regularly it
   does little, and an attacker who controls a lot of the network
   can already mount other attacks.

5. Proposal

   I propose that we move the default onion key rotation interval
   from 7 days to 28 days, as follows.

   There should be a new consensus parameter, "onion-key-rotation-days",
   measuring the key lifetime in days.  Its minimum should be 1, its
   maximum should be 90, and its default should be 28.

   There should also be a new consensus parameter,
   "onion-key-grace-period-days", measuring the interval for which
   older onion keys should still be accepted.  Its minimum should be
   1, its maximum should be onion-key-rotation-days, and its default
   should be 7.

   Every relay should list each onion key it generates for
   onion-key-rotation-days days after generating it, and then
   replace it.  Relays should continue to accept their most recent
   previous onion key for an additional onion-key-grace-period-days
   days after it is replaced.

Filename: 275-md-published-time-is-silly.txt
Title: Stop including meaningful "published" time in microdescriptor consensus
Author: Nick Mathewson
Created: 20-Feb-2017
Status: Closed
Target: 0.3.1.x-alpha
Implemented-In: 0.4.8.1-alpha

0. Status:

   As of 0.2.9.11 / 0.3.0.7 / 0.3.1.1-alpha, Tor no longer takes any
   special action on "future" published times, as proposed in section 4.

   As of 0.4.0.1-alpha, we implemented a better mechanism for relays to know
   when to publish. (See proposal 293.)

1. Overview

   This document proposes that, in order to limit the bandwidth needed
   for networkstatus diffs, we remove "published" part of the "r" lines
   in microdescriptor consensuses.

   The more extreme, compatibility-breaking version of this idea will
   reduce ed consensus diff download volume by approximately 55-75%.  A
   less-extreme interim version would still reduce volume by
   approximately 5-6%.

2. Motivation

   The current microdescriptor consensus "r" line format is:
     r Nickname Identity Published IP ORPort DirPort
   as in:
     r moria1 lpXfw1/+uGEym58asExGOXAgzjE 2017-01-10 07:59:25 \
        128.31.0.34 9101 9131

   As I'll show below, there's not much use for the "Published" part
   of these lines.  By omitting them or replacing them with
   something more compressible, we can save space.

   What's more, changes in the Published field are one of the most
   frequent changes between successive networkstatus consensus
   documents.  If we were to remove this field, then networkstatus diffs
   (see proposal 140) would be smaller.

3. Compatibility notes

   Above I've talked about "removing" the published field.  But of
   course, doing this would make all existing consensus consumers
   stop parsing the consensus successfully.

   Instead, let's look at how this field is used currently in Tor,
   and see if we can replace the value with something else.

      * Published is used in the voting process to decide which
        descriptor should be considered.  But that is taken from
        vote networkstatus documents, not consensuses.

      * Published is used in mark_my_descriptor_dirty_if_too_old()
        to decide whether to upload a new router descriptor.  If the
        published time in the consensus is more than 18 hours in the
        past, we upload a new descriptor.  (Relays are potentially
        looking at the microdesc consensus now, since #6769 was
        merged in 0.3.0.1-alpha.)  Relays have plenty of other ways
        to notice that they should upload new descriptors.

      * Published is used in client_would_use_router() to decide
        whether a routerstatus is one that we might possibly use.
        We say that a routerstatus is not usable if its published
        time is more than OLD_ROUTER_DESC_MAX_AGE (5 days) in the
        past, or if it is not at least
        TestingEstimatedDescriptorPropagationTime (10 minutes) in
        the future. [***] Note that this is the only case where anything
        is rejected because it comes from the future.

          * client_would_use_router() decides whether we should
            download a router descriptor (not a microdescriptor)
            in routerlist.c

          * client_would_use_router() is used from
            count_usable_descriptors() to decide which relays are
            potentially usable, thereby forming the denominator of
            our "have descriptors / usable relays" fraction.

   So we have a fairly limited constraints on which Published values
   we can safely advertize with today's Tor implementations.  If we
   advertise anything more than 10 minutes in the future,
   client_would_use_router() will consider routerstatuses unusable.
   If we advertize anything more than 18 hours in the past, relays
   will upload their descriptors far too often.

4. Proposal

   Immediately, in 0.2.9.x-stable (our LTS release series), we
   should stop caring about published_on dates in the future.  This
   is a two-line change.

   As an interim solution: We should add a new consensus method number
   that changes the process by which Published fields in consensuses are
   generated.  It should set all Published fields in the consensus
   to be the same value.  These fields should be taken to rotate
   every 15 hours, by taking consensus valid-after time, and rounding
   down to the nearest multiple of 15 hours since the epoch.

   As a longer-term solution: Once all Tor versions earlier than 0.2.9.x
   are obsolete (in mid 2018), we can update with a new consensus
   method, and set the published_on date to some safe time in the
   future.

5. Analysis

   To consider the impact on consensus diffs: I analyzed consensus
   changes over the month of January 2017, using scripts at [1].

   With the interim solution in place, compressed diff sizes fell by
   2-7% at all measured intervals except 12 hours, where they increased
   by about 4%.  Savings of 5-6% were most typical.

   With the longer-term solution in place, and all published times held
   constant permanently, the compressed diff sizes were uniformly at
   least 56% smaller.

   With this in mind, I think we might want to only plan to support the
   longer-term solution.

    [1] https://github.com/nmathewson/consensus-diff-analysis



Filename: 276-lower-bw-granularity.txt
Title: Report bandwidth with lower granularity in consensus documents
Author: Nick Mathewson
Created: 20-Feb-2017
Status: Dead
Target: 0.3.1.x-alpha

   [NOTE: We're calling this proposal dead for now: the benefits are small
   compared to the possible loss in routing correctness.  If/when proposal
   300 is built, it will have even less benefit. (2020 July 31)]


1. Overview

   This document proposes that, in order to limit the bandwidth needed for
   networkstatus diffs, we lower the granularity with which bandwidth is
   reported in consensus documents.

   Making this change will reduce the total compressed ed diff download
   volume by around 10%.

2. Motivation

   Consensus documents currently report bandwidth values as the median
   of the measured bandwidth values in the votes.  (Or as the median of
   all votes' values if there are not enough measurements.)  And when
   voting, in turn, authorities simply report whatever measured value
   they most recently encountered, clipped to 3 significant base-10
   figures.

   This means that, from one consensus to the next, these weights very
   often and with little significance:  A large fraction of bandwidth
   transitions are under 2% in magnitude.

   As we begin to use consensus diffs, each change will take space to
   transmit.  So lowering the amount of changes will lower client
   bandwidth requirements significantly.

3. Proposal

   I propose that we round the bandwidth values, as they are placed in votes,
   to no more than two significant digits.  In addition, for
   values beginning with decimal "2" through "4", we should round the
   first two digits the nearest multiple of 2.  For values beginning
   with decimal "5" though "9", we should round to the nearest multiple
   of 5.

   The change will take effect progressively as authorities upgrade: since
   the median value is used, when one authority upgrades, 1/5 of the
   bandwidths will be rounded (on average).

   Once all authorities upgrade, all bandwidths will be rounded like this.

4. Analysis

   The rounding proposed above will not round any value by more than 5% more
   than current rounding, so the overall impact on bandwidth balancing should
   be small.

   In order to assess the bandwidth savings of this approach, I
   smoothed the January 2017 consensus documents' Bandwidth fields,
   using scripts from [1].  I found that if clients download
   consensus diffs once an hour, they can expect 11-13% mean savings
   after xz or gz compression.  For two-hour intervals, the savings
   is 8-10%; for three-hour or four-hour intervals, the savings only
   is 6-8%.  After that point, we start seeing diminishing returns,
   with only 1-2% savings on a 72-hour interval's diff.

    [1] https://github.com/nmathewson/consensus-diff-analysis

5. Open questions:

   Is there a greedier smoothing algorithm that would produce better
   results?

   Is there any reason to think this amount of smoothing would not
   be safe?

   Would a time-aware smoothing mechanism work better?
Filename: 277-detect-id-sharing.txt
Title: Detect multiple relay instances running with same ID
Author: Nick Mathewson
Created: 20-Feb-2017
Status: Open
Target: 0.3.??

1. Overview

   This document proposes that we detect multiple relay instances running
   with the same ID, and block them all, or block all but one of each.

2. Motivation

   While analyzing microdescriptor and relay status transitions (see
   proposal XXXX), I found that something like 16/10631 router
   identities from January 2017 were apparently shared by two or
   more relays, based on their excessive number of onion key
   transitions.  This is probably accidental: and if intentional,
   it's probably not achieving whatever the relay operators
   intended.

   Sharing identities causes all the relays in question to "flip" back
   and forth onto the network, depending on which one uploaded its
   descriptor most recently.  One relay's address will be listed; and
   so will that relay's onion key.  Routers connected to one of the
   other relays will believe its identity, but be suspicious of its
   address.  Attempts to extend to the relay will fail because of the
   incorrect onion key.  No more than one of the relays' bandwidths will
   actually get significant use.

   So clearly, it would be best to prevent this.

3. Proposal 1: relay-side detection

   Relays should themselves try to detect whether another relay is using
   its identity.  If a relay, while running, finds that it is listed in
   a fresh consensus using an onion key other than its current or
   previous onion key, it should tell its operator about the problem.

   (This proposal borrows from Mike Perry's ideas related to key theft
   detection.)

4. Proposal 2: offline detection

   Any relay that has a large number of onion-key transitions over time,
   but only a small number of distinct onion keys, is probably two or
   more relays in conflict with one another.

   In this case, the operators can be contacted, or the relay
   blacklisted.

   We could build support for blacklisting all but one of the addresses,
   but it's probably best to treat this as a misconfiguratino serious
   enough that it needs to be resolved.




Filename: 278-directory-compression-scheme-negotiation.txt
Title: Directory Compression Scheme Negotiation
Author: Alexander Færøy
Created: 2017-03-06
Status: Closed
Implemented-In: 0.3.1.1-alpha

0. Overview

  This document describes a method to provide and use different
  compression schemes in Tor's directory specification[0] and let it be
  up the client and server to negotiate a mutually supported scheme
  using the semantics of the HTTP protocol.

  Furthermore this proposal also extends Tor's directory protocol with
  support for the LZMA and Zstandard compression schemes.

1. Motivation

  Currently Tor serves each directory client with its different document
  flavours in either an uncompressed format or, if the client adds a
  ".z"-suffix to the URL file path, a zlib-compressed document.

  This have historically been non-problematic, but it disallows us from
  easily extending the set of supported compression schemes.

  Some of the problems this proposal is trying to aid:

    - We currently only support zlib-based compression schemes and there
      is no way for directory servers or clients to announce which
      compression schemes they support. Zlib might not be the ideal
      compression scheme for all purposes.

    - It is not easily possible to add support for additional
      compression schemes without adding additional file extensions or
      flavours of the directory documents.

    - In low-bandwidth and/or low-memory client scenarios it is useful
      to be able to limit the amount of supported compression schemes to
      have a client only support the most efficient compression scheme
      for the given use-case and have the directory servers support the
      most commonly available compression schemes used throughout the
      network.

    - We add support for the LZMA compression scheme, which yields
      better compressed size and decompression time at the expensive of
      higher compression time and higher memory usage.

    - We add support for the Zstandard compression scheme, which yields
      better compression ratio than GZip, but slightly worse than LZMA,
      but with a smaller CPU and memory footprint than LZMA.

2. Analysis

  We investigated the compression ratio, memory usage, memory allocation
  strategies, and execution time for compression and decompression of
  the GZip, BZip2, LZMA, and Zstandard compression schemes at
  compression levels 1 through 9.

  The data used in this analysis can be found in [1] and the `bench`
  tool for generating the data can be found in [2].

  During the preparation for this proposal Nick have analysed
  compressing consensus diffs using both GZip, LZMA, and Zstandard. The
  result of Nick's analysis can be found in [3].

  We must continue to support both "gzip", "deflate", and "identity"
  which are the currently available compression schemes in the Tor
  network.

  Further to enhance the compression ratio Nick have also worked on
  proposal #274 (Rotate onion keys less frequently), #275 (Stop
  including meaningful "published" time in microdescriptor consensus),
  #276 (Report bandwidth with lower granularity in consensus documents),
  and #277 (Detect multiple relay instances running with same ID) which
  all aid in making our consensus documents less dynamic.

3. Proposal

  We extend the directory client requests to include the
  "Accept-Encoding" header as part of its request. The "Accept-Encoding"
  header should contain a comma-separated list of names of the
  compression schemes of which the client supports.

  For example:

    GET / HTTP/1.0
    Accept-Encoding: x-zstd, x-tor-lzma, gzip, deflate

  When a directory server receives a request with the "Accept-Encoding"
  header included, to either the ".z" compressed or the uncompressed
  version of any given document, it must decide on a mutually supported
  compression scheme and add the "Content-Encoding" header to its
  response and thus notifying the client of its decision. The
  "Content-Encoding" header can at most contain one supported
  compression scheme. If no mutual compression scheme can be negotiated
  the server must respond with an HTTP error status code of 406
  "Not Acceptable".

  For example:

    HTTP/1.0 200 OK
    Content-Length: 1337
    Connection: close
    Content-Encoding: x-zstd

  Currently supported compression scheme names includes "identity",
  "gzip", and "deflate". This proposal adds two additional compression
  scheme named "x-tor-lzma" (LZMA) and "x-zstd" (Zstandard).

  All compression scheme names are case-insensitive.

  The "deflate", "gzip", and "identity" compression schemes must be
  supported by directory servers for backwards compatibility.

  We use the name "x-tor-lzma" instead of just "x-lzma" because we
  require a defined upper bound of memory usage that is available for
  decompression of LZMA compressed data. The upper bound for memory
  available for LZMA decompression is defined as 16 MB. This currently
  means that will not use the LZMA compression scheme with a "preset"
  value higher than 6.

  Additionally, when a client, that supports this proposals, makes a
  request to a directory document with the ".z"-suffix it must send an
  ordered set of supported compression schemes where the last elements
  in the set contains compression schemes that are supported by all of
  the currently available Tor nodes ("gzip", "deflate", "identity"). In
  this way older relays will simply respond with the document compressed
  using zlib deflate without any prior knowledge of the newly added
  compression schemes.

  If a directory server receives a request to a document with the ".z"
  suffix, where the client does not include an "Accept-Encoding" header,
  the server should respond with the zlib compressed version of the
  document for backwards compatibility with client that does not support
  this proposal.

  The "Content-Length" header contains the number of compressed bytes
  sent to the client.

  The new compression schemes will be available for directory clients
  over both clearnet and BEGIN_DIR-style connections.

4. Security Implications

4.1 Compression and Decompression Bombs

  We currently detect compression and decompression "bombs" and must
  continue to do so with any additional compression schemes that we add.

  The detection of compression and decompression bombs are handled in
  `is_compression_bomb()` in torgzip.c and the same functionality is
  used both for compression and decompression. These functions must be
  extended to support LZMA and Zstandard.

4.2 Detection of Compression Algorithms

  To ensure that we do not pass compressed data through the incorrect
  decompression handler, when we have received data from another peer,
  Tor tries to detect the compression scheme in
  `detect_compression_method()`` in torgzip.c. This function should be
  extended to also detect the LZMA and Zstandard formats. Possible
  methods of applying this detection is looking at xz-tools, zstd's CLI,
  and the libmagic 'compress' module.

4.3 Fingerprinting

  All clients should aim at supporting the same set of supported
  compression schemes to avoid fingerprinting.

5. Compatibility

  This proposal does not break any backwards compatibility.

  Tor will continue to support serving uncompressed and zlib-compressed
  objects using the method defined in the directory specification[0],
  but will allow newer clients to negotiate a mutually supported
  compression scheme.

6. Performance and Scalability

  Each newly added compression scheme adds to the compression cache of a
  relay, which increases the memory requirements of a relay.

  The LZMA compression scheme yields better compression ratio at the
  expense of higher memory and CPU requirements for compression and
  slightly higher memory and CPU requirements for decompression.

  The Zstandard compression scheme yields better compression ratio than
  GZip does, but does not suffer from the same high CPU and memory
  requirements for compression as LZMA does.

  Because of the high requirements for CPU and memory usage for LZMA it
  is possible that we do not support this scheme for all available
  documents or that we only support it in situations where it is
  possible to pre-compute and cache the compressed document.

7. References

  [0]: https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt
  [1]: https://docs.google.com/spreadsheets/d/1devQlUOzMPStqUl9mPawFWP99xSsRM8xWv7DNcqjFdo
  [2]: https://gitlab.com/ahf/tor-sponsor4-compression
  [3]: https://github.com/nmathewson/consensus-diff-analysis

8. Acknowledgements

  This research was supported in part by NSF grants CNS-1111539,
  CNS-1314637, CNS-1526306, CNS-1619454, and CNS-1640548.

Filename: 279-naming-layer-api.txt
Title: A Name System API for Tor Onion Services
Author: George Kadianakis, Yawning Angel, David Goulet
Created: 04-Oct-2016
Status: Needs-Revision

  Table Of Contents:

			1. Introduction
				1.1. Motivation
				1.2. Design overview and rationale
			2. System Specification
				2.1. System overview [SYSTEMOVERVIEW]
				2.2. System Illustration
				2.3. System configuration [TORRC]
					2.3.1. Tor name resolution logic
				2.4. Name system initialization [INITPROTOCOL]
				2.5. Name resolution using NS API
					2.5.1. Message format
					2.5.2. RESOLVED status codes
					2.5.3. Further name resolution behavior
				2.6. Cancelling a name resolution request
				2.7. Launching name plugins [INITENVVARS]
				2.8. Name plugin workflow [NSBEHAVIOR]
					2.8.1. Name plugin shutdown [NSSHUTDOWN]
				2.9. Further details of stdin/stdout communication
					2.9.1. Message Format
			3. Discussion
				3.1. Using second-level domains instead of tld
				3.2. Name plugins handling all tlds '*'
				3.3. Deployment strategy
				3.4. Miscellaneous discussion topics
			4. Acknowledgements
			A.1: Example communication Tor <-> name plugin [PROTOEXAMPLE]
			A.2: Example plugins [PLUGINEXAMPLES]

1. Introduction

   This proposal specifies a modular API for integrating name systems with Tor.

1.1. Motivation

   Tor onion service addresses are decentralized and self-authenticated but
   they are not human-memorable (e.g. 3g2upl4pq6kufc4m.onion). This is a source
   of poor usability, since Internet users are familiar with the convenient
   naming of DNS and are not used to addresses being random text.

   In particular, onion addresses are currently composed of 16 random base32
   characters, and they look like this:

                      3g2upl4pq6kufc4m.onion
                      vwakviie2ienjx7t.onion
                      idnxcnkne4qt76tg.onion
                      vwakviie2ienjx6t.onion

   When Proposal 224 gets deployed, onion addresses will become even
   bigger: 53 base32 characters. That's:

        llamanymityx4fi3l6x2gyzmtmgxjyqyorj9qsb5r543izcwymle.onion
        lfels7g3rbceenuuqmpsz45z3lswakqf56n5i3bvqhc22d5rrsza.onion
        odmmeotgcfx65l5hn6ejkaruvai222vs7o7tmtllszqk5xbysola.onion
        qw3yvgovok3dsarhqephpu2pkiwzjdk2fgdfwwf3tb3vgzxr5kba.onion

   Over the years Tor users have come up with various ad-hoc ways of handling
   onion addresses. For example people memorize them, or use third-party
   centralized directories, or just google them everytime.

   We believe that the UX problem of non-human-memorable addresses is not
   actually solved with the above ad-hoc solutions and remains a critical
   usability barrier that prevents onion services from being used by a wider
   audience.

1.2. Design overview and rationale

   During the past years there has been lots of research on secure naming and
   various such systems have been proposed (e.g. GNS, Namecoin, etc.).

   Turns out securely naming things is a very hard research problem, and hence
   none of the proposed systems is a clear winner: all of them carry various
   trade-offs. Furthermore, none of the proposed systems has seen widespread use
   so far, which makes it even harder to pick a single project.

   Given the plenitude of options, one approach to decide which system
   is best is to make various decent name systems available and let the
   Tor community and the sands of time pick the winner. Also, it might
   be that there is no single winner, and perhaps different specialized
   name system should be used in different situations. We believe that
   by getting secure name systems actually get utilized by real users,
   the whole field will mature and existing systems will get battle-hardened.

   Hence, the contribution of this proposal is a modular Name System API
   (NSA) that allows developers to integrate their own name systems in
   Tor. The interface design is name-system-agnostic, and it's heavily
   based on the pluggable transports API (proposal 180). It should be
   flexible enough to accommodate all sorts of name systems (see [PLUGINEXAMPLES]).

2. System Specification

   A developer that wants to integrate a name system with Tor needs to first
   write a wrapper that understands the Tor Name System API (NS API). Using the
   Name System API, Tor asks the name system to perform name queries, and
   receives the query results. The NS API works using environment variables and
   stdin/stdout communication. It aims to be portable and easy to implement.

2.1. System overview [SYSTEMOVERVIEW]

   Here is an overview of the Tor name system:

   Alice, a Tor user, can activate various name systems by editing her
   torrc file and specifying which tld each name system is responsible
   for. For this section, let's consider a simple fictional name system,
   unicorn, which magically maps domains with the .corn tld to the
   correct onion address. Here it is:

       OnionNamePlugin 0 .corn   /usr/local/bin/unicorn

   After Alice enables the unicorn plugin, she attempts connecting to
   elephantforum.corn. Tor will intercept the SOCKS request, and use the
   executable at /usr/local/bin/unicorn to query the unicorn name system
   for elephantforum.corn. Tor communicates with the unicorn plugin
   using the Tor NS API through which name queries and their results can
   be transported using stdin/stdout.

   If elephantforum.corn corresponds to an onion address in the unicorn
   name system, unicorn should return the onion address to Tor using the
   Tor NS API. Tor must then internally rewrite the elephantforum.corn
   address to the actual onion address, and initiate a connection to it.

2.2. System Illustration

   Here is a diagram illustrating how the Tor Name System API works. The name
   system used in this example is GNS, but there is nothing GNS-specific here
   and GNS could be swapped for any other name system (like hosts files, or
   Namecoin).

   The example below illustrates how a user who types debian.zkey in their Tor
   browser gets redirected to sejnfjrq6szgca7v.onion after Tor consults the GNS
   network.

   Please revisit this illustration after reading the rest of the proposal.

       |                                    $~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~$
       |  1.                                $          4. GNS magic!!              $
       |  User: SOCKS CONNECT to            $ debian.zkey -> sejnfjrq6szgca7v.onion$
       |        http://debian.zkey/         $~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~$~~~~~~~$
       |                                                                   $
 +-----|-----------------------------------------+                         $
 |+----v-----+     2.                 +---------+|       3.                $
 ||Tor       |     debian.zkey        |Tor      ||       debian.zkey     +-$-------+
 ||Networking------------------------->Naming   ------------------------->         |
 ||Submodule |                        |Submodule||  Tor Name System API  |  GNS    |
 ||          <-------------------------         <-------------------------  wrapper|
 ||          | 6.                     |         ||5.                     |         |
 |+----|-----+ sejnfjrq6szgca7v.onion +---------+|sejnfjrq6szgca7v.onion +---------+
 +-----|-----------------------------------------+
       |  7.
       |  Tor: Connect to
       |       http://sejnfjrq6szgca7v.onion/
       v


2.3. System configuration [TORRC]

   As demonstrated in [SYSTEMOVERVIEW], a Tor user who wants to use a name
   system has to edit their configuration file appropriately. Here is the torrc
   line format:

       OnionNamePlugin <priority> <tld> <path>

   where <priority> is a positive integer denoting the priority with which this
   name plugin should be consulted. <tld> is a string which restricts the scope
   of this plugin to a particular tld.  Finally, <path> is a filesystem path to
   an executable that speaks the Tor Name System API and can act as an
   intermediary between Tor and the name system.

   For example here is a snippet from a torrc file:
       OnionNamePlugin 0 .hosts      /usr/local/bin/local-hosts-file
       OnionNamePlugin 1 .zkey       /usr/local/bin/gns-tor-wrapper
       OnionNamePlugin 2 .bit        /usr/local/bin/namecoin-tor-wrapper
       OnionNamePlugin 3 .scallion   /usr/local/bin/community-hosts-file

2.3.1. Tor name resolution logic

   When Tor receives a SOCKS request to an address that has a name
   plugin assigned to it, it needs to perform a query for that address
   using that name plugin.

   If there are multiple name plugins that correspond to the requested
   address, Tor queries all relevant plugins sorted by their priority
   value, until one of them returns a successful result. If two plugins
   have the same priority value, Tor MUST abort.

   If all plugins fail to successfuly perform the name resolution, Tor SHOULD
   default to using the exit node for name resolution.
   XXX or not?  because of leaks?

2.4. Name system initialization [INITPROTOCOL]

   When Tor finds OnionNamePlugin lines in torrc, it launches and initializes
   their respective executables.

   When launching name plugins, Tor sets various environment variables to pass
   data to the name plugin (e.g. NS API version, state directory, etc.). More
   information on the environment variables at [INITENVVARS].

   After a name plugin initializes and parses all needed environment
   variables, it communicates with Tor using its stdin/stdout.

   The first line that a name plugin sends to stdout signifies that it's ready
   to receive name queries. This line looks like this:

      INIT <VERSION> <STATUS_CODE> [<STATUS_MSG>]

   where VERSION is the Tor NS API protocol version that the plugin supports,
   STATUS_CODE is an integer status code, and STATUS_MSG is an optional string
   error message. STATUS_CODE value 0 is reserved for "success", and all other
   integers are error codes.

   See [PROTOEXAMPLE] for an example of this protocol.

2.5. Name resolution using NS API

   Here is how actual name resolution requests are performed in NS API.

2.5.1. Message format

   When Tor receives a SOCKS request to an address with a tld that has a name
   plugin assigned to it, Tor performs an NS API name query for that address.

   Tor does this by printing lines on the name plugin stdout as follows:

      RESOLVE <QUERY_ID> <NAME_STRING>

   where QUERY_ID is a unique integer corresponding to this query, and
   NAME_STRING is the name to be queried.

   When the name plugin completes the name resolution, it prints the following
   line in its stdout:

      RESOLVED <QUERY_ID> <STATUS_CODE> <RESULT>

   where QUERY_ID is the corresponding query ID and STATUS_CODE is an integer
   status code. RESULT is the resolution result (an onion address) or an error
   message if the resolution was not succesful.

   See [PROTOEXAMPLE] for an example of this protocol.

   XXX Should <RESULT> be optional in the case of failure?

2.5.2. RESOLVED status codes

   Name plugins can deliver the following status codes:

   0 -- The name resolution was successful.

   1 -- Name resolution generic failure.

   2 -- Name tld not recognized.

   3 -- Name not registered.

   4 -- Name resolution timeout exceeded.

   XXX add more status codes here as needed

2.5.3. Further name resolution behavior

   Tor and name plugins MAY cache name resolution results in memory as
   needed. Caching results on disk should be avoided.

   Tor SHOULD abort (or cancel) an ongoing name resolution request, if it takes
   more than NAME_RESOLUTION_TIMEOUT seconds.
   XXX NAME_RESOLUTION_TIMEOUT = ???

   Tor MUST validate that the resolution result is a valid .onion name.
   XXX should we also accept IPs and regular domain results???
   XXX perhaps we should make sure that results are not names that need
       additional name resolution to avoid endless loops. e.g. imagine
       some sort of loop like this:
        debian.zkey -> debian-bla.zkey -> debian.zkey -> etc.

2.6. Cancelling a name resolution request

   Tor might need to cancel an ongoing name resolution request
   (e.g. because a timeout passed, or the client is not interested in
   that address anymore). In this case, Tor sends the following line to
   the plugin stdout as follows:

     CANCEL <QUERY_ID>

   to which the name plugin, after performing the cancellation, SHOULD
   answer with:

     CANCELED <QUERY_ID>

2.7. Launching name plugins [INITENVVARS]

   As described in [INITPROTOCOL], when Tor launches a name plugin, it sets
   certain environment variables. At a minimum, it sets (in addition to the
   normal environment variables inherited from Tor):

    "TOR_NS_STATE_LOCATION" -- A filesystem directory path where the
       plugin should store state if it wants to.  This directory is not
       required to exist, but the plugin SHOULD be able to create it if
       it doesn't.  The plugin MUST NOT store state elsewhere.
      Example: TOR_NS_STATE_LOCATION=/var/lib/tor/ns_state/

    "TOR_NS_PROTO_VERSION" -- To tell the plugin which versions of this
       configuration protocol Tor supports. Future versions will give a
       comma-separated list.  Plugins MUST accept comma-separated lists
       containing any version that they recognize, and MUST work correctly even
       if some of the versions they don't recognize are non-numeric.  Valid
       version characters are non-space, non-comma printing ASCII characters.
      Example: TOR_NS_PROTO_VERSION=1,1a,2,4B

    "TOR_NS_PLUGIN_OPTIONS" -- Specifies configuration options for this
       name plugin as a semicolon-separated list of k=v strings with
       options that are to be passed to the plugin.

       Colons, semicolons, equal signs and backslashes MUST be escaped with a
       backslash.

       If there are no arguments that need to be passed to any of the
       plugins, "TOR_NS_PLUGIN_OPTIONS" MAY be omitted.

       For example consider the following options for the "banana" name plugin:

         TOR_NS_PLUGIN_OPTIONS=timeout=5;url=https://bananacake.com

         Will pass to banana the parameters 'timeout=5' and
         'url=https://bananacake.com'.

       XXX Do we like this option-passing interface? Do we have any lessons
           from our PT experiences?

   XXX Add ControlPort/SocksPort environment variables.

   See [PROTOEXAMPLE] for an example of this environment

2.8. Name plugin workflow [NSBEHAVIOR]

   Name plugins follow the following workflow:

     1) Tor sets the required environment values and launches the name plugin
        as a sub-process (fork()/exec()). See [INITENVVARS].

     2) The name plugin checks its environment, and determines the supported NS
        API versions using the env variable TOR_NS_PROTO_VERSION.

        2.1) If there are no compatible versions, the name plugin writes
             an INIT message with a failure status code as in
             [INITPROTOCOL], and then shuts down.

     3) The name plugin parses and handles the rest of the environment values.

        3.1) If the environment variables are malformed, or otherwise
             invalid, the name plugin writes an INIT message with a
             failure status code as in [INITPROTOCOL], and then shuts
             down.

     4) After the name plugin completely initializes, it sends a successful
        INIT message to stdout as in [INITPROTOCOL]. Then it continues
        monitoring its stdin for incoming RESOLVE messages.

     6) When the name plugin receives a RESOLVE message, it performs the name
        resolution and replies with the appropriate RESOLVED message.

     7) Upon being signaled to terminate by the parent process [NSSHUTDOWN], the
        name plugin gracefully shuts down.

2.8.1. Name plugin shutdown [NSSHUTDOWN]

   To ensure clean shutdown of all name plugins when Tor shuts down, the
   following rules apply for name plugins:

   Name plugins MUST handle OS specific mechanisms to gracefully terminate
   (e.g. SIGTERM).

   Name plugins SHOULD monitor their stdin and exit gracefully when it is
   closed.

2.9. Further details of stdin/stdout communication

2.9.1. Message Format

   Tor communicates with its name plugins by writing NL-terminated lines to
   stdout.  The line metaformat is

      <Line> ::= <Keyword> <OptArgs> <NL>
      <Keyword> ::= <KeywordChar> | <Keyword> <KeywordChar>
      <KeyWordChar> ::= <any US-ASCII alphanumeric, dash, and underscore>
      <OptArgs> ::= <Args>*
      <Args> ::= <SP> <ArgChar> | <Args> <ArgChar>
      <ArgChar> ::= <any US-ASCII character but NUL or NL>
      <SP> ::= <US-ASCII whitespace symbol (32)>
      <NL> ::= <US-ASCII newline (line feed) character (10)>

   Tor MUST ignore lines with keywords that it doesn't recognize.

3. Discussion

3.1. Using second-level domains instead of tld

   People have suggested that users should try to connect to reddit.zkey.onion
   instead of reddit.zkey. That is, we should always preserve .onion as the
   tld, and only utilize second-level domains for naming.

   The argument for this is that this way users cannot accidentally leak
   addresses to DNS, as the .onion domain is reserved by RFC 7686.

   The counter-argument here is that this might be confusing to users since
   they are not used to the second-level domain being special (e.g. co.uk).
   Also, what happens when someone registers a 16-character name, that matches
   the length of a vanilla onion address?

   We should consider the concerns here and take the right decision.

3.2. Name plugins handling all tlds '*'

   In [TORRC], we assigned a single tld to each name plugin.  Should we also
   accept catch-all tlds using '*'? I'm afraid that this way a name system
   could try to resolve even normal domains like reddit.com .

   Perhaps we trust the name plugin itself, but maybe the name system
   network could exploit this? Also, the catch-all tld will probably
   cause some engineering complications in this proposal (as it did for PTs).

3.3. Deployment strategy

   We need to devise a deployment strategy that will allow us to improve
   the UX of our users as soon as possible, but without taking hasty,
   sloppy or uneducated decisions.

   For starters, we should make it easy for developers to write wrappers around
   their secure name systems. We should develop libraries that speak the NS API
   protocol and can be used to quickly write wrappers. Similar libraries were quite
   successful during pluggable transport deployment; see pyptlib and goptlib.

   In the beginning, name plugins should be third-party applications that can
   be installed by interested users manually or through package managers. Users
   will also have to add the appropriate OnionNamePlugin line to their torrc.
   This will be a testing phase, and also a community-growing phase.

   After some time, and when we get a better idea of how name plugins
   work for the community, we can start considering how to make them
   more easily usable.  For example, we can start by including some name
   plugins into TBB in an optional opt-in fashion. We should be careful
   here, as people have real incentives for attacking name systems and
   we should not put our users unwillingly in danger.

3.4. Miscellaneous discussion topics

   1. The PT spec tries hard so that a single executable can expose multiple
      PTs. In this spec, it's clear that each executable is a single name
      plugin. Is this OK or a bad idea? Should we change interfaces so that
      each name plugin has an identifier, and then use that identifier for
      things?

   2. Should we make our initialization protocol _identical_ to the PT API
      initialization protocol? That is, use ENV-ERROR etc. instead of INT?

   3. Does it make sense to support reverse queries, from .onion to names? So
      that people auto-learn the names of the onions they use?

4. Acknowledgements

   Proposal details discussed during Tor hackfest in Seattle between
   Yawning, David and me. Thanks to Lunar and indolering for more
   discussion and feedback.

   This research was supported in part by NSF grants CNS-1111539,
   CNS-1314637, CNS-1526306, CNS-1619454, and CNS-1640548.

Appendix A.1: Example communication Tor <-> name plugin [PROTOEXAMPLE]

   Environemnt variables:

     TOR_NS_STATE_LOCATION=/var/lib/tor/ns_state
     TOR_NS_PROTO_VERSION=1
     TOR_NS_PLUGIN_OPTIONS=timeout=5;cache=/home/user/name_cache

   Messages between Tor and the banana name plugin:

     Name plugin (banana) -> Tor:
        INIT 1 0

     Tor -> Name plugin (banana):
        RESOLVE 1 daewonskate.banana

     Name plugin (banana) -> Tor:
        RESOLVED 1 0 jqkscnkne4qt91iq.onion

     Tor -> Name plugin (banana):
        RESOLVE 1 architecturedirect.zkey

     Name plugin (banana) -> Tor (banana):
        RESOLVE 1 2 "zkey not recognized tld"

     Tor -> Name plugin (banana):
        RESOLVE 1 origamihub.banana

     Name plugin (banana) -> Tor (banana):
        RESOLVE 1 2 wdxfpaxar4dg12vd.onion

Appendix A.2: Example plugins [PLUGINEXAMPLES]

   Here are a few examples of name plugins for brainstorming:

   a) Simplest plugin: A local hosts file. Basically a local petname system
      that maps names to onion addresses.

   b) A remote hosts file. A centralized community hosts file that people trust.

   c) Multiple remote hosts files. People can add their own favorite community
      hosts file.

   d) Multiple remote hosts files with notaries and reputation
      trust. Like moxie's convergence tool but for names.

   e) GNS

   f) OnioNS

   g) Namecoin/Blockstart
Filename: 280-privcount-in-tor.txt
Title: Privacy-Preserving Statistics with Privcount in Tor
Author: Nick Mathewson, Tim Wilson-Brown
Created: 02-Aug-2017
Status: Superseded
Superseded-By: 288

0. Acknowledgments

  Tariq Elahi, George Danezis, and Ian Goldberg designed and implemented
  the PrivEx blinding scheme. Rob Jansen and Aaron Johnson extended
  PrivEx's differential privacy guarantees to multiple counters in
  PrivCount:

  https://github.com/privcount/privcount/blob/master/README.markdown#research-background

  Rob Jansen and Tim Wilson-Brown wrote the majority of the experimental
  PrivCount code, based on the PrivEx secret-sharing variant. This
  implementation includes contributions from the PrivEx authors, and
  others:

  https://github.com/privcount/privcount/blob/master/CONTRIBUTORS.markdown

  This research was supported in part by NSF grants CNS-1111539,
  CNS-1314637, CNS-1526306, CNS-1619454, and CNS-1640548.

1. Introduction and scope

  PrivCount is a privacy-preserving way to collect aggregate statistics
  about the Tor network without exposing the statistics from any single
  Tor relay.

  This document describes the behavior of the in-Tor portion of the
  PrivCount system.  It DOES NOT describe the counter configurations,
  or any other parts of the system. (These will be covered in separate
  proposals.)

2. PrivCount overview

  Here follows an oversimplified summary of PrivCount, with enough
  information to explain the Tor side of things.  The actual operation
  of the non-Tor components is trickier than described below.

  All values in the scheme below are 64-bit unsigned integers; addition
  and subtraction are modulo 2^64.

  In PrivCount, a Data Collector (in this case a Tor relay) shares
  numeric data with N different Tally Reporters. (A Tally Reporter
  performs the summing and unblinding roles of the Tally Server and Share
  Keeper from experimental PrivCount.)

  All N Tally Reporters together can reconstruct the original data, but
  no (N-1)-sized subset of the Tally Reporters can learn anything about
  the data.

  (In reality, the Tally Reporters don't reconstruct the original data
  at all! Instead, they will reconstruct a _sum_ of the original data
  across all participating relays.)

  To share data, for each value X to be shared, the relay generates
  random values B_1 though B_n, and shares each B_i secretly with a
  single Tally Reporter.  The relay then publishes Y = X + SUM(B_i) + Z,
  where Z is a noise value taken at random from a gaussian distribution.
  The Tally Reporters can reconstruct X+Z by securely computing SUM(B_i)
  across all contributing Data Collectors. (Tally Reporters MUST NOT
  share individual B_i values: that would expose the underlying relay
  totals.)

  In order to prevent bogus data from corrupting the tally, the Tor
  relays and the Tally Reporters perform multiple "instances" of this
  algorithm, randomly sampling the relays in each instance. Each relay
  sends multiple Y values for each measurement, built with different
  sets of B_i. These "instances" are numbered in order from 1 to R.

  So that the system will still produce results in the event of a single
  Tally Reporter failure, these instances are distributed across multiple
  subsets of Tally Reporters.

  Below we describe a data format for this.

3. The document format

  This document format builds on the line-based directory format used
  for other tor documents, described in Tor's dir-spec.txt.

  Using this format, we describe two kinds of documents here: a
  "counters" document that publishes all the Y values, and a "blinding"
  document that describes the B_i values.  But see "An optimized
  alternative" below.

  The "counters" document has these elements:

    "privctr-dump-format" SP VERSION SP SigningKey

       [At start, exactly once]

       Describes the version of the dump format, and provides an ed25519
       signing key to identify the relay.  The signing key is encoded in
       base64 with padding stripped. VERSION is "alpha" now, but should
       be "1" once this document is finalized.

       [[[TODO: Do we need a counter version as well?

          Noise is distributed across a particular set of counters,
          to provide differential privacy guarantees for those counters.
          Reducing noise requires a break in the collection.
          Adding counters is ok if the noise on each counter
          monotonically increases. (Removing counters always reduces
          noise.)

          We also need to work out how to handle instances with mixed
          Tor versions, where some Data Collectors report a different
          set of counters than other Data Collectors. (The blinding works
          if we substitute zeroes for missing counters on Tally Reporters.
          But we also need to add noise in this case.)

          -teor
        ]]]

    "starting-at" SP IsoTime

       [Exactly once]

       The start of the time period when the statistics here were
       collected.

    "ending-at" SP IsoTime

       [Exactly once]

       The end of the time period when the statistics here were
       collected.

    "num-instances" SP Number

       [Exactly once]

       The number of "instances" that the relay used (see above.)

    "tally-reporter" SP Identifier SP Key SP InstanceNumbers

       [At least twice]

       The curve25519 public key of each Tally Reporter that the relay
       believes in.  (If the list does not match the list of
       participating tally reporters, they won't be able to find the
       relay's values correctly.)  The identifiers are non-space,
       non-nul character sequences.  The Key values are encoded in
       base64 with padding stripped; they must be unique within each
       counters document.  The InstanceNumbers are comma-separated lists
       of decimal integers from 0 to (num-instances - 1), in ascending
       order.

    Keyword ":" SP Int SP Int SP Int ...

       [Any number of times]

       The Y values for a single measurement.  There are num-instances
       such Y values for each measurement.  They are 64-bit unsigned
       integers, expressed in decimal.

       The "Keyword" denotes which measurement is being shared. Keyword
       MAY be any sequence of characters other than colon, nul, space,
       and newline, though implementators SHOULD avoid getting too
       creative here.  Keywords MUST be unique within a single document.
       Tally Reporters MUST handle unrecognized keywords.  Keywords MAY
       appear in any order.

       It is safe to send the blinded totals for each instance to every
       Tally Reporter. To unblind the totals, a Tally Reporter needs:
         * a blinding document from each relay in the instance, and
         * the per-counter blinding sums from the other Tally Reporters
           in their instance.

       [[[TODO: But is it safer to create a per-instance counters
          document? -- teor]]]

       The semantics of individual measurements are not specified here.

    "signature" SP Signature

       [At end, exactly once]

       The Ed25519 signature of all the fields in the document, from the
       first byte, up to but not including the "signature" keyword here.
       The signature is encoded in base64 with padding stripped.


  The "blinding" document has these elements:

    "privctr-secret-offsets" SP VERSION SP SigningKey

       [At start, exactly once.]

       The VERSION and SigningKey parameters are the same as for
       "privctr-dump-format".

    "instances" SP Numbers

       [Exactly once]

       The instances that this Tally Reporter handles.
       They are given as comma-separated decimal integers, as in the
       "tally-reporter" entry in the counters document.  They MUST
       match the instances listed in the counters document.

       [[[TODO: this is redundant. Specify the constraint instead? --teor]]]

    "num-counters" SP Number

       [Exactly once]

       The number of counters that the relay used in its counters
       document. This MUST be equal to the number of keywords in the
       counters document.

       [[[TODO: this is redundant. Specify the constraint instead? --teor]]]

    "tally-reporter-pubkey" SP Key

       [Exactly once]

       The curve25519 public key of the tally reporter who is intended
       to receive and decrypt this document.  The key is base64-encoded
       with padding stripped.

    "count-document-digest" SP "sha3" Digest NL
    "-----BEGIN ENCRYPTED DATA-----" NL
    Data
    "-----END ENCRYPTED DATA-----" NL

       [Exactly once]

       The SHA3-256 digest of the count document corresponding to this
       blinding document.  The digest is base64-encoded with padding
       stripped.  The data encodes the blinding values (See "The
       Blinding Values") below, and is encrypted to the tally reporter's
       public key using the hybrid encryption algorithm described below.

    "signature" SP Signature

       [At end, exactly once]

       The Ed25519 signature of all the fields in the document, from the
       first byte, up to but not including the "signature" keyword here.
       The signature is encoded in base64 with padding stripped.


4. The Blinding Values

  The "Data" field of the blinding documents above, when decrypted,
  yields a sequence of 64-bit binary values, encoded in network
  (big-endian) order.  There are C * R such values, where C is the number
  of keywords in the count document, and R is the number of instances
  that the Tally Reporter participates in. The client generates all of
  these values uniformly at random.

  For each keyword in the count document, in the order specified by the
  count document, the decrypted data holds R*8 bytes for the specified
  instance of that keyword's blinded counter.

  For example: if the count document lists the keywords "b", "x", "g",
  and "a" (in that order), and lists instances "0" and "2", then the
  decrypted data will hold the blinding values in this order:
      b, instance 0
      b, instance 2
      x, instance 0
      x, instance 2
      g, instance 0
      g, instance 2
      a, instance 0
      a, instance 2


4. Implementation Notes

  A relay should, when starting a new round, generate all the blinding
  values and noise values in advance.  The relay should then use these
  values to compute Y_0 = SUM(B_i) + Z for each instance of each
  counter.  Having done this, the relay MUST encrypt the blinding values
  to the public key of each tally reporter, and wipe them from memory.


5. The hybrid encryption algorithm

  We use a hybrid encryption scheme above, where items can be encrypted
  to a public key.  We instantiate it as follows, using curve25519
  public keys.

  To encrypt a plaintext M to a public key PK1
     1. the sender generates a new ephemeral keypair sk2, PK2.
     2. The sender computes the shared diffie hellman secret
        SEED = (sk2 * PK1).

     3. The sender derives 64 bytes of key material as
          SHAKE256(TEXT | SEED)[...64]
        where "TEXT" is "Expand curve25519 for privcount encryption".

        The first 32 bytes of this is an aes key K1;
        the second 32 bytes are a mac key K2.

     4. The sender computes a ciphertext C as AES256_CTR(K1, M)

     5. The sender computes a MAC as
          SHA3_256([00 00 00 00  00 00 00 20] | K2 | C)

     6. The hybrid-encrypted text is PK2 | MAC | C.


6. An optimized alternative

   As an alternative, the sequences of blinding values are NOT transmitted
   to the tally reporters.  Instead the client generates a single
   ephemeral keypair sk_c, PK_c, and places the public key in its counts
   document.  It does this each time a new round begins.

   For each tally reporter with public key PK_i, the client then does
   the handshake sk_c * PK_i to compute SEED_i.

   The client then generates the blinding values for that tally reporter
   as SHAKE256(SEED_i)[...R*C*8].

   After initializing the counters to Y_0, the client can discard the
   blinding values and sk_c.

   Later, the tally reporters can reconstruct the blinding values as
   SHAKE256(sk_i * PK_c)[...]

   This alternative allows the client to transmit only a single public
   key, when previously it would need to transmit a complete set of
   blinding factors for each tally reporter. Further, the alternative
   does away with the need for blinding documents altogether.  It is,
   however, more sensitive to any defects in SHAKE256 than the design
   above.  Like the rest of this design, it would need rethinking if we
   want to expand this scheme to work with anonymous data collectors,
   such as Tor clients.

Filename: 281-bulk-md-download.txt
Title: Downloading microdescriptors in bulk
Author: Nick Mathewson
Created: 11-Aug-2017
Status: Reserve

1. Introduction

  This proposal describes a ways to download more microdescriptors
  at a time, using fewer bytes.

  Right now, to download N microdescriptors, the client must send
  about 44*N bytes in its HTTP request.  Because clients can request
  microdescriptors in any combination, the directory caches cannot
  pre-compress responses to these requests, and need to use less
  space-efficient on-the-fly compression algorithms.

  Under this proposal, clients simply say "Send me the
  microdescriptors I need", given what I know.

2. Combined microdescriptor downloads

2.1. By diff

  If a client has a consensus with base64 sha3-256 digest X, and it
  previously had a consensus with base64 sha3-256 digests Y then
  it may request all the microdescriptors listed in X but not Y,
  by asking for the resource:
      /tor/micro/diff/X/Y

  Clients SHOULD only ask for this resource compressed.

  Caches MUST NOT answer this request unless they recognize the
  consensus with digest X, and digest Y.

  If answering, caches MUST reply with all of the
  microdescriptors that the cache holds that were listed by
  consensus X, and MUST omit all the microdescriptors that were
  not listed in consensus Y.  (For the purposes of this proposal,
  microdescriptors are "the same" if they are textually identical
  and have the same digest.)

2.2. By consensus:

  If a client has fewer than NMNM% of the microdescriptors listed in a
  consensus X, it should fetch the resource
      /tor/micro/full/X

  Clients SHOULD only ask for this resource compressed.

  Caches MUST NOT answer this request unless they recognize the
  consensus with digest X. They should send all the microdescriptors
  they have that are listed in that consensus.

2.3. When to make these requests

  Clients should decide to use this format in preference to the old
  download-by-digest format if the consensus X lists their preferred
  directory cache as using a new DirCache subprotocol version. (See
  5 below.)

  When a client has some preferred directory caches that support
  this subprotocol and some that do not, it chooses one at random,
  and uses these requests if that one supports this subprotocol.

  (A client always has a consensus when it requests
  microdescriptors, so it will know whether any given cache supports
  these requests.)

3. Performance analysis

  This is a back-of-the-envelope analysis using a month's worth of
  consensus documents, and a randomly chosen sample of
  microdescriptors.


  On average, about 0.5% of the microdescriptors change between any
  two consensuses.  Call it 50.  That means 50*43 bytes == 2150
  bytes to request the microdescriptors.  It means ~24530 bytes of
  microdescriptors downloaded, compressed to ~13687 bytes by zstd.

  With this proposal, we're down to 86 bytes for the request, and we
  can precompute the compressed output, making it save to use lzma2,
  getting a compressed result more like 13362.

  It appears that this change would save about 15% for incremental
  microdescriptor downloads, most of that coming from the reduction
  in request size.

  For complete downloads, a complete set of microdescriptors is about
  7700 microdesciptors long.  That makes the total number of bytes
  for the requests 7700*43 == 331100 bytes.  The response, if
  compressed with lzma instead of zstd, would fall from 1659682 to
  1587804 bytes, for a total savings of 20%.


5. Compatibility

   Caches supporting this download protocol need to advertise
   support of a new DirCache subprotocol version.
Filename: 282-remove-named-from-consensus.txt
Title: Remove "Named" and "Unnamed" handling from consensus voting
Author: Nick Mathewson
Created: 12-Sep-2017
Status: Accepted
Target: arti-dirauth

1. Summary

   Authorities no longer vote for the "Named" and "Unnamed" flags, and we
   have begun to remove the client code that supports them. (See proposal
   235). The next logical step is to remove the special handling from these
   flags from the consensus voting algorithm.  We specify this here.

2. Proposal

   We add a new consensus method, here represented as M, to be allocated
   when this proposal's implementation is merged.

   We specify that the Named and Unnamed flags are only handled
   specially when the negotiated consensus method is earlier than M.  If
   the negotiated method is M or later, then the Named and Unnamed
   flags are handled as if any they were any other consensus flags.

Filename: 283-ipv6-in-micro-consensus.txt
Title: Move IPv6 ORPorts from microdescriptors to the microdesc consensus
Author: Tim Wilson-Brown (teor), Nick Mathewson
Created: 18-Oct-2017
Status: Closed
Target: 0.3.3.x
Implemented-In: 0.3.3.1-alpha
Ticket: #20916

1. Summary

   Moving IPv6 ORPorts from microdescs to the microdesc consensus will make
   it easier for IPv6 clients to bootstrap and select reachable guards.

   Tor clients on IPv6-only connections currently have to use IPv6 Fallback
   Directory Mirrors to fetch their microdescriptors. This does not scale
   well. After this change, they will be able to fetch microdescriptors from
   any IPv6-enabled directory mirror in the consensus.

   Tor clients on versions 0.2.8.x and 0.2.9.x are currently unable to
   bootstrap over IPv6-only connections when using microdescriptors. After
   this consensus change, they will be able to bootstrap without any client
   code changes.

   For clients that use microdescriptors (the default), IPv6 ORPorts are
   always placed in microdescriptors. So these clients can only tell if an
   IPv6 ORPort is unreachable when a majority of voting authorities mark the
   relay as not Running. After this proposal, clients will be able to discover
   unreachable ORPorts, even if a minority of voting authorities set
   AuthDirHasIPv6Connectivity 1.

2. Proposal

   We add two new consensus methods, here represented as M and N (M < N), to
   be allocated when this proposal's implementation is merged. These consensus
   methods move IPv6 ORPorts from microdescs to the microdesc consensus.

   We use two different methods because this allows us to modify client code
   based on each method. Also, if a bug is discovered in one of the methods,
   authorities can be patched to stop voting for it, and then we can implement
   a fix in a later method.

2.1. Add Reachable IPv6 ORPorts to the Microdesc Consensus

   We specify that microdescriptor consensuses created with methods M or later
   contain reachable IPv6 ORPorts.

2.2. Remove IPv6 ORPorts from Microdescriptors

   We specify that microdescriptors created with methods N or later start
   omitting IPv6 ORPorts.

3. Retaining Existing Behaviour

   The following existing behaviour will be retained:

3.1. Authority IPv6 Reachability

   Only authorities configured with AuthDirHasIPv6Connectivity 1 will test
   IPv6 ORPort reachability, and vote for IPv6 ORPorts.

   This means that:
   * if no voting authorities set AuthDirHasIPv6Connectivity 1, there will be
     no IPv6 ORPorts in the consensus,
   * if a minority of voting authorities set AuthDirHasIPv6Connectivity 1:
     unreachable IPv6 ORPort lines will be dropped from the consensus, but
     the relay will still be listed as Running, and
     reachable IPv6 ORPort lines will be included in the consensus.
   * if a majority of voting authorities set AuthDirHasIPv6Connectivity 1,
     relays with unreachable IPv6 ORPorts will not be listed as Running.
     Reachable IPv6 ORPort lines will be included in the consensus.
     (To ensure that any valid majority will vote relays with unreachable
     IPv6 ORPorts not Running, 75% of authorities must set
     AuthDirHasIPv6Connectivity 1.)

   We will document this behaviour in the tor manual page, see #23870.

3.2. NS Consensus IPv6 ORPorts

   The NS consensus will continue to contain reachable IPv6 ORPorts.

4. Impact and Related Changes

4.1. Directory Authority Configuration

   We will work to get a super-majority (75%) of authorities checking relay
   IPv6 reachability, to avoid Running-flag flapping. To do this, authorities
   need to get IPv6 connectivity, and set AuthDirHasIPv6Connectivity 1.

4.2. Relays and Bridges

   Tor relays and bridges do not currently use IPv6 ORPorts from the
   consensus.

   We expect that 2/3 of authorities will be voting for consensus method N
   before future Tor relay or bridge versions use IPv6 ORPorts from the
   consensus.

4.3. Clients

4.3.1. Legacy Clients

4.3.1.1. IPv6 ORPort Circuits

   Tor clients on versions 0.2.8.x to 0.3.2.x check directory documents for
   ORPorts in the following order:
     * descriptors (routerinfo, available if using bridges or full descriptors)
     * consensus (routerstatus)
     * microdescriptors (IPv6 ORPorts only)

   Their behaviour will be identical to the current behaviour for consensus
   methods M and earlier. When consensus method N is used, they will ignore
   unreachable IPv6 ORPorts without any code changes, as long as they are
   using microdescriptors.

4.3.1.2. IPv6 ORPort Bootstrap

   Tor clients on versions 0.2.8.x and 0.2.9.x are currently unable to
   bootstrap over IPv6-only connections when using microdescriptors. This
   happens because the microdesc consensus does not contain IPv6 ORPorts.
   (IPv6-only Tor clients on versions 0.3.0.2-alpha and later use fallback
   directory mirrors to fetch their microdescriptors.)

   When consensus method M is used, 0.2.8.x and 0.2.9.x clients will be able
   to bootstrap over IPv6-only connections using microdescriptors, without any
   code changes.

4.3.2. Future Clients

4.3.2.1. Ignoring IPv6 ORPorts in Microdescs

   Tor clients on versions 0.3.3.x and later will ignore unreachable IPv6
   ORPorts once consensus method M or later is in use. This requires some code
   changes, see #23827.

4.3.2.2. IPv6 ORPort Bootstrap

   If a bootstrapping IPv6-only client has a consensus made with method M or
   later, it should download microdescriptors from one of the IPv6 ORPorts in
   that consensus. This requires some code changes, see #23827.

   Previously, IPv6-only clients would use fallback directory
   mirrors to download microdescs, because there were no IPv6 ORPorts in the
   microdesc consensus.

4.3.2.3. Ignoring Addresses in Unused Directory Documents

   If a client doesn't use a particular directory document type for a node,
   it should ignore any addresses in that document type. This requires some
   code changes, see #23975.

5. Data Size

   This change removes 7-50 bytes from the microdescriptors of relays that
   have an IPv6 ORPort, and adds them to reachable IPv6 relays' microdesc
   consensus entries.

   As of October 2017, 600 relays (9%) have IPv6 ORPorts in the NS
   consensus. Their "a" lines take up 19 KB, or 33 bytes each on average.

   The gzip-compressed microdesc consensus is 564 KB, and adding the existing
   IPv6 addresses makes it 576 KB (a 2.1% increase). Adding IPv6 addresses to
   every relay makes it 644 KB (a 14% increase). zstd-compressed microdesc
   consensuses show smaller increases of 1.7% and 8.0%, respectively.

   Most tor clients are already running 0.3.1.7, which implements consensus
   diffs and zstd compression. We expect that most directory mirrors will also
   implement consensus diffs and zstd compression by the time 2/3 of
   authorities are voting for consensus method M. Consensus diffs will reduce
   the worst-case impact of this change for clients and relays that have a
   recent consensus.

6. External Impacts

   We don't expect this change to impact Onionoo and similar projects, because
   they typically use the NS consensus.

7. Monitoring

   OnionOO has implemented an "unreachable IPv6 address" attribute:
   https://trac.torproject.org/projects/tor/ticket/21637

   Metrics is working on IPv6 relay graphs:
   https://trac.torproject.org/projects/tor/ticket/23761

   Consensus-health implements a ReachableIPv6 pseudo-flag for authorities
   and relays:
   https://consensus-health.torproject.org/
Filename: 284-hsv3-control-port.txt
Title: Hidden Service v3 Control Port
Author: David Goulet
Created: 02-November-2017
Status: Closed

1. Summary

   This document extends the hidden service control port events and commands
   to version 3 (rend-spec-v3.txt).

   No command nor events are newly added in this document, it only desribes
   how the current commands and events are extended to support v3.

2. Format

   The formatting of this document follows section 2 of control-spec.txt. It
   is split in two sections, the Commands and the Events for hidden service
   version 3.

   We define the alphabet of a Base64 encoded value to be:

      Base64Character = "A"-"Z" / "a"-"z" / "0"-"9" / "+" / "/"

   For a command or event, if nothing is mentionned, the behavior doesn't
   change from the control port specification.

3. Specification:

3.1. Commands

   As specified in the control specification, all commands are
   case-insensitive but the keywords are case-sensitive.

3.1.1. GETINFO

   Hidden service commands are:

     "hs/client/desc/id/<ADDR>"
       The <ADDR> can be a v3 address without the ".onion" part. The rest is
       as is.

     "hs/service/desc/id/<ADDR>"
       The <ADDR> can be a v3 address without the ".onion" part. The rest is
       as is.

     "onions/{current,detached}"
       No change. This command can support v3 hidden service without changes
       returning v3 address(es).

3.1.2. HSFETCH

   The syntax of this command supports both an HSAddress or a versionned
   descriptor ID. However, for descriptor ID, version 3 doesn't have the same
   concept as v2 so, for v3 the descriptor ID is the blinded key of a
   descriptor which is used as an index to query the HSDir:

   The syntax becomes:
     "HSFETCH" SP (HSAddress / "v" Version "-" DescId)
               *[SP "SERVER=" Server] CRLF

     HSAddress = (16*Base32Character / 56*Base32Character)
     Version = "2" / "3"
     DescId = (32*Base32Character / 32*Base64Character)
     Server = LongName

   The "HSAddress" key is extended to accept 56 base32 characters which is the
   format of a version 3 onion address.

   The "DescId" of the form 32*Base64Character is the descriptor blinded key
   used as an index to query the directory. It can only be used with
   "Version=3".

3.1.5. HSPOST

   To support version 3, the command needs an extra parameter that is the
   onion address of the given descriptor. With v2, the address could have been
   deduced from the given descriptor but with v3, this is not possible.  In
   order to fire up the HS_DESC event correctly, we need the address so the
   request can be linked on the control port.

   Furthermore, the given descriptor will be validated with the given address
   and an error will be returned if they are not matching.

   The syntax becomes:

     "+HSPOST" *[SP "SERVER=" Server] [SP "HSADDRESS=" HSAddress]
               CRLF Descriptor CRLF "." CRLF

     HSAddress = 56*Base32Character

   The "HSAddress" key is optional and only applies for v3 descriptors. A 513
   error is returned if used with v2.

3.1.3. ADD_ONION

   For this command to support version 3, new values are added but the syntax
   is unchanged:

     "ADD_ONION" SP KeyType ":" KeyBlob
                 [SP "Flags=" Flag *("," Flag)]
                 1*(SP "Port=" VirtPort ["," Target])
                 *(SP "ClientAuth=" ClientName [":" ClientBlob]) CRLF

   New "KeyType" value to "ED25519-V3" which identifies the key type to be a
   v3 ed25519 key.

   With the KeyType == "ED25519-V3", the "KeyBlob" should be a base64 encoded
   ed25519 private key.

   The "NEW:BEST" option will still return a version 2 address as long as the
   HiddenServiceVersion torrc option default is 2. To ask for a new v3 key,
   this should be used: "NEW:ED25519-V3".

   Because client authentication is not yet implemented, the "ClientAuth"
   field is ignored as well as "Flags=BasicAuth". A 513 error is returned if
   "ClientAuth" is used with an ED25519-V3 key type.

3.1.4. DEL_ONION

   The syntax of this command is:

     "DEL_ONION" SP ServiceID CRLF

     ServiceID = The Onion Service address without the trailing ".onion"
                 suffix

   The "ServiceID" can simply be a v3 address. Nothing else changes.

3.2. Events

3.2.1. HS_DESC

   For this event to support vesrion 3, one optional field and new
   values are added:

     "650" SP "HS_DESC" SP Action SP HSAddress SP AuthType SP HsDir
           [SP DescriptorID] [SP "REASON=" Reason] [SP "REPLICA=" Replica]
           [SP "HSDIR_INDEX=" HSDirIndex]

     Action =  "REQUESTED" / "UPLOAD" / "RECEIVED" / "UPLOADED" / "IGNORE" /
               "FAILED" / "CREATED"
     HSAddress = 16*Base32Character / 56*Base32Character / "UNKNOWN"
     AuthType = "NO_AUTH" / "BASIC_AUTH" / "STEALTH_AUTH" / "UNKNOWN"
     HsDir = LongName / Fingerprint / "UNKNOWN"
     DescriptorID = 32*Base32Character / 43*Base64Character
     Reason = "BAD_DESC" / "QUERY_REJECTED" / "UPLOAD_REJECTED" / "NOT_FOUND" /
              "UNEXPECTED" / "QUERY_NO_HSDIR"
     Replica = 1*DIGIT
     HSDirIndex = 64*HEXDIG

   The "HSDIR_INDEX=" is an optional field that is only for version 3 which
   contains the computed index of the HsDir the descriptor was uploaded to or
   fetched from.

   The "HSAddress" key is extended to accept 56 base32 characters which is the
   format of a version 3 onion address.

   The "DescriptorID" key is extended to accept 43 base64 characters which is
   the descriptor blinded key used for the index value at the "HsDir".

   The "REPLICA=" field is not used for the "CREATED" event because v3 doesn't
   use the replica number in the descriptor ID computation.

   Because client authentication is not yet implemented, the "AuthType" field
   is always "NO_AUTH".

3.2.2. HS_DESC_CONTENT

   For this event to support version 3, new values are added but the syntax is
   unchanged:

     "650" "+" "HS_DESC_CONTENT" SP HSAddress SP DescId SP HsDir CRLF
                Descriptor CRLF "." CRLF "650" SP "OK" CRLF

     HSAddress = 16*Base32Character / 56*Base32Character / "UNKNOWN"
     DescId = 32*Base32Character / 32*Base64Character
     HsDir = LongName / "UNKNOWN"
     Descriptor = The text of the descriptor formatted as specified in
                  rend-spec-v3.txt section 2.4 or empty string on failure.

   The "HSAddress" key is extended to accept 56 base32 characters which is the
   format of a version 3 onion address.

   The "DescriptorID" key is extended to accept 32 base64 characters which is
   the descriptor blinded key used for the index value at the "HsDir".

3.2.3 CIRC and CIRC_MINOR

   These circuit events have an optional field named "REND_QUERY" which takes
   an "HSAddress". This field is extended to support v3 address:

      HSAddress = 16*Base32Character / 56*Base32Character / "UNKNOWN"
Filename: 285-utf-8.txt
Title: Directory documents should be standardized as UTF-8
Author: Nick Mathewson
Created: 13 November 2017
Status: Accepted
Target: arti-dirauth
Ticket: https://gitlab.torproject.org/tpo/core/tor/-/issues/40131

1. Summary and motivation

   People frequently want to include non-ASCII text in their router
   descriptors.  The Contact line is a favorite place to do this, but in
   principle the platform line would also be pretty logical.

   Unfortunately, there's no specified way to encode non-ASCII in our
   directory documents.

   Fortunately, almost everybody who does it, uses UTF-8 anyway.

   As we move towards Rust support in Tor, we gain another motivation
   for standarding on UTF-8, since Rust's native strings strongly prefer
   UTF-8.

   So, in this proposal, we describe a migration path to having all
   directory documents be fully UTF-8.

   (See 2.3 below for a discussion of what exactly we mean by "non-UTF-8".)

2. Proposal

   First, we should have Tor relays reject ContactInfo lines (and any
   other lines copied directly into router descriptors) that are not
   UTF-8.

   At the same time, we should have authorities reject any router
   descriptors or extrainfo documents that are not valid UTF-8.
   Simultaneously, we can have all Tor instances reject all
   non-directory-descriptor directory documents that are not UTF-8,
   since none should exist today.

   Finally, once the authorities have updated, we should have all Tor
   instances reject all directory documents that are not UTF-8.  (We
   should not take this step until the authorities have upgraded, or
   else the behavior of updated and non-updated clients could be
   distinguished.)

2.1. Hidden service descriptors' encrypted bodies

   For the encrypted bodies of hidden service descriptors, we cannot
   reject them at the authority level, and so we need to take a slightly
   different approach to prevent client fingerprinting attacks.

   First, we should make Tor instances start warning about any hidden
   service descriptors whose bodies, post-decryption, contain non-utf-8
   plaintext.  At the same time, we add a consensus parameter to
   indicate that hidden service descriptors with non-utf-8 plaintexts
   should be rejected entirely: "reject-encrypted-non-utf-8".  If that
   parameter is set to 1, then hidden service clients will not only
   warn, but reject the descriptors.

   Once the vast majority of clients are running versions that support
   the "reject-encrypted-non-utf-8" parameter, that parameter can be set
   to 1.

2.2. Bridge descriptors

   Since clients download bridge descriptors directly from the bridges, they
   also need a two-phase plan as for hidden service descriptors above.  Here
   we take the same approach as in section 2.1 above, except using the
   parameter "reject-bridge-descriptor-non-utf-8".

2.3. Which UTF-8 exactly?

   We define the allowable set of UTF-8 as:
      * Zero or mode Unicode scalar values (as defined by The Unicode
        Standard, Version 3.1 or later), that is:
         * Unicode code points U+00 through U+10FFFF,
         * but excluding the code points U+D800 through U+DFFF,
      * Excluding the scalar value U+00 (for compatibility with NUL-terminated
        C strings),
      * Serialized using the UTF-8 encoding scheme (as defined by The Unicode
        Standard, Version 3.1 or later), in particular:
         * each code point is encoded with the shortest possible encoding,
      * Without a Unicode byte order mark (BOM, U+FEFF) at the start of the
        descriptor. (BOMs are optional and not recommended in UTF-8. Allowing
        a BOM would break backwards compatibility with ASCII-only Tor
        implementations.) Byte-swapped BOMs (U+FFFE) must also be rejected.

   In order to remain compatible with future versions of The Unicode Standard,
   we allow all possible code points, including Reserved code points.

   For languages with a conforming UTF-8 implementation (as defined by The
   Unicode Standard, Version 3.1 or later), this is equivalent to well-formed
   UTF-8, with the following additional rules:
      * reject a BOM (U+FEFF) or byte-swapped BOM (U+FFFE) at the start of the
        descriptor,
      * reject U+00 at any point in the descriptor,
      * accept all code point types used in UTF-8, including Control,
        Private-Use, Noncharacter, and Reserved. (The Surrogate code point type
        is not used in UTF-8.)

   For languages without a conforming UTF-8 implementation, we recommend
   checking UTF-8 conformity based on the "Well-Formed UTF-8 Byte Sequences"
   table from The Unicode Standard, Version 11 (or later).

   Note that U+00 is serialized to 0x00, but U+FEFF is serialized to 0xEFBBBF,
   and U+FFFE is serialized to 0xEFBFBE.

3. References

   The Unicode Standard, Version 11, Chapter 3.
   In particular:
      * Unicode scalar values: D76, page 120.
      * UTF-8 encoding form: D92, pages 125-127.
      * Well-Formed UTF-8 Byte Sequences: Table 3-7, page 126.
      * Byte order mark: C11, page 83; D94, page 130.
      * UTF-8 encoding scheme: D96, pages 130.
Filename: 286-hibernation-api.txt
Title: Controller APIs for hibernation access on mobile
Author: Nick Mathewson
Created: 30-November-2017
Status: Rejected

Notes: This proposal was useful for our early thinking, but a simpler
 solution (DisableNetwork) proved much more useful.


1. Introduction

   On mobile platforms, battery life is achieved by reducing
   needless network access and CPU access.  Tor currently provides
   few ways for controllers and operating systems to tune its
   behavior.

   This proposal describes controller APIs for better management of
   Tor's hibernation mechanisms, and extensions to those mechanisms,
   for better power management in mobile environments.

1.1. Background: hibernation and idling in Tor today

   We have an existing "hibernation" mechanism that we use to
   implement "bandwidth accounting" and "slow shutdown" mechanisms:
   When a Tor instance is close to its bandwidth limit: it stops
   accepting new connections or circuits, and only processes those
   it has, until the bandwidth limit is reached.  Once the bandwidth
   limit is reached, Tor closes all connections and circuits, and
   all non-controller listeners, until a new accounting limit
   begins.

   Tor handles the INT signal on relays similarly: it stops
   accepting new connections or circuits, and gives the existing
   ones a short interval in which to shut down.  Then Tor closes all
   connections and exits the process entirely.

   Tor's "idle" mechanism is related to hibernation, though its
   implementation is separate.  When a Tor clients has passed a
   certain amount of time without any user activity, it declares
   itself "idle" and stops performing certain background tasks, such
   as fetching directory information, or building circuits in
   anticipation of future needs.  (This is tied in the codebase to
   the "predicted ports" mechanism, but it doesn't have to be.)


1.2. Background: power-management signals on mobile platforms

   (I'm not a mobile developer, so I'm about to wildly oversimplify.
   Please let me know where I'm wrong.)

   Mobile platforms achieve long battery life by turning off the
   parts they don't need.  The most important parts to turn off are
   the antenna(s) and the screen; the CPU can be run in a slower
   mode.

   But it doesn't do much good turning things off when they're
   unused, if some background app is going to make sure that they're
   always in use!  So mobile platforms use signals of various kinds
   to tell applications "okay, shut up now".

   Some apps need to do online background activities periodically;
   to help this out, mobile platforms give them a signal "Hey, now
   is a good time if you want to do that" and "stop now!"


1.3. Mostly out-of-scope: limiting CPU wakeups when idle.

   The changes described here will be of limited use if we do not
   also alter Tor so that, when it's idle, the CPU is pretty quiet.
   That isn't the case right now: we have large numbers of callbacks
   that happen periodically (every second, every minute, etc)
   whether they need to or not.  We're hoping to limit those, but
   that's not what this proposal is about.


2. Improvements to the hibernation model

   To present a consistent interface that applications and
   controllers can use to manage power consumption, we make these
   enhancements to our hibernation model.

   First, we add three new hibernation states: "IDLE",
   "IDLE_UPDATING", "SLEEP", and "SLEEP_UPDATING".

   "IDLE" is like the current "idle" or "no predicted ports" state:
   Tor doesn't launch circuits or start any directory activity, but
   its listeners are still open.  Tor clients can enter the IDLE
   state on their own when they are LIVE, but haven't gotten any
   client activity for a while.  Existing connections and circuits
   are not closed. If the Tor instance receives any new connections,
   it becomes LIVE.

   "IDLE_UPDATING" is like IDLE, except that Tor should check for
   directory updates as appropriate.  If there are any, it should
   fetch directory information, and then become IDLE again.

   "SLEEPING" is like the current "dormant state we use for
   bandwidth exhaustion, but it is controller-initiated: it begins
   when Tor is told to enter it, and ends when Tor is told to leave
   it.  Existing connections and circuits are closed; listeners are
   closed too.

   "SLEEP_UPDATING" is like SLEEP, except that Tor should check for
   directory updates as appropriate.  If there are any, it should
   fetch directory information, and then SLEEP again.


2.1. Relay operation

   Relays and bridges should not automatically become IDLE on their
   own.


2.2. Onion service operation

   When a Tor instance that is running an onion service is IDLE, it
   does the minimum to try to remain responsive on the onion
   service: It keeps its introduction points open if it can. Once a
   day, it fetches new directory information and opens new
   introduction points.


3. Controller hibernation API

3.1. Examining the current hibernation state

   We define a new "GETINFO status/hibernation" to inspect the
   current hibernation state.  Possible values are:
     - "live"
     - "idle:control"
     - "idle:no-activity"
     - "sleep:control"
     - "sleep:accounting"
     - "idle-update:control"
     - "sleep-update:control"
     - "shutdown:exiting"
     - "shutdown:accounting"
     - "shutdown:control"

   The first part of each value indicates Tor's current state:
      "live" -- completely awake
      "idle" -- waiting to see if anything happens
      "idle-update" -- waiting to see if anything happens; probing
         for directory information
      "sleep" -- completely unresponsive
      "shutdown" -- unresponsive to new requests; still processing
         existing requests.

   The second part of each value indicates the reason that Tor
   entered this state:
      "control" -- a controller told us to do this.
      "no-activity" -- Tor became idle on its own due to not
         noticing any requests.
      "accounting" -- the bandwidth system told us to enter this
         state.
      "exiting" -- Tor is in this state because it's getting ready
         to exit.

   We add a STATUS_GENERAL hibernation event as follows:

      HIBERNATION
      "STATUS=" (one of the status pairs above.)

      Indicates that Tor's hibernation status has changed.

   Note: Controllers MUST accept status values here that they don't
   recognize.

   The "GETINFO accounting/hibernating" value and the "STATUS_SERVER
   HIBERANATION_STATUS" event keep their old meaning.

3.2. Changing the hibernation state

   We add the following new possible values to the SIGNAL controller
   command:
      "SLEEP" -- enter the sleep state, after an appropriate
         shutdown interval.

      "IDLE" -- enter the idle state

      "SLEEPWALK" -- If in sleep or idle, start probing for
         directory information in the sleep-update or idle-update
         state respectively.  Remain in that state until we've
         probed for directory information, or until we're told to
         IDLE or SLEEP again, or (if we're idle) until we get client
         activity. Has no effect if not in sleep or idle.

      "WAKEUP" -- If in sleep, sleep-update, idle, idle-update, or
         shutdown:sleep state, enter the live state.  Has no effect
         in any other state.

3.3. New configuration parameters

   StartIdle -- Boolean.  If set to 1, Tor begins in IDLE mode.

   
Filename: 287-reduce-lifetime.txt
Title: Reduce circuit lifetime without overloading the network
Author: Fernando Fernandez Mancera
Created: 30-Nov-2017
Status: Open

Motivation:

Currently Tor users are reusing a given circuit for ten minutes (by default)
after it's first used. This time is too long because a malicious Exit relay can
trace a user's pseudonymous profile, especially if connections from multiple
protocols are put on the same circuit.

This time it is established on MaxCircuitDirtiness parameter and by default its
value is ten minutes.

I have been thinking in a way to fix this. The first idea that came to my mind
was to use StreamIsolationByHost and StreamIsolationByPort on it, but I wasn't
able to sort it out.

One day, I thought "Why is time so important?" and later on I realized that
maybe focusing on the amount of bytes running through the circuit could end up
being a better approach on this problem.

Design:

I propose two options to reduce this problem, both based on taking into account
the amount of bytes running through a circuit.

MaxCircuitSizeDirtiness (temporal parameter name) will take an integer field
that is contained on an interval and represents the maximum amount of bytes
that can be written/read (we need to discuss about the use of one for both) by
the circuit. If the circuit exceeds that amount, new streams won't use this
circuit anymore.

MaxCircuitSizeDirtinessByPort (temporal parameter name) will take an array of
integers that are contained on an interval and represents the maximum amount of
bytes that can be written/read (we need to discuss about the use of one for
both) by the circuit per port (StreamIsolationByPort). This array is parallel
to the array of ports from StreamIsolationByPort. If the circuit exceeds that
amount, new streams won't use this circuit anymore.

Regarding default values it would be useful to set up one a bit lower than the
average amount of bytes per circuit. On MaxCircuitSizeDirtinessByPort after
discuss it we shouldn't set up a default value because someone can identify the
port used. About MaxCircuitDirtiness, if the others are set up by default it
could be bigger, like thirty minutes, so if the user doesn't send/receive a
significant amount of data the circuit will be changed anyway.

Security Implications:

It is believed that the proposed changes will improve the anonymity for end
users. The end user won't reuse a given circuit if they have sent a
considerable amount of bytes, thus making more difficult for malicious Exit
relays to be able to trace a user's pseudonymous profile.

Obviously this is a probability, of course it's possible that sensitive data
will leak in a little amount of data but it's more even possible that sensitive
data will leak in a large amount.

Specification:

In order to implement this feature we will need to add some new
functionalities. We need to parse MaxCircuitSizeDirtiness and
MaxCircuitSizeDirtinessByPort from the torrc config file. We need to create a
function or improve one to check the amount of bytes that are running through
the circuit and if this amount is higher than the established value, consider
the circuit dirty.

Compatibility:

The proposed changes should not create any compatibility issues. New Tor
clients will be able to take advantage of this without any modification to the
network.

Implementation:

It is proposed that MaxCircuitSizeDirtiness will be enabled by default and also
increase MaxCircuitDirtiness to thirty minutes. 

It is proposed that MaxCircuitSizeDirtinessByPort won't be enabled by default
for port 22, 53, and port 80 as StreamIsolationByPort.

About TorBrowser or any other Tor application that is able to manage circuits
by its own because of KeepAliveIsolateSOCKSAuth option being active by default
shouldn't be affected by this new feature. As the same form that it currently
ignores MaxCircuitDirtiness parameter.

Performance and scalability notes: 

The proposed changes will reduce Tor network stress as users who do not exceed
the set amount will reduce circuit generation by three (if default
MaxCircuitDirtinesss value is thirty minutes).

I want to work on demonstrating that by a research but first it's nice to get the
idea accepted.

References:

Tor project research ideas [https://research.torproject.org/ideas.html]

Enhancing Tor's Performance using Real-time Traffic Classification
[https://www.cypherpunks.ca/~iang/pubs/difftor-ccs.pdf] (It's not exactly about
that, but they talk about circuit lifetime and the ten minutes problem a few
times. Also it's an interesting paper.)
Filename: 288-privcount-with-shamir.txt
Title: Privacy-Preserving Statistics with Privcount in Tor (Shamir version)
Author: Nick Mathewson, Tim Wilson-Brown, Aaron Johnson
Created: 1-Dec-2017
Supercedes: 280
Status: Reserve

0. Acknowledgments

  Tariq Elahi, George Danezis, and Ian Goldberg designed and implemented
  the PrivEx blinding scheme. Rob Jansen and Aaron Johnson extended
  PrivEx's differential privacy guarantees to multiple counters in
  PrivCount:

  https://github.com/privcount/privcount/blob/master/README.markdown#research-background

  Rob Jansen and Tim Wilson-Brown wrote the majority of the experimental
  PrivCount code, based on the PrivEx secret-sharing variant. This
  implementation includes contributions from the PrivEx authors, and
  others:

  https://github.com/privcount/privcount/blob/master/CONTRIBUTORS.markdown

  This research was supported in part by NSF grants CNS-1111539,
  CNS-1314637, CNS-1526306, CNS-1619454, and CNS-1640548.

  The use of a Shamir secret-sharing-based approach is due to a
  suggestion by Aaron Johnson (iirc); Carolin Zöbelein did some helpful
  analysis here.

  Aaron Johnson and Tim Wilson-Brown made improvements to the draft proposal.

1. Introduction and scope

  PrivCount is a privacy-preserving way to collect aggregate statistics
  about the Tor network without exposing the statistics from any single
  Tor relay.

  This document describes the behavior of the in-Tor portion of the
  PrivCount system.  It DOES NOT describe the counter configurations,
  or any other parts of the system. (These will be covered in separate
  proposals.)

2. PrivCount overview

  Here follows an oversimplified summary of PrivCount, with enough
  information to explain the Tor side of things.  The actual operation
  of the non-Tor components is trickier than described below.

  In PrivCount, a Data Collector (DC, in this case a Tor relay) shares
  numeric data with N different Tally Reporters (TRs). (A Tally Reporter
  performs the summing and unblinding roles of the Tally Server and Share
  Keeper from experimental PrivCount.)

  All N Tally Reporters together can reconstruct the original data, but
  no (N-1)-sized subset of the Tally Reporters can learn anything about
  the data.

  (In reality, the Tally Reporters don't reconstruct the original data
  at all! Instead, they will reconstruct a _sum_ of the original data
  across all participating relays.)

  In brief, the system works as follow:

  To share data, for each counter value V to be shared, the Data Collector
  first adds Gaussian noise to V in order to produce V', uses (K,N) Shamir
  secret-sharing to generate N shares of V' (K<=N, K being the
  reconstruction threshold), encrypts each share to a different Tally
  Reporter, and sends each encrypted share to the Tally Reporter it
  is encrypted for.

  The Tally Reporters then agree on the set S of Data Collectors that sent
  data to all of them, and each Tally Reporter forms a share of the aggregate
  value by decrypting the shares it received from the Data Collectors in S
  and adding them together. The Tally Reporters then, collectively, perform
  secret reconstruction, thereby learning the sum of all the different
  values V'.

  The use of Shamir secret sharing lets us survive up to N-K crashing TRs.
  Waiting until the end to agree on a set S of surviving relays lets us
  survive an arbitrary number of crashing DCs. In order to prevent bogus
  data from corrupting the tally, the Tally Reporters can perform the
  aggregation step multiple times, each time proceeding with a different
  subset of S and taking the median of the resulting values.

  Relay subsets should be chosen at random to avoid relays manipulating their
  subset membership(s). If an shared random value is required, all relays must
  submit their results, and then the next revealed shared random value can
  be used to select relay subsets. (Tor's shared random value can be
  calculated as soon as all commits have been revealed. So all relay results
  must be received *before* any votes are cast in the reveal phase for that
  shared random value.)

  Below we describe the algorithm in more detail, and describe the data
  format to use.

3. The algorithm

  All values below are B-bit integers modulo some prime P; we suggest
  B=62 and P = 2**62 - 2**30 - 1 (hex 0x3fffffffbfffffff).  The size of
  this field is an upper limit on the largest sum we can calculate; it
  is not a security parameter.

  There are N Tally Reporters: every participating relay must agree on
  which N exist, and on their current public keys.  We suggest listing
  them in the consensus networkstatus document.  All parties must also
  agree on some ordering the Tally Reporters.  Similarly, all parties
  must also agree on some value K<=N.

  There are a number of well-known "counters", identified known by ASCII
  identifiers.  Each counter is a value that the participating relays
  will know how to count.  Let C be the number of counters.

3.1. Data Collector (DC) side

  At the start of each period, every Data Collector ("client" below)
  initializes their state as follows

      1. For every Tally Reporter with index i, the client constructs a
         random 32-byte random value SEED_i.  The client then generates
         a pseudorandom bitstream of using the SHAKE-256
         XOF with SEED_i as its input, and divides this stream into
         C values, with the c'th value denoted by MASK(i, c).

         [To divide the stream into values, consider the stream 8 bytes at a
         time as unsigned integers in network (big-endian) order. For each
         such integer, clear the top (64-B) bits.  If the result is less than
         P, then include the integer as one of the MASK(i, .) values.
         Otherwise, discard this 8-byte segment and proceed to the next
         value.]

      2. The client encrypts SEED_i using the public key of Tally
         Reporter i, and remembers this encrypted value.  It discards
         SEED_i.

      3. For every counter c, the client generates a noise value Z_c
         from an appropriate Gaussian distribution. If the noise value is
         negative, the client adds P to bring Z_c into the range 0...(P-1).
         (The noise MUST be sampled using the procedure in Appendix C.)

         The client then uses Shamir secret sharing to generate
         N shares (x,y) of Z_c, 1 <= x <= N, with the x'th share to be used by
         the x'th Tally Reporter.  See Appendix A for more on Shamir secret
         sharing.  See Appendix B for another idea about X coordinates.

         The client picks a random value CTR_c and stores it in the counter,
         which serves to locally blind the counter.

         The client then subtracts (MASK(x, c)+CTR_c) from y, giving
         "encrypted shares" of (x, y0) where y0 = y-CTR_c.

         The client then discards all MASK values, all CTR values, and all
         original shares (x,y), all CTR and the noise value Z_c. For each
         counter c, it remembers CTR_c, and N shares of the form (x, y).

  To increment a counter by some value "inc":

      1. The client adds "inc" to counter value, modulo P.

         (This step is chosen to be optimal, since it will happen more
         frequently than any other step in the computation.)

         Aggregate counter values that are close to P/2 MUST be scaled to
         avoid overflow. See Appendix D for more information. (We do not think
         that any counters on the current Tor network will require scaling.)

  To publish the counter values:

      1. The client publishes, in the format described below:

         The list of counters it knows about
         The list of TRs it knows about
         For each TR:
            For each counter c:
                A list of (i, y-CTR_c-MASK(x,c)), which corresponds
                to the share for the i'th TR of counter c.
            SEED_i as encrypted earlier to the i'th TR's public key.

3.2. Tally Reporter (TR) side

  This section is less completely specified than the Data Collector's
  behavior: I expect that the TRs will be easier to update as we proceed.

  (Each TR has a long-term identity key (ed25519).  It also has a
  sequence of short-term curve25519 keys, each associated with a single
  round of data collection.)

   1. When a group of TRs receives information from the Data Collectors,
      they collectively chose a set S of DCs and a set of counters such
      that every TR in the group has a valid entry for every counter,
      from every DC in the set.

      To be valid, an entry must not only be well-formed, but must also
      have the x coordinate in its shares corresponding to the
      TR's position in the list of TRs.

   2. For each Data Collector's report, the i'th TR decrypts its part of
      the client's report using its curve25519 key.  It uses SEED_i and
      SHAKE-256 to regenerate MASK(0) through MASK(C-1).  Then for each
      share (x, y-CTR_c-MASK(x,c)) (note that x=i), the TR reconstructs the
      true share of the value for that DC and counter c by adding
      V+MASK(x,c) to the y coordinate to yield the share (x, y_final).

   3. For every counter in the set, each TR computes the sum of the
      y_final values from all clients.

   4. For every counter in the set, each TR publishes its a share of
      the sum as (x, SUM(y_final)).

   5. If at least K TRs publish correctly, then the sum can be
      reconstructed using Lagrange polynomial interpolation. (See
      Appendix A).

   6. If the reconstructed sum is greater than P/2, it is probably a negative
      value. The value can be obtained by subtracting P from the sum.
      (Negative values are generated when negative noise is added to small
      signals.)

   7. If scaling has been applied, the sum is scaled by the scaling factor.
      (See Appendix D.)

4. The document format

4.1. The counters document.

  This document format builds on the line-based directory format used
  for other tor documents, described in Tor's dir-spec.txt.

  Using this format, we describe a "counters" document that publishes
  the shares collected by a given DC, for a single TR.

  The "counters" document has these elements:

    "privctr-dump-format" SP VERSION SP SigningKey

       [At start, exactly once]

       Describes the version of the dump format, and provides an ed25519
       signing key to identify the relay. The signing key is encoded in
       base64 with padding stripped. VERSION is "alpha" now, but should
       be "1" once this document is finalized.

    "starting-at" SP IsoTime

       [Exactly once]

       The start of the time period when the statistics here were
       collected.

    "ending-at" SP IsoTime

       [Exactly once]

       The end of the time period when the statistics here were
       collected.

    "share-parameters" SP Number SP Number

       [Exactly once]

       The number of shares needed to reconstruct the client's
       measurements (K), and the number of shares produced (N),
       respectively.

    "tally-reporter" SP Identifier SP Integer SP Key

       [At least twice]

       The curve25519 public key of each Tally Reporter that the relay
       believes in.  (If the list does not match the list of
       participating Tally Reporters, they won't be able to find the
       relay's values correctly.)  The identifiers are non-space,
       non-nul character sequences.  The Key values are encoded in
       base64 with padding stripped; they must be unique within each
       counters document.  The Integer values are the X coordinate of
       the shares associated with each Tally Reporter.

    "encrypted-to-key" SP Key

       [Exactly once]

       The curve25519 public key to which the report below is encrypted.
       Note that it must match one of the Tally Reporter options above.


    "report" NL
      "----- BEGIN ENCRYPTED MESSAGE-----" NL
      Base64Data
      "----- END ENCRYPTED MESSAGE-----" NL

      [Exactly once]

      An encrypted document, encoded in base64. The plaintext format is
      described in section 4.2. below. The encryption is as specified in
      section 5 below, with STRING_CONSTANT set to "privctr-shares-v1".

    "signature" SP Signature

       [At end, exactly once]

       The Ed25519 signature of all the fields in the document, from the
       first byte, up to but not including the "signature" keyword here.
       The signature is encoded in base64 with padding stripped.

4.2. The encrypted "shares" document.

  The shares document is sent, encrypted, in the "report" element above.
  Its plaintext contents include these fields:

   "encrypted-seed" NL
      "----- BEGIN ENCRYPTED MESSAGE-----" NL
      Base64Data
      "----- END ENCRYPTED MESSAGE-----" NL

      [At start, exactly once.]

      An encrypted document, encoded in base64. The plaintext value is
      the 32-byte value SEED_i for this TR. The encryption is as
      specified in section 5 below, with STRING_CONSTANT set to
      "privctr-seed-v1".

   "d" SP Keyword SP Integer

      [Any number of times]

      For each counter, the name of the counter, and the obfuscated Y
      coordinate of this TR's share for that counter.  (The Y coordinate
      is calculated as y-CTR_c as in 3.1 above.)  The order of counters
      must correspond to the order used when generating the MASK() values;
      different clients do not need to choose the same order.

5. Hybrid encryption

   This scheme is taken from rend-spec-v3.txt, section 2.5.3, replacing
   "secret_input" and "STRING_CONSTANT".  It is a hybrid encryption
   method for encrypting a message to a curve25519 public key PK.

     We generate a new curve25519 keypair (sk,pk).

     We run the algorithm of rend-spec-v3.txt 2.5.3, replacing
     "secret_input" with Curve25519(sk,PK) | SigningKey, where
     SigningKey is the DC's signing key.  (Including the DC's SigningKey
     here prevents one DC from replaying another one's data.)

     We transmit the encrypted data as in rend-spec-v3.txt 2.5.3,
     prepending pk.


Appendix A. Shamir secret sharing for the impatient

   In Shamir secret sharing, you want to split a value in a finite
   field into N shares, such that any K of the N shares can
   reconstruct the original value, but K-1 shares give you no
   information at all.

   The key insight here is that you can reconstruct a K-degree
   polynomial given K+1 distinct points on its curve, but not given
   K points.

   So, to split a secret, we going to generate a (K-1)-degree
   polynomial.  We'll make the Y intercept of the polynomial be our
   secret, and choose all the other coefficients at random from our
   field.

   Then we compute the (x,y) coordinates for x in [1, N].  Now we
   have N points, any K of which can be used to find the original
   polynomial.

   Moreover, we can do what PrivCount wants here, because adding the
   y coordinates of N shares gives us shares of the sum:  If P1 is
   the polynomial made to share secret A and P2 is the polynomial
   made to share secret B, and if (x,y1) is on P1 and (x,y2) is on
   P2, then (x,y1+y2) will be on P1+P2 ... and moreover, the y
   intercept of P1+P2 will be A+B.

   To reconstruct a secret from a set of shares, you have to either
   go learn about Lagrange polynomials, or just blindly copy a
   formula from your favorite source.

   Here is such a formula, as pseudocode^Wpython, assuming that
   each share is an object with a _x field and a _y field.

     def interpolate(shares):
        for sh in shares:
           product_num = FE(1)
           product_denom = FE(1)
           for sh2 in shares:
               if sh2 is sh:
                   continue
               product_num *= sh2._x
               product_denom *= (sh2._x - sh._x)

           accumulator += (sh._y * product_num) / product_denom

       return accumulator


Appendix B. An alternative way to pick X coordinates

   Above we describe a system where everybody knows the same TRs and
   puts them in the same order, and then does Shamir secret sharing
   using "x" as the x coordinate for the x'th TR.

   But what if we remove that requirement by having x be based on a hash
   of the public key of the TR?  Everything would still work, so long as
   all users chose the same K value.  It would also let us migrate TR
   sets a little more gracefully.


Appendix C. Sampling floating-point Gaussian noise for differential privacy

   Background:

   When we add noise to a counter value (signal), we want the added noise to
   protect all of the bits in the signal, to ensure differential privacy.

   But because noise values are generated from random double(s) using
   floating-point calculations, the resulting low bits are not distributed
   evenly enough to ensure differential privacy.

   As implemented in the C "double" type, IEEE 754 double-precision
   floating-point numbers contain 53 significant bits in their mantissa. This
   means that noise calculated using doubles can not ensure differential
   privacy for client activity larger than 2**53:
     * if the noise is scaled to the magnitude of the signal using
       multiplication, then the low bits are unprotected,
     * if the noise is not scaled, then the high bits are unprotected.

   But the operations in the noise transform also suffer from floating-point
   inaccuracy, further affecting the low bits in the mantissa. So we can only
   protect client activity up to 2**46 with Laplacian noise. (We assume that
   the limit for Gaussian noise is similar.)

   Our noise generation procedure further reduces this limit to 2**42. For
   byte counters, 2**42 is 4 Terabytes, or the observed bandwidth of a 1 Gbps
   relay running at full speed for 9 hours. It may be several years before we
   want to protect this much client activity. However, since the mitigation is
   relatively simple, we specify that it MUST be implemented.

   Procedure:

   Data collectors MUST sample noise as follows:
     1. Generate random double(s) in [0, 1] that are integer multiples of
        2**-53.
        TODO: the Gaussian transform in step 2 may require open intervals
     2. Generate a Gaussian floating-point noise value at random with sigma 1,
        using the random double(s) generated in step 1.
     3. Multiply the floating-point noise by the floating-point sigma value.
     4. Truncate the scaled noise to an integer to remove the fractional bits.
        (These bits can never correspond to signal bits, because PrivCount only
        collects integer counters.)
     5. If the floating-point sigma value from step 3 is large enough that any
        noise value could be greater than or equal to 2**46, we need to
        randomise the low bits of the integer scaled noise value. (This ensures
        that the low bits of the signal are always hidden by the noise.)

        If we use the sample_unit_gaussian() transform in nickm/privcount_nm:
        A. The maximum r value is sqrt(-2.0*ln(2**-53)) ~=  8.57, and the
           maximal sin(theta) values are +/- 1.0. Therefore, the generated
           noise values can be greater than or equal to 2**46 when the sigma
           value is greater than 2**42.
        B. Therefore, the number of low bits that need to be randomised is:
               N = floor(sigma / 2**42)
        C. We randomise the lowest N bits of the integer noise by replacing them
           with a uniformly distributed N-bit integer value in 0...(2**N)-1.
     6. Add the integer noise to the integer counter, before the counter is
        incremented in response to events. (This ensures that the signal value
        is always protected.)

   This procedure is security-sensitive: changing the order of
   multiplications, truncations, or bit replacements can expose the low or
   high bits of the signal or noise.

   As long as the noise is sampled using this procedure, the low bits of the
   signal are protected. So we do not need to "bin" any signals.

   The impact of randomising more bits than necessary is minor, but if we fail
   to randomise an unevenly distributed bit, client activity can be exposed.
   Therefore, we choose to randomise all bits that could potentially be affected
   by floating-point inaccuracy.

   Justification:

   Although this analysis applies to Laplacian noise, we assume a similar
   analysis applies to Gaussian noise. (If we add Laplacian noise on DCs,
   the total ends up with a Gaussian distribution anyway.)

   TODO: check that the 2**46 limit applies to Gaussian noise.

   This procedure results in a Gaussian distribution for the higher ~42 bits
   of the noise. We can safely ignore the value of the lower bits of the noise,
   because they are insignificant for our reporting.

   This procedure is based on section 5.2 of:
   "On Significance of the Least Significant Bits For Differential Privacy"
   Ilya Mironov, ACM CCS 2012
   https://www.microsoft.com/en-us/research/wp-content/uploads/2012/10/lsbs.pdf

   We believe that this procedure is safe, because we neither round nor smooth
   the noise values. The truncation in step 4 has the same effect as Mironov's
   "safe snapping" procedure. Randomising the low bits removes the 2**46 limit
   on the sigma value, at the cost of departing slightly from the ideal
   infinite-precision Gaussian distribution. (But we already know that these
   bits are distributed poorly, due to floating-point inaccuracy.)

   Mironov's analysis assumes that a clamp() function is available to clamp
   large signal and noise values to an infinite floating-point value.
   Instead of clamping, PrivCount's arithmetic wraps modulo P. We believe that
   this is safe, because any reported values this large will be meaningless
   modulo P. And they will not expose any client activity, because "modulo P"
   is an arithmetic transform of the summed noised signal value.

   Alternatives:

   We could round the encrypted value to the nearest multiple of the
   unprotected bits. But this relies on the MASK() value being a uniformly
   distributed random value, and it is less generic.

   We could also simply fail when we reach the 2**42 limit on the sigma value,
   but we do not want to design a system with a limit that low.

   We could use a pure-integer transform to create Gaussian noise, and avoid
   floating-point issues entirely. But we have not been able to find an
   efficient pure-integer Gaussian or Laplacian noise transform. Nor do we
   know if such a transform can be used to ensure differential privacy.


Appendix D. Scaling large counters

   We do not believe that scaling will be necessary to collect PrivCount
   statistics in Tor. As of November 2017, the Tor network advertises a
   capacity of 200 Gbps, or 2**51 bytes per day. We can measure counters as
   large as ~2**61 before reaching the P/2 counter limit.

   If scaling becomes necessary, we can scale event values (and noise sigmas)
   by a scaling factor before adding them to the counter. Scaling may introduce
   a bias in the final result, but this should be insignificant for reporting.


Appendix Z. Remaining client-side uncertainties

   [These are the uncertainties at the client side. I'm not considering
    TR-only operations here unless they affect clients.]

   Should we do a multi-level thing for the signing keys?  That is, have
   an identity key for each TR and each DC, and use those to sign
   short-term keys?

   How to tell the DCs the parameters of the system, including:
      - who the TRs are, and what their keys are?
      - what the counters are, and how much noise to add to each?
      - how do we impose a delay when the noise parameters change?
        (this delay ensures differential privacy even when the old and new
        counters are compared)
        - or should we try to monotonically increase counter noise?
      - when the collection intervals start and end?
      - what happens in networks where some relays report some counters, and
        other relays report other counters?
        - do we just pick the latest counter version, as long as enough relays
          support it?
          (it's not safe to report multiple copies of counters)

   How the TRs agree on which DCs' counters to collect?

   How data is uploaded to DCs?

   What to say about persistence on the DC side?
Filename: 289-authenticated-sendmes.txt
Title: Authenticating sendme cells to mitigate bandwidth attacks
Author: Rob Jansen, Roger Dingledine, David Goulet
Created: 2016-12-01
Status: Closed
Implemented-In: 0.4.1.1-alpha

1. Overview and Motivation

   In Rob's "Sniper attack", a malicious Tor client builds a circuit,
   fetches a large file from some website, and then refuses to read any
   of the cells from the entry guard, yet sends "sendme" (flow control
   acknowledgement) cells down the circuit to encourage the exit relay
   to keep sending more cells. Eventually enough cells queue at the
   entry guard that it runs out of memory and exits [0, 1].

   We resolved the "runs out of memory and exits" part of the attack with
   our Out-Of-Memory (OOM) manager introduced in Tor 0.2.4.18-rc. But
   the earlier part remains unresolved: a malicious client can launch
   an asymmetric bandwidth attack by creating circuits and streams and
   sending a small number of sendme cells on each to cause the target
   relay to receive a large number of data cells.

   This attack could be used for general mischief in the network (e.g.,
   consume Tor network bandwidth resources or prevent access to relays),
   and it could probably also be leveraged to harm anonymity a la the
   "congestion attack" designs [2, 3].

   This proposal describes a way to verify that the client has seen all
   of the cells that its sendme cell is acknowledging, based on the
   authenticated sendmes design from [1].

2. Sniper Attack Variations

   There are some variations on the attack involving the number and
   length of the circuits and the number of Tor clients used. We explain
   them here to help understand which of them this proposal attempts to
   defend against.

   We compare the efficiency of these attacks in terms of the number of
   cells transferred by the adversary and by the network, where receiving
   and sending a cell counts as two transfers of that cell.

2.1 Single Circuit, without Sendmes

   The simplest attack is where the adversary starts a single Tor client,
   creates one circuit and two streams to some website, and stops
   reading from the TCP connection to the entry guard. The adversary
   gets 1000 "attack" cells "for free" (until the stream and circuit
   windows close). The attack data cells are both received and sent by the
   exit and the middle, while being received and queued by the guard.

   Adversary:
   6 transfers to create the circuit
   2 to begin the two exit connections
   2 to send the two GET requests
   ---
   10 total

   Network:
   18 transfers to create the circuit
   22 to begin the two exit connections (assumes two for the exit TCP connect)
   12 to send the two GET requests to the website
   5000 for requested data (until the stream and circuit windows close)
   ---
   5052 total

2.2 Single Circuit, with Sendmes

   A slightly more complex version of the attack in 2.1 is where the
   adversary continues to send sendme cells to the guard (toward the exit),
   and then gets another 100 attack data cells sent across the network for every
   three additional exitward sendme cells that it sends (two stream-level
   sendmes and one circuit-level sendme). The adversary also gets another
   three clientward sendme cells sent by the exit for every 100 exitward
   sendme cells it sends.

   If the adversary sends N sendmes, then we have:

   Adversary:
   10 for circuit and stream setup
   N for circuit and stream sendmes
   ---
   10+N

   Network:
   5052 for circuit and stream setup and initial depletion of circuit windows
   N*100/3*5 for transferring additional data cells from the website
   N*3/100*4 for transferring sendmes from exit to client
   ---
   5052 + N*166.79

   It is important to note that once the adversary stops reading from the
   guard, it will no longer get feedback on the speed at which the data
   cells are able to be transferred through the circuit from the exit
   to the guard. It needs to approximate when it should send sendmes
   to the exit; if too many sendmes are sent such that the circuit
   window would open farther than 1000 cells (500 for streams), then the
   circuit may be closed by the exit. In practice, the adversary could
   take measurements during the circuit setup process and use them to
   estimate a conservative sendme sending rate.

2.3 Multiple Circuits

   The adversary could parallelize the above attacks using multiple
   circuits. Because the adversary needs to stop reading from the TCP
   connection to the guard, they would need to do a pre-attack setup
   phase during which they construct the attack circuits. Then, they
   would stop reading from the guard and send all of the GET requests
   across all of the circuits they created.

   The number of cells from 2.1 and 2.2 would then be multiplied by the
   number of circuits C that the adversary is able to build and sustain
   during the attack.

2.4 Multiple Guards

   The adversary could use the "UseEntryGuards 0" torrc option, or build
   custom circuits with stem to parallelize the attack across multiple
   guard nodes. This would slightly increase the bandwidth usage of the
   adversary, since it would be creating additional TCP connections to
   guard nodes.

2.5 Multiple Clients

   The adversary could run multiple attack clients, each of which would
   choose its own guard. This would slightly increase the bandwidth
   usage of the adversary, since it would be creating additional TCP
   connections to guard nodes and would also be downloading directory
   info, creating testing circuits, etc.

2.6 Short Two-hop Circuits

   If the adversary uses two-hop circuits, there is less overhead
   involved with the circuit setup process.

   Adversary:
   4 transfers to create the circuit
   2 to begin the two exit connections
   2 to send the two GET requests
   ---
   8

   Network:
   8 transfers to create the circuit
   14 to begin the two exit connections (assumes two for the exit TCP connect)
   8 to send the two GET requests to the website
   5000 for requested data (until the stream and circuit windows close)
   ---
   5030

2.7 Long >3-hop Circuits

   The adversary could use a circuit longer than three hops to cause more
   bandwidth usage across the network. Let's use an 8 hop circuit as an
   example.

   Adversary:
   16 transfers to create the circuit
   2 to begin the two exit connections
   2 to send the two GET requests
   ---
   20

   Network:
   128 transfers to create the circuit
   62 to begin the two exit connections (assumes two for the exit TCP connect)
   32 to send the two GET requests to the website
   15000 for requested data (until the stream and circuit windows close)
   ---
   15222

   The adversary could also target a specific relay, and use it multiple
   times as part of the long circuit, e.g., as hop 1, 4, and 7.

   Target:
   54 transfers to create the circuit
   22 to begin the two exit connections (assumes two for the exit TCP connect)
   12 to send the two GET requests to the website
   5000 for requested data (until the stream and circuit windows close)
   ---
   5088

3. Design

   This proposal aims to defend against the versions of the attack that
   utilize sendme cells without reading. It does not attempt to handle
   the case of multiple circuits per guard, or try to restrict the number
   of guards used by a client, or prevent a sybil attack across multiple
   client instances.

   The proposal involves three components: first, the client needs to add
   a token to the sendme payload, to prove that it knows the contents
   of the cells that it has received. Second, the exit relay needs to
   verify this token. Third, to resolve the case where the client already
   knows the contents of the file so it only pretends to read the cells,
   the exit relay needs to be able to add unexpected randomness to the
   circuit.

   (Note: this proposal talks about clients and exit relays, but since
   sendmes go in both directions, both sides of the circuit should do
   these changes.)

3.1. Changing the sendme payload to prove receipt of cells

   In short: clients put the latest received relay cell digest in the
   payload of their circuit-level sendme cells.

   Each relay cell header includes a 4-byte digest which represents
   the rolling hash of all bytes received on that circuit. So knowledge
   of that digest is an indication that you've seen the bytes that go
   into it.

   We pick circuit-level sendme cells, as opposed to stream-level sendme
   cells, because we think modifying just circuit-level sendmes is
   sufficient to accomplish the properties we need, and modifying just
   stream-level sendmes is not sufficient: a client could send a bunch
   of begin cells and fake their circuit-level sendmes, but never send
   any stream-level sendmes, attracting 500*n queued cells to the entry
   guard for the n streams that it opens.

   Which digest should the client put in the sendme payload? Right now
   circuit-level sendmes are sent whenever one window worth of relay cells
   (100) has arrived. So the client should use the digest from the cell
   that triggers the sendme.

   In order to achieve this, we need to version the SENDME cell so we can
   differentiate the original protocol versus the new authenticated cell.
   Right now, the SENDME payload is empty which translate to a version value
   of 0 with this proposed change. The version to achieve authenticated
   SENDMEs of this proposal would be 1.

   The SENDME cell payload would contain the following:

      VERSION     [1 byte]
      DATA_LEN    [2 bytes]
      DATA        [DATA_LEN bytes]

   The VERSION tells us what is expected in the DATA section of length
   DATA_LEN. The recognized values are:

      0x00: The rest of the payload should be ignored.

      0x01: Authenticated SENDME. The DATA section should contain:

         DIGEST   [20 bytes]

         If the DATA_LEN value is less than 4 bytes, the cell should be
         dropped and the circuit closed. If the value is more than 4 bytes,
         then the first 20 bytes should be read to get the correct value.

         The DIGEST is the digest value from the cell that triggered this
         SENDME as mentioned above. This value is matched on the other side
         from the previous cell.

   If a VERSION is unrecognized, the SENDME cell should be treated as version
   0 meaning the payload is ignored.

3.2. Verifying the sendme payload

   In the current Tor, the exit relay keeps no memory of the cells it
   has sent down the circuit, so it won't be in a position to verify
   the digest that it gets back.

   But fortunately, the exit relay can count also, so it knows which cell
   is going to trigger the sendme response. Each circuit can have at most
   10 sendmes worth of data outstanding. So the exit relay will keep
   a per-circuit fifo queue of the digests from the appropriate cells,
   and when a new sendme arrives, it pulls off the next digest in line,
   and verifies that it matches.

   If a sendme payload has a payload version of 1 yet its digest
   doesn't match the expected digest, or if the sendme payload has
   an unexpected payload version (see below about deployment phases),
   the exit relay must tear down the circuit. (If we later find that
   we need to introduce a newer payload version in an incompatible way,
   we would do that by bumping the circuit protocol version.)

3.3. Making sure there are enough unpredictable bytes in the circuit

   So far, the design as described fails to a very simple attacker:
   the client fetches a file whose contents it already knows, and it
   uses that knowledge to calculate the correct digests and fake its
   sendmes just like in the original attack.

   The fix is that the exit relay needs to be able to add some randomness
   into its cells. It can add this randomness, in a way that's completely
   orthogonal to the rest of this design, simply by choosing one relay
   cell every so often and not using the entire relay cell payload for
   actual data (i.e. using a Length field of less than 498), and putting
   some random bytes in the remainder of the payload.

   How many random bytes should the exit relay use, and how often should
   it use them? There is a tradeoff between security when under attack,
   and efficiency when not under attack. We think 1 byte of randomness
   every 1000 cells is a good starting plan, and we can always improve
   it later without needing to change any of the rest of this design.

   (Note that the spec currently says "The remainder of the payload
   is padded with NUL bytes." We think "is" doesn't mean MUST, so we
   should just be sure to update that part of the spec to reflect our
   new plans here.)

4. Deployment Plan

   This section describes how we will be able to deploy this new mechanism on
   the network.

   Alas, this deployment plan leaves a pretty large window until relays are
   protected from attack. It's not all bad news though, since we could flip
   the switches earlier than intended if we encounter a network-wide attack.

   There are 4 phases to this plan detailed in the following subsections.

4.1. Phase One - Remembering Digests

   Both sides begin remembering their expected digests, and they learn how to
   parse sendme version 1 payloads. When they receive a version 1 SENDME, they
   verify its digest and tear down the circuit if it's wrong. But they
   continue to send and accept payload version 0 sendmes.

4.2. Phase Two - Sending Version 1

   We flip a switch in the consensus, and everybody starts sending payload
   version 1 sendmes. Payload version 0 sendmes are still accepted. The newly
   proposed consensus parameter to achieve this is:

      "sendme_emit_min_version" - Minimum SENDME version that can be sent.

4.3. Phase Three - Protover

   On phase four (section 4.4), the new consensus parameter that tells us
   which minimum version to accept, once flipped to version 1, has the
   consequence of making every tor not supporting that version to fail to
   operate on the network. It goes as far as unable to download a consensus.

   It is essentially a "false-kill" switch because tor will still run but will
   simply not work. It will retry over and over to download a consensus. In
   order to help us transition before only accepting v1 on the network, a new
   protover value is proposed (see section 9 of tor-spec.txt for protover
   details).

   Tor clients and relays that don't support this protover version from the
   consensus "required-client-protocols" or "required-relay-protocols" lines
   will exit and thus not try to join the network. Here is the proposed value:

     "FlowCtrl"

     Describes the flow control protocol at the circuit and stream level. If
     there is no FlowCtrl protocol version, tor supports the unauthenticated
     flow control features from its supported Relay protocols.

       "1" -- supports authenticated circuit level SENDMEs as of proposal
              289 in Tor 0.4.1.1-alpha.

4.4. Phase Four - Accepting Version 1

   We flip a different switch in the consensus, and everybody starts refusing
   payload version 0 sendmes. The newly proposed consensus parameter to
   achieve this is:

      "sendme_accept_min_version" - Minimum SENDME version that is accepted.

   It has to be two separate switches, not one unified one, because otherwise
   we'd have a race where relays learn about the update before clients know to
   start the new behavior.

4.5. Timeline

   The proposed timeline for the deployment phases:

      Phase 1:

         Once this proposal is merged into tor (expected: 0.4.1.1-alpha), v1
         SENDMEs can be accepted on a circuit.

      Phase 2:

         Once Tor Browser releases a stable version containing 0.4.1, we
         consider that we have a very large portion of clients supporting v1
         and thus limit the partition problem.

         We can safely emit v1 SENDMEs in the network because the payload is
         ignored for version 0 thus sending a v1 right now will not affect
         older tor's behavior and will be considered a v0.

      Phase 3:

         This phase will effectively exit() all tor not supporting
         "FlowCtrl=1". The earliest date we can do that is when all versions
         not supporting v1 are EOL.

         According to our release schedule[4], this can happen when our latest
         LTS (0.3.5) goes EOL that is on Feb 1st, 2022.

      Phase 4:

         We recommend to pass at least one version after Phase 3 so we can
         take the time to see the effect that it had on the network.
         Considering 6 months release time frame we expect to do this phase
         around July 2022.

5. Security Discussion

   Does our design enable any new adversarial capabilities?

   An adversarial middle relay could attempt to trick the exit into
   killing an otherwise valid circuit.

   An adversarial relay can already kill a circuit, but here it could make
   it appear that the circuit was killed for a legitimate reason (invalid
   or missing sendme), and make someone else (the exit) do the killing.

   There are two ways it might do this: by trying to make a valid sendme
   appear invalid; and by blocking the delivery of a valid sendme. Both of
   these depend on the ability for the adversary to guess which exitward
   cell is a sendme cell, which it could do by counting clientward cells.

   * Making a valid sendme appear invalid

   A malicious middle could stomp bits in the exitward sendme so
   that the exit sendme validation fails. However, bit stomping would
   be detected at the protocol layer orthogonal to this design, and
   unrecognized exitward cells would currently cause the circuit to be
   torn down. Therefore, this attack has the same end result as blocking
   the delivery of a valid sendme.

   (Note that, currently, clientward unrecognized cells are dropped but
   the circuit is not torn down.)

   * Blocking delivery of a valid sendme

   A malicious middle could simply drop a exitward sendme, so that
   the exit is unable to verify the digest in the sendme payload. The
   following exitward sendme cell would then be misaligned with the
   sendme that the exit is expecting to verify. The exit would kill the
   circuit because the client failed to prove it has read all of the
   clientward cells.

   The benefits of such an attack over just directly killing the circuit
   seem low, and we feel that the added benefits of the defense outweigh
   the risks.

6. Open problems

   With the proposed defenses in place, an adversary will be unable to
   successfully use the "continue sending sendmes" part of these attacks.

   But this proposal won't resolve the "build up many circuits over time,
   and then use them to attack all at once" issue, nor will it stop
   sybil attacks like if an attacker makes many parallel connections to
   a single target relay, or reaches out to many guards in parallel.

   We spent a while trying to figure out if we can enforce some
   upper bound on how many circuits a given connection is allowed
   to have open at once, to limit every connection's potential for
   launching a bandwidth attack. But there are plausible situations
   where well-behaving clients accumulate many circuits over time:
   Ricochet clients with many friends, popular onion services, or even
   Tor Browser users with a bunch of tabs open.

   Even though a per-conn circuit limit would produce many false
   positives, it might still be useful to have it deployed and available
   as a consensus parameter, as another tool for combatting a wide-scale
   attack on the network: a parameter to limit the total number of
   open circuits per conn (viewing each open circuit as a threat) would
   complement the current work in #24902 to rate limit circuit creates
   per client address.

   But we think the threat of parallel attacks might be best handled by
   teaching relays to react to actual attacks, like we've done in #24902:
   we should teach Tor relays to recognize when somebody is *doing* this
   attack on them, and to squeeze down or outright block the client IP
   addresses that have tried it recently.

   An alternative direction would be to await research ideas on how guards
   might coordinate to defend against attacks while still preserving
   user privacy.

   In summary, we think authenticating the sendme cells is a useful
   building block for these future solutions, and it can be (and should
   be) done orthogonally to whatever sybil defenses we pick later.

7. References

   [0] https://blog.torproject.org/blog/new-tor-denial-service-attacks-and-defenses
   [1] https://www.freehaven.net/anonbib/#sniper14
   [2] https://www.freehaven.net/anonbib/#torta05
   [3] https://www.freehaven.net/anonbib/#congestion-longpaths
   [4] https://trac.torproject.org/projects/tor/wiki/org/teams/NetworkTeam/CoreTorReleases

8. Acknowledgements

  This research was supported in part by NSF grants CNS-1111539,
  CNS-1314637, CNS-1526306, CNS-1619454, and CNS-1640548.
Filename: 290-deprecate-consensus-methods.txt
Title: Continuously update consensus methods
Author: Nick Mathewson
Created: 2018-02-21
Status: Meta

1. Background

   Directory authorities use the "consensus method" mechanism to achieve
   forward compatibility during voting.  When each authority publishes
   its vote, it includes a list of numbered consensus methods that it
   supports.  Each authority chooses to calculate the consensus
   according to the highest consensus method it knows supported by more
   than 2/3 of the voting authorities.  So long as all the authorities
   have a method in common, they will all reach the same consensus.

   Consensus method 1 was first introduced in the Tor 0.2.0 series
   around 2008. But by 2012, we realized that we had a problem: we were
   stuck documenting and supporting old consensus methods indefinitely.

   With proposal 215, we deprecated and removed support for all
   consensus methods before method 13.  That was good as far as it went,
   but it didn't solve the problem going forward: the latest consensus
   method is now 28.

   This proposal describes a policy for removing older consensus methods
   going forward, so we won't have to keep supporting them forever.

2. Proposal

   I propose that from time to time, old consensus methods should be
   deprecated.

   Specifically, I propose that we deprecate all methods older than the
   highest method supported in the first stable release of the oldest LTS
   (long-term support) release series.

   For example, the current oldest LTS series is 0.2.5.x.  The first
   stable release in that series was 0.2.5.10.  The highest consensus
   method listed by 0.2.5.10 is 18.  Therefore, we should currently
   consider ourselves free to deprecate all methods before 18.

   Once 0.2.5.x is deprecated, 0.2.9.x will become the oldest LTS
   series.  The first stable release in that series was 0.2.9.8.  The
   highest consensus method listed by 0.2.9.8 is 25.  Therefore, once
   0.2.5.x is deprecated (in May 2018), we may deprecate all methods
   before 25.

   When a consensus method is deprecated, it should no longer be listed
   or implemented by the latest Tor releases.  (It's okay for older
   authorities to keep advertising it.)

   Most consensus methods add a feature that is used in "method M or
   later". Deprecating method M-1 means that the feature is used in all
   supported consensus methods. Therefore, we can remove any code that
   makes the feature conditional on a consensus method, and any code for
   previous implementations of the feature.

   Some consensus methods remove a feature that was used up to method
   M. Deprecating method M means that the feature is no longer used by
   any supported consensus methods. Therefore, we can remove any code
   that implements the feature.

A. Acknowledgments

   Thanks to isis and teor for the discussion that led to this proposal.
   I believe that teor first suggested the policy described in section 2
   above.

B. Client and relay compatibility notes

   Dear reader: you may be worrying that this proposal will cause old
   clients or relays to stop working prematurely.  That is not the case.
   Consensus methods determine how the authorities behave, but they do
   not represent backward-incompatible changes in how they generate
   their consensuses.

Filename: 291-two-guard-nodes.txt
Title: The move to two guard nodes
Author: Mike Perry
Created: 2018-03-22
Supersedes: Proposal 236
Status: Finished

0. Background

  Back in 2014, Tor moved from three guard nodes to one guard node[1,2,3].

  We made this change primarily to limit points of observability of entry
  into the Tor network for clients and onion services, as well as to
  reduce the ability of an adversary to track clients as they move from
  one internet connection to another by their choice of guards.


1. Proposed changes

1.1. Switch to two guards per client

  When this proposal becomes effective, clients will switch to using
  two guard nodes. The guard node selection algorithms of Proposal 271
  will remain unchanged. Instead of having one primary guard "in use",
  Tor clients will always use two.

  This will be accomplished by setting the guard-n-primary-guards-to-use
  consensus parameter to 2, as well as guard-n-primary-guards to 2.
  (Section 3.1 covers the reason for both parameters). This is equivalent
  to using the torrc option NumEntryGuards=2, which can be used for
  testing behavior prior to the consensus update.

1.2. Enforce Tor's path restrictions across this guard layer

  In order to ensure that Tor can always build circuits using two guards
  without resorting to a third, they must be chosen such that Tor's path
  restrictions could still build a path with at least one of them,
  regardless of the other nodes in the path.

  In other words, we must ensure that both guards are not chosen from the
  same /16 or the same node family. In this way, Tor will always be able to
  build a path using these guards, preventing the use of a third guard.


2. Discussion

2.1. Why two guards?

  The main argument for switching to two guards is that because of Tor's
  path restrictions, we're already using two guards, but we're using them
  in a suboptimal and potentially dangerous way.

  Tor's path restrictions enforce the condition that the same node cannot
  appear twice in the same circuit, nor can nodes from the same /16 subnet
  or node family be used in the same circuit.

  Tor's paths are also built such that the exit node is chosen first and
  held fixed during guard node choice, as are the IP, HSDIR, and RPs for
  onion services. This means that whenever one of these nodes happens to
  be the guard[4], or be in the same /16 or node family as the guard, Tor
  will build that circuit using a second "primary" guard, as per proposal
  271[7].

  Worse still, the choice of RP, IP, and exit can all be controlled by an
  adversary (to varying degrees), enabling them to force the use of a
  second guard at will.

  Because this happens somewhat infrequently in normal operation, a fresh
  TLS connection will typically be created to the second "primary" guard,
  and that TLS connection will be used only for the circuit for that
  particular request. This property makes all sorts of traffic analysis
  attacks easier, because this TLS connection will not benefit from any
  multiplexing.

  This is more serious than traffic injection via an already in-use
  guard because the lack of multiplexing means that the data retention
  level required to gain information from this activity is very low, and
  may exist for other reasons. To gain information from this behavior, an
  adversary needs only connection 5-tuples + timestamps, as opposed to
  detailed timeseries data that is polluted by other concurrent activity
  and padding.

  In the most severe form of this attack, the adversary can take a suspect
  list of Tor client IP addresses (or the list of all Guard node IP addresses)
  and observe when secondary Tor connections are made to them at the time when
  they cycle through all guards as RPs for connections to an onion
  service. This adversary does not require collusion on the part of observers
  beyond the ability to provide 5-tuple connection logs (which ISPs may retain
  for reasons such as netflow accounting, IDS, or DoS protection systems).

  A fully passive adversary can also make use of this behavior. Clients
  unlucky enough to pick guard nodes in heavily used /16s or in large node
  families will tend to make use of a second guard more frequently even
  without effort from the adversary. In these cases, the lack of
  multiplexing also means that observers along the path to this secondary
  guard gain more information per observation.

2.2. Why not MORE guards?

  We do not want to increase the number of observation points for client
  activity into the Tor network[1]. We merely want better multiplexing for
  the cases where this already happens.

2.3. Can you put some numbers on that?

  The Changing of the Guards[13] paper studies this from a few different
  angles, but one of the crucially missing graphs is how long a client
  can expect to run with N guards before it chooses a malicious guard.

  However, we do have tables in section 3.2.1 of proposal 247 that cover
  this[14]. There are three tables there: one for a 1% adversary, one for
  a 5% adversary, and one for a 10% adversary. You can see the probability
  of adversary success for one and two guards in terms of the number of
  rotations needed before the adversary's node is chosen. Not surprisingly,
  the two guard adversary gets to compromise clients roughly twice as
  quickly, but the timescales are still rather large even for the 10%
  adversary: they only have 50% chance of success after 4 rotations, which
  will take about 14 months with Tor's 3.5 month guard rotation.

2.4. What about guard fingerprinting?

  More guards also means more fingerprinting[8]. However, even one guard
  may be enough to fingerprint a user who moves around in the same area,
  if that guard is low bandwidth or there are not many Tor users in that
  area.

  Furthermore, our use of separate directory guards (and three of them)
  means that we're not really changing the situation much with the
  addition of another regular guard. Right now, directory guard use alone
  is enough to track all Tor users across the entire world.

  While the directory guard problem could be fixed[12] (and should be
  fixed), it is still the case that another mechanism should be used for
  the general problem of guard-vs-location management[9].


3. Alternatives

  There are two other solutions that also avoid the use of secondary guard
  in the path restriction case.

3.1. Eliminate path restrictions entirely

  If Tor decided to stop enforcing /16, node family, and also allowed the
  guard node to be chosen twice in the path, then under normal conditions,
  it should retain the use of its primary guard.

  This approach is not as extreme as it seems on face. In fact, it is hard
  to come up with arguments against removing these restrictions. Tor's
  /16 restriction is of questionable utility against monitoring, and it can
  be argued that since only good actors use node family, it gives influence
  over path selection to bad actors in ways that are worse than the benefit
  it provides to paths through good actors[10,11].

  However, while removing path restrictions will solve the immediate
  problem, it will not address other instances where Tor temporarily opts
  to use a second guard due to congestion, OOM, or failure of its primary
  guard, and we're still running into bugs where this can be adversarially
  controlled or just happen randomly[5].

  While using two guards means twice the surface area for these types of
  bugs, it also means that instances where they happen simultaneously on
  both guards (thus forcing a third guard) are much less likely than with
  just one guard. (In the passive adversary model, consider that one guard
  fails at any point with probability P1. If we assume that such passive
  failures are independent events, both guards would fail concurrently
  with probability P1*P2. Even if the events are correlated, the maximum
  chance of concurrent failure is still MIN(P1,P2)).

  Note that for this analysis to hold, we have to ensure that nodes that
  are at RESOURCELIMIT or otherwise temporarily unresponsive do not cause
  us to consider other primary guards beyond than the two we have chosen.
  This is accomplished by setting guard-n-primary-guards to 2 (in addition
  to setting guard-n-primary-guards-to-use to 2). With this parameter
  set, the proposal 271 algorithm will avoid considering more than our two
  guards, unless *both* are down at once.

3.2. No Guard-flagged nodes as exit, RP, IP, or HSDIRs

  Similar to 3.1, we could instead forbid the use of Guard-flagged nodes
  for the exit, IP, RP, and HSDIR positions.

  This solution has two problems: First, like 3.1, it also does not handle
  the case where resource exhaustion could force the use of a second
  guard. Second, it requires clients to upgrade to the new behavior and
  stop using Guard flagged nodes before it can be deployed.


4. The future is confluxed

  An additional benefit of using a second guard is that it enables us to
  eventually use conflux[6].

  Conflux works by giving circuits a 256bit cookie that is sent to the
  exit/RP, and circuits that are then built to the same exit/RP with the
  same cookie can then be fused together. Throughput estimates are used to
  balance traffic between these circuits, depending on their performance.

  We have unfortunately signaled to the research community that conflux is
  not worth pursuing, because of our insistence on a single guard. While
  not relevant to this proposal (indeed, conflux requires its own proposal
  and also concurrent research), it is worth noting that whichever way we
  go here, the door remains open to conflux because of its utility against
  similar issues.

  If our conflux implementation includes packet acking, then circuits can
  still survive the loss of one guard node due to DoS, OOM, or other
  failures because the second half of the path will remain open and
  usable (see the probability of concurrent failure arguments in Section
  3.1).

  If exits remember this cookie for a short period of time after the last
  circuit is closed, the technique can be used to protect against
  DoS/OOM/guard downtime conditions that take down both guard nodes or
  destroy many circuits to confirm both guard node choices. In these
  cases, circuits could be rebuilt along an alternate path and resumed
  without end-to-end circuit connectivity loss. This same technique will
  also make things like ephemeral bridges (ie Snowflake/Flashproxy) more
  usable, because bridge uptime will no longer be so crucial to usability.
  It will also improve mobile usability by allowing us to resume
  connections after mobile Tor apps are briefly suspended, or if the user
  switches between cell and wifi networks.

  Furthermore, it is likely that conflux will also be useful against traffic
  analysis and congestion attacks. Since the load balancing is dynamic and
  hard to predict by an external observer and also increases overall
  traffic multiplexing, traffic correlation and website traffic
  fingerprinting attacks will become harder, because the adversary can no
  longer be sure what percentage of the traffic they have seen (depending
  on their position and other potential concurrent activity).  Similarly,
  it should also help dampen congestion attacks, since traffic will
  automatically shift away from a congested guard.


5. Acknowledgements

  This research was supported in part by NSF grants CNS-1111539,
  CNS-1314637, CNS-1526306, CNS-1619454, and CNS-1640548.


References:

1. https://blog.torproject.org/improving-tors-anonymity-changing-guard-parameters
2. https://trac.torproject.org/projects/tor/ticket/12206
3. https://gitweb.torproject.org/torspec.git/tree/proposals/236-single-guard-node.txt
4. https://trac.torproject.org/projects/tor/ticket/14917
5. https://trac.torproject.org/projects/tor/ticket/25347#comment:14
6. https://www.cypherpunks.ca/~iang/pubs/conflux-pets.pdf
7. https://gitweb.torproject.org/torspec.git/tree/proposals/271-another-guard-selection.txt
8. https://trac.torproject.org/projects/tor/ticket/9273#comment:3
9. https://tails.boum.org/blueprint/persistent_Tor_state/
10. https://trac.torproject.org/projects/tor/ticket/6676#comment:3
11. https://bugs.torproject.org/15060
12. https://trac.torproject.org/projects/tor/ticket/10969
13. https://www.freehaven.net/anonbib/cache/wpes12-cogs.pdf
14. https://gitweb.torproject.org/torspec.git/tree/proposals/247-hs-guard-discovery.txt
Filename: 292-mesh-vanguards.txt
Title: Mesh-based vanguards
Authors: George Kadianakis and Mike Perry
Created: 2018-05-08
Status: Closed
Supersedes: 247

0. Motivation

  A guard discovery attack allows attackers to determine the guard
  node of a Tor client. The hidden service rendezvous protocol
  provides an attack vector for a guard discovery attack since anyone
  can force an HS to construct a 3-hop circuit to a relay (#9001).

  Following the guard discovery attack with a compromise and/or
  coercion of the guard node can lead to the deanonymization of a
  hidden service.

1. Overview

  This document tries to make the above guard discovery + compromise
  attack harder to launch. It introduces a configuration
  option which makes the hidden service also pin the second and third
  hops of its circuits for a longer duration.

  With this new path selection, we force the adversary to perform a
  Sybil attack and two compromise attacks before succeeding. This is
  an improvement over the current state where the Sybil attack is
  trivial to pull off, and only a single compromise attack is required.

  With this new path selection, an attacker is forced to do a one or
  more node compromise attacks before learning the guard node of a hidden
  service. This increases the uncertainty of the attacker, since
  compromise attacks are costly and potentially detectable, so an
  attacker will have to think twice before beginning a chain of node
  compromise attacks that they might not be able to complete.

1.1. Tor integration

  The mechanisms introduced in this proposal are currently implemented
  partially in Tor and partially through an external Python script:
            https://github.com/mikeperry-tor/vanguards

  The Python script uses the new Tor configuration options HSLayer2Nodes and
  HSLayer3Nodes to be able to select nodes for the guard layers. The Python
  script is tasked with maintaining and rotating the guard nodes as needed
  based on the lifetimes described in this proposal.

  In the future, we are aiming to include the whole functionality into Tor,
  with no need for external scripts.

1.2. Visuals

  Here is how a hidden service rendezvous circuit currently looks like:

                     -> middle_1 -> middle_A
                     -> middle_2 -> middle_B
                     -> middle_3 -> middle_C
                     -> middle_4 -> middle_D
       HS -> guard   -> middle_5 -> middle_E
                     -> middle_6 -> middle_F
                     -> middle_7 -> middle_G
                     -> middle_8 -> middle_H
                     ->   ...    ->  ...
                     -> middle_n -> middle_n

  this proposal pins the two middle positions into a much more
  restricted sets, as follows:

                       -> guard_2A
                                   -> guard_3A
          -> guard_1A  -> guard_2B -> guard_3B
       HS                          -> guard_3C
          -> guard_1B  -> guard_2C -> guard_3D
                                   -> guard_3E
                       -> guard_2D -> guard_3F

  Additionally, to avoid linkability, we insert an extra middle node
  after the third layer guard for client side intro and hsdir circuits,
  and service-side rendezvous circuits. This means that the set of
  paths for Client (C) and Service (S) side look like this:

     C - G - L2 - L3 - R
     S - G - L2 - L3 - HSDIR
     S - G - L2 - L3 - I
     C - G - L2 - L3 - M - I
     C - G - L2 - L3 - M - HSDIR
     S - G - L2 - L3 - M - R

1.3. Threat model, Assumptions, and Goals

  Consider an adversary with the following powers:

     - Can launch a Sybil guard discovery attack against any node of a
       rendezvous circuit. The slower the rotation period of the node,
       the longer the attack takes. Similarly, the higher the percentage
       of the network is compromised, the faster the attack runs.

     - Can compromise any node on the network, but this compromise takes
       time and potentially even coercive action, and also carries risk
       of discovery.

  We also make the following assumptions about the types of attacks:

  1. A Sybil attack is observable by both people monitoring the network
     for large numbers of new nodes, as well as vigilant hidden service
     operators. It will require either large amounts of traffic sent
     towards the hidden service, multiple test circuits, or both.

  2. A Sybil attack against the second or first layer Guards will be
     more noisy than a Sybil attack against the third layer guard, since the
     second and first layer Sybil attack requires a timing side channel in
     order to determine success, whereas the Sybil success is almost
     immediately obvious to third layer guard, since it will be instructed
     to connect to a cooperating malicious rend point by the adversary.

  3. As soon as the adversary is confident they have won the Sybil attack,
     an even more aggressive circuit building attack will allow them to
     determine the next node very fast (an hour or less).

  4. The adversary is strongly disincentivized from compromising nodes that
     may prove useless, as node compromise is even more risky for the
     adversary than a Sybil attack in terms of being noticed.

  Given this threat model, our security parameters were selected so that
  the first two layers of guards should be hard to attack using a Sybil
  guard discovery attack and hence require a node compromise attack. Ideally,
  we want the node compromise attacks to carry a non-negligible probability of
  being useless to the adversary by the time they complete.

  On the other hand, the outermost layer of guards should rotate fast enough to
  _require_ a Sybil attack.

  See our vanguard simulator project for a simulation of the above adversary
  model and a motivation for the parameters selected within this proposal:
        https://github.com/asn-d6/vanguard_simulator
        https://github.com/asn-d6/vanguard_simulator/wiki/Optimizing-vanguard-topologies


2. Design

  When a hidden service picks its guard nodes, it also picks an
  additional NUM_LAYER2_GUARDS-sized set of middle nodes for its
  `second_guard_set`, as well as a NUM_LAYER3_GUARDS-sized set of
  middle nodes for its `third_guard_set`.

  When a hidden service needs to establish a circuit to an HSDir,
  introduction point or a rendezvous point, it uses nodes from
  `second_guard_set` as the second hop of the circuit and nodes from
  `third_guard_set` as third hop of the circuit.

  A hidden service rotates nodes from the 'second_guard_set' at a random
  time between MIN_SECOND_GUARD_LIFETIME hours and
  MAX_SECOND_GUARD_LIFETIME hours.

  A hidden service rotates nodes from the 'third_guard_set' at a random
  time between MIN_THIRD_GUARD_LIFETIME and MAX_THIRD_GUARD_LIFETIME
  hours.

  Each node's rotation time is tracked independently, to avoid disclosing
  the rotation times of the primary and second-level guards.

2.1. Security parameters

  We set NUM_LAYER2_GUARDS to 4 nodes and NUM_LAYER3_GUARDS to 6 nodes.

  We set MIN_SECOND_GUARD_LIFETIME to 1 day, and MAX_SECOND_GUARD_LIFETIME
  to 45 days inclusive, for an average rotation rate of 29.5 days, using
  the max(X,X) distribution specified in Section 3.3.

  We set MIN_THIRD_GUARD_LIFETIME to 1 hour, and MAX_THIRD_GUARD_LIFETIME
  to 48 hours inclusive, for an average rotation rate of 31.5 hours, using
  the max(X,X) distribution specified in Section 3.3.

  See Section 3 for more analysis on these constants.

2.2. Path restriction changes

  In order to avoid information leaks and ensure paths can be built, path
  restrictions must be loosened.

  In particular, we allow the following:
     1. Nodes from the same /16 and same family for any/all hops
     2. Guard nodes can be chosen for RP/IP/HSDIR
     3. Guard nodes can be chosen for hop before RP/IP/HSDIR.

  The first change prevents the situation where paths cannot be built if two
  layers all share the same subnet and/or node family. It also prevents the
  the use of a different entry guard based on the family or subnet of the
  IP, HSDIR, or RP.

  The second change prevents an adversary from forcing the use of a different
  entry guard by enumerating all guard-flaged nodes as the RP.

  The third change prevents an adversary from learning the guard node by way
  of noticing which nodes were not chosen for the hop before it.


3. Rationale and Security Parameter Selection

3.1. Sybil rotation counts for a given number of Guards

  The probability of Sybil success for Guard discovery can be modeled as
  the probability of choosing 1 or more malicious middle nodes for a
  sensitive circuit over some period of time.

  P(At least 1 bad middle) = 1 - P(All Good Middles)
                           = 1 - P(One Good middle)^(num_middles)
                           = 1 - (1 - c/n)^(num_middles)

  c/n is the adversary compromise percentage

  In the case of Vanguards, num_middles is the number of Guards you rotate
  through in a given time period. This is a function of the number of vanguards
  in that position (v), as well as the number of rotations (r).

  P(At least one bad middle) = 1 - (1 - c/n)^(v*r)

  Here's detailed tables in terms of the number of rotations required for
  a given Sybil success rate for certain number of guards.

  1.0% Network Compromise:
   Sybil Success   One   Two  Three  Four  Five  Six  Eight  Nine  Ten  Twelve  Sixteen
    10%            11     6     4     3     3     2     2     2     2     1       1
    15%            17     9     6     5     4     3     3     2     2     2       2
    25%            29    15    10     8     6     5     4     4     3     3       2
    50%            69    35    23    18    14    12     9     8     7     6       5
    60%            92    46    31    23    19    16    12    11    10     8       6
    75%           138    69    46    35    28    23    18    16    14    12       9
    85%           189    95    63    48    38    32    24    21    19    16      12
    90%           230   115    77    58    46    39    29    26    23    20      15
    95%           299   150   100    75    60    50    38    34    30    25      19
    99%           459   230   153   115    92    77    58    51    46    39      29

  5.0% Network Compromise:
   Sybil Success   One   Two  Three  Four  Five  Six  Eight  Nine  Ten  Twelve  Sixteen
    10%             3     2     1     1     1     1     1     1     1     1       1
    15%             4     2     2     1     1     1     1     1     1     1       1
    25%             6     3     2     2     2     1     1     1     1     1       1
    50%            14     7     5     4     3     3     2     2     2     2       1
    60%            18     9     6     5     4     3     3     2     2     2       2
    75%            28    14    10     7     6     5     4     4     3     3       2
    85%            37    19    13    10     8     7     5     5     4     4       3
    90%            45    23    15    12     9     8     6     5     5     4       3
    95%            59    30    20    15    12    10     8     7     6     5       4
    99%            90    45    30    23    18    15    12    10     9     8       6

  10.0% Network Compromise:
   Sybil Success   One   Two  Three  Four  Five  Six  Eight  Nine  Ten  Twelve  Sixteen
    10%             2     1     1     1     1     1     1     1     1     1       1
    15%             2     1     1     1     1     1     1     1     1     1       1
    25%             3     2     1     1     1     1     1     1     1     1       1
    50%             7     4     3     2     2     2     1     1     1     1       1
    60%             9     5     3     3     2     2     2     1     1     1       1
    75%            14     7     5     4     3     3     2     2     2     2       1
    85%            19    10     7     5     4     4     3     3     2     2       2
    90%            22    11     8     6     5     4     3     3     3     2       2
    95%            29    15    10     8     6     5     4     4     3     3       2
    99%            44    22    15    11     9     8     6     5     5     4       3

  The rotation counts in these tables were generated with:
     def num_rotations(c, v, success):
       r = 0
       while 1-math.pow((1-c), v*r) < success: r += 1
       return r

3.2. Rotation Period

  As specified in Section 1.2, the primary driving force for the third
  layer selection was to ensure that these nodes rotate fast enough that
  it is not worth trying to compromise them, because it is unlikely for
  compromise to succeed and yield useful information before the nodes stop
  being used.

  From the table in Section 3.1, with NUM_LAYER2_GUARDS=4 and
  NUM_LAYER3_GUARDS=6, it can be seen that this means that the Sybil attack
  on layer3 will complete with 50% chance in 12*31.5 hours (15.75 days)
  for the 1% adversary, ~4 days for the 5% adversary, and 2.62 days for the
  10% adversary.

  Since rotation of each node happens independently, the distribution of
  when the adversary expects to win this Sybil attack in order to discover
  the next node up is uniform. This means that on average, the adversary
  should expect that half of the rotation period of the next node is already
  over by the time that they win the Sybil.

  With this fact, we choose our range and distribution for the second
  layer rotation to be short enough to cause the adversary to risk
  compromising nodes that are useless, yet long enough to require a
  Sybil attack to be noticeable in terms of client activity. For this
  reason, we choose a minimum second-layer guard lifetime of 1 day,
  since this gives the adversary a minimum expected value of 12 hours for
  during which they can compromise a guard before it might be rotated.
  If the total expected rotation rate is 29.5 days, then the adversary can
  expect overall to have 14.75 days remaining after completing their Sybil
  attack before a second-layer guard rotates away.

3.3. Rotation distributions

  In order to skew the distribution of the third layer guard towards
  higher values, we use max(X,X) for the distribution, where X is a
  random variable that takes on values from the uniform distribution.

  Here's a table of expectation (arithmetic means) for relevant
  ranges of X (sampled from 0..N-1). The table was generated with the
  following python functions:

  def ProbMinXX(N, i): return (2.0*(N-i)-1)/(N*N)
  def ProbMaxXX(N, i): return (2.0*i+1)/(N*N)

  def ExpFn(N, ProbFunc):
    exp = 0.0
    for i in xrange(N): exp += i*ProbFunc(N, i)
    return exp

  The current choice for second-layer guards is noted with **, and
  the current choice for third-layer guards is noted with ***.

   Range  Min(X,X)   Max(X,X)
   40      12.84       26.16
   41      13.17       26.83
   42      13.50       27.50
   43      13.84       28.16
   44      14.17       28.83
   45      14.50       29.50**
   46      14.84       30.16
   47      15.17       30.83
   48      15.50       31.50***

  The Cumulative Density Function (CDF) tells us the probability that a
  guard will no longer be in use after a given number of time units have
  passed.

  Because the Sybil attack on the third node is expected to complete at any
  point in the second node's rotation period with uniform probability, if we
  want to know the probability that a second-level Guard node will still be in
  use after t days, we first need to compute the probability distribution of
  the rotation duration of the second-level guard at a uniformly random point
  in time. Let's call this P(R=r).

  For P(R=r), the probability of the rotation duration depends on the selection
  probability of a rotation duration, and the fraction of total time that
  rotation is likely to be in use. This can be written as:

  P(R=r) = ProbMaxXX(X=r)*r / \sum_{i=1}^N ProbMaxXX(X=i)*i

  or in Python:

  def ProbR(N, r, ProbFunc=ProbMaxXX):
     return ProbFunc(N, r)*r/ExpFn(N, ProbFunc)

  For the full CDF, we simply sum up the fractional probability density for
  all rotation durations. For rotation durations less than t days, we add the
  entire probability mass for that period to the density function. For
  durations d greater than t days, we take the fraction of that rotation
  period's selection probability and multiply it by t/d and add it to the
  density. In other words:

  def FullCDF(N, t, ProbFunc=ProbR):
    density = 0.0
    for d in xrange(N):
      if t >= d: density += ProbFunc(N, d)
      # The +1's below compensate for 0-indexed arrays:
      else: density += ProbFunc(N, d)*(float(t+1))/(d+1)
    return density

  Computing this yields the following distribution for our current parameters:

   t          P(SECOND_ROTATION <= t)
   1               0.03247
   2               0.06494
   3               0.09738
   4               0.12977
   5               0.16207
  10               0.32111
  15               0.47298
  20               0.61353
  25               0.73856
  30               0.84391
  35               0.92539
  40               0.97882
  45               1.00000

  This CDF tells us that for the second-level Guard rotation, the
  adversary can expect that 3.3% of the time, their third-level Sybil
  attack will provide them with a second-level guard node that has only
  1 day remaining before it rotates. 6.5% of the time, there will
  be only 2 day or less remaining, and 9.7% of the time, 3 days or less.

  Note that this distribution is still a day-resolution approximation.


4. Security concerns and mitigations

4.1. Mitigating fingerprinting of new HS circuits

  By pinning the middle nodes of rendezvous circuits, we make it
  easier for all hops of the circuit to detect that they are part of a
  special hidden service circuit with varying degrees of certainty.

  The Guard node is able to recognize a Vanguard client with a high
  degree of certainty because it will observe a client IP creating the
  overwhelming majority of its circuits to just a few middle nodes in
  any given 31.5 day time period.

  The middle nodes will be able to tell with a variable certainty that
  depends on both its traffic volume and upon the popularity of the
  service, because they will see a large number of circuits that tend to
  pick the same Guard and Exit.

  The final nodes will be able to tell with a similar level of certainty
  that depends on their capacity and the service popularity, because they
  will see a lot of handshakes that all tend to have the same second
  hops.

  The most serious of these is the Guard fingerprinting issue. When
  proposal 254-padding-negotiation is implemented, services that enable
  this feature should use those padding primitives to create fake circuits
  to random middle nodes that are not their guards, in an attempt to look
  more like a client.

  Additionally, if Tor Browser implements "virtual circuits" based on
  SOCKS username+password isolation in order to enforce the re-use of
  paths when SOCKS username+passwords are re-used, then the number of
  middle nodes in use during a typical user's browsing session will be
  proportional to the number of sites they are viewing at any one time.
  This is likely to be much lower than one new middle node every ten
  minutes, and for some users, may be close to the number of Vanguards
  we're considering.

  This same reasoning is also an argument for increasing the number of
  second-level guards beyond just two, as it will spread the hidden
  service's traffic over a wider set of middle nodes, making it both
  easier to cover, and behave closer to a client using SOCKS virtual
  circuit isolation.

5. Default vs optional behavior

  We suggest this torrc option to be optional because it changes path
  selection in a way that may seriously impact hidden service performance,
  especially for high traffic services that happen to pick slow guard
  nodes.

  However, by having this setting be disabled by default, we make hidden
  services who use it stand out a lot. For this reason, we should in fact
  enable this feature globally, but only after we verify its viability for
  high-traffic hidden services, and ensure that it is free of second-order
  load balancing effects.

  Even after that point, until Single Onion Services are implemented,
  there will likely still be classes of very high traffic hidden services
  for whom some degree of location anonymity is desired, but for which
  performance is much more important than the benefit of Vanguards, so there
  should always remain a way to turn this option off.

  In the meantime, a reference implementation is available at:
  https://github.com/mikeperry-tor/vanguards/blob/master/vanguards/vanguards.py


6. Acknowledgements

  This research was supported in part by NSF grants CNS-1111539,
  CNS-1314637, CNS-1526306, CNS-1619454, and CNS-1640548.


Appendix A: Full Python program for generating tables in this proposal

#!/usr/bin/python
import math

############ Section 3.1 #################
def num_rotations(c, v, success):
  i = 0
  while 1-math.pow((1-c), v*i) < success: i += 1
  return i

def rotation_line(c, pct):
  print "    %2d%%        %6d%6d%6d%6d%6d%6d%6d%6d%6d%6d%8d" % \
     (pct, num_rotations(c, 1, pct/100.0), num_rotations(c, 2, pct/100.0), \
      num_rotations(c, 3, pct/100.0), num_rotations(c, 4, pct/100.0),
      num_rotations(c, 5, pct/100.0), num_rotations(c, 6, pct/100.0),
      num_rotations(c, 8, pct/100.0), num_rotations(c, 9, pct/100.0),
      num_rotations(c, 10, pct/100.0), num_rotations(c, 12, pct/100.0),
      num_rotations(c, 16, pct/100.0))

def rotation_table_31():
  for c in [1,5,10]:
    print "\n  %2.1f%% Network Compromise: " % c
    print "   Sybil Success   One   Two  Three  Four  Five  Six  Eight  Nine  Ten  Twelve  Sixteen"
    for success in [10,15,25,50,60,75,85,90,95,99]:
      rotation_line(c/100.0, success)

############ Section 3.3 #################
def ProbMinXX(N, i): return (2.0*(N-i)-1)/(N*N)
def ProbMaxXX(N, i): return (2.0*i+1)/(N*N)

def ExpFn(N, ProbFunc):
  exp = 0.0
  for i in xrange(N): exp += i*ProbFunc(N, i)
  return exp

def ProbUniformX(N, i): return 1.0/N

def ProbR(N, r, ProbFunc=ProbMaxXX):
  return ProbFunc(N, r)*r/ExpFn(N, ProbFunc)

def FullCDF(N, t, ProbFunc=ProbR):
  density = 0.0
  for d in xrange(N):
    if t >= d: density += ProbFunc(N, d)
    # The +1's below compensate for 0-indexed arrays:
    else: density += ProbFunc(N, d)*float(t+1)/(d+1)
  return density

def expectation_table_33():
  print "\n   Range  Min(X,X)   Max(X,X)"
  for i in xrange(10,49):
    print "   %2d      %2.2f       %2.2f" % (i, ExpFn(i,ProbMinXX), ExpFn(i, ProbMaxXX))

def CDF_table_33():
  print "\n   t          P(SECOND_ROTATION <= t)"
  for i in xrange(1,46):
    print "  %2d               %2.5f" % (i, FullCDF(45, i-1))

########### Output ############

# Section 3.1
rotation_table_31()

# Section 3.3
expectation_table_33()
CDF_table_33()

----------------------

1. https://onionbalance.readthedocs.org/en/latest/design.html#overview
Filename: 293-know-when-to-publish.txt
Title: Other ways for relays to know when to publish
Author: Nick Mathewson
Created: 30-May-2018
Status: Closed
Target: 0.3.5
Implemented-In: 0.4.0.1-alpha

  [IMPLEMENTATION NOTES: Mechanism one is implemented; mechanism two is
  rejected.]


1. Motivation

   In proposal 275, we give reasons for dropping the published-on
   field from consensus documents, to improve the performance of
   consensus diffs.  We've already changed Tor (as of 0.2.9.11) to
   allow us to set those fields far in the future -- but
   unfortunately, there is still one use case that requires them:
   relays use the published-on field to tell if they are about to fall
   out of the consensus and need to make new descriptors.

   Here we propose two alternative mechanisms for relays to know that
   they should publish descriptors, so we can enact proposal 275 and
   set the published-on field to some time in the distant future.


2. Mechanism One: The StaleDesc flag

   Authorities should begin voting on a new StaleDesc flag.

   When authorities vote, if the most recent published_on date for
   a descriptor is over DESC_IS_STALE_INTERVAL in the past, the
   authorities should vote to give the StaleDesc flag to that relay.

   If any relay sees that it has the StaleDesc flag, it should upload
   some time in the first half of the voting interval.  (Implementors
   should take care not to re-upload over and over, though: Relays won't
   lose the flag until the next voting interval is reached.)

   (Define DESC_IS_STALE_INTERVAL as equal to
   FORCE_REGENERATE_DESCRIPTOR_INTERVAL.)


3. Mechanism Two: Uploading more frequently when rejected.

   Tor relays should remember the last time at which they uploaded a
   descriptor that was accepted by a majority of dirauths.  If this
   time is more than FAST_RETRY_DESCRIPTOR_INTERVAL in the past, we
   mark our descriptor as dirty from
   mark_my_descriptor_dirty_if_too_old().


4. Implications for proposal 275

   Once most relays are running versions that support the features
   above, and once authorities are generating consensuses with the
   StaleDesc flag, there will no longer be a need to keep the
   published time in consensus documents accurate -- we can start
   setting it to some time in the distant future, per proposal 275.

Filename: 294-tls-1.3.txt
Title: TLS 1.3 Migration
Authors: Isis Lovecruft
Created: 11 December 2017
Updated: 23 January 2018
Status: Draft

This proposal is currently in draft state and should be periodically
revised as we research how much of our idiosyncratic older TLS uses
can be removed.

1. Motivation

   TLS 1.3 is a substantial redesign over previous versions of TLS, with several
   significant protocol changes which should likely provide Tor implementations
   with not only greater security but an improved ability to blend into "normal"
   TLS traffic on the internet, due to its improvements in encrypting more
   portions of the handshake.

   Tor implementations may utilise the new TLS 1.3 EncryptedExtensions feature
   to define arbitrary encrypted TLS extensions to encompass our less standard
   (ab)uses of TLS.  Additionally, several new Elliptic Curve (EC) based
   signature algorithms, including Ed25519 and Ed448, are included within the
   base specification including a single specification for EC point compression
   for each supported curve, further decreasing our reliance on
   Tor-protocol-specific uses and extensions (and implementation details).

   Other new features which Tor implementations might take advantage of include
   improved (server-side) stateless session resumption, which might be usable
   for OPs to resume sessions with their guards, for example after network
   disconnection or router IP address reassignment.

2. Summary

   Everything that's currently TLS 1.2: make it use TLS 1.3.  KABLAM.  DONE.

   For an excellent summary of differences between TLS 1.2 and TLS 1.3, see
   [TLS-1.3-DIFFERENCES].

3. Specification

3.1. Link Subprotocol 4

   (We call it "Link v4" here, but reserve whichever is the subsequently
   available subprotocol version at the time.)

3.2. TLS Session Resumption & Compression

   As before, implementations MUST NOT allow TLS session resumption.  In the
   event that it might be decided in the future that OR implementations would
   benefit from 0-RTT, we can re-evaluate this decision and its security
   considerations in a separate proposal.

   Compression has been removed from TLS in version 1.3, so we no longer need to
   make recommendations against its usage.

3.3. Handshake Protocol

3.3.1. Negotiation

   The initiator sends the following four sets of options, as defined in §4.1.1
   of [TLS-1.3-NEGOTIATION]:
   >
   >  - A list of cipher suites which indicates the AEAD algorithm/HKDF hash
   >    pairs which the client supports.
   >  - A “supported_groups” (Section 4.2.7) extension which indicates the
   >    (EC)DHE groups which the client supports and a “key_share” (Section 4.2.8)
   >    extension which contains (EC)DHE shares for some or all of these groups.
   >  - A “signature_algorithms” (Section 4.2.3) extension which indicates the
   >    signature algorithms which the client can accept.
   >  - A “pre_shared_key” (Section 4.2.11) extension which contains a list of
   >    symmetric key identities known to the client and a “psk_key_exchange_modes”
   >    (Section 4.2.9) extension which indicates the key exchange modes that may be
   >    used with PSKs.

   In our case, the initiator MUST leaave the PSK section blank and MUST include
   the "key_share" extension, and the responder proceeds to select a ECDHE
   group, including its "key_share" in the response ServerHello.

3.3.2. ClientHello

   To initiate a v4 handshake, the client sends a TLS1.3 ClientHello with the
   following options:

     - The "legacy_version" field MUST be set to "TLS 1.2 (0x0303)".  TLS 1.3
       REQUIRES this.  (Actual version negotiation is done via the
       "supported_versions" extension.  See §5.1 of this proposal for details of
       the case where a TLS-1.3 capable initiator finds themself talking to a
       node which does not support TLS 1.3 and/or doesn't support v4.)

     - The "random" field MUST be filled with 32 bytes of securely generated
       randomness.

     - The "legacy_session_id" MUST be set to a new pseudorandom value each
       time, regardless of whether the initiator has previously opened either a
       TLS1.2 or TLS1.3 connection to the other side.

     - The "legacy_compression_methods" MUST be set to a single null byte,
       indicating no compression is supported.  (This is the only valid setting
       for this field in TLS1.3, since there is no longer any compression
       support.)

     - The "cipher_suites" should be set to 
       "TLS13-CHACHA20-POLY1305-SHA256:TLS13-AES-128-GCM-SHA256:TLS13-AES-256-GCM-SHA384:"
       This is the DEFAULT cipher suite list for OpenSSL 1.1.1.  While an
       argument could be made for customisation to remove the AES-128 option, we
       choose to attempt to blend in which the majority of other TLS-1.3
       clients, since this portion of the handshake is unencrypted.
       (If the initiator actually means to begin a v3 protocol connection, they
       send these ciphersuites anyway, cf. §5.2 of this proposal.)

     - The "supported_groups" MUST include "X25519" and SHOULD NOT include any
       of the NIST P-* groups.

     - The "signature_algorithms" MUST include "ed25519 (0x0807)".
       Implementations MAY advertise support for other signature schemes,
       including "ed448 (0x0808)", however they MUST NOT advertise support for
       ECDSA schemes due to the perils of secure implementation.

   The initiator MUST NOT send any "pre_shared_key" or "psk_key_exchange_modes"
   extensions.

   The details of the "signature_algorithms" choice depends upon the final
   standardisation of PKIX. [IETF-PKIX]

3.3.2.1. ClientHello Extensions

   From [TLS-1.3_SIGNATURE_ALGOS]:
   >
   > The “signature_algorithms_cert” extension was added to allow implementatations
   > which supported different sets of algorithms for certificates and in TLS itself
   > to clearly signal their capabilities. TLS 1.2 implementations SHOULD also
   > process this extension.

   In order to support cross-certification, the initiator's ClientHello MUST
   include the "signature_algorithms_cert" extension, in order to signal that
   some certificate chains (one in particular) will include a certificate signed
   using RSA-PKCSv1-SHA1:

     - The "signature_algorithms_cert" MUST include the legacy algorithm
       "rsa_pkcs1_sha1(0x0201)".

3.3.3. ServerHello

   To respond to a TLS 1.3 ClientHello which supports the v4 link handshake
   protocol, the responder sends a ServerHello with the following options:

     - The "legacy_version" field MUST be set to "TLS 1.2 (0x0303)".  TLS 1.3
       REQUIRES this.  (Actual version negotiation is done via the
       "supported_versions" extension.  See §5.1 of this proposal for details of
       the case where a TLS-1.3 capable initiator finds themself talking to a
       node which does not support TLS 1.3 and/or doesn't support v4.)

     - The "random" field MUST be filled with 32 bytes of securely generated
       randomness.

     - The "legacy_session_id_echo" field MUST be filled with the contents of
       the "legacy_session_id" from the initiator's ClientHello.

     - The "cipher_suite" field MUST be set to "TLS13-CHACHA20-POLY1305-SHA256".

     - The "legacy_compression_method" MUST be set to a single null byte,
       indicating no compression is supported.  (This is the only valid setting
       for this field in TLS1.3, since there is no longer any compression
       support.)

   XXX structure and "key_share" response (XXX can we pre-generate a cache of
   XXX key_shares?)

3.3.3.1 ServerHello Extensions

   XXX what extensions do we need?

4. Implementation Details

4.1. Certificate Chains and Cross-Certifications

   TLS 1.3 specifies that a certificate in a chain SHOULD be directly certified by
   the preceding certificate in the chain.  This seems to imply that OR
   implementations SHOULD NOT do the DAG-like construction normally implied by
   cross-certification between the master Ed25519 identity key and the master
   RSA-1024 identity key.

   Instead, since certificate chains are expected to be linear, we'll need three
   certificate chains included in the same handshake:

     1. EdMaster->EdSigning, EdSigning->Link
     2. EdMaster->RSALegacy
     3. RSALegacy->EdMaster

   where A->B denotes that the certificate containing B has been signed with key A.

4.2. Removal of AUTHENTICATE, CLIENT_AUTHENTICATE, and CERTS cells

   XXX see prop#224 and RFC5705 and compare

   XXX when can we remove our "renegotiation" handshake completely?

5. Compatibility

5.1. TLS 1.2 version negotiation

   From [TLS-1.3-DIFFERENCES]:
   >
   > The “supported_versions” ClientHello extension can be used to
   > negotiate the version of TLS to use, in preference to the
   > legacy_version field of the ClientHello.

   If an OR does not receive a ClientHello with a "supported_versions"
   extenstion, it MUST fallback to using the Tor Link subprotocols v3.  That is,
   the OR MUST immediately fallback to TLS 1.2 (or v3 with TLS 1.3, cf. the next
   section) and, following both Tor's "renegotiation" and "in-protocol" version
   negotiation mechanisms, immediately send a VERSIONS cell.

   Otherwise, upon seeing a "supported_versions" in the ClientHello set to
   0x0304, the OR should procede with Tor's Link subprotocol 4.

5.2. Preparing Tor's v3 Link Subprotocol for TLS 1.3

   Some changes to the current v3 Link protocol are required, and these MUST
   be backported, since implementations which are currently compiled against
   TLS1.3-supporting OpenSSLs fail to establish any connections due to:

     - failing to include any ciphersuite candidates which are TLS1.3 compatible

   This is likely to be accomplished by:

     1. Prefacing our v3 ciphersuite lists with
        TLS13-CHACHA20-POLY1305-SHA256:TLS13-AES-128-GCM-SHA256:TLS13-AES-256-GCM-SHA384:
        (We could also retroactively change our custom cipher suite list to be
        the HIGH cipher suites, since this includes all TLS 1.3 suites.)
     2. Calling SSL_CTX_set1_groups() to set the supported groups (should be set
        to "X25519:P-256"). [TLS-1.3_SET1_GROUPS]
     3. Taking care that older OpenSSLs, which instead have the concept of
        "curves" not groups, should have their equivalent TLS context settings
        in place.  [TLS-1.3_SET1_GROUPS] mentions that "The curve functions were
        first added to OpenSSL 1.0.2. The equivalent group functions were first
        added to OpenSSL 1.1.1".

   However more steps may need to be taken.

   [XXX are there any more steps necessary? —isis]

6. Security Implications

   XXX evaluate the static RSA attack and its effects on TLS1.2/TLS1.3
   XXX dual-operable protocols and determine if they apply
   XXX
   XXX Jager, T., Schwenk, J. and J. Somorovsky, "On the Security
   XXX of TLS 1.3 and QUIC Against Weaknesses in PKCS#1 v1.5 Encryption",
   XXX Proceedings of ACM CCS 2015 , 2015.
   XXX https://www.nds.rub.de/media/nds/veroeffentlichungen/2015/08/21/Tls13QuicAttacks.pdf

7. Performance and Scalability

8. Availability and External Deployment

8.1. OpenSSL Availability and Interoperability

   Implementation should be delayed until the stable release of OpenSSL 1.1.1.

   OpenSSL 1.1.1 will be binary and API compatible with OpenSSL 1.1.0, so in
   preparation we might wish to revise our current usage to OpenSSL 1.1.0 to be
   prepared.

   From Matt Caswell in [OPENSSL-BLOG-TLS-1.3]:
   >
   > OpenSSL 1.1.1 will not be released until (at least) TLSv1.3 is
   > finalised. In the meantime the OpenSSL git master branch contains
   > our development TLSv1.3 code which can be used for testing purposes
   > (i.e. it is not for production use). You can check which draft
   > TLSv1.3 version is implemented in any particular OpenSSL checkout
   > by examining the value of the TLS1_3_VERSION_DRAFT_TXT macro in the
   > tls1.h header file. This macro will be removed when the final
   > version of the standard is released.
   >
   > In order to compile OpenSSL with TLSv1.3 support you must use the
   > “enable-tls1_3” option to “config” or “Configure”.
   >
   > Currently OpenSSL has implemented the “draft-20” version of
   > TLSv1.3. Many other libraries are still using older draft versions in
   > their implementations. Notably many popular browsers are using
   > “draft-18”. This is a common source of interoperability
   > problems. Interoperability of the draft-18 version has been tested
   > with BoringSSL, NSS and picotls.
   >
   > Within the OpenSSL git source code repository there are two branches:
   > “tls1.3-draft-18” and “tls1.3-draft-19”, which implement the older
   > TLSv1.3 draft versions. In order to test interoperability with other
   > TLSv1.3 implementations you may need to use one of those
   > branches. Note that those branches are considered temporary and are
   > likely to be removed in the future when they are no longer needed.

   At the time of its release, we may wish to test interoperability with other
   implementation(s).

9. Future Directions

   The implementation of this proposal would greatly ease the
   implementation difficulty and maintenance requirements for some
   other possible future beneficial areas of work.

9.1. TLS Handshake Composability

   Handshake composition (i.e. hybrid handshakes) in TLS 1.3 is incredibly
   straightforward.

   For example, provided we had a Supersingular Isogeny Diffie-Hellman (SIDH)
   based implementation with a sane API, composition of Elliptic Curve
   Diffie-Hellman (ECDH) and SIDH handshakes would be a trivial codebase
   addition (~10-20 lines of code, for others who have implemented this).

   Our current circuit-layer protocol safeguards the majority of our security
   and anonymity guarantees, while our TLS layer has historically been either a
   stop-gap and/or an attempted (albeit usually not-so-successful) obfuscation
   mechanism.  However, our TLS usage has, in many cases, successfully, through
   combination with the circuit layer cryptography, prevented more then a few
   otherwise horrendous bugs.  After our circuit-layer protocol is upgraded to a
   hybrid post-quantum secure protocol (prop#269 and prop#XXX), and in order to
   ensure that our TLS layer continues to act in this manner as a stop gap —
   including within threat models which include adversaries capable of recording
   traffic now and decrypting with a potential quantum computer in the future —
   our TLS layer should also provide safety against such a quantum-capable
   adversary.


A. References

[TLS-1.3-DIFFERENCES]:
   https://tlswg.github.io/tls13-spec/draft-ietf-tls-tls13.html#rfc.section.1.3
[OPENSSL-BLOG-TLS-1.3]:
   https://www.openssl.org/blog/blog/2017/05/04/tlsv1.3/
[TLS-1.3-NEGOTIATION]:
   https://tlswg.github.io/tls13-spec/draft-ietf-tls-tls13.html#rfc.section.4.1.1
[IETF-PKIX]:
   https://datatracker.ietf.org/doc/draft-ietf-curdle-pkix/
[TLS-1.3_SET1_GROUPS]:
   https://www.openssl.org/docs/manmaster/man3/SSL_CTX_set1_groups.html
[TLS-1.3_SIGNATURE_ALGOS]:
   https://tlswg.github.io/tls13-spec/draft-ietf-tls-tls13.html#signature-algorithms
Filename: 295-relay-crypto-with-adl.txt
Title: Using ADL for relay cryptography (solving the crypto-tagging attack)
Author: Tomer Ashur, Orr Dunkelman, Atul Luykx
Created: 22 Feb 2018
Last-Modified: 13 Jan. 2020
Status: Open


0. Context

   Although Crypto Tagging Attacks were identified already in the
   original Tor design, it was not before the rise of the
   Procyonidae in 2012 that their severity was fully realized. In
   Proposal 202 (Two improved relay encryption protocols for Tor
   cells) Nick Mathewson discussed two approaches to stymie tagging
   attacks and generally improve Tor's cryptography. In Proposal 261
   (AEZ for relay cryptography) Mathewson puts forward a concrete
   approach which uses the tweakable wide-block cipher AEZ.

   This proposal suggests an alternative approach to Proposal 261
   using the notion of Release (of) Unverified Plaintext (RUP)
   security. It describes an improved algorithm for circuit
   encryption based on CTR-mode which is already used in Tor, and an
   additional component for hashing.

   Incidentally, and similar to Proposal 261, this proposal employs
   the ENCODE-then-ENCIPHER approach thus it improves Tor's E2E
   integrity by using (sufficient) redundancy.

   For more information about the scheme and a security proof for
   its RUP-security see

       Tomer Ashur, Orr Dunkelman, Atul Luykx: Boosting
       Authenticated Encryption Robustness with Minimal
       Modifications. CRYPTO (3) 2017: 3-33

   available online at https://eprint.iacr.org/2017/239 .

   For authentication between the OP and the edge node we use
   the PIV scheme: https://eprint.iacr.org/2013/835 .

   A recent paper presented a birthday bound distinguisher
   against the ADL scheme, thus showing that the RUP security
   proof is tight: https://eprint.iacr.org/2019/1359 .


2. Preliminaries

2.1 Motivation

   For motivation, see proposal 202.

2.2. Notation

   Symbol               Meaning
   ------               -------
   M                    Plaintext
   C_I                  Ciphertext
   CTR                  Counter Mode
   N_I                  A de/encryption nonce (to be used in CTR-mode)
   T_I                  A tweak (to be used to de/encrypt the nonce)
   Tf'_I                A running digest (forward direction)
   Tb'_I                A running digest (backward direction)
   ^                    XOR
   ||                   Concatenation
          (This is more readable than a single | but must be adapted
          before integrating the proposal into tor-spec.txt)

2.3. Security parameters

   HASH_LEN -- The length of the hash function's output, in bytes.

   PAYLOAD_LEN -- The longest allowable cell payload, in bytes. (509)

   DIG_KEY_LEN -- The key length used to digest messages (e.g.,
   using GHASH). Since GHASH is only defined for 128-bit keys, we
   recommend DIG_KEY_LEN = 128.

   ENC_KEY_LEN -- The key length used for encryption (e.g., AES). We
   recommend ENC_KEY_LEN = 256.

2.4. Key derivation (replaces Section 5.2.2 in Tor-spec.txt)

   For newer KDF needs, Tor uses the key derivation function HKDF
   from RFC5869, instantiated with SHA256. The generated key
   material is:

                 K = K_1 | K_2 | K_3 | ...

   where, if H(x,t) denotes HMAC_SHA256 with value x and key t,
         and m_expand denotes an arbitrarily chosen value,
         and INT8(i) is an octet with the value "i", then
             K_1     = H(m_expand | INT8(1) , KEY_SEED )
         and K_(i+1) = H(K_i | m_expand | INT8(i+1) , KEY_SEED ),
   in RFC5869's vocabulary, this is HKDF-SHA256 with info ==
   m_expand, salt == t_key, and IKM == secret_input.

   When used in the ntor handshake a string of key material is
   generated and is used in the following way:

   Length       Purpose                         	Notation
   ------        -------                        	--------
   HASH_LEN     forward authentication digest IV 	AF
   HASH_LEN     forward digest IV               	DF
   HASH_LEN     backward digest IV              	DB
   ENC_KEY_LEN  encryption key                  	Kf
   ENC_KEY_LEN  decryption key                  	Kb
   DIG_KEY_LEN  forward digest key              	Khf
   DIG_KEY_LEN  backward digest key             	Khb
   ENC_KEY_LEN  forward tweak key               	Ktf
   ENC_KEY_LEN  backward tweak key              	Ktb
   DIGEST_LEN   nonce to use in the
                  hidden service protocol(*)

      (*) I am not sure that if this is still needed.

   Excess bytes from K are discarded.

2.6. Ciphers

   For hashing(*) we use GHASH(**) with a DIG_KEY_LEN-bit key. We write
   this as Digest(K,M) where K is the key and M the message to be
   hashed.

   We use AES with an ENC_KEY_LEN-bit key. For AES encryption
   (resp., decryption) we write E(K,X) (resp., D(K,X)) where K is an
   ENC_KEY_LEN-bit key and X the block to be encrypted (resp.,
   decrypted).

   For a stream cipher, unless otherwise specified, we use
   ENC_KEY_LEN-bit AES in counter mode, with a nonce that is
   generated as explained below. We write this as Encrypt(K,N,X)
   (resp., Decrypt(K,N,X)) where K is the key, N the nonce, and X
   the message to be encrypted (resp., decrypted).

   (*) The terms hash and digest are used interchangeably.
   (**) Proposal 308 suggested that using POLYVAL [GLL18]
        would be more efficient here. This proposal will work just the
		same if POLYVAL is used instead of GHASH.

3. Routing relay cells

   Let n denote the integer representing the destination node. For
   I = 1...n, we set Tf'_{I} = DF_I, Tb'_{I} = DB_I, and
   Ta'_I = AF_I where DF_I, DB_I, and AF_I are generated
   according to Section 2.4.

3.1. Forward Direction

   The forward direction is the direction that CREATE/CREATE2 cells
   are sent.

3.1.1. Routing from the origin

   When an OP sends a relay cell, they prepare the
   cell as follows:

        The OP prepares the authentication part of the message:

                C_{n+1} = M
                Ta_I = Digest(Khf_n,Ta'_I||C_{n+1})
                N_{n+1} = Ta_I ^ E(Ktf_n,Ta_I ^ 0)
		Ta'_{I} = Ta_I

        Then, the OP prepares the multi-layered encryption:

                For I=n...1:
                        C_I = Encrypt(Kf_I,N_{I+1},C_{I+1})
                        T_I = Digest(Khf_I,Tf'_I||C_I)
                        N_I = T_I ^ E(Ktf_I,T_I ^ N_{I+1})
                        Tf'_I = T_I

          The OP sends C_1 and N_1 to node 1.

3.1.2. Relaying forward at onion routers

   When a forward relay cell is received by OR_I, it decrypts the
   payload with the stream cipher, as follows:

        'Forward' relay cell:

                T_I = Digest(Khf_I,Tf'_I||C_I)
                N_{I+1} = T_I ^ D(Ktf_I,T_I ^ N_I)
                C_{I+1} = Decrypt(Kf_I,N_{I+1},C_I)
                Tf'_I = T_I

   The OR then decides whether it recognizes the relay cell as
   described below. If the OR recognizes the cell, it processes the
   contents of the relay cell. Otherwise, it passes C_{I+1}||N_{I+1}
   along the circuit if the circuit continues.

   For more information, see section 4 below.

3.2. Backward direction

   The backward direction is the opposite direction from
   CREATE/CREATE2 cells.

3.2.1. Relaying backward at onion routers

   When a backward relay cell is received by OR_I, it encrypts the
   payload with the stream cipher, as follows:

        'Backward' relay cell:

                T_I = Digest(Khb_I,Tb'_I||C_{I+1})
                N_I = T_I ^ E(Ktb_I,T_I ^ N_{I+1})
                C_I = Encrypt(Kb_I,N_I,C_{I+1})
                Tb'_I = T_I

   with C_{n+1} = M and N_{n+1}=0. Once encrypted, the node passes
   C_I and N_I along the circuit towards the OP.

3.2.2. Routing to the origin

   When a relay cell arrives at an OP, the OP decrypts the payload
   with the stream cipher as follows:

        OP receives relay cell from node 1:

                For I=1...n, where n is the end node on the circuit:
                        C_{I+1} = Decrypt(Kb_I,N_I,C_I)
                        T_I = Digest(Khb_I,Tb'_I||C_{I+1})
                        N_{I+1} = T_I ^ D(Ktb_I,T_I ^ N_I)
                        Tb'_I = T_I

                If the payload is recognized (see Section 4.1),
                then:

                       The sending node is I. Stop, process the
                       payload and authenticate.

4. Application connections and stream management

4.1. Relay cells

  Within a circuit, the OP and the end node use the contents of
  RELAY packets to tunnel end-to-end commands and TCP connections
  ("Streams") across circuits. End-to-end commands can be initiated
  by either edge; streams are initiated by the OP.

        The payload of each unencrypted RELAY cell consists of:

                Relay command           [1 byte]
                StreamID                [2 bytes]
                Length                  [2 bytes]
                Data                    [PAYLOAD_LEN-21 bytes]


   The old Digest field is removed since sufficient information for
   authentication is now included in the nonce part of the payload.

   The old 'Recognized' field is removed and the node always tries to
   authenticate the message as follows.

4.1.1 forward direction (executed by the end node):

				Ta_I = Digest(Khf_n,Ta'_I||C_{n+1})
				Tag = Ta_I ^ D(Ktf_n,Ta_I ^ N_{n+1})

				If Tag = 0:
                  Ta'_I = Ta_I
                  The message is authenticated.
             Otherwise:
                  Ta'_I remains unchanged.
                  The message is not authenticated.


4.1.2 backward direction (executed by the OP):

                The message is recognized and authenticated
				(i.e., C_{n+1} = M) if and only if N_{n+1} = 0.


   The 'Length' field of a relay cell contains the number of bytes
   in the relay payload which contain real payload data. The
   remainder of the payload is padding bytes.

4.2. Appending the encrypted nonce and dealing with version-homogenic
     and version-heterogenic circuits

   When a cell is prepared to be routed from the origin (see Section
   3.1.1 above) the encrypted nonce N is appended to the encrypted
   cell (occupying the last 16 bytes of the cell). If the cell is
   prepared to be sent to a node supporting the new protocol, N is
   used to generate the layer's nonce. Otherwise, if the node only
   supports the old protocol, N is still appended to the encrypted
   cell (so that following nodes can still recover their nonce),
   but a synchronized nonce (as per the old protocol) is used in
   CTR-mode.

   When a cell is sent along the circuit in the 'backward'
   direction, nodes supporting the new protocol always assume that
   the last 16 bytes of the input are the nonce used by the previous
   node, which they process as per Section 3.2.1. If the previous
   node also supports the new protocol, these cells are indeed the
   nonce. If the previous node only supports the old protocol, these
   bytes are either encrypted padding bytes or encrypted data.

5. Security

5.1. Resistance to crypto-tagging attacks

   A crypto-tagging attack involves a circuit with two colluding
   nodes and at least one honest node between them. The attack works
   when one node makes a change to the cell (tagging) in a way that
   can be undone by the other colluding party. In between, the
   tagged cell is processed by honest nodes which do not detect the
   change. The attack is possible due to the malleability property
   of CTR-mode: a change to a ciphertext bit effects only the
   respective plaintext bit in a predicatble way. This proposal
   frustrates the crypto-tagging attack by linking the nonce to the
   encrypted message such that any change to the ciphertext results
   in a random nonce and hence, random plaintext.

   Let us consider the following 3-hop scenario: the entry and end
   nodes are malicious and colluding and the middle node is honest.

5.1.1. forward direction

   Suppose that node I tags the ciphertext part of the message
   (C'_{I+1} != C_{I+1}) then forwards it to the next node (I+1). As
   per Section 3.1.2. Node I+1 digests C'_{I+1} to generate T_{I+1}
   and N_{I+2}. Since C'_{I+2} is different from what it should be, so
   are the resulting T_{I+1} and N_{I+2}. Hence, decrypting C'_{I+1}
   using these values results in a random string for C_{I+2}. Since
   C_{I+2} is now just a random string, it is decrypted into a
   random string and cannot be authenticated. Furthermore, since
   C'_{I+1} is different than what it should be, Tf'_{I+1}
   (i.e., the running digest of the middle node) is now out of sync
   with that of the OP, which means that all future cells sent through
   this node will decrypt into garbage (random strings).

   Likewise, suppose that instead of tagging the ciphertext, Node I
   tags the encrypted nonce N'_{I+1} != N_{I+1}. Now, when Node
   I+1 digests the payload the tweak T_{I+1} is fine, but using it
   to decrypt N'_{I+1} again results in a random nonce for
   N_{I+2}. This random nonce is used to decrypt C_{I+1} into a
   random C'_{I+2} which cannot be authenticated by the end node. Since
   C_{I+2} is a random string, the running digest of the end node is
   now out of sync with that of OP, which prevents the end node from
   decrypting further cells.

5.1.2. Backward direction

   In the backward direction the tagging is done by Node I+2
   untagging by Node I. Suppose first that Node I+2 tags the
   ciphertext C_{I+2} and sends it to Node I+1. As per Section
   3.2.1, Node I+1 first digests C_{I+2} and uses the resulting
   T_{I+1} to generate a nonce N_{I+1}. From this it is clear that
   any change introduced by Node I+2 influences the entire payload
   and cannot be removed by Node I.

   Unlike in Section 5.1.1., the cell is blindly delivered by Node I
   to the OP which decrypts it. However, since the payload leaving
   the end node was modified, the message cannot be authenticated by
   the OP which can be trusted to tear down the circuit.

   Suppose now that tagging is done by Node I+2 to the nonce part of
   the payload, i.e., N_{I+2}. Since this value is encrypted by Node
   I+1 to generate its own nonce N_{I+1}, again, a random nonce is
   used which affects the entire keystream of CTR-mode. The cell
   again cannot be authenticated by the OP and the circuit is torn
   down.

   We note that the end node can modify the plain message before
   ever encrypting it and this cannot be discovered by the Tor
   protocol. This vulnerability is outside the scope of this
   proposal and users should always use TLS to make sure that their
   application data is encrypted before it enters the Tor network.

5.2. End-to-end authentication

   Similar to the old protocol, this proposal only offers end-to-end
   authentication rather than per-hop authentication. However,
   unlike the old protocol, the ADL-construction is non-malleable
   and hence, once a non-authentic message was processed by an
   honest node supporting the new protocol, it is effectively
   destroyed for all nodes further down the circuit. This is because
   the nonce used to de/encrypt all messages is linked to (a digest
   of) the payload data.

   As a result, while honest nodes cannot detect non-authentic
   messages, such nodes still destroy the message thus invalidating
   its authentication tag when it is checked by edge nodes. As a
   result, security against crypto-tagging attacks is ensured as
   long as an honest node supporting the new protocol processes the
   message between two dishonest ones.

5.3. The running digest

   Unlike the old protocol, the running digest is now computed as
   the output of a GHASH call instead of a hash function call
   (SHA256). Since GHASH does not provide the same type of security
   guarantees as SHA256, it is worth discussing why security is not
   lost from computing the running digest differently.

   The running digest is used to ensure that if the same payload is
   encrypted twice, then the resulting ciphertext does not remain
   the same. Therefore, all that is needed is that the digest should
   repeat with low probability. GHASH is a universal hash function,
   hence it gives such a guarantee assuming its key is chosen
   uniformly at random.

6. Forward secrecy

   Inspired by the approach of Proposal 308, a small modification
   to this proposal makes it forward secure. The core idea is to
   replace the encryption key KF_n after de/encrypting the cell.
   As an added benefit, this would allow to keep the authentication
   layer stateless (i.e., without keeping a running digest for
   this layer).

   Below we present the required changes to the sections above.

6.1. Routing from the Origin (replacing 3.1.1 above)

   When an OP sends a relay cell, they prepare the
   cell as follows:

        The OP prepares the authentication part of the message:

			C_{n+1} = M
			T_{n+1} = Digest(Khf_n,C_{n+1})
			N_{n+1} = T_{n+1} ^ E(Ktf_n,T_{n+1} ^ 0)


        Then, the OP prepares the multi-layered encryption:
			For the final layer n:
				(C_n,Kf'_n) = Encrypt(Kf_n,N_{n+1},C_{I+1}||0||0) (*)
				T_n = Digest(Khf_I,Tf'_n||C_n)
				N_n = T_I ^ E(Ktf_n,T_n ^ N_{n+1})
				Tf'_n = T_n
				Kf_n = Kf'_n

				(*) CTR mode is used to generate two additional blocks. This
					256-bit value is denoted K'f_n and is used in subsequent
					steps to replace the encryption key of this layer.
					To achieve forward secrecy it is important that the
					obsolete Kf_n is erased in a non-recoverable way.

                For layer I=(n-1)...1:
                        C_I = Encrypt(Kf_I,N_{I+1},C_{I+1})
                        T_I = Digest(Khf_I,Tf'_I||C_I)
                        N_I = T_I ^ E(Ktf_I,T_I ^ N_{I+1})
                        Tf'_I = T_I

		The OP sends C_1 and N_1 to node 1.

	Alternatively, if we want that all nodes use the same functionality
	OP prepares the cell as follows:

			For layer I=n...1:
				(C_I,K'f_I) = Encrypt(Kf_I,N_{I+1},C_{I+1}||0||0) (*)
				T_I = Digest(Khf_I,Tf'_I||C_I)
				N_I = T_I ^ E(Ktf_I,T_I ^ N_{I+1})
				Tf'_I = T_I
				Kf_I = Kf'_I

				(*) CTR mode is used to generate two additional blocks. This
					256-bit value is denoted K'f_n and is used in subsequent
					steps to replace the encryption key of this layer.
					To achieve forward secrecy it is important that the
					obsolete Kf_n is erased in a non-recoverable way.

		This scheme offers forward secrecy in all levels of the circuit.

6.2. Relaying Forward at Onion Routers (replacing 3.1.2 above)

   When a forward relay cell is received by OR I, it decrypts the
   payload with the stream cipher, as follows:

        'Forward' relay cell:

                T_I = Digest(Khf_I,Tf'_I||C_I)
                N_{I+1} = T_I ^ D(Ktf_I,T_I ^ N_I)
                C_{I+1} = Decrypt(Kf_I,N_{I+1},C_I||0||0)
                Tf'_I = T_I

   The OR then decides whether it recognizes the relay cell as described below.
   Depending on the choice of scheme from 6.1 the OR uses the last two blocks
   of C_{I+1} to update the encryption key or discards them.

   If the cell is recognized the OR also processes the contents of the relay
   cell. Otherwise, it passes C_{I+1}||N_{I+1} along the circuit if the circuit
   continues.

   For more information about recognizing and authenticating relay cells,
   see 5.4.5 below.

6.3. Relaying Backward at Onion Routers (replacing 3.2.1 above)

   When an edge node receives a message M to be routed back to the
   origin, it encrypts it as follows:

		T_n = Digest(Khb_n,Tb'_n||M)
                N_n = T_n ^ E(Ktb_n,T_n ^ 0)
                (C_n,K'b_n) = Encrypt(Kb_n,N_n,M||0||0) (*)
                Tb'_n = T_n
		Kb_n = K'b_n

				(*) CTR mode is used to generate two additional blocks. This
					256-bit value is denoted K'b_n and will be used in
					subsequent steps to replace the encryption key of this layer.
					To achieve forward secrecy it is important that the obsolete
					K'b_n is erased in a non-recoverable way.

    Once encrypted, the edge node sends C_n and N_n along the circuit towards
	the OP. When a backward relay cell is received by OR_I (I<n), it encrypts
	the payload with the stream cipher, as follows:

        'Backward' relay cell:

                T_I = Digest(Khb_I,Tb'_I||C_{I+1})
                N_I = T_I ^ E(Ktb_I,T_I ^ N_{I+1})
                C_I = Encrypt(Kb_I,N_I,C_{I+1})
                Tb'_I = T_I

   Each node passes C_I and N_I along the circuit towards the OP.

   If forward security is desired for all layers in the circuit, all OR's
   encrypt as follows:
		T_I = Digest(Khb_I,Tb'_I||C_{I+1})
                N_I = T_I ^ E(Ktb_I,T_I ^ 0)
                (C_I,K'b_I) = Encrypt(Kb_n,N_n,M||0||0)
                Tb'_I = T_I
		Kb_I = K'b_I


6.4. Routing to the Origin (replacing 3.2.2 above)

   When a relay cell arrives at an OP, the OP decrypts the payload
   with the stream cipher as follows:

        OP receives relay cell from node 1:

                For I=1...n, where n is the end node on the circuit:
                        C_{I+1} = Decrypt(Kb_I,N_I,C_I)
                        T_I = Digest(Khb_I,Tb'_I||C_{I+1})
                        N_{I+1} = T_I ^ D(Ktb_I,T_I ^ N_I)
                        Tb'_I = T_I

				And updates the encryption keys according to the strategy
				chosen for 6.3.

                If the payload is recognized (see Section 4.1),
                then:

                       The sending node is I. Process the payload!


6.5. Recognizing and authenticating a relay cell (replacing 4.1.1 above):

	Authentication in the forward direction is done as follows:

		T_{n+1} = Digest(Khf_n,C_{n+1})
                Tag = T_{n+1} ^ D(Ktf_n,T_{n+1} ^ N_{n+1})

	The message is recognized and authenticated
				(i.e., M = C_{n+1}) if and only if Tag = 0.

	No changes are required to the authentication process when the relay
	cell is sent backwards.
Filename: 296-expose-bandwidth-files.txt
Title: Have Directory Authorities expose raw bandwidth list files
Author: Tom Ritter
Created: 11-December-2017
Status: Closed
Ticket: https://trac.torproject.org/projects/tor/ticket/21377
Implemented-In: 0.4.0.1-alpha

1. Introduction

Bandwidth Authorities (bwauths) perform scanning of the Tor Network
and calculate observed bandwidths for each relay. They produce a bandwidth
list file that is given to a Directory Authority. The Directory
Authority uses the bw (bandwidth) value from this file in its vote file
denoting its view of the bandwidth of the relay.

After collecting all of the votes from other Authorities, a consensus
is calculated, and the consensus's view of a relay's speed is
determined by choosing the low-median value of all the authorities'
values for each relay.

Only a single metric from the bandwidth list file is exposed by a
Directory Authority's vote, however the original file contains
considerably more diagnostic information about how the bwauth arrives
at that measurement for that relay.

For more details, see the bandwidth list file specification in
bandwidth-file-spec.txt.

2. Motivation

The bandwidth list file contains more information than is exposed in the
overall vote file. This information is useful to debug:
  * anomalies in relays' utilization,
  * suspected bugs in the (decrepit) bwauth code, and
  * the transition to a replacement bwauth implementation.

Currently, all bwauths expose the bandwidth list file through various (non-
standard) means, and that file is downloaded (hourly) by a single person
(as long as his home internet connection and home server is working)
and archived (with a small amount of robustness.)

It would be preferable to have this exposed in a standard manner.
Doing so would no longer require bwauths to run HTTP servers to expose
the file, no longer require them to take additional manual steps to
provide it, and would enable public consumption by any interested
parties.  We hope that Collector will begin archiving the files.

3. Specification

An authority SHOULD publish the bandwidth list file used to calculate its
next vote. It SHOULD make the bandwidth list file available whenever the
corresponding vote is available, at the corresponding URL. (See
dir-spec for the exact details.)

It SHOULD make the file available at
  http://<hostname>/tor/status-vote/next/bandwidth.z
  http://<hostname>/tor/status-vote/current/bandwidth.z

It MUST NOT attempt to send its bandwidth list file in a HTTP POST to
other authorities and it SHOULD NOT make bandwidth list files from other
authorities available.

Clients interested in consuming these documents should download them from
each authority's:
  * next URL when votes are created. (In the public Tor network, this is
    after HH:50 during normal operation, and after HH:20 during a
    consensus failure.)
  * current URL after the valid-after time in the consensus.
    (After HH:00, and HH:30 during consensus failure.)

4. Security Implications

The raw bandwidth list file does not [really: is not believed to] expose
any sensitive information.  All authorities currently make this
document public already, an example is at
  https://bwauth.ritter.vg/bwauth/bwscan.V3BandwidthsFile

5. Compatibility

Exposing the document presents no compatibility concerns.

Applications that parse the document should follow the bandwidth list file
specification in bandwidth-file-spec.txt.
If a new bandwidth list format version is added, the applications MAY need
to upgrade to that version.
Filename: 297-safer-protover-shutdowns.txt
Title: Relaxing the protover-based shutdown rules
Author: Nick Mathewson
Created: 19-Sep-2018
Status: Closed
Target: 0.3.5.x
Implemented-In: 0.4.0.x

IMPLEMENTATION NOTE:

   We went with the proposed change in section 2.  The "release date" is
   now updated by the "make update-versions" target whenever the version
   number is incremented.  Maintainers may also manually set the "release
   date" to the future.

1. Introduction

   In proposal 264 (now implemented) we introduced the subprotocol
   versioning mechanism to better handle forward-compatibility in
   the Tor network.  Included was a mechanism for safely disabling
   obsolete versions of Tor that no longer ran any supported
   protocols.  If a version of Tor receives a consensus that lists
   as "required" any protocol version that it cannot speak, Tor will
   not start--even if the consensus is in its cache.

   The intended use case for this is that once some protocol has
   been provided by all supported versions for a long time, the
   authorities can mark it as "required".  We had thought about the
   "adding a requirement" case mostly.

   This past weekend, though, we found an unwanted side-effect: it
   is hard to safely *un*-require a currently required protocol.

   Here's what happened:

      - Long ago, we created the LinkAuth=1 protocol, which required
        direct access to the ClientRandom and ServerRandom fields.
        (0.2.3.6-alpha)

      - Later, once we implemented Ed25519 identity keys, we added
        an improved LinkAuth=3 protocol, which uses the RFC5705 "key
        export" mechanism. (0.3.0.1-alpha)

      - When we added the subprotocols mechanism, we listed
        LinkAuth=1 as required. (backported to 0.2.9.x)

      - While porting Tor to NSS, we found that LinkAuth=1 couldn't
        be supported, because NSS wisely declines to expose the TLS
        fields it uses.  So we removed "LinkAuth=1" from the
        required list (backported to 0.3.2.x), and got a bunch of
        authorities to upgrade.

      - In 0.3.5.1-alpha, once enough authorities had upgraded, we
        removed "LinkAuth=1" from the supported subprotocols list
        when Tor is running with NSS. [*]

      - We found, however, that this change could cause a bug when
        Tor+NSS started with a cached consensus that was created before
        LinkAuth=1 was removed from the requirements.  Tor would
        decline to start, because the (old) consensus told it that
        LinkAuth=1 was required.

   This proposal discusses two alternatives for making it safe to
   remove required subprotocol versions in the future.


   [*] There was actually a bug here where OpenSSL removed LinkAuth=1
       too, but that's mostly beside the point for this timeline, other
       than the fact it would have made things waaay worse if people
       hadn't caught it.

2. Recommended change: consider the consensus date.

   I propose that when deciding whether to shut down because of
   subprotocol requirements, a Tor implementation should only shut
   down if the consensus is dated to some time after the
   implementation's release date.

   With this change, an old cached consensus cannot cause the
   implementation to shut down, but a newer one can.  This makes it
   safe to put out a release that does not support a formerly
   required protocol, so long as the authorities have upgraded to
   stop requiring that protocol.

   (It is safe to use the *scheduled* release date for the
   implementation, plus a few months -- just so long as we don't
   plan to start requiring a subprotocol that's not supported by the
   latest version of Tor.)

3. Not-recommended change: ignore the cached consensus.

   Was it a mistake to have Tor consider a cached consensus when
   deciding whether to shut down?

   The rationale for considering the cached consensus was that when
   a Tor implementation is obsolete, we don't want it hammering on
   the network, probing for new consensuses, and possibly
   reconnecting aggressively as its handshakes fail.  That still
   seems compelling to me, though it's possible that if we find some
   problem with the methodology from section 2 above, we'll need to
   find some other way to achieve this goal.





Filename: 298-canonical-families.txt
Title: Putting family lines in canonical form
Author: Nick Mathewson
Created: 31-Oct-2018
Status: Closed
Target: 0.3.6.x
Implemented-In: 0.4.0.1-alpha

1. Introduction

   With ticket #27359, we begin encoding microdescriptor families in
   memory in a reference-counted form, so that if 10 relays all list the
   same family, their family only needs to be stored once.  For large
   families, this has the potential to save a lot of RAM -- but only if
   the families are the same across those relays.

   Right now, family lines are often encoded in different ways, and
   placed into consensuses and microdescriptor lines in whatever format
   the relay reported.

   This proposal describes an algorithm that authorities should use
   while voting to place families into a canonical format.

   This algorithm is forward-compatible, so that new family line formats
   can be supported in the future.

2. The canonicalizing algorithm

   To make a the family listed in a router descriptor canonical:

      For all entries of the form $hexid=name or $hexid~name, remove
      the =name or ~name portion.

      Remove all entries of the form $hexid, where hexid is not 40
      hexadecimal characters long.

      If an entry is a valid nickname, put it into lower case.

      If an entry is a valid $hexid, put it into upper case.

      If there are any entries, add a single $hexid entry for the relay
      in question, so that it is a member of its own family.

      Sort all entries in lexical order.

      Remove duplicate entries.

   Note that if an entry is not of the form "nickname", "$hexid",
   "$hexid=nickname" or "$hexid~nickname", then it will be unchanged:
   this is what makes the algorithm forward-compatible.

3. When to apply this algorithm

   We allocate a new consensus method number.  When building a consensus
   using this method or later, before encoding a family entry into a
   microdescriptor, the authorities should apply the algorithm above.

   Relay MAY apply this algorithm to their own families before
   publishing them.  Unlike authorities, relays SHOULD warn about
   unrecognized family items.



Filename: 299-ip-failure-count.txt
Title: Preferring IPv4 or IPv6 based on IP Version Failure Count
Author: Neel Chauhan
Created: 25-Jan-2019
Status: Superseded
Superseded-by: 306
Ticket: https://trac.torproject.org/projects/tor/ticket/27491

1. Introduction

   As IPv4 address space becomes scarce, ISPs and organizations will deploy
   IPv6 in their networks. Right now, Tor clients connect to guards using
   IPv4 connectivity by default.

   When networks first transition to IPv6, both IPv4 and IPv6 will be enabled
   on most networks in a so-called "dual-stack" configuration. This is to not
   break existing IPv4-only applications while enabling IPv6 connectivity.
   However, IPv6 connectivity may be unreliable and clients should be able
   to connect to the guard using the most reliable technology, whether IPv4
   or IPv6.

   In ticket #27490, we introduced the option ClientAutoIPv6ORPort which adds
   preliminary "happy eyeballs" support. If set, this lets a client randomly
   choose between IPv4 or IPv6. However, this random decision does not take
   into account unreliable connectivity or network failures of an IP family.
   A successful Tor implementation of the happy eyeballs algorithm requires
   that unreliable connectivity on IPv4 and IPv6 are taken into consideration.

   This proposal describes an algorithm to take into account network failures
   in the random decision used for choosing an IP family and the data fields
   used by the algorithm.

2. Options To Enable The Failure Counter

   To enable the failure counter, we will add a flags to ClientAutoIPv6ORPort.
   The new format for ClientAutoIPv6ORPort is:

      ClientAutoIPv6ORPort 0|1 [flags]

   The first argument is to enable the automatic selection between IPv4 and
   IPv6 if it is 1. The second argument is a list of optional flags.

   The only flag so far is "TrackFailures", which enables the tracking of
   failures to make a better decision when selecting between IPv4 and IPv6.
   The tracking of failures will be described in the rest of this proposal.

   However, we should be open to more flags from future proposals as they
   are written and implemented.

3. Failure Counter Design

   I propose that the failure counter uses the following fields:

      * IPv4 failure points

      * IPv6 failure points

   These entries will exist as internal counters for the current session, and
   a calculated value from the previous session in the statefile. 

   These values will be stored as 32-bit unsigned integers for the current
   session and in the statefile.

   When a new session is loaded, we will load the failure count from the
   statefile, and when a session is closed, the failure counts from the current
   session will be stored in the statefile. 

4. Failure Probability Calculation

   The failure count of one IP version will increase the probability of the
   other IP version. For instance, a failure of IPv4 will increase the IPv6
   probability, and vice versa.

   When the IP version is being chosen, I propose that these values will be
   included in the guard selection code:

      * IPv4 failure points

      * IPv6 failure points

      * Total failure points

   These values will be stored as 32-bit unsigned integers.

   A generic failure of an IP version will add one point to the failure point
   count values of the particular IP version which failed.

   A failure of an IP version from a "no route" error which happens when
   connections automatically fail will be counted as two failure points
   for the automatically failed version.

   The failure points for both IPv4 and IPv6 is sum of the values in the state
   file plus the current session's failure values. The total failure points is
   a sum of the IPv4 and IPv6 failure points, and is updated when the failure
   point count of an IP version is updated.

   The probability of a particular IP version is the failure points of the
   other version divided by the total number of failure points, multiplied
   by 4 and stored as an integer. We will call this value the summarized
   failure point value (SFPV). The reason for this summarization is to
   emulate a probability in 1/4 intervals by the random number generator.

   In the random number generator, we will choose a random number between 0
   and 4. If the random number is less than the IPv6 SFPV, we will choose
   IPv4. If it is equal or greater, we will choose IPv6.

   If the probability is 0/4 with a SFPV value of 0, it will be rounded to
   1/4 with a SFPV of 1. Also, if the probability is 4/4 with a SFPV of 4,
   it will be rounded to 3/4 with a SFPV of 3. The reason for this is to
   accomodate mobile clients which could change networks at any time (e.g.
   WiFi to cellular) which may be more or less reliable in terms of a
   particular IP family when compared to the previous network of the client.

5. Initial Failure Point Calculation

   When a client starts without failure points or if the FP value drops to 0,
   we need a SFPV value to start with. The Initial SFPV value will be counted
   based on whether the client is using a bridge or not and if the relays in
   the bridge configuration or consensus have IPv6.

   For clients connecting directly to Tor, we will:

      * During Bootstrap: use the number of IPv4 and IPv6 capable fallback
        directory mirrors during bootstrap.

      * After the initial consensus is received: use the number of IPv4 and
        IPv6 capable guards in the consensus.

   The reason why the consensus will be used to calculate the initial failure
   point value is because using the number of guards would bias the SFPV value
   with whatever's dominant on the network rather than what works on the
   client.

   For clients connecting through bridges, we will use the number of bridges
   configured and the IP versions supported.

   The initial value of the failure points in the scenarios described in this
   section would be:

      * IPv4 Faulure Points: Count the number of IPv6-capable relays

      * IPv6 Failure Points: Count the number of IPv4-capable relays

   If the consensus or bridge configuration changes during a session, we should
   not update the failure point counters to generate a SFPV.

   If we are starting a new session, we should use the existing failure points
   to generate a SFPV unless the counts for IPv4 or IPv6 are zero.

6. Forgetting Old Sessions

   We should be able to forget old failures as clients could change networks.
   For instance, a mobile phone could switch between WiFi and cellular. Keeping
   an exact failure history would have privacy implications, so we should store
   an approximate history.

   One way we could forget old sessions is by halving all the failure point
   (FP) values before adding when:

      * One or more failure point values are a multiple of a random number
        between 1 and 5

      * One or more failure point values are greater than or equal to 100

   The reason for halving the values at regular intervals is to forget old
   sessions while keeping an approxmate history. We halve all FP values so
   that one IP version doesn't dominante on the failure count if the other
   is halved. This keeps an approximate scale of the failures on a client.

   The reason for halving at a multiple of a random number instead of a fixed
   interval is so we can halve regularly while not making it too predictable.
   This prevents a situation where we would be halving too often to keep an
   approximate failure history.

   If we halve, we add the FP value for the failed IP version after halving all
   FPs if done to account for the failure. If halving is not done, we will just
   add the FP.

   If the FP value for one IP version goes down to zero, we will re-calculate
   the SFPV for that version using the methods described in Section 4.

7. Separate Concurrent Connection Limits

   Right now, there is a limit for three concurrent connections from a client.
   at any given time. This limit includes both IPv4 and IPv6 connections.
   This is to prevent denial of service attacks. I propose that a seperate
   connection limit is used for IPv4 and IPv6. This means we can have three
   concurrent IPv4 connections and three concurrent IPv6 connections at the
   same time.

   Having seperate connection limits allows us to deal with networks dropping
   packets for a particular IP family while still preventing potential denial
   of service attacks.

8. Pathbias and Failure Probability

   If ClientAutoIPv6ORPort is in use, and pathbias is triggered, we should
   ignore "no route" warnings. The reason for this is because we would be
   adding two failure points for the failed as described in Section 3 of this
   proposal. Adding two failure points would make us more likely to prefer the
   competing IP family over the failed one versus than adding a single failure
   point on a normal failure.

9. Counting Successful Connections

   If a connection to a particular IP version is successful, we should use
   it. This ensures that clients have a reliable connection to Tor. Accounting
   for successful connections can be done by adding one failure point to the
   competing IP version of the successful connection. For instance, if we have
   a successful IPv6 connection, we add one IPv4 failure point.

   Why use failure points for successful connections? This reduces the need for
   separate counters for successes and allows for code reuse. Why add to the
   competing version's failure point? Similar to how we should prefer IPv4 if
   IPv6 fails, we should also prefer IPv4 if it is successful. We should also
   prefer IPv6 if it is successful.

   Even on adding successes, we will still halve the failure counters as
   described in Section 5.

10. Acknowledgements

   Thank you teor for aiding me with the implementation of Happy Eyeballs in
   Tor. This would not have been possible if it weren't for you.
Filename: 300-walking-onions.txt
Title: Walking Onions: Scaling and Saving Bandwidth
Author: Nick Mathewson
Created: 5-Feb-2019
Status: Informational


0. Status

   This proposal describes a mechanism called "Walking Onions" for
   scaling the Tor network and reducing the amount of client bandwidth
   used to maintain a client's view of the Tor network.

   This is a draft proposal; there are problems left to be solved and
   questions left to be answered.  Proposal 323 tries to fill in all the
   gaps.

1. Introduction

   In the current Tor network design, we assume that every client has a
   complete view of all the relays in the network.  To achieve this,
   clients download consensus directories at regular intervals, and
   download descriptors for every relay listed in the directory.

   The substitution of microdescriptors for regular descriptors
   (proposal 158) and the use of consensus diffs (proposal 140) have
   lowered the bytes that clients must dedicate to directory operations.
   But we still face the problem that, if we force each client to know
   about every relay in the network, each client's directory traffic
   will grow linearly with the number of relays in the network.

   Another drawback in our current system is that client directory
   traffic is front-loaded: clients need to fetch an entire directory
   before they begin building circuits.  This places extra delays on
   clients, and extra load on the network.

   To anonymize the world, we will need to scale to a much larger number
   of relays and clients: requiring clients to know about every relay in
   the set simply won't scale, and requiring every new client to download
   a large document is also problematic.

   There are obvious responses here, and some other anonymity tools have
   taken them.  It's possible to have a client only use a fraction of
   the relays in a network--but doing so opens the client to _epistemic
   attacks_, in which the difference in clients' views of the
   network is used to partition their traffic.  It's also possible to
   move the problem of selecting relays from the client to the relays
   themselves, and let each relay select the next relay in turn--but
   this choice opens the client to _route capture attacks_, in which a
   malicious relay selects only other malicious relays.

   In this proposal, I'll describe a design for eliminating up-front
   client directory downloads.  Clients still choose relays at random,
   but without ever having to hold a list of all the relays. This design
   does not require clients to trust relays any more than they do today,
   or open clients to epistemic attacks.

   I hope to maintain feature parity with the current Tor design; I'll
   list the places in which I haven't figured out how to do so yet.

   I'm naming this design "walking onions".  The walking onion (Allium x
   proliferum) reproduces by growing tiny little bulbs at the
   end of a long stalk.  When the stalk gets too top-heavy, it flops
   over, and the little bulbs start growing somewhere new.

   The rest of this document will run as follows.  In section 2, I'll
   explain the ideas behind the "walking onions" design, and how they
   can eliminate the need for regular directory downloads.  In section 3, I'll
   answer a number of follow-up questions that arise, and explain how to
   keep various features in Tor working.  Section 4 (not yet written)
   will elaborate all the details needed to turn this proposal into a
   concrete set of specification changes.

2. Overview

2.1. Recapping proposal 141

   Back in Proposal 141 ("Download server descriptors on demand"), Peter
   Palfrader proposed an idea for eliminating ahead-of-time descriptor
   downloads.  Instead of fetching all the descriptors in advance, a
   client would fetch the descriptor for each relay in its path right
   before extending the circuit to that relay.  For example, if a client
   has a circuit from A->B and wants to extend the circuit to C, the
   client asks B for C's descriptor, and then extends the circuit to C.

   (Note that the client needs to fetch the descriptor every time it
   extends the circuit, so that an observer can't tell whether the
   client already had the descriptor or not.)

   There are a couple of limitations for this design:
      * It still requires clients to download a consensus.
      * It introduces a extra round-trip to each hop of the circuit
        extension process.

   I'll show how to solve these problems in the two sections below.

2.2. An observation about the ntor handshake.

   I'll start with an observation about our current circuit extension
   handshake, ntor: it should not actually be necessary to know a
   relay's onion key before extending to it.

   Right now, the client sends:
         NODEID     (The relay's identity)
         KEYID      (The relay's public onion key)
         CLIENT_PK  (a diffie-hellman public key)

   and the relay responds with:
         SERVER_PK  (a diffie-hellman public key)
         AUTH       (a function of the relay's private keys and
                     *all* of the public keys.)

   Both parties generate shared symmetric keys from the same inputs
   that are are used to create the AUTH value.

   The important insight here is that we could easily change
   this handshake so that the client sends only CLIENT_PK, and receives
   NODEID and KEYID as part of the response.

   In other words, the client needs to know the relay's onion key to
   _complete_ the handshake, but doesn't actually need to know the
   relay's onion key in order to _initiate_ the handshake.

   This is the insight that will let us save a round trip:  When the
   client goes to extend a circuit from A->B to C, it can send B a
   request to extend to C and retrieve C's descriptor in a single step.
   Specifically, the client sends only CLIENT_PK, and relay B can include C's
   keys as part of the EXTENDED cell.

2.3. Extending by certified index

   Now I'll explain how the client can avoid having to download a
   list of relays entirely.

   First, let's look at how a client chooses a random relay today.
   First, the client puts all of the relays in a list, and computes a
   weighted bandwidth for each one. For example, suppose the relay
   identities are R1, R2, R3, R4, and R5, and their bandwidth weights
   are 50, 40, 30, 20, and 10.  The client makes a table like this:

      Relay   Weight     Range of index values
      R1      50         0..49
      R2      40         50..89
      R3      30         90..119
      R4      20         120..139
      R5      10         140..149

   To choose a random relay, the client picks a random index value
   between 0 and 149 inclusive, and looks up the corresponding relay in
   the table.  For example, if the client's random number is 77, it will
   choose R2.  If its random number is 137, it chooses R4.

   The key observation for the "walking onions" design is that the
   client doesn't actually need to construct this table itself.
   Instead, we will have this table be constructed by the authorities
   and distributed to all the relays.

   Here's how it works: let's have the authorities make a new kind of
   consensus-like thing.  We'll call it an Efficient Network Directory
   with Individually Verifiable Entries, or "ENDIVE" for short.  This
   will differ from the client's index table above in two ways.  First,
   every entry in the ENDIVE is normalized so that the bandwidth
   weights maximum index is 2^32-1:

       Relay      Normalized weight    Range of index values
       R1         0x55555546           0x00000000..0x55555545
       R2         0x44444438           0x55555546..0x9999997d
       R3         0x3333332a           0x9999997e..0xcccccca7
       R4         0x2222221c           0xcccccca8..0xeeeeeec3
       R5         0x1111113c           0xeeeeeec4..0xffffffff

   Second, every entry in the ENDIVE is timestamped and signed by the
   authorities independently, so that when a client sees a line from the
   table above, it can verify that it came from an authentic ENDIVE.
   When a client has chosen a random index, one of these entries will
   prove to the client that a given relay corresponds to that index.
   Because of this property, we'll be calling these entries "Separable
   Network Index Proofs", or "SNIP"s for short.

   For example, a single SNIP from the table above might consist of:
     * A range of times during which this SNIP is valid
     * R1's identity
     * R1's ntor onion key
     * R1's address
     * The index range 0x00000000..0x55555545
     * A signature of all of the above, by a number of authorities

   Let's put it together. Suppose that the client has a circuit from
   A->B, and it wants to extend to a random relay, chosen randomly
   weighted by bandwidth.

   1. The client picks a random index value between 0 and 2**32 - 1.  It
      sends that index to relay B in its EXTEND cell, along with a
      g^x value for the ntor handshake.

      Note: the client doesn't send an address or identity for the next
      relay, since it doesn't know what relay it has chosen!  (The
      combination of an index and a g^x value is what I'm calling a
      "walking onion".)

   2. Now, relay B looks up the index in its most recent ENDIVE, to
      learn which relay the client selected.

      (For example, suppose that the client's random index value is
      0x50000001.  This index value falls between 0x00000000 and
      0x55555546 in the table above, so the relay B sees that the client
      has chosen R1 as its next hop.)

   3. Relay B sends a create cell to R1 as usual.  When it gets a
      CREATED reply, it includes the authority-signed SNIP for
      R1 as part of the EXTENDED cell.

   4. As part of verifying the handshake, the client verifies that the
      SNIP was signed by enough authorities, that its timestamp
      is recent enough, and that it actually corresponds to the
      random index that the client selected.

   Notice the properties we have with this design:

       - Clients can extend circuits without having a list of all the
         relays.

       - Because the client's random index needs to match a routing
         entry signed by the authorities, the client is still selecting
         a relay randomly by weight.  A hostile relay cannot choose
         which relay to send the client.


   On a failure to extend, a relay should still report the routing entry
   for the other relay that it couldn't connect to.  As before, a client
   will start a new circuit if a partially constructed circuit is a
   partial failure.


   We could achieve a reliability/security tradeoff by letting clients
   offer the relay a choice of two or more indices to extend to.
   This would help reliability, but give the relay more influence over
   the path.  We'd need to analyze this impact.


   In the next section, I'll discuss a bunch of details that we need to
   straighten out in order to make this design work.


3. Sorting out the details.

3.1. Will these routing entries fit in EXTEND2 and EXTENDED2 cells?

   The EXTEND2 cell is probably big enough for this design.  The random
   index that the client sends can be a new "link specifier" type,
   replacing the IP and identity link specifiers.

   The EXTENDED2 cell is likely to need to grow here.  We'll need to
   implement proposal 249 ("Allow CREATE cells with >505 bytes of
   handshake data") so that EXTEND2 and EXTENDED2 cells can be larger.

3.2. How should SNIPs be signed?

   We have a few options, and I'd like to look into the possibilities
   here more closely.

   The simplest possibility is to use **multiple signatures** on each
   SNIP, the way we do today for consensuses.  These signatures should
   be made using medium-term Ed25519 keys from the authorities.  At a
   cost of 64 bytes per signature, at 9 authorities, we would need 576
   bytes for each SNIP.  These signatures could be batch-verified to
   save time at the client side.  Since generating a signature takes
   around 20 usec on my mediocre laptop, authorities should be able to
   generate this many signatures fairly easily.

   Another possibility is to use a **threshold signature** on each SNIP,
   so that the authorities collaboratively generate a short signature
   that the clients can verify.  There are multiple threshold signature
   schemes that we could consider here, though I haven't yet found one
   that looks perfect.

   Another possibility is to use organize the SNIPs in a **merkle tree
   with a signed root**.  For this design, clients could download the
   signed root periodically, and receive the hash-path from the signed
   root to the SNIP.  This design might help with
   certificate-transparency-style designs, and it would be necessary if we
   ever want to move to a postquantum signature algorithm that requires
   large signatures.

   Another possibility (due to a conversation among Chelsea Komlo, Sajin
   Sasy, and Ian Goldberg), is to *use SNARKs*.  (Why not?  All the cool
   kids are doing it!)  For this, we'd have the clients download a
   signed hash of the ENDIVE periodically, and have the authorities
   generate a SNARK for each SNIP, proving its presence in that
   document.

3.3. How can we detect authority misbehavior?

   We might want to take countermeasures against the possibility that a
   quorum of corrupt or compromised authorities give some relays a
   different set of SNIPs than they give other relays.

   If we incorporate a merkle tree or a hash chain in the design, we can
   use mechanisms similar to certificate transparency to ensure that the
   authorities have a consistent log of all the entries that they have
   ever handed out.

3.4. How many types of weighted node selection are there, and how do we
     handle them?

   Right now, there are multiple weights that we use in Tor:
      * Weight for exit
      * Weight for guard
      * Weight for middle node

   We also filter nodes for several properties, such as flags they have.

   To reproduce this behavior, we should enumerate the various weights
   and filters that we use, and (if there are not too many) create a
   separate index for each.  For example, the Guard index would weight
   every node for selection as guard, assigning 0 weight to non-Guard
   nodes.  The Exit index would weight every node for selection as an
   exit, assigning 0 weight to non-Exit nodes.

   When choosing a relay, the client would have to specify which index
   to use.  We could either have a separate (labeled) set of SNIPs
   entries for each index, or we could have each SNIP have a separate
   (labeled) index range for each index.

   REGRESSION: the client's choice of which index to use would leak the
   next router's position and purpose in the circuit.  This information
   is something that we believe relays can infer now, but it's not a
   desired feature that they can.

3.5. Does this design break onion service introduce handshakes?

   In rend-spec-v3.txt section 3.3.2, we specify a variant of ntor for
   use in INTRODUCE2 handshakes.  It allows the client to send encrypted
   data as part of its initial ntor handshake, but requires the client
   to know the onion service's onion key before it sends its initial
   handshake.

   That won't be a problem for us here, though: we still require clients
   to fetch onion service descriptors before contacting a onion
   service.

3.6. How does the onion service directory work here?

   The onion service directory is implemented as a hash ring, where
   each relay's position in the hash ring is decided by a hash of its
   identity, the current date, and a shared random value that the
   authorities compute each day.

   To implement this hash ring using walking onions, we would need to
   have an extra index based not on bandwidth, but on position in the
   hash ring.  Then onion services and clients could build a circuit,
   then extend it one more hop specifying their desired index in the
   hash ring.

   We could either have a command to retrieve a trio of hashring-based
   routing entries by index, or to retrieve (or connect to?) the n'th item
   after a given hashring entry.

3.7. How can clients choose guard nodes?

   We can reuse the fallback directories here.  A newly bootstrapping
   client would connect to a fallback directory, then build a three-hop
   circuit, and finally extend the three-hop circuit by indexing to a
   random guard node.  The random guard node's SNIP would
   contain the information that the client needs to build real circuits
   through that guard in the future.  Because the client would be
   building a three-hop circuit, the fallback directory would not learn
   the client's guards.

   (Note that even if the extend attempt fails, we should still pick the
   node as a possible guard based on its router entry, so that other
   nodes can't veto our choice of guards.)

3.8. Does the walking onions design preclude postquantum circuit handshakes?

   Not at all!  Both proposal 263 (ntru) and proposal 270 (newhope) work
   by having the client generate an ephemeral key as part of its initial
   handshake.  The client does not need to know the relay's onion key to
   do this, so we can still integrate those proposals with this one.

3.9. Does the walking onions design stop us from changing the network
     topology?

   For Tor to continue to scale, we will someday need to accept that not
   every relay can be simultaneously connected to every other relay.
   Therefore, we will need to move from our current clique topology
   assumption to some other topology.

   There are also proposals to change node selection rules to generate
   routes providing better performance, or improved resistance to local
   adversaries.

   We can, I think, implement this kind of proposal by changing the way
   that ENDIVEs are generated.  Instead giving every relay the same
   ENDIVE, the authorities would generate different ENDIVEs for
   different relays, depending on the probability distribution of which
   relay should be chosen after which in the network topology.  In the
   extreme case, this would produce O(n) ENDIVEs and O(n^2) SNIPs.  In
   practice, I hope that we could do better by having the network
   topology be non-clique, and by having many relays share the same
   distribution of successors.


3.10. How can clients handle exit policies?

   This is an unsolved challenge.  If the client tells the middle relay
   its target port, it leaks information inappropriately.

   One possibility is to try to gather exit policies into common
   categories, such as "most ports supported" and "most common ports
   supported".

   Another (inefficient) possibility is for clients to keep trying exits
   until they find one that works.

   Another (inefficient) possibility is to require that clients who use
   unusual ports fall back to the old mechanism for route selection.


3.11. Can this approach support families?

   This is an unsolved challenge.

   One (inefficient) possibility is for clients to generate circuits and
   discard those that use multiple relays in the same family.

   One (not quite compatible) possibility is for the authorities to sort
   the ENDIVE so that relays in the same family are adjacent to
   one another.  The index-bounds part of each SNIP would also
   have to include the bounds of the family.  This approach is not quite
   compatible with the status quo, because it prevents relays from
   belonging to more than one family.

   One interesting possibility (due to Chelsea Komlo, Sajin Sasy, and
   Ian Goldberg) is for the middle node to take responsibility for
   family enforcement. In this design, the client might offer the middle
   node multiple options for the next relay's index, and the middle node
   would choose the first such relay that is neither in its family nor
   its predecessor's family.  We'd need to look for a way to make sure
   that the middle node wasn't biasing the path selection.

   (TODO: come up with more ideas here.)

3.12. Can walking onions support IP-based and country-based restrictions?

   This is an unsolved challenge.

   If the user's restrictions do not exclude most paths, one
   (inefficient) possibility is for the user to generate paths until
   they generate one that they like.  This idea becomes inefficient
   if the user is excluding most paths.

   Another (inefficient and fingerprintable) possibility is to require
   that clients who use complex path restrictions fall back to the old
   mechanism for route selection.

   (TODO: come up with better ideas here.)

3.13. What scaling problems have we not solved with this design?

   The walking onions design doesn't solve (on its own) the problem that
   the authorities need to know about every relay, and arrange to have
   every relay tested.

   The walking onions design doesn't solve (on its own) the problem that
   relays need to have a list of all the relays.  (But see section 3.9
   above.)

3.14. Should we still have clients download a consensus when they're
      using walking onions?

   There are some fields in the current consensus directory documents
   that the clients will still need, like the list of supported
   protocols and network parameters.  A client that uses walking onions
   should download a new flavor of consensus document that contains only
   these fields, and does not list any relays.  In some signature
   schemes, this consensus would contain a digest of the ENDIVE -- see
   3.2 above.

   (Note that this document would be a "consensus document" but not a
   "consensus directory", since it doesn't list any relays.)


4. Putting it all together

   [This is the section where, in a later version of this proposal, I
   would specify the exact behavior and data formats to be used here.
   Right now, I'd say we're too early in the design phase.]


A.1. Acknowledgments

   Thanks to Peter Palfrader for his original design in proposal 141,
   and to the designers of PIR-Tor, both of which inspired aspects of
   this Walking Onions design.

   Thanks to Chelsea Komlo, Sajin Sasy, and Ian Goldberg for feedback on
   an earlier version of this design.

   Thanks to David Goulet, Teor, and George Kadianakis for commentary on
   earlier versions of this draft.

   This research was supported by NSF grants CNS-1526306 and CNS-1619454.

A.2. Additional ideas

   Teor notes that there are ways to try to get this idea to apply to
   one-pass circuit construction, something like the old onion design.
   We might be able to derive indices and keys from the same seeds,
   even.  I don't see a way to do this without losing forward secrecy,
   but it might be worth looking at harder.


Filename: 301-dont-vote-on-package-fingerprints.txt
Title: Don't include package fingerprints in consensus documents
Author: Iain R. Learmonth
Created: 2019-02-21
Status: Closed
Ticket: #28465

0. Abstract

   I propose modifying the Tor consensus document to remove
   digests of the latest versions of package files. These "package"
   lines were never used by any directory authority and so add
   additional complexity to the consensus voting mechanisms while
   adding no additional value.

1. Introduction

   In proposal 227 [1], to improve the integrity and security of
   updates, a way to authenticate the latest versions of core Tor
   software through the consensus was described. By listing a location
   with this information for each version of each package, we can
   augment the update process of Tor software to authenticate the
   packages it downloads through the Tor consensus. This was
   implemented in tor 0.2.6.3-alpha.

   When looking at modernising our network archive recently [2], I
   came across this line for votes and consensuses. If packages are
   referenced by the consensus then ideally we should archive those
   packages just as we archive referenced descriptors. However, this
   line was never present in any vote archived.

2. Proposal

   We deprecate the "package" line in the specification for votes.

   Directory authorities stop voting for "package" lines in their
   votes. Changes to votes do not require a new consensus method, so
   this part of the proposal can be implemented separately.

   We allocate a consensus method when this proposal is implemented.
   Let's call it consensus method N.

   Authorities will continue computing consensus package lines in the
   consensus if the consensus method is between 19 and (N-1).  If the
   consensus method is N or later, they omit these lines.

3. Security Considerations

   This proposal removes a feature that could be used for improved
   security but currently isn't. As such it is extra code in the
   codebase that may have unknown bugs or lead to bugs in the future
   due to unexpected interactions. Overall this should be a good
   thing for security of Core Tor.

4. Compatability Considerations

   A new consensus method is required for this proposal. The
   "package" line was always optional and so no client should be
   depending on it. There are no known consumers of the "package"
   lines (there are none to consume anyway).

A. References

   [1] Nick Mathewson, Mike Perry. "Include package fingerprints in
       consensus documents". Tor Proposal 227, February 2014.
   [2] Iain Learmonth, Karsten Loesing. "Towards modernising data
       collection and archive for the Tor network". Technical Report
       2018-12-001, December 2018.

B. Acknowledgements

   Thanks to teor and Nick Mathewson for their comments and
   suggestions on this proposal.
Filename: 302-padding-machines-for-onion-clients.txt
Title: Hiding onion service clients using padding
Author: George Kadianakis, Mike Perry
Created: Thursday 16 May 2019
Status: Closed
Implemented-In: 0.4.1.1-alpha

NOTE: Please look at section 3 of padding-spec.txt now, not this document.

0. Overview

   Tor clients use "circuits" to do anonymous communications. There are various
   types of circuits. Some of them are for navigating the normal Internet,
   others are for fetching Tor directory information, others are for connecting
   to onion services, while others are simply for measurements and testing.

   It's currently possible for MITM type of adversaries (like tor-network-level
   and local-area-network adversaries) to distinguish Tor circuit types from
   each other using a wide array of metadata and distinguishers.

   In this proposal, we study various techniques that can be used to
   distinguish client-side onion service circuits and provide WTF-PAD circuit
   padding machines (using prop#254) to hide them against certain adversaries.

1. Motivation

   We are writing this proposal for various reasons:

   1) We believe that in an ideal setting MITM adversaries should not be able
      to distinguish circuit types by inspecting traffic. Tor traffic should
      look amorphous to an outside observer to maximize uncertainty and
      anonymity properties.

      Client-side onion service circuits are an easy target for this proposal,
      because we believe we can improve their privacy with low bandwidth
      overhead.

   2) We want to start experimenting with the WTF-PAD subsystem of Tor, and
      this use-case provides us with a good testbed.

   3) We hope that by actually starting to use the WTF-PAD subsystem of Tor, we
      will encourage more researchers to start experimenting with it.

2. Scope of the proposal [SCOPE]

   Given the above, this proposal sets forth to use the WTF-PAD system to hide
   client-side onion service circuits against the classifiers of paper by Kwon
   et al. above.

   By client-side onion service circuits we refer to these two types of circuits:
      - Client-side introduction circuits: Circuit from client to the introduction point
      - Client-side rendezvous circuits: Circuit from client to the rendezvous point

   Service-side onion service circuits are not in scope for this proposal, and
   this is because hiding those would require more bandwidth and also more
   advanced WTF-PAD features.

   Furthermore, this proposal only aims to cloak the naive distinguishing
   features mentioned in the [KNOWN_DISTINGUISHERS] section, and can by no
   means guarantee that client-side onion service circuits are totally
   indistinguishable by other means.

   The machines specified in this proposal are meant to be lightweight and
   created for a specific purpose. This means that they can be easily extended
   with additional states to do more advanced hiding.

3. Known distinguishers against onion service circuits [KNOWN_DISTINGUISHERS]

   Over the past years it's been assumed that motivated adversaries can
   distinguish onion-service traffic from normal Tor traffic given their
   special characteristics.

   As far as we know, there has been relatively little research-level work done
   to this direction. The main article published in this area is the USENIX
   paper "Circuit Fingerprinting Attacks: Passive Deanonymization of Tor Hidden
   Services" by Kwon et al. [0]

   The above paper deals with onion service circuits in sections 3.2 and 5.1.
   It uses the following three "naive" circuit features to distinguish circuits:
      1) Circuit construction sequence
      2) Number of incoming and outgoing cells
      3) Duration of Activity ("DoA")

    All onion service circuits have particularly loud signatures to the above
    characteristics, but WTF-PAD (prop#254) gives us tools to effectively
    silence those signatures to the point where the paper's classifiers won't
    work.

4. Hiding circuit features using WTF-PAD

   According to section [KNOWN_DISTINGUISHERS] there are three circuit features
   we are attempting to hide. Here is how we plan to do this using the WTF-PAD
   system:

   1) Circuit construction sequence

      The USENIX paper uses the directions of the first 10 cells sent in a
      circuit to fingerprint them. Client-side onion service circuits have
      unique circuit construction sequences and hence they can be fingeprinted
      using just the first 10 cells.

      We use WTF-PAD to destroy this feature of onion service circuits by
      carefully sending padding cells (relay DROP cells) during circuit
      construction and making them look exactly like most general tor circuits
      up till the end of the circuit construction sequence.

   2) Number of incoming and outgoing cells

      The USENIX paper uses the amount of incoming and outgoing cells to
      distinguish circuit types. For example, client-side introduction circuits
      have the same amount of incoming and outgoing cells, whereas client-side
      rendezvous circuits have more incoming than outgoing cells.

      We use WTF-PAD to destroy this feature by changing the number of cells
      sent in introduction circuits. We leave rendezvous circuits as is, since
      the actual rendezvous traffic flow usually resembles well normal Tor
      circuits.

    3) Duration of Activity ("DoA")

      The USENIX paper uses the period of time during which circuits send and
      receive cells to distinguish circuit types. For example, client-side
      introduction circuits are really short lived, wheras service-side
      introduction circuits are very long lived. OTOH, rendezvous circuits have
      the same median lifetime as general Tor circuits which is 10 minutes.

      We use WTF-PAD to destroy this feature of client-side introduction
      circuits by setting a special WTF-PAD option, which keeps the circuits
      open for 10 minutes completely mimicking the DoA of general Tor circuits.

4.1. A dive into general circuit construction sequences [CIRCCONSTRUCTION]

   In this section we give an overview of how circuit construction looks like
   to a network or guard-level adversary. We use this knowledge to make the
   right padding machines that can make intro and rend circuits look like these
   general circuits.

   In particular, most general Tor circuits used to surf the web or download
   directory information, start with the following 6-cell relay cell sequence (cells
   surrounded in [brackets] are outgoing, the others are incoming):

     [EXTEND2] -> EXTENDED2 -> [EXTEND2] -> EXTENDED2 -> [BEGIN] -> CONNECTED

   When this is done, the client has established a 3-hop circuit and also
   opened a stream to the other end. Usually after this comes a series of DATA
   cell that either fetches pages, establishes an SSL connection or fetches
   directory information:

     [DATA] -> [DATA] -> DATA -> DATA

   The above stream of 10 relay cells defines the grand majority of general
   circuits that come out of Tor browser during our testing, and it's what we
   are gonna use to make introduction and rednezvous circuits blend in.

   Please note that in this section we only investigate relay cells and not
   connection-level cells like CREATE/CREATED or AUTHENTICATE/etc. that are
   used during the link-layer handshake. The rationale is that connection-level
   cells depend on the type of guard used and are not an effective fingerprint
   for a network/guard-level adversary.

5. WTF-PAD machines

   For the purposes of this proposal we will make use of four WTF-PAD machines
   as follows:

      - Client-side introduction circuit hiding machine (origin-side)
      - Client-side introduction circuit hiding machine (relay-side)

      - Client-side rendezvous circuit hiding machine (origin-side)
      - Client-side rendezvous circuit hiding machine (relay-side)

   In the following sections we will analyze these machines.

5.1. Client-side introduction circuit hiding machines [INTRO_CIRC_HIDING]

   These two machines are meant to hide client-side introduction circuits. The
   origin-side machine sits on the client and sends padding towards the
   introduction circuit, whereas the relay-side machine sits on the middle-hop
   (second hop of the circuit) and sends padding towards the client. The
   padding from the origin-side machine terminates at the middle-hop and does
   not get forwarded to the actual introduction point.

   Both of these machines only get activated for introduction circuits, and
   only after an INTRODUCE1 cell has been sent out.

   This means that before the machine gets activated our cell flow looks like this:

    [EXTEND2] -> EXTENDED2 -> [EXTEND2] -> EXTENDED2 -> [EXTEND2] -> EXTENDED2 -> [INTRODUCE1]

   Comparing the above with section [CIRCCONSTRUCTION], we see that the above
   cell sequence matches the one from general circuits up to the first 7 cells.

   However, in normal introduction circuits this is followed by an
   INTRODUCE_ACK and then the circuit gets teared down, which does not match
   the sequence from [CIRCCONSTRUCTION].

   Hence when our machine is used, after sending an [INTRODUCE1] cell, we also
   send a [PADDING_NEGOTIATE] cell, which gets answered by a PADDING_NEGOTIATED
   cell and an INTRODUCE_ACKED cell. This makes us match the [CIRCCONSTRUCTION]
   sequence up to the first 10 cells.

   After that, we continue sending padding from the relay-side machine so as to
   fake a directory download, or an SSL connection setup. We also want to
   continue sending padding so that the connection stays up longer to destroy
   the "Duration of Activity" fingerprint.

   To calculate the padding overhead, we see that the origin-side machine just
   sends a single [PADDING_NEGOATIATE] cell, wheras the origin-side machine
   sends a PADDING_NEGOTIATED cell and between 7 to 10 DROP cells. This means
   that the average overhead of this machine is 11 padding cells.

   In terms of WTF-PAD terminology, these machines have three states (START,
   OBF, END). They move from the START to OBF state when the first
   non-padding cell is received on the circuit, and they stay in the OBF
   state until all the padding gets depleted. The OBF state is controlled by
   a histogram which specifies the parameters described in the paragraphs
   above. After all the padding finishes, it moves to END state.

   We also set a special WTF-PAD flag which keeps the circuit open even after
   the introduction is performed. In particular, with this feature the circuit
   will stay alive for the same durations as normal web circuits before they
   expire (usually 10 minutes).

5.2. Client-side rendezvous circuit hiding machines

   The rendezvous circuit machines apply on client-side rendezvous circuits and
   only after the rendezvous point has been established (REND_ESTABLISHED has
   been received). Up to that point, the following cell sequence has been
   observed on the circuit:

    [EXTEND2] -> EXTENDED2 -> [EXTEND2] -> EXTENDED2 -> [ESTABLISH_REND] -> REND_ESTABLISHED

   which matches the general circuit construction sequence [CIRCCONSTRUCTION]
   up to the first 6 cells. However after that, normal rendezvous circuits
   receive a RENDEZVOUS2 cell followed by a [BEGIN] and a CONNECTED, which does
   not fit the circuit construction sequence we are trying to imitate.

   Hence our machine gets activated right after REND_ESTABLISHED is received,
   and continues by sending a [PADDING_NEGOTIATE] and a [DROP] cell, before
   receiving a PADDING_NEGOTIATED and a DROP cell, effectively blending into
   the general circuit construction sequence on the first 10 cells.

   After that our machine gets deactivated, and we let the actual rendezvous
   circuit shape the traffic flow. Since rendezvous circuits usually immitate
   general circuits (their purpose is to surf the web), we can expect that they
   will look alike.

   In terms of overhead, this machine is quite light. Both sides send 2 padding
   cells, for a total of 4 padding cells.

6. Overhead analysis

   Given the parameters above, intro circuit machines have an overhead of 11
   padding cells, and rendezvous circuit machines have an overhead of 4
   cpadding ells.  . This means that for every intro and rendezvous circuit
   there will be an overhead of 15 padding cells in average, which is about
   7.5kb.

   In the PrivCount paper [1] we learn that the Tor network sees about 12
   million successful descriptor fetches per day. We can use this figure to
   assume that the Tor network also sees about 12 million intro and rendezvous
   circuits per day. Given the 7.5kb overhead of each of these circuits, we get
   that our padding machines infer an additional 94GB overhead per day on the
   network, which is about 3.9GB per hour.

   XXX Isn't this kinda intense????? Using the graphs from metrics we see that
       the Tor network has total capacity of 300 Gbit/s which is about 135000GB per
       hour, so 3.9GB per hour is not that much, but still...

7. Discussion

7.1. Alternative approaches

   These machines try to hide onion service client-side circuits by obfuscating
   their looks. This is a reasonable approach, but if the resulting circuits
   look unlike any other Tor circuits, they would still be fingerprintable just
   by that fact.

   Another approach we could take is make normal client circuits look like
   onion service circuits, or just make normal clients establish fake onion
   service circuits periodically. The hope here is that the adversary won't be
   able to distinguish fake onion service circuits from real ones. This
   approach has not been taken yet, mainly because it requires additional
   WTF-PAD features and poses greater overhead risks.

7.2. Future work

   As discussed in [SCOPE], this proposal only aims to hide some very specific
   features of client-side onion service circuits. There is lots of work to be
   done here to see what other features can be used to distinguish such
   circuits, and also what other classifiers can be built using deep learning
   and whatnot.

A. Acknowledgements

   This research was supported by NSF grants CNS-1526306 and CNS-1619454.

---

   [0]: https://www.usenix.org/node/190967
        https://blog.torproject.org/technical-summary-usenix-fingerprinting-paper

   [1]: "Understanding Tor Usage with Privacy-Preserving Measurement"
        by Akshaya Mani, T Wilson-Brown, Rob Jansen, Aaron Johnson, and Micah Sherr
        In Proceedings of the Internet Measurement Conference 2018 (IMC 2018).
Filename: 303-protover-removal-policy.txt
Title: When and how to remove support for protocol versions
Author: Nick Mathewson
Created: 21 May 2019
Status: Open

1. Background

   With proposal 264, added support for "subprotocol versions" -- a
   means to declare which features are required for participation in the
   Tor network.  We also created a mechanism (refined later in proposal
   297) for telling Tor clients and relays that they cannot participate
   effectively in the Tor network, and they need to shut down.

   In this document, we describe a policy according to which these
   decisions should be made in practice.

2. Recommending features (for clients and relays)

   A subprotocol version SHOULD become recommended soon after all
   release series that did not provide it become unsupported (within a
   month or so).

   For example, the current oldest LTS release series is 0.2.9; when it
   becomes unsupported in 2020, the oldest supported release series will
   be 0.3.5.  Suppose that 0.2.9 supports a subprotocol Cupcake=1, and
   that all stable 0.3.5.x versions support Cupcake=1-3.  Around one
   month after the end of 0.2.9 support, Cupcake=3 should become a
   _recommended_ protocol for clients and relays.

   Additionally, a feature can become _recommended_ because of security
   reasons.  If we believe that it is a terrible idea to run an old
   protocol, we can make it _recommended_ for relays or clients or both.
   We should not do this lightly, since it will be annoying.

3. Requiring features (for relays)

   We regularly update the directory authorities to require relays to
   run certain versions of Tor or later.  We generally do this after a
   short outreach campaign to get as many relays as possible to upgrade.

   We MAY make a feature required for relays one month after every
   version without it is obsolete and unsupported, though it is better
   to wait three months if possible.

   We SHOULD make a feature required for relays within 12 months after
   every version without it is obsolete and unsupported.

4. Requiring features (for clients)

   Clients take the longest time to update, and are often the least able
   to fetch upgrades. Because of this, we should be very careful about
   making subprotocol versions required on clients, and should only do
   so for fairly compelling reasons.

   We SHOULD NOT make a feature required for clients until it has been
   _recommended_ for clients for at first 9 months.

   We SHOULD make a feature required for clients if it has been
   _recommended_ for clients for at least 18 months.


Filename: 304-socks5-extending-hs-error-codes.txt
Title: Extending SOCKS5 Onion Service Error Codes
Author: David Goulet, George Kadianakis
Created: 22-May-2019
Status: Closed

Note: We are extending SOCKS5 here but in terms, when Tor Browser supports
      HTTPCONNECT, we should not do that anymore.

0. Abstract

   We propose extending the SOCKS5 protocol to allow returning more meaningful
   response failure onion service codes back to the client.

   This is inspired by proposal 229 [PROP229] minus the new authentication
   method.

1. Introduction

   The motivation behind this proposal is because we need a synchronous way to
   return a reason on why the SOCKS5 connection failed.

   The alternative is to use a control port event but then the caller needs to
   match the SOCKS failure to the control event. And tor provides no guarantee
   that a control event will be emitted before the SOCKS failure or vice
   versa.

   With this proposal, the client can get the reason on why the onion service
   connection failed with the SOCKS5 returned error code.

2. Proposal

2.1. New SocksPort Flag

   In order to have backward compatibility with third party applications that
   do not support the new proposed SOCKS5 error code, we propose a new
   SocksPort flag that needs to be set in the tor configuration file in order
   for those code to be sent back.

   The new SocksPort flag is:

      "ExtendedErrors" -- Tor will report new SOCKS5 error code detailed below
                          in section 2.2 (once merged, they will end up in
                          socks-extension.txt).

   It is possible that more codes will be added in the future so an
   application using this flag should possibly expect unknown codes to be
   returned.

2.2. Onion Service Extended SOCKS5 Error Code

   We introduce the following additional SOCKS5 reply codes to be sent in the
   REP field of a SOCKS5 message iff the "ExtendedErrors" on the SocksPort is
   set (see section 2.1 above).

   The SOCKS5 specification [RFC1928] defines a range of code that are
   "unassigned" so we'll be using those on the far end of the range in order
   to inform the client of onion service failures:

   Where:

    * X'F0' Onion Service Descriptor Can Not be Found

      The requested onion service descriptor can't be found on the hashring
      and thus not reachable by the client.

    * X'F1' Onion Service Descriptor Is Invalid

      The requested onion service descriptor can't be parsed or signature
      validation failed.

    * X'F2' Onion Service Introduction Failed

      Client failed to introduce to the service meaning the descriptor was
      found but the service is not anymore at the introduction points. The
      service has likely changed its descriptor or is not running.

    * X'F3' Onion Service Rendezvous Failed

      Client failed to rendezvous with the service which means that the client
      is unable to finalize the connection.

    * X'F4' Onion Service Missing Client Authorization

      Tor was able to download the requested onion service descriptor but is
      unable to decrypt its content because it is missing client authorization
      information for it.

    * X'F5' Onion Service Wrong Client Authorization

      Tor was able to download the requested onion service descriptor but is
      unable to decrypt its content using the client authorization information
      it has. This means the client access were revoked.

3. Compatibility

   No new field or extension has been added. Only new code values from the
   unassigned range are being used. We expect these to not be a problem for
   backward compatibility.

   These codes are only sent back if the new proposed SocksPort flag,
   "ExtendedErrors", is set and making it easier for backward and foward
   compatibility.

References:

[PROP229] https://gitweb.torproject.org/torspec.git/tree/proposals/229-further-socks5-extensions.txt

[RFC1928] https://www.ietf.org/rfc/rfc1928.txt
Filename: 305-establish-intro-dos-defense-extention.txt
Title: ESTABLISH_INTRO Cell DoS Defense Extension
Author: David Goulet, George Kadianakis
Created: 06-June-2019
Status: Closed

0. Abstract

   We propose introducing a new cell extension to the onion service version 3
   ESTABLISH_INTRO cell in order for a service operator to send directives to
   the introduction point.

1. Introduction

   The idea behind this proposal is to provide a way for a service operator to
   give to the introduction points Denial of Service (DoS) defense parameters
   through the ESTABLISH_INTRO cell.

   We are currently developing onion service DoS defenses at the introduction
   point layer which for now has consensus parameter values for the defenses'
   knobs. This proposal would allow the service operator more flexibility for
   tuning these knobs and/or future parameters.

2. ESTABLISH_INTRO Cell DoS Extention

   We introduce a new extention to the ESTABLISH_INTRO cell. The EXTENSIONS
   field will be leveraged and a new protover will be introduced to reflect
   that change.

   As a reminder, this is the content of an ESTABLISH_INTRO cell (taken from
   rend-spec-v3.txt section 3.1.1):

     AUTH_KEY_TYPE         [1 byte]
     AUTH_KEY_LEN          [2 bytes]
     AUTH_KEY              [AUTH_KEY_LEN bytes]
     N_EXTENSIONS          [1 byte]
     N_EXTENSIONS times:
        EXT_FIELD_TYPE     [1 byte]
        EXT_FIELD_LEN      [1 byte]
        EXT_FIELD          [EXT_FIELD_LEN bytes]
     HANDSHAKE_AUTH        [MAC_LEN bytes]
     SIG_LEN               [2 bytes]
     SIG                   [SIG_LEN bytes]

   We propose a new EXT_FIELD_TYPE value:

      [01] -- DOS_PARAMETERS.

              If this flag is set, the extension should be used by the
              introduction point to learn what values the denial of service
              subsystem should be using.

   The EXT_FIELD content format is:

      N_PARAMS    [1 byte]
      N_PARAMS times:
         PARAM_TYPE  [1 byte]
         PARAM_VALUE [8 byte]

   The PARAM_TYPE proposed values are:

      [01] -- DOS_INTRODUCE2_RATE_PER_SEC
              The rate per second of INTRODUCE2 cell relayed to the service.

      [02] -- DOS_INTRODUCE2_BURST_PER_SEC
              The burst per second of INTRODUCE2 cell relayed to the service.

   The PARAM_VALUE size is 8 bytes in order to accomodate 64bit values
   (uint64_t). It MUST match the specified limit for the following PARAM_TYPE:

      [01] -- Min: 0, Max: 2147483647
      [02] -- Min: 0, Max: 2147483647

   A value of 0 means the defense is disabled. If the rate per second is set
   to 0 (param 0x01) then the burst value should be ignored. And vice-versa,
   if the burst value is 0 (param 0x02), then the rate value should be
   ignored. In other words, setting one single parameter to 0 disables the
   INTRODUCE2 rate limiting defense.

   The burst can NOT be smaller than the rate. If so, the parameters should be
   ignored by the introduction point.

   The maximum is set to INT32_MAX meaning (2^31 - 1). Our consensus
   parameters are capped to that limit and these parameters happen to be also
   consensus parameters as well hence the common limit.

   Any valid value does have precedence over the network wide consensus
   parameter.

   This will increase the payload size by 21 bytes:

      This extension type and length is 2 extra bytes, the N_EXTENSIONS field
      is always present and currently set to 0.

      Then the EXT_FIELD is 19 bytes because one parameter is 9 bytes so for
      two parameters, it is 18 bytes plus 1 byte for the N_PARAMS for a total
      of 19.

   The ESTABLISH_INTRO v3 cell currently uses 134 bytes for its payload. With
   this increase, 343 bytes remain unused (498 maximum payload size minus 155
   bytes new payload).

3. Protocol Version

   We introduce a new protocol version in order for onion service that wants
   to specifically select introduction points supporting this new extension.
   But also, it should be used to know when to send this extension or not.

   The new version for the "HSIntro" protocol is:

      "5" -- support ESTABLISH_INTRO cell DoS parameters extension for onion
             service version 3 only.

4. Configuration Options

   We also propose new torrc options in order for the operator to control
   those values passed through the ESTABLISH_INTRO cell.

      "HiddenServiceEnableIntroDoSDefense 0|1"

         If this option is set to 1, the onion service will always send to an
         introduction point, supporting this extension (using protover), the
         denial of service defense parameters regardless if the consensus
         enables them or not. The values are taken from
         HiddenServiceEnableIntroDoSRatePerSec and
         HiddenServiceEnableIntroDoSBurstPerSec torrc option.
         (Default: 0)

      "HiddenServiceEnableIntroDoSRatePerSec N sec"

         Controls the introduce rate per second the introduction point should
         impose on the introduction circuit. The default values are only used
         if the consensus param is not set.
         (Default: 25, Min: 0, Max: 4294967295)

      "HiddenServiceEnableIntroDoSBurstPerSec N sec"

         Controls the introduce burst per second the introduction point should
         impose on the introduction circuit. The default values are only used
         if the consensus param is not set.
         (Default: 200, Min: 0, Max: 4294967295)

   They respectively control the parameter type 0x01 and 0x02 in the
   ESTABLISH_INTRO cell detailed in section 2.

   The default values of the rate and burst are taken from ongoing anti-DoS
   implementation work [1][2]. They aren't meant to be defined with this
   proposal.

5. Security Considerations

   Using this new extension leaks to the introduction point the service's tor
   version. This could in theory help any kind of de-anonymization attack on a
   service since at first it partitions it in a very small group of running
   tor.

   Furthermore, when the first tor version supporting this extension will be
   released, very few introduction points will be updated to that version.
   Which means that we could end up in a situation where many services want to
   use this feature and thus will only select a very small subset of relays
   supporting it overloading them but also making it an easier vector for an
   attacker that whishes to be the service introduction point.

   For the above reasons, we propose a new consensus parameter that will
   provide a "go ahead" for all service out there to start using this
   extension only if the introduction point supports it.

      "HiddenServiceEnableIntroDoSDefense"

         If set to 1, this makes tor start using this new proposed extension
         if available by the introduction point (looking at the new protover).

   This parameter should be switched on when a majority of relays have
   upgraded to a tor version that supports this extension for which we believe
   will also give enough time for most services to move to this new stable
   version making the anonymity set much bigger.

   We believe that there are services that do not care about anonymity on the
   service side and thus could benefit from this feature right away if they
   wish to use it.

5. Discussions

   One possible new avenue to explore is for the introduction point to send
   back a new type of cell which would tell the service that the DoS defenses
   have been triggered. It could include some statistics in the cell which can
   ultimately be reported back to the service operator to use those for better
   decisions for the parameters.

   But also for the operator to be noticed that their service is under attack
   or very popular which could mean time to increase or disable the denial of
   service defenses.

A. Acknowledgements

   This research was supported by NSF grants CNS-1526306 and CNS-1619454.

References:

[1] https://lists.torproject.org/pipermail/tor-dev/2019-May/013837.html
[2] https://trac.torproject.org/15516
Filename: 306-ipv6-happy-eyeballs.txt
Title: A Tor Implementation of IPv6 Happy Eyeballs
Author: Neel Chauhan
Created: 25-Jun-2019
Supercedes: 299
Status: Open
Ticket: https://trac.torproject.org/projects/tor/ticket/29801

1. Introduction

   As IPv4 address space becomes scarce, ISPs and organizations will deploy
   IPv6 in their networks. Right now, Tor clients connect to entry nodes using
   IPv4 connectivity by default.

   When networks first transition to IPv6, both IPv4 and IPv6 will be enabled
   on most networks in a so-called "dual-stack" configuration. This is to not
   break existing IPv4-only applications while enabling IPv6 connectivity.
   However, IPv6 connectivity may be unreliable and clients should be able
   to connect to the entry node using the most reliable technology, whether
   IPv4 or IPv6.

   In ticket #27490, we introduced the option ClientAutoIPv6ORPort which
   lets a client randomly choose between IPv4 or IPv6. However, this
   random decision does not take into account unreliable connectivity
   or falling back to the alternate IP version should one be unreliable
   or unavailable.

   One way to select between IPv4 and IPv6 on a dual-stack network is a
   so-called "Happy Eyeballs" algorithm as per RFC 8305. In one, a client
   attempts the preferred IP family, whether IPv4 or IPv6. Should it work,
   the client sticks with the preferred IP family. Otherwise, the client
   attempts the alternate version. This means if a dual-stack client has
   both IPv4 and IPv6, and IPv6 is unreliable, preferred or not, the
   client uses IPv4, and vice versa. However, if IPv4 and IPv6 are both
   equally reliable, and IPv6 is preferred, we use IPv6.

   In Proposal 299, we have attempted a IP fallback mechanism using failure
   counters and preferring IPv4 and IPv6 based on the state of the counters.
   However, Prop299 was not standard Happy Eyeballs and an alternative,
   standards-compliant proposal was requested in [P299-TRAC] to avoid issues
   from complexity caused by randomness.

   This proposal describes a Tor implementation of Happy Eyeballs and is
   intended as a successor to Proposal 299.

2. Address/Relay Selection

   This section describes the necessary changes for address selection to
   implement Prop306.

2.1. Address Handling Changes

   To be able to handle Happy Eyeballs in Tor, we will need to modify the
   data structures used for connections to entry nodes, namely the extend info
   structure.

   Entry nodes are usually guards, but some clients don't use guards:

     * Bootstrapping clients can connect to fallback directory mirrors or
       authorities

     * v3 single onion services can use IPv4 or IPv6 addresses to connect
       to introduction and rendezvous points, and

     * Clients can be configured to disable entry guards

   Bridges are out of scope for this proposal, because Tor does not support
   multiple IP addresses in a single bridge line.

   The extend info structure should contain both an IPv4 and an IPv6 address.
   This will allow us to try IPv4 and the IPv6 addresses should both be
   available on a relay and the client is dual-stack.

   When processing:
     * relay descriptors,
     * hard-coded authority and fallback directory lists,
     * onion service descriptors, or
     * onion service introduce cells,
   and filling in the extend info data structure, we need to fill in both the
   IPv4 and IPv6 address if both are available. If only one family is
   available for a relay (IPv4 or IPv6), we should leave the other family null.

2.2 Bootstrap Changes

   Tor's hard-coded authority and fallback directory mirror lists contain
   some entries with IPv6 ORPorts. As of January 2020, 56% of authorities and
   47% of fallback directories have IPv6.

   During bootstrapping, we should have an option for the maximum number of
   IPv4-only nodes, before the next node must have an IPv6 ORPort. The
   parameter is as follows:

     * MaxNumIPv4BootstrapAttempts NUM

   During bootstrap, the minimum fraction of nodes with IPv6 ORPorts will be
   1/(1 + MaxNumIPv4BootstrapAttempts). And the average fraction will be
   larger than both minimum fraction, and the actual proportion of IPv6
   ORPorts in the fallback directory list. (Clients mainly use fallback
   directories for bootstrapping.)

   Since this option is used during bootstrapping, it can not have a
   corresponding consensus parameter.

   The default value for MaxNumIPv4BootstrapAttempts should be 2. This
   means that every third bootstrap node must have an IPv6 ORPort. And on
   average, just over half of bootstrap nodes chosen by clients will have an
   IPv6 ORPort. This change won't have much impact on load-balancing, because
   almost half the fallback directory mirrors have IPv6 ORPorts.

   The minimum value of MaxNumIPv4BootstrapAttempts is 0. (Every bootstrap
   node must have an IPv6 ORPort. This setting is equivalent to
   ClientPreferIPv6ORPort 1.)

   The maximum value of MaxNumIPv4BootstrapAttempts should be 100. (Since
   most clients only make a few bootstrap connections, bootstrap nodes will
   be chosen at random, regardless of their IPv6 ORPorts.)

2.3. Guard Selection Changes

   When we select guard candidates, we should have an option for the number of
   primary IPv6 entry guards. The parameter is as follows:

     * NumIPv6Guards NUM

   If UseEntryGuards is set to 1, we will select exactly this number of IPv6
   relays for our primary guard list, which is the set of relays we strongly
   prefer when connecting to the Tor network. (This number should also apply
   to all of Tor's other guard lists, scaled up based on the relative size of
   the list.)

   If NUM is -1, we try to learn the number from the NumIPv6Guards
   consensus parameter. If the consensus parameter isn't set, we should
   default to 1.

   The default value for NumIPv6Guards should be -1. (Use the consensus
   parameter, or the underlying default value of 1.)

   As of September 2019, approximately 20% of Tor's guards supported IPv6,
   by consensus weight. (Excluding exits that are also guards, because
   clients avoid choosing exits in their guard lists.)

   If all Tor clients implement NumIPv6Guards, then these 20% of guards will
   handle approximately 33% of Tor's traffic. (Because the default value of
   NumPrimaryGuards is 3.) This may have a significant impact on Tor's
   load-balancing. Therefore, we should deploy this feature gradually, and try
   to increase the number of relays that support IPv6 to at least 33%.

   To minimise the impact on load-balancing, IPv6 support should only be
   required for exactly NumIPv6Guards during guard list selection. All other
   guards should be IPv4-only guards. Once approximately 50% of guards support
   IPv6, NumIPv6Guards can become a minimum requirement, rather than an exact
   requirement.

   The minimum configurable value of NumIPv6Guards is -1. (Use the consensus
   parameter, or the underlying default.)

   The minimum resulting value of NumIPv6Guards is 0. (Guards will be chosen
   at random, regardless of their IPv6 ORPorts.)

   The maximum value of NumIPv6Guards should be the configured value of
   NumPrimaryGuards. (Every guard must have an IPv6 ORPort. This setting is
   equivalent to ClientPreferIPv6ORPort 1.)

3. Relay Connections

   If there is an existing authenticated connection, we must use it similar
   to how we used it pre-Prop306.

   If there is no existing authenticated connection for an entry node, tor
   currently attempts to connect using the first available, allowed, and
   preferred address. (Determined using the existing Client IPv4 and IPv6
   options.)

   We should also allow falling back to the alternate address. For this,
   a design will be given in Section 3.1.

3.1. TCP Connection to Preferred Address On First TCP Success

   In this design, we will connect via TCP to the first preferred address.
   On a failure or after a 250 msec delay, we attempt to connect via TCP to
   the alternate address. On a success, Tor attempts to authenticate and
   closes the other connection.

   This design is close to RFC 8305 and is similar to how Happy Eyeballs
   is implemented in a web browser.

3.2. Handling Connection Successes And Failures

   Should a connection to a entry node succeed and is authenticated via TLS,
   we can then use the connection. In this case, we should cancel all other
   connection timers and in-progress connections. Cancelling the timers is
   necessary so we don't attempt new unnecessary connections when our
   existing connection is successful, preventing denial-of-service risks.

   However, if we fail all available and allowed connections, we should tell
   the rest of Tor that the connection has failed. This is so we can attempt
   another entry node.

3.3. Connection Attempt Delays

   As mentioned in [TEOR-P306-REP], initially, clients should prefer IPv4
   by default. The Connection Attempt Delay, or delay between IPv4 and IPv6
   connections should be 250 msec. This is to avoid the overhead from tunneled
   IPv6 connections.

   The Connection Attempt Delay should not be dynamically adjusted, as it adds
   privacy risks. This value should be fixed, and could be manually adjusted
   using this torrc option or consensus parameter:

     * ConnectionAttemptDelay N [msec|second]

   The Minimum and Maximum Connection Attempt Delays should also not be
   dynamically adjusted for privacy reasons. The Minimum should be fixed at
   10 msec as per RFC 8305. But the maximum should be higher than the RFC 8305
   recommendation of 2 seconds. For Tor, we should make this timeout value 30
   seconds to match Tor's existing timeout.

   We need to make it possible for users to set the Maximum Connection Attempt
   Delay value higher for slower and higher-latency networks such as dial-up
   and satellite.

4. Option Changes

   As we enable IPv6-enabled clients to connect out of the box, we should
   adjust the default options to enable IPv6 while not breaking IPv4-only
   clients.

   The new default options should be:

    * ClientUseIPv4 1 (to enable IPv4)

    * ClientUseIPv6 1 (to enable IPv6)

    * ClientPreferIPv6ORPort 0 (for load-balancing reasons so we don't
      overload IPv6-only guards)

    * ConnectionAttemptDelay 250 msec (the recommended delay between IPv4
      and IPv6, as per RFC 8305)

   One thing to note is that clients should be able to connect with the above
   options on IPv4-only, dual-stack, and IPv6-only networks, and they should
   also work if ClientPreferIPv6ORPort is 1. But we shouldn't expect
   IPv4 or IPv6 to work if ClientUseIPv4 or ClientUseIPv6 is set to 0.

   When the majority of clients and relay are IPv6-capable, we could set the
   default value of ClientPreferIPv6ORPort to 1, in order to take advantage
   of IPv6. We could add a ClientPreferIPv6ORPort consensus parameter, so we
   can make this change network-wide.

5. Relay Statistics

   Entry nodes could measure the following statistics for both IPv4 and IPv6:

     * Number of successful connections

     * Number of extra Prop306 connections (unsuccessful or cancelled)
       * Client closes the connection before completing TLS
       * Client closes the connection before sending any circuit or data cells

     * Number of client and relay connections
       * We can distinguish between authenticated (relay, authority
         reachability) and unauthenticated (client, bridge) connections

   Should we implement Section 5:

     * We can send this information to the directory authorities using relay
       extra-info descriptors

     * We should consider the privacy implications of these statistics, and
       how much noise we need to add to them

     * We can include these statistics in the Heartbeat logs

6. Initial Feasibility Testing

   We should test this proposal with the following scenarios:

    * Different combinations of values for the options ClientUseIPv4,
      ClientUseIPv6, and ClientPreferIPv6ORPort on IPv4-only, IPv6-only,
      and dual-stack connections

    * Dual-stack connections of different technologies, including
      high-bandwidth and low-latency (e.g. FTTH), moderate-bandwidth and
      moderate-latency (e.g. DSL, LTE), and high-latency and low-bandwidth
      (e.g. satellite, dial-up) to see if Prop306 is reliable and feasible

7. Minimum Viable Prop306 Product

   The mimumum viable product for Prop306 must include the following:

    * The address handling, bootstrap, and entry guard changes described in
      Section 2. (Single Onion Services are optional, Bridge Clients are out
      of scope. The consensus parameter and torrc options are optional.)

    * The alternative address retry algorithm in Section 3.1.

    * The Connection Success/Failure mechanism in Section 3.2.

    * The Connection Delay mechanism in Section 3.3. (The
      ConnectionAttemptDelay torrc option and consensus parameter are
      optional.)

    * A default setup capable of both IPv4 and IPv6 connections with the
      options described in Section 4. (The ClientPreferIPv6ORPort consensus
      parameter is optional.)

8. Optional Features

   Some features which are optional include:

    * Single Onion services: extend info address changes for onion service
      descriptors and introduce cells. (Section 2.1.)

    * Bridge clients are out of scope: they would require bridge line format
      changes, internal bridge data structure changes, and extend info address
      changes. (Section 2.1.)

    * MaxNumIPv4BootstrapAttempts torrc option. We may need this option if
      the proposed default doesn't work for some clients. (Section 2.2.)

    * NumIPv6Guards torrc option and consensus parameter. We may need this
      option if the proposed default doesn't work for some clients.
      (Section 2.3.)

    * ConnectionAttemptDelay torrc option and consensus parameter. We will need
      this option if the Connection Attempt Delay needs to be manually
      adjusted, for instance, if clients often fail IPv6 connections.
      (Section 3.3.)

    * ClientPreferIPv6ORPort consensus parameter. (Section 4.)

    * IPv4, IPv6, client, relay, and extra Prop306 connection statistics.
      While optional, these statistics may be useful for debugging and
      reliability testing, and metrics on IPv4 vs IPv6. (Section 5.)

9. Acknowledgments

   Thank you so much to teor for your discussion on this happy eyeballs
   proposal. I wouldn't have been able to do this has it not been for
   your help.

10. Refrences

   [P299-TRAC]: https://trac.torproject.org/projects/tor/ticket/29801

   [TEOR-P306-REP]: https://lists.torproject.org/pipermail/tor-dev/2019-July/013919.html
Filename: 307-onionbalance-v3.txt
Title: Onion Balance Support for Onion Service v3
Author: Nick Mathewson
Created: 03-April-2019
Status: Reserve

   [This proposal is currently in reserve status because bug tor#29583 makes
   it unnecessary. (2020 July 31)]

0. Draft Notes

   2019-07-25:

      At this point in time, the cross-certification is not implemented
      correctly in >= tor-0.3.2.1-alpha. See https://trac.torproject.org/29583
      for more details.

      This proposal assumes that this bug is fixed.

1. Introduction

   The OnionBalance tool allows several independent Tor instances to host an
   onion service, while clients can access that onion service without having
   to take its distributed status into account. OnionBalance works by having
   each instance run a separate onion service. Then, a management server
   periodically downloads the descriptors from those onion services, and
   generates a new descriptor containing the introduction points from each
   instance's onion service.

   OnionBalance is used by several high-profile onion services, including
   Facebook and The Tor Project.

   Unfortunately, because of the cross-certification features in v3 onion
   services, OnionBalance no longer works for them. To a certain extent, this
   breakage is because of a security improvement: It's probably a good thing
   that random third parties can no longer grab a onion service's introduction
   points and claim that they are introduction points for a different service.
   But nonetheless, a lack of a working OnionBalance remains an obstacle for
   v3 onion service migration.

   This proposal describes extensions to v3 onion service design to
   accommodate OnionBalance.

2. Background and Solution

   If an OnionBalance management server wants to provide an aggregate
   descriptor for a v3 onion service, it faces several obstacles that it
   didn't have in v2.

   When the management server goes to construct an aggregated descriptor, it
   will have a mismatch on the "auth-key", "enc-key-cert", and
   "legacy-key-cert" fields: these fields are supposed to certify the onion
   service's current descriptor-signing key, but each of these keys will be
   generated independently by each instance. Because they won't match each
   other, there is no possible key that the aggregated descriptor could use
   for its descriptor signing key.

   In this design, we require that each instance should know in advance about
   a descriptor-signing public key that the aggregate descriptor will use for
   each time period. (I'll explain how they can do this later, in section 3
   below.) They don't have to know the corresponding private key.

   When generating their own onion service descriptors for a given time
   period, the instances generate these additional fields to be used for the
   aggregate descriptor:

       "meta-auth-key"
       "meta-enc-key-cert"
       "meta-legacy-key-cert"

   These fields correspond to "auth-key", "enc-key-cert", and
   "legacy-key-cert" respectively, but differ in one regard: the
   descriptor-signing public key that they certify is _not_ the instance's own
   descriptor-signing key, but rather the aggregate public key for the time
   period.

   Ordinary clients ignore these new fields.

   When the management server creates the aggregate descriptor, it checks that
   the signing key for each of these "meta" fields matches the signing key for
   its corresponding non-"meta" field, and that they certify the correct
   descriptor-signing key-- and then uses these fields in place of their
   corresponding non-"meta" variants.

2.1. A quick note on synchronization

   In the design above, and in the section below, I frequently refer to "the
   current time period". By this, I mean the time period for which the
   descriptor is encoded, not the time period in which it is generated.

   Instances and management servers should generate descriptors for the two
   closest time periods, as they do today: no additional synchronization
   should needed here.

3. How to distribute descriptor-signing keys

   The design requires that every instance of the onion service knows about
   the public descriptor-signing key that will be used for the aggregate onion
   service. Here I'll discuss how this can be achieved.

3.1. If the instances are trusted.

   If the management server trusts each of the instances, it can distribute a
   shared secret to each one of them, and use this shared secret to derive
   each time period's private key.

   For example, if the shared secret is SK, then the private descriptor-
   signing key for each time period could be derived as:

        H("meta-descriptor-signing-key-deriv" |
           onion_service_identity
           INT_8(period_num) |
           INT_8(period_length) |
           SK )

   (Remember that in the terminology of rend-spec-v3, INT_8() denotes a 64-bit
   integer, see section 0.2 in rend-spec-v3.txt.)

   If shared secret is ever compromised, then an attacker can impersonate the
   onion service until the shared secret is changed, and can correlate all
   past descriptors for the onion service.

3.2. If the instances are not trusted: Option One

   If the management server does not trust the instances with
   descriptor-signing public keys, another option for it is to simply
   distribute a load of public keys in advance, and use them according to a
   schedule.

   In this design, the management server would pre-generate the
   "descriptor-signing-key-cert" fields for a long time in advance, and
   distribute them to the instances offline. Each one would be
   associated with its corresponding time period.

   If these certificates were revealed to an attacker, the attacker
   could correlate descriptors for the onion service with one another,
   but could not impersonate the service.

3.3. If the instances are not trusted: Option Two

   Another option for the trust model of 3.2 above is to use the same
   key-blinding method as used for v3 onion services. The management server
   would hold a private descriptor-signing key, and use it to derive a
   different private descriptor-signing key for each time period. The instance
   servers would hold the corresponding public key, and use it to derive a
   different public descriptor-signing key for each time period.

   (For security, the key-blinding function in this case should use a
   different nonce than used in the)

   This design would allow the instances to only be configured once, which
   would be simpler than 3.2 above-- but at a cost. The management server's
   use of a long-term private descriptor-signing key would require it to keep
   that key online. (It could keep the derived private descriptor-signing keys
   online, but the parent key could be derived from them.)

   Here, if the instance's knowledge were revealed to an attack, the attacker
   could correlate descriptors for the onion service with one another, but
   could not impersonate the service.

4. Some features of this proposal

   We retain the property that each instance service remains accessible as a
   working onion service. However, anyone who can access it can identify it as
   an instance of an OnionBalance service, and correlate its descriptor to the
   aggregate descriptor.

   Instances could use client authorization to ensure that only the management
   server can decrypt their introduction points. However, because of the
   key-blinding features of v3 onion services, nobody who doesn't know the
   onion addresses for the instances can access them anyway: It would be
   sufficient to keep these addresses secret.

   Although anybody who successfully accesses an instance can correlate its
   descriptor to the meta-descriptor, this only works for two descriptors
   within a single time period: You can't match an instance descriptor from
   one time period to a meta-descriptor from another.

A. Acknowledgments

   Thanks to the network team for helping me clarify my ideas here, explore
   options, and better understand some of the implementations and challenges
   in this problem space.

   This research was supported by NSF grants CNS-1526306 and CNS-1619454.
Filename: 308-counter-galois-onion.txt
Title: Counter Galois Onion: A New Proposal for Forward-Secure Relay Cryptography
Authors: Jean Paul Degabriele, Alessandro Melloni, Martijn Stam
Created: 13 Sep 2019
Last-Modified: 13 Sep 2019
Status: Superseded


    NOTE: This proposal is superseded by an improved version of the
    Counter Galois Onion design based on the authors' forthcoming
    paper, "Counter Galois Onion: Fast, Forward-Secure, and
    Non-Malleable Onion Encryption for Tor".  The improved proposal
    will be publicly available once the paper is closer to being
    ready for publication.
      -nickm


1. Background and Motivation

    In Proposal 202, Mathewson expressed the need to update Tor's Relay
    cryptography and protect against tagging attacks. Towards this goal he
    outlined two possible approaches for constructing an onion encryption
    scheme that should be able to withstand tagging attacks. Later, in
    Proposal 261, Mathewson proposed a concrete scheme based on the
    tweakable wide-block cipher AEZ. The security of Proposal 261 was
    analysed in [DS18]. An alternative scheme was suggested in Proposal 295
    which combines an instantiation of the PIV construction from [ST14] and
    a variant of the GCM-RUP construction from [ADL17]. In this document we
    propose yet another scheme, Counter Galois Onion (CGO)
    which improves over proposals 261 and 295 in a number of ways. CGO has
    a minimalistic design requiring only a block cipher in counter-mode and
    a universal hash function. To take advantage of Intel's AES-NI and
    PCLMULQDQ instructions we recommend using AES and POLYVAL [GLL18]. In
    terms of security, it protects against tagging attacks while
    simultaneously providing forward security with respect to end-to-end
    authenticity and confidentiality. Furthermore CGO performs better than
    proposal 295 in terms of efficiency and its support of "leaky pipes".


1.2 Design Overview

    CGO makes due with a universal hash function while simultaneously
    satisfying forward security. It employs two distinct types of
    encryption, a dynamic encryption scheme DEnc and a static encryption
    scheme SEnc. DEnc is used for end-to-end encryption (layer n) and SEnc
    is used for the intermediate layers (n-1 to 1). DEnc is a Forward-
    Secure Authenticated Encryption scheme for securing end-to-end
    communication and SEnc provides the non-malleability for protecting
    against tagging attacks. In order to provide forward security, the key
    material in DEnc is updated with every encryption whereas in SEnc the
    key material is static. To support leaky pipes, in the forward
    direction each OR first attempts a partial decryption using DEnc and
    if it fails it reverts to decrypting using SEnc. The rest of the
    document describes the scheme's operation in terms of the low-level
    primitives and we make no further mention of DEnc and SEnc. However,
    on an intuitive level it can be helpful to think of:

    a) the combinations of E(KSf_I, *) and PH(HSf_I, *) as well as
    E(KDf_I, *) and PH(HDf_I, *) as two instances of a tweakable block
    cipher,

    b) the operation E(Sf_I, <0>) | E(Sf_I, <1>) |  E(Sf_I, <2>) | ... as a
    PRG with seed Sf_I,

    c) and E(JSf_I, <IV>) | E(JSf_I, <IV+1>) | ... | E(JSf_I, <IV+31>) as
    counter-mode encryption with <IV> as the initial vector.


2. Preliminaries

2.1. Notation

   Symbol               Meaning
   ------               -------
   M                    Plaintext
   Sf_I                 PRG Seed, forward direction, layer I
   Sb_I                 PRG Seed, backward direction, layer I
   Cf_I                 Ciphertext, forward direction, layer I
   Cb_I                 Ciphertext, backward direction, layer I
   Tf_I                 Tag, forward direction, layer I
   LTf_I                Last Tag, forward direction, layer I
   Tb_I                 Tag, backward direction, layer I
   LTb_I                Last Tag, backward direction, layer I
   Nf_I                 Nonce, forward direction, layer I
   LNf_I                Last Nonce, forward direction, layer I
   Nb_I                 Nonce, backward direction, layer I
   LNb_I                Last Nonce, backward direction, layer I
   JSf_I                Static Block Cipher Key, forward direction, layer I
   JSb_I                Static Block Cipher Key, backward direction, layer I
   KSf_I                Static Block Cipher Key, forward direction, layer I
   KSb_I                Static Block Cipher Key, backward direction, layer I
   KDf_I                Dynamic Block Cipher Key, forward direction, layer I
   KDb_I                Dynamic Block Cipher Key, backward direction, layer I
   HSf_I                Static Poly-Hash Key, forward direction, layer I
   HSb_I                Static Poly-Hash Key, backward direction, layer I
   HDf_I                Dynamic Poly-Hash Key, forward direction, layer I
   HDb_I                Dynamic Poly-Hash Key, backward direction, layer I
   ^                    Bitwise XOR operator
   |                    Concatenation
   &&                   Logical AND  operator
   Z[a, b]               For a string Z, the substring from byte a to byte b
                        (indexing starts at 1)
   INT(X)               Translate string X into an unsigned integer

2.2. Security parameters

   POLY_HASH_LEN -- The length of the polynomial hash function's output,
   in bytes. For POLYVAL, POLY_HASH_LEN = 16.

   PAYLOAD_LEN -- The longest allowable cell payload, in bytes (509).

   HASH_KEY_LEN -- The key length used to digest messages in bytes.
   For POLYVAL, DIG_KEY_LEN = 16.

   BC_KEY_LEN -- The key length, in bytes, of the block cipher used. For
   AES we recommend ENC_KEY_LEN = 16.

   BC_BLOCK_LEN -- The block length, in bytes, of the block cipher used.
   For AES, BC_BLOCK_LEN = 16.

2.3. Primitives

   The polynomial hash function is POLYVAL with a HASH_KEY_LEN-byte key. We
   write this as PH(H, M) where H is the key and M the message to be hashed.

   We use AES with a BC_KEY_LEN-byte key. For AES encryption (resp.,
   decryption) we write E(K, X) (resp., D(K, X)) where K is a BC_KEY_LEN-byte
   key and X the block to be encrypted (resp., decrypted). For an integer
   j, we use <j> to denote the string of length BC_BLOCK_LEN representing
   that integer.

2.4 Key derivation and initialisation (replaces Section 5.2.2)

   For newer KDF needs, Tor uses the key derivation function HKDF from
   RFC5869, instantiated with SHA256.  (This is due to a construction
   from Krawczyk.)  The generated key material is:

     K = K_1 | K_2 | K_3 | ...

   Where H(x, t) is HMAC_SHA256 with value x and key t
   and K_1     = H(m_expand | INT8(1) , KEY_SEED )
   and K_(i+1) = H(K_i | m_expand | INT8(i+1) , KEY_SEED )
   and m_expand is an arbitrarily chosen value,
   and INT8(i) is an octet with the value "i".

   In RFC5869's vocabulary, this is HKDF-SHA256 with info == m_expand,
   salt == t_key, and IKM == secret_input.

2.4.1. Key derivation using the KDF

   When used in the ntor handshake, for each layer I, the key material is
   split into the following sequence of contiguous values:

   Length             Purpose                    Notation
   ------             -------                    --------
   BC_KEY_LEN         forward Seed               Sf_I
   BC_KEY_LEN         backward Seed              Sb_I

   if (I < n) in addition derive the following static keys:

   BC_KEY_LEN         forward BC Key             KSf_I
   BC_KEY_LEN         backward BC Key            KSb_I
   BC_KEY_LEN         forward CTR Key            JSf_I
   BC_KEY_LEN         backward CTR Key           JSb_I
   HASH_KEY_LEN       forward poly hash key      HSf_I
   HASH_KEY_LEN       backward poly hash key     HSb_I

   Excess bytes from K are discarded.

2.4.2. Initialisation from Seed

   For each layer I compute E(Sf_I, <0>) | E(Sf_I, <1>) |  E(Sf_I, <2>) | ...
   and parse the output as:

   Length             Purpose                    Notation
   ------             -------                    --------
   BC_BLOCK_LEN       forward Nonce              Nf_I
   BC_KEY_LEN         forward BC Key             KDf_I
   HASH_KEY_LEN       forward poly hash key      HDf_I
   BC_KEY_LEN         new forward Seed           Sf'_I

   Discard excess bytes, replace Sf_I with Sf'_I, and set LNf_n and LTf_I
   to the zero string.

   Similarly for the backward direction, compute E(Sb_I, <0>) | E(Sb_I, <1>)
  | E(Sb_I, <2>) | ... and parse the output as:

   Length             Purpose                    Notation
   ------             -------                    --------
   BC_BLOCK_LEN       backward Nonce             Nb_I
   BC_KEY_LEN         forward BC Key             KDb_I
   HASH_KEY_LEN       forward poly hash key      HDb_I
   BC_KEY_LEN         new backward Seed          Sb'_I

   Discard excess bytes, replace Sb_I with Sb'_I, and set LNb_n and LTb_I
   to the zero string.

   NOTE: For layers n-1 to 1 the values Nf_I, KDf_I, HDf_I, Sf_I and their
   backward counterparts are only required in order to support leaky
   pipes. If leaky pipes is not required these values can be safely
   omitted.


3. Routing relay cells

   Let n denote the number of nodes in the circuit. Then encryption layer n
   corresponds to the encryption between the OP and the exit/destination
   node.


3.1. Forward Direction

   The forward direction is the direction that CREATE/CREATE2 cells
   are sent.


3.1.1. Routing From the Origin

   When an OP sends a relay cell, the cell is produced as follows:

   The OP computes E(Sf_n, <0>) | E(Sf_n, <1>) |  E(Sf_n, <2>) | ...
   and parses the output as

   Length             Purpose                    Notation
   ------             -------                    --------
   509                encryption pad             Z
   BC_BLOCK_LEN       backward Nonce             Nf'_I
   BC_KEY_LEN         forward BC Key             KDf'_I
   HASH_KEY_LEN       forward poly hash key      HDf'_I
   BC_KEY_LEN         new forward Seed           Sf'_I

   Excess bytes are discarded. It then computes the n'th layer ciphertext
   (Tf_n, Cf_n) as follows:

   Cf_n = M ^ Z
   X_n = PH(HDf_n, (LNf_n | Cf_n))
   Y_n = Nf_n ^ X_n
   Tf_n = E(KDf_n, Y_n) ^ X_n

   and updates its state by overwriting the old variables with the new
   ones.

   LNf_n = Nf_n
   Nf_n = Nf'_n
   KDf_n = KDf'_n
   HDf_n = HDf'_n
   Sf_n = Sf'_n

   It then applies the remaining n-1 layers of encryption to (Tf_n, Cf_n)
   as follows:

   For I = n-1 to 1:
     IV = INT(Tf_{I+1})
     Z  = E(JSf_I, <IV>) | E(JSf_I, <IV+1>) | ... | E(JSf_I, <IV+31>)
     % BC_BLOCK_LEN = 16
     Cf_I = Cf_{I+1} ^ Z[1, 509]
     X_I = PH(HSf_n, (LTf_{I+1} | Cf_I))
     Y_I = Tf_{I+1} ^ X_I
     Tf_I = E(KSf_I, Y_I) ^ X_I
     LTf_{I+1} = Tf_{I+1}

   Upon completion the OP sends (Tf_1, Cf_1) to node 1.


3.1.2. Relaying Forward at Onion Routers

   When a forward relay cell (Tf_I, Cf_I) is received by OR I, it decrypts
   it performs the following set of steps:

   'Forward' relay cell:

    X_I = PH(HDf_n, (LNf_I | Cf_I))
    Y_I = Tf_I ^ X_I
    if (Nf_I == D(KDf_I, Y_I) ^ X_I)  % cell recognized and authenticated
      compute E(Sf_I, <0>) | E(Sf_I, <1>) |  E(Sf_I, <2>) | ... and parse the
      output as Z, Nf'_I, KDf'_I, HDf'_I, Sf'_I

      M = Cf_n ^ Z
      LNf_I = Nf_I
      Nf_I = Nf'_I
      KDf_I = KDf'_I
      HDf_I = HDf'_I
      Sf_I = Sf'_I

      return M

    else if (I == n)    % last node, decryption has failed
      send DESTROY cell to tear down the circuit

    else    % decrypt and forward cell
      X_I = PH(HSf_I, (LTf_{I+1} | Cf_I))
      Y_I = Tf_I ^ X_I
      Tf_{I+1} = D(KSf_I, Y_I) ^ X_I
      IV = INT(Tf_{I+1})
      Z  = E(JSf_I, <IV>) | E(JSf_I, <IV+1>) | ... | E(JSf_I, <IV+31>)
      % BC_BLOCK_LEN = 16
      Cf_{I+1} = Cf_I ^ Z[1, 509]

      forward (Tf_{I+1}, Cf_{I+1}) to OR I+1

3.2. Backward Direction

   The backward direction is the opposite direction from
   CREATE/CREATE2 cells.

3.2.1. Routing From the Exit Node

   At OR n encryption proceeds as follows:

   It computes E(Sb_n, <0>) | E(Sb_n, <1>) |  E(Sb_n, <2>) | ...
   and parses the output as

   Length             Purpose                    Notation
   ------             -------                    --------
   509                encryption pad             Z
   BC_BLOCK_LEN       backward Nonce             Nb'_I
   BC_KEY_LEN         forward BC Key             KDb'_I
   HASH_KEY_LEN       forward poly hash key      HDb'_I
   BC_KEY_LEN         new forward Seed           Sb'_I

   Excess bytes are discarded. It then computes the ciphertext
   (Tf_n, Cf_n) as follows:

   Cb_n = M ^ Z
   X_n = PH(HDb_n, (LNb_n | Cb_n))
   Y_n = Nb_n ^ X_n
   Tb_n = E(KDb_n, Y_n) ^ X_n)

   and updates its state by overwriting the old variables with the new
   ones.

   LNb_n = Nb_n
   Nb_n = Nb'_n
   KDb_n = KDb'_n
   HDb_n = HDb'_n
   Sb_n = Sb'_n


3.2.2. Relaying Backward at the Onion Routers

   At OR I (for I < n) when a ciphertext (Tb_I, Cb_I) in the backward
   direction is received it is processed as follows:

   X_I = PH(HSb_n, (LTb_{I-1} | Cb_I))
   Y_I = Tb_I ^ X_I
   Tb_{I-1} = D(KSb_I, Y_I) ^ X_I
   IV = INT(Tb_{I-1})
   Z  = E(JSb_I, <IV>) | E(JSb_I, <IV+1>) | ... | E(JSb_I, <IV+31>)
   % BC_BLOCK_LEN = 16
   Cb_{I-1} = Cb_I ^ Z[1, 509]

   The ciphertext (Tb_I, Cb_I) is then passed along the circuit towards
   the OP.


3.2.2. Routing to the Origin

   When a ciphertext (Tb_1, Cb_1) arrives at an OP, the OP decrypts it in
   two stages. It first reverses the layers from 1 to n-1 as follows:

   For I = 1 to n-1:
     X_I = PH(HSb_I, (LTb_{I+1} | Cb_I))
     Y_I = Tb_I ^ X_I
     Tb_{I+1} = E(KSb_I, Y_I) ^ X_I
     IV = INT(Tb_{I+1})
     Z  = E(JSb_I, <IV>) | E(JSb_I, <IV+1>) | ... | E(JSb_I, <IV+31>)
     % BC_BLOCK_LEN = 16
     Cb_{I+1} = Cb_I ^ Z[1, 509]

   Upon completion the n'th layer of encryption is removed as follows:

   X_n = PH(HDb_n, (LNb_n | Cb_n))
   Y_n = Tb_n ^ X_n
   if (Nb_n = D(KDb_n, Y_n) ^ X_n)     % authentication is successful
     compute E(Sb_n, <0>) | E(Sb_n, <1>) |  E(Sb_n, <2>) | and parse the
     output as Z, Nb'_n, KDb'_n, HDb'_n, Sb'_n

     M = Cb_n ^ Z
     LNb_n = Nb_n
     Nb_n = Nb'_n
     KDb_n = KDb'_n
     HDb_n = HDb'_n
     Sb_n = Sb'_n

     return M

   else
     send DESTROY cell to tear down the circuit


4. Application connections and stream management

4.1. Amendments to the Relay Cell Format

   Within a circuit, the OP and the end node use the contents of
   RELAY packets to tunnel end-to-end commands and TCP connections
   ("Streams") across circuits. End-to-end commands can be initiated
   by either edge; streams are initiated by the OP.

   The payload of each unencrypted RELAY cell consists of:

       Relay command           [1 byte]
       StreamID                [2 bytes]
       Length                  [2 bytes]
       Data                    [PAYLOAD_LEN-21 bytes]

   The old Digest field is removed since sufficient information for
   authentication is now included in the nonce part of the payload.

   The old 'Recognized' field is removed. Instead a cell is recognized
   via a partial decryption using the node's dynamic keys - namely the
   following steps (already included in Section 3):

   Forward direction:

   X_I = PH(HDf_n, (LNf_I | Cf_I))
   Y_I = Tf_I ^ X_I
   if (Nf_I == D(KDf_I, Y_I) ^ X_I)  % cell is recognized and authenticated

   Backward direction (executed by the OP):

   If the OP is aware of the number of layers present in the cell there
   is no need to attempt to recognize the cell. Otherwise the OP can, for
   each layer, first attempt a partial decryption using the dynamic keys
   for that layer as follows:

   X_I = PH(HDb_I, (LNb_I | Cb_I))
   Y_I = Tb_I ^ X_I
   if (Nb_I = D(KDb_I, Y_I) ^ X_I)    % cell is recognized and authenticated

   The 'Length' field of a relay cell contains the number of bytes
   in the relay payload which contain real payload data. The
   remainder of the payload is padding bytes.

4.2. Appending the encrypted nonce and dealing with version-homogenic
     and version-heterogenic circuits

   When a cell is prepared to be routed from the origin (see Section
   3.1.1) the encrypted nonce N is appended to the encrypted cell
   (occupying the last 16 bytes of the cell). If the cell is prepared to
   be sent to a node supporting the new protocol, S is combined with other
   sources to generate the layer's nonce. Otherwise, if the node only
   supports the old protocol, n is still appended to the encrypted cell
   (so that following nodes can still recover their nonce), but a
   synchronized nonce (as per the old protocol) is used in CTR-mode.

   When a cell is sent along the circuit in the 'backward' direction,
   nodes supporting the new protocol always assume that the last 16 bytes
   of the input are the nonce used by the previous node, which they
   process as per Section 3.2.1. If the previous node also supports the
   new protocol, these cells are indeed the nonce. If the previous node
   only supports the old protocol, these bytes are either encrypted
   padding bytes or encrypted data.

5. Security and Design Rationale

   We are currently working on a security proof to better substantiate our
   security claims. Below is a short informal summary on the security of
   CGO and its design rationale.

5.1. Resistance to crypto-tagging attacks

   Protection against crypto-tagging attacks is provided by layers n-1 to
   1. This part of the scheme is based on the paradigm from [ADL17] which
   has the property that if any single bit of the OR's input is changed
   then all of the OR's output will be randomised. Specifically, if
   (Tf_I, Cf_I) is travelling in the forward direction and is processed by
   an honest node I, a single bit flip to either Tf_I or Cf_I will result
   in both Tf_{I+1} and Cf_{I+1} being completely randomised. In addition,
   the processing of (Tf_I, Cf_I) includes LTf_{I+1} so that any
   modification to (Tf_I, Cf_I) at time j will in turn randomise the value
   (Tf_{I+1}, Cf_{I+1}) at any time >= j . Thus once a circuit is tampered
   with it is not possible to recover from it at a later stage. This helps
   to protect against the standard crypto-tagging attack and variations
   thereof (Section 5.2 in [DS18]). A similar argument holds in the
   backward direction.


5.2. End-to-end authenticated encryption

   Layer n provides end-to-end authenticated encryption. Similar to the
   old protocol, this proposal only offers end-to-end authentication
   rather than per-hop authentication. However, CGO provides 128-bit
   authentication as opposed to the 32-bit authentication provided by the
   old protocol. A main observation underpinning the design of CGO is
   that the n'th layer does not need to be secure against the release of
   unverified plaintext (RUP). RUP security is only needed to protect
   against tagging attacks and the n'th layer does not help in that regard
   (but the layers below do). Consequently we employ a different scheme at
   the n'th layer which is designed to provide forward-secure
   authenticated encryption.


5.3 Forward Security

   As mentioned in the previous section CGO provides end-to-end
   authenticated encryption that is also forward secure. Our notion of
   forward security follows the definitions of Bellare and Yee [BY03] for
   both confidentiality and authenticity. Forward-secure confidentiality
   says that upon corrupting either the sender (or the receiver), the
   secrecy of the messages that have already been sent (or received) is
   still guaranteed. As for forward-secure authentication, upon corrupting
   the sender the authenticity of previously authenticated messages is
   still guaranteed (even if they have not yet been received). In order to
   achieve forward-secure authenticated encryption, CGO updates the key
   material of the n'th layer encryption with every cell that is
   processed. In order to support leaky pipes the lower layers also need
   to maintain a set of dynamic keys that are used to recognize cells that
   are intended for them. This key material is only used for partial
   processing, i.e. recognizing the cell, and is only updated if
   verification is successful. If the cell is not recognized, the node
   reverts to processing the cell with the static key material. If support
   for leaky-pipes is not required this extra processing can be omitted.


6. Efficiency Considerations

   Although we have not carried out any experiments to verify this, we
   expect CGO to perform relatively well in terms of efficiency. Firstly,
   it manages to achieve forward security with just a universal hash as
   opposed to other proposals which suggested the use of SHA2 or SHA3. In
   this respect we recommend using POLYVAL [GLL18], a variant of GHASH
   that is more compatible with Intel's PCMULQDQ instruction. Furthermore
   CGO admits a certain degree of parallelisability. Supporting leaky
   pipes requires an OR to first verify the cell using the the dynamic key
   material and if the cell is unrecognised it goes on to process the cell
   with the static key material. The important thing to note (see for
   instance Section 3.1.2) is that the initial processing of the cell
   using the static key material is almost identical to the verification
   using the dynamic key material, and the two computations are
   independent of each other. As such, although in Section 3 these were
   described as being evaluated sequentially, they can in fact be computed
   in parallel. In particular the two polynomial hashes could be computed
   in parallel by using the new vectorised VPCMULQDQ instruction.

   We are currently looking into further optimisations of the scheme as
   presented here. One such optimisation is the possibility of removing
   KDf_I and KDb_I while retaining forward security. This would further
   improve the efficiency of the scheme by reducing the amount of dynamic
   key material that needs to be updated with every cell that is processed.


References

[ADL17] Tomer Ashur, Orr Dunkelman, Atul Luykx, "Boosting Authenticated
Encryption Robustness with Minimal Modifications", CRYPTO 2017.

[BY03] Mihir Bellare, Bennett Yee, "Forward-Security in Private-Key
Cryptography", CT-RSA 2003.

[DS18] Jean Paul Degabriele, Martijn Stam, "Untagging Tor: A Formal
Treatment of Onion Encryption", EUROCRYPT 2018.

[GLL18] Shay Gueron, Adam Langley, Yehuda Lindell, "AES-GCM-SIV: Nonce
Misuse-Resistant Authenticated Encryption", RFC 8452, April 2019.

[ST13] Thomas Shrimpton, R. Seth Terashima, "A Modular Framework for
Building Variable-Input Length Tweakable Ciphers", ASIACRYPT 2013.
Filename: 309-optimistic-socks-in-tor.txt
Title: Optimistic SOCKS Data
Author: Tom Ritter
Created: 21-June-2019
Status: Open
Ticket: #5915

0. Abstract

   We propose that tor should have a SocksPort option that causes it to lie
   to the application that the SOCKS Handshake has succeeded immediately,
   allowing the application to begin sending data optimistically.

1. Introduction

   In the past, Tor Browser had a patch that allowed it to send data
   optimistically. This effectively eliminated a round trip through the
   entire circuit, reducing latency.

   This feature was buggy, and specifically caused problems with MOAT, as
   described in [0] and Tor Messenger as described in [1]. It is possible
   that the other issues observed with it were the same issue, it is
   possible they were different.

   Rather than trying to identify and fix the problem in Tor Browser, an
   alternate idea is to have tor lie to the application, causing it to send
   the data optimistically. This can benefit all users of tor. This
   proposal documents that idea.

   [0] https://trac.torproject.org/projects/tor/ticket/24432#comment:19
   [1] https://trac.torproject.org/projects/tor/ticket/19910#comment:3

2. Proposal

2.1. Behavior

   When the SocksPort flag defined below is present, Tor will immediately
   report a successful SOCKS handshake subject for non-onion connections.
   If, later, tor recieves an end cell rather than a connected cell, it
   will hang up the SOCKS connection.

   The requirement to omit this for onion connections is because in
   #30382 we implemented a mechanism to return a special SOCKS error code
   if we are connecting to an onion site that requires authentication.
   Returning an early success would prevent this from working.

   Redesigning the mechanism to communicate auth-required onion sites to
   the browser, while also supporting optimistic data, are left to a future
   proposal.

2.2. New SocksPort Flag

   In order to have backward compatibility with third party applications that
   do not support or do not want to use optimistic data, we propose a new
   SocksPort flag that needs to be set in the tor configuration file in order
   for the optimistic beahvior to occur.

   The new SocksPort flag is:

      "OptimisticData" -- Tor will immediately report a successful SOCKS
                          handshake subject for non-onion connections and
                          hang up if it gets an end cell rather than a
                          connected cell.

3. Application Error Handling

   This behavior will cause the application talking to Tor to potentially
   behave abnormally as it will believe that it has completed a TCP
   connection. If no such connection can be made by tor, the program may
   behave in a way that does not accurately represent the behavior of the
   connection.

   Applications SHOULD test various connection failure modes and ensure their
   behavior is acceptable before using this feature.

References:

[RFC1928] https://www.ietf.org/rfc/rfc1928.txt
Filename: 310-bandaid-on-guard-selection.txt
Title: Towards load-balancing in Prop 271
Author:  Florentin Rochet, Aaron Johnson et al.
Created: 2019-10-27
Supersedes: 271
Status: Closed

1. Motivation and Context

  Prop 271 causes guards to be selected with probabilities different than their
  weights due to the way it samples many guards and then chooses primary guards
  from that sample. We are suggesting a straightforward fix to the problem, which
  is, roughly speaking, to choose primary guards in the order in which they were
  sampled.

  In more detail, Prop 271 chooses guards via a multi-step process: 
    1. It chooses 20 distinct guards (and sometimes more) by sampling without
       replacement with probability proportional to consensus weight.
    2. It produces two subsets of the sample: (1) "filtered" guards, which are
       guards that satisfy various torrc constraints and path bias, and (2)
       "confirmed" guards, which are guards through which a circuit has been
       constructed. 
    3. The "primary" guards (i.e. the actual guards used for circuits) are
       chosen from the confirmed and/or filtered subsets.  I'm ignoring the
       additional "usable" subsets for clarity. This description is based on
       Section 4.6 of the specification
       (https://gitweb.torproject.org/torspec.git/tree/guard-spec.txt).


1.1 Picturing the problem when Tor starts the first time

  The primary guards are selected *uniformly at random* from the filtered guards
  when no confirmed guards exist. No confirmed guards appear to exist until some
  primary guards have been selected, and so when Tor is started the first time
  the primary guards always come only from the filtered set. The uniformly-random
  selection causes a bias in primary-guard selection away from consensus weights
  and towards a more uniform selection of guards. As just an example of the
  problem, if there were only 20 guards in the network, the sampled set would be
  all guards and primary guard selection would be entirely uniformly random,
  ignoring weights entirely. This bias is worse the larger the sampled set is
  relative to the entire set of guards, and it has a significant effect on Tor
  simulations in Shadow, which are typically on smaller networks.

2. Solution Design

  We propose a solution that fits well within the existing guard-selection
  algorithm. Our solution is to select primary guards in the order they were
  sampled. This ordering should be applied after the filtering and/or confirmed
  guard sets are constructed as normal. That is, primary guards should be
  selected from the filtered guards (if no guards are both confirmed and
  filtered) or from the set of confirmed and filtered guards (if such guards
  exist) in the order they were initially sampled. This solution guarantees that
  each primary guard is selected (without replacement) from all guards with a
  probability that is proportional to its consensus weight.

2.1 Performance implications

  This proposal is a straightforward fix to the unbalanced network that may arise
  from the uniform selection of sampled relays. It solves the performance
  correctness in Shadow for which simulations live on a small timeframe. However,
  it does not solve all the load-balancing problems of Proposal 271. One other
  load-balancing issue comes when we choose our guards on a date but then make
  decisions about them on a different date.  Building a sampled list of relays at
  day 0 that we intend to use in a long time for most of them is taking the risk
  to slowly make the network unbalanced.

2.2 Security implications

  This proposal solves the following problems: Prop271 reduces Tor's security by
  increasing the number of clients that an adversary running small relays can
  observe. In addition, an adversary has to wait less time than it should after
  it starts a malicious guard to be chosen by a client. This weakness occurs
  because the malicious guard only needs to enter the sampled list to have a
  chance to be chosen as primary, rather than having to wait until all
  previously-sampled guards have already expired.

2.3 Implementation notes

  The code used for ordering the confirmed list by confirmed idx should be
  removed, and a sampled order should be applied throughout the various lists.
  The next sampled idx should be recalculed from the state file, and the
  sampled_idx values should be recalculated to be a dense array when we save the
  state.

3. Going Further -- Let's not choose our History (future work)

  A deeper refactoring of Prop 271 would try to solve the load balancing problem
  of choosing guards on a date but then making decisions about them on a
  different date. One suggestion is to remove the sampled list, which we can
  picture as a "forward history" and to have instead a real history of previously
  sampled guards. When moving to the next guard, we could consider *current*
  weights and make the decision. The history should resist attacks that try to
  force clients onto compromised guards, using relays that are part of the
  history if they're still available (in sampled order), and by tracking its
  size. This should maintain the initial goals of Prop 271.
Filename: 311-relay-ipv6-reachability.txt
Title: Tor Relay IPv6 Reachability
Author: teor, Nick Mathewson
Created: 22-January-2020
Status: Accepted
Ticket: #24404

0. Abstract

   We propose that Tor relays (and bridges) should check the reachability of
   their IPv6 ORPort, before deciding whether to publish their descriptor. To
   check IPv6 ORPort reachability, relays and bridges need to be able to
   extend circuits via other relays, and back to their own IPv6 ORPort.

1. Introduction

   Tor relays (and bridges) currently check the reachability of their IPv4
   ORPort and DirPort before publishing them in their descriptor. But relays
   and bridges do not test the reachability of their IPv6 ORPorts.

   However, directory authorities make direct connections to relay IPv4 and
   IPv6 ORPorts, to test each relay's reachability. Once a relay has been
   confirmed as reachable by a majority of authorities, it is included in the
   consensus. (Currently, 6 out of 9 directory authorities perform IPv4 and
   IPv6 reachability checks. The others just check IPv4.)

   The Bridge authority makes direct connections to bridge IPv4 ORPorts, to
   test each bridge's reachability. Depending on its configuration, it may also
   test IPv6 ORPorts. Once a bridge has been confirmed as reachable by the
   bridge authority, it is included in the bridge networkstatus used by
   BridgeDB.

   Many relay (and bridge) operators don't know when their relay's IPv6 ORPort
   is unreachable. They might not find out until they check [Relay Search], or
   their traffic may drop. For new operators, it might just look like Tor
   simply isn't working, or it isn't using much traffic. IPv6 ORPort issues
   are a significant source of relay operator support requests.

   Implementing IPv6 ORPort reachability checks will provide immediate, direct
   feedback to operators in the relay's logs. It also enables future work,
   such as automatically discovering relay and bridge addresses for IPv6
   ORPorts (see [Proposal 312: Relay Auto IPv6 Address]).

2. Scope

   This proposal modifies Tor's behaviour as follows:

   Relays (including directory authorities):
   * circuit extension,
   * OR connections for circuit extension,
   * reachability testing.

   Bridges:
   * reachability testing only.

   This proposal does not change client behaviour.

   Throughout this proposal, "relays" includes directory authorities, except
   where they are specifically excluded. "relays" does not include bridges,
   except where they are specifically included. (The first mention of "relays"
   in each section should specifically exclude or include these other roles.)

   When this proposal describes Tor's current behaviour, it covers all
   supported Tor versions (0.3.5.7 to 0.4.2.5), as of January 2020, except
   where another version is specifically mentioned.

3. Allow Relay IPv6 Extends

   To check IPv6 ORPort reachability, relays and bridges need to be able to
   extend circuits via other relays, and back to their own IPv6 ORPort.

   We propose that relays start to extend some circuits over IPv6 connections.
   We do not propose any changes to bridge extend behaviour.

3.1. Current IPv6 ORPort Implementation

   Currently, all relays (and bridges) must have an IPv4 ORPort. IPv6 ORPorts
   are optional.

   Tor supports making direct IPv6 OR connections:
     * from directory authorities to relay ORPorts,
     * from the bridge authority to bridge ORPorts,
     * from clients to relay and bridge ORPorts.

   Tor relays and bridges accept IPv6 ORPort connections. But IPv6 ORPorts are
   not currently included in extend requests to other relays. And even if an
   extend cell contains an IPv6 ORPort, bridges and relays will not extend
   via an IPv6 connection to another relay.

   Instead, relays will extend circuits:
     * Using an existing authenticated connection to the requested relay
       (which is typically over IPv4), or
     * Over a new connection via the IPv4 ORPort in an extend cell.

   If a relay receives an extend cell that only contains an IPv6 ORPort, the
   extend typically fails.

3.2. Relays Extend to IPv6 ORPorts

   We propose that relays make some connections via the IPv6 ORPorts in
   extend cells.

   Relays will extend circuits:
     * using an existing authenticated connection to the requested relay
       (which may be over IPv4 or IPv6), or
     * over a new connection via the IPv4 or IPv6 ORPort in an extend cell.

   Since bridges try to imitate client behaviour, they will not adopt this new
   behaviour, until clients begin routinely connecting via IPv6. (See
   [Proposal 306: Client Auto IPv6 Connections].)

3.2.1. Making IPv6 ORPort Extend Connections

   Relays may make a new connection over IPv6 when:
     * they have an IPv6 ORPort,
     * there is no existing authenticated connection to the requested relay,
       and
     * the extend cell contains an IPv6 ORPort.

   If these conditions are satisfied, and the extend cell also contains an
   IPv4 ORPort, we propose that the relay choose between an IPv4 and an IPv6
   connection at random.

   If the extend cell does not contain an IPv4 ORPort, we propose that the
   relay connects over IPv6. (Relays should support IPv6-only extend cells,
   even though they are not used to test relay reachability in this proposal.)

   A successful IPv6 connection also requires that:
     * the requested relay has an IPv6 ORPort.
   But extending relays must not check the consensus for other relays' IPv6
   information. Consensuses may be out of date, particularly when relays are
   doing reachability checks for new IPv6 ORPorts.

   See section 3.3.2 for other situations where IPv6 information may be
   incorrect or unavailable.

3.2.2. No Tor Client Changes

   Tor clients currently include IPv4 ORPorts in their extend cells, but they
   do not include IPv6 ORPorts.

   We do not propose any client IPv6 extend cell changes at this time.

   The Tor network needs more IPv6 relays, before clients can safely use
   IPv6 extends. (Relays do not require anonymity, so they can safely use
   IPv6 extends to test their own reachability.)

   We also recommend prioritising client to relay IPv6 connections
   (see [Proposal 306: Client Auto IPv6 Connections]) over relay to relay IPv6
   connections. Because client IPv6 connections have a direct impact on users.

3.3. Alternative Extend Designs

   We briefly mention some potential extend designs, and the reasons that
   they were not used in this proposal.

   (Some designs may be proposed for future Tor versions, but are not necessary
   at this time.)

3.3.1. Future Relay IPv6 Extend Behaviour

   Random selection of extend ORPorts is a simple design, which supports IPv6
   ORPort reachability checks.

   However, it is not the most efficient design when:
     * both relays meet the requirements for IPv4 and IPv6 extends,
     * a new connection is required,
     * the relays have either IPv4 or IPv6 connectivity, but not both.

   In this very specific case, this proposal results in an average of 1
   circuit extend failure per new connection. (Because relays do not try to
   connect to the other ORPort when the first one fails.)

   If relays try both the IPv4 and IPv6 ORPorts, then the circuit would
   succeed. For example, relays could try the alternative port after a 250ms
   delay, as in [Proposal 306: Client Auto IPv6 Connections]. The design in
   this proposal results in an average circuit delay of up to 125ms
   (250ms / 2) per new connection, rather than failure.

   However, partial relay connectivity should be uncommon. And relays keep
   connections open long-term, so new relay connections are a small proportion
   of extend requests.

   Therefore, we defer implementing any more complex designs. Since we propose
   to use IPv6 extends to test relay reachability, occasional circuit extend
   failures have a very minor impact.

3.3.2. Future Bridge IPv6 Extend Behaviour

   When clients automatically connect to relay IPv4 and IPv6 ORPorts by
   default, bridges should also adopt this behaviour. (For example,
   see [Proposal 306: Client Auto IPv6 Connections].)

3.3.3. Allowing Extends to Prefer IPv4 or IPv6

   Here is an alternate design, which allows extending clients (or relays doing
   reachability tests) to prefer either IPv4 or IPv6:

   Suppose that a relay's extend cell contains the IPv4 address and the
   IPv6 address in their _preferred order_.  So if the party generating
   the extend cell would prefer an IPv4 connection, it puts the IPv4
   addess first; if it would prefer an IPv6 connection, it puts the IPv6
   address first.

   The relay that receives the extend cell could respond in several ways:
     * One possibility (similar to section 3.2.1) is to choose at random,
       with a higher probability given to the first option.
     * One possibility (similar to section 3.3.1) is to try the first, and
       then try the second if the first one fails.

   This scheme has some advantage, in that it lets the self-testing relay say
   "please try IPv6 if you can" or "please try IPv4 if you can" in a reliable
   way, and lets us migrate from the current behavior to the 3.3.1 behavior
   down the road.

   However, it might not be necessary: clients should not care if their
   extends are over IPv4 or IPv6, they just want to get to an exit safely.
   (And clients should not depend on using IPv4 or IPv6, because relays may
   use an existing authenticated connection to extend.) The only use case
   where extends might want to prefer IPv4 or IPv6 is relay reachability
   tests. But we want our reachability test design to succeed, without
   depending on the specific extend implementation.

3.4. Rejected Extend Designs

   Some designs may never be suitable for the Tor network.

   We rejected designs where relays check the consensus to see if other
   relays support IPv6, because:
     * relays may have different consensuses,
     * the extend cell may have been created using a version of the
       [Onion Service Protocol] which supports IPv6, or
     * the extend cell may be from a relay that has just added IPv6, and is
       testing the reachability of its own ORPort (see Section 4).

   We avoided designs where relays try to learn if other relays support IPv6,
   because these designs:
     * are more complex than random selection,
     * potentially leak information between different client circuits,
     * may enable denial of service attacks, where a flood of incorrect extend
       cells causes a relay to believe that another relay is unreachable on an
       ORPort that actually works, and
     * require careful tuning to match the typical interval at which network
       connectivity is actually changing.

4. Check Relay and Bridge IPv6 ORPort Reachability

   We propose that relays (and bridges) check their own IPv6 ORPort
   reachability.

   To check IPv6 ORPort reachability, relays (and bridges) extend circuits via
   other relays (but not other bridges), and back to their own IPv6 ORPort.

   If IPv6 reachability checks fail, relays (and bridges) should refuse to
   publish their descriptors, if they believe IPv6 reachability checks are
   reliable, and their IPv6 address was explicitly configured. (See
   [Proposal 312: Relay Auto IPv6 Address] for the ways relays can guess their
   IPv6 addresses.)

   Directory authorities always publish their descriptors.

4.1. Current Reachability Implementation

   Relays and bridges check the reachability of their IPv4 ORPorts and
   DirPorts, and refuse to publish their descriptor if either reachability
   check fails. (Directory authorities test their own reachability, but they
   only warn, and publish their descriptor regardless of reachability.)

   IPv4 ORPort reachability checks succeed when any create cell is received on
   any inbound OR connection. The check succeeds, even if the cell is from an
   IPv6 ORPort, or a circuit built by a client.

   Directory authorities make direct connections to relay IPv4 and IPv6
   ORPorts, to test each relay's reachability. Relays that fail either
   reachability test, on enough directory authorities, are excluded from the
   consensus.

   The Bridge authority makes direct connections to bridge IPv4 ORPorts, to
   test each bridge's reachability. Depending on its configuration, it may also
   test IPv6 ORPorts. Bridges that fail either reachability test are excluded
   from BridgeDB.

4.2. Checking IPv6 ORPort Reachability

   We propose that testing relays (and bridges) select some IPv6 extend-capable
   relays for their reachability circuits, and include their own IPv4 and IPv6
   ORPorts in the final extend cells on those circuits.

   The final extending relay will extend to the testing relay:
     * using an existing authenticated connection to the testing relay
       (which may be over IPv4 or IPv6), or
     * over a new connection via the IPv4 or IPv6 ORPort in the extend cell.

   The testing relay will confirm that test circuits can extend to both its
   IPv4 and IPv6 ORPorts.

   Checking IPv6 ORPort reachability will create extra IPv6 connections on the
   tor network. (See [Proposal 313: Relay IPv6 Statistics].) It won't directly
   create much extra traffic, because reachability circuits don't send many
   cells. But some client circuits may use the IPv6 connections created by
   relay reachability self-tests.

4.2.1. Selecting the Final Extending Relay

   IPv6 ORPort reachability checks require an IPv6 extend-capable relay as
   the second-last hop of reachability circuits. (The testing relay is the
   last hop.)

   IPv6-extend capable relays must have:
     * Relay subprotocol version 3 (or later), and
     * an IPv6 ORPort.
   (See section 5.1 for the definition of Relay subprotocol version 3.)

   The other relays in the path do not require any particular protocol
   versions.

4.2.2. Extending from the Second-Last Hop

   IPv6 ORPort reachability circuits should put the IPv4 and IPv6 ORPorts in
   the extend cell for the final extend in reachability circuits.

   Supplying both ORPorts makes these extend cells indistinguishable from
   future client extend cells.

   If reachability succeeds, the testing relay (or bridge) will accept the
   final extend on one of its ORPorts, selected at random by the extending
   relay (see section 3.2.1).

4.2.3. Separate IPv4 and IPv6 Reachability Flags

   Testing relays (and bridges) will record reachability separately for IPv4
   and IPv6 ORPorts, based on the ORPort that the test circuit was received on.

   Here is a reliable way to do reachability self-tests for each ORPort:

   1. Check for create cells on inbound ORPort connections from other relays

   Check for a cell on any IPv4 and any IPv6 ORPort. (We can't know which
   listener(s) correspond to the advertised ORPorts, particularly when using
   port forwarding.) Make sure the cell was received on an inbound OR
   connection, and make sure the connection is authenticated to another relay.
   (Rather than to a client: clients don't authenticate.)

   2. Check for created cells from testing circuits on outbound OR connections

   Check for a returned created cell on our IPv4 and IPv6 self-test circuits.
   Make sure those circuits were on outbound OR connections.

   By combining these tests, we confirm that we can:
     * reach our own ORPorts with testing circuits,
     * send and receive cells via inbound OR connections to our own ORPorts
       from other relays, and
     * send and receive cells via outbound OR connections to other relays'
       ORPorts.

   Once we validate the created cell, we have confirmed that the final remote
   relay has our private keys. Therefore, this test reliably detects ORPort
   reachability, in most cases.

   There are a few exceptions:

   A. Duplicate Relay Keys

   Duplicate keys are only possible if a relay's private keys have been copied
   to another relay. That's either a misconfiguration, or a security issue.
   Directory authorities ensure that only one relay with each key is included
   in the consensus.

   If a relay was set up using a copy of another relay's keys, then its
   reachability self-tests might connect to that other relay. (If the second
   hop in a testing circuit has an existing OR connection to the other relay.)

   Relays could test if the inbound create cells they receive, match the
   create cells that they have sent on self-test circuits.

   But this seems like unnecessary complexity, because duplicate keys are
   rare. At best, it would provide a warning for some operators who have
   accidentally duplicated their keys. (But it doesn't provide any extra
   security, because operators can disable self-tests using AssumeReachable.)

   B. Multiple ORPorts in an Address Family

   Some relays have multiple IPv4 ORPorts, or multiple IPv6 ORPorts. In some
   cases, only some ports are reachable. (This configuration is uncommon, but
   multiple ORPorts are supported.)

   Here is how these tests can pass, even if the advertised ORPort is
   unreachable:
     * the final extend cell contains the advertised IPv6 address of the
       self-testing relay,
     * if the extending relay already has a connection to a working NoAdvertise
       ORPort, it may use that connection instead.

4.2.4. No Changes to DirPort Reachability

   We do not propose any changes to relay IPv4 DirPort reachability checks at
   this time.

   The following configurations are currently not supported:
     * bridge DirPorts, and
     * relay IPv6 DirPorts.
   Therefore, they are also out of scope for this proposal.

4.3. Refusing to Publish Descriptor if IPv6 ORPort is Unreachable

   If an IPv6 ORPort reachability check fails, relays (and bridges) should log
   a warning.

   If IPv6 reachability checks fail, relays (and bridges) should refuse to
   publish their descriptors, if they believe IPv6 reachability checks are
   reliable, and their IPv6 address was explicitly configured. (See
   [Proposal 312: Relay Auto IPv6 Address] for the ways relays can guess their
   IPv6 addresses.)

   Directory authorities always publish their descriptors.

4.3.1. Refusing to Publish the Descriptor

   If IPv6 reachability checks fail, relays (and bridges) should refuse to
   publish their descriptors, if:
     * enough existing relays support IPv6 extends, and
     * the IPv6 address was explicitly configured by the operator
       (rather than guessed using [Proposal 312: Relay Auto IPv6 Address]).

   Directory authorities may perform reachability checks, and warn if those
   checks fail. But they always publish their descriptors.

   We set a threshold of consensus relays for reliable IPv6 ORPort checks:
     * at least 30 relays, and
     * at least 1% of the total consensus weight,
   must support IPv6 extends.

   We chose these parameters so that the number of relays is triple the
   number of directory authorities, and the consensus weight is high enough
   to support occasional reachability circuits.

   In small networks with:
     * less than 2000 relays, or
     * a total consensus weight of zero,
   the threshold should be the minimum tor network size to test reachability:
     * at least 2 relays, excluding this relay.
   (Note: we may increase this threshold to 3 or 4 relays if we discover a
   higher minimum during testing.)

   If the current consensus satisfies this threshold, testing relays (and
   bridges, but not directory authorities) that fail IPv6 ORPort reachability
   checks should refuse to publish their descriptors.

   To ensure an accurate threshold, testing relays should exclude:
     * the testing relay itself, and
     * relays that they will not use in testing circuits,
   from the:
     * relay count, and
     * the numerator of the threshold percentage.

   Typically, relays will be excluded if they are in the testing relay's:
     * family,
     * IPv4 address /16 network,
     * IPv6 address /32 network (a requirement as of Tor 0.4.0.1-alpha),
   unless EnforceDistinctSubnets is 0.

   As a useful side-effect, these different thresholds for each relay family
   will reduce the likelihood of the network flapping around the threshold.

   If flapping has an impact on the network health, directory authorities
   should set the AssumeIPv6Reachable consensus parameter. (See the next
   section.)

4.3.2. Add AssumeIPv6Reachable Option

   We add an AssumeIPv6Reachable torrc option and consensus parameter.

   If IPv6 ORPort checks have bugs that impact the health of the network,
   they can be disabled by setting AssumeIPv6Reachable=1 in the consensus
   parameters.

   If IPv6 ORPort checks have bugs that impact a particular relay (or bridge),
   they can be disabled by setting "AssumeIPv6Reachable 1" in the relay's
   torrc.

   This option disables IPv6 ORPort reachability checks, so relays publish
   their descriptors if their IPv4 ORPort reachability checks succeed.
   (Unlike AssumeReachable, AssumeIPv6Reachable has no effect on the existing
   dirauth IPv6 reachability checks, which connect directly to relay ORPorts.)

   The default for the torrc option is "auto", which checks the consensus
   parameter. If the consensus parameter is not set, the default is "0".

   "AssumeReachable 1" overrides all values of "AssumeIPv6Reachable",
   disabling both IPv4 and IPv6 ORPort reachability checks. Tor should warn if
   AssumeReachable is 1, but AssumeIPv6Reachable is 0. (On directory
   authorities, "AssumeReachable 1" also disables dirauth IPv4 and IPv6
   reachability checks, which connect directly to relay ORPorts.
   AssumeIPv6Reachable does not disable directory authority to relay IPv6
   checks.)

4.4. Optional Efficiency and Reliability Changes

   We propose some optional changes for efficiency and reliability, and
   describe their impact.

   Some of these changes may be more appropriate in future releases, or
   along with other proposed features.

4.4.1. Extend IPv6 From All Supported Second-Last Hops

   The testing relay (or bridge) puts both IPv4 and IPv6 ORPorts in its final
   extend cell, and the receiving ORPort is selected at random by the
   extending relay (see sections 3.2.1 and 4.2). Therefore, approximately half
   of IPv6 ORPort reachability circuits will actually end up confirming IPv4
   ORPort reachability.

   We propose this optional change, to improve the rate of IPv6 ORPort
   reachability checks:

   If the second-last hop of an IPv4 ORPort reachability circuit supports IPv6
   extends, testing relays may put the IPv4 and IPv6 ORPorts in the extend
   cell for the final extend.

   As the number of relays that support IPv6 extends increases, this change
   will increase the number of IPv6 reachability confirmations. In the ideal
   case, where the entire network supports IPv4 and IPv6 extends, IPv4 and IPv6
   ORPort reachability checks would require a similar number of circuits.

4.4.2. Close Existing Connections Before Testing Reachability

   When a busy relay is performing reachability checks, it may already have
   established inbound or outbound connections to the second-last hop in its
   reachability test circuits. The extending relay may use these connections
   for the extend, rather than opening a connection to the target ORPort
   (see sections 3.2 and 4.2.2).

   Bridges only establish outbound connections to other relays, and only over
   IPv4 (except for reachability test circuits). So they are still potentially
   affected by this issue.

   We propose these optional changes, to improve the efficiency of IPv4 and
   IPv6 ORPort reachability checks:

   Testing relays (and bridges):
     * close any outbound connections to the second-last hop of reachability
       circuits, and
     * close inbound connections to the second-last hop of reachability
       circuits, if those connections are not using the target ORPort.

   Even though it is unlikely that bridges will have inbound connections to
   a non-target ORPort, bridges should still do inbound connection checks, for
   consistency.

   These changes are particularly important if a relay is connected to all
   other relays in the network, but only over IPv4. (Or in the future, only
   over IPv6.)

   We expect that these changes will slightly increase the number of relay
   re-connections, but reduce the number of reachability test circuits
   required to confirm reachability.

4.4.3. Accurately Identifying Test Circuits

   The testing relay (or bridge) may confirm that the create cells it is
   receiving are from its own test circuits, and that test circuits are
   capable of returning create cells to the origin.

   Currently, relays confirm reachability if any create cell is received on
   any inbound connection (see section 4.1). Relays do not check that the
   circuit is a reachability test circuit, and they do not wait to receive the
   return created cell. This behaviour has resulted in difficult to diagnose
   bugs on some rare relay configurations.

   We propose these optional changes, to improve the efficiency of IPv4 and
   IPv6 ORPort reachability checks:

   Testing relays may:
     * check that the create cell is received from a test circuit
       (by comparing the received cell to the cells sent by test circuits),
     * check that the create cell is received on an inbound connection
       (this is existing behaviour),
     * if the create cell from a test circuit is received on an outbound
       connection, destroy the circuit (rather than returning a created cell),
       and
     * check that the created cell is returned to the relay on a test circuit
       (by comparing the remote address of the final hop on the circuit, to
       the local IPv4 and IPv6 ORPort addresses).

   Relays can efficiently match inbound create cells to test circuits by
   storing a set of their test circuits' extend cells g^X values, and then
   check incoming cells create cells against that set.

   If we make these changes, relays should track whether they are
   "maybe reachable" (under the current definition of 'reachable') and
   "definitely reachable" (based on the new definition). They should log
   different messages depending on whether they are "maybe reachable" but these
   new tests fail, or whether they are completely unreachable.

4.4.4. Allowing More Relay IPv6 Extends

   Currently, clients, relays, and bridges do not include IPv6 ORPorts in their
   extend cells.

   In this proposal, we only make relays (and bridges) extend over IPv6 on
   the final hop of test circuits. This limited use of IPv6 extends means that
   IPv6 connections will still be uncommon.

   We propose these optional changes, to increase the number of IPv6
   connections between relays:

   To increase the number of IPv6 connections, relays that support IPv6
   extends may want to use them for all hops of their own circuits. Relays
   make their own circuits for reachability tests, bandwidth tests, and
   ongoing preemptive circuits. (Bridges can not change their behaviour,
   because they try to imitate clients.)

   We propose a torrc option and consensus parameter RelaySendIPv6Extends,
   which is only supported on relays (and not bridges or clients). This option
   makes relays send IPv4 and IPv6 ORPorts in all their extend cells, when
   supported by the extending and receiving relay. (See section 3.2.1.)

   The default value for this option is "auto", which checks the consensus
   parameter. If the consensus parameter is not set, it defaults to "0" in
   the initial release.

   Once IPv6 extends have had enough testing, we may enable
   SendIPv6CircuitExtends on the network. The consensus parameter will be set
   to 1. The default will be changed to "1" (if the consensus parameter is not
   set).

   We defer any client (and bridge) changes to a separate proposal, to be
   implemented when there are more IPv6 relays in the network. But we note
   that relay IPv6 extends will provide some cover traffic when clients
   eventually use IPv6 extends in their circuits.

   As a useful side effect, increasing the number of IPv6 connections in the
   network makes it more likely that an existing connection can be used for
   the final hop of a relay IPv6 ORPort reachability check.

4.4.5. Relay Bandwidth Self-Tests Over IPv4 and IPv6

   In this proposal, we only make relays (and bridges) use IPv6 for their
   reachability self-tests.

   We propose this optional change, to improve the accuracy of relay (and
   bridge) bandwidth self-tests:

   Relays (and bridges) perform bandwidth self-tests over IPv4 and IPv6.

   If we implement good abstractions for relay self-tests, then this change
   will not need much extra code.

   If we implement IPv6 extends for all relay circuits (see section 4.4.4),
   then this change will effectively be redundant.

   Doing relay bandwidth self-tests over IPv6 will create extra IPv6
   connections and IPv6 bandwidth on the tor network. (See
   [Proposal 313: Relay IPv6 Statistics].) In addition, some client circuits
   may use the IPv6 connections created by relay bandwidth self-tests.

4.5. Alternate Reachability Designs

   We briefly mention some potential reachability designs, and the reasons that
   they were not used in this proposal.

4.5.1. Removing IPv4 ORPorts from Extend Cells

   We avoid designs that only include IPv6 ORPorts in extend cells, and remove
   IPv4 ORPorts.

   Only including the IPv6 ORPort would provide slightly more specific
   reachability check circuits. However, we don't need IPv6-only designs,
   because relays continue trying different reachability circuits until they
   confirm reachability.

   IPv6-only designs also make it easy to distinguish relay reachability extend
   cells from other extend cells. This distinguisher will become more of an
   issue as IPv6 extends become more common in the network (see sections 4.2.2
   and 4.4.4).

   Removing the IPv4 ORPort also provides no fallback, if the IPv6 ORPort is
   actually unreachable. IPv6-only failures do not affect reachability checks,
   but they will become important in the future, as other circuit types start
   using IPv6 extends.

   IPv6-only reachability designs also increase the number of special cases in
   the implementation. (And the likelihood of subtle bugs.)

   These designs may be appropriate in future, when there are IPv6-only bridges
   or relays.

5. New Relay Subprotocol Version

   We reserve Tor subprotocol "Relay=3" for tor versions where:
     * relays may perform IPv6 extends, and
     * bridges might not perform IPv6 extends,
   as described in this proposal.

5.1. Tor Specification Changes

   We propose the following changes to the [Tor Specification], once this
   proposal is implemented.

   Adding a new Relay subprotocol version lets testing relays identify other
   relays that support IPv6 extends. It also allows us to eventually recommend
   or require support for IPv6 extends on all relays.

   Append to the Relay version 2 subprotocol specification:

          Relay=2 has limited IPv6 support:
            * Clients might not include IPv6 ORPorts in EXTEND2 cells.
            * Relays (and bridges) might not initiate IPv6 connections in
              response to EXTEND2 cells containing IPv6 ORPorts, even if they
              are configured with an IPv6 ORPort.
          However, relays accept inbound connections to their IPv6 ORPorts,
          and will extend circuits via those connections.

   "3" -- relays support extending over IPv6 connections in response to an
          EXTEND2 cell containing an IPv6 ORPort.

          Bridges might not extend over IPv6, because they try to imitate
          client behaviour.

          A successful IPv6 extend requires:
            * Relay subprotocol version 3 (or later) on the extending relay,
            * an IPv6 ORPort on the extending relay,
            * an IPv6 ORPort for the accepting relay in the EXTEND2 cell, and
            * an IPv6 ORPort on the accepting relay.
          (Because different tor instances can have different views of the
          network, these checks should be done when the path is selected.
          Extending relays should only check local IPv6 information, before
          attempting the extend.)

          When relays receive an EXTEND2 cell containing both an IPv4 and an
          IPv6 ORPort, and there is no existing authenticated connection with
          the target relay, the extending relay may choose between IPv4 and
          IPv6 at random. The extending relay might not try the other address,
          if the first connection fails.
          (TODO: check final behaviour after code is merged.)

          As is the case with other subprotocol versions, tor advertises,
          recommends, or requires support for this protocol version, regardless
          of its current configuration.

          In particular:
            * relays without an IPv6 ORPort, and
            * tor instances that are not relays,
          have the following behaviour, regardless of their configuration:
            * advertise support for "Relay=3" in their descriptor
              (if they are a relay, bridge, or directory authority), and
            * react to consensuses recommending or requiring support for
              "Relay=3".

          This subprotocol version is described in proposal 311, and
          implemented in Tor 0.4.4.1-alpha.
          (TODO: check version after code is merged).

6. Test Plan

   We provide a quick summary of our testing plans.

6.1. Test IPv6 ORPort Reachability and Extends

   We propose to test these changes using chutney networks with AssumeReachable
   disabled. (Chutney currently enables AssumeReachable by default.)

   We also propose to test these changes on the public network with a small
   number of relays and bridges.

   Once these changes are merged, volunteer relay and bridge operators will be
   able to test them by:
     * compiling from source,
     * running nightly builds, or
     * running alpha releases.

6.2. Test Existing Features

   We will modify and test these existing features:
     * IPv4 ORPort reachability checks

   We do not plan on modifying these existing features:
     * relay reachability retries
       TODO: Do relays re-check their own reachability? How often?
     * relay canonical connections
     * "too many connections" warning logs
   But we will test that they continue to function correctly, and fix any bugs
   triggered by the modifications in this proposal.

6.3. Test Legacy Relay Compatibility

   We will also test IPv6 extends from newer relays (which implement this
   proposal) to older relays (which do not). Although this proposal does not
   create these kinds of circuits, we need to check for bugs and excessive
   logs in older tor versions.

7. Ongoing Monitoring

   To monitor the impact of these changes:
     * relays should collect basic IPv6 connection statistics, and
     * relays and bridges should collect basic IPv6 bandwidth statistics.
   (See [Proposal 313: Relay IPv6 Statistics]).

   Some of these statistics may be included in tor's heartbeat logs, making
   them accessible to relay operators.

   We do not propose to collect additional statistics on:
     * circuit counts, or
     * failure rates.
   Collecting statistics like these could impact user privacy.

   We also plan to write a script to calculate the number of IPv6 relays in
   the consensus. This script will help us monitor the network during the
   deployment of these new IPv6 features.

8. Changes to Other Proposals

   [Proposal 306: Client Auto IPv6 Connections] needs to be modified to keep
   bridge IPv6 behaviour in sync with client IPv6 behaviour. (See section
   3.3.2.)

References:

[Onion Service Protocol]:
   In particular, Version 3 of the Onion Service Protocol supports IPv6:
   https://gitweb.torproject.org/torspec.git/tree/rend-spec-v3.txt

[Proposal 306: Client Auto IPv6 Connections]:
   One possible design for automatic client IPv4 and IPv6 connections is at:
   https://gitweb.torproject.org/torspec.git/tree/proposals/306-ipv6-happy-eyeballs.txt
   (TODO: modify to include bridge changes with client changes)

[Proposal 312: Relay Auto IPv6 Address]:
   https://gitweb.torproject.org/torspec.git/tree/proposals/312-relay-auto-ipv6-addr.txt

[Proposal 313: Relay IPv6 Statistics]:
   https://gitweb.torproject.org/torspec.git/tree/proposals/313-relay-ipv6-stats.txt

[Relay Search]:
   https://metrics.torproject.org/rs.html

[Tor Specification]:
   https://gitweb.torproject.org/torspec.git/tree/tor-spec.txt
Filename: 312-relay-auto-ipv6-addr.txt
Title: Tor Relay Automatic IPv6 Address Discovery
Author: teor, Nick Mathewson, s7r
Created: 28-January-2020
Status: Accepted
Ticket: #33073

0. Abstract

   We propose that Tor relays (and bridges) should automatically find their
   IPv6 address.

   Like tor's existing IPv4 address auto-detection, the chosen IPv6 address
   will be published as an IPv6 ORPort in the relay's descriptor. Clients,
   relays, and authorities connect to relay descriptor IP addresses.
   Therefore, IP addresses in descriptors need to be publicly routable. (If
   the relay is running on the public tor network.)

   To discover their IPv6 address, some relays may fetch directory documents
   over IPv6. (For anonymity reasons, bridges are unable to fetch directory
   documents over IPv6, until clients start to do so.)

1. Introduction

   Tor relays (and bridges) currently find their IPv4 address, and use it as
   their ORPort and DirPort address when publishing their descriptor. But
   relays and bridges do not automatically find their IPv6 address.

   However, relay operators can manually configure an ORPort with an IPv6
   address, and that ORPort is published in their descriptor in an "or-address"
   line (see [Tor Directory Protocol]).

   Many relay operators don't know their relay's IPv4 or IPv6 addresses. So
   they rely on Tor's IPv4 auto-detection, and don't configure an IPv6
   address. When operators do configure an IPv6 address, it's easy for them to
   make mistakes. IPv6 ORPort issues are a significant source of relay
   operator support requests.

   Implementing IPv6 address auto-detection, and IPv6 ORPort reachability
   checks (see [Proposal 311: Relay IPv6 Reachability]) will increase the
   number of working IPv6-capable relays in the tor network.

2. Scope

   This proposal modifies Tor's behaviour as follows:

   Relays, bridges, and directory authorities:
     * automatically find their IPv6 address, and
     * for consistency between IPv4 and IPv6 detection:
       * start using IPv4 ORPort for IPv4 address detection, and
       * re-order IPv4 address detection methods.

   Relays (but not bridges, or directory authorities):
     * fetch some directory documents over IPv6.

   For anonymity reasons, bridges are unable to fetch directory documents over
   IPv6, until clients start to do so. (See
   [Proposal 306: Client Auto IPv6 Connections].)

   For security reasons, directory authorities must only use addresses that
   are explicitly configured in their torrc.

   This proposal makes a small, optional change to existing client behaviour:
     * clients also check IPv6 addresses when rotating TLS keys for new
       networks.
   In addition to the changes to IPv4 address resolution, most of which won't
   affect clients. (Because they do not set Address or ORPort.)

   Throughout this proposal, "relays" includes directory authorities, except
   where they are specifically excluded. "relays" does not include bridges,
   except where they are specifically included. (The first mention of "relays"
   in each section should specifically exclude or include these other roles.)

   When this proposal describes Tor's current behaviour, it covers all
   supported Tor versions (0.3.5.7 to 0.4.2.5), as of January 2020, except
   where another version is specifically mentioned.

3. Finding Relay IPv6 Addresses

   We propose that Tor relays (and bridges) should automatically find their
   IPv6 address.

   Like tor's existing IPv4 address auto-detection, the chosen IPv6 address
   will be published as an IPv6 ORPort in the relay's descriptor. Clients,
   relays, and authorities connect to relay descriptor IP addresses.
   Therefore, IP addresses in descriptors need to be publicly routable. (If
   the relay is running on the public tor network.)

   Relays should ignore any addresses that are reserved for private networks,
   and check the reachability of addresses that appear to be public (see
   [Proposal 311: Relay IPv6 Reachability]). Relays should only publish IP
   addresses in their descriptor, if they are public and reachable. (If the
   relay is not running on the public tor network, it may use any IP address.)

   To discover their IPv6 address, some relays may fetch directory documents
   over IPv6. (For anonymity reasons, bridges are unable to fetch directory
   documents over IPv6, until clients start to do so. For security reasons,
   directory authorities only use addresses that are explicitly configured in
   their torrc.)

3.1. Current Relay IPv4 Address Discovery

   Currently, all relays (and bridges) must have an IPv4 address. IPv6
   addresses are optional for relays.

   Tor currently tries to find relay IPv4 addresses in this order:
     1. the Address torrc option
     2. the address of the hostname (resolved using DNS, if needed)
     3. a local interface address
        (by making an unused socket, if needed)
     4. an address reported by a directory server (using X-Your-Address-Is)

   When using the Address option, or the hostname, tor supports:
     * an IPv4 address literal, or
     * resolving an IPv4 address from a hostname.

   If tor is running on the public network, and an address isn't globally
   routable, tor ignores it. (If it was explicitly set in Address, tor logs an
   error.)

   If there are multiple valid addresses, tor chooses:
     * the first address returned by the resolver,
     * the first address returned by the local interface API, and
     * the latest address(es) returned by a directory server, DNS, or the
       local interface API.

3.1.1. Current Relay IPv4 and IPv6 Address State Management

   Currently, relays (and bridges) manage their IPv4 address discovery state,
   as described in the following table:

                       a b c d e f
   1. Address literal  . . . . . .
   1. Address hostname S N . . . T
   2. auto hostname    S N . . F T
   3. auto interface   ? ? . . F ?
   3. auto socket      ? ? . . F ?
   4. auto dir header  D N D D F A

   IPv6 address discovery only uses the first IPv6 ORPort address:

                       a b c d e f
   1. ORPort listener  . . C . F .
   1. ORPort literal   . . C C F .
   1. ORPort hostname  S N C C F T

   The tables are structured as follows:
     * rows are address resolution stage variants
       * each address resolution stage has a number, and a description
       * the description includes any variants
         (for example: IP address literal, or hostname)
    * columns describe each variant's state management.

   The state management key is:
    a. What kind of API is used to perform the address resolution?
      * . a CPU-bound API
      * S a synchronous query API
      * ? an API that is probably CPU-bound, but may be synchronous on some
          platforms
      * D tor automatically updates the stored directory address, whenever a
          directory document is received
    b. What does the API depend on?
      * . a CPU-bound API
      * N a network-bound API
      * ? an API that is probably CPU-bound, but may be network-bound on some
          platforms
    c. How are any discovered addresses stored?
      * . addresses are not stored
          (but they may be cached by some higher-level tor modules)
      * D addresses are stored in the directory address suggestion variable
      * C addresses are stored in the port config listener list
    d. What event makes the address resolution happen?
      * . when tor wants to know its own address
      * D when a directory document is received
      * C when tor parses its config at startup, and during reconfiguration
    e. What conditions make tor attempt this address resolution method?
      * . this method is always attempted
      * F this method is only attempted when all other higher-priority
          methods fail to return an address
    f. Can this method timeout?
      * . can't time out
      * T might time out
      * ? probably doesn't time out, but might time out on some platforms
      * A can't time out, because it is asynchronous. If a stored address
          is available, it is returned immediately.

3.2. Finding Relay IPv6 Addresses

   We propose that relays (and bridges) try to find their IPv6 address. For
   consistency, we also propose to change the address resolution order for
   IPv4 addresses.

   We use the following general principles to choose the order of IP address
   methods:
     * Explicit is better than Implicit,
     * Local Information is better than a Remote Dependency,
     * Trusted is better than Untrusted, and
     * Reliable is better than Unreliable.
   Within these constraints, we try to find the simplest working design.

   If a relay is given the wrong address by an attacker, the attacker can
   direct all inbound relay traffic to their own address. They can't decrypt
   the traffic without the relay's private keys, but they can monitor traffic
   patterns.

   Therefore, relays should only use untrusted address discovery methods, if
   every other method has failed. Any method that uses DNS is potentially
   untrusted, because DNS is often a remote, unauthenticated service. And
   addresses provided by other directory servers are also untrusted.

   For security reasons, directory authorities only use addresses that are
   explicitly configured in their torrc.

   Based on these principles, we propose that tor tries to find relay IPv4 and
   IPv6 addresses in this order:
     1. the Address torrc option
     2. the advertised ORPort address
     3. a local interface address
        (by making an unused socket, if needed)
     4. the address of the host's own hostname (resolved using DNS, if needed)
     5. an address reported by a directory server (using X-Your-Address-Is)

   Each of these address resolution steps is described in more detail, in its
   own subsection.

   For anonymity reasons, bridges are unable to fetch directory documents over
   IPv6, until clients start to do so. (See
   [Proposal 306: Client Auto IPv6 Connections].)

   We avoid using advertised DirPorts for address resolution, because:
     * they are not supported on bridges,
     * they are not supported on IPv6,
     * they may not be configured on a relay, and
     * it is unlikely that a relay operator would configure an ORPort without
       an IPv4 address, but configure a DirPort with an IPv4 address.

   While making these changes, we want to preserve tor's existing behaviour:
     * resolve Address using the local resolver, if needed,
     * ignore private addresses on public tor networks, and
     * when there are multiple valid addresses:
       * if a list of addresses is received, choose the first address, and
       * if different addresses are received over time, choose the most recent
         address.

3.2.1. Make the Address torrc Option Support IPv6

   First, we propose that relays (and bridges) use the Address torrc option
   to find their IPv4 and IPv6 addresses.

   There are two cases we need to cover:

     1. Explicit IP addresses:
        * allow the option to be specified up to two times,
        * use the IPv4 address for IPv4,
        * use the IPv6 address for IPv6.
        Configuring two addresses in the same address family is a config error.

     2. Hostnames / DNS names:
        * allow the option to be specified up to two times,
        * look up the configured name,
        * use the first IPv4 and IPv6 address returned by the resolver, and
        Resolving multiple addresses in the same address family is not a
        runtime error, but only the first address from each family will be
        used.

   These lookups should ignore private addresses on public tor networks. If
   multiple IPv4 or IPv6 addresses are returned, the first public address from
   each family should be used.

   We should support the following combinations of address literals and
   hostnames:

   Legacy configurations:
     A. No configured Address option
     B. Address IPv4 literal
     C. Address hostname (use IPv4 and IPv6 DNS addresses)

   New configurations:
     D. Address IPv6 literal
     E. Address IPv4 literal / Address IPv6 literal
     F. Address hostname / Address hostname (use IPv4 and IPv6 DNS addresses)
     G. Address IPv4 literal / Address hostname (only use IPv6 DNS addresses)
     H. Address hostname (only use IPv4 DNS addresses) / Address IPv6 literal

   If we can't find an IPv4 or IPv6 address using the configured Address
   options:
     No IPv4: guess IPv4, and its reachability must succeed.
     No IPv6: guess IPv6, publish if reachability succeeds.

   Combinations A and B are the most common legacy configurations. We want to
   support the following outcomes for all legacy configurations:
     * automatic upgrades to guessed and reachable IPv6 addresses,
     * continuing to operate on IPv4 when the IPv6 address can't be guessed,
       and
     * continuing to operate on IPv4 when the IPv6 address has been guessed,
       but it is unreachable.

   At this time, we do not propose guessing multiple IPv4 or IPv6 addresses
   and testing their reachability (see section 3.4.2).

   It is an error to configure an Address option with a private IPv4 or IPv6
   address. Tor should warn if a configured Address hostname does not resolve
   to any publicly routable IPv4 or IPv6 addresses. (In both these cases, if
   tor is configured with a custom set of directory authorities, private
   addresses should be allowed, with a notice-level log.)

   For security reasons, directory authorities only use addresses that are
   explicitly configured in their torrc. Therefore, we propose that directory
   authorities only accept IPv4 or IPv6 address literals in their Address
   option. They must not attempt to resolve their Address using DNS. It is a
   config error to provide a hostname as a directory authority's Address.

   If the Address option is not configured for IPv4 or IPv6, or the hostname
   lookups do not provide both IPv4 and IPv6 addresses, address resolution
   should go to the next step.

3.2.2. Use the Advertised ORPort IPv4 and IPv6 Addresses

   Next, we propose that relays (and bridges) use the first advertised ORPort
   IPv4 and IPv6 addresses, as configured in their torrc.

   The ORPort address may be a hostname. If it is, tor should try to use it to
   resolve an IPv4 and IPv6 address, and open ORPorts on the first available
   IPv4 and IPv6 address. Tor should respect the IPv4Only and IPv6Only port
   flags, if specified. (Tor currently resolves IPv4 and IPv6 addresses from
   hostnames in ORPort lines.)

   Relays (and bridges) currently use the first advertised ORPort IPv6 address
   as their IPv6 address. We propose to use the first advertised IPv4 ORPort
   address in a similar way, for consistency.

   Therefore, this change may affect existing relay IPv4 addressses. We expect
   that a small number of relays may change IPv4 address, from a guessed IPv4
   address, to their first advertised IPv4 ORPort address.

   In rare cases, relays may have been using non-advertised ORPorts for their
   addresses. This change may also change their addresses.

   Tor currently uses its listener port list to look up its IPv6 ORPort for
   its descriptor. We propose that tor's address discovery uses the  listener
   port list for both IPv4 and IPv6. (And does not attempt to independently
   parse or resolve ORPort configs.)

   This design decouples ORPort option parsing, ORPort listener opening, and
   address discovery. It also implements a form of caching: IPv4 and IPv6
   addresses resolved from hostnames are stored in the listener port list,
   then used to open listeners. Therefore, tor should continue to use the same
   address, while the listener remains open. (See also sections 3.2.7 and
   3.2.8.)

   For security reasons, directory authorities only use addresses that are
   explicitly configured in their torrc. Therefore, we propose that directory
   authorities only accept IPv4 or IPv6 address literals in the address part
   of the ORPort and DirPort options. They must not attempt to resolve these
   addresses using DNS. It is a config error to provide a hostname as a
   directory authority's ORPort or DirPort.

   If directory authorities don't have an IPv4 address literal in their
   Address or ORPort, they should issue a configuration error, and refuse to
   launch. If directory authorities don't have an IPv6 address literal in their
   Address or ORPort, they should issue a notice-level log, and fall back to
   only using IPv4.

   For the purposes of address resolution, tor should ignore private
   configured ORPort addresses on public tor networks. (Binding to private
   ORPort addresses is supported, even on public tor networks, for relays that
   use NAT to reach the Internet.) If an ORPort address is private, address
   resolution should go to the next step.

3.2.3. Use Local Interface IPv6 Address

   Next, we propose that relays (and bridges) use publicly routable addresses
   from the OS interface addresses or routing table, as their IPv4 and IPv6
   addresses.

   Tor has local interface address resolution functions, which support most
   major OSes. Tor uses these functions to guess its IPv4 address. We propose
   using them to also guess tor's IPv6 address.

   We also propose modifying the address resolution order, so interface
   addresses are used before the local hostname. This decision is based
   on our principles: interface addresses are local, trusted, and reliable;
   hostname lookups may be remote, untrusted, and unreliable.

   Some developer documentation also recommends using interface addresses,
   rather than resolving the host's own hostname. For example, on recent
   versions of macOS, the man pages tell developers to use interface addresses
   (getifaddrs) rather than look up the host's own hostname (gethostname and
   getaddrinfo). Unfortunately, these man pages don't seem to be available
   online, except for short quotes (see [getaddrinfo man page] for the
   relevant quote).

   If the local interface addresses are unavailable, tor opens a UDP socket to
   a publicly routable address, but doesn't actually send any packets.
   Instead, it uses the socket APIs to discover the interface address for the
   socket. (UDP is used because it is stateless, so the OS will not send any
   packets to open a connection.)

   For security reasons, directory authorities only use addresses that are
   explicitly configured in their torrc. Since local interface addresses are
   implicit, and may depend on DHCP, directory authorities do not use this
   address resolution method (or any of the other, lower-priority address
   resolution methods).

   Relays that use NAT to reach the Internet may have no publicly routable
   local interface addresses, even on the public tor network. The NAT box has
   the publicly routable addresses, and it may be a separate machine.

   Relays may also be unable to detect any local interface addresses. The
   required APIs may be unavailable, due to:
     * missing OS or library features, or
     * local security policies.

   Tor already ignores private IPv4 interface addresses on public relays. We
   propose to also ignore private IPv6 interface addresses. If all IPv4 or
   IPv6 interface addresses are private, address resolution should go to the
   next step.

3.2.4. Use Own Hostname IPv6 Addresses

   Next, we propose that relays (and bridges) get their local hostname, look
   up its addresses, and use them as its IPv4 and IPv6 addresses.

   We propose to use the same underlying lookup functions to look up the IPv4
   and IPv6 addresses for:
     * the Address torrc option (see section 3.2.1), and
     * the local hostname.
   However, OS APIs typically only return a single hostname. (Rather than a
   separate hostname for IPv4 and IPv6.)

   For security reasons, directory authorities only use addresses that are
   explicitly configured in their torrc. Since hostname lookups may use DNS,
   directory authorities do not use this address resolution method.

   The hostname lookup should ignore private addresses on public relays. If
   multiple IPv4 or IPv6 addresses are returned, the first public address from
   each family should be used. If all IPv4 or IPv6 hostname addresses are
   private, address resolution should go to the next step.

3.2.5. Use Directory Header IPv6 Addresses

   Finally, we propose that relays get their IPv4 and IPv6 addresses from the
   X-Your-Address-Is HTTP header in tor directory documents. To support this
   change, we propose that relays start fetching directory documents over IPv4
   and IPv6.

   We propose that bridges continue to only fetch directory documents over
   IPv4, because they try to imitate clients. (Most clients only fetch
   directory documents over IPv4, a few clients are configured to only fetch
   over IPv6.) When client behaviour changes to use both IPv4 and IPv6 for
   directory fetches, bridge behaviour can also change to match. (See
   section 3.4.1 and [Proposal 306: Client Auto IPv6 Connections].)

   For security reasons, directory authorities only use addresses that are
   explicitly configured in their torrc. Since directory headers are provided
   by other directory servers, directory authorities do not use this address
   resolution method.

   We propose to use a simple load balancing scheme for IPv4 and IPv6
   directory requests:
     * choose between IPv4 and IPv6 directory requests at random.

   We do not expect this change to have any load-balancing impact on the public
   tor network, because the number of relays is much smaller than the number
   of clients. However, the 6 directory authorities with IPv6 enabled may see
   slightly more directory load, particularly over IPv6.

   To support this change, tor should also change how it handles IPv6
   directory failures on relays:
     * avoid recording IPv6 directory failures as remote relay failures,
       because they may actually be due to a lack of IPv6 connectivity on the
       local relay, and
     * issue IPv6 directory failure logs at notice level, and rate-limit them
       to one per hour.

   If a relay is:
     * explicitly configured with an IPv6 address, or
     * a publicly routable, reachable IPv6 address is discovered in an
       earlier step,
   tor should start issuing IPv6 directory failure logs at warning level. Tor
   may also record these directory failures as remote relay failures. (Rather
   than ignoring them, as described in the previous paragraph.)

   (Alternately, tor could stop doing IPv6 directory requests entirely. But we
   prefer designs where all relays behave in a similar way, regardless of their
   internal state.)

   For some more complex directory load-balancing schemes, see section 3.5.4.

   Tor already ignores private IPv4 addresses in directory headers. We propose
   to also ignore private IPv6 addresses in directory headers. If all IPv4 and
   IPv6 addresses in directory headers are private, address resolution should
   return a temporary error.

   Whenever address resolution fails, tor should warn the operator to set the
   Address torrc option for IPv4 and IPv6. (If IPv4 is available, and only
   IPv6 is missing, the log should be at notice level.) These logs may need to
   be rate-limited.

   The next time tor receives a directory header containing a public IPv4 or
   IPv6 address, tor should use that address for reachability checks. If the
   reachability checks succeed, tor should use that address in its descriptor.

   Doing relay directory fetches over IPv6 will create extra IPv6 connections
   and IPv6 bandwidth on the tor network. (See
   [Proposal 313: Relay IPv6 Statistics].) In addition, some client circuits
   may use the IPv6 connections created by relay directory fetches.

3.2.6. Disabling IPv6 Address Resolution

   Relays (and bridges) that have a reachable IPv6 address, but that address
   is unsuitable for the relay, need to be able to disable IPv6 address
   resolution.

   Based on [Proposal 311: Relay IPv6 Reachability], and this proposal, those
   relays would:
     * discover their IPv6 address,
     * open an IPv6 ORPort,
     * find it reachable,
     * publish a descriptor containing that IPv6 ORPort,
     * have the directory authorities find it reachable,
     * have it published in the consensus, and
     * have it used by clients,
   regardless of how the operator configures their tor instance.

   Currently, relays are required to have an IPv4 address. So if the guessed
   IPv4 address is unsuitable, operators can set the Address option to a
   suitable IPv4 address. But IPv6 addresses are optional, so relay operators
   may need to disable IPv6 entirely.

   We propose a new torrc-only option, AddressDisableIPv6. This option is set
   to 0 by default. If the option is set to 1, tor disables IPv6 address
   resolution, IPv6 ORPorts, IPv6 reachability checks, and publishing an IPv6
   ORPort in its descriptor.

3.2.6.1. Disabling IPv6 Address Resolution: Alternative Design

   As an alternative design, tor could change its interpretation of the
   IPv4Only flag, so that the following configuration lines disable IPv6:
   (In the absence of any non-IPv4Only  ORPort lines.)
     * ORPort 9999 IPv4Only
     * ORPort 1.1.1.1:9999 IPv4Only

   However, we believe that this is a confusing design, because we want to
   enable IPv6 address resolution on this similar, very common configuration:
     * ORPort 1.1.1.1:9999

   Therefore, we avoid this design, becuase it changes the meaning of existing
   flags and options.

3.2.7. Automatically Enabling an IPv6 ORPort

   We propose that relays (and bridges) that discover their IPv6 address,
   should open an IPv6 ORPort, and test its reachability (see
   [Proposal 311: Relay IPv6 Reachability], particularly section 4.3.1).

   The ORPort should be opened on the port configured in the relay's ORPort
   torrc option. Relay operators can use the IPv4Only and IPv6Only options
   to configure different ports for IPv4 and IPv6.

   If the ORPort is auto-detected, there will not be any specific bind
   address. (And the detected address may actually be on a NAT box, rather
   than the local machine.) Therefore, relays should attempt to bind to all
   IPv4 and IPv6 addresses (or all interfaces).

   Some operating systems expect applications to bind to IPv4 and IPv6
   addresses using separate API calls. Others don't support binding only to
   IPv4 or IPv6, and will bind to all addresses whenever there is no specified
   IP address (in a single API call). Tor should support both styles of
   networking API.

   In particular, if binding to all IPv6 addresses fails, relays should still
   try to discover their public IPv6 address, and check the reachability of
   that address. Some OSes may not support the IPV6_V6ONLY flag, but they may
   instead bind to all addresses at runtime. (The tor install may also have
   compile-time / runtime flag mismatches.)

   If both reachability checks succeed, relays should publish their IPv4 and
   IPv6 ORPorts in their descriptor.

   If only the IPv4 ORPort check succeeds, and the IPv6 address was guessed
   (rather than being explicitly configured), then relays should:
     * publish their IPv4 ORPort in their descriptor,
     * stop publishing their IPv6 ORPort in their descriptor, and
     * log a notice about the failed IPv6 ORPort reachability check.

3.2.8. Proposed Relay IPv4 and IPv6 Address State Management

   We propose that relays (and bridges) manage their IPv4 and IPv6 address
   discovery state, as described in the following table:

                       a b c d e f
   1. Address literal  . . . . . .
   1. Address hostname S N . . . T
   2. ORPort listener  . . C . F .
   2. ORPort literal   . . C C F .
   2. ORPort hostname  S N C C F T
   3. auto interface   ? ? . . F ?
   3. auto socket      ? ? . . F ?
   4. auto hostname    S N . . F T
   5. auto dir header  D N D D F A

   See section 3.1.1 for a description and key for this table. See the rest of
   section 3.2 for a detailed description of each method and variant.

   For security reasons, directory authorities only use addresses that are
   explicitly configured in their torrc. Therefore, they stop after step 2.
   (And don't use the "hostname" variants in steps 1 and 2.)

   For anonymity reasons, bridges are unable to fetch directory documents over
   IPv6, until clients start to do so. (See
   [Proposal 306: Client Auto IPv6 Connections].)

3.3. Consequential Tor Client Changes

   We do not propose any required client address resolution changes at this
   time.

   However, clients will use the updated address resolution functions to detect
   when they are on a new connection, and therefore need to rotate their TLS
   keys.

   This minor client change allows us to avoid keeping an outdated version of
   the address resolution functions, which is only for client use.

   Clients should skip address resolution steps that don't apply to them, such
   as:
     * the ORPort option, and
     * the Address option, if it becomes a relay module option.

3.4. Alternative Address Resolution Designs

   We briefly mention some potential address resolution designs, and the
   reasons that they were not used in this proposal.

   (Some designs may be proposed for future Tor versions, but are not necessary
   at this time.)

3.4.1. Future Bridge IPv6 Address Resolution Behaviour

   When clients automatically fetch directory documents via relay IPv4 and
   IPv6 ORPorts by default, bridges should also adopt this dual-stack
   behaviour. (For example, see [Proposal 306: Client Auto IPv6 Connections].)

   When bridges fetch directory documents via IPv6, they will be able to find
   their IPv6 address using directory headers (see 3.2.5).

3.4.2. Guessing Muliple IPv4 or IPv6 Addresses

   We avoid designs which guess (or configure) multiple IPv4 or IPv6
   addresses, test them all for reachability, and choose one that works.

   Using multiple addresses is rare, and the code to handle it is complex. It
   also requires careful design to avoid:
     * conflicts between multiple relays (or bridges) on the same address
       (tor allows up to 2 relays per IPv4 address),
     * relay flapping,
     * race conditions, and
     * relay address switching.

3.4.3. Rejected Address Resolution Designs

   We reject designs that try all the different address resolution methods,
   score addresses, and then choose the address with the highest score.

   These designs are a generalisation of designs that try different methods in
   a set order (like this proposal). They are more complex than required.
   Complex designs can confuse operators, particularly when they fail.

   Operators should not need complex address resolution in tor: most relay
   (and bridge) addresses are fixed, or change occasionally. And most relays
   can reliably discover their address using directory headers, if all other
   methods fail. (Bridges won't discover their IPv6 address from directory
   headers, see section 3.2.5.)

   If complex address resolution is required, it can be configured using a
   dynamic DNS name in the Address torrc option, or via the control port.

   We also avoid designs that use any addresses other than the first
   (or latest) valid IPv4 and IPv6 address. These designs are more complex, and
   they don't have clear benefits:
     * sort addresses numerically (avoid address flipping)
     * sort addresses by length, then numerically
       (also minimise consensus size)
     * store a list of previous addresses in the state file, and use the most
       recently used address that's currently available.

   Operators who want to avoid address flipping should set the Address option
   in the torrc. Operators who want to minimise the size of the consensus
   should use all-zero IPv6 host identifiers.

3.5. Optional Efficiency and Reliability Changes

   We propose some optional changes for efficiency and reliability, and
   describe their impact.

   Some of these changes may be more appropriate in future releases, or
   along with other proposed features.

   Some of these changes make tor ignore some potential IP addresses.

   Ignoring addresses risks relays having no available ORPort addresses, and
   refusing to publish their descriptor. So before we ignore any addresses, we
   should make sure that:
     * tor's other address detection methods are robust and reliable, and
     * we would prefer relays to shut down, rather than use the ignored
       address.

   As a less severe alternative, low-quality methods can be put last in the
   address resolution order. (See section 3.2.)

   If relays prefer addresses from particular sources (for example: ORPorts),
   they should try these sources regularly, so that their addresses do not
   become too old.

   If relays ignore addresses from some sources (for example: DirPorts), they
   must regularly try other sources (for example: ORPorts).

3.5.1. Using Authenticated IPv4 and IPv6 Addresses

   We propose this optional change, to improve relay (and bridge) address
   accuracy and reliability.

   Relays should try to use authenticated connections to discover their own
   IPv4 and IPv6 addresses.

   Tor supports two kinds of authenticated address information:
     * authenticated directory connections, and
     * authenticated NETINFO cells.
   See the following sections for more details.

   See also sections 3.5.2 to 3.5.4.

3.5.1.1. Authenticated Directory Connections

   We propose this optional change, to improve relay address accuracy and
   reliability. (Bridges are not affected, because they already use
   authenticated directory connections, just like clients.)

   Tor supports authenticated, encrypted directory fetches using BEGINDIR over
   ORPorts (see the [Tor Specification] for details).

   Relays currently fetch unencrypted directory documents over DirPorts. The
   directory document itself is signed, but the HTTP headers are not
   authenticated. (Clients and bridges only fetch directory documents using
   authenticated directory fetches.)

   Using authenticated directory headers for relay addresses:
     * provides authenticated address information,
     * reduces the number of attackers that can deliberately give a relay an
       incorrect IP address, and
     * avoids caches (or other machines) accidentally mangling, deleting, or
       repeating X-Your-Address-Is headers.

   To make this change, we need to modify tor's directory connection code:
     * when making directory requests, relays should fetch some directory
       documents using BEGINDIR over ORPorts.

   Once tor regularly gets authenticated X-Your-Address-Is headers, relays can
   change how they handle unauthenticated addresses. When they receive an
   unauthenticated address suggestion, relays can:
     * ignore the address, or
     * use the address as the lowest priority address method.
   See section 3.5 for some factors to consider when making this design
   decision.

   For security reasons, directory authorities only use addresses that are
   explicitly configured in their torrc. Since directory headers are provided
   by other directory servers, directory authorities do not use this address
   resolution method.

   For anonymity reasons, bridges are unable to fetch directory documents over
   IPv6, until clients start to do so. (See
   [Proposal 306: Client Auto IPv6 Connections].)

   Bridges currently use authenticated IPv4 connections for all their
   directory fetches, to imitate default client behaviour.

   We describe a related change, which is also optional:

   We can increase the number of ORPort directory fetches:
     * if tor has an existing ORPort connection to a relay that it has selected
       for a directory fetch, it should use an ORPort fetch, rather than
       opening an additional DirPort connection.

   Using an existing ORPort connection:
     * saves one DirPort connection and file descriptor,
     * but slightly increases the cryptographic processing done by the relay,
       and by the directory server it is connecting to.
   However, the most expensive cryptographic operations have already happened,
   when the ORPort connection was opened.

   This change does not increase the number of NETINFO cells, because it
   re-uses existing OR connections. See the next section for more details.

3.5.1.2. Authenticated NETINFO Cells

   We propose this optional change, to improve relay (and bridge) address
   accuracy and reliability. (Bridge IPv6 addresses are not affected, because
   bridges only make OR connections over IPv4, to imitate default client
   behaviour.)

   Tor supports authenticated IPv4 and IPv6 address information, using the
   NETINFO cells exchanged at the beginning of each ORPort connection (see the
   [Tor Specification] for details).

   Relays do not currently use any address information from NETINFO cells.

   Using authenticated NETINFO cells for relay addresses:
     * provides authenticated address information,
     * reduces the number of attackers that can deliberately give a relay an
       incorrect IP address, and
     * does not require a directory fetch (NETINFO cells are sent during
       connection setup).

   To make this change, we need to modify tor's cell processing:
     * when processing NETINFO cells, tor should store the OTHERADDR field,
       like it currently does for X-Your-Address-Is HTTP headers, and
     * IPv4 and IPv6 addresses should be stored separately.
   See the previous section, and section 3.2.5 for more details about the
   X-Your-Address-Is HTTP header.

   Once tor uses NETINFO cell addresses, relays can change how they handle
   unauthenticated X-Your-Address-Is headers. When they receive an
   unauthenticated address suggestion, relays can:
     * ignore the address, or
     * use the address as the lowest priority address method.
   See section 3.5 for some factors to consider when making this design
   decision.

   We propose that tor continues to use the X-Your-Address-Is header, and adds
   support for addresses in NETINFO cells. X-Your-Address-Is headers are sent
   once per directory document fetch, but NETINFO cells are only sent once per
   OR connection.

   If a relay:
     * only gets addresses from NETINFO cells from authorities, and
     * has an existing, long-term connection to every authority,
   then it may struggle to detect address changes.

   Once all supported tor versions use NETINFO cells for address detection, we
   should review this design decision. If we are confident that almost all
   relays will be forced to make new connections when their address changes,
   then tor may be able to stop using X-Your-Address-Is HTTP headers.

   For security reasons, directory authorities only use addresses that are
   explicitly configured in their torrc. Since NETINFO cells are provided
   by other directory servers, directory authorities do not use this address
   resolution method.

   Bridges only make OR connections, and those OR connections are only over
   IPv4, to imitate default client behaviour.

   For anonymity reasons, bridges are unable to make regular connections over
   IPv4 and IPv6, until clients start to do so. (See
   [Proposal 306: Client Auto IPv6 Connections].)

   As an alternative design, if tor's addresses are stale, it could close some
   of its open directory authority connections. (Similar to section 4.4.2
   in [Proposal 311: Relay IPv6 Reachability], where relays close existing OR
   connections, before testing their own reachability.) However, this design is
   more complicated, because it involves tracking address age, as well as the
   address itself.

3.5.2. Preferring IPv4 and IPv6 Addresses from Directory Authorities

   We propose this optional change, to improve relay (but not bridge) address
   accuracy and reliability.

   Relays prefer IPv4 and IPv6 address suggestions received from Directory
   Authorities.

   Directory authorities do not use these address detection methods to
   discover their own addresses, for security reasons.

   When they receive an address suggestion from a directory mirror, relays can:
     * ignore the address, or
     * use the address as the lowest priority address method.
   See section 3.5 for some factors to consider when making this design
   decision.

   Bridges only make OR connections, and those OR connections are only over
   IPv4, to imitate default client behaviour.

   For anonymity reasons, bridges are unable to make regular connections over
   IPv6, until clients start to do so. (See
   [Proposal 306: Client Auto IPv6 Connections].)

   See also sections 3.5.1 to 3.5.4.

3.5.3. Ignoring Addresses on Inbound Connections

   We propose this optional change, to improve relay (and bridge) address
   accuracy and reliability.

   Relays ignore IPv4 and IPv6 address suggestions received on inbound
   connections.

   We make this change, because we want to detect the IP addresses of the
   relay's outbound routes, rather than the addresses that that other relays
   believe they are connecting to for inbound connections.

   If we make this change, relays may need to close some inbound connections,
   before doing address detection. If we also make the changes in sections
   3.5.1 and 3.5.2, busy relays could have persistent, inbound OR connections
   from all directory authorities. (Currently, there are 9 directory
   authorities with IPv4 addresses, and 6 directory authorities with IPv6
   addresses.)

   Directory authorities do not use these address detection methods to
   discover their own addresses, for security reasons.

   See also sections 3.5.1 to 3.5.4.

3.5.4. Load Balancing

   We propose some optional changes to improve relay (and bridge)
   load-balancing across directory authorities.

   Directory authorities do not use these address detection methods to
   discover their own addresses, for security reasons.

   See also sections 3.5.1 to 3.5.3.

3.5.4.1. Directory Authority Load Balancing

   Relays may prefer:
     * authenticated connections (section 3.5.1).

   Relays and bridges may prefer:
     * connecting to Directory Authorities (section 3.5.2), or
     * ignoring addresses on inbound connections (section 3.5.3)
       (and therefore, they may close some inbound connections,
       leading to extra connection re-establishment load).

   All these changes are optional, so they might not be implemented.

   Directory authorities do not use these address detection methods to
   discover their own addresses, for security reasons.

   If both changes are implemented, we would like all relays (and bridges) to
   do frequent directory fetches:
     * using BEGINDIR over ORPorts,
     * to directory authorities.
   However, this extra load from relays may be unsustainable during high
   network load (see
   [Ticket 33018: Dir auths using an unsustainable 400+ mbit/s]).

   For anonymity reasons, bridges should avoid connecting to directory
   authorities too frequently, to imitate default client behaviour.

   Therefore, we propose a simple load-balancing scheme between address
   resolution and non-address resolution requests:
     * when relays first start up, they should make two directory authority
       ORPort fetch attempts, one on IPv4, and one on IPv6,
     * relays should also make occasional directory authority ORPort directory
       fetch attempts, on IPv4 and IPv6, to learn if their addresses have
       changed.

   We propose a new torrc option and consensus parameter:

    RelayMaxIntervalWithoutAddressDetectionRequest N seconds|minutes|hours

    Relays make most of their directory requests via directory mirror DirPorts,
    to reduce the load on directory authorities.

    When this amount of time has passed since a relay last connected to a
    directory authority ORPort, the relay makes its next directory request via
    a directory authority ORPort. (Default: 15 minutes)

   The final name and description for this option will depend on which optional
   changes are actually implemented in tor. In particular, this option should
   only consider requests that tor may use to discover its IP addresses.
   For example:
     * if tor uses NETINFO cells for addresses (section 3.5.1.2), then all
       OR connections to an authority should be considered,
     * if tor does not use NETINFO cells for addresses, and only uses
       X-Your-Address-Is headers, then only directory fetches from authorities
       should be considered.

   We set the default value of this option to 15 minutes, because:
     * tor's reachability tests fail if the ORPort is unreachable after 20
       minutes. So we want to do at least two address detection requests in
       the first 20 minutes;
     * the minimum consensus period is 30 minutes, and we want to do at least
       one address detection per consensus period. (Consensuses are usually
       created every hour. But if there is no fresh consensus, directory
       authorities will try to create a consensus every 30 minutes); and
     * the default value for TestingAuthDirTimeToLearnReachability is 30
       minutes. So directory authorities will make reachability test OR
       connections to each relay, at least every 30 minutes. Therefore, relays
       will see NETINFO cells from directory authorities about this often.
       (Relays may use NETINFO cells for address detection, see section
       3.5.1.2.)

   See also section 3.5.4.3, for some general load balancing criteria, that
   may help when tuning the address detection interval.

   We propose a related change, which is also optional:

   If relays use address suggestions from directory mirrors, they may choose
   between ORPort and DirPort connections to directory mirrors at random.
   Directory mirrors typically have enough spare CPU and bandwidth to handle
   ORPort directory requests. (And the most expensive cryptography happens
   when the ORPort connection is opened.)

   See also sections 3.5.1 to 3.5.3.

3.5.4.2. Load Balancing Between IPv4 and IPv6 Directories

   We propose this optional change, to improve the load-balancing between IPv4
   and IPv6 directories, when used by relays to find their IPv4 and IPv6
   addresses (see section 3.2.5).

   For anonymity reasons, bridges are unable to make regular connections over
   IPv6, until clients start to do so. (See
   [Proposal 306: Client Auto IPv6 Connections].)

   Directory authorities do not use these address detection methods to
   discover their own addresses, for security reasons.

   This change may only be necessary if the following changes result in poor
   load-balancing, or other relay issues:
     * randomly selecting IPv4 or IPv6 directories (see section 3.2.5),
     * preferring addresses from directory authorities, via an authenticated
       connection (see sections 3.5.1 and 3.5.2), or
     * ignoring addresses on inbound connections, and therefore closing and
       re-opening some connections (see section 3.5.3).

   We propose that the RelayMaxIntervalWithoutAddressDetection option is
   counted separately for IPv4 and IPv6 (see the previous section for details).

   For example:
     * if 30 minutes has elapsed since the last IPv4 address detection request,
       then the next directory request should be an IPv4 address detection
       request, and
     * if 30 minutes has elapsed since the last IPv6 address detection request,
       then the next directory request should be an IPv6 address detection
       request.

   If both intervals have elapsed at the same time, the relay should choose
   between IPv4 and IPv6 at random.

   See also section 3.5.4.3, for some general load balancing criteria, that
   may help when tuning the address detection interval.

   Alternately, we could wait until
   [Proposal 306: Client Auto IPv6 Connections] is implemented, and use the
   directory fetch design from that proposal.

   See also sections 3.5.1 to 3.5.3.

3.5.4.3. General Load Balancing Criteria

   We propose the following criteria for choosing load-balancing intervals:

   The selected interval should be chosen based on the following factors:
     * relays need to discover their IPv4 and IPv6 addresses to publish their
       descriptors,
     * it only takes one successful directory fetch from one authority for a
       relay to discover its IP address (see section 3.5.2),
     * if relays fall back to addresses discovered from directory mirrors,
       when directory authorities are unavailable (see section 3.5.2),
     * BEGINDIR over ORPort requires and TLS connection, and some additional
       tor cryptography, so it is more expensive for authorities than a
       DirPort fetch (and it can not be cached by a HTTP cache)
       (see section 3.5.1),
     * closing and re-opening some OR connections (see section 3.5.3),
     * minimising wasted CPU (and bandwidth) for IPv6 connection attempts on
       IPv4-only relays, and
     * other potential changes to relay directory fetches (see
       [Ticket 33018: Dir auths using an unsustainable 400+ mbit/s])

   The selected interval should allow almost all relays to update both their
   IPv4 and IPv6 addresses:
     * at least twice when they bootstrap and test reachability (to allow for
       fetch failures),
     * at least once per consensus interval (that is, every 30 minutes), and
     * from a directory authority (if required).

   For anonymity reasons, bridges are unable to make regular connections over
   IPv6, until clients start to do so. (See
   [Proposal 306: Client Auto IPv6 Connections].)

   Directory authorities do not use these address detection methods to
   discover their own addresses, for security reasons.

   In this proposal, relays choose between IPv4 and IPv6 directory fetches
   at random (see section 3.2.5 for more detail). But if this change causes
   issues on IPv4-only relays, we may have to try IPv6 less often.

   See also sections 3.5.1 to 3.5.3.

3.5.5. Detailed Address Resolution Logs

   We propose this optional change, to help diagnose relay address resolution
   issues.

   Relays (and bridges) should  log the address chosen using each address
   resolution method, when:
     * address resolution succeeds,
     * address resolution fails,
     * reachability checks fail, or
     * publishing the descriptor fails.
   These logs should be rate-limited separately for successes and failures.

   The logs should tell operators to set the Address torrc option for IPv4 and
   IPv6 (if available).

3.5.6. Add IPv6 Support to is_local_addr()

   We propose this optional change, to improve the accuracy of IPv6 address
   detection from directory documents.

   Directory servers use is_local_addr() to detect if the requesting tor
   instance is on the same local network. If it is, the directory server does
   not include the X-Your-Address-Is HTTP header in directory documents.

   Currently, is_local_addr() checks for:
     * an internal IPv4 or IPv6 address, or
     * the same IPv4 /24 as the directory server.

   We propose also checking for:
     * the same IPv6 /48 as the directory server.

   We choose /48 because it is typically the smallest network in the global
   IPv6 routing tables, and it was previously the recommended per-customer
   network block. (See [RFC 6177: IPv6 End Site Address Assignment].)

   Tor currently uses:
     * IPv4 /8 and IPv6 /16 for port summaries,
     * IPv4 /16 and IPv6 /32 for path selection (avoiding relays in the same
       network block).
   See also the next section, which uses IPv6 /64 for sybils.

3.5.7. Add IPv6 Support to AuthDirMaxServersPerAddr

   We propose this optional change, to improve the health of the network, by
   rejecting too many relays on the same IPv6 address.

   Modify get_possible_sybil_list() so it takes an address family argument,
   and returns a list of IPv4 or IPv6 sybils.

   Use the modified get_possible_sybil_list() to exclude relays from the
   authority's vote, if there are more than:
     * AuthDirMaxServersPerAddr on the same IPv4 address, or
     * AuthDirMaxServersPerIPv6Site in the same IPv6 /64.

   We choose IPv6 /64 as the IPv6 site size, because:
     * provider site allocations range between /48 and /64
       (with a recommendation of /56),
     * /64 is the typical host allocation
       (see [RFC 6177: IPv6 End Site Address Assignment]),
     * we don't want to discourage IPv6 address adoption on the tor network.

   Tor currently uses:
     * IPv4 /8 and IPv6 /16 for port summaries,
     * IPv4 /16 and IPv6 /32 for path selection (avoiding relays in the same
       network block).
   See also the previous section, which uses IPv6 /48 for the local network.

   This change allows:
     * up to AuthDirMaxServersPerIPv6Site relays on the smallest IPv6 site
       (/64, which is also the typical IPv6 host), and
     * thousands of relays on the recommended IPv6 site size of /56.
   The number of relays in an IPv6 block was previously unlimited, and sybils
   were only limited by the scarcity of IPv4 addresses.

   We propose choosing a default value for AuthDirMaxServersPerIPv6Site by
   analysing the current IPv6 addresses on the tor network. Reasonable
   default values are likely in the range 4 to 50.

   If tor every allows IPv6-only relays, we should review the default value
   of AuthDirMaxServersPerIPv6Site.

   Since these relay exclusions happen at voting time, they do not require a
   new consensus method.

3.5.8. Use a Local Interface Address on the Default Route

   We propose this optional change, to improve the accuracy of local interface
   IPv4 and IPv6 address detection (see section 3.2.3), on relays
   (and bridges).

   Directory authorities do not use this address detection method to
   discover their own addresses, for security reasons.

   Rewrite the get_interface_address*() functions to choose an interface
   address on the default route, or to sort default route addresses first in
   the list of addresses. (If the platform API allows us to find the default
   route.)

   For more information, see [Ticket 12377: Prefer default route when checking
   local interface addresses].

   This change might not be necessary, because the directory header IP address
   method will find the IP address of the default route, in most cases
   (see section 3.2.5).

3.5.9. Add IPv6 Support via Other DNS APIs

   We propose these optional changes, to add IPv6 support to hostname
   resolution on older OSes. These changes affect:
     * the Address torrc option, when it is a hostname (see section 3.2.1),
       and
     * automatic hostname resolution (see section 3.2.4),
   on relays and bridges.

   Directory authorities do not use this address detection method to
   discover their own addresses, for security reasons.

   Tor currently uses getaddrinfo() on most systems, which supports IPv6 DNS.
   But tor also supports the legacy gethostbyname() DNS API, which does not
   support IPv6.

   There are two alternative APIs we could use for IPv6 DNS, if getaddrinfo()
   is not available:
     * libevent DNS API, and
     * gethostbyname2().

   But this change may be unnecessary, because:
     * Linux has used getaddrinfo() by default since glibc 2.20 (2014)
     * macOS has recommended getaddrinfo() since before 2006
     * since macOS adopts BSD changes, most BSDs would have switched to
       getaddrinfo() in a similar timeframe
     * Windows has supported getaddrinfo() since Windows Vista; tor's minimum
       supported Windows version is Vista.
   See [Tor Supported Platforms] for more detai

   If a large number of systems do not support getaddrinfo(), we propose
   implementing one of these alternatives:

   The libevent DNS API supports IPv6 DNS, and tor already has a dependency on
   libevent. Therefore, we should prefer the libevent DNS API. (Unless we find
   it difficult to implement.)

   We could also use gethostbyname2() to add IPv6 support to hostname
   resolution on older OSes, which don't support getaddrinfo().

   Handling multiple addresses:

   When looking up hostnames using libevent, the DNS callbacks provide a list
   of all addresses received. Therefore, we should ignore any private
   addresses, and then choose the first address in the list.

   When looking up hostnames using gethostbyname() or gethostbyname2(), if the
   first address is a private address, we may want to look at the entire list
   of addresses. Some struct hostent versions (example: current macOS) also
   have a h_addr_list rather than h_addr. (They define h_addr as
   h_addr_list[0], for backwards compatibility.)

   However, having private and public addresses resolving from the same
   hostname is a rare configuration, so we might not need to make this change.
   (On OSes that support getaddrinfo(), tor searches the list of addresses for
   a publicly routable address.)

   Alternative change: remove gethostbyname():

   As an alternative, if we believe that all supported OSes have getaddrinfo(),
   we could simply remove the gethostbyname() code, rather than trying to
   modify it to work with IPv6.

   Most relays can reliably discover their address using directory headers,
   if all other methods fail. Or operators can set the Address torrc option to
   an IPv4 or IPv6 literal.

3.5.10. Change Relay OutboundBindAddress Defaults

   We propose this optional change, to improve the reliability of
   IP address-based filters in tor. These filters typically affect relays and
   directory authorities. But we propose that bridges and clients also make
   this change, for consistency.

   For example, the tor network treats relay IP addresses differently when:
     * resisting denial of service, and
     * selecting canonical, long-term connections.
   (See [Ticket 33018: Dir auths using an unsustainable 400+ mbit/s] for the
   initial motivation for this change: resisting significant bandwidth load
   on directory authorities.)

   Now that tor knows its own addresses, we propose that relays (and bridges)
   set their IPv4 and IPv6 OutboundBindAddress to these discovered addresses,
   by default. If binding fails, tor should fall back to an unbound socket.

   Operators would still be able to set a custom IPv4 and IPv6
   OutboundBindAddress, if needed.

   Currently, tor doesn't bind to a specific address, unless
   OutboundBindAddress is configured. So on relays with multiple IP addresses,
   the outbound address comes from the chosen route for each TCP connection
   or UDP packet (usually the default route).

3.5.11. IPv6 Address Privacy Extensions

   We propose this optional change, to improve the reliability of relays (and
   bridges) that use IPv6 address privacy extensions (see section 3.5 of
   [RFC 4941: Privacy Extensions for IPv6]).

   Directory authorities:
     * should not use IPv6 address privacy extensions, because their addresses
       need to stay the same over time, and
     * do not use address detection methods that would automatically select
       an IPv6 address with privacy extensions, for security reasons.

   We propose that tor should avoid using IPv6 addresses generated using
   privacy extensions, unless no other publicly routable addresses are
   available.

   In practice, each operating system has a different way of detecting IPv6
   address privacy extensions. And some operating systems may not tell
   applications if a particular address is using privacy extensions. So
   implementing this change may be difficult.

   On operating systems that provide IPv6 address privacy extension state,
   IPv6 addresses may be:
     * "public" - these addresses do not change
     * "temporary" - these addresses change due to IPv6 privacy extensions.
   Therefore, tor should prefer "public" IPv6 addresses, when they are
   available.

   However, even if we do not make this change, tor should be compatible with
   the RFC 4941 defaults:
     * a new IPv6 address is generated each day
     * deprecated addresses are removed after one week
     * temporary addresses should be disabled, unless an application opts in
       to using them
   (See sections 3.5 and  3.6 of [RFC 4941: Privacy Extensions for IPv6].)

   In particular, it can take up to 4.5 hours for a client to receive a new
   address for a relay. Here are the maximum times:
     * 30 minutes for directory authorities to do reachability checks
       (see TestingAuthDirTimeToLearnReachability in the [Tor Manual Page]).
     * 1 hour for a reachable relay to be included in a vote
     * 10 minutes for votes to be turned into a consensus
     * 2 hours and 50 minutes for clients
       (See the [Tor Directory Protocol], sections 1.4 and 5.1, and the
       corresponding Directory Authority options in the [Tor Manual Page].)

   But 4.5 hours is much less than 1 week, and even significantly less than 1
   day. So clients and relays should be compatible with the IPv6 privacy
   extensions defaults, even if they are used for all applications.

   However, bandwidth authorities may reset a relay's bandwidth when its IPv6
   address changes. (The tor network currently uses torflow and sbws as
   bandwidth authorities, neither implementation resets bandwidth when IPv6
   addresses change.) Since bandwidth authorities only scan the whole tor
   network about once a day, resetting a relay's bandwidth causes a huge
   penalty.

   Therefore, we propose that sbws should not reset relay bandwidths when
   IPv6 addresses change. (See
   [Ticket 28725: Reset relay bandwidths when their IPv6 address changes].)

3.5.12. Quick Extends After Relay Restarts

   We propose this optional change, to reduce client circuit failures, after a
   relay restarts.

   We propose that relays (and bridges) should open their ORPorts, and support
   client extends, as soon as possible after they start up. (Clients may
   already have the relay's addresses from a previous consensus.)

   Automatically enabling an IPv6 ORPort creates a race condition with IPv6
   extends (see section 3.2.7 of this proposal, and
   [Proposal 311: Relay IPv6 Reachability]).

   This race condition has the most impact when:
     1. a relay has outbound IPv6 connectivity,
     2. the relay detects a publicly routable IPv6 address,
     3. the relay opens an IPv6 ORPort,
     4. but the IPv6 ORPort is not reachable.

   Between steps 3 and 4, the relay could successfully extend over IPv6, even
   though its IPv6 ORPort is unreachable. However, we expect this case to be
   rare.

   A far more common case is that a working relay has just restarted, and
   clients still have its addresses, therefore they continue to try to extend
   through it. If the relay refused to extend, all these clients would have to
   retry their circuits.

   To support this case, tor relays should open IPv4 and IPv6 ORPorts, and
   perform extends, as soon as they can after startup. Relays can extend to
   other relays, as soon as they have validated the directory documents
   containing other relays' public keys.

   In particular, relays which automatically detect their IPv6 address, should
   support IPv6 extends as soon as they detect an IPv6 address. (Relays may
   also attempt to bind to all IPv6 addresses on all interfaces. If that bind
   is successful, they may choose to extend over IPv6, even before they know
   their own IPv6 address.)

   Relays should not wait for reachable IPv4 or IPv6 ORPorts before they start
   performing client extends.

   DirPort requests are less critical, because relays and clients will retry
   directory fetches using multiple mirrors. However, DirPorts may also open
   as early as possible, for consistency. (And for simpler code.)

   Tor's existing code handles this use case, so the code changes required to
   support IPv6 may be quite small. But we should still test this use case for
   clients connecting over IPv4 and IPv6, and extending over IPv4 and IPv6.

   Directory authorities do not rely on their own reachability checks, so they
   should be able to perform extends (and serve cached directory documents)
   shortly after startup.

3.5.13. Using Authority Addresses for Socket-Based Address Detection

   We propose this optional change, to avoid issues with firewalls during
   relay (and bridge) address detection. (And to reduce user confusion about
   firewall notifications which show a strange IP address, particularly on
   clients.)

   Directory authorities do not use a UDP socket to discover their own
   addresses, for security reasons. Therefore, we are free to use any
   directory address for this check, without the risk of a directory authority
   making a UDP socket to itself, and discovering its own private address.

   We propose that tor should use a directory authority IPv4 and IPv6 address,
   for any sockets that it opens to detect local interface addresses (see
   section 3.2.3). We propose that this change is applied regardless of the
   role of the current tor instance (relay, bridge, directory authority, or
   client).

   Tor currently uses the arbitrary IP addresses 18.0.0.1 and [2002::], which
   may be blocked by firewalls. These addresses may also cause user confusion,
   when they appear in logs or notifications.

   The relevant function is get_interface_address6_via_udp_socket_hack() in
   lib/net. The hard-coded addresses are in app/config. Directly using these
   addresses would break tor's module layering rules, so we propose:
     * copying one directory authority's hard-coded IPv4 and IPv6 addresses to
       an ADDRESS_PRIVATE macro or variable in lib/net/address.h
     * writing a unit test that makes sure that the address used by
       get_interface_address6_via_udp_socket_hack() is still in the list of
       hard-coded directory authority addresses.

   When we choose the directory authority, we should avoid using a directory
   authority that has different hard-coded and advertised IP addresses. (To
   avoid user confusion.)

4. Directory Protocol Specification Changes

   We propose explicitly supporting IPv6 X-Your-Address-Is HTTP headers in the
   tor directory protocol.

   We propose the following changes to the [Tor Directory Protocol]
   specification, in section 6.1:

  Servers MAY include an X-Your-Address-Is: header, whose value is the
  apparent IPv4 or IPv6 address of the client connecting to them. IPv6
  addresses SHOULD/MAY (TODO) be formatted enclosed in square brackets.

  TODO: require brackets? What does Tor currently do?

  For directory connections tunneled over a BEGIN_DIR stream, servers SHOULD
  report the IP from which the circuit carrying the BEGIN_DIR stream reached
  them.

  Servers SHOULD disable caching of multiple network statuses or multiple
  server descriptors.  Servers MAY enable caching of single descriptors,
  single network statuses, the list of all server descriptors, a v1
  directory, or a v1 running routers document, with appropriate expiry times
  (around 30 minutes). Servers SHOULD disable caching of X-Your-Address-Is
  headers.

5. Test Plan

   We provide a quick summary of our testing plans.

5.1. Testing Relay IPv6 Addresses Discovery

   We propose to test these changes using chutney networks. However, chutney
   creates a limited number of configurations, so we also need to test these
   changes with relay operators on the public network.

   Therefore, we propose to test these changes on the public network with a
   small number of relays and bridges.

   Once these changes are merged, volunteer relay and bridge operators will be
   able to test them by:
     * compiling from source,
     * running nightly builds, or
     * running alpha releases.

5.2. Test Existing Features

   We will modify and test these existing features:
     * Find Relay IPv4 Addresses

   We do not plan on modifying these existing features:
     * relay address retries
     * existing warning logs
   But we will test that they continue to function correctly, and fix any bugs
   triggered by the modifications in this proposal.

6. Ongoing Monitoring

   To monitor the impact of these changes:
     * relays should collect basic IPv6 connection statistics, and
     * relays and bridges should collect basic IPv6 bandwidth statistics.
   (See [Proposal 313: Relay IPv6 Statistics]).

   Some of these statistics may be included in tor's heartbeat logs, making
   them accessible to relay operators.

   We do not propose to collect additional statistics on:
     * circuit counts, or
     * failure rates.
   Collecting statistics like these could impact user privacy.

   We also plan to write a script to calculate the number of IPv6 relays in
   the consensus. This script will help us monitor the network during the
   deployment of these new IPv6 features.

7. Changes to Other Proposals

   [Proposal 306: Client Auto IPv6 Connections] needs to be modified to keep
   bridge IPv6 behaviour in sync with client IPv6 behaviour. (See section
   3.2.5.)

References:

[getaddrinfo man page]: See the quoted section in:
   https://stackoverflow.com/a/42351676

[Proposal 306: Client Auto IPv6 Connections]: One possible design for
   automatic client IPv4 and IPv6 connections is at
   https://gitweb.torproject.org/torspec.git/tree/proposals/306-ipv6-happy-eyeballs.txt
   (TODO: modify to include bridge changes with client changes)

[Proposal 311: Relay IPv6 Reachability]:
   https://gitweb.torproject.org/torspec.git/tree/proposals/311-relay-ipv6-reachability.txt

[Proposal 313: Relay IPv6 Statistics]:
   https://gitweb.torproject.org/torspec.git/tree/proposals/313-relay-ipv6-stats.txt

[RFC 4941: Privacy Extensions for IPv6]:
   https://tools.ietf.org/html/rfc4941
   Or the older RFC 3041: https://tools.ietf.org/html/rfc3041

[RFC 6177: IPv6 End Site Address Assignment]:
   https://tools.ietf.org/html/rfc6177#page-7

[Ticket 12377: Prefer default route when checking local interface addresses]:
   https://trac.torproject.org/projects/tor/ticket/12377

[Ticket 28725: Reset relay bandwidths when their IPv6 address changes]:
   https://trac.torproject.org/projects/tor/ticket/29725#comment:3

[Ticket 33018: Dir auths using an unsustainable 400+ mbit/s]:
   https://trac.torproject.org/projects/tor/ticket/33018

[Tor Directory Protocol]:
   (version 3) https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt

[Tor Manual Page]:
   https://2019.www.torproject.org/docs/tor-manual.html.en

[Tor Specification]:
   https://gitweb.torproject.org/torspec.git/tree/tor-spec.txt

[Tor Supported Platforms]:
   https://trac.torproject.org/projects/tor/wiki/org/teams/NetworkTeam/SupportedPlatforms#OSSupportlevels
Filename: 313-relay-ipv6-stats.txt
Title: Tor Relay IPv6 Statistics
Author: teor, Karsten Loesing, Nick Mathewson
Created: 10-February-2020
Status: Accepted
Ticket: #33159

0. Abstract

   We propose that:
     * tor relays should collect statistics on IPv6 connections, and
     * tor relays and bridges should collect statistics on consumed bandwidth.
   Like tor's existing connection and consumed bandwidth statistics, these new
   IPv6 statistics will be published in each relay's extra-info descriptor.

   We also plan to write a script that shows the number of relays in the
   consensus that support:
     * IPv6 extends, and
     * IPv6 client connections.
   This script will be used for medium-term monitoring, during the deployment
   of tor's IPv6 changes in 2020. (See [Proposal 311: Relay IPv6 Reachability]
   and [Proposal 312: Relay Auto IPv6 Address].)

1. Introduction

   Tor relays (and bridges) can accept IPv6 client connections via their
   ORPort. But current versions of tor need to have an explicitly configured
   IPv6 address (see [Proposal 312: Relay Auto IPv6 Address]), and they don't
   perform IPv6 reachability self-checks (see
   [Proposal 311: Relay IPv6 Reachability]).

   As we implement these new IPv6 features in tor, we want to monitor their
   impact on the IPv6 connections and bandwidth in the tor network.

   Tor developers also need to know how many relays support these new IPv6
   features, so they can test tor's IPv6 reachability checks. (In particular,
   see section 4.3.1 in [Proposal 311: Relay IPv6 Reachability]:  Refusing to
   Publish the Descriptor.)

2. Scope

   This proposal modifies Tor's behaviour as follows:

   Relays, bridges, and directory authorities collect statistics on:
     * IPv6 connections, and
     * IPv6 consumed bandwidth.
   The design of these statistics will be based on tor's existing connection
   and consumed bandwidth statistics.

   Tor's existing consumed bandwidth statistics truncate their totals to the
   nearest kilobyte. The existing connection statistics do not perform any
   binning.

   We do not proposed to add any extra noise or binning to these statistics.
   Instead, we expect to leave these changes until we have a consistent
   privacy-preserving statistics framwework for tor. As an example of this
   kind of framework, see
   [Proposal 288: Privacy-Preserving Stats with Privcount (Shamir version)].

   We avoid:
     * splitting connection statistics into clients and relays, and
     * collecting circuit statistics.
   These statistics are more sensitive, so we want to implement
   privacy-preserving statistics, before we consider adding them.

   Throughout this proposal, "relays" includes directory authorities, except
   where they are specifically excluded. "relays" does not include bridges,
   except where they are specifically included. (The first mention of "relays"
   in each section should specifically exclude or include these other roles.)

   Tor clients do not collect any statistics for public reporting. Therefore,
   clients are out of scope in this proposal.

   When this proposal describes Tor's current behaviour, it covers all
   supported Tor versions (0.3.5.7 to 0.4.2.5), as of January 2020, except
   where another version is specifically mentioned.

   This proposal also includes a medium-term monitoring script, which
   calculates the number of relays in the consensus that support IPv6 extends,
   and IPv6 client connections.

3. Monitoring IPv6 Relays in the Consensus

   We propose writing a script that calculates:
     * the number of relays, and
     * the consensus weight fraction of relays,
   in the consensus that:
     * have an IPv6 ORPort,
     * support IPv6 reachability checks,
     * support IPv6 clients, and
     * support IPv6 reachability checks, and IPv6 clients.

   In order to provide easy access to these statistics, we propose
   that the script should:
     * download a consensus (or read an existing consensus), and
     * calculate and report these statistics.

   The following consensus weight fractions should divide by the total
   consensus weight:
     * have an IPv6 ORPort (all relays have an IPv4 ORPort), and
     * support IPv6 reachability checks (all relays support IPv4 reachability).

   The following consensus weight fractions should divide by the
   "usable Guard" consensus weight:
     * support IPv6 clients, and
     * support IPv6 reachability checks and IPv6 clients.

   "Usable Guards" have the Guard flag, but do not have the Exit flag. If the
   Guard also has the BadExit flag, the Exit flag should be ignored.

   Note that this definition of "Usable Guards" is only valid when the
   consensus contains many more guards than exits. That is, Wgd must be 0 in
   the consensus. (See the [Tor Directory Protocol] for more details.)

   Therefore, the script should check that Wgd is 0. If it is not, the script
   should log a warning about the accuracy of the "Usable Guard" statistics.

4. Collecting IPv6 Consumed Bandwidth Statistics

   We propose that relays (and bridges) collect IPv6 consumed bandwidth
   statistics.

   To minimise development and testing effort, we propose re-using the existing
   "bw_array" code in rephist.c.

   In particular, tor currently counts these bandwidth statistics:
     * read,
     * write,
     * dir_read, and
     * dir_write.

   We propose adding the following bandwidth statistics:
     * ipv6_read, and
     * ipv6_write.
   (The IPv4 statistics can be calculated by subtracting the IPv6 statistics
   from the existing total consumed bandwidth statistics.)

   We believe that collecting IPv6 consumed bandwidth statistics is about as
   safe as the existing IPv4+IPv6 total consumed bandwidth statistics.

   See also section 7.5, which adds a BandwidthStatistics torrc option and
   consensus parameter. BandwidthStatistics is an optional change.

5. Collecting IPv6 Connection Statistics

   We propose that relays (but not bridges) collect IPv6 connection statistics.

   Bridges refuse to collect the existing ConnDirectionStatistics, so we do not
   believe it is safe to collect the smaller IPv6 totals on bridges.

   To minimise development and testing effort, we propose re-using the existing
   "bidi" code in rephist.c. (This code may require some refactoring, because
   the "bidi" totals are globals, rather than a struct.)

   In particular, tor currently counts these connection statistics:
     * below threshold,
     * mostly read,
     * mostly written, and
     * both read and written.

   We propose adding IPv6 variants of all these statistics. (The IPv4
   statistics can be calculated by subtracting the IPv6 statistics from the
   existing total connection statistics.)

   See also section 7.6, which adds a ConnDirectionStatistics consensus
   parameter. This consensus paramter is an optional change.

6. Directory Protocol Specification Changes

   We propose adding IPv6 variants of the consumed bandwidth and connection
   direction statistics to the tor directory protocol.

   We propose the following additions to the [Tor Directory Protocol]
   specification, in section 2.1.2. Each addition should be inserted below the
   existing consumed bandwidth and connection direction specifications.

    "ipv6-read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL
        [At most once]
    "ipv6-write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL
        [At most once]

        Declare how much bandwidth the OR has used recently, on IPv6
        connections. See "read-history" and "write-history" for more details.
        (The migration notes do not apply to IPv6.)

    "ipv6-conn-bi-direct" YYYY-MM-DD HH:MM:SS (NSEC s) BELOW,READ,WRITE,BOTH NL
        [At most once]

        Number of IPv6 connections, that are used uni-directionally or
        bi-directionally. See "conn-bi-direct" for more details.

   We also propose the following replacement, in the same section:

    "dirreq-read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL
        [At most once]
    "dirreq-write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL
        [At most once]

        Declare how much bandwidth the OR has spent on answering directory
        requests. See "read-history" and "write-history" for more details.
        (The migration notes do not apply to dirreq.)

   This replacement is optional, but it may avoid the 3 *read-history
   definitions getting out of sync.

7. Optional Changes

   We propose some optional changes to help relay operators, tor developers,
   and tor's network health. We also expect that these changes will drive IPv6
   relay adoption.

   Some of these changes may be more appropriate as future work, or along with
   other proposed features.

7.1. Log IPv6 Statistics in Tor's Heartbeat Logs

   We propose this optional change, so relay operators can see their own IPv6
   statistics:

   We propose that tor logs its IPv6 consumed bandwidth and connection
   statistics in its regular "heartbeat" logs.

   These heartbeat statistics should be collected over the lifetime of the tor
   process, rather than using the state file, like the statistics in sections
   4 and 5.

   Tor's existing heartbeat logs already show its consumed bandwidth and
   connections (in the link protocol counts).

   We may also want to show IPv6 consumed bandwidth and connections as a
   propotion of the total consumed bandwidth and connections.

   These statistics only show a relay's local bandwidth usage, so they can't
   be used for reporting.

7.2. Show IPv6 Relay Counts on Consensus Health

   The [Consensus Health] website displays a wide rage of tor statistics,
   based on the most recent consensus.

   We propose this optional change, to:
     * help tor developers improve IPv6 support on the tor network,
     * help diagnose issues with IPv6 on the tor network, and
     * drive IPv6 adoption on tor relays.

   Consensus Health adds an IPv6 section, with relays in the consensus that:
     * have an IPv6 ORPort, and
     * support IPv6 reachability checks.

   The definitions of these statistics are in section 3.

   These changes can be tested using the script proposed in section 3.

7.3. Add an IPv6 Reachability Pseudo-Flag on Relay Search

   The [Relay Search] website displays tor relay information, based on the
   current consensus and relay descriptors.

   We propose this optional change, to:
     * help relay operators diagnose issues with IPv6 on their relays, and
     * drive IPv6 adoption on tor relays.

   Relay Search adds a pseudo-flag for relay IPv6 reachability support.

   This pseudo-flag should be given to relays that have:
     * a reachable IPv6 ORPort (in the consensus), and
     * support tor subprotocol version "Relay=3" (or later).
   See [Proposal 311: Relay IPv6 Reachability] for details.

   TODO: Is this a useful change?
         Are there better ways of driving IPv6 adoption?

7.4. Add IPv6 Connections and Consumed Bandwidth Graphs to Tor Metrics

   The [Tor Metrics: Traffic] website displays connection and bandwidth
   information for the tor network, based on relay extra-info descriptors.

   We propose these optional changes, to:
     * help tor developers improve IPv6 support on the tor network,
     * help diagnose issues with IPv6 on the tor network, and
     * drive IPv6 adoption on tor relays.

   Tor Metrics adds the following information to the graphs on the Traffic
   page:

   Consumed Bandwidth by IP version
     * added to the existing [Tor Metrics: Advertised bandwidth by IP version]
       page
     * as a stacked graph, like
       [Tor Metrics: Advertised and consumed bandwidth by relay flags]

   Fraction of connections used uni-/bidirectionally by IP version
     * added to the existing
       [Tor Metrics: Fraction of connections used uni-/bidirectionally] page
     * as a stacked graph, like
       [Tor Metrics: Advertised and consumed bandwidth by relay flags]

7.5. Add a BandwidthStatistics option

   We propose adding a new BandwidthStatistics torrc option and consensus
   parameter, which activates reporting of all these statistics. Currently,
   the existing statistics are controlled by ExtraInfoStatistics, but we
   propose using the new BandwidthStatistics option for them as well.

   The default value of this option should be "auto", which checks the
   consensus parameter. If there is no consensus parameter, the default should
   be 1. (The existing bandwidth statistics are reported by default.)

7.6. Add a ConnDirectionStatistics consensus parameter

   We propose using the existing ConnDirectionStatistics torrc option, and
   adding a consensus parameter with the same name. This option will control
   the new and existing connection statistics.

   The default value of this option should be "auto", which checks the
   consensus parameter. If there is no consensus parameter, the default should
   be 0.

   Bridges refuse to collect the existing ConnDirectionStatistics, so we do not
   believe it is safe to collect the smaller IPv6 totals on bridges. The new
   consensus parameter should also be ignored on bridges.

   If we implement the ConnDirectionStatistics consensus parameter, we can set
   the consensus parameter to 1 for a week or two, so we can collect these
   statistics.

8. Test Plan

   We provide a quick summary of our testing plans.

8.1. Testing IPv6 Relay Consensus Calculations

   We propose to test the IPv6 Relay consensus script using chutney networks.
   However, chutney creates a limited number of relays, so we also need to
   test these changes on consensuses from the public tor network.

   Some of these calculations are similar to the calculations that tor will do,
   to find out if IPv6 reachability checks are reliable. So we may be able to
   check the script against tor's reachability logs. (See section 4.3.1 in
   [Proposal 311: Relay IPv6 Reachability]:  Refusing to Publish the
   Descriptor.)

   The Tor Metrics team may also independently check these calculations.

   Once the script is completed, its output will be monitored by tor
   developers, as more volunteer relay operators deploy the relevant tor
   versions. (And as the number of IPv6 relays in the consensus increases.)

8.2. Testing IPv6 Extra-Info Statistics

   We propose to test the connection and consumed bandwidth statistics using
   chutney networks. However, chutney runs for a short amount of time, and
   creates a limited amount of traffic, so we also need to test these changes
   on the public tor network.

   In particular, we have struggled to test statistics using chutney, because
   tor's hard-coded statistics period is 24 hours. (And most chutney networks
   run for under 1 minute.)

   Therefore, we propose to test these changes on the public network with a
   small number of relays and bridges.

   During 2020, the Tor Metrics team will analyse these statistics on the
   public tor network, and provide IPv6 progress reports. We expect that we may
   discover some bugs during the first analysis.

   Once these changes are merged, they will be monitored by tor developers, as
   more volunteer relay operators deploy the relevant tor versions. (And as the
   number of IPv6 relays in the consensus increases.)

References:

[Consensus Health]:
   https://consensus-health.torproject.org/

[Proposal 288: Privacy-Preserving Stats with Privcount (Shamir version)]:
   https://gitweb.torproject.org/torspec.git/tree/proposals/288-privcount-with-shamir.txt

[Proposal 311: Relay IPv6 Reachability]:
   https://gitweb.torproject.org/torspec.git/tree/proposals/311-relay-ipv6-reachability.txt

[Proposal 312: Relay Auto IPv6 Address]:
   https://gitweb.torproject.org/torspec.git/tree/proposals/312-relay-auto-ipv6-addr.txt

[Relay Search]:
   https://metrics.torproject.org/rs.html

[Tor Directory Protocol]:
   (version 3) https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt

[Tor Manual Page]:
   https://2019.www.torproject.org/docs/tor-manual.html.en

[Tor Metrics: Advertised and consumed bandwidth by relay flags]:
   https://metrics.torproject.org/bandwidth-flags.html

[Tor Metrics: Advertised bandwidth by IP version]:
   https://metrics.torproject.org/advbw-ipv6.html

[Tor Metrics: Fraction of connections used uni-/bidirectionally]:
   https://metrics.torproject.org/connbidirect.html

[Tor Metrics: Traffic]:
   https://metrics.torproject.org/bandwidth-flags.html

[Tor Specification]:
   https://gitweb.torproject.org/torspec.git/tree/tor-spec.txt
Filename: 314-allow-markdown-proposals.md
Title: Allow Markdown for proposal format.
Author: Nick Mathewson
Created: 23 April 2020
Status: Closed

Introduction

This document proposes a change in our proposal format: to allow Markdown.

Motivation

Many people, particularly researchers, have found it difficult to write text in the format that we prefer. Moreover, we have often wanted to add more formatting in proposals, and found it nontrivial to do so.

Markdown is an emerging "standard" (albeit not actually a standardized one), and we're using it in several other places. It seems like a natural fit for our purposes here.

Details

We should pick a particular Markdown dialect. "CommonMark" seems like a good choice, since it's the basis of what github and gitlab use.

We should also pick a particular tool to use for validating Markdown proposals.

We should continue to allow text proposals.

We should continue to require headers for our proposals, and do so using the format at the head of this document: wrapping the headers inside triple backticks.

Filename: 315-update-dir-required-fields.txt
Title: Updating the list of fields required in directory documents
Author: Nick Mathewson
Created: 23 April 2020
Status: Closed
Implemented-In: 0.4.5.1-alpha

Notes:

   The "hidden-service-dir" field was not made assumed-present; all other
    fields were updated.

1. Introduction

   When we add a new field to a directory document, we must at first
   describe it as "optional", since older Tor implementations will
   not generate it.  When those implementations are obsolete and
   unsupported, however, we can safely describe those fields as
   "required", since they are always included in practice.

   Making fields required is not just a matter of bookkeeping: it
   helps prevent bugs in two ways.  First, it simplifies our code.
   Second, it makes our code's requirements match our assumptions
   about the network.

   Here I'll describe a general policy for making fields required
   when LTS versions become unsupported, and include a list of
   fields that should become required today.

   This document does not require to us to make all optional fields
   required -- only those which we intend that all Tor instances
   should always generate and expect.

   When we speak of making a field "required", we are talking about
   describing it as "required" in dir-spec.txt, so that any document
   missing that field is no longer considered well-formed.

2. When fields should become required

   We have four relevant kinds of directory documents: those
   generated by public relays, those generated by bridges, those
   generated by authorities, and those generated by onion services.

   Relays generate extrainfo documents and routerdesc documents.
   For these, we can safely make a field required when it is always
   generated by all relay versions that the authorities allow to
   join the network.  To avoid partitioning, authorities should
   start requiring the field before any relays or clients do.

   (If a relay field indicates the presence of a now-required
   feature, then instead of making the field mandatory, we may
   change the semantics so that the field is assumed to be
   present. Later we can remove the option.)

   Bridge relays have their descriptors processed by clients
   without necessarily passing through authorities.
   We can make fields mandatory in bridge descriptors once we
   can be confident that no bridge lacking them will actually
   connect to the network-- or that all such bridges are safe
   to stop using.

   For bridges, when a field becomes required, it will take some
   time before all clients require that field.  This would create a
   partitioning opportunity, but partitioning at the first-hop
   position is not so strong: the bridge already knows the client's
   IP, which is a much better identifier than the client's Tor
   version.

   Authorities generate authority certificates, votes, consensus
   documents, and microdescriptors.  For these, we can safely make a
   field required once all authorities are generating it, and we are
   confident that we do not plan to downgrade those authorities.

   Onion services generate service descriptors.  Because of the risk
   of partitioning attacks, we should not make features in service
   descriptors required without a phased process, described in the
   following section.

2.1. Phased addition of onion service descriptor changes

   Phase one: we add client and service support for the new field,
   but have this support disabled by default. By default, services
   should not generate the new field, and clients should not parse
   it when it is present.  This behavior is controlled by a pair of
   network parameters.  (If the feature is at all complex, the
   network parameters should describe a _minimum version_ that
   should enable the feature, so that we can later enable it only in
   the versions where the feature is not buggy.)

   During this phase, we can manually override the defaults on
   particular clients and services to test the new field.

   Phase two: authorities use the network parameters to enable the
   client support and the service support.  They should only do this
   once enough clients and services have upgraded to a version that
   supports the feature.

   Phase three: once all versions that support the feature are
   obsolete and unsupported, the feature may be marked as required
   in the specifications, and the network parameters ignored.

   Phase four: once all versions that used the network parameters
   are obsolete and unsupported, authorities may stop including
   those parameters in their votes.

3. Directory fields that should become required.

   These fields in router descriptors should become required:
      * identity-ed25519
      * master-key-ed25519
      * onion-key-crosscert
      * ntor-onion-key
      * ntor-onion-key-crosscert
      * router-sig-ed25519
      * proto

   These fields in router descriptors should become "assumed present":
      * hidden-service-dir

   These fields in extra-info documents should become required:
      * identity-ed25519
      * router-sig-ed25519

   The following fields in microdescriptors should become
   required:
      * ntor-onion-key

   The following fields in votes and consensus documents should
   become required:
      * pr

Filename: 316-flashflow.md
Title: FlashFlow: A Secure Speed Test for Tor (Parent Proposal)
Author: Matthew Traudt, Aaron Johnson, Rob Jansen, Mike Perry
Created: 23 April 2020
Status: Draft

1. Introduction

FlashFlow is a new distributed bandwidth measurement system for Tor that consists of a single authority node ("coordinator") instructing one or more measurement nodes ("measurers") when and how to measure Tor relays. A measurement consists of the following steps:

  1. The measurement nodes demonstrate to the target relay permission to perform measurements.
  2. The measurement nodes open many TCP connections to the target relay and create a one-hop circuit to the target relay on each one.
  3. For 30 seconds the measurement nodes send measurement cells to the target relay and verify that the cells echoed back match the ones sent. During this time the relay caps the amount of background traffic it transfers. Background and measurement traffic are handled separately at the relay. Measurement traffic counts towards all the standard existing relay statistics.
  4. For every second during the measurement, the measurement nodes report to the authority node how much traffic was echoed back. The target relay also reports the amount of per-second background (non-measurement) traffic.
  5. The authority node sums the per-second reported throughputs into 30 sums (one for each second) and calculates the median. This is the estimated capacity of the relay.

FlashFlow performs a measurement of every relay according to a schedule described later in this document. Periodically it produces relay capacity estimates in the form of a v3bw file, which is suitable for direct consumption by a Tor directory authority. Alternatively an existing load balancing system such as Simple Bandwidth Scanner could be modified to use FlashFlow's v3bw file as input.

It is envisioned that each directory authority that wants to use FlashFlow will run their own FlashFlow deployment consisting of a coordinator that they run and one or more measurers that they trust (e.g. because they run them themselves), similar to how each runs their own Torflow/sbws. Section 5 of this proposal describes long term plans involving multiple FlashFlow deployments. FlashFlow coordinators do not need to communicate with each other.

FlashFlow is more performant than Torflow: FlashFlow takes 5 hours to measure the entire existing Tor network from scratch (with 3 Gbit/s measurer capacity) while Torflow takes 2 days; FlashFlow measures relays it hasn't seen recently as soon as it learns about them (i.e. every new consensus) while Torflow can take a day or more; and FlashFlow accurately measures new high-capacity relays the first time and every time while Torflow takes days/weeks to assign them their full fair share of bandwidth (especially for non-exits). FlashFlow is more secure than Torflow: FlashFlow allows a relay to inflate its measured capacity by up to 1.33x (configured by a parameter) while Torflow allows weight inflation by a factor of 89x [0] or even 177x [1].

After an overview in section 2 of the planned deployment stages, section 3, 4, and 5 discuss the short, medium, and long term deployment plans in more detail.

2. Deployment Stages

FlashFlow's deployment shall be broken up into three stages.

In the short term we will implement a working FlashFlow measurement system. This requires code changes in little-t tor and an external FlashFlow codebase. The majority of the implementation work will be done in the short term, and the product is a complete FlashFlow measurement system. Remaining pieces (e.g. better authentication) are added later for enhanced security and network performance.

In the medium term we will begin collecting data with a FlashFlow deployment. The intermediate results and v3bw files produced will be made available (semi?) publicly for study.

In the long term experiments will be performed to study ways of using FF v3bw files to improve load balancing. Two examples: (1) using FF v3bw files instead of sbws's (and eventually phasing out torflow/sbws), and (2) continuing to run sbws but use FF's results as a better estimate of relay capacity than observed bandwidth. Authentication and other FlashFlow features necessary to make it completely ready for full production deployment will be worked on during this long term phase.

3. FlashFlow measurement system: Short term

The core measurement mechanics will be implemented in little-t tor, but a separate codebase for the FlashFlow side of the measurement system will also be created. This section is divided into three parts: first a discussion of changes/additions that logically reside entirely within tor (essentially: relay-side modifications), second a discussion of the separate FlashFlow code that also requires some amount of tor changes (essentially: measurer-side and coordinator-side modifications), and third a security discussion.

3.1 Little-T Tor Components

The primary additions/changes that entirely reside within tor on the relay side:

  • New torrc options/consensus parameters.
  • New cell commands.
  • Pre-measurement handshaking (with a simplified authentication scheme).
  • Measurement mode, during which the relay will echo traffic with measurers, set a cap on the amount of background traffic it transfers, and report the amount of transferred background traffic.

3.1.1 Parameters

FlashFlow will require some consensus parameters/torrc options. Each has some default value if nothing is specified; the consensus parameter overrides this default value; the torrc option overrides both.

FFMeasurementsAllowed: A global toggle on whether or not to allow measurements. Even if all other settings would allow a measurement, if this is turned off, then no measurement is allowed. Possible values: 0,

  1. Default: 0 (disallowed).

FFAllowedCoordinators: The list of coordinator TLS certificate fingerprints that are allowed to start measurements. Relays check their torrc when they receive a connection from a FlashFlow coordinator to see if it's on the list. If they have no list, they check the consensus parameter. If nether exist, then no FlashFlow deployment is allowed to measure this relay. Default: empty list.

FFMeasurementPeriod: A relay should expect on average, to be measured by each FlashFlow deployment once each measurement period. A relay will not allow itself to be measured more than twice by a FlashFlow deployment in any time window of this length. Relays should not change this option unless they really know what they're doing. Changing it at the relay will not change how often FlashFlow will attempt to measure the relay. Possible values are in the range [1 hour, 1 month] inclusive. Default: 1 day.

FFBackgroundTrafficPercent: The maximum amount of regular non-measurement traffic a relay should handle while being measured, as a percent of total traffic (measurement + non-measurement). This parameter is a trade off between having to limit background traffic and limiting how much a relay can inflate its result by handling no background traffic but reporting that it has done so. Possible values are in the range [0, 99] inclusive. Default: 25 (a maximum inflation factor of 1.33).

FFMaxMeasurementDuration: The maximum amount of time, in seconds, that is allowed to pass from the moment the relay is notified that a measurement will begin soon and the end of the measurement. If this amount of time passes, the relay shall close all measurement connections and exit its measurement mode. Note this duration includes handshake time, thus it necessarily is larger than the expected actual measurement duration. Possible values are in the range [10, 120] inclusive. Default: 45.

3.1.2 New Cell Types

FlashFlow will introduce a new cell command MEASUREMENT.

The payload of each MEASUREMENT cell consists of:

Measure command [1 byte]
Data            [varied]

The measure commands are:

0 -- MEAS_PARAMS    [forward]
1 -- MEAS_PARAMS_OK [backward]
2 -- MEAS_BG        [backward]
3 -- MEAS_ERR       [forward and backward]

Forward cells are sent from the measurer/coordinator to the relay. Backward cells are sent from the relay to the measurer/coordinator.

MEAS_PARAMS and MEAS_PARAMS_OK are used during the pre-measurement stage to tell the target what to expect and for the relay to positively acknowledge the message. The target send a MEAS_BG cell once per second to report the amount of background traffic it is handling. MEAS_ERR cells are used to signal to the other party that there has been some sort of problem and that the measurement should be aborted. These measure commands are described in more detail in the next section.

FlashFlow also introduces a new relay command, MEAS_ECHO. Relay celsl with this relay command are the measurement traffic. The measurer generates and encrypts them, sends them to the target, the target decrypts them, then it sends them back. A variation where the measurer skips encryption of MEAS_ECHO cells in most cases is described in Appendix A, and was found to be necessary in paper prototypes to save CPU load at the measurer.

MEASUREMENT cells, on the other hand, are not encrypted (beyond the regular TLS on the connection).

3.1.3 Pre-Measurement Handshaking/Starting a Measurement

The coordinator establishes a one-hop circuit with the target relay and sends it a MEAS_PARAMS cell. If the target is unwilling to be measured at this time or if the coordinator didn't use a TLS certificate that the target trusts, it responds with an error cell and closes the connection. Otherwise it checks that the parameters of the measurement are acceptable (e.g. the version is acceptable, the duration isn't too long, etc.). If the target is happy, it sends a MEAS_PARAMS_OK, otherwise it sends a MEAS_ERR and closes the connection.

Upon learning the IP addresses of the measurers from the coordinator in the MEAS_PARAMS cell, the target whitelists their IPs in its DoS detection subsystem until the measurement ends (successfully or otherwise), at which point the whitelist is cleared.

Upon receiving a MEAS_PARAMS_OK from the target, the coordinator will instruct the measurers to open their circuits (one circuit per connection) with the target. If the coordinator or any measurer receives a MEAS_ERR, it reports the error to the coordinator and considers the measurement a failure. It is also a failure if any measurer is unable to open at least half of its circuits with the target.

The payload of MEAS_PARAMS cells [XXX more may need to be added]:

- meas_duration [2 bytes] [1, 600]
- num_measurers [1 byte] [1, 10]
- measurer_info [num_measurers times]

meas_duration is the duration, in seconds, that the actual measurement will last. num_measurers is how many link_specifier structs follow containing information on the measurers that the relay should expect. Future versions of FlashFlow and MEAS_PARAMS will use TLS certificates instead of IP addresses. [XXX probably need diff layout to allow upgrade to TLS certs instead of link_specifier structs. probably using ext-type-length-value like teor suggests] [XXX want to specify number of conns to expect from each measurer here?]

MEAS_PARAMS_OK has no payload: it's just padding bytes to make the cell PAYLOAD_LEN (509) bytes long.

The payload of MEAS_ECHO cells:

- arbitrary bytes [PAYLOAD_LEN bytes]

The payload of MEAS_BG cells [XXX more for extra info? like CPU usage]:

- second        [2 byte] [1, 600]
- sent_bg_bytes [4 bytes] [0, 2^32-1]
- recv_bg_bytes [4 bytes] [0, 2^32-1]

second is the number of seconds since the measurement began. MEAS_BG cells are sent once per second from the relay to the FlashFlow coordinator. The first cell will have this set to 1, and each subsequent cell will increment it by one. sent_bg_bytes is the number of background traffic bytes sent in the last second (since the last MEAS_BG cell). recv_bg_bytes is the same but for received bytes.

The payload of MEAS_ERR cells [XXX need field for more info]:

- err_code [1 byte] [0, 255]

The error code is one of:

[... XXX TODO ...]
255 -- OTHER

3.1.4 Measurement Mode

The relay considers the measurement to have started the moment it receives the first MEAS_ECHO cell from any measurer. At this point, the relay

  • Starts a repeating 1s timer on which it will report the amount of background traffic to the coordinator over the coordinator's connection.
  • Enters "measurement mode" and limits the amount of background traffic it handles according to the torrc option/consensus parameter.

The relay decrypts and echos back all MEAS_ECHO cells it receives on measurement connections until it has reported its amount of background traffic the same number of times as there are seconds in the measurement (e.g. 30 per-second reports for a 30 second measurement). After sending the last MEAS_BG cell, the relay drops all buffered MEAS_ECHO cells, closes all measurement connections, and exits measurement mode.

During the measurement the relay targets a ratio of background traffic to measurement traffic as specified by a consensus parameter/torrc option. For a given ratio r, if the relay has handled x cells of measurement traffic recently, Tor then limits itself to y = xr/(1-r) cells of non-measurement traffic this scheduling round. If x is very small, the relay will perform the calculation s.t. x is the number of cells required to produce 10 Mbit/s of measurement traffic, thus ensuring some minimum amount of background traffic is allowed.

[XXX teor suggests in [4] that the number 10 Mbit/s could be derived more intelligently. E.g. based on AuthDirFastGuarantee or AuthDirGuardBWGuarantee]

3.2 FlashFlow Components

The FF coordinator and measurer code will reside in a FlashFlow repository separate from little-t tor.

There are three notable parameters for which a FF deployment must choose values. They are:

  • The number of sockets, s, the measurers should open, in aggregate, with the target relay. We suggest s=160 based on the FF paper.
  • The bandwidth multiplier, m. Given an existing capacity estimate for a relay, z, the coordinator will instruct the measurers to, in aggregate, send m*z Mbit/s to the target relay. We recommend m=2.25.
  • The measurement duration, d. Based on the FF paper, we recommend d=30 seconds.

The rest of this section first discusses notable functions of the FlashFlow coordinator, then goes on to discuss FF measurer code that will require supporting tor code.

3.2.1 FlashFlow Coordinator

The coordinator is responsible for scheduling measurements, aggregating results, and producing v3bw files. It needs continuous access to new consensus files, which it can obtain by running an accompanying Tor process in client mode.

The coordinator has the following functions, which will be described in this section:

  • result aggregation.
  • schedule measurements.
  • v3bw file generation.

3.2.1.1 Aggregating Results

Every second during a measurement, the measurers send the amount of verified measurement traffic they have received back from the relay. Additionally, the relay sends a MEAS_BG cell each second to the coordinator with amount of non-measurement background traffic it is sending and receiving.

For each second's reports, the coordinator sums the measurer's reports. The coordinator takes the minimum of the relay's reported sent and received background traffic. If, when compared to the measurer's reports for this second, the relay's claimed background traffic is more than what's allowed by the background/measurement traffic ratio, then the coordinator further clamps the relay's report down. The coordinator adds this final adjusted amount of background traffic to the sum of the measurer's reports.

Once the coordinator has done the above for each second in the measurement (e.g. 30 times for a 30 second measurement), the coordinator takes the median of the 30 per-second throughputs and records it as the estimated capacity of the target relay.

3.2.1.2 Measurement Schedule

The short term implementation of measurement scheduling will be simpler than the long term one due to (1) there only being one FlashFlow deployment, and (2) there being very few relays that support being measured by FlashFlow. In fact the FF coordinator will maintain a list of the relays that have updated to support being measured and have opted in to being measured, and it will only measure them.

The coordinator divides time into a series of 24 hour periods, commonly referred to as days. Each period has measurement slots that are longer than a measurement lasts (30s), say 60s, to account for pre- and post-measurement work. Thus with 60s slots there's 1,440 slots in a day.

At the start of each day the coordinator considers the list of relays that have opted in to being measured. From this list of relays, it repeatedly takes the relay with the largest existing capacity estimate. It selects a random slot. If the slot has existing relays assigned to it, the coordinator makes sure there is enough additional measurer capacity to handle this relay. If so, it assigns this relay to this slot. If not, it keeps picking new random slots until one has sufficient additional measurer capacity.

Relays without existing capacity estimates are assumed to have the 75th percentile capacity of the current network.

If a relay is not online when it's scheduled to be measured, it doesn't get measured that day.

3.2.1.2.1 Example

Assume the FF deployment has 1 Gbit/s of measurer capacity. Assume the chosen multiplier m=2. Assume there are only 5 slots in a measurement period.

Consider a set of relays with the following existing capacity estimates and that have opted in to being measured by FlashFlow.

  • 500 Mbit/s
  • 300 Mbit/s
  • 250 Mbit/s
  • 200 Mbit/s
  • 100 Mbit/s
  • 50 Mbit/s

The coordinator takes the largest relay, 500 Mbit/s, and picks a random slot for it. It picks slot 3. The coordinator takes the next largest, 300, and randomly picks slot 2. The slots are now:

   0   |   1   |   2   |   3   |   4
-------|-------|-------|-------|-------
       |       |  300  |  500  |
       |       |       |       |

The coordinator takes the next largest, 250, and randomly picks slot 2. Slot 2 already has 600 Mbit/s of measurer capacity reserved (300*m); given just 1000 Mbit/s of total measurer capacity, there is just 400 Mbit/s of spare capacity while this relay requires 500 Mbit/s. There is not enough room in slot 2 for this relay. The coordinator picks a new random slot, 0.

   0   |   1   |   2   |   3   |   4
-------|-------|-------|-------|-------
  250  |       |  300  |  500  |
       |       |       |       |

The next largest is 200 and the coordinator randomly picks slot 2 again (wow!). As there is just enough spare capacity, the coordinator assigns this relay to slot 2.

   0   |   1   |   2   |   3   |   4
-------|-------|-------|-------|-------
  250  |       |  300  |  500  |
       |       |  200  |       |

The coordinator randomly picks slot 4 for the last remaining relays, in that order.

   0   |   1   |   2   |   3   |   4
-------|-------|-------|-------|-------
  250  |       |  300  |  500  |  100
       |       |  200  |       |   50

3.2.1.3 Generating V3BW files

Every hour the FF coordinator produces a v3bw file in which it stores the latest capacity estimate for every relay it has measured in the last week. The coordinator will create this file on the host's local file system. Previously-generated v3bw files will not be deleted by the coordinator. A symbolic link at a static path will always point to the latest v3bw file.

$ ls -l
v3bw -> v3bw.2020-03-01-05-00-00
v3bw.2020-03-01-00-00-00
v3bw.2020-03-01-01-00-00
v3bw.2020-03-01-02-00-00
v3bw.2020-03-01-03-00-00
v3bw.2020-03-01-04-00-00
v3bw.2020-03-01-05-00-00

[XXX Either FF should auto-delete old ones, logrotate config should be provided, a script provided, or something to help bwauths not accidentally fill up their disk]

[XXX What's the approxmiate disk usage for, say, a few years of these?]

3.2.2 FlashFlow Measurer

The measurers take commands from the coordinator, connect to target relays with many sockets, send them traffic, and verify the received traffic is the same as what was sent.

Notable new things that internal tor code will need to do on the measurer (client) side:

  1. Open many TLS+TCP connections to the same relay on purpose.

3.2.2.1 Open many connections

FlashFlow prototypes needed to "hack in" a flag in the open-a-connection-with-this-relay function call chain that indicated whether or not we wanted to force a new connection to be created. Most of Tor doesn't care if it reuses an existing connection, but FF does want to create many different connections. The cleanest way to accomplish this will be investigated.

On the relay side, these measurer connections do not count towards DoS detection algorithms.

3.3 Security

In this section we discuss the security of various aspects of FlashFlow and the tor changes it requires.

3.3.1 Weight Inflation

Target relays are an active part of the measurement process; they know they are getting measured. While a relay cannot fake the measurement traffic, it can trivially stop transferring client background traffic for the duration of the measurement yet claim it carried some. More generally, there is no verification of the claimed amount of background traffic during the measurement. The relay can claim whatever it wants, but it will not be trusted above the ratio the FlashFlow deployment is configured to know. This places an easy to understand, firm, and (if set as we suggest) low cap on how much a relay can inflate its measured capacity.

Consider a background/measurement ratio of 1/4, or 25%. Assume the relay in question has a hard limit on capacity (e.g. from its NIC) of 100 Mbit/s. The relay is supposed to use up to 25% of its capacity for background traffic and the remaining 75%+ capacity for measurement traffic. Instead the relay ceases carrying background traffic, uses all 100 Mbit/s of capacity to handle measurement traffic, and reports ~33 Mbit/s of background traffic (33/133 = ~25%). FlashFlow would trust this and consider the relay capable of 133 Mbit/s. (If the relay were to report more than ~33 Mbit/s, FlashFlow limits it to just ~33 Mbit/s.) With r=25%, FlashFlow only allows 1.33x weight inflation.

Prior work shows that Torflow allows weight inflation by a factor of 89x [0] or even 177x [1].

The ratio chosen is a trade-off between impact on background traffic and security: r=50% allows a relay to double its weight but won't impact client traffic for relays with steady state throughput below 50%, while r=10% allows a very low inflation factor but will cause throttling of client traffic at far more relays. We suggest r=25% (and thus 1/(1-0.25)=1.33x inflation) for a reasonable trade-off between performance and security.

It may be possible to catch relays performing this attack, especially if they literally drop all background traffic during the measurement: have the measurer (or some party on its behalf) create a regular stream through the relay and measure the throughput on the stream before/during/after the measurement. This can be explored longer term.

3.3.2 Incomplete Authentication

The short term FlashFlow implementation has the relay set two torrc options if they would like to allow themselves to be measured: a flag allowing measurement, and the list of coordinator TLS certificate that are allowed to start a measurement.

The relay drops MEAS_PARAMS cells from coordinators it does not trust, and immediately closes the connection after that. A FF coordinator cannot convince a relay to enter measurement mode unless the relay trusts its TLS certificate.

A trusted coordinator specifies in the MEAS_PARAMS cell the IP addresses of the measurers the relay shall expect to connect to it shortly. The target adds the measurer IP addresses to a whitelist in the DoS connection limit system, exempting them from any configured connection limit. If a measurer is behind a NAT, an adversary behind the same NAT can DoS the relay's available sockets until the end of the measurement. The adversary could also pretend to be the measurer. Such an adversary could induce measurement failures and inaccuracies. (Note: the whitelist is cleared after the measurement is over.)

4. FlashFlow measurement system: Medium term

The medium term deployment stage begins after FlashFlow has been implemented and relays are starting to update to a version of Tor that supports it.

New link- and relay-subprotocol versions will be used by the relay to indicate FF support. E.g. at the time of writing, the next relay subprotocol version is 4 [3].

We plan to host a FlashFlow deployment consisting of a FF coordinator and a single FF measurer on a single 1 Gbit/s machine. Data produced by this deployment will be made available (semi?) publicly, including both v3bw files and intermediate results.

Any development changes needed during this time would go through separate proposals.

5. FlashFlow measurement system: Long term

In the long term, finishing-touch development work will be done, including adding better authentication and measurement scheduling, and experiments will be run to determine the best way to integrate FlashFlow into the Tor ecosystem.

Any development changes needed during this time would go through separate proposals.

5.1 Authentication to Target Relay

Short term deployment already had FlashFlow coordinators using TLS certificates when connecting to relays, but in the long term, directory authorities will vote on the consensus parameter for which coordinators should be allowed to perform measurements. The voting is done in the same way they currently vote on recommended tor versions.

FlashFlow measurers will be updated to use TLS certificates when connecting to relays too. FlashFlow coordinators will update the contents of MEAS_PARAMS cells to contain measurer TLS certificates instead of IP addresses, and relays will update to expect this change.

5.2 Measurement Scheduling

Short term deployment only has one FF deployment running. Long term this may no longer be the case because, for example, more than one directory authority decides to adopt it and they each want to run their own deployment. FF deployments will need to coordinate between themselves to not measure the same relay at the same time, and to handle new relays as they join during the middle of a measurement period (during the day).

The measurement scheduling process shall be non-interactive. All the inputs (e.g. the shared random value, the identities of the coords, the relays currently in the network) are publicly known to (at least) the bwauths, thus each individual bwauth can calculate same multi-coord measurement schedule.

The following is quoted from Section 4.3 of the FlashFlow paper.

To measure all relays in the network, the BWAuths periodically
determine the measurement schedule. The schedule determines when and
by whom a relay should be measured. We assume that the BWAuths have
sufficiently synchronized clocks to facilitate coordinating their
schedules. A measurement schedule is created for each measurement
period, the length p of which determines how often a relay is
measured. We use a measurement period of p = 24 hours.

To help avoid active denial-of-service attacks on targeted relays,
the measurement schedule is randomized and known only to the
BWAuths. Before the next measurement period starts, the BWAuths
collectively generate a random seed (e.g. using Tor’s
secure-randomness protocol). Each BWAuth can then locally determine
the shared schedule using pseudorandom bits extracted from that
seed. The algorithm to create the schedule considers each
measurement period to be divided into a sequence of t-second
measurement slots. For each old relay, slots for each BWAuth to
measure it are selected uniformly at random without replacement
from all slots in the period that have sufficient unallocated
measurement capacity to accommodate the measurement. When a new
relay appears, it is measured separately by each BWAuth in the first
slots with sufficient unallocated capacity. Note that this design
ensures that old relays will continue to be measured, with new
relays given secondary priority in the order they arrive.

[XXX Teor leaves good ideas in his tor-dev@ post [5], including a good plain language description of what the FF paper quotes says, and a recommendation on which consensus to use when making a new schedule]

A problem arises when two relays are hosted on the same machine but measured at different times: they both will be measured to have the full capacity of their host. At the very least, the scheduling algo should schedule relays with the same IP to be measured at the same time. Perhaps better is measuring all relays in the same MyFamily, same ipv4/24, and/or same ipv6/48 at the same time. What specifically to do here is left for medium/long term work.

5.3 Experiments

[XXX todo]

5.4 Other Changes/Investigations/Ideas

  • How can FlashFlow data be used in a way that doesn't lead to poor load balancing given the following items that lead to non-uniform client behavior:
    • Guards that high-traffic HSs choose (for 3 months at a time)
    • Guard vs middle flag allocation issues
    • New Guard nodes (Guardfraction)
    • Exit policies other than default/all
    • Directory activity
    • Total onion service activity
    • Super long-lived circuits
  • Add a cell that the target relay sends to the coordinator indicating its CPU and memory usage, whether it has a shortage of sockets, how much bandwidth load it has been experiencing lately, etc. Use this information to lower a relays weight, never increase.
  • If FlashFlow and sbws work together (as opposed to FlashFlow replacing sbws), consider logic for how much sbws can increase/decrease FF results
  • Coordination of multiple FlashFlow deployments: scheduling of measurements, seeding schedule with shared random value.
  • Other background/measurement traffic ratios. Dynamic? (known slow relay => more allowed bg traffic?)
  • Catching relays inflating their measured capacity by dropping background traffic.
  • What to do about co-located relays. Can they be detected reliably? Should we just add a torrc option a la MyFamily for co-located relays?
  • What is the explanation for dennis.jackson's scary graphs in this [2] ticket? Was it because of the speed test? Why? Will FlashFlow produce the same behavior?

Appendix A: Save CPU at measurer by not encrypting all MEAS_ECHO cells

Verify echo cells

A parameter will exist to tell the measurers with what frequency they shall verify that cells echoed back to them match what was sent. This parameter does not need to exist outside of the FF deployment (e.g. it doesn't need to be a consensus parameter).

The parameter instructs the measurers to check 1 out of every N cells.

The measurer keeps a count of how many measurement cells it has sent. It also logically splits its output stream of cells into buckets of size N. At the start of each bucket (when num_sent % N == 0), the measurer chooses a random index in the bucket. Upon sending the cell at that index (num_sent % N == chosen_index), the measurer records the cell.

The measurer also counts cells that it receives. When it receives a cell at an index that was recorded, it verifies that the received cell matches the recorded sent cell. If they match, no special action is taken. If they don't match, the measurer indicates failure to the coordinator and target relay and closes all connections, ending the measurement.

Example

Consider bucket_size is 1000. For the moment ignore cell encryption.

We start at idx=0 and pick an idx in [0, 1000) to record, say 640. At idx=640 we record the cell. At idx=1000 we choose a new idx in [1000, 2000) to record, say 1236. At idx=1236 we record the cell. At idx=2000 we choose a new idx in [2000, 3000). Etc.

There's 2000+ cells in flight and the measurer has recorded two items:

- (640, contents_of_cellA)
- (1236, contents_of_cellB)

Consider the receive side now. It counts the cells it receives. At receive idx=640, it checks the received cell matches the saved cell from before. At receive idx=1236, it again checks the received cell matches. Etc.

Motivation

A malicious relay may want to skip decryption of measurement cells to save CPU cycles and obtain a higher capacity estimate. More generally, it could generate fake measurement cells locally, ignore the measurement traffic it is receiving, and flood the measurer with more traffic that it (the measurer) is even sending.

The security of echo cell verification is discussed in section 3.3.1.

Security

A smaller bucket size means more cells are checked and FF is more likely to detect a malicious target. It also means more bookkeeping overhead (CPU/RAM).

An adversary that knows bucket_size and cheats on one item out of every bucket_size items will have a 1/bucket_size chance of getting caught in the first bucket. This is the worst case adversary. While cheating on just a single item per bucket yields very little advantage, cheating on more items per bucket increases the likelihood the adversary gets caught. Thus only the worst case is considered here.

In general, the odds the adversary can successfully cheat in a single bucket are

(bucket_size-1)/bucket_size

Thus the odds the adversary can cheat in X consecutive buckets are

[(bucket_size-1)/bucket_size]^X

In our case, X will be highly varied: Slow relays won't see very many buckets, but fast relays will. The damage to the network a very slow relay can do by faking being only slightly faster is limited. Nonetheless, for now we motivate the selection of bucket_size with a slow relay:

  • Assume a very slow relay of 1 Mbit/s capacity that will cheat 1 cell in each bucket. Assume a 30 second measurement.
  • The relay will handle 1*30 = 30 Mbit of traffic during the measurement, or 3.75 MB, or 3.75 million bytes.
  • Cells are 514 bytes. Approximately (e.g. ignoring TLS) 7300 cells will be sent/recv over the course of the measurement.
  • A bucket_size of 50 results in about 146 buckets over the course of the 30s measurement.
  • Therefore, the odds of the adversary cheating successfully as (49/50)^(146), or about 5.2%.

This sounds high, but a relay capable of double the bandwidth (2 Mbit/s) will have (49/50)^(2*146) or 0.2% odds of success, which is quite low.

Wanting a <1% chance that a 10 Mbit/s relay can successfully cheat results in a bucket size of approximately 125:

  • 10*30 = 300 Mbit of traffic during 30s measurement. 37.5 million bytes.
  • 37,500,000 bytes / 514 bytes/cell = ~73,000 cells
  • bucket_size of 125 cells means 73,000 / 125 = 584 buckets
  • (124/125)^(584) = 0.918% chance of successfully cheating

Slower relays can cheat more easily but the amount of extra weight they can obtain is insignificant in absolute terms. Faster relays are essentially unable to cheat.

Filename: 317-secure-dns-name-resolution.txt
Title: Improve security aspects of DNS name resolution
Author: Christian Hofer
Created: 21-Mar-2020
Status: Needs-Revision

Overview:

   This document proposes a solution for handling DNS name resolution within
   Tor in a secure manner. In order to achieve this the responsibility for
   name resolution is moved from the exit relays to the clients. Therefore a
   security aware DNS resolver is required that is able to operate using Tor.

   The advantages are:

   * Users have no longer to trust exit relays but can choose trusted
     nameservers.
   * DNS requests are kept confidential from exit relays in case the
     nameservers are running behind onion services.
   * The authenticity and integrity of DNS records is verified by means of
     DNSSEC.

Motivation:

   The way how Tor resolves DNS names has always been a hot topic within
   the Tor community and it seems that the discussion is not over yet.

   One example is this recent blog posting that addresses the importance of
   avoiding public DNS resolvers in order to mitigate analysis attacks.

   https://blog.torproject.org/new-low-cost-traffic-analysis-attacks-mitigations

   Then there is the paper "The Effect of DNS on Tor’s Anonymity" that
   discusses how to use DNS traffic for correlation attacks and what
   countermeasures should be taken. Based on this, there is this interesting
   medium article evaluating the situation two years after it was published.

   https://medium.com/@nusenu/who-controls-tors-dns-traffic-a74a7632e8ca

   Furthermore, there was already a proposal to improve the way how DNS
   resolution is done within Tor. Unfortunately, it seems that it has been
   abandoned, so this proposal picked up the presented ideas.

   https://gitweb.torproject.org/torspec.git/tree/proposals/219-expanded-dns.txt

Design:

   The key aspect is the introduction of a DNS resolver module on the client
   side. It has to comply with the well known DNS standards as described in a
   series of RFCs. Additional requirements are the ability to communicate
   through the Tor network for ensuring confidentiality and the implementation
   of DNS security extensions (DNSSEC) for verifying the authenticity and
   integrity. Furthermore it has to cover two distinct scenarios, which are
   described in subsequent sections.

   The resolution scenario, the most common scenario for a DNS resolvers, is
   applicable for connections handled by the SocksPort. After successful socks
   handshakes the target address is resolved before attaching the connection.

   The proxy scenario is a more unusual use case, however it is required for
   connections handled by the DNSPort. In this case requests are forwarded as
   they are received without employing any resolution or verification means.

   In both scenarios the most noticeable change in terms of interactions
   between the resolver and the rest of Tor concerns the entry and exit points
   for passing connections forth and back. Additionally, the entry_connection
   needs to be extended so that it is capable of holding related state
   information.

Security implications:

   This improves the security aspects of DNS name resolution by reducing the
   significance of exit relays. In particular:

   * Operating nameservers behind onion services allows end-to-end encryption
     for DNS lookups.
   * Employing DNSSEC verification prevents tampering with DNS records.
   * Configuring trusted nameservers on the client side reduces the number of
     entities that must be trusted.

Specification:

   DNS resolver general implementation:

      The security aware DNS resolver module has to comply with existing DNS
      and DNSSEC specifications. A list of related RFCs:

      RFC883, RFC973, RFC1035, RFC1183, RFC1876, RFC1996, RFC2065, RFC2136,
      RFC2230, RFC2308, RFC2535, RFC2536, RFC2539, RFC2782, RFC2845, RFC2874,
      RFC2930, RFC3110, RFC3123, RFC3403, RFC3425, RFC3596, RFC3658, RFC3755,
      RFC3757, RFC3986, RFC4025, RFC4033, RFC4034, RFC4035, RFC4255, RFC4398,
      RFC4431,RFC4509, RFC4635, RFC4701, RFC5011, RFC5155, RFC5702, RFC5933,
      RFC6605, RFC6672, RFC6698, RFC6725, RFC6840, RFC6844, RFC6891, RFC7129,
      RFC7344, RFC7505, RFC7553, RFC7929, RFC8005, RFC8078, RFC8080, RFC8162.

   DNS resolver configuration settings:

      DNSResolver: If True use DNS resolver module for name resolution,
        otherwise Tor's behavior should be unchanged.

      DNSResolverIPv4: If True names should be resolved to IPv4 addresses.

      DNSResolverIPv6: If True names should be resolved to IPv6 addresses. In
        case IPv4 and IPv6 are enabled prefer IPv6 and use IPv4 as fallback.

      DNSResolverRandomizeCase: If True apply 0x20 hack to DNS names for
        outgoing requests.

      DNSResolverNameservers: A list of comma separated nameservers, can be an
        IPv4, an IPv6, or an onion address. Should allow means to configure the
        port and supported zones.

      DNSResolverHiddenServiceZones: A list of comma separated hidden service
        zones.

      DNSResolverDNSSECMode: Should support at least four modes.
        Off: No validation is done. The DO bit is not set in the header of
             outgoing requests.
        Trust: Trust validation of DNS recursor. The CD and DO bits are not set
               in the header of outgoing requests.
        Porcess: Employ DNSSEC validation but ignore the result.
        Validate: Employ DNSSEC validation and reject insecure data.

      DNSResolverTrustAnchors: A list of comma separated trust anchors in DS
        record format. https://www.iana.org/dnssec/files

      DNSResolverMaxCacheEntries: Specifies the maximum number of cache
        entries.

      DNSResolverMaxCacheTTL: Specifies the maximum age of cache entries in
        seconds.

   DNS resolver state (dns_lookup_st.h):

      action: Defines the active action. Available actions are: forward,
        resolve, validate.

      qname: Specifies the name that should be resolved or forwarded.

      qtype: Specifies the type that should be resolved or forwarded.

      start_time: Holds the initiation time.

      nameserver: Specifies the chosen nameserver.

      validation: Holds the DNSSEC validation state only applicable for the
        validate action.

      server_request: The original DNSPort request required for delivering
        responses in the proxy scenario.

      ap_conn: The original SocksPort entry_connection required for delivering
        responses in the resolution scenario.

   SocksPort related changes (resolution scenario):

      The entry point is directly after a successful socks handshake in
      connection_ap_handshake_process_socks (connetion_edge.c). Based on the
      target address type the entry_connection is either passed to the DNS
      resolver (hostname) or handled as usual (IPv4, IPv6, onion).

      In the former case the DNS resolver creates a new DNS lookup connection
      and attaches it instead of the given entry_connection. This connection is
      responsible for resolving the hostname of the entry_connection and
      verifying the response.

      Once the result is verified and the hostname is resolved, the DNS
      resolver replaces the target address in the entry_connection with the
      resolved address and attaches it. From this point on the entry_connection
      is processed as usual.

   DNSPort related changes (proxy scenario):

      The entry point is in evdns_server_callback (dnsserv.c). Instead of
      creating a dummy connection the received server_request is passed to the
      DNS resolver. It creates a DNS lookup connection with the action type
      forward and applies the name and type from the server_request. When the
      DNS resolver receives the answer from the nameserver it resolvers the
      server_request by adding all received resource records.

Compatibility:

   Compatibility issues are not expected since there are no changes to the Tor
   protocol. The significant part takes place on the client side before
   attaching connections.

Implementation:

   A complete implementation of this proposal can be found here:
    https://github.com/torproject/tor/pull/1869

   The following steps should suffice to test the implementation:

      * check out the branch
      * build Tor as usual
      * enable the DNS resolver module by adding `DNSResolver 1` to torrc

   Useful services for verifying DNSSEC validation:

   * http://www.dnssec-or-not.com/
   * https://enabled.dnssec.hkirc.hk/
   * https://www.cloudflare.com/ssl/encrypted-sni/

   Dig is useful for testing the DNSPort related changes:

      dig -p9053 torproject.org

Performance and scalability:

   Since there are no direct changes to the protocol and this is an alternative
   approach for an already existing requirement, there are no performance
   issues expected. Additionally, the encoding and decoding of DNS message
   handling as well as the verification takes place on the client side.

   In terms of scalability the availability of nameservers might be one of the
   key concerns. However, this is the same issue as for nameservers on the
   clearweb. If it turns out that it is not feasible to operate nameservers as
   onion service in a performant manner it is always possible to fallback to
   clearweb nameservers by changing a configuration setting.

Filename: 318-limit-protovers.md
Title: Limit protover values to 0-63.
Author: Nick Mathewson
Created: 11 May 2020
Status: Closed
Implemented-In: 0.4.5.1-alpha

Limit protover values to 0-63.

I propose that we no longer accept protover values higher than 63, so that they can all fit nicely into 64-bit fields.

(This proposal is part of the Walking Onions spec project.)

Motivation

Doing this will simplify our implementations and our protocols. Right now, an efficient protover implementation needs to use ranges to represent possible protocol versions, and needs workarounds to prevent an attacker from constructing a protover line that would consume too much memory. With Walking Onions, we need lists of protocol versions to be represented in an extremely compact format, which also would benefit from a limited set of possible versions.

I believe that we will lose nothing by making this change. Currently, after nearly two decades of Tor development and 3.5 years of experiences with protovers in production, we have no protocol version high than 5.

Even if we did someday need to implement higher protocol versions, we could simply add a new subprotocol name instead. For example, instead of "HSIntro=64", we could say "HSIntro2=1".

Migration

Immediately, authorities should begin rejecting relays with protocol versions above 63. (There are no such relays in the consensus right now.)

Once this change is deployed to a majority of authorities, we can remove support in other Tor environments for protocol versions above 63.

Filename: 319-wide-everything.md
Title: RELAY_FRAGMENT cells
Author: Nick Mathewson
Created: 11 May 2020
Status: Obsolete

(Proposal superseded by proposal 340)

(This proposal is part of the Walking Onions spec project.)

Introduction

Proposal 249 described a system for CREATE cells to become wider, in order to accommodate hybrid crypto. And in order to send those cell bodies across circuits, it described a way to split CREATE cells into multiple EXTEND cells.

But there are other cell types that can need to be wider too. For example, INTRODUCE and RENDEZVOUS cells also contain key material used for a handshake: if handshakes need to grow larger, then so do these cells.

This proposal describes an encoding for arbitrary "wide" relay cells, that can be used to send a wide variant of anything.

To be clear, although this proposal describes a way that all relay cells can become "wide", I do not propose that wide cells should actually be allowed for all relay cell types.

Proposal

We add a new relay cell type: RELAY_FRAGMENT. This cell type contains part of another relay cell. A RELAY_FRAGMENT cell can either introduce a new fragmented cell, or can continue one that is already in progress.

The format of a RELAY_FRAGMENT body is one of the following:

// First body in a series
struct fragment_begin {
   // What relay_command is in use for the underlying cell?
   u8 relay_command;
   // What will the total length of the cell be once it is reassembled?
   u16 total_len;
   // Bytes for the cell body
   u8 body[];
}

// all other cells.
struct fragment_continued {
   // More bytes for the cell body.
   u8 body[];
}

To send a fragmented cell, first a party sends a RELAY_FRAGMENT cell containing a "fragment_begin" payload. This payload describes the total length of the cell, the relay command

Fragmented cells other than the last one in sequence MUST be sent full of as much data as possible. Parties SHOULD close a circuit if they receive a non-full fragmented cell that is not the last fragment in a sequence.

Fragmented cells MUST NOT be interleaved with other relay cells on a circuit, other than cells used for flow control. (Currently, this is only SENDME cells.) If any party receives any cell on a circuit, other than a flow control cell or a RELAY_FRAGMENT cell, before the fragmented cell is complete, than it SHOULD close the circuit.

Parties MUST NOT send extra data in fragmented cells beyond the amount given in the first 'total_len' field.

Not every relay command may be sent in a fragmented cell. In this proposal, we allow the following cell types to be fragmented: EXTEND2, EXTENDED2, INTRODUCE1, INTRODUCE2, RENDEZVOUS1, and RENDEZVOUS2. Any party receiving a command that they believe should not be fragmented should close the circuit.

Not all lengths up to 65535 are valid lengths for a fragmented cell. Any length under 499 bytes SHOULD cause the circuit to close, since that could fit into a non-fragmented RELAY cell. Parties SHOULD enforce maximum lengths for cell types that they understand.

All RELAY_FRAGMENT cells for the fragmented cell must have the same Stream ID. (For those cells allowed above, the Stream ID is always zero.) Implementations SHOULD close a circuit if they receive fragments with mismatched Stream ID.

Onion service concerns.

We allocate a new extension for use in the ESTABLISH_INTRO by onion services, to indicate that they can receive a wide INTRODUCE2 cell. This extension contains:

    struct wide_intro2_ok {
      u16 max_len;
    }

We allocate a new extension for use in the ESTABLISH_RENDEZVOUS cell, to indicate acceptance of wide RENDEZVOUS2 cells. This extension contains:

    struct wide_rend2_ok {
      u16 max_len;
    }

(Note that ESTABLISH_RENDEZVOUS cells do not currently have a an extension mechanism. They should be extended to use the same extension format as ESTABLISH_INTRO cells, with extensions placed after the rendezvous cookie.)

Handling RELAY_EARLY

The first fragment of each EXTEND cell should be tagged with RELAY_EARLY. The remaining fragments should not. Relays should accept EXTEND cells if and only if their first fragment is tagged with RELAY_EARLY.

Rationale: We could allow any fragment to be tagged, but that would give hostile guards an opportunity to move RELAY_EARLY tags around and build a covert channel. But if we later move to a relay encryption method that lets us authenticate RELAY_EARLY, we could then require only that any fragment has RELAY_EARLY set.

Compatibility

This proposal will require the allocation of a new 'Relay' protocol version, to indicate understanding of the RELAY_FRAGMENTED command.

Filename: 320-tap-out-again.md
Title: Removing TAP usage from v2 onion services
Author: Nick Mathewson
Created: 11 May 2020
Status: Rejected

NOTE: we rejected this proposal in favor of simply deprecating v2 onion services entirely.

(This proposal is part of the Walking Onions spec project. It updates proposal 245.)

Removing TAP from v2 onion services

As we implement walking onions, we're faced with a problem: what to do with TAP keys? They are bulky and insecure, and not used for anything besides v2 onion services. Keeping them in SNIPs would consume bandwidth, and keeping them indirectly would consume complexity. It would be nicer to remove TAP keys entirely.

But although v2 onion services are obsolescent and their cryptographic parameters are disturbing, we do not want to drop support for them as part of the Walking Onions migration. If we did so, then we would force some users to choose between Walking Onions and v2 onion services, which we do not want to do.

Instead, we describe here a phased plan to replace TAP in v2 onion services with ntor. This change improves the forward secrecy of some of the session keys used with v2 onion services, but does not improve their authentication, which is strongly tied to truncated SHA1 hashes of RSA1024 keys.

Implementing this change is more complex than similar changes elsewhere in the Tor protocol, since we do not want clients or services to leak whether they have support for this proposal, until support is widespread enough that revealing it is no longer a privacy risk.

We define these entries that may appear in v2 onion service descriptors, once per introduction point.

"identity-ed25519"
"ntor-onion-key"

   [at most once each per intro point.]

   These values are in the same format as and follow the same
   rules as their equivalents in router descriptors.

"link-specifiers"

   [at most once per introduction point]

   This value is the same as the link specifiers in a v3 onion
   service descriptor, and follows the same rules.

Services should not include any of these fields unless a new network parameter, "hsv2-intro-updated" is set to 1. Clients should not parse these fields or use them unless "hsv2-use-intro-updated" is set to 1.

We define a new field that can be used for hsv2 descriptors with walking onions:

"snip"
    [at most once]

    This value is the same as the snip field introduced to a v3
    onion service descriptor by proposal (XXX) and follows the
    same rules.

Services should not include this field unless a new network parameter, "hsv2-intro-snip" is set to 1. Clients should not parse this field or use it unless the parameter "hsv2-use-intro-snip" is set to 1.

Additionally, relays SHOULD omit the following legacy intro point parameters when a new network parameter, "hsv2-intro-legacy" is set to 0: "ip-address", "onion-port", and "onion-key". Clients should treat them as optional when "hsv2-tolerate-no-legacy" is set to 1.

INTRODUCE cells, RENDEZVOUS cells, and ntor.

We allow clients to specify the rendezvous point's ntor key in the INTRODUCE2 cell instead of the TAP key. To do this, the client simply sets KLEN to 32, and includes the ntor key for the relay.

Clients should only use ntor keys in this way if the network parameter "hsv2-client-rend-ntor" is set to 1, and if the entry "allow-rend-ntor" is present in the onion service descriptor.

Services should only advertise "allow-rend-ntor" in this way if the network parameter "hsv2-service-rend-ntor" is set to 1.

Migration steps

First, we implement all of the above, but set it to be disabled by default. We use torrc fields to selectively enable them for testing purposes, to make sure they work.

Once all non-LTS versions of Tor without support for this proposal are obsolete, we can safely enable "hsv2-client-rend-ntor", "hsv2-service-rend-ntor", "hsv2-intro-updated", and "hsv2-use-intro-updated".

Once all non-LTS versions of Tor without support for walking onions are obsolete, we can safely enable "hsv2-intro-snip", "hsv2-use-intro-snip", and "hsv2-tolerate-no-legacy".

Once all non-LTS versions of Tor without support for both of the above implementations are finally obsolete, we can finally set "hsv2-intro-legacy" to 0.

Future work

There is a final TAP-like protocol used for v2 hidden services: the client uses RSA1024 and DH1024 to send information about the rendezvous point and to start negotiating the session key to be used for end-to-end encryption.

In theory we could get a benefit to forward secrecy by using ntor instead of TAP here, but we would get not corresponding benefit for authentication, since authentication is still ultimately tied to HSv2's scary RSA1024-plus-truncated-SHA1 combination.

Given that, it might be just as good to allow the client to insert a curve25519 key in place of their DH1024 key, and use that for the DH handshake instead. That would be a separate proposal, though: this proposal is enough to allow all relays to drop TAP support.

Filename: 321-happy-families.md
Title: Better performance and usability for the MyFamily option (v2)
Author: Nick Mathewson
Created: 27 May 2020
Status: Accepted

Problem statement.

The current family mechanism allows well-behaved relays to identify that they all belong to the same 'family', and should not be used in the same circuits.

Right now, families work by having every family member list every other family member in its server descriptor. This winds up using O(n^2) space in microdescriptors and server descriptors. (For RAM, we can de-duplicate families which sometimes helps.) Adding or removing a server from the family requires all the other servers to change their torrc settings.

This growth in size is not just a theoretical problem. Family declarations currently make up a little over 55% of the microdescriptors in the directory--around 24% after compression. The largest family has around 270 members. With Walking Onions, 270 members times a 160-bit hashed identifier leads to over 5 kilobytes per SNIP, which is much greater than we'd want to use.

This is an updated version of proposal 242. It differs by clarifying requirements and providing a more detailed migration plan.

Design overview.

In this design, every family has a master ed25519 "family key". A node is in the family iff its server descriptor includes a certificate of its ed25519 identity key with the family key. The certificate format is the one in the tor-certs.txt spec; we would allocate a new certificate type for this usage. These certificates would need to include the signing key in the appropriate extension.

Note that because server descriptors are signed with the node's ed25519 signing key, this creates a bidirectional relationship between the two keys, so that nodes can't be put in families without their consent.

Changes to router descriptors

We add a new entry to server descriptors:

"family-cert" NL
"-----BEGIN FAMILY CERT-----" NL
cert
"-----END FAMILY CERT-----".

This entry contains a base64-encoded certificate as described above. It may appear any number of times; authorities MAY reject descriptors that include it more than three times.

Changes to microdescriptors

We add a new entry to microdescriptors: family-keys.

This line contains one or more space-separated strings describing families to which the node belongs. These strings MUST be sorted in lexicographic order. These strings MAY be base64-formated nonpadded ed25519 family keys, or may represent some future encoding.

Clients SHOULD accept unrecognized key formats.

Changes to voting algorithm

We allocate a new consensus method number for voting on these keys.

When generating microdescriptors using a suitable consensus method, the authorities include a "family-keys" line if the underlying server descriptor contains any valid family-cert lines. For each valid family-cert in the server descriptor, they add a base-64-encoded string of that family-cert's signing key.

See also "deriving family lines from family-keys?" below for an interesting but more difficult extension mechanism that I would not recommend.

Relay configuration

There are several ways that we could configure relays to let them include family certificates in their descriptors.

The easiest would be putting the private family key on each relay, so that the relays could generate their own certificates. This is easy to configure, but slightly risky: if the private key is compromised on any relay, anybody can claim membership in the family. That isn't so very bad, however -- all the relays would need to do in this event would be to move to a new private family key.

A more orthodox method would be to keep the private key somewhere offline, and use it to generate a certificate for each relay in the family as needed. These certificates should be made with long-enough lifetimes, and relays should warn when they are going to expire soon.

Changes to relay behavior

Each relay should track which other relays they have seen using the same family-key as itself. When generating a router descriptor, each relay should list all of these relays on the legacy 'family' line. This keeps the "family" lines up-to-date with "family-keys" lines for compliant relays.

Relays should continue listing relays in their family lines if they have seen a relay with that identity using the same family-key at any time in the last 7 days.

The presence of this line should be configured by a network parameter, derive-family-line.

Relays whose family lines do not stay at least mostly in sync with their family keys should be marked invalid by the authorities.

Client behavior

Clients should treat node A and node B as belonging to the same family if ANY of these is true:

  • The client has descriptors for A and B, and A's descriptor lists B in its family line, and B's descriptor lists A in its family line.

  • Client A has descriptors for A and B, and they both contain the same entry in their family-keys or family-cert. (Note that a family-cert key may match a base64-encoded entry in the family-keys entry.)

Migration

For some time, existing relays and clients will not support family certificates. Because of this, we try to make sure above the well-behaved relays will list the same entries in both places.

Once enough clients have migrated to using family certificates, authorities SHOULD disable derive-family-line.

Security

Listing families remains as voluntary in this design as in today's Tor, though bad-relay hunters can continue to look for families that have not adopted a family key.

A hostile relay family could list a "family" line that did not match its "family-certs" values. However, the only reason to do so would be in order to launch a client partitioning attack, which is probably less valuable than the kinds of attacks that they could run by simply not listing families at all.

Appendix: deriving family lines from family-keys?

As an alternative, we might declare that authorities should keep family lines in sync with family-certs. Here is a design sketch of how we might do that, but I don't think it's actually a good idea, since it would require major changes to the data flow of the voting system.

In this design, authorties would include a "family-keys" line in each router section in their votes corresponding to a relay with any family-cert. When generating final microdescriptors using this method, the authorities would use these lines to add entries to the microdescriptors' family lines:

  1. For every relay appearing in a routerstatus's family-keys, the relays calculate a consensus family-keys value by listing including all those keys that are listed by a majority of those voters listing the same router with the same descriptor. (This is the algorithm we use for voting on other values derived from the descriptor.)

  2. The authorities then compute a set of "expanded families": one for each family key. Each "expanded family" is a set containing every router in the consensus associated with that key in its consensus family-keys value.

  3. The authorities discard all "expanded families" of size 1 or smaller.

  4. Every router listed for the "expanded family" has every other router added to the "family" line in its microdescriptor. (The "family" line is then re-canonicalized according to the rules of proposal 298 to remove its )

  5. Note that the final microdescriptor consensus will include the digest of the derived microdescriptor in step 4, rather than the digest of the microdescriptor listed in the original votes. (This calculation is deterministic.)

The problem with this approach is that authorities would have to fetch microdescriptors they do not have in order to replace their family lines. Currently, voting never requires an authority to fetch a microdescriptor from another authority. If we implement vote compression and diffs as in the Walking Onions proposal, however, we might suppose that votes could include microdescriptors directly.

Still, this is likely more complexity than we want for a transition mechanism.

Appendix: Deriving family-keys from families??

We might also imagine that authorities could infer which families exist from the graph of family relationships, and then include synthetic "family-keys" entries for routers that belong to the same family.

This has two challenges: first, to compute these synthetic family keys, the authorities would need to have the same graph of family relationships to begin with, which once again would require them to include the complete list of families in their votes.

Secondly, finding all the families is equivalent to finding all maximal cliques in a graph. This problem is NP-hard in its general case. Although polynomial solutions exist for nice well-behaved graphs, we'd still need to worry about hostile relays including strange family relationships in order to drive the algorithm into its exponential cases.

Appendix: New assigned values

We need a new assigned value for the certificate type used for family signing keys.

We need a new consensus method for placing family-keys lines in microdescriptors.

Appendix: New network parameters

  • derive-family-line: If 1, relays should derive family lines from observed family-keys. If 0, they do not. Min: 0, Max: 1. Default: 1.
Filename: 322-dirport-linkspec.md
Title: Extending link specifiers to include the directory port
Author: Nick Mathewson
Created: 27 May 2020
Status: Open

Motivation

Directory ports remain the only way to contact a (non-bridge) Tor relay that isn't expressible as a Link Specifier. We haven't specified a link specifier of this kind so far, since it isn't a way to contact a relay to create a channel.

But authorities still expose directory ports, and encourage relays to use them preferentially for uploading and downloading. And with Walking Onions, it would be convenient to try to make every kind of "address" a link specifier -- we'd like want authorities to be able to specify a list of link specifiers that can be used to contact them for uploads and downloads.

It is possible that after revision, Walking Onions won't need a way to specify this information. If so, this proposal should be moved to "Reserve" status as generally unuseful.

Proposal

We reserve a new link specifier type "dir-url", for use with the directory system. This is a variable-length link specifier, containing a URL prefix. The only currently supported URL schema is "http://". Implementations SHOULD ignore unrecognized schemas. IPv4 and IPv6 addresses MAY be used directory; hostnames are also allowed. Implementations MAY ignore hostnames and only use raw addresses.

The URL prefix includes everything through the string "tor" in the directory hierarchy.

A dir-url link specifier SHOULD NOT appear in an EXTEND cell; implementations SHOULD reject them if they do appear.

Filename: 323-walking-onions-full.md
Title: Specification for Walking Onions
Author: Nick Mathewson
Created: 3 June 2020
Status: Open

Introduction: A Specification for Walking Onions

In Proposal 300, I introduced Walking Onions, a design for scaling Tor and simplifying clients, by removing the requirement that every client know about every relay on the network.

This proposal will elaborate on the original Walking Onions idea, and should provide enough detail to allow multiple compatible implementations. In this introduction, I'll start by summarizing the key ideas of Walking Onions, and then outline how the rest of this proposal will be structured.

Remind me about Walking Onions again?

With Tor's current design, every client downloads and refreshes a set of directory documents that describe the directory authorities' views about every single relay on the Tor network. This requirement makes directory bandwidth usage grow quadratically, since the directory size grows linearly with the number of relays, and it is downloaded a number of times that grows linearly with the number of clients. Additionally, low-bandwidth clients and bootstrapping clients spend a disproportionate amount of their bandwidth loading directory information.

With these drawbacks, why does Tor still require clients to download a directory? It does so in order to prevent attacks that would be possible if clients let somebody else choose their paths through the network, or if each client chose its paths from a different subset of relays.

Walking Onions is a design that resists these attacks without requiring clients ever to have a complete view of the network.

You can think of the Walking Onions design like this: Imagine that with the current Tor design, the client covers a wall with little pieces of paper, each representing a relay, and then throws a dart at the wall to pick a relay. Low-bandwidth relays get small pieces of paper; high-bandwidth relays get large pieces of paper. With the Walking Onions design, however, the client throws its dart at a blank wall, notes the position of the dart, and asks for the relay whose paper would be at that position on a "standard wall". These "standard walls" are mapped out by directory authorities in advance, and are authenticated in such a way that the client can receive a proof of a relay's position on the wall without actually having to know the whole wall.

Because the client itself picks the position on the wall, and because the authorities must vote together to build a set of "standard walls", nobody else controls the client's path through the network, and all clients can choose their paths in the same way. But since clients only probe one position on the wall at a time, they don't need to download a complete directory.

(Note that there has to be more than one wall at a time: the client throws darts at one wall to pick guards, another wall to pick middle relays, and so on.)

In Walking Onions, we call a collection of standard walls an "ENDIVE" (Efficient Network Directory with Individually Verifiable Entries). We call each of the individual walls a "routing index", and we call each of the little pieces of paper describing a relay and its position within the routing index a "SNIP" (Separable Network Index Proof).

For more details about the key ideas behind Walking Onions, see proposal 300. For more detailed analysis and discussion, see "Walking Onions: Scaling Anonymity Networks while Protecting Users" by Komlo, Mathewson, and Goldberg.

The rest of this document

This proposal is unusually long, since Walking Onions touches on many aspects of Tor's functionality. It requires changes to voting, directory formats, directory operations, circuit building, path selection, client operations, and more. These changes are described in the sections listed below.

Here in section 1, we briefly reintroduce Walking Onions, and talk about the rest of this proposal.

Section 2 will describe the formats for ENDIVEs, SNIPs, and related documents.

Section 3 will describe new behavior for directory authorities as they vote on and produce ENDIVEs.

Section 4 describes how relays fetch and reconstruct ENDIVEs from the directory authorities.

Section 5 has the necessary changes to Tor's circuit extension protocol so that clients can extend to relays by index position.

Section 6 describes new behaviors for clients as they use Walking Onions, to retain existing Tor functionality for circuit construction.

Section 7 explains how to implement onion services using Walking Onions.

Section 8 describes small alterations in client and relay behavior to strengthen clients against some kinds of attacks based on relays picking among multiple ENDIVEs, while still making the voting system robust against transient authority failures.

Section 9 closes with a discussion of how to migrate from the existing Tor design to the new system proposed here.

Appendices

Additionally, this proposal has several appendices:

Appendix A defines commonly used terms.

Appendix B provides definitions for CDDL grammar productions that are used elsewhere in the documents.

Appendix C lists the new elements in the protocol that will require assigned values.

Appendix D lists new network parameters that authorities must vote on.

Appendix E gives a sorting algorithm for a subset of the CBOR object representation.

Appendix F gives an example set of possible "voting rules" that authorities could use to produce an ENDIVE.

Appendix G lists the different routing indices that will be required in a Walking Onions deployment.

Appendix H discusses partitioning TCP ports into a small number of subsets, so that relays' exit policies can be represented only as the group of ports that they support.

Appendix Z closes with acknowledgments.

The following proposals are not part of the Walking Onions proposal, but they were written at the same time, and are either helpful or necessary for its implementation.

318-limit-protovers.md restricts the allowed version numbers for each subprotocol to the range 0..63.

319-wide-everything.md gives a general mechanism for splitting relay commands across more than one cell.

320-tap-out-again.md attempts to remove the need for TAP keys in the HSv2 protocol.

321-happy-families.md lets families be represented with a single identifier, rather than a long list of keys

322-dirport-linkspec.md allows a directory port to be represented with a link specifier.

Document Formats: ENDIVEs and SNIPs

Here we specify a pair of related document formats that we will use for specifying SNIPs and ENDIVEs.

Recall from proposal 300 that a SNIP is a set of information about a single relay, plus proof from the directory authorities that the given relay occupies a given range in a certain routing index. For example, we can imagine that a SNIP might say:

  • Relay X has the following IP, port, and onion key.
  • In the routing index Y, it occupies index positions 0x20002 through 0x23000.
  • This SNIP is valid on 2020-12-09 00:00:00, for one hour.
  • Here is a signature of all the above text, using a threshold signature algorithm.

You can think of a SNIP as a signed combination of a routerstatus and a microdescriptor... together with a little bit of the randomized routing table from Tor's current path selection code, all wrapped in a signature.

Every relay keeps a set of SNIPs, and serves them to clients when the client is extending by a routing index position.

An ENDIVE is a complete set of SNIPs. Relays download ENDIVEs, or diffs between ENDIVEs, once every voting period. We'll accept some complexity in order to make these diffs small, even though some of the information in them (particularly SNIP signatures and index ranges) will tend to change with every period.

Preliminaries and scope

Goals for our formats

We want SNIPs to be small, since they need to be sent on the wire one at a time, and won't get much benefit from compression. (To avoid a side-channel, we want CREATED cells to all be the same size, which means we need to pad up to the largest size possible for a SNIP.)

We want to place as few requirements on clients as possible, and we want to preserve forward compatibility.

We want ENDIVEs to be compressible, and small. We want successive ENDIVEs to be textually similar, so that we can use diffs to transmit only the parts that change.

We should preserve our policy of requiring only loose time synchronization between clients and relays, and allow even looser synchronization when possible. Where possible, we'll make the permitted skew explicit in the protocol: for example, rather than saying "you can accept a document 10 minutes before it is valid", we will just make the validity interval start 10 minutes earlier.

Notes on Metaformat

In the format descriptions below, we will describe a set of documents in the CBOR metaformat, as specified in RFC 7049. If you're not familiar with CBOR, you can think of it as a simple binary version of JSON, optimized first for simplicity of implementation and second for space.

I've chosen CBOR because it's schema-free (you can parse it without knowing what it is), terse, dumpable as text, extensible, standardized, and very easy to parse and encode.

We will choose to represent many size-critical types as maps whose keys are short integers: this is slightly shorter in its encoding than string-based dictionaries. In some cases, we make types even shorter by using arrays rather than maps, but only when we are confident we will not have to make changes to the number of elements in the future.

We'll use CDDL (defined in RFC 8610) to describe the data in a way that can be validated -- and hopefully, in a way that will make it comprehensible. (The state of CDDL tooling is a bit lacking at the moment, so my CDDL validation will likely be imperfect.)

We make the following restrictions to CBOR documents that Tor implementations will generate:

  • No floating-point values are permitted.

  • No tags are allowed unless otherwise specified.

  • All items must follow the rules of RFC 7049 section 3.9 for canonical encoding, unless otherwise specified.

Implementations SHOULD accept and parse documents that are not generated according to these rules, for future extensibility. However, implementations SHOULD reject documents that are not "well-formed" and "valid" by the definitions of RFC 7049.

Design overview: signing documents

We try to use a single document-signing approach here, using a hash function parameterized to accommodate lifespan information and an optional nonce.

All the signed CBOR data used in this format is represented as a binary string, so that CBOR-processing tools are less likely to re-encode or transform it. We denote this below with the CDDL syntax bstr .cbor Object, which means "a binary string that must hold a valid encoding of a CBOR object whose type is Object".

Design overview: SNIP Authentication

I'm going to specify a flexible authentication format for SNIPs that can handle threshold signatures, multisignatures, and Merkle trees. This will give us flexibility in our choice of authentication mechanism over time.

  • If we use Merkle trees, we can make ENDIVE diffs much much smaller, and save a bunch of authority CPU -- at the expense of requiring slightly larger SNIPs.

  • If Merkle tree root signatures are in SNIPs, SNIPs get a bit larger, but they can be used by clients that do not have the latest signed Merkle tree root.

  • If we use threshold signatures, we need to depend on not-yet-quite-standardized algorithms. If we use multisignatures, then either SNIPs get bigger, or we need to put the signed Merkle tree roots into a consensus document.

Of course, flexibility in signature formats is risky, since the more code paths there are, the more opportunities there are for nasty bugs. With this in mind, I'm structuring our authentication so that there should (to the extent possible) be only a single validation path for different uses.

With this in mind, our format is structured so that "not using a Merkle tree" is considered, from the client's point of view, the same as "using a Merkle of depth 1".

The authentication on a single snip is structured, in the abstract, as:

  • ITEM: The item to be authenticated.
  • PATH: A string of N bits, representing a path through a Merkle tree from its root, where 0 indicates a left branch and 1 indicates a right branch. (Note that in a left-leaning tree, the 0th leaf will have path 000..0, the 1st leaf will have path 000..1, and so on.)
  • BRANCH: A list of N digests, representing the digests for the branches in the Merkle tree that we are not taking.
  • SIG: A generalized signature (either a threshold signature or a multisignature) of a top-level digest.
  • NONCE: an optional nonce for use with the hash functions.

Note that PATH here is a bitstring, not an integer! "0001" and "01" are different paths, and "" is a valid path, indicating the root of the tree.

We assume two hash functions here: H_leaf() to be used with leaf items, and H_node() to be used with intermediate nodes. These functions are parameterized with a path through the tree, with the lifespan of the object to be signed, and with a nonce.

To validate the authentication on a SNIP, the client proceeds as follows:

Algorithm: Validating SNIP authentication

Let N = the length of PATH, in bits.

Let H = H_leaf(PATH, LIFESPAN, NONCE, ITEM).

While N > 0:
   Remove the last bit of PATH; call it P.
   Remove the last digest of BRANCH; call it B.

   If P is zero:
       Let H = H_node(PATH, LIFESPAN, NONCE, H, B)
   else:
       Let H = H_node(PATH, LIFESPAN, NONCE, B, H)

   Let N = N - 1

Check wither SIG is a correct (multi)signature over H with the
correct key(s).

Parameterization on this structure is up to the authorities. If N is zero, then we are not using a Merkle tree. The generalize signature SIG can either be given as part of the SNIP, or as part of a consensus document. I expect that in practice, we will converge on a single set of parameters here quickly (I'm favoring BLS signatures and a Merkle tree), but using this format will give clients the flexibility to handle other variations in the future.

For our definition of H_leaf() and H_node(), see "Digests and parameters" below.

Design overview: timestamps and validity.

For future-proofing, SNIPs and ENDIVEs have separate time ranges indicating when they are valid. Unlike with current designs, these validity ranges should take clock skew into account, and should not require clients or relays to deliberately add extra tolerance to their processing. (For example, instead of saying that a document is "fresh" for three hours and then telling clients to accept documents for 24 hours before they are valid and 24 hours after they are expired, we will simply make the documents valid for 51 hours.)

We give each lifespan as a (PUBLISHED, PRE, POST) triple, such that objects are valid from (PUBLISHED - PRE) through (PUBLISHED + POST). (The "PUBLISHED" time is provided so that we can more reliably tell which of two objects is more recent.)

Later (see section 08), we'll explain measures to ensure that hostile relays do not take advantage of multiple overlapping SNIP lifetimes to attack clients.

Design overview: how the formats work together

Authorities, as part of their current voting process, will produce an ENDIVE.

Relays will download this ENDIVE (either directly or as a diff), validate it, and extract SNIPs from it. Extracting these SNIPs may be trivial (if they are signed individually), or more complex (if they are signed via a Merkle tree, and the Merkle tree needs to be reconstructed). This complexity is acceptable only to the extent that it reduces compressed diff size.

Once the SNIPs are reconstructed, relays will hold them and serve them to clients.

What isn't in this section

This section doesn't tell you what the different routing indices are or mean. For now, we can imagine there being one routing index for guards, one for middles, and one for exits, and one for each hidden service directory ring. (See section 06 for more on regular indices, and section 07 for more on onion services.)

This section doesn't give an algorithm for computing ENDIVEs from votes, and doesn't give an algorithm for extracting SNIPs from an ENDIVE. Those come later. (See sections 03 and 04 respectively.)

SNIPs

Each SNIP has three pieces: the part of the SNIP that describes the router, the part of that describes the SNIP's place within an ENDIVE, and the part that authenticates the whole SNIP.

Why two separate authenticated pieces? Because one (the router description) is taken verbatim from the ENDIVE, and the other (the location within the ENDIVE) is computed from the ENDIVE by the relays. Separating them like this helps ensure that the part generated by the relay and the part generated by the authorities can't interfere with each other.

; A SNIP, as it is sent from the relay to the client.  Note that
; this is represented as a three-element array.
SNIP = [
    ; First comes the signature.  This is computed over
    ; the concatenation of the two bstr objects below.
    auth: SNIPSignature,

    ; Next comes the location of the SNIP within the ENDIVE.
    index: bstr .cbor SNIPLocation,

    ; Finally comes the information about the router.
    router: bstr .cbor SNIPRouterData,
]

(Computing the signature over a concatenation of objects is safe, since the objects' content is self-describing CBOR, and isn't vulnerable to framing issues.)

SNIPRouterData: information about a single router.

Here we talk about the type that tells a client about a single router. For cases where we are just storing information about a router (for example, when using it as a guard), we can remember this part, and discard the other pieces.

The only required parts here are those that identify the router and tell the client how to build a circuit through it. The others are all optional. In practice, I expect they will be encoded in most cases, but clients MUST behave properly if they are absent.

More than one SNIPRouterData may exist in the same ENDIVE for a single router. For example, there might be a longer version to represent a router to be used as a guard, and another to represent the same router when used as a hidden service directory. (This is not possible in the voting mechanism that I'm working on, but relays and clients MUST NOT treat this as an error.)

This representation is based on the routerstats and microdescriptor entries of today, but tries to omit a number of obsolete fields, including RSA identity fingerprint, TAP key, published time, etc.

; A SNIPRouterData is a map from integer keys to values for
; those keys.
SNIPRouterData = {
    ; identity key.
    ? 0 => Ed25519PublicKey,

    ; ntor onion key.
    ? 1 => Curve25519PublicKey,

    ; list of link specifiers other than the identity key.
    ; If a client wants to extend to the same router later on,
    ; they SHOULD include all of these link specifiers verbatim,
    ; whether they recognize them or not.
    ? 2 => [ LinkSpecifier ],

    ; The software that this relay says it is running.
    ? 3 => SoftwareDescription,

    ; protovers.
    ? 4 => ProtoVersions,

    ; Family.  See below for notes on dual encoding.
    ? 5 => [ * FamilyId ],

    ; Country Code
    ? 6 => Country,

    ; Exit policies describing supported port _classes_.  Absent exit
    ; policies are treated as "deny all".
    ? 7 => ExitPolicy,

    ; NOTE: Properly speaking, there should be a CDDL 'cut'
    ; here, to indicate that the rules below should only match
    ; if one if the previous rules hasn't matched.
    ; Unfortunately, my CDDL tool doesn't seem to support cuts.

    ; For future tor extensions.
    * int => any,

    ; For unofficial and experimental extensions.
    * tstr => any,
}

; For future-proofing, we are allowing multiple ways to encode
; families.  One is as a list of other relays that are in your
; family.  One is as a list of authority-generated family
; identifiers. And one is as a master key for a family (as in
; Tor proposal 242).
;
; A client should consider two routers to be in the same
; family if they have at least one FamilyId in common.
; Authorities will canonicalize these lists.
FamilyId = bstr

; A country.  These should ordinarily be 2-character strings,
; but I don't want to enforce that.
Country = tstr;

; SoftwareDescription replaces our old "version".
SoftwareDescription = [
  software: tstr,
  version: tstr,
  extra: tstr
]

; Protocol versions: after a bit of experimentation, I think
; the most reasonable representation to use is a map from protocol
; ID to a bitmask of the supported versions.
ProtoVersions = { ProtoId => ProtoBitmask }

; Integer protocols are reserved for future version of Tor. tstr ids
; are reserved for experimental and non-tor extensions.
ProtoId = ProtoIdEnum / int / tstr

ProtoIdEnum = &(
  Link     : 0,
  LinkAuth : 1,
  Relay    : 2,
  DirCache : 3,
  HSDir    : 4,
  HSIntro  : 5,
  HSRend   : 6,
  Desc     : 7,
  MicroDesc: 8,
  Cons     : 9,
  Padding  : 10,
  FlowCtrl : 11,
)
; This type is limited to 64 bits, and that's fine.  If we ever
; need a protocol version higher than 63, we should allocate a
; new protoid.
ProtoBitmask = uint

; An exit policy may exist in up to two variants.  When port classes
; have not changed in a while, only one policy is needed.  If port
; classes have changed recently, however, then SNIPs need to include
; each relay's position according to both the older and the newer policy
; until older network parameter documents become invalid.
ExitPolicy = SinglePolicy / [ SinglePolicy, SinglePolicy ]

; Each single exit policy is a tagged bit array, whose bits
; correspond to the members of the list of port classes in the
; network parameter document with a corresponding tag.
SinglePolicy = [
     ; Identifies which group of port classes we're talking about
     tag: unsigned,
     ; Bit-array of which port classes this relay supports.
     policy: bstr
]

SNIPLocation: Locating a SNIP within a routing index.

The SNIPLocation type can encode where a SNIP is located with respect to one or more routing indices. Note that a SNIPLocation does not need to be exhaustive: If a given IndexId is not listed for a given relay in one SNIP, it might exist in another SNIP. Clients should not infer that the absence of an IndexId in one SNIPLocation for a relay means that no SNIPLocation with that IndexId exists for the relay.

; SNIPLocation: we're using a map here because it's natural
; to look up indices in maps.
SNIPLocation = {
    ; The keys of this mapping represent the routing indices in
    ; which a SNIP appears.  The values represent the index ranges
    ; that it occupies in those indices.
    * IndexId => IndexRange / ExtensionIndex,
}

; We'll define the different index ranges as we go on with
; these specifications.
;
; IndexId values over 65535 are reserved for extensions and
; experimentation.
IndexId = uint32

; An index range extends from a minimum to a maximum value.
; These ranges are _inclusive_ on both sides.  If 'hi' is less
; than 'lo', then this index "wraps around" the end of the ring.
; A "nil" value indicates an empty range, which would not
; ordinarily be included.
IndexRange = [ lo: IndexPos,
               hi: IndexPos ] / nil

; An ExtensionIndex is reserved for future use; current clients
; will not understand it and current ENDIVEs will not contain it.
ExtensionIndex = any

; For most routing indices, the ranges are encoded as 4-byte integers.
; But for hsdir rings, they are binary strings.  (Clients and
; relays SHOULD NOT require this.)
IndexPos = uint / bstr

A bit more on IndexRanges: Every IndexRange actually describes a set of prefixes for possible index positions. For example, the IndexRange [ h'AB12', h'AB24' ] includes all the binary strings that start with (hex) AB12, AB13, and so on, up through all strings that start with AB24. Alternatively, you can think of a bstr-based IndexRange (lo, hi) as covering lo00000... through hiff....

IndexRanges based on the uint type work the same, except that they always specify the first 32 bits of a prefix.

SNIPSignature: How to prove a SNIP is in the ENDIVE.

Here we describe the types for implementing SNIP signatures, to be validated as described in "Design overview: Authentication" above.

; Most elements in a SNIPSignature are positional and fixed
SNIPSignature = [
    ; The actual signature or signatures.  If this is a single signature,
    ; it's probably a threshold signature.  Otherwise, it's probably
    ; a list containing one signature from each directory authority.
    SingleSig / MultiSig,

    ; algorithm to use for the path through the merkle tree.
    d_alg: DigestAlgorithm,
    ; Path through merkle tree, possibly empty.
    merkle_path: MerklePath,

    ; Lifespan information.  This is included as part of the input
    ; to the hash algorithm for the signature.
    LifespanInfo,

    ; optional nonce for hash algorithm.
    ? nonce: bstr,

    ; extensions for later use. These are not signed.
    ? extensions: { * any => any },
]

; We use this group to indicate when an object originated, and when
; it should be accepted.
;
; When we are using it as an input to a hash algorithm for computing
; signatures, we encode it as an 8-byte number for "published",
; followed by two 4-byte numbers for pre-valid and post-valid.
LifespanInfo = (
    ; Official publication time in seconds since the epoch.  These
    ; MUST be monotonically increasing over time for a given set of
    ; authorities on all SNIPs or ENDIVEs that they generate: a
    ; document with a greater `published` time is always more recent
    ; than one with an earlier `published` time.
    ;
    ; Seeing a publication time "in the future" on a correctly
    ; authenticated document is a reliable sign that your
    ; clock is set too far in the past.
    published: uint,

    ; Value to subtract from "published" in order to find the first second
    ; at which this object should be accepted.
    pre-valid: uint32,

    ; Value to add to "published" in order to find the last
    ; second at which this object should be accepted.  The
    ; lifetime of an object is therefore equal to "(post-valid +
    ; pre-valid)".
    post-valid: uint32,
)

; A Lifespan is just the fields of LifespanInfo, encoded as a list.
Lifespan = [ LifespanInfo ]

; One signature on a SNIP or ENDIVE.  If the signature is a threshold
; signature, or a reference to a signature in another
; document, there will probably be just one of these per SNIP.  But if
; we're sticking a full multisignature in the document, this
; is just one of the signatures on it.
SingleSig = [
   s_alg: SigningAlgorithm,
   ; One of signature and sig_reference must be present.
   ?signature: bstr,
   ; sig_reference is an identifier for a signature that appears
   ; elsewhere, and can be fetched on request.  It should only be
   ; used with signature types too large to attach to SNIPs on their
   ; own.
   ?sig_reference: bstr,
   ; A prefix of the key or the key's digest, depending on the
   ; algorithm.
   ?keyid: bstr,
]

MultiSig = [ + SingleSig ]

; A Merkle path is represented as a sequence of bits to
; indicate whether we're going left or right, and a list of
; hashes for the parts of the tree that we aren't including.
;
; (It's safe to use a uint for the number of bits, since it will
; never overflow 64 bits -- that would mean a Merkle tree with
; too many leaves to actually calculate on.)
MerklePath = [ uint, *bstr ]

ENDIVEs: sending a bunch of SNIPs efficiently.

ENDIVEs are delivered by the authorities in a compressed format, optimized for diffs.

Note that if we are using Merkle trees for SNIP authentication, ENDIVEs do not include the trees at all, since those can be inferred from the leaves of the tree. Similarly, the ENDIVEs do not include raw routing indices, but instead include a set of bandwidths that can be combined into the routing indices -- these bandwidths change less frequently, and therefore are more diff-friendly.

Note also that this format has more "wasted bytes" than SNIPs do. Unlike SNIPs, ENDIVEs are large enough to benefit from compression with with gzip, lzma2, or so on.

This section does not fully specify how to construct SNIPs from an ENDIVE; for the full algorithm, see section 04.

; ENDIVEs are also sent as CBOR.
ENDIVE = [
    ; Signature for the ENDIVE, using a simpler format than for 
    ; a SNIP.  Since ENDIVEs are more like a consensus, we don't need
    ; to use threshold signatures or Merkle paths here.
    sig: ENDIVESignature,

    ; Contents, as a binary string.
    body: encoded-cbor .cbor ENDIVEContent,
]

; The set of signatures across an ENDIVE.
;
; This type doubles as the "detached signature" document used when
; collecting signatures for a consensus.
ENDIVESignature = {
    ; The actual signatures on the endive. A multisignature is the
    ; likeliest format here.
    endive_sig: [ + SingleSig ],

    ; Lifespan information.  As with SNIPs, this is included as part
    ; of the input to the hash algorithm for the signature.
    ; Note that the lifespan of an ENDIVE is likely to be a subset
    ; of the lifespan of its SNIPs.
    endive_lifespan: Lifespan,

    ; Signatures across SNIPs, at some level of the Merkle tree.  Note
    ; that these signatures are not themselves signed -- having them
    ; signed would take another step in the voting algorithm.
    snip_sigs: DetachedSNIPSignatures,

    ; Signatures across the ParamDoc pieces.  Note that as with the
    ; DetachedSNIPSignatures, these signatures are not themselves signed.
    param_doc: ParamDocSignature,

    ; extensions for later use. These are not signed.
    * tstr => any,
}

; A list of single signatures or a list of multisignatures. This
; list must have 2^signature-depth elements.
DetachedSNIPSignatures =
      [ *SingleSig ] / [ *MultiSig ]

ENDIVEContent = {

    ; Describes how to interpret the signatures over the SNIPs in this
    ; ENDIVE. See section 04 for the full algorithm.
    sig_params: {
        ; When should we say that the signatures are valid?
        lifespan: Lifespan,
        ; Nonce to be used with the signing algorithm for the signatures.
        ? signature-nonce: bstr,

        ; At what depth of a Merkle tree do the signatures apply?
        ; (If this value is 0, then only the root of the tree is signed.
        ; If this value is >= ceil(log2(n_leaves)), then every leaf is
        ; signed.).
        signature-depth: uint,

        ; What digest algorithm is used for calculating the signatures?
        signature-digest-alg: DigestAlgorithm,

        ; reserved for future extensions.
        * tstr => any,
    },

    ; Documents for clients/relays to learn about current network
    ; parameters.
    client-param-doc: encoded-cbor .cbor ClientParamDoc,
    relay-param-doc: encoded-cbor .cbor RelayParamDoc,

    ; Definitions for index group.  Each "index group" is all
    ; applied to the same SNIPs.  (If there is one index group,
    ; then every relay is in at most one SNIP, and likely has several
    ; indices.  If there are multiple index groups, then relays
    ; can appear in more than one SNIP.)
    indexgroups: [ *IndexGroup ],

    ; Information on particular relays.
    ;
    ; (The total number of SNIPs identified by an ENDIVE is at most
    ; len(indexgroups) * len(relays).)
    relays: [ * ENDIVERouterData ],

    ; for future exensions
    * tstr => any,
}

; An "index group" lists a bunch of routing indices that apply to the same
; SNIPs.  There may be multiple index groups when a relay needs to appear
; in different SNIPs with routing indices for some reason.
IndexGroup = {
    ; A list of all the indices that are built for this index group.
    ; An IndexId may appear in at most one group per ENDIVE.
    indices: [ + IndexId ],
    ; A list of keys to delete from SNIPs to build this index group.
    omit_from_snips: [ *(int/tstr) ],
    ; A list of keys to forward from SNIPs to the next relay in an EXTEND
    ; cell.  This can help the next relay know which keys to use in its
    ; handshake.
    forward_with_extend: [ *(int/tstr) ],

    ; A number of "gaps" to place in the Merkle tree after the SNIPs
    ; in this group.  This can be used together with signature-depth
    ; to give different index-groups independent signatures.
    ? n_padding_entries: uint,

    ; A detailed description of how to build the index.
    + IndexId => IndexSpec,

    ; For experimental and extension use.
    * tstr => any,
}

; Enumeration to identify how to generate an index.
Indextype_Raw = 0
Indextype_Weighted = 1
Indextype_RSAId = 2
Indextype_Ed25519Id = 3
Indextype_RawNumeric = 4

; An indexspec may be given as a raw set of index ranges.  This is a
; fallback for cases where we simply can't construct an index any other
; way.
IndexSpec_Raw = {
    type: Indextype_Raw,
    ; This index is constructed by taking relays by their position in the
    ; list from the list of ENDIVERouterData, and placing them at a given
    ; location in the routing index.  Each index range extends up to
    ; right before the next index position.
    index_ranges: [ * [ uint, IndexPos ] ],
}

; An indexspec given as a list of numeric spans on the index.
IndexSpec_RawNumeric = {
    type: Indextype_RawNumeric,
    first_index_pos: uint,
    ; This index is constructed by taking relays by index from the list
    ; of ENDIVERouterData, and giving them a certain amount of "weight"
    ; in the index.
    index_ranges: [ * [ idx: uint, span: uint ] ],
}

; This index is computed from the weighted bandwidths of all the SNIPs.
;
; Note that when a single bandwidth changes, it can change _all_ of
; the indices in a bandwidth-weighted index, even if no other
; bandwidth changes.  That's why we only pack the bandwidths
; here, and scale them as part of the reconstruction algorithm.
IndexSpec_Weighted = {
    type: Indextype_Weighted,
    ; This index is constructed by assigning a weight to each relay,
    ; and then normalizing those weights. See algorithm below in section
    ; 04.
    ; Limiting bandwidth weights to uint32 makes reconstruction algorithms
    ; much easier.
    index_weights: [ * uint32 ],
}

; This index is computed from the RSA identity key digests of all of the
; SNIPs. It is used in the HSv2 directory ring.
IndexSpec_RSAId = {
    type: Indextype_RSAId,
    ; How many bytes of RSA identity data go into each indexpos entry?
    n_bytes: uint,
    ; Bitmap of which routers should be included.
    members: bstr,
}

; This index is computed from the Ed25519 identity keys of all of the
; SNIPs.  It is used in the HSv3 directory ring.
IndexSpec_Ed25519Id = {
    type: Indextype_Ed25519Id,
    ; How many bytes of digest go into each indexpos entry?
    n_bytes: uint,
    ; What digest do we use for building this ring?
    d_alg: DigestAlgorithm,
    ; What bytes do we give to the hash before the ed25519?
    prefix: bstr,
    ; What bytes do we give to the hash after the ed25519?
    suffix: bstr,
    ; Bitmap of which routers should be included.
    members: bstr,
}

IndexSpec = IndexSpec_Raw /
            IndexSpec_RawNumeric /
            IndexSpec_Weighted /
            IndexSpec_RSAId /
            IndexSpec_Ed25519Id

; Information about a single router in an ENDIVE.
ENDIVERouterData = {
    ; The authority-generated SNIPRouterData for this router.
    1 => encoded-cbor .cbor SNIPRouterData,
    ; The RSA identity, or a prefix of it, to use for HSv2 indices.
    ? 2 => RSAIdentityFingerprint,

    * int => any,
    * tstr => any,
}

; encoded-cbor is defined in the CDDL postlude as a bstr that is
; tagged as holding verbatim CBOR:
;
;    encoded-cbor = #6.24(bstr)
;
; Using a tag like this helps tools that validate the string as
; valid CBOR; using a bstr helps indicate that the signed data
; should not be interpreted until after the signature is checked.
; It also helps diff tools know that they should look inside these
; objects.

Network parameter documents

Network parameter documents ("ParamDocs" for short) take the place of the current consensus and certificates as a small document that clients and relays need to download periodically and keep up-to-date. They are generated as part of the voting process, and contain fields like network parameters, recommended versions, authority certificates, and so on.

; A "parameter document" is like a tiny consensus that relays and clients
; can use to get network parameters.
ParamDoc = [
   sig: ParamDocSignature,
   ; Client-relevant portion of the parameter document. Everybody fetches
   ; this.
   cbody: encoded-cbor .cbor ClientParamDoc,
   ; Relay-relevant portion of the parameter document. Only relays need to
   ; fetch this; the document can be validated without it.
   ? sbody: encoded-cbor .cbor RelayParamDoc,
]
ParamDocSignature = [
   ; Multisignature or threshold signature of the concatenation
   ; of the two digests below.
   SingleSig / MultiSig,

   ; Lifespan information.  As with SNIPs, this is included as part
   ; of the input to the hash algorithm for the signature.
   ; Note that the lifespan of a parameter document is likely to be
   ; very long.
   LifespanInfo,

   ; how are c_digest and s_digest computed?
   d_alg: DigestAlgorithm,
   ; Digest over the cbody field
   c_digest: bstr,
   ; Digest over the sbody field
   s_digest: bstr,
]

ClientParamDoc = {
   params: NetParams,
   ; List of certificates for all the voters.  These
   ; authenticate the keys used to sign SNIPs and ENDIVEs and votes,
   ; using the authorities' longest-term identity keys.
   voters: [ + bstr .cbor VoterCert ],

   ; A division of exit ports into "classes" of ports.
   port-classes: PortClasses,

   ; As in client-versions from dir-spec.txt
   ? recommend-versions: [ * tstr ],
   ; As in recommended-client-protocols in dir-spec.txt
   ? recommend-protos: ProtoVersions,
   ; As in required-client-protocols in dir-spec.txt
   ? require-protos: ProtoVersions,

   ; For future extensions.
   * tstr => any,
}

RelayParamDoc = {
   params: NetParams,

   ; As in server-versions from dir-spec.txt
   ? recommend-versions: [ * tstr ],
   ; As in recommended-relay-protocols in dir-spec.txt
   ? recommend-protos: ProtoVersions,
   ; As in required-relay-protocols in dir-spec.txt
   ? require-versions: ProtoVersions,

   * tstr => any,
}

; A NetParams encodes information about the Tor network that
; clients and relays need in order to participate in it.  The
; current list of parameters is described in the "params" field
; as specified in dir-spec.txt.
;
; Note that there are separate client and relay NetParams now.
; Relays are expected to first check for a defintion in the
; RelayParamDoc, and then in the ClientParamDoc.
NetParams = { *tstr => int }

PortClasses = {
    ; identifies which port class grouping this is. Used to migrate
    ; from one group of port classes to another.
    tag: uint,
    ; list of the port classes.
    classes: { * IndexId => PortList },
}
PortList = [ *PortOrRange ]
 ; Either a single port or a low-high pair
PortOrRange = Port / [ Port, Port ]
Port = 1...65535

Certificates

Voting certificates are used to bind authorities' long-term identities to shorter-term signing keys. These have a similar purpose to the authority certs made for the existing voting algorithm, but support more key types.

; A 'voter certificate' is a statement by an authority binding keys to
; each other.
VoterCert = [

   ; One or more signatures over `content` using the provided lifetime.
   ; Each signature should be treated independently.
   [ + SingleSig ],
   ; A lifetime value, used (as usual ) as an input to the
   ; signature algorithm.
   LifespanInfo,
   ; The keys and other data to be certified.
   content: encoded-cbor .cbor CertContent,
]

; The contents of the certificate that get signed.
CertContent = {
   ; What kind of a certificate is this?
   type: CertType,
   ; A list of keys that are being certified in this document
   keys: [ + CertifiedKey ],
   ; A list of other keys that you might need to know about, which
   ; are NOT certififed in this document.
   ? extra: [ + CertifiedKey ],
   * tstr => any,
}

CertifiedKey = {
   ; What is the intended usage of this key?
   usage: KeyUsage,
   ; What cryptographic algorithm is this key used for?
   alg: PKAlgorithm,
   ; The actual key being certified.
   data: bstr,
   ; A human readable string.
   ? remarks: tstr,
   * tstr => any,
}

ENDIVE diffs

Here is a binary format to be used with ENDIVEs, ParamDocs, and any other similar binary formats. Authorities and directory caches need to be able to generate it; clients and non-cache relays only need to be able to parse and apply it.

; Binary diff specification.
BinaryDiff = {
    ; This is version 1.
    v: 1,
    ; Optionally, a diff can say what different digests
    ; of the document should be before and after it is applied.
    ; If there is more than one entry, parties MAY check one or
    ; all of them.
    ? digest: { * DigestAlgorithm =>
                     [ pre: Digest,
                       post: Digest ]},

    ; Optionally, a diff can give some information to identify
    ; which document it applies to, and what document you get
    ; from applying it.  These might be a tuple of a document type
    ; and a publication type.
    ? ident: [ pre: any, post: any ],

    ; list of commands to apply in order to the original document in
    ; order to get the transformed document
    cmds: [ *DiffCommand ],

    ; for future extension.
    * tstr => any,
}

; There are currently only two diff commands.
; One is to copy some bytes from the original.
CopyDiffCommand = [
    OrigBytesCmdId,
    ; Range of bytes to copy from the original document.
    ; Ranges include their starting byte.  The "offset" is relative to
    ; the end of the _last_ range that was copied.
    offset: int,
    length: uint,
]

; The other diff comment is to insert some bytes from the diff.
InsertDiffCommand = [
    InsertBytesCmdId,
    data: bstr,
]

DiffCommand = CopyDiffCommand / InsertDiffCommand

OrigBytesCmdId = 0
InsertBytesCmdId = 1

Applying a binary diff is simple:

Algorithm: applying a binary diff.

(Given an input bytestring INP and a diff D, produces an output OUT.)

Initialize OUT to an empty bytestring.

Set OFFSET to 0.

For each command C in D.commands, in order:

    If C begins with OrigBytesCmdId:
        Increase "OFFSET" by C.offset
        If OFFSET..OFFSET+C.length is not a valid range in
           INP, abort.
        Append INP[OFFSET .. OFFSET+C.length] to OUT.
        Increase "OFFSET" by C.length

    else: # C begins with InsertBytesCmdId:
        Append C.data to OUT.

Generating a binary diff can be trickier, and is not specified here. There are several generic algorithms out there for making binary diffs between arbitrary byte sequences. Since these are complex, I recommend a chunk-based CBOR-aware algorithm, using each CBOR item in a similar way to the way in which our current line-oriented code uses lines. When encountering a bstr tagged with "encoded-cbor", the diff algorithm should look inside it to find more cbor chunks. (See example-code/cbor_diff.py for an example of doing this with Python's difflib.)

The diff format above should work equally well no matter what diff algorithm is used, so we have room to move to other algorithms in the future if needed.

To indicate support for the above diff format in directory requests, implementations should use an X-Support-Diff-Formats header. The above format is designated "cbor-bindiff"; our existing format is called "ed".

Digests and parameters

Here we give definitions for H_leaf() and H_node(), based on an underlying digest function H() with a preferred input block size of B. (B should be chosen as the natural input size of the hash function, to aid in precomputation.)

We also define H_sign(), to be used outside of SNIP authentication where we aren't using a Merkle tree at all.

PATH must be no more than 64 bits long. NONCE must be no more than B-33 bytes long.

 H_sign(LIFESPAN, NONCE, ITEM) =
    H( PREFIX(OTHER_C, LIFESPAN, NONCE) || ITEM)

 H_leaf(PATH, LIFESPAN, NONCE, ITEM) =
    H( PREFIX(LEAF_C, LIFESPAN, NONCE) ||
       U64(PATH) ||
       U64(bits(path)) ||
       ITEM )

 H_node(PATH, LIFESPAN, NONCE, ITEM) =
    H( PREFIX(NODE_C, LIFESPAN, NONCE) ||
       U64(PATH) ||
       U64(bits(PATH)) ||
       ITEM )

 PREFIX(leafcode, lifespan, nonce) =
      U64(leafcode) ||
      U64(lifespan.published) ||
      U32(lifespan.pre-valid) ||
      U32(lifespan.post-valid) ||
      U8(len(nonce)) ||
      nonce ||
      Z(B - 33 - len(nonce))

 LEAF_C = 0x8BFF0F687F4DC6A1 ^ NETCONST
 NODE_C = 0xA6F7933D3E6B60DB ^ NETCONST
 OTHER_C = 0x7365706172617465 ^ NETCONST

 # For the live Tor network only.
 NETCONST = 0x0746f72202020202
 # For testing networks, by default.
 NETCONST = 0x74657374696e6720

 U64(n) -- N encoded as a big-endian 64-bit number.
 Z(n) -- N bytes with value zero.
 len(b) -- the number of bytes in a byte-string b.
 bits(b) -- the number of bits in a bit-string b.

Directory authority operations

For Walking Onions to work, authorities must begin to generate ENDIVEs as a new kind of "consensus document". Since this format is incompatible with the previous consensus document formats, and is CBOR-based, a text-based voting protocol is no longer appropriate for generating it.

We cannot immediately abandon the text-based consensus and microdescriptor formats, but instead will need to keep generating them for legacy relays and clients. Ideally, process that produces the ENDIVE should also produce a legacy consensus, to limit the amount of divergence in their contents.

Further, it would be good for the purposes of this proposal if we can "inherit" as much as possible of our existing voting mechanism for legacy purposes.

This section of the proposal will try to solve these goals by defining a new binary-based voting format, a new set of voting rules for it, and a series of migration steps.

Overview

Except as described below, we retain from Tor's existing voting mechanism all notions of how votes are transferred and processed. Other changes are likely desirable, but they are out of scope for this proposal.

Notably, we are not changing how the voting schedule works. Nor are we changing the property that all authorities must agree on the list of authorities; the property that a consensus is computed as a deterministic function of a set of votes; or the property that if authorities believe in different sets of votes, they will not reach the same consensus.

The principal changes in the voting that are relevant for legacy consensus computation are:

  • The uploading process for votes now supports negotiation, so that the receiving authority can tell the uploading authority what kind of formats, diffs, and compression it supports.

  • We specify a CBOR-based binary format for votes, with a simple embedding method for the legacy text format. This embedding is meant for transitional use only; once all authorities support the binary format, the transitional format and its support structures can be abandoned.

  • To reduce complexity, the new vote format also includes verbatim microdescriptors, whereas previously microdescriptors would have been referenced by hash. (The use of diffs and compression should make the bandwidth impact of this addition negligible.)

For computing ENDIVEs, the principal changes in voting are:

  • The consensus outputs for most voteable objects are specified in a way that does not require the authorities to understand their semantics when computing a consensus. This should make it easier to change fields without requiring new consensus methods.

Negotiating vote uploads

Authorities supporting Walking Onions are required to support a new resource "/tor/auth-vote-opts". This resource is a text document containing a list of HTTP-style headers. Recognized headers are described below; unrecognized headers MUST be ignored.

The Accept-Encoding header follows the same format as the HTTP header of the same name; it indicates a list of Content-Encodings that the authority will accept for uploads. All authorities SHOULD support the gzip and identity encodings. The identity encoding is mandatory. (Default: "identity")

The Accept-Vote-Diffs-From header is a list of digests of previous votes held by this authority; any new uploaded votes that are given as diffs from one of these old votes SHOULD be accepted. The format is a space-separated list of "digestname:Hexdigest". (Default: "".)

The Accept-Vote-Formats header is a space-separated list of the vote formats that this router accepts. The recognized vote formats are "legacy-3" (Tor's current vote format) and "endive-1" (the vote format described here). Unrecognized vote formats MUST be ignored. (Default: "legacy-3".)

If requesting "/tor/auth-vote-opts" gives an error, or if one or more headers are missing, the default values SHOULD be used. These documents (or their absence) MAY be cached for up to 2 voting periods.)

Authorities supporting Walking Onions SHOULD also support the "Connection: keep-alive" and "Keep-Alive" HTTP headers, to avoid needless reconnections in response to these requests. Implementors SHOULD be aware of potential denial-of-service attacks based on open HTTP connections, and mitigate them as appropriate.

Note: I thought about using OPTIONS here, but OPTIONS isn't quite right for this, since Accept-Vote-Diffs-From does not fit with its semantics.

Note: It might be desirable to support this negotiation for legacy votes as well, even before walking onions is implemented. Doing so would allow us to reduce authority bandwidth a little, and possibly include microdescriptors in votes for more convenient processing.

A generalized algorithm for voting

Unlike with previous versions of our voting specification, here I'm going to try to describe pieces the voting algorithm in terms of simpler voting operations. Each voting operation will be named and possibly parameterized, and data will frequently self-describe what voting operation is to be used on it.

Voting operations may operate over different CBOR types, and are themselves specified as CBOR objects.

A voting operation takes place over a given "voteable field". Each authority that specifies a value for a voteable field MUST specify which voting operation to use for that field. Specifying a voteable field without a voting operation MUST be taken as specifying the voting operation "None" -- that is, voting against a consensus.

On the other hand, an authority MAY specify a voting operation for a field without casting any vote for it. This means that the authority has an opinion on how to reach a consensus about the field, without having any preferred value for the field itself.

Constants used with voting operations

Many voting operations may be parameterized by an unsigned integer. In some cases the integers are constant, but in others, they depend on the number of authorities, the number of votes cast, or the number of votes cast for a particular field.

When we encode these values, we encode them as short strings rather than as integers.

I had thought of using negative integers here to encode these special constants, but that seems too error-prone.

The following constants are defined:

N_AUTH -- the total number of authorities, including those whose votes are absent.

N_PRESENT -- the total number of authorities whose votes are present for this vote.

N_FIELD -- the total number of authorities whose votes for a given field are present.

Necessarily, N_FIELD <= N_PRESENT <= N_AUTH -- you can't vote on a field unless you've cast a vote, and you can't cast a vote unless you're an authority.

In the definitions below, // denotes the truncating integer division operation, as implemented with / in C.

QUORUM_AUTH -- The lowest integer that is greater than half of N_AUTH. Equivalent to N_AUTH // 2 + 1.

QUORUM_PRESENT -- The lowest integer that is greater than half of N_PRESENT. Equivalent to N_PRESENT // 2 + 1.

QUORUM_FIELD -- The lowest integer that is greater than half of N_FIELD. Equivalent to N_FIELD // 2 + 1.

We define SUPERQUORUM_..., variants of these fields as well, based on the lowest integer that is greater than 2/3 majority of the underlying field. SUPERQUORUM_x is thus equivalent to (N_x * 2) // 3 + 1.

; We need to encode these arguments; we do so as short strings.
IntOpArgument = uint / "auth" / "present" / "field" /
     "qauth" / "qpresent" / "qfield" /
     "sqauth" / "sqpresent" / "sqfield"

No IntOpArgument may be greater than AUTH. If an IntOpArgument is given as an integer, and that integer is greater than AUTH, then it is treated as if it were AUTH.

This rule lets us say things like "at least 3 authorities must vote on x...if there are 3 authorities."

Producing consensus on a field

Each voting operation will either produce a CBOR output, or produce no consensus. Unless otherwise stated, all CBOR outputs are to be given in canonical form.

Below we specify a number of operations, and the parameters that they take. We begin with operations that apply to "simple" values (integers and binary strings), then show how to compose them to larger values.

All of the descriptions below show how to apply a single voting operation to a set of votes. We will later describe how to behave when the authorities do not agree on which voting operation to use, in our discussion of the StructJoinOp operation.

Note that while some voting operations take other operations as parameters, we are not supporting full recursion here: there is a strict hierarchy of operations, and more complex operations can only have simpler operations in their parameters.

All voting operations follow this metaformat:

; All a generic voting operation has to do is say what kind it is.
GenericVotingOp = {
    op: tstr,
    * tstr => any,
}

Note that some voting operations require a sort or comparison operation over CBOR values. This operation is defined later in appendix E; it works only on homogeneous inputs.

Generic voting operations

None

This voting operation takes no parameters, and always produces "no consensus". It is encoded as:

; "Don't produce a consensus".
NoneOp = { op: "None" }

When encountering an unrecognized or nonconforming voting operation, or one which is not recognized by the consensus-method in use, the authorities proceed as if the operation had been "None".

Voting operations for simple values

We define a "simple value" according to these cddl rules:

; Simple values are primitive types, and tuples of primitive types.
SimpleVal = BasicVal / SimpleTupleVal
BasicVal = bool / int / bstr / tstr
SimpleTupleVal = [ *BasicVal ]

We also need the ability to encode the types for these values:

; Encoding a simple type.
SimpleType = BasicType / SimpleTupleType
BasicType = "bool" /  "uint" / "sint" / "bstr" / "tstr"
SimpleTupleType = [ "tuple", *BasicType ]

In other words, a SimpleVal is either an non-compound base value, or is a tuple of such values.

; We encode these operations as:
SimpleOp = MedianOp / ModeOp / ThresholdOp /
    BitThresholdOp / CborSimpleOp / NoneOp

We define each of these operations in the sections below.

Median

Parameters: MIN_VOTES (an integer), BREAK_EVEN_LOW (a boolean), TYPE (a SimpleType)

; Encoding:
MedianOp = { op: "Median",
             ? min_vote: IntOpArgument,  ; Default is 1.
             ? even_low: bool,           ; Default is true.
             type: SimpleType  }

Discard all votes that are not of the specified TYPE. If there are fewer than MIN_VOTES votes remaining, return "no consensus".

Put the votes in ascending sorted order. If the number of votes N is odd, take the center vote (the one at position (N+1)/2). If N is even, take the lower of the two center votes (the one at position N/2) if BREAK_EVEN_LOW is true. Otherwise, take the higher of the two center votes (the one at position N/2 + 1).

For example, the Median(…, even_low: True, type: "uint") of the votes ["String", 2, 111, 6] is 6. The Median(…, even_low: True, type: "uint") of the votes ["String", 77, 9, 22, "String", 3] is 9.

Mode

Parameters: MIN_COUNT (an integer), BREAK_TIES_LOW (a boolean), TYPE (a SimpleType)

; Encoding:
ModeOp = { op: "Mode",
           ? min_count: IntOpArgument,   ; Default 1.
           ? tie_low: bool,              ; Default true.
           type: SimpleType
}

Discard all votes that are not of the specified TYPE. Of the remaining votes, look for the value that has received the most votes. If no value has received at least MIN_COUNT votes, then return "no consensus".

If there is a single value that has received the most votes, return it. Break ties in favor of lower values if BREAK_TIES_LOW is true, and in favor of higher values if BREAK_TIES_LOW is false. (Perform comparisons in canonical cbor order.)

Threshold

Parameters: MIN_COUNT (an integer), BREAK_MULTI_LOW (a boolean), TYPE (a SimpleType)

; Encoding
ThresholdOp = { op: "Threshold",
                min_count: IntOpArgument,  ; No default.
                ? multi_low: bool,          ; Default true.
                type: SimpleType
}

Discard all votes that are not of the specified TYPE. Sort in canonical cbor order. If BREAK_MULTI_LOW is false, reverse the order of the list.

Return the first element that received at least MIN_COUNT votes. If no value has received at least MIN_COUNT votes, then return "no consensus".

BitThreshold

Parameters: MIN_COUNT (an integer >= 1)

; Encoding
BitThresholdOp = { op: "BitThreshold",
                   min_count: IntOpArgument, ; No default.
}

These are usually not needed, but are quite useful for building some ProtoVer operations.

Discard all votes that are not of type uint or bstr; construe bstr inputs as having type "biguint".

The output is a uint or biguint in which the b'th bit is set iff the b'th bit is set in at least MIN_COUNT of the votes.

Voting operations for lists

These operations work on lists of SimpleVal:

; List type definitions
ListVal = [ * SimpleVal ]

ListType = [ "list",
             [ *SimpleType ] / nil ]

They are encoded as:

; Only one list operation exists right now.
ListOp = SetJoinOp

SetJoin

Parameters: MIN_COUNT (an integer >= 1). Optional parameters: TYPE (a SimpleType.)

; Encoding:
SetJoinOp = {
   op: "SetJoin",
   min_count: IntOpArgument,
   ? type: SimpleType
}

Discard all votes that are not lists. From each vote, discard all members that are not of type 'TYPE'.

For the consensus, construct a new list containing exactly those elements that appears in at least MIN_COUNT votes.

(Note that the input votes may contain duplicate elements. These must be treated as if there were no duplicates: the vote [1, 1, 1, 1] is the same as the vote [1]. Implementations may want to preprocess votes by discarding all but one instance of each member.)

Voting operations for maps

Map voting operations work over maps from key types to other non-map types.

; Map type definitions.
MapVal = { * SimpleVal => ItemVal }
ItemVal = ListVal / SimpleVal

MapType = [ "map", [ *SimpleType ] / nil, [ *ItemType ] / nil ]
ItemType = ListType / SimpleType

They are encoded as:

; MapOp encodings
MapOp = MapJoinOp / StructJoinOp

MapJoin

The MapJoin operation combines homogeneous maps (that is, maps from a single key type to a single value type.)

Parameters: KEY_MIN_COUNT (an integer >= 1) KEY_TYPE (a SimpleType type) ITEM_OP (A non-MapJoin voting operation)

Encoding:

; MapJoin operation encoding
MapJoinOp = {
   op: "MapJoin"
   ? key_min_count: IntOpArgument, ; Default 1.
   key_type: SimpleType,
   item_op: ListOp / SimpleOp
}

First, discard all votes that are not maps. Then consider the set of keys from each vote as if they were a list, and apply SetJoin[KEY_MIN_COUNT,KEY_TYPE] to those lists. The resulting list is a set of keys to consider including in the output map.

We have a separate key_min_count field, even if item_op has its own min_count field, because some min_count values (like qfield) depend on the overall number of votes for the field. Having key_min_count lets us specify rules like "the FOO of all votes on this field, if there are at least 2 such votes."

For each key in the output list, run the sub-voting operation ItemOperation on the values it received in the votes. Discard all keys for which the outcome was "no consensus".

The final vote result is a map from the remaining keys to the values produced by the voting operation.

StructJoin

A StructJoinOp operation describes a way to vote on maps that encode a structure-like object.

Parameters: KEY_RULES (a map from int or string to StructItemOp) UNKNOWN_RULE (An operation to apply to unrecognized keys.)

; Encoding
StructItemOp = ListOp / SimpleOp / MapJoinOp / DerivedItemOp /
    CborDerivedItemOp

VoteableStructKey = int / tstr

StructJoinOp = {
    op: "StructJoin",
    key_rules: {
        * VoteableStructKey => StructItemOp,
    }
    ? unknown_rule: StructItemOp
}

To apply a StructJoinOp to a set of votes, first discard every vote that is not a map. Then consider the set of keys from all the votes as a single list, with duplicates removed. Also remove all entries that are not integers or strings from the list of keys.

For each key, then look for that key in the KEY_RULES map. If there is an entry, then apply the StructItemOp for that entry to the values for that key in every vote. Otherwise, apply the UNKNOWN_RULE operation to the values for that key in every vote. Otherwise, there is no consensus for the values of this key. If there is a consensus for the values, then the key should map to that consensus in the result.

This operation always reaches a consensus, even if it is an empty map.

CborData

A CborData operation wraps another operation, and tells the authorities that after the operation is completed, its result should be decoded as a CBOR bytestring and interpolated directly into the document.

Parameters: ITEM_OP (Any SingleOp that can take a bstr input.)

 ; Encoding
 CborSimpleOp = {
     op: "CborSimple",
     item-op: MedianOp / ModeOp / ThresholdOp / NoneOp
 }
 CborDerivedItemOp = {
     op: "CborDerived",
     item-op: DerivedItemOp,
 }

To apply either of these operations to a set of votes, first apply ITEM_OP to those votes. After that's done, check whether the consensus from that operation is a bstr that encodes a single item of "well-formed" "valid" cbor. If it is not, this operation gives no consensus. Otherwise, the consensus value for this operation is the decoding of that bstr value.

DerivedFromField

This operation can only occur within a StructJoinOp operation (or a semantically similar SectionRules). It indicates that one field should have been derived from another. It can be used, for example, to say that a relay's version is "derived from" a relay's descriptor digest.

Unlike other operations, this one depends on the entire consensus (as computed so far), and on the entirety of the set of votes.

This operation might be a mistake, but we need it to continue lots of our current behavior.

Parameters:

`FIELDS` (one or more other locations in the vote)
`RULE` (the rule used to combine values)

Encoding:

; This item is "derived from" some other field.
DerivedItemOp = {
    op: "DerivedFrom",
    fields: [ +SourceField ],
    rule: SimpleOp
}

; A field in the vote.
SourceField = [ FieldSource, VoteableStructKey ]

; A location in the vote.  Each location here can only
; be referenced from later locations, or from itself.
FieldSource = "M" ; Meta.
           / "CP" ; ClientParam.
           / "SP" ; ServerParam.
           / "RM" ; Relay-meta
           / "RS" ; Relay-SNIP
           / "RL" ; Relay-legacy

To compute a consensus with this operation, first locate each field described in the SourceField entry in each VoteDocument (if present), and in the consensus computed so far. If there is no such field in the consensus or if it has not been computed yet, then this operation produces "no consensus". Otherwise, discard the VoteDocuments that do not have the same value for the field as the consensus, and their corresponding votes for this field. Do this for every listed field.

At this point, we have a set of votes for this field's value that all come from VoteDocuments that describe the same value for the source field(s). Apply the RULE operation to those votes in order to give the result for this voting operation.

The DerivedFromField members in a SectionRules or a StructJoinOp should be computed after the other members, so that they can refer to those members themselves.

Voting on document sections

Voting on a section of the document is similar to the StructJoin operation, with some exceptions. When we vote on a section of the document, we do not apply a single voting rule immediately. Instead, we first "merge" a set of SectionRules together, and then apply the merged rule to the votes. This is the only place where we merge rules like this.

A SectionRules is not a voting operation, so its format is not tagged with an "op":

; Format for section rules.
SectionRules = {
  * VoteableStructKey => SectionItemOp,
  ? nil => SectionItemOp
}
SectionItemOp = StructJoinOp / StructItemOp

To merge a set of SectionRules together, proceed as follows. For each key, consider whether at least QUORUM_AUTH authorities have voted the same StructItemOp for that key. If so, that StructItemOp is the resulting operation for this key. Otherwise, there is no entry for this key.

Do the same for the "nil" StructItemOp; use the result as the UNKNOWN_RULE.

Note that this merging operation is not recursive.

A CBOR-based metaformat for votes.

A vote is a signed document containing a number of sections; each section corresponds roughly to a section of another document, a description of how the vote is to be conducted, or both.

; VoteDocument is a top-level signed vote.
VoteDocument = [
    ; Each signature may be produced by a different key, if they
    ; are all held by the same authority.
    sig: [ + SingleSig ],
    lifetime: Lifespan,
    digest-alg: DigestAlgorithm,
    body: bstr .cbor VoteContent
]

VoteContent = {
    ; List of supported consensus methods.
    consensus-methods: [ + uint ],

    ; Text-based legacy vote to be used if the negotiated
    ; consensus method is too old.  It should itself be signed.
    ; It's encoded as a series of text chunks, to help with
    ; cbor-based binary diffs.
    ? legacy-vote: [ * tstr ],

    ; How should the votes within the individual sections be
    ; computed?
    voting-rules: VotingRules,

    ; Information that the authority wants to share about this
    ; vote, which is not itself voted upon.
    notes: NoteSection,

    ; Meta-information that the authorities vote on, which does
    ; not actually appear in the ENDIVE or consensus directory.
    meta: MetaSection .within VoteableSection,

    ; Fields that appear in the client network parameter document.
    client-params: ParamSection .within VoteableSection,
    ; Fields that appear in the server network parameter document.
    server-params: ParamSection .within VoteableSection,

    ; Information about each relay.
    relays: RelaySection,

    ; Information about indices.
    indices: IndexSection,

    * tstr => any
}

; Self-description of a voter.
VoterSection = {
    ; human-memorable name
    name: tstr,

    ; List of link specifiers to use when uploading to this
    ; authority. (See proposal for dirport link specifier)
    ? ul: [ *LinkSpecifier ],

    ; List of link specifiers to use when downloading from this authority.
    ? dl: [ *LinkSpecifier ],

    ; contact information for this authority.
    ? contact: tstr,

    ; legacy certificate in format given by dir-spec.txt.
    ? legacy-cert: tstr,

    ; for extensions
    * tstr => any,
}

; An indexsection says how we think routing indices should be built.
IndexSection = {
    * IndexId => bstr .cbor [ IndexGroupId, GenericIndexRule ],
}
IndexGroupId = uint
; A mechanism for building a single routing index.  Actual values need to
; be within RecognizedIndexRule or the authority can't complete the
; consensus.
GenericIndexRule = {
    type: tstr,
    * tstr => any
}
RecognizedIndexRule = EdIndex / RSAIndex / BWIndex / WeightedIndex
; The values in an RSAIndex are derived from digests of Ed25519 keys.
EdIndex = {
    type: "ed-id",
    alg: DigestAlgorithm,
    prefix: bstr,
    suffix: bstr
}
; The values in an RSAIndex are derived from RSA keys.
RSAIndex = {
    type: "rsa-id"
}
; A BWIndex is built by taking some uint-valued field referred to by
; SourceField from all the relays that have all of required_flags set.
BWIndex = {
    type: "bw",
    bwfield: SourceField,
    require_flags: FlagSet,
}
; A flag can be prefixed with "!" to indicate negation.  A flag
; with a name like P@X indicates support for port class 'X' in its
; exit policy.
;
; FUTURE WORK: perhaps we should add more structure here and it
; should be a matching pattern?
FlagSet = [ *tstr ]
; A WeightedIndex applies a set of weights to a BWIndex based on which
; flags the various routers have.  Relays that match a set of flags have
; their weights multiplied by the corresponding WeightVal.
WeightedIndex = {
    type: "weighted",
    source: BwIndex,
    weight: { * FlagSet => WeightVal }
}
; A WeightVal is either an integer to multiply bandwidths by, or a
; string from the Wgg, Weg, Wbm, ... set as documented in dir-spec.txt,
; or a reference to an earlier field.
WeightVal = uint / tstr / SourceField
VoteableValue =  MapVal / ListVal / SimpleVal

; A "VoteableSection" is something that we apply part of the
; voting rules to.  When we apply voting rules to these sections,
; we do so without regards to their semantics.  When we are done,
; we use these consensus values to make the final consensus.
VoteableSection = {
   VoteableStructKey => VoteableValue,
}

; A NoteSection is used to convey information about the voter and
; its vote that is not actually voted on.
NoteSection = {
   ; Information about the voter itself
   voter: VoterSection,
   ; Information that the voter used when assigning flags.
   ? flag-thresholds: { tstr => any },
   ; Headers from the bandwidth file to be reported as part of
   ; the vote.
   ? bw-file-headers: {tstr => any },
   ? shared-rand-commit: SRCommit,
   * VoteableStructKey => VoteableValue,
}

; Shared random commitment; fields are as for the current
; shared-random-commit fields.
SRCommit = {
   ver: uint,
   alg: DigestAlgorithm,
   ident: bstr,
   commit: bstr,
   ? reveal: bstr
}

; the meta-section is voted on, but does not appear in the ENDIVE.
MetaSection = {
   ; Seconds to allocate for voting and distributing signatures
   ; Analagous to the "voting-delay" field in the legacy algorithm.
   voting-delay: [ vote_seconds: uint, dist_seconds: uint ],
   ; Proposed time till next vote.
   voting-interval: uint,
   ; proposed lifetime for the SNIPs and ENDIVEs
   snip-lifetime: Lifespan,
   ; proposed lifetime for client params document
   c-param-lifetime: Lifespan,
   ; proposed lifetime for server params document
   s-param-lifetime: Lifespan,
   ; signature depth for ENDIVE
   signature-depth: uint,
   ; digest algorithm to use with ENDIVE.
   signature-digest-alg: DigestAlgorithm,
   ; Current and previous shared-random values
   ? cur-shared-rand: [ reveals: uint, rand: bstr ],
   ? prev-shared-rand: [ reveals: uint, rand: bstr ],
   ; extensions.
   * VoteableStructKey => VoteableValue,
};

; A ParamSection will be made into a ParamDoc after voting;
; the fields are analogous.
ParamSection = {
   ? certs: [ 1*2 bstr .cbor VoterCert ],
   ? recommend-versions: [ * tstr ],
   ? require-protos: ProtoVersions,
   ? recommend-protos: ProtoVersions,
   ? params: NetParams,
   * VoteableStructKey => VoteableValue,
}
RelaySection = {
   ; Mapping from relay identity key (or digest) to relay information.
   * bstr => RelayInfo,
}

; A RelayInfo is a vote about a single relay.
RelayInfo = {
   meta: RelayMetaInfo .within VoteableSection,
   snip: RelaySNIPInfo .within VoteableSection,
   legacy: RelayLegacyInfo .within VoteableSection,
}

; Information about a relay that doesn't go into a SNIP.
RelayMetaInfo = {
    ; Tuple of published-time and descriptor digest.
    ? desc: [ uint , bstr ],
    ; What flags are assigned to this relay?  We use a
    ; string->value encoding here so that only the authorities
    ; who have an opinion on the status of a flag for a relay need
    ; to vote yes or no on it.
    ? flags: { *tstr=>bool },
    ; The relay's self-declared bandwidth.
    ? bw: uint,
    ; The relay's measured bandwidth.
    ? mbw: uint,
    ; The fingerprint of the relay's RSA identity key.
    ? rsa-id: RSAIdentityFingerprint
}
; SNIP information can just be voted on directly; the formats
; are the same.
RelaySNIPInfo = SNIPRouterData

; Legacy information is used to build legacy consensuses, but
; not actually required by walking onions clients.
RelayLegacyInfo = {
   ; Mapping from consensus version to microdescriptor digests
   ; and microdescriptors.
   ? mds: [ *Microdesc ],
}

; Microdescriptor votes now include the digest AND the
; microdescriptor-- see note.
Microdesc = [
   low: uint,
   high: uint,
   digest: bstr .size 32,
   ; This is encoded in this way so that cbor-based diff tools
   ; can see inside it.  Because of compression and diffs,
   ; including microdesc text verbatim should be comparatively cheap.
   content: encoded-cbor .cbor [ *tstr ],
]

; ==========

; The VotingRules field explains how to vote on the members of
; each section
VotingRules = {
    meta: SectionRules,
    params: SectionRules,
    relay: RelayRules,
    indices: SectionRules,
}

; The RelayRules object explains the rules that apply to each
; part of a RelayInfo.  A key will appear in the consensus if it
; has been listed by at least key_min_count authorities.
RelayRules = {
    key_min_count: IntOpArgument,
    meta: SectionRules,
    snip: SectionRules,
    legacy: SectionRules,
}

Computing a consensus.

To compute a consensus, the authorities first verify that all the votes are timely and correctly signed by real authorities. This includes validating all invariants stated here, and all internal documents.

If they have two votes from an authority, authorities SHOULD issue a warning, and they should take the one that is published more recently.

TODO: Teor suggests that maybe we shouldn't warn about two votes from an authority for the same period, and we could instead have a more resilient process here, where authorities can update their votes at various times over the voting period, up to some point.

I'm not sure whether this helps reliability more or less than it risks it, but it worth investigating.

Next, the authorities determine the consensus method as they do today, using the field "consensus-method". This can also be expressed as the voting operation Threshold[SUPERQUORUM_PRESENT, false, uint].

If there is no consensus for the consensus-method, then voting stops without having produced a consensus.

Note that in contrast with its behavior in the current voting algorithm, the consensus method does not determine the way to vote on every individual field: that aspect of voting is controlled by the voting-rules. Instead, the consensus-method changes other aspects of this voting, such as:

* Adding, removing, or changing the semantics of voting
  operations.
* Changing the set of documents to which voting operations apply.
* Otherwise changing the rules that are set out in this
  document.

Once a consensus-method is decided, the next step is to compute the consensus for other sections in this order: meta, client-params, server-params, and indices. The consensus for each is calculated according to the operations given in the corresponding section of VotingRules.

Next the authorities compute a consensus on the relays section, which is done slightly differently, according to the rules of RelayRules element of VotingRules.

Finally, the authorities transform the resulting sections into an ENDIVE and a legacy consensus, as in "Computing an ENDIVE" and "Computing a legacy consensus" below.

To vote on a single VotingSection, find the corresponding SectionRules objects in the VotingRules of this votes, and apply it as described above in "Voting on document sections".

If an older consensus method is negotiated (Transitional)

The legacy-vote field in the vote document contains an older (v3, text-style) consensus vote, and is used when an older consensus method is negotiated. The legacy-vote is encoded by splitting it into pieces, to help with CBOR diff calculation. Authorities MAY split at line boundaries, space boundaries, or anywhere that will help with diffs. To reconstruct the legacy vote, concatenate the members of legacy-vote in order. The resulting string MUST validate according to the rules of the legacy voting algorithm.

If a legacy vote is present, then authorities SHOULD include the same view of the network in the legacy vote as they included in their real vote.

If a legacy vote is present, then authorities MUST give the same list of consensus-methods and the same voting schedule in both votes. Authorities MUST reject noncompliant votes.

Computing an ENDIVE.

If a consensus-method is negotiated that is high enough to support ENDIVEs, then the authorities proceed as follows to transform the consensus sectoins above into an ENDIVE.

The ParamSections from the consensus are used verbatim as the bodies of the client-params and relay-params fields.

The fields that appear in each RelaySNIPInfo determine what goes into the SNIPRouterData for each relay. To build the relay section, first decide which relays appear according to the key_min_count field in the RelayRules. Then collate relays across all the votes by their keys, and see which ones are listed. For each key that appears in at least key_min_count votes, apply the RelayRules to each section of the RelayInfos for that key.

The sig_params section is derived from fields in the meta section. Fields with identical names are simply copied; Lifespan values are copied to the corresponding documents (snip-lifetime as the lifespan for SNIPs and ENDIVEs, and c and s-param-lifetime as the lifespan for ParamDocs).

To compute the signature nonce, use the signature digest algorithm to compute the digest of each input vote body, sort those digests lexicographically, and concatenate and hash those digests again.

Routing indices are built according to named IndexRules, and grouped according to fields in the meta section. See "Constructing Indices" below.

(At this point extra fields may be copied from the Meta section of each RelayInfo into the ENDIVERouterData depending on the meta document; we do not, however, currently specify any case where this is done.)

Constructing indices

After having built the list of relays, the authorities construct and encode the indices that appear in the ENDIVEs. The voted-upon GenericIndexRule values in the IndexSection of the consensus say how to build the indices in the ENDIVE, as follows.

An EdIndex is built using the IndexType_Ed25519Id value, with the provided prefix and suffix values. Authorities don't need to expand this index in the ENDIVE, since the relays can compute it deterministically.

An RSAIndex is built using the IndexType_RSAId type. Authorities don't need to expand this index in the ENDIVE, since the relays can compute it deterministically.

A BwIndex is built using the IndexType_Weighted type. Each relay has a weight equal to some specified bandwidth field in its consensus RelayInfo. If a relay is missing any of the required_flags in its meta section, or if it does not have the specified bandwidth field, that relay's weight becomes 0.

A WeightedIndex is built by computing a BwIndex, and then transforming each relay in the list according to the flags that it has set. Relays that match any set of flags in the WeightedIndex rule get their bandwidths multiplied by all WeightVals that apply. Some WeightVals are computed according to special rules, such as "Wgg", "Weg", and so on. These are taken from the current dir-spec.txt.

For both BwIndex and WeightedIndex values, authorities MUST scale the computed outputs so that no value is greater than UINT32_MAX; they MUST do by shifting all values right by lowest number of bits that achieves this.

We could specify a more precise algorithm, but this is simpler.

Indices with the same IndexGroupId are placed in the same index group; index groups are ordered numerically.

Computing a legacy consensus.

When using a consensus method that supports Walking Onions, the legacy consensus is computed from the same data as the ENDIVE. Because the legacy consensus format will be frozen once Walking Onions is finalized, we specify this transformation directly, rather than in a more extensible way.

The published time and descriptor digest are used directly. Microdescriptor negotiation proceeds as before. Bandwidths, measured bandwidths, descriptor digests, published times, flags, and rsa-id values are taken from the RelayMetaInfo section. Addresses, protovers, versions, and so on are taken from the RelaySNIPInfo. Header fields are all taken from the corresponding header fields in the MetaSection or the ClientParamsSection. All parameters are copied into the net-params field.

Managing indices over time.

The present voting mechanism does not do a great job of handling the authorities

The semantic meaning of most IndexId values, as understood by clients should remain unchanging; if a client uses index 6 for middle nodes, 6 should always mean "middle nodes".

If an IndexId is going to change its meaning over time, it should not be hardcoded by clients; it should instead be listed in the NetParams document, as the exit indices are in the port-classes field. (See also section 6 and appendix AH.) If such a field needs to change, it also needs a migration method that allows clients with older and newer parameters documents to exist at the same time.

Relay operations: Receiving and expanding ENDIVEs

Previously, we introduced a format for ENDIVEs to be transmitted from authorities to relays. To save on bandwidth, the relays download diffs rather than entire ENDIVEs. The ENDIVE format makes several choices in order to make these diffs small: the Merkle tree is omitted, and routing indices are not included directly.

To address those issues, this document describes the steps that a relay needs to perform, upon receiving an ENDIVE document, to derive all the SNIPs for that ENDIVE.

Here are the steps to be followed. We'll describe them in order, though in practice they could be pipelined somewhat. We'll expand further on each step later on.

  1. Compute routing indices positions.

  2. Compute truncated SNIPRouterData variations.

  3. Build signed SNIP data.

  4. Compute Merkle tree.

  5. Build authenticated SNIPs.

Below we'll specify specific algorithms for these steps. Note that relays do not need to follow the steps of these algorithms exactly, but they MUST produce the same outputs as if they had followed them.

Computing index positions.

For every IndexId in every Index Group, the relay will compute the full routing index. Every routing index is a mapping from index position ranges (represented as 2-tuples) to relays, where the relays are represented as ENDIVERouterData members of the ENDIVE. The routing index must map every possible value of the index to exactly one relay.

An IndexSpec field describes how the index is to be constructed. There are four types of IndexSpec: Raw, Raw Spans, Weighted, RSAId, and Ed25519Id. We'll describe how to build the indices for each.

Every index may either have an integer key, or a binary-string key. We define the "successor" of an integer index as the succeeding integer. We define the "successor" of a binary string as the next binary string of the same length in lexicographical (memcmp) order. We define "predecessor" as the inverse of "successor". Both these operations "wrap around" the index.

The algorithms here describe a set of invariants that are "verified". Relays SHOULD check each of these invariants; authorities MUST NOT generate any ENDIVEs that violate them. If a relay encounters an ENDIVE that cannot be verified, then the ENDIVE cannot be expanded.

NOTE: conceivably should there be some way to define an index as a subset of another index, with elements weighted in different ways? In other words, "Index a is index b, except multiply these relays by 0 and these relays by 1.2". We can keep this idea sitting around in case there turns out to be a use for it.

Raw indices

When the IndexType is Indextype_Raw, then its members are listed directly in the IndexSpec.

Algorithm: Expanding a "Raw" indexspec.

Let result_idx = {} (an empty mapping).

Let previous_pos = indexspec.first_index

For each element [i, pos2] of indexspec.index_ranges:

    Verify that i is a valid index into the list of ENDIVERouterData.

    Set pos1 = the successor of previous_pos.

    Verify that pos1 and pos2 have the same type.

    Append the mapping (pos1, pos2) => i to result_idx

    Set previous_pos to pos2.

Verify that previous_pos = the predecessor of indexspec.first_index.

Return result_idx.

Raw numeric indices

If the IndexType is Indextype_RawNumeric, it is described by a set of spans on a 32-bit index range.

Algorithm: Expanding a RawNumeric index.

Let prev_pos = 0

For each element [i, span] of indexspec.index_ranges:

    Verify that i is a valid index into the list of ENDIVERouterData.

    Verify that prev_pos <= UINT32_MAX - span.

    Let pos2 = prev_pos + span.

    Append the mapping (pos1, pos2) => i to result_idx.

    Let prev_pos = successor(pos2)

Verify that prev_pos = UINT32_MAX.

Return result_idx.

Weighted indices

If the IndexSpec type is Indextype_Weighted, then the index is described by assigning a probability weight to each of a number of relays. From these, we compute a series of 32-bit index positions.

This algorithm uses 64-bit math, and 64-by-32-bit integer division.

It requires that the sum of weights is no more than UINT32_MAX.

Algorithm: Expanding a "Weighted" indexspec.

Let total_weight = SUM(indexspec.index_weights)

Verify total_weight <= UINT32_MAX.

Let total_so_far = 0.

Let result_idx = {} (an empty mapping).

Define POS(b) = FLOOR( (b << 32) / total_weight).

For 0 <= i < LEN(indexspec.indexweights):

   Let w = indexspec.indexweights[i].

   Let lo = POS(total_so_far).

   Let total_so_far = total_so_far + w.

   Let hi = POS(total_so_far) - 1.

   Append (lo, hi) => i to result_idx.

Verify that total_so_far = total_weight.

Verify that the last value of "hi" was UINT32_MAX.

Return result_idx.

This algorithm is a bit finicky in its use of division, but it results in a mapping onto 32 bit integers that completely covers the space of available indices.

RSAId indices

If the IndexSpec type is Indextype_RSAId then the index is a set of binary strings describing the routers' legacy RSA identities, for use in the HSv2 hash ring.

These identities are truncated to a fixed length. Though the SNIP format allows variable-length binary prefixes, we do not use this feature.

Algorithm: Expanding an "RSAId" indexspec.

Let R = [ ] (an empty list).

Take the value n_bytes from the IndexSpec.

For 0 <= b_idx < MIN( LEN(indexspec.members) * 8,
                      LEN(list of ENDIVERouterData) ):

   Let b = the b_idx'th bit of indexspec.members.

   If b is 1:
       Let m = the b_idx'th member of the ENDIVERouterData list.

       Verify that m has its RSAIdentityFingerprint set.

       Let pos = m.RSAIdentityFingerprint, truncated to n_bytes.

       Add (pos, b_idx) to the list R.

Return INDEX_FROM_RING_KEYS(R).

Sub-Algorithm: INDEX_FROM_RING_KEYS(R)

First, sort R according to its 'pos' field.

For each member (pos, idx) of the list R:

    If this is the first member of the list R:
        Let key_low = pos for the last member of R.
    else:
        Let key_low = pos for the previous member of R.

    Let key_high = predecessor(pos)

    Add (key_low, key_high) => idx to result_idx.

Return result_idx.

Ed25519 indices

If the IndexSpec type is Indextype_Ed25519, then the index is a set of binary strings describing the routers' positions in a hash ring, derived from their Ed25519 identity keys.

This algorithm is a generalization of the one used for hsv3 rings, to be used to compute the hsv3 ring and other possible future derivatives.

Algorithm: Expanding an "Ed25519Id" indexspec.

Let R = [ ] (an empty list).

Take the values prefix, suffix, and n_bytes from the IndexSpec.

Let H() be the digest algorithm specified by d_alg from the
IndexSpec.

For 0 <= b_idx < MIN( LEN(indexspec.members) * 8,
                      LEN(list of ENDIVERouterData) ):

   Let b = the b_idx'th bit of indexspec.members.

   If b is 1:
       Let m = the b_idx'th member of the ENDIVERouterData list.

       Let key = m's ed25519 identity key, as a 32-byte value.

       Compute pos = H(prefix || key || suffix)

       Truncate pos to n_bytes.

       Add (pos, b_idx) to the list R.

Return INDEX_FROM_RING_KEYS(R).

Building a SNIPLocation

After computing all the indices in an IndexGroup, relays combine them into a series of SNIPLocation objects. Each SNIPLocation MUST contain all the IndexId => IndexRange entries that point to a given ENDIVERouterData, for the IndexIds listed in an IndexGroup.

Algorithm: Build a list of SNIPLocation objects from a set of routing indices.

Initialize R as [ { } ] * LEN(relays)   (A list of empty maps)

For each IndexId "ID" in the IndexGroup:

   Let router_idx be the index map calculated for ID.
   (This is what we computed previously.)

   For each entry ( (LO, HI) => idx) in router_idx:

      Let R[idx][ID] = (LO, HI).

SNIPLocation objects are thus organized in the order in which they will appear in the Merkle tree: that is, sorted by the position of their corresponding ENDIVERouterData.

Because SNIPLocation objects are signed, they must be encoded as "canonical" cbor, according to section 3.9 of RFC 7049.

If R[idx] is {} (the empty map) for any given idx, then no SNIP will be generated for the SNIPRouterData at that routing index for this index group.

Computing truncated SNIPRouterData.

An index group can include an omit_from_snips field to indicate that certain fields from a SNIPRouterData should not be included in the SNIPs for that index group.

Since a SNIPRouterData needs to be signed, this process has to be deterministic. Thus, the truncated SNIPRouterData should be computed by removing the keys and values for EXACTLY the keys listed and no more. The remaining keys MUST be left in the same order that they appeared in the original SNIPRouterData, and they MUST NOT be re-encoded.

(Two keys are "the same" if and only if they are integers encoding the same value, or text strings with the same UT-8 content.)

There is no need to compute a SNIPRouterData when no SNIP is going to be generated for a given router.

Building the Merkle tree.

After computing a list of (SNIPLocation, SNIPRouterData) for every entry in an index group, the relay needs to expand a Merkle tree to authenticate every SNIP.

There are two steps here: First the relay generates the leaves, and then it generates the intermediate hashes.

To generate the list of leaves for an index group, the relay first removes all entries from the (SNIPLocation, SNIPRouterData) list that have an empty index map. The relay then puts n_padding_entries "nil" entries at the end of the list.

To generate the list of leaves for the whole Merkle tree, the relay concatenates these index group lists in the order in which they appear in the ENDIVE, and pads the resulting list with "nil" entries until the length of the list is a power of two: 2^tree-depth for some integer tree-depth. Let LEAF(IDX) denote the entry at position IDX in this list, where IDX is a D-bit bitstring. LEAF(IDX) is either a byte string or nil.

The relay then recursively computes the hashes in the Merkle tree as follows. (Recall that H_node() and H_leaf() are hashes taking a bit-string PATH, a LIFESPAN and NONCE from the signature information, and a variable-length string ITEM.)

Recursive defintion: HM(PATH)

Given PATH a bitstring of length no more than tree-depth.

Define S:
    S(nil) = an all-0 string of the same length as the hash output.
    S(x) = x, for all other x.

If LEN(PATH) = tree-depth:   (Leaf case.)
   If LEAF(PATH) = nil:
     HM(PATH) = nil.
   Else:
     HM(PATH) = H_node(PATH, LIFESPAN, NONCE, LEAF(PATH)).

Else:
   Let LEFT = HM(PATH || 0)
   Let RIGHT = HM(PATH || 1)
   If LEFT = nil and RIGHT = nil:
       HM(PATH) = nil
   else:
       HM(PATH) = H_node(PATH, LIFESPAN, NONCE, S(LEFT) || S(RIGHT))

Note that entries aren't computed for "nil" leaves, or any node all of whose children are "nil". The "nil" entries only exist to place all leaves at a constant depth, and to enable spacing out different sections of the tree.

If signature-depth for the ENDIVE is N, the relay does not need to compute any Merkle tree entries for PATHs of length shorter than N bits.

Assembling the SNIPs

Finally, the relay has computed a list of encoded (SNIPLocation, RouterData) values, and a Merkle tree to authenticate them. At this point, the relay builds them into SNIPs, using the sig_params and signatures from the ENDIVE.

Algorithm: Building a SNIPSignature for a SNIP.

Given a non-nil (SNIPLocation, RouterData) at leaf position PATH.

Let SIG_IDX = PATH, truncated to signature-depth bits.
Consider SIG_IDX as an integer.

Let Sig = signatures[SIG_IDX] -- either the SingleSig or the MultiSig
for this snip.

Let HashPath = []   (an empty list).
For bitlen = signature-depth+1 ... tree-depth-1:
    Let X = PATH, truncated to bitlen bits.
    Invert the final bit of PATH.
    Append HM(PATH) to HashPath.

The SnipSignature's signature values is Sig, and its merkle_path is
HashPath.

Implementation considerations

A relay only needs to hold one set of SNIPs at a time: once one ENDIVE's SNIPs have been extracted, then the SNIPs from the previous ENDIVE can be discarded.

To save memory, a relay MAY store SNIPs to disk, and mmap them as needed.

Extending circuits with Walking Onions

When a client wants to extend a circuit, there are several possibilities. It might need to extend to an unknown relay with specific properties. It might need to extend to a particular relay from which it has received a SNIP before. In both cases, there are changes to be made in the circuit extension process.

Further, there are changes we need to make for the handshake between the extending relay and the target relay. The target relay is no longer told by the client which of its onion keys it should use... so the extending relay needs to tell the target relay which keys are in the SNIP that the client is using.

Modifying the EXTEND/CREATE handshake

First, we will require that proposal 249 (or some similar proposal for wide CREATE and EXTEND cells) is in place, so that we can have EXTEND cells larger than can fit in a single cell. (See 319-wide-everything.md for an example proposal to supersede 249.)

We add new fields to the CREATE2 cell so that relays can send each other more information without interfering with the client's part of the handshake.

The CREATE2, CREATED2, and EXTENDED2 cells change as follows:

  struct create2_body {
     // old fields
     u16 htype; // client handshake type
     u16 hlen; // client handshake length
     u8 hdata[hlen]; // client handshake data.

     // new fields
     u8 n_extensions;
     struct extension extension[n_extensions];
  }

  struct created2_body {
     // old fields
     u16 hlen;
     u8 hdata[hlen];

     // new fields
     u8 n_extensions;
     struct extension extension[n_extensions];
  }

  struct truncated_body {
     // old fields
     u8 errcode;

     // new fields
     u8 n_extensions;
     struct extension extension[n_extensions];
  }

  // EXTENDED2 cells can now use the same new fields as in the
  // created2 cell.

  struct extension {
     u16 type;
     u16 len;
     u8 body[len];
  }

These extensions are defined by this proposal:

  [01] -- `Partial_SNIPRouterData` -- Sent from an extending relay
          to a target relay. This extension holds one or more fields
          from the SNIPRouterData that the extending relay is using,
          so that the target relay knows (for example) what keys to
          use.  (These fields are determined by the
          "forward_with_extend" field in the ENDIVE.)

  [02] -- Full_SNIP -- an entire SNIP that was used in an attempt to
          extend the circuit.  This must match the client's provided
          index position.

  [03] -- Extra_SNIP -- an entire SNIP that was not used to extend
          the circuit, but which the client requested anyway.  This
          can be sent back from the extending relay when the client
          specifies multiple index positions, or uses a nonzero "nth" value
          in their `snip_index_pos` link specifier.

  [04] -- SNIP_Request -- a 32-bit index position, or a single zero
          byte, sent away from the client.  If the byte is 0, the
          originator does not want a SNIP.  Otherwise, the
          originator does want a SNIP containing the router and the
          specified index.  Other values are unspecified.

By default, EXTENDED2 cells are sent with a SNIP iff the EXTENDED2 cell used a snip_index_pos link specifier, and CREATED2 cells are not sent with a SNIP.

We add a new link specifier type for a router index, using the following coding for its contents:

/* Using trunnel syntax here. */
struct snip_index_pos {
    u32 index_id; // which index is it?
    u8 nth; // how many SNIPs should be skipped/included?
    u8 index_pos[]; // extends to the end of the link specifier.
}

The index_pos field can be longer or shorter than the actual width of the router index. If it is too long, it is truncated. If it is too short, it is extended with zero-valued bytes.

Any number of these link specifiers may appear in an EXTEND cell. If there is more then one, then they should appear in order of client preference; the extending relay may extend to any of the listed routers.

This link specifier SHOULD NOT be used along with IPv4, IPv6, RSA ID, or Ed25519 ID link specifiers. Relays receiving such a link specifier along with a snip_index_pos link specifier SHOULD reject the entire EXTEND request.

If nth is nonzero, then link specifier means "the n'th SNIP after the one defined by the SNIP index position." A relay MAY reject this request if nth is greater than 4. If the relay does not reject this request, then it MUST include all snips between index_pos and the one that was actually used in an Extra_Snip extension. (Otherwise, the client would not be able to verify that it had gotten the correct SNIP.)

I've avoided use of CBOR for these types, under the assumption that we'd like to use CBOR for directory stuff, but no more. We already have trunnel-like objects for this purpose.

Modified ntor handshake

We adapt the ntor handshake from tor-spec.txt for this use, with the following main changes.

  • The NODEID and KEYID fields are omitted from the input. Instead, these fields may appear in a PartialSNIPData extension.

  • The NODEID and KEYID fields appear in the reply.

  • The NODEID field is extended to 32 bytes, and now holds the relay's ed25519 identity.

So the client's message is now:

CLIENT_PK [32 bytes]

And the relay's reply is now:

NODEID    [32 bytes]
KEYID     [32 bytes]
SERVER_PK [32 bytes]
AUTH      [32 bytes]

otherwise, all fields are computed as described in tor-spec.

When this handshake is in use, the hash function is SHA3-256 and keys are derived using SHAKE-256, as in rend-spec-v3.txt.

Future work: We may wish to update this choice of functions between now and the implementation date, since SHA3 is a bit pricey. Perhaps one of the BLAKEs would be a better choice. If so, we should use it more generally. On the other hand, the presence of public-key operations in the handshake probably outweighs the use of SHA3.

We will have to give this version of the handshake a new handshake type.

New relay behavior on EXTEND and CREATE failure.

If an EXTEND2 cell based on an routing index fails, the relay should not close the circuit, but should instead send back a TRUNCATED cell containing the SNIP in an extension.

If a CREATE2 cell fails and a SNIP was requested, then instead of sending a DESTROY cell, the relay SHOULD respond with a CREATED2 cell containing 0 bytes of handshake data, and the SNIP in an extension. Clients MAY re-extend or close the circuit, but should not leave it dangling.

NIL handshake type

We introduce a new handshake type, "NIL". The NIL handshake always fails. A client's part of the NIL handshake is an empty bytestring; there is no server response that indicates success.

The NIL handshake can used by the client when it wants to fetch a SNIP without creating a circuit.

Upon receiving a request to extend with the NIL circuit type, a relay SHOULD NOT actually open any connection or send any data to the target relay. Instead, it should respond with a TRUNCATED cell with the SNIP(s) that the client requested in one or more Extra_SNIP extensions.

Padding handshake cells to a uniform size

To avoid leaking information, all CREATE/CREATED/EXTEND/EXTENDED cells SHOULD be padded to the same sizes. In all cases, the amount of padding is controlled by a set of network parameters: "create-pad-len", "created-pad-len", "extend-pad-len" and "extended-pad-len". These parameters determine the minimum length that the cell body or relay cell bodies should be.

If a cell would be sent whose body is less than the corresponding parameter value, then the sender SHOULD pad the body by adding zero-valued bytes to the cell body. As usual, receivers MUST ignore extra bytes at the end of cells.

ALTERNATIVE: We could specify a more complicated padding mechanism, eg. 32 bytes of zeros then random bytes.

Client behavior with walking onions

Today's Tor clients have several behaviors that become somewhat more difficult to implement with Walking Onions. Some of these behaviors are essential and achievable. Others can be achieved with some effort, and still others appear to be incompatible with the Walking Onions design.

Bootstrapping and guard selection

When a client first starts running, it has no guards on the Tor network, and therefore can't start building circuits immediately. To produce a list of possible guards, the client begins connecting to one or more fallback directories on their ORPorts, and building circuits through them. These are 3-hop circuits. The first hop of each circuit is the fallback directory; the second and third hops are chosen from the Middle routing index. At the third hop, the client then sends an informational request for a guard's SNIP. This informational request is an EXTEND2 cell with handshake type NIL, using a random spot on the Guard routing index.

Each such request yields a single SNIP that the client will store. These SNIPs, in the order in which they were requested, will form the client's list of "Sampled" guards as described in guard-spec.txt.

Clients SHOULD ensure that their sampled guards are not linkable to one another. In particular, clients SHOULD NOT add more than one guard retrieved from the same third hop on the same circuit. (If it did, that third hop would realize that some client using guard A was also using guard B.)

Future work: Is this threat real? It seems to me that knowing one or two guards at a time in this way is not a big deal, though knowing the whole set would sure be bad. However, we shouldn't optimize this kind of defense away until we know that it's actually needless.

If a client's network connection or choice of entry nodes is heavily restricted, the client MAY request more than one guard at a time, but if it does so, it SHOULD discard all but one guard retrieved from each set.

After choosing guards, clients will continue to use them even after their SNIPs expire. On the first circuit through each guard after opening a channel, clients should ask that guard for a fresh SNIP for itself, to ensure that the guard is still listed in the consensus, and to keep the client's information up-to-date.

Using bridges

As now, clients are configured to use a bridge by using an address and a public key for the bridge. Bridges behave like guards, except that they are not listed in any directory or ENDIVE, and so cannot prove membership when the client connects to them.

On the first circuit through each channel to a bridge, the client asks that bridge for a SNIP listing itself in the Self routing index. The bridge responds with a self-created unsigned SNIP:

 ; This is only valid when received on an authenticated connection
 ; to a bridge.
 UnsignedSNIP = [
    ; There is no signature on this SNIP.
    auth : nil,

    ; Next comes the location of the SNIP within the ENDIVE.  This
    ; SNIPLocation will list only the Self index.
    index : bstr .cbor SNIPLocation,

    ; Finally comes the information about the router.
    router : bstr .cbor SNIPRouterData,
 ]

Security note: Clients MUST take care to keep UnsignedSNIPs separated from signed ones. These are not part of any ENDIVE, and so should not be used for any purpose other than connecting through the bridge that the client has received them from. They should be kept associated with that bridge, and not used for any other, even if they contain other link specifiers or keys. The client MAY use link specifiers from the UnsignedSNIP on future attempts to connect to the bridge.

Finding relays by exit policy

To find a relay by exit policy, clients might choose the exit routing index corresponding to the exit port they want to use. This has negative privacy implications, however, since the middle node discovers what kind of exit traffic the client wants to use. Instead, we support two other options.

First, clients may build anonymous three-hop circuits and then use those circuits to request the SNIPs that they will use for their exits. This may, however, be inefficient.

Second, clients may build anonymous three-hop circuits and then use a BEGIN cell to try to open the connection when they want. When they do so, they may include a new flag in the begin cell, "DVS" to enable Delegated Verifiable Selection. As described in the Walking Onions paper, DVS allows a relay that doesn't support the requested port to instead send the client the SNIP of a relay that does. (In the paper, the relay uses a digest of previous messages to decide which routing index to use. Instead, we have the client send an index field.)

This requires changes to the BEGIN and END cell formats. After the "flags" field in BEGIN cells, we add an extension mechanism:

struct begin_cell {
    nulterm addr_port;
    u32 flags;
    u8 n_extensions;
    struct extension exts[n_extensions];
}

We allow the snip_index_pos link specifier type to appear as a begin extension.

END cells will need to have a new format that supports including policy and SNIP information. This format is enabled whenever a new EXTENDED_END_CELL flag appears in the begin cell.

struct end_cell {
    u8 tag IN [ 0xff ]; // indicate that this isn't an old-style end cell.
    u8 reason;
    u8 n_extensions;
    struct extension exts[n_extensions];
}

We define three END cell extensions. Two types are for addresses, that indicate what address was resolved and the associated TTL:

struct end_ext_ipv4 {
    u32 addr;
    u32 ttl;
}
struct end_ext_ipv6 {
    u8 addr[16];
    u32 ttl;
}

One new END cell extension is used for delegated verifiable selection:

struct end_ext_alt_snip {
    u16 index_id;
    u8 snip[..];
}

This design may require END cells to become wider; see 319-wide-everything.md for an example proposal to supersede proposal 249 and allow more wide cell types.

Universal path restrictions

There are some restrictions on Tor paths that all clients should obey, unless they are configured not to do so. Some of these restrictions (like "start paths with a Guard node" or "don't use an Exit as a middle when Exit bandwidth is scarce") are captured by the index system. Some other restrictions are not. Here we describe how to implement those.

The general approach taken here is "build and discard". Since most possible paths will not violate these universal restrictions, we accept that a fraction of the paths built will not be usable. Clients tear them down a short time after they are built.

Clients SHOULD discard a circuit if, after it has been built, they find that it contains the same relay twice, or it contains more than one relay from the same family or from the same subnet.

Clients MAY remember the SNIPs they have received, and use those SNIPs to avoid index ranges that they would automatically reject. Clients SHOULD NOT store any SNIP for longer than it is maximally recent.

NOTE: We should continue to monitor the fraction of paths that are rejected in this way. If it grows too high, we either need to amend the path selection rules, or change authorities to e.g. forbid more than a certain fraction of relay weight in the same family or subnet.

FUTURE WORK: It might be a good idea, if these restrictions truly are 'universal', for relays to have a way to say "You wouldn't want that SNIP; I am giving you the next one in sequence" and send back both SNIPs. This would need some signaling in the EXTEND/EXTENDED cells.

Client-configured path restrictions

Sometimes users configure their clients with path restrictions beyond those that are in ordinary use. For example, a user might want to enter only from US relays, but never exit from US. Or they might be configured with a short list of vanguards to use in their second position.

Handling "light" restrictions

If a restriction only excludes a small number of relays, then clients can continue to use the "build and discard" methodology described above.

Handling some "heavy" restrictions

Some restrictions can exclude most relays, and still be reasonably easy to implement if they only include a small fraction of relays. For example, if the user has a EntryNodes restriction that contains only a small group of relays by exact IP address, the client can connect or extend to one of those addresses specifically.

If we decide IP ranges are important, that IP addresses without ports are important, or that key specifications are important, we can add routing indices that list relays by IP, by RSAId, or by Ed25519 Id. Clients could then use those indices to remotely retrieve SNIPs, and then use those SNIPs to connect to their selected relays.

Future work: we need to decide how many of the above functions to actually support.

Recognizing too-heavy restrictions

The above approaches do not handle all possible sets of restrictions. In particular, they do a bad job for restrictions that ban a large fraction of paths in a way that is not encodeable in the routing index system.

If there is substantial demand for such a path restriction, implementors and authority operators should figure out how to implement it in the index system if possible.

Implementations SHOULD track what fraction of otherwise valid circuits they are closing because of the user's configuration. If this fraction is above a certain threshold, they SHOULD issue a warning; if it is above some other threshold, they SHOULD refuse to build circuits entirely.

Future work: determine which fraction appears in practice, and use that to set the appropriate thresholds above.

Using and providing onion services with Walking Onions

Both live versions of the onion service design rely on a ring of hidden service directories for use in uploading and downloading hidden service descriptors. With Walking Onions, we can use routing indices based on Ed25519 or RSA identity keys to retrieve this data.

(The RSA identity ring is unchanging, whereas the Ed25519 ring changes daily based on the shared random value: for this reason, we have to compute two simultaneous indices for Ed25519 rings: one for the earlier date that is potentially valid, and one for the later date that is potentially valid. We call these hsv3-early and hsv3-late.)

Beyond the use of these indices, however, there are other steps that clients and services need to take in order to maintain their privacy.

Finding HSDirs

When a client or service wants to contact an HSDir, it SHOULD do so anonymously, by building a three-hop anonymous circuit, and then extending it a further hop using the snip_span link specifier to upload to any of the first 3 replicas on the ring. Clients SHOULD choose an 'nth' at random; services SHOULD upload to each replica.

Using a full 80-bit or 256-bit index position in the link specifier would leak the chosen service to somebody other than the directory. Instead, the client or service SHOULD truncate the identifier to a number of bytes equal to the network parameter hsv2-index-bytes or hsv3-index-bytes respectively. (See Appendix C.)

SNIPs for introduction points

When services select an introduction point, they should include the SNIP for the introduction point in their hidden service directory entry, along with the introduction-point fields. The format for this entry is:

"snip" NL snip NL
  [at most once per introduction points]

Clients SHOULD begin treating the link specifier and onion-key fields of each introduction point as optional when the "snip" field is present, and when the hsv3-tolerate-no-legacy network parameter is set to 1. If either of these fields is present, and the SNIP is too, then these fields MUST match those listed in the SNIPs. Clients SHOULD reject descriptors with mismatched fields, and alert the user that the service may be trying a partitioning attack. The "legacy-key" and "legacy-key-cert" fields, if present, should be checked similarly.

Using the SNIPs in these ways allows services to prove that their introduction points have actually been listed in the consensus recently. It also lets clients use introduction point features that the relay might not understand.

Services should include these fields based on a set of network parameters: hsv3-intro-snip and hsv3-intro-legacy-fields. (See appendix C.)

Clients should use these fields only when Walking Onions support is enabled; see section 09.

SNIPs for rendezvous points

When a client chooses a rendezvous point for a v3 onion service, it similarly has the opportunity to include the SNIP of its rendezvous point in the encrypted part of its INTRODUCE cell. (This may cause INTRODUCE cells to become fragmented; see proposal about fragmenting relay cells.)

Using the SNIPs in these ways allows services to prove that their introduction points have actually been listed in the consensus recently. It also lets services use introduction point features that the relay might not understand.

To include the SNIP, the client places it in an extension in the INTRODUCE cell. The onion key can now be omitted[*], along with the link specifiers.

[*] Technically, we use a zero-length onion key, with a new type "implicit in SNIP".

To know whether the service can recognize this kind of cell, the client should look for the presence of a "snips-allowed 1" field in the encrypted part of the hidden service descriptor.

In order to prevent partitioning, services SHOULD NOT advertise "snips-allowed 1" unless the network parameter "hsv3-rend-service-snip" is set to 1. Clients SHOULD NOT use this field unless "hsv3-rend-client-snip" is set to 1.

TAP keys and where to find them

If v2 hidden services are still supported when Walking Onions arrives on the network, we have two choices: We could migrate them to use ntor keys instead of TAP, or we could provide a way for TAP keys to be advertised with Walking Onions.

The first option would appear to be far simpler. See proposal draft 320-tap-out-again.md.

The latter option would require us to put RSA-1024 keys in SNIPs, or put a digest of them in SNIPs and give some way to retrieve them independently.

(Of course, it's possible that we will have v2 onion services deprecated by the time Walking Onions is implemented. If so, that will simplify matters a great deal too.)

Tracking Relay honesty

Our design introduces an opportunity for dishonest relay behavior: since multiple ENDIVEs are valid at the same time, a malicious relay might choose any of several possible SNIPs in response to a client's routing index value.

Here we discuss several ways to mitigate this kind of attack.

Defense: index stability

First, the voting process should be designed such that relays do not needlessly move around the routing index. For example, it would not be appropriate to add an index type whose value is computed by first putting the relays into a pseudorandom order. Instead, index voting should be deterministic and tend to give similar outputs for similar inputs.

This proposal tries to achieve this property in its index voting algorithms. We should measure the degree to which we succeed over time, by looking at all of the ENDIVEs that are valid at any particular time, and sampling several points for each index to see how many distinct relays are listed at each point, across all valid ENDIVEs.

We do not need this stability property for routing indices whose purpose is nonrandomized relay selection, such as those indices used for onion service directories.

Defense: enforced monotonicity

Once an honest relay has received an ENDIVE, it has no reason to keep any previous ENDIVEs or serve SNIPs from them. Because of this, relay implementations SHOULD ensure that no data is served from a new ENDIVE until all the data from an old ENDIVE is thoroughly discarded.

Clients and relays can use this monotonicity property to keep relays honest: once a relay has served a SNIP with some timestamp T, that relay should never serve any other SNIP with a timestamp earlier than T. Clients SHOULD track the most recent SNIP timestamp that they have received from each of their guards, and MAY track the most recent SNIP timestamps that they have received from other relays as well.

Defense: limiting ENDIVE variance within the network.

The primary motivation for allowing long (de facto) lifespans on today's consensus documents is to keep the network from grinding to a halt if the authorities fail to reach consensus for a few hours. But in practice, if there is a consensus, then relays should have it within an hour or two, so they should not be falling a full day out of date.

Therefore we can potentially add a client behavior that, within N minutes after the client has seen any SNIP with timestamp T, the client should not accept any SNIP with timestamp earlier than T - Delta.

Values for N and Delta are controlled by network parameters (enforce-endive-dl-delay-after and allow-endive-dl-delay respectively in appendix C). N should be about as long as we expect it to take for a single ENDIVE to propagate to all the relays on the network; Delta should be about as long as we would like relays to go between updating ENDIVEs under ideal circumstances.

Migrating to Walking Onions

This proposal is a major change in the Tor network that will eventually require the participation of all relays [*], and will make clients who support it distinguishable from clients that don't.

[*] Technically, the last relay in the path doesn't need support.

To keep the compatibility issues under control, here is the order in which it should be deployed on the network.

  1. First, authorities should add support for voting on ENDIVEs.

  2. Relays may immediately begin trying to download and reconstruct ENDIVEs. (Relay versions are public, so they leak nothing by doing this.)

  3. Once a sufficient number of authorities are voting on ENDIVEs and unlikely to downgrade, relays should begin serving parameter documents and responding to walking-onion EXTEND and CREATE cells. (Again, relay versions are public, so this doesn't leak.)

  4. In parallel with relay support, Tor should also add client support for Walking Onions. This should be disabled by default, however, since it will only be usable with the subset of relays that support Walking Onions, and since it would make clients distinguishable.

  5. Once enough of the relays (possibly, all) support Walking Onions, the client support can be turned on. They will not be able to use old relays that do not support Walking Onions.

  6. Eventually, relays that do not support Walking Onions should not be listed in the consensus.

Client support for Walking Onions should be enabled or disabled, at first, with a configuration option. Once it seems stable, the option should have an "auto" setting that looks at a network parameter. This parameter should NOT be a simple "on" or "off", however: it should be the minimum client version whose support for Walking Onions is believed to be correct.

Future work: migrating away from sedentary onions

Once all clients are using Walking Onions, we can take a pass through the Tor specifications and source code to remove no-longer-needed code.

Clients should be the first to lose support for old directories, since nobody but the clients depends on the clients having them. Only after obsolete clients represent a very small fraction of the network should relay or authority support be disabled.

Some fields in router descriptors become obsolete with Walking Onions, and possibly router descriptors themselves should be replaced with cbor objects of some kind. This can only happen, however, after no descriptor users remain.

Appendices

Appendix A: Glossary

I'm going to put a glossary here so I can try to use these terms consistently.

SNIP -- A "Separable Network Index Proof". Each SNIP contains the information necessary to use a single Tor relay, and associates the relay with one or more index ranges. SNIPs are authenticated by the directory authorities.

ENDIVE -- An "Efficient Network Directory with Individually Verifiable Entries". An ENDIVE is a collection of SNIPS downloaded by relays, authenticated by the directory authorities.

Routing index -- A routing index is a map from binary strings to relays, with some given property. Each relay that is in the routing index is associated with a single index range.

Index range -- A range of positions withing a routing index. Each range contains many positions.

Index position -- A single value within a routing index. Every position in a routing index corresponds to a single relay.

ParamDoc -- A network parameters document, describing settings for the whole network. Clients download this infrequently.

Index group -- A collection of routing indices that are encoded in the same SNIPs.

Appendix B: More cddl definions

; These definitions are used throughout the rest of the
; proposal

; Ed25519 keys are 32 bytes, and that isn't changing.
Ed25519PublicKey = bstr .size 32

; Curve25519 keys are 32 bytes, and that isn't changing.
Curve25519PublicKey = bstr .size 32

; 20 bytes or fewer: legacy RSA SHA1 identity fingerprint.
RSAIdentityFingerprint = bstr

; A 4-byte integer -- or to be cddl-pedantic, one that is
; between 0 and UINT32_MAX.
uint32 = uint .size 4

; Enumeration to define integer equivalents for all the digest algorithms
; that Tor uses anywhere.  Note that some of these are not used in
; this spec, but are included so that we can use this production
; whenever we need to refer to a hash function.
DigestAlgorithm = &(
    NoDigest: 0,
    SHA1    : 1,     ; deprecated.
    SHA2-256: 2,
    SHA2-512: 3,
    SHA3-256: 4,
    SHA3-512: 5,
    Kangaroo12-256: 6,
    Kangaroo12-512: 7,
)

; A digest is represented as a binary blob.
Digest = bstr

; Enumeration for different signing algorithms.
SigningAlgorithm = &(
   RSA-OAEP-SHA1  : 1,     ; deprecated.
   RSA-OAEP-SHA256: 2,     ; deprecated.
   Ed25519        : 3,
   Ed448          : 4,
   BLS            : 5,     ; Not yet standardized.
)

PKAlgorithm = &(
   SigningAlgorithm,

   Curve25519: 100,
   Curve448  : 101
)

KeyUsage = &(
   ; A master unchangeable identity key for this authority.  May be
   ; any signing key type.  Distinct from the authority's identity as a
   ; relay.
   AuthorityIdentity: 0x10,
   ; A medium-term key used for signing SNIPs, votes, and ENDIVEs.
   SNIPSigning: 0x11,

   ; These are designed not to collide with the "list of certificate
   ; types" or "list of key types" in cert-spec.txt
)

CertType = &(
   VotingCert: 0x12,
   ; These are designed not to collide with the "list of certificate
   ; types" in cert-spec.txt.
)

LinkSpecifier = bstr

Appendix C: new numbers to assign.

Relay commands:

  • We need a new relay command for "FRAGMENT" per proposal 319.

CREATE handshake types:

  • We need a type for the NIL handshake.

  • We need a handshake type for the new ntor handshake variant.

Link specifiers:

  • We need a link specifier for extend-by-index.

  • We need a link specifier for dirport URL.

Certificate Types and Key Types:

  • We need to add the new entries from CertType and KeyUsage to cert-spec.txt, and possibly merge the two lists.

Begin cells:

  • We need a flag for Delegated Verifiable Selection.

  • We need an extension type for extra data, and a value for indices.

End cells:

  • We need an extension type for extra data, a value for indices, a value for IPv4 addresses, and a value for IPv6 addresses.

Extensions for decrypted INTRODUCE2 cells:

  • A SNIP for the rendezvous point.

Onion key types for decrypted INTRODUCE2 cells:

  • An "onion key" to indicate that the onion key for the rendezvous point is implicit in the SNIP.

New URLs:

  • A URL for fetching ENDIVEs.

  • A URL for fetching client / relay parameter documents

  • A URL for fetching detached SNIP signatures.

Protocol versions:

(In theory we could omit many new protovers here, since being listed in an ENDIVE implies support for the new protocol variants. We're going to use new protovers anyway, however, since doing so keeps our numbering consistent.)

We need new versions for these subprotocols:

  • Relay to denote support for new handshake elements.

  • DirCache to denote support for ENDIVEs, paramdocs, binary diffs, etc.

  • Cons to denote support for ENDIVEs

Appendix D: New network parameters.

We introduce these network parameters:

From section 5:

  • create-pad-len -- Clients SHOULD pad their CREATE cell bodies to this size.

  • created-pad-len -- Relays SHOULD pad their CREATED cell bodies to this size.

  • extend-pad-len -- Clients SHOULD pad their EXTEND cell bodies to this size.

  • extended-pad-len -- Relays SHOULD pad their EXTENDED cell bodies to this size.

From section 7:

  • hsv2-index-bytes -- how many bytes to use when sending an hsv2 index position to look up a hidden service directory. Min: 1, Max: 40. Default: 4.

  • hsv3-index-bytes -- how many bytes to use when sending an hsv3 index position to look up a hidden service directory. Min: 1, Max: 128. Default: 4.

  • hsv3-intro-legacy-fields -- include legacy fields in service descriptors. Min: 0. Max: 1. Default: 1.

  • hsv3-intro-snip -- include intro point SNIPs in service descriptors. Min: 0. Max: 1. Default: 0.

  • hsv3-rend-service-snip -- Should services advertise and accept rendezvous point SNIPs in INTRODUCE2 cells? Min: 0. Max: 1. Default: 0.

  • hsv3-rend-client-snip -- Should clients place rendezvous point SNIPS in INTRODUCE2 cells when the service supports it? Min: 0. Max: 1. Default: 0.

  • hsv3-tolerate-no-legacy -- Should clients tolerate v3 service descriptors that don't have legacy fields? Min: 0. Max: 1. Default: 0.

From section 8:

  • enforce-endive-dl-delay-after -- How many seconds after receiving a SNIP with some timestamp T does a client wait for rejecting older SNIPs? Equivalent to "N" in "limiting ENDIVE variance within the network." Min: 0. Max: INT32_MAX. Default: 3600 (1 hour).

  • allow-endive-dl-delay -- Once a client has received an SNIP with timestamp T, it will not accept any SNIP with timestamp earlier than "allow-endive-dl-delay" seconds before T. Equivalent to "Delta" in "limiting ENDIVE variance within the network." Min: 0. Max: 2592000 (30 days). Default: 10800 (3 hours).

Appendix E: Semantic sorting for CBOR values.

Some voting operations assume a partial ordering on CBOR values. We define such an ordering as follows:

  • bstr and tstr items are sorted lexicographically, as if they were compared with a version of strcmp() that accepts internal NULs.
  • uint and int items are are sorted by integer values.
  • arrays are sorted lexicographically by elements.
  • Tagged items are sorted as if they were not tagged.
  • Maps do not have any sorting order.
  • False precedes true.
  • Otherwise, the ordering between two items is not defined.

More specifically:

 Algorithm: compare two cbor items A and B.

 Returns LT, EQ, GT, or NIL.

 While A is tagged, remove the tag from A.
 While B is tagged, remove the tag from B.

 If A is any integer type, and B is any integer type:
      return A cmp B

 If the type of A is not the same as the type of B:
      return NIL.

 If A and B are both booleans:
      return int(A) cmp int(B), where int(false)=0 and int(B)=1.

 If A and B are both tstr or both bstr:
      while len(A)>0 and len(B)>0:
         if A[0] != B[0]:
              return A[0] cmp B[0]
         Discard A[0] and B[0]
      If len(A) == len(B) == 0:
         return EQ.
      else if len(A) == 0:
         return LT.  (B is longer)
      else:
         return GT.  (A is longer)

 If A and B are both arrays:
      while len(A)>0 and len(B)>0:
         Run this algorithm recursively on A[0] and B[0].
         If the result is not EQ:
             Return that result.
         Discard A[0] and B[0]
      If len(A) == len(B) == 0:
         return EQ.
      else if len(A) == 0:
         return LT.  (B is longer)
      else:
         return GT.  (A is longer)

Otherwise, A and B are a type for which we do not define an ordering,
so return NIL.

Appendix F: Example voting rules

Here we give a set of voting rules for the fields described in our initial VoteDocuments.

{
  meta: {
     voting-delay: { op: "Mode", tie_low:false,
                       type:["tuple","uint","uint"] },
     voting-interval: { op: "Median", type:"uint" },
     snip-lifespan: {op: "Mode", type:["tuple","uint","uint","uint"] },
     c-param-lifetime: {op: "Mode", type:["tuple","uint","uint","uint"] },
     s-param-lifetime: {op: "Mode", type:["tuple","uint","uint","uint"] },
     cur-shared-rand: {op: "Mode", min_count: "qfield",
                         type:["tuple","uint","bstr"]},
     prev-shared-rand: {op: "Mode", min_count: "qfield",
                         type:["tuple","uint","bstr"]},
  client-params: {
     recommend-versions: {op:"SetJoin", min_count:"qfield",type:"tstr"},
     require-protos: {op:"BitThreshold", min_count:"sqauth"},
     recommend-protos: {op:"BitThreshold", min_count:"qauth"},
     params: {op:"MapJoin",key_min_count:"qauth",
                 keytype:"tstr",
                 item_op:{op:"Median",min_vote:"qauth",type:"uint"},
                 },
     certs: {op:"SetJoin",min_count:1, type: 'bstr'},
  },
  ; Use same value for server-params.
  relay: {
     meta: {
        desc: {op:"Mode", min_count:"qauth",tie_low:false,
               type:["uint","bstr"] },
        flags: {op:"MapJoin", key_type:"tstr",
                item_op:{op:"Mode",type:"bool"}},
        bw: {op:"Median", type:"uint" },
        mbw :{op:"Median", type:"uint" },
        rsa-id: {op:"Mode", type:"bstr"},
    },
    snip: {
       ; ed25519 key is handled as any other value.
       0: { op:"DerivedFrom", fields:[["RM","desc"]],
             rule:{op:"Mode",type="bstr"} },

       ; ntor onion key.
       1: { op:"DerivedFrom", fields:[["RM","desc"]],
             rule:{op:"Mode",type="bstr"} },

       ; link specifiers.
       2: { op: "CborDerived",
             item-op: { op:"DerivedFrom", fields:[["RM","desc"]],
                        rule:{op:"Mode",type="bstr" } } },

       ; software description.
       3: { op:"DerivedFrom", fields:[["RM","desc"]],
             rule:{op:"Mode",type=["tuple", "tstr", "tstr"] } },

       ; protovers.
       4: { op: "CborDerived",
             item-op: { op:"DerivedFrom", fields:[["RM","desc"]],
                      rule:{op:"Mode",type="bstr" } } },

       ; families.
       5: { op:"SetJoin", min_count:"qfield", type:"bstr" },

       ; countrycode
       6: { op:"Mode", type="tstr" } ,

       ; 7: exitpolicy.
       7: { op: "CborDerived",
             item-op: { op: "DerivedFrom", fields:[["RM","desc"],["CP","port-classes"]],
                      rule:{op:"Mode",type="bstr" } } },
    },
    legacy: {
      "sha1-desc": { op:"DerivedFrom",
                      fields:[["RM","desc"]],
                      rule:{op:"Mode",type="bstr"} },
      "mds": { op:"DerivedFrom",
                fields:[["RM":"desc"]],
                rule: { op:"ThresholdOp", min_count: "qauth",
                         multi_low:false,
                         type:["tuple", "uint", "uint",
                               "bstr", "bstr" ] }},
    }
  }
  indices: {
     ; See appendix G.
  }
}

Appendix G: A list of routing indices

Middle -- general purpose index for use when picking middle hops in circuits. Bandwidth-weighted for use as middle relays. May exclude guards and/or exits depending on overall balance of resources on the network.

Formula:

  type: 'weighted',
  source: {
      type:'bw', require_flags: ['Valid'], 'bwfield' : ["RM", "mbw"]
  },
  weight: {
      [ "!Exit", "!Guard" ] => "Wmm",
      [ "Exit", "Guard" ] => "Wbm",
      [ "Exit", "!Guard" ] => "Wem",
      [ "!Exit", "Guard" ] => "Wgm",
  }

Guard -- index for choosing guard relays. This index is not used directly when extending, but instead only for picking guard relays that the client will later connect to directly. Bandwidth-weighted for use as guard relays. May exclude guard+exit relays depending on resource balance.

  type: 'weighted',
  source: {
       type:'bw',
       require_flags: ['Valid', "Guard"],
       bwfield : ["RM", "mbw"]
  },
  weight: {
      [ "Exit", ] => "Weg",
  }

HSDirV2 -- index for finding spots on the hsv2 directory ring.

Formula: type: 'rsa-id',

HSDirV3-early -- index for finding spots on the hsv3 directory ring for the earlier of the two "active" days. (The active days are today, and whichever other day is closest to the time at which the ENDIVE becomes active.)

Formula: type: 'ed-id' alg: SHA3-256, prefix: b"node-idx", suffix: (depends on shared-random and time period)

HSDirV3-late -- index for finding spots on the hsv3 directory ring for the later of the two "active" days.

Formula: as HSDirV3-early, but with a different suffix.

Self -- A virtual index that never appears in an ENDIVE. SNIPs with this index are unsigned, and occupy the entire index range. This index is used with bridges to represent each bridge's uniqueness.

Formula: none.

Exit0..ExitNNN -- Exits that can connect to all ports within a given PortClass 0 through NNN.

Formula:

  type: 'weighted',
  source: {
       type:'bw',
       ; The second flag here depends on which portclass this is.
       require_flags: [ 'Valid', "P@3" ],
       bwfield : ["RM", "mbw"]
   },
  weight: {
      [ "Guard", ] => "Wge",
  }

Appendix H: Choosing good clusters of exit policies

With Walking Onions, we cannot easily support all the port combinations [*] that we currently allow in the "policy summaries" that we support in microdescriptors.

[*] How many "short policy summaries" are there? The number would be 2^65535, except for the fact today's Tor doesn't permit exit policies to get maximally long.

In the Walking Onions whitepaper (https://crysp.uwaterloo.ca/software/walkingonions/) we noted in section 6 that we can group exit policies by class, and get down to around 220 "classes" of port, such that each class was either completely supported or completely unsupported by every relay. But that number is still impractically large: if we need ~11 bytes to represent a SNIP index range, we would need an extra 2320 bytes per SNIP, which seems like more overhead than we really want.

We can reduce the number of port classes further, at the cost of some fidelity. For example, suppose that the set {https,http} is supported by relays {A,B,C,D}, and that the set {ssh,irc} is supported by relays {B,C,D,E}. We could combine them into a new port class {https,http,ssh,irc}, supported by relays {B,C,D} -- at the expense of no longer being able to say that relay A supported {https,http}, or that relay E supported {ssh,irc}.

This loss would not necessarily be permanent: the operator of relay A might be willing to add support for {ssh,irc}, and the operator of relay E might be willing to add support for {https,http}, in order to become useful as an exit again.

(We might also choose to add a configuration option for relays to take their exit policies directly from the port classes in the consensus.)

How might we select our port classes? Three general categories of approach seem possible: top-down, bottom-up, and hybrid.

In a top-down approach, we would collaborate with authority and exit operators to identify a priori reasonable classes of ports, such as "Web", "Chat", "Miscellaneous internet", "SMTP", and "Everything else". Authorities would then base exit indices on these classes.

In a bottom-up approach, we would find an algorithm to run on the current exit policies in order to find the "best" set of port classes to capture the policies as they stand with minimal loss. (Quantifying this loss is nontrivial: do we weight by bandwidth? Do we weight every port equally, or do we call some more "important" than others?)

See exit-analysis for an example tool that runs a greedy algorithm to find a "good" partition using an unweighted, all-ports-are-equal cost function. See the files "greedy-set-cov-{4,8,16}" for examples of port classes produced by this algorithm.

In a hybrid approach, we'd use top-down and bottom-up techniques together. For example, we could start with an automated bottom-up approach, and then evaluate it based feedback from operators. Or we could start with a handcrafted top-down approach, and then use bottom-up cost metrics to look for ways to split or combine those port classes in order to represent existing policies with better fidelity.

Appendix I: Non-clique topologies with Walking Onions

For future work, we can expand the Walking Onions design to accommodate network topologies where relays are divided into groups, and not every group connects to every other. To do so requires additional design work, but here I'll provide what I hope will be a workable sketch.

First, each SNIP needs to contain an ID saying which relay group it belongs to, and an ID saying which relay group(s) may serve it.

When downloading an ENDIVE, each relay should report its own identity, and receive an ENDIVE for that identity's group. It should contain both the identities of relays in the group, and the SNIPs that should be served for different indices by members of that group.

The easy part would be to add an optional group identity field to SNIPs, defaulting to 0, indicating that the relay belongs to that group, and an optional served-by field to each SNIP, indicating groups that may serve the SNIP. You'd only accept SNIPs if they were served by a relay in a group that was allowed to serve them.

Would guards work? Sure: we'd need to have guard SNIPS served by middle relays.

For hsdirs, we'd need to have either multiple shards of the hsdir ring (which seems like a bad idea?) or have all middle nodes able to reach the hsdir ring.

Things would get tricky with making onion services work: if you need to use an introduction point or a rendezvous point in group X, then you need to get there from a relay that allows connections to group X. Does this imply indices meaning "Can reach group X" or "two-degrees of group X"?

The question becomes: "how much work on alternative topologies does it make sense to deploy in advance?" It seems like there are unknowns affecting both client and relay operations here, which suggests that advance deployment for either case is premature: we can't necessarily make either clients or relays "do the right thing" in advance given what we now know of the right thing.

Appendix Z: acknowledgments

Thanks to Peter Palfrader for his original design in proposal 141, and to the designers of PIR-Tor, both of which inspired aspects of this Walking Onions design.

Thanks to Chelsea Komlo, Sajin Sasy, and Ian Goldberg for feedback on an earlier version of this design.

Thanks to David Goulet, Teor, and George Kadianakis for commentary on earlier versions of proposal 300.

Thanks to Chelsea Komlo and Ian Goldberg for their help fleshing out so many ideas related to Walking Onions in their work on the design paper.

Thanks to Teor for improvements to diff format, ideas about grouping exit ports, and numerous ideas about getting topology and distribution right.

These specifications were supported by a grant from the Zcash Foundation.

Filename: 324-rtt-congestion-control.txt
Title: RTT-based Congestion Control for Tor
Author: Mike Perry
Created: 02 July 2020
Status: Finished


0. Motivation [MOTIVATION]

This proposal specifies how to incrementally deploy RTT-based congestion
control and improved queue management in Tor. It is written to allow us
to first deploy the system only at Exit relays, and then incrementally
improve the system by upgrading intermediate relays.

Lack of congestion control is the reason why Tor has an inherent speed
limit of about 500KB/sec for downloads and uploads via Exits, and even
slower for onion services. Because our stream SENDME windows are fixed
at 500 cells per stream, and only ~500 bytes can be sent in one cell,
the max speed of a single Tor stream is 500*500/circuit_latency. This
works out to about 500KB/sec max sustained throughput for a single
download, even if circuit latency is as low as 500ms.

Because onion services paths are more than twice the length of Exit
paths (and thus more than twice the circuit latency), onion service
throughput will always have less than half the throughput of Exit
throughput, until we deploy proper congestion control with dynamic
windows.

Proper congestion control will remove this speed limit for both Exits
and onion services, as well as reduce memory requirements for fast Tor
relays, by reducing queue lengths.

The high-level plan is to use Round Trip Time (RTT) as a primary
congestion signal, and compare the performance of two different
congestion window update algorithms that both use RTT as a congestion
signal.

The combination of RTT-based congestion signaling, a congestion window
update algorithm, and Circuit-EWMA will get us the most if not all of
the benefits we seek, and only requires clients and Exits to upgrade to
use it. Once this is deployed, circuit bandwidth caps will no longer be
capped at ~500kb/sec by the fixed window sizes of SENDME; queue latency
will fall significantly; memory requirements at relays should plummet;
and transient bottlenecks in the network should dissipate.

Extended background information on the choices made in this proposal can
be found at:
  https://lists.torproject.org/pipermail/tor-dev/2020-June/014343.html
  https://lists.torproject.org/pipermail/tor-dev/2020-January/014140.html

An exhaustive list of citations for further reading is in Section
[CITATIONS].

A glossary of common congestion control acronyms and terminology is in
Section [GLOSSARY].


1. Overview [OVERVIEW]

This proposal has five main sections, after this overview. These
sections are referenced [IN_ALL_CAPS] rather than by number, for easy
searching.

Section [CONGESTION_SIGNALS] specifies how to use Tor's SENDME flow
control cells to measure circuit RTT, for use as an implicit congestion
signal. It also mentions an explicit congestion signal, which can be
used as a future optimization once all relays upgrade.

Section [CONTROL_ALGORITHMS] specifies two candidate congestion window
upgrade mechanisms, which will be compared for performance in simulation
in Shadow, as well as evaluated on the live network, and tuned via
consensus parameters listed in [CONSENSUS_PARAMETERS].

Section [FLOW_CONTROL] specifies how to handle back-pressure when one of
the endpoints stops reading data, but data is still arriving. In
particular, it specifies what to do with streams that are not being read
by an application, but still have data arriving on them.

Section [SYSTEM_INTERACTIONS] describes how congestion control will
interact with onion services, circuit padding, and conflux-style traffic
splitting.

Section [EVALUATION] describes how we will evaluate and tune our
options for control algorithms and their parameters.

Section [PROTOCOL_SPEC] describes the specific cell formats and
descriptor changes needed by this proposal.

Section [SECURITY_ANALYSIS] provides information about the DoS and
traffic analysis properties of congestion control.


2. Congestion Signals [CONGESTION_SIGNALS]

In order to detect congestion at relays on a circuit, Tor will use
circuit Round Trip Time (RTT) measurement. This signal will be used in
slightly different ways in our various [CONTROL_ALGORITHMS], which will
be compared against each other for optimum performance in Shadow and on
the live network.

To facilitate this, we will also change SENDME accounting logic
slightly. These changes only require clients, exits, and dirauths to
update.

As a future optimization, it is possible to send a direct ECN congestion
signal. This signal *will* require all relays on a circuit to upgrade to
support it, but it will reduce congestion by making the first congestion event
on a circuit much faster to detect.

To reduce confusion and complexity of this proposal, this signal has been
moved to the ideas repository, under xxx-backward-ecn.txt [BACKWARD_ECN].


2.1 RTT measurement

Recall that Tor clients, exits, and onion services send
RELAY_COMMAND_SENDME relay cells every CIRCWINDOW_INCREMENT (100) cells
of received RELAY_COMMAND_DATA.

This allows those endpoints to measure the current circuit RTT, by
measuring the amount of time between sending a RELAY_COMMAND_DATA cell
that would trigger a SENDME from the other endpoint, and the arrival of
that SENDME cell. This means that RTT is measured every 'cc_sendme_inc'
data cells.

Circuits will record the minimum and maximum RTT measurement, as well as
a smoothed value of representing the current RTT. The smoothing for the
current RTT is performed as specified in [N_EWMA_SMOOTHING].

Algorithms that make use of this RTT measurement for congestion
window update are specified in [CONTROL_ALGORITHMS].

2.1.1. Clock Jump Heuristics [CLOCK_HEURISTICS]

The timestamps for RTT (and BDP) are measured using Tor's
monotime_absolute_usec() API. This API is designed to provide a monotonic
clock that only moves forward. However, depending on the underlying system
clock, this may result in the same timestamp value being returned for long
periods of time, which would result in RTT 0-values. Alternatively, the clock
may jump forward, resulting in abnormally large RTT values.

To guard against this, we perform a series of heuristic checks on the time delta
measured by the RTT estimator, and if these heurtics detect a stall or a jump,
we do not use that value to update RTT or BDP, nor do we update any congestion
control algorithm information that round.

If the time delta is 0, that is always treated as a clock stall, the RTT is
not used, congestion control is not updated, and this fact is cached globally.

If the circuit does not yet have an EWMA RTT or it is still in Slow Start, then
no further checks are performed, and the RTT is used.

If the circuit has stored an EWMA RTT and has exited Slow Start, then every
sendme ACK, the new candidate RTT is compared to the stored EWMA RTT. If the
new RTT is 5000 times larger than the EWMA RTT, then the circuit does not
record that estimate, and does not update BDP or the congestion control
algorithms for that SENDME ack. If the new RTT is 5000 times smaller than the
EWMA RTT, then the circuit uses the globally cached value from above (ie: it
assumes the clock is stalled *only* if there was previously *also* a 0-delta RTT).

If both ratio checks pass, the globally cached clock stall state is set to
false (no stall), and the RTT value is used.

2.1.2. N_EWMA Smoothing [N_EWMA_SMOOTHING]

RTT estimation requires smoothing, to reduce the effects of packet jitter.

This smoothing is performed using N_EWMA[27], which is an Exponential
Moving Average with alpha = 2/(N+1):

  N_EWMA = RTT*2/(N+1) + N_EWMA_prev*(N-1)/(N+1)
         = (RTT*2 + N_EWMA_prev*(N-1))/(N+1).

Note that the second rearranged form MUST be used in order to ensure that
rounding errors are handled in the same manner as other implementations.

Flow control rate limiting uses this function.

During Slow Start, N is set to `cc_ewma_ss`, for RTT estimation.

After Slow Start, N is the number of SENDME acks between congestion window
updates, divided by the value of consensus parameter 'cc_ewma_cwnd_pct', and
then capped at a max of 'cc_ewma_max', but always at least 2:

  N = MAX(MIN(CWND_UPDATE_RATE(cc)*cc_ewma_cwnd_pct/100, cc_ewma_max), 2);

CWND_UPDATE_RATE is normally just round(CWND/cc_sendme_inc), but after
slow start, it is round(CWND/(cc_cwnd_inc_rate*cc_sendme_inc)).

2.2. SENDME behavior changes

We will make four major changes to SENDME behavior to aid in computing
and using RTT as a congestion signal.

First, we will need to establish a ProtoVer of "FlowCtrl=2" to signal
support by Exits for the new SENDME format and congestion control
algorithm mechanisms.  We will need a similar announcement in the onion
service descriptors of services that support congestion control.

Second, we will turn CIRCWINDOW_INCREMENT into a consensus parameter
cc_sendme_inc, instead of using a hardcoded value of 100 cells. It is
likely that more frequent SENDME cells will provide quicker reaction to
congestion, since the RTT will be measured more often. If
experimentation in Shadow shows that more frequent SENDMEs reduce
congestion and improve performance but add significant overhead, we can
reduce SENDME overhead by allowing SENDME cells to carry stream data, as
well, using Proposal 325. The method for negotiating a common value of
cc_sendme_inc on a circuit is covered in [ONION_NEGOTIATION] and
[EXIT_NEGOTIATION].

Third, authenticated SENDMEs can remain as-is in terms of protocol
behavior, but will require some implementation updates to account for
variable window sizes and variable SENDME pacing. In particular, the
sendme_last_digests list for auth sendmes needs updated checks for
larger windows and CIRCWINDOW_INCREMENT changes. Other functions to
examine include:
     - circuit_sendme_cell_is_next()
     - sendme_record_cell_digest_on_circ()
     - sendme_record_received_cell_digest()
     - sendme_record_sending_cell_digest()
     - send_randomness_after_n_cells

Fourth, stream level SENDMEs will be eliminated. Details on handling
streams and backpressure is covered in [FLOW_CONTROL].


3. Congestion Window Update Algorithms [CONTROL_ALGORITHMS]

In general, the goal of congestion control is to ensure full and fair
utilization of the capacity of a network path -- in the case of Tor the spare
capacity of the circuit. This is accomplished by setting the congestion window
to target the Bandwidth-Delay Product[28] (BDP) of the circuit in one way or
another, so that the total data outstanding is roughly equal to the actual
transit capacity of the circuit.

There are several ways to update a congestion window to target the BDP. Some
use direct BDP estimation, where as others use backoff properties to achieve
this. We specify three BDP estimation algorithms in the [BDP_ESTIMATION]
sub-section, and three congestion window update algorithms in [TOR_WESTWOOD],
[TOR_VEGAS], and [TOR_NOLA].

Note that the congestion window update algorithms differ slightly from the
background tor-dev mails[1,2], due to corrections and improvements. Hence they
have been given different names than in those two mails. The third algorithm,
[TOR_NOLA], simply uses the latest BDP estimate directly as its congestion
window.

These algorithms were evaluated by running Shadow simulations, to help
determine parameter ranges, and with experimentation on the live network.
After this testing, we have converged on using [TOR_VEGAS], and RTT-based BDP
estimation using the congestion window. We leave the algorithms in place
for historical reference.

All of these algorithms have rules to update 'cwnd' - the current congestion
window, which starts out at a value controlled by consensus parameter
'cc_cwnd_init'. The algorithms also keep track of 'inflight', which is a count
of the number of cells currently not yet acked by a SENDME. The algorithm MUST
ensure that cells cease being sent if 'cwnd - inflight <= 0'. Note that this
value CAN become negative in the case where the cwnd is reduced while packets
are inflight.

While these algorithms are in use, updates and checks of the current
'package_window' field are disabled. Where a 'package_window' value is
still needed, for example by cell packaging schedulers, 'cwnd - inflight' is
used (with checks to return 0 in the event of negative values).

The 'deliver_window' field is still used to decide when to send a SENDME. In C
tor, the deliver window is initially set at 1000, but it never gets below 900,
because authenticated sendmes (Proposal 289) require that we must send only
one SENDME at a time, and send it immediately after 100 cells are received.

Implementation of different algorithms should be very simple - each
algorithm should have a different update function depending on the selected algorithm,
as specified by consensus parameter 'cc_alg'.

For C Tor's current flow control, these functions are defined in sendme.c,
and are called by relay.c:
  - sendme_note_circuit_data_packaged()
  - sendme_circuit_data_received()
  - sendme_circuit_consider_sending()
  - sendme_process_circuit_level()

Despite the complexity of the following algorithms in their TCP
implementations, their Tor equivalents are extremely simple, each being
just a handful of lines of C. This simplicity is possible because Tor
does not have to deal with out-of-order delivery, packet drops,
duplicate packets, and other network issues at the circuit layer, due to
the fact that Tor circuits already have reliability and in-order
delivery at that layer.

We are also removing the aspects of TCP that cause the congestion
algorithm to reset into slow start after being idle for too long, or
after too many congestion signals. These are deliberate choices that
simplify the algorithms and also should provide better performance for
Tor workloads.

In all cases, variables in these sections are either consensus parameters
specified in [CONSENSUS_PARAMETERS], or scoped to the circuit. Consensus
parameters for congestion control are all prefixed by cc_. Everything else
is circuit-scoped.


3.1. Estimating Bandwidth-Delay Product [BDP_ESTIMATION]

At a high-level, there are three main ways to estimate the Bandwidth-Delay
Product: by using the current congestion window and RTT, by using the inflight
cells and RTT, and by measuring SENDME arrival rate. After extensive shadow
simulation and live testing, we have arrived at using the congestion window
RTT based estimator, but we will describe all three for background.

All three estimators are updated every SENDME ack arrival.

The SENDME arrival rate is the most direct way to estimate BDP, but it
requires averaging over multiple SENDME acks to do so. Unfortunatetely,
this approach suffers from what is called "ACK compression", where returning
SENDMEs build up in queues, causing over-estimation of the BDP.

The congestion window and inflight estimates rely on the congestion algorithm
more or less correctly tracking an approximation of the BDP, and then use
current and minimum RTT to compensate for overshoot. These estimators tend to
under-estimate BDP, especially when the congestion window is below the BDP.
This under-estimation is corrected for by the increase of the congestion
window in congestion control algorithm rules.

3.1.1. SENDME arrival BDP estimation

It is possible to directly measure BDP via the amount of time between SENDME
acks. In this period of time, we know that the endpoint successfully received
'cc_sendme_inc' cells.

This means that the bandwidth of the circuit is then calculated as:

   BWE = cc_sendme_inc/sendme_ack_timestamp_delta

The bandwidth delay product of the circuit is calculated by multiplying this
bandwidth estimate by the *minimum* RTT time of the circuit (to avoid counting
queue time):

   BDP = BWE * RTT_min

In order to minimize the effects of ack compression (aka SENDME responses
becoming close to one another due to queue delay on the return), we
maintain a history a full congestion window worth of previous SENDME
timestamps.

With this, the calculation becomes:

   BWE = (num_sendmes-1) * cc_sendme_inc / num_sendme_timestamp_delta
   BDP = BWE * RTT_min

Note that because we are counting the number of cells *between* the first
and last sendme of the congestion window, we must subtract 1 from the number
of sendmes actually received. Over the time period between the first and last
sendme of the congestion window, the other endpoint successfully read
(num_sendmes-1) * cc_sendme_inc cells.

Furthermore, because the timestamps are microseconds, to avoid integer
truncation, we compute the BDP using multiplication first:

   BDP = (num_sendmes-1) * cc_sendme_inc * RTT_min / num_sendme_timestamp_delta

After all of this, the BDP is smoothed using [N_EWMA_SMOOTHING].

This smoothing means that the SENDME BDP estimation will only work after two
(2) SENDME acks have been received. Additionally, it tends not to be stable
unless at least 'cc_bwe_min' sendme's are used. This is controlled by the
'cc_bwe_min' consensus parameter. Finally, if [CLOCK_HEURISTICS] have detected
a clock jump or stall, this estimator is not updated.

If all edge connections no longer have data available to send on a circuit
and all circuit queues have drained without blocking the local orconn, we stop
updating this BDP estimate and discard old timestamps. However, we retain the
actual estimator value.

Unfortunately, even after all of this, SENDME BDP estimation proved unreliable
in Shadow simulation, due to ack compression.

3.1.2. Congestion Window BDP Estimation

This is the BDP estimator we use.

Assuming that the current congestion window is at or above the current BDP,
the bandwidth estimate is the current congestion window size divided by the
RTT estimate:

   BWE = cwnd / RTT_current_ewma

The BDP estimate is computed by multiplying the Bandwidth estimate by
the *minimum* circuit latency:

   BDP = BWE * RTT_min

Simplifying:

   BDP = cwnd * RTT_min / RTT_current_ewma

The RTT_min for this calculation comes from the minimum RTT_current_ewma seen
in the lifetime of this circuit. If the congestion window falls to
`cc_cwnd_min` after slow start, implementations MAY choose to reset RTT_min
for use in this calculation to either the RTT_current_ewma, or a
percentile-weighted average between RTT_min and RTT_current_ewma, specified by
`cc_rtt_reset_pct`. This helps with escaping starvation conditions.

The net effect of this estimation is to correct for any overshoot of
the cwnd over the actual BDP. It will obviously underestimate BDP if cwnd
is below BDP.

3.1.3. Inflight BDP Estimation

Similar to the congestion window based estimation, the inflight estimation
uses the current inflight packet count to derive BDP. It also subtracts local
circuit queue use from the inflight packet count. This means it will be strictly
less than or equal to the cwnd version:

   BDP = (inflight - circ.chan_cells.n) * RTT_min / RTT_current_ewma

If all edge connections no longer have data available to send on a circuit
and all circuit queues have drained without blocking the local orconn, we stop
updating this BDP estimate, because there are not sufficient inflight cells
to properly estimate BDP.

While the research literature for Vegas says that inflight estimators
performed better due to the ability to avoid overhsoot, we had better
performance results using other methods to control overshot. Hence, we do not
use the inflight BDP estimator.

3.1.4. Piecewise BDP estimation

A piecewise BDP estimation could be used to help respond more quickly in the
event the local OR connection is blocked, which indicates congestion somewhere
along the path from the client to the guard (or between Exit and Middle). In
this case, it takes the minimum of the inflight and SENDME estimators.

When the local OR connection is not blocked, this estimator uses the max of
the SENDME and cwnd estimator values.

When the SENDME estimator has not gathered enough data, or has cleared its
estimates based on lack of edge connection use, this estimator uses the
Congestion Window BDP estimator value.


3.2. Tor Westwood: TCP Westwood using RTT signaling [TOR_WESTWOOD]
   http://intronetworks.cs.luc.edu/1/html/newtcps.html#tcp-westwood
   http://nrlweb.cs.ucla.edu/nrlweb/publication/download/99/2001-mobicom-0.pdf
   http://cpham.perso.univ-pau.fr/TCP/ccr_v31.pdf
   https://c3lab.poliba.it/images/d/d7/Westwood_linux.pdf

Recall that TCP Westwood is basically TCP Reno, but it uses BDP estimates
for "Fast recovery" after a congestion signal arrives.

We will also be using the RTT congestion signal as per BOOTLEG_RTT_TOR
here, from the Options mail[1] and Defenestrator paper[3].

This system must keep track of RTT measurements per circuit: RTT_min, RTT_max,
and RTT_current. These are measured using the time delta between every
'cc_sendme_inc' relay cells and the SENDME response. The first RTT_min can be
measured arbitrarily, so long as it is larger than what we would get from
SENDME.

RTT_current is N-EWMA smoothed over 'cc_ewma_cwnd_pct' percent of
congestion windows worth of SENDME acks, up to a max of 'cc_ewma_max' acks, as
described in [N_EWMA_SMOOTHING].

Recall that BOOTLEG_RTT_TOR emits a congestion signal when the current
RTT falls below some fractional threshold ('cc_westwood_rtt_thresh') fraction
between RTT_min and RTT_max. This check is:
   RTT_current < (1−cc_westwood_rtt_thresh)*RTT_min
                  + cc_westwood_rtt_thresh*RTT_max

Additionally, if the local OR connection is blocked at the time of SENDME ack
arrival, this is treated as an immediate congestion signal.

(We can also optionally use the ECN signal described in
ideas/xxx-backward-ecn.txt, to exit Slow Start.)

Congestion signals from RTT, blocked OR connections, or ECN are processed only
once per congestion window. This is achieved through the next_cc_event flag,
which is initialized to a cwnd worth of SENDME acks, and is decremented
each ack. Congestion signals are only evaluated when it reaches 0.

Note that because the congestion signal threshold of TOR_WESTWOOD is a
function of RTT_max, and excessive queuing can cause an increase in RTT_max,
TOR_WESTWOOD may have runaway conditions. Additionally, if stream activity is
constant, but of a lower bandwidth than the circuit, this will not drive the
RTT upwards, and this can result in a congestion window that continues to
increase in the absence of any other concurrent activity.

Here is the complete congestion window algorithm for Tor Westwood. This will run
each time we get a SENDME (aka sendme_process_circuit_level()):

 # Update acked cells
 inflight -= cc_sendme_inc

 if next_cc_event:
   next_cc_event--

 # Do not update anything if we detected a clock stall or jump,
 # as per [CLOCK_HEURISTICS]
 if clock_stalled_or_jumped:
   return

 if next_cc_event == 0:
   # BOOTLEG_RTT_TOR threshold; can also be BACKWARD_ECN check:
   if (RTT_current <
      (100−cc_westwood_rtt_thresh)*RTT_min/100 +
      cc_westwood_rtt_thresh*RTT_max/100) or orconn_blocked:
     if in_slow_start:
       cwnd += cwnd * cc_cwnd_inc_pct_ss             # Exponential growth
     else:
       cwnd = cwnd + cc_cwnd_inc                     # Linear growth
   else:
     if cc_westwood_backoff_min:
       cwnd = min(cwnd * cc_westwood_cwnd_m, BDP)         # Window shrink
     else:
       cwnd = max(cwnd * cc_westwood_cwnd_m, BDP)         # Window shrink
     in_slow_start = 0

     # Back off RTT_max (in case of runaway RTT_max)
     RTT_max = RTT_min + cc_westwood_rtt_m * (RTT_max - RTT_min)

   cwnd = MAX(cwnd, cc_circwindow_min)
   next_cc_event = cwnd / (cc_cwnd_inc_rate * cc_sendme_inc)


3.3. Tor Vegas: TCP Vegas with Aggressive Slow Start [TOR_VEGAS]
   http://intronetworks.cs.luc.edu/1/html/newtcps.html#tcp-vegas
   http://pages.cs.wisc.edu/~akella/CS740/F08/740-Papers/BOP94.pdf
   http://www.mathcs.richmond.edu/~lbarnett/cs332/assignments/brakmo_peterson_vegas.pdf
   ftp://ftp.cs.princeton.edu/techreports/2000/628.pdf

TCP Vegas control algorithm estimates the queue lengths at relays by
subtracting the current BDP estimate from the current congestion window.

After extensive shadow simulation and live testing, we have settled on this
congestion control algorithm for use in Tor.

Assuming the BDP estimate is accurate, any amount by which the congestion
window exceeds the BDP will cause data to queue.

Thus, Vegas estimates estimates the queue use caused by congestion as:

   queue_use = cwnd - BDP

Original TCP Vegas used a cwnd BDP estimator only. We added the ability to
switch this BDP estimator in the implementation, and experimented with various
options. We also parameterized this queue_use calculation as a tunable
weighted average between the cwnd-based BDP estimate and the piecewise
estimate (consensus parameter 'cc_vegas_bdp_mix'). After much testing of
various ways to compute BDP, we were still unable to do much better than the
original cwnd estimator. So while this capability to change the BDP estimator
remains in the C implementation, we do not expect it to be used.

However, it was useful to use a local OR connection block at the time of
SENDME ack arrival, as an immediate congestion signal. Note that in C-Tor,
this orconn_block state is not derived from any socket info, but instead is a
heuristic that declares an orconn as blocked if any circuit cell queue
exceeds the 'cellq_high' consensus parameter.

(As an additional optimization, we could also use the ECN signal described in
ideas/xxx-backward-ecn.txt, but this is not implemented. It is likely only of
any benefit during Slow Start, and even that benefit is likely small.)

During Slow Start, we use RFC3742 Limited Slow Start[32], which checks the
congestion signals from RTT, blocked OR connections, or ECN every single
SENDME ack. It also provides a `cc_sscap_*` parameter for each path length,
which reduces the congestion window increment rate after it is crossed, as
per the rules in RFC3742:
  rfc3742_ss_inc(cwnd):
    if cwnd <= cc_ss_cap_pathtype:
      # Below the cap, we increment as per cc_cwnd_inc_pct_ss percent:
      return round(cc_cwnd_inc_pct_ss*cc_sendme_inc/100)
    else:
      # This returns an increment equivalent to RFC3742, rounded,
      # with a minimum of inc=1.
      # From RFC3742:
      #  K = int(cwnd/(0.5 max_ssthresh));
      #  inc = int(MSS/K);
      return MAX(round((cc_sendme_inc*cc_ss_cap_pathtype)/(2*cwnd)), 1);

During both Slow Start, and Steady State, if the congestion window is not full,
we never increase the congestion window. We can still decrease it, or exit slow
start, in this case. This is done to avoid causing overshoot. The original TCP
Vegas addressed this problem by computing BDP and queue_use from inflight,
instead of cwnd, but we found that approach to have signficantly worse
performance.

Because C-Tor is single-threaded, multiple SENDME acks may arrive during one
processing loop, before edge connections resume reading. For this reason,
we provide two heuristics to provide some slack in determining the full
condition. The first is to allow a gap between inflight and cwnd,
parameterized as 'cc_cwnd_full_gap' multiples of 'cc_sendme_inc':
   cwnd_is_full(cwnd, inflight):
     if inflight + 'cc_cwnd_full_gap'*'cc_sendme_inc' >= cwnd:
       return true
     else
       return false

The second heuristic immediately resets the full state if it falls below
'cc_cwnd_full_minpct' full:
   cwnd_is_nonfull(cwnd, inflight):
     if 100*inflight < 'cc_cwnd_full_minpct'*cwnd:
       return true
     else
       return false

This full status is cached once per cwnd if 'cc_cwnd_full_per_cwnd=1';
otherwise it is cached once per cwnd update. These two helper functions
determine the number of acks in each case:
   SENDME_PER_CWND(cwnd):
     return ((cwnd + 'cc_sendme_inc'/2)/'cc_sendme_inc')
   CWND_UPDATE_RATE(cwnd, in_slow_start):
     # In Slow Start, update every SENDME
     if in_slow_start:
       return 1
     else: # Otherwise, update as per the 'cc_inc_rate' (31)
       return ((cwnd + 'cc_cwnd_inc_rate'*'cc_sendme_inc'/2)
           / ('cc_cwnd_inc_rate'*'cc_sendme_inc'));

Shadow experimentation indicates that 'cc_cwnd_full_gap=2' and
'cc_cwnd_full_per_cwnd=0' minimizes queue overshoot, where as
'cc_cwnd_full_per_cwnd=1' and 'cc_cwnd_full_gap=1' is slightly better
for performance. Since there may be a difference between Shadow and live,
we leave this parmeterization in place.

Here is the complete pseudocode for TOR_VEGAS with RFC3742, which is run every
time an endpoint receives a SENDME ack. All variables are scoped to the
circuit, unless prefixed by an underscore (local), or in single quotes
(consensus parameters):

  # Decrement counters that signal either an update or cwnd event
  if next_cc_event:
    next_cc_event--
  if next_cwnd_event:
    next_cwnd_event--

  # Do not update anything if we detected a clock stall or jump,
  # as per [CLOCK_HEURISTICS]
  if clock_stalled_or_jumped:
    inflight -= 'cc_sendme_inc'
    return

  if BDP > cwnd:
    _queue_use = 0
  else:
    _queue_use = cwnd - BDP

  if cwnd_is_full(cwnd, inflight):
    cwnd_full = 1
  else if cwnd_is_nonfull(cwnd, inflight):
    cwnd_full = 0

  if in_slow_start:
    if _queue_use < 'cc_vegas_gamma' and not orconn_blocked:
      # Only increase cwnd if the cwnd is full
      if cwnd_full:
        _inc = rfc3742_ss_inc(cwnd);
        cwnd += _inc

        # If the RFC3742 increment drops below steady-state increment
        # over a full cwnd worth of acks, exit slow start.
        if _inc*SENDME_PER_CWND(cwnd) <= 'cc_cwnd_inc'*'cc_cwnd_inc_rate':
          in_slow_start = 0
    else: # Limit hit. Exit Slow start (even if cwnd not full)
      in_slow_start = 0
      cwnd = BDP + 'cc_vegas_gamma'

    # Provide an emergency hard-max on slow start:
    if cwnd >= 'cc_ss_max':
      cwnd = 'cc_ss_max'
      in_slow_start = 0
  else if next_cc_event == 0:
    if _queue_use > 'cc_vegas_delta':
      cwnd = BDP + 'cc_vegas_delta' - 'cc_cwnd_inc'
    elif _queue_use > cc_vegas_beta or orconn_blocked:
      cwnd -= 'cc_cwnd_inc'
    elif cwnd_full and _queue_use < 'cc_vegas_alpha':
      # Only increment if queue is low, *and* the cwnd is full
      cwnd += 'cc_cwnd_inc'

    cwnd = MAX(cwnd, 'cc_circwindow_min')

  # Specify next cwnd and cc update
  if next_cc_event == 0:
    next_cc_event = CWND_UPDATE_RATE(cwnd)
  if next_cwnd_event == 0:
    next_cwnd_event = SENDME_PER_CWND(cwnd)

  # Determine if we need to reset the cwnd_full state
  # (Parameterized)
  if 'cc_cwnd_full_per_cwnd' == 1:
    if next_cwnd_event == SENDME_PER_CWND(cwnd):
      cwnd_full = 0
  else:
    if next_cc_event == CWND_UPDATE_RATE(cwnd):
      cwnd_full = 0

  # Update acked cells
  inflight -= 'cc_sendme_inc'


3.4. Tor NOLA: Direct BDP tracker [TOR_NOLA]

Based on the theory that congestion control should track the BDP,
the simplest possible congestion control algorithm could just set the
congestion window directly to its current BDP estimate, every SENDME ack.

Such an algorithm would need to overshoot the BDP slightly, especially in the
presence of competing algorithms. But other than that, it can be exceedingly
simple. Like Vegas, but without putting on airs. Just enough strung together.

After meditating on this for a while, it also occurred to me that no one has
named a congestion control algorithm after New Orleans. We have Reno, Vegas,
and scores of others. What's up with that?

Here's the pseudocode for TOR_NOLA that runs on every SENDME ack:

  # Do not update anything if we detected a clock stall or jump,
  # as per [CLOCK_HEURISTICS]
  if clock_stalled_or_jumped:
    return

  # If the orconn is blocked, do not overshoot BDP
  if orconn_blocked:
    cwnd = BDP
  else:
    cwnd = BDP + cc_nola_overshoot

  cwnd = MAX(cwnd, cc_circwindow_min)


4. Flow Control [FLOW_CONTROL]

Flow control provides what is known as "pushback" -- the property that
if one endpoint stops reading data, the other endpoint stops sending
data. This prevents data from accumulating at points in the network, if
it is not being read fast enough by an application.

Because Tor must multiplex many streams onto one circuit, and each
stream is mapped to another TCP socket, Tor's current pushback is rather
complicated and under-specified. In C Tor, it is implemented in the
following functions:
   - circuit_consider_stop_edge_reading()
   - connection_edge_package_raw_inbuf()
   - circuit_resume_edge_reading()

The decision on when a stream is blocked is performed in:
  - sendme_note_stream_data_packaged()
  - sendme_stream_data_received()
  - sendme_connection_edge_consider_sending()
  - sendme_process_stream_level()

Tor currently maintains separate windows for each stream on a circuit,
to provide individual stream flow control. Circuit windows are SENDME
acked as soon as a relay data cell is decrypted and recognized. Stream
windows are only SENDME acked if the data can be delivered to an active
edge connection. This allows the circuit to continue to operate if an
endpoint refuses to read data off of one of the streams on the circuit.

Because Tor streams can connect to many different applications and
endpoints per circuit, it is important to preserve the property that if
only one endpoint edge connection is inactive, it does not stall the
whole circuit, in case one of those endpoints is malfunctioning or
malicious.

However, window-based stream flow control also imposes a speed limit on
individual streams. If the stream window size is below the circuit
congestion window size, then it becomes the speed limit of a download,
as we saw in the [MOTIVATION] section of this proposal.

So for performance, it is optimal that each stream window is the same
size as the circuit's congestion window. However, large stream windows
are a vector for OOM attacks, because malicious clients can force Exits
to buffer a full stream window for each stream while connecting to a
malicious site and uploading data that the site does not read from its
socket. This attack is significantly easier to perform at the stream
level than on the circuit level, because of the multiplier effects of
only needing to establish a single fast circuit to perform the attack on
a very large number of streams.

This catch22 means that if we use windows for stream flow control, we
either have to commit to allocating a full congestion window worth
memory for each stream, or impose a speed limit on our streams.

Hence, we will discard stream windows entirely, and instead use a
simpler buffer-based design that uses XON/XOFF to signal when this
buffer is too large. Additionally, the XON cell will contain advisory
rate information based on the rate at which that edge connection can
write data while it has data to write. The other endpoint can rate limit
sending data for that stream to the rate advertised in the XON, to avoid
excessive XON/XOFF chatter and sub-optimal behavior.

This will allow us to make full use of the circuit congestion window for
every stream in combination, while still avoiding buffer buildup inside
the network.

4.1. Stream Flow Control Without Windows [WINDOWLESS_FLOW]

Each endpoint (client, Exit, or onion service) sends circuit-level
SENDME acks for all circuit cells as soon as they are decrypted and
recognized, but *before* delivery to their edge connections.

This means that if the edge connection is blocked because an
application's SOCKS connection or a destination site's TCP connection is
not reading, data will build up in a queue at that endpoint,
specifically in the edge connection's outbuf.

Consensus parameters will govern the length of this queue that
determines when XON and XOFF cells are sent, as well as when advisory
XON cells that contain rate information can be sent. These parameters
are separate for the queue lengths of exits, and of clients/services.

(Because clients and services will typically have localhost connections
for their edges, they will need similar buffering limits. Exits may have
different properties, since their edges will be remote.)

The trunnel relay cell payload definitions for XON and XOFF are:

struct xoff_cell {
  u8 version IN [0x00];
}

struct xon_cell {
  u8 version IN [0x00];

  u32 kbps_ewma;
}

Parties SHOULD treat XON or XOFF cells with unrecognized versions as a
protocol violation.

In `xon_cell`, a zero value for `kbps_ewma` means that the stream's rate is
unlimited.  Parties should therefore not send "0" to mean "do not send data".

4.1.1. XON/XOFF behavior

If the length of an edge outbuf queue exceeds the size provided in the
appropriate client or exit XOFF consensus parameter, a
RELAY_COMMAND_STREAM_XOFF will be sent, which instructs the other endpoint to
stop sending from that edge connection.

Once the queue is expected to empty, a RELAY_COMMAND_STREAM_XON will be sent,
which allows the other end to resume reading on that edge connection. This XON
also indicates the average rate of queue drain since the XOFF.

Advisory XON cells are also sent whenever the edge connection's drain
rate changes by more than 'cc_xon_change_pct' percent compared to
the previously sent XON cell's value.

4.1.2. Edge bandwidth rate advertisement [XON_ADVISORY]

As noted above, the XON cell provides a field to indicate the N_EWMA rate which
edge connections drain their outgoing buffers.

To compute the drain rate, we maintain a timestamp and a byte count of how many
bytes were written onto the socket from the connection outbuf.

In order to measure the drain rate of a connection, we need to measure the time
it took between flushing N bytes on the socket and when the socket is available
for writing again. In other words, we are measuring the time it took for the
kernel to send N bytes between the first flush on the socket and the next
poll() write event.

For example, lets say we just wrote 100 bytes on the socket at time t = 0sec
and at time t = 2sec the socket becomes writeable again, we then estimate that
the rate of the socket is 100 / 2sec thus 50B/sec.

To make such measurement, we start the timer by recording a timestamp as soon
as data begins to accumulate in an edge connection's outbuf, currently 16KB (32
cells). We use such value for now because Tor write up to 32 cells at once on a
connection outbuf and so we use this burst of data as an indicator that bytes
are starting to accumulate.

After 'cc_xon_rate' cells worth of stream data, we use N_EWMA to average this
rate into a running EWMA average, with N specified by consensus parameter
'cc_xon_ewma_cnt'. Every EWMA update, the byte count is set to 0 and a new
timestamp is recorded. In this way, the EWMA counter is averaging N counts of
'cc_xon_rate' cells worth of bytes each.

If the buffers are non-zero, and we have sent an XON before, and the N_EWMA
rate has changed more than 'cc_xon_change_pct' since the last XON, we send an
updated rate. Because the EWMA rate is only updated every 'cc_xon_rate' cells
worth of bytes, such advisory XON updates can not be sent more frequent than
this, and should be sent much less often in practice.

When the outbuf completely drains to 0, and has been 0 for 'cc_xon_rate' cells
worth of data, we double the EWMA rate. We continue to double it while the
outbuf is 0, every 'cc_xon_rate' cells. The measurement timestamp is also set
back to 0.

When an XOFF is sent, the EWMA rate is reset to 0, to allow fresh calculation
upon drain.

If a clock stall or jump is detected by [CLOCK_HEURISTICS], we also
clear the fields, but do not record them in ewma.

NOTE: Because our timestamps are microseconds, we chose to compute and
transmit both of these rates as 1000 byte/sec units, as this reduces the
number of multiplications and divisions and avoids precision loss.

4.1.3. Oomkiller behavior

A malicious client can attempt to exhaust memory in an Exits outbufs, by
ignoring XOFF and advisory XONs. Implementations MAY choose to close specific
streams with outbufs that grow too large, but since the exit does not know
with certainty the client's congestion window, it is non-trival to determine
the exact upper limit a well-behaved client might send on a blocked stream.

Implementations MUST close the streams with the oldest chunks present in their
outbufs, while under global memory pressure, until memory pressure is
relieved.

4.1.4. Sidechannel mitigation

In order to mitigate DropMark attacks[28], both XOFF and advisory XON
transmission must be restricted. Because DropMark attacks are most severe
before data is sent, clients MUST ensure that an XOFF does not arrive before
it has sent the appropriate XOFF limit of bytes on a stream ('cc_xoff_exit'
for exits, 'cc_xoff_client' for onions).

Clients also SHOULD ensure that advisory XONs do not arrive before the
minimum of the XOFF limit and 'cc_xon_rate' full cells worth of bytes have
been transmitted.

Clients SHOULD ensure that advisory XONs do not arrive more frequently than
every 'cc_xon_rate' cells worth of sent data. Clients also SHOULD ensure than
XOFFs do not arrive more frequently than every XOFF limit worth of sent data.

Implementations SHOULD close the circuit if these limits are violated on the
client-side, to detect and resist dropmark attacks[28].

Additionally, because edges no longer use stream SENDME windows, we alter the
half-closed connection handling to be time based instead of data quantity
based. Half-closed connections are allowed to receive data up to the larger
value of the congestion control max_rtt field or the circuit build timeout
(for onion service circuits, we use twice the circuit build timeout). Any data
or relay cells after this point are considered invalid data on the circuit.

Recall that all of the dropped cell enforcement in C-Tor is performed by
accounting data provided through the control port CIRC_BW fields, currently
enforced only by using the vanguards addon[29].

The C-Tor implementation exposes all of these properties to CIRC_BW for
vanguards to enforce, but does not enforce them itself. So violations of any
of these limits do not cause circuit closure unless that addon is used (as
with the rest of the dropped cell side channel handling in C-Tor).


5. System Interactions [SYSTEM_INTERACTIONS]

Tor's circuit-level SENDME system currently has special cases in the
following situations: Intropoints, HSDirs, onion services, and circuit
padding. Additionally, proper congestion control will allow us to very
easily implement conflux (circuit traffic splitting).

This section details those special cases and interactions of congestion
control with other components of Tor.

5.1. HSDirs

Because HSDirs use the tunneled dirconn mechanism and thus also use
RELAY_COMMAND_DATA, they are already subject to Tor's flow control.

We may want to make sure our initial circuit window for HSDir circuits
is set custom for those circuit types, so a SENDME is not required to
fetch long descriptors. This will ensure HSDir descriptors can be
fetched in one RTT.

5.2. Introduction Points

Introduction Points are not currently subject to any flow control.

Because Intropoints accept INTRODUCE1 cells from many client circuits
and then relay them down a single circuit to the service as INTRODUCE2
cells, we cannot provide end-to-end congestion control all the way from
client to service for these cells.

We can run congestion control from the service to the Intropoint, and probably
should, since this is already subject to congestion control.

As an optimization, if that congestion window reaches zero (because the
service is overwhelmed), then we start sending NACKS back to the clients (or
begin requiring proof-of-work), rather than just let clients wait for timeout.

5.3. Rendezvous Points

Rendezvous points are already subject to end-to-end SENDME control,
because all relay cells are sent end-to-end via the rendezvous circuit
splice in circuit_receive_relay_cell().

This means that rendezvous circuits will use end-to-end congestion
control, as soon as individual onion clients and onion services upgrade
to support it. There is no need for intermediate relays to upgrade at
all.

5.4. Circuit Padding

Recall that circuit padding is negotiated between a client and a middle
relay, with one or more state machines running on circuits at the middle
relay that decide when to add padding.
  https://github.com/torproject/tor/blob/master/doc/HACKING/CircuitPaddingDevelopment.md

This means that the middle relay can send padding traffic towards the
client that contributes to congestion, and the client may also send
padding towards the middle relay, that also creates congestion.

For low-traffic padding machines, such as the currently deployed circuit
setup obfuscation, this padding is inconsequential.

However, higher traffic circuit padding machines that are designed to
defend against website traffic fingerprinting will need additional care
to avoid inducing additional congestion, especially after the client or
the exit experiences a congestion signal.

The current overhead percentage rate limiting features of the circuit
padding system should handle this in some cases, but in other cases, an
XON/XOFF circuit padding flow control command may be required, so that
clients may signal to the machine that congestion is occurring.

5.5. Conflux

Conflux (aka multi-circuit traffic splitting) becomes significantly
easier to implement once we have congestion control. However, much like
congestion control, it will require experimentation to tune properly.

Recall that Conflux uses a 256-bit UUID to bind two circuits together at
the Exit or onion service. The original Conflux paper specified an
equation based on RTT to choose which circuit to send cells on.
  https://www.cypherpunks.ca/~iang/pubs/conflux-pets.pdf

However, with congestion control, we will already know which circuit has
the larger congestion window, and thus has the most available cells in
its current congestion window. This will also be the faster circuit.
Thus, the decision of which circuit to send a cell on only requires
comparing congestion windows (and choosing the circuit with more packets
remaining in its window).

Conflux will require sequence numbers on data cells, to ensure that the
two circuits' data is properly re-assembled. The resulting out-of-order
buffer can potentially be as large as an entire congestion window, if
the circuits are very desynced (or one of them closes). It will be very
expensive for Exits to maintain this much memory, and exposes them to
OOM attacks.

This is not as much of a concern in the client download direction, since
clients will typically only have a small number of these out-of-order
buffers to keep around. But for the upload direction, Exits will need
to send some form of early XOFF on the faster circuit if this
out-of-order buffer begins to grow too large, since simply halting the
delivery of SENDMEs will still allow a full congestion window full of
data to arrive. This will also require tuning and experimentation, and
optimum results will vary between simulator and live network.


6. Performance Evaluation [EVALUATION]

Congestion control for Tor will be easy to implement, but difficult to
tune to ensure optimal behavior.

6.1. Congestion Signal Experiments

Our first experiments were to conduct client-side experiments to
determine how stable the RTT measurements of circuits are across the
live Tor network, to determine if we need more frequent SENDMEs, and/or
need to use any RTT smoothing or averaging.

These experiments were performed using onion service clients and services on
the live Tor network. From these experiments, we tuned the RTT and BDP
estimators, and arrived at reasonable values for EWMA smoothing and the
minimum number of SENDME acks required to estimate BDP.

Additionally, we specified that the algorithms maintain previous congestion
window estimates in the event that a circuit goes idle, rather than revert to
slow start. We experimented with intermittent idle/active live onion clients
to make sure that this behavior is acceptable, and it appeared to be.

In Shadow experimentation, the primary thing to test will be if the OR conn on
Exit relays blocks too frequently when under load, thus causing excessive
congestion signals, and overuse of the Inflight BDP estimator as opposed
to SENDME or CWND BDP. It may also be the case that this behavior is optimal,
even if it does happen.

Finally, we should check small variations in the EWMA smoothing and minimum BDP ack
counts in Shadow experimentation, to check for high variability in these
estimates, and other surprises.

6.2. Congestion Algorithm Experiments

In order to evaluate performance of congestion control algorithms, we will
need to implement [TOR_WESTWOOD], [TOR_VEGAS], and [TOR_NOLA]. We will need to
simulate their use in the Shadow Tor network simulator.

Simulation runs will need to evaluate performance on networks that use
only one algorithm, as well as on networks that run a combinations of
algorithms - particularly each type of congestion control in combination
with Tor's current flow control. Depending upon the number of current
flow control clients, more aggressive parameters of these algorithms may
need to be set, but this will result in additional queueing as well as
sub-optimal behavior once all clients upgrade.

In particular, during live onion service testing, we noticed that these
algorithms required particularly agressive default values to compete against
a network full of current clients. As more clients upgrade, we may be able
to lower these defaults. We should get a good idea of what values we can
choose at what upgrade point, from mixed Shadow simulation.

If Tor's current flow control is so aggressive that it causes probelems with
any amount of remaining old clients, we can experiment with kneecapping these
legacy flow control Tor clients by setting a low 'circwindow' consensus
parameter for them. This will allow us to set more reasonable parameter
values, without waiting for all clients to upgrade.

Because custom congestion control can be deployed by any Exit or onion
service that desires better service, we will need to be particularly careful
about how congestion control algorithms interact with rogue implementations
that more aggressively increase their window sizes.  During these
adversarial-style experiments, we must verify that cheaters do not get
better service, and that Tor's circuit OOM killer properly closes circuits
that seriously abuse the congestion control algorithm, as per
[SECURITY_ANALYSIS]. This may requiring tuning 'circ_max_cell_queue_size',
and 'CircuitPriorityHalflifeMsec'.

Additionally, we will need to experiment with reducing the cell queue limits
on OR conns before they are blocked (OR_CONN_HIGHWATER), and study the
interaction of that with treating the or conn block as a congestion signal.

Finally, we will need to monitor our Shadow experiments for evidence of ack
compression, which can cause the BDP estimator to over-estimate the congestion
window. We will instrument our Shadow simulations to alert if they discover
excessive congestion window values, and tweak 'cc_bwe_min' and
'cc_sendme_inc' appropriately. We can set the 'cc_cwnd_max' parameter value
to low values (eg: ~2000 or so) to watch for evidence of this in Shadow, and
log. Similarly, we should watch for evidence that the 'cc_cwnd_min' parameter
value is rarely hit in Shadow, as this indicates that the cwnd may be too
small to measure BDP (for cwnd less than 'cc_sendme_inc'*'cc_bwe_min').

6.3. Flow Control Algorithm Experiments

Flow control only applies when the edges outside of Tor (SOCKS application,
onion service application, or TCP destination site) are *slower* than Tor's
congestion window. This typically means that the application is either
suspended or reading too slow off its SOCKS connection, or the TCP destination
site itself is bandwidth throttled on its downstream.

To examine these properties, we will perform live onion service testing, where
curl is used to download a large file. We will test no rate limit, and
verify that XON/XOFF was never sent. We then suspend this download, verify
that an XOFF is sent, and transmission stops. Upon resuming this download, the
download rate should return to normal. We will also use curl's --limit-rate
option, to exercise that the flow control properly measures the drain rate and
limits the buffering in the outbuf, modulo kernel socket and localhost TCP
buffering.

However, flow control can also get triggered at Exits in a situation where
either TCP fairness issues or Tor's mainloop does not properly allocate
enough capacity to edge uploads, causing them to be rate limited below the
circuit's congestion window, even though the TCP destination actually has
sufficient downstream capacity.

Exits are also most vulnerable to the buffer bloat caused by such uploads,
since there may be many uploads active at once.

To study this, we will run shadow simulations. Because Shadow does *not*
rate limit its tgen TCP endpoints, and only rate limits the relays
themselves, if *any* XON/XOFF activity happens in Shadow *at all*, it is
evidence that such fairness issues can ocurr.

Just in case Shadow does not have sufficient edge activity to trigger such
emergent behavior, when congestion control is enabled on the live network, we
will also need to instrument a live exit, to verify that XON/XOFF is not
happening frequently on it. Relays may also report these statistics in
extra-info descriptor, to help with monitoring the live network conditions, but
this might also require aggregation or minimization.

If excessive XOFF/XON activity happens at Exits, we will need to investigate
tuning the libevent mainloop to prioritize edge writes over orconn writes.
Additionally, we can lower 'cc_xoff_exit'. Linux Exits can also lower the
'net.ipv[46].tcp_wmem' sysctl value, to reduce the amount of kernel socket
buffering they do on such streams, which will improve XON/XOFF responsiveness
and reduce memory usage.

6.4. Performance Metrics [EVALUATION_METRICS]

The primary metrics that we will be using to measure the effectiveness
of congestion control in simulation are TTFB/RTT, throughput, and utilization.

We will calibrate the Shadow simulator so that it has similar CDFs for all of
these metrics as the live network, without using congestion control.

Then, we will want to inspect CDFs of these three metrics for various
congestion control algorithms and parameters.

The live network testing will also spot-check performance characteristics of
a couple algorithm and parameter sets, to ensure we see similar results as
Shadow.

On the live network, because congestion control will affect so many aspects of
performance, from throughput to RTT, to load balancing, queue length,
overload, and other failure conditions, the full set of performance metrics
will be required, to check for any emergent behaviors:
  https://gitlab.torproject.org/legacy/trac/-/wikis/org/roadmaps/CoreTor/PerformanceMetrics

We will also need to monitor network health for relay queue lengths,
relay overload, and other signs of network stress (and particularly the
alleviation of network stress).

6.5. Consensus Parameter Tuning [CONSENSUS_PARAMETERS]

During Shadow simulation, we will determine reasonable default
parameters for our consensus parameters for each algorithm. We will then
re-run these tuning experiments on the live Tor network, as described
in:
  https://gitlab.torproject.org/tpo/core/team/-/wikis/NetworkTeam/Sponsor61/PerformanceExperiments

6.5.1. Parameters common to all algorithms

These are sorted in order of importance to tune, most important first.

  cc_alg:
    - Description:
          Specifies which congestion control algorithm clients should
          use, as an integer.
    - Range: 0 or 2  (0=fixed windows, 2=Vegas)
    - Default: 2
    - Tuning Values: [2,3]
    - Tuning Notes:
           These algorithms need to be tested against percentages of current
           fixed alg client competition, in Shadow. Their optimal parameter
           values, and even the optimal algorithm itself, will likely depend
           upon how much fixed sendme traffic is in competition. See the
           algorithm-specific parameters for additional tuning notes.
           As of Tor 0.4.8, Vegas is the default algorithm, and support
           for algorithms 1 (Westwood) and 3 (NOLA) have been removed.
    - Shadow Tuning Results:
           Westwood exhibited responsiveness problems, drift, and overshoot.
           NOLA exhibited ack compression resulting in over-estimating the
           BDP. Vegas, when tuned properly, kept queues low and throughput
           high, but even.

  cc_bwe_min:
    - Description:
          The minimum number of SENDME acks to average over in order to
          estimate bandwidth (and thus BDP).
    - Range: [2, 20]
    - Default: 5
    - Tuning Values: 4-10
    - Tuning Notes:
           The lower this value is, the sooner we can get an estimate of
           the true BDP of a circuit. Low values may lead to massive
           over-estimation, due to ack compression. However, if this
           value is above the number of acks that fit in cc_cwnd_init, then
           we won't get a BDP estimate within the first use of the circuit.
           Additionally, if this value is above the number of acks that
           fit in cc_cwnd_min, we may not be able to estimate BDP
           when the congestion window is small. If we need small congestion
           windows, we should also lower cc_sendme_inc, which will get us more
           frequent acks with less data.
    - Shadow Tuning Results:
           Regarless of how high this was set, there were still cases where
           queues built up, causing BDP over-estimation. As a result, we disable
           use of the BDP estimator, and only use the Vegas CWND estimator.

  cc_sendme_inc:
    - Description: Specifies how many cells a SENDME acks
    - Range: [1, 254]
    - Default: 31
    - Tuning Values: 25,33,50
    - Tuning Notes:
           Increasing this increases overhead, but also increases BDP
           estimation accuracy. Since we only use circuit-level sendmes,
           and the old code sent sendmes at both every 50 cells, and every
           100, we can set this as low as 33 to have the same amount of
           overhead.
    - Shadow Tuning Results:
           This was optimal at 31-32 cells, which is also the number of
           cells that fit in a TLS frame. Much of the rest of Tor has
           processing values at 32 cells, as well.
    - Consensus Update Notes:
           This value MUST only be changed by +/- 1, every 4 hours.
           If greater changes are needed, they MUST be spread out over
           multiple consensus updates.

  cc_cwnd_init:
    - Description: Initial congestion window for new congestion
                   control Tor clients. This can be set much higher
                   than TCP, since actual TCP to the guard will prevent
                   buffer bloat issues at local routers.
    - Range: [31, 10000]
    - Default: 4*31
    - Tuning Values: 150,200,250,500
    - Tuning Notes:
           Higher initial congestion windows allow the algorithms to
           measure initial BDP more accurately, but will lead to queue bursts
           and latency.  Ultimately, the ICW should be set to approximately
           'cc_bwe_min'*'cc_sendme_inc', but the presence of competing
           fixed window clients may require higher values.
    - Shadow Tuning Results:
           Setting this too high caused excessive cell queues at relays.
           4*31 ended up being a sweet spot.
    - Consensus Update Notes:
           This value must never be set below cc_sendme_inc.

  cc_cwnd_min:
    - Description: The minimum allowed cwnd.
    - Range: [31, 1000]
    - Default: 31
    - Tuning Values: [100, 150, 200]
    - Tuning Notes:
           If the cwnd falls below cc_sendme_inc, connections can't send
           enough data to get any acks, and will stall. If it falls below
           cc_bwe_min*cc_sendme_inc, connections can't use SENDME BDP
           estimates. Likely we want to set this around
           cc_bwe_min*cc_sendme_inc, but no lower than cc_sendme_inc.
    - Shadow Tuning Results:
           We set this at 31 cells, the cc_sendme_inc.
    - Consensus Update Notes:
           This value must never be set below cc_sendme_inc.

  cc_cwnd_max:
    - Description: The maximum allowed cwnd.
    - Range: [500, INT32_MAX]
    - Default: INT32_MAX
    - Tuning Values: [5000, 10000, 20000]
    - Tuning Notes:
       If cc_bwe_min is set too low, the BDP estimator may over-estimate the
       congestion window in the presence of large queues, due to SENDME ack
       compression. Once all clients have upgraded to congestion control,
       queues large enough to cause ack compression should become rare. This
       parameter exists primarily to verify this in Shadow, but we preserve it
       as a consensus parameter for emergency use in the live network, as well.
    - Shadow Tuning Results:
       We kept this at INT32_MAX.

  circwindow:
    - Description: Initial congestion window for legacy Tor clients
    - Range: [100, 1000]
    - Default: 1000
    - Tuning Values: 100,200,500,1000
    - Tuning Notes:
           If the above congestion algorithms are not optimal until an
           unreasonably high percentge of clients upgrade, we can reduce
           the performance of ossified legacy clients by reducing their
           circuit windows. This will allow old clients to continue to
           operate without impacting optimal network behavior.

  cc_cwnd_inc_rate:
    - Description: How often we update our congestion window, per cwnd worth
                   of packets
    - Range: [1, 250]
    - Default: 1
    - Tuning Values: [1,2,5,10]
    - Tuning Notes:
           Congestion control theory says that the congestion window should
           only be updated once every cwnd worth of packets. We may find it
           better to update more frequently, but this is probably unlikely
           to help a great deal.
    - Shadow Tuning Results:
           Increasing this during slow start caused overshoot and excessive
           queues. Increasing this after slow start was suboptimal for
           performance. We keep this at 1.

  cc_ewma_cwnd_pct:
    - Description: This specifies the N in N-EWMA smoothing of RTT and BDP
                   estimation, as a percent of the number of SENDME acks
                   in a congestion window. It allows us to average these RTT
                   values over a percentage of the congestion window,
                   capped by 'cc_ewma_max' below, and specified in
                   [N_EWMA_SMOOTHING].
    - Range: [1, 255]
    - Default: 50,100
    - Tuning Values: [25,50,100]
    - Tuning Notes:
           Smoothing our estimates reduces the effects of ack compression and
           other ephemeral network hiccups; changing this much is unlikely
           to have a huge impact on performance.
    - Shadow Tuning Results:
           Setting this to 50 seemed to reduce cell queues, but this may also
           have impacted performance.

  cc_ewma_max:
    - Description: This specifies the max N in N_EWMA smoothing of RTT and BDP
                   estimation. It allows us to place a cap on the N of EWMA
                   smoothing, as specified in [N_EWMA_SMOOTHING].
    - Range: [2, INT32_MAX]
    - Default: 10
    - Tuning Values: [10,20]
    - Shadow Tuning Results:
           We ended up needing this to make Vegas more responsive to
           congestion, to avoid overloading slow relays. Values of 10 or 20
           were best.

  cc_ewma_ss:
    - Description: This specifies the N in N_EWMA smoothing of RTT during
                   Slow Start.
    - Range: [2, INT32_MAX]
    - Default: 2
    - Tuning Values: [2,4]
    - Shadow Tuning Results:
           Setting this to 2 helped reduce overshoot during Slow Start.

  cc_rtt_reset_pct:
    - Description: Describes a percentile average between RTT_min and
                   RTT_current_ewma, for use to reset RTT_min, when the
                   congestion window hits cwnd_min.
    - Range: [0, 100]
    - Default: 100
    - Shadow Tuning Results:
           cwnd_min is not hit in Shadow simulations, but it can be hit
           on the live network while under DoS conditions, and with cheaters.

  cc_cwnd_inc:
    - Description: How much to increment the congestion window by during
                   steady state, every cwnd.
    - Range: [1, 1000]
    - Default: 31
    - Tuning Values: 25,50,100
    - Tuning Notes:
           We are unlikely to need to tune this much, but it might be worth
           trying a couple values.
    - Shadow Tuning Results:
           Increasing this negatively impacted performance. Keeping it at
           cc_sendme_inc is best.

  cc_cwnd_inc_pct_ss:
    - Description: Percentage of the current congestion window to increment
                   by during slow start, every cwnd.
    - Range: [1, 500]
    - Default: 50
    - Tuning Values: 50,100,200
    - Tuning Notes:
           On the current live network, the algorithms tended to exit slow
           start early, so we did not exercise this much. This may not be the
           case in Shadow, or once clients upgrade to the new algorithms.
    - Shadow Tuning Results:
           Setting this above 50 caused excessive queues to build up in
           Shadow. This may have been due to imbalances in Shadow client
           allocation, though. Values of 50-100 will be explored after
           examining Shadow Guard Relay Utilization.


6.5.2. Westwood parameters

  Westwood has runaway conditions. Because the congestion signal threshold of
  TOR_WESTWOOD is a function of RTT_max, excessive queuing can cause an
  increase in RTT_max. Additionally, if stream activity is constant, but of
  a lower bandwidth than the circuit, this will not drive the RTT upwards,
  and this can result in a congestion window that continues to increase in the
  absence of any other concurrent activity.

  For these reasons, we are unlikely to spend much time deeply investigating
  Westwood in Shadow, beyond a simulaton or two to check these behaviors.

  cc_westwood_rtt_thresh:
    - Description:
              Specifies the cutoff for BOOTLEG_RTT_TOR to deliver
              congestion signal, as fixed point representation
              divided by 1000.
    - Range: [1, 1000]
    - Default: 33
    - Tuning Values: [20, 33, 40, 50]
    - Tuning Notes:
          The Defenestrator paper set this at 23, but did not justify it. We
          may need to raise it to compete with current fixed window SENDME.

  cc_westwood_cwnd_m:
    - Description: Specifies how much to reduce the congestion
                   window after a congestion signal, as a fraction of
                   100.
    - Range: [0, 100]
    - Default: 75
    - Tuning Values: [50, 66, 75]
    - Tuning Notes:
           Congestion control theory started out using 50 here, and then
           decided 70-75 was better.

  cc_westwood_min_backoff:
    - Description: If 1, take the min of BDP estimate and westwood backoff.
                   If 0, take the max of BDP estimate and westwood backoff.
    - Range: [0, 1]
    - Default: 0
    - Tuning Notes:
           This parameter can make the westwood backoff less agressive, if
           need be. We're unlikely to need it, though.

  cc_westwood_rtt_m:
    - Description: Specifies a backoff percent of RTT_max, upon receipt of
                   a congestion signal.
    - Range: [50, 100]
    - Default: 100
    - Tuning Notes:
           Westwood technically has a runaway condition where congestion can
           cause RTT_max to grow, which increases the congestion threshhold.
           This has not yet been observed, but because it is possible, we
           include this parameter.

6.5.3. Vegas Parameters

  cc_vegas_alpha_{exit,onion,sbws}:
  cc_vegas_beta_{exit,onion,sbws}:
  cc_vegas_gamma_{exit,onion,sbws}:
  cc_vegas_delta_{exit,onion,sbws}:
    - Description: These parameters govern the number of cells
                   that [TOR_VEGAS] can detect in queue before reacting.
    - Range: [0, 1000] (except delta, which has max of INT32_MAX)
    - Defaults:
           # OUTBUF_CELLS=62
           cc_vegas_alpha_exit (3*OUTBUF_CELLS)
           cc_vegas_beta_exit (4*OUTBUF_CELLS)
           cc_vegas_gamma_exit (3*OUTBUF_CELLS)
           cc_vegas_delta_exit (5*OUTBUF_CELLS)
           cc_vegas_alpha_onion (3*OUTBUF_CELLS)
           cc_vegas_beta_onion (6*OUTBUF_CELLS)
           cc_vegas_gamma_onion (4*OUTBUF_CELLS)
           cc_vegas_delta_onion (7*OUTBUF_CELLS)
    - Tuning Notes:
           The amount of queued cells that Vegas should tolerate is heavily
           dependent upon competing congestion control algorithms. The specified
           defaults are necessary to compete against current fixed SENDME traffic,
           but are much larger than neccessary otherwise. These values also
           need a large-ish range between alpha and beta, to allow some degree of
           variance in traffic, as per [33]. The tuning of these parameters
           happened in two tickets[34,35]. The onion service parameters were
           set on the basis that they should increase the queue until as much
           queue delay as Exits, but tolerate up to 6 hops of outbuf delay.
           Lack of visibility into onion service congestion window on the live
           network prevented confirming this.
    - Shadow Tuning Results:
           We found that the best values for 3-hop Exit circuits was to set
           alpha and gamma to the size of the outbufs times the number of
           hops. Beta is set to one TLS record/sendme_inc above this value.

  cc_sscap_{exit,onion,sbws}:
    - Description: These parameters describe the RFC3742 'cap', after which
         congestion window increments are reduced. INT32_MAX disables
         RFC3742.
    - Range: [100, INT32_MAX]
    - Defaults:
         sbws: 400
         exit: 600
         onion: 475
    - Shadow Tuning Results:
         We picked these defaults based on the average congestion window
         seen in Shadow sims for exits and onion service circuits.

   cc_ss_max:
      - Description: This parameter provides a hard-max on the congestion
        window in slow start.
      - Range: [500, INT32_MAX]
      - Default: 5000
      - Shadow Tuning Results:
         The largest congestion window seen in Shadow is ~3000, so this was
         set as a safety valve above that.

   cc_cwnd_full_gap:
      - Description: This parameter defines the integer number of
        'cc_sendme_inc' multiples of gap allowed between inflight and
        cwnd, to still declare the cwnd full.
      - Range: [0, INT16_MAX]
      - Default: 4
      - Shadow Tuning Results:
        Low values resulted in a slight loss of performance, and increased
        variance in throughput. Setting this at 4 seemed to achieve a good
        balance betwen throughput and queue overshoot.

   cc_cwnd_full_minpct:
      - Description: This paramter defines a low watermark in percent. If
        inflight falls below this percent of cwnd, the congestion window
        is immediately declared non-full.
      - Range: [0, 100]
      - Default: 25

   cc_cwnd_full_per_cwnd:
      - Description: This parameter governs how often a cwnd must be
        full, in order to allow congestion window increase. If it is 1,
        then the cwnd only needs to be full once per cwnd worth of acks.
        If it is 0, then it must be full once every cwnd update (ie:
        every SENDME).
      - Range: [0, 1]
      - Default: 1
      - Shadow Tuning Results:
        A value of 0 resulted in a slight loss of performance, and increased
        variance in throughput. The optimal number here likely depends on
        edgeconn inbuf size, edgeconn kernel buffer size, and eventloop
        behavior.

6.5.4. NOLA Parameters

  cc_nola_overshoot:
    - Description: The number of cells to add to the BDP estimator to obtain
                   the NOLA cwnd.
    - Range: [0, 1000]
    - Default: 100
    - Tuning Values: 0, 50, 100, 150, 200
    - Tuning Notes:
            In order to compete against current fixed sendme, and to ensure
            that the congestion window has an opportunity to grow, we must
            set the cwnd above the current BDP estimate. How much above will
            be a function of competing traffic. It may also turn out that
            absent any more agressive competition, we do not need to overshoot
            the BDP estimate.

6.5.5. Flow Control Parameters

  As with previous sections, the parameters in this section are sorted with
  the parameters that are most impportant to tune, first.

  These parameters have been tuned using onion services. The defaults are
  believed to be good.

  cc_xoff_client
  cc_xoff_exit
    - Description: Specifies the outbuf length, in relay cell multiples,
                   before we send an XOFF.
    - Range: [1, 10000]
    - Default: 500
    - Tuning Values: [500, 1000]
    - Tuning Notes:
        This threshold plus the sender's cwnd must be greater than the
        cc_xon_rate value, or a rate cannot be computed. Unfortunately,
        unless it is sent, the receiver does not know the cwnd. Therefore,
        this value should always be higher than cc_xon_rate minus
        'cc_cwnd_min' (100) minus the xon threshhold value (0).

  cc_xon_rate
    - Description: Specifies how many full packed cells of bytes must arrive
                   before we can compute a rate, as well as how often we can
                   send XONs.
    - Range: [1, 5000]
    - Default: 500
    - Tuning Values: [500, 1000]
    - Tuning Notes:
        Setting this high will prevent excessive XONs, as well as reduce
        side channel potential, but it will delay response to queuing.
        and will hinder our ability to detect rate changes. However, low
        values will also reduce our ability to accurately measure drain
        rate. This value should always be lower than 'cc_xoff_*' +
        'cc_cwnd_min', so that a rate can be computed solely from the outbuf
        plus inflight data.

 cc_xon_change_pct
    - Description: Specifies how much the edge drain rate can change before
                   we send another advisory cell.
    - Range: [1, 99]
    - Default: 25
    - Tuning values: [25, 50, 75]
    - Tuning Notes:
        Sending advisory updates due to a rate change may help us avoid
        hitting the XOFF limit, but it may also not help much unless we
        are already above the advise limit.

  cc_xon_ewma_cnt
    - Description: Specifies the N in the N_EWMA of rates.
    - Range: [2, 100]
    - Default: 2
    - Tuning values: [2, 3, 5]
    - Tuning Notes:
        Setting this higher will smooth over changes in the rate field,
        and thus avoid XONs, but will reduce our reactivity to rate changes.


6.5.6. External Performance Parameters to Tune

  The following parameters are from other areas of Tor, but tuning them
  will improve congestion control performance. They are again sorted
  by most important to tune, first.

  cbtquantile
    - Description: Specifies the percentage cutoff for the circuit build
                   timeout mechanism.
    - Range: [60, 80]
    - Default: 80
    - Tuning Values: [70, 75, 80]
    - Tuning Notes:
       The circuit build timeout code causes Tor to use only the fastest
       'cbtquantile' percentage of paths to build through the network.
       Lowering this value will help avoid congested relays, and improve
       latency.

  CircuitPriorityHalflifeMsec
    - Description: The CircEWMA half-life specifies the time period after
                   which the cell count on a circuit is halved. This allows
                   circuits to regain their priority if they stop being bursty.
    - Range: [1, INT32_MAX]
    - Default: 30000
    - Tuning Values: [5000, 15000, 30000, 60000]
    - Tuning Notes:
       When we last tuned this, it was before KIST[31], so previous values may
       have little relevance to today. According to the CircEWMA paper[30], values
       that are too small will fail to differentiate bulk circuits from interactive
       ones, and values that are too large will allow new bulk circuits to keep
       priority over interactive circuits for too long. The paper does say
       that the system was not overly sensitive to specific values, though.

  CircuitPriorityTickSecs
    - Description: This specifies how often in seconds we readjust circuit
                   priority based on their EWMA.
    - Range: [1, 600]
    - Default: 10
    - Tuning Values: [1, 5, 10]
    - Tuning Notes:
        Even less is known about the optimal value for this parameter. At a
        guess, it should be more often than the half-life. Changing it also
        influences the half-life decay, though, at least according to the
        CircEWMA paper[30].

  KISTSchedRunInterval
    - If 0, KIST is disabled. (We should also test KIST disabled)


6.5.7. External Memory Reduction Parameters to Tune

  The following parameters are from other areas of Tor, but tuning them
  will reduce memory utilization in relays. They are again sorted by most
  important to tune, first.

  circ_max_cell_queue_size
    - Description: Specifies the minimum number of cells that are allowed
                   to accumulate in a relay queue before closing the circuit.
    - Range: [1000, INT32_MAX]
    - Default: 50000
    - Tuning Values: [1000, 2500, 5000]
    - Tuning Notes:
       Once all clients have upgraded to congestion control, relay circuit
       queues should be minimized. We should minimize this value, as any
       high amounts of queueing is a likely violator of the algorithm.

  cellq_low
  cellq_high
    - Description: Specifies the number of cells that can build up in
                   a circuit's queue for delivery onto a channel (from edges)
                   before we either block or unblock reading from streams
                   attached to that circuit.
    - Range: [1, 1000]
    - Default: low=10, high=256
    - Tuning Values: low=[0, 2, 4, 8]; high=[16, 32, 64]
    - Tuning Notes:
        When data arrives from edges into Tor, it gets packaged up into cells
        and then delivered to the cell queue, and from there is dequeued and
        sent on a channel. If the channel has blocked (see below params), then
        this queue grows until the high watermark, at which point Tor stops
        reading on all edges associated with a circuit, and a congestion
        signal is delivered to that circuit. At 256 cells, this is ~130k of
        data for *every* circuit, which is far more than Tor can write in a
        channel outbuf. Lowering this will reduce latency, reduce memory
        usage, and improve responsiveness to congestion. However, if it is
        too low, we may incur additional mainloop invocations, which are
        expensive. We will need to trace or monitor epoll() invocations in
        Shadow or on a Tor exit to verify that low values do not lead to
        more mainloop invocations.
    - Shadow Tuning Results:
        After extensive tuning, it turned out that the defaults were optimal
        in terms of throughput.

  orconn_high
  orconn_low
    - Description: Specifies the number of bytes that can be held in an
                   orconn's outbuf before we block or unblock the orconn.
    - Range: [509, INT32_MAX]
    - Default: low=16k, high=32k
    - Tuning Notes:
        When the orconn's outbuf is above the high watermark, cells begin
        to accumulate in the cell queue as opposed to being added to the
        outbuf. It may make sense to lower this to be more in-line with the
        cellq values above. Also note that the low watermark is only used by
        the vanilla scheduler, so tuning it may be relevant when we test with
        KIST disabled. Just like the cell queue, if this is set lower, congestion
        signals will arrive sooner to congestion control when orconns become
        blocked, and less memory will occupy queues. It will also reduce latency.
        Note that if this is too low, we may not fill TLS records, and we may
        incur excessive epoll()/mainloop invocations. Tuning this is likely
        less beneficial than tuning the above cell_queue, unless KIST is
        disabled.

  MaxMemInQueues
    - Should be possible to set much lower, similarly to help with
      OOM conditions due to protocol violation. Unfortunately, this
      is just a torrc, and a bad one at that.


7. Protocol format specifications [PROTOCOL_SPEC]

   TODO: This section needs details once we close out other TODOs above.

7.1. Circuit window handshake format

   TODO: We need to specify a way to communicate the currently seen
         cc_sendme_inc consensus parameter to the other endpoint,
         due to consensus sync delay. Probably during the CREATE
         onionskin (and RELAY_COMMAND_EXTEND).
   TODO: We probably want stricter rules on the range of values
         for the per-circuit negotiation - something like
         it has to be between [cc_sendme_inc/2, 2*cc_sendme_inc].
         That way, we can limit weird per-circuit values, but still
         allow us to change the consensus value in increments.

7.2. XON/XOFF relay cell formats

   TODO: We need to specify XON/XOFF for flow control. This should be
         simple.
   TODO: We should also allow it to carry stream data, as in Prop 325.

7.3. Onion Service formats

   TODO: We need to specify how to signal support for congestion control
         in an onion service, to both the intropoint and to clients.

7.4. Protocol Version format

   TODO: We need to pick a protover to signal Exit and Intropoint
         congestion control support.

7.5. SENDME relay cell format

   TODO: We need to specify how to add stream data to a SENDME as an
         optimization.

7.6. Extrainfo descriptor formats

   TODO: We will want to gather information on circuitmux and other
         relay queues, as well as XON/XOFF rates, and edge connection
         queue lengths at exits.


8. Security Analysis [SECURITY_ANALYSIS]

The security risks of congestion control come in three forms: DoS
attacks, fairness abuse, and side channel risk.

8.1. DoS Attacks (aka Adversarial Buffer Bloat)

The most serious risk of eliminating our current window cap is that
endpoints can abuse this situation to create huge queues and thus DoS
Tor relays.

This form of attack was already studied against the Tor network in the
Sniper attack:
  https://www.freehaven.net/anonbib/cache/sniper14.pdf

We had two fixes for this. First, we implemented a circuit-level OOM
killer that closed circuits whose queues became too big, before the
relay OOMed and crashed.

Second, we implemented authenticated SENDMEs, so clients could not
artificially increase their window sizes with honest exits:
  https://gitweb.torproject.org/torspec.git/tree/proposals/289-authenticated-sendmes.txt

We can continue this kind of enforcement by having Exit relays ensure that
clients are not transmitting SENDMEs too often, and do not appear to be
inflating their send windows beyond what the Exit expects by calculating a
similar estimated receive window. Note that such an estimate may have error
and may become negative if the estimate is jittery.

Unfortunately, authenticated SENDMEs do *not* prevent the same attack
from being done by rogue exits, or rogue onion services. For that, we
rely solely on the circuit OOM killer. During our experimentation, we
must ensure that the circuit OOM killer works properly to close circuits
in these scenarios.

But in any case, it is important to note that we are not any worse off
with congestion control than we were before, with respect to these kinds
of DoS attacks. In fact, the deployment of congestion control by honest
clients should reduce queue use and overall memory use in relays,
allowing them to be more resilient to OOM attacks than before.

8.2. Congestion Control Fairness Abuse (aka Cheating)

On the Internet, significant research and engineering effort has been
devoted to ensuring that congestion control algorithms are "fair" in
that each connection receives equal throughput. This fairness is
provided both via the congestion control algorithm, as well as via queue
management algorithms at Internet routers.

One of the most unfortunate early results was that TCP Vegas, despite
being near-optimal at minimizing queue lengths at routers, was easily
out-performed by more aggressive algorithms that tolerated larger queue
delay (such as TCP Reno).

Note that because the most common direction of traffic for Tor is from
Exit to client, unless Exits are malicious, we do not need to worry
about rogue algorithms as much, but we should still examine them in our
experiments because of the possibility of malicious Exits, as well as
malicious onion services.

Queue management can help further mitigate this risk, too. When RTT is
used as a congestion signal, our current Circuit-EWMA queue management
algorithm is likely sufficient for this. Because Circuit-EWMA will add
additional delay to loud circuits, "cheaters" who use alternate
congestion control algorithms to inflate their congestion windows should
end up with more RTT congestion signals than those who do not, and the
Circuit-EWMA scheduler will also relay fewer of their cells per time
interval.

In this sense, we do not need to worry about fairness and cheating as a
security property, but a lack of fairness in the congestion control
algorithm *will* increase memory use in relays to queue these
unfair/loud circuits, perhaps enough to trigger the OOM killer. So we
should still be mindful of these properties in selecting our congestion
control algorithm, to minimize relay memory use, if nothing else.

These two properties (honest Exits and Circuit-EWMA) may even be enough
to make it possible to use [TOR_VEGAS] even in the presence of other
algorithms, which would be a huge win in terms of memory savings as well
as vastly reduced queue delay. We must verify this experimentally,
though.

8.3. Side Channel Risks

Vastly reduced queue delay and predictable amounts of congestion on the
Tor network may make certain forms of traffic analysis easier.
Additionally, the ability to measure RTT and have it be stable due to
minimal network congestion may make geographical inference attacks
easier:
  https://www.freehaven.net/anonbib/cache/ccs07-latency-leak.pdf
  https://www.robgjansen.com/publications/howlow-pets2013.pdf

It is an open question as to if these risks are serious enough to
warrant eliminating the ability to measure RTT at the protocol level and
abandoning it as a congestion signal, in favor of other approaches
(which have their own side channel risks). It will be difficult to
comprehensively eliminate RTT measurements, too.

On the plus side, Conflux traffic splitting (which is made easy once
congestion control is implemented) does show promise as providing
defense against traffic analysis:
  https://www.comsys.rwth-aachen.de/fileadmin/papers/2019/2019-delacadena-splitting-defense.pdf

There is also literature on shaping circuit bandwidth to create a side
channel. This can be done regardless of the use of congestion control,
and is not an argument against using congestion control. In fact, the
Backlit defense may be an argument in favor of endpoints monitoring
circuit bandwidth and latency more closely, as a defense:
  https://www.freehaven.net/anonbib/cache/ndss09-rainbow.pdf
  https://www.freehaven.net/anonbib/cache/ndss11-swirl.pdf
  https://www.freehaven.net/anonbib/cache/acsac11-backlit.pdf

Finally, recall that we are considering ideas/xxx-backward-ecn.txt
[BACKWARD_ECN] to use a circuit-level cell_t.command to signal
congestion.  This allows all relays in the path to signal congestion in
under RTT/2 in either direction, and it can be flipped on existing relay
cells already in transit, without introducing any overhead.  However,
because cell_t.command is visible and malleable to all relays, it can
also be used as a side channel. So we must limit its use to a couple of
cells per circuit, at most.
  https://blog.torproject.org/tor-security-advisory-relay-early-traffic-confirmation-attack


9. Onion Service Negotiation [ONION_NEGOTIATION]

Onion service requires us to advertise the protocol version and congestion
control parameters in a different way since the end points do not know each
other like a client knows all the relays and what they support. Additionally,
we cannot use ntorv3 for onion service negotiation, because it is not
supported at all rendezvous and introduction points.

To address this, this is done in two parts. First, the service needs to
advertise to the world that it supports congestion control, and its view of
the current cc_sendme_inc consensus parameter. This is done through a new
line in the onion service descriptor, see section 9.1 below.

Second, the client needs to inform the service that it wants to use congestion
control on the rendezvous circuit. This is done through the INTRODUCE cell as
an extension, see section 9.2 below.

9.1. Onion Service Descriptor

We propose to add a new line to advertise the flow control protocol version,
in the encrypted section of the onion service descriptor:

  "flow-control" SP version-range SP sendme-inc NL

    The "version-range" value is the same as the protocol version FlowCtrl
    that relay advertises which is defined earlier in this proposal. The
    current value is "1-2".

    The "sendme-inc" value comes from the service's current cc_sendme_inc
    consensus parameter.

Clients MUST ignore additional unknown versions in "version-range", and MUST
ignore any additional values on this line.

Clients SHOULD use the highest value in "version-range" to govern their
protocol choice for "FlowCtrl" and INTRODUCE cell format, as per Section 9.2
below.

If clients do not support any of the versions in "version-range", they SHOULD
reject the descriptor. (They MAY choose to ignore this line instead, but doing
so means using the old fixed-window SENDME flow control, which will likely be
bad for the network).

Clients that are able to parse this line and know the protocol version
MUST validate that the "sendme-inc" value is within a multiple of 2 of the
"cc_sendme_inc" in the consensus that they see. If "sendme-inc" is not within
range, they MUST reject the descriptor.

If their consensus also lists a non-zero "cc_alg", they MAY then send in the
INTRODUCE1 cell congestion control request extention field, which is detailed
in the next section.

A service should only advertise its flow control version if congestion control is
enabled. It MUST remove this line if congestion control is disabled.

If the service observes a change in 'cc_sendme_inc' consensus parameter since
it last published its descriptor, it MUST immediately close its introduction
points, and publish a new descriptor with the new "sendme-inc" value. The
additional step of closing the introduction points ensures that no clients
arrive using a cached descriptor, with the old "sendme-inc" value.

9.2 INTRODUCE cell extension

We propose a new extension to the INTRODUCE cell which can be used to send
congestion control parameters down to the service. It is important to mention
that this is an extension to be used in the encrypted setion of the cell and
not its readable section by the introduction point.

If used, it needs to be encoded within the ENCRYPTED section of the INTRODUCE1
cell defined in rend-spec-v3.txt section 3.3. The content is defined as follow:

  EXT_FIELD_TYPE:

    [01] -- Congestion Control Request.

This field is has zero payload length. Its presence signifies that the client wants to
use congestion control. The client MUST NOT set this field, or use
ntorv3, if the service did not list "2" in the "FlowCtrl" line in the
descriptor. The client SHOULD NOT provide this field if the consensus parameter
'cc_alg' is 0.

The service MUST ignore any unknown fields.

9.3 Protocol Flow

First, the client reads the "flow-control" line in the descriptor and gets the
maximum value from that line's "version-range" and the service supports. As an
example, if the client supports 2-3-4 and the service supports 2-3, then 3 is
chosen.

It then sends that value along its desired cc_sendme_inc value in the
INTRODUCE1 cell in the extension.

The service will then validate that is does support version 3 and that the
parameter cc_sendme_inc is within range of the protocol. Congestion control is
then applied to the rendezvous circuit.

9.4 Circuit Behavior

If the extension is not found in the cell, the service MUST NOT use congestion
control on the rendezvous circuit.

Any invalid values received in the extension should result in closing the
introduction circuit and thus not continuing the rendezvous process. An invalid
value is either if the value is not supported or out of the defined range.

9.5 Security Considerations

Advertising a new line in a descriptor does leak that a service is running at
least a certain tor version. We believe that this is an acceptable risk in
order to be able for service to take advantage of congestion control. Once a
new tor stable is released, we hope that most service upgrades and thus
everyone looks the same again.

The new extension is located in the encrypted part of the INTRODUCE1 cell and
thus the introduction point can't learn its content.


10. Exit negotiation [EXIT_NEGOTIATION]

Similar to onion services, clients and exits will need to negotiate the
decision to use congestion control, as well as a common value for
'cc_sendme_inc', for a given circuit.

10.1. When to negotiate

Clients decide to initiate a negotiation attempt for a circuit if the
consensus lists a non-zero 'cc_alg' parameter value, and the protover line
for their chosen exit includes a value of 2 in the "FlowCtrl" field.

If the FlowCtrl=2 subprotocol is absent, a client MUST NOT attempt negotiation.

If 'cc_alg' is absent or zero, a client SHOULD NOT attempt
negotiation, or use ntorv3.

If the protover and consensus conditions are met, clients SHOULD negotiate
with the Exit if the circuit is to be used for exit stream activity. Clients
SHOULD NOT negotiate congestion control for one-hop circuits, or internal
circuits.

10.2. What to negotiate

Clients and exits need not agree on a specific congestion control algorithm,
or any aspects of its behavior. Each endpoint's management of its congestion
window is independent. However, because the new algorithms no longer use
stream SENDMEs or fixed window sizes, they cannot be used with an endpoint
expecting the old behavior.

Additionally, each endpoint must agree on the the SENDME increment rate, in
order to synchronize SENDME authentication and pacing.

For this reason, negotiation needs to establish a boolean: "use congestion
control", and an integer value for SENDME increment.

No other parameters need to be negotiated.

10.3. How to negotiate

Negotiation is performed by sending an ntorv3 onionskin, as specified in
Proposal 332, to the Exit node. The encrypted payload contents from the
clients are encoded as an extension field, as in the onion service INTRO1
cell:

  EXT_FIELD_TYPE:

    [01] -- Congestion Control Request.

As in the INTRO1 extension field, this field is has zero payload length.

Its presence signifies that the client wants to use congestion control.
Again, the client MUST NOT set this field, or use ntorv3, if this exit did not
list "2" in the "FlowCtrl" version line. The client SHOULD NOT set this to 1
if the consensus parameter 'cc_alg' is 0.

The Exit MUST ignore any additional unknown extension fields.

The server's encrypted ntorv3 reply payload is encoded as:

   EXT_FIELD_TYPE:

    [02] -- Congestion Control Response.

    If this flag is set, the extension should be used by the service to learn
    what are the congestion control parameters to use on the rendezvous
    circuit.

  EXT_FIELD content payload is a single byte:

    sendme_inc        [1 byte]

The Exit MUST provide its current view of 'cc_sendme_inc' in this payload if it
observes a non-zero 'cc_alg' consensus parameter. Exits SHOULD only include
this field once.

The client MUST use the FIRST such field value, and ignore any duplicate field
specifiers. The client MUST ignore any unknown additional fields.

10.5. Client checks

The client MUST reject any ntorv3 replies for non-ntorv3 onionskins.

The client MUST reject an ntorv3 reply with field EXT_FIELD_TYPE=02, if the
client did not include EXT_FIELD_TYPE=01 in its handshake.

The client SHOULD reject a sendme_inc field value that differs from the
current 'cc_sendme_inc' consensus parameter by more than +/- 1, in
either direction.

If a client rejects a handshake, it MUST close the circuit.

10.6. Managing consenus updates

The pedantic reader will note that a rogue consensus can cause all clients
to decide to close circuits by changing 'cc_sendme_inc' by a large margin.

As a matter of policy, the directory authorities MUST NOT change
'cc_sendme_inc' by more than +/- 1.

In Shadow simulation, the optimal 'cc_sendme_inc' value to be ~31 cells, or
one (1) TLS record worth of cells. We do not expect to change this value
significantly.


11. Acknowledgements

Immense thanks to Toke Høiland-Jørgensen for considerable input into all
aspects of the TCP congestion control background material for this proposal,
as well as review of our versions of the algorithms.


12. Glossary [GLOSSARY]

ACK - Acknowledgment. In congestion control, this is a type of packet that
signals that the endpoint received a packet or packet set. In Tor, ACKs are
called SENDMEs.

BDP - Bandwidth Delay Product. This is the quantity of bytes that are actively
in transit on a path at any given time. Typically, this does not count packets
waiting in queues. It is essentially RTT*BWE - queue_delay.

BWE - BandWidth Estimate. This is the estimated throughput on a path.

CWND - Congestion WiNDow. This is the total number of packets that are allowed
to be "outstanding" (aka not ACKed) on a path at any given time. An ideal
congestion control algorithm sets CWND=BDP.

EWMA - Exponential Weighted Moving Average. This is a mechanism for smoothing
out high-frequency changes in a value, due to temporary effects.

ICW - Initial Congestion Window. This is the initial value of the congestion
window at the start of a connection.

RTT - Round Trip Time. This is the time it takes for one endpoint to send a
packet to the other endpoint, and get a response.

SS - Slow Start. This is the initial phase of most congestion control
algorithms. Despite the name, it is an exponential growth phase, to quickly
increase the congestion window from the ICW value up the path BDP. After Slow
Start, changes to the congestion window are linear.

XOFF - Transmitter Off. In flow control, XOFF means that the receiver is
receiving data too fast and is beginning to queue. It is sent to tell the
sender to stop sending.

XON - Transmitter On. In flow control, XON means that the receiver is ready to
receive more data. It is sent to tell the sender to resume sending.


13. [CITATIONS]

1. Options for Congestion Control in Tor-Like Networks.
   https://lists.torproject.org/pipermail/tor-dev/2020-January/014140.html

2. Towards Congestion Control Deployment in Tor-like Networks.
   https://lists.torproject.org/pipermail/tor-dev/2020-June/014343.html

3. DefenestraTor: Throwing out Windows in Tor.
   https://www.cypherpunks.ca/~iang/pubs/defenestrator.pdf

4. TCP Westwood: Bandwidth Estimation for Enhanced Transport over Wireless Links
   http://nrlweb.cs.ucla.edu/nrlweb/publication/download/99/2001-mobicom-0.pdf

5. Performance Evaluation and Comparison of Westwood+, New Reno, and Vegas TCP Congestion Control
   http://cpham.perso.univ-pau.fr/TCP/ccr_v31.pdf

6. Linux 2.4 Implementation of Westwood+ TCP with rate-halving
   https://c3lab.poliba.it/images/d/d7/Westwood_linux.pdf

7. TCP Westwood
   http://intronetworks.cs.luc.edu/1/html/newtcps.html#tcp-westwood

8. TCP Vegas: New Techniques for Congestion Detection and Avoidance
   http://pages.cs.wisc.edu/~akella/CS740/F08/740-Papers/BOP94.pdf

9. Understanding TCP Vegas: A Duality Model
   ftp://ftp.cs.princeton.edu/techreports/2000/628.pdf

10. TCP Vegas
    http://intronetworks.cs.luc.edu/1/html/newtcps.html#tcp-vegas

11. Controlling Queue Delay
    https://queue.acm.org/detail.cfm?id=2209336

12. Controlled Delay Active Queue Management
    https://tools.ietf.org/html/rfc8289

13. How Much Anonymity does Network Latency Leak?
    https://www.freehaven.net/anonbib/cache/ccs07-latency-leak.pdf

14. How Low Can You Go: Balancing Performance with Anonymity in Tor
    https://www.robgjansen.com/publications/howlow-pets2013.pdf

15. POSTER: Traffic Splitting to Counter Website Fingerprinting
    https://www.comsys.rwth-aachen.de/fileadmin/papers/2019/2019-delacadena-splitting-defense.pdf

16. RAINBOW: A Robust And Invisible Non-Blind Watermark for Network Flows
    https://www.freehaven.net/anonbib/cache/ndss09-rainbow.pdf

17. SWIRL: A Scalable Watermark to Detect Correlated Network Flows
    https://www.freehaven.net/anonbib/cache/ndss11-swirl.pdf

18. Exposing Invisible Timing-based Traffic Watermarks with BACKLIT
    https://www.freehaven.net/anonbib/cache/acsac11-backlit.pdf

19. The Sniper Attack: Anonymously Deanonymizing and Disabling the Tor Network
    https://www.freehaven.net/anonbib/cache/sniper14.pdf

20. Authenticating sendme cells to mitigate bandwidth attacks
    https://gitweb.torproject.org/torspec.git/tree/proposals/289-authenticated-sendmes.txt

21. Tor security advisory: "relay early" traffic confirmation attack
    https://blog.torproject.org/tor-security-advisory-relay-early-traffic-confirmation-attack

22. The Path Less Travelled: Overcoming Tor’s Bottlenecks with Traffic Splitting
    https://www.cypherpunks.ca/~iang/pubs/conflux-pets.pdf

23. Circuit Padding Developer Documentation
    https://github.com/torproject/tor/blob/master/doc/HACKING/CircuitPaddingDevelopment.md

24. Plans for Tor Live Network Performance Experiments
    https://gitlab.torproject.org/tpo/core/team/-/wikis/NetworkTeam/Sponsor61/PerformanceExperiments

25. Tor Performance Metrics for Live Network Tuning
    https://gitlab.torproject.org/legacy/trac/-/wikis/org/roadmaps/CoreTor/PerformanceMetrics

26. Bandwidth-Delay Product
    https://en.wikipedia.org/wiki/Bandwidth-delay_product

27. Exponentially Weighted Moving Average
    https://corporatefinanceinstitute.com/resources/knowledge/trading-investing/exponentially-weighted-moving-average-ewma/

28. Dropping on the Edge
    https://www.petsymposium.org/2018/files/papers/issue2/popets-2018-0011.pdf

29. https://github.com/mikeperry-tor/vanguards/blob/master/README_TECHNICAL.md#the-bandguards-subsystem

30. An Improved Algorithm for Tor Circuit Scheduling.
    https://www.cypherpunks.ca/~iang/pubs/ewma-ccs.pdf

31. KIST: Kernel-Informed Socket Transport for Tor
    https://matt.traudt.xyz/static/papers/kist-tops2018.pdf

32. RFC3742 Limited Slow Start
    https://datatracker.ietf.org/doc/html/rfc3742#section-2

33. https://people.csail.mit.edu/venkatar/cc-starvation.pdf

34. https://gitlab.torproject.org/tpo/core/tor/-/issues/40642

35. https://gitlab.torproject.org/tpo/network-health/analysis/-/issues/49
Filename: 325-packed-relay-cells.md
Title: Packed relay cells: saving space on small commands
Author: Nick Mathewson
Created: 10 July 2020
Status: Obsolete

(Proposal superseded by proposal 340)

Introduction

In proposal 319 I suggested a way to fragment long commands across multiple RELAY cells. In this proposal, I suggest a new format for RELAY cells that can be used to pack multiple relay commands into a single cell.

Why would we want to do this? As we move towards improved congestion-control and flow-control algorithms, we might not want to use an entire 498-byte relay payload just to send a one-byte flow-control message.

We already have some cases where we'd benefit from this feature. For example, when we send SENDME messages, END cells, or BEGIN_DIR cells, most of the cell body is wasted with padding.

As a side benefit, packing cells in this way may make the job of the traffic analyst a little more tricky, as cell contents become less predictable.

The basic design

Let's use the term "Relay Message" to mean the kind of thing that a relay cell used to hold. Thus, this proposal is about packing multiple "Relay Messages" in to a cell.

I'll use "Packed relay cell" to mean a relay cell in this new format, that supports multiple messages.

I'll use "client" to mean the initiator of a circuit, and "relay" to refer to the parties through who a circuit is created. Note that each "relay" (as used here) may be the "client" on circuits of its own.

When a relay supports relay message packing, it advertises the fact using a new Relay protocol version. Clients must opt-in to using this protocol version (see "Negotiation and Migration" section below ) before they can send any packed relay cells, and before the relay will send them any packed relay cells.

When packed cells are in use, multiple cell messages can be concatenated in a single relay cell.

Packed Cell Format

In order to have multiple commands within one single relay cell, they are concatenated one after another following this format of a relay cell. The first command is the same header format as a normal relay cell detailed in section 6.1 of tor-spec.txt

Relay Command   [1 byte]
'Recognized'    [2 bytes]
StreamID        [2 bytes]
Digest          [4 bytes]
Length          [2 bytes]
Data            [Length bytes]
RELAY\_MESSAGE
Padding         [up to end of cell]

The RELAY_MESSAGE can be empty as in no bytes indicating no other messages or set to the following:

Relay Command   [1 byte]
StreamID        [2 bytes]
Length          [2 bytes]
Data            [Length bytes]
RELAY\_MESSAGE

Note that the Recognized and Digest field are not added to a second relay message, they are solely used for the whole relay cell thus how we encrypt/decrypt and recognize a cell is not changed, only the payload changes to contain multiple messages.

The "Relay Command" byte "0" is now used to explicitly indicate "end of commands". If the byte "0" appears after a RELAY_MESSAGE, the rest of the cell MUST be ignored. (Note that this "end of commands" indicator may be absent if there are no bytes remaining after the last message in the cell.)

Only some "Relay Command" are supported for relay cell packing:

  • BEGIN_DIR
  • BEGIN
  • CONNECTED
  • DATA
  • DROP
  • END
  • PADDING_NEGOTIATED
  • PADDING_NEGOTIATE
  • SENDME

If any relay message with a relay command not listed above appears in a packed relay cell with another relay message, then the receiving party MUST tear down the circuit.

(Note that relay cell fragments (proposal 319) are not supported for packing.)

When generating RELAY cells, implementations SHOULD (as they do today) fill in the Padding field with four 0-valued bytes, followed by a sequence of random bytes up to the end of the cell. If there are fewer than 4 unused bytes at the end of the cell, those unused bytes should all be filled with 0-valued bytes.

Negotiation and migration

After receiving a packed relay cell, the relay knows that the client supports this proposal: Relays SHOULD send packed relay cells on any circuit on which they have received a packed relay cell. Relays MUST NOT send packed relay cells otherwise.

Clients, in turn, MAY send packed relay cells to any relay whose "Relay" subprotocol version indicates that it supports this protocol. To avoid fingerprinting, this client behavior should controlled with a tristate (1/0/auto) torrc configuration value, with the default set to use a consensus parameter.

The parameter is:

"relay-cell-packing"

Boolean: if 1, clients should send packed relay cells.
(Min: 0, Max 1, Default: 0)

To handle migration, first the parameter should be set to 0 and the configuration setting should be "auto". To test the feature, individual clients can set the tristate to "1".

Once enough clients have support for the parameter, the parameter can be set to 1.

A new relay message format

(This section is optional and should be considered separately; we may decide it is too complex.)

Currently, every relay message uses 5 bytes of header to hold a relay command, a length field, and a stream ID. This is wasteful: the stream ID is often redundant, and the top 7 bits of the length field are always zero.

I propose a new relay message format, described here (with ux denoting an x-bit bitfield). This format is 2 bytes or 4 bytes, depending on its first bit.

struct relay_header {
   u1 stream_id_included; // Is the stream_id included?
   u6 relay_command; // as before
   u9 relay_data_len; // as before
   u8 optional_stream_id[]; // 0 bytes or two bytes.
}

Alternatively, you can view the first three fields as a 16-bit value, computed as:

(stream_id_included<<15) | (relay_command << 9) | (relay_data_len).

If the optional_stream_id field is not present, then the default value for the stream_id is computed as follows. We use stream_id 0 for any command that doesn't take a stream ID. For commands that do take a steam_id, we use whichever nonzero stream_id appeared most recently in the same cell.

This format limits the space of possible relay commands. That's probably okay: after 20 years of Tor development, we have defined 25 relay command values. But in case 2^6==64 commands will not be enough, we reserve command values 48 through 63 for future formats that need more command bits.

Filename: 326-tor-relay-well-known-uri-rfc8615.md
Title: The "tor-relay" Well-Known Resource Identifier
Author: nusenu
Created: 14 August 2020
Status: Open

The "tor-relay" Well-Known Resource Identifier

This is a specification for a well-known registry entry according to RFC8615.

This resource identifier can be used for serving and finding proofs related to Tor relay and bridge contact information. It can also be used for autodiscovery of Tor relays run by a given entity, if the entity's domain is known. It solves the issue that Tor relay/bridge contact information is a unidirectional and unverified claim by nature. This well-known URI aims to allow the verification of the unidirectional claim. It aims to reduce the risk of impersonation attacks, where a Tor relay/bridge claims to be operated by a certain entity, but actually isn't. The automated verification will also support the visualization of relay/bridge groups.

  • An initially (unverified) Tor relay or bridge contact information might claim to be related to an organization by pointing to its website: Tor relay/bridge contact information field -> website

  • The "tor-relay" URI allows for the verification of that claim by fetching the files containing Tor relay ID(s) or hashed bridge fingerprints under the specified URI, because attackers can not easily place these files at the given location.

  • By publishing Tor relay IDs or hashed bridge IDs under this URI the website operator claims to be the responsible entity for these Tor relays/bridges. The verification of listed Tor relay/bridge IDs only succeeds if the claim can be verified bidirectionally (website -> relay/bridge and relay/bridge -> website).

  • This URI is not related to Tor onion services.

  • The URL MUST be HTTPS and use a valid TLS certificate from a generally trusted root CA. Plain HTTP MUST not be used.

  • The URL MUST be accessible by robots (no CAPTCHAs).

/.well-known/tor-relay/rsa-fingerprint.txt

  • The file contains one or more Tor relay RSA SHA1 fingerprints operated by the entity in control of this website.
  • Each line contains one relay fingerprint.
  • The file MUST NOT contain fingerprints of Tor bridges (or hashes of bridge fingerprints). For bridges see the file hashed-bridge-rsa-fingerprint.txt.
  • The file may contain comments (starting with #).
  • Non-comment lines must be exactly 40 characters long and consist of the following characters [a-fA-F0-9].
  • Fingerprints are not case-sensitive.
  • Each fingerprint MUST appear at most once.
  • The file MUST not be larger than one MByte.
  • The content MUST be a media type of "text/plain".

Example file content:

# we operate these Tor relays
A234567890123456789012345678901234567ABC
B234567890123456789012345678901234567890

The RSA SHA1 relay fingerprint can be found in the file named "fingerprint" located in the Tor data directory on the relay.

/.well-known/tor-relay/ed25519-master-pubkey.txt

  • The file contains one or more ed25519 Tor relay public master keys of relays operated by the entity in control of this website.
  • This file is not relevant for bridges.
  • Each line contains one public ed25519 master key in its base64 encoded form.
  • The file may contain comments (starting with #).
  • Non-comment lines must be exactly 43 characters long and consist of the following characters [a-zA-z0-9/+].
  • Each key MUST appear at most once.
  • The file MUST not be larger than one MByte.
  • The content MUST be a media type of "text/plain".

Example file content:

# we operate these Tor relays
yp0fwtp4aa/VMyZJGz8vN7Km3zYet1YBZwqZEk1CwHI
kXdA5dmIhXblAquMx0M0ApWJJ4JGQGLsjUSn86cbIaU
bHzOT41w56KHh+w6TYwUhN4KrGwPWQWJX04/+tw/+RU

The base64 encoded ed25519 public master key can be found in the file named "fingerprint-ed25519" located in the Tor data directory on the relay.

/.well-known/tor-relay/hashed-bridge-rsa-fingerprint.txt

  • The file contains one or more SHA1 hashed Tor bridge SHA1 fingerprints operated by the entity in control of this website.
  • Each line contains one hashed bridge fingerprint.
  • The file may contain comments (starting with #).
  • Non-comment lines must be exactly 40 characters long and consist of the following characters [a-fA-F0-9].
  • Hashed fingerprints are not case-sensitive.
  • Each hashed fingerprint MUST appear at most once.
  • The file MUST not be larger than one MByte.
  • The file MUST NOT contain fingerprints of Tor relays.
  • The content MUST be a media type of "text/plain".

Example file content:

# we operate these Tor bridges
1234567890123456789012345678901234567ABC
4234567890123456789012345678901234567890

The hashed Tor bridge fingerprint can be found in the file named "hashed-fingerprint" located in the Tor data directory on the bridge.

Change Controller

Tor Project Development Mailing List tor-dev@lists.torproject.org

Related Information

Filename: 327-pow-over-intro.txt
Title: A First Take at PoW Over Introduction Circuits
Author: George Kadianakis, Mike Perry, David Goulet, tevador
Created: 2 April 2020
Status: Closed

0. Abstract

  This proposal aims to thwart introduction flooding DoS attacks by introducing
  a dynamic Proof-Of-Work protocol that occurs over introduction circuits.

1. Motivation

  So far our attempts at limiting the impact of introduction flooding DoS
  attacks on onion services has been focused on horizontal scaling with
  Onionbalance, optimizing the CPU usage of Tor and applying rate limiting.
  While these measures move the goalpost forward, a core problem with onion
  service DoS is that building rendezvous circuits is a costly procedure both
  for the service and for the network. For more information on the limitations
  of rate-limiting when defending against DDoS, see [REF_TLS_1].

  If we ever hope to have truly reachable global onion services, we need to
  make it harder for attackers to overload the service with introduction
  requests. This proposal achieves this by allowing onion services to specify
  an optional dynamic proof-of-work scheme that its clients need to participate
  in if they want to get served.

  With the right parameters, this proof-of-work scheme acts as a gatekeeper to
  block amplification attacks by attackers while letting legitimate clients
  through.

1.1. Related work

  For a similar concept, see the three internet drafts that have been proposed
  for defending against TLS-based DDoS attacks using client puzzles [REF_TLS].

1.2. Threat model [THREAT_MODEL]

1.2.1. Attacker profiles [ATTACKER_MODEL]

  This proposal is written to thwart specific attackers. A simple PoW proposal
  cannot defend against all and every DoS attack on the Internet, but there are
  adversary models we can defend against.

  Let's start with some adversary profiles:

  "The script-kiddie"

    The script-kiddie has a single computer and pushes it to its
    limits. Perhaps it also has a VPS and a pwned server. We are talking about
    an attacker with total access to 10 GHz of CPU and 10 GB of RAM. We
    consider the total cost for this attacker to be zero $.

  "The small botnet"

    The small botnet is a bunch of computers lined up to do an introduction
    flooding attack. Assuming 500 medium-range computers, we are talking about
    an attacker with total access to 10 THz of CPU and 10 TB of RAM. We
    consider the upfront cost for this attacker to be about $400.

  "The large botnet"

    The large botnet is a serious operation with many thousands of computers
    organized to do this attack. Assuming 100k medium-range computers, we are
    talking about an attacker with total access to 200 THz of CPU and 200 TB of
    RAM. The upfront cost for this attacker is about $36k.

  We hope that this proposal can help us defend against the script-kiddie
  attacker and small botnets. To defend against a large botnet we would need
  more tools at our disposal (see [FUTURE_DESIGNS]).

1.2.2. User profiles [USER_MODEL]

  We have attackers and we have users. Here are a few user profiles:

  "The standard web user"

    This is a standard laptop/desktop user who is trying to browse the
    web. They don't know how these defences work and they don't care to
    configure or tweak them. If the site doesn't load, they are gonna close
    their browser and be sad at Tor. They run a 2GHz computer with 4GB of RAM.

  "The motivated user"

    This is a user that really wants to reach their destination. They don't
    care about the journey; they just want to get there. They know what's going
    on; they are willing to make their computer do expensive multi-minute PoW
    computations to get where they want to be.

  "The mobile user"

    This is a motivated user on a mobile phone. Even tho they want to read the
    news article, they don't have much leeway on stressing their machine to do
    more computation.

  We hope that this proposal will allow the motivated user to always connect
  where they want to connect to, and also give more chances to the other user
  groups to reach the destination.

1.2.3. The DoS Catch-22 [CATCH22]

  This proposal is not perfect and it does not cover all the use cases. Still,
  we think that by covering some use cases and giving reachability to the
  people who really need it, we will severely demotivate the attackers from
  continuing the DoS attacks and hence stop the DoS threat all together.
  Furthermore, by increasing the cost to launch a DoS attack, a big
  class of DoS attackers will disappear from the map, since the expected ROI
  will decrease.

2. System Overview

2.1. Tor protocol overview

                                          +----------------------------------+
                                          |          Onion Service           |
   +-------+ INTRO1  +-----------+ INTRO2 +--------+                         |
   |Client |-------->|Intro Point|------->|  PoW   |-----------+             |
   +-------+         +-----------+        |Verifier|           |             |
                                          +--------+           |             |
                                          |                    |             |
                                          |                    |             |
                                          |         +----------v---------+   |
                                          |         |Intro Priority Queue|   |
                                          +---------+--------------------+---+
                                                           |  |  |
                                                Rendezvous |  |  |
                                                  circuits |  |  |
                                                           v  v  v



  The proof-of-work scheme specified in this proposal takes place during the
  introduction phase of the onion service protocol.

  The system described in this proposal is not meant to be on all the time, and
  it can be entirely disabled for services that do not experience DoS attacks.

  When the subsystem is enabled, suggested effort is continuously adjusted and
  the computational puzzle can be bypassed entirely when the effort reaches
  zero. In these cases, the proof-of-work subsystem can be dormant but still
  provide the necessary parameters for clients to voluntarily provide effort
  in order to get better placement in the priority queue.

  The protocol involves the following major steps:

  1) Service encodes PoW parameters in descriptor [DESC_POW]
  2) Client fetches descriptor and computes PoW [CLIENT_POW]
  3) Client completes PoW and sends results in INTRO1 cell [INTRO1_POW]
  4) Service verifies PoW and queues introduction based on PoW effort
     [SERVICE_VERIFY]
  5) Requests are continuously drained from the queue, highest effort first,
     subject to multiple constraints on speed [HANDLE_QUEUE]

2.2. Proof-of-work overview

2.2.1. Algorithm overview

  For our proof-of-work function we will use the Equi-X scheme by tevador
  [REF_EQUIX].  Equi-X is an asymmetric PoW function based on Equihash<60,3>,
  using HashX as the underlying layer. It features lightning fast verification
  speed, and also aims to minimize the asymmetry between CPU and GPU.
  Furthermore, it's designed for this particular use-case and hence
  cryptocurrency miners are not incentivized to make optimized ASICs for it.

  The overall scheme consists of several layers that provide different pieces
  of this functionality:

  1) At the lowest layers, blake2b and siphash are used as hashing and PRNG
     algorithms that are well suited to common 64-bit CPUs.
  2) A custom hash function family, HashX, randomizes its implementation for
     each new seed value. These functions are tuned to utilize the pipelined
     integer performance on a modern 64-bit CPU. This layer provides the
     strongest ASIC resistance, since a hardware reimplementation would need
     to include a CPU-like pipelined execution unit to keep up.
  3) The Equi-X layer itself builds on HashX and adds an algorithmic puzzle
     that's designed to be strongly asymmetric and to require RAM to solve
     efficiently.
  4) The PoW protocol itself builds on this Equi-X function with a particular
     construction of the challenge input and particular constraints on the
     allowed blake2b hash of the solution. This layer provides a linearly
     adjustable effort that we can verify.
  5) Above the level of individual PoW handshakes, the client and service
     form a closed-loop system that adjusts the effort of future handshakes.

  The Equi-X scheme provides two functions that will be used in this proposal:
      - equix_solve(challenge) which solves a puzzle instance, returning
        a variable number of solutions per invocation depending on the specific
        challenge value.
      - equix_verify(challenge, solution) which verifies a puzzle solution
        quickly. Verification still depends on executing the HashX function,
        but far fewer times than when searching for a solution.

  For the purposes of this proposal, all cryptographic algorithms are assumed
  to produce and consume byte strings, even if internally they operate on
  some other data type like 64-bit words. This is conventionally little endian
  order for blake2b, which contrasts with Tor's typical use of big endian.
  HashX itself is configured with an 8-byte output but its input is a single
  64-bit word of undefined byte order, of which only the low 16 bits are used
  by Equi-X in its solution output. We treat Equi-X solution arrays as byte
  arrays using their packed little endian 16-bit representation.

  We tune Equi-X in section [EQUIX_TUNING].

2.2.2. Dynamic PoW

  DoS is a dynamic problem where the attacker's capabilities constantly change,
  and hence we want our proof-of-work system to be dynamic and not stuck with a
  static difficulty setting. Hence, instead of forcing clients to go below a
  static target like in Bitcoin to be successful, we ask clients to "bid" using
  their PoW effort. Effectively, a client gets higher priority the higher
  effort they put into their proof-of-work. This is similar to how
  proof-of-stake works but instead of staking coins, you stake work.

  The benefit here is that legitimate clients who really care about getting
  access can spend a big amount of effort into their PoW computation, which
  should guarantee access to the service given reasonable adversary models. See
  [PARAM_TUNING] for more details about these guarantees and tradeoffs.

  As a way to improve reachability and UX, the service tries to estimate the
  effort needed for clients to get access at any given time and places it in
  the descriptor. See [EFFORT_ESTIMATION] for more details.

2.2.3. PoW effort

  It's common for proof-of-work systems to define an exponential effort
  function based on a particular number of leading zero bits or equivalent.
  For the benefit of our effort estimation system, it's quite useful if we
  instead have a linear scale. We use the first 32 bits of a hashed version
  of the Equi-X solution as compared to the full 32-bit range.

  Conceptually we could define a function:
         unsigned effort(uint8_t *token)
  which takes as its argument a hashed solution, interprets it as a
  bitstring, and returns the quotient of dividing a bitstring of 1s by it.

  So for example:
         effort(00000001100010101101) = 11111111111111111111
                                          / 00000001100010101101
  or the same in decimal:
         effort(6317) = 1048575 / 6317 = 165.

  In practice we can avoid even having to perform this division, performing
  just one multiply instead to see if a request's claimed effort is supported
  by the smallness of the resulting 32-bit hash prefix. This assumes we send
  the desired effort explicitly as part of each PoW solution. We do want to
  force clients to pick a specific effort before looking for a solution,
  otherwise a client could opportunistically claim a very large effort any
  time a lucky hash prefix comes up. Thus the effort is communicated explicitly
  in our protocol, and it forms part of the concatenated Equi-X challenge.

3. Protocol specification

3.1. Service encodes PoW parameters in descriptor [DESC_POW]

  This whole protocol starts with the service encoding the PoW parameters in
  the 'encrypted' (inner) part of the v3 descriptor. As follows:

       "pow-params" SP type SP seed-b64 SP suggested-effort
                    SP expiration-time NL

        [At most once]

        type: The type of PoW system used. We call the one specified here "v1"

        seed-b64: A random seed that should be used as the input to the PoW
                  hash function. Should be 32 random bytes encoded in base64
                  without trailing padding.

        suggested-effort: An unsigned integer specifying an effort value that
                  clients should aim for when contacting the service. Can be
                  zero to mean that PoW is available but not currently
                  suggested for a first connection attempt. See
                  [EFFORT_ESTIMATION] for more details here.

        expiration-time: A timestamp in "YYYY-MM-DDTHH:MM:SS" format (iso time
                         with no space) after which the above seed expires and
                         is no longer valid as the input for PoW. It's needed
                         so that our replay cache does not grow infinitely. It
                         should be set to RAND_TIME(now+7200, 900) seconds.

   The service should refresh its seed when expiration-time passes. The service
   SHOULD keep its previous seed in memory and accept PoWs using it to avoid
   race-conditions with clients that have an old seed. The service SHOULD avoid
   generating two consequent seeds that have a common 4 bytes prefix. See
   [INTRO1_POW] for more info.

   By RAND_TIME(ts, interval) we mean a time between ts-interval and ts, chosen
   uniformly at random.

3.2. Client fetches descriptor and computes PoW [CLIENT_POW]

  If a client receives a descriptor with "pow-params", it should assume that
  the service is prepared to receive PoW solutions as part of the introduction
  protocol.

  The client parses the descriptor and extracts the PoW parameters. It makes
  sure that the <expiration-time> has not expired and if it has, it needs to
  fetch a new descriptor.

  The client should then extract the <suggested-effort> field to configure its
  PoW 'target' (see [REF_TARGET]). The client SHOULD NOT accept 'target' values
  that will cause unacceptably long PoW computation.

  The client uses a "personalization string" P equal to the following
  nul-terminated ASCII string: "Tor hs intro v1\0".

  The client looks up `ID`, the current 32-byte blinded public ID
  (KP_hs_blind_id) for the onion service.

  To complete the PoW the client follows the following logic:

      a) Client selects a target effort E, based on <suggested-effort> and past
         connection attempt history.
      b) Client generates a secure random 16-byte nonce N, as the starting
         point for the solution search.
      c) Client derives seed C by decoding 'seed-b64'.
      d) Client calculates S = equix_solve(P || ID || C || N || E)
      e) Client calculates R = ntohl(blake2b_32(P || ID || C || N || E || S))
      f) Client checks if R * E <= UINT32_MAX.
        f1) If yes, success! The client can submit N, E, the first 4 bytes of
        C, and S.
        f2) If no, fail! The client interprets N as a 16-byte little-endian
        integer, increments it by 1 and goes back to step d).

  Note that the blake2b hash includes the output length parameter in its
  initial state vector, so a blake2b_32 is not equivalent to the prefix of a
  blake2b_512. We calculate the 32-bit blake2b specifically, and interpret it
  in network byte order as an unsigned integer.

  At the end of the above procedure, the client should have S as the solution
  of the Equix-X puzzle with N as the nonce, C as the seed. How quickly this
  happens depends solely on the target effort E parameter.

  The algorithm as described is suitable for single-threaded computation.
  Optionally, a client may choose multiple nonces and attempt several solutions
  in parallel on separate CPU cores. The specific choice of nonce is entirely
  up to the client, so parallelization choices like this do not impact the
  network protocol's interoperability at all.

3.3. Client sends PoW in INTRO1 cell [INTRO1_POW]

  Now that the client has an answer to the puzzle it's time to encode it into
  an INTRODUCE1 cell. To do so the client adds an extension to the encrypted
  portion of the INTRODUCE1 cell by using the EXTENSIONS field (see
  [PROCESS_INTRO2] section in rend-spec-v3.txt). The encrypted portion of the
  INTRODUCE1 cell only gets read by the onion service and is ignored by the
  introduction point.

  We propose a new EXT_FIELD_TYPE value:

     [02] -- PROOF_OF_WORK

   The EXT_FIELD content format is:

        POW_VERSION    [1 byte]
        POW_NONCE      [16 bytes]
        POW_EFFORT     [4 bytes]
        POW_SEED       [4 bytes]
        POW_SOLUTION   [16 bytes]

   where:

    POW_VERSION is 1 for the protocol specified in this proposal
    POW_NONCE is the nonce 'N' from the section above
    POW_EFFORT is the 32-bit integer effort value, in network byte order
    POW_SEED is the first 4 bytes of the seed used

   This will increase the INTRODUCE1 payload size by 43 bytes since the
   extension type and length is 2 extra bytes, the N_EXTENSIONS field is always
   present and currently set to 0 and the EXT_FIELD is 41 bytes. According to
   ticket #33650, INTRODUCE1 cells currently have more than 200 bytes
   available.

3.4. Service verifies PoW and handles the introduction  [SERVICE_VERIFY]

   When a service receives an INTRODUCE1 with the PROOF_OF_WORK extension, it
   should check its configuration on whether proof-of-work is enabled on the
   service. If it's not enabled, the extension SHOULD BE ignored. If enabled,
   even if the suggested effort is currently zero, the service follows the
   procedure detailed in this section.

   If the service requires the PROOF_OF_WORK extension but received an
   INTRODUCE1 cell without any embedded proof-of-work, the service SHOULD
   consider this cell as a zero-effort introduction for the purposes of the
   priority queue (see section [INTRO_QUEUE]).

3.4.1. PoW verification [POW_VERIFY]

   To verify the client's proof-of-work the service MUST do the following steps:

      a) Find a valid seed C that starts with POW_SEED. Fail if no such seed
         exists.
      b) Fail if N = POW_NONCE is present in the replay cache
              (see [REPLAY_PROTECTION])
      c) Calculate R = ntohl(blake2b_32(P || ID || C || N || E || S))
      d) Fail if R * E > UINT32_MAX
      e) Fail if equix_verify(P || ID || C || N || E, S) != EQUIX_OK
      f) Put the request in the queue with a priority of E

   If any of these steps fail the service MUST ignore this introduction request
   and abort the protocol.

   In this proposal we call the above steps the "top half" of introduction
   handling. If all the steps of the "top half" have passed, then the circuit
   is added to the introduction queue as detailed in section [INTRO_QUEUE].

3.4.1.1. Replay protection [REPLAY_PROTECTION]

  The service MUST NOT accept introduction requests with the same (seed, nonce)
  tuple. For this reason a replay protection mechanism must be employed.

  The simplest way is to use a simple hash table to check whether a (seed,
  nonce) tuple has been used before for the active duration of a
  seed. Depending on how long a seed stays active this might be a viable
  solution with reasonable memory/time overhead.

  If there is a worry that we might get too many introductions during the
  lifetime of a seed, we can use a Bloom filter as our replay cache
  mechanism. The probabilistic nature of Bloom filters means that sometimes we
  will flag some connections as replays even if they are not; with this false
  positive probability increasing as the number of entries increase. However,
  with the right parameter tuning this probability should be negligible and
  well handled by clients.

  {TODO: Design and specify a suitable bloom filter for this purpose.}

3.4.2. The Introduction Queue  [INTRO_QUEUE]

3.4.2.1. Adding introductions to the introduction queue [ADD_QUEUE]

  When PoW is enabled and a verified introduction comes through, the service
  instead of jumping straight into rendezvous, queues it and prioritizes it
  based on how much effort was devoted by the client to PoW. This means that
  introduction requests with high effort should be prioritized over those with
  low effort.

  To do so, the service maintains an "introduction priority queue" data
  structure. Each element in that priority queue is an introduction request,
  and its priority is the effort put into its PoW:

  When a verified introduction comes through, the service uses its included
  effort commitment value to place each request into the right position of the
  priority_queue: The bigger the effort, the more priority it gets in the
  queue. If two elements have the same effort, the older one has priority over
  the newer one.

3.4.2.2. Handling introductions from the introduction queue [HANDLE_QUEUE]

  The service should handle introductions by pulling from the introduction
  queue. We call this part of introduction handling the "bottom half" because
  most of the computation happens in this stage. For a description of how we
  expect such a system to work in Tor, see [TOR_SCHEDULER] section.

3.4.3. PoW effort estimation [EFFORT_ESTIMATION]

3.4.3.1. High-level description of the effort estimation process

  The service starts with a default suggested-effort value of 0, which keeps
  the PoW defenses dormant until we notice signs of overload.

  The overall process of determining effort can be thought of as a set of
  multiple coupled feedback loops. Clients perform their own effort
  adjustments via [CLIENT_TIMEOUT] atop a base effort suggested by the service.
  That suggestion incorporates the service's control adjustments atop a base
  effort calculated using a sum of currently-queued client effort.

  Each feedback loop has an opportunity to cover different time scales. Clients
  can make adjustments at every single circuit creation request, whereas
  services are limited by the extra load that frequent updates would place on
  HSDir nodes.

  In the combined client/service system these client-side increases are
  expected to provide the most effective quick response to an emerging DoS
  attack. After early clients increase the effort using [CLIENT_TIMEOUT],
  later clients will benefit from the service detecting this increased queued
  effort and offering a larger suggested_effort.

  Effort increases and decreases both have an intrinsic cost. Increasing effort
  will make the service more expensive to contact, and decreasing effort makes
  new requests likely to become backlogged behind older requests. The steady
  state condition is preferable to either of these side-effects, but ultimately
  it's expected that the control loop always oscillates to some degree.

3.4.3.2. Service-side effort estimation

  Services keep an internal effort estimation which updates on a regular
  periodic timer in response to measurements made on the queueing behavior
  in the previous period. These internal effort changes can optionally trigger
  client-visible suggested_effort changes when the difference is great enough
  to warrant republishing to the HSDir.

  This evaluation and update period is referred to as HS_UPDATE_PERIOD.
  The service side effort estimation takes inspiration from TCP congestion
  control's additive increase / multiplicative decrease approach, but unlike
  a typical AIMD this algorithm is fixed-rate and doesn't update immediately
  in response to events.

  {TODO: HS_UPDATE_PERIOD is hardcoded to 300 (5 minutes) currently, but it
   should be configurable in some way. Is it more appropriate to use the
   service's torrc here or a consensus parameter?}

3.4.3.3. Per-period service state

  During each update period, the service maintains some state:

    1. TOTAL_EFFORT, a sum of all effort values for rendezvous requests that
       were successfully validated and enqueued.

    2. REND_HANDLED, a count of rendezvous requests that were actually
       launched. Requests that made it to dequeueing but were too old to launch
       by then are not included.

    3. HAD_QUEUE, a flag which is set if at any time in the update period we
       saw the priority queue filled with more than a minimum amount of work,
       greater than we would expect to process in approximately 1/4 second
       using the configured dequeue rate.

    4. MAX_TRIMMED_EFFORT, the largest observed single request effort that we
       discarded during the period. Requests are discarded either due to age
       (timeout) or during culling events that discard the bottom half of the
       entire queue when it's too full.

3.4.3.4. Service AIMD conditions

  At the end of each period, the service may decide to increase effort,
  decrease effort, or make no changes, based on these accumulated state values:

    1. If MAX_TRIMMED_EFFORT > our previous internal suggested_effort,
       always INCREASE. Requests that follow our latest advice are being
       dropped.

    2. If the HAD_QUEUE flag was set and the queue still contains at least
       one item with effort >= our previous internal suggested_effort,
       INCREASE. Even if we haven't yet reached the point of dropping requests,
       this signal indicates that the our latest suggestion isn't high enough
       and requests will build up in the queue.

    3. If neither condition (1) or (2) are taking place and the queue is below
       a level we would expect to process in approximately 1/4 second, choose
       to DECREASE.

    4. If none of these conditions match, the suggested effort is unchanged.

  When we INCREASE, the internal suggested_effort is increased to either its
  previous value + 1, or (TOTAL_EFFORT / REND_HANDLED), whichever is larger.

  When we DECREASE, the internal suggested_effort is scaled by 2/3rds.

  Over time, this will continue to decrease our effort suggestion any time the
  service is fully processing its request queue. If the queue stays empty, the
  effort suggestion decreases to zero and clients should no longer submit a
  proof-of-work solution with their first connection attempt.

  It's worth noting that the suggested-effort is not a hard limit to the
  efforts that are accepted by the service, and it's only meant to serve as a
  guideline for clients to reduce the number of unsuccessful requests that get
  to the service. The service still adds requests with lower effort than
  suggested-effort to the priority queue in [ADD_QUEUE].

3.4.3.5. Updating descriptor with new suggested effort

  The service descriptors may be updated for multiple reasons including
  introduction point rotation common to all v3 onion services, the scheduled
  seed rotations described in [DESC_POW], and updates to the effort suggestion.
  Even though the internal effort estimate updates on a regular timer, we avoid
  propagating those changes into the descriptor and the HSDir hosts unless
  there is a significant change.

  If the PoW params otherwise match but the seed has changed by less than 15
  percent, services SHOULD NOT upload a new descriptor.

4. Client behavior [CLIENT_BEHAVIOR]

  This proposal introduces a bunch of new ways where a legitimate client can
  fail to reach the onion service.

  Furthermore, there is currently no end-to-end way for the onion service to
  inform the client that the introduction failed. The INTRO_ACK cell is not
  end-to-end (it's from the introduction point to the client) and hence it does
  not allow the service to inform the client that the rendezvous is never gonna
  occur.

  From the client's perspective there's no way to attribute this failure to
  the service itself rather than the introduction point, so error accounting
  is performed separately for each introduction-point. Existing mechanisms
  will discard an introduction point that's required too many retries.

4.1. Clients handling timeouts [CLIENT_TIMEOUT]

  Alice can fail to reach the onion service if her introduction request gets
  trimmed off the priority queue in [HANDLE_QUEUE], or if the service does not
  get through its priority queue in time and the connection times out.

  This section presents a heuristic method for the client getting service even
  in such scenarios.

  If the rendezvous request times out, the client SHOULD fetch a new descriptor
  for the service to make sure that it's using the right suggested-effort for
  the PoW and the right PoW seed. If the fetched descriptor includes a new
  suggested effort or seed, it should first retry the request with these
  parameters.

  {TODO: This is not actually implemented yet, but we should do it. How often
     should clients at most try to fetch new descriptors? Determined by a
     consensus parameter? This change will also allow clients to retry
     effectively in cases where the service has just been reconfigured to
     enable PoW defenses.}

  Every time the client retries the connection, it will count these failures
  per-introduction-point. These counts of previous retries are combined with
  the service's suggested_effort when calculating the actual effort to spend
  on any individual request to a service that advertises PoW support, even
  when the currently advertised suggested_effort is zero.

  On each retry, the client modifies its solver effort:

    1. If the effort is below (CLIENT_POW_EFFORT_DOUBLE_UNTIL = 1000)
       it will be doubled.

    2. Otherwise, multiply the effort by (CLIENT_POW_RETRY_MULTIPLIER = 1.5).

    3. Constrain the new effort to be at least
       (CLIENT_MIN_RETRY_POW_EFFORT = 8) and no greater than
       (CLIENT_MAX_POW_EFFORT = 10000)

  {TODO: These hardcoded limits should be replaced by timed limits and/or
      an unlimited solver with robust cancellation. This is issue tor#40787}

5. Attacker strategies [ATTACK_META]

  Now that we defined our protocol we need to start tweaking the various
  knobs. But before we can do that, we first need to understand a few
  high-level attacker strategies to see what we are fighting against.

5.1.1. Overwhelm PoW verification (aka "Overwhelm top half") [ATTACK_TOP_HALF]

  A basic attack here is the adversary spamming with bogus INTRO cells so that
  the service does not have computing capacity to even verify the
  proof-of-work. This adversary tries to overwhelm the procedure in the
  [POW_VERIFY] section.

  That's why we need the PoW algorithm to have a cheap verification time so
  that this attack is not possible: we tune this PoW parameter in section
  [POW_TUNING_VERIFICATION].

5.1.2. Overwhelm rendezvous capacity (aka "Overwhelm bottom half")
       [ATTACK_BOTTOM_HALF]

  Given the way the introduction queue works (see [HANDLE_QUEUE]), a very
  effective strategy for the attacker is to totally overwhelm the queue
  processing by sending more high-effort introductions than the onion service
  can handle at any given tick. This adversary tries to overwhelm the procedure
  in the [HANDLE_QUEUE] section.

  To do so, the attacker would have to send at least 20 high-effort
  introduction cells every 100ms, where high-effort is a PoW which is above the
  estimated level of "the motivated user" (see [USER_MODEL]).

  An easier attack for the adversary, is the same strategy but with
  introduction cells that are all above the comfortable level of "the standard
  user" (see [USER_MODEL]). This would block out all standard users and only
  allow motivated users to pass.

5.1.3. Hybrid overwhelm strategy [ATTACK_HYBRID]

  If both the top- and bottom- halves are processed by the same thread, this
  opens up the possibility for a "hybrid" attack. Given the performance figures
  for the bottom half (0.31 ms/req.) and the top half (5.5 ms/req.), the
  attacker can optimally deny service by submitting 91 high-effort requests and
  1520 invalid requests per second. This will completely saturate the main loop
  because:

  0.31*(1520+91) ~ 0.5 sec.
  5.5*91         ~ 0.5 sec.

  This attack only has half the bandwidth requirement of [ATTACK_TOP_HALF] and
  half the compute requirement of [ATTACK_BOTTOM_HALF].

  Alternatively, the attacker can adjust the ratio between invalid and
  high-effort requests depending on their bandwidth and compute capabilities.

5.1.4. Gaming the effort estimation logic [ATTACK_EFFORT]

  Another way to beat this system is for the attacker to game the effort
  estimation logic (see [EFFORT_ESTIMATION]). Essentially, there are two attacks
  that we are trying to avoid:

  - Attacker sets descriptor suggested-effort to a very high value effectively
    making it impossible for most clients to produce a PoW token in a
    reasonable timeframe.
  - Attacker sets descriptor suggested-effort to a very small value so that
    most clients aim for a small value while the attacker comfortably launches
    an [ATTACK_BOTTOM_HALF] using medium effort PoW (see [REF_TEVADOR_1])

5.1.4. Precomputed PoW attack

  The attacker may precompute many valid PoW nonces and submit them all at once
  before the current seed expires, overwhelming the service temporarily even
  using a single computer. The current scheme gives the attackers 4 hours to
  launch this attack since each seed lasts 2 hours and the service caches two
  seeds.

  An attacker with this attack might be aiming to DoS the service for a limited
  amount of time, or to cause an [ATTACK_EFFORT] attack.

6. Parameter tuning [POW_TUNING]

  There are various parameters in this PoW system that need to be tuned:

  We first start by tuning the time it takes to verify a PoW token. We do this
  first because it's fundamental to the performance of onion services and can
  turn into a DoS vector of its own. We will do this tuning in a way that's
  agnostic to the chosen PoW function.

  We will then move towards analyzing the client starting difficulty setting
  for our PoW system. That defines the expected time for clients to succeed in
  our system, and the expected time for attackers to overwhelm our system. Same
  as above we will do this in a way that's agnostic to the chosen PoW function.

  Currently, we have hardcoded the initial client starting difficulty at 8,
  but this may be too low to ramp up quickly to various on and off attack
  patterns. A higher initial difficulty may be needed for these, depending on
  their severity. This section gives us an idea of how large such attacks can
  be.

  Finally, using those two pieces we will tune our PoW function and pick the
  right client starting difficulty setting. At the end of this section we will
  know the resources that an attacker needs to overwhelm the onion service, the
  resources that the service needs to verify introduction requests, and the
  resources that legitimate clients need to get to the onion service.

6.1. PoW verification [POW_TUNING_VERIFICATION]

  Verifying a PoW token is the first thing that a service does when it receives
  an INTRODUCE2 cell and it's detailed in section [POW_VERIFY]. This
  verification happens during the "top half" part of the process. Every
  millisecond spent verifying PoW adds overhead to the already existing "top
  half" part of handling an introduction cell. Hence we should be careful to
  add minimal overhead here so that we don't enable attacks like [ATTACK_TOP_HALF].

  During our performance measurements in [TOR_MEASUREMENTS] we learned that the
  "top half" takes about 0.26 msecs in average, without doing any sort of PoW
  verification. Using that value we compute the following table, that describes
  the number of cells we can queue per second (aka times we can perform the
  "top half" process) for different values of PoW verification time:

      +---------------------+-----------------------+--------------+
      |PoW Verification Time| Total "top half" time | Cells Queued |
      |                     |                       |  per second  |
      |---------------------|-----------------------|--------------|
      |    0     msec       |    0.26      msec     |    3846      |
      |    1     msec       |    1.26      msec     |    793       |
      |    2     msec       |    2.26      msec     |    442       |
      |    3     msec       |    3.26      msec     |    306       |
      |    4     msec       |    4.26      msec     |    234       |
      |    5     msec       |    5.26      msec     |    190       |
      |    6     msec       |    6.26      msec     |    159       |
      |    7     msec       |    7.26      msec     |    137       |
      |    8     msec       |    8.26      msec     |    121       |
      |    9     msec       |    9.26      msec     |    107       |
      |    10    msec       |    10.26     msec     |    97        |
      +---------------------+-----------------------+--------------+

  Here is how you can read the table above:

  - For a PoW function with a 1ms verification time, an attacker needs to send
    793 dummy introduction cells per second to succeed in a [ATTACK_TOP_HALF] attack.

  - For a PoW function with a 2ms verification time, an attacker needs to send
    442 dummy introduction cells per second to succeed in a [ATTACK_TOP_HALF] attack.

  - For a PoW function with a 10ms verification time, an attacker needs to send
    97 dummy introduction cells per second to succeed in a [ATTACK_TOP_HALF] attack.

  Whether an attacker can succeed at that depends on the attacker's resources,
  but also on the network's capacity.

  Our purpose here is to have the smallest PoW verification overhead possible
  that also allows us to achieve all our other goals.

  [Note that the table above is simply the result of a naive multiplication and
  does not take into account all the auxiliary overheads that happen every
  second like the time to invoke the mainloop, the bottom-half processes, or
  pretty much anything other than the "top-half" processing.

  During our measurements the time to handle INTRODUCE2 cells dominates any
  other action time: There might be events that require a long processing time,
  but these are pretty infrequent (like uploading a new HS descriptor) and
  hence over a long time they smooth out. Hence extrapolating the total cells
  queued per second based on a single "top half" time seems like good enough to
  get some initial intuition. That said, the values of "Cells queued per
  second" from the table above, are likely much smaller than displayed above
  because of all the auxiliary overheads.]

6.2. PoW difficulty analysis [POW_DIFFICULTY_ANALYSIS]

  The difficulty setting of our PoW basically dictates how difficult it should
  be to get a success in our PoW system. An attacker who can get many successes
  per second can pull a successful [ATTACK_BOTTOM_HALF] attack against our
  system.

  In classic PoW systems, "success" is defined as getting a hash output below
  the "target". However, since our system is dynamic, we define "success" as an
  abstract high-effort computation.

  Our system is dynamic but we still need a starting difficulty setting that
  will be used for bootstrapping the system. The client and attacker can still
  aim higher or lower but for UX purposes and for analysis purposes we do need
  to define a starting difficulty, to minimize retries by clients.

6.2.1. Analysis based on adversary power

  In this section we will try to do an analysis of PoW difficulty without using
  any sort of Tor-related or PoW-related benchmark numbers.

  We created the table (see [REF_TABLE]) below which shows how much time a
  legitimate client with a single machine should expect to burn before they get
  a single success. The x-axis is how many successes we want the attacker to be
  able to do per second: the more successes we allow the adversary, the more
  they can overwhelm our introduction queue. The y-axis is how many machines
  the adversary has in her disposal, ranging from just 5 to 1000.

       ===============================================================
       |    Expected Time (in seconds) Per Success For One Machine   |
 ===========================================================================
 |                                                                          |
 |   Attacker Succeses        1       5       10      20      30      50    |
 |       per second                                                         |
 |                                                                          |
 |            5               5       1       0       0       0       0     |
 |            50              50      10      5       2       1       1     |
 |            100             100     20      10      5       3       2     |
 | Attacker   200             200     40      20      10      6       4     |
 |  Boxes     300             300     60      30      15      10      6     |
 |            400             400     80      40      20      13      8     |
 |            500             500     100     50      25      16      10    |
 |            1000            1000    200     100     50      33      20    |
 |                                                                          |
 ============================================================================

  Here is how you can read the table above:

  - If an adversary has a botnet with 1000 boxes, and we want to limit her to 1
    success per second, then a legitimate client with a single box should be
    expected to spend 1000 seconds getting a single success.

  - If an adversary has a botnet with 1000 boxes, and we want to limit her to 5
    successes per second, then a legitimate client with a single box should be
    expected to spend 200 seconds getting a single success.

  - If an adversary has a botnet with 500 boxes, and we want to limit her to 5
    successes per second, then a legitimate client with a single box should be
    expected to spend 100 seconds getting a single success.

  - If an adversary has access to 50 boxes, and we want to limit her to 5
    successes per second, then a legitimate client with a single box should be
    expected to spend 10 seconds getting a single success.

  - If an adversary has access to 5 boxes, and we want to limit her to 5
    successes per second, then a legitimate client with a single box should be
    expected to spend 1 seconds getting a single success.

  With the above table we can create some profiles for starting values of our
  PoW difficulty.

6.2.2. Analysis based on Tor's performance [POW_DIFFICULTY_TOR]

  To go deeper here, we can use the performance measurements from
  [TOR_MEASUREMENTS] to get a more specific intuition on the starting
  difficulty. In particular, we learned that completely handling an
  introduction cell takes 5.55 msecs in average. Using that value, we can
  compute the following table, that describes the number of introduction cells
  we can handle per second for different values of PoW verification:

      +---------------------+-----------------------+--------------+
      |PoW Verification Time| Total time to handle  | Cells handled|
      |                     |   introduction cell   |  per second  |
      |---------------------|-----------------------|--------------|
      |    0      msec      |    5.55        msec   |    180.18    |
      |    1      msec      |    6.55        msec   |    152.67    |
      |    2      msec      |    7.55        msec   |    132.45    |
      |    3      msec      |    8.55        msec   |    116.96    |
      |    4      msec      |    9.55        mesc   |    104.71    |
      |    5      msec      |    10.55       msec   |    94.79     |
      |    6      msec      |    11.55       msec   |    86.58     |
      |    7      msec      |    12.55       msec   |    79.68     |
      |    8      msec      |    13.55       msec   |    73.80     |
      |    9      msec      |    14.55       msec   |    68.73     |
      |    10     msec      |    15.55       msec   |    64.31     |
      +---------------------+-----------------------+--------------+

  Here is how you can read the table above:

  - For a PoW function with a 1ms verification time, an attacker needs to send
    152 high-effort introduction cells per second to succeed in a
    [ATTACK_BOTTOM_HALF] attack.

  - For a PoW function with a 10ms verification time, an attacker needs to send
    64 high-effort introduction cells per second to succeed in a
    [ATTACK_BOTTOM_HALF] attack.

  We can use this table to specify a starting difficulty that won't allow our
  target adversary to succeed in an [ATTACK_BOTTOM_HALF] attack.

  Of course, when it comes to this table, the same disclaimer as in section
  [POW_TUNING_VERIFICATION] is valid. That is, the above table is just a
  theoretical extrapolation and we expect the real values to be much lower
  since they depend on auxiliary processing overheads, and on the network's
  capacity.


7. Discussion

7.1. UX

  This proposal has user facing UX consequences.

  When the client first attempts a pow, it can note how long iterations of the
  hash function take, and then use this to determine an estimation of the
  duration of the PoW. This estimation could be communicated via the control
  port or other mechanism, such that the browser could display how long the
  PoW is expected to take on their device. If the device is a mobile platform,
  and this time estimation is large, it could recommend that the user try from
  a desktop machine.

7.2. Future work [FUTURE_WORK]

7.2.1. Incremental improvements to this proposal

  There are various improvements that can be done in this proposal, and while
  we are trying to keep this v1 version simple, we need to keep the design
  extensible so that we build more features into it. In particular:

  - End-to-end introduction ACKs

    This proposal suffers from various UX issues because there is no end-to-end
    mechanism for an onion service to inform the client about its introduction
    request. If we had end-to-end introduction ACKs many of the problems from
    [CLIENT_BEHAVIOR] would be alleviated. The problem here is that end-to-end
    ACKs require modifications on the introduction point code and a network
    update which is a lengthy process.

  - Multithreading scheduler

    Our scheduler is pretty limited by the fact that Tor has a single-threaded
    design. If we improve our multithreading support we could handle a much
    greater amount of introduction requests per second.

7.2.2. Future designs [FUTURE_DESIGNS]

  This is just the beginning in DoS defences for Tor and there are various
  future designs and schemes that we can investigate. Here is a brief summary
  of these:

  "More advanced PoW schemes" -- We could use more advanced memory-hard PoW
         schemes like MTP-argon2 or Itsuku to make it even harder for
         adversaries to create successful PoWs. Unfortunately these schemes
         have much bigger proof sizes, and they won't fit in INTRODUCE1 cells.
         See #31223 for more details.

  "Third-party anonymous credentials" -- We can use anonymous credentials and a
         third-party token issuance server on the clearnet to issue tokens
         based on PoW or CAPTCHA and then use those tokens to get access to the
         service. See [REF_CREDS] for more details.

  "PoW + Anonymous Credentials" -- We can make a hybrid of the above ideas
         where we present a hard puzzle to the user when connecting to the
         onion service, and if they solve it we then give the user a bunch of
         anonymous tokens that can be used in the future. This can all happen
         between the client and the service without a need for a third party.

  All of the above approaches are much more complicated than this proposal, and
  hence we want to start easy before we get into more serious projects.

7.3. Environment

  We love the environment! We are concerned of how PoW schemes can waste energy
  by doing useless hash iterations. Here is a few reasons we still decided to
  pursue a PoW approach here:

  "We are not making things worse" -- DoS attacks are already happening and
      attackers are already burning energy to carry them out both on the
      attacker side, on the service side and on the network side. We think that
      asking legitimate clients to carry out PoW computations is not gonna
      affect the equation too much, since an attacker right now can very
      quickly cause the same damage that hundreds of legitimate clients do a
      whole day.

  "We hope to make things better" -- The hope is that proposals like this will
      make the DoS actors go away and hence the PoW system will not be used. As
      long as DoS is happening there will be a waste of energy, but if we
      manage to demotivate them with technical means, the network as a whole
      will less wasteful. Also see [CATCH22] for a similar argument.

8. Acknowledgements

  Thanks a lot to tevador for the various improvements to the proposal and for
  helping us understand and tweak the RandomX scheme.

  Thanks to Solar Designer for the help in understanding the current PoW
  landscape, the various approaches we could take, and teaching us a few neat
  tricks.

Appendix A.  Little-t tor introduction scheduler

  This section describes how we will implement this proposal in the "tor"
  software (little-t tor).

  The following should be read as if tor is an onion service and thus the end
  point of all inbound data.

A.1. The Main Loop [MAIN_LOOP]

  Tor uses libevent for its mainloop. For network I/O operations, a mainloop
  event is used to inform tor if it can read on a certain socket, or a
  connection object in tor.

  From there, this event will empty the connection input buffer (inbuf) by
  extracting and processing a cell at a time. The mainloop is single threaded
  and thus each cell is handled sequentially.

  Processing an INTRODUCE2 cell at the onion service means a series of
  operations (in order):

    1) Unpack cell from inbuf to local buffer.

    2) Decrypt cell (AES operations).

    3) Parse cell header and process it depending on its RELAY_COMMAND.

    4) INTRODUCE2 cell handling which means building a rendezvous circuit:
        i)  Path selection
        ii) Launch circuit to first hop.

    5) Return to mainloop event which essentially means back to step (1).

  Tor will read at most 32 cells out of the inbuf per mainloop round.

A.2. Requirements for PoW

  With this proposal, in order to prioritize cells by the amount of PoW work
  it has done, cells can _not_ be processed sequentially as described above.

  Thus, we need a way to queue a certain number of cells, prioritize them and
  then process some cell(s) from the top of the queue (that is, the cells that
  have done the most PoW effort).

  We thus require a new cell processing flow that is _not_ compatible with
  current tor design. The elements are:

    - Validate PoW and place cells in a priority queue of INTRODUCE2 cells (as
      described in section [INTRO_QUEUE]).

    - Defer "bottom half" INTRO2 cell processing for after cells have been
      queued into the priority queue.

A.3. Proposed scheduler [TOR_SCHEDULER]

  The intuitive way to address the A.2 requirements would be to do this
  simple and naive approach:

    1) Mainloop: Empty inbuf INTRODUCE2 cells into priority queue

    2) Process all cells in pqueue

    3) Goto (1)

  However, we are worried that handling all those cells before returning to the
  mainloop opens possibilities of attack by an adversary since the priority
  queue is not gonna be kept up to date while we process all those cells. This
  means that we might spend lots of time dealing with introductions that don't
  deserve it. See [BOTTOM_HALF_SCHEDULER] for more details.

  We thus propose to split the INTRODUCE2 handling into two different steps:
  "top half" and "bottom half" process, as also mentioned in [POW_VERIFY]
  section above.

A.3.1. Top half and bottom half scheduler

  The top half process is responsible for queuing introductions into the
  priority queue as follows:

    a) Unpack cell from inbuf to local buffer.

    b) Decrypt cell (AES operations).

    c) Parse INTRODUCE2 cell header and validate PoW.

    d) Return to mainloop event which essentially means step (1).

  The top-half basically does all operations of section [MAIN_LOOP] except from (4).

  An then, the bottom-half process is responsible for handling introductions
  and doing rendezvous. To achieve this we introduce a new mainloop event to
  process the priority queue _after_ the top-half event has completed. This new
  event would do these operations sequentially:

    a) Pop INTRODUCE2 cell from priority queue.

    b) Parse and process INTRODUCE2 cell.

    c) End event and yield back to mainloop.

A.3.2. Scheduling the bottom half process [BOTTOM_HALF_SCHEDULER]

  The question now becomes: when should the "bottom half" event get triggered
  from the mainloop?

  We propose that this event is scheduled in when the network I/O event
  queues at least 1 cell into the priority queue. Then, as long as it has a
  cell in the queue, it would re-schedule itself for immediate execution
  meaning at the next mainloop round, it would execute again.

  The idea is to try to empty the queue as fast as it can in order to provide a
  fast response time to an introduction request but always leave a chance for
  more cells to appear between cell processing by yielding back to the
  mainloop. With this we are aiming to always have the most up-to-date version
  of the priority queue when we are completing introductions: this way we are
  prioritizing clients that spent a lot of time and effort completing their PoW.

  If the size of the queue drops to 0, it stops scheduling itself in order to
  not create a busy loop. The network I/O event will re-schedule it in time.

  Notice that the proposed solution will make the service handle 1 single
  introduction request at every main loop event. However, when we do
  performance measurements we might learn that it's preferable to bump the
  number of cells in the future from 1 to N where N <= 32.

A.4 Performance measurements

  This section will detail the performance measurements we've done on tor.git
  for handling an INTRODUCE2 cell and then a discussion on how much more CPU
  time we can add (for PoW validation) before it badly degrades our
  performance.

A.4.1 Tor measurements [TOR_MEASUREMENTS]

  In this section we will derive measurement numbers for the "top half" and
  "bottom half" parts of handling an introduction cell.

  These measurements have been done on tor.git at commit
  80031db32abebaf4d0a91c01db258fcdbd54a471.

  We've measured several set of actions of the INTRODUCE2 cell handling process
  on Intel(R) Xeon(R) CPU E5-2650 v4. Our service was accessed by an array of
  clients that sent introduction requests for a period of 60 seconds.

  1. Full Mainloop Event

     We start by measuring the full time it takes for a mainloop event to
     process an inbuf containing INTRODUCE2 cells. The mainloop event processed
     2.42 cells per invocation on average during our measurements.

     Total measurements: 3279

       Min: 0.30 msec - 1st Q.: 5.47 msec - Median: 5.91 msec
       Mean: 13.43 msec - 3rd Q.: 16.20 msec - Max: 257.95 msec

  2. INTRODUCE2 cell processing (bottom-half)

     We also measured how much time the "bottom half" part of the process
     takes. That's the heavy part of processing an introduction request as seen
     in step (4) of the [MAIN_LOOP] section:

     Total measurements: 7931

       Min: 0.28 msec - 1st Q.: 5.06 msec - Median: 5.33 msec
       Mean: 5.29 msec - 3rd Q.: 5.57 msec - Max: 14.64 msec

  3. Connection data read (top half)

     Now that we have the above pieces, we can use them to measure just the
     "top half" part of the procedure. That's when bytes are taken from the
     connection inbound buffer and parsed into an INTRODUCE2 cell where basic
     validation is done.

     There is an average of 2.42 INTRODUCE2 cells per mainloop event and so we
     divide that by the full mainloop event mean time to get the time for one
     cell. From that we subtract the "bottom half" mean time to get how much
     the "top half" takes:

        => 13.43 / (7931 / 3279) = 5.55
        => 5.55 - 5.29 = 0.26

        Mean: 0.26 msec

  To summarize, during our measurements the average number of INTRODUCE2 cells
  a mainloop event processed is ~2.42 cells (7931 cells for 3279 mainloop
  invocations).

  This means that, taking the mean of mainloop event times, it takes ~5.55msec
  (13.43/2.42) to completely process an INTRODUCE2 cell. Then if we look deeper
  we see that the "top half" of INTRODUCE2 cell processing takes 0.26 msec in
  average, whereas the "bottom half" takes around 5.33 msec.

  The heavyness of the "bottom half" is to be expected since that's where 95%
  of the total work takes place: in particular the rendezvous path selection
  and circuit launch.

A.2. References

    [REF_EQUIX]: https://github.com/tevador/equix
                 https://github.com/tevador/equix/blob/master/devlog.md
    [REF_TABLE]: The table is based on the script below plus some manual editing for readability:
                 https://gist.github.com/asn-d6/99a936b0467b0cef88a677baaf0bbd04
    [REF_BOTNET]: https://media.kasperskycontenthub.com/wp-content/uploads/sites/43/2009/07/01121538/ynam_botnets_0907_en.pdf
    [REF_CREDS]: https://lists.torproject.org/pipermail/tor-dev/2020-March/014198.html
    [REF_TARGET]: https://en.bitcoin.it/wiki/Target
    [REF_TLS]: https://www.ietf.org/archive/id/draft-nygren-tls-client-puzzles-02.txt
               https://datatracker.ietf.org/doc/html/draft-nir-tls-puzzles-00.html
               https://tools.ietf.org/html/draft-ietf-ipsecme-ddos-protection-10
    [REF_TLS_1]: https://www.ietf.org/archive/id/draft-nygren-tls-client-puzzles-02.txt
    [REF_TEVADOR_1]: https://lists.torproject.org/pipermail/tor-dev/2020-May/014268.html
    [REF_TEVADOR_2]: https://lists.torproject.org/pipermail/tor-dev/2020-June/014358.html
    [REF_TEVADOR_SIM]: https://github.com/mikeperry-tor/scratchpad/blob/master/tor-pow/effort_sim.py#L57
Filename: 328-relay-overload-report.md
Title: Make Relays Report When They Are Overloaded
Author: David Goulet, Mike Perry
Created: November 3rd 2020
Status: Closed

0. Introduction

Many relays are likely sometimes under heavy load in terms of memory, CPU or network resources which in turns diminishes their ability to efficiently relay data through the network.

Having the capability of learning if a relay is overloaded would allow us to make better informed load balancing decisions. For instance, we can make our bandwidth scanners more intelligent on how they allocate bandwidth based on such metrics from relays.

We could furthermore improve our network health monitoring and pinpoint relays possibly misbehaving or under DDoS attack.

1. Metrics to Report

We propose that relays start collecting several metrics (see section 2) reflecting their loads from different component of tor.

Then, we propose that 1 new line be added to the server descriptor document (see dir-spec.txt, section 2.1.1) for the general overload case.

And 2 new lines to the extra-info document (see dir-spec.txt, section 2.1.2) for more specific overload cases.

The following describes a series of metrics to collect but more might come in the future and thus this is not an exhaustive list.

1.1. General Overload

The general overload line indicates that a relay has reached an "overloaded state" which can be one or many of the following load metrics:

  • Any OOM invocation due to memory pressure
  • Any ntor onionskins are dropped [Removed in tor-0.4.6.11 and 0.4.7.5-alpha]
  • A certain ratio of ntor onionskins dropped. [Added in tor-0.4.6.11 and 0.4.7.5-alpha]
  • TCP port exhaustion
  • DNS timeout reached (X% of timeouts over Y seconds). [Removed in tor-0.4.7.3-alpha]
  • CPU utilization of Tor's mainloop CPU core above 90% for 60 sec [Never implemented]
  • Control port overload (too many messages queued) [Never implemented]

For DNS timeouts, the X and Y are consensus parameters (overload_dns_timeout_scale_percent and overload_dns_timeout_period_secs) defined in param-spec.txt.

The format of the overloaded line added in the server descriptor document is as follows:

"overload-general" SP version SP YYYY-MM-DD HH:MM:SS NL
   [At most once.]

The timestamp is when at least one metric was detected. It should always be at the hour and thus, as an example, "2020-01-10 13:00:00" is an expected timestamp. Because this is a binary state, if the line is present, we consider that it was hit at the very least once somewhere between the provided timestamp and the "published" timestamp of the document which is when the document was generated.

The overload field should remain in place for 72 hours since last triggered. If the limits are reached again in this period, the timestamp is updated, and this 72 hour period restarts.

The 'version' field is set to '1' for the initial implementation of this proposal which includes all the above overload metrics except from the CPU and control port overload.

1.2. Token bucket size

Relays should report the 'BandwidthBurst' and 'BandwidthRate' limits in their descriptor, as well as the number of times these limits were reached, for read and write, in the past 24 hours starting at the provided timestamp rounded down to the hour.

The format of this overload line added in the extra-info document is as follows:

"overload-ratelimits" SP version SP YYYY-MM-DD SP HH:MM:SS
                      SP rate-limit SP burst-limit
                      SP read-overload-count SP write-overload-count NL
  [At most once.]

The "rate-limit" and "burst-limit" are the raw values from the BandwidthRate and BandwidthBurst found in the torrc configuration file.

The "{read|write}-overload-count" are the counts of how many times the reported limits of burst/rate were exhausted and thus the maximum between the read and write count occurrences. To make the counter more meaningful and to avoid multiple connections saturating the counter when a relay is overloaded, we only increment it once a minute.

The 'version' field is set to '1' for the initial implementation of this proposal.

1.3. File Descriptor Exhaustion

Not having enough file descriptors in this day of age is really a misconfiguration or a too old operation system. That way, we can very quickly notice which relay has a value too small and we can notify them.

The format of this overload line added in the extra-info document is as follows:

"overload-fd-exhausted" SP version YYYY-MM-DD HH:MM:SS NL
  [At most once.]

As the overloaded line, the timestamp indicates that the maximum was reached between the this timestamp and the "published" timestamp of the document.

This overload field should remain in place for 72 hours since last triggered. If the limits are reached again in this period, the timestamp is updated, and this 72 hour period restarts.

The 'version' field is set to '1' for the initial implementation of this proposal which detects fd exhaustion only when a socket open fails.

2. Load Metrics

This section proposes a series of metrics that should be collected and reported to the MetricsPort. The Prometheus format (only one supported for now) is described for each metrics.

2.1 Out-Of-Memory (OOM) Invocation

Tor's OOM manages caches and queues of all sorts. Relays have many of them and so any invocation of the OOM should be reported.

# HELP Total number of bytes the OOM has cleaned up
# TYPE counter
tor_relay_load_oom_bytes_total{<LABEL>} <VALUE>

Running counter of how many bytes were cleaned up by the OOM for a tor component identified by a label (see list below). To make sense, this should be visualized with the rate() function.

Possible LABELs for which the OOM was triggered:

  • subsys=cell: Circuit cell queue
  • subsys=dns: DNS resolution cache
  • subsys=geoip: GeoIP cache
  • subsys=hsdir: Onion service descriptors

2.2 Onionskin Queues

Onionskins handling is one of the few items that tor processes in parallel but they can be dropped for various reasons when under load. For this metrics to make sense, we also need to gather how many onionskins are we processing and thus one can provide a total processed versus dropped ratio:

# HELP Total number of onionskins
# TYPE counter
tor_relay_load_onionskins_total{<LABEL>} <NUM>

Possible LABELs are:

  • type=<handshake_type>: Type of handshake of that onionskins.
    • Possible values: ntor, tap, fast
  • action=processed: Indicating how many were processed.
  • action=dropped: Indicating how many were dropped due to load.

2.3 File Descriptor Exhaustion

Relays can reach a "ulimit" (on Linux) cap that is the number of allowed opened file descriptors. In Tor's use case, this is mostly sockets. File descriptors should be reported as follow:

# HELP Total number of sockets
# TYPE gauge
tor_relay_load_socket_total{<LABEL>} <NUM>

Possible LABELs are:

  • : How many available sockets.
  • state=opened: How many sockets are opened.

Note: since tor does track that value in order to reserve a block for critical port such as the Control Port, that value can easily be exported.

2.4 TCP Port Exhaustion

TCP protocol is capped at 65535 ports and thus if the relay ever is unable to open more outbound sockets, that is an overloaded state. It should be reported:

# HELP Total number of times we ran out of TCP ports
# TYPE gauge
tor_relay_load_tcp_exhaustion_total <NUM>

2.5 Connection Bucket Limit

Rate limited connections track bandwidth using a bucket system. Once the bucket is filled and tor wants to send more, it pauses until it is refilled a second later. Once that is hit, it should be reported:

# HELP Total number of global connection bucket limit reached
# TYPE counter
tor_relay_load_global_rate_limit_reached_total{<LABEL>} <NUM>

Possible LABELs are:

  • side=read: Read side of the global rate limit bucket.
  • side=write: Write side of the global rate limit bucket.
Filename: 329-traffic-splitting.txt
Title: Overcoming Tor's Bottlenecks with Traffic Splitting
Author: David Goulet, Mike Perry
Created: 2020-11-25
Status: Finished

0. Status

  This proposal describes the Conflux [CONFLUX] system developed by
  Mashael AlSabah, Kevin Bauer, Tariq Elahi, and Ian Goldberg. It aims at
  improving Tor client network performance by dynamically splitting
  traffic between two circuits. We have made several additional improvements
  to the original Conflux design, by making use of congestion control
  information, as well as updates from Multipath TCP literature.


1. Overview

1.1. Multipath TCP Design Space

  In order to understand our improvements to Conflux, it is important to
  properly conceptualize what is involved in the design of multipath
  algorithms in general.

  The design space is broken into two orthogonal parts: congestion control
  algorithms that apply to each path, and traffic scheduling algorithms
  that decide which packets to send on each path.

  MPTCP specifies 'coupled' congestion control (see [COUPLED]). Coupled
  congestion control updates single-path congestion control algorithms to
  account for shared bottlenecks between the paths, so that the combined
  congestion control algorithms do not overwhelm any bottlenecks that
  happen to be shared between the multiple paths. Various ways of
  accomplishing this have been proposed and implemented in the Linux
  kernel.

  Because Tor's congestion control only concerns itself with bottlenecks in
  Tor relay queues, and not with any other bottlenecks (such as
  intermediate Internet routers), we can avoid this complexity merely by
  specifying that any paths that are constructed SHOULD NOT share any
  relays (except for the exit). This assumption is valid, because non-relay
  bottlenecks are managed by TCP of client-to-relay and relay-to-relay OR
  connections, and not Tor's circuit-level congestion control. In this way,
  we can proceed to use the exact same congestion control as specified in
  [PROP324], for each path.

  For this reason, this proposal will focus on protocol specification, and
  the traffic scheduling algorithms, rather than coupling. Note that the
  scheduling algorithms are currently in flux, and will be subject to
  change as we tune them in Shadow, on the live network, and for future
  UDP implementation (see [PROP339]). This proposal will be kept up to
  date with the current implementation.

1.2. Divergence from the initial Conflux design

  The initial [CONFLUX] paper doesn't provide any indications on how to
  handle the size of out-of-order cell queue, which we consider a
  potential dangerous memory DoS vector (see [MEMORY_DOS]). It also used
  RTT as the sole heuristic for selecting which circuit to send on (which
  may vary depending on the geographical locations of the participant
  relays), without considering their actual available circuit capacity
  (which will be available to us via Proposal 324). Additionally, since
  the publication of [CONFLUX], more modern packet scheduling algorithms
  have been developed, which aim to reduce out-of-order queue size.

  We propose mitigations for these issues using modern scheduling
  algorithms, as well as implementations options for avoiding the
  out-of-order queue at Exit relays. Additionally, we consider resumption,
  side channel, and traffic analysis risks and benefits in [RESUMPTION],
  [SIDE_CHANNELS] and [TRAFFIC_ANALYSIS].

1.3. Design Overview

  The following section describes the Conflux design.

  The circuit construction is as follows:

         Primary Circuit (lower RTT)
            +-------+      +--------+
            |Guard 1|----->|Middle 1|----------+
            +---^---+      +--------+          |
   +-----+      |                           +--v---+
   | OP  +------+                           | Exit |--> ...
   +-----+      |                           +--^---+
            +---v---+      +--------+          |
            |Guard 2|----->|Middle 2|----------+
            +-------+      +--------+
         Secondary Circuit (higher RTT)

  Both circuits are built using current Tor path selection, however they
  SHOULD NOT share the same Guard relay, or middle relay. By avoiding
  using the same relays in these positions in the path, we ensure
  additional path capacity, and eliminate the need to use more complicated
  'coupled' congestion control algorithms from the MPTCP
  literature[COUPLED].  This both simplifies design, and improves
  performance.

  Then, the OP needs to link the two circuits together, as described in
  [CONFLUX_HANDSHAKE].

  For ease of explanation, the primary circuit is the circuit that is
  more desirable to use, as per the scheduling algorithm, and the secondary
  circuit is used after the primary is blocked by congestion control. Note
  that for some algorithms, this selection becomes fuzzy, but all of them
  favor the circuit with lower RTT, at the beginning of transmission.

  Note also that this notion of primary vs secondary is a local property
  of the current sender: each endpoint may have different notions of
  primary, secondary, and current sending circuit. They also may use
  different scheduling algorithms to determine this.

  Initial RTT is measured during circuit linking, as described in
  [CONFLUX_HANDSHAKE]. After the initial link, RTT is continually measured
  using SENDME timing, as in Proposal 324. This means that during use,
  the primary circuit and secondary circuit may switch roles, depending on
  unrelated network congestion caused by other Tor clients.

  We also support linking onion service circuits together. In this case,
  only two rendezvous circuits are linked. Each of these RP circuits will
  be constructed separately, and then linked. However, the same path
  constraints apply to each half of the circuits (no shared relays between
  the legs). If, by chance, the service and the client sides end up
  sharing some relays, this is not catastrophic. Multipath TCP researchers
  we have consulted (see [ACKNOWLEDGMENTS]), believe Tor's congestion
  control from Proposal 324 to be sufficient in this rare case.

  In the algorithms we recommend here, only two circuits will be linked
  together at a time.  However, implementations SHOULD support more than
  two paths, as this has been shown to assist in traffic analysis
  resistance[WTF_SPLIT], and will also be useful for maintaining a desired
  target RTT, for UDP VoIP applications.

  If the number of circuits exceeds the current number of guard relays,
  guard relays MAY be re-used, but implementations SHOULD use the same
  number of Guards as paths.

  Linked circuits MUST NOT be extended further once linked (ie:
  'cannibalization' is not supported).


2. Protocol Mechanics

2.1. Advertising support for conflux

2.1.1 Relay

  We propose a new protocol version in order to advertise support for
  circuit linking on the relay side:

     "Conflux=1" -- Relay supports Conflux as in linking circuits together using
                    the new LINK, LINKED and SWITCH relay command.

2.1.2 Onion Service

  We propose to add a new line in order to advertise conflux support in the
  encrypted section of the onion service descriptor:

    "conflux" SP max-num-circ SP desired-ux NL

      The "max-num-circ" value indicate the maximum number of rendezvous
      circuits that are allowed to be linked together.

  We let the service specify the conflux algorithm to use, when sending data
  to the service. Some services may prefer latency, where as some may prefer
  throughput. However, clients also have the ability to request their own UX
  for data that the service sends, in the LINK handshake below, in part
  because the high-throughput algorithms will require more out-of-order queue
  memory, which may be infeasible on mobile.

  The next section describes how the circuits are linked together.

2.2. Conflux Handshake [CONFLUX_HANDSHAKE]

  To link circuits, we propose new relay commands that are sent on both
  circuits, as well as a response to confirm the join, and an ack of this
  response. These commands create a 3way handshake, which allows each
  endpoint to measure the initial RTT of each leg upon link, without
  needing to wait for any data.

  All three stages of this handshake are sent on *each* circuit leg to be
  linked.

  When packed cells are a reality (proposal 340), these cells SHOULD be
  combined with the initial RELAY_BEGIN cell on the faster circuit leg.
  This combination also allows better enforcement against side channels.
  (See [SIDE_CHANNELS]).

  There are other ways to do this linking that we have considered, but they
  seem not to be significantly better than this method, especially since we can
  use Proposal 340 to eliminate the RTT cost of this setup before sending data.
  For those other ideas, see [ALTERNATIVE_LINKING] and [ALTERNATIVE_RTT], in
  the appendix.

  The first two parts of the handshake establish the link, and enable
  resumption:

    19 -- RELAY_CONFLUX_LINK

          Sent from the OP to the exit/service in order to link circuits
          together at the end point.

    20 -- RELAY_CONFLUX_LINKED

          Sent from the exit/service to the OP, to confirm the circuits were
          linked.

  The contents of these two cells is exactly the same. They have the following
  contents:

    VERSION   [1 byte]
    PAYLOAD   [variable, up to end of relay payload]

  The VERSION tells us which circuit linking mechanism to use. At this
  point in time, only 0x01 is recognized and is the one described by the
  Conflux design.

  For version 0x01, the PAYLOAD contains:

     NONCE              [32 bytes]
     LAST_SEQNO_SENT    [8 bytes]
     LAST_SEQNO_RECV    [8 bytes]
     DESIRED_UX         [1 byte]

  The NONCE contains a random 256-bit secret, used to associate the two
  circuits together. The nonce MUST NOT be shared outside of the circuit
  transmission, or data may be injected into TCP streams. This means it
  MUST NOT be logged to disk.

  The two sequence number fields are 0 upon initial link, but non-zero in
  the case of a reattach or resumption attempt (See [CONFLUX_SET_MANAGEMENT]
  and [RESUMPTION]).

  The DESIRED_UX field allows the endpoint to request the UX properties
  it wants. The other endpoint SHOULD select the best known scheduling
  algorithm, for these properties. The endpoints do not need to agree
  on which UX style they prefer.

  The UX properties are:

    0 - NO_OPINION
    1 - MIN_LATENCY
    2 - LOW_MEM_LATENCY
    3 - HIGH_THROUGHPUT
    4 - LOW_MEM_THROUGHPUT

  The algorithm choice is performed by to the *sender* of data, (ie: the
  receiver of the PAYLOAD). The receiver of data (sender of the PAYLOAD)
  does not need to be aware of the exact algorithm in use, but MAY enforce
  expected properties (particularly low queue usage, in the case of requesting
  either LOW_MEM_LATENCY or LOW_MEM_THROUGHPUT). The receiver MAY close the
  entire conflux set if these properties are violated.

  If either circuit does not receive a RELAY_CONFLUX_LINKED response, both
  circuits MUST be closed.

  The third stage of the handshake exists to help the exit/service measure
  initial RTT, for use in [SCHEDULING]:

    21 -- RELAY_CONFLUX_LINKED_ACK

          Sent from the OP to the exit/service, to provide initial RTT
          measurement for the exit/service.

  These three relay commands are sent on *each* leg, to allow each endpoint to
  measure the initial RTT of each leg.

  The client SHOULD abandon and close circuit if the LINKED message takes too
  long to arrive. This timeout MUST be no larger than the normal SOCKS/stream
  timeout in use for RELAY_BEGIN, but MAY be the Circuit Build Timeout value,
  instead. (The C-Tor implementation currently uses Circuit Build Timeout).

  See [SIDE_CHANNELS] for rules for when to reject unexpected handshake cells.

2.2. Linking Circuits from OP to Exit [LINKING_EXIT]

  To link exit circuits, two circuits to the same exit are built, with
  additional restrictions such that they do not share Guard or Middle
  relays. This restriction applies via specific relay identity keys,
  and not IP addresses, families, or networks. (This is because the purpose
  of it is to avoid sharing a bottleneck *inside* relay circuit queues;
  bottlenecks caused by a shared network are handled by TCP's congestion
  control on the OR conns).

  Bridges also are subject to the same constraint as Guard relays;
  the C-Tor codebase emits a warn if only one bridge is configured, unless
  that bridge has transport "snowflake". Snowflake is exempt from this
  Guard restriction because it is actually backed by many bridges. In the
  bridge case, we only warn, and do not refuse to build conflux circuits,
  because it is not catastrophic that Bridges are shared, it is just
  sub-optimal for performance and congestion.

  When each circuit is opened, we ensure that congestion control
  has been negotiated. If congestion control negotiation has failed, the
  circuit MUST be closed. After this, the linking handshake begins.

  The RTT times between RELAY_CONFLUX_LINK and RELAY_CONFLUX_LINKED are
  measured by the client, to determine primary vs secondary circuit use,
  and for packet scheduling. Similarly, the exit measures the RTT times
  between RELAY_CONFLUX_LINKED and RELAY_CONFLUX_LINKED_ACK, for the same
  purpose.

  Because of the race between initial data and the RELAY_CONFLUX_LINKED_ACK
  cell, conditions can arise where an Exit needs to send data before the
  slowest circuit delivers this ACK. In these cases, it should prefer sending
  data on the circuit that has delivered the ACK (which will arrive immediately
  prior to any data from the client). This circuit will be the lower RTT
  circuit anyway, but the code needs to handle the fact that in this case,
  there won't yet be an RTT for the second circuit.

2.3. Linking circuits to an onion service [LINKING_SERVICE]

  For onion services, we will only concern ourselves with linking
  rendezvous circuits.

  To join rendezvous circuits, clients make two introduce requests to a
  service's intropoint, causing it to create two rendezvous circuits, to
  meet the client at two separate rendezvous points. These introduce
  requests MUST be sent to the same intropoint (due to potential use of
  onionbalance), and SHOULD be sent back-to-back on the same intro
  circuit. They MAY be combined with Proposal 340. (Note that if we do not
  use Prop340, we will have to raise the limit on number of intros per
  client circuit to 2, here, at intropoints).

  When rendezvous circuits are built, they should use the same Guard,
  Bridge, and Middle restrictions as specified in 2.2, for Exits. These
  restrictions SHOULD also take into consideration all Middles in the path,
  including the rendezvous point. All relay identities should be unique
  (again, except for when the Snowflake transport is in use). The one
  special case here is if the chosen rendezvous points by a client
  are the same as the service's guards. In this case, the service SHOULD
  NOT use different guards, but instead stick with the guards it has.
  The reason for this is that we do not want to create the ability
  for a client to force a service to use different guards.

  The first rendezvous circuit to get joined SHOULD use Proposal 340 to
  append the RELAY_BEGIN command, and the service MUST answer on this
  circuit, until RTT can be measured.

  Once both circuits are linked and RTT is measured, packet scheduling
  MUST be used, as per [SCHEDULING].

2.4. Conflux Set Management [CONFLUX_SET_MANAGEMENT]

  When managing legs, it is useful to separate sets that have completed the
  link handshake from legs that are still performing the handshake. Linked
  sets MAY have additional unlinked legs on the way, but these should not
  be used for sending data until the handshake is complete.

  It is also useful to enforce various additional conditions on the handshake,
  depending on if [RESUMPTION] is supported, and if a leg has been launched
  because of an early failure, or due to a desire for replacement.

2.4.1. Pre-Building Sets

  In C-Tor, conflux is only used via circuit prebuilding. Pre-built conflux
  sets are preferred over other pre-built circuits, but if the pre-built pool
  ends up empty, normal pre-built circuits are used. If those run out, regular
  non-conflux circuits are built. In other words, in C-Tor, conflux sets are
  never built on-demand, but this is strictly an implementation decision, to
  simplify dealing with the C-Tor codebase

  The consensus parameter 'cfx_max_prebuilt_set' specifies the number of
  sets to pre-build.

  During upgrade, the consensus parameter 'cfx_low_exit_threshold' will be
  used, so that if there is a low amount of conflux-supporting exits, only
  one conflux set will be built.

2.4.2. Set construction

  When a set is launched, legs begin the handshake in the unlinked state.
  As handshakes complete, finalization is attempted, to create a linked set.
  On the client, this finalization happens upon receipt of the LINKED cell.
  On the exit/service, this finalization happens upon *sending* the LINKED_ACK.

  The initiator of this handshake considers the set fully linked once the
  RELAY_CONFLUX_LINKED_ACK is sent (roughly upon receipt of the LINKED cell).
  Because of the potential race between LINKED_ACK, and initial data sent by
  the client, the receiver of the handshake must consider a leg linked at
  the time of *sending* a LINKED_ACK cell.

  This means that exit legs may not have an RTT measurement, if data on the
  faster leg beats the LINKED_ACK on the slower leg. The implementation MUST
  account for this, by treating unmeasured legs as having infinite RTT.

  When attempting to finalize a set, this finalization should not complete
  if any unlinked legs are still pending.

2.4.3. Closing circuits

  For circuits that are unlinked, the origin SHOULD immediately relaunch a new
  leg when it is closed, subject to the limits in [SIDE_CHANNELS].

  In C-Tor, we do not support arbitrary resumption. Therefore, we perform
  some additional checks upon closing circuits, to decide if we should
  immediately tear down the entire set:
     - If the closed leg was the current sending leg, close the set
     - If the closed leg had the highest non-zero last_seq_recv/sent, close the set
     - If data was in progress on a closed leg (inflight > cc_sendme_inc), then
       all legs must be closed

2.4.4. Reattaching Legs

  While C-Tor does not support arbitrary resumption, new legs *can* be
  attached, so long as there is no risk of data loss from a closed leg.
  This enables latency probing, which will be important for UDP VoIP.

  Currently, the C-Tor codebase checks for data loss by verifying that
  the LINK/LINKED cell has a lower last_seq_sent than all current
  legs' maximum last_seq_recv, and a lower last_seq_recv than all
  current legs last_seq_sent.

  This check is performed on finalization, not the receipt of first
  handshake cell. This gives the data additional time to arrive.

2.5. Congestion Control Application [CONGESTION_CONTROL]

  The SENDMEs for congestion control are performed per-leg. As soon as
  data arrives, regardless of its ordering, it is counted towards SENDME
  delivery. In this way, 'cwnd - inflight' of each leg always reflects
  the available data to send on each leg. This is important for
  [SCHEDULING].

  The Congestion control Stream XON/XOFF can be sent on either leg, and
  applies to the stream's transmission on both legs.

  In C-Tor, streams used to become blocked as soon as the OR conn
  of their circuit was blocked. Because conflux can send on the other
  circuit, which uses a different OR conn, this form of stream blocking
  has been decoupled from the OR conn status, and only happens when
  congestion control has decided that all circuits are blocked (congestion
  control becomes blocked when either 'cwnd - inflight <= 0', *or* when
  the local OR conn is blocked, so if all local OR conns of a set are
  blocked, then the stream will block that way).

  Note also that because congestion control only covers RELAY_COMMAND_DATA
  cells, for all algorithms, a special case must be made such that if no
  circuit is available to send on due to congestion control blocking,
  commands other than RELAY_COMMAN_DATA MUST be sent on the current
  circuit, even if the cell scheduler believes that no circuit is available.
  Depending on the code structure of Arti, this special case may or may
  not be necessary. It arises in C-Tor because nothing can block the
  sending of arbitrary non-DATA relay command cells.

2.6. Sequencing [SEQUENCING]

  With multiple paths for data, the problem of data re-ordering appears.
  In other words, cells can arrive out of order from the two circuits
  where cell N + 1 arrives before the cell N.

  Handling this reordering operates after congestion control for each
  circuit leg, but before relay cell command processing or stream data
  delivery.

  For the receiver to be able to reorder the receiving cells, a sequencing
  scheme needs to be implemented. However, because Tor does not drop or
  reorder packets inside of a circuit, this sequence number can be very
  small. It only has to signal that a cell comes after those arriving on
  another circuit.

  To achieve this, we propose a new relay command used to indicate a switch to
  another leg:

    22 -- RELAY_CONFLUX_SWITCH

          Sent from a sending endpoint when switching leg in an
          already linked circuit construction. This message is sent on the leg
          that will be used for new traffic, and tells the receiver the size of
          the gap since the last data (if any) sent on that leg.

  The cell payload format is:

    SeqNum  [4 bytes]

  The "SeqNum" value is a relative sequence number, which is the difference
  between the last absolute sequence number sent on the new leg and the last
  absolute sequence number sent on all other legs prior to the switch. In this
  way, the endpoint knows what to increment its local absolute sequence number
  by, before cells start to arrive.

  To achieve this, the sender must maintain the last absolute sequence sent for
  each leg, and the receiver must maintain the last absolute sequence number
  received for each leg.

  As an example, let's say we send 10 cells on the first leg, so our absolute
  sequence number is 10. If we then switch to the second leg, it is trivial to
  see that we should send a SWITCH with 10 as the relative sequence number, to
  indicate that regardless of the order in which the first cells are received,
  subsequent cells on the second leg should start counting at 10.

  However, if we then send 21 cells on this leg, our local absolute sequence
  number as the sender is 31. So when we switch back to the first leg, where
  the last absolute sequence sent was 10, we must send a SWITCH cell with 21,
  so that when the first leg receives subsequent cells, it assigns those cells
  an absolute sequence number starting at 31.

  In the rare event that we send more than 2^31 cells (~1TB) on a single leg,
  the leg should be switched in order to reset that relative sequence number to
  fit within 4 bytes.

  For a discussion of rules to rate limit the usage of SWITCH as a side
  channel, see [SIDE_CHANNELS].

2.7. Resumption [RESUMPTION]

  In the event that a circuit leg is destroyed, they MAY be resumed.
  Full resumption is not supported in C-Tor, but is possible to implement,
  at the expense of always storing roughly a congestion window of
  already-transmitted data on each endpoint, in the worst case. Simpler
  forms of resumption, where there is no data loss, are supported. This
  is important to support latency probing, for ensuring UDP VoIP minimum
  RTT requirements are met (roughly 300-500ms, depending on VoIP
  implementation).

  Resumption is achieved by re-using the NONCE to the same endpoint
  (either [LINKING_EXIT] or [LINKING_SERVICE]). The resumed path need
  not use the same middle and guard relays as the destroyed leg(s), but
  SHOULD NOT share any relays with any existing legs(s).

  If data loss has been detected upon a link handshake, resumption can be
  achieved by sending a switch cell, which is immediately followed by the
  missing data. Roughly, each endpoint must check:
    - if cell.last_seq_recv <
         min(max(legs.last_seq_sent),max(closed_legs.last_seq_sent)):
      - send a switch cell immediately with missing data:
        (last_seq_sent - cell.last_seq_recv)

  If an endpoint does not have this missing data due to memory pressure,
  that endpoint MUST destroy *both* legs, as this represents unrecoverable
  data loss.

  Re-transmitters MUST NOT re-increment their absolute sent fields
  while re-transmitting.

  It is even possible to resume conflux circuits where both legs have been
  collapsed using this scheme, if endpoints continue to buffer their
  unacked package_window data for some time after this close. However, see
  [TRAFFIC_ANALYSIS] for more details on the full scope of this issue.

  If endpoints are buffering package_window data, such data should be
  given priority to be freed in any oomkiller invocation. See [MEMORY_DOS]
  for more oomkiller information.

2.8. Data transmission

  Most cells in Tor are circuit-specific, and should only be sent on a
  circuit, even if that circuit is part of a conflux set. Cells that
  are not multiplexed do not count towards the conflux sequence numbers.

  However, in addition to the obvious RELAY_COMMAND_DATA, a subset of cells
  MUST ALSO be multiplexed, so that their ordering is preserved when they
  arrive at the other end. These cells do count towards conflux sequence
  numbers, and are handled in the out-of-order queue, to preserve ordered
  delivery:
    RELAY_COMMAND_BEGIN
    RELAY_COMMAND_DATA
    RELAY_COMMAND_END
    RELAY_COMMAND_CONNECTED
    RELAY_COMMAND_RESOLVE
    RELAY_COMMAND_RESOLVED
    RELAY_COMMAND_XOFF
    RELAY_COMMAND_XON

  Currently, this set is the same as the set of cells that have stream ID,
  but the property that leads to this requirement is not stream usage by
  itself, it is that these cells must be ordered with respect to all data
  on the circuit. It is not impossible that future relay commands could be
  invented that don't have stream IDs, but yet must still arrive in order
  with respect to circuit data cells. Prop#253 is one possible example of
  such a thing (though we won't be implementing that proposal).


3. Traffic Scheduling [SCHEDULING]

  In order to load balance the traffic between the two circuits, the
  original conflux paper used only RTT. However, with Proposal 324, we
  will have accurate information on the instantaneous available bandwidth
  of each circuit leg, as 'cwnd - inflight' (see Section 3 of
  Proposal 324). We also have the TCP block state of the local OR
  connection.

  We specify two traffic schedulers from the multipath literature and
  adapt them to Tor: [MINRTT_TOR], and [LOWRTT_TOR]. Additionally,
  we create low-memory variants of these that aim to minimize the
  out-of-order queue size at the receiving endpoint.

  Additionally, see the [TRAFFIC_ANALYSIS] sections of this proposal for
  important details on how this selection can be changed, to reduce
  website traffic fingerprinting.

3.1. MinRTT scheduling [MINRTT_TOR]

  This schedulng algorithm is used for the MIN_LATENCY user experience.

  It works by always and only sending on the circuit with the current minimum
  RTT. With this algorithm, conflux should effectively stay on the circuit with
  the lowest initial RTT, unless that circuit's RTT raises above the RTT of the
  other circuit (due to relay load or congestion). When the circuit's congestion
  window is full (ie: cwnd - inflight <= 0), or if the local OR conn blocks,
  the conflux set stops transmitting and stops reading on edge connections,
  rather than switch.

  This should result in low out-of-order queues in most situations, unless
  the initial RTTs of the two circuits are very close (basically within the
  Vegas RTT bounds of queue variance, 'alpha' and 'beta').

3.2. LowRTT Scheduling [LOWRTT_TOR]

  This scheduling algorithm is based on [MPTCP]'s LowRTT scheduler. This
  algorithm is used for the UX choice of HIGH_THROUGHPUT.

  In this algorithm, endpoints send cells on the circuit with lowest RTT that
  has an unblocked local OR connection, and room in its congestion window (ie:
  cwnd - inflight > 0). We stop reading on edge connections only when both
  congestion windows become full, or when both local OR connections are blocked.

  In this way, unlike original conflux, we switch to the secondary circuit
  without causing congestion either locally, or on either circuit. This
  improves both load times, and overall throughput. Given a large enough
  transmission, both circuits are used to their full capacity,
  simultaneously.

3.3. MinRTT Low-Memory Scheduling [MINRTT_LOWMEM_TOR]

  The low memory version of the MinRTT scheduler ensures that we do not
  perform a switch more often than once per congestion window worth of data.

  XXX: Other rate limiting, such as not switching unless the RTT changes by
  more than X%, may be useful here.

3.4. BLEST Scheduling [BLEST_TOR]

  XXX: Something like this might be useful for minimizing OOQ for the UX
  choice of LOW_MEM_THROUGHPUT, but we might just be able to reduce switching
  frequency instead.

  XXX: We want an algorithm that only uses cwnd instead. This algorithm
  has issues if the primary cwnd grows while the secondary does not.
  Expect this section to change.

  [BLEST] attempts to predict the availability of the primary circuit, and
  use this information to reorder transmitted data, to minimize
  head-of-line blocking in the recipient (and thus minimize out-of-order
  queues there).

  BLEST_TOR uses the primary circuit until the congestion window is full.
  Then, it uses the relative RTT times of the two circuits to calculate
  how much data can be sent on the secondary circuit faster than if we
  just waited for the primary circuit to become available.

  This is achieved by computing two variables at the sender:

    rtts = secondary.currRTT / primary.currRTT
    primary_limit = primary.cwnd + (rtts-1)/2)*rtts

  Note: This (rtts-1)/2 factor represents anticipated congestion window
  growth over this period.. it may be different for Tor, depending on CC
  alg.

  If primary_limit < secondary.cwnd - (secondary.package_window + 1), then
  there is enough space on the secondary circuit to send data faster than
  we could than waiting for the primary circuit.

  XXX: Note that BLEST uses total_send_window where we use secondary.cwnd
  in this check. total_send_window is min(recv_win, CWND). But since Tor
  does not use receive windows and instead uses stream XON/XOFF, we only
  use CWND. There is some concern this may alter BLEST's buffer
  minimization properties, but since receive window only matter if
  the application is slower than Tor, and XON/XOFF will cover that case,
  hopefully this is fine. If we need to, we could turn [REORDER_SIGNALING]
  into a receive window indication of some kind, to indicate remaining
  buffer size.

  Otherwise, if the primary_limit condition is not hit, cease reading on
  source edge connections until SENDME acks come back.

  Here is the pseudocode for this:

    while source.has_data_to_send():
      if primary.cwnd > primary.package_window:
        primary.send(source.get_packet())
        continue

      rtts = secondary.currRTT / primary.currRTT
      primary_limit = (primary.cwnd + (rtts-1)/2)*rtts

      if primary_limit < secondary.cwnd - (secondary.package_window+1):
        secondary.send(source.get_packet())
      else:
        break # done for now, wait for SENDME to free up CWND and restart

  Note that BLEST also has a parameter lambda that is updated whenever HoL
  blocking occurs. Because it is expensive and takes significant time to
  signal this over Tor, we omit this.


4. Security Considerations

4.1. Memory Denial of Service [MEMORY_DOS]

  Both reorder queues and retransmit buffers inherently represent a memory
  denial of service condition.

  For [RESUMPTION] retransmit buffers, endpoints that support this feature
  SHOULD free retransmit information as soon as they get close to memory
  pressure. This prevents resumption while data is in flight, but will not
  otherwise harm operation.

  In terms of adversarial issues, clients can lie about sequence numbers,
  sending cells with sequence numbers such that the next expected sequence
  number is never sent.  They can do this repeatedly on many circuits, to
  exhaust memory at exits.  Intermediate relays may also block a leg, allowing
  cells to traverse only one leg, thus still accumulating at the reorder queue.

  In C-Tor we will mitigate this in three ways: via the OOM killer, by the
  ability for exits to request that clients use the LOW_MEM_LATENCY UX
  behavior, and by rate limiting the frequency of switching under the
  LOW_MEM_LATENCY UX style.

  When a relay is under memory pressure, the circuit OOM killer SHOULD free
  and close circuits with the oldest reorder queue data, first. This heuristic
  was shown to be best during the [SNIPER] attack OOM killer iteration cycle.

  The rate limiting under LOW_MEM_LATENCY will be heuristic driven, based
  on data from Shadow simulations, and live network testing. It is possible that
  other algorithms may be able to be similarly rate limited.

4.2. Protocol Side Channels [SIDE_CHANNELS]

  To understand the decisions we make below with respect to handling
  potential side channels, it is important to understand a bit of the history
  of the Tor threat model.

  Tor's original threat model completely disregarded all traffic analysis,
  including protocol side channels, assuming that they were all equally
  effective, and that diversity of relays was what provided protection.
  Numerous attack papers have proven this to be an over-generalization.

  Protocol side channels are most severe when a circuit is known to be silent,
  because stateful protocol behavior prevents other normal cells from ever being
  sent. In these cases, it is trivial to inject a packet count pattern that has
  zero false positives. These kinds of side channels are made use of in the
  Guard discovery literature, such as [ONION_FOUND], and [DROPMARK]. It is even
  more trivial to manipulate the AES-CTR cipherstream, as per [RACOON23], until
  we implement [PROP308].

  However, because we do not want to make this problem worse, it is extremely
  important to be mindful of ways that an adversary can inject new cell
  commands, as well as ways that the adversary can spawn new circuits
  arbitrarily.

  It is also important, though slightly less so, to be mindful of the uniqueness
  of new handshakes, as handshakes can be used to classify usage (such as via
  Onion Service Circuit Fingerprinting). Handshake side channels are only
  weakly defended, via padding machines for onion services. These padding
  machines will need to be improved, and this is also scheduled for arti.

  Finally, usage-based traffic analysis need to be considered. This includes
  things like website traffic fingerprinting, and is covered in
  [TRAFFIC_ANALYSIS].

4.2.1. Cell Injection Side Channel Mitigations

  To avoid [DROPMARK] attacks, several checks must be performed, depending
  on the cell type. The circuit MUST be closed if any of these checks fail.

  RELAY_CONFLUX_LINK:
    - Ensure conflux is enabled
    - Ensure the circuit is an Exit (or Service Rend) circuit
    - Ensure that no previous LINK cell has arrived on this circuit

  RELAY_CONFLUX_LINKED:
    - Ensure conflux is enabled
    - Ensure the circuit is client-side
    - Ensure this is an unlinked circuit that sent a LINK command
    - Ensure that the nonce matches the nonce used in the LINK command
    - Ensure that the cell came from the expected hop

  RELAY_CONFLUX_LINKED_ACK:
    - Ensure conflux is enabled
    - Ensure that this circuit is not client-side
    - Ensure that the circuit has successfully received its LINK cell
    - Ensure that this circuit has not received a LINKED_ACK yet

  RELAY_CONFLUX_SWITCH
    - If Prop#340 is in use, this cell MUST be packed with a valid
      multiplexed RELAY_COMMAND cell.
    - XXX: Additional rate limiting per algorithm, after tuning.

4.2.2. Guard Discovery Side Channel Mitigations

  In order to mitigate potential guard discovery by malicious exits,
  clients MUST NOT retry failed unlinked circuit legs for a set more than
  'cfx_max_unlinked_leg_retry' times.

4.2.3. Usage-Based Side Channel Discussion

  After we have solved all of the zero false positive protocol side
  channels in Tor, our attention can turn to more subtle, usage-based
  side channels.

  Two potential usage side channels may be introduced by the use of Conflux:
     1. Delay-based side channels, by manipulating switching
     2. Location info leaks through the use of both leg's latencies

  To perform delay-based side channels, Exits can simply disregard the RTT
  or cwnd when deciding to switch legs, thus introducing a pattern of gaps that
  the Guard node can detect. Guard relays can also delay legs to introduce a
  pattern into the delivery of cells at the exit relay, by varying the latency
  of SENDME cells (every 31st cell) to change the distribution of traffic to
  send information. This attack could be performed in either direction of
  traffic, to bias traffic load off of a particular Guard. If an adversary
  controls both Guards, it could in theory send a binary signal, by
  alternating delays on each.

  However, Tor currently provides no defenses against already existing
  single-circuit delay-based (or stop-and-start) side channels. It is already
  the case that on a single circuit, either the Guard or the Exit can simply
  withhold sending traffic, as per a recognizable pattern. This class of
  attacks, and a possible defense for them, is discussed in [BACKLIT].

  However, circuit padding can also help to obscure these side channels,
  even if tuned for website fingerprinting. See [TRAFFIC_ANALYSIS] for more
  details there.

  The second class of side channel is where the Exit relay may be able to
  use the two legs to further infer more information about client
  location. See [LATENCY_LEAK] for more details. It is unclear at this
  time how much more severe this is for two paths than just one.

  We preserve the ability to disable conflux to and from Exit relays
  using consensus parameters, if these side channels prove more severe,
  or if it proves possible possible to mitigate single-circuit side
  channels, but not conflux side channels.

4.3. Traffic analysis [TRAFFIC_ANALYSIS]

  Even though conflux shows benefits against traffic analysis in
  [WTF_SPLIT], these gains may be moot if the adversary is able to perform
  packet counting and timing analysis at guards to guess which specific
  circuits are linked. In particular, the 3 way handshake in
  [LINKING_CIRCUITS] may be quite noticeable.

  Additionally, the conflux handshake may make onion services stand out
  more, regardless of the number of stages in the handshake. For this
  reason, it may be wise to simply address these issues with circuit
  padding machines during circuit setup (see padding-spec.txt).

  Additional traffic analysis considerations arise when combining conflux
  with padding, for purposes of mitigating traffic fingerprinting. For
  this, it seems wise to treat the packet schedulers as another piece of a
  combined optimization problem in tandem with optimizing padding
  machines, perhaps introducing randomness or fudge factors their
  scheduling, as a parameterized distribution. For details, see
  https://github.com/torproject/tor/blob/master/doc/HACKING/CircuitPaddingDevelopment.md

  Finally, conflux may exacerbate forms of confirmation-based traffic
  analysis that close circuits to determine concretely if they were in
  use, since closing either leg might cause resumption to fail. TCP RST
  injection can perform this attack on the side, without surveillance
  capability. [RESUMPTION] with buffering of the inflight unacked
  package_window data, for retransmit, is a partial mitigation, if
  endpoints buffer this data for retransmission for a brief time even if
  both legs close. This buffering seems more feasible for onion services,
  which are more vulnerable to this attack. However, if the adversary
  controls the client and is attacking the service in this way, they
  will notice the resumption re-link at their client, and still obtain
  confirmation that way.

  It seems the only way to fully mitigate these kinds of attacks is with
  the Snowflake pluggable transport, which provides its own resumption and
  retransmit behavior. Additionally, Snowflake's use of UDP DTLS also
  protects against TCP RST injection, which we suspect to be the main
  vector for such attacks.

  In the future, a DTLS or QUIC transport for Tor such as masque could
  provide similar RST injection resistance, and resumption at Guard/Bridge
  nodes, as well.

5. Consensus Parameters [CONSENSUS]

  - cfx_enabled
    - Values: 0=off, 1=on
    - Description: Emergency off switch, in case major issues are discovered.

  - cfx_low_exit_threshold
    - Range: 0-10000
    - Description: Fraction out of 10000 that represents the fractional rate of
      exits that must support protover 5. If the fraction is below this
      amount, the number of pre-built sets is restricted to 1.

  - cfx_max_linked_set
    - Range: 0-255
    - Description: The total number of linked sets that can be created. 255
      means "unlimited".

  - cfx_max_prebuilt_set
    - Range: 0-255
    - Description: The maximum number of pre-built conflux sets to make.
      This value is overridden by the 'cfx_low_exit_threshold' criteria.

  - cfx_max_unlinked_leg_retry
    - Range: 0-255
    - Description: The maximum number of times to retry an unlinked leg that
      fails during build or link, to mitigate guard discovery attacks.

  - cfx_num_legs_set
    - Range: 0-255
    - Description: The number of legs to link in a set.

  - cfx_send_pct
    - XXX: Experimental tuning parameter. Subject to change/removal.

  - cfx_drain_pct
    - XXX: Experimental tuning parameter. Subject to change/removal.


7. Tuning Experiments [EXPERIMENTS]

  - conflux_sched & conflux_exits
    - Exit reorder queue size
    - Responsiveness vs throughput tradeoff?
  - Congestion control
  - EWMA and KIST
  - num guards & conflux_circs


Appended A [ALTERNATIVES]

A.1 Alternative Link Handshake [ALTERNATIVE_LINKING]

  The circuit linking in [LINKING_CIRCUITS] could be done as encrypted
  ntor onionskin extension fields, similar to those used by v3 onions.

  This approach has at least four problems:
    i). For onion services, since onionskins traverse the intro circuit
        and return on the rend circuit, this handshake cannot measure
        RTT there.
   ii). Since these onionskins are larger, and have no PFS, an adversary
        at the middle relay knows that the onionskin is for linking, and
        can potentially try to obtain the onionskin key for attacks on
        the link.
  iii). It makes linking circuits more fragile, since they could timeout
        due to CBT, or other issues during construction.
   iv). The overhead in processing this onionskin in onionskin queues
        adds additional time for linking, even in the Exit case, making
        that RTT potentially noisy.

  Additionally, it is not clear that this approach actually saves us
  anything in terms of setup time, because we can optimize away the
  linking phase using Proposal 340, to combine initial RELAY_BEGIN cells
  with RELAY_CIRCUIT_LINK.

A.2. Alternative RTT measurement [ALTERNATIVE_RTT]

  Instead of measuring RTTs during [LINKING_CIRCUITS], we could create
  PING/PONG cells, whose sole purpose is to allow endpoints to measure
  RTT.

  This was rejected for several reasons. First, during circuit use, we
  already have SENDMEs to measure RTT. Every 100 cells (or
  'circwindow_inc' from Proposal 324), we are able to re-measure RTT based
  on the time between that Nth cell and the SENDME ack. So we only need
  PING/PONG to measure initial circuit RTT.

  If we were able to use onionskins, as per [ALTERNATIVE_LINKING] above,
  we might be able to specify a PING/PONG/PING handshake solely for
  measuring initial RTT, especially for onion service circuits.

  The reason for not making a dedicated PING/PONG for this purpose is that
  it is context-free. Even if we were able to use onionskins for linking
  and resumption, to avoid additional data in handshake that just measures
  RTT, we would have to enforce that this PING/PONG/PING only follows the
  exact form needed by this proposal, at the expected time, and at no
  other points.

  If we do not enforce this specific use of PING/PONG/PING, it becomes
  another potential side channel, for use in attacks such as [DROPMARK].

  In general, Tor is planning to remove current forms of context-free and
  semantic-free cells from its protocol:
  https://gitlab.torproject.org/tpo/core/torspec/-/issues/39

  We should not add more.


Appendix B: Acknowledgments [ACKNOWLEDGMENTS]

  Thanks to Per Hurtig for helping us with the framing of the MPTCP
  problem space.

  Thanks to Simone Ferlin for clarifications on the [BLEST] paper, and for
  pointing us at the Linux kernel implementation.

  Extreme thanks goes again to Toke Høiland-Jørgensen, who helped
  immensely towards our understanding of how the BLEST condition relates
  to edge connection pushback, and for clearing up many other
  misconceptions we had.

  Finally, thanks to Mashael AlSabah, Kevin Bauer, Tariq Elahi, and Ian
  Goldberg, for the original [CONFLUX] paper!


References:

[CONFLUX]
   https://freehaven.net/anonbib/papers/pets2013/paper_65.pdf

[BLEST]
  https://olivier.mehani.name/publications/2016ferlin_blest_blocking_estimation_mptcp_scheduler.pdf
  https://opus.lib.uts.edu.au/bitstream/10453/140571/2/08636963.pdf
  https://github.com/multipath-tcp/mptcp/blob/mptcp_v0.95/net/mptcp/mptcp_blest.c

[WTF_SPLIT]
   https://www.comsys.rwth-aachen.de/fileadmin/papers/2020/2020-delacadena-trafficsliver.pdf

[COUPLED]
   https://datatracker.ietf.org/doc/html/rfc6356
   https://www.researchgate.net/profile/Xiaoming_Fu2/publication/230888515_Delay-based_Congestion_Control_for_Multipath_TCP/links/54abb13f0cf2ce2df668ee4e.pdf?disableCoverPage=true
   http://staff.ustc.edu.cn/~kpxue/paper/ToN-wwj-2020.04.pdf
   https://www.thinkmind.org/articles/icn_2019_2_10_30024.pdf
   https://arxiv.org/pdf/1308.3119.pdf

[BACKLIT]
   https://www.freehaven.net/anonbib/cache/acsac11-backlit.pdf

[LATENCY_LEAK]
   https://www.freehaven.net/anonbib/cache/ccs07-latency-leak.pdf
   https://www.robgjansen.com/publications/howlow-pets2013.pdf

[SNIPER]
   https://www.freehaven.net/anonbib/cache/sniper14.pdf

[DROPMARK]
   https://www.petsymposium.org/2018/files/papers/issue2/popets-2018-0011.pdf

[RACCOON23]
   https://archives.seul.org/or/dev/Mar-2012/msg00019.html

[ONION_FOUND]
   https://www.researchgate.net/publication/356421302_From_Onion_Not_Found_to_Guard_Discovery/fulltext/619be24907be5f31b7ac194a/From-Onion-Not-Found-to-Guard-Discovery.pdf

[VANGUARDS_ADDON]
  https://github.com/mikeperry-tor/vanguards/blob/master/README_TECHNICAL.md

[PROP324]
  https://gitlab.torproject.org/tpo/core/torspec/-/blob/main/proposals/324-rtt-congestion-control.txt

[PROP339]
  https://gitlab.torproject.org/tpo/core/torspec/-/blob/main/proposals/339-udp-over-tor.md

[PROP308]
  https://gitlab.torproject.org/tpo/core/torspec/-/blob/main/proposals/308-counter-galois-onion.txt
Filename: 330-authority-contact.md
Title: Modernizing authority contact entries
Author: Nick Mathewson
Created: 10 Feb 2021
Status: Open

This proposal suggests changes to interfaces used to describe a directory authority, to better support load balancing and denial-of-service resistance.

(In an appendix, it also suggests an improvement to the description of authority identity keys, to avoid a deprecated cryptographic algorithm.)

Background

There are, broadly, three good reasons to make a directory request to a Tor directory authority:

  • As a relay, to publish a new descriptor.
  • As another authority, to perform part of the voting and consensus protocol.
  • As a relay, to fetch a consensus or a set of (micro)descriptors.

There are some more reasons that are OK-ish:

  • as a bandwidth authority or similar related tool running under the auspices of an authority.
  • as a metrics tool, to fetch directory information.
  • As a liveness checking tool, to make sure the authorities are running.

There are also a number of bad reasons to make a directory request to a Tor directory authority.

  • As a client, to download directory information. (Clients should instead use a directory guard, or a fallback directory if they don't know any directory information at all.)
  • As a tor-related application, to download directory information. (Such applications should instead run a tor client, which can maintain an up-to-date directory much more efficiently.)

Currently, Tor provides two mechanisms for downloading and uploading directory information: the DirPort, and the BeginDir command. A DirPort is an HTTP port on which directory information is served. The BeginDir command is a relay command that is used to send an HTTP stream directly over a Tor circuit.

Historically, we used DirPort for all directory requests. Later, when we needed encrypted or anonymous directory requests, we moved to a "Begin-over-tor" approach, and then to BeginDir. We still use the DirPort directly, however, when relays are connecting to authorities to publish descriptors or download fresh directories. We also use it for voting.

This proposal suggests that instead of having only a single DirPort, authorities should be able to expose a separate contact point for each supported interaction above. By separating these contact points, we can impose separate access controls and rate limits on each, to improve the robustness of the consensus voting process.

Eventually, separate contact points will allow us do even more: we'll be able to have separate implementations for the upload and download components of the authorities, and keep the voting component mostly offline.

Adding contact points to authorities

Currently, for each directory authority, we ship an authority entry. For example, the entry describing tor26 is:

"tor26 orport=443 "
  "v3ident=14C131DFC5C6F93646BE72FA1401C02A8DF2E8B4 "
  "ipv6=[2001:858:2:2:aabb:0:563b:1526]:443 "
  "86.59.21.38:80 847B 1F85 0344 D787 6491 A548 92F9 0493 4E4E B85D",

We extend these lines with optional contact point elements as follows:

  • upload=http://IP:port/ A location to publish router descriptors.
  • download=http://IP:port/ A location to use for caches when fetching router descriptors.
  • vote=http://IP:port/ A location to use for authorities when voting.

Each of these contact point elements can appear more than once. If it does, then it describes multiple valid contact points for a given purpose; implementations MAY use any of the contact point elements that they recognize for a given authority.

Implementations SHOULD ignore url schemas that they do not recognize, and SHOULD ignore hostnames addresses that appear in the place of the IP elements above. (This will make it easier for us to extend these lists in the future.)

If there is no contact point element for a given type, then implementations should fall back to using the main IPv4 addr:port, and/or the IPv6 addr:port if available.

As an extra rule: If more than one authority lists the same upload point, then uploading a descriptor to that upload point counts as having uploaded it to all of those authorities. (This rule will allow multiple authorities to share an upload point in the future, if they decide to do so. We do not need a corresponding rules for voting or downloading, since every authority participates in voting directly, and since there is no notion of "downloading from each authority.")

Authority-side configuration

We add a few flags to DirPort configuration, indicating what kind of requests are acceptable.

  • no-voting
  • no-download
  • no-upload

These flags remove a given set of possible operations from a given DirPort. So for example, an authority might say:

 DirPort 9030 no-download no-upload
 DirPort 9040 no-voting no-upload
 DirPort 9050 no-voting no-download

We can also allow "upload-only" as an alias for "no-voting no-download", and so on.

Note that authorities would need to keep a legacy dirport around until all relays have upgraded.

Bridge authorities

This proposal does not yet apply to bridge authorities, since neither clients nor bridges connect to bridge authorities over HTTP. A later proposal may add a schema that can be used to describe contacting to a bridge authority via BEGINDIR.

Example uses

Example setup: Simple access control and balancing.

Right now the essential functionality of authorities is sometimes blocked by getting too much load from directory downloads by non-relays. To address this we can proceed as follows. We can have each relay authority open four separate dirports: One for publishing, one for voting, one for downloading, and one legacy port. These can be rate-limited separately, and requests sent to the wrong port can be rejected. We could additionally prioritize voting, then uploads, then downloads. This could be done either within Tor, or with other IP shaping tools.

Example setup: Full authority refactoring

In the future, this system lets us get fancier with our authorities and how they are factored. For example, as in proposal 257, an authority could run upload services, voting, and download services all at separate locations.

The authorities themselves would be the only ones that needed to use their voting protocol. The upload services (run on the behalf of authorities or groups of authorities) could receive descriptors and do initial testing on them before passing them on to the authorities. The authorities could then vote with one another, and push the resulting consensus and descriptors to the download services. This would make the download services primarily responsible for serving directory information, and have them take all the load.

Appendix: Cryptographic extensions to authority configuration

The 'v3ident' element, and the relay identity fingerprint in authority configuration, are currently both given as SHA1 digests of RSA keys. SHA1 is currently deprecated: even though we're only relying on second-preimage resistance, we should migrate away.

With that in mind, we're adding two more fields to the authority entries:

  • ed25519-id=BASE64 The ed25519 identity of a the authority when it acts as a relay.
  • v3ident-sha3-256=HEX The SHA3-256 digest of the authority's v3 signing key.

(We use base64 here for the ed25519 key since that's what we use elsewhere.)

Filename: 331-res-tokens-for-anti-dos.md
Title: Res tokens: Anonymous Credentials for Onion Service DoS Resilience
Author: George Kadianakis, Mike Perry
Created: 11-02-2021
Status: Draft
              +--------------+           +------------------+
              | Token Issuer |           | Onion Service    |
              +--------------+           +------------------+
                     ^                            ^
                     |        +----------+        |
            Issuance |  1.    |          |   2.   | Redemption
                     +------->|  Alice   |<-------+
                              |          |
                              +----------+

0. Introduction

This proposal specifies a simple anonymous credential scheme based on Blind RSA signatures designed to fight DoS abuse against onion services. We call the scheme "Res tokens".

Res tokens are issued by third-party issuance services, and are verified by onion services during the introduction protocol (through the INTRODUCE1 cell).

While Res tokens are used for denial of service protection in this proposal, we demonstrate how they can have application in other Tor areas as well, like improving the IP reputation of Tor exit nodes.

1. Motivation

Denial of service attacks against onion services have been explored in the past and various defenses have been proposed:

  • Tor proposal #305 specifies network-level rate-limiting mechanisms.
  • Onionbalance allows operators to scale their onions horizontally.
  • Tor proposal #327 increases the attacker's computational requirements (not implemented yet).

While the above proposals in tandem should provide reasonable protection against many DoS attackers, they fundamentally work by reducing the asymmetry between the onion service and the attacker. This won't work if the attacker is extremely powerful because the asymmetry is already huge and cutting it down does not help.

We believe that a proposal based on cryptographic guarantees -- like Res tokens -- can offer protection against even extremely strong attackers.

2. Overview

In this proposal we introduce an anonymous credential scheme -- Res tokens -- that is well fitted for protecting onion services against DoS attacks. We also introduce a system where clients can acquire such anonymous credentials from various types of Token Issuers and then redeem them at the onion service to gain access even when under DoS conditions.

In section TOKEN_DESIGN, we list our requirements from an anonymous credential scheme and provide a high-level overview of how the Res token scheme works.

In section PROTOCOL_SPEC, we specify the token issuance and redemption protocols, as well as the mathematical operations that need to be conducted for these to work.

In section TOKEN_ISSUERS, we provide a few examples and guidelines for various token issuer services that could exist.

In section DISCUSSION, we provide more use cases for Res tokens as well as future improvements we can conduct to the scheme.

3. Design [TOKEN_DESIGN]

In this section we will go over the high-level design of the system, and in the next section we will delve into the lower-level details of the protocol.

3.1. Anonymous credentials

Anonymous credentials or tokens are cryptographic identifiers that allow their bearer to maintain an identity while also preserving anonymity.

Clients can acquire a token in a variety of ways (e.g. registering on a third-party service, solving a CAPTCHA, completing a PoW puzzle) and then redeem it at the onion service proving this way that work was done, but without linking the act of token acquisition with the act of token redemption.

3.2. Anonymous credential properties

The anonymous credential literature is vast and there are dozens of credential schemes with different properties REF_TOKEN_ZOO, in this section we detail the properties we care about for this use case:

  • Public Verifiability: Because of the distributed trust properties of the Tor network, we need anonymous credentials that can be issued by one party (the token issuer) and verified by a different party (in this case the onion service).

  • Perfect unlinkability: Unlinkability between token issuance and token redemption is vital in private settings like Tor. For this reason we want our scheme to preserve its unlinkability even if its fundamental security assumption is broken. We want unlinkability to be protected by information theoretic security or random oracle, and not just computational security.

  • Small token size: The tokens will be transfered to the service through the INTRODUCE1 cell which is not flexible and has only a limited amount of space (about 200 bytes) REF_INTRO_SPACE. We need tokens to be small.

  • Quick Verification: Onions are already experiencing resource starvation because of the DoS attacks so it's important that the process of verifying a token should be as quick as possible. In section TOKEN_PERF we will go deeper into this requirement.

After careful consideration of the above requirements, we have leaned towards using Blind RSA as the primitive for our tokens, since it's the fastest scheme by far that also allows public verifiability. See also Appendix A for a security proof sketch of Blind RSA perfect unlinkability.

3.3. Other security considerations

Apart from the above properties we also want:

  • Double spending protection: We don't want Malory to be able to double spend her tokens in various onion services thereby amplifying her attack. For this reason our tokens are not global, and can only be redeemed at a specific destination onion service.

  • Metadata: We want to encode metadata/attributes in the tokens. In particular, we want to encode the destination onion service and an expiration date. For more information see section DEST_DIGEST. For blind RSA tokens this is usually done using "partially blind signatures" but to keep it simple we instead encode the destination directly in the message to be blind-signed and the expiration date using a set of rotating signing keys.

  • One-show: There are anonymous credential schemes with multi-show support where one token can be used multiple times in an unlinkable fashion. However, that might allow an adversary to use a single token to launch a DoS attack, since revocation solutions are complex and inefficient in anonymous credentials. For this reason, in this work we use one-show tokens that can only be redeemed once. That takes care of the revocation problem but it means that a client will have to get more tokens periodically.

3.4. Res tokens overview

Throughout this proposal we will be using our own token scheme, named "Res", which is based on blind RSA signatures. In this modern cryptographic world, not only we have the audacity of using Chaum's oldest blind signature scheme of all times, but we are also using RSA with a modulus of 1024 bits...

The reason that Res uses only 1024-bits RSA is because we care most about small token size and quick verification rather than the unforgeability of the token. This means that if the attacker breaks the issuer's RSA signing key and issues tokens for herself, this will enable the adversary to launch DoS attacks against onion services, but it won't allow her to link users (because of the "perfect unlinkability" property).

Furthermore, Res tokens get a short implicit expiration date by having the issuer rapidly rotate issuance keys every few hours. This means that even if an adversary breaks an issuance key, she will be able to forge tokens for just a few hours before that key expires.

For more ideas on future schemes and improvements see section FUTURE_RES.

3.5. Token performance requirements [TOKEN_PERF]

As discussed above, verification performance is extremely important in the anti-DoS use case. In this section we provide some concrete numbers on what we are looking for.

In proposal #327 REF_POW_PERF we measured that the total time spent by the onion service on processing a single INTRODUCE2 cell ranges from 5 msec to 15 msecs with a mean time around 5.29 msec. This time also includes the launch of a rendezvous circuit, but does not include the additional blocking and time it takes to process future cells from the rendezvous point.

We also measured that the parsing and validation of INTRODUCE2 cell ("top half") takes around 0.26 msec; that's the lightweight part before the onion service decides to open a rendezvous circuit and do all the path selection and networking.

This means that any defenses introduced by this proposal should add minimal overhead to the above "top half" procedure, so as to apply access control in the lightest way possible.

For this reason we implemented a basic version of the Res token scheme in Rust and benchmarked the verification and issuance procedure REF_RES_BENCH.

We measured that the verification procedure from section RES_VERIFY takes about 0.104 ms, which we believe is a reasonable verification overhead for the purposes of this proposal.

We also measured that the issuance procedure from RES_ISSUANCE takes about 0.614 ms.

4. Specification [PROTOCOL_SPEC]

              +--------------+           +------------------+
              | Token Issuer |           | Onion Service    |
              +--------------+           +------------------+
                     ^                            ^
                     |        +----------+        |
            Issuance |  1.    |          |   2.   | Redemption
                     +------->|  Alice   |<-------+
                              |          |
                              +----------+

4.0. Notation

Let a || b be the concatenation of a with b.

Let a^b denote the exponentiation of a to the bth power.

Let a == b denote a check for equality between a and b.

Let FDH_N(msg) be a Full Domain Hash (FDH) of 'msg' using SHA256 and stretching the digest to be equal to the size of an RSA modulus N.

4.1. Token issuer setup

The Issuer creates a set of ephemeral RSA-1024 "issuance keys" that will be used during the issuance protocol. Issuers will be rotating these ephemeral keys every 6 hours.

The Issuer exposes the set of active issuance public keys through a REST HTTP API that can be accessed by visiting /issuers.keys.

Tor directory authorities periodically fetch the issuer's public keys and vote for those keys in the consensus so that they are readily available by clients. The keys in the current consensus are considered active, whereas the ones that have fallen off have expired.

XXX how many issuance public keys are active each time? how does overlapping keys work? clients and onions need to know precise expiration date for each key. this needs to be specified and tested for robustness.

XXX every how often does the fetch work? how does the voting work? which issuers are considered official? specify consensus method.

XXX An alternative approach: Issuer has a long-term ed25519 certification key that creates expiring certificates for the ephemeral issuance keys. Alice shows the certificate to the service to prove that the token comes from an issuer. The consensus includes the long-term certification key of the issuers to establish ground truth. This way we avoid the synchronization between dirauths and issuers, and the multiple overlapping active issuance keys. However, certificates might not fit in the INTRODUCE1 cell (prop220 certs take 104 bytes on their own). Also certificate metadata might create a vector for linkability attacks between the issuer and the verifier.

4.2. Onion service signals ongoing DoS attack

When an onion service is under DoS attack it adds the following line in the "encrypted" (inner) part of the v3 descriptor as a way to signal to its clients that tokens are required for gaining access:

"token-required" SP token-type SP issuer-list NL

[At most once]

token-type: Is the type of token supported ("res" for this proposal)
issuer-list: A comma separated list of issuers which are supported by this onion service

4.3. Token issuance

When Alice visits an onion service with an active "token-required" line in its descriptor it checks whether there are any tokens available for this onion service in its token store. If not, it needs to acquire some and hence the token issuance protocol commences.

4.3.1. Client preparation [DEST_DIGEST]

Alice first chooses an issuer supported by the onion service depending on her preferences by looking at the consensus and her Tor configuration file for the current list of active issuers.

After picking a supported issuer, she performs the following preparation before contacting the issuer:

  1. Alice extracts the issuer's public key (N,e) from the consensus

  2. Alice computes a destination digest as follows:

      dest_digest = FDH_N(destination || salt)
    
         where:
         - 'destination' is the 32-byte ed25519 public identity key of the destination onion
         - 'salt' is a random 32-byte value,
    
  3. Alice samples a blinding factor 'r' uniformly at random from [1, N)

  4. Alice computes: blinded_message = dest_digest * r^e (mod N)

After this phase is completed, Alice has a blinded message that is tailored specifically for the destination onion service. Alice will send the blinded message to the Token Issuer, but because of the blinding the Issuer does not get to learn the dest_digest value.

XXX Is the salt needed? Reevaluate.

4.3.3. Token Issuance [RES_ISSUANCE]

Alice now initiates contact with the Token Issuer and spends the resources required to get issued a token (e.g. solve a CAPTCHA or a PoW, create an account, etc.). After that step is complete, Alice sends the blinded_message to the issuer through a JSON-RPC API.

After the Issuer receives the blinded_message it signs it as follows:

    blinded_signature = blinded_message ^ d (mod N)

      where:
      - 'd' is the private RSA exponent.

and returns the blinded_signature to Alice.

XXX specify API (JSON-RPC? Needs SSL + pubkey pinning.)

4.3.4. Unblinding step

Alice verifies the received blinded signature, and unblinds it to get the final token as follows:

    token = blinded_signature * r^{-1} (mod N)
          = blinded_message ^ d * r^{-1] (mod N)
          = (dest_digest * r^e) ^d * r^{-1} (mod N)
          = dest_digest ^ d * r * r^{-1} (mod N)
          = dest_digest ^ d (mod N)

      where:
      - r^{-1} is the multiplicative inverse of the blinding factor 'r'

Alice will now use the 'token' to get access to the onion service.

By verifying the received signature using the issuer keys in the consensus, Alice ensures that a legitimate token was received and that it has not expired (since the issuer keys are still in the consensus).

4.4. Token redemption

4.4.1. Alice sends token to onion service

Now that Alice has a valid 'token' it can request access to the onion service. It does so by embedding the token into the INTRODUCE1 cell to the onion service.

To do so, Alice adds an extension to the encrypted portion of the INTRODUCE1 cell by using the EXTENSIONS field (see PROCESS_INTRO2 section in rend-spec-v3.txt). The encrypted portion of the INTRODUCE1 cell only gets read by the onion service and is ignored by the introduction point.

We propose a new EXT_FIELD_TYPE value:

[02] -- ANON_TOKEN

The EXT_FIELD content format is:

   TOKEN_VERSION    [1 byte]
   ISSUER_KEY       [4 bytes]
   DEST_DIGEST      [32 bytes]
   TOKEN            [128 bytes]
   SALT             [32 bytes]

where:

  • TOKEN_VERSION is the version of the token ([0x01] for Res tokens)
  • ISSUER_KEY is the public key of the chosen issuer (truncated to 4 bytes)
  • DEST_DIGEST is the 'dest_digest' from above
  • TOKEN is the 'token' from above
  • SALT is the 32-byte 'salt' added during blinding

This will increase the INTRODUCE1 payload size by 199 bytes since the data above is 197 bytes, the extension type and length is 2 extra bytes, and the N_EXTENSIONS field is always present. According to ticket #33650, INTRODUCE1 cells currently have more than 200 bytes available so we should be able to fit the above fields in the cell.

XXX maybe we don't need to pass DEST_DIGEST and we can just derive it

XXX maybe with a bit of tweaking we can even use a 1536-bit RSA signature here...

4.4.2. Onion service verifies token [RES_VERIFY]

Upon receiving an INTRODUCE1 cell with the above extension the service verifies the token. It does so as follows:

  1. The service checks its double spend protection cache for an element that matches DEST_DIGEST. If one is found, verification fails.
  2. The service checks: DEST_DIGEST == FDH_N(service_pubkey || SALT), where 'service_pubkey' is its own long-term public identity key.
  3. The service finds the corresponding issuer public key 'e' based on ISSUER_KEY from the consensus or its configuration file
  4. The service checks: TOKEN ^ e == DEST_DIGEST

Finally the onion service adds the DEST_DIGEST to its double spend protection cache to avoid the same token getting redeemed twice. Onion services keep a double spend protection cache by maintaining a sorted array of truncated DEST_DIGEST elements.

If any of the above steps fails, the verification process aborts and the introduction request gets discarded.

If all the above verification steps have been completed successfully, the service knows that this a valid token issued by the token issuer, and that the token has been created for this onion service specifically. The service considers the token valid and the rest of the onion service protocol carries out as normal.

5. Token issuers [TOKEN_ISSUERS]

In this section we go over some example token issuers. While we can have official token issuers that are supported by the Tor directory authorities, it is also possible to have unofficial token issuers between communities that can be embedded directly into the configuration file of the onion service and the client.

In general, we consider the design of token issuers to be independent from this proposal so we will touch the topic but not go too deep into it.

5.1. CAPTCHA token issuer

A use case resembling the setup of Cloudflare's PrivacyPass would be to have a CAPTCHA service that issues tokens after a successful CAPTCHA solution.

Tor Project, Inc runs https://ctokens.torproject.org which serves hCaptcha CAPTCHAs. When the user solves a CAPTCHA the server gives back a list of tokens. The amount of tokens rewarded for each solution can be tuned based on abuse level.

Clients reach this service via a regular Tor Exit connection, possibly via a dedicated exit enclave-like relay that can only connect to https://ctokens.torproject.org.

Upon receiving tokens, Tor Browser delivers them to the Tor client via the control port, which then stores the tokens into a token cache to be used when connecting to onion services.

In terms of UX, most of the above procedure can be hidden from the user by having Tor Browser do most of the things under the scenes and only present the CAPTCHA to the user if/when needed (if the user doesn't have tokens available for that destination).

XXX specify control port API between browser and tor

5.2. PoW token issuer

An idea that mixes the CAPTCHA issuer with proposal#327, would be to have a token issuer that accepts PoW solutions and provides tokens as a reward.

This solution tends to be less optimal than applying proposal#327 directly because it doesn't allow us to fine-tune the PoW difficulty based on the attack severity; which is something we are able to do with proposal#327.

However, we can use the fact that token issuance happens over HTTP to introduce more advanced PoW-based concepts. For example, we can design token issuers that accept blockchain shares as a reward for tokens. For example, a system like Monero's Primo could be used to provide DoS protection and also incentivize the token issuer by being able to use those shares for pool mining REF_PRIMO.

5.3. Onion service self-issuing

The onion service itself can also issue tokens to its users and then use itself as an issuer for verification. This way it can reward trusted users by giving it tokens for the future. The tokens can be rewarded from within the website of the onion service and passed to the Tor Client through the control port, or they can be provided in an out-of-bands way for future use (e.g. from a journalist to a future source using a QR code).

Unfortunately, the anonymous credential scheme specified in this proposal is one-show, so the onion service cannot provide a single token that will work for multiple "logins". In the future we can design multi-show credential systems that also have revocation to further facilitate this use case (see FUTURE_RES for more info).

6. User Experience

This proposal has user facing UX consequences.

Ideally we want this process to be invisible to the user and things to "just work". This can be achieved with token issuers that don't require manual work by the user (e.g. the PoW issuer, or the onion service itself), since both the token issuance and the token redemption protocols don't require any manual work.

In the cases where manual work is needed by the user (e.g. solving a CAPTCHA) it's ideal if the work is presented to the user right before visiting the destination and only if it's absolutely required. An explanation about the service being under attack should be given to the user when the CAPTCHA is provided.

7. Security

In this section we analyze potential security threats of the above system:

  • An evil client can hoard tokens for hours and unleash them all at once to cause a denial of service attack. We might want to make the key rotation even more frequent if we think that's a possible threat.

  • A trusted token issuer can always DoS an onion service by forging tokens.

  • Overwhelming attacks like "top half attacks" and "hybrid attacks" from proposal#327 is valid for this proposal as well.

  • A bad RNG can completely wreck the linkability properties of this proposal.

XXX Actually analyze the above if we think there is merit to listing them

8. Discussion [DISCUSSION]

8.1. Using Res tokens on Exit relays

There are more scenarios within Tor that could benefit from Res tokens however we didn't expand on those use cases to keep the proposal short. In the future, we might want to split this document into two proposals: one proposal that specifies the token scheme, and another that specifies how to use it in the context of onion services, so that we can then write more proposals that use the token scheme as a primitive.

An extremely relevant use case would be to use Res tokens as a way to protect and improve the IP reputation of Exit relays. We can introduce an exit pool that requires tokens in exchange for circuit streams. The idea is that exits that require tokens will see less abuse, and will not have low scores in the various IP address reputation systems that now govern who gets access to websites and web services on the public Internet. We hope that this way we will see less websites blocking Tor.

8.2. Future improvements to this proposal [FUTURE_RES]

The Res token scheme is a pragmatic scheme that works for the space/time constraints of this use case but it's far from ideal for the greater future (RSA? RSA-1024?).

After Tor proposal#319 gets implemented we will be able to pack more data in RELAY cells and that opens the door to token schemes with bigger token sizes. For example, we could design schemes based on BBS+ that can provide more advanced features like multi-show and complex attributes but currently have bigger token sizes (300+ bytes). That would greatly improve UX since the client won't have to solve multiple CAPTCHAs to gain access. Unfortunately, another problem here is that right now pairing-based schemes have significantly worse verification performance than RSA (e.g. in the order of 4-5 ms compared to <0.5 ms). We expect pairing-based cryptography performance to only improve in the future and we are looking forward to these advances.

When we switch to a multi-show scheme, we will also need revocation support otherwise a single client can abuse the service with a single multi-show token. To achieve this we would need to use blacklisting schemes based on accumulators (or other primitives) that can provide more flexible revocation and blacklisting; however these come at the cost of additional verification time which is not something we can spare at this time. We warmly welcome research on revocation schemes that are lightweight on the verification side but can be heavy on the proving side.

8.3. Other uses for tokens in Tor

There is more use cases for tokens in Tor but we think that other token schemes with different properties would be better suited for those.

In particular we could use tokens as authentication mechanisms for logging into services (e.g. acquiring bridges, or logging into Wikipedia). However for those use cases we would ideally need multi-show tokens with revocation support. We can also introduce token schemes that help us build a secure name system for onion services.

We hope that more research will be done on how to combine various token schemes together, and how we can maintain agility while using schemes with different primitives and properties.

9. Acknowledgements

Thanks to Jeff Burdges for all the information about Blind RSA and anonymous credentials.

Thanks to Michele Orrù for the help with the unlinkability proof and for the discussions about anonymous credentials.

Thanks to Chelsea Komlo for pointing towards anonymous credentials in the context of DoS defenses for onion services.


Appendix A: RSA Blinding Security Proof [BLIND_RSA_PROOF]

This proof sketch was provided by Michele Orrù:

RSA Blind Sigs: https://en.wikipedia.org/wiki/Blind_signature#Blind_RSA_signatures

As you say, blind RSA should be perfectly blind.

I tried to look at Boneh-Shoup, Katz-Lindell, and Bellare-Goldwasser for a proof, but didn't find any :(

The basic idea is proving that:
for any  message "m0" that is blinded with "r0^e" to obtain "b" (that is sent to the server), it is possible to freely choose another message "m1" that blinded with another opening "r1^e" to obtain the same "b".

As long as r1, r0 are chosen uniformly at random, you have no way of telling if what message was picked and therefore it is *perfectly* blind.

To do so:
Assume the messages ("m0" and "m1") are invertible mod N=pq (this happens at most with overwhelming probability phi(N)/N if m is uniformly distributed as a result of a hash, or you can enforce it at signing time).

Blinding happens by computing:
   b = m0 * (r0^e).

However, I can also write:
   b = m0 * r0^e = (m1/m1) * m0 * r0^e = m1 * (m0/m1*r0^e).

This means that r1 = (m0/m1)^d * r0 is another valid blinding factor for b, and it's distributed exactly as r0 in the group of invertibles (it's unif at random, because r0 is so).

         https://www.monerooutreach.org/stories/RPC-Pay.html
Filename: 332-ntor-v3-with-extra-data.md
Title: Ntor protocol with extra data, version 3.
Author: Nick Mathewson
Created: 12 July 2021
Status: Closed

Overview

The ntor handshake is our current protocol for circuit establishment.

So far we have two variants of the ntor handshake in use: the "ntor v1" that we use for everyday circuit extension (see tor-spec.txt) and the "hs-ntor" that we use for v3 onion service handshake (see rend-spec-v3.txt). This document defines a third version of ntor, adapting the improvements from hs-ntor for use in regular circuit establishment.

These improvements include:

  • Support for sending additional encrypted and authenticated protocol-setup handshake data as part of the ntor handshake. (The information sent from the client to the relay does not receive forward secrecy.)

  • Support for using an external shared secret that both parties must know in order to complete the handshake. (In the HS handshake, this is the subcredential. We don't use it for circuit extension, but in theory we could.)

  • Providing a single specification that can, in the future, be used both for circuit extension and HS introduction.

The improved protocol: an abstract view

Given a client "C" that wants to construct a circuit to a relay "S":

The client knows:

  • B: a public "onion key" for S
  • ID: an identity for S, represented as a fixed-length byte string.
  • CM: a message that it wants to send to S as part of the handshake.
  • An optional "verification" string.

The relay knows:

  • A set of [(b,B)...] "onion key" keypairs. One of them is "current", the others are outdated, but still valid.
  • ID: Its own identity.
  • A function for computing a server message SM, based on a given client message.
  • An optional "verification" string. This must match the "verification" string from the client.

Both parties have a strong source of randomness.

Given this information, the client computes a "client handshake" and sends it to the relay.

The relay then uses its information plus the client handshake to see if the incoming message is valid; if it is, then it computes a "server handshake" to send in reply.

The client processes the server handshake, and either succeeds or fails.

At this point, the client and the relay both have access to:

  • CM (the message the client sent)
  • SM (the message the relay sent)
  • KS (a shared byte stream of arbitrary length, used to compute keys to be used elsewhere in the protocol).

Additionally, the client knows that CM was sent only to the relay whose public onion key is B, and that KS is shared only with that relay.

The relay does not know which client participated in the handshake, but it does know that CM came from the same client that generated the key X, and that SM and KS were shared only with that client.

Both parties know that CM, SM, and KS were shared correctly, or not at all.

Both parties know that they used the same verification string; if they did not, they do not learn what the verification string was. (This feature is required for HS handshakes.)

The handshake in detail

Notation

We use the following notation:

  • | -- concatenation
  • "..." -- a byte string, with no terminating NUL.
  • ENCAP(s) -- an encapsulation function. We define this as htonll(len(s)) | s. (Note that len(ENCAP(s)) = len(s) + 8).
  • PARTITION(s, n1, n2, n3, ...) -- a function that partitions a bytestring s into chunks of length n1, n2, n3, and so on. Extra data is put into a final chunk. If s is not long enough, the function fails.

We require the following crypto operations:

  • KDF(s,t) -- a tweakable key derivation function, returning a keystream of arbitrary length.
  • H(s,t) -- a tweakable hash function of output length DIGEST_LEN.
  • MAC(k, msg, t) -- a tweakable message-authentication-code function, with key length MAC_KEY_LEN and output length MAC_LEN.
  • EXP(pk,sk) -- our Diffie Hellman group operation, taking a public key of length PUB_KEY_LEN.
  • KEYGEN() -- our Diffie-Hellman keypair generation algorithm, returning a (secret-key,public-key) pair.
  • ENC(k, m) -- a stream cipher with key of length ENC_KEY_LEN. DEC(k, m) is its inverse.

Parameters:

  • PROTOID -- a short protocol identifier
  • t_* -- a set of "tweak" strings, used to derive distinct hashes from a single hash function.
  • ID_LEN -- the length of an identity key that uniquely identifies a relay.

Given our cryptographic operations and a set of tweak strings, we define:

H_foo(s) = H(s, t_foo)
MAC_foo(k, msg) = MAC(k, msg, t_foo)
KDF_foo(s) = KDF(s, t_foo)

See Appendix A.1 below for a set of instantiations for these operations and constants.

Client operation, phase 1

The client knows: B, ID -- the onion key and ID of the relay it wants to use. CM -- the message that it wants to send as part of its handshake. VER -- a verification string.

First, the client generates a single-use keypair:

x,X = KEYGEN()

and computes:

Bx = EXP(B,x)
secret_input_phase1 = Bx | ID | X | B | PROTOID | ENCAP(VER)
phase1_keys = KDF_msgkdf(secret_input_phase1)
(ENC_K1, MAC_K1) = PARTITION(phase1_keys, ENC_KEY_LEN, MAC_KEY_LEN)

encrypted_msg = ENC(ENC_K1, CM)
msg_mac = MAC_msgmac(MAC_K1, ID | B | X | encrypted_msg)

and sends:

NODEID      ID               [ID_LEN bytes]
KEYID       B                [PUB_KEY_LEN bytes]
CLIENT_PK   X                [PUB_KEY_LEN bytes]
MSG         encrypted_msg    [len(CM) bytes]
MAC         msg_mac          [last MAC_LEN bytes of message]

The client remembers x, X, B, ID, Bx, and msg_mac.

Server operation

The relay checks whether NODEID is as expected, and looks up the (b,B) keypair corresponding to KEYID. If the keypair is missing or the NODEID is wrong, the handshake fails.

Now the relay uses X=CLIENT_PK to compute:

Xb = EXP(X,b)
secret_input_phase1 = Xb | ID | X | B | PROTOID | ENCAP(VER)
phase1_keys = KDF_msgkdf(secret_input_phase1)
(ENC_K1, MAC_K1) = PARTITION(phase1_keys, ENC_KEY_LEN, MAC_KEY_LEN)

expected_mac = MAC_msgmac(MAC_K1, ID | B | X | MSG)

If expected_mac is not MAC, the handshake fails. Otherwise the relay computes CM as:

CM = DEC(MSG, ENC_K1)

The relay then checks whether CM is well-formed, and in response composes SM, the reply that it wants to send as part of the handshake. It then generates a new ephemeral keypair:

y,Y = KEYGEN()

and computes the rest of the handshake:

Xy = EXP(X,y)
secret_input = Xy | Xb | ID | B | X | Y | PROTOID | ENCAP(VER)
ntor_key_seed = H_key_seed(secret_input)
verify = H_verify(secret_input)

RAW_KEYSTREAM = KDF_final(ntor_key_seed)
(ENC_KEY, KEYSTREAM) = PARTITION(RAW_KEYSTREAM, ENC_KEY_LKEN, ...)

encrypted_msg = ENC(ENC_KEY, SM)

auth_input = verify | ID | B | Y | X | MAC | ENCAP(encrypted_msg) |
    PROTOID | "Server"
AUTH = H_auth(auth_input)

The relay then sends:

Y          Y              [PUB_KEY_LEN bytes]
AUTH       AUTH           [DIGEST_LEN bytes]
MSG        encrypted_msg  [len(SM) bytes, up to end of the message]

The relay uses KEYSTREAM to generate the shared secrets for the newly created circuit.

Client operation, phase 2

The client computes:

Yx = EXP(Y, x)
secret_input = Yx | Bx | ID | B | X | Y | PROTOID | ENCAP(VER)
ntor_key_seed = H_key_seed(secret_input)
verify = H_verify(secret_input)

auth_input = verify | ID | B | Y | X | MAC | ENCAP(MSG) |
    PROTOID | "Server"
AUTH_expected = H_auth(auth_input)

If AUTH_expected is equal to AUTH, then the handshake has succeeded. The client can then calculate:

RAW_KEYSTREAM = KDF_final(ntor_key_seed)
(ENC_KEY, KEYSTREAM) = PARTITION(RAW_KEYSTREAM, ENC_KEY_LKEN, ...)

SM = DEC(ENC_KEY, MSG)

SM is the message from the relay, and the client uses KEYSTREAM to generate the shared secrets for the newly created circuit.

Security notes

Whenever comparing bytestrings, implementations SHOULD use constant-time comparison function to avoid side-channel attacks.

To avoid small-subgroup attacks against the Diffie-Hellman function, implementations SHOULD either:

  • Make sure that all incoming group members are in fact in the DH group.
  • Validate all outputs from the EXP function to make sure that they are not degenerate.

Notes on usage

We don't specify what should actually be done with the resulting keystreams; that depends on the usage for which this handshake is employed. Typically, they'll be divided up into a series of tags and symmetric keys.

The keystreams generated here are (conceptually) unlimited. In practice, the usage will determine the amount of key material actually needed: that's the amount that clients and relays will actually generate.

The PROTOID parameter should be changed not only if the cryptographic operations change here, but also if the usage changes at all, or if the meaning of any parameters changes. (For example, if the encoding of CM and SM changed, or if ID were a different length or represented a different type of key, then we should start using a new PROTOID.)

A.1 Instantiation

Here are a set of functions based on SHA3, SHAKE-256, Curve25519, and AES256:

H(s, t) = SHA3_256(ENCAP(t) | s)
MAC(k, msg, t) = SHA3_256(ENCAP(t) | ENCAP(k) | s)
KDF(s, t) = SHAKE_256(ENCAP(t) | s)
ENC(k, m) = AES_256_CTR(k, m)

EXP(pk,sk), KEYGEN: defined as in curve25519

DIGEST_LEN = MAC_LEN = MAC_KEY_LEN = ENC_KEY_LEN = PUB_KEY_LEN = 32

ID_LEN = 32  (representing an ed25519 identity key)

Notes on selected operations: SHA3 can be pretty slow, and AES256 is likely overkill. I'm choosing them anyway because they are what we use in hs-ntor, and in my preliminary experiments they don't account for even 1% of the time spent on this handshake.

t_msgkdf = PROTOID | ":kdf_phase1"
t_msgmac = PROTOID | ":msg_mac"
t_key_seed = PROTOID | ":key_seed"
t_verify = PROTOID | ":verify"
t_final = PROTOID | ":kdf_final"
t_auth = PROTOID | ":auth_final"

A.2 Encoding for use with Tor circuit extension

Here we give a concrete instantiation of ntor-v3 for use with circuit extension in Tor, and the parameters in A.1 above.

If in use, this is a new CREATE2 type. Clients should not use it unless the relay advertises support by including an appropriate version of the Relay=X subprotocol in its protocols list.

When the encoding and methods of this section, along with the instantiations from the previous section, are in use, we specify:

PROTOID = "ntor3-curve25519-sha3_256-1"

The key material is extracted as follows, unless modified by the handshake (see below). See tor-spec.txt for more info on the specific values:

Df    Digest authentication, forwards  [20 bytes]
Db    Digest authentication, backwards [20 bytes]
Kf    Encryption key, forwards         [16 bytes]
Kb    Encryption key, backwards        [16 bytes]
KH    Onion service nonce              [20 bytes]

We use the following meta-encoding for the contents of client and server messages.

[Any number of times]:
EXTENSION
   EXT_FIELD_TYPE     [one byte]
   EXT_FIELD_LEN      [one byte]
   EXT_FIELD          [EXT_FIELD_LEN bytes]

(EXT_FIELD_LEN may be zero, in which case EXT_FIELD is absent.)

All parties MUST reject messages that are not well-formed per the rules above.

We do not specify specific TYPE semantics here; we leave those for other proposals and specifications.

Parties MUST ignore extensions with EXT_FIELD_TYPE bodies they do not recognize.

Unless otherwise specified in the documentation for an extension type:

  • Each extension type SHOULD be sent only once in a message.
  • Parties MUST ignore any occurrences all occurrences of an extension with a given type after the first such occurrence.
  • Extensions SHOULD be sent in numerically ascending order by type.

(The above extension sorting and multiplicity rules are only defaults; they may be overridden in the description of individual extensions.)

A.3 How much space is available?

We start with a 498-byte payload in each relay cell.

The header of the EXTEND2 cell, including link specifiers and other headers, comes to 89 bytes.

The client handshake requires 128 bytes (excluding CM).

That leaves 281 bytes, "which should be plenty".

X.1 Negotiating proposal-324 circuit windows

(We should move this section into prop324 when this proposal is finished.)

We define a type value, CIRCWINDOW_INC.

We define a triplet of consensus parameters: circwindow_inc_min, cincwindow_inc_max, and circwindow_inc_dflt. These all have range (1,65535).

When the authority operators want to experiment with different values for circwindow_inc_dflt, they set circwindow_inc_min and circwindow_inc_max to the range in which they want to experiment, making sure that the existing circwindow_inc_dflt is within that range.

vWhen a client sees that a relay supports the ntor3 handshake type (subprotocol Relay=X), and also supports the flow control algorithms of proposal 324 (subprotocol FlowCtrl=X), then the client sends a message, with type CIRCWINDOW_INC, containing a two-byte integer equal to circwindow_inc_dflt.

The relay rejects the message if the value given is outside of the [circwindow_inc_min, circwindow_inc_max] range. Otherwise, it accepts it, and replies with the same message that the client sent.

X.2: Test vectors

The following test values, in hex, were generated by a Python reference implementation.

Inputs:

b = "4051daa5921cfa2a1c27b08451324919538e79e788a81b38cbed097a5dff454a" B = "f8307a2bc1870b00b828bb74dbb8fd88e632a6375ab3bcd1ae706aaa8b6cdd1d" ID = "9fad2af287ef942632833d21f946c6260c33fae6172b60006e86e4a6911753a2" x = "b825a3719147bcbe5fb1d0b0fcb9c09e51948048e2e3283d2ab7b45b5ef38b49" X = "252fe9ae91264c91d4ecb8501f79d0387e34ad8ca0f7c995184f7d11d5da4f46" CM = "68656c6c6f20776f726c64" VER = "78797a7a79" y = "4865a5b7689dafd978f529291c7171bc159be076b92186405d13220b80e2a053" Y = "4bf4814326fdab45ad5184f5518bd7fae25dc59374062698201a50a22954246d" SM = "486f6c61204d756e646f"

Intermediate values:

ENC_K1 = "4cd166e93f1c60a29f8fb9ec40ea0fc878930c27800594593e1c4d0f3b5fbd02" MAC_K1 = "f5b69e85fdd26e1b0bdbbc8128e32d8123040255f11f744af3cc98fc13613cda" msg_mac = "9e044d53565f04d82bbb3bebed3d06cea65db8be9c72b68cd461942088502f67" key_seed = "b9a092741098e1f5b8ab37ce74399dd57522c974d7ae4626283a1077b9273255" verify = "1dc09fb249738a79f1bc3a545eee8c415f27213894a760bb4df58862e414799a" ENC_KEY (server) = "cab8a93eef62246a83536c4384f331ec26061b66098c61421b6cae81f4f57c56" AUTH = "2fc5f8773ca824542bc6cf6f57c7c29bbf4e5476461ab130c5b18ab0a9127665"

Messages:

client_handshake = "9fad2af287ef942632833d21f946c6260c33fae6172b60006e86e4a6911753a2f8307a2bc1870b00b828bb74dbb8fd88e632a6375ab3bcd1ae706aaa8b6cdd1d252fe9ae91264c91d4ecb8501f79d0387e34ad8ca0f7c995184f7d11d5da4f463bebd9151fd3b47c180abc9e044d53565f04d82bbb3bebed3d06cea65db8be9c72b68cd461942088502f67"

server_handshake = "4bf4814326fdab45ad5184f5518bd7fae25dc59374062698201a50a22954246d2fc5f8773ca824542bc6cf6f57c7c29bbf4e5476461ab130c5b18ab0a91276651202c3e1e87c0d32054c"

First 256 bytes of keystream:

KEYSTREAM = "9c19b631fd94ed86a817e01f6c80b0743a43f5faebd39cfaa8b00fa8bcc65c3bfeaa403d91acbd68a821bf6ee8504602b094a254392a07737d5662768c7a9fb1b2814bb34780eaee6e867c773e28c212ead563e98a1cd5d5b4576f5ee61c59bde025ff2851bb19b721421694f263818e3531e43a9e4e3e2c661e2ad547d8984caa28ebecd3e4525452299be26b9185a20a90ce1eac20a91f2832d731b54502b09749b5a2a2949292f8cfcbeffb790c7790ed935a9d251e7e336148ea83b063a5618fcff674a44581585fd22077ca0e52c59a24347a38d1a1ceebddbf238541f226b8f88d0fb9c07a1bcd2ea764bbbb5dacdaf5312a14c0b9e4f06309b0333b4a"

Filename: 333-vanguards-lite.md
Title: Vanguards lite
Author: George Kadianakis, Mike Perry
Created: 2021-05-20
Status: Closed
Implemented-In: 0.4.7.1-alpha

0. Introduction & Motivation

This proposal specifies a simplified version of Proposal 292 "Mesh-based vanguards" for the purposes of implementing it directly into the C Tor codebase.

For more details on guard discovery attacks and how vanguards defend against it, we refer to Proposal 292 PROP292_REF.

1. Overview

We propose an identical system to the Mesh-based Vanguards from proposal 292, but with the following differences:

  • No third layer of guards is used.
  • The Layer2 lifetime uses the max(x,x) distribution with a minimum of one day and maximum of 12 days. This makes the average lifetime approximately a week.
  • We let NUM_LAYER2_GUARDS=4. We also introduce a consensus parameter guard-hs-l2-number that controls the number of layer2 guards (with a maximum of 19 layer2 guards).
  • We don't write guards on disk. This means that the guard topology resets when tor restarts.

By avoiding a third-layer of guards we avoid most of the linkability issues of Proposal 292. This means that we don't add an extra hop on top of most of our onion service paths, which increases performance. However, we do add an extra middle hop at the end of service-side introduction circuits to avoid linkability of L2s by the intro points.

This is how onion service circuits look like with this proposal:

 Client rend:   C -> G -> L2 -> Rend
 Client intro:  C -> G -> L2 -> M -> Intro
 Client hsdir:  C -> G -> L2 -> M -> HSDir
 Service rend:  C -> G -> L2 -> M -> Rend
 Service intro: C -> G -> L2 -> M -> Intro
 Service hsdir: C -> G -> L2 -> M -> HSDir

2. Rotation Period Analysis

From the table in Section 3.1 of Proposal 292, with NUM_LAYER2_GUARDS=4 it can be seen that this means that the Sybil attack on Layer2 will complete with 50% chance in 187 days (126 days) for the 1% adversary, 47 days (one month) for the 5% adversary, and 2*7 days (two weeks) for the 10% adversary.

3. Tradeoffs from Proposal 292

This proposal has several advantages over Proposal 292:

By avoiding a third-layer of guards we reduce the linkability issues of Proposal 292, which means that we don't have to add an extra hop on top of our paths. This simplifies engineering and makes paths shorter by default: this means less latency and quicker page load times.

This proposal also comes with disadvantages:

The lack of third-layer guards makes it easier to launch guard discovery attacks against clients and onion services. Long-lived services are not well protected, and this proposal might provide those services with a false sense of security. Such services should still use the vanguards addon VANGUARDS_REF.

4. Implementation nuances

Tor replaces an L2 vanguard whenever it is no longer listed in the most recent consensus, with the goal that we will always have the right number of vanguards ready to be used.

For implementation reasons, we also replace a vanguard if it loses the Fast or Stable flag, because the path selection logic wants middle nodes to have those flags when it's building preemptive vanguard-using circuits.

The design doesn't have to be this way: we might instead have chosen to keep vanguards in our list as long as possible, and continue to use them even if they have lost some flags. This tradeoff is similar to the one in https://bugs.torproject.org/17773 about whether to continue using Entry Guards if they lose the Guard flag -- and Tor's current choice is "no, rotate" for that case too.

5. References

Filename: 334-middle-only-flag.txt
Title: A Directory Authority Flag To Mark Relays As Middle-only
Author: Neel Chauhan
Created: 2021-09-07
Status: Superseded
Superseded-by: 335-middle-only-redux.md

1. Introduction

  The Health Team often deals with a large number of relays with an incorrect
  configuration (e.g. not all relays in MyFamily), or needs validation that
  requires contacting the relay operator. It is desirable to put the said
  relays in a less powerful position, such as a middle only flag that prevents
  a relay from being used in more powerful positions like an entry guard or an
  exit relay. [1]

1.1. Motivation

  The proposed middle-only flag is needed by the Health Team to prevent
  misconfigured relays from being used in positions capable of deanonymizing
  users while the team evaluates the relay's risk to the network. An example
  of this scenario is when a guard and exit relay run by the same operator
  has an incomplete MyFamily, and the same operator's guard and exit are used
  in a circuit.

  The reason why we won't play with the Guard and Exit flags or weights to
  achieve the same goal is because even if we were to reduce the guard and
  exit weights of a misconfigured relay, it could keep some users at risk of
  deanonymization. Even a small fraction of users at risk of deanonymization
  isn't something we should aim for.

  One case we could look out for is if all relays are exit relays (unlikely),
  or if walking onions are working on the current Tor network. This proposal
  should not affect those scenarios, but we should watch out for these cases.

2. The MiddleOnly Flag

  We propose a consensus flag MiddleOnly. As mentioned earlier, relays will be
  assigned this flag from the directory authorities.

  What this flag does is that a relay must not be used as an entry guard or
  exit relay. This is to prevent issues with a misconfigured relay as described
  in Section 1 (Introduction) while the Health Team assesses the risk with the
  relay.

3. Implementation details

  The MiddleOnly flag can be assigned to relays whose IP addresses and/or
  fingerprints are configured at the directory authority level, similar to
  how the BadExit flag currently works. In short, if a relay's IP is
  designated as middle-only, it must assign the MiddleOnly flag, otherwise
  we must not assign it.

  Relays which haven't gotten the Guard or Exit flags yet but have IP addresses
  that aren't designated as middle-only in the dirauths must not get the
  MiddleOnly flag. This is to allow new entry guards and exit relays to enter
  the Tor network, while giving relay administrators flexibility to increase
  and reduce bandwidth, or change their exit policy.

3.1. Client Implementation

  Clients should interpret the MiddleOnly flag while parsing relay descriptors
  to determine whether a relay is to be avoided for non-middle purposes. If
  a client parses the MiddleOnly flag, it must not use MiddleOnly-designated
  relays as entry guards or exit relays.

3.2. MiddleOnly Relay Purposes

  If a relay has the MiddleOnly flag, we do not allow it to be used for the
  following purposes:

   * Entry Guard

   * Directory Guard

   * Exit Relay

  The reason for this is to prevent a misconfigured relay from being used
  in places where they may know about clients or destination traffic. This
  is in case certain misconfigured relays are used to deanonymize clients.

  We could also bar a MiddleOnly relay from other purposes such as rendezvous
  and fallback directory purposes. However, while more secure in theory, this
  adds unnecessary complexity to the Tor design and has the possibility of
  breaking clients that aren't MiddleOnly-aware [2].

4. Consensus Considerations

4.1. Consensus Methods

  We propose a new consensus method 32, which is to only use this flag if and
  when all authorities understand the flag and agree on it. This is because the
  MiddleOnly flag impacts path selection for clients.

4.2. Consensus Requirements

  The MiddleOnly flag would work like most other consensus flags where a
  majority of dirauths have to assign a relay the flag in order for a relay
  to have the MiddleOnly flag.

  Another approach is to make it that only one dirauth is needed to give
  relays this flag, however it would put too much power in the hands of a
  single directory authority servre [3].

5. Acknowledgements

  Thank you so much to nusenu, s7r, David Goulet, and Roger Dingledine for your
  suggestions to Prop334. My proposal wouldn't be what it is without you.

6. Citations

  [1] - https://gitlab.torproject.org/tpo/core/tor/-/issues/40448

  [2] - https://lists.torproject.org/pipermail/tor-dev/2021-September/014627.html

  [3] - https://lists.torproject.org/pipermail/tor-dev/2021-September/014630.html
Filename: 335-middle-only-redux.md
Title: An authority-only design for MiddleOnly
Author: Nick Mathewson
Created: 2021-10-08
Status: Closed
Implemented-In: 0.4.7.2-alpha

Introduction

This proposal describes an alternative design for a MiddleOnly flag. Instead of making changes at the client level, it adds a little increased complexity at the directory authority's voting process. In return for that complexity, this design will work without additional changes required from Tor clients.

For additional motivation and discussion see proposal 334 by Neel Chauhan, and the related discussions on tor-dev.

Protocol changes

Generating votes

When voting for a relay with the MiddleOnly flag, an authority should vote for all flags indicating that a relay is unusable for a particular purpose, and against all flags indicating that the relay is usable for a particular position.

Specifically, these flags SHOULD be set in a vote whenever MiddleOnly is present, and only when the authority is configured to vote on the BadExit flag.

  • BadExit

And these flags SHOULD be cleared in a vote whenever MiddleOnly is present.

  • Exit
  • Guard
  • HSDir
  • V2Dir

Computing a consensus

This proposal will introduce a new consensus method (probably 32). Whenever computing a consensus using that consensus method or later, authorities post-process the set of flags that appear in the consensus after flag voting takes place, by applying the same rule as above.

That is, with this consensus method, the authorities first compute the presence or absence of each flag on each relay as usual. Then, if the MiddleOnly flag is present, the authorities set BadExit, and clear Exit, Guard, HSDir, and V2Dir.

Configuring authorities

We'll need a means for configuring which relays will receive this flag. For now, we'll just reuse the same mechanism as AuthDirReject and AuthDirBadExit: a set of torrc configuration lines listing relays by address. We'll call this AuthDirMiddleOnly.

We'll also add an AuthDirListsMiddleOnly option to turn on or off voting on this option at all.

Notes on safety and migration

Under this design, the MiddleOnly option becomes useful immediately, since authorities that use it will stop voting for certain additional options for MiddleOnly relays without waiting for the other authorities.

We don't need to worry about a single authority setting MiddleOnly unilaterally for all relays, since the MiddleOnly flag will have no special effect until most authorities have upgraded to the new consensus method.

Filename: 336-randomize-guard-retries.md
Title: Randomized schedule for guard retries
Author: Nick Mathewson
Created: 2021-10-22
Status: Closed

Implementation Status

This proposal is implemented in Arti, and recommended for future guard implementations. We have no current plans to implement it in C Tor.

Introduction

When we notice that a guard isn't working, we don't mark it as retriable until a certain interval has passed. Currently, these intervals are fixed, as described in the documentation for GUARDS_RETRY_SCHED in guard-spec appendix A.1. Here we propose using a randomized retry interval instead, based on the same decorrelated-jitter algorithm we use for directory retries.

The upside of this approach is that it makes our behavior in the presence of an unreliable network a bit harder for an attacker to predict. It also means that if a guard goes down for a while, its clients will notice that it is up at staggered times, rather than probing it in lock-step.

The downside of this approach is that we can, if we get unlucky enough, completely fail to notice that a preferred guard is online when we would otherwise have noticed sooner.

Note that when a guard is marked retriable, it isn't necessarily retried immediately. Instead, its status is changed from "Unreachable" to "Unknown", which will cause it to get retried.

For reference, our previous schedule was:

   {param:PRIMARY_GUARDS_RETRY_SCHED}
      -- every 10 minutes for the first six hours,
      -- every 90 minutes for the next 90 hours,
      -- every 4 hours for the next 3 days,
      -- every 9 hours thereafter.

   {param:GUARDS_RETRY_SCHED} --
      -- every hour for the first six hours,
      -- every 4 hours for the next 90 hours,
      -- every 18 hours for the next 3 days,
      -- every 36 hours thereafter.

The new algorithm

We re-use the decorrelated-jitter algorithm from dir-spec section 5.5. The specific formula used to compute the 'i+1'th delay is:

Delay_{i+1} = MIN(cap, random_between(lower_bound, upper_bound))
where upper_bound = MAX(lower_bound+1, Delay_i * 3)
      lower_bound = MAX(1, base_delay).

For primary guards, we set base_delay to 30 seconds and cap to 6 hours.

For non-primary guards, we set base_delay to 10 minutes and cap to 36 hours.

(These parameters were selected by simulating the results of using them until they looked "a bit more aggressive" than the current algorithm, but not too much.)

The average behavior for the new primary schedule is:

First 1.0 hours: 10.14283 attempts. (Avg delay 4m 47.41s)
First 6.0 hours: 19.02377 attempts. (Avg delay 15m 36.95s)
First 96.0 hours: 56.11173 attempts. (Avg delay 1h 40m 3.13s)
First 168.0 hours: 83.67091 attempts. (Avg delay 1h 58m 43.16s)
Steady state: 2h 36m 44.63s between attempts.

The average behavior for the new non-primary schedule is:

First 1.0 hours: 3.08069 attempts. (Avg delay 14m 26.08s)
First 6.0 hours: 8.1473 attempts. (Avg delay 35m 25.27s)
First 96.0 hours: 22.57442 attempts. (Avg delay 3h 49m 32.16s)
First 168.0 hours: 29.02873 attempts. (Avg delay 5h 27m 2.36s)
Steady state: 11h 15m 28.47s between attempts.
Filename: 337-simpler-guard-usability.md
Title: A simpler way to decide, "Is this guard usable?"
Author: Nick Mathewson
Created: 2021-10-22
Status: Closed

Introduction

The current guard-spec describes a mechanism for how to behave when our primary guards are unreachable, and we don't know which other guards are reachable. This proposal describes a simpler method, currently implemented in Arti.

(Note that this method might not actually give different results: its only advantage is that it is much simpler to implement.)

The task at hand

For illustration, we'll assume that our primary guards are P1, P2, and P3, and our subsequent guards (in preference order) are G1, G2, G3, and so on. The status of each guard is Reachable (we think we can connect to it), Unreachable (we think it's down), or Unknown (we haven't tried it recently).

The question becomes, "What should we do when P1, P2, and P3 are Unreachable, and G1, G2, ... are all Unknown"?

In this circumstance, we could say that we only build circuits to G1, wait for them to succeed or fail, and only try G2 if we see that the circuits to G1 have failed completely. But that delays in the case that G1 is down.

Instead, the first time we get a circuit request, we try to build one circuit to G1. On the next circuit request, if the circuit to G1 isn't done yet, we launch a circuit to G2 instead. The next request (if the G1 and G2 circuits are still pending) goes to G3, and so on. But (here's the critical part!) we don't actually use the circuit to G2 unless the circuit to G1 fails, and we don't actually use the circuit to G3 unless the circuits to G1 and G2 both fail.

This approach causes Tor clients to check the status of multiple possible guards in parallel, while not actually using any guard until we're sure that all the guards we'd rather use are down.

The current algorithm and its drawbacks

For the current algorithm, see guard-spec section 4.9: circuits are exploratory if they are not using a primary guard. If such an exploratory circuit is waiting_for_better_guard, then we advance it (or not) depending on the status of all other circuits using guards that we'd rather be using.

In other words, the current algorithm is described in terms of actions to take with given circuits.

For Arti (and for other modular Tor implementations), however, this algorithm is a bit of a pain: it introduces dependencies between the guard code and the circuit handling code, requiring each one to mess with the other.

Proposal

I suggest that we describe an alternative algorithm for handing circuits to non-primary guards, to be used in preference to the current algorithm. Unlike the existing approach, it isolates the guard logic a bit better from the circuit logic.

Handling exploratory circuits

When all primary guards are Unreachable, we need to try non-primary guards. We select the first such guard (in preference order) that is neither Unreachable nor Pending. Whenever we give out such a guard, if the guard's status is Unknown, then we call that guard "Pending" until the attempt to use it succeeds or fails. We remember when the guard became Pending.

Aside: None of the above is a change from our existing specification.

After completing a circuit, the implementation must check whether its guard is usable. A guard is usable according to these rules:

Primary guards are always usable.

Non-primary guards are usable for a given circuit if every guard earlier in the preference list is either unsuitable for that circuit (e.g. because of family restrictions), or marked as Unreachable, or has been pending for at least {NONPRIMARY_GUARD_CONNECT_TIMEOUT}.

Non-primary guards are unusable for a given circuit if some guard earlier in the preference list is suitable for the circuit and Reachable.

Non-primary guards are unusable if they have not become usable after {NONPRIMARY_GUARD_IDLE_TIMEOUT} seconds.

If a circuit's guard is neither usable nor unusable immediately, the circuit is not discarded; instead, it is kept (but not used) until it becomes usable or unusable.

I am not 100% sure whether this description produces the same behavior as the current guard-spec, but it is simpler to describe, and has proven to be simpler to implement.

Implications for program design.

(This entire section is implementation detail to explain why this is a simplification from the previous algorithm. It is for explanatory purposes only and is not part of the spec.)

With this algorithm, we cut down the interaction between the guard code and the circuit code considerably, but we do not remove it entirely. Instead, there remains (in Arti terms) a pair of communication channels between the circuit manager and the guard manager:

  • Whenever a guard is given to the circuit manager, the circuit manager receives the write end of a single-use channel to report whether the guard has succeeded or failed.

  • Whenever a non-primary guard is given to the circuit manager, the circuit receives the read end of a single-use channel that will tell it whether the guard is usable or unusable. This channel doesn't report anything until the guard has one status or the other.

With this design, the circuit manager never needs to look at the list of guards, and the guard manager never needs to look at the list of circuits.

Subtleties concerning "guard success"

Note that the above definitions of a Reachable guard depend on reporting when the guard is successful or failed. This is not necessarily the same as reporting whether the circuit is successful or failed. For example, a circuit that fails after the first hop does not necessarily indicate that there's anything wrong with the guard. Similarly, we can reasonably conclude that the guard is working (at least somewhat) as long as we have an open channel to it.

Filename: 338-netinfo-y2038.md
Title: Use an 8-byte timestamp in NETINFO cells
Author: Nick Mathewson
Created: 2022-03-14
Status: Accepted

Introduction

Currently Tor relays use a 4-byte timestamp (in seconds since the Unix epoch) in their NETINFO cells. Notoriously, such a timestamp will overflow on 19 January 2038.

Let's get ahead of the problem and squash this issue now, by expanding the timestamp to 8 bytes. (8 bytes worth of seconds will be long enough to outlast the Earth's sun.)

Proposed change

I propose adding a new link protocol version. (The next one in sequence, as of this writing, is version 6.)

I propose that we change the text of tor-spec section 4.5 from:

      TIME       (Timestamp)                     [4 bytes]

to

     TIME       (Timestamp)                     [4 or 8 bytes *]

and specify that this field is 4 bytes wide on link protocols 1-5, but 8 bytes wide on link protocols 6 and beyond.

Rejected alternatives

Our protocol specifies that parties MUST ignore extra data at the end of cells. Therefore we could add additional data at the end of the NETINFO cell, and use that to store the high 4 bytes of the timestamp without having to increase the link protocol version number. I propose that we don't do that: it's ugly.

As another alternative, we could declare that parties must interpret the timestamp such that its high 4 bytes place it as close as possible to their current time. I'm rejecting this kludge because it would give confusing results in the too-common case where clients have their clocks mis-set to Jan 1, 1970.

Impacts on our implementations

Arti won't be able to implement this change until it supports connection padding (as required by link protocol 5), which is currently planned for the next Arti milestone (1.0.0, scheduled for this fall).

If we think that that's a problem, or if we want to have support for implementations without connection padding in the future, we should reconsider this plan so that connection padding support is independent from 8-byte timestamps.

Other timestamps in Tor

I've done a cursory search of our protocols to see if we have any other instances of the Y2038 problem.

There is a 4-byte timestamp in cert-spec, but that one is an unsigned integer counting hours since the Unix epoch, which will keep it from wrapping around till 478756 C.E. (The rollover date of "10136 CE" reported in cert-spec is wrong, and seems to be based on the misapprehension that the counter is in minutes.)

The v2 onion service protocol has 4-byte timestamps, but it is thoroughly deprecated and unsupported.

I couldn't find any other 4-byte timestamps, but that is no guarantee: others should look for them too.

Filename: 339-udp-over-tor.md
Title: UDP traffic over Tor
Author: Nick Mathewson
Created: 11 May 2020
Status: Accepted

Introduction

Tor currently only supports delivering two kinds of traffic to the internet: TCP data streams, and a certain limited subset of DNS requests. This proposal describes a plan to extend the Tor protocol so that exit relays can also relay UDP traffic to the network.

Why would we want to do this? There are important protocols that use UDP, and in order to support users that rely on these protocols, we'll need to support them over Tor.

This proposal is a minimal version of UDP-over-Tor. Notably, it does not add an unreliable out-of-order transport to Tor's semantics. Instead, UDP messages are just tunneled over Tor's existing reliable in-order circuits. (Adding a datagram transport to Tor is attractive for some reasons, but it presents a number of problems; see this whitepaper for more information.)

In some parts of this proposal I'll assume that we have accepted and implemented some version of proposal 319 (relay fragment cells) so that we can transmit relay messages larger than 498 bytes.

Overview

UDP is a datagram protocol; it allows messages of up to 65536 bytes, though in practice most protocols will use smaller messages in order to avoid having to deal with fragmentation.

UDP messages can be dropped or re-ordered. There is no authentication or encryption baked into UDP, though it can be added by higher-level protocols like DTLS or QUIC.

When an application opens a UDP socket, the OS assigns it a 16-bit port on some IP address of a local interface. The application may send datagrams from that address:port combination, and will receive datagrams sent to that address:port.

With most (all?) IP stacks, a UDP socket can either be connected to a remote address:port (in which case all messages will be sent to that address:port, and only messages from that address will be passed to the application), or unconnected (in which case outgoing messages can be sent to any address:port, and incoming messages from any address:port will be accepted).

In this version of the protocol, we support only connected UDP sockets, though we provide extension points for someday adding unconnected socket support.

Tor protocol specification

Overview

We reserve three new relay commands: CONNECT_UDP, CONNECTED_UDP and DATAGRAM.

The CONNECT_UDP command is sent by a client to an exit relay to tell it to open a new UDP stream "connected" to a targeted address and UDP port. The same restrictions apply as for CONNECT cells: the target must be permitted by the relay's exit policy, the target must not be private, localhost, or ANY, the circuit must appear to be multi-hop, there must not be a stream with the same ID on the same circuit, and so on.

On success, the relay replies with a CONNECTED_UDP cell telling the client the IP address it is connected to, and which IP address and port (on the relay) it has bound to. On failure, the relay replies immediately with an END cell.

(Note that we do not allow the client to choose an arbitrary port to bind to. It doesn't work when two clients want the same port, and makes it too easy to probe which ports are in use.)

When the UDP stream is open, the client can send and receive DATAGRAM messages from the exit relay. Each such message corresponds to a single UDP datagram. If a datagram is larger than 498 bytes, it is transmitted as a fragmented message.

When a client no longer wishes to use a UDP stream, but it wants to keep the circuit open, it sends an END cell over the circuit. Upon receiving this message, the exit closes the stream, and stops sending any more cells on it.

Exits MAY send an END cell on a UDP stream; when a client receives it, it must treat the UDP stream as closed. Exits MAY send END cells in response to resource exhaustion, time-out signals, or (TODO what else?).

(TODO: Should there be an END ACK? We've wanted one in DATA streams for a while, to know when we can treat a stream as definitively gone-away.)

Optimistic traffic is permitted as with TCP streams: a client MAY send DATAGRAM messages immediately after its CONNECT_UDP message, without waiting for a CONNECTED_UDP. These are dropped if the CONNECT_UDP fails.

Clients and exits MAY drop incoming datagrams if their stream or circuit buffers are too full. (Once a DATAGRAM message has been sent on a circuit, however, it cannot be dropped until it reaches its intended recipient.)

Circuits carrying UDP traffic obey the same SENDME congestion control protocol as other circuits. Rather than using XON/XOFF to control transmission, excess packets may simply be dropped. UDP and TCP traffic can be mixed on the same circuit, but not on the same stream.

Discussion on "too full"

(To be determined! We need an algorithm here before we implement, though our choice of algorithm doesn't need to be the same on all exits or for all clients, IIUC.)

Discussion from the pad:

  - "Too full" should be a pair of watermark consensus parameter in
     implementation, imo. At the low watermark, random early dropping
     MAY be performed, a-la RED, etc. At the high watermark, all packets
     SHOULD be dropped. - mike
  - +1. I left "too full" as deliberately underspecified here, since I figured
    you would have a better idea than me about what it should really be.
    Maybe we should say "for one suggested algorithm, see section X below" and
    describe the algorithm you propose above in a bit more detail? -nickm
    - I have not dug deeply into drop strategies, but I believe that BLUE
      is what is in use now: https://en.wikipedia.org/wiki/Blue_(queue_management_algorithm)
    - Additionally, an important implementation detail is that it is likely
      best to actually continue to read even if our buffer is full, so we can
      perform the drop ourselves and ensure the kernel/socket buffers don't
      also bloat on us. Though this may have tradeoffs with the eventloop
      bottleneck on C-Tor. Because of that bottleneck, it might be best to
      stop reading. arti likely will have different optimal properties here. -mike

Message formats

Here we describe the format for the bodies of the new relay messages, along with extensions to some older relay message types. We note in passing how we could extend these messages to support unconnected UDP sockets in the future.

Common Format

We define here a common format for an "address" that is used both in a CONNECT_UDP and CONNECTED_UDP cell.

Address

Defines an IP or Hostname address along with its port. This can be seen as the ADDRPORT of a BEGIN cell defined in tor-spec.txt but with a different format.

/* Address types.

  Note that these are the same as in RESOLVED cells.
*/
const T_HOSTNAME = 0x00;
const T_IPV4     = 0x04;
const T_IPV6     = 0x06;

struct address {
   u8 type IN [T_IPV4, T_IPV6, T_HOSTNAME];
   u8 len;
   union addr[type] with length len {
      T_IPV4: u32 ipv4;
      T_IPV6: u8 ipv6[16];
      T_HOSTNAME: u8 hostname[];
   };
   u16 port;
}

The hostname follows the RFC1035 for its accepted length that is 63 characters or less that is a len between 0 and 255 (bytes). It should contain a sequence of nonzero octets as in any nul byte results in a malformed cell.

CONNECT_UDP

/* Tells an exit to connect a UDP port for connecting to a new target
   address.  The stream ID is chosen by the client, and is part of
   the relay header.
*/

struct connect_udp_body {
   /* As in BEGIN cells. */
   u32 flags;
   /* Address to connect to. */
   struct address addr;
   // The rest is ignored.

   // TODO: Is "the rest is ignored" still a good idea? Look at Rochet's
   // research.
}

/* As in BEGIN cells: these control how hostnames are interpreted.
   Clients MUST NOT send unrecognized flags; relays MUST ignore them.
   See tor-spec for semantics.
 */
const FLAG_IPV6_OKAY      = 0x01;
const FLAG_IPV4_NOT_OKAY  = 0x02;
const FLAG_IPV6_PREFERRED = 0x04;

A "hostname" is a DNS hostname that can only contain ascii characters. It is NOT following the large and broad DNS syntax. These behaves exacly like BEGIN cell behave with regards to the hostname given.

CONNECTED_UDP

A CONNECTED_UDP cell sent in response to a CONNECT_UDP cell has the following format.

struct udp_connected_body {
   /* The address that the relay has bound locally.  This might not
    * be an address that is advertised in the relay's descriptor. */
   struct address our_address;
   /* The address that the stream is connected to. */
   struct address their_address;
   // The rest is ignored.  There is no resolved-address TTL.

   // TODO: Is "the rest is ignored" still a good idea? Look at Rochet's
   // research.
}

Both our_address and their_address MUST NOT be of type T_HOSTNAME else the cell is considered malformed.

DATAGRAM

struct datagram_body {
   /* The datagram body is the entire body of the message.
    * This length is in the relay message header. */
   u8 body[..];
}

END

We explicitly allow all END reasons from the existing Tor protocol.

We may wish to add more as we gain experience with this protocol.

Extensions for unconnected sockets

Because of security concerns I don't suggest that we support unconnected sockets in the first version of this protocol. But if we did, here's how I'd suggest we do it.

  1. We would add a new "FLAG_UNCONNECTED" flag for CONNECT_UDP messages.

  2. We would designate the ANY addresses 0.0.0.0:0 and [::]:0 as permitted in CONNECT_UDP messages, and as indicating unconnected sockets. These would be only permitted along with the FLAG_UNCONNECTED flag, and not permitted otherwise.

  3. We would designate the ANY addresses above as permitted for the their_address field in the CONNECTED_UDP message, in the case when FLAG_UNCONNECTED was used.

  4. We would define a new DATAGRAM message format for unconnected streams, where the first 6 or 18 bytes were reserved for an IPv4 or IPv6 address:port respectively.

Specifying exit policies and compatibility

We add the following fields to relay descriptors and microdescriptors:

// In relay descriptors
ipv4-udp-policy accept PortList
ipv6-udp-policy accept PostList

// In microdescriptors
p4u accept PortList
p6u accept PortList

(We need to include the policies in relay descriptors so that the authorities can include them in the microdescriptors when voting.)

As in the p and p6 fields, the PortList fields are comma-separated lists of port ranges. Only "accept" policies are parsed or generated in this case; the alternative is not appreciably shorter. When no policy is listed, the default is "reject 1-65535".

This proposal would also add a new subprotocol, "Datagram". Only relays that implement this proposal would advertise "Datagram=1". Doing so would not necessarily mean that they permitted datagram streams, if their exit policies did not say so.

MTU notes and issues

Internet time. I might have this wrong.

The "maximum safe IPv4 UDP payload" is "well known" to be only 508 bytes long: that's defined by the 576-byte minimum-maximum IP datagram size in RFC 791 p.12, minus 60 bytes for a very big IPv4 header, minus 8 bytes for the UDP header.

Unfortunately, our RELAY body size is only 498 bytes. It would be lovely if we could easily move to larger relay cells, or tell applications not to send datagrams whose bodies are larger than 498 bytes, but there is probably a pretty large body of tools out there that assume that they will never have to restrict their datagram size to fit into a transport this small.

(That means that if we implement this proposal without fragmentation, we'll probably be breaking a bunch of stuff, and creating a great deal of overhead.)

Integration issues

I do not know how applications should tell Tor that they want to use this feature. Any ideas? We should probably integrate with their MTU discovery systems too if we can. (TODO: write about some alternatives)

Resource management issues

TODO: Talk about sharing server-side relay sockets, and whether it's safe to do so, and how to avoid information leakage when doing so.

TODO: Talk about limiting UDP sockets per circuit, and whether that's a good idea?

Security issues

  • Are there any major DoS or amplification attack vectors that this enables? I think no, because we don't allow spoofing the IP header. But maybe some wacky protocol out there lets you specify a reply address in the payload even if the source IP is different. -mike

  • Are there port-reuse issues with source port on exits, such that destinations could become confused over the start and end of a UDP stream, if a source port is reused "too fast"? This also likely varies by protocol. We should prameterize time-before-reuse on source port, in case we notice issues with some broken/braindead UDP protocol later. -mike

Future work

Extend this for onion services, possibly based on Matt's prototypes.

Filename: 340-packed-and-fragmented.md
Title: Packed and fragmented relay messages
Author: Nick Mathewson
Created: 31 May 2022
Status: Open

Introduction

Tor sends long-distance messages on circuits via relay cells. The current relay cell format allows one relay message (e.g., "BEGIN" or "DATA" or "END") per relay cell. We want to relax this 1:1 requirement, between messages and cells, for two reasons:

  • To support relay messages that are longer than the current 498-byte limit. Applications would include wider handshake messages for postquantum crypto, UDP messages, and SNIP transfer in walking onions.

  • To transmit small messages more efficiently. Several message types (notably SENDME, XON, XOFF, and several types from proposal 329) are much smaller than the relay cell size, and could be sent comparatively often. We also want to be able to hide the transmission of small control messages by packing them into what would have been the padding of other messages, making them effectively invisible to a network traffic observer.

In this proposal, we describe a way to decouple relay cells from relay messages. Relay messages can now be packed into multiple cells or split across multiple cells.

This proposal combines ideas from proposal 319 (fragmentation) and proposal 325 (packed cells). It requires ntor v3 and prepares for next-generation relay cryptography.

Additionally, this proposal has been revised to incorporate another protocol change, and move StreamId from the relay cell header into a new, optional header.

A preliminary change: Relay encryption, version 1.5

We are fairly sure that, whatever we do for our next batch of relay cryptography, we will want to increase the size of the data used to authenticate relay cells to 128 bits. (Currently it uses a 4-byte tag plus 2 bytes of zeros.)

To avoid proliferating formats, I'm going to suggest that we make the other changes in this proposal changes concurrently with a change in our relay cryptography, so that we do not have too many incompatible cell formats going on at the same time.

The new format for a decrypted relay cell will be:

recognized [2 bytes]
digest     [14 bytes]
body       [509 - 16 = 493 bytes]

The recognized and digest fields are computed as before; the only difference is that they occur before the rest of the cell, and that digest is truncated to 14 bytes instead of 4.

If we are lucky, we won't have to build this encryption at all, and we can just move to some version of GCM-UIV or other RPRP that reserves 16 bytes for an authentication tag or similar cryptographic object.

The body MUST contain exactly 493 bytes as relay cells have a fixed size.

New relay message packing

We define this new format for a relay message. We require that both header parts fit in a single RELAY cell. However, the body can be split across many relay cells:

  Message Header
    command         u8
    length          u16
  Message Routing Header (optional)
    stream_id       u16
  Message Body
    data            u8[length]

One big change from the current tor protocol is something that has become optional: we have moved stream_id into a separate inner header that only appears sometimes named the Message Routing Header. The command value tells us if the header is to be expected or not.

The following message types take required stream IDs: BEGIN, DATA, END, CONNECTED, RESOLVE, RESOLVED, and BEGIN_DIR, XON, XOFF.

The following message types from proposal 339 (UDP) take required stream IDs: CONNECT_UDP, CONNECTED_UDP and DATAGRAM.

No other current message types take stream IDs. The stream_id field, when present, MUST NOT be zero.

Messages can be split across relay cells; multiple messages can occur in a single relay cell. We enforce the following rules:

  • Headers may not be split across cells.
  • If a 0 byte follows a message body, there are no more messages.
  • A message body is permitted to end at exactly the end of a relay cell, without a 0 byte afterwards.
  • A relay cell may not be "empty": it must have at least some part of some message.

Unless specified elsewhere, all message types may be packed, and all message types may be fragmented.

Every command has an associated maximum length for its messages. If not specified elsewhere, the maximum length for every message is 498 bytes (for legacy reasons).

Receivers MUST validate that the cell header and the message header are well-formed and have valid lengths while handling the cell in which the header is encoded. If any of them is invalid, the circuit MUST be destroyed.

A message header with an unrecognized command is considered invalid and thus MUST result in the circuit being immediately destroyed (without waiting for the rest of the relay message to arrive, in the case of a fragmented message).

New subprotocol RelayCell

We introduce a new subprotocol RelayCell to specify the relay cell ABI. The new format specified in this proposal, supporting packing and fragmentation, corresponds to RelayCell version 1. The ABI prior to this proposal is RelayCell version 0.

All clients and relays implicitly support RelayCell version 0.

XXX: Do we want to consider some migration path for eventually removing support for RelayCell version 0? e.g. maybe this should be something like "Support for any of Relay versions 1-5 imply support for RelayCell version 0"?

We reserve the protocol ID 13 for binary encoding of this subprotocol with respect to proposal 346 and proposal 323.

To use RelayCell version 1 or greater with a given relay on a given circuit, the client negotiates it using an ntor_v3 extension, as per proposal 346. This implies that the relay must advertise support for Relay version 5 (ntor_v3 circuit extensions) as well as the target RelayCell version (1 for the format introduced in this proposal).

Circuits using mixed RelayCell versions are permitted. e.g. we anticipate some of the use-cases for packing and fragmentation to only need the exit-relay to support it. Not requiring RelayCell=1 for other relays in the circuit provides a larger pool of candidate relays. While an intermediate relay using a different RelayCell version than the destination relay of a given relay cell will look at the wrong bytes for the recognized and digest fields, they will reach the correct conclusion that the cell is not intended for them and pass it to the next hop in the circuit.

Migration

Note: This differs from what we decided was our new best-practices. Should we make this disableable at all?

We add a consensus parameter, "streamed-relay-messages", with default value 0, minimum value 0, and maximum value 1.

If this value is 0, then clients will not (by default) negotiate this relay protocol. If it is 1, then clients will negotiate it when relays support it.

For testing, clients can override this setting. Once enough relays support this proposal, we'll change the consensus parameter to 1. Later, we'll change the default to 1 as well.

Packing decisions

We specify the following greedy algorithm for making decisions about fragmentation and packing. Other algorithms are possible, but this one is fairly simple, and using it will help avoid distinguishability issues:

Whenever a client or relay is about to send a cell that would leave at least 32 bytes unused in a relay cell, it checks to see whether there is any pending data to be sent in the same circuit (in a data cell). If there is, then it adds a DATA message to the end of the current cell, with as much data as possible. Otherwise, the client sends the cell with no packed data.

XXX: This isn't quite right. What was actually implemented in tor, and what we want in arti, is to defer sending some "control" messages like confluence switch and (non-first) xon, until they can be invisibly packed into a cell for a DATA message.

dgoulet: Could you update this section with the concrete details, and exactly what property we're trying to achieve? e.g.:

If we have data to send, but the corresponding DATA messages don't leave enough room to pack in the deferred control message(s), what do we do? If we continue deferring could we end up deferring forever if the application always writes in chunks that happen to align this way?

Since cells containing any part of a DATA message is subject to congestion windows, does that mean if our congestion window is empty we can't send these control messages either (until the window becomes non-empty)?

Onion services

Negotiating this for onion services will happen in a separate proposal; it is not a current priority, since there is nothing sent over rendezvous circuits that we currently need to fragment or pack.

Miscellany

Handling RELAY_EARLY

The RELAY_EARLY status for a command is determined based on the relay cell in which the command's header appeared.

Thus, a relay MUST close a circuit if the cell containing the first fragment of an EXTEND message is not RELAY_EARLY, and MUST allow but NOT require RELAY_EARLY to be set on other cells.

This implies that a client only needs to set RELAY_EARLY on the cell containing the first fragment of an EXTEND message, but that it MAY set RELAY_EARLY on other cells, in order to prevent traffic fingerprinting.

(Note: As now, relays and clients MUST destroy any circuit upon seeing a RELAY_EARLY message in the inbound direction.)

In our implementation, clients will continue to set RELAY_EARLY on the first N cells of each circuit, as we do today.

Note that this description allows us to take two approaches when we eventually do support fragmented EXTEND messages. We can either set the RELAY_EARLY flag on the cell containing the first fragment only, or we can continue to set it on the first N cells sent on each circuit. Both will work fine, assuming that the limit of RELAY_EARLY cells is adjustable. This brings us to:

Making the RELAY_EARLY limit adjustable

We add the following parameter, to support an eventual migration to longer extend cells, in case we decide to take the second approach in our note above.

"max_early_per_circ" -- Relays MUST destroy any circuit on
which they see more than this number of RELAY_EARLY cells.
Min: 5. Max: 65535. Default: 8.

Handling SENDMEs

SENDME messages may not be fragmented; the body and the command must appear in the same cell. (This is necessary so authenticated sendmes can have a reasonable implementation.)

Interaction with Conflux

Fragmented messages may be used together with Conflux, but we do not allow fragments from a single method to be sent on separate legs of a single circuit bundle.

That is to say, it is an error to send a CONFLUX_SWITCH message if the SeqNum would leave any other circuit with an incomplete message where not all framgents have arrived. Upon receiving such an erroneous message, parties SHOULD destroy all circuits in the conflux bundle.

An exception for DATA.

Data messages may not be fragmented. When packing data into a cell containing other messages is desired, the application can instead construct a DATA message of an appropriate size to fit into the remaining space.

While relaxing this could simplify the implementation of opportunistic packing somewhat (by allowing code that constructs DATA messages not to have to know about packing or fragmentation), doing so would have several downsides.

First, on the receiver side a naive implementation that receives the first cell of a fragmented DATA message would not be able to pass the data in that fragment on to the application until the remaining cells of that message are received. An optimized implementation might choose to do so, but that complexity seems worse than the complexity we'd be avoiding by allowing DATA fragmentation in the first place.

Second, as with any sort of flexibility permitted to implementations, allowing flexibility here adds opportunities for fingerprinting and covert channels.

Extending message-length maxima

For now, the maximum length for every message body is 493 bytes, except as follows:

  • DATAGRAM messages (see proposal 339) have a maximum body length of 1967 bytes. (This works out to four relay cells, and accommodates most reasonable MTU choices)

Any increase in maximum length for any other message type requires a new RelayCell subprotocol version. (For example, if we later want to allow EXTEND2 messages to be 2000 bytes long, we need to add a new proposal saying so, and reserving a new subprotocol version.)

SENDME window accounting

SENDME windows count relay cells rather than relay messages.

A cell counts towards the circuit's SENDME window if it contains any part of any message that would normally count towards SENDME windows (currently only DATA).

A cell counts towards the SENDME window of every stream that it contains part of a message for, whose command counts towards SENDME windows.

Examples:

  • A cell containing a SENDME message and a RESOLVE message currently wouldn't count towards any windows, since neither of those commands currently counts towards windows.
  • A cell containing a SENDME message and a DATA message would count towards the circuit window and the DATA message's stream's window.
  • A cell containing two DATA messages, for different streams, would count towards the circuit-level window and both stream-level windows.
  • A cell containing two DATA messages for the same stream counts once towards the circuit-level and stream-level windows.
  • If DATAGRAM messages (proposal 339) are implemented, and count towards windows, then every cell containing a fragment of a DATAGRAM message counts towards windows.

Appendix: Example cells

Here is an example of the simplest case: one message, sent in one relay cell:

  Cell 1:
    header:
       circid         ..                [4 bytes]
       command        RELAY             [1 byte]
    relay cell header:
       recognized     0                 [2 bytes]
       digest         (...)             [14 bytes]
    message header:
       command        BEGIN             [1 byte]
       length         23                [2 bytes]
    message routing header:
       stream_id      42                [2 bytes]
    message body:
      "www.torproject.org:443\0"        [23 bytes]
    end-of-messages marker:
      0                                 [1 byte]
    padding up to end of cell:
      random                            [464 bytes]

Total of 514 bytes which is the absolute maximum relay cell size.

A message whose body ends at exactly the end of a relay cell has no corresponding end-of-messages marker.

  Cell 1:
    header:
       circid         ..                [4 bytes]
       command        RELAY             [1 byte]
    relay cell header:
       recognized     0                 [2 bytes]
       digest         (...)             [14 bytes]
    message header:
       command        DATA              [1 byte]
       length         488               [2 bytes]
    message routing header:
       stream_id      42                [2 bytes]
    message body:
       (data)                           [488 bytes]

Here's an example with fragmentation only: a large EXTEND2 message split across two relay cells.

  Cell 1:
    header:
       circid         ..               [4 bytes]
       command        RELAY_EARLY      [1 byte]
    relay cell header:
       recognized     0                [2 bytes]
       digest         (...)            [14 bytes]
    message header:
       command        EXTEND           [1 byte]
       length         800              [2 bytes]
    message body:
       (extend body, part 1)           [490 bytes]

  Cell 2:
    header:
       circid         ..               [4 bytes]
       command        RELAY            [1 byte]
    relay cell header:
      recognized     0                 [2 bytes]
      digest         (...)             [14 bytes]
    message body, continued:
      (extend body, part 2)            [310 bytes] (310+490=800)
    end-of-messages marker:
      0                                [1 byte]
    padding up to end of cell:
      random                           [182 bytes]

Each cells are 514 bytes for a message body totalling 800 bytes.

Here is an example with packing only: A BEGIN_DIR message and a data message in the same cell.

  Cell 1:
    header:
       circid         ..                [4 bytes]
       command        RELAY             [1 byte]
    relay cell header:
       recognized     0                 [2 bytes]
       digest         (...)             [14 bytes]

    # First relay message
    message header:
       command        BEGIN_DIR         [1 byte]
       length         0                 [2 bytes]
    message routing header:
       stream_id      32                [2 bytes]

    # Second relay message
    message header:
       command        DATA              [1 byte]
       length         25                [2 bytes]
    message routing header:
       stream_id      32                [2 bytes]
    message body:
       "HTTP/1.0 GET /tor/foo\r\n\r\n"  [25 bytes]

    end-of-messages marker:
      0                                 [1 byte]
    padding up to end of cell:
      random                            [457 bytes]

Here is an example with packing and fragmentation: a large DATAGRAM cell, a SENDME cell, and an XON cell.

(Note that this sequence of cells would not actually be generated by the algorithm described in "Packing decisions" above; this is only an example of what parties need to accept.)

  Cell 1:
    header:
       circid         ..               [4 bytes]
       command        RELAY            [1 byte]
    relay cell header:
       recognized     0                [2 bytes]
       digest         (...)            [14 bytes]

    # First message
    message header:
       command        DATAGRAM         [1 byte]
       length         1200             [2 bytes]
    message routing header:
       stream_id      99               [2 bytes]
    message body:
       (datagram body, part 1)         [488 bytes]

  Cell 2:
    header:
       circid         ..               [4 bytes]
       command        RELAY            [1 byte]
    relay cell header:
      recognized     0                 [2 bytes]
      digest         (...)             [14 bytes]
    message body, continued:
      (datagram body, part 2)          [493 bytes]

  Cell 3:
    header:
       circid         ..               [4 bytes]
       command        RELAY            [1 byte]
    relay cell header:
      recognized     0                 [2 bytes]
      digest         (...)             [14 bytes]
    message body, continued:
      (datagram body, part 3)          [219 bytes] (488+493+219=1200)

    # Second message
    message header:
       command        SENDME           [1 byte]
       length         23               [2 bytes]
    message body:
       version        1                [1 byte]
       datalen        20               [2 bytes]
       data           (digest to ack)  [20 bytes]

    # Third message
    message header:
       command        XON              [1 byte]
       length         1                [2 bytes]
    message routing header:
       stream_id      50               [2 bytes]
    message body:
       version        1                [1 byte]

    end-of-messages marker:
      0                                [1 byte]
    padding up to end of cell:
      random                           [241 bytes]
Filename: 341-better-oos.md
Title: A better algorithm for out-of-sockets eviction
Author: Nick Mathewson
Created: 25 July 2022
Status: Open

Introduction

Our existing algorithm for handling an out-of-sockets condition needs improvement. It only handles sockets used for OR connections, and prioritizes those with more circuits. Because of these weaknesses, the algorithm is trivial to circumvent, and it's disabled by default with DisableOOSCheck.

Here we propose a new algorithm for choosing which connections to close when we're out of sockets. In summary, the new algorithm works by deciding which kinds of connections we have "too many" of, and then by closing excess connections of each kind. The algorithm for selecting connections of each kind is different.

Intuitions behind the algorithm below

We want to keep a healthy mix of connections running; favoring one kind of connection over another gives the attacker a fine way to starve the disfavored connections by making a bunch of the favored kind.

The correct mix of connections depends on the type of service we are providing. Everywhere except authorities, for example, inbound directory connections are perfectly fine to close, since nothing in our protocol actually generates them.

In general, we would prefer to close DirPort connections, then Exit connections, then OR connections.

The priority with which to close connections is different depending on the connection type. "Age of connection" or "number of circuits" may be a fine metric for how truly used an OR connection is, but for a DirPort connection, high age is suspicious.

The algorithm

Define a "candidate" connection as one that has a socket, and is either an exit stream, an inbound directory stream, or an OR connection.

(Note that OR connections can be from clients, relays, or bridges. Note that ordinary relays should not get directory streams that use sockets, since clients always use BEGIN_DIR to create tunneled directory streams.)

In all of the following, treat subtraction as saturating at zero. In other words, when you see "A - B" below, read it as "MAX(A-B, 0)".

Phase 1: Deciding how many connections to close

When we find that we are low on sockets, we pick a number of sockets that we want to close according to our existing algorithm. (That is, we try to close 1/4 of our maximum sockets if we have reached our upper limit, or 1/10 of our maximum sockets if we have encountered a failure from socket(2).) Call this N_CLOSE.

Then we decide which sockets to target based on this algorithm.

  1. Consider the total number of sockets used for exit streams (N_EXIT), the total number used for inbound directory streams (N_DIR), and the total number used for OR connections (N_OR). (In these calculations, we exclude connections that are already marked to be closed.) Call the total N_CONN = N_DIR + N_OR + N_EXIT. Define N_RETAIN = N_CONN - N_CLOSE.

  2. Compute how many connections of each type are "in excess". First, calculate our target proportions:

    • If we are an authority, let T_DIR = 1. Otherwise set T_DIR = 0.1.
    • If we are an exit or we are running an onion service, let T_EXIT = 2. Otherwise let T_EXIT = 0.1.
    • Let T_OR = 1.

    TODO: Should those numbers be consensus parameters?

    These numbers define the relative proportions of connections that we would be willing to retain retain in our final mix. Compute a number of excess connections of each type by calculating.

    T_TOTAL = T_OR + T_DIR + T_EXIT.
    EXCESS_DIR   = N_DIR  - N_RETAIN * (T_DIR  / T_TOTAL)
    EXCESS_EXIT  = N_EXIT - N_RETAIN * (T_EXIT / T_TOTAL)
    EXCESS_OR    = N_OR   - N_RETAIN * (T_OR   / T_TOTAL)
    
  3. Finally, divide N_CLOSE among the different types of excess connections, assigning first to excess directory connections, then excess exit connections, and finally to excess OR connections.

    CLOSE_DIR = MIN(EXCESS_DIR, N_CLOSE)
    N_CLOSE := N_CLOSE - CLOSE_DIR
    CLOSE_EXIT = MIN(EXCESS_EXIT, N_CLOSE)
    N_CLOSE := N_CLOSE - CLOSE_EXIT
    CLOSE_OR = MIN(EXCESS_OR, N_CLOSE)
    

We will try to close CLOSE_DIR directory connections, CLOSE_EXIT exit connections, and CLOSE_OR OR connections.

Phase 2: Closing directory connections

We want to close a certain number of directory connections. To select our targets, we sort first by the number of directory connections from a similar address (see "similar address" below) and then by their age, preferring to close the oldest ones first.

This approach defeats "many requests from the same address" and "Open a connection and hold it open, and do so from many addresses". It doesn't do such a great job with defeating "open and close frequently and do so on many addresses."

Note that fallback directories do not typically use sockets for handling directory connections: theirs are usually created with BEGIN_DIR.

Phase 3: Closing exit connections.

We want to close a certain number of exit connections. To do this, we pick an exit connection at random, then close its circuit along with all the other exit connections on the same circuit. Then we repeat until we have closed at least our target number of exit connections.

This approach probabilistically favors closing circuits with a large number of sockets open, regardless of how long those sockets have been open. This defeats the easiest way of opening a large number of exit streams ("open them all on one circuit") without making the counter-approach ("open each exit stream on its own circuit") much more attractive.

Phase 3: Closing OR connections.

We want to close a certain number of OR connections, to clients, bridges, or relays.

To do this, we first close OR connections with zero circuits. Then we close all OR connections but the most recent 2 from each "similar address". Then we close OR connections at random from among those not to a recognized relay in the latest directory. Finally, we close OR connections at random.

We used to unconditionally prefer to close connections with fewer circuits. That's trivial for an adversary to circumvent, though: they can just open a bunch of circuits on their bogus OR connections, and force us to preferentially close circuits from real clients, bridges, and relays.

Note that some connections that seem like client connections ("not from relays in the latest directory") are actually those created by bridges.

What is "A similar address"?

We define two connections as having a similar address if they are in the same IPv4 /30, or if they are in the same IPv6 /90.

Acknowledgments

This proposal was inspired by a set of OOS improvements from starlight.

Filename: 342-decouple-hs-interval.md
Title: Decoupling hs_interval and SRV lifetime
Author: Nick Mathewson
Created: 9 January 2023
Status: Draft

Motivation and introduction

Tor uses shared random values (SRVs) in the consensus to determine positions of relays within a hash ring. Which shared random value is to be used for a given time period depends upon the time at which that shared random value became valid.

But right now, the consensus voting period is closely tied to the shared random value voting cycle: and clients need to understand both of these in order to determine when a shared random value became current.

This creates tight coupling between:

  • The voting schedule
  • The SRV liveness schedule
  • The hsdir_interval parameter that determines the length of the an HSDIR index

To decouple these values, this proposal describes a forward compatible change to how Tor reports SRVs in consensuses, and how Tor decides which hash ring to use when.

Reporting SRV timestamps

In consensus documents, parties should begin to accept shared-rand-*-value lines with an additional argument, in the format of an IsoTimeNospace timestamp (like "1985-10-26T00:00:00"). When present, this timestamp indicates the time at which the given shared random value first became the "current" SRV.

Additionally, we define a new consensus method that adds these timestamps to the consensus.

We specify that, in the absence of such a timestamp, parties are to assume that the shared-rand-current-value SRV became "current" at the first 00:00 UTC on the UTC day of the consensus's valid-after timestamp, and that the shard-rand-previous-value SRV became "current" at 00:00 UTC on the previous UTC day.

Generalizing HSDir index scheduling.

Under the current HSDir design, there is one SRV for each time period, and one time period for which each SRV is in use. Decoupling hsdir_interval from 24 hours will require that we change this notion slightly.

We therefore propose this set of generalized directory behavior rules, which should be equivalent to the current rules under current parameters.

The calculation of time periods remains the same (see rend-spec-v3.txt section [TIME PERIODS]).

A single SRV is associated with each time period: specifically, the SRV that was "current" at the start of the time period.

There is a separate hash ring associated with each time period and its SRV.

Whenever fetching an onion service descriptor, the client uses the hash ring for the time period that contains the start of the liveness interval of the current consensus. Call this the "Consensus" time period.

Whenever uploading an onion service descriptor, the service uses two or three hash rings:

  • The "consensus" time period (see above).
  • The immediately preceding time period, if the SRV to calculate that hash ring is available in the consensus.
  • The immediately following time period, if the SRV to calculate that hash ring is available in the consensus.

(Under the current parameters, where hsdir_interval = SRV_interval, there will never be more than two possible time periods for which the service can qualify.)

Migration

We declare that, for at least the lifetime of the C tor client, we will not make any changes to the voting interval, the SRV interval, or the hsdir_interval. As such, we do not need to prioritize implementing these changes in the C client: we can make them in Arti only.

Issues left unsolved

There are likely other lingering issues that would come up if we try to change the voting interval. This proposal does not attempt to solve them.

This proposal does not attempt to add flexibility to the SRV voting algorithm itself.

Changing hsdir_interval would create a flag day where everybody using old and new values of hsdir_interval would get different hash rings. We do not try to solve that here.

Acknowledgments

Thanks to David Goulet for explaining all of this stuff to me!

Filename: 343-rend-caa.txt
Title: CAA Extensions for the Tor Rendezvous Specification
Author: Q Misell <q@as207960.net>
Created: 2023-04-25
Status: Open
Ticket: https://gitlab.torproject.org/tpo/core/tor/-/merge_requests/716

Overview:
  The document defines extensions to the Tor Rendezvous Specification Hidden
  Service descriptor format to allow the attachment of DNS style CAA records to
  Tor hidden services to allow the same security benefits as CAA provides in the
  DNS.

Motivation:
  As part of the work on draft-misell-acme-onion [I-D.misell-acme-onion] at the
  IETF it was felt necessary to define a method to incorporate CAA records
  [RFC8659] into Tor hidden services.

  CAA records in the DNS provide an mechanism to indicate which Certificate
  Authorities are permitted to issue certificates for a given domain name, and
  restrict which validation methods are permitted for certificate validation.

  As Tor hidden service domains are not in the DNS another way to provide the
  same security benefits as CAA does in the DNS needed to be devised.

  It is important to note that a hidden service is not required to publish a CAA
  record to obtain a certificate, as is the case in the DNS.

  More information about this project in general can be found at
  https://acmeforonions.org.

Specification:
  To enable maximal code re-use in CA codebases the same CAA record format is
  used in Tor hidden services as in the DNS. To this end a new field is added to
  the second layer hidden service descriptor [tor-rend-spec-v3] § 2.5.2.2.
  with the following format:

    "caa" SP flags SP tag SP value NL
    [Any number of times]

  The contents of "flag", "tag", and "value" are as per [RFC8659] § 4.1.1.
  Multiple CAA records may be present, as is the case in the DNS.

  A hidden service's second layer descriptor using CAA may look
  something like the following:

    create2-formats 2
    single-onion-service
    caa 0 issue "example.com"
    caa 0 iodef "mailto:security@example.com"
    caa 128 validationmethods "onion-csr-01"
    introduction-point AwAGsAk5nSMpAhRqhMHbTFCTSlfhP8f5PqUhe6DatgMgk7kSL3KHCZ...

  As the CAA records are in the second layer descriptor and in the case of a
  hidden service requiring client authentication it is impossible to read them
  without the hidden service trusting a CA's public key, a method is required to
  signal that there are CAA records present (but not reveal their contents,
  which may disclose unwanted information about the hidden service operator to
  third parties). This is to allow a CA to know that it must attempt to check
  CAA records before issuance, and fail if it is unable to do so.

  To this end a new field is added to the first layer hidden service descriptor
  [tor-rend-spec-v3] § 2.5.1.2. with the following format:

    "caa-critical" NL
    [At most once]

Security Considerations:
  The second layer descriptor is signed, encrypted and MACed in a way that only
  a party with access to the secret key of the hidden service could manipulate
  what is published there. Therefore, Tor CAA records have at least the same
  security as those in the DNS secured by DNSSEC.

  The "caa-critical" flag is visible to anyone with knowledge of the hidden
  service's public key, however it reveals no information that could be used to
  de-anonymize the hidden service operator.

  The CAA flags in the second layer descriptor may reveal information about the
  hidden service operator if they choose to publish an "iodef", "contactemail",
  or "contactphone" tag. These however are not required for primary goal of CAA,
  that is to restrict which CAs may issue certificates for a given domain name.

  No more information is revealed by the "issue" nor "issuewild" tags than would
  be revealed by someone making a connection to the hidden service and noting
  which certificate is presented.

Compatibility:
  The hidden service spec [tor-rend-spec-v3] already requires that clients
  ignore unknown lines when decoding hidden service descriptors, so this change
  should not cause any compatibility issues. Additionally in testing no
  compatibility issues where found with existing Tor implementations.

  A hidden service with CAA records published in its descriptor is available at
  znkiu4wogurrktkqqid2efdg4nvztm7d2jydqenrzeclfgv3byevnbid.onion, to allow
  further compatibility testing.

References:
  [I-D.misell-acme-onion]
             Misell, Q., "Automated Certificate Management Environment (ACME)
             Extensions for ".onion" Domain Names", Internet-Draft
             draft-misell-acme-onion-02, April 2023,
             <https://datatracker.ietf.org/doc/html/draft-misell-acme-onion-02>.

  [RFC8659]  Hallam-Baker, P., Stradling, R., and J. Hoffman-Andrews,
             "DNS Certification Authority Authorization (CAA) Resource
             Record", RFC 8659, DOI 10.17487/RFC8659, November 2019,
             <https://www.rfc-editor.org/info/rfc8659>.

  [tor-rend-spec-v3]
             The Tor Project, "Tor Rendezvous Specification - Version 3",
             <https://spec.torproject.org/rend-spec-v3>.```
Filename: 344-protocol-info-leaks.txt
Title: Prioritizing Protocol Information Leaks in Tor
Author: Mike Perry
Created: 2023-07-17
Purpose: Normative
Status: Open


0. Introduction

Tor's protocol has numerous forms of information leaks, ranging from highly
severe covert channels, to behavioral issues that have been useful
in performing other attacks, to traffic analysis concerns.

Historically, we have had difficulty determining the severity of these
information leaks when they are considered in isolation. At a high level, many
information leaks look similar, and all seem to be forms of traffic analysis,
which is regarded as a difficult attack to perform due to Tor's distributed
trust properties.

However, some information leaks are indeed more severe than others: some can
be used to remove Tor's distributed trust properties by providing a covert
channel and using it to ensure that only colluding and communicating relays
are present in a path, thus deanonymizing users. Some do not provide this
capability, but can be combined with other info leak vectors to quickly yield
Guard Discovery, and some only become dangerous once Guard Discovery or other
anonymity set reduction is already achieved.

By prioritizing information leak vectors by their co-factors, impact, and
resulting consequences, we can see that these attack vectors are not all
equivalent. Each vector of information leak also has a common solution, and
some categories even share the same solution as other categories.

This framework is essential for understanding the context in which we will be
addressing information leaks, so that decisions and fixes can be understood
properly. This framework is also essential for recognizing when new protocol
changes might introduce information leaks or not, for gauging the severity of
such information leaks, and for knowing what to do about them.

Hence, we are including it in tor-spec, as a living, normative document to be
updated with experience, and as external research progresses.

It is essential reading material for any developers working on new Tor
implementations, be they Arti, Arti-relay, or a third party implementation.

This document is likely also useful to developers of Tor-like anonymity
systems, of which there are now several, such as I2P, MASQUE, and Oxen. They
definitely share at least some, and possibly even many of these issues.

Readers who are relatively new to anonymity literature may wish to first
consult the Glossary in Section 3, especially if terms such as Covert Channel,
Path Bias, Guard Discovery, and False Positive/False Negative are unfamiliar
or hazy. There is also a catalog of historical real-world attacks that are
known to have been performed against Tor in Section 2, to help illustrate how
information leaks have been used adversarially, in practice.

We are interested in hearing from journalists and legal organizations who
learn about court proceedings involving Tor. We became aware of three
instances of real-world attacks covered in Section 2 in this way. Parallel
construction (hiding the true source of evidence by inventing an alternate
story for the court -- also known as lying) is a possibility in the US and
elsewhere, but (so far) we are not aware of any direct evidence of this
occurring with respect to Tor cases. Still, keep your eyes peeled...


0.1. Table of Contents

  1. Info Leak Vectors
     1.1. Highly Severe Covert Channel Vectors
          1.1.1. Cryptographic Tagging
          1.1.2. End-to-end cell header manipulation
          1.1.3. Dropped cells
     1.2. Info Leaks that enable other attacks
          1.2.1. Handshakes with unique traffic patterns
          1.2.2. Adversary-Induced Circuit Creation
          1.2.3. Relay Bandwidth Lying
          1.2.4. Metrics Leakage
          1.2.5. Protocol Oracles
     1.3. Info Leaks of Research Concern
          1.3.1. Netflow Activity
          1.3.2. Active Traffic Manipulation Covert Channels
          1.3.3. Passive Application-Layer Traffic Patterns
          1.3.4. Protocol or Application Linkability
          1.3.5. Latency Measurement
  2. Attack Examples
     2.1. CMU Tagging Attack
     2.2. Guard Discovery Attacks with Netflow Deanonymization
     2.3. Netflow Anonymity Set Reduction
     2.4. Application Layer Confirmation
  3. Glossary


1. Info Leak Vectors

In this section, we enumerate the vectors of protocol-based information leak
in Tor, in order of highest priority first. We separate these vectors into
three categories: "Highly Severe Covert Channels", "Info Leaks that Enable
other attacks", and "Info Leaks Of Research Concern". The first category
yields deanonymization attacks on their own. The second category enables other
attacks that can lead to deanonymization. The final category can be aided by
the earlier vectors to become more severe, but overall severity is a
combination of many factors, and requires further research to illuminate all
of these factors.

For each vector, we provide a brief "at-a-glance" summary, which includes a
ballpark estimate of Accuracy in terms of False Positives (FP) and False
Negatives (FN), as 0, near-zero, low, medium, or high. We then list what is
required to make use of the info leak, the impact, the reason for the
prioritization, and some details on where the signal is injected and observed.


1.1. Highly Severe Covert Channel Vectors

This category of info leak consists entirely of covert channel vectors that
have zero or near-zero false positive and false negative rates, because they
can inject a covert channel in places where similar activity would not happen,
and they are end-to-end.

They also either provide or enable path bias attacks that can capture
the route clients use, to ensure that only malicious exits are used, leading
to full deanonymization when the requirements are met.

If the adversary has censorship capability, and can ensure that users only
connect to compromised Guards (or Bridges), they can fully deanonymize all
users with these covert channels.


1.1.1. Cryptographic Tagging

At a glance:
  Accuracy: FP=0, FN=0
  Requires: Malicious or compromised Guard, at least one exit
  Impact: Full deanonymization (path bias, identifier transmission)
  Path Bias: Automatic route capture (all non-deanonymized circuits fail)
  Reason for prioritization: Severity of Impact; similar attacks used in wild
  Signal is: Modified cell contents
  Signal is injected: by guard
  Signal is observed: by exit

First reported at Black Hat in 2009 (see [ONECELL]), and elaborated further
with the path bias amplification attack in 2012 by some Raccoons (see
[RACCOON23]), this is the most severe vector of covert channel attack in Tor.

Cryptographic tagging is where an adversary who controls a Guard (or Bridge)
XORs an identifier, such as an IP address, directly into the circuit's
cipher-stream, in an area of known-plaintext. This tag can be exactly
recovered by a colluding exit relay, ensuring zero false positives and zero
false negatives for this built-in identifier transmission, along with their
collusion signal.

Additionally, because every circuit that does not have a colluding relay will
automatically fail because of the failed digest validation, the adversary gets
a free path bias amplification attack, such that their relay only actually
carries traffic that they know they have successfully deanonymized. Because
clients will continually attempt to re-build such circuits through the guard
until they hit a compromised exit and succeed, this violates Tor's distributed
trust assumption, reducing it to the same security level as a one-hop proxy
(ie: the security of fully trusting the Guard relay). Worse still, when the
adversary has full censorship control over all connections into the Tor
network, Tor provides zero anonymity or privacy against them, when they also
use this vector.

Because the Exit is able to close *all* circuits that are not deanonymized,
for maximal efficiency, the adversary's Guard capacity should exactly match
their Exit capacity. To make up for the loss of traffic caused by closing many
circuits, relays can lie about their bandwidth (see Section 1.2.3).

Large amounts of circuit failure (that might be evidence of such an attack)
are tracked and reported by C-Tor in the logs, by the path bias detector, but
when the Guard is under DDoS, or even heavy load, this can yield false alarms.
These false alarms happened frequently during the network-wide DDoS of
2022-2023. They can also be induced at arbitrary Guards via DoS, to make users
suspicious of their Guards for no reason.

The path bias detector could have a second layer in Arti, that checks to see
if any specific Exits are overused when the circuit failure rate is high. This
would be more indicative of an attack, but could still go off if the user is
actually trying to use rare exits (ie: country selection, bittorrent).

This attack, and path bias attacks that are used in the next two sections, do
have some minor engineering barriers when being performed against both onion
and exit traffic, because the onion service traffic is restricted to
particular hops in the case of HSDIR and intro point circuits. However,
because pre-built circuits are used to access HSDIR and intro points, the
adversary can use their covert channel such that only exits and pre-built
onion service circuits are allowed to proceed. Onion services are harder to
deanonymize in this way, because the HSDIR choice itself can't be controlled
by them, but they can still be connected to using pre-built circuits until the
adversary also ends up in the HSDIR position, for deanonymization.

Solution: Path Bias Exit Usage Counter;
          Counter Galois Onion (CGO) (Forthcoming update to Prop#308).
Status: Unfixed (Current PathBias detector is error-prone under DDoS)
Funding: CGO explicitly funded via Sponsor 112


1.1.2. End-to-end cell header manipulation

At a glance:
  Accuracy: FP=0, FN=0
  Requires: Malicious or compromised Guard, at least one exit
  Impact: Full deanonymization (path bias, identifier transmission)
  Path Bias: Full route capture is trivial
  Reason for prioritization: Severity of Impact; used in the wild
  Signal is: Modified cell commands.
  Signal is injected: By either guard or exit/HSDIR
  Signal is observed: By either guard or exit/HSDIR

The Tor protocol consists of both cell header commands, and relay header
commands. Cell commands are not encrypted by circuit-level encryption, so they
are visible and modifiable by every relay in the path. Relay header commands
are encrypted, and not visible to every hop in the path.

Not all cell commands are forwarded end-to-end. Currently, these are limited
to RELAY, RELAY_EARLY, and DESTROY. Because of the attack described here,
great care must be taken when adding new end-to-end cell commands, even if
they are protected by a MAC.

Previously, a group of researchers at CMU used this property to modify the
cell command header of cells on circuits, to switch between RELAY_EARLY and
RELAY at exits and HSDIRs (see [RELAY_EARLY]). This creates a visible bit in
each cell, that can signal collusion, or with enough cells, can encode an
identifier such as an IP address. They assisted the FBI, to use this attack in
the wild to deanonymize clients.

We addressed the CMU attack by closing the circuit upon receiving an "inbound"
(towards the client) RELAY_EARLY command cell, and by limiting the number of
"outbound" (towards the exit) RELAY_EARLY command cells at relays, and by
requiring the use of RELAY_EARLY for EXTEND (onionskin) relay commands. This
defense is not generalized, though. Guards may still use this specific covert
channel to send around 3-5 bits of information after the extend handshake,
without killing the circuit. It is possible to use the remaining outbound
vector to assist in path bias attacks for dropped cells, as a collusion signal
to reduce the amount of non-compromised traffic that malicious exits must
carry (see the following Section 1.1.3).

If this covert channel is not addressed, it is trivial for a Guard and Exit
relays to close every circuit that does not display this covert channel,
providing path bias amplification attack and distributed trust reduction,
similar to cryptographic tagging attacks. Because the inbound direction *is*
addressed, we believe this kind of path bias is currently not possible with
this vector by itself (thus also requiring the vector from Section 1.1.3), but
it could easily become possible if this defense is forgotten, or if a new
end-to-end cell type is introduced.

While more cumbersome than cryptographic tagging attacks, in practice this
attack is just as successful, if these cell command types are not restricted
and limited. It is somewhat surprising that the FBI used this attack before
cryptographic tagging, but perhaps that was just a lucky coincidence of
opportunity.

Solution: CGO (Updated version of Prop#308) covers cell commands in the MAC;
          Any future end-to-end cell commands must still limit usage
Status: Fix specific to CMU attack; Outbound direction is unfixed
Funding: Arti and relay-side fixes are explicitly funded via Sponsor 112


1.1.3. Dropped cells

At a glance:
  Accuracy: FP=0, FN=0
  Requires: Malicious Guard or Netflow data (if high volume), one exit
  Impact: Full deanonymization (path bias amplification, collusion signal)
  Path Bias: Full route capture is trivial
  Reason for prioritization: Severity of Impact; similar attacks used in wild
  Signal is: Unusual patterns in number of cells received
  Signal is injected: By exit or HSDIR
  Signal is observed: at guard or client<->guard connection.
	
Dropped cells are cells that a relay can inject that end up ignored and
discarded by a Tor client. These include:
  - Unparsable cells
  - Unrecognized cells (ie: wrong source hop, or decrypt failures)
  - invalid relay commands
  - unsupported (or consensus-disabled) relay commands or extensions
  - out-of-context relay commands
  - duplicate relay commands
  - relay commands that hit any error codepaths
  - relay commands for an invalid or already-closed stream ID
  - semantically void relay cells (incl relay data len == 0, or PING)
  - onion descriptor-appended junk

This attack works by injecting inbound RELAY cells at the exit or at a middle
relay, and then observing anomalous traffic patterns at the guard or at the
client->guard connection.

The severity of this covert channel is extreme (zero false positives; zero
false negatives) when they are injected in cases where the circuit is
otherwise known to be silent, because of the protocol state machine. These
cases include:
  - Immediately following an onionskin response
  - During other protocol handshakes (onion services, conflux)
  - Following relay CONNECTED or RESOLVED (not as severe - no path bias)

Because of the stateful and deterministic nature of the Tor protocol,
especially handshakes, it is easy to accurately recognize these specific cases
even when observing only encrypted circuit traffic at the Guard relay (see
[DROPMARK]).

Because this covert channel is most accurate before actual circuit use, when
the circuit is expected to be otherwise silent, it is trivial for a Guard
relay to close every circuit that does not display this covert channel,
providing path bias amplification attack and distributed trust reduction,
similar to cryptographic tagging attacks and end-to-end cell header
manipulation. This ability to use the collusion signal to perform path bias
before circuit use differentiates dropped cells within the Tor Protocol from
deadweight traffic during application usage (such as javascript requests for
404 URLs, covered in Section 1.3.2).

This category is not quite as severe as these previous two categories by
itself, for two main reasons. However, it is also the case that due to other
factors, these reasons may not matter in practice.

First, the Exit can't use this covert channel to close circuits that are not
deanonymized by a colluding Guard, since there is no covert channel from the
Guard to the Exit with this vector alone. Thus, unlike cryptographic tagging,
the adversary's Exits will still carry non-deanonymized traffic from
non-adversary Guards, and thus the adversary needs more Exit capacity than
Guard capacity. These kinds of more subtle trade-offs with respect to path
bias are covered in [DOSSECURITY]. However, note that this issue can be fixed
by using the previous RELAY_EARLY covert channel from the Guard to the Exit
(since this direction is unfixed). This allows the adversary to confirm
receipt of the dropped cell covert channel, allowing both the Guard and the
Exit to close all non-confirmed circuits, and thus ensure that they only need
to allocate equal amounts of compromised Guard and Exit traffic, to monitor
all Tor traffic.
 
Second, encoding a full unique identifier in this covert channel is
non-trivial. A significant amount of injected traffic must be sent to exchange
more than a simple collusion signal, to link circuits when attacking a large
number of users. In practice, this likely means some amount of correlation,
and a resulting (but very small) statistical error.

Obviously, the actual practical consequences of these two limitations are
questionable, so this covert channel is still regarded as "Highly Severe". It
can still result in full deanonymization of all Tor traffic by an adversary
with censorship capability, with very little error.

Solution: Forthcoming dropped-cell proposal
Status: Fixed with vanguards addon; Unfixed otherwise
Funding: Arti and relay-side fixes are explicitly funded via Sponsor 112


1.2. Info Leaks that enable other attacks

These info leaks are less severe than the first group, as they do not yield
full covert channels, but they do enable other attacks, including guard
discovery and eventual netflow deanonymization, and website traffic
fingerprinting.


1.2.1. Handshakes with unique traffic patterns

At a glance:
  Accuracy: FP=near-zero, FN=near-zero
  Requires: Compromised Guard
  Impact: Anonymity Set Reduction and Oracle; assists in Guard Discovery
  Path Bias: Full route capture is difficult (high failure rate)
  Reason for Prioritization: Increases severity of vectors 1.2.2 and 1.3.3
  Signal is: Caused by client's behavior.
  Signal is observed: At guard
  Signal is: Unique cell patterns

Certain aspects of Tor's handshakes are very unique and easy to fingerprint,
based only on observed traffic timing and volume patterns. In particular, the
onion client and onion service handshake activity is fingerprintable with
near-zero false negatives and near-zero false positive rates, as per
[ONIONPRINT]. The conflux link handshake is also unique (and thus accurately
recognizable), because it is our only 3-way handshake.

This info leak is very accurate. However, the impact is much lower than that
of covert channels, because by itself, it can only tell if a particular Tor
protocol, behavior, or feature is in use.

Additionally, Tor's distributed trust properties remain in-tact, because there
is no collusion signal built in to this info leak. When a path bias attack
is mounted to close circuits during circuit handshake construction without a
collusion signal to the Exit, it must proceed hop-by-hop. Guards must close
circuits that do not extend to colluding middles, and those colluding middles
must close circuits that don't extend to colluding exits. This means that the
adversary must control some relays in each position, and has a substantially
higher circuit failure rate while directing circuits to each of these relays
in a path.

To put this into perspective, an adversary using a collusion signal with 10%
of Exits expects to fail 9 circuits before detecting their signal at a
colluding exit and allowing a circuit to succeed. However, an adversary
without a collusion signal and 10% of all relays expects to fail 9 circuits
before getting a circuit to their middle, but then expects 9 of *those*
circuits to fail before reaching an Exit, for 81 circuit failures for every
successful circuit.

Published attacks have built upon this info leak, though.

In particular, certain error conditions, such as returning a single
"404"-containing relay cell for an unknown onion service descriptor, are
uniquely recognizable. This fingerprint was used in the [ONIONFOUND] guard
discovery attack, and they provide a measurement of its uniqueness.

Additionally, onion client fingerprintability can be used to vastly reduce the
set of website traffic traces that need to be considered for website traffic
fingerprinting (see Section 1.3.3), making that attack realistic and
practical. Effectively, it functions as a kind of oracle in this case (see
Glossary, and [ORACLES]).

Solution: Padding machines at middles for protocol handshakes (as per [PCP]);
          Pathbias-lite.
Status: Padding machines deployed for onion clients, but have weaknesses
        against DF and stateful cross-circuit fingerprints
Funding: Not explicitly funded


1.2.2. Adversary-Induced Circuit Creation

At a glance:
  Accuracy: FP=high, FN=high
  Requires: Onion service activity, or malicious exit
  Impact: Guard discovery
  Path Bias: Repeated circuits eventually provide the desired path
  Reason for Prioritization: Enables Guard Discovery
  Signal is: Inducing a client to make a new Tor circuit
  Signal is injected: by application layer, client, or malicious relay
  Signal is observed: At middle

By itself, the ability for an adversary to cause a client to create circuits
is not a covert channel or arguably even an info leak. Circuit creation, even
bursts of frequent circuit creation, is commonplace on the Tor network.

However, when this activity is combined with a covert channel from Section
1.1, with a unique handshake from Section 1.2.1, or with active traffic
manipulation (Section 1.3.2), then it leads to Guard Discovery, by allowing
the adversary to recognize when they are chosen for the Middle position, and
thus learn the Guard. Once Guard Discovery is achieved, netflow analysis of
the Guard's connections can be used to perform intersection attacks and
eventually determine the client IP address (see Section 1.3.1).

Large quantities of circuit creation can be induced by:
  - Many connections to an Onion Service
  - Causing a client to make connections to many onion service addresses
  - Application connection to ports in rare exit policies, followed by circuit
    close at Exit
  - Repeated Conflux leg failures

In Tor 0.4.7 and later, onion services are protected from this activity via
Vanguards-Lite (Proposal #333). This system adds a second layer of vanguards
to onion service circuits, with rotation times set such that it is sufficient
to protect a user for use cases on the order of weeks, assuming the adversary
does not get lucky and land in a set. Non-Onion service activity, such as
Conflux leg failures, is protected by feature-specific rate limits.

Longer lived onion services should use the Vanguards Addon, which implements
Mesh Vanguards (Prop#292). It uses two layers of vanguards, and expected
use cases of months.

These attack times are probabilistic expectations, and are rough estimates.
See the proposals for details. To derive these numbers, the proposals assume a
100% accurate covert channel for detecting that the middle is in the desired
circuit. If we address the low hanging fruit for such covert channels above,
these numbers change, and such attacks also become much more easily
detectable, as they will rely on application layer covert channels (See
Section 1.3.2), which will resemble an application layer DoS or flood.

Solution: Mesh-vanguards (Prop#292); Vanguards-lite (Prop#333); rate limiting
          circuit creation attempts; rate limiting the total number of distinct
          paths used by circuits
Status: Vanguards-lite deployed in Tor 0.4.7; Mesh-vanguards is vanguards addon;
        Conflux leg failures are limited per-exit; Exitpolicy scanner exists
Funding: Not explicitly funded


1.2.3. Relay Bandwidth Lying

At a glance:
  Accuracy: FP=high, FN=high
  Requires: Running relays in the network
  Impact: Additional traffic towards malicious relays
  Path Bias: Bandwidth lying can make up for circuit rejection
  Reason for prioritization: Assists Covert Channel Path Bias attacks
  Signal is injected: by manipulating reported descriptor bandwidths
  Signal is observed: by clients choosing lying relays more often
  Signal is: the effect of using lying relays more often

Tor clients select relays for circuits in proportion to their fraction of
consensus "bandwidth" weight. This consensus weight is calculated by
multiplying the relay's self-reported "observed" descriptor bandwidth value by
a ratio that is measured by the Tor load balancing system (formerly TorFlow;
now sbws -- see [SBWS] for an overview).

The load balancing system uses two-hop paths to measure the stream bandwidth
through all relays on the network. The ratio is computed by determining a
network-wide average stream bandwidth, 'avg_sbw', and a per-relay average
stream bandwidth, 'relay_sbw'. Each relay's ratio value is 'relay_sbw/avg_sbw'.
(There are also additional filtering steps to remove slow outlier streams).

Because the consensus weights for relays derive from manipulated descriptor
values by multiplication with this ratio, relays can still influence their
weight by egregiously lying in their descriptor value, thus attracting more
client usage. They can also attempt to fingerprint load balancer activity and
selectively give it better service, though this is more complicated than
simply patching Tor to lie.

This attack vector is especially useful when combined with a path bias attack
from Section 1.1: if an adversary is using one of those covert channels to
close a large portion of their circuits, they can make up for this loss of
usage by inflating their corresponding bandwidth value by an equivalent
amount, thus causing the load balancer to still measure a reasonable ratio for
them, and thus still provide fast service for the fully deanonymized circuits
that they do carry.

There are many research papers written on alternate approaches to the
measurement problem. These have not been deployed for three reasons:
  1. The unwieldy complexity and fragility of the C-Tor codebase
  2. The conflation of measurement with load balancing (we need both)
  3. Difficulty performing measurement of the fastest relays with
     non-detectable/distributed mechanisms

In the medium term, we will work on detecting bandwidth lying and manipulation
via scanners. In the long term, Arti-relay will allow the implementation of
distributed and/or dedicated measurement components, such as [FLASHFLOW].
(Note that FlashFlow still needs [SBWS] or another mechanism to handle load
balancing, though, since FlashFlow only provides measurement).

Solutions: Scan for lying relays; implement research measurement solutions
Status: A sketch of the lying relay scanner design is in [LYING_SCANNER]
Funding: Scanning for lying relays is funded via Sponsor 112


1.2.4. Metrics Leakage

At a glance:
  Accuracy: FP=low, FN=high
  Requires: Some mechanism to bias or inflate reported relay metrics
  Impact: Guard discovery
  Path Bias: Potentially relevant, depending on type of leak
  Reason for prioritization: Historically severe issues
  Signal is injected: by interacting with onion service
  Signal is observed: by reading router descriptors
  Signal is: information about volume of traffic and number of IP addresses

In the past, we have had issues with info leaks in our metrics reporting (see
[METRICSLEAK]). We addressed them by lowering the resolution of read/write
history, and ensuring certain error conditions could not willfully introduce
noticeable asymmetries. However, certain characteristics, like reporting local
onion or SOCKS activity in relay bandwidth counts, still remain.

Additionally, during extremely large flooding or DDoS attempts, it may still
be possible to see the corresponding increases in reported metrics for Guards
in use by onion services, and thus discover its Guards.

Solutions: Fix client traffic reporting; remove injectable asymmetries;
           reduce metrics resolution; add noise
Status: Metrics resolution reduced to 24hr; known asymmetries fixed
Funding: Not funded


1.2.5. Protocol Oracles

At a glance:
  Accuracy: FP=medium, FN=0 (for unpopular sites: FP=0, FN=0)
  Requires: Probing relay DNS cache
  Impact: Assists Website Traffic Fingerprinting; Domain Usage Analytics
  Path Bias: Not Possible
  Reason for prioritization: Historically accurate oracles
  Signal is injected: by client causing DNS caching at exit
  Signal is observed: by probing DNS response wrt to cell ordering via all exits
  Signal is: If cached, response is immediate; otherwise other cells come first

Protocol oracles, such as exit DNS cache timing to determine if a domain has
been recently visited, increase the severity of Website Traffic Fingerprinting
in Section 1.3.3, by reducing false positives, especially for unpopular
websites.

There are additional forms of oracles for Website Traffic Fingerprinting, but
the remainder are not protocol oracles in Tor. See [ORACLES] in the
references.

Tor deployed a defense for this oracle in the [DNSORACLE] tickets, to
randomize expiry time. This helps reduce the precision of this oracle for
popular and moderately popular domains/websites in the network, but does not
fully eliminate it for unpopular domains/websites.

The paper in [DNSORACLE] specifies a further defense, using a pre-load of
popular names and circuit cache isolation defense in Section 6.2, with third
party resolvers. The purpose of the pre-load list is to preserve the cache
hits for shared domains across circuits (~11-17% of cache hits, according to
the paper). The purpose of circuit isolation is to avoid Tor cache hits for
unpopular domains across circuits. The purpose of third party resolvers is to
ensure that the local resolver's cache does not become measurable, when
isolating non-preloaded domains to be per-circuit.

Unfortunately, third party resolvers are unlikely to be recommended for use by
Tor, since cache misses of unpopular domains would hit them, and be subject to
sale in DNS analytics data at high resolution (see [NETFLOW_TICKET]).

Also note that the cache probe attack can only be used by one adversary at a
time (or they begin to generate false positives for each other by actually
*causing* caching, or need to monitor for each other to avoid each other).
This is in stark contrast to third party resolvers, where this information is
sold and available to multiple adversaries concurrently, for all uncached
domains, with high resolution timing, without the need for careful
coordination by adversaries.

However, note that an arti-relay implementation would no longer be single
threaded, and would be able to reprioritize asynchronous cache activity
arbitrarily, especially for sensitive uncached activity to a local resolver.
This might be useful for reducing the accuracy of the side channel, in this
case.

Unfortunately, we lack sufficient clarity to determine if it is meaningful to
implement any further defense that does not involve third party resolvers
under either current C-Tor, or future arti-relay circumstances.

Solutions: Isolate cache per circuit; provide a shared pre-warmed cache of
           popular domains; smarter cache handling mechanisms?
Status: Randomized expiry only - not fully eliminated
Funding: Any further fixes are covered by Sponsor 112


1.3. Info Leaks of Research Concern

In this section, we list info leaks that either need further research, or are
undergoing active research.

Some of these are still severe, but typically less so than the already covered
ones, unless they are part of a combined attack, such as with an Oracle,
or with Guard Discovery.

Some of these may be more or less severe than currently suspected: If we
knew for certain, they wouldn't need research.


1.3.1. Netflow Activity

At a glance:
  Accuracy: FP=high; FN=0 (FN=medium with incomplete vantage point set)
  Requires: Access to netflow data market, or ISP coercion
  Impact: Anonymity Set Reduction; Deanonymization with Guard Discovery/Oracle
  Path Bias: Not possible
  Reason for Prioritization: Low impact without Guard Discovery/Oracle
  Signal is: created by using the network
  Signal is observed: at ISP of everything that is using the network.
  Signal is: Connection tuple times and byte counts

Netflow is a feature of internet routers that records connection tuples, as
well as time stamps and byte counts, for analysis.

This data is bought and sold, by both governments and threat intelligence
companies, as documented in [NETFLOW_TICKET].

Tor has a padding mechanism to reduce the resolution of this data (see Section
2 of [PADDING_SPEC]), but this hinges on clients' ability to keep connections
open and padded for 45-60 minutes, even when idle. This padding reduces the
resolution of intersection attacks, making them operate on 30 minute time
windows, rather than 15 second time windows. This increases the false positive
rate, and thus increases the duration of such intersection attacks.

Large scale Netflow data can also be used to track Tor users as they migrate
from location to location, without necessarily deanonymizing them. Because Tor
uses three directory guards, and has ~4000 Guard relays, the choice
Choose(4000,3) of directory Guards is ~10 billion different combinations,
though probability weighting of Guard selection does reduce this considerably
in practice. Lowering the total number of Guard relays (via arti-relay and
using only the fastest Guards), and using just two directory guards as opposed
to three can reduce this such that false positives become more common. More
thorough solutions are discussed in [GUARDSETS].

Location tracking aside, by itself, this data (especially when padded) is not
a threat to client anonymity. However, this data can also be used in
combination with a number of Oracles or confirmation vectors, such as:
  - Guard Discovery
  - Flooding an onion service with huge amounts of traffic in a pattern
  - Advertising analytics or account activity log purchase
  - TCP RST injection
  - TLS conn rotation

These oracles can be used to either confirm the connection of an onion
service, or to deanonymize it after Guard Discovery.

In the case of clients, the use of Oracle data can enable intersection attacks
to deanonymize them. The oracle data necessary for client intersection attack
is also being bought and sold, as documented in [NETFLOW_TICKET]. It is
unknown how long such attacks take, but it is a function of the number of
users under consideration, and their connection durations.

The research interest here is in determining what can be done to increase the
amount of time these attacks take, in terms of increasing connection duration,
increasing the number of users, reducing the total number of Guard relays,
using a UDP transport, or changing user behavior.

Solutions: Netflow padding; connection duration increase; QUIC transport;
           using bridges; decreasing total number of guards; using only two
           directory guards; guardsets; limiting sensitive account usage
Status: Netflow padding deployed in C-Tor and arti
Funding: Not explicitly funded


1.3.2. Active Traffic Manipulation Covert Channels

At a Glance:
  Accuracy: FP=medium, FN=low
  Requires: Netflow data, or compromised/monitored Guard
  Impact: Anonymity Set Reduction; Netflow-assisted deanonymization
  Path Bias: Possible via exit policy or onion service reconnection
  Reason for Prioritization: Can assist other attacks; lower severity otherwise
  Signal is injected: by the target application, manipulated by the other end.
  Signal is observed: at Guard, or target->guard connection.
  Signal is: Unusual traffic volume or timing.

This category of covert channel occurs after a client has begun using a
circuit, by manipulating application data traffic. This manipulation can
occur either at the application layer, or at the Tor protocol layer.

Because it occurs after the circuit is in use, it does not permit the use of
path bias or trust reduction properties by itself (unless combined with one of
the above info leak attack vectors -- most often Adversary-Induced Circuit
Creation).

These covert channels also have a significantly higher false positive rate
than those before circuit use, since application traffic is ad-hoc and
arbitrary, and is also involved during the attempted manipulation of
application traffic.

For onion services, this covert channel is much more severe: Onion services may
be flooded with application data in large-volume patterns over long periods of
time, which can be seen in netflow logs.

For clients, this covert channel typically is only effective after the
adversary suspects an individual, for confirmation of their suspicion, or
after Guard Discovery.

Examples of this class of covert channel include:
  - Application-layer manipulation (AJAX)
  - Traffic delays (rainbow, swirl - see [BACKLIT])
  - Onion Service flooding via HTTP POST
  - Flooding Tor relays to notice traffic changes in onion service throughput
  - Conflux leg switching patterns
  - Traffic inflation (1 byte data cells)

Solution: Protocol checks; Padding machines at middles for specific kinds of
          traffic; limits on inbound onion service traffic; Backlit
Status: Protocol checks performed for conflux; vanguards addon closes
        high-volume circuits
Funding: Not explicitly funded


1.3.3. Passive Application-Layer Traffic Patterns

At a Glance:
  Accuracy: FP=medium, FN=Medium
  Requires: Compromised Guard (external monitoring increases FP+FN rate)
  Impact: Links client and destination activity (ie: deanonymization with logs)
  Path Bias: Not Possible
  Reason for prioritization: Large FP rate without oracle, debated practicality
  Signal is: not injected; passively extracted
  Signal is observed: at Guard, or entire network
  Signal is: timing and volume patterns of traffic.

This category of information leak occurs after a client has begun using a
circuit, by analyzing application data traffic.

Examples of this class of information leak include:
  - Website traffic fingerprinting
  - End-to-end correlation

The canonical application of this information leak is in end-to-end
correlation, where application traffic entering the Tor network is correlated
to traffic exiting the Tor network (see [DEEPCOFFEA]). This attack vector
requires a global view of all Tor traffic, or false negatives skyrocket.
However, this information leak is also possible to exploit at a single
observation point, using machine learning classifiers (see [ROBFINGERPRINT]),
typically either the Guard or bridge relay, or the path between the
Guard/bridge and the client.

In both cases, this information leak has a significant false positive rate,
since application traffic is ad-hoc, arbitrary, and self-similar. Because
multiple circuits are multiplexed on one TLS connection, the false positive
and false negative rates are higher still at this observation location, as
opposed to on a specific circuit.

In both cases, the majority of the information gained by classifiers is in the
beginning of the trace (see [FRONT] and [DEEPCOFFEA]).

This information leak gets more severe when it is combined with another oracle
(as per [ORACLES]) that can confirm the statistically derived activity, or
narrow the scope of material to analyze. Example oracles include:
  - DNS cache timing
  - Onion service handshake fingerprinting
  - Restricting targeting to either specific users, or specific websites
  - Advertising analytics or account activity log purchase (see
    [NETFLOW_TICKET])

Website traffic fingerprinting literature is divided into two classes of
attack study: Open World and Closed World. Closed World is when the adversary
uses an Oracle to restrict the set of possible websites to classify traffic
against. Open World is when the adversary attempts to recognize a specific
website or set of websites out of all possible other traffic.

The nature of the protocol usage by the application can make this attack
easier or harder, which has resulted in application layer defenses, such as
[ALPACA]. Additionally, the original Google QUIC was easier to fingerprint
than HTTP (See [QUICPRINT1]), but IETF HTTP3 reversed this (See [QUICPRINT2]).
Javascript usage makes these attacks easier (see [INTERSPACE], Table 3), where
as concurrent activity (in the case of TLS observation) makes them harder.
Web3 protocols that exchange blocks of data instead of performing AJAX
requests are likely to be much harder to fingerprint, so long as the web3
application is accessed via its native protocol, and not via a website
front-end.

The entire research literature for this vector is fraught with analysis
problems, unfortunately. Because smaller web crawl sizes make the attacks more
effective, and because attack papers are easier to produce than defenses
generally, dismal results are commonplace. [WFNETSIM] and [WFLIVE] examine
some of these effects. It is common for additional hidden gifts to adversaries
to creep in, leading to contradictory results, even in otherwise comprehensive
papers at top-tier venues. The entire vein of literature must be read with a
skeptical eye, a fine-tooth comb, and a large dumpster nearby.

As one recent example, in an otherwise comprehensive evaluation of modern
defenses, [DEFCRITIC] found a contrary result with respect to the Javascript
finding in the [INTERSPACE] paper, by training and testing their classifiers
with knowledge of the Javascript state of the browser (thus giving them a free
oracle). In truth, neither [DEFCRITIC] nor [INTERSPACE] properly examined the
effects of Javascript -- a rigorous test would train and test on a mix of
Javascript and non-Javascript traffic, and then compare the classification
accuracy of each set separately, after joint classification. Instead,
[DEFCRITIC] just reported that disabling Javascript (via the security level of
Tor Browser) has "no beneficial effect", which they showed by actually letting
the adversary know which traces had Javascript disabled.

Such hidden gifts to adversaries are commonplace, especially in attack papers.
While it may be useful to do this while comparing defenses against each other,
when these assumptions are hidden, and when defenses are not re-tunable for
more realistic conditions, this leads to focus on burdensome defenses with
large amounts of delay or huge amounts of overhead, at the expense of ignoring
lighter approaches that actually improve the situation in practice.

This of course means that nothing gets done at all, because Tor is neither
going to add arbitrary cell delay at relays (because of queue memory required
for this and the impacts on congestion control), nor add 400% overhead to
both directions of traffic.

In terms of defense deployment, it makes the most sense to place these padding
machines at the Guards to start, for many reasons. This is in contrast to
other lighter padding machines for earlier vectors, where it makes more sense
to place them at the middle relay. In this case, the heavier padding machines
necessary for this vector can take advantage of higher multiplexing, which
means less overhead. They can also use the congestion signal at the TLS
connection, to more easily avoid unnecessary padding when the TLS connection
is blocked, thus only using "slack" Guard capacity. Conflux also can be tuned
to provide at least some benefit here: even if in lab conditions it provides
low benefit, in the scenarios studied by [WFNETSIM] and [WFLIVE], this may
actually be considerable, unless the adversary has both guards, which is more
difficult for an internal adversary. Additionally, the distinction between
external and internal adversaries is rarely, if ever, evaluated in the
literature anyway, so there is little guidance on this distinction as a whole,
right now.


Solution: Application layer solutions ([ALPACA], disabling Javascript, web3 apps);
          Padding machines at guards for application traffic; conflux tuning
Status: Unfixed
Funding: Padding machine and simulator port to arti are funded via Sponsor 112


1.3.4. Protocol or Application Linkability

At a Glance:
  Accuracy: FP=0, FN=0
  Requires: Compromised Exit; Traffic Observation; Hostile Website
  Impact: Anonymity Set Reduction
  Path Bias: Not Possible
  Reason for prioritization: Low impact with faster releases
  Signal is: not injected; passively extracted
  Signal is observed: at Exit, or at application destination
  Signal is: Rare protocol usage or behavior

Historically, due to Tor's slow upgrade cycles, we have had concerns about
deploying new features that may fragment the anonymity set of early adopters.

Since we have moved to a more rapid release cycle for both clients and relays
by abandoning the Tor LTS series, these concerns are much less severe.
However, they can still present concerns during the upgrade cycle. For
Conflux, for example, during the alpha series, the fact that few exits
supported conflux caused us to limit the number of pre-built conflux sets to
just one, to avoid concentrating alpha users at just a few exits. It is not
clear that this was actually a serious anonymity concern, but it was certainly
a concern with respect to concentrating the full activity of all these users
at just a few locations, for load balancing reasons alone.

Similar concerns exist for users of alternate implementations, both of Tor,
and of applications like the browser. We regard this as a potential research
concern, but it is likely not a severe one. For example, assuming Tor Browser
and Brave both address browser fingerprinting, how bad is it for anonymity
that they address it differently? Even if they ensure that all their users
have the same or similar browser fingerprints, it will still be possible for
websites, analytics datasets, and possibly even Exit relays or Exit-side
network observers, to differentiate the use of one browser versus the other.
Does this actually harm their anonymity in a real way, or must other oracles
be involved? Are these oracles easy to obtain?

Similarly, letting users choose their exit country is in this category. In
some circumstances, this choice has serious anonymity implications: if the
choice is a permanent, global one, and the user chooses an unpopular country
with few exits, all of their activity will be much more linkable. However, if
the country is popular, and/or if the choice is isolated per-tab or per-app,
is this still significant such that it actually enables any real attacks? It
seems like not so much.

Solutions: Faster upgrade cycle; Avoiding concentrated use of new features
Status: Tor LTS series is no longer supported
Funding: Not explicitly funded


1.3.5. Latency Measurement

At a glance:
  Accuracy: FP=high, FN=high
  Requires: Onion service, or malicious Exit
  Impact: Anonymity Set Reduction/Rough geolocation of services
  Path Bias: Possible exacerbating factor
  Reason for Prioritization: Low impact; multiple observations required
  Signal is created naturally by anything that has a "reply" mechanic
  Signal is observed at either end.
  Signal is: delays between a message sent and a message received in reply.

Latency's effects on anonymity set has been studied in the [LATENCY_LEAK]
papers.

It may be possible to get a rough idea of the geolocation of an onion service
by measuring the latency over many different circuits. This seems more
realistic if the Guard or Guards are known, so that their contribution to
latency statistics can be factored in, over many many connections to an onion
service. For normal client activity, route selection and the fact that the
Exit does not know specific accounts or cookies in use likely provides enough
protection.

If this turns out to be severe, it seems the best option is to add a delay on
the client side to attempt to mask the overall latency. This kind of approach
is only likely to make sense for onion services. Other path selection
alterations may help, though.

Solutions: Guards, vanguards, alternative path selection, client-side delay
Status: Guards and vanguards-lite are used in Tor since 0.4.7
Funding: Not explicitly funded


2. Attack Examples

To demonstrate how info leaks combine, here we provide some historical
real-world attacks that have used these info leaks to deanonymize Tor
users.


2.1. CMU Tagging Attack

Perhaps the most famous historical attack was when a group at CMU assisted the
FBI in performing dragnet deanonymization of Tor users, through their
[RELAY_EARLY] attack on the live network. This attack could only work on users
who happened to use their Guards, but those users could be fully deanonymized.

The attack itself operated on connections to monitored HSDIRs: it encoded the
address of the onion service in the cell command header, via the RELAY_EARLY
bitflipping technique from Section 1.1.2. Their Guards then recorded this
address, along with the IP address of the user, providing a log of onion
services that each IP address visited.

It is not clear if the CMU group even properly utilized the full path bias
attack power here to deanonymize as many Tor users as possible, or if their
logs were simply of interest to the FBI because of what they happened to
capture. It seems like the latter is the case.

A similar, motivated adversary could use any of the covert channels in Section
1.1, in combination with Path Bias to close non-deanonymized circuits, to
fully deanonymize all exit traffic carried by their Guard relays. There are
path bias detectors in Tor to detect large amounts of circuit failure, but
when the network (or the Guard) is also under heavy circuit load, they can
become unreliable, and have their own false positives.

While this attack vector requires the Guard relay, it is of interest to any
adversary that would like to perform dragnet deanonymization of a wide range
of Tor users, or to compel a Guard to deanonymize certain Tor users. It is
also of interest to adversaries with censorship capability, who would like
to monitor all Tor usage of users, rather than block them. Such an adversary
would use their censorship capability to direct Tor users to only their own
malicious Guards or Bridges.


2.2. Guard Discovery Attacks with Netflow Deanonymization

Prior to the introduction of Vanguards-lite in Tor 0.4.7, it was possible to
combine "1.2.2. Adversary-Induced Circuit Creation", with a circuit-based
covert channel (1.1.3, 1.2.1, or 1.3.2), to obtain a middle relay confirmed
to be next to the user's Guard.

Once the Guard is obtained, netflow connection times can be used to find the
user of interest.

There was at least one instance of this being used against a user of Ricochet,
who was fully deanonymized. The user was using neither vanguards-lite, nor the
vanguards addon, so this attack was trivial. It is unclear which covert
channel type was used for Guard Discovery. The netflow attack proceeded
quickly, because the attacker was able to determine when the user was on and
offline via their onion service descriptor being available, and the number
of users at the discovered Guard was relatively small.


2.3. Netflow Anonymity Set Reduction

Netflow records have been used, to varying degrees of success, to attempt to
identify users who have posted violent threats in an area.

In most cases, this has simply ended up hassling unrelated Tor users, without
finding the posting user. However, in at least one case, the user was found.

Netflow records were also reportedly used to build suspicion of a datacenter
in Germany which was emitting large amounts of Tor traffic, to eventually
identify it as a Tor hosting service providing service to drug markets, after
further investigation. It is not clear if a flooding attack was also used in
this case.


2.4. Application Layer Confirmation

The first (and only) known case of fine-grained traffic analysis of Tor
involved an application layer confirmation attack, using the vector from
1.3.2.

In this case, a particular person was suspected as being involved in a group
under investigation, due to the presence of an informant in that group. The
FBI then monitored the suspect's WiFi, and sent a series of XMPP ping messages
to the account in question. Despite the use of Tor, enough pings were sent
such that the timings on the monitored WiFi showed overlap with the XMPP
timings of sent pings and responses. This was prior to Tor's introduction of
netflow padding (which generates similar back-and-forth traffic every 4-9
seconds between the client and the Guard).

It should be noted that such attacks are still prone to error, especially for
heavy Tor users whose other traffic would always cause such overlap, as
opposed to those who use Tor for only one purpose, and very lightly or
infrequently.


3. Glossary

Covert Channel:
  A kind of information leak that allows an adversary to send information
  to another point in the network.

Collusion Signal:
  A Covert Channel that only reliably conveys 1 bit: if an adversary is
  present. Such covert channels are weaker than those that enable full
  identifier transmission, and also typically require correlation.

Confirmation Signal:
  Similar to a collusion signal, a confirmation signal is sent over a
  weak or noisy channel, and can only confirm that an already suspected
  entity is the target of the signal.

False Negative:
  A false negative is when the adversary fails to spot the presence of
  an info leak vector, in instances where it is actually present.

False Positive:
  A false positive is when the adversary attempts to use an info leak vector,
  but some similar traffic pattern or behavior elsewhere matches the traffic
  pattern of their info leak vector.

Guard Discovery:
  The ability of an adversary to determine the Guard in use by a service or
  client.

Identifier Transmission:
  The ability of a covert channel to reliably encode a unique identifier,
  such as an IP address, without error.

Oracle:
  An additional mechanism used to confirm an observed info leak vector
  that has a high rate of False Positives. Can take the form of DNS
  cache, server logs, analytics data, and other factors. (See [ORACLES]).

Path Bias (aka Route Manipulation, or Route Capture):
  The ability of an adversary to direct circuits towards their other
  compromised relays, by destroying circuits and/or TLS connections
  whose paths are not sufficiently compromised.


Acknowledgments:

This document has benefited from review and suggestions by David Goulet, Nick
Hopper, Rob Jansen, Nick Mathewson, Tobias Pulls, and Florentin Rochet.


References:

[ALPACA]
   https://petsymposium.org/2017/papers/issue2/paper54-2017-2-source.pdf

[BACKLIT]
   https://www.freehaven.net/anonbib/cache/acsac11-backlit.pdf

[DEEPCOFFEA]
   https://www-users.cse.umn.edu/~hoppernj/deepcoffea.pdf

[DEFCRITIC]
   https://www-users.cse.umn.edu/~hoppernj/sok_wf_def_sp23.pdf

[DNSORACLE]
   https://www.usenix.org/system/files/usenixsecurity23-dahlberg.pdf
   https://gitlab.torproject.org/rgdd/ttapd/-/tree/main/artifact/safety-board
   https://gitlab.torproject.org/tpo/core/tor/-/issues/40674
   https://gitlab.torproject.org/tpo/core/tor/-/issues/40539
   https://gitlab.torproject.org/tpo/core/tor/-/issues/32678

[DOSSECURITY]
   https://www.princeton.edu/~pmittal/publications/dos-ccs07.pdf

[DROPMARK]
   https://petsymposium.org/2018/files/papers/issue2/popets-2018-0011.pdf

[FLASHFLOW]
   https://gitweb.torproject.org/torspec.git/tree/proposals/316-flashflow.md

[FRONT]
   https://www.usenix.org/system/files/sec20summer_gong_prepub.pdf

[GUARDSETS]
   https://www.freehaven.net/anonbib/cache/guardsets-pets2015.pdf
   https://www.freehaven.net/anonbib/cache/guardsets-pets2018.pdf

[INTERSPACE]
   https://arxiv.org/pdf/2011.13471.pdf (Table 3)

[LATENCY_LEAK]
   https://www.freehaven.net/anonbib/cache/ccs07-latency-leak.pdf
   https://www.robgjansen.com/publications/howlow-pets2013.pdf

[LYING_SCANNER]
   https://gitlab.torproject.org/tpo/network-health/team/-/issues/313

[METRICSLEAK]
   https://gitlab.torproject.org/tpo/core/tor/-/issues/23512

[NETFLOW_TICKET]
   https://gitlab.torproject.org/tpo/network-health/team/-/issues/42

[ONECELL]
   https://www.blackhat.com/presentations/bh-dc-09/Fu/BlackHat-DC-09-Fu-Break-Tors-Anonymity.pdf

[ONIONPRINT]
   https://www.freehaven.net/anonbib/cache/circuit-fingerprinting2015.pdf

[ONIONFOUND]
   https://www.researchgate.net/publication/356421302_From_Onion_Not_Found_to_Guard_Discovery/fulltext/619be24907be5f31b7ac194a/From-Onion-Not-Found-to-Guard-Discovery.pdf?origin=publication_detail

[ORACLES]
   https://petsymposium.org/popets/2020/popets-2020-0013.pdf

[PADDING_SPEC]
   https://gitlab.torproject.org/tpo/core/torspec/-/blob/main/padding-spec.txt#L68

[PCP]
   https://arxiv.org/abs/2103.03831

[QUICPRINT1]
   https://arxiv.org/abs/2101.11871 (see also: https://news.ycombinator.com/item?id=25969886)

[QUICPRINT2]
   https://netsec.ethz.ch/publications/papers/smith2021website.pdf

[RACCOON23]
   https://archives.seul.org/or/dev/Mar-2012/msg00019.html

[RELAY_EARLY]
   https://blog.torproject.org/tor-security-advisory-relay-early-traffic-confirmation-attack/

[ROBFINGERPRINT]
   https://www.usenix.org/conference/usenixsecurity23/presentation/shen-meng

[SBWS]
   https://tpo.pages.torproject.net/network-health/sbws/how_works.html

[WFLIVE]
  https://www.usenix.org/system/files/sec22-cherubin.pdf

[WFNETSIM]
   https://petsymposium.org/2023/files/papers/issue4/popets-2023-0125.pdf
Filename: 345-specs-in-mdbook.md
Title: Migrating the tor specifications to mdbook
Author: Nick Mathewson
Created: 2023-10-03
Status: Closed

Introduction

I'm going to propose that we migrate our specifications to a set of markdown files, specifically using the mdbook tool.

This proposal does not propose a bulk rewrite of our specs; it is meant to be a low-cost step forward that will produce better output, and make it easier to continue working on our specs going forward.

That said, I think that this change will enable rewrites in the future. I'll explain more below.

What is mdbook?

Mdbook is a tool developed by members of the Rust community to create books with Markdown. Each chapter is a single markdown file; the files are organized into a book using a SUMMARY.md file.

Have a look at the mdbook documentation; this is what the output looks like.

Have a look at this source tree: that's the input that produces the output above.

Markdown is extensible: it can use numerous plugins to enhance the semantics of the the markdown input, add diagrams, output in more formats, and so on.

What would using mdbook get us immediately?

There are a bunch of changes that we could get immediately via even the simplest migration to mdbook. These immediate benefits aren't colossal, but they are things we've wanted for quite a while.

  • We'll have a document that's easier to navigate (via the sidebars).

  • We'll finally have good HTML output.

  • We'll have all our specifications organized into a single "document", able to link to one another and cross reference one another.

  • We'll have perma-links to sections.

  • We'll have a built-in text search function. (Go to the mdbook documentation and hit "s" to try it out.)

How will mdbook help us later on as we reorganize?

Many of the benefits of mdbook will come later down the line as we improve our documentation.

  • Reorganizing will become much easier.

    • Our links will no longer be based on section number, so we won't have to worry about renumbering when we add new sections.
    • We'll be able to create redirects from old section filenames to new ones if we need to rename a file completely.
    • It will be far easier to break up our files into smaller files when we find that we need to reorganize material.
  • We will be able make our documents even easier to navigate.

    • As we improve our documentation, we'll be able to use links to cross-reference our sections.
  • We'll be able to include real diagrams and tables.

  • We'll be able to integrate proposals more easily.

    • New proposals can become new chapters in our specification simply by copying them into a new 'md' file or files; we won't have to decide between integrating them into existing files or creating a new spec.

    • Implemented but unmerged proposals can become additional chapters in an appendix to the spec. We can refer to them with permalinks that will still work when they move to another place in the specs.

How should we do this?

Strategy

My priorities here are:

  • no loss of information,
  • decent-looking output,
  • a quick automated conversion process that won't lose a bunch of time.
  • a process that we can run experimentally until we are satisfied with the results

With that in mind, I'm writing a simple set of torspec-converter scripts to convert our old torspec.git repository into its new format. We can tweak the scripts until we like the that they produce.

After running a recent torspec-converter on a fairly recent torspec.git, here is how the branch looks:

https://gitlab.torproject.org/nickm/torspec/-/tree/spec_conversion?ref_type=heads

And here's the example output when running mdbook on that branch:

https://people.torproject.org/~nickm/volatile/mdbook-specs/index.html

Note: these is not a permanent URL; we won't keep the example output forever. When we actually merge the changes, they will move into whatever final location we provide.

The conversion script isn't perfect. It only recognizes three kinds of content: headings, text, and "other". Content marked "other" is marked with ``` to reneder it verbatim.

The choice of which sections to split up and which to keep as a single page is up to us; I made some initial decisions in the file above, but we can change it around as we please. See the configuration section at the end of the grinder.py script for details on how it's set up.

Additional work that will be needed

Assuming that we make this change, we'll want to build an automated CI process to build it as a website, and update the website whenever there is a commit to the specifications.

(This automated CI process might be as simple as git clone && mdbook build && rsync -avz book/ $TARGET.)

We'll want to go through our other documentation and update links, especially the permalinks in spec.torproject.org.

It might be a good idea to use spec.torproject.org as the new location of this book, assuming weasel (who maintains spec.tpo) also thinks it's reasonable. If we do that, we need to decide on what we want the landing page to look like, and we need very much to get our permalink story correct. Right now I'm generating a .htaccess file as part of the conversion.

Stuff we shouldn't do.

I think we should continue to use the existing torspec.git repository for the new material, and just move the old text specs into a new archival location in torspec. (We could make a new repository entirely, but I don't think that's the best idea. In either case, we shouldn't change the text specifications after the initial conversion.)

We'll want to figure out our practices for keeping links working as we reorganize these documents. Mdbook has decent redirect support, but it's up to us to actually create the redicrets as necessary.

The transition, in detail

  • Before the transition:

    • Work on the script until it produces output we like.
    • Finalize this proposal and determine where we are hosting everything.
    • Develop the CI process as needed to keep the site up to date.
    • Get approval and comment from necessary stakeholders.
    • Write documentation as needed to support the new way of doing things.
    • Decide on the new layout we want for torspec.git.
  • Staging the transition:

    • Make a branch to try out the transition; explicitly allow force-pushing that branch. (Possibly nickm/torspec.git in a branch called mdbook-demo, or torspec.git in a branch called mdbook-demo assuming it is not protected.)
    • Make a temporary URL to target with the transition (possibly spec-demo.tpo)
    • Once we want to do the transition, shift the scripts to tpo/torspec.git:main and spec.tpo, possibly?
  • The transition:

    • Move existing specs to a new subdirectory in torspec.git.
    • Run the script to produce an mdbook instance in torspec.git with the right layout.
    • Install the CI process to keep the site up to date.
  • Post-transition

    • Update links elsewhere.
    • Continue to improve the specs.

Integrating proposals

We could make all of our proposals into a separate book, like rust does at https://rust-lang.github.io/rfcs/ . We could also leave them as they are for now.

(I don't currently think we should make all proposals part of the spec automatically.)

Timing

I think the right time to do this, if we decide to move ahead, is before November. That way we have this issue as something people can work on during the docs hackathon.

Alternatives

I've tried experimenting with Docusaurus here, which is even more full-featured and generates pretty react sites like this. (We're likely to use it for managing the Arti documentation and website.)

For the purposes we have here, it seems slightly overkill, but I do think a migration is feasible down the road if we decide we do want to move to docusaurus. The important thing is the ability to keep our URLs working, and I'm confident we could do that

The main differences for our purposes here seem to be:

  • The markdown implementation in Docusaurus is extremely picky about stuff that looks like HTML but isn't; it rejects it, rather than passing it on as text. Thus, using it would require a more painstaking conversion process before we could include text like "<state:on>" or "A <-> B" as our specs do in a few places.

  • Instead of organizing our documents in a SUMMARY.md with an MD outline format, we'd have to organize them in a sidebar.js with a javascript syntax.

  • Docusaurus seems to be far more flexible and have a lot more features, but also seems trickier to configure.

<-- References -->

Filename: 346-protovers-again.md
Title: Clarifying and extending the use of protocol versioning
Author: Nick Mathewson
Created: 19 Oct 2023
Status: Open

Introduction

In proposal 264, we introduced "subprotocol versions" as a way to independently version different pieces of the Tor protocols, and communicate which parts of the Tor protocols are supported, recommended, and required.

Here we clarify the semantics of individual subprotocol versions, and describe more ways to use and negotiate them.

Semantics: Protocol versions are feature flags

One issue we left unclarified previously is the relationship between two different versions of the same subprotocol. That is, if we know the semantics of (say) Handshake=7, can we infer anything about a relay that supports Handshake=8? In particular, can we infer that it supports all of the same features implied by Handshake=7? If we want to know "does this relay support some feature supported by Handshake=7", must we check whether it supports Handshake=7, or should we check Handshake=x for any x≥7?

In this proposal, we settle the question as follows: subprotocol versions are flags. They do not have any necessary semantic relationship between them.

We reject the interpretation for several reasons:

  • It's tricky to implement.
  • It prevents us from ever removing a feature.
  • It requires us to implement features in the same order across all Tor versions.

...but sometimes a flag is a version!

There are some places in our protocol (notably: directory authority consensus methods, and channel protocol versions) where there is a semantic relationship between version numbers. Specifically: "higher numbers are already better". When parties need to pick a one of these versions, they always pick the highest version number supported by enough of them.

When this kind of real version intersects with the "subprotocol versions" system, we use the same numbers:

  • Link subprotocols correspond one-to-one with the version numbers sent in a VERSIONS cell.
  • Microdesc and Cons subprotocols correspond to a subset of the version numbers of consensus methods.

How to document subprotocol versions

When describing a subprotocol, we should be clear what relationship, if any, exists between its versions and any versions negotiated elsewhere in the specifications.

Unless otherwise documented, all versions can be in use at the same time: if only one can exist at once (on a single circuit, a single document, etc), this must be documented.

Implication: This means that we must say (for example) that you can't use Link=4 and Link=5 on the same channel.

Negotiating protocol versions in circuit handshakes.

Here we describe a way for a client to opt into features as part of its circuit handshake, in order to avoid proliferating negotiating extensions.

Binary-encoding protocol versions.

We assign a one-byte encoding for each protocol version number, ordered in the same way as in tor-spec.

ProtocolId
Link0
LinkAuth1
Relay2
DirCache3
HSDir4
HSIntro5
HSRend6
Desc7
MicroDesc8
Cons9
Padding10
FlowCtrl11
Conflux12
RelayCell13
DatagramTBD

Note: This is the same encoding used in walking onions proposal. It takes its order from the ordering of protocol versions in tor-spec and matches up with the values defined in for protocol_type_t in C tor's protover.h.

Requesting an opt-in circuit feature

When a client wants to request a given set of features, it sends an ntor_v3 extension containing:

struct subproto_request {
  struct req[..]; // up to end of extension
}

struct req {
  u8 protocol_id;
  u8 protovol_version;
}

Note 1: The above format does not include any parameters for each req. Thus, if we're negotiating an extension that requires a client- supplied parameter, it may not be appropriate to use this request format.

Note 2: This proposal does not include any relay extension acknowledging support. In the case of individual subprotocols, we could later say "If this subprotocol is in use, the relay MUST also send extension foo".

Note 3: The existence of this extension does not preclude the later addition of other extensions to negotiate featuress differently, or to do anything else.

Each req entry corresponds to a single subprotocol version. A client MUST NOT send any req entry unless:

  • That subprotocol version is advertised by the relay,
  • OR that subprotocol version is listed as required for relays in the current consensus, using required-relay-protocols.

Note: We say above that a client may request a required subprotocol even if the relay does not advertise it. This is what allows clients to send a req extension to introduction points and rendezvous points, even when we do not recognize the relay from the consensus.

Note 2: If the intro/rend point does not support a required protocol, it should not be on the network, and the client/service should not have selected it.

If a relay receives a subproto_request extension for any subprotocol version that it does not support, it MUST reject the circuit with a DESTROY cell.

Alternatives: we could give the relay the option to decline to support an extension, and we could require the relay to acknowledge which extensions it is providing. We aren't doing that, in the name of simplicity.

Only certain subprotocol versions need to be negotiated in this way; they will be explicitly listed as such in our specifications, with language like "This extension is negotiated as part of the circuit extension handshake". Other subprotocol versions MUST NOT be listed in this extension; if they are, the relay SHOULD reject the circuit.

Alternative: We could allow the client to list other subprotocols that the relay supports which are nonetheless irrelevant to the circuit protocol, like Microdesc, or ones that don't currently need to be negotiated, like HsRend.

This is not something we plan to do.

Currently specified subprotocol versions which can be negotiated using this extension are:

  • FlowCtrl=2 (congestion control)
  • RelayCell=1 (proposal 340)

The availability of the subproto_request extension itself will be indicated by a new Relay=X flag. When used, it will supplant several older ntor_v3 extensions, including:

  • (TODO: list these here, if we find any. I think FlowCtrl has an extension?)

That is, if using subproto_request, there is no need to send the (TODO) extensions.

Making features that can be disabled.

Sometimes, we will want the ability to make features that can be enabled or disabled from the consensus. But if we were to make a single flag that can turn the feature on and off, we'd run into trouble: after the feature was turned off, every relay would stop providing it right away, but there would be a delay before clients realized that the relays had stopped advertising the feature. During this interval, clients would try to enable the feature, and the relays would reject their circuits.

To solve this problem, we need to make features like these controlled by a pair of consensus parameters: one to disable advertising the feature, and one to disable the feature itself. To disable a feature, first the authorities would tell relays to stop advertising it, and only later tell the relays to stop supporting it. (If we want to enable a previously disabled feature, we can turn on advertisement and support at the same time.)

These parameters would be specified something like this (for a hypthetical Relay=33 feature).

  • support-relay-33: if set to 1, relays that can provide Relay=33 should do so.
  • advertise-relay-33: if set to 1, relays that are providing Relay=33 should include it in their advertised protocol versions.

Note: as a safety measure, relays MUST NOT advertise any feature that they do not support. This is reflected in the descriptions of the parameters above.

When we add a new feature of this kind, we should have the advertise-* flag parameter be 1 by default, and probably we should have support-* be 1 by default oo.

Subprotocol versions in onion services

Here we describe how to expand the onion service protocols in order to better accomodate subprotocol versions.

Advertising an onion service's subprotocols

In its encrypted descriptor (the innermost layer), the onion service adds a new entry:

  • "protocols" - A list of supported subprotocol versions, in the same format as those listed in a microdescriptor or descriptor.

Note that this is NOT a complete list of all the subprotocol versions actually supported by the onion service. Instead, onion services only advertise a subprotocol version if they support it, and it is documented in the specs as being supported by onion services.

Alternative: I had considered having a mask that would be put in the consensus document, telling the onion services which subprotocols to advertise. I don't think that's a great idea, however.

Right now, the following protocols should be advertised:

  • FlowCtrl
  • Conflux (?? Doesn't this take parameters? TODO)
  • Pow (??? Doesn't this take parameters? If we do it, we need to allocate a subprotocol for it. TODO)

Negotiating subprotocols with an onion service.

In the hs_ntor handshake sent by the client, we add an encrypted subproto_request extension of the same format, with the same semantics, as used in the ntor-v3 handshake.

This supplants the following:

  • (Congestion Control; Pow? TODO)

Advertising other relays' subprotocols?

Alternative: I had previously considered a design where the introduction points in the onion service descriptor would be listed along with their subprotocols, and the hs_ntor handshake would contain the subprotocols of the rendezvous point.

I'm rejecting this design for now because it gives the onion service and the client too much freedom to lie about relays. In the future, the walking onions design would solve this, since the contact information for intro and rend points would be authenticated.

Appendix

New numbers to reserve:

  • An extension ID for the ntor_v3 handshake subproto_request extension.
  • An extension ID for the hs_ntor handshake subproto_request extension.
  • A Relay= subprotocol indicating support for the ntor-v3 and hs_ntor extensions.
  • The numeric encoding of each existing subprotocol, in the table above.

Acknowledgments

Thanks to David Goulet and Mike Perry for their feedback on earlier versions of this proposal!

Filename: 347-domain-separation.md
Title: Domain separation for certificate signing keys
Author: Nick Mathewson
Created: 19 Oct 2023
Status: Open

Our goal

We'd like to be able to use the "family key" from proposal 321 as a general purpose signing key, to authenticate other things than the membership of a family. For example, we might want to have a challenge/response mechanism where the challenger says, "If you want to log in as the owner of the account corresponding to this family, sign the following challenge with your key. Or we might want to have a message authentication scheme where an operator can sign a message in a way that proves key ownership.

We might also like to use relay identity keys or onion service identitiy keys for the same purpose.

The problem

When we're using a signing key for two purposes, it's important to perform some kind of domain separation so that documents signed for one purpose can't be mistaken for documents signed for the other.

For example, in the challenge/response example, it would be bad if the challenger could provide a challenge string that would cause the signer to inadvertently authenticate an incorrect family.

These keys are currently used in some places with no personalization. Their signature format is as described in cert-spec.txt, which says:

The signature is created by signing all the fields in the certificate up until "SIGNATURE" (that is, signing sizeof(ed25519_cert) - 64 bytes).

One solution

This one is pretty easy: we would extend cert-spec as follows.

Using signing keys for other purposes.

Other protocols may introduce other uses for the signing keys in these certificates other than those specified here. When they do, they MUST ensure that the documents being signed cannot be confused with the certificate bodies of this document.

In some existing cases in the Tor protocols, we achieve this by specifying an ASCII prefix string that must be prepended to the other protocol's signed object before it is signed.

For future protocols, we recommend that this be done by specifying that the signing key is to be used to sign a cSHAKE digest (or other secure customizable digest) of the other protocol's signed object, using a customization string unique to the other protocol.

We would also make this amendment:

Future versions of this specification

In order to maintain the domain separation that currently exists between the signatures on these certificates and other documents signed with the same keys, it suffices (for now!) that these certificates begin with the version byte [01], whereas the other documents are in printable ASCII, which never includes [01].

Future versions of this specification SHOULD move to using an ed25519-prehash construction, using a customizable hash with built-in domain separation.

Filename: 348-udp-app-support.md
Title: UDP Application Support in Tor
Author: Micah Elizabeth Scott
Created: December 2023
Status: Open

UDP Application Support in Tor

Table of Contents

Introduction

This proposal takes a fresh look at the problem of implementing support in Tor for applications which require UDP/IP communication.

This work is being done with the sponsorship and goals of the Tor VPN Client for Android project.

The proposal begins with a summary of previous work and the specific problem space being addressed. This leads into an analysis of possible solutions, and finally some possible conclusions about the available development opportunities.

History

There have already been multiple attempts over Tor's history to define some type of UDP extension.

2006

Proposal 100 by Marc Liberatore in 2006 suggested a way to "add support for tunneling unreliable datagrams through tor with as few modifications to the protocol as possible." This proposal suggested extending the existing TLS+TCP protocol with a new DTLS+UDP link mode. The focus of this work was on a potential way to support unreliable traffic, not necessarily on UDP itself or on UDP applications.

In proposal 100, a Tor stream is used for one pairing of local and remote address and port, copying the technique used by Tor for TCP. This works for some types of UDP applications, but it's broken by common behaviors like ICE connectivity checks, NAT traversal attempts, or using multiple servers via the same socket.

No additional large-message fragmentation protocol is defined, so the MTU in proposal 100 is limited to what fits in a single Tor cell. This value is much too small for most applications.

It's possible these UDP protocol details would have been elaborated during design, but the proposal hit a snag elsewhere: there was no agreement on a way to avoid facilitating new attacks against anonymity.

2014

In a thread on the tor-talk mailing list, Nathan Freitas suggested UDP tunneling over Tor using the BadVPN project's udpgw protocol.

This protocol was never formally documented and is no longer actively maintained, but it was very broadly similar in scope to a RFC8656 TURN relay operating over a TCP transport.

2018

In 2018, Nick Mathewson and Mike Perry wrote a summary of the side-channel issues with unreliable transports for Tor.

The focus of this document is on the communication between Tor relays, but there is considerable overlap between the attack space explored here and the potential risks of any application-level UDP support. Attacks that are described here, such as drops and injections, may be applied by malicious exits or some types of third parties even in an implementation using only present-day reliable Tor transports.

2020

Proposal 339 by Nick Mathewson in 2020 introduced a simpler UDP encapsulation design which had similar stream mapping properties as in proposal 100, but with the unreliable transport omitted. Datagrams are tunneled over a new type of Tor stream using a new type of Tor message. As a prerequisite, it depends on proposal 319 to support messages that may be larger than a cell, extending the MTU to support arbitrarily large UDP datagrams.

In proposal 339 the property of binding a stream both to a local port and to a remote peer is described in UNIX-style terminology as a connected socket. This idea is explored below using alternate terminology from RFC4787, NAT behavior requirements for UDP. The single-peer connected socket behavior would be referred to as an endpoint-dependent mapping in RFC4787. This type works fine for client/server apps but precludes the use of NAT traversal for peer-to-peer transfer.

Scope

This proposal aims to allow Tor applications and Tor-based VPNs to provide compatibility with applications that require UDP/IP communications.

We don't have a specific list of applications that must be supported, but we are currently aiming for broad support of popular applications while still respecting and referencing all applicable Internet standards documents.

Changes to the structure of the Tor network are out of scope, as are most performance optimizations. We expect to rely on common optimizations to the performance of Tor circuits, rather than looking to make specific changes that optimize for unreliable datagram transmission.

This document will briefly discuss UDP for onion services below. It's worth planning for this as a way to evaluate the future design space, but in practice we are not aiming for UDP onion services yet. This will require changes to most applications that want to use it, as it implies that any media negotiations will need to understand onion addressing in addition to IPv4 and IPv6.

The allowed subset of UDP traffic is not subject to a single rigid definition. There are several options discussed below using the RFC4787 framework.

We require support for DNS clients. Tor currently only supports a limited subset of DNS queries, and it's desirable to support more. This will be analyzed in detail as an application below. DNS is one of very few applications that still rely on fragmented UDP datagrams, though this may not be relevant for us since only servers typically need to control the production of fragments.

We require support for voice/video telecommunications apps. Even without an underlying transport that supports unreliable datagrams, we expect a tunnel to provide a usable level of compatibility. This design space is very similar to the TURN (RFC8656) specification, already used very widely for compatibility with networks that filter UDP. See the analysis of specific applications below.

We require support for peer-to-peer UDP transfer without additional relaying, in apps that use ICE (RFC8445) or similar connection establishment techniques. Video calls between two Tor users should transit directly between two exit nodes. This requires that allocated UDP ports can each communicate with multiple peers: endpoint-independent mapping as described by RFC4787.

We do not plan to support applications which accept incoming datagrams from previously-unknown peers, for example a DNS server hosted via Tor. RFC4787 calls this endpoint-independent filtering. It's unnecessary for running peer-to-peer apps, and it facilitates an extremely easy traffic injection attack.

UDP Traffic Models

To better specify the role of a UDP extension for Tor, this section explores a few frameworks for describing noteworthy subsets of UDP traffic.

User Datagram Protocol (RFC768)

The "User Interface" suggested by RFC768 for the protocol is a rough sketch, suggesting that applications have some way to allocate a local port for receiving datagrams and to transmit datagrams with arbitrary headers.

Despite UDP's simplicity as an application of IP, we do need to be aware of IP features that are typically hidden by TCP's abstraction.

UDP applications typically try to obtain an awareness of the path MTU, using some type of path MTU discovery (PMTUD) algorithm. On IPv4, this requires sending packets with the "Don't Fragment" flag set, and measuring when those packets are lost or when ICMP "Fragmentation Needed" replies are seen.

Note that many applications have their own requirements for path MTU. For example, QUIC and common implementations of WebRTC require an MTU no smaller than 1200 bytes, but they can discover larger MTUs when available.

Socket Layer

In practice the straightforward "User Interface" from RFC768, capable of arbitrary local address, is only available to privileged users.

BSD-style sockets support UDP via SOCK_DGRAM. UDP is a stateless protocol, but sockets do have state. Each socket is bound, either explicitly with bind() or automatically, to a source IP and port.

At the API level, a socket is said to be connected to a remote (address, port) if that address is the default destination. A connected socket will also filter out incoming packets with source addresses different from this default destination. A socket is considered unconnected if connect() has not been called. These sockets have no default destination, and they accept datagrams from any source.

There does not need to be any particular mapping between the lifetime of these application sockets and any higher-level "connection" the application establishes. It's better to think of one socket as one allocated local port. A typical application may allocate only a single port (one socket) for talking to many peers. Every datagram sent or received on the socket may have a different peer address.

Network Address Translation (NAT)

Much of the real-world complexity in applying UDP comes from defining strategies to detect and overcome the effects of NAT. As a result, an intimidating quantity of IETF documentation has been written on NAT behavior and on strategies for NAT traversal.

RFC4787 and later RFC7857 offer best practices for implementing NAT. These are sometimes referred to as the BEHAVE-WG recommendations, based on the "Behavior Engineering for Hindrance Avoidance" working group behind them.

RFC6888 makes additional recommendations for "carrier grade" NAT systems, where small pools of IP addresses are shared among a much larger number of subscribers.

RFC8445 describes the Interactive Connectivity Establishment (ICE) protocol, which has become a common and recommended application-level technique for building peer-to-peer applications that work through NAT.

There are multiple fundamental technical issues that NAT presents:

  1. NAT must be stateful in order to route replies back to the correct source.

    This directly conflicts with the stateless nature of UDP itself. The NAT's mapping lifetime, determined by a timer, will not necessarily match the lifetime of the application-level connection. This necessitates keep-alive packets in some protocols. Protocols that allow their binding to expire may be open to a NAT rebinding attack, when a different party acquires access to the NAT's port allocation.

  2. Applications no longer know an address they can be reached at without outside help.

    Chosen port numbers may or may not be used by the NAT. The effective IP address and port are not knowable without observing from an outside peer.

  3. Filtering and mapping approaches both vary, and it's not generally possible to establish a connection without interactive probing.

    This is the reason ICE exists, but it's also a possible anonymity hazard. This risk is explored a bit further below in the context of interaction with other networks.

We can use the constraints of NAT both to understand application behavior and as an opportunity to model Tor's behavior as a type of NAT. In fact, Tor's many exit nodes already share similarity with some types of carrier-grade NAT. Applications will need to assume very little about the IP address their outbound UDP originates on, and we can use that to our advantage in implementing UDP for Tor.

This body of work is invaluable for understanding the scope of the problem and for defining common terminology. Let's take inspiration from these documents while also keeping in mind that the analogy between Tor and a NAT is imperfect. For example, in analyzing Tor as a type of carrier-grade NAT, we may consider the "pooling behavior" defined in RFC4787: the choice of which external addresses map to an internal address. Tor by necessity must carefully limit how predictable these mappings can ever be, to preserve its anonymity properties. A literal application of RFC6888 would find trouble in REQ-2 and REQ-9, as well as the various per-subscriber limiting requirements.

Mapping and Filtering Behaviors

RFC4787 defines a framework for understanding the behavior of NAT by analyzing both its "mapping" and "filtering" behavior separately. Mappings are the NAT's unit of state tracking. Filters are layered on top of mappings, potentially rejecting incoming datagrams that don't match an already-expected address. Both RFC4787 and the demands of peer-to-peer applications make a good case for always using an Endpoint-Independent Mapping.

Choice of filtering strategy is left open by the BEHAVE-WG recommendations. RFC4787 does not make one single recommendation for all circumstances, instead it defines three behavior options with different properties:

  • Endpoint-Independent Filtering allows incoming datagrams from any peer once a mapping has been established.

    RFC4787 recommends this approach, with the concession that it may not be ideal for all security requirements.

    This technique cannot be safely applied in the context of Tor. It makes traffic injection attacks possible from any source address, provided you can guess the UDP port number used at an exit. It also makes possible clear-net hosting of UDP servers using an exit node's IP, which may have undesirable abuse properties.

    This permissive filter is also incompatible with our proposed mitigation to local port exhaustion on exit relays. Even with per-circuit rate limiting, an attacker could trivially overwhelm the local port capacity of all combined UDP-capable Tor exits.

    It is still common for present-day applications to prefer endpoint-independent filtering, as it allows incoming connections from peers which cannot use STUN or a similar address fixing protocol. Choosing endpoint-independent filtering would have some compatibility benefit, but among modern protocols which use ICE and STUN there would be no improvement. The cost, on the other hand, would be an uncomfortably low-cost traffic injection attack and additional risks toward exit nodes.

  • Address-Dependent Filtering

    This is a permitted alternative according to RFC4787, in which incoming datagrams are allowed from only IP addresses we have previously sent to, but any port on that IP may be the sender.

    The intended benefits of this approach versus the port-dependent filtering below are unclear, and may no longer be relevant. In theory they would be:

    • To support a class of applications that rely on, for a single local port, multiple remote ports achieving filter acceptance status when only one of those ports has been sent a datagram. We are currently lacking examples of applications in this category. Any application using ICE should be outside this category, as each port would have its own connectivity check datagrams exchanged in each direction.

    • REQ-8 in RFC4787 claims the existence of a scenario in which this approach facilitates ICE connections with a remote peer that disregards REQ-1 (the peer does not use Endpoint-Independent Mapping). It is not clear that this claim is still relevant.

    One security hazard of address-dependent and non-port-dependent filtering, identified in RFC4787, is that a peer on a NAT effectively negates the security benefits of this host filtering. In fact, this should raise additional red flags as applied to either Tor or carrier grade NAT. If supporting peer-to-peer applications, it should be commonplace to establish UDP flows between two Tor exit nodes. When this takes place, non-port-dependent filtering would then allow anyone on Tor to connect via those same nodes and perform traffic injection. The resulting security properties really become uncomfortably similar to endpoint-independent filtering.

  • Address- and Port-Dependent Filtering

    This is the strictest variety of filtering, and it is an allowed alternative under RFC4787. It provides opportunities for increased security and opportunities for reduced compatibility, both of which in practice may depend on other factors.

    For every application we've analyzed so far, port-dependent filtering is not a problem. Usage of ICE will open all required filters during the connectivity check phase.

    This is the only type of filtering that provides any barrier at all between cross-circuit traffic injection when the communicating parties are known.

RFC4787 recommends that filtering style be configurable. We would like to implement that advice, but only to the extent it can be done safely and meaningfully in the context of an anonymity system. When possible, it would provide additional compatibility at no mandatory cost to allow applications to optionally request Address-Dependent Filtering. Otherwise, Address- and Port-Dependent Filtering is the most appropriate default setting.

Common Protocols

Applications that want to use UDP are increasingly making use of higher-level protocols to avoid creating bespoke solutions for problems like NAT traversal, connection establishment, and reliable delivery.

This section looks at how these protocols affect Tor's UDP traffic requirements.

QUIC

RFC9000 defines QUIC, a multiplexed secure point-to-point protocol which supports reliable and unreliable delivery. The most common use is as an optional HTTP replacement, especially among Google services.

QUIC does not normally try to traverse NAT; as an HTTP replacement, the server is expected to have an address reachable without any prior connection setup.

QUIC provides its own flexible connection lifetimes which may outlive individual network links or NAT mappings. The intention is to provide transparent roaming as mobile users change networks. This automated path discovery opens additional opportunities for malicious traffic, for which the RFC also offers mitigations. See path validation in section 8.2, and the additional mitigations from section 9.3.

When QUIC is used as an optional upgrade path, we must compare any proposed UDP support against the baseline of a non-upgraded original connection. In these cases the goal is not a compatibility enhancement but an avoidance of regression.

In cases where QUIC is used as a primary protocol without TCP fallback, UDP compatibility will be vital. These applications are currently niche but we expect they may rise in popularity.

WebRTC

WebRTC is a large collection of protocols tuned to work together for media transport and NAT traversal. It is increasingly common, both for browser-based telephony and for peer-to-peer data transfer. Non-browser apps often implement WebRTC as well, for example using libwebrtc. Even non-WebRTC apps sometimes have significant overlaps in their technology stacks, due to the independent history of ICE, RTP, and SDP adoption.

Of particular importance to us, WebRTC uses the Interactive Connection Establishment (ICE) protocol to establish a bidirectional channel between endpoints that may or may not be behind a NAT with unknown configuration.

Any generalized solution to connection establishment, like ICE, will require sending connectivity test probes. These have an inherent hazard to anonymity: assuming no delays are inserted intentionally, the result is a broadcast of similar traffic across all available network interfaces. This could form a convenient correlation beacon for an attacker attempting to de-anonymize users who use WebRTC over a Tor VPN. This is the risk enumerated below as interaction with other networks.

See RFC8825 Overview: Real-Time Protocols for Browser-Based Applications, RFC8445 Interactive Connectivity Establishment (ICE): A Protocol for Network Address Translator (NAT) Traversal, RFC8838 Trickle ICE: Incremental Provisioning of Candidates for the Interactive Connectivity Establishment (ICE) Protocol, RFC5389 Session Traversal Utilities for NAT (STUN), and others.

Common Applications

With applications exhibiting such a wide variety of behaviors, how do we know what to expect from a good implementation? How do we know which compatibility decisions will be most important to users? For this it's helpful to look at specific application behaviors. This is a best-effort analysis conducted at a point in time. It's not meant to be a definitive reference, think of it as a site survey taken before planning a building.

In alphabetical order:

ApplicationTypeProtocol featuresCurrent behaviorExpected outcome
BitTorrentFile sharingMany peers per local portFails without UDPWorks, new source of nuisance traffic
BigBlueButtonTelecomWebRTC, TURN, TURN-over-TLSWorksSlight latency improvement
DiscordTelecomProprietary, client/serverFails without UDPStarts working
DNSInfrastructureMight want IP fragmentationLimitedFull DNS support, for better and worse
FaceTimeTelecomWebRTC, TURN, TURN-over-TCPWorksSlight latency improvement
Google MeetTelecomSTUN/TURN, TURN-over-TCPWorksSlight latency improvement
Jitsi (meet.jit.si)TelecomWebRTC, TURN-over-TLS, CloudflareFails on TorNo change, problem does not appear UDP-related
Jitsi (docker-compose)TelecomWebRTC, centralized STUN onlyFails without UDPStarts working
LinphoneTelecom (SIP)SIP-over-TLS, STUN, TURNFails without UDPStarts working
SignalTelecomWebRTC, TURN, TURN-over-TCPWorksSlight latency improvement
SkypeTelecomP2P, STUN, TURN-over-TLSWorksSlight latency improvement
WhatsAppTelecomSTUN, TURN-over-TCP. Multi serverWorksSlight latency improvement
WiFi CallingTelecomIPsec tunnelOut of scopeStill out of scope
ZoomTelecomclient/server or P2P, UDP/TCPWorksSlight latency improvement

Overview of Possible Solutions

This section starts to examine different high-level implementation techniques we could adopt. Broadly they can be split into datagram routing and tunneling.

Datagram Routing

These approaches seek to use a network that can directly route datagrams from place to place. These approaches are the most obviously suitable for implementing UDP, but they also form the widest departure from classic Tor.

Intentional UDP Leak

The simplest approach would be to allow UDP traffic to bypass the anonymity layer. This is an unacceptable loss of anonymity in many cases, given that the client's real IP address is made visible to web application providers.

In other cases, this is an acceptable or even preferable approach. For example, VPN users may be more concerned with achieving censorship-resistant network connectivity than hiding personal identifiers from application vendors.

In threat models where application vendors are more trustworthy than the least trustworthy Tor exits, it may be more appropriate to allow direct peer-to-peer connections than to trust Tor exits with unencrypted connection establishment traffic.

3rd Party Implementations

Another option would be to use an unrelated anonymizer system for datagram traffic. It's not clear that a suitable system already exists. I2P provides a technical solution for routing anonymized datagrams, but not a Tor-style infrastructure of exit node operators.

This points to the key weakness of relying on a separate network for UDP: Tor has an especially well-developed community of volunteers running relays. Any UDP solution that is inconvenient for relay operators has little chance of adoption.

Future Work on Tor

There may be room for future changes to Tor which allow it to somehow transfer and route datagrams directly, without a separate process of establishing circuits and tunnels. If this is practical it may prove to be the simplest and highest performance route to achieving high quality UDP support in the long term. A specific design is out of the scope of this document.

It is worth thinking early about how we can facilitate combinations of approaches. Even without bringing any new network configurations to Tor, achieving interoperable support for both exit nodes and onion services in a Tor UDP implementation requires some attention to how multiple UDP providers can share protocol responsibilities. This may warrant the introduction of some additional routing layer.

Tunneling

The approaches in this section add a new construct which does not exist in UDP itself: a point-to-point tunnel between clients and some other location at which they establish the capability to send and receive UDP datagrams.

Any tunneling approach requires some way to discover tunnel endpoints. For the best usability and adoption this should come as an extension of Tor's existing process for distributing consensus and representing exit policy.

In practice, exit policies for UDP will have limited practical amounts of diversity. VPN implementations will need to know ahead of time which tunnel circuits to build, or they will suffer a significant spike in latency for the first outgoing datagram to a new peer. Additionally, it's common for UDP port numbers to be randomly assigned. This would make highly specific Tor exit policies even less useful and even higher overhead than they are with TCP.

TURN Encapsulated in Tor Streams

The scope of this tunnel is quite similar to the existing TURN relays, used commonly by WebRTC applications to implement fallbacks for clients who cannot find a more direct connection path.

TURN is defined by RFC8656 as a set of extensions built on the framework from STUN in RFC8489. The capabilities are a good match for our needs, offering clients the ability to encapsulate UDP datagrams within a TCP stream, and to allocate local port mappings on the server.

TURN was designed to be a set of modular and extensible pieces, which might be too distant from Tor's design philosophy of providing single canonical representations. Any adoption of TURN will need to consider the potential for malicious implementations to mark traffic, facilitating de-anonymization attacks.

TURN has a popular embeddable C-language implementation, coturn, which may be suitable for including alongside or inside C tor.

Tor Stream Tunnel to an Exit

Most of the discussion on UDP implementation in Tor so far has assumed this approach. Essentially it's the same strategy as TCP exits, but for UDP. When the OP initializes support for UDP, it pre-builds circuits to exits that support required UDP exit policies. These pre-built circuits can then be used as tunnels for UDP datagrams.

Within this overall approach, there are various ways to assign Tor streams for the UDP traffic. This will be considered below.

Tor Stream Tunnel to a Rendezvous Point

To implement onion services which advertise UDP ports, we can use additional tunnels. A new type of tunnel could end at a rendezvous point rather than an exit node. Clients could establish the ability to allocate a temporary virtual datagram mailbox at these rendezvous nodes.

This leaves more open questions about how outgoing traffic is routed, and which addressing format would be used for the datagram mailbox. The most immediate challenge in UDP rendezvous would then become application support. Protocols like STUN and ICE deal directly with IPv4 and IPv6 formats in order to advertise a reachable address to their peer. Supporting onion services in WebRTC would require protocol extensions and software modifications for STUN, TURN, ICE, and SDP at minimum.

UDP-like rendezvous extensions would have limited meaning unless they form part of a long-term strategy to forward datagrams in some new way for enhanced performance or compatibility. Otherwise, application authors might as well stick with Tor's existing reliable circuit rendezvous functionality.

Specific Designs Using Tor Streams

Let's look more closely at Tor streams, the multiplexing layer right below circuits.

Streams have an opaque 16-bit identifier, allocated from the onion proxy (OP) endpoint. Stream lifetimes are subject to some slight ambiguity still in the Tor spec. They are always allocated from the OP end but may be destroyed asynchronously by either circuit endpoint.

We have an opportunity to use this additional existing multiplexing layer to serve a useful function in the new protocol, or we can opt to interact with streams as little as possible in order to keep the protocol features more orthogonal.

One Stream per Tunnel (VPN)

It's possible to transport arbitrary UDP traffic using only a single Tor stream ID within the circuit. In this case, both the source and destination address would need to be represented somehow within an added per-datagram header.

The most common examples of this approach might be TUN/TAP forwarding over SSH, or any VPN that uses a TCP transport.

This design might be preferable if 16-bit stream IDs are found to be in short supply, but otherwise we would expect some benefit from using multiple streams IDs and therefore allowing a shorter per-datagram header.

One Stream per Local port (TURN)

With the standard TURN relay protocol, implemented over a TCP transport, the "allocation" of a relayed port can be performed once per tuple of (protocol, source IP, source port, destination IP, destination port). Put another way, TURN does not amplify the number of local ports available. An application that wishes to use two relayed ports will need two separate TCP connections, or in this context two separate Tor stream IDs.

As with the SSH example above, TURN could be implemented without any new Tor message types. TURN adds a 4-byte "channel data" header to normal datagrams, to facilitate message framing and peer identification.

We could choose to implement protocol support for bundling TURN relay functionality with a normal Tor exit node. In that case, we would want a single new Tor message type:

  • CONNECT_TURN

    • Establish a stream as a connection to the exit relay's built-in (or configured) TURN server.

      This would logically be a TURN-over-TCP connection, though it does not need to correspond to any real TCP socket if the TURN server is implemented in-process with tor.

Note that RFC8656 requires authentication before data can be relayed, which is a good default best practice for the internet perhaps but is the opposite of what Tor is trying to do. We would either deviate from the specification to relax this auth requirement, or provide a way for clients to discover credentials: perhaps by fixing them ahead of time or by including them in the relay descriptor.

If authentication is used, we must consider the protocol's privacy limitations. STUN/TURN usernames are conveyed in plaintext unless an additional TLS layer is also in use. For anonymity, use a randomly generated username. The draft-uberti-behave-turn-rest-00 document describes a method for generating time-limited credentials, as implemented through --use-auth-secret in coturn. Username timestamps should be randomized, to avoid exposing the precise local system clock.

One Stream per Local Port (Proposal 339)

One stream per socket was the approach suggested in Proposal 339 by Nick Mathewson in 2020, defining a per-local-port approach using slightly different terminology than TURN.

Proposal 339 chooses a different UDP subset to support, but otherwise the main technical distinction vs. TURN's per-local-port stream comes down to the framing layer. TURN defines an additional header to be used within a TCP-like stream, whereas proposal 339 avoids the two redundant length bytes by defining new Tor message types: CONNECT_UDP, CONNECTED_UDP, and DATAGRAM.

Similar to TURN, each stream's lifetime would match the lifetime of a local port allocation. Unlike TURN, there would be a single peer (remote address, remote port) allowed per local port. This matches the usage of BSD-style sockets on which connect() has completed, but it's incompatible with many of the applications analyzed. Multiple peers are typically needed for a variety of reasons, like connectivity checks or multi-region servers.

This approach would be simplest to implement and specify, especially in the existing C tor implementation. It also unfortunately has very limited compatibility, and no clear path toward incremental upgrades if we wish to improve compatibility later.

A simple one-to-one mapping between streams and sockets would preclude the optimizations necessary to address local port exhaustion risks below. Solutions under this design are possible, but only by decoupling logical protocol-level sockets from the ultimate implementation-level sockets and reintroducing much of the complexity that we attempted to avoid by choosing this design.

One Stream per Local Port (NAT Mapping)

We could improve the compatibility of Proposal 339 while retaining its very low per-datagram overhead by updating it to align with terminology and requirements from RFC4787 where practical.

This approach would use each stream to represent one endpoint-independent mapping, for use as a local UDP port in communication with multiple peers.

A mapping would always be allocated from the OP (client) side. It could explicitly specify a filtering style, if we wish to allow applications to request non-port-dependent filtering for compatibility. Each datagram within the stream would still need to be tagged with a peer address/port in some way.

This approach would involve a single new type of stream, and two new messages that pertain to these mapping streams:

  • NEW_UDP_MAPPING

    • Always client-to-exit.
    • Creates a new mapping, with a specified stream ID.
    • Succeeds instantly; no reply is expected, early data is ok.
    • Externally-visible local port number is arbitrary, and must be determined through interaction with other endpoints.
    • Might contain an IP "don't fragment" flag.
    • Might contain a requested filtering mode.
    • Lifetime is until circuit teardown or END message.
  • UDP_MAPPING_DATAGRAM

    • Conveys one datagram on a stream previously defined by NEW_UDP_MAPPING.
    • Includes peer address (IPv4/IPv6) as well as datagram content.

This updated approach is more compatible than Proposal 339 but it's not yet as efficient as TURN. Peers are almost always repeated over the course of a tunnel's lifetime, so the header space needed for addressing could be compressed by keeping some persistent information about peers as well as mappings.

One Stream per Flow

One stream per flow has also been suggested. Specifically, Mike Perry brought this up during our conversations about UDP recently and we spent some time analyzing it from a RFC4787 perspective. This approach has some interesting properties but also hidden complexity that may ultimately make other options more easily applicable.

This would assign a stream ID to the tuple consisting of at least (local port, remote address, remote port). Additional flags may be included for features like transmit and receive filtering, IPv4/v6 choice, and IP Don't Fragment.

This has advantages in keeping the datagram cells simple, with no additional IDs beyond the existing circuit ID. It may also have advantages in DoS-prevention and in privacy analysis.

Stream lifetimes, in this case, would not have any specific meaning other than the lifetime of the ID itself. The bundle of flows associated with one local port would still all be limited to the lifetime of a Tor circuit, by scoping the local port identifier to be contained within the lifetime of its circuit.

It would be necessary to allocate a new stream ID any time a new (local port, remote address, remote port) tuple is seen. This would most commonly happen as a result of a first datagram sent to a new peer, coinciding with the establishment of a NAT-style mapping and the possible allocation of a socket on the exit.

A less common case needs to be considered too: what if the parameter tuple first occurs on the exit side? We don't yet have a way to allocate stream IDs from either end of a circuit. This would need to be considered. One simple solution would be to statically partition the stream ID space into a portion that can be independently allocated by each side.

When is this exit-originated circuit ID allocation potentially needed? It is clearly needed when using address-dependent filtering. An incoming datagram from a previously-unseen peer port is expected to be deliverable, and the exit would need to allocate an ID for it.

Even with the stricter address and port-dependent filtering clients may still be exposed to exit-originated circuit IDs if there are mismatches in the lifetime of the filter and the stream.

This approach thus requires some attention to either correctly allocating stream IDs on both sides of the circuit, or choosing a filtering strategy and filter/mapping lifetime that does not ever leave stream IDs undefined when expecting incoming datagrams.

Hybrid Mapping and Flow Approach

We can extend the approach above with an optimization that addresses the undesirable space overhead from redundant address headers. This uses two new types of stream, in order to have streams per mapping and per flow at the same time.

The per-mapping stream remains the sole interface for managing the lifetime of a mapped UDP port. Mappings are created explicitly by the client. As an optimization, within the lifetime of a mapping there may exist some number of flows, each assigned their own ID.

This tries to combine the strengths of both approaches, using the lifetime of one stream to define a mapping and to carry otherwise-unbundled traffic while also allowing additional streams to bundle datagrams that would otherwise have repetitive headers. It avoids the space overhead of a purely per mapping approach and avoids the ID allocation and lifetime complexity introduced with per flow.

This approach takes some inspiration from TURN, where commonly used peers will be defined as a "channel" with an especially short header. Incoming datagrams with no channel can always be represented in the long form, so TURN never has to allocate channels unexpectedly.

The implementation here could be a strict superset of the per mapping implementation, adding new commands for flows while retaining existing behavior for mappings. There would be a total of four new message types:

  • NEW_UDP_MAPPING

    • Same as above.
  • UDP_MAPPING_DATAGRAM

    • Same as above.
  • NEW_UDP_FLOW

    • Allocates a stream ID as a flow, given the ID to be allocated and the ID of its parent mapping stream.
    • Includes a peer address (IPv4/IPv6).
    • The flow has a lifetime strictly bounded by the outer mapping. It is deleted by an explicit END or when the mapping is de-allocated for any reason.
  • UDP_FLOW_DATAGRAM

    • Datagram contents only, without address.
    • Only appears on flow streams.

We must consider the traffic marking opportunities provided when allowing an exit to represent one incoming datagram as either a flow or mapping datagram.

It's possible this traffic injection potential is not worse than the baseline amount of injection potential than every UDP protocol presents. See more on risks below. For this hybrid stream approach specifically, there's a limited mitigation available which allows exits only a bounded amount of leaked information per UDP peer:

Ideally exits may not choose to send a UDP_MAPPING_DATAGRAM when they could have sent a UDP_FLOW_DATAGRAM. Sometimes it is genuinely unclear though: an exit may have received this datagram in-between processing NEW_UDP_MAPPING and NEW_UDP_FLOW. A partial mitigation would terminate circuits which send a UDP_MAPPING_DATAGRAM for a peer that has already been referenced in a UDP_FLOW_DATAGRAM. The exit is thereby given a one-way gate allowing it to switch from using mapping datagrams to using flow datagrams at some point, but not to switch back and forth repeatedly.

Mappings that do not request port-specific filtering may always get unexpected UDP_MAPPING_DATAGRAMs. Mappings that do use port-specific filtering could make a flow for their only expected peers, then expect to never see UDP_MAPPING_DATAGRAM.

NEW_UDP_MAPPING could have an option requiring that only UDP_FLOW_DATAGRAM is to be used, never UDP_MAPPING_DATAGRAM. This would remove the potential for ambiguity, but costs in compatibility as it's no longer possible to implement non-port-specific filtering.

Risks

Any proposed UDP support involves significant risks to user privacy and software maintainability. This section elaborates some of these risks, so they can be compared against expected benefits.

Behavior Regressions

In some applications it is possible that Tor's implementation of a UDP compatibility layer will cause a regression in the ultimate level of performance or security.

Performance regressions can occur accidentally due to bugs or compatibility glitches. They may also occur for more fundamental reasons of protocol layering. For example, the redundant error correction layers when tunneling QUIC over TCP. These performance degradations are expected to be minor, but there's some unavoidable risk.

The risk of severe performance or compatibility regressions may be mitigated by giving users a way to toggle UDP support per-application.

Privacy and security regressions have more severe consequences and they can be much harder to detect. There are straightforward downgrades, like WebRTC apps that give up TURN-over-TLS for plaintext TURN-over-UDP. More subtly, the act of centralizing connection establishment traffic in Tor exit nodes can make users an easier target for other attacks.

Bandwidth Usage

We should expect an increase in overall exit bandwidth requirements due to peer-to-peer file sharing applications.

Current users attempting to use BitTorrent over Tor are hampered by the lack of UDP compatibility. Interoperability with common file-sharing peers would make Tor more appealing to users with a large and sustained appetite for anonymized bandwidth.

Local Port Exhaustion

Exit routers will have a limited number of local UDP ports. In the most constrained scenario, an exit may have a single IP with 16384 or fewer ephemeral ports available. These ports could each be allocated by one client for an unbounded amount of exclusive use.

In order to enforce high levels of isolation between different subsequent users of the same local UDP port, we may wish to enforce an extended delay between allocations during which nobody may own the port. Effective isolation requires this timer duration to be greater than any timer encountered on a peer or a NAT. In RFC4787's recommendations a NAT's mapping timer must be longer than 2 minutes. Our timer should ideally be much longer than 2 minutes.

An attacker who allocates ports for only this minimum duration of 2 minutes would need to send 136.5 requests per second to achieve sustained use of all available ports. With multiple simultaneous clients this could easily be done while bypassing per-circuit rate limiting.

The expanded definition of "Port overlapping" from RFC7857 section 3, may form at least a partial mitigation:

This document clarifies that this port overlapping behavior may be extended to connections originating from different internal source IP addresses and ports as long as their destinations are different.

This gives us an opportunity for a vast reduction in the number of required ports and file descriptors. Exit routers can automatically allocate local ports for use with a specific peer when that peer is first added to the client's filter.

Due to the general requirements of NAT traversal, UDP applications with any NAT support will always need to communicate with a relatively well known server prior to any attempts at peer-to-peer communication. This early peer could be an entire application server, or it could be a STUN endpoint. In any case, the identity of this first peer gives us a hint about the set of all potential peers.

Within the exit router, each local port will track a separate mapping owner for each peer. When processing that first outgoing datagram, the exit may choose any local port where the specific peer is not taken. Subsequent outgoing datagrams on the same port may communicate with a different peer, and there's no guarantee all these future peers will be claimed successfully.

When is this a problem? An un-claimable peer represents a case where the exact (local ip, local port, remote ip, remote port) tuple is in use already for a different mapping in some other Tor stream. For example, imagine two clients are running different types of telecom apps which are nevertheless inter-compatible and capable of both calling the same peers. Alternatively, consider the same app but with servers in several regions. The two apps will begin by communicating with different sets of peers, due to different application servers and different bundled STUN servers. This is our hint that it's likely appropriate to overlap their local port allocations. At this point, both of these applications may be successfully sharing a (local ip, local port) tuple on the exit. As soon as one of these apps calls a peer with some (remote ip, remote port), the other app will be unable to contact that specific peer.

The lack of connectivity may seem like a niche inconvenience, and perhaps that is the extent of the issue. It seems likely this heuristic could result in a problematic information disclosure under some circumstances, and it deserves closer study.

Application Fingerprinting

UDP applications present an increased surface of plaintext data that may be available for user fingerprinting by malicious exits.

Exposed values can include short-lived identifiers like STUN usernames. Typically it will also be possible to determine what type of software is in use, and maybe what version of that software.

Short-lived identifiers are still quite valuable to attackers, because they may reliably track application sessions across changes to the Tor exit. If longer-lived identifiers exist for any reason, that of course provides a powerful tool for call metadata gathering.

Peer-to-Peer Metadata Collection

One of our goals was to achieve the compatibility and perhaps performance benefits of allowing "peer-to-peer" (in our case really exit-to-exit) UDP connections. We expect this to enable the subset of applications that lack a fallback path which loops traffic through an app-provided server.

This goal may be at odds with our privacy requirements. At minimum, a pool of malicious exit nodes could passively collect metadata about these connections as a noisy proxy for call metadata.

Any additional signals like application fingerprints or injected markers may be used to enrich this metadata graph, possibly tracking users across sessions or across changes to the tunnel endpoint.

Interaction with Other Networks

Any application using an ICE-like interactive connection establishment scheme will easily leak information across network boundaries if it ever has access to multiple networks at once.

In applications that are not privacy conscious this is often desired behavior. For example, a video call to someone in your household may typically transit directly over Wifi, decreasing service costs and improving latency. This implies that some type of local identifier accompanies the call signaling info, allowing the devices to find each other's LAN address.

Privacy-preserving solutions for this use case are still an active area of standardization effort. The IETF draft on mDNS ICE candidates proposes one way to accomplish this by generating short-lived unique IDs which are only useful to peers with physical access to the same mDNS services.

Without special attention to privacy, the typical implementation is to share all available IP addresses and to initiate simultaneous connectivity tests using any IP pairs which cannot be trivially discarded. This applies to ICE as specified, but also to any proprietary protocol which operates in the same design space as ICE. This has multiple issues in a privacy-conscious environment: The IP address disclosures alone can be fatal to anonymity under common threat models. Even if meaningful IP addresses are not disclosed, the timing correlation from connectivity checks can provide confirmation beacons that alert an attacker to some connection between a LAN user and a Tor exit.

Traffic Injection

Some forms of UDP support would have obvious and severe traffic injection vulnerabilities. For example, the very permissive endpoint-independent filtering strategy would allow any host on the internet to send datagrams in bulk to all available local ports on a Tor exit in order to map that traffic's effect on any guards they control.

Any vehicle for malicious parties to mark traffic can be abused to de-anonymize users. Even if there is a more restrictive filtering policy, UDP's lack of sequence numbers make header spoofing attacks considerably easier than in TCP. Third parties capable of routing datagrams to an exit with a spoofed source address could bypass filtering when the communicating parties are known or can be guessed. For example, a malicious superuser at an ISP without egress filtering could send packets with the source IPs set to various common DNS servers and application STUN/TURN servers. If a specific application is being targeted, only the exit node and local port numbers need to be guessed by brute force.

In case of malicious exit relays, whole datagrams can be inserted and dropped, and datagrams may be padded with additional data, both without any specific knowledge of the application protocol. With specific protocol insights, a malicious relay may make arbitrary edits to plaintext data.

Of particular interest is the plaintext STUN, TURN, and ICE traffic used by most WebRTC apps. These applications rely on higher-level protocols (SRTP, DTLS) to provide end-to-end encryption and authentication. A compromise at the connection establishment layer would not violate application-level end-to-end security requirements, making it outside the threat model of WebRTC but very much still a concern for Tor.

These attacks are not fully unique to the proposed UDP support, but UDP may increase exposure. In cases where the application already has a fallback using TURN-over-TLS, the proposal is a clear regression over previous behaviors. Even when comparing plaintext to plaintext, there may be a serious downside to centralizing all connection establishment traffic through a small number of exit IPs. Depending on your threat model, it could very well be more private to allow the UDP traffic to bypass Tor entirely.

Malicious Outgoing Traffic

We can expect UDP compatibility in Tor will give malicious actors additional opportunities to transmit unwanted traffic.

In general, exit abuse will need to be filtered administratively somehow. This is not unique to UDP support, and exit relay administration typically involves some type of filtering response tooling that falls outside the scope of Tor itself.

Exit administrators may choose to modify their exit policy, or to silently drop problematic traffic. Silent dropping is discouraged in most cases, as Tor prioritizes the accuracy of an exit's advertised policy. Detailed exit policies have a significant space overhead in the overall Tor consensus document, but it's still seen as a valuable resource for clients during circuit establishment.

Exit policy filtering may be less useful in UDP than with TCP due to the inconvenient latency spike when establishing a new tunnel. Applications that are sensitive to RTT measurements made during connection establishment may fail entirely when the tunnel cannot be pre-built.

This section lists a few potential hazards, but the real-world impact may be hard to predict owing to a diversity of custom UDP protocols implemented across the internet.

  • Amplification attacks against arbitrary targets

    These are possible only in limited circumstances where the protocol allows an arbitrary reply address, like SIP. The peer is often at fault for having an overly permissive configuration. Nevertheless, any of these easy amplification targets can be exploited from Tor with little consequence, creating a nuisance for the ultimate target and for exit operators.

  • Amplification attacks against an exit relay

    An amplification peer which doesn't allow arbitrary destinations can still be used to attack the exit relay itself or other users of that relay. This is essentially the same attack that is possible against any NAT the attacker is behind.

  • Malicious fragmented traffic

    If we allow sending large UDP datagrams over IPv4 without the Don't Fragment flag set, we allow attackers to generate fragmented IP datagrams. This is not itself a problem, but it has historically been a common source of inconsistencies in firewall behavior.

  • Excessive sends to an uninterested peer

    Whereas TCP mandates a successful handshake, UDP will happily send unlimited amounts of traffic to a peer that has never responded. To prevent denial of service attacks we have an opportunity and perhaps a responsibility to define our supported subset of UDP to include true bidirectional traffic but exclude continued sends to peers who do not respond.

    See also RFC7675 and STUN's concept of "Send consent".

  • Excessive number of peers

    We may want to place conservative limits on the maximum number of peers per mapping or per circuit, in order to make bulk scanning of UDP port space less convenient.

    The limit would need to be on peers, not stream IDs as we presently do for TCP. In this proposal stream IDs are not necessarily meaningful except as a representational choice made by clients.

Next Steps

At this point we have perhaps too many possibilities for how to proceed. We could integrate UDP quite closely with Tor itself or not at all.

Choosing a route forward is both a risk/benefit tradeoff and a guess about the future of Tor. If we expect that some future version of Tor will provide its own datagram transport, we should plot a course aiming in that direction. If not, we might be better served by keeping compatibility features confined to separate protocol layers when that's practical.

Requiring a Long-Term Datagram Plan

If we choose to add new long-term maintenance burdens to our protocol stack, we should ensure they serve our long-term goals for UDP adoption as well as these shorter-term application compatibility goals.

This work so far has been done with the assumption that end-to-end datagram support is out of scope. If we intend to proceed down any path which encodes a datagram-specific protocol into Tor proper, we should prioritize additional protocol research and standardization work.

Alternatively, Modular Application-Level Support

Without a clear way to implement fully generic UDP support, we're left with application-level goals. Different applications may have contrasting needs, and we can only achieve high levels of both compatibility and privacy by delegating some choices to app authors or per-app VPN settings.

This points to an alternative in which UDP support is excluded from Tor as much as possible while still supporting application requirements. For example, this could motivate the TURN Encapsulated in Tor Streams design, or even simpler designs where TURN servers are maintained independently from the Tor exit infrastructure.

The next prototyping step we would need at this stage is a version of onionmasq extended to support NAT-style UDP mappings using TURN allocations provided through TURN-over-TCP or TURN-over-TLS. This can be a platform for further application compatibility experiments. It could potentially become a low-cost minimal implementation of UDP application compatibility, serving to assess which remaining user needs are still unmet.

Filename: 349-command-state-validation.md
Title: Client-Side Command Acceptance Validation
Author: Mike Perry
Created: 2023-08-17
Status: Draft

Introduction

The ability of relays to inject end-to-end relay cells that are ignored by clients allows malicious relays to create a covert channel to verify that they are present in multiple positions of a path. This covert channel allows a Guard to deanonymize 100% of its traffic, or just all the traffic of a particular client IP address.

This attack was first documented in DROPMARK. Proposal 344 describes the severity of this attack, and how this kind of end-to-end covert channel leads to full deanonymization, in a reliable way, in practice. (Recall that dropped cell attacks are most severe when an adversary can inject arbitrary end-to-end data patterns at times when the circuit is known to be idle, before it is used for traffic; injection at this point enables path bias attacks which can ensure that only malicious Guard+Exit relays are present in all circuits used by a particular target client IP address. For further details, see Proposal 344.)

This proposal is targeting arti-client, not C-Tor. This proposal is specific to client-side checks of relay cells and relay messages. Its primary change to behavior is the definition of state machines that enforce what relay message commands are acceptable on a given circuit, and when.

By applying and enforcing these state machine rules, we prevent the end-to-end transmission of arbitrary amounts of data, and ensure that predictable periods of the protocol are happening as expected, and not filled with side channel packet patterns.

Overview of dropped cell types

Dropped cells are cells that a relay can inject that end up ignored and discarded by a Tor client.

These include:

  1. Unparsable cells
  2. invalid relay commands
  3. Unrecognized cells (ie: wrong source hop, or decrypt failures)
  4. unsupported (or consensus-disabled) relay commands or extensions
  5. out-of-context relay commands
  6. duplicate relay commands
  7. relay commands that hit any error codepaths
  8. relay commands for an invalid or already-closed stream ID
  9. semantically void relay cells (incl relay data len == 0, or PING)
  10. onion descriptor-appended junk

Items 1-4 and 8 are handled by the existing relay command parsers in arti. In these cases, arti closes the circuit already.

XXX: Arti's relay parser is lazy; see https://gitlab.torproject.org/tpo/core/arti/-/merge_requests/1978 Does this mean that individual components need to properly propagate error information in order for circuits to get closed, when a command does not parse?

The state machines of this proposal handle 5-7 in a rigorous way. (In many cases of out-of-context relay cells, arti already closes the circuit; our goal here is to centralize this validation so that we can ensure that it is not possible for any relay commands to omit checks or allow unbounded activity.)

XXX: Does arti allow extra onion-descriptor junk to be appended after the descriptor signature? C-Tor does...

Architectural Patterns and Behavior

Ideally, the handling of invalid protocol behavior should be centralized, so that validation can happen in one easy-to-audit place, rather than spread across the codebase (as it currently is with C-Tor).

For some narrow cases of invalid protocol activity, this is trivial. The relay command acceptance is centralized in arti, which allows arti to immediately reject unknown or disabled relay commands. This kind of validation is necessary, but not sufficient, in order to prevent dropped cell vectors.

Things quickly get complicated when handling parsable relay cells sent during an inappropriate time, or other activity such as duplicate relay commands, semantically void cells, or commands that would hit an error condition, or lazy parsing failure, deep in the code and be silently accepted without closing the circuit.

To handle such cases, we propose adding a relay command message state machine pattern. Each relay protocol, when it becomes active on a circuit, must register a state machine that handles validating its messages.

Because multiple relay protocols can be active at a time, multiple validation state machines can be attached to a circuit. This also allows protocols to create their own validation without needing to modify the entire validation process. Relay messages that are not accepted by any active protocol validation handler MUST result in circuit close.

Architectural Patterns

In order to handle these cases, we rely on some architectural patterns:

  1. No relay message command may be sent to the client unless it is unless explicitly allowed by the specification, advertised as supported, and negotiated on a particular channel or circuit. (Prop#346)
  2. Any relay commands or extension fields not successfully negotiated on a circuit are invalid. This includes cells from intermediate hops, which must also negotiate their use (example: padding machine negotiation to middles).
  3. By following the above principles, state machines can be developed that govern when a relay command is acceptable. This covers the majority of protocol activity. See Section 3.
  4. For some commands, additional checks must be performed by using context of the protocol itself.

The following relay commands require additional module state to enforce limitations, beyond what is known by a state machine, for #4:

  • RELAY_COMMAND_SENDME
    • Requires checking that the auth digest hash is accurate
  • RELAY_COMMAND_XOFF and RELAY_COMMAND_XON
    • Context and rate limiting is stream-dependent
    • Packing enforcement via prop#340 is context-dependent
  • RELAY_COMMAND_CONFLUX_SWITCH
    • Packing enforcement via prop#340 is context-dependent
  • RELAY_COMMAND_DROP:
    • This can only be accepted from a hop if there is a padding machine at that hop.
  • RELAY_COMMAND_INTRODUCE2
    • Requires inspecting replay cache (however, circuits should not get closed because replays can come from the client)

Behavior

When an invalid relay cell or relay message is encountered, the corresponding circuit should be immediately closed.

Initially, this can be accomplished by sending a DESTROY cell to the Guard relay.

Additionally, when closing circuits in this way, clients must take care not to allow cases of adversarially-induced infinite circuit creation in non-onion service protocols that are not protected by Vanguards/Vanguards-lite, by limiting the number of retries they perform. (One such example of this is a malicious conflux exit that repeatedly kills only one leg by injecting dropped cells to close the circuit.)

While we also specify some cases where the channel to the Guard should be closed, this is not necessary in the general case.

XXX: I can't think of any issues severe enough to actually warrant the following, but Florentin pointed it out as a possibility: A malicious Guard may withhold the DESTROY, and still allow full identifier transmission before the circuit is closed. While this does not directly allow full deanonymization because the client won't actually use the circuit, it may still be enough to make the vector useful for other attacks. For completeness against this vector, we may want to consider sending a new RELAY_DESTROY command to the middle node, such that it has responsibility for tearing down a circuit by sending its own DESTROYS in both directions, and then have the client send its own DESTROY if the client does not get a DESTROY from the Guard. >>> See torspec#220: https://gitlab.torproject.org/tpo/core/torspec/-/issues/220

State machine descriptions

These state machines apply only at the client. (There is no information leak from extra cells in the protocol on the relay side, so we will not be specifying relay-side enforcement, or implementing it for C-Tor.)

There are multiple state machines, describing particular circuit purposes and/or components of the Tor relay protocol.

Each state machine has a "Trigger", and a "Message Scope". The "Trigger" is the condition, relay command, or action that causes the state machine to get added to a circuit's command state validator set. The Message Scope is where the state machine applies: to specific a hop number, stream ID, or both.

A circuit can have multiple state machines attached at one time.

  • If no state machine accepts a relay command, then the circuit MUST be closed.
  • When we say "receive X" we mean "receive a valid cell of type X". If the cell is invalid, we MUST kill the circuit

Relay message handlers

The state machines process enveloped relay message commands. Ie, with respect to prop#340, they operate on the message bodies, with associated stream ID.

With respect to Proposal #340, the calls to state machine validation would go after converting cells to messages, but before parsing the message body itself, to still minimize exposure of the parser attack surfaces.

XXX: Again, some validation will require early parsing, not lazy parsing

There are multiple relay message handlers that can be registered with each circuit ID, for a specific hop on that circuit ID, depending on the protocols that are in use on that circuit with that hop, as well as the streams to that hop.

Each handler has a Message Scope, that acts as a filter such that only relay command messages from this scope are processed by that handler.

If a message is not accepted by any active handler, the circuit MUST be closed.

Base Handler

Purpose: This handler validates commands for circuit construction and circuit-level SENDME activity.

Trigger: Creation of a circuit; ntor handhshake success for a hop

Message Scope: The circuit ID and hop number must match for this handler to apply. (Because of leaky pipes, each hop of the circuit has a base handler added when that hop completes an ntor handshake and is added to the circuit.)

START:
  Upon sending EXTEND:
     Enter EXTEND_SENT.

  Receive SENDME:
     Ensure expected auth digest matches; close circuit otherwise
     No transition.

EXTEND_SENT:
  Receiving EXTENDED:
     Enter START.

  Receive SENDME:
     Ensure expected auth digest matches; close circuit otherwise
     No transition.

Client Introducing Handler

Purpose: Circuits used by clients to connect to a service introduction point have this handler attached.

Trigger: Usage of a circuit for client introduction

Message Scope: Circuit ID and hop number must match

CLIENT_INTRO_START:
  Upon sending INTRODUCE1:
    Enter CLIENT_INTRO_WAIT

CLIENT_INTRO_WAIT
  Receieve INTRODUCE_ACK:
    Accept
    Transition to CLIENT_INTRO_END

CLIENT_INTRO_END:
  No transitions possible
  - XXX: Enforce that no new handlers can be added? We may still have padding
    handlers though.

Service Introduce Handler

Purpose: Service-side onion service introduction circuits have this handler attached.

Trigger: Onion service establishing an introduction point circuit

Message Scope: Circuit ID and hop number must match

SERVICE_INTRO_START:
  Upon sending ESTABLISH_INTRO:
    Enter SERVICE_INTRO_ESTABLISH

SERVICE_INTRO_ESTABLISH:
  Receiving INTRO_ESTABLISHED:
    Enter SERVICE_INTRO_ESTABLISHED

SERVICE_INTRO_ESTABLISHED:
  Receiving INTRODUCE2
    Accept

Client Rendezvous Handler

Purpose: Circuits used by clients to build a rendezvous point have this handler attached.

Trigger: Client rendezvous initiation

Message Scope: Circuit ID and hop number must match

CLIENT_REND_START:
  Upon Sending RENDEZVOUS1:
    Enter CLIENT_REND_WAIT

CLIENT_REND_WAIT:
  Receive RENDEZVOUS2:
    Enter CLIENT_REND_ESTABLISHED

CLIENT_REND_ESTABLISHED:
  Remain in this state; launch TCP, UDP, or Conflux handlers for streams

Service Rendezvous Handler

Purpose: Circuits used by services to connect to a rendezvous point have this handler attached.

Trigger: Incoming introduce cell/service rend initiation

Message Scope: Circuit ID and hop number must match

SERVICE_REND_START:
  Upon sending ESTABLISH_RENDEZVOUS:
    Enter SERVICE_REND_WAIT

SERVICE_REND_WAIT:
  Receive RENDEZVOUS_ESTABLISHED:
    Enter SERVICE_REND_ESTABLISHED

SERVICE_REND_ESTABLISHED:
  Remain in this state; launch TCP, UDP, or Conflux handlers for streams

CircPad Handler

Purpose: Circuit-level padding is negotiated with a particular hop in the circuit; when it is negotiated, we need to allow padding cells from that hop.

Trigger: Negotiation of a circuit padding machine

Message Scope: Circuit ID and hop must match; padding machine must be active

PADDING_START:
  Upon sending PADDING_NEGOTIATE:
    Enter PADDING_NEGOTIATING

PADDING_NEGOTIATING:
  Receiving PADDING_NEGOTIATED:
    Enter PADDING_ACTIVE

PADDING_ACTIVE:
  Receiving DROP:
    Accept (if from correct hop)
    - XXX: We could perform more sophisticated rate limiting accounting here
      too?

Resolve Stream Handler

Purpose: This handler is created on circuits when a resolve happens.

Trigger: RESOLVE message

Message Scope: Circuit ID, stream ID, and hop number must all match

RESOLVE_START:
  Send a RESOLVE message:
    Enter RESOLVE_SENT

RESOLVE_SENT:
  Receive a RESOLVED or an END:
    Enter RESOLVE_START.

TCP Stream handler

Purpose: This handler is created when the client creates a new stream ID, using either BEGIN or BEGIN_DIR.

Trigger: New AP or DirConn stream

Message Scope: Circuit ID, stream ID, and hop number must all match; stream ID must be open or half-open (half-open is END_SENT).

TCP_STREAM_START:
  Send a BEGIN or BEGIN_DIR message:
    Enter BEGIN_SENT.

BEGIN_SENT:
  Receive an END:
    Enter TCP_STREAM_START.
  Receive a CONNECTED:
    Enter STREAM_OPEN.

STREAM_OPEN:
  Receive DATA:
    Verify length is > 0
    XXX: Handle [HSDIRINFLATION] here?
    Process.

  Receive XOFF:
    Enter STREAM_XOFF

  Send END:
    Enter END_SENT.

  Receive END:
    Enter TCP_STREAM_START

STREAM_XOFF:
  Receive DATA:
    Verify length is > 0
    XXX: Handle [HSDIRINFLATION] here?
    Process.
 
  Send END:
    Enter END_SENT.

  Receive XON:
    Enter STREAM_XON

  Receive END:
    Enter TCP_STREAM_START

STREAM_XON:
  Receive DATA:
    Verify length is > 0
    XXX: Handle [HSDIRINFLATION] here?
    Process.

  Receive XOFF:
    If prop#340 is enabled, verify packed with SENDME
    Enter STREAM_XOFF

  Receive XON:
    If prop#340 is enabled, verify packed with SENDME
    Verify rate has changed

  Send END:
    Enter END_SENT.

  Receive END:
    Enter TCP_STREAM_START

END_SENT:
  Same as STREAM_OPEN, except do not actually deliver data.
  Only remain in this state for one RTT_max, or until END_ACK.

Conflux Handler

Purpose: Circuits that are a part of a conflux set have a conflux handler, associated with the last hop.

Trigger: Creation of a conflux set

Message Scope: Circuit ID and hop number must match

  • XXX: Linked circuits must accept stream ids from either circuit for other handlers :/
CONFLUX_START: (all conflux leg circuits start here)
  Upon sending CONFLUX_LINK:
     Enter CONFLUX_LINKING

CONFLUX_LINKING:
  Receiving CONFLUX_LINKED:
     Send CONFLUX_LINKED_ACK
     Enter CONFLUX_LINKED

CONFLUX_LINKED:
  Receiving CONFLUX_SWITCH:
     If prop#340 is negotiated, ensure packed with a DATA cell

UDP Stream Handler

Purpose: Circuits that are using prop#339

Trigger: UDP stream creation

Message Scope: Circuit ID, hop number, and stream-id must match

UDP_STREAM_START:
  If no other udp streams used on circuit:
    Send CONNECT_UDP for any stream, enter UDP_CONNECTING
  else:
    Immediately enter UDP_CONNECTING
    (CONNECTED_UDP MAY arrive without a CONNECT_UDP, after the first UDP
     stream on a circuit is established)

UDP_CONNECTING:
  Upon receipt of CONNECTED_UDP, enter UDP_CONNECTED

UDP_CONNECTED:
  Receive DATAGRAM:
    Verify length > 0
    Verify Prop#344 NAT rules are obeyed, including srcport and stream limits
    Process.

  Send END:
    Enter UDP_END_SENT

UDP_END_SENT:
  Same as UDP_CONNECTED, except do not actually deliver data.
  Only remain in this state for one RTT_max, or until END_ACK,
  then transition to UDP_STREAM_START.

HSDIR Inflation

XXX: This can be folded into the state machines and/or rend-spec.. The state machines should actually be able to handle this, once they are ready for it.

One of the most common questions about dropped cells is "what about data cells with a 1 byte payload?". As Prop#344 makes clear, this is not a dropped cell attack, but is instead an instance of an Active Traffic Manipulation Covert Channel, described in Section 1.3.2. The lower severity of active traffic manipulation is due to the fact that it cannot be used to deanonymize 100% of a target client's circuits, where as the combination of path bias and pre-usage dropped cells can.

However, there is one case where one can construct a potent attack from this Active Traffic Manipulation: by making use of onion service circuits being built on demand by an application. Further, because the onion service handshake is uniquely fingerprintable (see Section 1.2.1 of Prop#344), it is possible to use this vector in this specific case to encode an identifier in the timing and traffic patterns of the onion service descriptor download, similar to how the CMU attack operated, and use both the onion service fingerprint and descriptor traffic pattern to transmit the fact that a particular onion service was visited, to the Guard or possibly even a local network observer.

A normal hidden service descriptor occupies only ~10 cells (with a hard max of 30KB, or ~60 cells). This is not enough to reliably encode the full address of the onion service in a timing-based covert channel.

However, there are two ways to cause this descriptor download to transmit enough data to encode such a covert channel, and replicate the CMU attack using timing information of this data.

First, the actual descriptor payload can be spread across many DATA cells that are filled only partially with data (which does not happen if the HSDIR is honest and well-behaved, because it always has the full descriptor on hand).

Second, in C-tor, additional junk can be appended at the end of a onion service descriptor document that does not count against the 30KB maximum, which the client will happily download and then ignore.

Neither of these things are necessary to preserve, and neither can happen in normal operation. They can either be addressed directly by checks on HSDIR-based RELAY_COMMAND_DATA lengths and descriptor parsing, or by simply enforcing that circuits used to fetch service descriptors can only receive as many bytes as the maximum descriptor size, before being closed.

XXX: Consider RELAY_COMMAND_END_ACK also..

  • https://gitlab.torproject.org/tpo/core/torspec/-/issues/196

XXX: Tickets to grovel through for other stuff: https://gitlab.torproject.org/tpo/core/torspec/-/issues/38 https://gitlab.torproject.org/tpo/core/torspec/-/issues/39 https://gitlab.torproject.org/tpo/core/arti/-/issues/525

Command Allowlist enumeration

XXX: We are planning to remove this section after we finish the state machines; keeping it for reference until then for cross-checking.

Formerly, in C-Tor, we took the approach of performing a series of checks for each message command, ad-hoc. Here's those rules, for spot-checking that the above state machines cover them.

All relay commands are rejected by clients and serviced unless a rule says they are OK.

Here's a list of those rules, by relay command:

  • RELAY_COMMAND_DATA 2 - This command MUST only arrive for valid open or half-open stream ID - This command MUST have data length > 0 - On HSDIR circuits, ONLY ONE command is allowed to have a non-full payload (the last command). See Section 4.

  • RELAY_COMMAND_END 3

    • This command MUST only arrive ONCE for each valid open or half-open stream ID
  • RELAY_COMMAND_CONNECTED 4

    • This command MUST ONLY be accepted ONCE by clients if they sent a BEGIN or BEGIN_DIR
    • The stream ID MUST match the stream ID from BEGIN (or BEGIN_DIR)
  • RELAY_COMMAND_DROP 10

    • This command is accepted by clients from any hop that they have negotiated an active circuit padding machine with
  • RELAY_COMMAND_CONFLUX_LINKED 20

    • Ensure that a LINK cell was sent to the hop that sent this
    • Ensure that no previous LINKED cell has arrived on this circuit
  • RELAY_COMMAND_CONFLUX_SWITCH 22

    • Ensure that conflux is enabled and linked
    • If Prop#340 is in use, this cell MUST be packed with a valid multiplexed RELAY_COMMAND_DATA cell.
  • RELAY_COMMAND_INTRODUCE2 35

    • Services MUST check:
      • The intro is for a valid service identity and auth
      • The command has a valid sub-credential
      • The command is not a replay (possibly not close circuit?)
  • RELAY_COMMAND_RENDEZVOUS2 37

    • This command MUST ONLY arrive ONCE in response to a sent REND1 cell, on the appropriate circuit
    • The ntor handshake must succeed with MAC validation
  • RELAY_COMMAND_INTRO_ESTABLISHED 38

    • Services MUST check:
      • This cell MUST ONLY come ONCE in response to RELAY_COMMAND_ESTABLISH_INTRO, for the appropriate service identity
  • RELAY_COMMAND_RENDEZVOUS_ESTABLISHED 39

    • This command MUST ONLY be accepted ONCE in response to RELAY_COMMAND_ESTABLISH_RENDEZVOUS
  • RELAY_COMMAND_INTRODUCE_ACK 40

    • This command MUST ONLY be accepted ONCE by clients, in response to RELAY_COMMAND_INTRODUCE1
  • RELAY_COMMAND_PADDING_NEGOTIATED 42

    • This command MUST ONLY be accepted by clients in response to PADDING_NEGOTIATE
  • RELAY_COMMAND_XOFF 43

    • Ensure that congestion control is enabled and negotiated
    • Ensure that the stream id is either opened or half-open
    • Ensure that the stream id is in "XON" state
  • RELAY_COMMAND_XON 44

    • Ensure that congestion control is enabled and negotiated
    • Ensure that the stream id is either opened or half-open
    • Enforce always packing this to a SENDME with Prop#340?
  • RELAY_COMMAND_CONNECTED_UDP

    • The stream id in this command MUST match that from RELAY_COMMAND_CONNECT_UDP
    • This command is only accepted once per UDP stream id
  • RELAY_COMMAND_DATAGRAM

    • This command MUST only arrive for valid open or half-open stream ID
    • This command MUST have data length > 0

References:

Filename: 350-remove-tap.md
Title: A phased plan to remove TAP onion keys
Author: Nick Mathewson
Created: 31 May 2024
Status: Accepted

Introduction

Back in proposal 216, we introduced the ntor circuit extension handshake. It replaced the older TAP handshake, which was badly designed, and dependent on insecure key lengths (RSA1024, DH1024).

With the final shutdown of v2 onion services, there are no longer any supported users of TAP anywhere in the Tor protocols. Anecdotally, a relay operator reports that fewer than one handshake in 300,000 is currently TAP. (Such handshakes are presumably coming from long-obsolete clients.)

Nonetheless, we continue to bear burdens from TAP support. For example:

  • TAP keys compose a significant (but useless!) portion of directory traffic.
  • The TAP handshake requires cryptographic primitives used nowhere else in the Tor protocols.

Now that we are implementing relays in Arti, the time is ripe to remove TAP. (The only alternative is to add a useless TAP implementation in Arti, which would be a waste of time.)

This document outlines a plan to completely remove the TAP handshake, and its associated keys, from the Tor protocols.

This is, in some respects, a modernized version of proposal 245.

Migration plan

Here is the plan in summary; we'll discuss each phase below.

  • Phase 1: Remove TAP dependencies
    • Item 1: Stop accepting TAP circuit requests.
    • Item 2: Make TAP keys optional in directory documents.
    • Item 3: Publish dummy TAP keys.
  • Phase 2: After everybody has updated
    • Item 1: Allow TAP-free routerdescs at the authorities
    • Item 2: Generate TAP-free microdescriptors
  • Phase 3: Stop publishing dummy TAP keys.

Phase 1 can begin immediately.

Phase 2 can begin once all supported clients and relays have upgraded to run versions with the changes made in Phase 1.

Phase 3 can begin once all authorities have made the changes described in phase 2.

Phase 1, Item 1: Stop accepting TAP circuit requests.

(All items in phase 1 can happen in parallel.)

Immediately, Tor relays should stop accepting TAP requests. This includes all CREATE cells (not CREATE2), and any CREATE2 cell whose type is TAP (0x0000).

When receiving such a request, relays should respond with DESTROY. Relays MAY just drop the request entirely, however, if they find that they are getting too many requests.

Such relays must stop reporting Relay=1 among their supported protocol versions. (This protocol version is not currently required or recommended.)

If this proposal is accepted, we should clarify the protocol version specification to say that Relay=1 specifically denotes TAP.

Phase 1, Item 2: Make TAP keys optional in directory documents.

(All items in phase 1 can happen in parallel.)

In C tor and Arti, we should make the onion-key entry and the onion-key-crosscert entry optional. (If either is present, the other still must be present.)

When we do this, we should also modify the authority code to reject any descriptors that do not have these fields. (This will be needed so that existing Tor instances do not break.)

In the microdescriptor documents format, we should make the object of the onion-key element optional. (We do not want to make the element itself optional, since it is used to delimit microdescriptors.)

We use new protocol version flags (Desc=3, Microdesc=3) to note the ability to parse these documents.

Phase 1, Item 3: Publish dummy TAP keys

(All items in phase 1 can happen in parallel.)

Even after step 2 is done, many clients and relays on the network will still require TAP keys to be present in directory documents. Therefore, we can't remove those keys right away.

Relays, therefore, must put some kind of RSA key into their onion-key field.

I'll present three designs on what relays should do. We should pick one.

Option 1 (conservative)

Maybe, we should say that a relay should generate a TAP key, generate an onion-key-crosscert, and then discard the private component of the key.

Option 2 (low-effort)

In C tor, we can have relays simply proceed as they do now, maintaining TAP private keys and generating crosscerts.

This has little real risk beyond what is described in Option 1.

Option 3 (nutty)

We could generate a global, shared RSA1024 private key, to be used only for generating onion-key-crosscerts and placing into the onion-key field of a descriptor.

We would say that relays publishing this key MUST NOT actually handle any TAP requests.

The advantage of this approach over Option 1 would be that we'd see gains in our directory traffic immediately, since all identical onion keys would be highly compressible.

The downside here is that any client TAP requests sent to such a relay would be decryptable by anybody, which would expose long-obsolete clients to MITM attacks by hostile guards.

We would control the presence of these dummy TAP keys using a consensus parameter:

publish-dummy-tap-key — If set to 1, relays should include a dummy TAP key in their routerdescs. If set to 0, relays should omit the TAP key and corresponding crosscert. (Min: 0, Max, 1, Default: 0.)

We would want to ensure that all authorities voted for this parameter as "1" before enabling support for it at the relay level.

Phase 2, Item 1: Allow TAP-free routerdescs at the authorities

Once all clients and relays have updated to a version where the onion-key router descriptor element is optional (see phase 1, item 2), we can remove the authority code that requires all descriptors to have TAP keys.

Phase 2, Item 2: Generate TAP-free microdescriptors

Once all clients and descriptors have updated to a version where the onion-key body is optional in microdescriptors (see phase 1, item 2), we can add a new consensus method in which authorities omit the body when generating microdescriptors.

Phase 3: Stop publishing dummy TAP keys.

Once all authorities have stopped requiring the onion-key element in router descriptors (see phase 2, item 1), we can disable the publish-dummy-tap-key consensus parameter, so that relays will no longer include TAP keys in their router descriptors.