Proposals for changes in the Tor protocols
This "book" is a list of proposals that people have made over the years, (dating back to 2007) for protocol changes in Tor. Some of these proposals are already implemented or rejected; others are under active discussion.
If you're looking for a specific proposal, you can find it,
by filename, in the summary bar on the left, or at
this index. You can also see a list of Tor protocols
by their status at BY_STATUS.md
.
For information on creating a new proposal, you would ideally look at
001-process.txt
. That file is a bit out-of-date, though, and you
should probably just contact the developers.
Tor proposals by number
Here we have a set of proposals for changes to the Tor protocol. Some of these proposals are implemented; some are works in progress; and some will never be implemented.
Below are a list of proposals sorted by their proposal number. See BY_STATUS.md for a list of proposals sorted by status.
000-index.txt
: Index of Tor Proposals [META]001-process.txt
: The Tor Proposal Process [META]098-todo.txt
: Proposals that should be written [OBSOLETE]099-misc.txt
: Miscellaneous proposals [OBSOLETE]100-tor-spec-udp.txt
: Tor Unreliable Datagram Extension Proposal [DEAD]101-dir-voting.txt
: Voting on the Tor Directory System [CLOSED]102-drop-opt.txt
: Dropping "opt" from the directory format [CLOSED]103-multilevel-keys.txt
: Splitting identity key from regularly used signing key [CLOSED]104-short-descriptors.txt
: Long and Short Router Descriptors [CLOSED]105-handshake-revision.txt
: Version negotiation for the Tor protocol [CLOSED]106-less-tls-constraint.txt
: Checking fewer things during TLS handshakes [CLOSED]107-uptime-sanity-checking.txt
: Uptime Sanity Checking [CLOSED]108-mtbf-based-stability.txt
: Base "Stable" Flag on Mean Time Between Failures [CLOSED]109-no-sharing-ips.txt
: No more than one server per IP address [CLOSED]110-avoid-infinite-circuits.txt
: Avoiding infinite length circuits [CLOSED]111-local-traffic-priority.txt
: Prioritizing local traffic over relayed traffic [CLOSED]112-bring-back-pathlencoinweight.txt
: Bring Back Pathlen Coin Weight [SUPERSEDED]113-fast-authority-interface.txt
: Simplifying directory authority administration [SUPERSEDED]114-distributed-storage.txt
: Distributed Storage for Tor Hidden Service Descriptors [CLOSED]115-two-hop-paths.txt
: Two Hop Paths [DEAD]116-two-hop-paths-from-guard.txt
: Two hop paths from entry guards [DEAD]117-ipv6-exits.txt
: IPv6 exits [CLOSED]118-multiple-orports.txt
: Advertising multiple ORPorts at once [SUPERSEDED]119-controlport-auth.txt
: New PROTOCOLINFO command for controllers [CLOSED]120-shutdown-descriptors.txt
: Shutdown descriptors when Tor servers stop [DEAD]121-hidden-service-authentication.txt
: Hidden Service Authentication [CLOSED]122-unnamed-flag.txt
: Network status entries need a new Unnamed flag [CLOSED]123-autonaming.txt
: Naming authorities automatically create bindings [CLOSED]124-tls-certificates.txt
: Blocking resistant TLS certificate usage [SUPERSEDED]125-bridges.txt
: Behavior for bridge users, bridge relays, and bridge authorities [CLOSED]126-geoip-reporting.txt
: Getting GeoIP data and publishing usage summaries [CLOSED]127-dirport-mirrors-downloads.txt
: Relaying dirport requests to Tor download site / website [OBSOLETE]128-bridge-families.txt
: Families of private bridges [DEAD]129-reject-plaintext-ports.txt
: Block Insecure Protocols by Default [CLOSED]130-v2-conn-protocol.txt
: Version 2 Tor connection protocol [CLOSED]131-verify-tor-usage.txt
: Help users to verify they are using Tor [OBSOLETE]132-browser-check-tor-service.txt
: A Tor Web Service For Verifying Correct Browser Configuration [OBSOLETE]133-unreachable-ors.txt
: Incorporate Unreachable ORs into the Tor Network [RESERVE]134-robust-voting.txt
: More robust consensus voting with diverse authority sets [REJECTED]135-private-tor-networks.txt
: Simplify Configuration of Private Tor Networks [CLOSED]136-legacy-keys.txt
: Mass authority migration with legacy keys [CLOSED]137-bootstrap-phases.txt
: Keep controllers informed as Tor bootstraps [CLOSED]138-remove-down-routers-from-consensus.txt
: Remove routers that are not Running from consensus documents [CLOSED]139-conditional-consensus-download.txt
: Download consensus documents only when it will be trusted [CLOSED]140-consensus-diffs.txt
: Provide diffs between consensuses [CLOSED]141-jit-sd-downloads.txt
: Download server descriptors on demand [OBSOLETE]142-combine-intro-and-rend-points.txt
: Combine Introduction and Rendezvous Points [DEAD]143-distributed-storage-improvements.txt
: Improvements of Distributed Storage for Tor Hidden Service Descriptors [SUPERSEDED]144-enforce-distinct-providers.txt
: Increase the diversity of circuits by detecting nodes belonging the same provider [OBSOLETE]145-newguard-flag.txt
: Separate "suitable as a guard" from "suitable as a new guard" [SUPERSEDED]146-long-term-stability.txt
: Add new flag to reflect long-term stability [SUPERSEDED]147-prevoting-opinions.txt
: Eliminate the need for v2 directories in generating v3 directories [REJECTED]148-uniform-client-end-reason.txt
: Stream end reasons from the client side should be uniform [CLOSED]149-using-netinfo-data.txt
: Using data from NETINFO cells [SUPERSEDED]150-exclude-exit-nodes.txt
: Exclude Exit Nodes from a circuit [CLOSED]151-path-selection-improvements.txt
: Improving Tor Path Selection [CLOSED]152-single-hop-circuits.txt
: Optionally allow exit from single-hop circuits [CLOSED]153-automatic-software-update-protocol.txt
: Automatic software update protocol [SUPERSEDED]154-automatic-updates.txt
: Automatic Software Update Protocol [SUPERSEDED]155-four-hidden-service-improvements.txt
: Four Improvements of Hidden Service Performance [CLOSED]156-tracking-blocked-ports.txt
: Tracking blocked ports on the client side [SUPERSEDED]157-specific-cert-download.txt
: Make certificate downloads specific [CLOSED]158-microdescriptors.txt
: Clients download consensus + microdescriptors [CLOSED]159-exit-scanning.txt
: Exit Scanning [INFORMATIONAL]160-bandwidth-offset.txt
: Authorities vote for bandwidth offsets in consensus [CLOSED]161-computing-bandwidth-adjustments.txt
: Computing Bandwidth Adjustments [CLOSED]162-consensus-flavors.txt
: Publish the consensus in multiple flavors [CLOSED]163-detecting-clients.txt
: Detecting whether a connection comes from a client [SUPERSEDED]164-reporting-server-status.txt
: Reporting the status of server votes [OBSOLETE]165-simple-robust-voting.txt
: Easy migration for voting authority sets [REJECTED]166-statistics-extra-info-docs.txt
: Including Network Statistics in Extra-Info Documents [CLOSED]167-params-in-consensus.txt
: Vote on network parameters in consensus [CLOSED]168-reduce-circwindow.txt
: Reduce default circuit window [REJECTED]169-eliminating-renegotiation.txt
: Eliminate TLS renegotiation for the Tor connection handshake [SUPERSEDED]170-user-path-config.txt
: Configuration options regarding circuit building [SUPERSEDED]171-separate-streams.txt
: Separate streams across circuits by connection metadata [CLOSED]172-circ-getinfo-option.txt
: GETINFO controller option for circuit information [RESERVE]173-getinfo-option-expansion.txt
: GETINFO Option Expansion [OBSOLETE]174-optimistic-data-server.txt
: Optimistic Data for Tor: Server Side [CLOSED]175-automatic-node-promotion.txt
: Automatically promoting Tor clients to nodes [REJECTED]176-revising-handshake.txt
: Proposed version-3 link handshake for Tor [CLOSED]177-flag-abstention.txt
: Abstaining from votes on individual flags [RESERVE]178-param-voting.txt
: Require majority of authorities to vote for consensus parameters [CLOSED]179-TLS-cert-and-parameter-normalization.txt
: TLS certificate and parameter normalization [CLOSED]180-pluggable-transport.txt
: Pluggable transports for circumvention [CLOSED]181-optimistic-data-client.txt
: Optimistic Data for Tor: Client Side [CLOSED]182-creditbucket.txt
: Credit Bucket [OBSOLETE]183-refillintervals.txt
: Refill Intervals [CLOSED]184-v3-link-protocol.txt
: Miscellaneous changes for a v3 Tor link protocol [CLOSED]185-dir-without-dirport.txt
: Directory caches without DirPort [SUPERSEDED]186-multiple-orports.txt
: Multiple addresses for one OR or bridge [CLOSED]187-allow-client-auth.txt
: Reserve a cell type to allow client authorization [CLOSED]188-bridge-guards.txt
: Bridge Guards and other anti-enumeration defenses [RESERVE]189-authorize-cell.txt
: AUTHORIZE and AUTHORIZED cells [OBSOLETE]190-shared-secret-bridge-authorization.txt
: Bridge Client Authorization Based on a Shared Secret [OBSOLETE]191-mitm-bridge-detection-resistance.txt
: Bridge Detection Resistance against MITM-capable Adversaries [OBSOLETE]192-store-bridge-information.txt
: Automatically retrieve and store information about bridges [OBSOLETE]193-safe-cookie-authentication.txt
: Safe cookie authentication for Tor controllers [CLOSED]194-mnemonic-urls.txt
: Mnemonic .onion URLs [SUPERSEDED]195-TLS-normalization-for-024.txt
: TLS certificate normalization for Tor 0.2.4.x [DEAD]196-transport-control-ports.txt
: Extended ORPort and TransportControlPort [CLOSED]197-postmessage-ipc.txt
: Message-based Inter-Controller IPC Channel [REJECTED]198-restore-clienthello-semantics.txt
: Restore semantics of TLS ClientHello [CLOSED]199-bridgefinder-integration.txt
: Integration of BridgeFinder and BridgeFinderHelper [OBSOLETE]200-new-create-and-extend-cells.txt
: Adding new, extensible CREATE, EXTEND, and related cells [CLOSED]201-bridge-v3-reqs-stats.txt
: Make bridges report statistics on daily v3 network status requests [RESERVE]202-improved-relay-crypto.txt
: Two improved relay encryption protocols for Tor cells [META]203-https-frontend.txt
: Avoiding censorship by impersonating an HTTPS server [OBSOLETE]204-hidserv-subdomains.txt
: Subdomain support for Hidden Service addresses [CLOSED]205-local-dnscache.txt
: Remove global client-side DNS caching [CLOSED]206-directory-sources.txt
: Preconfigured directory sources for bootstrapping [CLOSED]207-directory-guards.txt
: Directory guards [CLOSED]208-ipv6-exits-redux.txt
: IPv6 Exits Redux [CLOSED]209-path-bias-tuning.txt
: Tuning the Parameters for the Path Bias Defense [OBSOLETE]210-faster-headless-consensus-bootstrap.txt
: Faster Headless Consensus Bootstrapping [SUPERSEDED]211-mapaddress-tor-status.txt
: Internal Mapaddress for Tor Configuration Testing [RESERVE]212-using-old-consensus.txt
: Increase Acceptable Consensus Age [NEEDS-REVISION]213-remove-stream-sendmes.txt
: Remove stream-level sendmes from the design [DEAD]214-longer-circids.txt
: Allow 4-byte circuit IDs in a new link protocol [CLOSED]215-update-min-consensus-ver.txt
: Let the minimum consensus method change with time [CLOSED]216-ntor-handshake.txt
: Improved circuit-creation key exchange [CLOSED]217-ext-orport-auth.txt
: Tor Extended ORPort Authentication [CLOSED]218-usage-controller-events.txt
: Controller events to better understand connection/circuit usage [CLOSED]219-expanded-dns.txt
: Support for full DNS and DNSSEC resolution in Tor [NEEDS-REVISION]220-ecc-id-keys.txt
: Migrate server identity keys to Ed25519 [CLOSED]221-stop-using-create-fast.txt
: Stop using CREATE_FAST [CLOSED]222-remove-client-timestamps.txt
: Stop sending client timestamps [CLOSED]223-ace-handshake.txt
: Ace: Improved circuit-creation key exchange [RESERVE]224-rend-spec-ng.txt
: Next-Generation Hidden Services in Tor [CLOSED]225-strawman-shared-rand.txt
: Strawman proposal: commit-and-reveal shared rng [SUPERSEDED]226-bridgedb-database-improvements.txt
: "Scalability and Stability Improvements to BridgeDB: Switching to a Distributed Database System and RDBMS" [RESERVE]227-vote-on-package-fingerprints.txt
: Include package fingerprints in consensus documents [CLOSED]228-cross-certification-onionkeys.txt
: Cross-certifying identity keys with onion keys [CLOSED]229-further-socks5-extensions.txt
: Further SOCKS5 extensions [REJECTED]230-rsa1024-relay-id-migration.txt
: How to change RSA1024 relay identity keys [OBSOLETE]231-migrate-authority-rsa1024-ids.txt
: Migrating authority RSA1024 identity keys [OBSOLETE]232-pluggable-transports-through-proxy.txt
: Pluggable Transport through SOCKS proxy [CLOSED]233-quicken-tor2web-mode.txt
: Making Tor2Web mode faster [REJECTED]234-remittance-addresses.txt
: Adding remittance field to directory specification [REJECTED]235-kill-named-flag.txt
: Stop assigning (and eventually supporting) the Named flag [CLOSED]236-single-guard-node.txt
: The move to a single guard node [CLOSED]237-directory-servers-for-all.txt
: All relays are directory servers [CLOSED]238-hs-relay-stats.txt
: Better hidden service stats from Tor relays [CLOSED]239-consensus-hash-chaining.txt
: Consensus Hash Chaining [OPEN]240-auth-cert-revocation.txt
: Early signing key revocation for directory authorities [OPEN]241-suspicious-guard-turnover.txt
: Resisting guard-turnover attacks [REJECTED]242-better-families.txt
: Better performance and usability for the MyFamily option [SUPERSEDED]243-hsdir-flag-need-stable.txt
: Give out HSDir flag only to relays with Stable flag [CLOSED]244-use-rfc5705-for-tls-binding.txt
: Use RFC5705 Key Exporting in our AUTHENTICATE calls [CLOSED]245-tap-out.txt
: Deprecating and removing the TAP circuit extension protocol [SUPERSEDED]246-merge-hsdir-and-intro.txt
: Merging Hidden Service Directories and Introduction Points [REJECTED]247-hs-guard-discovery.txt
: Defending Against Guard Discovery Attacks using Vanguards [SUPERSEDED]248-removing-rsa-identities.txt
: Remove all RSA identity keys [NEEDS-REVISION]249-large-create-cells.txt
: Allow CREATE cells with >505 bytes of handshake data [SUPERSEDED]250-commit-reveal-consensus.txt
: Random Number Generation During Tor Voting [CLOSED]251-netflow-padding.txt
: Padding for netflow record resolution reduction [CLOSED]252-single-onion.txt
: Single Onion Services [SUPERSEDED]253-oob-hmac.txt
: Out of Band Circuit HMACs [DEAD]254-padding-negotiation.txt
: Padding Negotiation [CLOSED]255-hs-load-balancing.txt
: Controller features to allow for load-balancing hidden services [RESERVE]256-key-revocation.txt
: Key revocation for relays and authorities [RESERVE]257-hiding-authorities.txt
: Refactoring authorities and making them more isolated from the net [META]258-dirauth-dos.txt
: Denial-of-service resistance for directory authorities [DEAD]259-guard-selection.txt
: New Guard Selection Behaviour [OBSOLETE]260-rend-single-onion.txt
: Rendezvous Single Onion Services [FINISHED]261-aez-crypto.txt
: AEZ for relay cryptography [OBSOLETE]262-rekey-circuits.txt
: Re-keying live circuits with new cryptographic material [RESERVE]263-ntru-for-pq-handshake.txt
: Request to change key exchange protocol for handshake v1.2 [OBSOLETE]264-subprotocol-versions.txt
: Putting version numbers on the Tor subprotocols [CLOSED]265-load-balancing-with-overhead.txt
: Load Balancing with Overhead Parameters [OPEN]266-removing-current-obsolete-clients.txt
: Removing current obsolete clients from the Tor network [SUPERSEDED]267-tor-consensus-transparency.txt
: Tor Consensus Transparency [OPEN]268-guard-selection.txt
: New Guard Selection Behaviour [OBSOLETE]269-hybrid-handshake.txt
: Transitionally secure hybrid handshakes [NEEDS-REVISION]270-newhope-hybrid-handshake.txt
: RebelAlliance: A Post-Quantum Secure Hybrid Handshake Based on NewHope [OBSOLETE]271-another-guard-selection.txt
: Another algorithm for guard selection [CLOSED]272-valid-and-running-by-default.txt
: Listed routers should be Valid, Running, and treated as such [CLOSED]273-exit-relay-pinning.txt
: Exit relay pinning for web services [RESERVE]274-rotate-onion-keys-less.txt
: Rotate onion keys less frequently [CLOSED]275-md-published-time-is-silly.txt
: Stop including meaningful "published" time in microdescriptor consensus [CLOSED]276-lower-bw-granularity.txt
: Report bandwidth with lower granularity in consensus documents [DEAD]277-detect-id-sharing.txt
: Detect multiple relay instances running with same ID [OPEN]278-directory-compression-scheme-negotiation.txt
: Directory Compression Scheme Negotiation [CLOSED]279-naming-layer-api.txt
: A Name System API for Tor Onion Services [NEEDS-REVISION]280-privcount-in-tor.txt
: Privacy-Preserving Statistics with Privcount in Tor [SUPERSEDED]281-bulk-md-download.txt
: Downloading microdescriptors in bulk [RESERVE]282-remove-named-from-consensus.txt
: Remove "Named" and "Unnamed" handling from consensus voting [ACCEPTED]283-ipv6-in-micro-consensus.txt
: Move IPv6 ORPorts from microdescriptors to the microdesc consensus [CLOSED]284-hsv3-control-port.txt
: Hidden Service v3 Control Port [CLOSED]285-utf-8.txt
: Directory documents should be standardized as UTF-8 [ACCEPTED]286-hibernation-api.txt
: Controller APIs for hibernation access on mobile [REJECTED]287-reduce-lifetime.txt
: Reduce circuit lifetime without overloading the network [OPEN]288-privcount-with-shamir.txt
: Privacy-Preserving Statistics with Privcount in Tor (Shamir version) [RESERVE]289-authenticated-sendmes.txt
: Authenticating sendme cells to mitigate bandwidth attacks [CLOSED]290-deprecate-consensus-methods.txt
: Continuously update consensus methods [META]291-two-guard-nodes.txt
: The move to two guard nodes [FINISHED]292-mesh-vanguards.txt
: Mesh-based vanguards [CLOSED]293-know-when-to-publish.txt
: Other ways for relays to know when to publish [CLOSED]294-tls-1.3.txt
: TLS 1.3 Migration [DRAFT]295-relay-crypto-with-adl.txt
: Using ADL for relay cryptography (solving the crypto-tagging attack) [OPEN]296-expose-bandwidth-files.txt
: Have Directory Authorities expose raw bandwidth list files [CLOSED]297-safer-protover-shutdowns.txt
: Relaxing the protover-based shutdown rules [CLOSED]298-canonical-families.txt
: Putting family lines in canonical form [CLOSED]299-ip-failure-count.txt
: Preferring IPv4 or IPv6 based on IP Version Failure Count [SUPERSEDED]300-walking-onions.txt
: Walking Onions: Scaling and Saving Bandwidth [INFORMATIONAL]301-dont-vote-on-package-fingerprints.txt
: Don't include package fingerprints in consensus documents [CLOSED]302-padding-machines-for-onion-clients.txt
: Hiding onion service clients using padding [CLOSED]303-protover-removal-policy.txt
: When and how to remove support for protocol versions [OPEN]304-socks5-extending-hs-error-codes.txt
: Extending SOCKS5 Onion Service Error Codes [CLOSED]305-establish-intro-dos-defense-extention.txt
: ESTABLISH_INTRO Cell DoS Defense Extension [CLOSED]306-ipv6-happy-eyeballs.txt
: A Tor Implementation of IPv6 Happy Eyeballs [OPEN]307-onionbalance-v3.txt
: Onion Balance Support for Onion Service v3 [RESERVE]308-counter-galois-onion.txt
: Counter Galois Onion: A New Proposal for Forward-Secure Relay Cryptography [SUPERSEDED]309-optimistic-socks-in-tor.txt
: Optimistic SOCKS Data [OPEN]310-bandaid-on-guard-selection.txt
: Towards load-balancing in Prop 271 [CLOSED]311-relay-ipv6-reachability.txt
: Tor Relay IPv6 Reachability [ACCEPTED]312-relay-auto-ipv6-addr.txt
: Tor Relay Automatic IPv6 Address Discovery [ACCEPTED]313-relay-ipv6-stats.txt
: Tor Relay IPv6 Statistics [ACCEPTED]314-allow-markdown-proposals.md
: Allow Markdown for proposal format [CLOSED]315-update-dir-required-fields.txt
: Updating the list of fields required in directory documents [CLOSED]316-flashflow.md
: FlashFlow: A Secure Speed Test for Tor (Parent Proposal) [DRAFT]317-secure-dns-name-resolution.txt
: Improve security aspects of DNS name resolution [NEEDS-REVISION]318-limit-protovers.md
: Limit protover values to 0-63 [CLOSED]319-wide-everything.md
: RELAY_FRAGMENT cells [OBSOLETE]320-tap-out-again.md
: Removing TAP usage from v2 onion services [REJECTED]321-happy-families.md
: Better performance and usability for the MyFamily option (v2) [ACCEPTED]322-dirport-linkspec.md
: Extending link specifiers to include the directory port [OPEN]323-walking-onions-full.md
: Specification for Walking Onions [OPEN]324-rtt-congestion-control.txt
: RTT-based Congestion Control for Tor [FINISHED]325-packed-relay-cells.md
: Packed relay cells: saving space on small commands [OBSOLETE]326-tor-relay-well-known-uri-rfc8615.md
: The "tor-relay" Well-Known Resource Identifier [OPEN]327-pow-over-intro.txt
: A First Take at PoW Over Introduction Circuits [CLOSED]328-relay-overload-report.md
: Make Relays Report When They Are Overloaded [CLOSED]329-traffic-splitting.txt
: Overcoming Tor's Bottlenecks with Traffic Splitting [FINISHED]330-authority-contact.md
: Modernizing authority contact entries [OPEN]331-res-tokens-for-anti-dos.md
: Res tokens: Anonymous Credentials for Onion Service DoS Resilience [DRAFT]332-ntor-v3-with-extra-data.md
: Ntor protocol with extra data, version 3 [CLOSED]333-vanguards-lite.md
: Vanguards lite [CLOSED]334-middle-only-flag.txt
: A Directory Authority Flag To Mark Relays As Middle-only [SUPERSEDED]335-middle-only-redux.md
: An authority-only design for MiddleOnly [CLOSED]336-randomize-guard-retries.md
: Randomized schedule for guard retries [CLOSED]337-simpler-guard-usability.md
: A simpler way to decide, "Is this guard usable?" [CLOSED]338-netinfo-y2038.md
: Use an 8-byte timestamp in NETINFO cells [ACCEPTED]339-udp-over-tor.md
: UDP traffic over Tor [ACCEPTED]340-packed-and-fragmented.md
: Packed and fragmented relay messages [OPEN]341-better-oos.md
: A better algorithm for out-of-sockets eviction [OPEN]342-decouple-hs-interval.md
: Decoupling hs_interval and SRV lifetime [DRAFT]343-rend-caa.txt
: CAA Extensions for the Tor Rendezvous Specification [OPEN]344-protocol-info-leaks.txt
: Prioritizing Protocol Information Leaks in Tor [OPEN]345-specs-in-mdbook.md
: Migrating the tor specifications to mdbook [CLOSED]346-protovers-again.md
: Clarifying and extending the use of protocol versioning [OPEN]347-domain-separation.md
: Domain separation for certificate signing keys [OPEN]348-udp-app-support.md
: UDP Application Support in Tor [OPEN]349-command-state-validation.md
: Client-Side Command Acceptance Validation [DRAFT]350-remove-tap.md
: A phased plan to remove TAP onion keys [ACCEPTED]351-socks-auth-extensions.md
: Making SOCKS5 authentication extensions extensible [CLOSED]352-complex-dns-for-vpn.md
: Handling Complex DNS Traffic for VPN usage in Tor [DRAFT]353-secure-relay-identity.md
: Requiring secure relay identities in EXTEND2 [DRAFT]
Tor proposals by status
Here we have a set of proposals for changes to the Tor protocol. Some of these proposals are implemented; some are works in progress; and some will never be implemented.
Below are a list of proposals sorted by status. See BY_INDEX.md for a list of proposals sorted by number.
Active proposals by status
OPEN proposals: under discussion
These are proposals that we think are likely to be complete, and ripe for discussion.
239-consensus-hash-chaining.txt
: Consensus Hash Chaining240-auth-cert-revocation.txt
: Early signing key revocation for directory authorities265-load-balancing-with-overhead.txt
: Load Balancing with Overhead Parameters267-tor-consensus-transparency.txt
: Tor Consensus Transparency277-detect-id-sharing.txt
: Detect multiple relay instances running with same ID287-reduce-lifetime.txt
: Reduce circuit lifetime without overloading the network295-relay-crypto-with-adl.txt
: Using ADL for relay cryptography (solving the crypto-tagging attack)303-protover-removal-policy.txt
: When and how to remove support for protocol versions306-ipv6-happy-eyeballs.txt
: A Tor Implementation of IPv6 Happy Eyeballs309-optimistic-socks-in-tor.txt
: Optimistic SOCKS Data322-dirport-linkspec.md
: Extending link specifiers to include the directory port323-walking-onions-full.md
: Specification for Walking Onions326-tor-relay-well-known-uri-rfc8615.md
: The "tor-relay" Well-Known Resource Identifier330-authority-contact.md
: Modernizing authority contact entries340-packed-and-fragmented.md
: Packed and fragmented relay messages341-better-oos.md
: A better algorithm for out-of-sockets eviction343-rend-caa.txt
: CAA Extensions for the Tor Rendezvous Specification344-protocol-info-leaks.txt
: Prioritizing Protocol Information Leaks in Tor346-protovers-again.md
: Clarifying and extending the use of protocol versioning347-domain-separation.md
: Domain separation for certificate signing keys348-udp-app-support.md
: UDP Application Support in Tor
ACCEPTED proposals: slated for implementation
These are the proposals that we agree we'd like to implement. They might or might not have a specific timeframe planned for their implementation.
282-remove-named-from-consensus.txt
: Remove "Named" and "Unnamed" handling from consensus voting285-utf-8.txt
: Directory documents should be standardized as UTF-8311-relay-ipv6-reachability.txt
: Tor Relay IPv6 Reachability312-relay-auto-ipv6-addr.txt
: Tor Relay Automatic IPv6 Address Discovery313-relay-ipv6-stats.txt
: Tor Relay IPv6 Statistics321-happy-families.md
: Better performance and usability for the MyFamily option (v2)338-netinfo-y2038.md
: Use an 8-byte timestamp in NETINFO cells339-udp-over-tor.md
: UDP traffic over Tor350-remove-tap.md
: A phased plan to remove TAP onion keys
FINISHED proposals: implemented, specs not merged
These proposals are implemented in some version of Tor; the proposals themselves still need to be merged into the specifications proper.
260-rend-single-onion.txt
: Rendezvous Single Onion Services291-two-guard-nodes.txt
: The move to two guard nodes324-rtt-congestion-control.txt
: RTT-based Congestion Control for Tor329-traffic-splitting.txt
: Overcoming Tor's Bottlenecks with Traffic Splitting
META proposals: about the proposal process
These proposals describe ongoing policies and changes to the proposals process.
000-index.txt
: Index of Tor Proposals001-process.txt
: The Tor Proposal Process202-improved-relay-crypto.txt
: Two improved relay encryption protocols for Tor cells257-hiding-authorities.txt
: Refactoring authorities and making them more isolated from the net290-deprecate-consensus-methods.txt
: Continuously update consensus methods
INFORMATIONAL proposals: not actually specifications
These proposals describe a process or project, but aren't actually proposed changes in the Tor specifications.
159-exit-scanning.txt
: Exit Scanning300-walking-onions.txt
: Walking Onions: Scaling and Saving Bandwidth
Preliminary proposals
DRAFT proposals: incomplete works
These proposals have been marked as a draft by their author or the editors, indicating that they aren't yet in a complete form. They're still open for discussion.
294-tls-1.3.txt
: TLS 1.3 Migration316-flashflow.md
: FlashFlow: A Secure Speed Test for Tor (Parent Proposal)331-res-tokens-for-anti-dos.md
: Res tokens: Anonymous Credentials for Onion Service DoS Resilience342-decouple-hs-interval.md
: Decoupling hs_interval and SRV lifetime349-command-state-validation.md
: Client-Side Command Acceptance Validation352-complex-dns-for-vpn.md
: Handling Complex DNS Traffic for VPN usage in Tor353-secure-relay-identity.md
: Requiring secure relay identities in EXTEND2
NEEDS-REVISION proposals: ideas that we can't implement as-is
These proposals have some promise, but we can't implement them without certain changes.
212-using-old-consensus.txt
: Increase Acceptable Consensus Age219-expanded-dns.txt
: Support for full DNS and DNSSEC resolution in Tor248-removing-rsa-identities.txt
: Remove all RSA identity keys269-hybrid-handshake.txt
: Transitionally secure hybrid handshakes279-naming-layer-api.txt
: A Name System API for Tor Onion Services317-secure-dns-name-resolution.txt
: Improve security aspects of DNS name resolution
NEEDS-RESEARCH proposals: blocking on research
These proposals are interesting ideas, but there's more research that would need to happen before we can know whether to implement them or not, or to fill in certain details.
(There are no proposals in this category)
Inactive proposals by status
CLOSED proposals: implemented and specified
These proposals have been implemented in some version of Tor, and the changes from the proposals have been merged into the specifications as necessary.
101-dir-voting.txt
: Voting on the Tor Directory System102-drop-opt.txt
: Dropping "opt" from the directory format103-multilevel-keys.txt
: Splitting identity key from regularly used signing key104-short-descriptors.txt
: Long and Short Router Descriptors105-handshake-revision.txt
: Version negotiation for the Tor protocol106-less-tls-constraint.txt
: Checking fewer things during TLS handshakes107-uptime-sanity-checking.txt
: Uptime Sanity Checking108-mtbf-based-stability.txt
: Base "Stable" Flag on Mean Time Between Failures109-no-sharing-ips.txt
: No more than one server per IP address110-avoid-infinite-circuits.txt
: Avoiding infinite length circuits111-local-traffic-priority.txt
: Prioritizing local traffic over relayed traffic114-distributed-storage.txt
: Distributed Storage for Tor Hidden Service Descriptors117-ipv6-exits.txt
: IPv6 exits119-controlport-auth.txt
: New PROTOCOLINFO command for controllers121-hidden-service-authentication.txt
: Hidden Service Authentication122-unnamed-flag.txt
: Network status entries need a new Unnamed flag123-autonaming.txt
: Naming authorities automatically create bindings125-bridges.txt
: Behavior for bridge users, bridge relays, and bridge authorities126-geoip-reporting.txt
: Getting GeoIP data and publishing usage summaries129-reject-plaintext-ports.txt
: Block Insecure Protocols by Default130-v2-conn-protocol.txt
: Version 2 Tor connection protocol135-private-tor-networks.txt
: Simplify Configuration of Private Tor Networks136-legacy-keys.txt
: Mass authority migration with legacy keys137-bootstrap-phases.txt
: Keep controllers informed as Tor bootstraps138-remove-down-routers-from-consensus.txt
: Remove routers that are not Running from consensus documents139-conditional-consensus-download.txt
: Download consensus documents only when it will be trusted140-consensus-diffs.txt
: Provide diffs between consensuses148-uniform-client-end-reason.txt
: Stream end reasons from the client side should be uniform150-exclude-exit-nodes.txt
: Exclude Exit Nodes from a circuit151-path-selection-improvements.txt
: Improving Tor Path Selection152-single-hop-circuits.txt
: Optionally allow exit from single-hop circuits155-four-hidden-service-improvements.txt
: Four Improvements of Hidden Service Performance157-specific-cert-download.txt
: Make certificate downloads specific158-microdescriptors.txt
: Clients download consensus + microdescriptors160-bandwidth-offset.txt
: Authorities vote for bandwidth offsets in consensus161-computing-bandwidth-adjustments.txt
: Computing Bandwidth Adjustments162-consensus-flavors.txt
: Publish the consensus in multiple flavors166-statistics-extra-info-docs.txt
: Including Network Statistics in Extra-Info Documents167-params-in-consensus.txt
: Vote on network parameters in consensus171-separate-streams.txt
: Separate streams across circuits by connection metadata174-optimistic-data-server.txt
: Optimistic Data for Tor: Server Side176-revising-handshake.txt
: Proposed version-3 link handshake for Tor178-param-voting.txt
: Require majority of authorities to vote for consensus parameters179-TLS-cert-and-parameter-normalization.txt
: TLS certificate and parameter normalization180-pluggable-transport.txt
: Pluggable transports for circumvention181-optimistic-data-client.txt
: Optimistic Data for Tor: Client Side183-refillintervals.txt
: Refill Intervals184-v3-link-protocol.txt
: Miscellaneous changes for a v3 Tor link protocol186-multiple-orports.txt
: Multiple addresses for one OR or bridge187-allow-client-auth.txt
: Reserve a cell type to allow client authorization193-safe-cookie-authentication.txt
: Safe cookie authentication for Tor controllers196-transport-control-ports.txt
: Extended ORPort and TransportControlPort198-restore-clienthello-semantics.txt
: Restore semantics of TLS ClientHello200-new-create-and-extend-cells.txt
: Adding new, extensible CREATE, EXTEND, and related cells204-hidserv-subdomains.txt
: Subdomain support for Hidden Service addresses205-local-dnscache.txt
: Remove global client-side DNS caching206-directory-sources.txt
: Preconfigured directory sources for bootstrapping207-directory-guards.txt
: Directory guards208-ipv6-exits-redux.txt
: IPv6 Exits Redux214-longer-circids.txt
: Allow 4-byte circuit IDs in a new link protocol215-update-min-consensus-ver.txt
: Let the minimum consensus method change with time216-ntor-handshake.txt
: Improved circuit-creation key exchange217-ext-orport-auth.txt
: Tor Extended ORPort Authentication218-usage-controller-events.txt
: Controller events to better understand connection/circuit usage220-ecc-id-keys.txt
: Migrate server identity keys to Ed25519221-stop-using-create-fast.txt
: Stop using CREATE_FAST222-remove-client-timestamps.txt
: Stop sending client timestamps224-rend-spec-ng.txt
: Next-Generation Hidden Services in Tor227-vote-on-package-fingerprints.txt
: Include package fingerprints in consensus documents228-cross-certification-onionkeys.txt
: Cross-certifying identity keys with onion keys232-pluggable-transports-through-proxy.txt
: Pluggable Transport through SOCKS proxy235-kill-named-flag.txt
: Stop assigning (and eventually supporting) the Named flag236-single-guard-node.txt
: The move to a single guard node237-directory-servers-for-all.txt
: All relays are directory servers238-hs-relay-stats.txt
: Better hidden service stats from Tor relays243-hsdir-flag-need-stable.txt
: Give out HSDir flag only to relays with Stable flag244-use-rfc5705-for-tls-binding.txt
: Use RFC5705 Key Exporting in our AUTHENTICATE calls250-commit-reveal-consensus.txt
: Random Number Generation During Tor Voting251-netflow-padding.txt
: Padding for netflow record resolution reduction254-padding-negotiation.txt
: Padding Negotiation264-subprotocol-versions.txt
: Putting version numbers on the Tor subprotocols271-another-guard-selection.txt
: Another algorithm for guard selection272-valid-and-running-by-default.txt
: Listed routers should be Valid, Running, and treated as such274-rotate-onion-keys-less.txt
: Rotate onion keys less frequently275-md-published-time-is-silly.txt
: Stop including meaningful "published" time in microdescriptor consensus278-directory-compression-scheme-negotiation.txt
: Directory Compression Scheme Negotiation283-ipv6-in-micro-consensus.txt
: Move IPv6 ORPorts from microdescriptors to the microdesc consensus284-hsv3-control-port.txt
: Hidden Service v3 Control Port289-authenticated-sendmes.txt
: Authenticating sendme cells to mitigate bandwidth attacks292-mesh-vanguards.txt
: Mesh-based vanguards293-know-when-to-publish.txt
: Other ways for relays to know when to publish296-expose-bandwidth-files.txt
: Have Directory Authorities expose raw bandwidth list files297-safer-protover-shutdowns.txt
: Relaxing the protover-based shutdown rules298-canonical-families.txt
: Putting family lines in canonical form301-dont-vote-on-package-fingerprints.txt
: Don't include package fingerprints in consensus documents302-padding-machines-for-onion-clients.txt
: Hiding onion service clients using padding304-socks5-extending-hs-error-codes.txt
: Extending SOCKS5 Onion Service Error Codes305-establish-intro-dos-defense-extention.txt
: ESTABLISH_INTRO Cell DoS Defense Extension310-bandaid-on-guard-selection.txt
: Towards load-balancing in Prop 271314-allow-markdown-proposals.md
: Allow Markdown for proposal format315-update-dir-required-fields.txt
: Updating the list of fields required in directory documents318-limit-protovers.md
: Limit protover values to 0-63327-pow-over-intro.txt
: A First Take at PoW Over Introduction Circuits328-relay-overload-report.md
: Make Relays Report When They Are Overloaded332-ntor-v3-with-extra-data.md
: Ntor protocol with extra data, version 3333-vanguards-lite.md
: Vanguards lite335-middle-only-redux.md
: An authority-only design for MiddleOnly336-randomize-guard-retries.md
: Randomized schedule for guard retries337-simpler-guard-usability.md
: A simpler way to decide, "Is this guard usable?"345-specs-in-mdbook.md
: Migrating the tor specifications to mdbook351-socks-auth-extensions.md
: Making SOCKS5 authentication extensions extensible
RESERVE proposals: saving for later
These proposals aren't anything we plan to implement soon, but for one reason or another we think they might be a good idea in the future. We're keeping them around as a reference in case we someday confront the problems that they try to solve.
133-unreachable-ors.txt
: Incorporate Unreachable ORs into the Tor Network172-circ-getinfo-option.txt
: GETINFO controller option for circuit information177-flag-abstention.txt
: Abstaining from votes on individual flags188-bridge-guards.txt
: Bridge Guards and other anti-enumeration defenses201-bridge-v3-reqs-stats.txt
: Make bridges report statistics on daily v3 network status requests211-mapaddress-tor-status.txt
: Internal Mapaddress for Tor Configuration Testing223-ace-handshake.txt
: Ace: Improved circuit-creation key exchange226-bridgedb-database-improvements.txt
: "Scalability and Stability Improvements to BridgeDB: Switching to a Distributed Database System and RDBMS"255-hs-load-balancing.txt
: Controller features to allow for load-balancing hidden services256-key-revocation.txt
: Key revocation for relays and authorities262-rekey-circuits.txt
: Re-keying live circuits with new cryptographic material273-exit-relay-pinning.txt
: Exit relay pinning for web services281-bulk-md-download.txt
: Downloading microdescriptors in bulk288-privcount-with-shamir.txt
: Privacy-Preserving Statistics with Privcount in Tor (Shamir version)307-onionbalance-v3.txt
: Onion Balance Support for Onion Service v3
SUPERSEDED proposals: replaced by something else
These proposals were obsoleted by a later proposal before they were implemented.
112-bring-back-pathlencoinweight.txt
: Bring Back Pathlen Coin Weight113-fast-authority-interface.txt
: Simplifying directory authority administration118-multiple-orports.txt
: Advertising multiple ORPorts at once124-tls-certificates.txt
: Blocking resistant TLS certificate usage143-distributed-storage-improvements.txt
: Improvements of Distributed Storage for Tor Hidden Service Descriptors145-newguard-flag.txt
: Separate "suitable as a guard" from "suitable as a new guard"146-long-term-stability.txt
: Add new flag to reflect long-term stability149-using-netinfo-data.txt
: Using data from NETINFO cells153-automatic-software-update-protocol.txt
: Automatic software update protocol154-automatic-updates.txt
: Automatic Software Update Protocol156-tracking-blocked-ports.txt
: Tracking blocked ports on the client side163-detecting-clients.txt
: Detecting whether a connection comes from a client169-eliminating-renegotiation.txt
: Eliminate TLS renegotiation for the Tor connection handshake170-user-path-config.txt
: Configuration options regarding circuit building185-dir-without-dirport.txt
: Directory caches without DirPort194-mnemonic-urls.txt
: Mnemonic .onion URLs210-faster-headless-consensus-bootstrap.txt
: Faster Headless Consensus Bootstrapping225-strawman-shared-rand.txt
: Strawman proposal: commit-and-reveal shared rng242-better-families.txt
: Better performance and usability for the MyFamily option245-tap-out.txt
: Deprecating and removing the TAP circuit extension protocol247-hs-guard-discovery.txt
: Defending Against Guard Discovery Attacks using Vanguards249-large-create-cells.txt
: Allow CREATE cells with >505 bytes of handshake data252-single-onion.txt
: Single Onion Services266-removing-current-obsolete-clients.txt
: Removing current obsolete clients from the Tor network280-privcount-in-tor.txt
: Privacy-Preserving Statistics with Privcount in Tor299-ip-failure-count.txt
: Preferring IPv4 or IPv6 based on IP Version Failure Count308-counter-galois-onion.txt
: Counter Galois Onion: A New Proposal for Forward-Secure Relay Cryptography334-middle-only-flag.txt
: A Directory Authority Flag To Mark Relays As Middle-only
DEAD, REJECTED, OBSOLETE proposals: not in our plans
These proposals are not on-track for discussion or implementation. Either discussion has stalled out (the proposal is DEAD), the proposal has been considered and not adopted (the proposal is REJECTED), or the proposal addresses an issue or a solution that is no longer relevant (the proposal is OBSOLETE).
098-todo.txt
: Proposals that should be written [OBSOLETE]099-misc.txt
: Miscellaneous proposals [OBSOLETE]100-tor-spec-udp.txt
: Tor Unreliable Datagram Extension Proposal [DEAD]115-two-hop-paths.txt
: Two Hop Paths [DEAD]116-two-hop-paths-from-guard.txt
: Two hop paths from entry guards [DEAD]120-shutdown-descriptors.txt
: Shutdown descriptors when Tor servers stop [DEAD]127-dirport-mirrors-downloads.txt
: Relaying dirport requests to Tor download site / website [OBSOLETE]128-bridge-families.txt
: Families of private bridges [DEAD]131-verify-tor-usage.txt
: Help users to verify they are using Tor [OBSOLETE]132-browser-check-tor-service.txt
: A Tor Web Service For Verifying Correct Browser Configuration [OBSOLETE]134-robust-voting.txt
: More robust consensus voting with diverse authority sets [REJECTED]141-jit-sd-downloads.txt
: Download server descriptors on demand [OBSOLETE]142-combine-intro-and-rend-points.txt
: Combine Introduction and Rendezvous Points [DEAD]144-enforce-distinct-providers.txt
: Increase the diversity of circuits by detecting nodes belonging the same provider [OBSOLETE]147-prevoting-opinions.txt
: Eliminate the need for v2 directories in generating v3 directories [REJECTED]164-reporting-server-status.txt
: Reporting the status of server votes [OBSOLETE]165-simple-robust-voting.txt
: Easy migration for voting authority sets [REJECTED]168-reduce-circwindow.txt
: Reduce default circuit window [REJECTED]173-getinfo-option-expansion.txt
: GETINFO Option Expansion [OBSOLETE]175-automatic-node-promotion.txt
: Automatically promoting Tor clients to nodes [REJECTED]182-creditbucket.txt
: Credit Bucket [OBSOLETE]189-authorize-cell.txt
: AUTHORIZE and AUTHORIZED cells [OBSOLETE]190-shared-secret-bridge-authorization.txt
: Bridge Client Authorization Based on a Shared Secret [OBSOLETE]191-mitm-bridge-detection-resistance.txt
: Bridge Detection Resistance against MITM-capable Adversaries [OBSOLETE]192-store-bridge-information.txt
: Automatically retrieve and store information about bridges [OBSOLETE]195-TLS-normalization-for-024.txt
: TLS certificate normalization for Tor 0.2.4.x [DEAD]197-postmessage-ipc.txt
: Message-based Inter-Controller IPC Channel [REJECTED]199-bridgefinder-integration.txt
: Integration of BridgeFinder and BridgeFinderHelper [OBSOLETE]203-https-frontend.txt
: Avoiding censorship by impersonating an HTTPS server [OBSOLETE]209-path-bias-tuning.txt
: Tuning the Parameters for the Path Bias Defense [OBSOLETE]213-remove-stream-sendmes.txt
: Remove stream-level sendmes from the design [DEAD]229-further-socks5-extensions.txt
: Further SOCKS5 extensions [REJECTED]230-rsa1024-relay-id-migration.txt
: How to change RSA1024 relay identity keys [OBSOLETE]231-migrate-authority-rsa1024-ids.txt
: Migrating authority RSA1024 identity keys [OBSOLETE]233-quicken-tor2web-mode.txt
: Making Tor2Web mode faster [REJECTED]234-remittance-addresses.txt
: Adding remittance field to directory specification [REJECTED]241-suspicious-guard-turnover.txt
: Resisting guard-turnover attacks [REJECTED]246-merge-hsdir-and-intro.txt
: Merging Hidden Service Directories and Introduction Points [REJECTED]253-oob-hmac.txt
: Out of Band Circuit HMACs [DEAD]258-dirauth-dos.txt
: Denial-of-service resistance for directory authorities [DEAD]259-guard-selection.txt
: New Guard Selection Behaviour [OBSOLETE]261-aez-crypto.txt
: AEZ for relay cryptography [OBSOLETE]263-ntru-for-pq-handshake.txt
: Request to change key exchange protocol for handshake v1.2 [OBSOLETE]268-guard-selection.txt
: New Guard Selection Behaviour [OBSOLETE]270-newhope-hybrid-handshake.txt
: RebelAlliance: A Post-Quantum Secure Hybrid Handshake Based on NewHope [OBSOLETE]276-lower-bw-granularity.txt
: Report bandwidth with lower granularity in consensus documents [DEAD]286-hibernation-api.txt
: Controller APIs for hibernation access on mobile [REJECTED]319-wide-everything.md
: RELAY_FRAGMENT cells [OBSOLETE]320-tap-out-again.md
: Removing TAP usage from v2 onion services [REJECTED]325-packed-relay-cells.md
: Packed relay cells: saving space on small commands [OBSOLETE]
Filename: 000-index.txt
Title: Index of Tor Proposals
Author: Nick Mathewson
Created: 26-Jan-2007
Status: Meta
Overview:
This document provides an index to Tor proposals.
This is an informational document.
Everything in this document below the line of '=' signs is automatically
generated by reindex.py; do not edit by hand.
============================================================
Proposals by number:
000 Index of Tor Proposals [META]
001 The Tor Proposal Process [META]
098 Proposals that should be written [OBSOLETE]
099 Miscellaneous proposals [OBSOLETE]
100 Tor Unreliable Datagram Extension Proposal [DEAD]
101 Voting on the Tor Directory System [CLOSED]
102 Dropping "opt" from the directory format [CLOSED]
103 Splitting identity key from regularly used signing key [CLOSED]
104 Long and Short Router Descriptors [CLOSED]
105 Version negotiation for the Tor protocol [CLOSED]
106 Checking fewer things during TLS handshakes [CLOSED]
107 Uptime Sanity Checking [CLOSED]
108 Base "Stable" Flag on Mean Time Between Failures [CLOSED]
109 No more than one server per IP address [CLOSED]
110 Avoiding infinite length circuits [CLOSED]
111 Prioritizing local traffic over relayed traffic [CLOSED]
112 Bring Back Pathlen Coin Weight [SUPERSEDED]
113 Simplifying directory authority administration [SUPERSEDED]
114 Distributed Storage for Tor Hidden Service Descriptors [CLOSED]
115 Two Hop Paths [DEAD]
116 Two hop paths from entry guards [DEAD]
117 IPv6 exits [CLOSED]
118 Advertising multiple ORPorts at once [SUPERSEDED]
119 New PROTOCOLINFO command for controllers [CLOSED]
120 Shutdown descriptors when Tor servers stop [DEAD]
121 Hidden Service Authentication [CLOSED]
122 Network status entries need a new Unnamed flag [CLOSED]
123 Naming authorities automatically create bindings [CLOSED]
124 Blocking resistant TLS certificate usage [SUPERSEDED]
125 Behavior for bridge users, bridge relays, and bridge authorities [CLOSED]
126 Getting GeoIP data and publishing usage summaries [CLOSED]
127 Relaying dirport requests to Tor download site / website [OBSOLETE]
128 Families of private bridges [DEAD]
129 Block Insecure Protocols by Default [CLOSED]
130 Version 2 Tor connection protocol [CLOSED]
131 Help users to verify they are using Tor [OBSOLETE]
132 A Tor Web Service For Verifying Correct Browser Configuration [OBSOLETE]
133 Incorporate Unreachable ORs into the Tor Network [RESERVE]
134 More robust consensus voting with diverse authority sets [REJECTED]
135 Simplify Configuration of Private Tor Networks [CLOSED]
136 Mass authority migration with legacy keys [CLOSED]
137 Keep controllers informed as Tor bootstraps [CLOSED]
138 Remove routers that are not Running from consensus documents [CLOSED]
139 Download consensus documents only when it will be trusted [CLOSED]
140 Provide diffs between consensuses [CLOSED]
141 Download server descriptors on demand [OBSOLETE]
142 Combine Introduction and Rendezvous Points [DEAD]
143 Improvements of Distributed Storage for Tor Hidden Service Descriptors [SUPERSEDED]
144 Increase the diversity of circuits by detecting nodes belonging the same provider [OBSOLETE]
145 Separate "suitable as a guard" from "suitable as a new guard" [SUPERSEDED]
146 Add new flag to reflect long-term stability [SUPERSEDED]
147 Eliminate the need for v2 directories in generating v3 directories [REJECTED]
148 Stream end reasons from the client side should be uniform [CLOSED]
149 Using data from NETINFO cells [SUPERSEDED]
150 Exclude Exit Nodes from a circuit [CLOSED]
151 Improving Tor Path Selection [CLOSED]
152 Optionally allow exit from single-hop circuits [CLOSED]
153 Automatic software update protocol [SUPERSEDED]
154 Automatic Software Update Protocol [SUPERSEDED]
155 Four Improvements of Hidden Service Performance [CLOSED]
156 Tracking blocked ports on the client side [SUPERSEDED]
157 Make certificate downloads specific [CLOSED]
158 Clients download consensus + microdescriptors [CLOSED]
159 Exit Scanning [INFORMATIONAL]
160 Authorities vote for bandwidth offsets in consensus [CLOSED]
161 Computing Bandwidth Adjustments [CLOSED]
162 Publish the consensus in multiple flavors [CLOSED]
163 Detecting whether a connection comes from a client [SUPERSEDED]
164 Reporting the status of server votes [OBSOLETE]
165 Easy migration for voting authority sets [REJECTED]
166 Including Network Statistics in Extra-Info Documents [CLOSED]
167 Vote on network parameters in consensus [CLOSED]
168 Reduce default circuit window [REJECTED]
169 Eliminate TLS renegotiation for the Tor connection handshake [SUPERSEDED]
170 Configuration options regarding circuit building [SUPERSEDED]
171 Separate streams across circuits by connection metadata [CLOSED]
172 GETINFO controller option for circuit information [RESERVE]
173 GETINFO Option Expansion [OBSOLETE]
174 Optimistic Data for Tor: Server Side [CLOSED]
175 Automatically promoting Tor clients to nodes [REJECTED]
176 Proposed version-3 link handshake for Tor [CLOSED]
177 Abstaining from votes on individual flags [RESERVE]
178 Require majority of authorities to vote for consensus parameters [CLOSED]
179 TLS certificate and parameter normalization [CLOSED]
180 Pluggable transports for circumvention [CLOSED]
181 Optimistic Data for Tor: Client Side [CLOSED]
182 Credit Bucket [OBSOLETE]
183 Refill Intervals [CLOSED]
184 Miscellaneous changes for a v3 Tor link protocol [CLOSED]
185 Directory caches without DirPort [SUPERSEDED]
186 Multiple addresses for one OR or bridge [CLOSED]
187 Reserve a cell type to allow client authorization [CLOSED]
188 Bridge Guards and other anti-enumeration defenses [RESERVE]
189 AUTHORIZE and AUTHORIZED cells [OBSOLETE]
190 Bridge Client Authorization Based on a Shared Secret [OBSOLETE]
191 Bridge Detection Resistance against MITM-capable Adversaries [OBSOLETE]
192 Automatically retrieve and store information about bridges [OBSOLETE]
193 Safe cookie authentication for Tor controllers [CLOSED]
194 Mnemonic .onion URLs [SUPERSEDED]
195 TLS certificate normalization for Tor 0.2.4.x [DEAD]
196 Extended ORPort and TransportControlPort [CLOSED]
197 Message-based Inter-Controller IPC Channel [REJECTED]
198 Restore semantics of TLS ClientHello [CLOSED]
199 Integration of BridgeFinder and BridgeFinderHelper [OBSOLETE]
200 Adding new, extensible CREATE, EXTEND, and related cells [CLOSED]
201 Make bridges report statistics on daily v3 network status requests [RESERVE]
202 Two improved relay encryption protocols for Tor cells [META]
203 Avoiding censorship by impersonating an HTTPS server [OBSOLETE]
204 Subdomain support for Hidden Service addresses [CLOSED]
205 Remove global client-side DNS caching [CLOSED]
206 Preconfigured directory sources for bootstrapping [CLOSED]
207 Directory guards [CLOSED]
208 IPv6 Exits Redux [CLOSED]
209 Tuning the Parameters for the Path Bias Defense [OBSOLETE]
210 Faster Headless Consensus Bootstrapping [SUPERSEDED]
211 Internal Mapaddress for Tor Configuration Testing [RESERVE]
212 Increase Acceptable Consensus Age [NEEDS-REVISION]
213 Remove stream-level sendmes from the design [DEAD]
214 Allow 4-byte circuit IDs in a new link protocol [CLOSED]
215 Let the minimum consensus method change with time [CLOSED]
216 Improved circuit-creation key exchange [CLOSED]
217 Tor Extended ORPort Authentication [CLOSED]
218 Controller events to better understand connection/circuit usage [CLOSED]
219 Support for full DNS and DNSSEC resolution in Tor [NEEDS-REVISION]
220 Migrate server identity keys to Ed25519 [CLOSED]
221 Stop using CREATE_FAST [CLOSED]
222 Stop sending client timestamps [CLOSED]
223 Ace: Improved circuit-creation key exchange [RESERVE]
224 Next-Generation Hidden Services in Tor [CLOSED]
225 Strawman proposal: commit-and-reveal shared rng [SUPERSEDED]
226 "Scalability and Stability Improvements to BridgeDB: Switching to a Distributed Database System and RDBMS" [RESERVE]
227 Include package fingerprints in consensus documents [CLOSED]
228 Cross-certifying identity keys with onion keys [CLOSED]
229 Further SOCKS5 extensions [REJECTED]
230 How to change RSA1024 relay identity keys [OBSOLETE]
231 Migrating authority RSA1024 identity keys [OBSOLETE]
232 Pluggable Transport through SOCKS proxy [CLOSED]
233 Making Tor2Web mode faster [REJECTED]
234 Adding remittance field to directory specification [REJECTED]
235 Stop assigning (and eventually supporting) the Named flag [CLOSED]
236 The move to a single guard node [CLOSED]
237 All relays are directory servers [CLOSED]
238 Better hidden service stats from Tor relays [CLOSED]
239 Consensus Hash Chaining [OPEN]
240 Early signing key revocation for directory authorities [OPEN]
241 Resisting guard-turnover attacks [REJECTED]
242 Better performance and usability for the MyFamily option [SUPERSEDED]
243 Give out HSDir flag only to relays with Stable flag [CLOSED]
244 Use RFC5705 Key Exporting in our AUTHENTICATE calls [CLOSED]
245 Deprecating and removing the TAP circuit extension protocol [SUPERSEDED]
246 Merging Hidden Service Directories and Introduction Points [REJECTED]
247 Defending Against Guard Discovery Attacks using Vanguards [SUPERSEDED]
248 Remove all RSA identity keys [NEEDS-REVISION]
249 Allow CREATE cells with >505 bytes of handshake data [SUPERSEDED]
250 Random Number Generation During Tor Voting [CLOSED]
251 Padding for netflow record resolution reduction [CLOSED]
252 Single Onion Services [SUPERSEDED]
253 Out of Band Circuit HMACs [DEAD]
254 Padding Negotiation [CLOSED]
255 Controller features to allow for load-balancing hidden services [RESERVE]
256 Key revocation for relays and authorities [RESERVE]
257 Refactoring authorities and making them more isolated from the net [META]
258 Denial-of-service resistance for directory authorities [DEAD]
259 New Guard Selection Behaviour [OBSOLETE]
260 Rendezvous Single Onion Services [FINISHED]
261 AEZ for relay cryptography [OBSOLETE]
262 Re-keying live circuits with new cryptographic material [RESERVE]
263 Request to change key exchange protocol for handshake v1.2 [OBSOLETE]
264 Putting version numbers on the Tor subprotocols [CLOSED]
265 Load Balancing with Overhead Parameters [OPEN]
266 Removing current obsolete clients from the Tor network [SUPERSEDED]
267 Tor Consensus Transparency [OPEN]
268 New Guard Selection Behaviour [OBSOLETE]
269 Transitionally secure hybrid handshakes [NEEDS-REVISION]
270 RebelAlliance: A Post-Quantum Secure Hybrid Handshake Based on NewHope [OBSOLETE]
271 Another algorithm for guard selection [CLOSED]
272 Listed routers should be Valid, Running, and treated as such [CLOSED]
273 Exit relay pinning for web services [RESERVE]
274 Rotate onion keys less frequently [CLOSED]
275 Stop including meaningful "published" time in microdescriptor consensus [CLOSED]
276 Report bandwidth with lower granularity in consensus documents [DEAD]
277 Detect multiple relay instances running with same ID [OPEN]
278 Directory Compression Scheme Negotiation [CLOSED]
279 A Name System API for Tor Onion Services [NEEDS-REVISION]
280 Privacy-Preserving Statistics with Privcount in Tor [SUPERSEDED]
281 Downloading microdescriptors in bulk [RESERVE]
282 Remove "Named" and "Unnamed" handling from consensus voting [ACCEPTED]
283 Move IPv6 ORPorts from microdescriptors to the microdesc consensus [CLOSED]
284 Hidden Service v3 Control Port [CLOSED]
285 Directory documents should be standardized as UTF-8 [ACCEPTED]
286 Controller APIs for hibernation access on mobile [REJECTED]
287 Reduce circuit lifetime without overloading the network [OPEN]
288 Privacy-Preserving Statistics with Privcount in Tor (Shamir version) [RESERVE]
289 Authenticating sendme cells to mitigate bandwidth attacks [CLOSED]
290 Continuously update consensus methods [META]
291 The move to two guard nodes [FINISHED]
292 Mesh-based vanguards [CLOSED]
293 Other ways for relays to know when to publish [CLOSED]
294 TLS 1.3 Migration [DRAFT]
295 Using ADL for relay cryptography (solving the crypto-tagging attack) [OPEN]
296 Have Directory Authorities expose raw bandwidth list files [CLOSED]
297 Relaxing the protover-based shutdown rules [CLOSED]
298 Putting family lines in canonical form [CLOSED]
299 Preferring IPv4 or IPv6 based on IP Version Failure Count [SUPERSEDED]
300 Walking Onions: Scaling and Saving Bandwidth [INFORMATIONAL]
301 Don't include package fingerprints in consensus documents [CLOSED]
302 Hiding onion service clients using padding [CLOSED]
303 When and how to remove support for protocol versions [OPEN]
304 Extending SOCKS5 Onion Service Error Codes [CLOSED]
305 ESTABLISH_INTRO Cell DoS Defense Extension [CLOSED]
306 A Tor Implementation of IPv6 Happy Eyeballs [OPEN]
307 Onion Balance Support for Onion Service v3 [RESERVE]
308 Counter Galois Onion: A New Proposal for Forward-Secure Relay Cryptography [SUPERSEDED]
309 Optimistic SOCKS Data [OPEN]
310 Towards load-balancing in Prop 271 [CLOSED]
311 Tor Relay IPv6 Reachability [ACCEPTED]
312 Tor Relay Automatic IPv6 Address Discovery [ACCEPTED]
313 Tor Relay IPv6 Statistics [ACCEPTED]
314 Allow Markdown for proposal format [CLOSED]
315 Updating the list of fields required in directory documents [CLOSED]
316 FlashFlow: A Secure Speed Test for Tor (Parent Proposal) [DRAFT]
317 Improve security aspects of DNS name resolution [NEEDS-REVISION]
318 Limit protover values to 0-63 [CLOSED]
319 RELAY_FRAGMENT cells [OBSOLETE]
320 Removing TAP usage from v2 onion services [REJECTED]
321 Better performance and usability for the MyFamily option (v2) [ACCEPTED]
322 Extending link specifiers to include the directory port [OPEN]
323 Specification for Walking Onions [OPEN]
324 RTT-based Congestion Control for Tor [FINISHED]
325 Packed relay cells: saving space on small commands [OBSOLETE]
326 The "tor-relay" Well-Known Resource Identifier [OPEN]
327 A First Take at PoW Over Introduction Circuits [CLOSED]
328 Make Relays Report When They Are Overloaded [CLOSED]
329 Overcoming Tor's Bottlenecks with Traffic Splitting [FINISHED]
330 Modernizing authority contact entries [OPEN]
331 Res tokens: Anonymous Credentials for Onion Service DoS Resilience [DRAFT]
332 Ntor protocol with extra data, version 3 [CLOSED]
333 Vanguards lite [CLOSED]
334 A Directory Authority Flag To Mark Relays As Middle-only [SUPERSEDED]
335 An authority-only design for MiddleOnly [CLOSED]
336 Randomized schedule for guard retries [CLOSED]
337 A simpler way to decide, "Is this guard usable?" [CLOSED]
338 Use an 8-byte timestamp in NETINFO cells [ACCEPTED]
339 UDP traffic over Tor [ACCEPTED]
340 Packed and fragmented relay messages [OPEN]
341 A better algorithm for out-of-sockets eviction [OPEN]
342 Decoupling hs_interval and SRV lifetime [DRAFT]
343 CAA Extensions for the Tor Rendezvous Specification [OPEN]
344 Prioritizing Protocol Information Leaks in Tor [OPEN]
345 Migrating the tor specifications to mdbook [CLOSED]
346 Clarifying and extending the use of protocol versioning [OPEN]
347 Domain separation for certificate signing keys [OPEN]
348 UDP Application Support in Tor [OPEN]
349 Client-Side Command Acceptance Validation [DRAFT]
350 A phased plan to remove TAP onion keys [ACCEPTED]
351 Making SOCKS5 authentication extensions extensible [CLOSED]
352 Handling Complex DNS Traffic for VPN usage in Tor [DRAFT]
353 Requiring secure relay identities in EXTEND2 [DRAFT]
Proposals by status:
DRAFT:
294 TLS 1.3 Migration
316 FlashFlow: A Secure Speed Test for Tor (Parent Proposal)
331 Res tokens: Anonymous Credentials for Onion Service DoS Resilience
342 Decoupling hs_interval and SRV lifetime
349 Client-Side Command Acceptance Validation
352 Handling Complex DNS Traffic for VPN usage in Tor
353 Requiring secure relay identities in EXTEND2
NEEDS-REVISION:
212 Increase Acceptable Consensus Age [for 0.2.4.x+]
219 Support for full DNS and DNSSEC resolution in Tor [for 0.2.5.x]
248 Remove all RSA identity keys
269 Transitionally secure hybrid handshakes
279 A Name System API for Tor Onion Services
317 Improve security aspects of DNS name resolution
OPEN:
239 Consensus Hash Chaining
240 Early signing key revocation for directory authorities
265 Load Balancing with Overhead Parameters [for arti-dirauth]
267 Tor Consensus Transparency
277 Detect multiple relay instances running with same ID [for 0.3.??]
287 Reduce circuit lifetime without overloading the network
295 Using ADL for relay cryptography (solving the crypto-tagging attack)
303 When and how to remove support for protocol versions
306 A Tor Implementation of IPv6 Happy Eyeballs
309 Optimistic SOCKS Data
322 Extending link specifiers to include the directory port
323 Specification for Walking Onions
326 The "tor-relay" Well-Known Resource Identifier
330 Modernizing authority contact entries
340 Packed and fragmented relay messages
341 A better algorithm for out-of-sockets eviction
343 CAA Extensions for the Tor Rendezvous Specification
344 Prioritizing Protocol Information Leaks in Tor
346 Clarifying and extending the use of protocol versioning
347 Domain separation for certificate signing keys
348 UDP Application Support in Tor
ACCEPTED:
282 Remove "Named" and "Unnamed" handling from consensus voting [for arti-dirauth]
285 Directory documents should be standardized as UTF-8 [for arti-dirauth]
311 Tor Relay IPv6 Reachability
312 Tor Relay Automatic IPv6 Address Discovery
313 Tor Relay IPv6 Statistics
321 Better performance and usability for the MyFamily option (v2)
338 Use an 8-byte timestamp in NETINFO cells
339 UDP traffic over Tor
350 A phased plan to remove TAP onion keys
META:
000 Index of Tor Proposals
001 The Tor Proposal Process
202 Two improved relay encryption protocols for Tor cells
257 Refactoring authorities and making them more isolated from the net
290 Continuously update consensus methods
FINISHED:
260 Rendezvous Single Onion Services [in 0.2.9.3-alpha]
291 The move to two guard nodes
324 RTT-based Congestion Control for Tor
329 Overcoming Tor's Bottlenecks with Traffic Splitting
CLOSED:
101 Voting on the Tor Directory System [in 0.2.0.x]
102 Dropping "opt" from the directory format [in 0.2.0.x]
103 Splitting identity key from regularly used signing key [in 0.2.0.x]
104 Long and Short Router Descriptors [in 0.2.0.x]
105 Version negotiation for the Tor protocol [in 0.2.0.x]
106 Checking fewer things during TLS handshakes [in 0.2.0.x]
107 Uptime Sanity Checking [in 0.2.0.x]
108 Base "Stable" Flag on Mean Time Between Failures [in 0.2.0.x]
109 No more than one server per IP address [in 0.2.0.x]
110 Avoiding infinite length circuits [for 0.2.3.x] [in 0.2.1.3-alpha, 0.2.3.11-alpha]
111 Prioritizing local traffic over relayed traffic [in 0.2.0.x]
114 Distributed Storage for Tor Hidden Service Descriptors [in 0.2.0.x]
117 IPv6 exits [for 0.2.4.x] [in 0.2.4.7-alpha]
119 New PROTOCOLINFO command for controllers [in 0.2.0.x]
121 Hidden Service Authentication [in 0.2.1.x]
122 Network status entries need a new Unnamed flag [in 0.2.0.x]
123 Naming authorities automatically create bindings [in 0.2.0.x]
125 Behavior for bridge users, bridge relays, and bridge authorities [in 0.2.0.x]
126 Getting GeoIP data and publishing usage summaries [in 0.2.0.x]
129 Block Insecure Protocols by Default [in 0.2.0.x]
130 Version 2 Tor connection protocol [in 0.2.0.x]
135 Simplify Configuration of Private Tor Networks [for 0.2.1.x] [in 0.2.1.2-alpha]
136 Mass authority migration with legacy keys [in 0.2.0.x]
137 Keep controllers informed as Tor bootstraps [in 0.2.1.x]
138 Remove routers that are not Running from consensus documents [in 0.2.1.2-alpha]
139 Download consensus documents only when it will be trusted [in 0.2.1.x]
140 Provide diffs between consensuses [in 0.3.1.1-alpha]
148 Stream end reasons from the client side should be uniform [in 0.2.1.9-alpha]
150 Exclude Exit Nodes from a circuit [in 0.2.1.3-alpha]
151 Improving Tor Path Selection [in 0.2.2.2-alpha]
152 Optionally allow exit from single-hop circuits [in 0.2.1.6-alpha]
155 Four Improvements of Hidden Service Performance [in 0.2.1.x]
157 Make certificate downloads specific [for 0.2.4.x]
158 Clients download consensus + microdescriptors [in 0.2.3.1-alpha]
160 Authorities vote for bandwidth offsets in consensus [for 0.2.1.x]
161 Computing Bandwidth Adjustments [for 0.2.1.x]
162 Publish the consensus in multiple flavors [in 0.2.3.1-alpha]
166 Including Network Statistics in Extra-Info Documents [for 0.2.2]
167 Vote on network parameters in consensus [in 0.2.2]
171 Separate streams across circuits by connection metadata [in 0.2.3.3-alpha]
174 Optimistic Data for Tor: Server Side [in 0.2.3.1-alpha]
176 Proposed version-3 link handshake for Tor [for 0.2.3]
178 Require majority of authorities to vote for consensus parameters [in 0.2.3.9-alpha]
179 TLS certificate and parameter normalization [for 0.2.3.x]
180 Pluggable transports for circumvention [in 0.2.3.x]
181 Optimistic Data for Tor: Client Side [in 0.2.3.3-alpha]
183 Refill Intervals [in 0.2.3.5-alpha]
184 Miscellaneous changes for a v3 Tor link protocol [for 0.2.3.x]
186 Multiple addresses for one OR or bridge [for 0.2.4.x+]
187 Reserve a cell type to allow client authorization [for 0.2.3.x]
193 Safe cookie authentication for Tor controllers
196 Extended ORPort and TransportControlPort [in 0.2.5.2-alpha]
198 Restore semantics of TLS ClientHello [for 0.2.4.x]
200 Adding new, extensible CREATE, EXTEND, and related cells [in 0.2.4.8-alpha]
204 Subdomain support for Hidden Service addresses
205 Remove global client-side DNS caching [in 0.2.4.7-alpha.]
206 Preconfigured directory sources for bootstrapping [in 0.2.4.7-alpha]
207 Directory guards [for 0.2.4.x]
208 IPv6 Exits Redux [for 0.2.4.x] [in 0.2.4.7-alpha]
214 Allow 4-byte circuit IDs in a new link protocol [in 0.2.4.11-alpha]
215 Let the minimum consensus method change with time [in 0.2.6.1-alpha]
216 Improved circuit-creation key exchange [in 0.2.4.8-alpha]
217 Tor Extended ORPort Authentication [for 0.2.5.x]
218 Controller events to better understand connection/circuit usage [in 0.2.5.2-alpha]
220 Migrate server identity keys to Ed25519 [in 0.3.0.1-alpha]
221 Stop using CREATE_FAST [for 0.2.5.x]
222 Stop sending client timestamps [in 0.2.4.18]
224 Next-Generation Hidden Services in Tor [in 0.3.2.1-alpha]
227 Include package fingerprints in consensus documents [in 0.2.6.3-alpha]
228 Cross-certifying identity keys with onion keys
232 Pluggable Transport through SOCKS proxy [in 0.2.6]
235 Stop assigning (and eventually supporting) the Named flag [in 0.2.6, 0.2.7]
236 The move to a single guard node
237 All relays are directory servers [for 0.2.7.x]
238 Better hidden service stats from Tor relays
243 Give out HSDir flag only to relays with Stable flag
244 Use RFC5705 Key Exporting in our AUTHENTICATE calls [in 0.3.0.1-alpha]
250 Random Number Generation During Tor Voting
251 Padding for netflow record resolution reduction [in 0.3.1.1-alpha]
254 Padding Negotiation
264 Putting version numbers on the Tor subprotocols [in 0.2.9.4-alpha]
271 Another algorithm for guard selection [in 0.3.0.1-alpha]
272 Listed routers should be Valid, Running, and treated as such [in 0.2.9.3-alpha, 0.2.9.4-alpha]
274 Rotate onion keys less frequently [in 0.3.1.1-alpha]
275 Stop including meaningful "published" time in microdescriptor consensus [for 0.3.1.x-alpha] [in 0.4.8.1-alpha]
278 Directory Compression Scheme Negotiation [in 0.3.1.1-alpha]
283 Move IPv6 ORPorts from microdescriptors to the microdesc consensus [for 0.3.3.x] [in 0.3.3.1-alpha]
284 Hidden Service v3 Control Port
289 Authenticating sendme cells to mitigate bandwidth attacks [in 0.4.1.1-alpha]
292 Mesh-based vanguards
293 Other ways for relays to know when to publish [for 0.3.5] [in 0.4.0.1-alpha]
296 Have Directory Authorities expose raw bandwidth list files [in 0.4.0.1-alpha]
297 Relaxing the protover-based shutdown rules [for 0.3.5.x] [in 0.4.0.x]
298 Putting family lines in canonical form [for 0.3.6.x] [in 0.4.0.1-alpha]
301 Don't include package fingerprints in consensus documents
302 Hiding onion service clients using padding [in 0.4.1.1-alpha]
304 Extending SOCKS5 Onion Service Error Codes
305 ESTABLISH_INTRO Cell DoS Defense Extension
310 Towards load-balancing in Prop 271
314 Allow Markdown for proposal format
315 Updating the list of fields required in directory documents [in 0.4.5.1-alpha]
318 Limit protover values to 0-63 [in 0.4.5.1-alpha]
327 A First Take at PoW Over Introduction Circuits
328 Make Relays Report When They Are Overloaded
332 Ntor protocol with extra data, version 3
333 Vanguards lite [in 0.4.7.1-alpha]
335 An authority-only design for MiddleOnly [in 0.4.7.2-alpha]
336 Randomized schedule for guard retries
337 A simpler way to decide, "Is this guard usable?"
345 Migrating the tor specifications to mdbook
351 Making SOCKS5 authentication extensions extensible [in Arti 1.2.8, Tor 0.4.9.1-alpha]
SUPERSEDED:
112 Bring Back Pathlen Coin Weight
113 Simplifying directory authority administration
118 Advertising multiple ORPorts at once
124 Blocking resistant TLS certificate usage
143 Improvements of Distributed Storage for Tor Hidden Service Descriptors
145 Separate "suitable as a guard" from "suitable as a new guard"
146 Add new flag to reflect long-term stability
149 Using data from NETINFO cells
153 Automatic software update protocol
154 Automatic Software Update Protocol
156 Tracking blocked ports on the client side
163 Detecting whether a connection comes from a client
169 Eliminate TLS renegotiation for the Tor connection handshake
170 Configuration options regarding circuit building
185 Directory caches without DirPort
194 Mnemonic .onion URLs
210 Faster Headless Consensus Bootstrapping
225 Strawman proposal: commit-and-reveal shared rng
242 Better performance and usability for the MyFamily option
245 Deprecating and removing the TAP circuit extension protocol
247 Defending Against Guard Discovery Attacks using Vanguards
249 Allow CREATE cells with >505 bytes of handshake data
252 Single Onion Services
266 Removing current obsolete clients from the Tor network
280 Privacy-Preserving Statistics with Privcount in Tor
299 Preferring IPv4 or IPv6 based on IP Version Failure Count
308 Counter Galois Onion: A New Proposal for Forward-Secure Relay Cryptography
334 A Directory Authority Flag To Mark Relays As Middle-only
DEAD:
100 Tor Unreliable Datagram Extension Proposal
115 Two Hop Paths
116 Two hop paths from entry guards
120 Shutdown descriptors when Tor servers stop
128 Families of private bridges
142 Combine Introduction and Rendezvous Points
195 TLS certificate normalization for Tor 0.2.4.x
213 Remove stream-level sendmes from the design
253 Out of Band Circuit HMACs
258 Denial-of-service resistance for directory authorities
276 Report bandwidth with lower granularity in consensus documents
REJECTED:
134 More robust consensus voting with diverse authority sets
147 Eliminate the need for v2 directories in generating v3 directories [for 0.2.4.x]
165 Easy migration for voting authority sets
168 Reduce default circuit window
175 Automatically promoting Tor clients to nodes
197 Message-based Inter-Controller IPC Channel
229 Further SOCKS5 extensions
233 Making Tor2Web mode faster
234 Adding remittance field to directory specification
241 Resisting guard-turnover attacks
246 Merging Hidden Service Directories and Introduction Points
286 Controller APIs for hibernation access on mobile
320 Removing TAP usage from v2 onion services
OBSOLETE:
098 Proposals that should be written
099 Miscellaneous proposals
127 Relaying dirport requests to Tor download site / website
131 Help users to verify they are using Tor
132 A Tor Web Service For Verifying Correct Browser Configuration
141 Download server descriptors on demand
144 Increase the diversity of circuits by detecting nodes belonging the same provider
164 Reporting the status of server votes
173 GETINFO Option Expansion
182 Credit Bucket
189 AUTHORIZE and AUTHORIZED cells
190 Bridge Client Authorization Based on a Shared Secret
191 Bridge Detection Resistance against MITM-capable Adversaries
192 Automatically retrieve and store information about bridges [for 0.2.[45].x]
199 Integration of BridgeFinder and BridgeFinderHelper
203 Avoiding censorship by impersonating an HTTPS server
209 Tuning the Parameters for the Path Bias Defense [for 0.2.4.x+]
230 How to change RSA1024 relay identity keys [for 0.2.?]
231 Migrating authority RSA1024 identity keys [for 0.2.?]
259 New Guard Selection Behaviour
261 AEZ for relay cryptography
263 Request to change key exchange protocol for handshake v1.2
268 New Guard Selection Behaviour
270 RebelAlliance: A Post-Quantum Secure Hybrid Handshake Based on NewHope
319 RELAY_FRAGMENT cells
325 Packed relay cells: saving space on small commands
RESERVE:
133 Incorporate Unreachable ORs into the Tor Network
172 GETINFO controller option for circuit information
177 Abstaining from votes on individual flags [for 0.2.4.x]
188 Bridge Guards and other anti-enumeration defenses
201 Make bridges report statistics on daily v3 network status requests [for 0.2.4.x]
211 Internal Mapaddress for Tor Configuration Testing [for 0.2.4.x+]
223 Ace: Improved circuit-creation key exchange
226 "Scalability and Stability Improvements to BridgeDB: Switching to a Distributed Database System and RDBMS"
255 Controller features to allow for load-balancing hidden services
256 Key revocation for relays and authorities
262 Re-keying live circuits with new cryptographic material
273 Exit relay pinning for web services [for n/a]
281 Downloading microdescriptors in bulk
288 Privacy-Preserving Statistics with Privcount in Tor (Shamir version)
307 Onion Balance Support for Onion Service v3
INFORMATIONAL:
159 Exit Scanning
300 Walking Onions: Scaling and Saving Bandwidth
Filename: 001-process.txt
Title: The Tor Proposal Process
Author: Nick Mathewson
Created: 30-Jan-2007
Status: Meta
Overview:
This document describes how to change the Tor specifications, how Tor
proposals work, and the relationship between Tor proposals and the
specifications.
This is an informational document.
Motivation:
Previously, our process for updating the Tor specifications was maximally
informal: we'd patch the specification (sometimes forking first, and
sometimes not), then discuss the patches, reach consensus, and implement
the changes.
This had a few problems.
First, even at its most efficient, the old process would often have the
spec out of sync with the code. The worst cases were those where
implementation was deferred: the spec and code could stay out of sync for
versions at a time.
Second, it was hard to participate in discussion, since you had to know
which portions of the spec were a proposal, and which were already
implemented.
Third, it littered the specifications with too many inline comments.
[This was a real problem -NM]
[Especially when it went to multiple levels! -NM]
[XXXX especially when they weren't signed and talked about that
thing that you can't remember after a year]
How to change the specs now:
First, somebody writes a proposal document. It should describe the change
that should be made in detail, and give some idea of how to implement it.
Once it's fleshed out enough, it becomes a proposal.
Like an RFC, every proposal gets a number. Unlike RFCs, proposals can
change over time and keep the same number, until they are finally
accepted or rejected. The history for each proposal
will be stored in the Tor repository.
Once a proposal is in the repository, we should discuss and improve it
until we've reached consensus that it's a good idea, and that it's
detailed enough to implement. When this happens, we implement the
proposal and incorporate it into the specifications. Thus, the specs
remain the canonical documentation for the Tor protocol: no proposal is
ever the canonical documentation for an implemented feature.
(This process is pretty similar to the Python Enhancement Process, with
the major exception that Tor proposals get re-integrated into the specs
after implementation, whereas PEPs _become_ the new spec.)
{It's still okay to make small changes directly to the spec if the code
can be
written more or less immediately, or cosmetic changes if no code change is
required. This document reflects the current developers' _intent_, not
a permanent promise to always use this process in the future: we reserve
the right to get really excited and run off and implement something in a
caffeine-or-m&m-fueled all-night hacking session.}
How new proposals get added:
Once an idea has been proposed on the development list, a properly formatted
(see below) draft exists, and rough consensus within the active development
community exists that this idea warrants consideration, the proposal editors
will officially add the proposal.
To get your proposal in, send it to the tor-dev@lists.torproject.org mailing
list.
What should go in a proposal:
Every proposal should have a header containing these fields:
Filename, Title, Author, Created, Status.
These fields are optional but recommended:
Target, Implemented-In, Ticket**.
The Target field should describe which version the proposal is hoped to be
implemented in (if it's Open or Accepted). The Implemented-In field
should describe which version the proposal was implemented in (if it's
Finished or Closed). The Ticket field should be a ticket number referring
to Tor's canonical bug tracker (e.g. "#7144" refers to
https://bugs.torproject.org/7144) or to a publicly accessible URI where one
may subscribe to updates and/or retrieve information on implementation
status.
** Proposals with assigned numbers of prop#283 and higher are REQUIRED to
have a Ticket field if the Status is OPEN, ACCEPTED, CLOSED, or FINISHED.
The body of the proposal should start with an Overview section explaining
what the proposal's about, what it does, and about what state it's in.
After the Overview, the proposal becomes more free-form. Depending on its
length and complexity, the proposal can break into sections as
appropriate, or follow a short discursive format. Every proposal should
contain at least the following information before it is "ACCEPTED",
though the information does not need to be in sections with these names.
Motivation: What problem is the proposal trying to solve? Why does
this problem matter? If several approaches are possible, why take this
one?
Design: A high-level view of what the new or modified features are, how
the new or modified features work, how they interoperate with each
other, and how they interact with the rest of Tor. This is the main
body of the proposal. Some proposals will start out with only a
Motivation and a Design, and wait for a specification until the
Design seems approximately right.
Security implications: What effects the proposed changes might have on
anonymity, how well understood these effects are, and so on.
Specification: A detailed description of what needs to be added to the
Tor specifications in order to implement the proposal. This should
be in about as much detail as the specifications will eventually
contain: it should be possible for independent programmers to write
mutually compatible implementations of the proposal based on its
specifications.
Compatibility: Will versions of Tor that follow the proposal be
compatible with versions that do not? If so, how will compatibility
be achieved? Generally, we try to not drop compatibility if at
all possible; we haven't made a "flag day" change since May 2004,
and we don't want to do another one.
Implementation: If the proposal will be tricky to implement in Tor's
current architecture, the document can contain some discussion of how
to go about making it work. Actual patches should go on public git
branches, or be uploaded to trac.
Performance and scalability notes: If the feature will have an effect
on performance (in RAM, CPU, bandwidth) or scalability, there should
be some analysis on how significant this effect will be, so that we
can avoid really expensive performance regressions, and so we can
avoid wasting time on insignificant gains.
How to format proposals:
Proposals may be written in plain text (like this one), or in Markdown.
If using Markdown, the header must be wrapped in triple-backtick ("```")
lines. Whenever possible, we prefer the Commonmark dialect of Markdown.
Proposal status:
Open: A proposal under discussion.
Accepted: The proposal is complete, and we intend to implement it.
After this point, substantive changes to the proposal should be
avoided, and regarded as a sign of the process having failed
somewhere.
Finished: The proposal has been accepted and implemented. After this
point, the proposal should not be changed.
Closed: The proposal has been accepted, implemented, and merged into the
main specification documents. The proposal should not be changed after
this point.
Rejected: We're not going to implement the feature as described here,
though we might do some other version. See comments in the document
for details. The proposal should not be changed after this point;
to bring up some other version of the idea, write a new proposal.
Draft: This isn't a complete proposal yet; there are definite missing
pieces. (Despite the existence of this status, the proposal editors
may decline to accept incomplete proposals: please consider asking for
help if you aren't sure how to solve an open issue.) Proposals that
remain in the Draft status for too long are likely to be marked as Dead
or Obsolete.
Needs-Revision: The idea for the proposal is a good one, but the proposal
as it stands has serious problems that keep it from being accepted.
See comments in the document for details.
Dead: The proposal hasn't been touched in a long time, and it doesn't look
like anybody is going to complete it soon. It can become "Open" again
if it gets a new proponent.
Needs-Research: There are research problems that need to be solved before
it's clear whether the proposal is a good idea.
Meta: This is not a proposal, but a document about proposals.
Reserve: This proposal is not something we're currently planning to
implement, but we might want to resurrect it some day if we decide to
do something like what it proposes.
Informational: This proposal is the last word on what it's doing.
It isn't going to turn into a spec unless somebody copy-and-pastes
it into a new spec for a new subsystem.
Obsolete: This proposal was flawed and has been superseded by another
proposal. See comments in the document for details.
The editors maintain the correct status of proposals, based on rough
consensus and their own discretion.
Proposal numbering:
Numbers 000-099 are reserved for special and meta-proposals. 100 and up
are used for actual proposals. Numbers aren't recycled.
Filename: 098-todo.txt
Title: Proposals that should be written
Author: Nick Mathewson, Roger Dingledine
Created: 26-Jan-2007
Status: Obsolete
{Obsolete: This document has been replaced by the tor-spec issue tracker.}
Overview:
This document lists ideas that various people have had for improving the
Tor protocol. These should be implemented and specified if they're
trivial, or written up as proposals if they're not.
This is an active document, to be edited as proposals are written and as
we come up with new ideas for proposals. We should take stuff out as it
seems irrelevant.
For some later protocol version.
- It would be great to get smarter about identity and linkability.
It's not crazy to say, "Never use the same circuit for my SSH
connections and my web browsing." How far can/should we take this?
See ideas/xxx-separate-streams-by-port.txt for a start.
- Fix onionskin handshake scheme to be more mainstream, less nutty.
Can we just do
E(HMAC(g^x), g^x) rather than just E(g^x) ?
No, that has the same flaws as before. We should send
E(g^x, C) with random C and expect g^y, HMAC_C(K=g^xy).
Better ask Ian; probably Stephen too.
- Length on CREATE and friends
- Versioning on circuits and create cells, so we have a clear path
to improve the circuit protocol.
- SHA1 is showing its age. We should get a design for upgrading our
hash once the AHS competition is done, or even sooner.
- Not being able to upgrade ciphersuites or increase key lengths is
lame.
- Paul has some ideas about circuit creation; read his PET paper once it's
out.
Any time:
- Some ideas for revising the directory protocol:
- Extend the "r" line in network-status to give a set of buckets (say,
comma-separated) for that router.
- Buckets are deterministic based on IP address.
- Then clients can choose a bucket (or set of buckets) to
download and use.
- We need a way for the authorities to declare that nodes are in a
family. Also, it kinda sucks that family declarations use O(N^2) space
in the descriptors.
- REASON_CONNECTFAILED should include an IP.
- Spec should incorporate some prose from tor-design to be more readable.
- Spec when we should rotate which keys
- Spec how to publish descriptors less often
- Describe pros and cons of non-deterministic path lengths
- We should use a variable-length path length by default -- 3 +/- some
distribution. Need to think harder about allowing values less than 3,
and there's a tradeoff between having a wide variance and performance.
- Clients currently use certs during TLS. Is this wise? It does make it
easier for servers to tell which NATted client is which. We could use a
seprate set of certs for each guard, I suppose, but generating so many
certs could get expensive. Omitting them entirely would make OP->OR
easier to tell from OR->OR.
Things that should change...
B.1. ... but which will require backward-incompatible change
- Circuit IDs should be longer.
. IPv6 everywhere.
- Maybe, keys should be longer.
- Maybe, key-length should be adjustable. How to do this without
making anonymity suck?
- Drop backward compatibility.
- We should use a 128-bit subgroup of our DH prime.
- Handshake should use HMAC.
- Multiple cell lengths.
- Ability to split circuits across paths (If this is useful.)
- SENDME windows should be dynamic.
- Directory
- Stop ever mentioning socks ports
B.1. ... and that will require no changes
- Advertised outbound IP?
- Migrate streams across circuits.
- Fix bug 469 by limiting the number of simultaneous connections per IP.
B.2. ... and that we have no idea how to do.
- UDP (as transport)
- UDP (as content)
- Use a better AES mode that has built-in integrity checking,
doesn't grow with the number of hops, is not patented, and
is implemented and maintained by smart people.
Let onion keys be not just RSA but maybe DH too, for Paul's reply onion
design.
Filename: 099-misc.txt
Title: Miscellaneous proposals
Author: Various
Created: 26-Jan-2007
Status: Obsolete
{This document is obsolete; we only used it once, and we have implemented
its only idea.)
Overview:
This document is for small proposal ideas that are about one paragraph in
length. From here, ideas can be rejected outright, expanded into full
proposals, or specified and implemented as-is.
Proposals
1. Directory compression.
Gzip would be easier to work with than zlib; bzip2 would result in smaller
data lengths. [Concretely, we're looking at about 10-15% space savings at
the expense of 3-5x longer compression time for using bzip2.] Doing
on-the-fly gzip requires zlib 1.2 or later; doing bzip2 requires bzlib.
Pre-compressing status documents in multiple formats would force us to use
more memory to hold them.
Status: Open
-- Nick Mathewson
Filename: 100-tor-spec-udp.txt
Title: Tor Unreliable Datagram Extension Proposal
Author: Marc Liberatore
Created: 23 Feb 2006
Status: Dead
Overview:
This is a modified version of the Tor specification written by Marc
Liberatore to add UDP support to Tor. For each TLS link, it adds a
corresponding DTLS link: control messages and TCP data flow over TLS, and
UDP data flows over DTLS.
This proposal is not likely to be accepted as-is; see comments at the end
of the document.
Contents
0. Introduction
Tor is a distributed overlay network designed to anonymize low-latency
TCP-based applications. The current tor specification supports only
TCP-based traffic. This limitation prevents the use of tor to anonymize
other important applications, notably voice over IP software. This document
is a proposal to extend the tor specification to support UDP traffic.
The basic design philosophy of this extension is to add support for
tunneling unreliable datagrams through tor with as few modifications to the
protocol as possible. As currently specified, tor cannot directly support
such tunneling, as connections between nodes are built using transport layer
security (TLS) atop TCP. The latency incurred by TCP is likely unacceptable
to the operation of most UDP-based application level protocols.
Thus, we propose the addition of links between nodes using datagram
transport layer security (DTLS). These links allow packets to traverse a
route through tor quickly, but their unreliable nature requires minor
changes to the tor protocol. This proposal outlines the necessary
additions and changes to the tor specification to support UDP traffic.
We note that a separate set of DTLS links between nodes creates a second
overlay, distinct from the that composed of TLS links. This separation and
resulting decrease in each anonymity set's size will make certain attacks
easier. However, it is our belief that VoIP support in tor will
dramatically increase its appeal, and correspondingly, the size of its user
base, number of deployed nodes, and total traffic relayed. These increases
should help offset the loss of anonymity that two distinct networks imply.
1. Overview of Tor-UDP and its complications
As described above, this proposal extends the Tor specification to support
UDP with as few changes as possible. Tor's overlay network is managed
through TLS based connections; we will re-use this control plane to set up
and tear down circuits that relay UDP traffic. These circuits be built atop
DTLS, in a fashion analogous to how Tor currently sends TCP traffic over
TLS.
The unreliability of DTLS circuits creates problems for Tor at two levels:
1. Tor's encryption of the relay layer does not allow independent
decryption of individual records. If record N is not received, then
record N+1 will not decrypt correctly, as the counter for AES/CTR is
maintained implicitly.
2. Tor's end-to-end integrity checking works under the assumption that
all RELAY cells are delivered. This assumption is invalid when cells
are sent over DTLS.
The fix for the first problem is straightforward: add an explicit sequence
number to each cell. To fix the second problem, we introduce a
system of nonces and hashes to RELAY packets.
In the following sections, we mirror the layout of the Tor Protocol
Specification, presenting the necessary modifications to the Tor protocol as
a series of deltas.
2. Connections
Tor-UDP uses DTLS for encryption of some links. All DTLS links must have
corresponding TLS links, as all control messages are sent over TLS. All
implementations MUST support the DTLS ciphersuite "[TODO]".
DTLS connections are formed using the same protocol as TLS connections.
This occurs upon request, following a CREATE_UDP or CREATE_FAST_UDP cell,
as detailed in section 4.6.
Once a paired TLS/DTLS connection is established, the two sides send cells
to one another. All but two types of cells are sent over TLS links. RELAY
cells containing the commands RELAY_UDP_DATA and RELAY_UDP_DROP, specified
below, are sent over DTLS links. [Should all cells still be 512 bytes long?
Perhaps upon completion of a preliminary implementation, we should do a
performance evaluation for some class of UDP traffic, such as VoIP. - ML]
Cells may be sent embedded in TLS or DTLS records of any size or divided
across such records. The framing of these records MUST NOT leak any more
information than the above differentiation on the basis of cell type. [I am
uncomfortable with this leakage, but don't see any simple, elegant way
around it. -ML]
As with TLS connections, DTLS connections are not permanent.
3. Cell format
Each cell contains the following fields:
CircID [2 bytes]
Command [1 byte]
Sequence Number [2 bytes]
Payload (padded with 0 bytes) [507 bytes]
[Total size: 512 bytes]
The 'Command' field holds one of the following values:
0 -- PADDING (Padding) (See Sec 6.2)
1 -- CREATE (Create a circuit) (See Sec 4)
2 -- CREATED (Acknowledge create) (See Sec 4)
3 -- RELAY (End-to-end data) (See Sec 5)
4 -- DESTROY (Stop using a circuit) (See Sec 4)
5 -- CREATE_FAST (Create a circuit, no PK) (See Sec 4)
6 -- CREATED_FAST (Circuit created, no PK) (See Sec 4)
7 -- CREATE_UDP (Create a UDP circuit) (See Sec 4)
8 -- CREATED_UDP (Acknowledge UDP create) (See Sec 4)
9 -- CREATE_FAST_UDP (Create a UDP circuit, no PK) (See Sec 4)
10 -- CREATED_FAST_UDP(UDP circuit created, no PK) (See Sec 4)
The sequence number allows for AES/CTR decryption of RELAY cells
independently of one another; this functionality is required to support
cells sent over DTLS. The sequence number is described in more detail in
section 4.5.
[Should the sequence number only appear in RELAY packets? The overhead is
small, and I'm hesitant to force more code paths on the implementor. -ML]
[There's already a separate relay header that has other material in it,
so it wouldn't be the end of the world to move it there if it's
appropriate. -RD]
[Having separate commands for UDP circuits seems necessary, unless we can
assume a flag day event for a large number of tor nodes. -ML]
4. Circuit management
4.2. Setting circuit keys
Keys are set up for UDP circuits in the same fashion as for TCP circuits.
Each UDP circuit shares keys with its corresponding TCP circuit.
[If the keys are used for both TCP and UDP connections, how does it
work to mix sequence-number-less cells with sequenced-numbered cells --
how do you know you have the encryption order right? -RD]
4.3. Creating circuits
UDP circuits are created as TCP circuits, using the *_UDP cells as
appropriate.
4.4. Tearing down circuits
UDP circuits are torn down as TCP circuits, using the *_UDP cells as
appropriate.
4.5. Routing relay cells
When an OR receives a RELAY cell, it checks the cell's circID and
determines whether it has a corresponding circuit along that
connection. If not, the OR drops the RELAY cell.
Otherwise, if the OR is not at the OP edge of the circuit (that is,
either an 'exit node' or a non-edge node), it de/encrypts the payload
with AES/CTR, as follows:
'Forward' relay cell (same direction as CREATE):
Use Kf as key; decrypt, using sequence number to synchronize
ciphertext and keystream.
'Back' relay cell (opposite direction from CREATE):
Use Kb as key; encrypt, using sequence number to synchronize
ciphertext and keystream.
Note that in counter mode, decrypt and encrypt are the same operation.
[Since the sequence number is only 2 bytes, what do you do when it
rolls over? -RD]
Each stream encrypted by a Kf or Kb has a corresponding unique state,
captured by a sequence number; the originator of each such stream chooses
the initial sequence number randomly, and increments it only with RELAY
cells. [This counts cells; unlike, say, TCP, tor uses fixed-size cells, so
there's no need for counting bytes directly. Right? - ML]
[I believe this is true. You'll find out for sure when you try to
build it. ;) -RD]
The OR then decides whether it recognizes the relay cell, by
inspecting the payload as described in section 5.1 below. If the OR
recognizes the cell, it processes the contents of the relay cell.
Otherwise, it passes the decrypted relay cell along the circuit if
the circuit continues. If the OR at the end of the circuit
encounters an unrecognized relay cell, an error has occurred: the OR
sends a DESTROY cell to tear down the circuit.
When a relay cell arrives at an OP, the OP decrypts the payload
with AES/CTR as follows:
OP receives data cell:
For I=N...1,
Decrypt with Kb_I, using the sequence number as above. If the
payload is recognized (see section 5.1), then stop and process
the payload.
For more information, see section 5 below.
4.6. CREATE_UDP and CREATED_UDP cells
Users set up UDP circuits incrementally. The procedure is similar to that
for TCP circuits, as described in section 4.1. In addition to the TLS
connection to the first node, the OP also attempts to open a DTLS
connection. If this succeeds, the OP sends a CREATE_UDP cell, with a
payload in the same format as a CREATE cell. To extend a UDP circuit past
the first hop, the OP sends an EXTEND_UDP relay cell (see section 5) which
instructs the last node in the circuit to send a CREATE_UDP cell to extend
the circuit.
The relay payload for an EXTEND_UDP relay cell consists of:
Address [4 bytes]
TCP port [2 bytes]
UDP port [2 bytes]
Onion skin [186 bytes]
Identity fingerprint [20 bytes]
The address field and ports denote the IPV4 address and ports of the next OR
in the circuit.
The payload for a CREATED_UDP cell or the relay payload for an
RELAY_EXTENDED_UDP cell is identical to that of the corresponding CREATED or
RELAY_EXTENDED cell. Both circuits are established using the same key.
Note that the existence of a UDP circuit implies the
existence of a corresponding TCP circuit, sharing keys, sequence numbers,
and any other relevant state.
4.6.1 CREATE_FAST_UDP/CREATED_FAST_UDP cells
As above, the OP must successfully connect using DTLS before attempting to
send a CREATE_FAST_UDP cell. Otherwise, the procedure is the same as in
section 4.1.1.
5. Application connections and stream management
5.1. Relay cells
Within a circuit, the OP and the exit node use the contents of RELAY cells
to tunnel end-to-end commands, TCP connections ("Streams"), and UDP packets
across circuits. End-to-end commands and UDP packets can be initiated by
either edge; streams are initiated by the OP.
The payload of each unencrypted RELAY cell consists of:
Relay command [1 byte]
'Recognized' [2 bytes]
StreamID [2 bytes]
Digest [4 bytes]
Length [2 bytes]
Data [498 bytes]
The relay commands are:
1 -- RELAY_BEGIN [forward]
2 -- RELAY_DATA [forward or backward]
3 -- RELAY_END [forward or backward]
4 -- RELAY_CONNECTED [backward]
5 -- RELAY_SENDME [forward or backward]
6 -- RELAY_EXTEND [forward]
7 -- RELAY_EXTENDED [backward]
8 -- RELAY_TRUNCATE [forward]
9 -- RELAY_TRUNCATED [backward]
10 -- RELAY_DROP [forward or backward]
11 -- RELAY_RESOLVE [forward]
12 -- RELAY_RESOLVED [backward]
13 -- RELAY_BEGIN_UDP [forward]
14 -- RELAY_DATA_UDP [forward or backward]
15 -- RELAY_EXTEND_UDP [forward]
16 -- RELAY_EXTENDED_UDP [backward]
17 -- RELAY_DROP_UDP [forward or backward]
Commands labelled as "forward" must only be sent by the originator
of the circuit. Commands labelled as "backward" must only be sent by
other nodes in the circuit back to the originator. Commands marked
as either can be sent either by the originator or other nodes.
The 'recognized' field in any unencrypted relay payload is always set to
zero.
The 'digest' field can have two meanings. For all cells sent over TLS
connections (that is, all commands and all non-UDP RELAY data), it is
computed as the first four bytes of the running SHA-1 digest of all the
bytes that have been sent reliably and have been destined for this hop of
the circuit or originated from this hop of the circuit, seeded from Df or Db
respectively (obtained in section 4.2 above), and including this RELAY
cell's entire payload (taken with the digest field set to zero). Cells sent
over DTLS connections do not affect this running digest. Each cell sent
over DTLS (that is, RELAY_DATA_UDP and RELAY_DROP_UDP) has the digest field
set to the SHA-1 digest of the current RELAY cells' entire payload, with the
digest field set to zero. Coupled with a randomly-chosen streamID, this
provides per-cell integrity checking on UDP cells.
[If you drop malformed UDP relay cells but don't close the circuit,
then this 8 bytes of digest is not as strong as what we get in the
TCP-circuit side. Is this a problem? -RD]
When the 'recognized' field of a RELAY cell is zero, and the digest
is correct, the cell is considered "recognized" for the purposes of
decryption (see section 4.5 above).
(The digest does not include any bytes from relay cells that do
not start or end at this hop of the circuit. That is, it does not
include forwarded data. Therefore if 'recognized' is zero but the
digest does not match, the running digest at that node should
not be updated, and the cell should be forwarded on.)
All RELAY cells pertaining to the same tunneled TCP stream have the
same streamID. Such streamIDs are chosen arbitrarily by the OP. RELAY
cells that affect the entire circuit rather than a particular
stream use a StreamID of zero.
All RELAY cells pertaining to the same UDP tunnel have the same streamID.
This streamID is chosen randomly by the OP, but cannot be zero.
The 'Length' field of a relay cell contains the number of bytes in
the relay payload which contain real payload data. The remainder of
the payload is padded with NUL bytes.
If the RELAY cell is recognized but the relay command is not
understood, the cell must be dropped and ignored. Its contents
still count with respect to the digests, though. [Before
0.1.1.10, Tor closed circuits when it received an unknown relay
command. Perhaps this will be more forward-compatible. -RD]
5.2.1. Opening UDP tunnels and transferring data
To open a new anonymized UDP connection, the OP chooses an open
circuit to an exit that may be able to connect to the destination
address, selects a random streamID not yet used on that circuit,
and constructs a RELAY_BEGIN_UDP cell with a payload encoding the address
and port of the destination host. The payload format is:
ADDRESS | ':' | PORT | [00]
where ADDRESS can be a DNS hostname, or an IPv4 address in
dotted-quad format, or an IPv6 address surrounded by square brackets;
and where PORT is encoded in decimal.
[What is the [00] for? -NM]
[It's so the payload is easy to parse out with string funcs -RD]
Upon receiving this cell, the exit node resolves the address as necessary.
If the address cannot be resolved, the exit node replies with a RELAY_END
cell. (See 5.4 below.) Otherwise, the exit node replies with a
RELAY_CONNECTED cell, whose payload is in one of the following formats:
The IPv4 address to which the connection was made [4 octets]
A number of seconds (TTL) for which the address may be cached [4 octets]
or
Four zero-valued octets [4 octets]
An address type (6) [1 octet]
The IPv6 address to which the connection was made [16 octets]
A number of seconds (TTL) for which the address may be cached [4 octets]
[XXXX Versions of Tor before 0.1.1.6 ignore and do not generate the TTL
field. No version of Tor currently generates the IPv6 format.]
The OP waits for a RELAY_CONNECTED cell before sending any data.
Once a connection has been established, the OP and exit node
package UDP data in RELAY_DATA_UDP cells, and upon receiving such
cells, echo their contents to the corresponding socket.
RELAY_DATA_UDP cells sent to unrecognized streams are dropped.
Relay RELAY_DROP_UDP cells are long-range dummies; upon receiving such
a cell, the OR or OP must drop it.
5.3. Closing streams
UDP tunnels are closed in a fashion corresponding to TCP connections.
6. Flow Control
UDP streams are not subject to flow control.
7.2. Router descriptor format.
The items' formats are as follows:
"router" nickname address ORPort SocksPort DirPort UDPPort
Indicates the beginning of a router descriptor. "address" must be
an IPv4 address in dotted-quad format. The last three numbers
indicate the TCP ports at which this OR exposes
functionality. ORPort is a port at which this OR accepts TLS
connections for the main OR protocol; SocksPort is deprecated and
should always be 0; DirPort is the port at which this OR accepts
directory-related HTTP connections; and UDPPort is a port at which
this OR accepts DTLS connections for UDP data. If any port is not
supported, the value 0 is given instead of a port number.
Other sections:
What changes need to happen to each node's exit policy to support this? -RD
Switching to UDP means managing the queues of incoming packets better,
so we don't miss packets. How does this interact with doing large public
key operations (handshakes) in the same thread? -RD
========================================================================
COMMENTS
========================================================================
[16 May 2006]
I don't favor this approach; it makes packet traffic partitioned from
stream traffic end-to-end. The architecture I'd like to see is:
A *All* Tor-to-Tor traffic is UDP/DTLS, unless we need to fall back on
TCP/TLS for firewall penetration or something. (This also gives us an
upgrade path for routing through legacy servers.)
B Stream traffic is handled with end-to-end per-stream acks/naks and
retries. On failure, the data is retransmitted in a new RELAY_DATA cell;
a cell isn't retransmitted.
We'll need to do A anyway, to fix our behavior on packet-loss. Once we've
done so, B is more or less inevitable, and we can support end-to-end UDP
traffic "for free".
(Also, there are some details that this draft spec doesn't address. For
example, what happens when a UDP packet doesn't fit in a single cell?)
-NM
Filename: 101-dir-voting.txt
Title: Voting on the Tor Directory System
Author: Nick Mathewson
Created: Nov 2006
Status: Closed
Implemented-In: 0.2.0.x
Overview
This document describes a consensus voting scheme for Tor directories;
instead of publishing different network statuses, directories would vote on
and publish a single "consensus" network status document.
This is an open proposal.
Proposal:
0. Scope and preliminaries
This document describes a consensus voting scheme for Tor directories.
Once it's accepted, it should be merged with dir-spec.txt. Some
preliminaries for authority and caching support should be done during
the 0.1.2.x series; the main deployment should come during the 0.2.0.x
series.
0.1. Goals and motivation: voting.
The current directory system relies on clients downloading separate
network status statements from the caches signed by each directory.
Clients download a new statement every 30 minutes or so, choosing to
replace the oldest statement they currently have.
This creates a partitioning problem: different clients have different
"most recent" networkstatus sources, and different versions of each
(since authorities change their statements often).
It also creates a scaling problem: most of the downloaded networkstatus
are probably quite similar, and the redundancy grows as we add more
authorities.
So if we have clients only download a single multiply signed consensus
network status statement, we can:
- Save bandwidth.
- Reduce client partitioning
- Reduce client-side and cache-side storage
- Simplify client-side voting code (by moving voting away from the
client)
We should try to do this without:
- Assuming that client-side or cache-side clocks are more correct
than we assume now.
- Assuming that authority clocks are perfectly correct.
- Degrading badly if a few authorities die or are offline for a bit.
We do not have to perform well if:
- No clique of more than half the authorities can agree about who
the authorities are.
1. The idea.
Instead of publishing a network status whenever something changes,
each authority instead publishes a fresh network status only once per
"period" (say, 60 minutes). Authorities either upload this network
status (or "vote") to every other authority, or download every other
authority's "vote" (see 3.1 below for discussion on push vs pull).
After an authority has (or has become convinced that it won't be able to
get) every other authority's vote, it deterministically computes a
consensus networkstatus, and signs it. Authorities download (or are
uploaded; see 3.1) one another's signatures, and form a multiply signed
consensus. This multiply-signed consensus is what caches cache and what
clients download.
If an authority is down, authorities vote based on what they *can*
download/get uploaded.
If an authority is "a little" down and only some authorities can reach
it, authorities try to get its info from other authorities.
If an authority computes the vote wrong, its signature isn't included on
the consensus.
Clients use a consensus if it is "trusted": signed by more than half the
authorities they recognize. If clients can't find any such consensus,
they use the most recent trusted consensus they have. If they don't
have any trusted consensus, they warn the user and refuse to operate
(and if DirServers is not the default, beg the user to adapt the list
of authorities).
2. Details.
2.0. Versioning
All documents generated here have version "3" given in their
network-status-version entries.
2.1. Vote specifications
Votes in v3 are similar to v2 network status documents. We add these
fields to the preamble:
"vote-status" -- the word "vote".
"valid-until" -- the time when this authority expects to publish its
next vote.
"known-flags" -- a space-separated list of flags that will sometimes
be included on "s" lines later in the vote.
"dir-source" -- as before, except the "hostname" part MUST be the
authority's nickname, which MUST be unique among authorities, and
MUST match the nickname in the "directory-signature" entry.
Authorities SHOULD cache their most recently generated votes so they
can persist them across restarts. Authorities SHOULD NOT generate
another document until valid-until has passed.
Router entries in the vote MUST be sorted in ascending order by router
identity digest. The flags in "s" lines MUST appear in alphabetical
order.
Votes SHOULD be synchronized to half-hour publication intervals (one
hour? XXX say more; be more precise.)
XXXX some way to request older networkstatus docs?
2.2. Consensus directory specifications
Consensuses are like v3 votes, except for the following fields:
"vote-status" -- the word "consensus".
"published" is the latest of all the published times on the votes.
"valid-until" is the earliest of all the valid-until times on the
votes.
"dir-source" and "fingerprint" and "dir-signing-key" and "contact"
are included for each authority that contributed to the vote.
"vote-digest" for each authority that contributed to the vote,
calculated as for the digest in the signature on the vote. [XXX
re-English this sentence]
"client-versions" and "server-versions" are sorted in ascending
order based on version-spec.txt.
"dir-options" and "known-flags" are not included.
[XXX really? why not list the ones that are used in the consensus?
For example, right now BadExit is in use, but no servers would be
labelled BadExit, and it's still worth knowing that it was considered
by the authorities. -RD]
The fields MUST occur in the following order:
"network-status-version"
"vote-status"
"published"
"valid-until"
For each authority, sorted in ascending order of nickname, case-
insensitively:
"dir-source", "fingerprint", "contact", "dir-signing-key",
"vote-digest".
"client-versions"
"server-versions"
The signatures at the end of the document appear as multiple instances
of directory-signature, sorted in ascending order by nickname,
case-insensitively.
A router entry should be included in the result if it is included by more
than half of the authorities (total authorities, not just those whose votes
we have). A router entry has a flag set if it is included by more than
half of the authorities who care about that flag. [XXXX this creates an
incentive for attackers to DOS authorities whose votes they don't like.
Can we remember what flags people set the last time we saw them? -NM]
[Which 'we' are we talking here? The end-users never learn which
authority sets which flags. So you're thinking the authorities
should record the last vote they saw from each authority and if it's
within a week or so, count all the flags that it advertised as 'no'
votes? Plausible. -RD]
The signature hash covers from the "network-status-version" line through
the characters "directory-signature" in the first "directory-signature"
line.
Consensus directories SHOULD be rejected if they are not signed by more
than half of the known authorities.
2.2.1. Detached signatures
Assuming full connectivity, every authority should compute and sign the
same consensus directory in each period. Therefore, it isn't necessary to
download the consensus computed by each authority; instead, the authorities
only push/fetch each others' signatures. A "detached signature" document
contains a single "consensus-digest" entry and one or more
directory-signature entries. [XXXX specify more.]
2.3. URLs and timelines
2.3.1. URLs and timeline used for agreement
An authority SHOULD publish its vote immediately at the start of each voting
period. It does this by making it available at
http://<hostname>/tor/status-vote/current/authority.z
and sending it in an HTTP POST request to each other authority at the URL
http://<hostname>/tor/post/vote
If, N minutes after the voting period has begun, an authority does not have
a current statement from another authority, the first authority retrieves
the other's statement.
Once an authority has a vote from another authority, it makes it available
at
http://<hostname>/tor/status-vote/current/<fp>.z
where <fp> is the fingerprint of the other authority's identity key.
The consensus network status, along with as many signatures as the server
currently knows, should be available at
http://<hostname>/tor/status-vote/current/consensus.z
All of the detached signatures it knows for consensus status should be
available at:
http://<hostname>/tor/status-vote/current/consensus-signatures.z
Once an authority has computed and signed a consensus network status, it
should send its detached signature to each other authority in an HTTP POST
request to the URL:
http://<hostname>/tor/post/consensus-signature
[XXXX Store votes to disk.]
2.3.2. Serving a consensus directory
Once the authority is done getting signatures on the consensus directory,
it should serve it from:
http://<hostname>/tor/status/consensus.z
Caches SHOULD download consensus directories from an authority and serve
them from the same URL.
2.3.3. Timeline and synchronization
[XXXX]
2.4. Distributing routerdescs between authorities
Consensus will be more meaningful if authorities take steps to make sure
that they all have the same set of descriptors _before_ the voting
starts. This is safe, since all descriptors are self-certified and
timestamped: it's always okay to replace a signed descriptor with a more
recent one signed by the same identity.
In the long run, we might want some kind of sophisticated process here.
For now, since authorities already download one another's networkstatus
documents and use them to determine what descriptors to download from one
another, we can rely on this existing mechanism to keep authorities up to
date.
[We should do a thorough read-through of dir-spec again to make sure
that the authorities converge on which descriptor to "prefer" for
each router. Right now the decision happens at the client, which is
no longer the right place for it. -RD]
3. Questions and concerns
3.1. Push or pull?
The URLs above define a push mechanism for publishing votes and consensus
signatures via HTTP POST requests, and a pull mechanism for downloading
these documents via HTTP GET requests. As specified, every authority will
post to every other. The "download if no copy has been received" mechanism
exists only as a fallback.
4. Migration
* It would be cool if caches could get ready to download consensus
status docs, verify enough signatures, and serve them now. That way
once stuff works all we need to do is upgrade the authorities. Caches
don't need to verify the correctness of the format so long as it's
signed (or maybe multisigned?). We need to make sure that caches back
off very quickly from downloading consensus docs until they're
actually implemented.
Filename: 102-drop-opt.txt
Title: Dropping "opt" from the directory format
Author: Nick Mathewson
Created: Jan 2007
Status: Closed
Implemented-In: 0.2.0.x
Overview:
This document proposes a change in the format used to transmit router and
directory information.
This proposal has been accepted, implemented, and merged into dir-spec.txt.
Proposal:
The "opt" keyword in Tor's directory formats was originally intended to
mean, "it is okay to ignore this entry if you don't understand it"; the
default behavior has been "discard a routerdesc if it contains entries you
don't recognize."
But so far, every new flag we have added has been marked 'opt'. It would
probably make sense to change the default behavior to "ignore unrecognized
fields", and add the statement that clients SHOULD ignore fields they don't
recognize. As a meta-principle, we should say that clients and servers
MUST NOT have to understand new fields in order to use directory documents
correctly.
Of course, this will make it impossible to say, "The format has changed a
lot; discard this quietly if you don't understand it." We could do that by
adding a version field.
Status:
* We stopped requiring it as of 0.1.2.5-alpha. We'll stop generating it
once earlier formats are obsolete.
Filename: 103-multilevel-keys.txt
Title: Splitting identity key from regularly used signing key
Author: Nick Mathewson
Created: Jan 2007
Status: Closed
Implemented-In: 0.2.0.x
Overview:
This document proposes a change in the way identity keys are used, so that
highly sensitive keys can be password-protected and seldom loaded into RAM.
It presents options; it is not yet a complete proposal.
Proposal:
Replacing a directory authority's identity key in the event of a compromise
would be tremendously annoying. We'd need to tell every client to switch
their configuration, or update to a new version with an uploaded list. So
long as some weren't upgraded, they'd be at risk from whoever had
compromised the key.
With this in mind, it's a shame that our current protocol forces us to
store identity keys unencrypted in RAM. We need some kind of signing key
stored unencrypted, since we need to generate new descriptors/directories
and rotate link and onion keys regularly. (And since, of course, we can't
ask server operators to be on-hand to enter a passphrase every time we
want to rotate keys or sign a descriptor.)
The obvious solution seems to be to have a signing-only key that lives
indefinitely (months or longer) and signs descriptors and link keys, and a
separate identity key that's used to sign the signing key. Tor servers
could run in one of several modes:
1. Identity key stored encrypted. You need to pick a passphrase when
you enable this mode, and re-enter this passphrase every time you
rotate the signing key.
1'. Identity key stored separate. You save your identity key to a
floppy, and use the floppy when you need to rotate the signing key.
2. All keys stored unencrypted. In this case, we might not want to even
*have* a separate signing key. (We'll need to support no-separate-
signing-key mode anyway to keep old servers working.)
3. All keys stored encrypted. You need to enter a passphrase to start
Tor.
(Of course, we might not want to implement all of these.)
Case 1 is probably most usable and secure, if we assume that people don't
forget their passphrases or lose their floppies. We could mitigate this a
bit by encouraging people to PGP-encrypt their passphrases to themselves,
or keep a cleartext copy of their secret key secret-split into a few
pieces, or something like that.
Migration presents another difficulty, especially with the authorities. If
we use the current set of identity keys as the new identity keys, we're in
the position of having sensitive keys that have been stored on
media-of-dubious-encryption up to now. Also, we need to keep old clients
(who will expect descriptors to be signed by the identity keys they know
and love, and who will not understand signing keys) happy.
A possible solution:
One thing to consider is that router identity keys are not very sensitive:
if an OR disappears and reappears with a new key, the network treats it as
though an old router had disappeared and a new one had joined the network.
The Tor network continues unharmed; this isn't a disaster.
Thus, the ideas above are mostly relevant for authorities.
The most straightforward solution for the authorities is probably to take
advantage of the protocol transition that will come with proposal 101, and
introduce a new set of signing _and_ identity keys used only to sign votes
and consensus network-status documents. Signing and identity keys could be
delivered to users in a separate, rarely changing "keys" document, so that
the consensus network-status documents wouldn't need to include N signing
keys, N identity keys, and N certifications.
Note also that there is no reason that the identity/signing keys used by
directory authorities would necessarily have to be the same as the identity
keys those authorities use in their capacity as routers. Decoupling these
keys would give directory authorities the following set of keys:
Directory authority identity:
Highly confidential; stored encrypted and/or offline. Used to
identity directory authorities. Shipped with clients. Used to
sign Directory authority signing keys.
Directory authority signing key:
Stored online, accessible to regular Tor process. Used to sign
votes and consensus directories. Downloaded as part of a "keys"
document.
[Administrators SHOULD rotate their signing keys every month or
two, just to keep in practice and keep from forgetting the
password to the authority identity.]
V1-V2 directory authority identity:
Stored online, never changed. Used to sign legacy network-status
and directory documents.
Router identity:
Stored online, seldom changed. Used to sign server descriptors
for this authority in its role as a router. Implicitly certified
by being listed in network-status documents.
Onion key, link key:
As in tor-spec.txt
Extensions to Proposal 101.
Define a new document type, "Key certificate". It contains the
following fields, in order:
"dir-key-certificate-version": As network-status-version. Must be
"3".
"fingerprint": Hex fingerprint, with spaces, based on the directory
authority's identity key.
"dir-identity-key": The long-term identity key for this authority.
"dir-key-published": The time when this directory's signing key was
last changed.
"dir-key-expires": A time after which this key is no longer valid.
"dir-signing-key": As in proposal 101.
"dir-key-certification": A signature of the above fields, in order.
The signed material extends from the beginning of
"dir-key-certicate-version" through the newline after
"dir-key-certification". The identity key is used to generate
this signature.
These elements together constitute a "key certificate". These are
generated offline when starting a v3 authority. Private identity
keys SHOULD be stored offline, encrypted, or both. A running
authority only needs access to the signing key.
Unlike other keys currently used by Tor, the authority identity
keys and directory signing keys MAY be longer than 1024 bits.
(They SHOULD be 2048 bits or longer; they MUST NOT be shorter than
1024.)
Vote documents change as follows:
A key certificate MUST be included in-line in every vote document. With
the exception of "fingerprint", its elements MUST NOT appear in consensus
documents.
Consensus network statuses change as follows:
Remove dir-signing-key.
Change "directory-signature" to take a fingerprint of the authority's
identity key and a fingerprint of the authority's current signing key
rather than the authority's nickname.
Change "dir-source" to take the a fingerprint of the authority's
identity key rather than the authority's nickname or hostname.
Add a new document type:
A "keys" document contains all currently known key certificates.
All authorities serve it at
http://<hostname>/tor/status/keys.z
Caches and clients download the keys document whenever they receive a
consensus vote that uses a key they do not recognize. Caches download
from authorities; clients download from caches.
Processing votes:
When receiving a vote, authorities check to see if the key
certificate for the voter is different from the one they have. If
the key certificate _is_ different, and its dir-key-published is
more recent than the most recently known one, and it is
well-formed and correctly signed with the correct identity key,
then authorities remember it as the new canonical key certificate
for that voter.
A key certificate is invalid if any of the following hold:
* The version is unrecognized.
* The fingerprint does not match the identity key.
* The identity key or the signing key is ill-formed.
* The published date is very far in the past or future.
* The signature is not a valid signature of the key certificate
generated with the identity key.
When processing the signatures on consensus, clients and caches act as
follows:
1. Only consider the directory-signature entries whose identity
key hashes match trusted authorities.
2. If any such entries have signing key hashes that match unknown
signing keys, download a new keys document.
3. For every entry with a known (identity key,signing key) pair,
check the signature on the document.
4. If the document has been signed by more than half of the
authorities the client recognizes, treat the consensus as
correctly signed.
If not, but the number entries with known identity keys but
unknown signing keys might be enough to make the consensus
correctly signed, do not use the consensus, but do not discard
it until we have a new keys document.
Filename: 104-short-descriptors.txt
Title: Long and Short Router Descriptors
Author: Nick Mathewson
Created: Jan 2007
Status: Closed
Implemented-In: 0.2.0.x
Overview:
This document proposes moving unused-by-clients information from regular
router descriptors into a new "extra info" router descriptor.
Proposal:
Some of the costliest fields in the current directory protocol are ones
that no client actually uses. In particular, the "read-history" and
"write-history" fields are used only by the authorities for monitoring the
status of the network. If we took them out, the size of a compressed list
of all the routers would fall by about 60%. (No other disposable field
would save much more than 2%.)
We propose to remove these fields from descriptors, and and have them
uploaded as a part of a separate signed "extra info" to the authorities.
This document will be signed. A hash of this document will be included in
the regular descriptors.
(We considered another design, where routers would generate and upload a
short-form and a long-form descriptor. Only the short-form descriptor would
ever be used by anybody for routing. The long-form descriptor would be
used only for analytics and other tools. We decided against this because
well-behaved tools would need to download short-form descriptors too (as
these would be the only ones indexed), and hence get redundant info. Badly
behaved tools would download only long-form descriptors, and expose
themselves to partitioning attacks.)
Other disposable fields:
Clients don't need these fields, but removing them doesn't help bandwidth
enough to be worthwhile.
contact (save about 1%)
fingerprint (save about 3%)
We could represent these fields more succinctly, but removing them would
only save 1%. (!)
reject
accept
(Apparently, exit polices are highly compressible.)
[Does size-on-disk matter to anybody? Some clients and servers don't
have much disk, or have really slow disk (e.g. USB). And we don't
store caches compressed right now. -RD]
Specification:
1. Extra Info Format.
An "extra info" descriptor contains the following fields:
"extra-info" Nickname Fingerprint
Identifies what router this is an extra info descriptor for.
Fingerprint is encoded in hex (using upper-case letters), with
no spaces.
"published" As currently documented in dir-spec.txt. It MUST match the
"published" field of the descriptor published at the same time.
"read-history"
"write-history"
As currently documented in dir-spec.txt. Optional.
"router-signature" NL Signature NL
A signature of the PKCS1-padded hash of the entire extra info
document, taken from the beginning of the "extra-info" line, through
the newline after the "router-signature" line. An extra info
document is not valid unless the signature is performed with the
identity key whose digest matches FINGERPRINT.
The "extra-info" field is required and MUST appear first. The
router-signature field is required and MUST appear last. All others are
optional. As for other documents, unrecognized fields must be ignored.
2. Existing formats
Implementations that use "read-history" and "write-history" SHOULD
continue accepting router descriptors that contain them. (Prior to
0.2.0.x, this information was encoded in ordinary router descriptors;
in any case they have always been listed as opt, so they should be
accepted anyway.)
Add these fields to router descriptors:
"extra-info-digest" Digest
"Digest" is a hex-encoded digest (using upper-case characters)
of the router's extra-info document, as signed in the router's
extra-info. (If this field is absent, no extra-info-digest
exists.)
"caches-extra-info"
Present if this router is a directory cache that provides
extra-info documents, or an authority that handles extra-info
documents.
(Since implementations before 0.1.2.5-alpha required that the "opt"
keyword precede any unrecognized entry, these keys MUST be preceded
with "opt" until 0.1.2.5-alpha is obsolete.)
3. New communications rules
Servers SHOULD generate and upload one extra-info document after each
descriptor they generate and upload; no more, no less. Servers MUST
upload the new descriptor before they upload the new extra-info.
Authorities receiving an extra-info document SHOULD verify all of the
following:
* They have a router descriptor for some server with a matching
nickname and identity fingerprint.
* That server's identity key has been used to sign the extra-info
document.
* The extra-info-digest field in the router descriptor matches
the digest of the extra-info document.
* The published fields in the two documents match.
Authorities SHOULD drop extra-info documents that do not meet these
criteria.
Extra-info documents MAY be uploaded as part of the same HTTP post as
the router descriptor, or separately. Authorities MUST accept both
methods.
Authorities SHOULD try to fetch extra-info documents from one another if
they do not have one matching the digest declared in a router
descriptor.
Caches that are running locally with a tool that needs to use extra-info
documents MAY download and store extra-info documents. They should do
so when they notice that the recommended descriptor has an
extra-info-digest not matching any extra-info document they currently
have. (Caches not running on a host that needs to use extra-info
documents SHOULD NOT download or cache them.)
4. New URLs
http://<hostname>/tor/extra/d/...
http://<hostname>/tor/extra/fp/...
http://<hostname>/tor/extra/all[.z]
(As for /tor/server/ URLs: supports fetching extra-info documents
by their digest, by the fingerprint of their servers, or all
at once. When serving by fingerprint, we serve the extra-info
that corresponds to the descriptor we would serve by that
fingerprint. Only directory authorities are guaranteed to support
these URLs.)
http://<hostname>/tor/extra/authority[.z]
(The extra-info document for this router.)
Extra-info documents are uploaded to the same URLs as regular
router descriptors.
Migration:
For extra info approach:
* First:
* Authorities should accept extra info, and support serving it.
* Routers should upload extra info once authorities accept it.
* Caches should support an option to download and cache it, once
authorities serve it.
* Tools should be updated to use locally cached information.
These tools include:
lefkada's exit.py script.
tor26's noreply script and general directory cache.
https://nighteffect.us/tns/ for its graphs
and check with or-talk for the rest, once it's time.
* Set a cutoff time for including bandwidth in router descriptors, so
that tools that use bandwidth info know that they will need to fetch
extra info documents.
* Once tools that want bandwidth info support fetching extra info:
* Have routers stop including bandwidth info in their router
descriptors.
Filename: 105-handshake-revision.txt
Title: Version negotiation for the Tor protocol.
Author: Nick Mathewson, Roger Dingledine
Created: Jan 2007
Status: Closed
Implemented-In: 0.2.0.x
Overview:
This document was extracted from a modified version of tor-spec.txt that we
had written before the proposal system went into place. It adds two new
cells types to the Tor link connection setup handshake: one used for
version negotiation, and another to prevent MITM attacks.
This proposal is partially implemented, and partially proceded by
proposal 130.
Motivation: Tor versions
Our *current* approach to versioning the Tor protocol(s) has been as
follows:
- All changes must be backward compatible.
- It's okay to add new cell types, if they would be ignored by previous
versions of Tor.
- It's okay to add new data elements to cells, if they would be
ignored by previous versions of Tor.
- For forward compatibility, Tor must ignore cell types it doesn't
recognize, and ignore data in those cells it doesn't expect.
- Clients can inspect the version of Tor declared in the platform line
of a router's descriptor, and use that to learn whether a server
supports a given feature. Servers, however, aren't assumed to all
know about each other, and so don't know the version of who they're
talking to.
This system has these problems:
- It's very hard to change fundamental aspects of the protocol, like the
cell format, the link protocol, any of the various encryption schemes,
and so on.
- The router-to-router link protocol has remained more-or-less frozen
for a long time, since we can't easily have an OR use new features
unless it knows the other OR will understand them.
We need to resolve these problems because:
- Our cipher suite is showing its age: SHA1/AES128/RSA1024/DH1024 will
not seem like the best idea for all time.
- There are many ideas circulating for multiple cell sizes; while it's
not obvious whether these are safe, we can't do them at all without a
mechanism to permit them.
- There are many ideas circulating for alternative circuit building and
cell relay rules: they don't work unless they can coexist in the
current network.
- If our protocol changes a lot, it's hard to describe any coherent
version of it: we need to say "the version that Tor versions W through
X use when talking to versions Y through Z". This makes analysis
harder.
Motivation: Preventing MITM attacks
TLS prevents a man-in-the-middle attacker from reading or changing the
contents of a communication. It does not, however, prevent such an
attacker from observing timing information. Since timing attacks are some
of the most effective against low-latency anonymity nets like Tor, we
should take more care to make sure that we're not only talking to who
we think we're talking to, but that we're using the network path we
believe we're using.
Motivation: Signed clock information
It's very useful for Tor instances to know how skewed they are relative
to one another. The only way to find out currently has been to download
directory information, and check the Date header--but this is not
authenticated, and hence subject to modification on the wire. Using
BEGIN_DIR to create an authenticated directory stream through an existing
circuit is better, but that's an extra step and it might be nicer to
learn the information in the course of the regular protocol.
Proposal:
1.0. Version numbers
The node-to-node TLS-based "OR connection" protocol and the multi-hop
"circuit" protocol are versioned quasi-independently.
Of course, some dependencies will continue to exist: Certain versions
of the circuit protocol may require a minimum version of the connection
protocol to be used. The connection protocol affects:
- Initial connection setup, link encryption, transport guarantees,
etc.
- The allowable set of cell commands
- Allowable formats for cells.
The circuit protocol determines:
- How circuits are established and maintained
- How cells are decrypted and relayed
- How streams are established and maintained.
Version numbers are incremented for backward-incompatible protocol changes
only. Backward-compatible changes are generally implemented by adding
additional fields to existing structures; implementations MUST ignore
fields they do not expect. Unused portions of cells MUST be set to zero.
Though versioning the protocol will make it easier to maintain backward
compatibility with older versions of Tor, we will nevertheless continue to
periodically drop support for older protocols,
- to keep the implementation from growing without bound,
- to limit the maintenance burden of patching bugs in obsolete Tors,
- to limit the testing burden of verifying that many old protocol
versions continue to be implemented properly, and
- to limit the exposure of the network to protocol versions that are
expensive to support.
The Tor protocol as implemented through the 0.1.2.x Tor series will be
called "version 1" in its link protocol and "version 1" in its relay
protocol. Versions of the Tor protocol so old as to be incompatible with
Tor 0.1.2.x can be considered to be version 0 of each, and are not
supported.
2.1. VERSIONS cells
When a Tor connection is established, both parties normally send a
VERSIONS cell before sending any other cells. (But see below.)
VersionsLen [2 byte]
Versions [VersionsLen bytes]
"Versions" is a sequence of VersionsLen bytes. Each value between 1 and
127 inclusive represents a single version; current implementations MUST
ignore other bytes. Parties should list all of the versions which they
are able and willing to support. Parties can only communicate if they
have some connection protocol version in common.
Version 0.2.0.x-alpha and earlier don't understand VERSIONS cells,
and therefore don't support version negotiation. Thus, waiting until
the other side has sent a VERSIONS cell won't work for these servers:
if the other side sends no cells back, it is impossible to tell
whether they
have sent a VERSIONS cell that has been stalled, or whether they have
dropped our own VERSIONS cell as unrecognized. Therefore, we'll
change the TLS negotiation parameters so that old parties can still
negotiate, but new parties can recognize each other. Immediately
after a TLS connection has been established, the parties check
whether the other side negotiated the connection in an "old" way or a
"new" way. If either party negotiated in the "old" way, we assume a
v1 connection. Otherwise, both parties send VERSIONS cells listing
all their supported versions. Upon receiving the other party's
VERSIONS cell, the implementation begins using the highest-valued
version common to both cells. If the first cell from the other party
has a recognized command, and is _not_ a VERSIONS cell, we assume a
v1 protocol.
(For more detail on the TLS protocol change, see forthcoming draft
proposals from Steven Murdoch.)
Implementations MUST discard VERSIONS cells that are not the first
recognized cells sent on a connection.
The VERSIONS cell must be sent as a v1 cell (2 bytes of circuitID, 1
byte of command, 509 bytes of payload).
[NOTE: The VERSIONS cell is assigned the command number 7.]
2.2. MITM-prevention and time checking
If we negotiate a v2 connection or higher, the second cell we send SHOULD
be a NETINFO cell. Implementations SHOULD NOT send NETINFO cells at other
times.
A NETINFO cell contains:
Timestamp [4 bytes]
Other OR's address [variable]
Number of addresses [1 byte]
This OR's addresses [variable]
Timestamp is the OR's current Unix time, in seconds since the epoch. If
an implementation receives time values from many ORs that
indicate that its clock is skewed, it SHOULD try to warn the
administrator. (We leave the definition of 'many' intentionally vague
for now.)
Before believing the timestamp in a NETINFO cell, implementations
SHOULD compare the time at which they received the cell to the time
when they sent their VERSIONS cell. If the difference is very large,
it is likely that the cell was delayed long enough that its
contents are out of date.
Each address contains Type/Length/Value as used in Section 6.4 of
tor-spec.txt. The first address is the one that the party sending
the NETINFO cell believes the other has -- it can be used to learn
what your IP address is if you have no other hints.
The rest of the addresses are the advertised addresses of the party
sending the NETINFO cell -- we include them
to block a man-in-the-middle attack on TLS that lets an attacker bounce
traffic through his own computers to enable timing and packet-counting
attacks.
A Tor instance should use the other Tor's reported address
information as part of logic to decide whether to treat a given
connection as suitable for extending circuits to a given address/ID
combination. When we get an extend request, we use an
existing OR connection if the ID matches, and ANY of the following
conditions hold:
- The IP matches the requested IP.
- We know that the IP we're using is canonical because it was
listed in the NETINFO cell.
- We know that the IP we're using is canonical because it was
listed in the server descriptor.
[NOTE: The NETINFO cell is assigned the command number 8.]
Discussion: Versions versus feature lists
Many protocols negotiate lists of available features instead of (or in
addition to) protocol versions. While it's possible that some amount of
feature negotiation could be supported in a later Tor, we should prefer to
use protocol versions whenever possible, for reasons discussed in
the "Anonymity Loves Company" paper.
Discussion: Bytes per version, versions per cell
This document provides for a one-byte count of how many versions a Tor
supports, and allows one byte per version. Thus, it can only support only
254 more versions of the protocol beyond the unallocated v0 and the
current v1. If we ever need to split the protocol into 255 incompatible
versions, we've probably screwed up badly somewhere.
Nevertheless, here are two ways we could support more versions:
- Change the version count to a two-byte field that counts the number of
_bytes_ used, and use a UTF8-style encoding: versions 0 through 127
take one byte to encode, versions 128 through 2047 take two bytes to
encode, and so on. We wouldn't need to parse any version higher than
127 right now, since all bytes used to encode higher versions would
have their high bit set.
We'd still have a limit of 380 simultaneously versions that could be
declared in any version. This is probably okay.
- Decide that if we need to support more versions, we can add a
MOREVERSIONS cell that gets sent before the VERSIONS cell. The spec
above requires Tors to ignore unrecognized cell types that they get
before the first VERSIONS cell, and still allows version negotiation
to
succeed.
[Resolution: Reserve the high bit and the v0 value for later use. If
we ever have more live versions than we can fit in a cell, we've made a
bad design decision somewhere along the line.]
Discussion: Reducing round-trips
It might be appealing to see if we can cram more information in the
initial VERSIONS cell. For example, the contents of NETINFO will pretty
soon be sent by everybody before any more information is exchanged, but
decoupling them from the version exchange increases round-trips.
Instead, we could speculatively include handshaking information at
the end of a VERSIONS cell, wrapped in a marker to indicate, "if we wind
up speaking VERSION 2, here's the NETINFO I'll send. Otherwise, ignore
this." This could be extended to opportunistically reduce round trips
when possible for future versions when we guess the versions right.
Of course, we'd need to be careful about using a feature like this:
- We don't want to include things that are expensive to compute,
like PK signatures or proof-of-work.
- We don't want to speculate as a mobile client: it may leak our
experience with the server in question.
Discussion: Advertising versions in routerdescs and networkstatuses.
In network-statuses:
The networkstatus "v" line now has the format:
"v" IMPLEMENTATION IMPL-VERSION "Link" LINK-VERSION-LIST
"Circuit" CIRCUIT-VERSION-LIST NL
LINK-VERSION-LIST and CIRCUIT-VERSION-LIST are comma-separated lists of
supported version numbers. IMPLEMENTATION is the name of the
implementation of the Tor protocol (e.g., "Tor"), and IMPL-VERSION is the
version of the implementation.
Examples:
v Tor 0.2.5.1-alpha Link 1,2,3 Circuit 2,5
v OtherOR 2000+ Link 3 Circuit 5
Implementations that release independently of the Tor codebase SHOULD NOT
use "Tor" as the value of their IMPLEMENTATION.
Additional fields on the "v" line MUST be ignored.
In router descriptors:
The router descriptor should contain a line of the form,
"protocols" "Link" LINK-VERSION-LIST "Circuit" CIRCUIT_VERSION_LIST
Additional fields on the "protocols" line MUST be ignored.
[Versions of Tor before 0.1.2.5-alpha rejected router descriptors with
unrecognized items; the protocols line should be preceded with an "opt"
until these Tors are obsolete.]
Security issues:
Client partitioning is the big danger when we introduce new versions; if a
client supports some very unusual set of protocol versions, it will stand
out from others no matter where it goes. If a server supports an unusual
version, it will get a disproportionate amount of traffic from clients who
prefer that version. We can mitigate this somewhat as follows:
- Do not have clients prefer any protocol version by default until that
version is widespread. (First introduce the new version to servers,
and have clients admit to using it only when configured to do so for
testing. Then, once many servers are running the new protocol
version, enable its use by default.)
- Do not multiply protocol versions needlessly.
- Encourage protocol implementors to implement the same protocol version
sets as some popular version of Tor.
- Disrecommend very old/unpopular versions of Tor via the directory
authorities' RecommmendedVersions mechanism, even if it is still
technically possible to use them.
Filename: 106-less-tls-constraint.txt
Title: Checking fewer things during TLS handshakes
Author: Nick Mathewson
Created: 9-Feb-2007
Status: Closed
Implemented-In: 0.2.0.x
Overview:
This document proposes that we relax our requirements on the context of
X.509 certificates during initial TLS handshakes.
Motivation:
Later, we want to try harder to avoid protocol fingerprinting attacks.
This means that we'll need to make our connection handshake look closer
to a regular HTTPS connection: one certificate on the server side and
zero certificates on the client side. For now, about the best we
can do is to stop requiring things during handshake that we don't
actually use.
What we check now, and where we check it:
tor_tls_check_lifetime:
peer has certificate
notBefore <= now <= notAfter
tor_tls_verify:
peer has at least one certificate
There is at least one certificate in the chain
At least one of the certificates in the chain is not the one used to
negotiate the connection. (The "identity cert".)
The certificate _not_ used to negotiate the connection has signed the
link cert
tor_tls_get_peer_cert_nickname:
peer has a certificate.
certificate has a subjectName.
subjectName has a commonName.
commonName consists only of characters in LEGAL_NICKNAME_CHARACTERS. [2]
tor_tls_peer_has_cert:
peer has a certificate.
connection_or_check_valid_handshake:
tor_tls_peer_has_cert [1]
tor_tls_get_peer_cert_nickname [1]
tor_tls_verify [1]
If nickname in cert is a known, named router, then its identity digest
must be as expected.
If we initiated the connection, then we got the identity digest we
expected.
USEFUL THINGS WE COULD DO:
[1] We could just not force clients to have any certificate at all, let alone
an identity certificate. Internally to the code, we could assign the
identity_digest field of these or_connections to a random number, or even
not add them to the identity_digest->or_conn map.
[so if somebody connects with no certs, we let them. and mark them as
a client and don't treat them as a server. great. -rd]
[2] Instead of using a restricted nickname character set that makes our
commonName structure look unlike typical SSL certificates, we could treat
the nickname as extending from the start of the commonName up to but not
including the first non-nickname character.
Alternatively, we could stop checking commonNames entirely. We don't
actually _do_ anything based on the nickname in the certificate, so
there's really no harm in letting every router have any commonName it
wants.
[this is the better choice -rd]
[agreed. -nm]
REMAINING WAYS TO RECOGNIZE CLIENT->SERVER CONNECTIONS:
Assuming that we removed the above requirements, we could then (in a later
release) have clients not send certificates, and sometimes and started
making our DNs a little less formulaic, client->server OR connections would
still be recognizable by:
having a two-certificate chain sent by the server
using a particular set of ciphersuites
traffic patterns
probing the server later
OTHER IMPLICATIONS:
If we stop verifying the above requirements:
It will be slightly (but only slightly) more common to connect to a non-Tor
server running TLS, and believe that you're talking to a Tor server (until
you send the first cell).
It will be far easier for non-Tor SSL clients to accidentally connect to
Tor servers and speak HTTPS or whatever to them.
If, in a later release, we have clients not send certificates, and we make
DNs less recognizable:
If clients don't send certs, servers don't need to verify them: win!
If we remove these restrictions, it will be easier for people to write
clients to fuzz our protocol: sorta win!
If clients don't send certs, they look slightly less like servers.
OTHER SPEC CHANGES:
When a client doesn't give us an identity, we should never extend any
circuits to it (duh), and we should allow it to set circuit ID however it
wants.
Filename: 107-uptime-sanity-checking.txt
Title: Uptime Sanity Checking
Author: Kevin Bauer & Damon McCoy
Created: 8-March-2007
Status: Closed
Implemented-In: 0.2.0.x
Overview:
This document describes how to cap the uptime that is used when computing
which routers are marked as stable such that highly stable routers cannot
be displaced by malicious routers that report extremely high uptime
values.
This is similar to how bandwidth is capped at 1.5MB/s.
Motivation:
It has been pointed out that an attacker can displace all stable nodes and
entry guard nodes by reporting high uptimes. This is an easy fix that will
prevent highly stable nodes from being displaced.
Security implications:
It should decrease the effectiveness of routing attacks that report high
uptimes while not impacting the normal routing algorithms.
Specification:
So we could patch Section 3.1 of dir-spec.txt to say:
"Stable" -- A router is 'Stable' if it is running, valid, not
hibernating, and either its uptime is at least the median uptime for
known running, valid, non-hibernating routers, or its uptime is at
least 30 days. Routers are never called stable if they are running
a version of Tor known to drop circuits stupidly. (0.1.1.10-alpha
through 0.1.1.16-rc are stupid this way.)
Compatibility:
There should be no compatibility issues due to uptime capping.
Implementation:
Implemented and merged into dir-spec in 0.2.0.0-alpha-dev (r9788).
Discussion:
Initially, this proposal set the maximum at 60 days, not 30; the 30 day
limit and spec wording was suggested by Roger in an or-dev post on 9 March
2007.
This proposal also led to 108-mtbf-based-stability.txt
Filename: 108-mtbf-based-stability.txt
Title: Base "Stable" Flag on Mean Time Between Failures
Author: Nick Mathewson
Created: 10-Mar-2007
Status: Closed
Implemented-In: 0.2.0.x
Overview:
This document proposes that we change how directory authorities set the
stability flag from inspection of a router's declared Uptime to the
authorities' perceived mean time between failure for the router.
Motivation:
Clients prefer nodes that the authorities call Stable. This flag is (as
of 0.2.0.0-alpha-dev) set entirely based on the node's declared value for
uptime. This creates an opportunity for malicious nodes to declare
falsely high uptimes in order to get more traffic.
Spec changes:
Replace the current rule for setting the Stable flag with:
"Stable" -- A router is 'Stable' if it is active and its observed Stability
for the past month is at or above the median Stability for active routers.
Routers are never called stable if they are running a version of Tor
known to drop circuits stupidly. (0.1.1.10-alpha through 0.1.1.16-rc
are stupid this way.)
Stability shall be defined as the weighted mean length of the runs
observed by a given directory authority. A run begins when an authority
decides that the server is Running, and ends when the authority decides
that the server is not Running. In-progress runs are counted when
measuring Stability. When calculating the mean, runs are weighted by
$\alpha ^ t$, where $t$ is time elapsed since the end of the run, and
$0 < \alpha < 1$. Time when an authority is down do not count to the
length of the run.
Rejected Alternative:
"A router's Stability shall be defined as the sum of $\alpha ^ d$ for every
$d$ such that the router was considered reachable for the entire day
$d$ days ago.
This allows a simpler implementation: every day, we multiply
yesterday's Stability by alpha, and if the router was observed to be
available every time we looked today, we add 1.
Instead of "day", we could pick an arbitrary time unit. We should
pick alpha to be high enough that long-term stability counts, but low
enough that the distant past is eventually forgotten. Something
between .8 and .95 seems right.
(By requiring that routers be up for an entire day to get their
stability increased, instead of counting fractions of a day, we
capture the notion that stability is more like "probability of
staying up for the next hour" than it is like "probability of being
up at some randomly chosen time over the next hour." The former
notion of stability is far more relevant for long-lived circuits.)
Limitations:
Authorities can have false positives and false negatives when trying to
tell whether a router is up or down. So long as these aren't terribly
wrong, and so long as they aren't significantly biased, we should be able
to use them to estimate stability pretty well.
Probing approaches like the above could miss short incidents of
downtime. If we use the router's declared uptime, we could detect
these: but doing so would penalize routers who reported their uptime
accurately.
Implementation:
For now, the easiest way to store this information at authorities
would probably be in some kind of periodically flushed flat file.
Later, we could move to Berkeley db or something if we really had to.
For each router, an authority will need to store:
The router ID.
Whether the router is up.
The time when the current run started, if the router is up.
The weighted sum length of all previous runs.
The time at which the weighted sum length was last weighted down.
Servers should probe at random intervals to test whether servers are
running.
Filename: 109-no-sharing-ips.txt
Title: No more than one server per IP address
Author: Kevin Bauer & Damon McCoy
Created: 9-March-2007
Status: Closed
Implemented-In: 0.2.0.x
Overview:
This document describes a solution to a Sybil attack vulnerability in the
directory servers. Currently, it is possible for a single IP address to
host an arbitrarily high number of Tor routers. We propose that the
directory servers limit the number of Tor routers that may be registered at
a particular IP address to some small (fixed) number, perhaps just one Tor
router per IP address.
While Tor never uses more than one server from a given /16 in the same
circuit, an attacker with multiple servers in the same place is still
dangerous because he can get around the per-server bandwidth cap that is
designed to prevent a single server from attracting too much of the overall
traffic.
Motivation:
Since it is possible for an attacker to register an arbitrarily large
number of Tor routers, it is possible for malicious parties to do this
as part of a traffic analysis attack.
Security implications:
This countermeasure will increase the number of IP addresses that an
attacker must control in order to carry out traffic analysis.
Specification:
For each IP address, each directory authority tracks the number of routers
using that IP address, along with their total observed bandwidth. If there
are more than MAX_SERVERS_PER_IP servers at some IP, the authority should
"disable" all but MAX_SERVERS_PER_IP servers. When choosing which servers
to disable, the authority should first disable non-Running servers in
increasing order of observed bandwidth, and then should disable Running
servers in increasing order of bandwidth.
[[ We don't actually do this part here. -NM
If the total observed
bandwidth of the remaining non-"disabled" servers exceeds MAX_BW_PER_IP,
the authority should "disable" some of the remaining servers until only one
server remains, or until the remaining observed bandwidth of non-"disabled"
servers is under MAX_BW_PER_IP.
]]
Servers that are "disabled" MUST be marked as non-Valid and non-Running.
MAX_SERVERS_PER_IP is 3.
MAX_BW_PER_IP is 8 MB per s.
Compatibility:
Upon inspection of a directory server, we found that the following IP
addresses have more than one Tor router:
Scruples 68.5.113.81 ip68-5-113-81.oc.oc.cox.net 443
WiseUp 68.5.113.81 ip68-5-113-81.oc.oc.cox.net 9001
Unnamed 62.1.196.71 pc01-megabyte-net-arkadiou.megabyte.gr 9001
Unnamed 62.1.196.71 pc01-megabyte-net-arkadiou.megabyte.gr 9001
Unnamed 62.1.196.71 pc01-megabyte-net-arkadiou.megabyte.gr 9001
aurel 85.180.62.138 e180062138.adsl.alicedsl.de 9001
sokrates 85.180.62.138 e180062138.adsl.alicedsl.de 9001
moria1 18.244.0.188 moria.mit.edu 9001
peacetime 18.244.0.188 moria.mit.edu 9100
There may exist compatibility issues with this proposed fix. Reasons why
more than one server would share an IP address include:
* Testing. moria1, moria2, peacetime, and other morias all run on one
computer at MIT, because that way we get testing. Moria1 and moria2 are
run by Roger, and peacetime is run by Nick.
* NAT. If there are several servers but they port-forward through the same
IP address, ... we can hope that the operators coordinate with each
other. Also, we should recognize that while they help the network in
terms of increased capacity, they don't help as much as they could in
terms of location diversity. But our approach so far has been to take
what we can get.
* People who have more than 1.5MB/s and want to help out more. For
example, for a while Tonga was offering 10MB/s and its Tor server
would only make use of a bit of it. So Roger suggested that he run
two Tor servers, to use more.
[Note Roger's tweak to this behavior, in
http://archives.seul.org/or/cvs/Oct-2007/msg00118.html]
Filename: 110-avoid-infinite-circuits.txt
Title: Avoiding infinite length circuits
Author: Roger Dingledine
Created: 13-Mar-2007
Status: Closed
Target: 0.2.3.x
Implemented-In: 0.2.1.3-alpha, 0.2.3.11-alpha
History:
Revised 28 July 2008 by nickm: set K.
Revised 3 July 2008 by nickm: rename from relay_extend to
relay_early. Revise to current migration plan. Allow K cells
over circuit lifetime, not just at start.
Overview:
Right now, an attacker can add load to the Tor network by extending a
circuit an arbitrary number of times. Every cell that goes down the
circuit then adds N times that amount of load in overall bandwidth
use. This vulnerability arises because servers don't know their position
on the path, so they can't tell how many nodes there are before them
on the path.
We propose a new set of relay cells that are distinguishable by
intermediate hops as permitting extend cells. This approach will allow
us to put an upper bound on circuit length relative to the number of
colluding adversary nodes; but there are some downsides too.
Motivation:
The above attack can be used to generally increase load all across the
network, or it can be used to target specific servers: by building a
circuit back and forth between two victim servers, even a low-bandwidth
attacker can soak up all the bandwidth offered by the fastest Tor
servers.
The general attacks could be used as a demonstration that Tor isn't
perfect (leading to yet more media articles about "breaking" Tor), and
the targetted attacks will come into play once we have a reputation
system -- it will be trivial to DoS a server so it can't pass its
reputation checks, in turn impacting security.
Design:
We should split RELAY cells into two types: RELAY and RELAY_EARLY.
Only K (say, 10) Relay_early cells can be sent across a circuit, and
only relay_early cells are allowed to contain extend requests. We
still support obscuring the length of the circuit (if more research
shows us what to do), because Alice can choose how many of the K to
mark as relay_early. Note that relay_early cells *can* contain any
sort of data cell; so in effect it's actually the relay type cells
that are restricted. By default, she would just send the first K
data cells over the stream as relay_early cells, regardless of their
actual type.
(Note that a circuit that is out of relay_early cells MUST NOT be
cannibalized later, since it can't extend. Note also that it's always okay
to use regular RELAY cells when sending non-EXTEND commands targetted at
the first hop of a circuit, since there is no intermediate hop to try to
learn the relay command type.)
Each intermediate server would pass on the same type of cell that it
received (either relay or relay_early), and the cell's destination
will be able to learn whether it's allowed to contain an Extend request.
If an intermediate server receives more than K relay_early cells, or
if it sees a relay cell that contains an extend request, then it
tears down the circuit (protocol violation).
Security implications:
The upside is that this limits the bandwidth amplification factor to
K: for an individual circuit to become arbitrary-length, the attacker
would need an adversary-controlled node every K hops, and at that
point the attack is no worse than if the attacker creates N/K separate
K-hop circuits.
On the other hand, we want to pick a large enough value of K that we
don't mind the cap.
If we ever want to take steps to hide the number of hops in the circuit
or a node's position in the circuit, this design probably makes that
more complex.
Migration:
In 0.2.0, servers speaking v2 or later of the link protocol accept
RELAY_EARLY cells, and pass them on. If the next OR in the circuit
is not speaking the v2 link protocol, the server relays the cell as
a RELAY cell.
In 0.2.1.3-alpha, clients begin using RELAY_EARLY cells on v2
connections. This functionality can be safely backported to
0.2.0.x. Clients should pick a random number betweeen (say) K and
K-2 to send.
In 0.2.1.3-alpha, servers close any circuit in which more than K
relay_early cells are sent.
Once all versions the do not send RELAY_EARLY cells are obsolete,
servers can begin to reject any EXTEND requests not sent in a
RELAY_EARLY cell.
Parameters:
Let K = 8, for no terribly good reason.
Spec:
[We can formalize this part once we think the design is a good one.]
Acknowledgements:
This design has been kicking around since Christian Grothoff and I came
up with it at PET 2004. (Nathan Evans, Christian Grothoff's student,
is working on implementing a fix based on this design in the summer
2007 timeframe.)
Filename: 111-local-traffic-priority.txt
Title: Prioritizing local traffic over relayed traffic
Author: Roger Dingledine
Created: 14-Mar-2007
Status: Closed
Implemented-In: 0.2.0.x
Overview:
We describe some ways to let Tor users operate as a relay and enforce
rate limiting for relayed traffic without impacting their locally
initiated traffic.
Motivation:
Right now we encourage people who use Tor as a client to configure it
as a relay too ("just click the button in Vidalia"). Most of these users
are on asymmetric links, meaning they have a lot more download capacity
than upload capacity. But if they enable rate limiting too, suddenly
they're limited to the same download capacity as upload capacity. And
they have to enable rate limiting, or their upstream pipe gets filled
up, starts dropping packets, and now their net connection doesn't work
even for non-Tor stuff. So they end up turning off the relaying part
so they can use Tor (and other applications) again.
So far this hasn't mattered that much: most of our fast relays are
being operated only in relay mode, so the rate limiting makes sense
for them. But if we want to be able to attract many more relays in
the future, we need to let ordinary users act as relays too.
Further, as we begin to deploy the blocking-resistance design and we
rely on ordinary users to click the "Tor for Freedom" button, this
limitation will become a serious stumbling block to getting volunteers
to act as bridges.
The problem:
Tor implements its rate limiting on the 'read' side by only reading
a certain number of bytes from the network in each second. If it has
emptied its token bucket, it doesn't read any more from the network;
eventually TCP notices and stalls until we resume reading. But if we
want to have two classes of service, we can't know what class a given
incoming cell will be until we look at it, at which point we've already
read it.
Some options:
Option 1: read when our token bucket is full enough, and if it turns
out that what we read was local traffic, then add the tokens back into
the token bucket. This will work when local traffic load alternates
with relayed traffic load; but it's a poor option in general, because
when we're receiving both local and relayed traffic, there are plenty
of cases where we'll end up with an empty token bucket, and then we're
back where we were before.
More generally, notice that our problem is easy when a given TCP
connection either has entirely local circuits or entirely relayed
circuits. In fact, even if they are both present, if one class is
entirely idle (none of its circuits have sent or received in the past
N seconds), we can ignore that class until it wakes up again. So it
only gets complex when a single connection contains active circuits
of both classes.
Next, notice that local traffic uses only the entry guards, whereas
relayed traffic likely doesn't. So if we're a bridge handling just
a few users, the expected number of overlapping connections would be
almost zero, and even if we're a full relay the number of overlapping
connections will be quite small.
Option 2: build separate TCP connections for local traffic and for
relayed traffic. In practice this will actually only require a few
extra TCP connections: we would only need redundant TCP connections
to at most the number of entry guards in use.
However, this approach has some drawbacks. First, if the remote side
wants to extend a circuit to you, how does it know which TCP connection
to send it on? We would need some extra scheme to label some connections
"client-only" during construction. Perhaps we could do this by seeing
whether any circuit was made via CREATE_FAST; but this still opens
up a race condition where the other side sends a create request
immediately. The only ways I can imagine to avoid the race entirely
are to specify our preference in the VERSIONS cell, or to add some
sort of "nope, not this connection, why don't you try another rather
than failing" response to create cells, or to forbid create cells on
connections that you didn't initiate and on which you haven't seen
any circuit creation requests yet -- this last one would lead to a bit
more connection bloat but doesn't seem so bad. And we already accept
this race for the case where directory authorities establish new TCP
connections periodically to check reachability, and then hope to hang
up on them soon after. (In any case this issue is moot for bridges,
since each destination will be one-way with respect to extend requests:
either receiving extend requests from bridge users or sending extend
requests to the Tor server, never both.)
The second problem with option 2 is that using two TCP connections
reveals that there are two classes of traffic (and probably quickly
reveals which is which, based on throughput). Now, it's unclear whether
this information is already available to the other relay -- he would
easily be able to tell that some circuits are fast and some are rate
limited, after all -- but it would be nice to not add even more ways to
leak that information. Also, it's less clear that an external observer
already has this information if the circuits are all bundled together,
and for this case it's worth trying to protect it.
Option 3: tell the other side about our rate limiting rules. When we
establish the TCP connection, specify the different policy classes we
have configured. Each time we extend a circuit, specify which policy
class that circuit should be part of. Then hope the other side obeys
our wishes. (If he doesn't, hang up on him.) Besides the design and
coordination hassles involved in this approach, there's a big problem:
our rate limiting classes apply to all our connections, not just
pairwise connections. How does one server we're connected to know how
much of our bucket has already been spent by another? I could imagine
a complex and inefficient "ok, now you can send me those two more cells
that you've got queued" protocol. I'm not sure how else we could do it.
(Gosh. How could UDP designs possibly be compatible with rate limiting
with multiple bucket sizes?)
Option 4: put both classes of circuits over a single connection, and
keep track of the last time we read or wrote a high-priority cell. If
it's been less than N seconds, give the whole connection high priority,
else give the whole connection low priority.
Option 5: put both classes of circuits over a single connection, and
play a complex juggling game by periodically telling the remote side
what rate limits to set for that connection, so you end up giving
priority to the right connections but still stick to roughly your
intended bandwidthrate and relaybandwidthrate.
Option 6: ?
Prognosis:
Nick really didn't like option 2 because of the partitioning questions.
I've put option 4 into place as of Tor 0.2.0.3-alpha.
In terms of implementation, it will be easy: just add a time_t to
or_connection_t that specifies client_used (used by the initiator
of the connection to rate limit it differently depending on how
recently the time_t was reset). We currently update client_used
in three places:
- command_process_relay_cell() when we receive a relay cell for
an origin circuit.
- relay_send_command_from_edge() when we send a relay cell for
an origin circuit.
- circuit_deliver_create_cell() when send a create cell.
We could probably remove the third case and it would still work,
but hey.
Filename: 112-bring-back-pathlencoinweight.txt
Title: Bring Back Pathlen Coin Weight
Author: Mike Perry
Created:
Status: Superseded
Superseded-By: 115
Overview:
The idea is that users should be able to choose a weight which
probabilistically chooses their path lengths to be 2 or 3 hops. This
weight will essentially be a biased coin that indicates an
additional hop (beyond 2) with probability P. The user should be
allowed to choose 0 for this weight to always get 2 hops and 1 to
always get 3.
This value should be modifiable from the controller, and should be
available from Vidalia.
Motivation:
The Tor network is slow and overloaded. Increasingly often I hear
stories about friends and friends of friends who are behind firewalls,
annoying censorware, or under surveillance that interferes with their
productivity and Internet usage, or chills their speech. These people
know about Tor, but they choose to put up with the censorship because
Tor is too slow to be usable for them. In fact, to download a fresh,
complete copy of levine-timing.pdf for the Anonymity Implications
section of this proposal over Tor took me 3 tries.
There are many ways to improve the speed problem, and of course we
should and will implement as many as we can. Johannes's GSoC project
and my reputation system are longer term, higher-effort things that
will still provide benefit independent of this proposal.
However, reducing the path length to 2 for those who do not need the
(questionable) extra anonymity 3 hops provide not only improves
their Tor experience but also reduces their load on the Tor network by
33%, and can be done in less than 10 lines of code. That's not just
Win-Win, it's Win-Win-Win.
Furthermore, when blocking resistance measures insert an extra relay
hop into the equation, 4 hops will certainly be completely unusable
for these users, especially since it will be considerably more
difficult to balance the load across a dark relay net than balancing
the load on Tor itself (which today is still not without its flaws).
Anonymity Implications:
It has long been established that timing attacks against mixed
networks are extremely effective, and that regardless of path
length, if the adversary has compromised your first and last
hop of your path, you can assume they have compromised your
identity for that connection.
In [1], it is demonstrated that for all but the slowest, lossiest
networks, error rates for false positives and false negatives were
very near zero. Only for constant streams of traffic over slow and
(more importantly) extremely lossy network links did the error rate
hit 20%. For loss rates typical to the Internet, even the error rate
for slow nodes with constant traffic streams was 13%.
When you take into account that most Tor streams are not constant,
but probably much more like their "HomeIP" dataset, which consists
mostly of web traffic that exists over finite intervals at specific
times, error rates drop to fractions of 1%, even for the "worst"
network nodes.
Therefore, the user has little benefit from the extra hop, assuming
the adversary does timing correlation on their nodes. The real
protection is the probability of getting both the first and last hop,
and this is constant whether the client chooses 2 hops, 3 hops, or 42.
Partitioning attacks form another concern. Since Tor uses telescoping
to build circuits, it is possible to tell a user is constructing only
two hop paths at the entry node. It is questionable if this data is
actually worth anything though, especially if the majority of users
have easy access to this option, and do actually choose their path
lengths semi-randomly.
Nick has postulated that exits may also be able to tell that you are
using only 2 hops by the amount of time between sending their
RELAY_CONNECTED cell and the first bit of RELAY_DATA traffic they
see from the OP. I doubt that they will be able to make much use
of this timing pattern, since it will likely vary widely depending
upon the type of node selected for that first hop, and the user's
connection rate to that first hop. It is also questionable if this
data is worth anything, especially if many users are using this
option (and I imagine many will).
Perhaps most seriously, two hop paths do allow malicious guards
to easily fail circuits if they do not extend to their colluding peers
for the exit hop. Since guards can detect the number of hops in a
path, they could always fail the 3 hop circuits and focus on
selectively failing the two hop ones until a peer was chosen.
I believe currently guards are rotated if circuits fail, which does
provide some protection, but this could be changed so that an entry
guard is completely abandoned after a certain ratio of extend or
general circuit failures with respect to non-failed circuits. This
could possibly be gamed to increase guard turnover, but such a game
would be much more noticeable than an individual guard failing circuits,
though, since it would affect all clients, not just those who chose
a particular guard.
Why not fix Pathlen=2?:
The main reason I am not advocating that we always use 2 hops is that
in some situations, timing correlation evidence by itself may not be
considered as solid and convincing as an actual, uninterrupted, fully
traced path. Are these timing attacks as effective on a real network
as they are in simulation? Would an extralegal adversary or authoritarian
government even care? In the face of these situation-dependent unknowns,
it should be up to the user to decide if this is a concern for them or not.
It should probably also be noted that even a false positive
rate of 1% for a 200k concurrent-user network could mean that for a
given node, a given stream could be confused with something like 10
users, assuming ~200 nodes carry most of the traffic (ie 1000 users
each). Though of course to really know for sure, someone needs to do
an attack on a real network, unfortunately.
Implementation:
new_route_len() can be modified directly with a check of the
PathlenCoinWeight option (converted to percent) and a call to
crypto_rand_int(0,100) for the weighted coin.
The entry_guard_t structure could have num_circ_failed and
num_circ_succeeded members such that if it exceeds N% circuit
extend failure rate to a second hop, it is removed from the entry list.
N should be sufficiently high to avoid churn from normal Tor circuit
failure as determined by TorFlow scans.
The Vidalia option should be presented as a boolean, to minimize confusion
for the user. Something like a radiobutton with:
* "I use Tor for Censorship Resistance, not Anonymity. Speed is more
important to me than Anonymity."
* "I use Tor for Anonymity. I need extra protection at the cost of speed."
and then some explanation in the help for exactly what this means, and
the risks involved with eliminating the adversary's need for timing attacks
wrt to false positives, etc.
Migration:
Phase one: Experiment with the proper ratio of circuit failures
used to expire garbage or malicious guards via TorFlow.
Phase two: Re-enable config and modify new_route_len() to add an
extra hop if coin comes up "heads".
Phase three: Make radiobutton in Vidalia, along with help entry
that explains in layman's terms the risks involved.
[1] http://www.cs.umass.edu/~mwright/papers/levine-timing.pdf
Filename: 113-fast-authority-interface.txt
Title: Simplifying directory authority administration
Author: Nick Mathewson
Created:
Status: Superseded
Overview
The problem:
Administering a directory authority is a pain: you need to go through
emails and manually add new nodes as "named". When bad things come up,
you need to mark nodes (or whole regions) as invalid, badexit, etc.
This means that mostly, authority admins don't: only 2/4 current authority
admins actually bind names or list bad exits, and those two have often
complained about how annoying it is to do so.
Worse, name binding is a common path, but it's a pain in the neck: nobody
has done it for a couple of months.
Digression: who knows what?
It's trivial for Tor to automatically keep track of all of the
following information about a server:
name, fingerprint, IP, last-seen time, first-seen time, declared
contact.
All we need to have the administrator set is:
- Is this name/fingerprint pair bound?
- Is this fingerprint/IP a bad exit?
- Is this fingerprint/IP an invalid node?
- Is this fingerprint/IP to be rejected?
The workflow for authority admins has two parts:
- Periodically, go through tor-ops and add new names. This doesn't
need to be done urgently.
- Less often, mark badly behaved serves as badly behaved. This is more
urgent.
Possible solution #1: Web-interface for name binding.
Deprecate use of the tor-ops mailing list; instead, have operators go to a
webform and enter their server info. This would put the information in a
standardized format, thus allowing quick, nearly-automated approval and
reply.
Possible solution #2: Self-binding names.
Peter Palfrader has proposed that names be assigned automatically to nodes
that have been up and running and valid for a while.
Possible solution #3: Self-maintaining approved-routers file
Mixminion alpha has a neat feature where whenever a new server is seen,
a stub line gets added to a configuration file. For Tor, it could look
something like this:
## First seen with this key on 2007-04-21 13:13:14
## Stayed up for at least 12 hours on IP 192.168.10.10
#RouterName AAAABBBBCCCCDDDDEFEF
(Note that the implementation needs to parse commented lines to make sure
that it doesn't add duplicates, but that's not so hard.)
To add a router as named, administrators would only need to uncomment the
entry. This automatically maintained file could be kept separately from a
manually maintained one.
This could be combined with solution #2, such that Tor would do the hard
work of uncommenting entries for routers that should get Named, but
operators could override its decisions.
Possible solution #4: A separate mailing list for authority operators.
Right now, the tor-ops list is very high volume. There should be another
list that's only for dealing with problems that need prompt action, like
marking a router as !badexit.
Resolution:
Solution #2 is described in "Proposal 123: Naming authorities
automatically create bindings", and that approach is implemented.
There are remaining issues in the problem statement above that need
their own solutions.
Filename: 114-distributed-storage.txt
Title: Distributed Storage for Tor Hidden Service Descriptors
Author: Karsten Loesing
Created: 13-May-2007
Status: Closed
Implemented-In: 0.2.0.x
Change history:
13-May-2007 Initial proposal
14-May-2007 Added changes suggested by Lasse Øverlier
30-May-2007 Changed descriptor format, key length discussion, typos
09-Jul-2007 Incorporated suggestions by Roger, added status of specification
and implementation for upcoming GSoC mid-term evaluation
11-Aug-2007 Updated implementation statuses, included non-consecutive
replication to descriptor format
20-Aug-2007 Renamed config option HSDir as HidServDirectoryV2
02-Dec-2007 Closed proposal
Overview:
The basic idea of this proposal is to distribute the tasks of storing and
serving hidden service descriptors from currently three authoritative
directory nodes among a large subset of all onion routers. The three
reasons to do this are better robustness (availability), better
scalability, and improved security properties. Further,
this proposal suggests changes to the hidden service descriptor format to
prevent new security threats coming from decentralization and to gain even
better security properties.
Status:
As of December 2007, the new hidden service descriptor format is implemented
and usable. However, servers and clients do not yet make use of descriptor
cookies, because there are open usability issues of this feature that might
be resolved in proposal 121. Further, hidden service directories do not
perform replication by themselves, because (unauthorized) replica fetch
requests would allow any attacker to fetch all hidden service descriptors in
the system. As neither issue is critical to the functioning of v2
descriptors and their distribution, this proposal is considered as Closed.
Motivation:
The current design of hidden services exhibits the following performance and
security problems:
First, the three hidden service authoritative directories constitute a
performance bottleneck in the system. The directory nodes are responsible for
storing and serving all hidden service descriptors. As of May 2007 there are
about 1000 descriptors at a time, but this number is assumed to increase in
the future. Further, there is no replication protocol for descriptors between
the three directory nodes, so that hidden services must ensure the
availability of their descriptors by manually publishing them on all
directory nodes. Whenever a fourth or fifth hidden service authoritative
directory is added, hidden services will need to maintain an equally
increasing number of replicas. These scalability issues have an impact on the
current usage of hidden services and put an even higher burden on the
development of new kinds of applications for hidden services that might
require storing even more descriptors.
Second, besides posing a limitation to scalability, storing all hidden
service descriptors on three directory nodes also constitutes a security
risk. The directory node operators could easily analyze the publish and fetch
requests to derive information on service activity and usage and read the
descriptor contents to determine which onion routers work as introduction
points for a given hidden service and need to be attacked or threatened to
shut it down. Furthermore, the contents of a hidden service descriptor offer
only minimal security properties to the hidden service. Whoever gets aware of
the service ID can easily find out whether the service is active at the
moment and which introduction points it has. This applies to (former)
clients, (former) introduction points, and of course to the directory nodes.
It requires only to request the descriptor for the given service ID, which
can be performed by anyone anonymously.
This proposal suggests two major changes to approach the described
performance and security problems:
The first change affects the storage location for hidden service descriptors.
Descriptors are distributed among a large subset of all onion routers instead
of three fixed directory nodes. Each storing node is responsible for a subset
of descriptors for a limited time only. It is not able to choose which
descriptors it stores at a certain time, because this is determined by its
onion ID which is hard to change frequently and in time (only routers which
are stable for a given time are accepted as storing nodes). In order to
resist single node failures and untrustworthy nodes, descriptors are
replicated among a certain number of storing nodes. A first replication
protocol makes sure that descriptors don't get lost when the node population
changes; therefore, a storing node periodically requests the descriptors from
its siblings. A second replication protocol distributes descriptors among
non-consecutive nodes of the ID ring to prevent a group of adversaries from
generating new onion keys until they have consecutive IDs to create a 'black
hole' in the ring and make random services unavailable. Connections to
storing nodes are established by extending existing circuits by one hop to
the storing node. This also ensures that contents are encrypted. The effect
of this first change is that the probability that a single node operator
learns about a certain hidden service is very small and that it is very hard
to track a service over time, even when it collaborates with other node
operators.
The second change concerns the content of hidden service descriptors.
Obviously, security problems cannot be solved only by decentralizing storage;
in fact, they could also get worse if done without caution. At first, a
descriptor ID needs to change periodically in order to be stored on changing
nodes over time. Next, the descriptor ID needs to be computable only for the
service's clients, but should be unpredictable for all other nodes. Further,
the storing node needs to be able to verify that the hidden service is the
true originator of the descriptor with the given ID even though it is not a
client. Finally, a storing node should learn as little information as
necessary by storing a descriptor, because it might not be as trustworthy as
a directory node; for example it does not need to know the list of
introduction points. Therefore, a second key is applied that is only known to
the hidden service provider and its clients and that is not included in the
descriptor. It is used to calculate descriptor IDs and to encrypt the
introduction points. This second key can either be given to all clients
together with the hidden service ID, or to a group or a single client as
an authentication token. In the future this second key could be the result of
some key agreement protocol between the hidden service and one or more
clients. A new text-based format is proposed for descriptors instead of an
extension of the existing binary format for reasons of future extensibility.
Design:
The proposed design is described by the required changes to the current
design. These requirements are grouped by content, rather than by affected
specification documents or code files, and numbered for reference below.
Hidden service clients, servers, and directories:
/1/ Create routing list
All participants can filter the consensus status document received from the
directory authorities to one routing list containing only those servers
that store and serve hidden service descriptors and which are running for
at least 24 hours. A participant only trusts its own routing list and never
learns about routing information from other parties.
/2/ Determine responsible hidden service directory
All participants can determine the hidden service directory that is
responsible for storing and serving a given ID, as well as the hidden
service directories that replicate its content. Every hidden service
directory is responsible for the descriptor IDs in the interval from
its predecessor, exclusive, to its own ID, inclusive. Further, a hidden
service directory holds replicas for its n predecessors, where n denotes
the number of consecutive replicas. (requires /1/)
[/3/ and /4/ were requirements to use BEGIN_DIR cells for directory
requests which have not been fulfilled in the course of the implementation
of this proposal, but elsewhere.]
Hidden service directory nodes:
/5/ Advertise hidden service directory functionality
Every onion router that has its directory port open can decide whether it
wants to store and serve hidden service descriptors by setting a new config
option "HidServDirectoryV2" 0|1 to 1. An onion router with this config
option being set includes the flag "hidden-service-dir" in its router
descriptors that it sends to directory authorities.
/6/ Accept v2 publish requests, parse and store v2 descriptors
Hidden service directory nodes accept publish requests for hidden service
descriptors and store them to their local memory. (It is not necessary to
make descriptors persistent, because after disconnecting, the onion router
would not be accepted as storing node anyway, because it has not been
running for at least 24 hours.) All requests and replies are formatted as
HTTP messages. Requests are directed to the router's directory port and are
contained within BEGIN_DIR cells. A hidden service directory node stores a
descriptor only when it thinks that it is responsible for storing that
descriptor based on its own routing table. Every hidden service directory
node is responsible for the descriptor IDs in the interval of its n-th
predecessor in the ID circle up to its own ID (n denotes the number of
consecutive replicas). (requires /1/)
/7/ Accept v2 fetch requests
Same as /6/, but with fetch requests for hidden service descriptors.
(requires /2/)
/8/ Replicate descriptors with neighbors
A hidden service directory node replicates descriptors from its two
predecessors by downloading them once an hour. Further, it checks its
routing table periodically for changes. Whenever it realizes that a
predecessor has left the network, it establishes a connection to the new
n-th predecessor and requests its stored descriptors in the interval of its
(n+1)-th predecessor and the requested n-th predecessor. Whenever it
realizes that a new onion router has joined with an ID higher than its
former n-th predecessor, it adds it to its predecessors and discards all
descriptors in the interval of its (n+1)-th and its n-th predecessor.
(requires /1/)
[Dec 02: This function has not been implemented, because arbitrary nodes
what have been able to download the entire set of v2 descriptors. An
authorized replication request would be necessary. For the moment, the
system runs without any directory-side replication. -KL]
Authoritative directory nodes:
/9/ Confirm a router's hidden service directory functionality
Directory nodes include a new flag "HSDir" for routers that decided to
provide storage for hidden service descriptors and that are running for at
least 24 hours. The last requirement prevents a node from frequently
changing its onion key to become responsible for an identifier it wants to
target.
Hidden service provider:
/10/ Configure v2 hidden service
Each hidden service provider that has set the config option
"PublishV2HidServDescriptors" 0|1 to 1 is configured to publish v2
descriptors and conform to the v2 connection establishment protocol. When
configuring a hidden service, a hidden service provider checks if it has
already created a random secret_cookie and a hostname2 file; if not, it
creates both of them. (requires /2/)
/11/ Establish introduction points with fresh key
If configured to publish only v2 descriptors and no v0/v1 descriptors any
more, a hidden service provider that is setting up the hidden service at
introduction points does not pass its own public key, but the public key
of a freshly generated key pair. It also includes these fresh public keys
in the hidden service descriptor together with the other introduction point
information. The reason is that the introduction point does not need to and
therefore should not know for which hidden service it works, so as to
prevent it from tracking the hidden service's activity. (If a hidden
service provider supports both, v0/v1 and v2 descriptors, v0/v1 clients
rely on the fact that all introduction points accept the same public key,
so that this new feature cannot be used.)
/12/ Encode v2 descriptors and send v2 publish requests
If configured to publish v2 descriptors, a hidden service provider
publishes a new descriptor whenever its content changes or a new
publication period starts for this descriptor. If the current publication
period would only last for less than 60 minutes (= 2 x 30 minutes to allow
the server to be 30 minutes behind and the client 30 minutes ahead), the
hidden service provider publishes both a current descriptor and one for
the next period. Publication is performed by sending the descriptor to all
hidden service directories that are responsible for keeping replicas for
the descriptor ID. This includes two non-consecutive replicas that are
stored at 3 consecutive nodes each. (requires /1/ and /2/)
Hidden service client:
/13/ Send v2 fetch requests
A hidden service client that has set the config option
"FetchV2HidServDescriptors" 0|1 to 1 handles SOCKS requests for v2 onion
addresses by requesting a v2 descriptor from a randomly chosen hidden
service directory that is responsible for keeping replica for the
descriptor ID. In total there are six replicas of which the first and the
last three are stored on consecutive nodes. The probability of picking one
of the three consecutive replicas is 1/6, 2/6, and 3/6 to incorporate the
fact that the availability will be the highest on the node with next higher
ID. A hidden service client relies on the hidden service provider to store
two sets of descriptors to compensate clock skew between service and
client. (requires /1/ and /2/)
/14/ Process v2 fetch reply and parse v2 descriptors
A hidden service client that has sent a request for a v2 descriptor can
parse it and store it to the local cache of rendezvous service descriptors.
/15/ Establish connection to v2 hidden service
A hidden service client can establish a connection to a hidden service
using a v2 descriptor. This includes using the secret cookie for decrypting
the introduction points contained in the descriptor. When contacting an
introduction point, the client does not use the public key of the hidden
service provider, but the freshly-generated public key that is included in
the hidden service descriptor. Whether or not a fresh key is used instead
of the key of the hidden service depends on the available protocol versions
that are included in the descriptor; by this, connection establishment is
to a certain extend decoupled from fetching the descriptor.
Hidden service descriptor:
(Requirements concerning the descriptor format are contained in /6/ and /7/.)
The new v2 hidden service descriptor format looks like this:
onion-address = h(public-key) + cookie
descriptor-id = h(h(public-key) + h(time-period + cookie + relica))
descriptor-content = {
descriptor-id,
version,
public-key,
h(time-period + cookie + replica),
timestamp,
protocol-versions,
{ introduction-points } encrypted with cookie
} signed with private-key
The "descriptor-id" needs to change periodically in order for the
descriptor to be stored on changing nodes over time. It may only be
computable by a hidden service provider and all of his clients to prevent
unauthorized nodes from tracking the service activity by periodically
checking whether there is a descriptor for this service. Finally, the
hidden service directory needs to be able to verify that the hidden service
provider is the true originator of the descriptor with the given ID.
Therefore, "descriptor-id" is derived from the "public-key" of the hidden
service provider, the current "time-period" which changes every 24 hours,
a secret "cookie" shared between hidden service provider and clients, and
a "replica" denoting the number of this non-consecutive replica. (The
"time-period" is constructed in a way that time periods do not change at
the same moment for all descriptors by deriving a value between 0:00 and
23:59 hours from h(public-key) and making the descriptors of this hidden
service provider expire at that time of the day.) The "descriptor-id" is
defined to be 160 bits long. [extending the "descriptor-id" length
suggested by LØ]
Only the hidden service provider and the clients are able to generate
future "descriptor-ID"s. Hence, the "onion-address" is extended from now
the hash value of "public-key" by the secret "cookie". The "public-key" is
determined to be 80 bits long, whereas the "cookie" is dimensioned to be
120 bits long. This makes a total of 200 bits or 40 base32 chars, which is
quite a lot to handle for a human, but necessary to provide sufficient
protection against an adversary from generating a key pair with same
"public-key" hash or guessing the "cookie".
A hidden service directory can verify that a descriptor was created by the
hidden service provider by checking if the "descriptor-id" corresponds to
the "public-key" and if the signature can be verified with the
"public-key".
The "introduction-points" that are included in the descriptor are encrypted
using the same "cookie" that is shared between hidden service provider and
clients. [correction to use another key than h(time-period + cookie) as
encryption key for introduction points made by LØ]
A new text-based format is proposed for descriptors instead of an extension
of the existing binary format for reasons of future extensibility.
Security implications:
The security implications of the proposed changes are grouped by the roles of
nodes that could perform attacks or on which attacks could be performed.
Attacks by authoritative directory nodes
Authoritative directory nodes are no longer the single places in the
network that know about a hidden service's activity and introduction
points. Thus, they cannot perform attacks using this information, e.g.
track a hidden service's activity or usage pattern or attack its
introduction points. Formerly, it would only require a single corrupted
authoritative directory operator to perform such an attack.
Attacks by hidden service directory nodes
A hidden service directory node could misuse a stored descriptor to track a
hidden service's activity and usage pattern by clients. Though there is no
countermeasure against this kind of attack, it is very expensive to track a
certain hidden service over time. An attacker would need to run a large
number of stable onion routers that work as hidden service directory nodes
to have a good probability to become responsible for its changing
descriptor IDs. For each period, the probability is:
1-(N-c choose r)/(N choose r) for N-c>=r and 1 otherwise, with N
as total
number of hidden service directories, c as compromised nodes, and r as
number of replicas
The hidden service directory nodes could try to make a certain hidden
service unavailable to its clients. Therefore, they could discard all
stored descriptors for that hidden service and reply to clients that there
is no descriptor for the given ID or return an old or false descriptor
content. The client would detect a false descriptor, because it could not
contain a correct signature. But an old content or an empty reply could
confuse the client. Therefore, the countermeasure is to replicate
descriptors among a small number of hidden service directories, e.g. 5.
The probability of a group of collaborating nodes to make a hidden service
completely unavailable is in each period:
(c choose r)/(N choose r) for c>=r and N>=r, and 0 otherwise,
with N as total
number of hidden service directories, c as compromised nodes, and r as
number of replicas
A hidden service directory could try to find out which introduction points
are working on behalf of a hidden service. In contrast to the previous
design, this is not possible anymore, because this information is encrypted
to the clients of a hidden service.
Attacks on hidden service directory nodes
An anonymous attacker could try to swamp a hidden service directory with
false descriptors for a given descriptor ID. This is prevented by requiring
that descriptors are signed.
Anonymous attackers could swamp a hidden service directory with correct
descriptors for non-existing hidden services. There is no countermeasure
against this attack. However, the creation of valid descriptors is more
expensive than verification and storage in local memory. This should make
this kind of attack unattractive.
Attacks by introduction points
Current or former introduction points could try to gain information on the
hidden service they serve. But due to the fresh key pair that is used by
the hidden service, this attack is not possible anymore.
Attacks by clients
Current or former clients could track a hidden service's activity, attack
its introduction points, or determine the responsible hidden service
directory nodes and attack them. There is nothing that could prevent them
from doing so, because honest clients need the full descriptor content to
establish a connection to the hidden service. At the moment, the only
countermeasure against dishonest clients is to change the secret cookie and
pass it only to the honest clients.
Compatibility:
The proposed design is meant to replace the current design for hidden service
descriptors and their storage in the long run.
There should be a first transition phase in which both, the current design
and the proposed design are served in parallel. Onion routers should start
serving as hidden service directories, and hidden service providers and
clients should make use of the new design if both sides support it. Hidden
service providers should be allowed to publish descriptors of the current
format in parallel, and authoritative directories should continue storing and
serving these descriptors.
After the first transition phase, hidden service providers should stop
publishing descriptors on authoritative directories, and hidden service
clients should not try to fetch descriptors from the authoritative
directories. However, the authoritative directories should continue serving
hidden service descriptors for a second transition phase. As of this point,
all v2 config options should be set to a default value of 1.
After the second transition phase, the authoritative directories should stop
serving hidden service descriptors.
Filename: 115-two-hop-paths.txt
Title: Two Hop Paths
Author: Mike Perry
Created:
Status: Dead
Supersedes: 112
Overview:
The idea is that users should be able to choose if they would like
to have either two or three hop paths through the tor network.
Let us be clear: the users who would choose this option should be
those that are concerned with IP obfuscation only: ie they would not be
targets of a resource-intensive multi-node attack. It is sometimes said
that these users should find some other network to use other than Tor.
This is a foolish suggestion: more users improves security of everyone,
and the current small userbase size is a critical hindrance to
anonymity, as is discussed below and in [1].
This value should be modifiable from the controller, and should be
available from Vidalia.
Motivation:
The Tor network is slow and overloaded. Increasingly often I hear
stories about friends and friends of friends who are behind firewalls,
annoying censorware, or under surveillance that interferes with their
productivity and Internet usage, or chills their speech. These people
know about Tor, but they choose to put up with the censorship because
Tor is too slow to be usable for them. In fact, to download a fresh,
complete copy of levine-timing.pdf for the Theoretical Argument
section of this proposal over Tor took me 3 tries.
Furthermore, the biggest current problem with Tor's anonymity for
those who really need it is not someone attacking the network to
discover who they are. It's instead the extreme danger that so few
people use Tor because it's so slow, that those who do use it have
essentially no confusion set.
The recent case where the professor and the rogue Tor user were the
only Tor users on campus, and thus suspected in an incident involving
Tor and that University underscores this point: "That was why the police
had come to see me. They told me that only two people on our campus were
using Tor: me and someone they suspected of engaging in an online scam.
The detectives wanted to know whether the other user was a former
student of mine, and why I was using Tor"[1].
Not only does Tor provide no anonymity if you use it to be anonymous
but are obviously from a certain institution, location or circumstance,
it is also dangerous to use Tor for risk of being accused of having
something significant enough to hide to be willing to put up with
the horrible performance as opposed to using some weaker alternative.
There are many ways to improve the speed problem, and of course we
should and will implement as many as we can. Johannes's GSoC project
and my reputation system are longer term, higher-effort things that
will still provide benefit independent of this proposal.
However, reducing the path length to 2 for those who do not need the
extra anonymity 3 hops provide not only improves their Tor experience
but also reduces their load on the Tor network by 33%, and should
increase adoption of Tor by a good deal. That's not just Win-Win, it's
Win-Win-Win.
Who will enable this option?
This is the crux of the proposal. Admittedly, there is some anonymity
loss and some degree of decreased investment required on the part of
the adversary to attack 2 hop users versus 3 hop users, even if it is
minimal and limited mostly to up-front costs and false positives.
The key questions are:
1. Are these users in a class such that their risk is significantly
less than the amount of this anonymity loss?
2. Are these users able to identify themselves?
Many many users of Tor are not at risk for an adversary capturing c/n
nodes of the network just to see what they do. These users use Tor to
circumvent aggressive content filters, or simply to keep their IP out of
marketing and search engine databases. Most content filters have no
interest in running Tor nodes to catch violators, and marketers
certainly would never consider such a thing, both on a cost basis and a
legal one.
In a sense, this represents an alternate threat model against these
users who are not at risk for Tor's normal threat model.
It should be evident to these users that they fall into this class. All
that should be needed is a radio button
* "I use Tor for local content filter circumvention and/or IP obfuscation,
not anonymity. Speed is more important to me than high anonymity.
No one will make considerable efforts to determine my real IP."
* "I use Tor for anonymity and/or national-level, legally enforced
censorship. It is possible effort will be taken to identify
me, including but not limited to network surveillance. I need more
protection."
and then some explanation in the help for exactly what this means, and
the risks involved with eliminating the adversary's need for timing
attacks with respect to false positives. Ultimately, the decision is a
simple one that can be made without this information, however. The user
does not need Paul Syverson to instruct them on the deep magic of Onion
Routing to make this decision. They just need to know why they use Tor.
If they use it just to stay out of marketing databases and/or bypass a
local content filter, two hops is plenty. This is likely the vast
majority of Tor users, and many non-users we would like to bring on
board.
So, having established this class of users, let us now go on to
examine theoretical and practical risks we place them at, and determine
if these risks violate the users needs, or introduce additional risk
to node operators who may be subject to requests from law enforcement
to track users who need 3 hops, but use 2 because they enjoy the
thrill of russian roulette.
Theoretical Argument:
It has long been established that timing attacks against mixed
and onion networks are extremely effective, and that regardless
of path length, if the adversary has compromised your first and
last hop of your path, you can assume they have compromised your
identity for that connection.
In fact, it was demonstrated that for all but the slowest, lossiest
networks, error rates for false positives and false negatives were
very near zero[2]. Only for constant streams of traffic over slow and
(more importantly) extremely lossy network links did the error rate
hit 20%. For loss rates typical to the Internet, even the error rate
for slow nodes with constant traffic streams was 13%.
When you take into account that most Tor streams are not constant,
but probably much more like their "HomeIP" dataset, which consists
mostly of web traffic that exists over finite intervals at specific
times, error rates drop to fractions of 1%, even for the "worst"
network nodes.
Therefore, the user has little benefit from the extra hop, assuming
the adversary does timing correlation on their nodes. Since timing
correlation is simply an implementation issue and is most likely
a single up-front cost (and one that is like quite a bit cheaper
than the cost of the machines purchased to host the nodes to mount
an attack), the real protection is the low probability of getting
both the first and last hop of a client's stream.
Practical Issues:
Theoretical issues aside, there are several practical issues with the
implementation of Tor that need to be addressed to ensure that
identity information is not leaked by the implementation.
Exit policy issues:
If a client chooses an exit with a very restrictive exit policy
(such as an IP or IP range), the first hop then knows a good deal
about the destination. For this reason, clients should not select
exits that match their destination IP with anything other than "*".
Partitioning:
Partitioning attacks form another concern. Since Tor uses telescoping
to build circuits, it is possible to tell a user is constructing only
two hop paths at the entry node and on the local network. An external
adversary can potentially differentiate 2 and 3 hop users, and decide
that all IP addresses connecting to Tor and using 3 hops have something
to hide, and should be scrutinized more closely or outright apprehended.
One solution to this is to use the "leaky-circuit" method of attaching
streams: The user always creates 3-hop circuits, but if the option
is enabled, they always exit from their 2nd hop. The ideal solution
would be to create a RELAY_SHISHKABOB cell which contains onion
skins for every host along the path, but this requires protocol
changes at the nodes to support.
Guard nodes:
Since guard nodes can rotate due to client relocation, network
failure, node upgrades and other issues, if you amortize the risk a
mobile, dialup, or otherwise intermittently connected user is exposed to
over any reasonable duration of Tor usage (on the order of a year), it
is the same with or without guard nodes. Assuming an adversary has
c%/n% of network bandwidth, and guards rotate on average with period R,
statistically speaking, it's merely a question of if the user wishes
their risk to be concentrated with probability c/n over an expected
period of R*c, and probability 0 over an expected period of R*(n-c),
versus a continuous risk of (c/n)^2. So statistically speaking, guards
only create a time-tradeoff of risk over the long run for normal Tor
usage. Rotating guards do not reduce risk for normal client usage long
term.[3]
On other other hand, assuming a more stable method of guard selection
and preservation is devised, or a more stable client side network than
my own is typical (which rotates guards frequently due to network issues
and moving about), guard nodes provide a tradeoff in the form of c/n% of
the users being "sacrificial users" who are exposed to high risk O(c/n)
of identification, while the rest of the network is exposed to zero
risk.
The nature of Tor makes it likely an adversary will take a "shock and
awe" approach to suppressing Tor by rounding up a few users whose
browsing activity has been observed to be made into examples, in an
attempt to prove that Tor is not perfect.
Since this "shock and awe" attack can be applied with or without guard
nodes, stable guard nodes do offer a measure of accountability of sorts.
If a user was using a small set of guard nodes and knows them well, and
then is suddenly apprehended as a result of Tor usage, having a fixed
set of entry points to suspect is a lot better than suspecting the whole
network. Conversely, it can also give non-apprehended users comfort
that they are likely to remain safe indefinitely with their set of (now
presumably trusted) guards. This is probably the most beneficial
property of reliable guards: they deter the adversary from mounting
"shock and awe" attacks because the surviving users will not
intimidated, but instead made more confident. Of course, guards need to
be made much more stable and users need to be encouraged to know their
guards for this property to really take effect.
This beneficial property of client vigilance also carries over to an
active adversary, except in this case instead of relying on the user
to remember their guard nodes and somehow communicate them after
apprehension, the code can alert them to the presence of an active
adversary before they are apprehended. But only if they use guard nodes.
So lets consider the active adversary: Two hop paths allow malicious
guards to get considerably more benefit from failing circuits if they do
not extend to their colluding peers for the exit hop. Since guards can
detect the number of hops in a path via either timing or by statistical
analysis of the exit policy of the 2nd hop, they can perform this attack
predominantly against 2 hop users.
This can be addressed by completely abandoning an entry guard after a
certain ratio of extend or general circuit failures with respect to
non-failed circuits. The proper value for this ratio can be determined
experimentally with TorFlow. There is the possibility that the local
network can abuse this feature to cause certain guards to be dropped,
but they can do that anyways with the current Tor by just making guards
they don't like unreachable. With this mechanism, Tor will complain
loudly if any guard failure rate exceeds the expected in any failure
case, local or remote.
Eliminating guards entirely would actually not address this issue due
to the time-tradeoff nature of risk. In fact, it would just make it
worse. Without guard nodes, it becomes much more difficult for clients
to become alerted to Tor entry points that are failing circuits to make
sure that they only devote bandwidth to carry traffic for streams which
they observe both ends. Yet the rogue entry points are still able to
significantly increase their success rates by failing circuits.
For this reason, guard nodes should remain enabled for 2 hop users,
at least until an IP-independent, undetectable guard scanner can
be created. TorFlow can scan for failing guards, but after a while,
its unique behavior gives away the fact that its IP is a scanner and
it can be given selective service.
Consideration of risks for node operators:
There is a serious risk for two hop users in the form of guard
profiling. If an adversary running an exit node notices that a
particular site is always visited from a fixed previous hop, it is
likely that this is a two hop user using a certain guard, which could be
monitored to determine their identity. Thus, for the protection of both
2 hop users and node operators, 2 hop users should limit their guard
duration to a sufficient number of days to verify reliability of a node,
but not much more. This duration can be determined experimentally by
TorFlow.
Considering a Tor client builds on average 144 circuits/day (10
minutes per circuit), if the adversary owns c/n% of exits on the
network, they can expect to see 144*c/n circuits from this user, or
about 14 minutes of usage per day per percentage of network penetration.
Since it will take several occurrences of user-linkable exit content
from the same predecessor hop for the adversary to have any confidence
this is a 2 hop user, it is very unlikely that any sort of demands made
upon the predecessor node would guaranteed to be effective (ie it
actually was a guard), let alone be executed in time to apprehend the
user before they rotated guards.
The reverse risk also warrants consideration. If a malicious guard has
orders to surveil Mike Perry, it can determine Mike Perry is using two
hops by observing his tendency to choose a 2nd hop with a viable exit
policy. This can be done relatively quickly, unfortunately, and
indicates Mike Perry should spend some of his time building real 3 hop
circuits through the same guards, to require them to at least wait for
him to actually use Tor to determine his style of operation, rather than
collect this information from his passive building patterns.
However, to actively determine where Mike Perry is going, the guard
will need to require logging ahead of time at multiple exit nodes that
he may use over the course of the few days while he is at that guard,
and correlate the usage times of the exit node with Mike Perry's
activity at that guard for the few days he uses it. At this point, the
adversary is mounting a scale and method of attack (widespread logging,
timing attacks) that works pretty much just as effectively against 3
hops, so exit node operators are exposed to no additional danger than
they otherwise normally are.
Why not fix Pathlen=2?:
The main reason I am not advocating that we always use 2 hops is that
in some situations, timing correlation evidence by itself may not be
considered as solid and convincing as an actual, uninterrupted, fully
traced path. Are these timing attacks as effective on a real network as
they are in simulation? Maybe the circuit multiplexing of Tor can serve
to frustrate them to a degree? Would an extralegal adversary or
authoritarian government even care? In the face of these situation
dependent unknowns, it should be up to the user to decide if this is
a concern for them or not.
It should probably also be noted that even a false positive
rate of 1% for a 200k concurrent-user network could mean that for a
given node, a given stream could be confused with something like 10
users, assuming ~200 nodes carry most of the traffic (ie 1000 users
each). Though of course to really know for sure, someone needs to do
an attack on a real network, unfortunately.
Additionally, at some point cover traffic schemes may be implemented to
frustrate timing attacks on the first hop. It is possible some expert
users may do this ad-hoc already, and may wish to continue using 3 hops
for this reason.
Implementation:
new_route_len() can be modified directly with a check of the
Pathlen option. However, circuit construction logic should be
altered so that both 2 hop and 3 hop users build the same types of
circuits, and the option should ultimately govern circuit selection,
not construction. This improves coverage against guard nodes being
able to passively profile users who aren't even using Tor.
PathlenCoinWeight, anyone? :)
The exit policy hack is a bit more tricky. compare_addr_to_addr_policy
needs to return an alternate ADDR_POLICY_ACCEPTED_WILDCARD or
ADDR_POLICY_ACCEPTED_SPECIFIC return value for use in
circuit_is_acceptable.
The leaky exit is trickier still.. handle_control_attachstream
does allow paths to exit at a given hop. Presumably something similar
can be done in connection_ap_handshake_process_socks, and elsewhere?
Circuit construction would also have to be performed such that the
2nd hop's exit policy is what is considered, not the 3rd's.
The entry_guard_t structure could have num_circ_failed and
num_circ_succeeded members such that if it exceeds F% circuit
extend failure rate to a second hop, it is removed from the entry list.
F should be sufficiently high to avoid churn from normal Tor circuit
failure as determined by TorFlow scans.
The Vidalia option should be presented as a radio button.
Migration:
Phase 1: Adjust exit policy checks if Pathlen is set, implement leaky
circuit ability, and 2-3 hop circuit selection logic governed by
Pathlen.
Phase 2: Experiment to determine the proper ratio of circuit
failures used to expire garbage or malicious guards via TorFlow
(pending Bug #440 backport+adoption).
Phase 3: Implement guard expiration code to kick off failure-prone
guards and warn the user. Cap 2 hop guard duration to a proper number
of days determined sufficient to establish guard reliability (to be
determined by TorFlow).
Phase 4: Make radiobutton in Vidalia, along with help entry
that explains in layman's terms the risks involved.
Phase 5: Allow user to specify path length by HTTP URL suffix.
[1] http://p2pnet.net/story/11279
[2] http://www.cs.umass.edu/~mwright/papers/levine-timing.pdf
[3] Proof available upon request ;)
Filename: 116-two-hop-paths-from-guard.txt
Title: Two hop paths from entry guards
Author: Michael Lieberman
Created: 26-Jun-2007
Status: Dead
This proposal is related to (but different from) Mike Perry's proposal 115
"Two Hop Paths."
Overview:
Volunteers who run entry guards should have the option of using only 2
additional tor nodes when constructing their own tor circuits.
While the option of two hop paths should perhaps be extended to every client
(as discussed in Mike Perry's thread), I believe the anonymity properties of
two hop paths are particularly well-suited to client computers that are also
serving as entry guards.
First I will describe the details of the strategy, as well as possible
avenues of attack. Then I will list advantages and disadvantages. Then, I
will discuss some possibly safer variations of the strategy, and finally
some implementation issues.
Details:
Suppose Alice is an entry guard, and wants to construct a two hop circuit.
Alice chooses a middle node at random (not using the entry guard strategy),
and gains anonymity by having her traffic look just like traffic from
someone else using her as an entry guard.
Can Alice's middle node figure out that she is initiator of the traffic? I
can think of four possible approaches for distinguishing traffic from Alice
with traffic through Alice:
1) Notice that communication from Alice comes too fast: Experimentation is
needed to determine if traffic from Alice can be distinguished from traffic
from a computer with a decent link to Alice.
2) Monitor Alice's network traffic to discover the lack of incoming packets
at the appropriate times. If an adversary has this ability, then Alice
already has problems in the current system, because the adversary can run a
standard timing attack on Alice's traffic.
3) Notice that traffic from Alice is unique in some way such that if Alice
was just one of 3 entry guards for this traffic, then the traffic should be
coming from two other entry guards as well. An example of "unique traffic"
could be always sending 117 packets every 3 minutes to an exit node that
exits to port 4661. However, if such patterns existed with sufficient
precision, then it seems to me that Tor already has a problem. (This "unique
traffic" may not be a problem if clients often end up choosing a single
entry guard because their other two are down. Does anyone know if this is
the case?)
4) First, control the middle node *and* some other part of the traffic,
using standard attacks on a two hop circuit without entry nodes (my recent
paper on Browser-Based Attacks would work well for this
http://petworkshop.org/2007/papers/PET2007_preproc_Browser_based.pdf). With
control of the circuit, we can now cause "unique traffic" as in 3).
Alternatively, if we know something about Alice independently, and we can
see what websites are being visited, we might be able to guess that she is
the kind of person that would visit those websites.
Anonymity Advantages:
-Alice never has the problem of choosing a malicious entry guard. In some
sense, Alice acts as her own entry guard.
Anonymity Disadvantages:
-If Alice's traffic is identified as originating from herself (see above for
how hard that might be), then she has the anonymity of a 2 hop circuit
without entry guards.
Additional advantages:
-A discussion of the latency advantages of two hop circuits is going on in
Mike Perry's thread already.
-Also, we can advertise this change as "Run an entry guard and decrease your
own Tor latency." This incentive has the potential to add nodes to the
network, improving the network as a whole.
Safer variations:
To solve the "unique traffic" problem, Alice could use two hop paths only
1/3 of the time, and choose 2 other entry guards for the other 2/3 of the
time. All the advantages are now 1/3 as useful (possibly more, if the other
2 entry guards are not always up).
To solve the problem that Alice's responses are too fast, Alice could delay
her responses (ideally based on some real data of response time when Alice
is used an entry guard). This loses most of the speed advantages of the two
hop path, but if Alice is a fast entry guard, it doesn't lose everything. It
also still has the (arguable) anonymity advantage that Alice doesn't have to
worry about having a malicious entry guard.
Implementation details:
For Alice to remain anonymous using this strategy, she has to actually be
acting as an entry guard for other nodes. This means the two hop option can
only be available to whatever high-performance threshold is currently set on
entry guards. Alice may need to somehow check her own current status as an
entry guard before choosing this two hop strategy.
Another thing to consider: suppose Alice is also an exit node. If the
fraction of exit nodes in existence is too small, she may rarely or never be
chosen as an entry guard. It would be sad if we offered an incentive to run
an entry guard that didn't extend to exit nodes. I suppose clients of Exit
nodes could pull the same trick, and bypass using Tor altogether (zero hop
paths), though that has additional issues.*
Mike Lieberman
MIT
*Why we shouldn't recommend Exit nodes pull the same trick:
1) Exit nodes would suffer heavily from the problem of "unique traffic"
mentioned above.
2) It would give governments an incentive to confiscate exit nodes to see if
they are pulling this trick.
Filename: 117-ipv6-exits.txt
Title: IPv6 exits
Author: coderman
Created: 10-Jul-2007
Status: Closed
Target: 0.2.4.x
Implemented-In: 0.2.4.7-alpha
Overview
Extend Tor for TCP exit via IPv6 transport and DNS resolution of IPv6
addresses. This proposal does not imply any IPv6 support for OR
traffic, only exit and name resolution.
Contents
0. Motivation
As the IPv4 address space becomes more scarce there is increasing
effort to provide Internet services via the IPv6 protocol. Many
hosts are available at IPv6 endpoints which are currently
inaccessible for Tor users.
Extending Tor to support IPv6 exit streams and IPv6 DNS name
resolution will allow users of the Tor network to access these hosts.
This capability would be present for those who do not currently have
IPv6 access, thus increasing the utility of Tor and furthering
adoption of IPv6.
1. Design
1.1. General design overview
There are three main components to this proposal. The first is a
method for routers to advertise their ability to exit IPv6 traffic.
The second is the manner in which routers resolve names to IPv6
addresses. Last but not least is the method in which clients
communicate with Tor to resolve and connect to IPv6 endpoints
anonymously.
1.2. Router IPv6 exit support
In order to specify exit policies and IPv6 capability new directives
in the Tor configuration will be needed. If a router advertises IPv6
exit policies in its descriptor this will signal the ability to
provide IPv6 exit. There are a number of additional default deny
rules associated with this new address space which are detailed in
the addendum.
When Tor is started on a host it should check for the presence of a
global unicast IPv6 address and if present include the default IPv6
exit policies and any user specified IPv6 exit policies.
If a user provides IPv6 exit policies but no global unicast IPv6
address is available Tor should generate a warning and not publish the
IPv6 policies in the router descriptor.
It should be noted that IPv4 mapped IPv6 addresses are not valid exit
destinations. This mechanism is mainly used to interoperate with
both IPv4 and IPv6 clients on the same socket. Any attempts to use
an IPv4 mapped IPv6 address, perhaps to circumvent exit policy for
IPv4, must be refused.
1.3. DNS name resolution of IPv6 addresses (AAAA records)
In addition to exit support for IPv6 TCP connections, a method to
resolve domain names to their respective IPv6 addresses is also
needed. This is accomplished in the existing DNS system via AAAA
records. Routers will perform both A and AAAA requests when
resolving a name so that the client can utilize an IPv6 endpoint when
available or preferred.
To avoid potential problems with caching DNS servers that behave
poorly all NXDOMAIN responses to AAAA requests should be ignored if a
successful response is received for an A request. This implies that
both AAAA and A requests will always be performed for each name
resolution.
For reverse lookups on IPv6 addresses, like that used for
RESOLVE_PTR, Tor will perform the necessary PTR requests via
IP6.ARPA.
All routers which perform DNS resolution on behalf of clients
(RELAY_RESOLVE) should perform and respond with both A and AAAA
resources.
[NOTE: In a future version, when we extend the behavior of RESOLVE to
encapsulate more of real DNS, it will make sense to allow more
flexibility here. -nickm]
1.4. Client interaction with IPv6 exit capability
1.4.1. Usability goals
There are a number of behaviors which Tor can provide when
interacting with clients that will improve the usability of IPv6 exit
capability. These behaviors are designed to make it simple for
clients to express a preference for IPv6 transport and utilize IPv6
host services.
1.4.2. SOCKSv5 IPv6 client behavior
The SOCKS version 5 protocol supports IPv6 connections. When using
SOCKSv5 with hostnames it is difficult to determine if a client
wishes to use an IPv4 or IPv6 address to connect to the desired host
if it resolves to both address types.
In order to make this more intuitive the SOCKSv5 protocol can be
supported on a local IPv6 endpoint, [::1] port 9050 for example.
When a client requests a connection to the desired host via an IPv6
SOCKS connection Tor will prefer IPv6 addresses when resolving the
host name and connecting to the host.
Likewise, RESOLVE and RESOLVE_PTR requests from an IPv6 SOCKS
connection will return IPv6 addresses when available, and fall back
to IPv4 addresses if not.
[NOTE: This means that SocksListenAddress and DNSListenAddress should
support IPv6 addresses. Perhaps there should also be a general option
to have listeners that default to 127.0.0.1 and 0.0.0.0 listen
additionally or instead on ::1 and :: -nickm]
1.4.3. MAPADDRESS behavior
The MAPADDRESS capability supports clients that may not be able to
use the SOCKSv4a or SOCKSv5 hostname support to resolve names via
Tor. This ability should be extended to IPv6 addresses in SOCKSv5 as
well.
When a client requests an address mapping from the wildcard IPv6
address, [::0], the server will respond with a unique local IPv6
address on success. It is important to note that there may be two
mappings for the same name if both an IPv4 and IPv6 address are
associated with the host. In this case a CONNECT to a mapped IPv6
address should prefer IPv6 for the connection to the host, if
available, while CONNECT to a mapped IPv4 address will prefer IPv4.
It should be noted that IPv6 does not provide the concept of a host
local subnet, like 127.0.0.0/8 in IPv4. For this reason integration
of Tor with IPv6 clients should consider a firewall or filter rule to
drop unique local addresses to or from the network when possible.
These packets should not be routed, however, keeping them off the
subnet entirely is worthwhile.
1.4.3.1. Generating unique local IPv6 addresses
The usual manner of generating a unique local IPv6 address is to
select a Global ID part randomly, along with a Subnet ID, and sharing
this prefix among the communicating parties who each have their own
distinct Interface ID. In this style a given Tor instance might
select a random Global and Subnet ID and provide MAPADDRESS
assignments with a random Interface ID as needed. This has the
potential to associate unique Global/Subnet identifiers with a given
Tor instance and may expose attacks against the anonymity of Tor
users.
To avoid this potential problem entirely MAPADDRESS must always
generate the Global, Subnet, and Interface IDs randomly for each
request. It is also highly suggested that explicitly specifying an
IPv6 source address instead of the wildcard address not be supported
to ensure that a good random address is used.
1.4.4. DNSProxy IPv6 client behavior
A new capability in recent Tor versions is the transparent DNS proxy.
This feature will need to return both A and AAAA resource records
when responding to client name resolution requests.
The transparent DNS proxy should also support reverse lookups for
IPv6 addresses. It is suggested that any such requests to the
deprecated IP6.INT domain should be translated to IP6.ARPA instead.
This translation is not likely to be used and is of low priority.
It would be nice to support DNS over IPv6 transport as well, however,
this is not likely to be used and is of low priority.
1.4.5. TransPort IPv6 client behavior
Tor also provides transparent TCP proxy support via the Trans*
directives in the configuration. The TransListenAddress directive
should accept an IPv6 address in addition to IPv4 so that IPv6 TCP
connections can be transparently proxied.
1.5. Additional changes
The RedirectExit option should be deprecated rather than extending
this feature to IPv6.
2. Spec changes
2.1. Tor specification
In '6.2. Opening streams and transferring data' the following should
be changed to indicate IPv6 exit capability:
"No version of Tor currently generates the IPv6 format."
In '6.4. Remote hostname lookup' the following should be updated to
reflect use of ip6.arpa in addition to in-addr.arpa.
"For a reverse lookup, the OP sends a RELAY_RESOLVE cell containing an
in-addr.arpa address."
In 'A.1. Differences between spec and implementation' the following
should be updated to indicate IPv6 exit capability:
"The current codebase has no IPv6 support at all."
[NOTE: the EXITPOLICY end-cell reason says that it can hold an ipv4 or an
ipv6 address, but doesn't say how. We may want a separate EXITPOLICY2
type that can hold an ipv6 address, since the way we encode ipv6
addresses elsewhere ("0.0.0.0 indicates that the next 16 bytes are ipv6")
is a bit dumb. -nickm]
[Actually, the length field lets us distinguish EXITPOLICY. -nickm]
2.2. Directory specification
In '2.1. Router descriptor format' a new set of directives is needed
for IPv6 exit policy. The existing accept/reject directives should
be clarified to indicate IPv4 or wildcard address relevance. The new
IPv6 directives will be in the form of:
"accept6" exitpattern NL
"reject6" exitpattern NL
The section describing accept6/reject6 should explain that the
presence of accept6 or reject6 exit policies in a router descriptor
signals the ability of that router to exit IPv6 traffic (according to
IPv6 exit policies).
The "[::]/0" notation is used to represent "all IPv6 addresses".
"[::0]/0" may also be used for this representation.
If a user specifies a 'reject6 [::]/0:*' policy in the Tor
configuration this will be interpreted as forcing no IPv6 exit
support and no accept6/reject6 policies will be included in the
published descriptor. This will prevent IPv6 exit if the router host
has a global unicast IPv6 address present.
It is important to note that a wildcard address in an accept or
reject policy applies to both IPv4 and IPv6 addresses.
2.3. Control specification
In '3.8. MAPADDRESS' the potential to have to addresses for a given
name should be explained. The method for generating unique local
addresses for IPv6 mappings needs explanation as described above.
When IPv6 addresses are used in this document they should include the
brackets for consistency. For example, the null IPv6 address should
be written as "[::0]" and not "::0". The control commands will
expect the same syntax as well.
In '3.9. GETINFO' the "address" command should return both public
IPv4 and IPv6 addresses if present. These addresses should be
separated via \r\n.
2.4. Tor SOCKS extensions
In '2. Name lookup' a description of IPv6 address resolution is
needed for SOCKSv5 as described above. IPv6 addresses should be
supported in both the RESOLVE and RESOLVE_PTR extensions.
A new section describing the ability to accept SOCKSv5 clients on a
local IPv6 address to indicate a preference for IPv6 transport as
described above is also needed. The behavior of Tor SOCKSv5 proxy
with an IPv6 preference should be explained, for example, preferring
IPv6 transport to a named host with both IPv4 and IPv6 addresses
available (A and AAAA records).
3. Questions and concerns
3.1. DNS A6 records
A6 is explicitly avoided in this document. There are potential
reasons for implementing this, however, the inherent complexity of
the protocol and resolvers make this unappealing. Is there a
compelling reason to consider A6 as part of IPv6 exit support?
[IMO not till anybody needs it. -nickm]
3.2. IPv4 and IPv6 preference
The design above tries to infer a preference for IPv4 or IPv6
transport based on client interactions with Tor. It might be useful
to provide more explicit control over this preference. For example,
an IPv4 SOCKSv5 client may want to use IPv6 transport to named hosts
in CONNECT requests while the current implementation would assume an
IPv4 preference. Should more explicit control be available, through
either configuration directives or control commands?
Many applications support a inet6-only or prefer-family type option
that provides the user manual control over address preference. This
could be provided as a Tor configuration option.
An explicit preference is still possible by resolving names and then
CONNECTing to an IPv4 or IPv6 address as desired, however, not all
client applications may have this option available.
3.3. Support for IPv6 only transparent proxy clients
It may be useful to support IPv6 only transparent proxy clients using
IPv4 mapped IPv6 like addresses. This would require transparent DNS
proxy using IPv6 transport and the ability to map A record responses
into IPv4 mapped IPv6 like addresses in the manner described in the
"NAT-PT" RFC for a traditional Basic-NAT-PT with DNS-ALG. The
transparent TCP proxy would thus need to detect these mapped addresses
and connect to the desired IPv4 host.
The IPv6 prefix used for this purpose must not be the actual IPv4
mapped IPv6 address prefix, though the manner in which IPv4 addresses
are embedded in IPv6 addresses would be the same.
The lack of any IPv6 only hosts which would use this transparent proxy
method makes this a lot of work for very little gain. Is there a
compelling reason to support this NAT-PT like capability?
3.4. IPv6 DNS and older Tor routers
It is expected that many routers will continue to run with older
versions of Tor when the IPv6 exit capability is released. Clients
who wish to use IPv6 will need to route RELAY_RESOLVE requests to the
newer routers which will respond with both A and AAAA resource
records when possible.
One way to do this is to route RELAY_RESOLVE requests to routers with
IPv6 exit policies published, however, this would not utilize current
routers that can resolve IPv6 addresses even if they can't exit such
traffic.
There was also concern expressed about the ability of existing clients
to cope with new RELAY_RESOLVE responses that contain IPv6 addresses.
If this breaks backward compatibility, a new request type may be
necessary, like RELAY_RESOLVE6, or some other mechanism of indicating
the ability to parse IPv6 responses when making the request.
3.5. IPv4 and IPv6 bindings in MAPADDRESS
It may be troublesome to try and support two distinct address mappings
for the same name in the existing MAPADDRESS implementation. If this
cannot be accommodated then the behavior should replace existing
mappings with the new address regardless of family. A warning when
this occurs would be useful to assist clients who encounter problems
when both an IPv4 and IPv6 application are using MAPADDRESS for the
same names concurrently, causing lost connections for one of them.
4. Addendum
4.1. Sample IPv6 default exit policy
reject 0.0.0.0/8
reject 169.254.0.0/16
reject 127.0.0.0/8
reject 192.168.0.0/16
reject 10.0.0.0/8
reject 172.16.0.0/12
reject6 [0000::]/8
reject6 [0100::]/8
reject6 [0200::]/7
reject6 [0400::]/6
reject6 [0800::]/5
reject6 [1000::]/4
reject6 [4000::]/3
reject6 [6000::]/3
reject6 [8000::]/3
reject6 [A000::]/3
reject6 [C000::]/3
reject6 [E000::]/4
reject6 [F000::]/5
reject6 [F800::]/6
reject6 [FC00::]/7
reject6 [FE00::]/9
reject6 [FE80::]/10
reject6 [FEC0::]/10
reject6 [FF00::]/8
reject *:25
reject *:119
reject *:135-139
reject *:445
reject *:1214
reject *:4661-4666
reject *:6346-6429
reject *:6699
reject *:6881-6999
accept *:*
# accept6 [2000::]/3:* is implied
4.2. Additional resources
'DNS Extensions to Support IP Version 6'
http://www.ietf.org/rfc/rfc3596.txt
'DNS Extensions to Support IPv6 Address Aggregation and Renumbering'
http://www.ietf.org/rfc/rfc2874.txt
'SOCKS Protocol Version 5'
http://www.ietf.org/rfc/rfc1928.txt
'Unique Local IPv6 Unicast Addresses'
http://www.ietf.org/rfc/rfc4193.txt
'INTERNET PROTOCOL VERSION 6 ADDRESS SPACE'
http://www.iana.org/assignments/ipv6-address-space
'Network Address Translation - Protocol Translation (NAT-PT)'
http://www.ietf.org/rfc/rfc2766.txt
Filename: 118-multiple-orports.txt
Title: Advertising multiple ORPorts at once
Author: Nick Mathewson
Created: 09-Jul-2007
Status: Superseded
Superseded-By: 186-multiple-orports.txt
[Needs Revision: This proposal needs revision to come up to 2011 standards
and take microdescriptors into account.]
Overview:
This document is a proposal for servers to advertise multiple
address/port combinations for their ORPort.
Motivation:
Sometimes servers want to support multiple ports for incoming
connections, either in order to support multiple address families, to
better use multiple interfaces, or to support a variety of
FascistFirewallPorts settings. This is easy to set up now, but
there's no way to advertise it to clients.
New descriptor syntax:
We add a new line in the router descriptor, "or-address". This line
can occur zero, one, or multiple times. Its format is:
or-address SP ADDRESS ":" PORTLIST NL
ADDRESS = IP6ADDR / IP4ADDR
IPV6ADDR = an ipv6 address, surrounded by square brackets.
IPV4ADDR = an ipv4 address, represented as a dotted quad.
PORTLIST = PORTSPEC | PORTSPEC "," PORTLIST
PORTSPEC = PORT | PORT "-" PORT
[This is the regular format for specifying sets of addresses and
ports in Tor.]
New OR behavior:
We add two more options to supplement ORListenAddress:
ORPublishedListenAddress, and ORPublishAddressSet. The former
listens on an address-port combination and publishes it in addition
to the regular address. The latter advertises a set of address-port
combinations, but does not listen on them. [To use this option, the
server operator should set up port forwarding to the regular ORPort,
as for example with firewall rules.]
Servers should extend their testing to include advertised addresses
and ports. No address or port should be advertised until it's been
tested. [This might get expensive in practice.]
New authority behavior:
Authorities should spot-test descriptors, and reject any where a
substantial part of the addresses can't be reached.
New client behavior:
When connecting to another server, clients SHOULD pick an
address-port ocmbination at random as supported by their
reachableaddresses. If a client has a connection to a server at one
address, it SHOULD use that address for any simultaneous connections
to that server. Clients SHOULD use the canonical address for any
server when generating extend cells.
Not addressed here:
* There's no reason to listen on multiple dirports; current Tors
mostly don't connect directly to the dirport anyway.
* It could be advantageous to list something about extra addresses in
the network-status document. This would, however, eat space there.
More analysis is needed, particularly in light of proposal 141
("Download server descriptors on demand")
Dependencies:
Testing for canonical connections needs to be implemented before it's
safe to use this proposal.
Notes 3 July:
- Write up the simple version of this. No ranges needed yet. No
networkstatus chagnes yet.
Filename: 119-controlport-auth.txt
Title: New PROTOCOLINFO command for controllers
Author: Roger Dingledine
Created: 14-Aug-2007
Status: Closed
Implemented-In: 0.2.0.x
Overview:
Here we describe how to help controllers locate the cookie
authentication file when authenticating to Tor, so we can a) require
authentication by default for Tor controllers and b) still keep
things usable. Also, we propose an extensible, general-purpose mechanism
for controllers to learn about a Tor instance's protocol and
authentication requirements before authenticating.
The Problem:
When we first added the controller protocol, we wanted to make it
easy for people to play with it, so by default we didn't require any
authentication from controller programs. We allowed requests only from
localhost as a stopgap measure for security.
Due to an increasing number of vulnerabilities based on this approach,
it's time to add authentication in default configurations.
We have a number of goals:
- We want the default Vidalia bundles to transparently work. That
means we don't want the users to have to type in or know a password.
- We want to allow multiple controller applications to connect to the
control port. So if Vidalia is launching Tor, it can't just keep the
secrets to itself.
Right now there are three authentication approaches supported
by the control protocol: NULL, CookieAuthentication, and
HashedControlPassword. See Sec 5.1 in control-spec.txt for details.
There are a couple of challenges here. The first is: if the controller
launches Tor, how should we teach Tor what authentication approach
it should require, and the secret that goes along with it? Next is:
how should this work when the controller attaches to an existing Tor,
rather than launching Tor itself?
Cookie authentication seems most amenable to letting multiple controller
applications interact with Tor. But that brings in yet another question:
how does the controller guess where to look for the cookie file,
without first knowing what DataDirectory Tor is using?
Design:
We should add a new controller command PROTOCOLINFO that can be sent
as a valid first command (the others being AUTHENTICATE and QUIT). If
PROTOCOLINFO is sent as the first command, the second command must be
either a successful AUTHENTICATE or a QUIT.
If the initial command sequence is not valid, Tor closes the connection.
Spec:
C: "PROTOCOLINFO" *(SP PIVERSION) CRLF
S: "250+PROTOCOLINFO" SP PIVERSION CRLF *InfoLine "250 OK" CRLF
InfoLine = AuthLine / VersionLine / OtherLine
AuthLine = "250-AUTH" SP "METHODS=" AuthMethod *(",")AuthMethod
*(SP "COOKIEFILE=" AuthCookieFile) CRLF
VersionLine = "250-VERSION" SP "Tor=" TorVersion [SP Arguments] CRLF
AuthMethod =
"NULL" / ; No authentication is required
"HASHEDPASSWORD" / ; A controller must supply the original password
"COOKIE" / ; A controller must supply the contents of a cookie
AuthCookieFile = QuotedString
TorVersion = QuotedString
OtherLine = "250-" Keyword [SP Arguments] CRLF
For example:
C: PROTOCOLINFO CRLF
S: "250+PROTOCOLINFO 1" CRLF
S: "250-AUTH Methods=HASHEDPASSWORD,COOKIE COOKIEFILE="/tor/cookie"" CRLF
S: "250-VERSION Tor=0.2.0.5-alpha" CRLF
S: "250 OK" CRLF
Tor MAY give its InfoLines in any order; controllers MUST ignore InfoLines
with keywords it does not recognize. Controllers MUST ignore extraneous
data on any InfoLine.
PIVERSION is there in case we drastically change the syntax one day. For
now it should always be "1", for the controller protocol. Controllers MAY
provide a list of the protocol versions they support; Tor MAY select a
version that the controller does not support.
Right now only two "topics" (AUTH and VERSION) are included, but more
may be included in the future. Controllers must accept lines with
unexpected topics.
AuthCookieFile = QuotedString
AuthMethod is used to specify one or more control authentication
methods that Tor currently accepts.
AuthCookieFile specifies the absolute path and filename of the
authentication cookie that Tor is expecting and is provided iff
the METHODS field contains the method "COOKIE". Controllers MUST handle
escape sequences inside this string.
The VERSION line contains the Tor version.
[What else might we want to include that could be useful? -RD]
Compatibility:
Tor 0.1.2.16 and 0.2.0.4-alpha hang up after the first failed
command. Earlier Tors don't know about this command but don't hang
up. That means controllers will need a mechanism for distinguishing
whether they're talking to a Tor that speaks PROTOCOLINFO or not.
I suggest that the controllers attempt a PROTOCOLINFO. Then:
- If it works, great. Authenticate as required.
- If they get hung up on, reconnect and do a NULL AUTHENTICATE.
- If it's unrecognized but they're not hung up on, do a NULL
AUTHENTICATE.
Unsolved problems:
If Torbutton wants to be a Tor controller one day... talking TCP is
bad enough, but reading from the filesystem is even harder. Is there
a way to let simple programs work with the controller port without
needing all the auth infrastructure?
Once we put this approach in place, the next vulnerability we see will
involve an attacker somehow getting read access to the victim's files
--- and then we're back where we started. This means we still need
to think about how to demand password-based authentication without
bothering the user about it.
Filename: 120-shutdown-descriptors.txt
Title: Shutdown descriptors when Tor servers stop
Author: Roger Dingledine
Created: 15-Aug-2007
Status: Dead
[Proposal dead as of 11 Jul 2008. The point of this proposal was to give
routers a good way to get out of the networkstatus early, but proposal
138 (already implemented) has achieved this.]
Overview:
Tor servers should publish a last descriptor whenever they shut down,
to let others know that they are no longer offering service.
The Problem:
The main reason for this is in reaction to Internet services that want
to treat connections from the Tor network differently. Right now,
if a user experiments with turning on the "relay" functionality, he
is punished by being locked out of some websites, some IRC networks,
etc --- and this lockout persists for several days even after he turns
the server off.
Design:
During the "slow shutdown" period if exiting, or shortly after the
user sets his ORPort back to 0 if not exiting, Tor should publish a
final descriptor with the following characteristics:
1) Exit policy is listed as "reject *:*"
2) It includes a new entry called "opt shutdown 1"
The first step is so current blacklists will no longer list this node
as exiting to whatever the service is.
The second step is so directory authorities can avoid wasting time
doing reachability testing. Authorities should automatically not list
as Running any router whose latest descriptor says it shut down.
[I originally had in mind a third step --- Advertised bandwidth capacity
is listed as "0" --- so current Tor clients will skip over this node
when building most circuits. But since clients won't fetch descriptors
from nodes not listed as Running, this step seems pointless. -RD]
Spec:
TBD but should be pretty straightforward.
Security issues:
Now external people can learn exactly when a node stopped offering
relay service. How bad is this? I can see a few minor attacks based
on this knowledge, but on the other hand as it is we don't really take
any steps to keep this information secret.
Overhead issues:
We are creating more descriptors that want to be remembered. However,
since the router won't be marked as Running, ordinary clients won't
fetch the shutdown descriptors. Caches will, though. I hope this is ok.
Implementation:
To make things easy, we should publish the shutdown descriptor only
on controlled shutdown (SIGINT as opposed to SIGTERM). That would
leave enough time for publishing that we probably wouldn't need any
extra synchronization code.
If that turns out to be too unintuitive for users, I could imagine doing
it on SIGTERMs too, and just delaying exit until we had successfully
published to at least one authority, at which point we'd hope that it
propagated from there.
Acknowledgements:
tup suggested this idea.
Comments:
2) Maybe add a rule "Don't do this for hibernation if we expect to wake
up before the next consensus is published"?
- NM 9 Oct 2007
Filename: 121-hidden-service-authentication.txt
Title: Hidden Service Authentication
Author: Tobias Kamm, Thomas Lauterbach, Karsten Loesing, Ferdinand Rieger,
Christoph Weingarten
Created: 10-Sep-2007
Status: Closed
Implemented-In: 0.2.1.x
Change history:
26-Sep-2007 Initial proposal for or-dev
08-Dec-2007 Incorporated comments by Nick posted to or-dev on 10-Oct-2007
15-Dec-2007 Rewrote complete proposal for better readability, modified
authentication protocol, merged in personal notes
24-Dec-2007 Replaced misleading term "authentication" by "authorization"
and added some clarifications (comments by Sven Kaffille)
28-Apr-2008 Updated most parts of the concrete authorization protocol
04-Jul-2008 Add a simple algorithm to delay descriptor publication for
different clients of a hidden service
19-Jul-2008 Added INTRODUCE1V cell type (1.2), improved replay
protection for INTRODUCE2 cells (1.3), described limitations
for auth protocols (1.6), improved hidden service protocol
without client authorization (2.1), added second, more
scalable authorization protocol (2.2), rewrote existing
authorization protocol (2.3); changes based on discussion
with Nick
31-Jul-2008 Limit maximum descriptor size to 20 kilobytes to prevent
abuse.
01-Aug-2008 Use first part of Diffie-Hellman handshake for replay
protection instead of rendezvous cookie.
01-Aug-2008 Remove improved hidden service protocol without client
authorization (2.1). It might get implemented in proposal
142.
Overview:
This proposal deals with a general infrastructure for performing
authorization (not necessarily implying authentication) of requests to
hidden services at three points: (1) when downloading and decrypting
parts of the hidden service descriptor, (2) at the introduction point,
and (3) at Bob's Tor client before contacting the rendezvous point. A
service provider will be able to restrict access to his service at these
three points to authorized clients only. Further, the proposal contains
specific authorization protocols as instances that implement the
presented authorization infrastructure.
This proposal is based on v2 hidden service descriptors as described in
proposal 114 and introduced in version 0.2.0.10-alpha.
The proposal is structured as follows: The next section motivates the
integration of authorization mechanisms in the hidden service protocol.
Then we describe a general infrastructure for authorization in hidden
services, followed by specific authorization protocols for this
infrastructure. At the end we discuss a number of attacks and non-attacks
as well as compatibility issues.
Motivation:
The major part of hidden services does not require client authorization
now and won't do so in the future. To the contrary, many clients would
not want to be (pseudonymously) identifiable by the service (though this
is unavoidable to some extent), but rather use the service
anonymously. These services are not addressed by this proposal.
However, there may be certain services which are intended to be accessed
by a limited set of clients only. A possible application might be a
wiki or forum that should only be accessible for a closed user group.
Another, less intuitive example might be a real-time communication
service, where someone provides a presence and messaging service only to
his buddies. Finally, a possible application would be a personal home
server that should be remotely accessed by its owner.
Performing authorization for a hidden service within the Tor network, as
proposed here, offers a range of advantages compared to allowing all
client connections in the first instance and deferring authorization to
the transported protocol:
(1) Reduced traffic: Unauthorized requests would be rejected as early as
possible, thereby reducing the overall traffic in the network generated
by establishing circuits and sending cells.
(2) Better protection of service location: Unauthorized clients could not
force Bob to create circuits to their rendezvous points, thus preventing
the attack described by Øverlier and Syverson in their paper "Locating
Hidden Servers" even without the need for guards.
(3) Hiding activity: Apart from performing the actual authorization, a
service provider could also hide the mere presence of his service from
unauthorized clients when not providing hidden service descriptors to
them, rejecting unauthorized requests already at the introduction
point (ideally without leaking presence information at any of these
points), or not answering unauthorized introduction requests.
(4) Better protection of introduction points: When providing hidden
service descriptors to authorized clients only and encrypting the
introduction points as described in proposal 114, the introduction points
would be unknown to unauthorized clients and thereby protected from DoS
attacks.
(5) Protocol independence: Authorization could be performed for all
transported protocols, regardless of their own capabilities to do so.
(6) Ease of administration: A service provider running multiple hidden
services would be able to configure access at a single place uniformly
instead of doing so for all services separately.
(7) Optional QoS support: Bob could adapt his node selection algorithm
for building the circuit to Alice's rendezvous point depending on a
previously guaranteed QoS level, thus providing better latency or
bandwidth for selected clients.
A disadvantage of performing authorization within the Tor network is
that a hidden service cannot make use of authorization data in
the transported protocol. Tor hidden services were designed to be
independent of the transported protocol. Therefore it's only possible to
either grant or deny access to the whole service, but not to specific
resources of the service.
Authorization often implies authentication, i.e. proving one's identity.
However, when performing authorization within the Tor network, untrusted
points should not gain any useful information about the identities of
communicating parties, neither server nor client. A crucial challenge is
to remain anonymous towards directory servers and introduction points.
However, trying to hide identity from the hidden service is a futile
task, because a client would never know if he is the only authorized
client and therefore perfectly identifiable. Therefore, hiding client
identity from the hidden service is not an aim of this proposal.
The current implementation of hidden services does not provide any kind
of authorization. The hidden service descriptor version 2, introduced by
proposal 114, was designed to use a descriptor cookie for downloading and
decrypting parts of the descriptor content, but this feature is not yet
in use. Further, most relevant cell formats specified in rend-spec
contain fields for authorization data, but those fields are neither
implemented nor do they suffice entirely.
Details:
1. General infrastructure for authorization to hidden services
We spotted three possible authorization points in the hidden service
protocol:
(1) when downloading and decrypting parts of the hidden service
descriptor,
(2) at the introduction point, and
(3) at Bob's Tor client before contacting the rendezvous point.
The general idea of this proposal is to allow service providers to
restrict access to some or all of these points to authorized clients
only.
1.1. Client authorization at directory
Since the implementation of proposal 114 it is possible to combine a
hidden service descriptor with a so-called descriptor cookie. If done so,
the descriptor cookie becomes part of the descriptor ID, thus having an
effect on the storage location of the descriptor. Someone who has learned
about a service, but is not aware of the descriptor cookie, won't be able
to determine the descriptor ID and download the current hidden service
descriptor; he won't even know whether the service has uploaded a
descriptor recently. Descriptor IDs are calculated as follows (see
section 1.2 of rend-spec for the complete specification of v2 hidden
service descriptors):
descriptor-id =
H(service-id | H(time-period | descriptor-cookie | replica))
Currently, service-id is equivalent to permanent-id which is calculated
as in the following formula. But in principle it could be any public
key.
permanent-id = H(permanent-key)[:10]
The second purpose of the descriptor cookie is to encrypt the list of
introduction points, including optional authorization data. Hence, the
hidden service directories won't learn any introduction information from
storing a hidden service descriptor. This feature is implemented but
unused at the moment. So this proposal will harness the advantages
of proposal 114.
The descriptor cookie can be used for authorization by keeping it secret
from everyone but authorized clients. A service could then decide whether
to publish hidden service descriptors using that descriptor cookie later
on. An authorized client being aware of the descriptor cookie would be
able to download and decrypt the hidden service descriptor.
The number of concurrently used descriptor cookies for one hidden service
is not restricted. A service could use a single descriptor cookie for all
users, a distinct cookie per user, or something in between, like one
cookie per group of users. It is up to the specific protocol and how it
is applied by a service provider.
Two or more hidden service descriptors for different groups or users
should not be uploaded at the same time. A directory node could conclude
easily that the descriptors were issued by the same hidden service, thus
being able to link the two groups or users. Therefore, descriptors for
different users or clients that ought to be stored on the same directory
are delayed, so that only one descriptor is uploaded to a directory at a
time. The remaining descriptors are uploaded with a delay of up to
30 seconds.
Further, descriptors for different groups or users that are to be stored
on different directories are delayed for a random time of up to 30
seconds to hide relations from colluding directories. Certainly, this
does not prevent linking entirely, but it makes it somewhat harder.
There is a conflict between hiding links between clients and making a
service available in a timely manner.
Although this part of the proposal is meant to describe a general
infrastructure for authorization, changing the way of using the
descriptor cookie to look up hidden service descriptors, e.g. applying
some sort of asymmetric crypto system, would require in-depth changes
that would be incompatible to v2 hidden service descriptors. On the
contrary, using another key for en-/decrypting the introduction point
part of a hidden service descriptor, e.g. a different symmetric key or
asymmetric encryption, would be easy to implement and compatible to v2
hidden service descriptors as understood by hidden service directories
(clients and services would have to be upgraded anyway for using the new
features).
An adversary could try to abuse the fact that introduction points can be
encrypted by storing arbitrary, unrelated data in the hidden service
directory. This abuse can be limited by setting a hard descriptor size
limit, forcing the adversary to split data into multiple chunks. There
are some limitations that make splitting data across multiple descriptors
unattractive: 1) The adversary would not be able to choose descriptor IDs
freely and would therefore have to implement his own indexing
structure. 2) Validity of descriptors is limited to at most 24 hours
after which descriptors need to be republished.
The regular descriptor size in bytes is 745 + num_ipos * 837 + auth_data.
A large descriptor with 7 introduction points and 5 kilobytes of
authorization data would be 11724 bytes in size. The upper size limit of
descriptors should be set to 20 kilobytes, which limits the effect of
abuse while retaining enough flexibility in designing authorization
protocols.
1.2. Client authorization at introduction point
The next possible authorization point after downloading and decrypting
a hidden service descriptor is the introduction point. It may be important
for authorization, because it bears the last chance of hiding presence
of a hidden service from unauthorized clients. Further, performing
authorization at the introduction point might reduce traffic in the
network, because unauthorized requests would not be passed to the
hidden service. This applies to those clients who are aware of a
descriptor cookie and thereby of the hidden service descriptor, but do
not have authorization data to pass the introduction point or access the
service (such a situation might occur when authorization data for
authorization at the directory is not issued on a per-user basis, but
authorization data for authorization at the introduction point is).
It is important to note that the introduction point must be considered
untrustworthy, and therefore cannot replace authorization at the hidden
service itself. Nor should the introduction point learn any sensitive
identifiable information from either the service or the client.
In order to perform authorization at the introduction point, three
message formats need to be modified: (1) v2 hidden service descriptors,
(2) ESTABLISH_INTRO cells, and (3) INTRODUCE1 cells.
A v2 hidden service descriptor needs to contain authorization data that
is introduction-point-specific and sometimes also authorization data
that is introduction-point-independent. Therefore, v2 hidden service
descriptors as specified in section 1.2 of rend-spec already contain two
reserved fields "intro-authorization" and "service-authorization"
(originally, the names of these fields were "...-authentication")
containing an authorization type number and arbitrary authorization
data. We propose that authorization data consists of base64 encoded
objects of arbitrary length, surrounded by "-----BEGIN MESSAGE-----" and
"-----END MESSAGE-----". This will increase the size of hidden service
descriptors, but this is allowed since there is no strict upper limit.
The current ESTABLISH_INTRO cells as described in section 1.3 of
rend-spec do not contain either authorization data or version
information. Therefore, we propose a new version 1 of the ESTABLISH_INTRO
cells adding these two issues as follows:
V Format byte: set to 255 [1 octet]
V Version byte: set to 1 [1 octet]
KL Key length [2 octets]
PK Bob's public key [KL octets]
HS Hash of session info [20 octets]
AUTHT The auth type that is supported [1 octet]
AUTHL Length of auth data [2 octets]
AUTHD Auth data [variable]
SIG Signature of above information [variable]
From the format it is possible to determine the maximum allowed size for
authorization data: given the fact that cells are 512 octets long, of
which 498 octets are usable (see section 6.1 of tor-spec), and assuming
1024 bit = 128 octet long keys, there are 215 octets left for
authorization data. Hence, authorization protocols are bound to use no
more than these 215 octets, regardless of the number of clients that
shall be authenticated at the introduction point. Otherwise, one would
need to send multiple ESTABLISH_INTRO cells or split them up, which we do
not specify here.
In order to understand a v1 ESTABLISH_INTRO cell, the implementation of
a relay must have a certain Tor version. Hidden services need to be able
to distinguish relays being capable of understanding the new v1 cell
formats and perform authorization. We propose to use the version number
that is contained in networkstatus documents to find capable
introduction points.
The current INTRODUCE1 cell as described in section 1.8 of rend-spec is
not designed to carry authorization data and has no version number, too.
Unfortunately, unversioned INTRODUCE1 cells consist only of a fixed-size,
seemingly random PK_ID, followed by the encrypted INTRODUCE2 cell. This
makes it impossible to distinguish unversioned INTRODUCE1 cells from any
later format. In particular, it is not possible to introduce some kind of
format and version byte for newer versions of this cell. That's probably
where the comment "[XXX011 want to put intro-level auth info here, but no
version. crap. -RD]" that was part of rend-spec some time ago comes from.
We propose that new versioned INTRODUCE1 cells use the new cell type 41
RELAY_INTRODUCE1V (where V stands for versioned):
Cleartext
V Version byte: set to 1 [1 octet]
PK_ID Identifier for Bob's PK [20 octets]
AUTHT The auth type that is included [1 octet]
AUTHL Length of auth data [2 octets]
AUTHD Auth data [variable]
Encrypted to Bob's PK:
(RELAY_INTRODUCE2 cell)
The maximum length of contained authorization data depends on the length
of the contained INTRODUCE2 cell. A calculation follows below when
describing the INTRODUCE2 cell format we propose to use.
1.3. Client authorization at hidden service
The time when a hidden service receives an INTRODUCE2 cell constitutes
the last possible authorization point during the hidden service
protocol. Performing authorization here is easier than at the other two
authorization points, because there are no possibly untrusted entities
involved.
In general, a client that is successfully authorized at the introduction
point should be granted access at the hidden service, too. Otherwise, the
client would receive a positive INTRODUCE_ACK cell from the introduction
point and conclude that it may connect to the service, but the request
will be dropped without notice. This would appear as a failure to
clients. Therefore, the number of cases in which a client successfully
passes the introduction point but fails at the hidden service should be
zero. However, this does not lead to the conclusion that the
authorization data used at the introduction point and the hidden service
must be the same, but only that both authorization data should lead to
the same authorization result.
Authorization data is transmitted from client to server via an
INTRODUCE2 cell that is forwarded by the introduction point. There are
versions 0 to 2 specified in section 1.8 of rend-spec, but none of these
contain fields for carrying authorization data. We propose a slightly
modified version of v3 INTRODUCE2 cells that is specified in section
1.8.1 and which is not implemented as of December 2007. In contrast to
the specified v3 we avoid specifying (and implementing) IPv6 capabilities,
because Tor relays will be required to support IPv4 addresses for a long
time in the future, so that this seems unnecessary at the moment. The
proposed format of v3 INTRODUCE2 cells is as follows:
VER Version byte: set to 3. [1 octet]
AUTHT The auth type that is used [1 octet]
AUTHL Length of auth data [2 octets]
AUTHD Auth data [variable]
TS Timestamp (seconds since 1-1-1970) [4 octets]
IP Rendezvous point's address [4 octets]
PORT Rendezvous point's OR port [2 octets]
ID Rendezvous point identity ID [20 octets]
KLEN Length of onion key [2 octets]
KEY Rendezvous point onion key [KLEN octets]
RC Rendezvous cookie [20 octets]
g^x Diffie-Hellman data, part 1 [128 octets]
The maximum possible length of authorization data is related to the
enclosing INTRODUCE1V cell. A v3 INTRODUCE2 cell with
1024 bit = 128 octets long public key without any authorization data
occupies 306 octets (AUTHL is only used when AUTHT has a value != 0),
plus 58 octets for hybrid public key encryption (see
section 5.1 of tor-spec on hybrid encryption of CREATE cells). The
surrounding INTRODUCE1V cell requires 24 octets. This leaves only 110
of the 498 available octets free, which must be shared between
authorization data to the introduction point _and_ to the hidden
service.
When receiving a v3 INTRODUCE2 cell, Bob checks whether a client has
provided valid authorization data to him. He also requires that the
timestamp is no more than 30 minutes in the past or future and that the
first part of the Diffie-Hellman handshake has not been used in the past
60 minutes to prevent replay attacks by rogue introduction points. (The
reason for not using the rendezvous cookie to detect replays---even
though it is only sent once in the current design---is that it might be
desirable to re-use rendezvous cookies for multiple introduction requests
in the future.) If all checks pass, Bob builds a circuit to the provided
rendezvous point. Otherwise he drops the cell.
1.4. Summary of authorization data fields
In summary, the proposed descriptor format and cell formats provide the
following fields for carrying authorization data:
(1) The v2 hidden service descriptor contains:
- a descriptor cookie that is used for the lookup process, and
- an arbitrary encryption schema to ensure authorization to access
introduction information (currently symmetric encryption with the
descriptor cookie).
(2) For performing authorization at the introduction point we can use:
- the fields intro-authorization and service-authorization in
hidden service descriptors,
- a maximum of 215 octets in the ESTABLISH_INTRO cell, and
- one part of 110 octets in the INTRODUCE1V cell.
(3) For performing authorization at the hidden service we can use:
- the fields intro-authorization and service-authorization in
hidden service descriptors,
- the other part of 110 octets in the INTRODUCE2 cell.
It will also still be possible to access a hidden service without any
authorization or only use a part of the authorization infrastructure.
However, this requires to consider all parts of the infrastructure. For
example, authorization at the introduction point relying on confidential
intro-authorization data transported in the hidden service descriptor
cannot be performed without using an encryption schema for introduction
information.
1.5. Managing authorization data at servers and clients
In order to provide authorization data at the hidden service and the
authenticated clients, we propose to use files---either the Tor
configuration file or separate files. The exact format of these special
files depends on the authorization protocol used.
Currently, rend-spec contains the proposition to encode client-side
authorization data in the URL, like in x.y.z.onion. This was never used
and is also a bad idea, because in case of HTTP the requested URL may be
contained in the Host and Referer fields.
1.6. Limitations for authorization protocols
There are two limitations of the current hidden service protocol for
authorization protocols that shall be identified here.
1. The three cell types ESTABLISH_INTRO, INTRODUCE1V, and INTRODUCE2
restricts the amount of data that can be used for authorization.
This forces authorization protocols that require per-user
authorization data at the introduction point to restrict the number
of authorized clients artificially. A possible solution could be to
split contents among multiple cells and reassemble them at the
introduction points.
2. The current hidden service protocol does not specify cell types to
perform interactive authorization between client and introduction
point or hidden service. If there should be an authorization
protocol that requires interaction, new cell types would have to be
defined and integrated into the hidden service protocol.
2. Specific authorization protocol instances
In the following we present two specific authorization protocols that
make use of (parts of) the new authorization infrastructure:
1. The first protocol allows a service provider to restrict access
to clients with a previously received secret key only, but does not
attempt to hide service activity from others.
2. The second protocol, albeit being feasible for a limited set of about
16 clients, performs client authorization and hides service activity
from everyone but the authorized clients.
These two protocol instances extend the existing hidden service protocol
version 2. Hidden services that perform client authorization may run in
parallel to other services running versions 0, 2, or both.
2.1. Service with large-scale client authorization
The first client authorization protocol aims at performing access control
while consuming as few additional resources as possible. A service
provider should be able to permit access to a large number of clients
while denying access for everyone else. However, the price for
scalability is that the service won't be able to hide its activity from
unauthorized or formerly authorized clients.
The main idea of this protocol is to encrypt the introduction-point part
in hidden service descriptors to authorized clients using symmetric keys.
This ensures that nobody else but authorized clients can learn which
introduction points a service currently uses, nor can someone send a
valid INTRODUCE1 message without knowing the introduction key. Therefore,
a subsequent authorization at the introduction point is not required.
A service provider generates symmetric "descriptor cookies" for his
clients and distributes them outside of Tor. The suggested key size is
128 bits, so that descriptor cookies can be encoded in 22 base64 chars
(which can hold up to 22 * 5 = 132 bits, leaving 4 bits to encode the
authorization type (here: "0") and allow a client to distinguish this
authorization protocol from others like the one proposed below).
Typically, the contact information for a hidden service using this
authorization protocol looks like this:
v2cbb2l4lsnpio4q.onion Ll3X7Xgz9eHGKCCnlFH0uz
When generating a hidden service descriptor, the service encrypts the
introduction-point part with a single randomly generated symmetric
128-bit session key using AES-CTR as described for v2 hidden service
descriptors in rend-spec. Afterwards, the service encrypts the session
key to all descriptor cookies using AES. Authorized client should be able
to efficiently find the session key that is encrypted for him/her, so
that 4 octet long client ID are generated consisting of descriptor cookie
and initialization vector. Descriptors always contain a number of
encrypted session keys that is a multiple of 16 by adding fake entries.
Encrypted session keys are ordered by client IDs in order to conceal
addition or removal of authorized clients by the service provider.
ATYPE Authorization type: set to 1. [1 octet]
ALEN Number of clients := 1 + ((clients - 1) div 16) [1 octet]
for each symmetric descriptor cookie:
ID Client ID: H(descriptor cookie | IV)[:4] [4 octets]
SKEY Session key encrypted with descriptor cookie [16 octets]
(end of client-specific part)
RND Random data [(15 - ((clients - 1) mod 16)) * 20 octets]
IV AES initialization vector [16 octets]
IPOS Intro points, encrypted with session key [remaining octets]
An authorized client needs to configure Tor to use the descriptor cookie
when accessing the hidden service. Therefore, a user adds the contact
information that she received from the service provider to her torrc
file. Upon downloading a hidden service descriptor, Tor finds the
encrypted introduction-point part and attempts to decrypt it using the
configured descriptor cookie. (In the rare event of two or more client
IDs being equal a client tries to decrypt all of them.)
Upon sending the introduction, the client includes her descriptor cookie
as auth type "1" in the INTRODUCE2 cell that she sends to the service.
The hidden service checks whether the included descriptor cookie is
authorized to access the service and either responds to the introduction
request, or not.
2.2. Authorization for limited number of clients
A second, more sophisticated client authorization protocol goes the extra
mile of hiding service activity from unauthorized clients. With all else
being equal to the preceding authorization protocol, the second protocol
publishes hidden service descriptors for each user separately and gets
along with encrypting the introduction-point part of descriptors to a
single client. This allows the service to stop publishing descriptors for
removed clients. As long as a removed client cannot link descriptors
issued for other clients to the service, it cannot derive service
activity any more. The downside of this approach is limited scalability.
Even though the distributed storage of descriptors (cf. proposal 114)
tackles the problem of limited scalability to a certain extent, this
protocol should not be used for services with more than 16 clients. (In
fact, Tor should refuse to advertise services for more than this number
of clients.)
A hidden service generates an asymmetric "client key" and a symmetric
"descriptor cookie" for each client. The client key is used as
replacement for the service's permanent key, so that the service uses a
different identity for each of his clients. The descriptor cookie is used
to store descriptors at changing directory nodes that are unpredictable
for anyone but service and client, to encrypt the introduction-point
part, and to be included in INTRODUCE2 cells. Once the service has
created client key and descriptor cookie, he tells them to the client
outside of Tor. The contact information string looks similar to the one
used by the preceding authorization protocol (with the only difference
that it has "1" encoded as auth-type in the remaining 4 of 132 bits
instead of "0" as before).
When creating a hidden service descriptor for an authorized client, the
hidden service uses the client key and descriptor cookie to compute
secret ID part and descriptor ID:
secret-id-part = H(time-period | descriptor-cookie | replica)
descriptor-id = H(client-key[:10] | secret-id-part)
The hidden service also replaces permanent-key in the descriptor with
client-key and encrypts introduction-points with the descriptor cookie.
ATYPE Authorization type: set to 2. [1 octet]
IV AES initialization vector [16 octets]
IPOS Intro points, encr. with descriptor cookie [remaining octets]
When uploading descriptors, the hidden service needs to make sure that
descriptors for different clients are not uploaded at the same time (cf.
Section 1.1) which is also a limiting factor for the number of clients.
When a client is requested to establish a connection to a hidden service
it looks up whether it has any authorization data configured for that
service. If the user has configured authorization data for authorization
protocol "2", the descriptor ID is determined as described in the last
paragraph. Upon receiving a descriptor, the client decrypts the
introduction-point part using its descriptor cookie. Further, the client
includes its descriptor cookie as auth-type "2" in INTRODUCE2 cells that
it sends to the service.
2.3. Hidden service configuration
A hidden service that is meant to perform client authorization adds a
new option HiddenServiceAuthorizeClient to its hidden service
configuration. This option contains the authorization type which is
either "1" for the protocol described in 2.1 or "2" for the protocol in
2.2 and a comma-separated list of human-readable client names, so that
Tor can create authorization data for these clients:
HiddenServiceAuthorizeClient auth-type client-name,client-name,...
If this option is configured, HiddenServiceVersion is automatically
reconfigured to contain only version numbers of 2 or higher.
Tor stores all generated authorization data for the authorization
protocols described in Sections 2.1 and 2.2 in a new file using the
following file format:
"client-name" human-readable client identifier NL
"descriptor-cookie" 128-bit key ^= 22 base64 chars NL
If the authorization protocol of Section 2.2 is used, Tor also generates
and stores the following data:
"client-key" NL a public key in PEM format
2.4. Client configuration
Clients need to make their authorization data known to Tor using another
configuration option that contains a service name (mainly for the sake of
convenience), the service address, and the descriptor cookie that is
required to access a hidden service (the authorization protocol number is
encoded in the descriptor cookie):
HidServAuth service-name service-address descriptor-cookie
Security implications:
In the following we want to discuss possible attacks by dishonest
entities in the presented infrastructure and specific protocol. These
security implications would have to be verified once more when adding
another protocol. The dishonest entities (theoretically) include the
hidden service itself, the authenticated clients, hidden service directory
nodes, introduction points, and rendezvous points. The relays that are
part of circuits used during protocol execution, but never learn about
the exchanged descriptors or cells by design, are not considered.
Obviously, this list makes no claim to be complete. The discussed attacks
are sorted by the difficulty to perform them, in ascending order,
starting with roles that everyone could attempt to take and ending with
partially trusted entities abusing the trust put in them.
(1) A hidden service directory could attempt to conclude presence of a
service from the existence of a locally stored hidden service descriptor:
This passive attack is possible only for a single client-service
relation, because descriptors need to contain a publicly visible
signature of the service using the client key.
A possible protection would be to increase the number of hidden service
directories in the network.
(2) A hidden service directory could try to break the descriptor cookies
of locally stored descriptors: This attack can be performed offline. The
only useful countermeasure against it might be using safe passwords that
are generated by Tor.
[passwords? where did those come in? -RD]
(3) An introduction point could try to identify the pseudonym of the
hidden service on behalf of which it operates: This is impossible by
design, because the service uses a fresh public key for every
establishment of an introduction point (see proposal 114) and the
introduction point receives a fresh introduction cookie, so that there is
no identifiable information about the service that the introduction point
could learn. The introduction point cannot even tell if client accesses
belong to the same client or not, nor can it know the total number of
authorized clients. The only information might be the pattern of
anonymous client accesses, but that is hardly enough to reliably identify
a specific service.
(4) An introduction point could want to learn the identities of accessing
clients: This is also impossible by design, because all clients use the
same introduction cookie for authorization at the introduction point.
(5) An introduction point could try to replay a correct INTRODUCE1 cell
to other introduction points of the same service, e.g. in order to force
the service to create a huge number of useless circuits: This attack is
not possible by design, because INTRODUCE1 cells are encrypted using a
freshly created introduction key that is only known to authorized
clients.
(6) An introduction point could attempt to replay a correct INTRODUCE2
cell to the hidden service, e.g. for the same reason as in the last
attack: This attack is stopped by the fact that a service will drop
INTRODUCE2 cells containing a DH handshake they have seen recently.
(7) An introduction point could block client requests by sending either
positive or negative INTRODUCE_ACK cells back to the client, but without
forwarding INTRODUCE2 cells to the server: This attack is an annoyance
for clients, because they might wait for a timeout to elapse until trying
another introduction point. However, this attack is not introduced by
performing authorization and it cannot be targeted towards a specific
client. A countermeasure might be for the server to periodically perform
introduction requests to his own service to see if introduction points
are working correctly.
(8) The rendezvous point could attempt to identify either server or
client: This remains impossible as it was before, because the
rendezvous cookie does not contain any identifiable information.
(9) An authenticated client could swamp the server with valid INTRODUCE1
and INTRODUCE2 cells, e.g. in order to force the service to create
useless circuits to rendezvous points; as opposed to an introduction
point replaying the same INTRODUCE2 cell, a client could include a new
rendezvous cookie for every request: The countermeasure for this attack
is the restriction to 10 connection establishments per client per hour.
Compatibility:
An implementation of this proposal would require changes to hidden
services and clients to process authorization data and encode and
understand the new formats. However, both services and clients would
remain compatible to regular hidden services without authorization.
Implementation:
The implementation of this proposal can be divided into a number of
changes to hidden service and client side. There are no
changes necessary on directory, introduction, or rendezvous nodes. All
changes are marked with either [service] or [client] do denote on which
side they need to be made.
/1/ Configure client authorization [service]
- Parse configuration option HiddenServiceAuthorizeClient containing
authorized client names.
- Load previously created client keys and descriptor cookies.
- Generate missing client keys and descriptor cookies, add them to
client_keys file.
- Rewrite the hostname file.
- Keep client keys and descriptor cookies of authorized clients in
memory.
[- In case of reconfiguration, mark which client authorizations were
added and whether any were removed. This can be used later when
deciding whether to rebuild introduction points and publish new
hidden service descriptors. Not implemented yet.]
/2/ Publish hidden service descriptors [service]
- Create and upload hidden service descriptors for all authorized
clients.
[- See /1/ for the case of reconfiguration.]
/3/ Configure permission for hidden services [client]
- Parse configuration option HidServAuth containing service
authorization, store authorization data in memory.
/5/ Fetch hidden service descriptors [client]
- Look up client authorization upon receiving a hidden service request.
- Request hidden service descriptor ID including client key and
descriptor cookie. Only request v2 descriptors, no v0.
/6/ Process hidden service descriptor [client]
- Decrypt introduction points with descriptor cookie.
/7/ Create introduction request [client]
- Include descriptor cookie in INTRODUCE2 cell to introduction point.
- Pass descriptor cookie around between involved connections and
circuits.
/8/ Process introduction request [service]
- Read descriptor cookie from INTRODUCE2 cell.
- Check whether descriptor cookie is authorized for access, including
checking access counters.
- Log access for accountability.
Filename: 122-unnamed-flag.txt
Title: Network status entries need a new Unnamed flag
Author: Roger Dingledine
Created: 04-Oct-2007
Status: Closed
Implemented-In: 0.2.0.x
1. Overview:
Tor's directory authorities can give certain servers a "Named" flag
in the network-status entry, when they want to bind that nickname to
that identity key. This allows clients to specify a nickname rather
than an identity fingerprint and still be certain they're getting the
"right" server. As dir-spec.txt describes it,
Name X is bound to identity Y if at least one binding directory lists
it, and no directory binds X to some other Y'.
In practice, clients can refer to servers by nickname whether they are
Named or not; if they refer to nicknames that aren't Named, a complaint
shows up in the log asking them to use the identity key in the future
--- but it still works.
The problem? Imagine a Tor server with nickname Bob. Bob and his
identity fingerprint are registered in tor26's approved-routers
file, but none of the other authorities registered him. Imagine
there are several other unregistered servers also with nickname Bob
("the imposters").
While Bob is online, all is well: a) tor26 gives a Named flag to
the real one, and refuses to list the other ones; and b) the other
authorities list the imposters but don't give them a Named flag. Clients
who have all the network-statuses can compute which one is the real Bob.
But when the real Bob disappears and his descriptor expires? tor26
continues to refuse to list any of the imposters, and the other
authorities continue to list the imposters. Clients don't have any
idea that there exists a Named Bob, so they can ask for server Bob and
get one of the imposters. (A warning will also appear in their log,
but so what.)
2. The stopgap solution:
tor26 should start accepting and listing the imposters, but it should
assign them a new flag: "Unnamed".
This would produce three cases in terms of assigning flags in the consensus
networkstatus:
i) a router gets the Named flag in the v3 networkstatus if
a) it's the only router with that nickname that has the Named flag
out of all the votes, and
b) no vote lists it as Unnamed
else,
ii) a router gets the Unnamed flag if
a) some vote lists a different router with that nickname as Named, or
b) at least one vote lists it as Unnamed, or
c) there are other routers with the same nickname that are Unnamed
else,
iii) the router neither gets a Named nor an Unnamed flag.
(This whole proposal is meant only for v3 dir flags; we shouldn't try
to backport it to the v2 dir world.)
Then client behavior is:
a) If there's a Bob with a Named flag, pick that one.
else b) If the Bobs don't have the Unnamed flag (notice that they should
either all have it, or none), pick one of them and warn.
else c) They all have the Unnamed flag -- no router found.
3. Problems not solved by this stopgap:
3.1. Naming authorities can go offline.
If tor26 is the only authority that provides a binding for Bob, when
tor26 goes offline we're back in our previous situation -- the imposters
can be referenced with a mere ignorable warning in the client's log.
If some other authority Names a different Bob, and tor26 goes offline,
then that other Bob becomes the unique Named Bob.
So be it. We should try to solve these one day, but there's no clear way
to do it that doesn't destroy usability in other ways, and if we want
to get the Unnamed flag into v3 network statuses we should add it soon.
3.2. V3 dir spec magnifies brief discrepancies.
Another point to notice is if tor26 names Bob(1), doesn't know about
Bob(2), but moria lists Bob(2). Then Bob(2) doesn't get an Unnamed flag
even if it should (and Bob(1) is not around).
Right now, in v2 dirs, the case where an authority doesn't know about
a server but the other authorities do know is rare. That's because
authorities periodically ask for other networkstatuses and then fetch
descriptors that are missing.
With v3, if that window occurs at the wrong time, it is extended for the
entire period. We could solve this by making the voting more complex,
but that doesn't seem worth it.
[3.3. Tor26 is only one tor26.
We need more naming authorities, possibly with some kind of auto-naming
feature. This is out-of-scope for this proposal -NM]
4. Changes to the v2 directory
Previously, v2 authorities that had a binding for a server named Bob did
not list any other server named Bob. This will change too:
Version 2 authorities will start listing all routers they know about,
whether they conflict with a name-binding or not: Servers for which
this authority has a binding will continue to be marked Named,
additionally all other servers of that nickname will be listed without the
Named flag (i.e. there will be no Unnamed flag in v2 status documents).
Clients already should handle having a named Bob alongside unnamed
Bobs correctly, and having the unnamed Bobs in the status file even
without the named server is no worse than the current status quo where
clients learn about those servers from other authorities.
The benefit of this is that an authority's opinion on a server like
Guard, Stable, Fast etc. can now be learned by clients even if that
specific authority has reserved that server's name for somebody else.
5. Other benefits:
This new flag will allow people to operate servers that happen to have
the same nickname as somebody who registered their server two years ago
and left soon after. Right now there are dozens of nicknames that are
registered on all three binding directory authorities, yet haven't been
running for years. While it's bad that these nicknames are effectively
blacklisted from the network, the really bad part is that this logic
is really unintuitive to prospective new server operators.
Filename: 123-autonaming.txt
Title: Naming authorities automatically create bindings
Author: Peter Palfrader
Created: 2007-10-11
Status: Closed
Implemented-In: 0.2.0.x
Overview:
Tor's directory authorities can give certain servers a "Named" flag
in the network-status entry, when they want to bind that nickname to
that identity key. This allows clients to specify a nickname rather
than an identity fingerprint and still be certain they're getting the
"right" server.
Authority operators name a server by adding their nickname and
identity fingerprint to the 'approved-routers' file. Historically
being listed in the file was required for a router, at first for being
listed in the directory at all, and later in order to be used by
clients as a first or last hop of a circuit.
Adding identities to the list of named routers so far has been a
manual, time consuming, and boring job. Given that and the fact that
the Tor network works just fine without named routers the last
authority to keep a current binding list stopped updating it well over
half a year ago.
Naming, if it were done, would serve a useful purpose however in that
users can have a reasonable expectation that the exit server Bob they
are using in their http://www.google.com.bob.exit/ URL is the same
Bob every time.
Proposal:
I propose that identity<->name binding be completely automated:
New bindings should be added after the router has been around for a
bit and their name has not been used by other routers, similarly names
that have not appeared on the network for a long time should be freed
in case a new router wants to use it.
The following rules are suggested:
i) If a named router has not been online for half a year, the
identity<->name binding for that name is removed. The nickname
is free to be taken by other routers now.
ii) If a router claims a certain nickname and
a) has been on the network for at least two weeks, and
b) that nickname is not yet linked to a different router, and
c) no other router has wanted that nickname in the last month,
a new binding should be created for this router and its desired
nickname.
This automaton does not necessarily need to live in the Tor code, it
can do its job just as well when it's an external tool.
Filename: 124-tls-certificates.txt
Title: Blocking resistant TLS certificate usage
Author: Steven J. Murdoch
Created: 2007-10-25
Status: Superseded
Overview:
To be less distinguishable from HTTPS web browsing, only Tor servers should
present TLS certificates. This should be done whilst maintaining backwards
compatibility with Tor nodes which present and expect client certificates, and
while preserving existing security properties. This specification describes
the negotiation protocol, what certificates should be presented during the TLS
negotiation, and how to move the client authentication within the encrypted
tunnel.
Motivation:
In Tor's current TLS [1] handshake, both client and server present a
two-certificate chain. Since TLS performs authentication prior to establishing
the encrypted tunnel, the contents of these certificates are visible to an
eavesdropper. In contrast, during normal HTTPS web browsing, the server
presents a single certificate, signed by a root CA and the client presents no
certificate. Hence it is possible to distinguish Tor from HTTP by identifying
this pattern.
To resist blocking based on traffic identification, Tor should behave as close
to HTTPS as possible, i.e. servers should offer a single certificate and not
request a client certificate; clients should present no certificate. This
presents two difficulties: clients are no longer authenticated and servers are
authenticated by the connection key, rather than identity key. The link
protocol must thus be modified to preserve the old security semantics.
Finally, in order to maintain backwards compatibility, servers must correctly
identify whether the client supports the modified certificate handling. This
is achieved by modifying the cipher suites that clients advertise support
for. These cipher suites are selected to be similar to those chosen by web
browsers, in order to resist blocking based on client hello.
Terminology:
Initiator: OP or OR which initiates a TLS connection ("client" in TLS
terminology)
Responder: OR which receives an incoming TLS connection ("server" in TLS
terminology)
Version negotiation and cipher suite selection:
In the modified TLS handshake, the responder does not request a certificate
from the initiator. This request would normally occur immediately after the
responder receives the client hello (the first message in a TLS handshake) and
so the responder must decide whether to request a certificate based only on
the information in the client hello. This is achieved by examining the cipher
suites in the client hello.
List 1: cipher suites lists offered by version 0/1 Tor
From src/common/tortls.c, revision 12086:
TLS1_TXT_DHE_RSA_WITH_AES_128_SHA
TLS1_TXT_DHE_RSA_WITH_AES_128_SHA : SSL3_TXT_EDH_RSA_DES_192_CBC3_SHA
SSL3_TXT_EDH_RSA_DES_192_CBC3_SHA
Client hello sent by initiator:
Initiators supporting version 2 of the Tor connection protocol MUST
offer a different cipher suite list from those sent by pre-version 2
Tors, contained in List 1. To maintain compatibility with older Tor
versions and common browsers, the cipher suite list MUST include
support for:
TLS_DHE_RSA_WITH_AES_256_CBC_SHA
TLS_DHE_RSA_WITH_AES_128_CBC_SHA
SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA
SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA
Client hello received by responder/server hello sent by responder:
Responders supporting version 2 of the Tor connection protocol should compare
the cipher suite list in the client hello with those in List 1. If it matches
any in the list then the responder should assume that the initiatior supports
version 1, and thus should maintain the version 1 behavior, i.e. send a
two-certificate chain, request a client certificate and do not send or expect
a VERSIONS cell [2].
Otherwise, the responder should assume version 2 behavior and select a cipher
suite following TLS [1] behavior, i.e. select the first entry from the client
hello cipher list which is acceptable. Responders MUST NOT select any suite
that lacks ephemeral keys, or whose symmetric keys are less then KEY_LEN bits,
or whose digests are less than HASH_LEN bits. Implementations SHOULD NOT
allow other SSLv3 ciphersuites.
Should no mutually acceptable cipher suite be found, the connection MUST be
closed.
If the responder is implementing version 2 of the connection protocol it
SHOULD send a server certificate with random contents. The organizationName
field MUST NOT be "Tor", "TOR" or "t o r".
Server certificate received by initiator:
If the server certificate has an organizationName of "Tor", "TOR" or "t o r",
the initiator should assume that the responder does not support version 2 of
the connection protocol. In which case the initiator should respond following
version 1, i.e. send a two-certificate client chain and do not send or expect
a VERSIONS cell.
[SJM: We could also use the fact that a client certificate request was sent]
If the server hello contains a ciphersuite which does not comply with the key
length requirements above, even if it was one offered in the client hello, the
connection MUST be closed. This will only occur if the responder is not a Tor
server.
Backward compatibility:
v1 Initiator, v1 Responder: No change
v1 Initiator, v2 Responder: Responder detects v1 initiator by client hello
v2 Initiator, v1 Responder: Responder accepts v2 client hello. Initiator
detects v1 server certificate and continues with v1 protocol
v2 Initiator, v2 Responder: Responder accepts v2 client hello. Initiator
detects v2 server certificate and continues with v2 protocol.
Additional link authentication process:
Following VERSION and NETINFO negotiation, both responder and
initiator MUST send a certification chain in a CERT cell. If one
party does not have a certificate, the CERT cell MUST still be sent,
but with a length of zero.
A CERT cell is a variable length cell, of the format
CircID [2 bytes]
Command [1 byte]
Length [2 bytes]
Payload [<length> bytes]
CircID MUST set to be 0x0000
Command is [SJM: TODO]
Length is the length of the payload
Payload contains 0 or more certificates, each is of the format:
Cert_Length [2 bytes]
Certificate [<cert_length> bytes]
Each certificate MUST sign the one preceding it. The initator MUST
place its connection certificate first; the responder, having
already sent its connection certificate as part of the TLS handshake
MUST place its identity certificate first.
Initiators who send a CERT cell MUST follow that with an LINK_AUTH
cell to prove that they posess the corresponding private key.
A LINK_AUTH cell is fixed-lenth, of the format:
CircID [2 bytes]
Command [1 byte]
Length [2 bytes]
Payload (padded with 0 bytes) [PAYLOAD_LEN - 2 bytes]
CircID MUST set to be 0x0000
Command is [SJM: TODO]
Length is the valid portion of the payload
Payload is of the format:
Signature version [1 byte]
Signature [<length> - 1 bytes]
Padding [PAYLOAD_LEN - <length> - 2 bytes]
Signature version: Identifies the type of signature, currently 0x00
Signature: Digital signature under the initiator's connection key of the
following item, in PKCS #1 block type 1 [3] format:
HMAC-SHA1, using the TLS master secret as key, of the
following elements concatenated:
- The signature version (0x00)
- The NUL terminated ASCII string: "Tor initiator certificate verification"
- client_random, as sent in the Client Hello
- server_random, as sent in the Server Hello
- SHA-1 hash of the initiator connection certificate
- SHA-1 hash of the responder connection certificate
Security checks:
- Before sending a LINK_AUTH cell, a node MUST ensure that the TLS
connection is authenticated by the responder key.
- For the handshake to have succeeded, the initiator MUST confirm:
- That the TLS handshake was authenticated by the
responder connection key
- That the responder connection key was signed by the first
certificate in the CERT cell
- That each certificate in the CERT cell was signed by the
following certificate, with the exception of the last
- That the last certificate in the CERT cell is the expected
identity certificate for the node being connected to
- For the handshake to have succeeded, the responder MUST confirm
either:
A) - A zero length CERT cell was sent and no LINK_AUTH cell was
sent
In which case the responder shall treat the identity of the
initiator as unknown
or
B) - That the LINK_AUTH MAC contains a signature by the first
certificate in the CERT cell
- That the MAC signed matches the expected value
- That each certificate in the CERT cell was signed by the
following certificate, with the exception of the last
In which case the responder shall treat the identity of the
initiator as that of the last certificate in the CERT cell
Protocol summary:
1. I(nitiator) <-> R(esponder): TLS handshake, including responder
authentication under connection certificate R_c
2. I <->: VERSION and NETINFO negotiation
3. R -> I: CERT (Responder identity certificate R_i (which signs R_c))
4. I -> R: CERT (Initiator connection certificate I_c,
Initiator identity certificate I_i (which signs I_c)
5. I -> R: LINK_AUTH (Signature, under I_c of HMAC-SHA1(master_secret,
"Tor initiator certificate verification" ||
client_random || server_random ||
I_c hash || R_c hash)
Notes: I -> R doesn't need to wait for R_i before sending its own
messages (reduces round-trips).
Certificate hash is calculated like identity hash in CREATE cells.
Initiator signature is calculated in a similar way to Certificate
Verify messages in TLS 1.1 (RFC4346, Sections 7.4.8 and 4.7).
If I is an OP, a zero length certificate chain may be sent in step 4;
In which case, step 5 is not performed
Rationale:
- Version and netinfo negotiation before authentication: The version cell needs
to come before before the rest of the protocol, since we may choose to alter
the rest at some later point, e.g switch to a different MAC/signature scheme.
It is useful to keep the NETINFO and VERSION cells close to each other, since
the time between them is used to check if there is a delay-attack. Still, a
server might want to not act on NETINFO data from an initiator until the
authentication is complete.
Appendix A: Cipher suite choices
This specification intentionally does not put any constraints on the
TLS ciphersuite lists presented by clients, other than a minimum
required for compatibility. However, to maximize blocking
resistance, ciphersuite lists should be carefully selected.
Recommended client ciphersuite list
Source: http://lxr.mozilla.org/security/source/security/nss/lib/ssl/sslproto.h
0xc00a: TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA
0xc014: TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA
0x0039: TLS_DHE_RSA_WITH_AES_256_CBC_SHA
0x0038: TLS_DHE_DSS_WITH_AES_256_CBC_SHA
0xc00f: TLS_ECDH_RSA_WITH_AES_256_CBC_SHA
0xc005: TLS_ECDH_ECDSA_WITH_AES_256_CBC_SHA
0x0035: TLS_RSA_WITH_AES_256_CBC_SHA
0xc007: TLS_ECDHE_ECDSA_WITH_RC4_128_SHA
0xc009: TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA
0xc011: TLS_ECDHE_RSA_WITH_RC4_128_SHA
0xc013: TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA
0x0033: TLS_DHE_RSA_WITH_AES_128_CBC_SHA
0x0032: TLS_DHE_DSS_WITH_AES_128_CBC_SHA
0xc00c: TLS_ECDH_RSA_WITH_RC4_128_SHA
0xc00e: TLS_ECDH_RSA_WITH_AES_128_CBC_SHA
0xc002: TLS_ECDH_ECDSA_WITH_RC4_128_SHA
0xc004: TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA
0x0004: SSL_RSA_WITH_RC4_128_MD5
0x0005: SSL_RSA_WITH_RC4_128_SHA
0x002f: TLS_RSA_WITH_AES_128_CBC_SHA
0xc008: TLS_ECDHE_ECDSA_WITH_3DES_EDE_CBC_SHA
0xc012: TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA
0x0016: SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA
0x0013: SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA
0xc00d: TLS_ECDH_RSA_WITH_3DES_EDE_CBC_SHA
0xc003: TLS_ECDH_ECDSA_WITH_3DES_EDE_CBC_SHA
0xfeff: SSL_RSA_FIPS_WITH_3DES_EDE_CBC_SHA (168-bit Triple DES with RSA and a SHA1 MAC)
0x000a: SSL_RSA_WITH_3DES_EDE_CBC_SHA
Order specified in:
http://lxr.mozilla.org/security/source/security/nss/lib/ssl/sslenum.c#47
Recommended options:
0x0000: Server Name Indication [4]
0x000a: Supported Elliptic Curves [5]
0x000b: Supported Point Formats [5]
Recommended compression:
0x00
Recommended server ciphersuite selection:
The responder should select the first entry in this list which is
listed in the client hello:
0x0039: TLS_DHE_RSA_WITH_AES_256_CBC_SHA [ Common Firefox choice ]
0x0033: TLS_DHE_RSA_WITH_AES_128_CBC_SHA [ Tor v1 default ]
0x0016: SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA [ Tor v1 fallback ]
0x0013: SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA [ Valid IE option ]
References:
[1] The Transport Layer Security (TLS) Protocol, Version 1.1, RFC4346, IETF
[2] Version negotiation for the Tor protocol, Tor proposal 105
[3] B. Kaliski, "Public-Key Cryptography Standards (PKCS) #1:
RSA Cryptography Specifications Version 1.5", RFC 2313,
March 1998.
[4] TLS Extensions, RFC 3546
[5] Elliptic Curve Cryptography (ECC) Cipher Suites for Transport Layer Security (TLS)
% <!-- Local IspellDict: american -->
Filename: 125-bridges.txt
Title: Behavior for bridge users, bridge relays, and bridge authorities
Author: Roger Dingledine
Created: 11-Nov-2007
Status: Closed
Implemented-In: 0.2.0.x
0. Preface
This document describes the design decisions around support for bridge
users, bridge relays, and bridge authorities. It acts as an overview
of the bridge design and deployment for developers, and it also tries
to point out limitations in the current design and implementation.
For more details on what all of these mean, look at blocking.tex in
/doc/design-paper/
1. Bridge relays
Bridge relays are just like normal Tor relays except they don't publish
their server descriptors to the main directory authorities.
1.1. PublishServerDescriptor
To configure your relay to be a bridge relay, just add
BridgeRelay 1
PublishServerDescriptor bridge
to your torrc. This will cause your relay to publish its descriptor
to the bridge authorities rather than to the default authorities.
Alternatively, you can say
BridgeRelay 1
PublishServerDescriptor 0
which will cause your relay to not publish anywhere. This could be
useful for private bridges.
1.2. Exit policy
Bridge relays should use an exit policy of "reject *:*". This is
because they only need to relay traffic between the bridge users
and the rest of the Tor network, so there's no need to let people
exit directly from them.
1.3. RelayBandwidthRate / RelayBandwidthBurst
We invented the RelayBandwidth* options for this situation: Tor clients
who want to allow relaying too. See proposal 111 for details. Relay
operators should feel free to rate-limit their relayed traffic.
1.4. Helping the user with port forwarding, NAT, etc.
Just as for operating normal relays, our documentation and hints for
how to make your ORPort reachable are inadequate for normal users.
We need to work harder on this step, perhaps in 0.2.2.x.
1.5. Vidalia integration
Vidalia has turned its "Relay" settings page into a tri-state
"Don't relay" / "Relay for the Tor network" / "Help censored users".
If you click the third choice, it forces your exit policy to reject *:*.
If all the bridges end up on port 9001, that's not so good. On the
other hand, putting the bridges on a low-numbered port in the Unix
world requires jumping through extra hoops. The current compromise is
that Vidalia makes the ORPort default to 443 on Windows, and 9001 on
other platforms.
At the bottom of the relay config settings window, Vidalia displays
the bridge identifier to the operator (see Section 3.1) so he can pass
it on to bridge users.
1.6. What if the default ORPort is already used?
If the user already has a webserver or some other application
bound to port 443, then Tor will fail to bind it and complain to the
user, probably in a cryptic way. Rather than just working on a better
error message (though we should do this), we should consider an
"ORPort auto" option that tells Tor to try to find something that's
bindable and reachable. This would also help us tolerate ISPs that
filter incoming connections on port 80 and port 443. But this should
be a different proposal, and can wait until 0.2.2.x.
2. Bridge authorities.
Bridge authorities are like normal directory authorities, except they
don't create their own network-status documents or votes. So if you
ask an authority for a network-status document or consensus, they
behave like a directory mirror: they give you one from one of the main
authorities. But if you ask the bridge authority for the descriptor
corresponding to a particular identity fingerprint, it will happily
give you the latest descriptor for that fingerprint.
To become a bridge authority, add these lines to your torrc:
AuthoritativeDirectory 1
BridgeAuthoritativeDir 1
Right now there's one bridge authority, running on the Tonga relay.
2.1. Exporting bridge-purpose descriptors
We've added a new purpose for server descriptors: the "bridge"
purpose. With the new router-descriptors file format that includes
annotations, it's easy to look through it and find the bridge-purpose
descriptors.
Currently we export the bridge descriptors from Tonga to the
BridgeDB server, so it can give them out according to the policies
in blocking.pdf.
2.2. Reachability/uptime testing
Right now the bridge authorities do active reachability testing of
bridges, so we know which ones to recommend for users.
But in the design document, we suggested that bridges should publish
anonymously (i.e. via Tor) to the bridge authority, so somebody watching
the bridge authority can't just enumerate all the bridges. But if we're
doing active measurement, the game is up. Perhaps we should back off on
this goal, or perhaps we should do our active measurement anonymously?
Answering this issue is scheduled for 0.2.1.x.
2.3. Migrating to multiple bridge authorities
Having only one bridge authority is both a trust bottleneck (if you
break into one place you learn about every single bridge we've got)
and a robustness bottleneck (when it's down, bridge users become sad).
Right now if we put up a second bridge authority, all the bridges would
publish to it, and (assuming the code works) bridge users would query
a random bridge authority. This resolves the robustness bottleneck,
but makes the trust bottleneck even worse.
In 0.2.2.x and later we should think about better ways to have multiple
bridge authorities.
3. Bridge users.
Bridge users are like ordinary Tor users except they use encrypted
directory connections by default, and they use bridge relays as both
entry guards (their first hop) and directory guards (the source of
all their directory information).
To become a bridge user, add the following line to your torrc:
UseBridges 1
and then add at least one "Bridge" line to your torrc based on the
format below.
3.1. Format of the bridge identifier.
The canonical format for a bridge identifier contains an IP address,
an ORPort, and an identity fingerprint:
bridge 128.31.0.34:9009 4C17 FB53 2E20 B2A8 AC19 9441 ECD2 B017 7B39 E4B1
However, the identity fingerprint can be left out, in which case the
bridge user will connect to that relay and use it as a bridge regardless
of what identity key it presents:
bridge 128.31.0.34:9009
This might be useful for cases where only short bridge identifiers
can be communicated to bridge users.
In a future version we may also support bridge identifiers that are
only a key fingerprint:
bridge 4C17 FB53 2E20 B2A8 AC19 9441 ECD2 B017 7B39 E4B1
and the bridge user can fetch the latest descriptor from the bridge
authority (see Section 3.4).
3.2. Bridges as entry guards
For now, bridge users add their bridge relays to their list of "entry
guards" (see path-spec.txt for background on entry guards). They are
managed by the entry guard algorithms exactly as if they were a normal
entry guard -- their keys and timing get cached in the "state" file,
etc. This means that when the Tor user starts up with "UseBridges"
disabled, he will skip past the bridge entries since they won't be
listed as up and usable in his networkstatus consensus. But to be clear,
the "entry_guards" list doesn't currently distinguish guards by purpose.
Internally, each bridge user keeps a smartlist of "bridge_info_t"
that reflects the "bridge" lines from his torrc along with a download
schedule (see Section 3.5 below). When he starts Tor, he attempts
to fetch a descriptor for each configured bridge (see Section 3.4
below). When he succeeds at getting a descriptor for one of the bridges
in his list, he adds it directly to the entry guard list using the
normal add_an_entry_guard() interface. Once a bridge descriptor has
been added, should_delay_dir_fetches() will stop delaying further
directory fetches, and the user begins to bootstrap his directory
information from that bridge (see Section 3.3).
Currently bridge users cache their bridge descriptors to the
"cached-descriptors" file (annotated with purpose "bridge"), but
they don't make any attempt to reuse descriptors they find in this
file. The theory is that either the bridge is available now, in which
case you can get a fresh descriptor, or it's not, in which case an
old descriptor won't do you much good.
We could disable writing out the bridge lines to the state file, if
we think this is a problem.
As an exception, if we get an application request when we have one
or more bridge descriptors but we believe none of them are running,
we mark them all as running again. This is similar to the exception
already in place to help long-idle Tor clients realize they should
fetch fresh directory information rather than just refuse requests.
3.3. Bridges as directory guards
In addition to using bridges as the first hop in their circuits, bridge
users also use them to fetch directory updates. Other than initial
bootstrapping to find a working bridge descriptor (see Section 3.4
below), all further non-anonymized directory fetches will be redirected
to the bridge.
This means that bridge relays need to have cached answers for all
questions the bridge user might ask. This makes the upgrade path
tricky --- for example, if we migrate to a v4 directory design, the
bridge user would need to keep using v3 so long as his bridge relays
only knew how to answer v3 queries.
In a future design, for cases where the user has enough information
to build circuits yet the chosen bridge doesn't know how to answer a
given query, we might teach bridge users to make an anonymized request
to a more suitable directory server.
3.4. How bridge users get their bridge descriptor
Bridge users can fetch bridge descriptors in two ways: by going directly
to the bridge and asking for "/tor/server/authority", or by going to
the bridge authority and asking for "/tor/server/fp/ID". By default,
they will only try the direct queries. If the user sets
UpdateBridgesFromAuthority 1
in his config file, then he will try querying the bridge authority
first for bridges where he knows a digest (if he only knows an IP
address and ORPort, then his only option is a direct query).
If the user has at least one working bridge, then he will do further
queries to the bridge authority through a full three-hop Tor circuit.
But when bootstrapping, he will make a direct begin_dir-style connection
to the bridge authority.
As of Tor 0.2.0.10-alpha, if the user attempts to fetch a descriptor
from the bridge authority and it returns a 404 not found, the user
will automatically fall back to trying a direct query. Therefore it is
recommended that bridge users always set UpdateBridgesFromAuthority,
since at worst it will delay their fetches a little bit and notify
the bridge authority of the identity fingerprint (but not location)
of their intended bridges.
3.5. Bridge descriptor retry schedule
Bridge users try to fetch a descriptor for each bridge (using the
steps in Section 3.4 above) on startup. Whenever they receive a
bridge descriptor, they reschedule a new descriptor download for 1
hour from then.
If on the other hand it fails, they try again after 15 minutes for the
first attempt, after 15 minutes for the second attempt, and after 60
minutes for subsequent attempts.
In 0.2.2.x we should come up with some smarter retry schedules.
3.6. Vidalia integration
Vidalia 0.0.16 has a checkbox in its Network config window called
"My ISP blocks connections to the Tor network." Users who click that
box change their configuration to:
UseBridges 1
UpdateBridgesFromAuthority 1
and should specify at least one Bridge identifier.
3.7. Do we need a second layer of entry guards?
If the bridge user uses the bridge as its entry guard, then the
triangulation attacks from Lasse and Paul's Oakland paper work to
locate the user's bridge(s).
Worse, this is another way to enumerate bridges: if the bridge users
keep rotating through second hops, then if you run a few fast servers
(and avoid getting considered an Exit or a Guard) you'll quickly get
a list of the bridges in active use.
That's probably the strongest reason why bridge users will need to
pick second-layer guards. Would this mean bridge users should switch
to four-hop circuits?
We should figure this out in the 0.2.1.x timeframe.
Filename: 126-geoip-reporting.txt
Title: Getting GeoIP data and publishing usage summaries
Author: Roger Dingledine
Created: 2007-11-24
Status: Closed
Implemented-In: 0.2.0.x
0. Status
In 0.2.0.x, this proposal is implemented to the extent needed to
address its motivations. See notes below with the test "RESOLUTION"
for details.
1. Background and motivation
Right now we can keep a rough count of Tor users, both total and by
country, by watching connections to a single directory mirror. Being
able to get usage estimates is useful both for our funders (to
demonstrate progress) and for our own development (so we know how
quickly we're scaling and can design accordingly, and so we know which
countries and communities to focus on more). This need for information
is the only reason we haven't deployed "directory guards" (think of
them like entry guards but for directory information; in practice,
it would seem that Tor clients should simply use their entry guards
as their directory guards; see also proposal 125).
With the move toward bridges, we will no longer be able to track Tor
clients that use bridges, since they use their bridges as directory
guards. Further, we need to be able to learn which bridges stop seeing
use from certain countries (and are thus likely blocked), so we can
avoid giving them out to other users in those countries.
Right now we already do GeoIP lookups in Vidalia: Vidalia draws relays
and circuits on its 'network map', and it performs anonymized GeoIP
lookups to its central servers to know where to put the dots. Vidalia
caches answers it gets -- to reduce delay, to reduce overhead on
the network, and to reduce anonymity issues where users reveal their
knowledge about the network through which IP addresses they ask about.
But with the advent of bridges, Tor clients are asking about IP
addresses that aren't in the main directory. In particular, bridge
users inform the central Vidalia servers about each bridge as they
discover it and their Vidalia tries to map it.
Also, we wouldn't mind letting Vidalia do a GeoIP lookup on the client's
own IP address, so it can provide a more useful map.
Finally, Vidalia's central servers leave users open to partitioning
attacks, even if they can't target specific users. Further, as we
start using GeoIP results for more operational or security-relevant
goals, such as avoiding or including particular countries in circuits,
it becomes more important that users can't be singled out in terms of
their IP-to-country mapping beliefs.
2. The available GeoIP databases
There are at least two classes of GeoIP database out there: "IP to
country", which tells us the country code for the IP address but
no more details, and "IP to city", which tells us the country code,
the name of the city, and some basic latitude/longitude guesses.
A recent ip-to-country.csv is 3421362 bytes. Compressed, it is 564252
bytes. A typical line is:
"205500992","208605279","US","USA","UNITED STATES"
http://ip-to-country.webhosting.info/node/view/5
Similarly, the maxmind GeoLite Country database is also about 500KB
compressed.
http://www.maxmind.com/app/geolitecountry
The maxmind GeoLite City database gives more finegrained detail like
geo coordinates and city name. Vidalia currently makes use of this
information. On the other hand it's 16MB compressed. A typical line is:
206.124.149.146,Bellevue,WA,US,47.6051,-122.1134
http://www.maxmind.com/app/geolitecity
There are other databases out there, like
http://www.hostip.info/faq.html
http://www.webconfs.com/ip-to-city.php
that want more attention, but for now let's assume that all the db's
are around this size.
3. What we'd like to solve
Goal #1a: Tor relays collect IP-to-country user stats and publish
sanitized versions.
Goal #1b: Tor bridges collect IP-to-country user stats and publish
sanitized versions.
Goal #2a: Vidalia learns IP-to-city stats for Tor relays, for better
mapping.
Goal #2b: Vidalia learns IP-to-country stats for Tor relays, so the user
can pick countries for her paths.
Goal #3: Vidalia doesn't do external lookups on bridge relay addresses.
Goal #4: Vidalia resolves the Tor client's IP-to-country or IP-to-city
for better mapping.
Goal #5: Reduce partitioning opportunities where Vidalia central
servers can give different (distinguishing) responses.
4. Solution overview
Our goal is to allow Tor relays, bridges, and clients to learn enough
GeoIP information so they can do local private queries.
4.1. The IP-to-country db
Directory authorities should publish a "geoip" file that contains
IP-to-country mappings. Directory caches will mirror it, and Tor clients
and relays (including bridge relays) will fetch it. Thus we can solve
goals 1a and 1b (publish sanitized usage info). Controllers could also
use this to solve goal 2b (choosing path by country attributes). It
also solves goal 4 (learning the Tor client's country), though for
huge countries like the US we'd still need to decide where the "middle"
should be when we're mapping that address.
The IP-to-country details are described further in Sections 5 and
6 below.
[RESOLUTION: The geoip file in 0.2.0.x is not distributed through
Tor. Instead, it is shipped with the bundle.]
4.2. The IP-to-city db
In an ideal world, the IP-to-city db would be small enough that we
could distribute it in the above manner too. But for now, it is too
large. Here's where the design choice forks.
Option A: Vidalia should continue doing its anonymized IP-to-city
queries. Thus we can achieve goals 2a and 2b. We would solve goal
3 by only doing lookups on descriptors that are purpose "general"
(see Section 4.2.1 for how). We would leave goal 5 unsolved.
Option B: Each directory authority should keep an IP-to-city db,
lookup the value for each router it lists, and include that line in
the router's network-status entry. The network-status consensus would
then use the line that appears in the majority of votes. This approach
also solves goals 2a and 2b, goal 3 (Vidalia doesn't do any lookups
at all now), and goal 5 (reduced partitioning risks).
Option B has the advantage that Vidalia can simplify its operation,
and the advantage that this consensus IP-to-city data is available to
other controllers besides just Vidalia. But it has the disadvantage
that the networkstatus consensus becomes larger, even though most of
the GeoIP information won't change from one consensus to the next. Is
there another reasonable location for it that can provide similar
consensus security properties?
[RESOLUTION: IP-to-city is not supported.]
4.2.1. Controllers can query for router annotations
Vidalia needs to stop doing queries on bridge relay IP addresses.
It could do that by only doing lookups on descriptors that are in
the networkstatus consensus, but that precludes designs like Blossom
that might want to map its relay locations. The best answer is that it
should learn the router annotations, with a new controller 'getinfo'
command:
"GETINFO desc-annotations/id/<OR identity>"
which would respond with something like
@downloaded-at 2007-11-29 08:06:38
@source "128.31.0.34"
@purpose bridge
[We could also make the answer include the digest for the router in
question, which would enable us to ask GETINFO router-annotations/all.
Is this worth it? -RD]
Then Vidalia can avoid doing lookups on descriptors with purpose
"bridge". Even better would be to add a new annotation "@private true"
so Vidalia can know how to handle new purposes that we haven't created
yet. Vidalia could special-case "bridge" for now, for compatibility
with the current 0.2.0.x-alphas.
4.3. Recommendation
My overall recommendation is that we should implement 4.1 soon
(e.g. early in 0.2.1.x), and we can go with 4.2 option A for now,
with the hope that later we discover a better way to distribute the
IP-to-city info and can switch to 4.2 option B.
Below we discuss more how to go about achieving 4.1.
5. Publishing and caching the GeoIP (IP-to-country) database
Each v3 directory authority should put a copy of the "geoip" file in
its datadirectory. Then its network-status votes should include a hash
of this file (Recommended-geoip-hash: %s), and the resulting consensus
directory should specify the consensus hash.
There should be a new URL for fetching this geoip db (by "current.z"
for testing purposes, and by hash.z for typical downloads). Authorities
should fetch and serve the one listed in the consensus, even when they
vote for their own. This would argue for storing the cached version
in a better filename than "geoip".
Directory mirrors should keep a copy of this file available via the
same URLs.
We assume that the file would change at most a few times a month. Should
Tor ship with a bootstrap geoip file? An out-of-date geoip file may
open you up to partitioning attacks, but for the most part it won't
be that different.
There should be a config option to disable updating the geoip file,
in case users want to use their own file (e.g. they have a proprietary
GeoIP file they prefer to use). In that case we leave it up to the
user to update his geoip file out-of-band.
[XXX Should consider forward/backward compatibility, e.g. if we want
to move to a new geoip file format. -RD]
[RESOLUTION: Not done over Tor.]
6. Controllers use the IP-to-country db for mapping and for path building
Down the road, Vidalia could use the IP-to-country mappings for placing
on its map:
- The location of the client
- The location of the bridges, or other relays not in the
networkstatus, on the map.
- Any relays that it doesn't yet have an IP-to-city answer for.
Other controllers can also use it to set EntryNodes, ExitNodes, etc
in a per-country way.
To support these features, we need to export the IP-to-country data
via the Tor controller protocol.
Is it sufficient just to add a new GETINFO command?
GETINFO ip-to-country/128.31.0.34
250+ip-to-country/128.31.0.34="US","USA","UNITED STATES"
[RESOLUTION: Not done now, except for the getinfo command.]
6.1. Other interfaces
Robert Hogan has also suggested a
GETINFO relays-by-country/cn
as well as torrc options for ExitCountryCodes, EntryCountryCodes,
ExcludeCountryCodes, etc.
[RESOLUTION: Not implemented in 0.2.0.x. Fodder for a future proposal.]
7. Relays and bridges use the IP-to-country db for usage summaries
Once bridges have a GeoIP database locally, they can start to publish
sanitized summaries of client usage -- how many users they see and from
what countries. This might also be a more useful way for ordinary Tor
relays to convey the level of usage they see, which would allow us to
switch to using directory guards for all users by default.
But how to safely summarize this information without opening too many
anonymity leaks?
7.1 Attacks to think about
First, note that we need to have a large enough time window that we're
not aiding correlation attacks much. I hope 24 hours is enough. So
that means no publishing stats until you've been up at least 24 hours.
And you can't publish follow-up stats more often than every 24 hours,
or people could look at the differential.
Second, note that we need to be sufficiently vague about the IP
addresses we're reporting. We are hoping that just specifying the
country will be vague enough. But a) what about active attacks where
we convince a bridge to use a GeoIP db that labels each suspect IP
address as a unique country? We have to assume that the consensus GeoIP
db won't be malicious in this way. And b) could such singling-out
attacks occur naturally, for example because of countries that have
a very small IP space? We should investigate that.
7.2. Granularity of users
Do we only want to report countries that have a sufficient anonymity set
(that is, number of users) for the day? For example, we might avoid
listing any countries that have seen less than five addresses over
the 24 hour period. This approach would be helpful in reducing the
singling-out opportunities -- in the extreme case, we could imagine a
situation where one blogger from the Sudan used Tor on a given day, and
we can discover which entry guard she used.
But I fear that especially for bridges, seeing only one hit from a
given country in a given day may be quite common.
As a compromise, we should start out with an "Other" category in
the reported stats, which is the sum of unlisted countries; if that
category is consistently interesting, we can think harder about how
to get the right data from it safely.
But note that bridge summaries will not be made public individually,
since doing so would help people enumerate bridges. Whereas summaries
from normal relays will be public. So perhaps that means we can afford
to be more specific in bridge summaries? In particular, I'm thinking the
"other" category should be used by public relays but not for bridges
(or if it is, used with a lower threshold).
Even for countries that have many Tor users, we might not want to be
too specific about how many users we've seen. For example, we might
round down the number of users we report to the nearest multiple of 5.
My instinct for now is that this won't be that useful.
7.3 Other issues
Another note: we'll likely be overreporting in the case of users with
dynamic IP addresses: if they rotate to a new address over the course
of the day, we'll count them twice. So be it.
7.4. Where to publish the summaries?
We designed extrainfo documents for information like this. So they
should just be more entries in the extrainfo doc.
But if we want to publish summaries every 24 hours (no more often,
no less often), aren't we tried to the router descriptor publishing
schedule? That is, if we publish a new router descriptor at the 18
hour mark, and nothing much has changed at the 24 hour mark, won't
the new descriptor get dropped as being "cosmetically similar", and
then nobody will know to ask about the new extrainfo document?
One solution would be to make and remember the 24 hour summary at the
24 hour mark, but not actually publish it anywhere until we happen to
publish a new descriptor for other reasons. If we happen to go down
before publishing a new descriptor, then so be it, at least we tried.
7.5. What if the relay is unreachable or goes to sleep?
Even if you've been up for 24 hours, if you were hibernating for 18
of them, then we're not getting as much fuzziness as we'd like. So
I guess that means that we need a 24-hour period of being "awake"
before we'll willing to publish a summary. A similar attack works if
you've been awake but unreachable for the first 18 of the 24 hours. As
another example, a bridge that's on a laptop might be suspended for
some of each day.
This implies that some relays and bridges will never publish summary
stats, because they're not ever reliably working for 24 hours in
a row. If a significant percentage of our reporters end up being in
this boat, we should investigate whether we can accumulate 24 hours of
"usefulness", even if there are holes in the middle, and publish based
on that.
What other issues are like this? It seems that just moving to a new
IP address shouldn't be a reason to cancel stats publishing, assuming
we were usable at each address.
7.6. IP addresses that aren't in the geoip db
Some IP addresses aren't in the public geoip databases. In particular,
I've found that a lot of African countries are missing, but there
are also some common ones in the US that are missing, like parts of
Comcast. We could just lump unknown IP addresses into the "other"
category, but it might be useful to gather a general sense of how many
lookups are failing entirely, by adding a separate "Unknown" category.
We could also contribute back to the geoip db, by letting bridges set
a config option to report the actual IP addresses that failed their
lookup. Then the bridge authority operators can manually make sure
the correct answer will be in later geoip files. This config option
should be disabled by default.
7.7 Bringing it all together
So here's the plan:
24 hours after starting up (modulo Section 7.5 above), bridges and
relays should construct a daily summary of client countries they've
seen, including the above "Unknown" category (Section 7.6) as well.
Non-bridge relays lump all countries with less than K (e.g. K=5) users
into the "Other" category (see Sec 7.2 above), whereas bridge relays are
willing to list a country even when it has only one user for the day.
Whenever we have a daily summary on record, we include it in our
extrainfo document whenever we publish one. The daily summary we
remember locally gets replaced with a newer one when another 24
hours pass.
7.8. Some forward secrecy
How should we remember addresses locally? If we convert them into
country-codes immediately, we will count them again if we see them
again. On the other hand, we don't really want to keep a list hanging
around of all IP addresses we've seen in the past 24 hours.
Step one is that we should never write this stuff to disk. Keeping it
only in ram will make things somewhat better. Step two is to avoid
keeping any timestamps associated with it: rather than a rolling
24-hour window, which would require us to remember the various times
we've seen that address, we can instead just throw out the whole list
every 24 hours and start over.
We could hash the addresses, and then compare hashes when deciding if
we've seen a given address before. We could even do keyed hashes. Or
Bloom filters. But if our goal is to defend against an adversary
who steals a copy of our ram while we're running and then does
guess-and-check on whatever blob we're keeping, we're in bad shape.
We could drop the last octet of the IP address as soon as we see
it. That would cause us to undercount some users from cablemodem and
DSL networks that have a high density of Tor users. And it wouldn't
really help that much -- indeed, the extent to which it does help is
exactly the extent to which it makes our stats less useful.
Other ideas?
Filename: 127-dirport-mirrors-downloads.txt
Title: Relaying dirport requests to Tor download site / website
Author: Roger Dingledine
Created: 2007-12-02
Status: Obsolete
1. Overview
Some countries and networks block connections to the Tor website. As
time goes by, this will remain a problem and it may even become worse.
We have a big pile of mirrors (google for "Tor mirrors"), but few of
our users think to try a search like that. Also, many of these mirrors
might be automatically blocked since their pages contain words that
might cause them to get banned. And lastly, we can imagine a future
where the blockers are aware of the mirror list too.
Here we describe a new set of URLs for Tor's DirPort that will relay
connections from users to the official Tor download site. Rather than
trying to cache a bunch of new Tor packages (which is a hassle in terms
of keeping them up to date, and a hassle in terms of drive space used),
we instead just proxy the requests directly to Tor's /dist page.
Specifically, we should support
GET /tor/dist/$1
and
GET /tor/website/$1
2. Direct connections, one-hop circuits, or three-hop circuits?
We could relay the connections directly to the download site -- but
this produces recognizable outgoing traffic on the bridge or cache's
network, which will probably surprise our nice volunteers. (Is this
a good enough reason to discard the direct connection idea?)
Even if we don't do direct connections, should we do a one-hop
begindir-style connection to the mirror site (make a one-hop circuit
to it, then send a 'begindir' cell down the circuit), or should we do
a normal three-hop anonymized connection?
If these mirrors are mainly bridges, doing either a direct or a one-hop
connection creates another way to enumerate bridges. That would argue
for three-hop. On the other hand, downloading a 10+ megabyte installer
through a normal Tor circuit can't be fun. But if you're already getting
throttled a lot because you're in the "relayed traffic" bucket, you're
going to have to accept a slow transfer anyway. So three-hop it is.
Speaking of which, we would want to label this connection
as "relay" traffic for the purposes of rate limiting; see
connection_counts_as_relayed_traffic() and or_conn->client_used. This
will be a bit tricky though, because these connections will use the
bridge's guards.
3. Scanning resistance
One other goal we'd like to achieve, or at least not hinder, is making
it hard to scan large swaths of the Internet to look for responses
that indicate a bridge.
In general this is a really hard problem, so we shouldn't demand to
solve it here. But we can note that some bridges should open their
DirPort (and offer this functionality), and others shouldn't. Then
some bridges provide a download mirror while others can remain
scanning-resistant.
4. Integrity checking
If we serve this stuff in plaintext from the bridge, anybody in between
the user and the bridge can intercept and modify it. The bridge can too.
If we do an anonymized three-hop connection, the exit node can also
intercept and modify the exe it sends back.
Are we setting ourselves up for rogue exit relays, or rogue bridges,
that trojan our users?
Answer #1: Users need to do pgp signature checking. Not a very good
answer, a) because it's complex, and b) because they don't know the
right signing keys in the first place.
Answer #2: The mirrors could exit from a specific Tor relay, using the
'.exit' notation. This would make connections a bit more brittle, but
would resolve the rogue exit relay issue. We could even round-robin
among several, and the list could be dynamic -- for example, all the
relays with an Authority flag that allow exits to the Tor website.
Answer #3: The mirrors should connect to the main distribution site
via SSL. That way the exit relay can't influence anything.
Answer #4: We could suggest that users only use trusted bridges for
fetching a copy of Tor. Hopefully they heard about the bridge from a
trusted source rather than from the adversary.
Answer #5: What if the adversary is trawling for Tor downloads by
network signature -- either by looking for known bytes in the binary,
or by looking for "GET /tor/dist/"? It would be nice to encrypt the
connection from the bridge user to the bridge. And we can! The bridge
already supports TLS. Rather than initiating a TLS renegotiation after
connecting to the ORPort, the user should actually request a URL. Then
the ORPort can either pass the connection off as a linked conn to the
dirport, or renegotiate and become a Tor connection, depending on how
the client behaves.
5. Linked connections: at what level should we proxy?
Check out the connection_ap_make_link() function, as called from
directory.c. Tor clients use this to create a "fake" socks connection
back to themselves, and then they attach a directory request to it,
so they can launch directory fetches via Tor. We can piggyback on
this feature.
We need to decide if we're going to be passing the bytes back and
forth between the web browser and the main distribution site, or if
we're going to be actually acting like a proxy (parsing out the file
they want, fetching that file, and serving it back).
Advantages of proxying without looking inside:
- We don't need to build any sort of http support (including
continues, partial fetches, etc etc).
Disadvantages:
- If the browser thinks it's speaking http, are there easy ways
to pass the bytes to an https server and have everything work
correctly? At the least, it would seem that the browser would
complain about the cert. More generally, ssl wants to be negotiated
before the URL and headers are sent, yet we need to read the URL
and headers to know that this is a mirror request; so we have an
ordering problem here.
- Makes it harder to do caching later on, if we don't look at what
we're relaying. (It might be useful down the road to cache the
answers to popular requests, so we don't have to keep getting
them again.)
6. Outstanding problems
1) HTTP proxies already exist. Why waste our time cloning one
badly? When we clone existing stuff, we usually regret it.
2) It's overbroad. We only seem to need a secure get-a-tor feature,
and instead we're contemplating building a locked-down HTTP proxy.
3) It's going to add a fair bit of complexity to our code. We do
not currently implement HTTPS. We'd need to refactor lots of the
low-level connection stuff so that "SSL" and "Cell-based" were no
longer synonymous.
4) It's still unclear how effective this proposal would be in
practice. You need to know that this feature exists, which means
somebody needs to tell you about a bridge (mirror) address and tell
you how to use it. And if they're doing that, they could (e.g.) tell
you about a gmail autoresponder address just as easily, and then you'd
get better authentication of the Tor program to boot.
Filename: 128-bridge-families.txt
Title: Families of private bridges
Author: Roger Dingledine
Created: 2007-12-xx
Status: Dead
1. Overview
Proposal 125 introduced the basic notion of how bridge authorities,
bridge relays, and bridge users should behave. But it doesn't get into
the various mechanisms of how to distribute bridge relay addresses to
bridge users.
One of the mechanisms we have in mind is called 'families of bridges'.
If a bridge user knows about only one private bridge, and that bridge
shuts off for the night or gets a new dynamic IP address, the bridge
user is out of luck and needs to re-bootstrap manually or wait and
hope it comes back. On the other hand, if the bridge user knows about
a family of bridges, then as long as one of those bridges is still
reachable his Tor client can automatically learn about where the
other bridges have gone.
So in this design, a single volunteer could run multiple coordinated
bridges, or a group of volunteers could each run a bridge. We abstract
out the details of how these volunteers find each other and decide to
set up a family.
2. Other notes.
somebody needs to run a bridge authority
it needs to have a torrc option to publish networkstatuses of its bridges
it should also do reachability testing just of those bridges
people ask for the bridge networkstatus by asking for a url that
contains a password. (it's safe to do this because of begin_dir.)
so the bridge users need to know a) a password, and b) a bridge
authority line.
the bridge users need to know the bridge authority line.
the bridge authority needs to know the password.
3. Current state
I implemented a BridgePassword config option. Bridge authorities
should set it, and users who want to use those bridge authorities
should set it.
Now there is a new directory URL "/tor/networkstatus-bridges" that
directory mirrors serve if BridgeAuthoritativeDir is set and it's a
begin_dir connection. It looks for the header
Authorization: Basic %s
where %s is the base-64 bridge password.
I never got around to teaching clients how to set the header though,
so it may or may not, and may or may not do what we ultimate want.
I've marked this proposal dead; it really never should have left the
ideas/ directory. Somebody should pick it up sometime and finish the
design and implementation.
Filename: 129-reject-plaintext-ports.txt
Title: Block Insecure Protocols by Default
Author: Kevin Bauer & Damon McCoy
Created: 2008-01-15
Status: Closed
Implemented-In: 0.2.0.x
Overview:
Below is a proposal to mitigate insecure protocol use over Tor.
This document 1) demonstrates the extent to which insecure protocols are
currently used within the Tor network, and 2) proposes a simple solution
to prevent users from unknowingly using these insecure protocols. By
insecure, we consider protocols that explicitly leak sensitive user names
and/or passwords, such as POP, IMAP, Telnet, and FTP.
Motivation:
As part of a general study of Tor use in 2006/2007 [1], we attempted to
understand what types of protocols are used over Tor. While we observed a
enormous volume of Web and Peer-to-peer traffic, we were surprised by the
number of insecure protocols that were used over Tor. For example, over an
8 day observation period, we observed the following number of connections
over insecure protocols:
POP and IMAP:10,326 connections
Telnet: 8,401 connections
FTP: 3,788 connections
Each of the above listed protocols exchange user name and password
information in plain-text. As an upper bound, we could have observed
22,515 user names and passwords. This observation echos the reports of
a Tor router logging and posting e-mail passwords in August 2007 [2]. The
response from the Tor community has been to further educate users
about the dangers of using insecure protocols over Tor. However, we
recently repeated our Tor usage study from last year and noticed that the
trend in insecure protocol use has not declined. Therefore, we propose that
additional steps be taken to protect naive Tor users from inadvertently
exposing their identities (and even passwords) over Tor.
Security Implications:
This proposal is intended to improve Tor's security by limiting the
use of insecure protocols.
Roger added: By adding these warnings for only some of the risky
behavior, users may do other risky behavior, not get a warning, and
believe that it is therefore safe. But overall, I think it's better
to warn for some of it than to warn for none of it.
Specification:
As an initial step towards mitigating the use of the above-mentioned
insecure protocols, we propose that the default ports for each respective
insecure service be blocked at the Tor client's socks proxy. These default
ports include:
23 - Telnet
109 - POP2
110 - POP3
143 - IMAP
Notice that FTP is not included in the proposed list of ports to block. This
is because FTP is often used anonymously, i.e., without any identifying
user name or password.
This blocking scheme can be implemented as a set of flags in the client's
torrc configuration file:
BlockInsecureProtocols 0|1
WarnInsecureProtocols 0|1
When the warning flag is activated, a message should be displayed to
the user similar to the message given when Tor's socks proxy is given an IP
address rather than resolving a host name.
We recommend that the default torrc configuration file block insecure
protocols and provide a warning to the user to explain the behavior.
Finally, there are many popular web pages that do not offer secure
login features, such as MySpace, and it would be prudent to provide
additional rules to Privoxy to attempt to protect users from unknowingly
submitting their login credentials in plain-text.
Compatibility:
None, as the proposed changes are to be implemented in the client.
References:
[1] Shining Light in Dark Places: A Study of Anonymous Network Usage.
University of Colorado Technical Report CU-CS-1032-07. August 2007.
[2] Rogue Nodes Turn Tor Anonymizer Into Eavesdropper's Paradise.
http://www.wired.com/politics/security/news/2007/09/embassy_hacks.
Wired. September 10, 2007.
Implementation:
Roger added this feature in
http://archives.seul.org/or/cvs/Jan-2008/msg00182.html
He also added a status event for Vidalia to recognize attempts to use
vulnerable-plaintext ports, so it can help the user understand what's
going on and how to fix it.
Next steps:
a) Vidalia should learn to recognize this controller status event,
so we don't leave users out in the cold when we enable this feature.
b) We should decide which ports to reject by default. The current
consensus is 23,109,110,143 -- the same set that we warn for now.
Filename: 130-v2-conn-protocol.txt
Title: Version 2 Tor connection protocol
Author: Nick Mathewson
Created: 2007-10-25
Status: Closed
Implemented-In: 0.2.0.x
Overview:
This proposal describes the significant changes to be made in the v2
Tor connection protocol.
This proposal relates to other proposals as follows:
It refers to and supersedes:
Proposal 124: Blocking resistant TLS certificate usage
It refers to aspects of:
Proposal 105: Version negotiation for the Tor protocol
In summary, The Tor connection protocol has been in need of a redesign
for a while. This proposal describes how we can add to the Tor
protocol:
- A new TLS handshake (to achieve blocking resistance without
breaking backward compatibility)
- Version negotiation (so that future connection protocol changes
can happen without breaking compatibility)
- The actual changes in the v2 Tor connection protocol.
Motivation:
For motivation, see proposal 124.
Proposal:
0. Terminology
The version of the Tor connection protocol implemented up to now is
"version 1". This proposal describes "version 2".
"Old" or "Older" versions of Tor are ones not aware that version 2
of this protocol exists;
"New" or "Newer" versions are ones that are.
The connection initiator is referred to below as the Client; the
connection responder is referred to below as the Server.
1. The revised TLS handshake.
For motivation, see proposal 124. This is a simplified version of the
handshake that uses TLS's renegotiation capability in order to avoid
some of the extraneous steps in proposal 124.
The Client connects to the Server and, as in ordinary TLS, sends a
list of ciphers. Older versions of Tor will send only ciphers from
the list:
TLS_DHE_RSA_WITH_AES_256_CBC_SHA
TLS_DHE_RSA_WITH_AES_128_CBC_SHA
SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA
SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA
Clients that support the revised handshake will send the recommended
list of ciphers from proposal 124, in order to emulate the behavior of
a web browser.
If the server notices that the list of ciphers contains only ciphers
from this list, it proceeds with Tor's version 1 TLS handshake as
documented in tor-spec.txt.
(The server may also notice cipher lists used by other implementations
of the Tor protocol (in particular, the BouncyCastle default cipher
list as used by some Java-based implementations), and whitelist them.)
On the other hand, if the server sees a list of ciphers that could not
have been sent from an older implementation (because it includes other
ciphers, and does not match any known-old list), the server sends a
reply containing a single connection certificate, constructed as for
the link certificate in the v1 Tor protocol. The subject names in
this certificate SHOULD NOT have any strings to identify them as
coming from a Tor server. The server does not ask the client for
certificates.
Old Servers will (mostly) ignore the cipher list and respond as in the v1
protocol, sending back a two-certificate chain.
After the Client gets a response from the server, it checks for the
number of certificates it received. If there are two certificates,
the client assumes a V1 connection and proceeds as in tor-spec.txt.
But if there is only one certificate, the client assumes a V2 or later
protocol and continues.
At this point, the client has established a TLS connection with the
server, but the parties have not been authenticated: the server hasn't
sent its identity certificate, and the client hasn't sent any
certificates at all. To fix this, the client begins a TLS session
renegotiation. This time, the server continues with two certificates
as usual, and asks for certificates so that the client will send
certificates of its own. Because the TLS connection has been
established, all of this is encrypted. (The certificate sent by the
server in the renegotiated connection need not be the same that
as sentin the original connection.)
The server MUST NOT write any data until the client has renegotiated.
Once the renegotiation is finished, the server and client check one
another's certificates as in V1. Now they are mutually authenticated.
1.1. Revised TLS handshake: implementation notes.
It isn't so easy to adjust server behavior based on the client's
ciphersuite list. Here's how we can do it using OpenSSL. This is a
bit of an abuse of the OpenSSL APIs, but it's the best we can do, and
we won't have to do it forever.
We can use OpenSSL's SSL_set_info_callback() to register a function to
be called when the state changes. The type/state tuple of
SSL_CB_ACCEPT_LOOP/SSL3_ST_SW_SRVR_HELLO_A
happens when we have completely parsed the client hello, and are about
to send a response. From this callback, we can check the cipherlist
and act accordingly:
* If the ciphersuite list indicates a v1 protocol, we set the
verify mode to SSL_VERIFY_NONE with a callback (so we get
certificates).
* If the ciphersuite list indicates a v2 protocol, we set the
verify mode to SSL_VERIFY_NONE with no callback (so we get
no certificates) and set the SSL_MODE_NO_AUTO_CHAIN flag (so that
we send only 1 certificate in the response.
Once the handshake is done, the server clears the
SSL_MODE_NO_AUTO_CHAIN flag and sets the callback as for the V1
protocol. It then starts reading.
The other problem to take care of is missing ciphers and OpenSSL's
cipher sorting algorithms. The two main issues are a) OpenSSL doesn't
support some of the default ciphers that Firefox advertises, and b)
OpenSSL sorts the list of ciphers it offers in a different way than
Firefox sorts them, so unless we fix that Tor will still look different
than Firefox.
[XXXX more on this.]
1.2. Compatibility for clients using libraries less hackable than OpenSSL.
As discussed in proposal 105, servers advertise which protocol
versions they support in their router descriptors. Clients can simply
behave as v1 clients when connecting to servers that do not support
link version 2 or higher, and as v2 clients when connecting to servers
that do support link version 2 or higher.
(Servers can't use this strategy because we do not assume that servers
know one another's capabilities when connecting.)
2. Version negotiation.
Version negotiation proceeds as described in proposal 105, except as
follows:
* Version negotiation only happens if the TLS handshake as described
above completes.
* The TLS renegotiation must be finished before the client sends a
VERSIONS cell; the server sends its VERSIONS cell in response.
* The VERSIONS cell uses the following variable-width format:
Circuit [2 octets; set to 0]
Command [1 octet; set to 7 for VERSIONS]
Length [2 octets; big-endian]
Data [Length bytes]
The Data in the cell is a series of big-endian two-byte integers.
* It is not allowed to negotiate V1 connections once the v2 protocol
has been used. If this happens, Tor instances should close the
connection.
3. The rest of the "v2" protocol
Once a v2 protocol has been negotiated, NETINFO cells are exchanged
as in proposal 105, and communications begin as per tor-spec.txt.
Until NETINFO cells have been exchanged, the connection is not open.
Filename: 131-verify-tor-usage.txt
Title: Help users to verify they are using Tor
Author: Steven J. Murdoch
Created: 2008-01-25
Status: Obsolete
Overview:
Websites for checking whether a user is accessing them via Tor are a
very helpful aid to configuring web browsers correctly. Existing
solutions have both false positives and false negatives when
checking if Tor is being used. This proposal will discuss how to
modify Tor so as to make testing more reliable.
Motivation:
Currently deployed websites for detecting Tor use work by comparing
the client IP address for a request with a list of known Tor nodes.
This approach is generally effective, but suffers from both false
positives and false negatives.
If a user has a Tor exit node installed, or just happens to have
been allocated an IP address previously used by a Tor exit node, any
web requests will be incorrectly flagged as coming from Tor. If any
customer of an ISP which implements a transparent proxy runs an exit
node, all other users of the ISP will be flagged as Tor users.
Conversely, if the exit node chosen by a Tor user has not yet been
recorded by the Tor checking website, requests will be incorrectly
flagged as not coming via Tor.
The only reliable way to tell whether Tor is being used or not is for
the Tor client to flag this to the browser.
Proposal:
A DNS name should be registered and point to an IP address
controlled by the Tor project and likely to remain so for the
useful lifetime of a Tor client. A web server should be placed
at this IP address.
Tor should be modified to treat requests to port 80, at the
specified DNS name or IP address specially. Instead of opening a
circuit, it should respond to a HTTP request with a helpful web
page:
- If the request to open a connection was to the domain name, the web
page should state that Tor is working properly.
- If the request was to the IP address, the web page should state
that there is a DNS-leakage vulnerability.
If the request goes through to the real web server, the page
should state that Tor has not been set up properly.
Extensions:
Identifying proxy server:
If needed, other applications between the web browser and Tor (e.g.
Polipo and Privoxy) could piggyback on the same mechanism to flag
whether they are in use. All three possible web pages should include
a machine-readable placeholder, into which another program could
insert their own message.
For example, the webpage returned by Tor to indicate a successful
configuration could include the following HTML:
<h2>Connection chain</h2>
<ul>
<li>Tor 0.1.2.14-alpha</li>
<!-- Tor Connectivity Check: success -->
</ul>
When the proxy server observes this string, in response to a request
for the Tor connectivity check web page, it would prepend it's own
message, resulting in the following being returned to the web
browser:
<h2>Connection chain
<ul>
<li>Tor 0.1.2.14-alpha</li>
<li>Polipo version 1.0.4</li>
<!-- Tor Connectivity Check: success -->
</ul>
Checking external connectivity:
If Tor intercepts a request, and returns a response itself, the user
will not actually confirm whether Tor is able to build a successful
circuit. It may then be advantageous to include an image in the web
page which is loaded from a different domain. If this is able to be
loaded then the user will know that external connectivity through
Tor works.
Automatic Firefox Notification:
All forms of the website should return valid XHTML and have a
hidden link with an id attribute "TorCheckResult" and a target
property that can be queried to determine the result. For example,
a hidden link would convey success like this:
<a id="TorCheckResult" target="success" href="/"></a>
failure like this:
<a id="TorCheckResult" target="failure" href="/"></a>
and DNS leaks like this:
<a id="TorCheckResult" target="dnsleak" href="/"></a>
Firefox extensions such as Torbutton would then be able to
issue an XMLHttpRequest for the page and query the result
with resultXML.getElementById("TorCheckResult").target
to automatically report the Tor status to the user when
they first attempt to enable Tor activity, or whenever
they request a check from the extension preferences window.
If the check website is to be themed with heavy graphics and/or
extensive documentation, the check result itself should be
contained in a seperate lightweight iframe that extensions can
request via an alternate url.
Security and resiliency implications:
What attacks are possible?
If the IP address used for this feature moves there will be two
consequences:
- A new website at this IP address will remain inaccessible over
Tor
- Tor users who are leaking DNS will be informed that Tor is not
working, rather than that it is active but leaking DNS
We should thus attempt to find an IP address which we reasonably
believe can remain static.
Open issues:
If a Tor version which does not support this extra feature is used,
the webpage returned will indicate that Tor is not being used. Can
this be safely fixed?
Related work:
The proposed mechanism is very similar to config.privoxy.org. The
most significant difference is that if the web browser is
misconfigured, Tor will only get an IP address. Even in this case,
Tor should be able to respond with a webpage to notify the user of how
to fix the problem. This also implies that Tor must be told of the
special IP address, and so must be effectively permanent.
Filename: 132-browser-check-tor-service.txt
Title: A Tor Web Service For Verifying Correct Browser Configuration
Author: Robert Hogan
Created: 2008-03-08
Status: Obsolete
Overview:
Tor should operate a primitive web service on the loopback network device
that tests the operation of user's browser, privacy proxy and Tor client.
The tests are performed by serving unique, randomly generated elements in
image URLs embedded in static HTML. The images are only displayed if the DNS
and HTTP requests for them are routed through Tor, otherwise the 'alt' text
may be displayed. The proposal assumes that 'alt' text is not displayed on
all browsers so suggests that text and links should accompany each image
advising the user on next steps in case the test fails.
The service is primarily for the use of controllers, since presumably users
aren't going to want to edit text files and then type something exotic like
127.0.0.1:9999 into their address bar. In the main use case the controller
will have configured the actual port for the webservice so will know where
to direct the request. It would also be the responsibility of the controller
to ensure the webservice is available, and tor is running, before allowing
the user to access the page through their browser.
Motivation:
This is a complementary approach to proposal 131. It overcomes some of the
limitations of the approach described in proposal 131: reliance
on a permanent, real IP address and compatibility with older versions of
Tor. Unlike 131, it is not as useful to Tor users who are not running a
controller.
Objective:
Provide a reliable means of helping users to determine if their Tor
installation, privacy proxy and browser are properly configured for
anonymous browsing.
Proposal:
When configured to do so, Tor should run a basic web service available
on a configured port on 127.0.0.1. The purpose of this web service is to
serve a number of basic test images that will allow the user to determine
if their browser is properly configured and that Tor is working normally.
The service can consist of a single web page with two columns. The left
column contains images, the right column contains advice on what the
display/non-display of the column means.
The rest of this proposal assumes that the service is running on port
9999. The port should be configurable, and configuring the port enables the
service. The service must run on 127.0.0.1.
In all the examples below [uniquesessionid] refers to a random, base64
encoded string that is unique to the URL it is contained in. Tor only ever
stores the most recently generated [uniquesessionid] for each URL, storing 3
in total. Tor should generate a [uniquesessionid] for each of the test URLs
below every time a HTTP GET is received at 127.0.0.1:9999 for index.htm.
The most suitable image for each test case is an implementation decision.
Tor will need to store and serve images for the first and second test
images, and possibly the third (see 'Open Issues').
1. DNS Request Test Image
This is a HTML element embedded in the page served by Tor at
http://127.0.0.1:9999:
<IMG src="http://[uniquesessionid]:9999/torlogo.jpg" alt="If you can see
this text, your browser's DNS requests are not being routed through Tor."
width="200" height="200" align="middle" border="2">
If the browser's DNS request for [uniquesessionid] is routed through Tor,
Tor will intercept the request and return 127.0.0.1 as the resolved IP
address. This will shortly be followed by a HTTP request from the browser
for http://127.0.0.1:9999/torlogo.jpg. This request should be served with
the appropriate image.
If the browser's DNS request for [uniquesessionid] is not routed through Tor
the browser may display the 'alt' text specified in the html element. The
HTML served by Tor should also contain text accompanying the image to advise
users what it means if they do not see an image. It should also provide a
link to click that provides information on how to remedy the problem. This
behaviour also applies to the images described in 2. and 3. below, so should
be assumed there as well.
2. Proxy Configuration Test Image
This is a HTML element embedded in the page served by Tor at
http://127.0.0.1:9999:
<IMG src="http://torproject.org/[uniquesessionid].jpg" alt="If you can see
this text, your browser is not configured to work with Tor." width="200"
height="200" align="middle" border="2">
If the HTTP request for the resource [uniquesessionid].jpg is received by
Tor it will serve the appropriate image in response. It should serve this
image itself, without attempting to retrieve anything from the Internet.
If Tor can identify the name of the proxy application requesting the
resource then it could store and serve an image identifying the proxy to the
user.
3. Tor Connectivity Test Image
This is a HTML element embedded in the page served by Tor at
http://127.0.0.1:9999:
<IMG src="http://torproject.org/[uniquesessionid]-torlogo.jpg" alt="If you
can see this text, your Tor installation cannot connect to the Internet."
width="200" height="200" align="middle" border="2">
The referenced image should actually exist on the Tor project website. If
Tor receives the request for the above resource it should remove the random
base64 encoded digest from the request (i.e. [uniquesessionid]-) and attempt
to retrieve the real image.
Even on a fully operational Tor client this test may not always succeed. The
user should be advised that one or more attempts to retrieve this image may
be necessary to confirm a genuine problem.
Open Issues:
The final connectivity test relies on an externally maintained resource, if
this resource becomes unavailable the connectivity test will always fail.
Either the text accompanying the test should advise of this possibility or
Tor clients should be advised of the location of the test resource in the
main network directory listings.
Any number of misconfigurations may make the web service unreachable, it is
the responsibility of the user's controller to recognize these and assist
the user in eliminating them. Tor can mitigate against the specific
misconfiguration of routing HTTP traffic to 127.0.0.1 to Tor itself by
serving such requests through the SOCKS port as well as the configured web
service report.
Now Tor is inspecting the URLs requested on its SOCKS port and 'dropping'
them. It already inspects for raw IP addresses (to warn of DNS leaks) but
maybe the behaviour proposed here is qualitatively different. Maybe this is
an unwelcome precedent that can be used to beat the project over the head in
future. Or maybe it's not such a bad thing, Tor is merely attempting to make
normally invalid resource requests valid for a given purpose.
Filename: 133-unreachable-ors.txt
Title: Incorporate Unreachable ORs into the Tor Network
Author: Robert Hogan
Created: 2008-03-08
Status: Reserve
Overview:
Propose a scheme for harnessing the bandwidth of ORs who cannot currently
participate in the Tor network because they can only make outbound
TCP connections.
Motivation:
Restrictive local and remote firewalls are preventing many willing
candidates from becoming ORs on the Tor network.These
ORs have a casual interest in joining the network but their operator is not
sufficiently motivated or adept to complete the necessary router or firewall
configuration. The Tor network is losing out on their bandwidth. At the
moment we don't even know how many such 'candidate' ORs there are.
Objective:
1. Establish how many ORs are unable to qualify for publication because
they cannot establish that their ORPort is reachable.
2. Devise a method for making such ORs available to clients for circuit
building without prejudicing their anonymity.
Proposal:
ORs whose ORPort reachability testing fails a specified number of
consecutive times should:
1. Enlist themselves with the authorities setting a 'Fallback' flag. This
flag indicates that the OR is up and running but cannot connect to
itself.
2. Open an orconn with all ORs whose fingerprint begins with the same
byte as their own. The management of this orconn will be transferred
entirely to the OR at the other end.
2. The fallback OR should update it's router status to contain the
'Running' flag if it has managed to open an orconn with 3/4 of the ORs
with an FP beginning with the same byte as its own.
Tor ORs who are contacted by fallback ORs requesting an orconn should:
1. Accept the orconn until they have reached a defined limit of orconn
connections with fallback ORs.
2. Should only accept such orconn requests from listed fallback ORs who
have an FP beginning with the same byte as its own.
Tor clients can include fallback ORs in the network by doing the
following:
1. When building a circuit, observe the fingerprint of each node they
wish to connect to.
2. When randomly selecting a node from the set of all eligible nodes,
add all published, running fallback nodes to the set where the first
byte of the fingerprint matches the previous node in the circuit.
Anonymity Implications:
At least some, and possibly all, nodes on the network will have a set
of nodes that only they and a few others can build circuits on.
1. This means that fallback ORs might be unsuitable for use as middlemen
nodes, because if the exit node is the attacker it knows that the
number of nodes that could be the entry guard in the circuit is
reduced to roughly 1/256th of the network, or worse 1/256th of all
nodes listed as Guards. For the same reason, fallback nodes would
appear to be unsuitable for two-hop circuits.
2. This is not a problem if fallback ORs are always exit nodes. If
the fallback OR is an attacker it will not be able to reduce the
set of possible nodes for the entry guard any further than a normal,
published OR.
Possible Attacks/Open Issues:
1. Gaming Node Selection
Does running a fallback OR customized for a specific set of published ORs
improve an attacker's chances of seeing traffic from that set of published
ORs? Would such a strategy be any more effective than running published
ORs with other 'attractive' properties?
2. DOS Attack
An attacker could prevent all other legitimate fallback ORs with a
given byte-1 in their FP from functioning by running 20 or 30 fallback ORs
and monopolizing all available fallback slots on the published ORs.
This same attacker would then be in a position to monopolize all the
traffic of the fallback ORs on that byte-1 network segment. I'm not sure
what this would allow such an attacker to do.
4. Circuit-Sniffing
An observer watching exit traffic from a fallback server will know that the
previous node in the circuit is one of a very small, identifiable
subset of the total ORs in the network. To establish the full path of the
circuit they would only have to watch the exit traffic from the fallback
OR and all the traffic from the 20 or 30 ORs it is likely to be connected
to. This means it is substantially easier to establish all members of a
circuit which has a fallback OR as an exit (sniff and analyse 10-50 (i.e.
1/256 varying) + 1 ORs) rather than a normal published OR (sniff all 2560
or so ORs on the network). The same mechanism that allows the client to
expect a specific fallback OR to be available from a specific published OR
allows an attacker to prepare his ground.
Mitigant:
In terms of the resources and access required to monitor 2000 to 3000
nodes, the effort of the adversary is not significantly diminished when he
is only interested in 20 or 30. It is hard to see how an adversary who can
obtain access to a randomly selected portion of the Tor network would face
any new or qualitatively different obstacles in attempting to access much
of the rest of it.
Implementation Issues:
The number of ORs this proposal would add to the Tor network is not known.
This is because there is no mechanism at present for recording unsuccessful
attempts to become an OR. If the proposal is considered promising it may be
worthwhile to issue an alpha series release where candidate ORs post a
primitive fallback descriptor to the authority directories. This fallback
descriptor would not contain any other flag that would make it eligible for
selection by clients. It would act solely as a means of sizing the number of
Tor instances that try and fail to become ORs.
The upper limit on the number of orconns from fallback ORs a normal,
published OR should be willing to accept is an open question. Is one
hundred, mostly idle, such orconns too onerous?
Filename: 134-robust-voting.txt
Title: More robust consensus voting with diverse authority sets
Author: Peter Palfrader
Created: 2008-04-01
Status: Rejected
History:
2009 May 27: Added note on rejecting this proposal -- Nick
Overview:
A means to arrive at a valid directory consensus even when voters
disagree on who is an authority.
Motivation:
Right now there are about five authoritative directory servers in the
Tor network, tho this number is expected to rise to about 15 eventually.
Adding a new authority requires synchronized action from all operators of
directory authorities so that at any time during the update at least half of
all authorities are running and agree on who is an authority. The latter
requirement is there so that the authorities can arrive at a common
consensus: Each authority builds the consensus based on the votes from
all authorities it recognizes, and so a different set of recognized
authorities will lead to a different consensus document.
Objective:
The modified voting procedure outlined in this proposal obsoletes the
requirement for most authorities to exactly agree on the list of
authorities.
Proposal:
The vote document each authority generates contains a list of
authorities recognized by the generating authority. This will be
a list of authority identity fingerprints.
Authorities will accept votes from and serve/mirror votes also for
authorities they do not recognize. (Votes contain the signing,
authority key, and the certificate linking them so they can be
verified even without knowing the authority beforehand.)
Before building the consensus we will check which votes to use for
building:
1) We build a directed graph of which authority/vote recognizes
whom.
2) (Parts of the graph that aren't reachable, directly or
indirectly, from any authorities we recognize can be discarded
immediately.)
3) We find the largest fully connected subgraph.
(Should there be more than one subgraph of the same size there
needs to be some arbitrary ordering so we always pick the same.
E.g. pick the one who has the smaller (XOR of all votes' digests)
or something.)
4) If we are part of that subgraph, great. This is the list of
votes we build our consensus with.
5) If we are not part of that subgraph, remove all the nodes that
are part of it and go to 3.
Using this procedure authorities that are updated to recognize a
new authority will continue voting with the old group until a
sufficient number has been updated to arrive at a consensus with
the recently added authority.
In fact, the old set of authorities will probably be voting among
themselves until all but one has been updated to recognize the
new authority. Then which set of votes is used for consensus
building depends on which of the two equally large sets gets
ordered before the other in step (3) above.
It is necessary to continue with the process in (5) even if we
are not in the largest subgraph. Otherwise one rogue authority
could create a number of extra votes (by new authorities) so that
everybody stops at 5 and no consensus is built, even tho it would
be trusted by all clients.
Anonymity Implications:
The author does not believe this proposal to have anonymity
implications.
Possible Attacks/Open Issues/Some thinking required:
Q: Can a number (less or exactly half) of the authorities cause an honest
authority to vote for "their" consensus rather than the one that would
result were all authorities taken into account?
Q: Can a set of votes from external authorities, i.e of whom we trust either
none or at least not all, cause us to change the set of consensus makers we
pick?
A: Yes, if other authorities decide they rather build a consensus with them
then they'll be thrown out in step 3. But that's ok since those other
authorities will never vote with us anyway.
If we trust none of them then we throw them out even sooner, so no harm done.
Q: Can this ever force us to build a consensus with authorities we do not
recognize?
A: No, we can never build a fully connected set with them in step 3.
------------------------------
I'm rejecting this proposal as insecure.
Suppose that we have a clique of size N, and M hostile members in the
clique. If these hostile members stop declaring trust for up to M-1
good members of the clique, the clique with the hostile members will
in it will be larger than the one without them.
The M hostile members will constitute a majority of this new clique
when M > (N-(M-1)) / 2, or when M > (N + 1) / 3. This breaks our
requirement that an adversary must compromise a majority of authorities
in order to control the consensus.
-- Nick
Filename: 135-private-tor-networks.txt
Title: Simplify Configuration of Private Tor Networks
Author: Karsten Loesing
Created: 29-Apr-2008
Status: Closed
Target: 0.2.1.x
Implemented-In: 0.2.1.2-alpha
Change history:
29-Apr-2008 Initial proposal for or-dev
19-May-2008 Included changes based on comments by Nick to or-dev and
added a section for test cases.
18-Jun-2008 Changed testing-network-only configuration option names.
Overview:
Configuring a private Tor network has become a time-consuming and
error-prone task with the introduction of the v3 directory protocol. In
addition to that, operators of private Tor networks need to set an
increasing number of non-trivial configuration options, and it is hard
to keep FAQ entries describing this task up-to-date. In this proposal we
(1) suggest to (optionally) accelerate timing of the v3 directory voting
process and (2) introduce an umbrella config option specifically aimed at
creating private Tor networks.
Design:
1. Accelerate Timing of v3 Directory Voting Process
Tor has reasonable defaults for setting up a large, Internet-scale
network with comparably high latencies and possibly wrong server clocks.
However, those defaults are bad when it comes to quickly setting up a
private Tor network for testing, either on a single node or LAN (things
might be different when creating a test network on PlanetLab or
something). Some time constraints should be made configurable for private
networks. The general idea is to accelerate everything that has to do
with propagation of directory information, but nothing else, so that a
private network is available as soon as possible. (As a possible
safeguard, changing these configuration values could be made dependent on
the umbrella configuration option introduced in 2.)
1.1. Initial Voting Schedule
When a v3 directory does not know any consensus, it assumes an initial,
hard-coded VotingInterval of 30 minutes, VoteDelay of 5 minutes, and
DistDelay of 5 minutes. This is important for multiple, simultaneously
restarted directory authorities to meet at a common time and create an
initial consensus. Unfortunately, this means that it may take up to half
an hour (or even more) for a private Tor network to bootstrap.
We propose to make these three time constants configurable (note that
V3AuthVotingInterval, V3AuthVoteDelay, and V3AuthDistDelay do not have an
effect on the _initial_ voting schedule, but only on the schedule that a
directory authority votes for). This can be achieved by introducing three
new configuration options: TestingV3AuthInitialVotingInterval,
TestingV3AuthInitialVoteDelay, and TestingV3AuthInitialDistDelay.
As first safeguards, Tor should only accept configuration values for
TestingV3AuthInitialVotingInterval that divide evenly into the default
value of 30 minutes. The effect is that even if people misconfigured
their directory authorities, they would meet at the default values at the
latest. The second safeguard is to allow configuration only when the
umbrella configuration option TestingTorNetwork is set.
1.2. Immediately Provide Reachability Information (Running flag)
The default behavior of a directory authority is to provide the Running
flag only after the authority is available for at least 30 minutes. The
rationale is that before that time, an authority simply cannot deliver
useful information about other running nodes. But for private Tor
networks this may be different. This is currently implemented in the code
as:
/** If we've been around for less than this amount of time, our
* reachability information is not accurate. */
#define DIRSERV_TIME_TO_GET_REACHABILITY_INFO (30*60)
There should be another configuration option
TestingAuthDirTimeToLearnReachability with a default value of 30 minutes
that can be changed when running testing Tor networks, e.g. to 0 minutes.
The configuration value would simply replace the quoted constant. Again,
changing this option could be safeguarded by requiring the umbrella
configuration option TestingTorNetwork to be set.
1.3. Reduce Estimated Descriptor Propagation Time
Tor currently assumes that it takes up to 10 minutes until router
descriptors are propagated from the authorities to directory caches.
This is not very useful for private Tor networks, and we want to be able
to reduce this time, so that clients can download router descriptors in a
timely manner.
/** Clients don't download any descriptor this recent, since it will
* probably not have propagated to enough caches. */
#define ESTIMATED_PROPAGATION_TIME (10*60)
We suggest to introduce a new config option
TestingEstimatedDescriptorPropagationTime which defaults to 10 minutes,
but that can be set to any lower non-negative value, e.g. 0 minutes. The
same safeguards as in 1.2 could be used here, too.
2. Umbrella Option for Setting Up Private Tor Networks
Setting up a private Tor network requires a number of specific settings
that are not required or useful when running Tor in the public Tor
network. Instead of writing down these options in a FAQ entry, there
should be a single configuration option, e.g. TestingTorNetwork, that
changes all required settings at once. Newer Tor versions would keep the
set of configuration options up-to-date. It should still remain possible
to manually overwrite the settings that the umbrella configuration option
affects.
The following configuration options are set by TestingTorNetwork:
- ServerDNSAllowBrokenResolvConf 1
Ignore the situation that private relays are not aware of any name
servers.
- DirAllowPrivateAddresses 1
Allow router descriptors containing private IP addresses.
- EnforceDistinctSubnets 0
Permit building circuits with relays in the same subnet.
- AssumeReachable 1
Omit self-testing for reachability.
- AuthDirMaxServersPerAddr 0
- AuthDirMaxServersPerAuthAddr 0
Permit an unlimited number of nodes on the same IP address.
- ClientDNSRejectInternalAddresses 0
Believe in DNS responses resolving to private IP addresses.
- ExitPolicyRejectPrivate 0
Allow exiting to private IP addresses. (This one is a matter of
taste---it might be dangerous to make this a default in a private
network, although people setting up private Tor networks should know
what they are doing.)
- V3AuthVotingInterval 5 minutes
- V3AuthVoteDelay 20 seconds
- V3AuthDistDelay 20 seconds
Accelerate voting schedule after first consensus has been reached.
- TestingV3AuthInitialVotingInterval 5 minutes
- TestingV3AuthInitialVoteDelay 20 seconds
- TestingV3AuthInitialDistDelay 20 seconds
Accelerate initial voting schedule until first consensus is reached.
- TestingAuthDirTimeToLearnReachability 0 minutes
Consider routers as Running from the start of running an authority.
- TestingEstimatedDescriptorPropagationTime 0 minutes
Clients try downloading router descriptors from directory caches,
even when they are not 10 minutes old.
In addition to changing the defaults for these configuration options,
TestingTorNetwork can only be set when a user has manually configured
DirServer lines.
Test:
The implementation of this proposal must pass the following tests:
1. Set TestingTorNetwork and see if dependent configuration options are
correctly changed.
tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \
"mydir 127.0.0.1:1234 0000000000000000000000000000000000000000"
telnet 127.0.0.1 9051
AUTHENTICATE
GETCONF TestingTorNetwork TestingAuthDirTimeToLearnReachability
250-TestingTorNetwork=1
250 TestingAuthDirTimeToLearnReachability=0
QUIT
2. Set TestingTorNetwork and a dependent configuration value to see if
the provided value is used for the dependent option.
tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \
"mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" \
TestingAuthDirTimeToLearnReachability 5
telnet 127.0.0.1 9051
AUTHENTICATE
GETCONF TestingTorNetwork TestingAuthDirTimeToLearnReachability
250-TestingTorNetwork=1
250 TestingAuthDirTimeToLearnReachability=5
QUIT
3. Start with TestingTorNetwork set and change a dependent configuration
option later on.
tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \
"mydir 127.0.0.1:1234 0000000000000000000000000000000000000000"
telnet 127.0.0.1 9051
AUTHENTICATE
SETCONF TestingAuthDirTimeToLearnReachability=5
GETCONF TestingAuthDirTimeToLearnReachability
250 TestingAuthDirTimeToLearnReachability=5
QUIT
4. Start with TestingTorNetwork set and a dependent configuration value,
and reset that dependent configuration value. The result should be
the testing-network specific default value.
tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \
"mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" \
TestingAuthDirTimeToLearnReachability 5
telnet 127.0.0.1 9051
AUTHENTICATE
GETCONF TestingAuthDirTimeToLearnReachability
250 TestingAuthDirTimeToLearnReachability=5
RESETCONF TestingAuthDirTimeToLearnReachability
GETCONF TestingAuthDirTimeToLearnReachability
250 TestingAuthDirTimeToLearnReachability=0
QUIT
5. Leave TestingTorNetwork unset and check if dependent configuration
options are left unchanged.
tor DataDirectory . ControlPort 9051 DirServer \
"mydir 127.0.0.1:1234 0000000000000000000000000000000000000000"
telnet 127.0.0.1 9051
AUTHENTICATE
GETCONF TestingTorNetwork TestingAuthDirTimeToLearnReachability
250-TestingTorNetwork=0
250 TestingAuthDirTimeToLearnReachability=1800
QUIT
6. Leave TestingTorNetwork unset, but set dependent configuration option
which should fail.
tor DataDirectory . ControlPort 9051 DirServer \
"mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" \
TestingAuthDirTimeToLearnReachability 0
[warn] Failed to parse/validate config:
TestingAuthDirTimeToLearnReachability may only be changed in testing
Tor networks!
7. Start with TestingTorNetwork unset and change dependent configuration
option later on which should fail.
tor DataDirectory . ControlPort 9051 DirServer \
"mydir 127.0.0.1:1234 0000000000000000000000000000000000000000"
telnet 127.0.0.1 9051
AUTHENTICATE
SETCONF TestingAuthDirTimeToLearnReachability=0
513 Unacceptable option value: TestingAuthDirTimeToLearnReachability
may only be changed in testing Tor networks!
8. Start with TestingTorNetwork unset and set it later on which should
fail.
tor DataDirectory . ControlPort 9051 DirServer \
"mydir 127.0.0.1:1234 0000000000000000000000000000000000000000"
telnet 127.0.0.1 9051
AUTHENTICATE
SETCONF TestingTorNetwork=1
553 Transition not allowed: While Tor is running, changing
TestingTorNetwork is not allowed.
9. Start with TestingTorNetwork set and unset it later on which should
fail.
tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \
"mydir 127.0.0.1:1234 0000000000000000000000000000000000000000"
telnet 127.0.0.1 9051
AUTHENTICATE
RESETCONF TestingTorNetwork
513 Unacceptable option value: TestingV3AuthInitialVotingInterval may
only be changed in testing Tor networks!
10. Set TestingTorNetwork, but do not provide an alternate DirServer
which should fail.
tor DataDirectory . ControlPort 9051 TestingTorNetwork 1
[warn] Failed to parse/validate config: TestingTorNetwork may only be
configured in combination with a non-default set of DirServers.
Filename: 136-legacy-keys.txt
Title: Mass authority migration with legacy keys
Author: Nick Mathewson
Created: 13-May-2008
Status: Closed
Implemented-In: 0.2.0.x
Overview:
This document describes a mechanism to change the keys of more than
half of the directory servers at once without breaking old clients
and caches immediately.
Motivation:
If a single authority's identity key is believed to be compromised,
the solution is obvious: remove that authority from the list,
generate a new certificate, and treat the new cert as belonging to a
new authority. This approach works fine so long as less than 1/2 of
the authority identity keys are bad.
Unfortunately, the mass-compromise case is possible if there is a
sufficiently bad bug in Tor or in any OS used by a majority of v3
authorities. Let's be prepared for it!
We could simply stop using the old keys and start using new ones,
and tell all clients running insecure versions to upgrade.
Unfortunately, this breaks our cacheing system pretty badly, since
caches won't cache a consensus that they don't believe in. It would
be nice to have everybody become secure the moment they upgrade to a
version listing the new authority keys, _without_ breaking upgraded
clients until the caches upgrade.
So, let's come up with a way to provide a time window where the
consensuses are signed with the new keys and with the old.
Design:
We allow directory authorities to list a single "legacy key"
fingerprint in their votes. Each authority may add a single legacy
key. The format for this line is:
legacy-dir-key FINGERPRINT
We describe a new consensus method for generating directory
consensuses. This method is consensus method "3".
When the authorities decide to use method "3" (as described in 3.4.1
of dir-spec.txt), for every included vote with a legacy-dir-key line,
the consensus includes an extra dir-source line. The fingerprint in
this extra line is as in the legacy-dir-key line. The ports and
addresses are in the dir-source line. The nickname is as in the
dir-source line, with the string "-legacy" appended.
[We need to include this new dir-source line because the code
won't accept or preserve signatures from authorities not listed
as contributing to the consensus.]
Authorities using legacy dir keys include two signatures on their
consensuses: one generated with a signing key signed with their real
signing key, and another generated with a signing key signed with
another signing key attested to by their identity key. These
signing keys MUST be different. Authorities MUST serve both
certificates if asked.
Process:
In the event of a mass key failure, we'll follow the following
(ugly) procedure:
- All affected authorities generate new certificates and identity
keys, and circulate their new dirserver lines. They copy their old
certificates and old broken keys, but put them in new "legacy
key files".
- At the earliest time that can be arranged, the authorities
replace their signing keys, identity keys, and certificates
with the new uncompromised versions, and update to the new list
of dirserer lines.
- They add an "V3DirAdvertiseLegacyKey 1" option to their torrc.
- Now, new consensuses will be generated using the new keys, but
the results will also be signed with the old keys.
- Clients and caches are told they need to upgrade, and given a
time window to do so.
- At the end of the time window, authorities remove the
V3DirAdvertiseLegacyKey option.
Notes:
It might be good to get caches to cache consensuses that they do not
believe in. I'm not sure the best way of how to do this.
It's a superficially neat idea to have new signing keys and have
them signed by the new and by the old authority identity keys. This
breaks some code, though, and doesn't actually gain us anything,
since we'd still need to include each signature twice.
It's also a superficially neat idea, if identity keys and signing
keys are compromised, to at least replace all the signing keys.
I don't think this achieves us anything either, though.
Filename: 137-bootstrap-phases.txt
Title: Keep controllers informed as Tor bootstraps
Author: Roger Dingledine
Created: 07-Jun-2008
Status: Closed
Implemented-In: 0.2.1.x
1. Overview.
Tor has many steps to bootstrapping directory information and
initial circuits, but from the controller's perspective we just have
a coarse-grained "CIRCUIT_ESTABLISHED" status event. Tor users with
slow connections or with connectivity problems can wait a long time
staring at the yellow onion, wondering if it will ever change color.
This proposal describes a new client status event so Tor can give
more details to the controller. Section 2 describes the changes to the
controller protocol; Section 3 describes Tor's internal bootstrapping
phases when everything is going correctly; Section 4 describes when
Tor detects a problem and issues a bootstrap warning; Section 5 covers
suggestions for how controllers should display the results.
2. Controller event syntax.
The generic status event is:
"650" SP StatusType SP StatusSeverity SP StatusAction
[SP StatusArguments] CRLF
So in this case we send
650 STATUS_CLIENT NOTICE/WARN BOOTSTRAP \
PROGRESS=num TAG=Keyword SUMMARY=String \
[WARNING=String REASON=Keyword COUNT=num RECOMMENDATION=Keyword]
The arguments MAY appear in any order. Controllers MUST accept unrecognized
arguments.
"Progress" gives a number between 0 and 100 for how far through
the bootstrapping process we are. "Summary" is a string that can be
displayed to the user to describe the *next* task that Tor will tackle,
i.e., the task it is working on after sending the status event. "Tag"
is an optional string that controllers can use to recognize bootstrap
phases from Section 3, if they want to do something smarter than just
blindly displaying the summary string.
The severity describes whether this is a normal bootstrap phase
(severity notice) or an indication of a bootstrapping problem
(severity warn). If severity warn, it should also include a "warning"
argument string with any hints Tor has to offer about why it's having
troubles bootstrapping, a "reason" string that lists one of the reasons
allowed in the ORConn event, a "count" number that tells how many
bootstrap problems there have been so far at this phase, and a
"recommendation" keyword to indicate how the controller ought to react.
3. The bootstrap phases.
This section describes the various phases currently reported by
Tor. Controllers should not assume that the percentages and tags listed
here will continue to match up, or even that the tags will stay in
the same order. Some phases might also be skipped (not reported) if the
associated bootstrap step is already complete, or if the phase no longer
is necessary. Only "starting" and "done" are guaranteed to exist in all
future versions.
Current Tor versions enter these phases in order, monotonically;
future Tors MAY revisit earlier stages.
Phase 0:
tag=starting summary="starting"
Tor starts out in this phase.
Phase 5:
tag=conn_dir summary="Connecting to directory mirror"
Tor sends this event as soon as Tor has chosen a directory mirror ---
one of the authorities if bootstrapping for the first time or after
a long downtime, or one of the relays listed in its cached directory
information otherwise.
Tor will stay at this phase until it has successfully established
a TCP connection with some directory mirror. Problems in this phase
generally happen because Tor doesn't have a network connection, or
because the local firewall is dropping SYN packets.
Phase 10
tag=handshake_dir summary="Finishing handshake with directory mirror"
This event occurs when Tor establishes a TCP connection with a relay used
as a directory mirror (or its https proxy if it's using one). Tor remains
in this phase until the TLS handshake with the relay is finished.
Problems in this phase generally happen because Tor's firewall is
doing more sophisticated MITM attacks on it, or doing packet-level
keyword recognition of Tor's handshake.
Phase 15:
tag=onehop_create summary="Establishing one-hop circuit for dir info"
Once TLS is finished with a relay, Tor will send a CREATE_FAST cell
to establish a one-hop circuit for retrieving directory information.
It will remain in this phase until it receives the CREATED_FAST cell
back, indicating that the circuit is ready.
Phase 20:
tag=requesting_status summary="Asking for networkstatus consensus"
Once we've finished our one-hop circuit, we will start a new stream
for fetching the networkstatus consensus. We'll stay in this phase
until we get the 'connected' relay cell back, indicating that we've
established a directory connection.
Phase 25:
tag=loading_status summary="Loading networkstatus consensus"
Once we've established a directory connection, we will start fetching
the networkstatus consensus document. This could take a while; this
phase is a good opportunity for using the "progress" keyword to indicate
partial progress.
This phase could stall if the directory mirror we picked doesn't
have a copy of the networkstatus consensus so we have to ask another,
or it does give us a copy but we don't find it valid.
Phase 40:
tag=loading_keys summary="Loading authority key certs"
Sometimes when we've finished loading the networkstatus consensus,
we find that we don't have all the authority key certificates for the
keys that signed the consensus. At that point we put the consensus we
fetched on hold and fetch the keys so we can verify the signatures.
Phase 45
tag=requesting_descriptors summary="Asking for relay descriptors"
Once we have a valid networkstatus consensus and we've checked all
its signatures, we start asking for relay descriptors. We stay in this
phase until we have received a 'connected' relay cell in response to
a request for descriptors.
Phase 50:
tag=loading_descriptors summary="Loading relay descriptors"
We will ask for relay descriptors from several different locations,
so this step will probably make up the bulk of the bootstrapping,
especially for users with slow connections. We stay in this phase until
we have descriptors for at least 1/4 of the usable relays listed in
the networkstatus consensus. This phase is also a good opportunity to
use the "progress" keyword to indicate partial steps.
Phase 80:
tag=conn_or summary="Connecting to entry guard"
Once we have a valid consensus and enough relay descriptors, we choose
some entry guards and start trying to build some circuits. This step
is similar to the "conn_dir" phase above; the only difference is
the context.
If a Tor starts with enough recent cached directory information,
its first bootstrap status event will be for the conn_or phase.
Phase 85:
tag=handshake_or summary="Finishing handshake with entry guard"
This phase is similar to the "handshake_dir" phase, but it gets reached
if we finish a TCP connection to a Tor relay and we have already reached
the "conn_or" phase. We'll stay in this phase until we complete a TLS
handshake with a Tor relay.
Phase 90:
tag=circuit_create "Establishing circuits"
Once we've finished our TLS handshake with an entry guard, we will
set about trying to make some 3-hop circuits in case we need them soon.
Phase 100:
tag=done summary="Done"
A full 3-hop circuit has been established. Tor is ready to handle
application connections now.
4. Bootstrap problem events.
When an OR Conn fails, we send a "bootstrap problem" status event, which
is like the standard bootstrap status event except with severity warn.
We include the same progress, tag, and summary values as we would for
a normal bootstrap event, but we also include "warning", "reason",
"count", and "recommendation" key/value combos.
The "reason" values are long-term-stable controller-facing tags to
identify particular issues in a bootstrapping step. The warning
strings, on the other hand, are human-readable. Controllers SHOULD
NOT rely on the format of any warning string. Currently the possible
values for "recommendation" are either "ignore" or "warn" -- if ignore,
the controller can accumulate the string in a pile of problems to show
the user if the user asks; if warn, the controller should alert the
user that Tor is pretty sure there's a bootstrapping problem.
Currently Tor uses recommendation=ignore for the first nine bootstrap
problem reports for a given phase, and then uses recommendation=warn
for subsequent problems at that phase. Hopefully this is a good
balance between tolerating occasional errors and reporting serious
problems quickly.
5. Suggested controller behavior.
Controllers should start out with a yellow onion or the equivalent
("starting"), and then watch for either a bootstrap status event
(meaning the Tor they're using is sufficiently new to produce them,
and they should load up the progress bar or whatever they plan to use
to indicate progress) or a circuit_established status event (meaning
bootstrapping is finished).
In addition to a progress bar in the display, controllers should also
have some way to indicate progress even when no controller window is
open. For example, folks using Tor Browser Bundle in hostile Internet
cafes don't want a big splashy screen up. One way to let the user keep
informed of progress in a more subtle way is to change the task tray
icon and/or tooltip string as more bootstrap events come in.
Controllers should also have some mechanism to alert their user when
bootstrapping problems are reported. Perhaps we should gather a set of
help texts and the controller can send the user to the right anchor in a
"bootstrapping problems" page in the controller's help subsystem?
6. Getting up to speed when the controller connects.
There's a new "GETINFO /status/bootstrap-phase" option, which returns
the most recent bootstrap phase status event sent. Specifically,
it returns a string starting with either "NOTICE BOOTSTRAP ..." or
"WARN BOOTSTRAP ...".
Controllers should use this getinfo when they connect or attach to
Tor to learn its current state.
Filename: 138-remove-down-routers-from-consensus.txt
Title: Remove routers that are not Running from consensus documents
Author: Peter Palfrader
Created: 11-Jun-2008
Status: Closed
Implemented-In: 0.2.1.2-alpha
1. Overview.
Tor directory authorities hourly vote and agree on a consensus document
which lists all the routers on the network together with some of their
basic properties, like if a router is an exit node, whether it is
stable or whether it is a version 2 directory mirror.
One of the properties given with each router is the 'Running' flag.
Clients do not use routers that are not listed as running.
This proposal suggests that routers without the Running flag are not
listed at all.
2. Current status
At a typical bootstrap a client downloads a 140KB consensus, about
10KB of certificates to verify that consensus, and about 1.6MB of
server descriptors, about 1/4 of which it requires before it will
start building circuits.
Another proposal deals with how to get that huge 1.6MB fraction to
effectively zero (by downloading only individual descriptors, on
demand). Should that get successfully implemented that will leave the
140KB compressed consensus as a large fraction of what a client needs
to get in order to work.
About one third of the routers listed in a consensus are not running
and will therefore never be used by clients who use this consensus.
Not listing those routers will save about 30% to 40% in size.
3. Proposed change
Authority directory servers produce vote documents that include all
the servers they know about, running or not, like they currently
do. In addition these vote documents also state that the authority
supports a new consensus forming method (method number 4).
If more than two thirds of votes that an authority has received claim
they support method 4 then this new method will be used: The
consensus document is formed like before but a new last step removes
all routers from the listing that are not marked as Running.
Filename: 139-conditional-consensus-download.txt
Title: Download consensus documents only when it will be trusted
Author: Peter Palfrader
Created: 2008-04-13
Status: Closed
Implemented-In: 0.2.1.x
Overview:
Servers only provide consensus documents to clients when it is known that
the client will trust it.
Motivation:
When clients[1] want a new network status consensus they request it
from a Tor server using the URL path /tor/status-vote/current/consensus.
Then after downloading the client checks if this consensus can be
trusted. Whether the client trusts the consensus depends on the
authorities that the client trusts and how many of those
authorities signed the consensus document.
If the client cannot trust the consensus document it is disregarded
and a new download is tried at a later time. Several hundred
kilobytes of server bandwidth were wasted by this single client's
request.
With hundreds of thousands of clients this will have undesirable
consequences when the list of authorities has changed so much that a
large number of established clients no longer can trust any consensus
document formed.
Objective:
The objective of this proposal is to make clients not download
consensuses they will not trust.
Proposal:
The list of authorities that are trusted by a client are encoded in
the URL they send to the directory server when requesting a consensus
document.
The directory server then only sends back the consensus when more than
half of the authorities listed in the request have signed the
consensus. If it is known that the consensus will not be trusted
a 404 error code is sent back to the client.
This proposal does not require directory caches to keep more than one
consensus document. This proposal also does not require authorities
to verify the signature on the consensus document of authorities they
do not recognize.
The new URL scheme to download a consensus is
/tor/status-vote/current/consensus/<F> where F is a list of
fingerprints, sorted in ascending order, and concatenated using a +
sign.
Fingerprints are uppercase hexadecimal encodings of the authority
identity key's digest. Servers should also accept requests that
use lower case or mixed case hexadecimal encodings.
A .z URL for compressed versions of the consensus will be provided
similarly to existing resources and is the URL that usually should
be used by clients.
Migration:
The old location of the consensus should continue to work
indefinitely. Not only is it used by old clients, but it is a useful
resource for automated tools that do not particularly care which
authorities have signed the consensus.
Authorities that are known to the client a priori by being shipped
with the Tor code are assumed to handle this format.
When downloading a consensus document from caches that do not support this
new format they fall back to the old download location.
Caches support the new format starting with Tor version 0.2.1.1-alpha.
Anonymity Implications:
By supplying the list of authorities a client trusts to the directory
server we leak information (like likely version of Tor client) to the
directory server. In the current system we also leak that we are
very old - by re-downloading the consensus over and over again, but
only when we are so old that we no longer can trust the consensus.
Footnotes:
1. For the purpose of this proposal a client can be any Tor instance
that downloads a consensus document. This includes relays,
directory caches as well as end users.
Filename: 140-consensus-diffs.txt
Title: Provide diffs between consensuses
Author: Peter Palfrader
Created: 13-Jun-2008
Implemented-In: 0.3.1.1-alpha
Status: Closed
Ticket: https://bugs.torproject.org/13339
0. History
22-May-2009: Restricted the ed format even more strictly for ease of
implementation. -nickm
25-May-2014: Adapted to the new dir-spec version 3 and made the diff urls
backwards-compatible. -mvdan
1-Mar-2017: Update to new stats, note newer proposals, note flavors,
diffs, add parameters, restore diff-only URLs, say what "Digest"
means. -nickm
3-May-2017: Add a notion of "digest-as-signed" vs "full digest", since
otherwise the fact that there are multiple encodings of the same valid
consensus signatures would make clients identify which encodings they
had been given as they asked for diffs.
4-May-2017: Remove support for truncated digest prefixes.
1. Overview.
Tor clients and servers need a list of which relays are on the
network. This list, the consensus, is created by authorities
hourly and clients fetch a copy of it, with some delay, hourly.
This proposal suggests that clients download diffs of consensuses
once they have a consensus instead of hourly downloading a full
consensus.
This does not only apply to ordinary directory consensuses, but to the
newer microdescriptor consensuses added in the third version of the
directory specification.
2. Numbers
After implementing proposal 138, which removed nodes that are not
running from the list, a consensus document was about 92 kilobytes
in size after compression... back in 2008 when this proposal was first
written.
But now in March 2017, that figure is more like 625 kilobytes.
The diff between two consecutive consensuses, in ed format, is on
average 37 kilobytes compressed. So by making this change, we could
save something like 94% of our consensus download bandwidth.
3. Proposal
3.0. Preliminaries.
Unless otherwise specified, all digests in this document are SHA3-256
digests, encoded in base64. This document also uses "hash" as
synonymous with "digest".
A "full digest" of a consensus document covers the entire document,
from the "network-status-version" through the newline after the final
"-----END SIGNATURE-----".
A "digest as signed" of a consensus document covers the same part that
the signatures cover: the "network-status-version" through the space
immediately after the "directory-signature" keyword on the first
"directory-signature" line.
3.1 Clients
If a client has a consensus that is recent enough it SHOULD
try to download a diff to get the latest consensus rather than
fetching a full one.
[XXX: what is recent enough?
time delta in hours / size of compressed diff
1: 38177
2: 66955
3: 93502
4: 118959
5: 143450
6: 167136
12: 291354
18: 404008
24: 416663
30: 431240
36: 443858
42: 454849
48: 464677
54: 476716
60: 487755
66: 497502
72: 506421
Data suggests that for the first few hours' diffs are very useful,
saving at least 50% for the first 12 hours. After that, returns seem to
be more marginal. But note the savings from proposals like 274-276, which
make diffs smaller over a much longer timeframe. ]
3.2 Servers
Directory authorities and servers need to keep a number of old consensus
documents so they can build diffs. (See section 5 below ). They should
offer a diff to the most recent consensus at the following request:
HTTP/1.0 GET /tor/status-vote/current/consensus{-Flavor}/<FPRLIST>.z
X-Or-Diff-From-Consensus: HASH1 HASH2...
where the hashes are the digests-as-signed of the consensuses the client
currently has, and FPRLIST is a list of (abbreviated) fingerprints of
authorities the client trusts.
Servers will only return a consensus if more than half of the requested
authorities have signed the document. Otherwise, a 404 error will be sent
back.
The advantage of using the same URL that is currently used for
consensuses is that the client doesn't need to know whether a server
supports consensus diffs. If it doesn't, it will simply ignore the
extra header and return the full consensus.
If a server cannot offer a diff from one of the consensuses identified
by one of the hashes but has a current consensus it MUST return the
full consensus.
[XXX: what should we do when the client already has the latest
consensus? I can think of the following options:
- send back 3xx not modified
- send back 200 ok and an empty diff
- send back 404 nothing newer here.
I currently lean towards the empty diff.]
Additionally, specific diff for a given consensus digest-as-signed
should be available a URL of the form:
/tor/status-vote/current/consensus{-Flavor}/diff/<HASH>/<FPRLIST>.z
This differs from the previous request type in that it should never
return a whole consensus: if a diff is not available, it should return
404.
4. Diff Format
Diffs start with the token "network-status-diff-version" followed by a
space and the version number, currently "1".
If a document does not start with network-status-diff it is assumed
to be a full consensus download and would therefore currently start
with "network-status-version 3".
Following the network-status-diff line is another header line,
starting with the token "hash" followed by the digest-as-signed of the
consensus that this diff applies to, and the full digest that the
resulting consensus should have.
Following the network-status-diff header lines is a diff, or patch, in
limited ed format. We choose this format because it is easy to create
and process with standard tools (patch, diff -e, ed). This will help
us in developing and testing this proposal and it should make future
debugging easier.
[ If at one point in the future we decide that the space benefits from
a custom diff format outweighs these benefits we can always
introduce a new diff format and offer it at for instance
../diff2/... ]
We support the following ed commands, each on a line by itself:
- "<n1>d" Delete line n1
- "<n1>,<n2>d" Delete lines n1 through n2, inclusive
- "<n1>,$d" Delete line n1 through the end of the file, inclusive.
- "<n1>c" Replace line n1 with the following block
- "<n1>,<n2>c" Replace lines n1 through n2, inclusive, with the
following block.
- "<n1>a" Append the following block after line n1.
- "a" Append the following block after the current line.
Note that line numbers always apply to the file after all previous
commands have already been applied. Note also that line numbers
are 1-indexed.
The commands MUST apply to the file from back to front, such that
lines are only ever referred to by their position in the original
file.
If there are any directory signatures on the original document, the
first command MUST be a "<n1>,$d" form to remove all of the directory
signatures. Using this format ensures that the client will
successfully apply the diff even if they have an unusual encoding for
the signatures.
The "current line" is either the first line of the file, if this is
the first command, the last line of a block we added in an append or
change command, or the line immediate following a set of lines we just
deleted (or the last line of the file if there are no lines after
that).
The replace and append command take blocks. These blocks are simply
appended to the diff after the line with the command. A line with
just a period (".") ends the block (and is not part of the lines
to add). Note that it is impossible to insert a line with just
a single dot.
4.1. Concatenating multiple diffs
Directory caches may, at their discretion, return the concatenation of
multiple diffs using the format above. Such diffs are to be applied from
first to last. This allows the caches to cache a smaller number of
compressed diffs, at the expense of some loss in bandwidth efficiency.
5. Networkstatus parameters
The following parameters govern how relays and clients use this protocol.
min-consensuses-age-to-cache-for-diff
(min 0, max 744, default 6)
max-consensuses-age-to-cache-for-diff
(min 0, max 8192, default 72)
These two parameters determine how much consensus history (in
hours) relays should try to cache in order to serve diffs.
try-diff-for-consensus-newer-than
(min 0, max 8192, default 72)
This parameter determines how old a consensus can be (in hours)
before a client should no longer try to find a diff for it.
Filename: 141-jit-sd-downloads.txt
Title: Download server descriptors on demand
Author: Peter Palfrader
Created: 15-Jun-2008
Status: Obsolete
1. Overview
Downloading all server descriptors is the most expensive part
of bootstrapping a Tor client. These server descriptors currently
amount to about 1.5 Megabytes of data, and this size will grow
linearly with network size.
Fetching all these server descriptors takes a long while for people
behind slow network connections. It is also a considerable load on
our network of directory mirrors.
This document describes proposed changes to the Tor network and
directory protocol so that clients will no longer need to download
all server descriptors.
These changes consist of moving load balancing information into
network status documents, implementing a means to download server
descriptors on demand in an anonymity-preserving way, and dealing
with exit node selection.
2. What is in a server descriptor
When a Tor client starts the first thing it will try to get is a
current network status document: a consensus signed by a majority
of directory authorities. This document is currently about 100
Kilobytes in size, tho it will grow linearly with network size.
This document lists all servers currently running on the network.
The Tor client will then try to get a server descriptor for each
of the running servers. All server descriptors currently amount
to about 1.5 Megabytes of downloads.
A Tor client learns several things about a server from its descriptor.
Some of these it already learned from the network status document
published by the authorities, but the server descriptor contains it
again in a single statement signed by the server itself, not just by
the directory authorities.
Tor clients use the information from server descriptors for
different purposes, which are considered in the following sections.
#three ways: One, to determine if a server will be able to handle
#this client's request; two, to actually communicate or use the server;
#three, for load balancing decisions.
#
#These three points are considered in the following subsections.
2.1 Load balancing
The Tor load balancing mechanism is quite complex in its details, but
it has a simple goal: The more traffic a server can handle the more
traffic it should get. That means the more traffic a server can
handle the more likely a client will use it.
For this purpose each server descriptor has bandwidth information
which tries to convey a server's capacity to clients.
Currently we weigh servers differently for different purposes. There
is a weight for when we use a server as a guard node (our entry to the
Tor network), there is one weight we assign servers for exit duties,
and a third for when we need intermediate (middle) nodes.
2.2 Exit information
When a Tor wants to exit to some resource on the internet it will
build a circuit to an exit node that allows access to that resource's
IP address and TCP Port.
When building that circuit the client can make sure that the circuit
ends at a server that will be able to fulfill the request because the
client already learned of all the servers' exit policies from their
descriptors.
2.3 Capability information
Server descriptors contain information about the specific version of
the Tor protocol they understand [proposal 105].
Furthermore the server descriptor also contains the exact version of
the Tor software that the server is running and some decisions are
made based on the server version number (for instance a Tor client
will only make conditional consensus requests [proposal 139] when
talking to Tor servers version 0.2.1.1-alpha or later).
2.4 Contact/key information
A server descriptor lists a server's IP address and TCP ports on which
it accepts onion and directory connections. Furthermore it contains
the onion key (a short lived RSA key to which clients encrypt CREATE
cells).
2.5 Identity information
A Tor client learns the digest of a server's key from the network
status document. Once it has a server descriptor this descriptor
contains the full RSA identity key of the server. Clients verify
that 1) the digest of the identity key matches the expected digest
it got from the consensus, and 2) that the signature on the descriptor
from that key is valid.
3. No longer require clients to have copies of all SDs
3.1 Load balancing info in consensus documents
One of the reasons why clients download all server descriptors is for
doing load proper load balancing as described in 2.1. In order for
clients to not require all server descriptors this information will
have to move into the network status document.
Consensus documents will have a new line per router similar
to the "r", "s", and "v" lines that already exist. This line
will convey weight information to clients.
"w Bandwidth=193"
The bandwidth number is the lesser of observed bandwidth and bandwidth
rate limit from the server descriptor that the "r" line referenced by
digest (1st and 3rd field of the bandwidth line in the descriptor).
It is given in kilobytes per second so the byte value in the
descriptor has to be divided by 1024 (and is then truncated, i.e.
rounded down).
Authorities will cap the bandwidth number at some arbitrary value,
currently 10MB/sec. If a router claims a larger bandwidth an
authority's vote will still only show Bandwidth=10240.
The consensus value for bandwidth is the median of all bandwidth
numbers given in votes. In case of an even number of votes we use
the lower median. (Using this procedure allows us to change the
cap value more easily.)
Clients should believe the bandwidth as presented in the consensus,
not capping it again.
3.2 Fetching descriptors on demand
As described in 2.4 a descriptor lists IP address, OR- and Dir-Port,
and the onion key for a server.
A client already knows the IP address and the ports from the consensus
documents, but without the onion key it will not be able to send
CREATE/EXTEND cells for that server. Since the client needs the onion
key it needs the descriptor.
If a client only downloaded a few descriptors in an observable manner
then that would leak which nodes it was going to use.
This proposal suggests the following:
1) when connecting to a guard node for which the client does not
yet have a cached descriptor it requests the descriptor it
expects by hash. (The consensus document that the client holds
has a hash for the descriptor of this server. We want exactly
that descriptor, not a different one.)
It does that by sending a RELAY_REQUEST_SD cell.
A client MAY cache the descriptor of the guard node so that it does
not need to request it every single time it contacts the guard.
2) when a client wants to extend a circuit that currently ends in
server B to a new next server C, the client will send a
RELAY_REQUEST_SD cell to server B. This cell contains in its
payload the hash of a server descriptor the client would like
to obtain (C's server descriptor). The server sends back the
descriptor and the client can now form a valid EXTEND/CREATE cell
encrypted to C's onion key.
Clients MUST NOT cache such descriptors. If they did they might
leak that they already extended to that server at least once
before.
Replies to RELAY_REQUEST_SD requests need to be padded to some
constant upper limit in order to conceal a client's destination
from anybody who might be counting cells/bytes.
RELAY_REQUEST_SD cells contain the following information:
- hash of the server descriptor requested
- hash of the identity digest of the server for which we want the SD
- IP address and OR-port or the server for which we want the SD
- padding factor - the number of cells we want the answer
padded to.
[XXX this just occured to me and it might be smart. or it might
be stupid. clients would learn the padding factor they want
to use from the consensus document. This allows us to grow
the replies later on should SDs become larger.]
[XXX: figure out a decent padding size]
3.3 Protocol versions
Server descriptors contain optional information of supported
link-level and circuit-level protocols in the form of
"opt protocols Link 1 2 Circuit 1". These are not currently needed
and will probably eventually move into the "v" (version) line in
the consensus. This proposal does not deal with them.
Similarly a server descriptor contains the version number of
a Tor node. This information is already present in the consensus
and is thus available to all clients immediately.
3.4 Exit selection
Currently finding an appropriate exit node for a user's request is
easy for a client because it has complete knowledge of all the exit
policies of all servers on the network.
The consensus document will once again be extended to contain the
information required by clients. This information will be a summary
of each node's exit policy. The exit policy summary will only contain
the list of ports to which a node exits to most destination IP
addresses.
A summary should claim a router exits to a specific TCP port if,
ignoring private IP addresses, the exit policy indicates that the
router would exit to this port to most IP address. either two /8
netblocks, or one /8 and a couple of /12s or any other combination).
The exact algorith used is this: Going through all exit policy items
- ignore any accept that is not for all IP addresses ("*"),
- ignore rejects for these netblocks (exactly, no subnetting):
0.0.0.0/8, 169.254.0.0/16, 127.0.0.0/8, 192.168.0.0/16, 10.0.0.0/8,
and 172.16.0.0/12m
- for each reject count the number of IP addresses rejected against
the affected ports,
- once we hit an accept for all IP addresses ("*") add the ports in
that policy item to the list of accepted ports, if they don't have
more than 2^25 IP addresses (that's two /8 networks) counted
against them (i.e. if the router exits to a port to everywhere but
at most two /8 networks).
An exit policy summary will be included in votes and consensus as a
new line attached to each exit node. The line will have the format
"p" <space> "accept"|"reject" <portlist>
where portlist is a comma seperated list of single port numbers or
portranges (e.g. "22,80-88,1024-6000,6667").
Whether the summary shows the list of accepted ports or the list of
rejected ports depends on which list is shorter (has a shorter string
representation). In case of ties we choose the list of accepted
ports. As an exception to this rule an allow-all policy is
represented as "accept 1-65535" instead of "reject " and a reject-all
policy is similarly given as "reject 1-65535".
Summary items are compressed, that is instead of "80-88,89-100" there
only is a single item of "80-100", similarly instead of "20,21" a
summary will say "20-21".
Port lists are sorted in ascending order.
The maximum allowed length of a policy summary (including the "accept "
or "reject ") is 1000 characters. If a summary exceeds that length we
use an accept-style summary and list as much of the port list as is
possible within these 1000 bytes.
3.4.1 Consensus selection
When building a consensus, authorities have to agree on a digest of
the server descriptor to list in the router line for each router.
This is documented in dir-spec section 3.4.
All authorities that listed that agreed upon descriptor digest in
their vote should also list the same exit policy summary - or list
none at all if the authority has not been upgraded to list that
information in their vote.
If we have votes with matching server descriptor digest of which at
least one of them has an exit policy then we differ between two cases:
a) all authorities agree (or abstained) on the policy summary, and we
use the exit policy summary that they all listed in their vote,
b) something went wrong (or some authority is playing foul) and we
have different policy summaries. In that case we pick the one
that is most commonly listed in votes with the matching
descriptor. We break ties in favour of the lexigraphically larger
vote.
If none one of the votes with a matching server descriptor digest has
an exit policy summary we use the most commonly listed one in all
votes, breaking ties like in case b above.
3.4.2 Client behaviour
When choosing an exit node for a specific request a Tor client will
choose from the list of nodes that exit to the requested port as given
by the consensus document. If a client has additional knowledge (like
cached full descriptors) that indicates the so chosen exit node will
reject the request then it MAY use that knowledge (or not include such
nodes in the selection to begin with). However, clients MUST NOT use
nodes that do not list the port as accepted in the summary (but for
which they know that the node would exit to that address from other
sources, like a cached descriptor).
An exception to this is exit enclave behaviour: A client MAY use the
node at a specific IP address to exit to any port on the same address
even if that node is not listed as exiting to the port in the summary.
4. Migration
4.1 Consensus document changes.
The consensus will need to include
- bandwidth information (see 3.1)
- exit policy summaries (3.4)
A new consensus method (number TBD) will be chosen for this.
5. Future possibilities
This proposal still requires that all servers have the descriptors of
every other node in the network in order to answer RELAY_REQUEST_SD
cells. These cells are sent when a circuit is extended from ending at
node B to a new node C. In that case B would have to answer a
RELAY_REQUEST_SD cell that asks for C's server descriptor (by SD digest).
In order to answer that request B obviously needs a copy of C's server
descriptor. The RELAY_REQUEST_SD cell already has all the info that
B needs to contact C so it can ask about the descriptor before passing it
back to the client.
Filename: 142-combine-intro-and-rend-points.txt
Title: Combine Introduction and Rendezvous Points
Author: Karsten Loesing, Christian Wilms
Created: 27-Jun-2008
Status: Dead
Change history:
27-Jun-2008 Initial proposal for or-dev
04-Jul-2008 Give first security property the new name "Responsibility"
and change new cell formats according to rendezvous protocol
version 3 draft.
19-Jul-2008 Added comment by Nick (but no solution, yet) that sharing of
circuits between multiple clients is not supported by Tor.
Overview:
Establishing a connection to a hidden service currently involves two Tor
relays, introduction and rendezvous point, and 10 more relays distributed
over four circuits to connect to them. The introduction point is
established in the mid-term by a hidden service to transfer introduction
requests from client to the hidden service. The rendezvous point is set
up by the client for a single hidden service request and actually
transfers end-to-end encrypted application data between client and hidden
service.
There are some reasons for separating the two roles of introduction and
rendezvous point: (1) Responsibility: A relay shall not be made
responsible that it relays data for a certain hidden service; in the
original design as described in [1] an introduction point relays no
application data, and a rendezvous points neither knows the hidden
service nor can it decrypt the data. (2) Scalability: The hidden service
shall not have to maintain a number of open circuits proportional to the
expected number of client requests. (3) Attack resistance: The effect of
an attack on the only visible parts of a hidden service, its introduction
points, shall be as small as possible.
However, elimination of a separate rendezvous connection as proposed by
Øverlier and Syverson [2] is the most promising approach to improve the
delay in connection establishment. From all substeps of connection
establishment extending a circuit by only a single hop is responsible for
a major part of delay. Reducing on-demand circuit extensions from two to
one results in a decrease of mean connection establishment times from 39
to 29 seconds [3]. Particularly, eliminating the delay on hidden-service
side allows the client to better observe progress of connection
establishment, thus allowing it to use smaller timeouts. Proposal 114
introduced new introduction keys for introduction points and provides for
user authorization data in hidden service descriptors; it will be shown
in this proposal that introduction keys in combination with new
introduction cookies provide for the first security property
responsibility. Further, eliminating the need for a separate introduction
connection benefits the overall network load by decreasing the number of
circuit extensions. After all, having only one connection between client
and hidden service reduces the overall protocol complexity.
Design:
1. Hidden Service Configuration
Hidden services should be able to choose whether they would like to use
this protocol. This might be opt-in for 0.2.1.x and opt-out for later
major releases.
2. Contact Point Establishment
When preparing a hidden service, a Tor client selects a set of relays to
act as contact points instead of introduction points. The contact point
combines both roles of introduction and rendezvous point as proposed in
[2]. The only requirement for a relay to be picked as contact point is
its capability of performing this role. This can be determined from the
Tor version number that needs to be equal or higher than the first
version that implements this proposal.
The easiest way to implement establishment of contact points is to
introduce v2 ESTABLISH_INTRO cells. By convention, the relay recognizes
version 2 ESTABLISH_INTRO cells as requests to establish a contact point
rather than an introduction point.
V Format byte: set to 255 [1 octet]
V Version byte: set to 2 [1 octet]
KLEN Key length [2 octets]
PK Public introduction key [KLEN octets]
HS Hash of session info [20 octets]
SIG Signature of above information [variable]
The hidden service does not create a fixed number of contact points, like
3 in the current protocol. It uses a minimum of 3 contact points, but
increases this number depending on the history of client requests within
the last hour. The hidden service also increases this number depending on
the frequency of failing contact points in order to defend against
attacks on its contact points. When client authorization as described in
proposal 121 is used, a hidden service can also use the number of
authorized clients as first estimate for the required number of contact
points.
3. Hidden Service Descriptor Creation
A hidden service needs to issue a fresh introduction cookie for each
established introduction point. By requiring clients to use this cookie
in a later connection establishment, an introduction point cannot access
the hidden service that it works for. Together with the fresh
introduction key that was introduced in proposal 114, this reduces
responsibility of a contact point for a specific hidden service.
The v2 hidden service descriptor format contains an
"intro-authentication" field that may contain introduction-point specific
keys. The hidden service creates a random string, comparable to the
rendezvous cookie, and includes it in the descriptor as introduction
cookie for auth-type "1". By convention, clients recognize existence of
auth-type 1 as possibility to connect to a hidden service via a contact
point rather than an introduction point. Older clients that do not
understand this new protocol simply ignore that cookie.
4. Connection Establishment
When establishing a connection to a hidden service a client learns about
the capability of using the new protocol from the hidden service
descriptor. It may choose whether to use this new protocol or not,
whereas older clients cannot understand the new capability and can only
use the current protocol. Client using version 0.2.1.x should be able to
opt-in for using the new protocol, which should change to opt-out for
later major releases.
When using the new capability the client creates a v2 INTRODUCE1 cell
that extends an unversioned INTRODUCE1 cell by adding the content of an
ESTABLISH_RENDEZVOUS cell. Further, the client sends this cell using the
new cell type 41 RELAY_INTRODUCE1_VERSIONED to the introduction point,
because unversioned and versioned INTRODUCE1 cells are indistinguishable:
Cleartext
V Version byte: set to 2 [1 octet]
PK_ID Identifier for Bob's PK [20 octets]
RC Rendezvous cookie [20 octets]
Encrypted to introduction key:
VER Version byte: set to 3. [1 octet]
AUTHT The auth type that is supported [1 octet]
AUTHL Length of auth data [2 octets]
AUTHD Auth data [variable]
RC Rendezvous cookie [20 octets]
g^x Diffie-Hellman data, part 1 [128 octets]
The cleartext part contains the rendezvous cookie that the contact point
remembers just as a rendezvous point would do.
The encrypted part contains the introduction cookie as auth data for the
auth type 1. The rendezvous cookie is contained as before, but there is
no further rendezvous point information, as there is no separate
rendezvous point.
5. Rendezvous Establishment
The contact point recognizes a v2 INTRODUCE1 cell with auth type 1 as a
request to be used in the new protocol. It remembers the contained
rendezvous cookie, replies to the client with an INTRODUCE_ACK cell
(omitting the RENDEZVOUS_ESTABLISHED cell), and forwards the encrypted
part of the INTRODUCE1 cell as INTRODUCE2 cell to the hidden service.
6. Introduction at Hidden Service
The hidden services recognizes an INTRODUCE2 cell containing an
introduction cookie as authorization data. In this case, it does not
extend a circuit to a rendezvous point, but sends a RENDEZVOUS1 cell
directly back to its contact point as usual.
7. Rendezvous at Contact Point
The contact point processes a RENDEZVOUS1 cell just as a rendezvous point
does. The only difference is that the hidden-service-side circuit is not
exclusive for the client connection, but shared among multiple client
connections.
[Tor does not allow sharing of a single circuit among multiple client
connections easily. We need to think about a smart and efficient way to
implement this. Comment by Nick. -KL]
Security Implications:
(1) Responsibility
One of the original reasons for the separation of introduction and
rendezvous points is that a relay shall not be made responsible that it
relays data for a certain hidden service. In the current design an
introduction point relays no application data and a rendezvous points
neither knows the hidden service nor can it decrypt the data.
This property is also fulfilled in this new design. A contact point only
learns a fresh introduction key instead of the hidden service key, so
that it cannot recognize a hidden service. Further, the introduction
cookie, which is unknown to the contact point, prevents it from accessing
the hidden service itself. The only way for a contact point to access a
hidden service is to look up whether it is contained in the descriptors
of known hidden services. A contact point cannot directly be made
responsible for which hidden service it is working. In addition to that,
it cannot learn the data that it transfers, because all communication
between client and hidden service are end-to-end encrypted.
(2) Scalability
Another goal of the existing hidden service protocol is that a hidden
service does not have to maintain a number of open circuits proportional
to the expected number of client requests. The rationale behind this is
better scalability.
The new protocol eliminates the need for a hidden service to extend
circuits on demand, which has a positive effect on circuits establishment
times and overall network load. The solution presented here to establish
a number of contact points proportional to the history of connection
requests reduces the number of circuits to a minimum number that fits the
hidden service's needs.
(3) Attack resistance
The third goal of separating introduction and rendezvous points is to
limit the effect of an attack on the only visible parts of a hidden
service which are the contact points in this protocol.
In theory, the new protocol is more vulnerable to this attack. An
attacker who can take down a contact point does not only eliminate an
access point to the hidden service, but also breaks current client
connections to the hidden service using that contact point.
Øverlier and Syverson proposed the concept of valet nodes as additional
safeguard for introduction/contact points [4]. Unfortunately, this
increases hidden service protocol complexity conceptually and from an
implementation point of view. Therefore, it is not included in this
proposal.
However, in practice attacking a contact point (or introduction point) is
not as rewarding as it might appear. The cost for a hidden service to set
up a new contact point and publish a new hidden service descriptor is
minimal compared to the efforts necessary for an attacker to take a Tor
relay down. As a countermeasure to further frustrate this attack, the
hidden service raises the number of contact points as a function of
previous contact point failures.
Further, the probability of breaking client connections due to attacking
a contact point is minimal. It can be assumed that the probability of one
of the other five involved relays in a hidden service connection failing
or being shut down is higher than that of a successful attack on a
contact point.
(4) Resistance against Locating Attacks
Clients are no longer able to force a hidden service to create or extend
circuits. This further reduces an attacker's capabilities of locating a
hidden server as described by Øverlier and Syverson [5].
Compatibility:
The presented protocol does not raise compatibility issues with current
Tor versions. New relay versions support both, the existing and the
proposed protocol as introduction/rendezvous/contact points. A contact
point acts as introduction point simultaneously. Hidden services and
clients can opt-in to use the new protocol which might change to opt-out
some time in the future.
References:
[1] Roger Dingledine, Nick Mathewson, and Paul Syverson, Tor: The
Second-Generation Onion Router. In the Proceedings of the 13th USENIX
Security Symposium, August 2004.
[2] Lasse Øverlier and Paul Syverson, Improving Efficiency and Simplicity
of Tor Circuit Establishment and Hidden Services. In the Proceedings of
the Seventh Workshop on Privacy Enhancing Technologies (PET 2007),
Ottawa, Canada, June 2007.
[3] Christian Wilms, Improving the Tor Hidden Service Protocol Aiming at
Better Performance, diploma thesis, June 2008, University of Bamberg.
[4] Lasse Øverlier and Paul Syverson, Valet Services: Improving Hidden
Servers with a Personal Touch. In the Proceedings of the Sixth Workshop
on Privacy Enhancing Technologies (PET 2006), Cambridge, UK, June 2006.
[5] Lasse Øverlier and Paul Syverson, Locating Hidden Servers. In the
Proceedings of the 2006 IEEE Symposium on Security and Privacy, May 2006.
Filename: 143-distributed-storage-improvements.txt
Title: Improvements of Distributed Storage for Tor Hidden Service Descriptors
Author: Karsten Loesing
Created: 28-Jun-2008
Status: Superseded
Change history:
28-Jun-2008 Initial proposal for or-dev
Overview:
An evaluation of the distributed storage for Tor hidden service
descriptors and subsequent discussions have brought up a few improvements
to proposal 114. All improvements are backwards compatible to the
implementation of proposal 114.
Design:
1. Report Bad Directory Nodes
Bad hidden service directory nodes could deny existence of previously
stored descriptors. A bad directory node that does this with all stored
descriptors causes harm to the distributed storage in general, but
replication will cope with this problem in most cases. However, an
adversary that attempts to make a specific hidden service unavailable by
running relays that become responsible for all of a service's
descriptors poses a more serious threat. The distributed storage needs to
defend against this attack by detecting and removing bad directory nodes.
As a countermeasure hidden services try to download their descriptors
every hour at random times from the hidden service directories that are
responsible for storing it. If a directory node replies with 404 (Not
found), the hidden service reports the supposedly bad directory node to
a random selection of half of the directory authorities (with version
numbers equal to or higher than the first version that implements this
proposal). The hidden service posts a complaint message using HTTP 'POST'
to a URL "/tor/rendezvous/complain" with the following message format:
"hidden-service-directory-complaint" identifier NL
[At start, exactly once]
The identifier of the hidden service directory node to be
investigated.
"rendezvous-service-descriptor" descriptor NL
[At end, Excatly once]
The hidden service descriptor that the supposedly bad directory node
does not serve.
The directory authority checks if the descriptor is valid and the hidden
service directory responsible for storing it. It waits for a random time
of up to 30 minutes before posting the descriptor to the hidden service
directory. If the publication is acknowledged, the directory authority
waits another random time of up to 30 minutes before attempting to
request the descriptor that it has posted. If the directory node replies
with 404 (Not found), it will be blacklisted for being a hidden service
directory node for the next 48 hours.
A blacklisted hidden service directory is assigned the new flag BadHSDir
instead of the HSDir flag in the vote that a directory authority creates.
In a consensus a relay is only assigned a HSDir flag if the majority of
votes contains a HSDir flag and no more than one third of votes contains
a BadHSDir flag. As a result, clients do not have to learn about the
BadHSDir flag. A blacklisted directory node will simply not be assigned
the HSDir flag in the consensus.
In order to prevent an attacker from setting up new nodes as replacement
for blacklisted directory nodes, all directory nodes in the same /24
subnet are blacklisted, too. Furthermore, if two or more directory nodes
are blacklisted in the same /16 subnet concurrently, all other directory
nodes in that /16 subnet are blacklisted, too. Blacklisting holds for at
most 48 hours.
2. Publish Fewer Replicas
The evaluation has shown that the probability of a directory node to
serve a previously stored descriptor is 85.7% (more precisely, this is
the 0.001-quantile of the empirical distribution with the rationale that
it holds for 99.9% of all empirical cases). If descriptors are replicated
to x directory nodes, the probability of at least one of the replicas to
be available for clients is 1 - (1 - 85.7%) ^ x. In order to achieve an
overall availability of 99.9%, x = 3.55 replicas need to be stored. From
this follows that 4 replicas are sufficient, rather than the currently
stored 6 replicas.
Further, the current design stores 2 sets of descriptors on 3 directory
nodes with consecutive identities. Originally, this was meant to
facilitate replication between directory nodes, which has not been and
will not be implemented (the selection criterion of 24 hours uptime does
not make it necessary). As a result, storing descriptors on directory
nodes with consecutive identities is not required. In fact it should be
avoided to enable an attacker to create "black holes" in the identifier
ring.
Hidden services should store their descriptors on 4 non-consecutive
directory nodes, and clients should request descriptors from these
directory nodes only. For compatibility reasons, hidden services also
store their descriptors on 2 consecutive directory nodes. Hence, 0.2.0.x
clients will be able to retrieve 4 out of 6 descriptors, but will fail
for the remaining 2 descriptors, which is sufficient for reliability. As
soon as 0.2.0.x is deprecated, hidden services can stop publishing the
additional 2 replicas.
3. Change Default Value of Being Hidden Service Directory
The requirements for becoming a hidden service directory node are an open
directory port and an uptime of at least 24 hours. The evaluation has
shown that there are 300 hidden service directory candidates in the mean,
but only 6 of them are configured to act as hidden service directories.
This is bad, because those 6 nodes need to serve a large share of all
hidden service descriptors. Optimally, there should be hundreds of hidden
service directories. Having a large number of 0.2.1.x directory nodes
also has a positive effect on 0.2.0.x hidden services and clients.
Therefore, the new default of HidServDirectoryV2 should be 1, so that a
Tor relay that has an open directory port automatically accepts and
serves v2 hidden service descriptors. A relay operator can still opt-out
running a hidden service directory by changing HidServDirectoryV2 to 0.
The additional bandwidth requirements for running a hidden service
directory node in addition to being a directory cache are negligible.
4. Make Descriptors Persistent on Directory Nodes
Hidden service directories that are restarted by their operators or after
a failure will not be selected as hidden service directories within the
next 24 hours. However, some clients might still think that these nodes
are responsible for certain descriptors, because they work on the basis
of network consensuses that are up to three hours old. The directory
nodes should be able to serve the previously received descriptors to
these clients. Therefore, directory nodes make all received descriptors
persistent and load previously received descriptors on startup.
5. Store and Serve Descriptors Regardless of Responsibility
Currently, directory nodes only accept descriptors for which they think
they are responsible. This may lead to problems when a directory node
uses an older or newer network consensus than hidden service or client
or when a directory node has been restarted recently. In fact, there are
no security issues in storing or serving descriptors for which a
directory node thinks it is not responsible. To the contrary, doing so
may improve reliability in border cases. As a result, a directory node
does not pay attention to responsibilty when receiving a publication or
fetch request, but stores or serves the requested descriptor. Likewise,
the directory node does not remove descriptors when it thinks it is not
responsible for them any more.
6. Avoid Periodic Descriptor Re-Publication
In the current implementation a hidden service re-publishes its
descriptor either when its content changes or an hour elapses. However,
the evaluation has shown that failures of hidden service directory nodes,
i.e. of nodes that have not failed within the last 24 hours, are very
rare. Together with making descriptors persistent on directory nodes,
there is no necessity to re-publish descriptors hourly.
The only two events leading to descriptor re-publication should be a
change of the descriptor content and a new directory node becoming
responsible for the descriptor. Hidden services should therefore consider
re-publication every time they learn about a new network consensus
instead of hourly.
7. Discard Expired Descriptors
The current implementation lets directory nodes keep a descriptor for two
days before discarding it. However, with the v2 design, descriptors are
only valid for at most one day. Directory nodes should determine the
validity of stored descriptors and discard them one hour after they have
expired (to compensate wrong clocks on clients).
8. Shorten Client-Side Descriptor Fetch History
When clients try to download a hidden service descriptor, they memorize
fetch requests to directory nodes for up to 15 minutes. This allows them
to request all replicas of a descriptor to avoid bad or failing directory
nodes, but without querying the same directory node twice.
The downside is that a client that has requested a descriptor without
success, will not be able to find a hidden service that has been started
during the following 15 minutes after the client's last request.
This can be improved by shortening the fetch history to only 5 minutes.
This time should be sufficient to complete requests for all replicas of a
descriptor, but without ending in an infinite request loop.
Compatibility:
All proposed improvements are compatible to the currently implemented
design as described in proposal 114.
Filename: 144-enforce-distinct-providers.txt
Title: Increase the diversity of circuits by detecting nodes belonging the
same provider
Author: Mfr
Created: 2008-06-15
Status: Obsolete
Overview:
Increase network security by reducing the capacity of the relay or
ISPs monitoring personally or requisition, a large part of traffic
Tor trying to break circuits privacy. A way to increase the
diversity of circuits without killing the network performance.
Motivation:
Since 2004, Roger an Nick publication about diversity [1], very fast
relays Tor running are focused among an half dozen of providers,
controlling traffic of some dozens of routers [2].
In the same way the generalization of VMs clonables paid by hour,
allowing starting in few minutes and for a small cost, a set of very
high-speed relay whose in a few hours can attract a big traffic that
can be analyzed, increasing the vulnerability of the network.
Whether ISPs or domU providers, these usually have several groups of
IP Class B. Also the restriction in place EnforceDistinctSubnets
automatically excluding IP subnet class B is only partially
effective. By contrast a restriction at the class A will be too
restrictive.
Therefore it seems necessary to consider another approach.
Proposal:
Add a provider control based on AS number added by the router on is
descriptor, controlled by Directories Authorities, and used like the
declarative family field for circuit creating.
Design:
Step 1 :
Add to the router descriptor a provider information get request [4]
by the router itself.
"provider" name NL
'names' is the AS number of the router formated like this:
'ASxxxxxx' where AS is fixed and xxxxxx is the AS number,
left aligned ( ex: AS98304 , AS4096,AS1 ) or if AS number
is missing the network A class number is used like that:
'ANxxx' where AN is fixed and xxx is the first 3 digits of
the IP (ex: for the IP 1.1.1.2 AN1) or an 'L' value is set
if it's a local network IP.
If two ORs list one another in their "provider" entries,
then OPs should treat them as a single OR for the purpose
of path selection.
For example, if node A's descriptor contains "provider B",
and node B's descriptor contains "provider A", then node A
and node B should never be used on the same circuit.
Add the regarding config option in torrc
EnforceDistinctProviders set to 1 by default.
Permit building circuits with relays in the same provider
if set to 0.
Regarding to proposal 135 if TestingTorNetwork is set
need to be EnforceDistinctProviders is unset.
Control by Authorities Directories of the AS numbers
The Directories Authority control the AS numbers of the new node
descriptor uploaded.
If an old version is operated by the node this test is
bypassed.
If AS number get by request is different from the
description, router is flagged as non-Valid by the testing
Authority for the voting process.
Step 2 When a ' significant number of nodes' of valid routers are
generating descriptor with provider information.
Add missing provider information get by DNS request
functionality for the circuit user:
During circuit building, computing, OP apply first
family check and EnforceDistinctSubnets directives for
performance, then if provider info is needed and
missing in router descriptor try to get AS provider
info by DNS request [4]. This information could be
DNS cached. AN ( class A number) is never generated
during this process to prevent DNS block problems. If
DNS request fails ignore and continue building
circuit.
Step 3 When the 'whole majority' of valid Tor clients are providing
DNS request.
Older versions are deprecated and mark as no-Valid.
EnforceDistinctProviders replace EnforceDistinctSubnets functionnality.
EnforceDistinctSubnets is removed.
Functionalities deployed in step 2 are removed.
Security implications:
This providermeasure will increase the number of providers
addresses that an attacker must use in order to carry out
traffic analysis.
Compatibility:
The presented protocol does not raise compatibility issues
with current Tor versions. The compatibility is preserved by
implementing this functionality in 3 steps, giving time to
network users to upgrade clients and routers.
Performance and scalability notes:
Provider change for all routers could reduce a little
performance if the circuit to long.
During step 2 Get missing provider information could increase
building path time and should have a time out.
Possible Attacks/Open Issues/Some thinking required:
These proposal seems be compatible with proposal 135 Simplify
Configuration of Private Tor Networks.
This proposal does not resolve multiples AS owners and top
providers traffic monitoring attacks [5].
Unresolved AS number are treated as a Class A network. Perhaps
should be marked as invalid. But there's only fives items on
last check see [2].
Need to define what's a 'significant number of nodes' and
'whole majority' ;-)
References:
[1] Location Diversity in Anonymity Networks by Nick Feamster and Roger
Dingledine.
In the Proceedings of the Workshop on Privacy in the Electronic Society
(WPES 2004), Washington, DC, USA, October 2004
http://freehaven.net/anonbib/#feamster:wpes2004
[2] http://as4jtw5gc6efb267.onion/IPListbyAS.txt
[3] see Goodell Tor Exit Page
http://cassandra.eecs.harvard.edu/cgi-bin/exit.py
[4] see the great IP to ASN DNS Tool
http://www.team-cymru.org/Services/ip-to-asn.html
[5] Sampled Traffic Analysis by Internet-Exchange-Level Adversaries by
Steven J. Murdoch and Piotr Zielinski.
In the Proceedings of the Seventh Workshop on Privacy Enhancing Technologies
(PET 2007), Ottawa, Canada, June 2007.
http://freehaven.net/anonbib/#murdoch-pet2007
[5] http://bugs.noreply.org/flyspray/index.php?do=details&id=690
Filename: 145-newguard-flag.txt
Title: Separate "suitable as a guard" from "suitable as a new guard"
Author: Nick Mathewson
Created: 1-Jul-2008
Status: Superseded
[This could be obsoleted by proposal 141, which could replace NewGuard
with a Guard weight.]
[This _is_ superseded by 236, which adds guard weights for real.]
Overview
Right now, Tor has one flag that clients use both to tell which
nodes should be kept as guards, and which nodes should be picked
when choosing new guards. This proposal separates this flag into
two.
Motivation
Balancing clients amoung guards is not done well by our current
algorithm. When a new guard appears, it is chosen by clients
looking for a new guard with the same probability as all existing
guards... but new guards are likelier to be under capacity, whereas
old guards are likelier to be under more use.
Implementation
We add a new flag, NewGuard. Clients will change so that when they
are choosing new guards, they only consider nodes with the NewGuard
flag set.
For now, authorities will always set NewGuard if they are setting
the Guard flag. Later, it will be easy to migrate authorities to
set NewGuard for underused guards.
Alternatives
We might instead have authorities list weights with which nodes
should be picked as guards.
Filename: 146-long-term-stability.txt
Title: Add new flag to reflect long-term stability
Author: Nick Mathewson
Created: 19-Jun-2008
Status: Superseded
Superseded-by: 206
Status:
The applications of this design are achieved by proposal 206 instead.
Instead of having the authorities track long-term stability for nodes
that might be useful as directories in a fallback consensus, we
eliminated the idea of a fallback consensus, and just have a DirSource
configuration option. (Nov 2013)
Overview
This document proposes a new flag to indicate that a router has
existed at the same address for a long time, describes how to
implement it, and explains what it's good for.
Motivation
Tor has had three notions of "stability" for servers. Older
directory protocols based a server's stability on its
(self-reported) uptime: a server that had been running for a day was
more stable than a server that had been running for five minutes,
regardless of their past history. Current directory protocols track
weighted mean time between failure (WMTBF) and weighted fractional
uptime (WFU). WFU is computed as the fraction of time for which the
server is running, with measurements weighted to exponentially
decay such that old days count less. WMTBF is computed as the
average length of intervals for which the server runs between
downtime, with old intervals weighted to count less.
WMTBF is useful in answering the question: "If a server is running
now, how long is it likely to stay running?" This makes it a good
choice for picking servers for streams that need to be long-lived.
WFU is useful in answering the question: "If I try connecting to
this server at an arbitrary time, is it likely to be running?" This
makes it an important factor for picking guard nodes, since we want
guard nodes to be usually-up.
There are other questions that clients want to answer, however, for
which the current flags aren't very useful. The one that this
proposal addresses is,
"If I found this server in an old consensus, is it likely to
still be running at the same address?"
This one is useful when we're trying to find directory mirrors in a
fallback-consensus file. This property is equivalent to,
"If I find this server in a current consensus, how long is it
likely to exist on the network?"
This one is useful if we're trying to pick introduction points or
something and care more about churn rate than about whether every IP
will be up all the time.
Implementation:
I propose we add a new flag, called "Longterm." Authorities should
set this flag for routers if their Longevity is in the upper
quartile of all routers. A router's Longevity is computed as the
total amount of days in the last year or so[*] for which the router has
been Running at least once at its current IP:orport pair.
Clients should use directory servers from a fallback-consensus only
if they have the Longterm flag set.
Authority ops should be able to mark particular routers as not
Longterm, regardless of history. (For instance, it makes sense to
remove the Longterm flag from a router whose op says that it will
need to shutdown in a month.)
[*] This is deliberately vague, to permit efficient implementations.
Compatibility and migration issues:
The voting protocol already acts gracefully when new flags are
added, so no change to the voting protocol is needed.
Tor won't have collected this data, however. It might be desirable
to bootstrap it from historical consensuses. Alternatively, we can
just let the algorithm run for a month or two.
Issues and future possibilities:
Longterm is a really awkward name.
Filename: 147-prevoting-opinions.txt
Title: Eliminate the need for v2 directories in generating v3 directories
Author: Nick Mathewson
Created: 2-Jul-2008
Status: Rejected
Target: 0.2.4.x
Overview
We propose a new v3 vote document type to replace the role of v2
networkstatus information in generating v3 consensuses.
Motivation
When authorities vote on which descriptors are to be listed in the
next consensus, it helps if they all know about the same descriptors
as one another. But a hostile, confused, or out-of-date server may
upload a descriptor to only some authorities. In the current v3
directory design, the authorities don't have a good way to tell one
another about the new descriptor until they exchange votes... but by
the time this happens, they are already committed to their votes,
and they can't add anybody they learn about from other authorities
until the next voting cycle. That's no good!
The current Tor implementation avoids this problem by having
authorities also look at v2 networkstatus documents, but we'd like
in the long term to eliminate these, once 0.1.2.x is obsolete.
Design:
We add a new value for vote-status in v3 consensus documents in
addition to "consensus" and "vote": "opinion". Authorities generate
and sign an opinion document as if they were generating a vote,
except that they generate opinions earlier than they generate votes.
[This proposal doesn't say what lines must be contained in opinion
documents. It seems that an authority that parses an opinion
document is only interested in a) relay fingerprint, b) descriptor
publication time, and c) descriptor digest; unless there's more
information that helps authorities decide whether "they might
accept" a descriptor. If not, opinion documents only need to
contain a small subset of headers and all the "r" lines that would
be contained in a later vote. -KL]
[This seems okay. It would however mean that we can't use the same
parsing logic as we use for regular votes. -NM]
[Authorities should use the same "valid-after", "fresh-until",
and "valid-until" lines in opinion documents as they are going to
use in their next vote. -KL]
[Maybe these lines should just get ignored on opinions. Or
omitted. -NM]
Authorities don't need to generate more than one opinion document
per voting interval, but may. They should send it to the other
authorities they know about, at
http://<hostname>/tor/post/opinion ,
before the authorities begin voting, so that enough time remains for
the authorities to fetch new descriptors.
Additionally, authories make their opinions available at
http://<hostname>/tor/status-vote/next/opinion.z
and download opinions from authorities they haven't heard from in a
while.
Authorities SHOULD send their opinion document to all other
authorities OpinionSeconds seconds before voting and request
missing opinion documents OpinionSeconds/2 seconds before voting.
OpinionSeconds SHOULD be defined as part of "voting-delay" lines
and otherwise default to the same number of seconds as VoteSeconds.
Authorities MAY generate opinions on demand.
Upon receiving an opinion document, authorities scan it for any
descriptors that:
- They might accept.
- Are for routers they don't know about, or are published more
recently than any descriptor they have for that router.
Authorities then begin downloading such descriptors from authorities
that claim to have them.
Authorities also download corresponding extra-info descriptors for
any router descriptor they learned from parsing an opinion document.
Authorities MAY cache opinion documents, but don't need to.
Reasons for rejection:
1. Authorities learn about new relays from each others' vote documents.
See git commits 2e692bd8 and eaf5487d, which went into 0.2.2.12-alpha:
o Major bugfixes:
- Many relays have been falling out of the consensus lately because
not enough authorities know about their descriptor for them to get
a majority of votes. When we deprecated the v2 directory protocol,
we got rid of the only way that v3 authorities can hear from each
other about other descriptors. Now authorities examine every v3
vote for new descriptors, and fetch them from that authority. Bugfix
on 0.2.1.23.
2. Authorities don't serve version 2 statuses anymore.
Since January 2013, there was only a single version 3 directory
authority left that served version 2 statuses: dizum. moria1 and tor26
have been rejecting version 2 requests for a long time, and it was
mostly an oversight that dizum still served them. As of January 2014,
dizum does not serve version 2 statuses anymore. The other six
authorities have never generated version 2 statuses for others to be
used as pre-voting opinions.
3. Vote documents indicate that pre-voting opinions wouldn't help much.
From January 1 to 7, 2014, only 0.4 relays on average were not included
in a consensus because they were listed in less than 5 votes. These 0.4
relays could probably have been included with pre-voting opinions.
(Here's how to find out: extract the votes-2014-01.tar.bz2 tarball, run
`grep -R "^r " 0[1-7] | cut -c 4-22,112- | cut -d" " -f1,3 | sort | uniq
-c | sort | grep " [1-4] " | wc -l`, result is 63, divide by 7*24
published consensuses, obtain 0.375 as end result.)
Filename: 148-uniform-client-end-reason.txt
Title: Stream end reasons from the client side should be uniform
Author: Roger Dingledine
Created: 2-Jul-2008
Status: Closed
Implemented-In: 0.2.1.9-alpha
Overview
When a stream closes before it's finished, the end relay cell that's
sent includes an "end stream reason" to tell the other end why it
closed. It's useful for the exit relay to send a reason to the client,
so the client can choose a different circuit, inform the user, etc. But
there's no reason to include it from the client to the exit relay,
and in some cases it can even harm anonymity.
We should pick a single reason for the client-to-exit-relay direction
and always just send that.
Motivation
Back when I first deployed the Tor network, it was useful to have
the Tor relays learn why a stream closed, so I could debug both ends
of the stream at once. Now that streams have worked for many years,
there's no need to continue telling the exit relay whether the client
gave up on a stream because of "timeout" or "misc" or what.
Then in Tor 0.2.0.28-rc, I fixed this bug:
- Fix a bug where, when we were choosing the 'end stream reason' to
put in our relay end cell that we send to the exit relay, Tor
clients on Windows were sometimes sending the wrong 'reason'. The
anonymity problem is that exit relays may be able to guess whether
the client is running Windows, thus helping partition the anonymity
set. Down the road we should stop sending reasons to exit relays,
or otherwise prevent future versions of this bug.
It turned out that non-Windows clients were choosing their reason
correctly, whereas Windows clients were potentially looking at errno
wrong and so always choosing 'misc'.
I fixed that particular bug, but I think we should prevent future
versions of the bug too.
(We already fixed it so *circuit* end reasons don't get sent from
the client to the exit relay. But we appear to be have skipped over
stream end reasons thus far.)
Design:
One option would be to no longer include any 'reason' field in end
relay cells. But that would introduce a partitioning attack ("users
running the old version" vs "users running the new version").
Instead I suggest that clients all switch to sending the "misc" reason,
like most of the Windows clients currently do and like the non-Windows
clients already do sometimes.
Filename: 149-using-netinfo-data.txt
Title: Using data from NETINFO cells
Author: Nick Mathewson
Created: 2-Jul-2008
Status: Superseded
Target: 0.2.1.x
[Partially done: we do the anti-MITM part. Not entirely done: we don't do
the time part.]
Overview
Current Tor versions send signed IP and timestamp information in
NETINFO cells, but don't use them to their fullest. This proposal
describes how they should start using this info in 0.2.1.x.
Motivation
Our directory system relies on clients and routers having
reasonably accurate clocks to detect replayed directory info, and
to set accurate timestamps on directory info they publish
themselves. NETINFO cells contain timestamps.
Also, the directory system relies on routers having a reasonable
idea of their own IP addresses, so they can publish correct
descriptors. This is also in NETINFO cells.
Learning the time and IP address
We need to think about attackers here. Just because a router tells
us that we have a given IP or a given clock skew doesn't mean that
it's true. We believe this information only if we've heard it from
a majority of the routers we've connected to recently, including at
least 3 routers. Routers only believe this information if the
majority includes at least one authority.
Avoiding MITM attacks
Current Tors use the IP addresses published in the other router's
NETINFO cells to see whether the connection is "canonical". Right
now, we prefer to extend circuits over "canonical" connections. In
0.2.1.x, we should refuse to extend circuits over non-canonical
connections without first trying to build a canonical one.
Filename: 150-exclude-exit-nodes.txt
Title: Exclude Exit Nodes from a circuit
Author: Mfr
Created: 2008-06-15
Status: Closed
Implemented-In: 0.2.1.3-alpha
Overview
Right now, Tor users can manually exclude a node from all positions
in their circuits created using the directive ExcludeNodes.
This proposal makes this exclusion less restrictive, allowing users to
exclude a node only from the exit part of a circuit.
Motivation
This feature would Help the integration into vidalia (tor exit
branch) or other tools, of features to exclude a country for exit
without reducing circuits possibilities, and privacy. This feature
could help people from a country were many sites are blocked to
exclude this country for browsing, giving them a more stable
navigation. It could also add the possibility for the user to
exclude a currently used exit node.
Implementation
ExcludeExitNodes is similar to ExcludeNodes except it's only
the exit node which is excluded for circuit build.
Tor doesn't warn if node from this list is not an exit node.
Security implications:
Open also possibilities for a future user bad exit reporting
Risks:
Use of this option can make users partitionable under certain attack
assumptions. However, ExitNodes already creates this possibility,
so there isn't much increased risk in ExcludeExitNodes.
We should still encourage people who exclude an exit node because
of bad behavior to report it instead of just adding it to their
ExcludeExit list. It would be unfortunate if we didn't find out
about broken exits because of this option. This issue can probably
be addressed sufficiently with documentation.
Filename: 151-path-selection-improvements.txt
Title: Improving Tor Path Selection
Author: Fallon Chen, Mike Perry
Created: 5-Jul-2008
Status: Closed
In-Spec: path-spec.txt
Implemented-In: 0.2.2.2-alpha
Overview
The performance of paths selected can be improved by adjusting the
CircuitBuildTimeout and avoiding failing guard nodes. This proposal
describes a method of tracking buildtime statistics at the client, and
using those statistics to adjust the CircuitBuildTimeout.
Motivation
Tor's performance can be improved by excluding those circuits that
have long buildtimes (and by extension, high latency). For those Tor
users who require better performance and have lower requirements for
anonymity, this would be a very useful option to have.
Implementation
Gathering Build Times
Circuit build times are stored in the circular array
'circuit_build_times' consisting of uint32_t elements as milliseconds.
The total size of this array is based on the number of circuits
it takes to converge on a good fit of the long term distribution of
the circuit builds for a fixed link. We do not want this value to be
too large, because it will make it difficult for clients to adapt to
moving between different links.
From our observations, the minimum value for a reasonable fit appears
to be on the order of 500 (MIN_CIRCUITS_TO_OBSERVE). However, to keep
a good fit over the long term, we store 5000 most recent circuits in
the array (NCIRCUITS_TO_OBSERVE).
The Tor client will build test circuits at a rate of one per
minute (BUILD_TIMES_TEST_FREQUENCY) up to the point of
MIN_CIRCUITS_TO_OBSERVE. This allows a fresh Tor to have
a CircuitBuildTimeout estimated within 8 hours after install,
upgrade, or network change (see below).
Long Term Storage
The long-term storage representation is implemented by storing a
histogram with BUILDTIME_BIN_WIDTH millisecond buckets (default 50) when
writing out the statistics to disk. The format this takes in the
state file is 'CircuitBuildTime <bin-ms> <count>', with the total
specified as 'TotalBuildTimes <total>'
Example:
TotalBuildTimes 100
CircuitBuildTimeBin 25 50
CircuitBuildTimeBin 75 25
CircuitBuildTimeBin 125 13
...
Reading the histogram in will entail inserting <count> values
into the circuit_build_times array each with the value of
<bin-ms> milliseconds. In order to evenly distribute the values
in the circular array, the Fisher-Yates shuffle will be performed
after reading values from the bins.
Learning the CircuitBuildTimeout
Based on studies of build times, we found that the distribution of
circuit buildtimes appears to be a Frechet distribution. However,
estimators and quantile functions of the Frechet distribution are
difficult to work with and slow to converge. So instead, since we
are only interested in the accuracy of the tail, we approximate
the tail of the distribution with a Pareto curve starting at
the mode of the circuit build time sample set.
We will calculate the parameters for a Pareto distribution
fitting the data using the estimators at
http://en.wikipedia.org/wiki/Pareto_distribution#Parameter_estimation.
The timeout itself is calculated by using the Quartile function (the
inverted CDF) to give us the value on the CDF such that
BUILDTIME_PERCENT_CUTOFF (80%) of the mass of the distribution is
below the timeout value.
Thus, we expect that the Tor client will accept the fastest 80% of
the total number of paths on the network.
Detecting Changing Network Conditions
We attempt to detect both network connectivity loss and drastic
changes in the timeout characteristics.
We assume that we've had network connectivity loss if 3 circuits
timeout and we've received no cells or TLS handshakes since those
circuits began. We then set the timeout to 60 seconds and stop
counting timeouts.
If 3 more circuits timeout and the network still has not been
live within this new 60 second timeout window, we then discard
the previous timeouts during this period from our history.
To detect changing network conditions, we keep a history of
the timeout or non-timeout status of the past RECENT_CIRCUITS (20)
that successfully completed at least one hop. If more than 75%
of these circuits timeout, we discard all buildtimes history,
reset the timeout to 60, and then begin recomputing the timeout.
Testing
After circuit build times, storage, and learning are implemented,
the resulting histogram should be checked for consistency by
verifying it persists across successive Tor invocations where
no circuits are built. In addition, we can also use the existing
buildtime scripts to record build times, and verify that the histogram
the python produces matches that which is output to the state file in Tor,
and verify that the Pareto parameters and cutoff points also match.
We will also verify that there are no unexpected large deviations from
node selection, such as nodes from distant geographical locations being
completely excluded.
Dealing with Timeouts
Timeouts should be counted as the expectation of the region of
of the Pareto distribution beyond the cutoff. This is done by
generating a random sample for each timeout at points on the
curve beyond the current timeout cutoff.
Future Work
At some point, it may be desirable to change the cutoff from a
single hard cutoff that destroys the circuit to a soft cutoff and
a hard cutoff, where the soft cutoff merely triggers the building
of a new circuit, and the hard cutoff triggers destruction of the
circuit.
It may also be beneficial to learn separate timeouts for each
guard node, as they will have slightly different distributions.
This will take longer to generate initial values though.
Issues
Impact on anonymity
Since this follows a Pareto distribution, large reductions on the
timeout can be achieved without cutting off a great number of the
total paths. This will eliminate a great deal of the performance
variation of Tor usage.
Filename: 152-single-hop-circuits.txt
Title: Optionally allow exit from single-hop circuits
Author: Geoff Goodell
Created: 13-Jul-2008
Status: Closed
Implemented-In: 0.2.1.6-alpha
Overview
Provide a special configuration option that adds a line to descriptors
indicating that a router can be used as an exit for one-hop circuits,
and allow clients to attach streams to one-hop circuits provided
that the descriptor for the router in the circuit includes this
configuration option.
Motivation
At some point, code was added to restrict the attachment of streams
to one-hop circuits.
The idea seems to be that we can use the cost of forking and
maintaining a patch as a lever to prevent people from writing
controllers that jeopardize the operational security of routers
and the anonymity properties of the Tor network by creating and
using one-hop circuits rather than the standard three-hop circuits.
It may be, for example, that some users do not actually seek true
anonymity but simply reachability through network perspectives
afforded by the Tor network, and since anonymity is stronger in
numbers, forcing users to contribute to anonymity and decrease the
risk to server operators by using full-length paths may be reasonable.
As presently implemented, the sweeping restriction of one-hop circuits
for all routers limits the usefulness of Tor as a general-purpose
technology for building circuits. In particular, we should allow
for controllers, such as Blossom, that create and use single-hop
circuits involving routers that are not part of the Tor network.
Design
Introduce a configuration option for Tor servers that, when set,
indicates that a router is willing to provide exit from one-hop
circuits. Routers with this policy will not require that a circuit
has at least two hops when it is used as an exit.
In addition, routers for which this configuration option
has been set will have a line in their descriptors, "opt
exit-from-single-hop-circuits". Clients will keep track of which
routers have this option and allow streams to be attached to
single-hop circuits that include such routers.
Security Considerations
This approach seems to eliminate the worry about operational router
security, since server operators will not set the configuraiton
option unless they are willing to take on such risk.
To reduce the impact on anonymity of the network resulting
from including such "risky" routers in regular Tor path
selection, clients may systematically exclude routers with "opt
exit-from-single-hop-circuits" when choosing random paths through
the Tor network.
Filename: 153-automatic-software-update-protocol.txt
Title: Automatic software update protocol
Author: Jacob Appelbaum
Created: 14-July-2008
Status: Superseded
[Superseded by thandy-spec.txt]
Automatic Software Update Protocol Proposal
0.0 Introduction
The Tor project and its users require a robust method to update shipped
software bundles. The software bundles often includes Vidalia, Privoxy, Polipo,
Torbutton and of course Tor itself. It is not inconcievable that an update
could include all of the Tor Browser Bundle. It seems reasonable to make this
a standalone program that can be called in shell scripts, cronjobs or by
various Tor controllers.
0.1 Minimal Tasks To Implement Automatic Updating
At the most minimal, an update must be able to do the following:
0 - Detect the curent Tor version, note the working status of Tor.
1 - Detect the latest Tor version.
2 - Fetch the latest version in the form of a platform specific package(s).
3 - Verify the itegrity of the downloaded package(s).
4 - Install the verified package(s).
5 - Test that the new package(s) works properly.
0.2 Specific Enumeration Of Minimal Tasks
To implement requirement 0, we need to detect the current Tor version of both
the updater and the current running Tor. The update program itself should be
versioned internally. This requirement should also test connecting through Tor
itself and note if such connections are possible.
To implement requirement 1, we need to learn the concensus from the directory
authorities or fail back to a known good URL with cryptographically signed
content.
To implement requirement 2, we need to download Tor - hopefully over Tor.
To implement requirement 3, we need to verify the package signature.
To implement requirement 4, we need to use a platform specific method of
installation. The Tor controller performing the update perform these platform
specific methods.
To implement requirement 5, we need to be able to extend circuits and reach
the internet through Tor.
0.x Implementation Goals
The update system will be cross platform and rely on as little external code
as possible. If the update system uses it, it must be updated by the update
system itself. It will consist only of free software and will not rely on any
non-free components until the actual installation phase. If a package manager
is in use, it will be platform specific and thus only invoked by the update
system implementing the update protocol.
The update system itself will attempt to perform update related network
activity over Tor. Possibly it will attempt to use a hidden service first.
It will attempt to use novel and not so novel caching
when possible, it will always verify cryptographic signatures before any
remotely fetched code is executed. In the event of an unusable Tor system,
it will be able to attempt to fetch updates without Tor. This should be user
configurable, some users will be unwilling to update without the protection of
using Tor - others will simply be unable because of blocking of the main Tor
website.
The update system will track current version numbers of Tor and supporting
software. The update system will also track known working versions to assist
with automatic The update system itself will be a standalone library. It will be
strongly versioned internally to match the Tor bundle it was shiped with. The
update system will keep track of the given platform, cpu architecture, lsb_release,
package management functionality and any other platform specific metadata.
We have referenced two popular automatic update systems, though neither fit
our needs, both are useful as an idea of what others are doing in the same
area.
The first is sparkle[0] but it is sadly only available for Cocoa
environments and is written in Objective C. This doesn't meet our requirements
because it is directly tied into the private Apple framework.
The second is the Mozilla Automatic Update System[1]. It is possibly useful
as an idea of how other free software projects automatically update. It is
however not useful in its currently documented form.
[0] http://sparkle.andymatuschak.org/documentation/
[1] http://wiki.mozilla.org/AUS:Manual
0.x Previous methods of Tor and related software update
Previously, Tor users updated their Tor related software by hand. There has
been no fully automatic method for any user to update. In addition, there
hasn't been any specific way to find out the most current stable version of Tor
or related software as voted on by the directory authority concensus.
0.x Changes to the directory specification
We will want to supplement client-versions and server-versions in the
concensus voting with another version identifier known as
'auto-update-versions'. This will keep track of the current concensus of
specific versions that are best per platform and per architecture. It should
be noted that while the Mac OS X universal binary may be the best for x86
processers with Tiger, it may not be the best for PPC users on Panther. This
goes for all of the package updates. We want to prevent updates that cause Tor
to break even if the updating program can recover gracefully.
x.x Assumptions About Operating System Package Management
It is assumed that users will use their package manager unless they are on
Microsoft Windows (any version) or Mac OS X (any version). Microsoft Windows
users will have integration with the normal "add/remove program" functionality
that said users would expect.
x.x Package Update System Failure Modes
The package update will try to ensure that a user always has a working Tor at
the very least. It will keep state to remember versions of Tor that were able
to bootstrap properly and reach the rest of the Tor network. It will also keep
note of which versions broke. It will select the best Tor that works for the
user. It will also allow for anonymized bug reporting on the packages
available and tested by the auto-update system.
x.x Package Signature Verification
The update system will be aware of replay attacks against the update signature
system itself. It will not allow package update signatures that are radically
out of date. It will be a multi-key system to prevent any single party from
forging an update. The key will be updated regularly. This is like authority
key (see proposal 103) usage.
x.x Package Caching
The update system will iterate over different update methods. Whichever method
is picked will have caching functionality. Each Tor server itself should be
able to serve cached update files. This will be an option that friendly server
administrators can turn on should they wish to support caching. In addition,
it is possible to cache the full contents of a package in an
authoratative DNS zone. Users can then query the DNS zone for their package.
If we wish to further distribute the update load, we can also offer packages
with encrypted bittorrent. Clients who wish to share the updates but do not
wish to be a server can help distribute Tor updates. This can be tied together
with the DNS caching[2][3] if needed.
[2] http://www.netrogenic.com/dnstorrent/
[3] http://www.doxpara.com/ozymandns_src_0.1.tgz
x.x Helping Our Users Spread Tor
There should be a way for a user to participate in the packaging caching as
described in section x.x. This option should be presented by the Tor
controller.
x.x Simple HTTP Proxy To The Tor Project Website
It has been suggested that we should provide a simple proxy that allows a user
to visit the main Tor website to download packages. This was part of a
previous proposal and has not been closely examined.
x.x Package Installation
Platform specific methods for proper package installation will be left to the
controller that is calling for an update. Each platform is different, the
installation options and user interface will be specific to the controller in
question.
x.x Other Things
Other things should be added to this proposal. What are they?
Filename: 154-automatic-updates.txt
Title: Automatic Software Update Protocol
Author: Matt Edman
Created: 30-July-2008
Status: Superseded
Target: 0.2.1.x
Superseded by thandy-spec.txt
Scope
This proposal specifies the method by which an automatic update client can
determine the most recent recommended Tor installation package for the
user's platform, download the package, and then verify that the package was
downloaded successfully. While this proposal focuses on only the Tor
software, the protocol defined is sufficiently extensible such that other
components of the Tor bundles, like Vidalia, Polipo, and Torbutton, can be
managed and updated by the automatic update client as well.
The initial target platform for the automatic update framework is Windows,
given that's the platform used by a majority of our users and that it lacks
a sane package management system that many Linux distributions already have.
Our second target platform will be Mac OS X, and so the protocol will be
designed with this near-future direction in mind.
Other client-side aspects of the automatic update process, such as user
interaction, the interface presented, and actual package installation
procedure, are outside the scope of this proposal.
Motivation
Tor releases new versions frequently, often with important security,
anonymity, and stability fixes. Thus, it is important for users to be able
to promptly recognize when new versions are available and to easily
download, authenticate, and install updated Tor and Tor-related software
packages.
Tor's control protocol [2] provides a method by which controllers can
identify when the user's Tor software is obsolete or otherwise no longer
recommended. Currently, however, no mechanism exists for clients to
automatically download and install updated Tor and Tor-related software for
the user.
Design Overview
The core of the automatic update framework is a well-defined file called a
"recommended-packages" file. The recommended-packages file is accessible via
HTTP[S] at one or more well-defined URLs. An example recommended-packages
URL may be:
https://updates.torproject.org/recommended-packages
The recommended-packages document is formatted according to Section 1.2
below and specifies the most recent recommended installation package
versions for Tor or Tor-related software, as well as URLs at which the
packages and their signatures can be downloaded.
An automatic update client process runs on the Tor user's computer and
periodically retrieves the recommended-packages file according to the method
described in Section 2.0. As described further in Section 1.2, the
recommended-packages file is signed and can be verified by the automatic
update client with one or more public keys included in the client software.
Since it is signed, the recommended-packages file can be mirrored by
multiple hosts (e.g., Tor directory authorities), whose URLs are included in
the automatic update client's configuration.
After retrieving and verifying the recommended-packages file, the automatic
update client compares the versions of the recommended software packages
listed in the file with those currently installed on the end-user's
computer. If one or more of the installed packages is determined to be out
of date, an updated package and its signature will be downloaded from one of
the package URLs listed in the recommended-packages file as described in
Section 2.2.
The automatic update system uses a multilevel signing key scheme for package
signatures. There are a small number of entities we call "packaging
authorities" that each have their own signing key. A packaging authority is
responsible for signing and publishing the recommended-packages file.
Additionally, each individual packager responsible for producing an
installation package for one or more platforms has their own signing key.
Every packager's signing key must be signed by at least one of the packaging
authority keys.
Specification
1. recommended-packages Specification
In this section we formally specify the format of the published
recommended-packages file.
1.1. Document Meta-format
The recommended-packages document follows the lightweight extensible
information format defined in Tor's directory protocol specification [1]. In
the interest of self-containment, we have reproduced the relevant portions
of that format's specification in this Section. (Credits to Nick Mathewson
for much of the original format definition language.)
The highest level object is a Document, which consists of one or more
Items. Every Item begins with a KeywordLine, followed by zero or more
Objects. A KeywordLine begins with a Keyword, optionally followed by
whitespace and more non-newline characters, and ends with a newline. A
Keyword is a sequence of one or more characters in the set [A-Za-z0-9-].
An Object is a block of encoded data in pseudo-Open-PGP-style
armor. (cf. RFC 2440)
More formally:
Document ::= (Item | NL)+
Item ::= KeywordLine Object*
KeywordLine ::= Keyword NL | Keyword WS ArgumentChar+ NL
Keyword ::= KeywordChar+
KeywordChar ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9' | '-'
ArgumentChar ::= any printing ASCII character except NL.
WS ::= (SP | TAB)+
Object ::= BeginLine Base-64-encoded-data EndLine
BeginLine ::= "-----BEGIN " Keyword "-----" NL
EndLine ::= "-----END " Keyword "-----" NL
The BeginLine and EndLine of an Object must use the same keyword.
In our Document description below, we also tag Items with a multiplicity in
brackets. Possible tags are:
"At start, exactly once": These items MUST occur in every instance of the
document type, and MUST appear exactly once, and MUST be the first item in
their documents.
"Exactly once": These items MUST occur exactly one time in every
instance of the document type.
"Once or more": These items MUST occur at least once in any instance
of the document type, and MAY occur more than once.
"At end, exactly once": These items MUST occur in every instance of
the document type, and MUST appear exactly once, and MUST be the
last item in their documents.
1.2. recommended-packages Document Format
When interpreting a recommended-packages Document, software MUST ignore
any KeywordLine that starts with a keyword it doesn't recognize; future
implementations MUST NOT require current automatic update clients to
understand any KeywordLine not currently described.
In lines that take multiple arguments, extra arguments SHOULD be
accepted and ignored.
The currently defined Items contained in a recommended-packages document
are:
"recommended-packages-format" SP number NL
[Exactly once]
This Item specifies the version of the recommended-packages format that
is contained in the subsequent document. The version defined in this
proposal is version "1". Subsequent iterations of this protocol MUST
increment this value if they introduce incompatible changes to the
document format and MAY increment this value if they only introduce
additional Keywords.
"published" SP YYYY-MM-DD SP HH:MM:SS NL
[Exactly once]
The time, in GMT, when this recommended-packages document was generated.
Automatic update clients SHOULD ignore Documents over 60 days old.
"tor-stable-win32-version" SP TorVersion NL
[Exactly once]
This keyword specifies the latest recommended release of Tor's "stable"
branch for the Windows platform that has an installation package
available. Note that this version does not necessarily correspond to the
most recently tagged stable Tor version, since that version may not yet
have an installer package available, or may have known issues on
Windows.
The TorVersion field is formatted according to Section 2 of Tor's
version specification [3].
"tor-stable-win32-package" SP Url NL
[Once or more]
This Item specifies the location from which the most recent
recommended Windows installation package for Tor's stable branch can be
downloaded.
When this Item appears multiple times within the Document, automatic
update clients SHOULD select randomly from the available package
mirrors.
"tor-dev-win32-version" SP TorVersion NL
[Exactly once]
This Item specifies the latest recommended release of Tor's
"development" branch for the Windows platform that has an installation
package available. The same caveats from the description of
"tor-stable-win32-version" also apply to this keyword.
The TorVersion field is formatted according to Section 2 of Tor's
version specification [3].
"tor-dev-win32-package" SP Url NL
[Once or more]
This Item specifies the location from which the most recent recommended
Windows installation package and its signature for Tor's development
branch can be downloaded.
When this Keyword appears multiple times within the Document, automatic
update clients SHOULD select randomly from the available package
mirrors.
"signature" NL SIGNATURE NL
[At end, exactly once]
The "SIGNATURE" Object contains a PGP signature (using a packaging
authority signing key) of the entire document, taken from the beginning
of the "recommended-packages-format" keyword, through the newline after
the "signature" Keyword.
2. Automatic Update Client Behavior
The client-side component of the automatic update framework is an
application that runs on the end-user's machine. It is responsible for
fetching and verifying a recommended-packages document, as well as
downloading, verifying, and subsequently installing any necessary updated
software packages.
2.1. Download and verify a recommended-packages document
The first step in the automatic update process is for the client to download
a copy of the recommended-packages file. The automatic update client
contains a (hardcoded and/or user-configurable) list of URLs from which it
will attempt to retrieve a recommended-packages file.
Connections to each of the recommended-packages URLs SHOULD be attempted in
the following order:
1) HTTPS over Tor
2) HTTP over Tor
3) Direct HTTPS
4) Direct HTTP
If the client fails to retrieve a recommended-packages document via any of
the above connection methods from any of the configured URLs, the client
SHOULD retry its download attempts following an exponential back-off
algorithm. After the first failed attempt, the client SHOULD delay one hour
before attempting again, up to a maximum of 24 hours delay between retry
attempts.
After successfully downloading a recommended-packages file, the automatic
update client will verify the signature using one of the public keys
distributed with the client software. If more than one recommended-packages
file is downloaded and verified, the file with the most recent "published"
date that is verified will be retained and the rest discarded.
2.2. Download and verify the updated packages
The automatic update client next compares the latest recommended package
version from the recommended-packages document with the currently installed
Tor version. If the user currently has installed a Tor version from Tor's
"development" branch, then the version specified in "tor-dev-*-version" Item
is used for comparison. Similarly, if the user currently has installed a Tor
version from Tor's "stable" branch, then the version specified in the
"tor-stable-*version" Item is used for comparison. Version comparisons are
done according to Tor's version specification [3].
If the automatic update client determines an installation package newer than
the user's currently installed version is available, it will attempt to
download a package appropriate for the user's platform and Tor branch from a
URL specified by a "tor-[branch]-[platform]-package" Item. If more than one
mirror for the selected package is available, a mirror will be chosen at
random from all those available.
The automatic update client must also download a ".asc" signature file for
the retrieved package. The URL for the package signature is the same as that
for the package itself, except with the extension ".asc" appended to the
package URL.
Connections to download the updated package and its signature SHOULD be
attempted in the same order described in Section 2.1.
After completing the steps described in Sections 2.1 and 2.2, the automatic
update client will have downloaded and verified a copy of the latest Tor
installation package. It can then take whatever subsequent platform-specific
steps are necessary to install the downloaded software updates.
2.3. Periodic checking for updates
The automatic update client SHOULD maintain a local state file in which it
records (at a minimum) the timestamp at which it last retrieved a
recommended-packages file and the timestamp at which the client last
successfully downloaded and installed a software update.
Automatic update clients SHOULD check for an updated recommended-packages
document at most once per day but at least once every 30 days.
3. Future Extensions
There are several possible areas for future extensions of this framework.
The extensions below are merely suggestions and should be the subject of
their own proposal before being implemented.
3.1. Additional Software Updates
There are several software packages often included in Tor bundles besides
Tor, such as Vidalia, Privoxy or Polipo, and Torbutton. The versions and
download locations of updated installation packages for these bundle
components can be easily added to the recommended-packages document
specification above.
3.2. Including ChangeLog Information
It may be useful for automatic update clients to be able to display for
users a summary of the changes made in the latest Tor or Tor-related
software release, before the user chooses to install the update. In the
future, we can add keywords to the specification in Section 1.2 that specify
the location of a ChangeLog file for the latest recommended package
versions. It may also be desirable to allow localized ChangeLog information,
so that the automatic update client can fetch release notes in the
end-user's preferred language.
3.3. Weighted Package Mirror Selection
We defined in Section 1.2 a method by which automatic update clients can
select from multiple available package mirrors. We may want to add a Weight
argument to the "*-package" Items that allows the recommended-packages file
to suggest to clients the probability with which a package mirror should be
chosen. This will allow clients to more appropriately distribute package
downloads across available mirrors proportional to their approximate
bandwidth.
Implementation
Implementation of this proposal will consist of two separate components.
The first component is a small "au-publish" tool that takes as input a
configuration file specifying the information described in Section 1.2 and a
private key. The tool is run by a "packaging authority" (someone responsible
for publishing updated installation packages), who will be prompted to enter
the passphrase for the private key used to sign the recommended-packages
document. The output of the tool is a document formatted according to
Section 1.2, with a signature appended at the end. The resulting document
can then be published to any of the update mirrors.
The second component is an "au-client" tool that is run on the end-user's
machine. It periodically checks for updated installation packages according
to Section 2 and fetches the packages if necessary. The public keys used
to sign the recommended-packages file and any of the published packages are
included in the "au-client" tool.
References
[1] Tor directory protocol (version 3),
https://tor-svn.freehaven.net/svn/tor/trunk/doc/spec/dir-spec.txt
[2] Tor control protocol (version 2),
https://tor-svn.freehaven.net/svn/tor/trunk/doc/spec/control-spec.txt
[3] Tor version specification,
https://tor-svn.freehaven.net/svn/tor/trunk/doc/spec/version-spec.txt
Filename: 155-four-hidden-service-improvements.txt
Title: Four Improvements of Hidden Service Performance
Author: Karsten Loesing, Christian Wilms
Created: 25-Sep-2008
Status: Closed
Implemented-In: 0.2.1.x
Change history:
25-Sep-2008 Initial proposal for or-dev
Overview:
A performance analysis of hidden services [1] has brought up a few
possible design changes to reduce advertisement time of a hidden service
in the network as well as connection establishment time. Some of these
design changes have side-effects on anonymity or overall network load
which had to be weighed up against individual performance gains. A
discussion of seven possible design changes [2] has led to a selection
of four changes [3] that are proposed to be implemented here.
Design:
1. Shorter Circuit Extension Timeout
When establishing a connection to a hidden service a client cannibalizes
an existing circuit and extends it by one hop to one of the service's
introduction points. In most cases this can be accomplished within a few
seconds. Therefore, the current timeout of 60 seconds for extending a
circuit is far too high.
Assuming that the timeout would be reduced to a lower value, for example
30 seconds, a second (or third) attempt to cannibalize and extend would
be started earlier. With the current timeout of 60 seconds, 93.42% of all
circuits can be established, whereas this fraction would have been only
0.87% smaller at 92.55% with a timeout of 30 seconds.
For a timeout of 30 seconds the performance gain would be approximately 2
seconds in the mean as opposed to the current timeout of 60 seconds. At
the same time a smaller timeout leads to discarding an increasing number
of circuits that might have been completed within the current timeout of
60 seconds.
Measurements with simulated low-bandwidth connectivity have shown that
there is no significant effect of client connectivity on circuit
extension times. The reason for this might be that extension messages are
small and thereby independent of the client bandwidth. Further, the
connection between client and entry node only constitutes a single hop of
a circuit, so that its influence on the whole circuit is limited.
The exact value of the new timeout does not necessarily have to be 30
seconds, but might also depend on the results of circuit build timeout
measurements as described in proposal 151.
2. Parallel Connections to Introduction Points
An additional approach to accelerate extension of introduction circuits
is to extend a second circuit in parallel to a different introduction
point. Such parallel extension attempts should be started after a short
delay of, e.g., 15 seconds in order to prevent unnecessary circuit
extensions and thereby save network resources. Whichever circuit
extension succeeds first is used for introduction, while the other
attempt is aborted.
An evaluation has been performed for the more resource-intensive approach
of starting two parallel circuits immediately instead of waiting for a
short delay. The result was a reduction of connection establishment times
from 27.4 seconds in the original protocol to 22.5 seconds.
While the effect of the proposed approach of delayed parallelization on
mean connection establishment times is expected to be smaller,
variability of connection attempt times can be reduced significantly.
3. Increase Count of Internal Circuits
Hidden services need to create or cannibalize and extend a circuit to a
rendezvous point for every client request. Really popular hidden services
require more than two internal circuits in the pool to answer multiple
client requests at the same time. This scenario was not yet analyzed, but
will probably exhibit worse performance than measured in the previous
analysis. The number of preemptively built internal circuits should be a
function of connection requests in the past to adapt to changing needs.
Furthermore, an increased number of internal circuits on client side
would allow clients to establish connections to more than one hidden
service at a time.
Under the assumption that a popular hidden service cannot make use of
cannibalization for connecting to rendezvous points, the circuit creation
time needs to be added to the current results. In the mean, the
connection establishment time to a popular hidden service would increase
by 4.7 seconds.
4. Build More Introduction Circuits
When establishing introduction points, a hidden service should launch 5
instead of 3 introduction circuits at the same time and use only the
first 3 that could be established. The remaining two circuits could still
be used for other purposes afterwards.
The effect has been simulated using previously measured data, too.
Therefore, circuit establishment times were derived from log files and
written to an array. Afterwards, a simulation with 10,000 runs was
performed picking 5 (4, 6) random values and using the 3 lowest values in
contrast to picking only 3 values at random. The result is that the mean
time of the 3-out-of-3 approach is 8.1 seconds, while the mean time of
the 3-out-of-5 approach is 4.4 seconds.
The effect on network load is minimal, because the hidden service can
reuse the slower internal circuits for other purposes, e.g., rendezvous
circuits. The only change is that a hidden service starts establishing
more circuits at once instead of subsequently doing so.
References:
[1] http://freehaven.net/~karsten/hidserv/perfanalysis-2008-06-15.pdf
[2] http://freehaven.net/~karsten/hidserv/discussion-2008-07-15.pdf
[3] http://freehaven.net/~karsten/hidserv/design-2008-08-15.pdf
Filename: 156-tracking-blocked-ports.txt
Title: Tracking blocked ports on the client side
Author: Robert Hogan
Created: 14-Oct-2008
Status: Superseded
[Superseded by 156, which recognizes the security issues here.]
Motivation:
Tor clients that are behind extremely restrictive firewalls can end up
waiting a while for their first successful OR connection to a node on the
network. Worse, the more restrictive their firewall the more susceptible
they are to an attacker guessing their entry nodes. Tor routers that
are behind extremely restrictive firewalls can only offer a limited,
'partitioned' service to other routers and clients on the network. Exit
nodes behind extremely restrictive firewalls may advertise ports that they
are actually not able to connect to, wasting network resources in circuit
constructions that are doomed to fail at the last hop on first use.
Proposal:
When a client attempts to connect to an entry guard it should avoid
further attempts on ports that fail once until it has connected to at
least one entry guard successfully. (Maybe it should wait for more than
one failure to reduce the skew on the first node selection.) Thereafter
it should select entry guards regardless of port and warn the user if
it observes that connections to a given port have failed every multiple
of 5 times without success or since the last success.
Tor should warn the operators of exit, middleman and entry nodes if it
observes that connections to a given port have failed a multiple of 5
times without success or since the last success. If attempts on a port
fail 20 or more times without or since success, Tor should add the port
to a 'blocked-ports' entry in its descriptor's extra-info. Some thought
needs to be given to what the authorities might do with this information.
Related TODO item:
"- Automatically determine what ports are reachable and start using
those, if circuits aren't working and it's a pattern we
recognize ("port 443 worked once and port 9001 keeps not
working")."
I've had a go at implementing all of this in the attached.
Addendum:
Just a note on the patch, storing the digest of each router that uses the port
is a bit of a memory hog, and its only real purpose is to provide a count of
routers using that port when warning the user. That could be achieved when
warning the user by iterating through the routerlist instead.
Index: src/or/connection_or.c
===================================================================
--- src/or/connection_or.c (revision 17104)
+++ src/or/connection_or.c (working copy)
@@ -502,6 +502,9 @@
connection_or_connect_failed(or_connection_t *conn,
int reason, const char *msg)
{
+ if ((reason == END_OR_CONN_REASON_NO_ROUTE) ||
+ (reason == END_OR_CONN_REASON_REFUSED))
+ or_port_hist_failure(conn->identity_digest,TO_CONN(conn)->port);
control_event_or_conn_status(conn, OR_CONN_EVENT_FAILED, reason);
if (!authdir_mode_tests_reachability(get_options()))
control_event_bootstrap_problem(msg, reason);
@@ -580,6 +583,7 @@
/* already marked for close */
return NULL;
}
+
return conn;
}
@@ -909,6 +913,7 @@
control_event_or_conn_status(conn, OR_CONN_EVENT_CONNECTED, 0);
if (started_here) {
+ or_port_hist_success(TO_CONN(conn)->port);
rep_hist_note_connect_succeeded(conn->identity_digest, now);
if (entry_guard_register_connect_status(conn->identity_digest,
1, now) < 0) {
Index: src/or/rephist.c
===================================================================
--- src/or/rephist.c (revision 17104)
+++ src/or/rephist.c (working copy)
@@ -18,6 +18,7 @@
static void bw_arrays_init(void);
static void predicted_ports_init(void);
static void hs_usage_init(void);
+static void or_port_hist_init(void);
/** Total number of bytes currently allocated in fields used by rephist.c. */
uint64_t rephist_total_alloc=0;
@@ -89,6 +90,25 @@
digestmap_t *link_history_map;
} or_history_t;
+/** or_port_hist_t contains our router/client's knowledge of
+ all OR ports offered on the network, and how many servers with each port we
+ have succeeded or failed to connect to. */
+typedef struct {
+ /** The port this entry is tracking. */
+ uint16_t or_port;
+ /** Have we ever connected to this port on another OR?. */
+ unsigned int success:1;
+ /** The ORs using this port. */
+ digestmap_t *ids;
+ /** The ORs using this port we have failed to connect to. */
+ digestmap_t *failure_ids;
+ /** Are we excluding ORs with this port during entry selection?*/
+ unsigned int excluded;
+} or_port_hist_t;
+
+static unsigned int still_searching = 0;
+static smartlist_t *or_port_hists;
+
/** When did we last multiply all routers' weighted_run_length and
* total_run_weights by STABILITY_ALPHA? */
static time_t stability_last_downrated = 0;
@@ -164,6 +184,16 @@
tor_free(hist);
}
+/** Helper: free storage held by a single OR port history entry. */
+static void
+or_port_hist_free(or_port_hist_t *p)
+{
+ tor_assert(p);
+ digestmap_free(p->ids,NULL);
+ digestmap_free(p->failure_ids,NULL);
+ tor_free(p);
+}
+
/** Update an or_history_t object <b>hist</b> so that its uptime/downtime
* count is up-to-date as of <b>when</b>.
*/
@@ -1639,7 +1669,7 @@
tmp_time = smartlist_get(predicted_ports_times, i);
if (*tmp_time + PREDICTED_CIRCS_RELEVANCE_TIME < now) {
tmp_port = smartlist_get(predicted_ports_list, i);
- log_debug(LD_CIRC, "Expiring predicted port %d", *tmp_port);
+ log_debug(LD_HIST, "Expiring predicted port %d", *tmp_port);
smartlist_del(predicted_ports_list, i);
smartlist_del(predicted_ports_times, i);
rephist_total_alloc -= sizeof(uint16_t)+sizeof(time_t);
@@ -1821,6 +1851,12 @@
tor_free(last_stability_doc);
built_last_stability_doc_at = 0;
predicted_ports_free();
+ if (or_port_hists) {
+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, p,
+ or_port_hist_free(p));
+ smartlist_free(or_port_hists);
+ or_port_hists = NULL;
+ }
}
/****************** hidden service usage statistics ******************/
@@ -2356,3 +2392,225 @@
tor_free(fname);
}
+/** Create a new entry in the port tracking cache for the or_port in
+ * <b>ri</b>. */
+void
+or_port_hist_new(const routerinfo_t *ri)
+{
+ or_port_hist_t *result;
+ const char *id=ri->cache_info.identity_digest;
+
+ if (!or_port_hists)
+ or_port_hist_init();
+
+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp,
+ {
+ /* Cope with routers that change their advertised OR port or are
+ dropped from the networkstatus. We don't discard the failures of
+ dropped routers because they are still valid when counting
+ consecutive failures on a port.*/
+ if (digestmap_get(tp->ids, id) && (tp->or_port != ri->or_port)) {
+ digestmap_remove(tp->ids, id);
+ }
+ if (tp->or_port == ri->or_port) {
+ if (!(digestmap_get(tp->ids, id)))
+ digestmap_set(tp->ids, id, (void*)1);
+ return;
+ }
+ });
+
+ result = tor_malloc_zero(sizeof(or_port_hist_t));
+ result->or_port=ri->or_port;
+ result->success=0;
+ result->ids=digestmap_new();
+ digestmap_set(result->ids, id, (void*)1);
+ result->failure_ids=digestmap_new();
+ result->excluded=0;
+ smartlist_add(or_port_hists, result);
+}
+
+/** Create the port tracking cache. */
+/*XXX: need to call this when we rebuild/update our network status */
+static void
+or_port_hist_init(void)
+{
+ routerlist_t *rl = router_get_routerlist();
+
+ if (!or_port_hists)
+ or_port_hists=smartlist_create();
+
+ if (rl && rl->routers) {
+ SMARTLIST_FOREACH(rl->routers, routerinfo_t *, ri,
+ {
+ or_port_hist_new(ri);
+ });
+ }
+}
+
+#define NOT_BLOCKED 0
+#define FAILURES_OBSERVED 1
+#define POSSIBLY_BLOCKED 5
+#define PROBABLY_BLOCKED 10
+/** Return the list of blocked ports for our router's extra-info.*/
+char *
+or_port_hist_get_blocked_ports(void)
+{
+ char blocked_ports[2048];
+ char *bp;
+
+ tor_snprintf(blocked_ports,sizeof(blocked_ports),"blocked-ports");
+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp,
+ {
+ if (digestmap_size(tp->failure_ids) >= PROBABLY_BLOCKED)
+ tor_snprintf(blocked_ports+strlen(blocked_ports),
+ sizeof(blocked_ports)," %u,",tp->or_port);
+ });
+ if (strlen(blocked_ports) == 13)
+ return NULL;
+ bp=tor_strdup(blocked_ports);
+ bp[strlen(bp)-1]='\n';
+ bp[strlen(bp)]='\0';
+ return bp;
+}
+
+/** Revert to client-only mode if we have seen to many failures on a port or
+ * range of ports.*/
+static void
+or_port_hist_report_block(unsigned int min_severity)
+{
+ or_options_t *options=get_options();
+ char failures_observed[2048],possibly_blocked[2048],probably_blocked[2048];
+ char port[1024];
+
+ memset(failures_observed,0,sizeof(failures_observed));
+ memset(possibly_blocked,0,sizeof(possibly_blocked));
+ memset(probably_blocked,0,sizeof(probably_blocked));
+
+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp,
+ {
+ unsigned int failures = digestmap_size(tp->failure_ids);
+ if (failures >= min_severity) {
+ tor_snprintf(port, sizeof(port), " %u (%u failures %s out of %u on the"
+ " network)",tp->or_port,failures,
+ (!tp->success)?"and no successes": "since last success",
+ digestmap_size(tp->ids));
+ if (failures >= PROBABLY_BLOCKED) {
+ strlcat(probably_blocked, port, sizeof(probably_blocked));
+ } else if (failures >= POSSIBLY_BLOCKED)
+ strlcat(possibly_blocked, port, sizeof(possibly_blocked));
+ else if (failures >= FAILURES_OBSERVED)
+ strlcat(failures_observed, port, sizeof(failures_observed));
+ }
+ });
+
+ log_warn(LD_HIST,"%s%s%s%s%s%s%s%s",
+ server_mode(options) &&
+ ((min_severity==FAILURES_OBSERVED) || strlen(probably_blocked))?
+ "You should consider disabling your Tor server.":"",
+ (min_severity==FAILURES_OBSERVED)?
+ "Tor appears to be blocked from connecting to a range of ports "
+ "with the result that it cannot connect to one tenth of the Tor "
+ "network. ":"",
+ strlen(failures_observed)?
+ "Tor has observed failures on the following ports: ":"",
+ failures_observed,
+ strlen(possibly_blocked)?
+ "Tor is possibly blocked on the following ports: ":"",
+ possibly_blocked,
+ strlen(probably_blocked)?
+ "Tor is almost certainly blocked on the following ports: ":"",
+ probably_blocked);
+
+}
+
+/** Record the success of our connection to <b>digest</b>'s
+ * OR port. */
+void
+or_port_hist_success(uint16_t or_port)
+{
+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp,
+ {
+ if (tp->or_port != or_port)
+ continue;
+ /*Reset our failure stats so we can notice if this port ever gets
+ blocked again.*/
+ tp->success=1;
+ if (digestmap_size(tp->failure_ids)) {
+ digestmap_free(tp->failure_ids,NULL);
+ tp->failure_ids=digestmap_new();
+ }
+ if (still_searching) {
+ still_searching=0;
+ SMARTLIST_FOREACH(or_port_hists,or_port_hist_t *,t,t->excluded=0;);
+ }
+ return;
+ });
+}
+/** Record the failure of our connection to <b>digest</b>'s
+ * OR port. Warn, exclude the port from future entry guard selection, or
+ * add port to blocked-ports in our server's extra-info as appropriate. */
+void
+or_port_hist_failure(const char *digest, uint16_t or_port)
+{
+ int total_failures=0, ports_excluded=0, report_block=0;
+ int total_routers=smartlist_len(router_get_routerlist()->routers);
+
+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp,
+ {
+ ports_excluded += tp->excluded;
+ total_failures+=digestmap_size(tp->failure_ids);
+ if (tp->or_port != or_port)
+ continue;
+ /* We're only interested in unique failures */
+ if (digestmap_get(tp->failure_ids, digest))
+ return;
+
+ total_failures++;
+ digestmap_set(tp->failure_ids, digest, (void*)1);
+ if (still_searching && !tp->success) {
+ tp->excluded=1;
+ ports_excluded++;
+ }
+ if ((digestmap_size(tp->ids) >= POSSIBLY_BLOCKED) &&
+ !(digestmap_size(tp->failure_ids) % POSSIBLY_BLOCKED))
+ report_block=POSSIBLY_BLOCKED;
+ });
+
+ if (total_failures >= (int)(total_routers/10))
+ or_port_hist_report_block(FAILURES_OBSERVED);
+ else if (report_block)
+ or_port_hist_report_block(report_block);
+
+ if (ports_excluded >= smartlist_len(or_port_hists)) {
+ log_warn(LD_HIST,"During entry node selection Tor tried every port "
+ "offered on the network on at least one server "
+ "and didn't manage a single "
+ "successful connection. This suggests you are behind an "
+ "extremely restrictive firewall. Tor will keep trying to find "
+ "a reachable entry node.");
+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp, tp->excluded=0;);
+ }
+}
+
+/** Add any ports marked as excluded in or_port_hist_t to <b>rt</b> */
+void
+or_port_hist_exclude(routerset_t *rt)
+{
+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp,
+ {
+ char portpolicy[9];
+ if (tp->excluded) {
+ tor_snprintf(portpolicy,sizeof(portpolicy),"*:%u", tp->or_port);
+ log_warn(LD_HIST,"Port %u may be blocked, excluding it temporarily "
+ "from entry guard selection.", tp->or_port);
+ routerset_parse(rt, portpolicy, "Ports");
+ }
+ });
+}
+
+/** Allow the exclusion of ports during our search for an entry node. */
+void
+or_port_hist_search_again(void)
+{
+ still_searching=1;
+}
Index: src/or/or.h
===================================================================
--- src/or/or.h (revision 17104)
+++ src/or/or.h (working copy)
@@ -3864,6 +3864,13 @@
int any_predicted_circuits(time_t now);
int rep_hist_circbuilding_dormant(time_t now);
+void or_port_hist_failure(const char *digest, uint16_t or_port);
+void or_port_hist_success(uint16_t or_port);
+void or_port_hist_new(const routerinfo_t *ri);
+void or_port_hist_exclude(routerset_t *rt);
+void or_port_hist_search_again(void);
+char *or_port_hist_get_blocked_ports(void);
+
/** Possible public/private key operations in Tor: used to keep track of where
* we're spending our time. */
typedef enum {
Index: src/or/routerparse.c
===================================================================
--- src/or/routerparse.c (revision 17104)
+++ src/or/routerparse.c (working copy)
@@ -1401,6 +1401,8 @@
goto err;
}
+ or_port_hist_new(router);
+
if (!router->platform) {
router->platform = tor_strdup("<unknown>");
}
Index: src/or/router.c
===================================================================
--- src/or/router.c (revision 17104)
+++ src/or/router.c (working copy)
@@ -1818,6 +1818,7 @@
char published[ISO_TIME_LEN+1];
char digest[DIGEST_LEN];
char *bandwidth_usage;
+ char *blocked_ports;
int result;
size_t len;
@@ -1825,7 +1826,6 @@
extrainfo->cache_info.identity_digest, DIGEST_LEN);
format_iso_time(published, extrainfo->cache_info.published_on);
bandwidth_usage = rep_hist_get_bandwidth_lines(1);
-
result = tor_snprintf(s, maxlen,
"extra-info %s %s\n"
"published %s\n%s",
@@ -1835,6 +1835,16 @@
if (result<0)
return -1;
+ blocked_ports = or_port_hist_get_blocked_ports();
+ if (blocked_ports) {
+ result = tor_snprintf(s+strlen(s), maxlen-strlen(s),
+ "%s",
+ blocked_ports);
+ tor_free(blocked_ports);
+ if (result<0)
+ return -1;
+ }
+
if (should_record_bridge_info(options)) {
static time_t last_purged_at = 0;
char *geoip_summary;
Index: src/or/circuitbuild.c
===================================================================
--- src/or/circuitbuild.c (revision 17104)
+++ src/or/circuitbuild.c (working copy)
@@ -62,6 +62,7 @@
static void entry_guards_changed(void);
static time_t start_of_month(time_t when);
+static int num_live_entry_guards(void);
/** Iterate over values of circ_id, starting from conn-\>next_circ_id,
* and with the high bit specified by conn-\>circ_id_type, until we get
@@ -1627,12 +1628,14 @@
smartlist_t *excluded;
or_options_t *options = get_options();
router_crn_flags_t flags = 0;
+ routerset_t *_ExcludeNodes;
if (state && options->UseEntryGuards &&
(purpose != CIRCUIT_PURPOSE_TESTING || options->BridgeRelay)) {
return choose_random_entry(state);
}
+ _ExcludeNodes = routerset_new();
excluded = smartlist_create();
if (state && (r = build_state_get_exit_router(state))) {
@@ -1670,12 +1673,18 @@
if (options->_AllowInvalid & ALLOW_INVALID_ENTRY)
flags |= CRN_ALLOW_INVALID;
+ if (options->ExcludeNodes)
+ routerset_union(_ExcludeNodes,options->ExcludeNodes);
+
+ or_port_hist_exclude(_ExcludeNodes);
+
choice = router_choose_random_node(
NULL,
excluded,
- options->ExcludeNodes,
+ _ExcludeNodes,
flags);
smartlist_free(excluded);
+ routerset_free(_ExcludeNodes);
return choice;
}
@@ -2727,6 +2736,7 @@
entry_guards_update_state(or_state_t *state)
{
config_line_t **next, *line;
+ unsigned int have_reachable_entry=0;
if (! entry_guards_dirty)
return;
@@ -2740,6 +2750,7 @@
char dbuf[HEX_DIGEST_LEN+1];
if (!e->made_contact)
continue; /* don't write this one to disk */
+ have_reachable_entry=1;
*next = line = tor_malloc_zero(sizeof(config_line_t));
line->key = tor_strdup("EntryGuard");
line->value = tor_malloc(HEX_DIGEST_LEN+MAX_NICKNAME_LEN+2);
@@ -2785,6 +2796,11 @@
if (!get_options()->AvoidDiskWrites)
or_state_mark_dirty(get_or_state(), 0);
entry_guards_dirty = 0;
+
+ /* XXX: Is this the place to decide that we no longer have any reachable
+ guards? */
+ if (!have_reachable_entry)
+ or_port_hist_search_again();
}
/** If <b>question</b> is the string "entry-guards", then dump
Filename: 157-specific-cert-download.txt
Title: Make certificate downloads specific
Author: Nick Mathewson
Created: 2-Dec-2008
Status: Closed
Target: 0.2.4.x
History:
2008 Dec 2, 22:34
Changed name of cross certification field to match the other authority
certificate fields.
Status:
As of 0.2.1.9-alpha:
Cross-certification is implemented for new certificates, but not yet
required. Directories support the tor/keys/fp-sk urls.
Overview:
Tor's directory specification gives two ways to download a certificate:
by its identity fingerprint, or by the digest of its signing key. Both
are error-prone. We propose a new download mechanism to make sure that
clients get the certificates they want.
Motivation:
When a client wants a certificate to verify a consensus, it has two choices
currently:
- Download by identity key fingerprint. In this case, the client risks
getting a certificate for the same authority, but with a different
signing key than the one used to sign the consensus.
- Download by signing key fingerprint. In this case, the client risks
getting a forged certificate that contains the right signing key
signed with the wrong identity key. (Since caches are willing to
cache certs from authorities they do not themselves recognize, the
attacker wouldn't need to compromise an authority's key to do this.)
Current solution:
Clients fetch by identity keys, and re-fetch with backoff if they don't get
certs with the signing key they want.
Proposed solution:
Phase 1: Add a URL type for clients to download certs by identity _and_
signing key fingerprint. Unless both fields match, the client doesn't
accept the certificate(s). Clients begin using this method when their
randomly chosen directory cache supports it.
Phase 1A: Simultaneously, add a cross-certification element to
certificates.
Phase 2: Once many directory caches support phase 1, clients should prefer
to fetch certificates using that protocol when available.
Phase 2A: Once all authorities are generating cross-certified certificates
as in phase 1A, require cross-certification.
Specification additions:
The key certificate whose identity key fingerprint is <F> and whose signing
key fingerprint is <S> should be available at:
http://<hostname>/tor/keys/fp-sk/<F>-<S>.z
As usual, clients may request multiple certificates using:
http://<hostname>/tor/keys/fp-sk/<F1>-<S1>+<F2>-<S2>.z
Clients SHOULD use this format whenever they know both key fingerprints for
a desired certificate.
Certificates SHOULD contain the following field (at most once):
"dir-key-crosscert" NL CrossSignature NL
where CrossSignature is a signature, made using the certificate's signing
key, of the digest of the PKCS1-padded hash of the certificate's identity
key. For backward compatibility with broken versions of the parser, we
wrap the base64-encoded signature in -----BEGIN ID SIGNATURE---- and
-----END ID SIGNATURE----- tags. (See bug 880.) Implementations MUST allow
the "ID " portion to be omitted, however.
When encountering a certificate with a dir-key-crosscert entry,
implementations MUST verify that the signature is a correct signature of
the hash of the identity key using the signing key.
(In a future version of this specification, dir-key-crosscert entries will
be required.)
Why cross-certify too?
Cross-certification protects clients who haven't updated yet, by reducing
the number of caches that are willing to hold and serve bogus certificates.
References:
This is related to part 2 of bug 854.
Filename: 158-microdescriptors.txt
Title: Clients download consensus + microdescriptors
Author: Roger Dingledine
Created: 17-Jan-2009
Status: Closed
Implemented-In: 0.2.3.1-alpha
0. History
15 May 2009: Substantially revised based on discussions on or-dev
from late January. Removed the notion of voting on how to choose
microdescriptors; made it just a function of the consensus method.
(This lets us avoid the possibility of "desynchronization.")
Added suggestion to use a new consensus flavor. Specified use of
SHA256 for new hashes. -nickm
15 June 2009: Cleaned up based on comments from Roger. -nickm
1. Overview
This proposal replaces section 3.2 of proposal 141, which was
called "Fetching descriptors on demand". Rather than modifying the
circuit-building protocol to fetch a server descriptor inline at each
circuit extend, we instead put all of the information that clients need
either into the consensus itself, or into a new set of data about each
relay called a microdescriptor.
Descriptor elements that are small and frequently changing should go
in the consensus itself, and descriptor elements that are small and
relatively static should go in the microdescriptor. If we ever end up
with descriptor elements that aren't small yet clients need to know
them, we'll need to resume considering some design like the one in
proposal 141.
Note also that any descriptor element which clients need to use to
decide which servers to fetch info about, or which servers to fetch
info from, needs to stay in the consensus.
2. Motivation
See
http://archives.seul.org/or/dev/Nov-2008/msg00000.html and
http://archives.seul.org/or/dev/Nov-2008/msg00001.html and especially
http://archives.seul.org/or/dev/Nov-2008/msg00007.html
for a discussion of the options and why this is currently the best
approach.
3. Design
There are three pieces to the proposal. First, authorities will list in
their votes (and thus in the consensus) the expected hash of
microdescriptor for each relay. Second, authorities will serve
microdescriptors, directory mirrors will cache and serve
them. Third, clients will ask for them and cache them.
3.1. Consensus changes
If the authorities choose a consensus method of a given version or
later, a microdescriptor format is implicit in that version.
A microdescriptor should in every case be a pure function of the
router descriptor and the consensus method.
In votes, we need to include the hash of each expected microdescriptor
in the routerstatus section. I suggest a new "m" line for each stanza,
with the base64 of the SHA256 hash of the router's microdescriptor.
For every consensus method that an authority supports, it includes a
separate "m" line in each router section of its vote, containing:
"m" SP methods 1*(SP AlgorithmName "=" digest) NL
where methods is a comma-separated list of the consensus methods
that the authority believes will produce "digest".
(As with base64 encoding of SHA1 hashes in consensuses, let's
omit the trailing =s)
The consensus microdescriptor-elements and "m" lines are then computed
as described in Section 3.1.2 below.
(This means we need a new consensus-method that knows
how to compute the microdescriptor-elements and add "m" lines.)
The microdescriptor consensus uses the directory-signature format from
proposal 162, with the "sha256" algorithm.
3.1.1. Descriptor elements to include for now
In the first version, the microdescriptor should contain the
onion-key element, and the family element from the router descriptor,
and the exit policy summary as currently specified in dir-spec.txt.
3.1.2. Computing consensus for microdescriptor-elements and "m" lines
When we are generating a consensus, we use whichever m line
unambiguously corresponds to the descriptor digest that will be
included in the consensus.
(If different votes have different microdescriptor digests for a
single <descriptor-digest, consensus-method> pair, then at least one
of the authorities is broken. If this happens, the consensus should
contain whichever microdescriptor digest is most common. If there is
no winner, we break ties in the favor of the lexically earliest.
Either way, we should log a warning: there is definitely a bug.)
The "m" lines in a consensus contain only the digest, not a list of
consensus methods.
3.1.3. A new flavor of consensus
Rather than inserting "m" lines in the current consensus format,
they should be included in a new consensus flavor (see proposal
162).
This flavor can safely omit descriptor digests.
When we implement this voting method, we can remove the exit policy
summary from the current "ns" flavor of consensus, since no current
clients use them, and they take up about 5% of the compressed
consensus.
This new consensus flavor should be signed with the sha256 signature
format as documented in proposal 162.
3.2. Directory mirrors fetch, cache, and serve microdescriptors
Directory mirrors should fetch, catch, and serve each microdescriptor
from the authorities. (They need to continue to serve normal relay
descriptors too, to handle old clients.)
The microdescriptors with base64 hashes <D1>,<D2>,<D3> should be
available at:
http://<hostname>/tor/micro/d/<D1>-<D2>-<D3>.z
(We use base64 for size and for consistency with the consensus
format. We use -s instead of +s to separate these items, since
the + character is used in base64 encoding.)
All the microdescriptors from the current consensus should also be
available at:
http://<hostname>/tor/micro/all.z
so a client that's bootstrapping doesn't need to send a 70KB URL just
to name every microdescriptor it's looking for.
Microdescriptors have no header or footer.
The hash of the microdescriptor is simply the hash of the concatenated
elements.
Directory mirrors should check to make sure that the microdescriptors
they're about to serve match the right hashes (either the hashes from
the fetch URL or the hashes from the consensus, respectively).
We will probably want to consider some sort of smart data structure to
be able to quickly convert microdescriptor hashes into the appropriate
microdescriptor. Clients will want this anyway when they load their
microdescriptor cache and want to match it up with the consensus to
see what's missing.
3.3. Clients fetch them and cache them
When a client gets a new consensus, it looks to see if there are any
microdescriptors it needs to learn. If it needs to learn more than
some threshold of the microdescriptors (half?), it requests 'all',
else it requests only the missing ones. Clients MAY try to
determine whether the upload bandwidth for listing the
microdescriptors they want is more or less than the download
bandwidth for the microdescriptors they do not want.
Clients maintain a cache of microdescriptors along with metadata like
when it was last referenced by a consensus, and which identity key
it corresponds to. They keep a microdescriptor
until it hasn't been mentioned in any consensus for a week. Future
clients might cache them for longer or shorter times.
3.3.1. Information leaks from clients
If a client asks you for a set of microdescs, then you know she didn't
have them cached before. How much does that leak? What about when
we're all using our entry guards as directory guards, and we've seen
that user make a bunch of circuits already?
Fetching "all" when you need at least half is a good first order fix,
but might not be all there is to it.
Another future option would be to fetch some of the microdescriptors
anonymously (via a Tor circuit).
Another crazy option (Roger's phrasing) is to do decoy fetches as
well.
4. Transition and deployment
Phase one, the directory authorities should start voting on
microdescriptors, and putting them in the consensus.
Phase two, directory mirrors should learn how to serve them, and learn
how to read the consensus to find out what they should be serving.
Phase three, clients should start fetching and caching them instead
of normal descriptors.
Filename: 159-exit-scanning.txt
Title: Exit Scanning
Author: Mike Perry
Created: 13-Feb-2009
Status: Informational
Overview:
This proposal describes the implementation and integration of an
automated exit node scanner for scanning the Tor network for malicious,
misconfigured, firewalled or filtered nodes.
Motivation:
Tor exit nodes can be run by anyone with an Internet connection. Often,
these users aren't fully aware of limitations of their networking
setup. Content filters, antivirus software, advertisements injected by
their service providers, malicious upstream providers, and the resource
limitations of their computer or networking equipment have all been
observed on the current Tor network.
It is also possible that some nodes exist purely for malicious
purposes. In the past, there have been intermittent instances of
nodes spoofing SSH keys, as well as nodes being used for purposes of
plaintext surveillance.
While it is not realistic to expect to catch extremely targeted or
completely passive malicious adversaries, the goal is to prevent
malicious adversaries from deploying dragnet attacks against large
segments of the Tor userbase.
Scanning methodology:
The first scans to be implemented are HTTP, HTML, Javascript, and
SSL scans.
The HTTP scan scrapes Google for common filetype urls such as exe, msi,
doc, dmg, etc. It then fetches these urls through Non-Tor and Tor, and
compares the SHA1 hashes of the resulting content.
The SSL scan downloads certificates for all IPs a domain will locally
resolve to and compares these certificates to those seen over Tor. The
scanner notes if a domain had rotated certificates locally in the
results for each scan.
The HTML scan checks HTML, Javascript, and plugin content for
modifications. Because of the dynamic nature of most of the web, the
scanner has a number of mechanisms built in to filter out false
positives that are used when a change is noticed between Tor and
Non-Tor.
All tests also share a URL-based false positive filter that
automatically removes results retroactively if the number of failures
exceeds a certain percentage of nodes tested with the URL.
Deployment Stages:
To avoid instances where bugs cause us to mark exit nodes as BadExit
improperly, it is proposed that we begin use of the scanner in stages.
1. Manual Review:
In the first stage, basic scans will be run by a small number of
people while we stabilize the scanner. The scanner has the ability
to resume crashed scans, and to rescan nodes that fail various
tests.
2. Human Review:
In the second stage, results will be automatically mailed to
an email list of interested parties for review. We will also begin
classifying failure types into three to four different severity
levels, based on both the reliability of the test and the nature of
the failure.
3. Automatic BadExit Marking:
In the final stage, the scanner will begin marking exits depending
on the failure severity level in one of three different ways: by
node idhex, by node IP, or by node IP mask. A potential fourth, less
severe category of results may still be delivered via email only for
review.
BadExit markings will be delivered in batches upon completion
of whole-network scans, so that the final false positive
filter has an opportunity to filter out URLs that exhibit
dynamic content beyond what we can filter.
Specification of Exit Marking:
Technically, BadExit could be marked via SETCONF AuthDirBadExit over
the control port, but this would allow full access to the directory
authority configuration and operation.
The approved-routers file could also be used, but currently it only
supports fingerprints, and it also contains other data unrelated to
exit scanning that would be difficult to coordinate.
Instead, we propose that a new badexit-routers file that has three
keywords:
BadExitNet 1*[exitpattern from 2.3 in dir-spec.txt]
BadExitFP 1*[hexdigest from 2.3 in dir-spec.txt]
BadExitNet lines would follow the codepaths used by AuthDirBadExit to
set authdir_badexit_policy, and BadExitFP would follow the codepaths
from approved-router's !badexit lines.
The scanner would have exclusive ability to write, append, rewrite,
and modify this file. Prior to building a new consensus vote, a
participating Tor authority would read in a fresh copy.
Security Implications:
Aside from evading the scanner's detection, there are two additional
high-level security considerations:
1. Ensure nodes cannot be marked BadExit by an adversary at will
It is possible individual website owners will be able to target certain
Tor nodes, but once they begin to attempt to fail more than the URL
filter percentage of the exits, their sites will be automatically
discarded.
Failing specific nodes is possible, but scanned results are fully
reproducible, and BadExits should be rare enough that humans are never
fully removed from the loop.
State (cookies, cache, etc) does not otherwise persist in the scanner
between exit nodes to enable one exit node to bias the results of a
later one.
2. Ensure that scanner compromise does not yield authority compromise
Having a separate file that is under the exclusive control of the
scanner allows us to heavily isolate the scanner from the Tor
authority, potentially even running them on separate machines.
Filename: 160-bandwidth-offset.txt
Title: Authorities vote for bandwidth offsets in consensus
Author: Roger Dingledine
Created: 4-May-2009
Status: Closed
Target: 0.2.1.x
1. Motivation
As part of proposal 141, we moved the bandwidth value for each relay
into the consensus. Now clients can know how they should load balance
even before they've fetched the corresponding relay descriptors.
Putting the bandwidth in the consensus also lets the directory
authorities choose more accurate numbers to advertise, if we come up
with a better algorithm for deciding weightings.
Our original plan was to teach directory authorities how to measure
bandwidth themselves; then every authority would vote for the bandwidth
it prefers, and we'd take the median of votes as usual.
The problem comes when we have 7 authorities, and only a few of them
have smarter bandwidth allocation algorithms. So long as the majority
of them are voting for the number in the relay descriptor, the minority
that have better numbers will be ignored.
2. Options
One fix would be to demand that every authority also run the
new bandwidth measurement algorithms: in that case, part of the
responsibility of being an authority operator is that you need to run
this code too. But in practice we can't really require all current
authority operators to do that; and if we want to expand the set of
authority operators even further, it will become even more impractical.
Also, bandwidth testing adds load to the network, so we don't really
want to require that the number of concurrent bandwidth tests match
the number of authorities we have.
The better fix is to allow certain authorities to specify that they are
voting on bandwidth measurements: more accurate bandwidth values that
have actually been evaluated. In this way, authorities can vote on
the median measured value if sufficient measured votes exist for a router,
and otherwise fall back to the median value taken from the published router
descriptors.
3. Security implications
If only some authorities choose to vote on an offset, then a majority of
those voting authorities can arbitrarily change the bandwidth weighting
for the relay. At the extreme, if there's only one offset-voting
authority, then that authority can dictate which relays clients will
find attractive.
This problem isn't entirely new: we already have the worry wrt
the subset of authorities that vote for BadExit.
To make it not so bad, we should deploy at least three offset-voting
authorities.
Also, authorities that know how to vote for offsets should vote for
an offset of zero for new nodes, rather than choosing not to vote on
any offset in those cases.
4. Design
First, we need a new consensus method to support this new calculation.
Now v3 votes can have an additional value on the "w" line:
"w Bandwidth=X Measured=" INT.
Once we're using the new consensus method, the new way to compute the
Bandwidth weight is by checking if there are at least 3 "Measured"
votes. If so, the median of these is taken. Otherwise, the median
of the "Bandwidth=" values are taken, as described in Proposal 141.
Then the actual consensus looks just the same as it did before,
so clients never have to know that this additional calculation is
happening.
5. Implementation
The Measured values will be read from a file provided by the scanners
described in proposal 161. Files with a timestamp older than 3 days
will be ignored.
The file will be read in from dirserv_generate_networkstatus_vote_obj()
in a location specified by a new config option "V3MeasuredBandwidths".
A helper function will be called to populate new 'measured' and
'has_measured' fields of the routerstatus_t 'routerstatuses' list with
values read from this file.
An additional for_vote flag will be passed to
routerstatus_format_entry() from format_networkstatus_vote(), which will
indicate that the "Measured=" string should be appended to the "w Bandwith="
line with the measured value in the struct.
routerstatus_parse_entry_from_string() will be modified to parse the
"Measured=" lines into routerstatus_t struct fields.
Finally, networkstatus_compute_consensus() will set rs_out.bandwidth
to the median of the measured values if there are more than 3, otherwise
it will use the bandwidth value median as normal.
Title: Computing Bandwidth Adjustments
Filename: 161-computing-bandwidth-adjustments.txt
Author: Mike Perry
Created: 12-May-2009
Target: 0.2.1.x
Status: Closed
1. Motivation
There is high variance in the performance of the Tor network. Despite
our efforts to balance load evenly across the Tor nodes, some nodes are
significantly slower and more overloaded than others.
Proposal 160 describes how we can augment the directory authorities to
vote on measured bandwidths for routers. This proposal describes what
goes into the measuring process.
2. Measurement Selection
The general idea is to determine a load factor representing the ratio
of the capacity of measured nodes to the rest of the network. This load
factor could be computed from three potentially relevant statistics:
circuit failure rates, circuit extend times, or stream capacity.
Circuit failure rates and circuit extend times appear to be
non-linearly proportional to node load. We've observed that the same
nodes when scanned at US nighttime hours (when load is presumably
lower) exhibit almost no circuit failure, and significantly faster
extend times than when scanned during the day.
Stream capacity, however, is much more uniform, even during US
nighttime hours. Moreover, it is a more intuitive representation of
node capacity, and also less dependent upon distance and latency
if amortized over large stream fetches.
3. Average Stream Bandwidth Calculation
The average stream bandwidths are obtained by dividing the network into
slices of 50 nodes each, grouped according to advertised node bandwidth.
Two hop circuits are built using nodes from the same slice, and a large
file is downloaded via these circuits. The file sizes are set based
on node percentile rank as follows:
0-10: 2M
10-20: 1M
20-30: 512k
30-50: 256k
50-100: 128k
These sizes are based on measurements performed during test scans.
This process is repeated until each node has been chosen to participate
in at least 5 circuits.
4. Ratio Calculation
The ratios are calculated by dividing each measured value by the
network-wide average.
5. Ratio Filtering
After the base ratios are calculated, a second pass is performed
to remove any streams with nodes of ratios less than X=0.5 from
the results of other nodes. In addition, all outlying streams
with capacity of one standard deviation below a node's average
are also removed.
The final ratio result will be greater of the unfiltered ratio
and the filtered ratio.
6. Pseudocode for Ratio Calculation Algorithm
Here is the complete pseudocode for the ratio algorithm:
Slices = {S | S is 50 nodes of similar consensus capacity}
for S in Slices:
while exists node N in S with circ_chosen(N) < 7:
fetch_slice_file(build_2hop_circuit(N, (exit in S)))
for N in S:
BW_measured(N) = MEAN(b | b is bandwidth of a stream through N)
Bw_stddev(N) = STDDEV(b | b is bandwidth of a stream through N)
Bw_avg(S) = MEAN(b | b = BW_measured(N) for all N in S)
for N in S:
Normal_Streams(N) = {stream via N | bandwidth >= BW_measured(N)}
BW_Norm_measured(N) = MEAN(b | b is a bandwidth of Normal_Streams(N))
Bw_net_avg(Slices) = MEAN(BW_measured(N) for all N in Slices)
Bw_Norm_net_avg(Slices) = MEAN(BW_Norm_measured(N) for all N in Slices)
for N in all Slices:
Bw_net_ratio(N) = Bw_measured(N)/Bw_net_avg(Slices)
Bw_Norm_net_ratio(N) = BW_Norm_measured(N)/Bw_Norm_net_avg(Slices)
ResultRatio(N) = MAX(Bw_net_ratio(N), Bw_Norm_net_ratio(N))
7. Security implications
The ratio filtering will deal with cases of sabotage by dropping
both very slow outliers in stream average calculations, as well
as dropping streams that used very slow nodes from the calculation
of other nodes.
This scheme will not address nodes that try to game the system by
providing better service to scanners. The scanners can be detected
at the entry by IP address, and at the exit by the destination fetch
IP.
Measures can be taken to obfuscate and separate the scanners' source
IP address from the directory authority IP address. For instance,
scans can happen offsite and the results can be rsynced into the
authorities. The destination server IP can also change.
Neither of these methods are foolproof, but such nodes can already
lie about their bandwidth to attract more traffic, so this solution
does not set us back any in that regard.
8. Parallelization
Because each slice takes as long as 6 hours to complete, we will want
to parallelize as much as possible. This will be done by concurrently
running multiple scanners from each authority to deal with different
segments of the network. Each scanner piece will continually loop
over a portion of the network, outputting files of the form:
node_id=<idhex> SP strm_bw=<BW_measured(N)> SP
filt_bw=<BW_Norm_measured(N)> ns_bw=<CurrentConsensusBw(N)> NL
The most recent file from each scanner will be periodically gathered
by another script that uses them to produce network-wide averages
and calculate ratios as per the algorithm in section 6. Because nodes
may shift in capacity, they may appear in more than one slice and/or
appear more than once in the file set. The most recently measured
line will be chosen in this case.
9. Integration with Proposal 160
The final results will be produced for the voting mechanism
described in Proposal 160 by multiplying the derived ratio by
the average published consensus bandwidth during the course of the
scan, and taking the weighted average with the previous consensus
bandwidth:
Bw_new = Round((Bw_current * Alpha + Bw_scan_avg*Bw_ratio)/(Alpha + 1))
The Alpha parameter is a smoothing parameter intended to prevent
rapid oscillation between loaded and unloaded conditions. It is
currently fixed at 0.333.
The Round() step consists of rounding to the 3 most significant figures
in base10, and then rounding that result to the nearest 1000, with
a minimum value of 1000.
This will produce a new bandwidth value that will be output into a
file consisting of lines of the form:
node_id=<idhex> SP bw=<Bw_new> NL
The first line of the file will contain a timestamp in UNIX time()
seconds. This will be used by the authority to decide if the
measured values are too old to use.
This file can be either copied or rsynced into a directory readable
by the directory authority.
Filename: 162-consensus-flavors.txt
Title: Publish the consensus in multiple flavors
Author: Nick Mathewson
Created: 14-May-2009
Implemented-In: 0.2.3.1-alpha
Status: Closed
[Implementation notes: the 'consensus index' feature never got implemented.]
Overview:
This proposal describes a way to publish each consensus in
multiple simultaneous formats, or "flavors". This will reduce the
amount of time needed to deploy new consensus-like documents, and
reduce the size of consensus documents in the long term.
Motivation:
In the future, we will almost surely want different fields and
data in the network-status document. Examples include:
- Publishing hashes of microdescriptors instead of hashes of
full descriptors (Proposal 158).
- Including different digests of descriptors, instead of the
perhaps-soon-to-be-totally-broken SHA1.
Note that in both cases, from the client's point of view, this
information _replaces_ older information. If we're using a
SHA256 hash, we don't need to see the SHA1. If clients only want
microdescriptors, they don't (necessarily) need to see hashes of
other things.
Our past approach to cases like this has been to shovel all of
the data into the consensus document. But this is rather poor
for bandwidth. Adding a single SHA256 hash to a consensus for
each router increases the compressed consensus size by 47%. In
comparison, replacing a single SHA1 hash with a SHA256 hash for
each listed router increases the consensus size by only 18%.
Design in brief:
Let the voting process remain as it is, until a consensus is
generated. With future versions of the voting algorithm, instead
of just a single consensus being generated, multiple consensus
"flavors" are produced.
Consensuses (all of them) include a list of which flavors are
being generated. Caches fetch and serve all flavors of consensus
that are listed, regardless of whether they can parse or validate
them, and serve them to clients. Thus, once this design is in
place, we won't need to deploy more cache changes in order to get
new flavors of consensus to be cached.
Clients download only the consensus flavor they want.
A note on hashes:
Everything in this document is specified to use SHA256, and to be
upgradeable to use better hashes in the future.
Spec modifications:
1. URLs and changes to the current consensus format.
Every consensus flavor has a name consisting of a sequence of one
or more alphanumeric characters and dashes. For compatibility
current descriptor flavor is called "ns".
The supported consensus flavors are defined as part of the
authorities' consensus method.
For each supported flavor, every authority calculates another
consensus document of as-yet-unspecified format, and exchanges
detached signatures for these documents as in the current consensus
design.
In addition to the consensus currently served at
/tor/status-vote/(current|next)/consensus.z and
/tor/status-vote/(current|next)/consensus/<FP1>+<FP2>+<FP3>+....z ,
authorities serve another consensus of each flavor "F" from the
locations /tor/status-vote/(current|next)/consensus-F.z. and
/tor/status-vote/(current|next)/consensus-F/<FP1>+....z.
When caches serve these documents, they do so from the same
locations.
2. Document format: generic consensus.
The format of a flavored consensus is as-yet-unspecified, except
that the first line is:
"network-status-version" SP version SP flavor NL
where version is 3 or higher, and the flavor is a string
consisting of alphanumeric characters and dashes, matching the
corresponding flavor listed in the unflavored consensus.
3. Document format: detached signatures.
We amend the detached signature format to include more than one
consensus-digest line, and more than one set of signatures.
After the consensus-digest line, we allow more lines of the form:
"additional-digest" SP flavor SP algname SP digest NL
Before the directory-signature lines, we allow more entries of the form:
"additional-signature" SP flavor SP algname SP identity SP
signing-key-digest NL signature.
[We do not use "consensus-digest" or "directory-signature" for flavored
consensuses, since this could confuse older Tors.]
The consensus-signatures URL should contain the signatures
for _all_ flavors of consensus.
4. The consensus index:
Authorities additionally generate and serve a consensus-index
document. Its format is:
Header ValidAfter ValidUntil Documents Signatures
Header = "consensus-index" SP version NL
ValidAfter = as in a consensus
ValidUntil = as in a consensus
Documents = Document*
Document = "document" SP flavor SP SignedLength
1*(SP AlgorithmName "=" Digest) NL
Signatures = Signature*
Signature = "directory-signature" SP algname SP identity
SP signing-key-digest NL signature
There must be one Document line for each generated consensus flavor.
Each Document line describes the length of the signed portion of
a consensus (the signatures themselves are not included), along
with one or more digests of that signed portion. Digests are
given in hex. The algorithm "sha256" MUST be included; others
are allowed.
The algname part of a signature describes what algorithm was
used to hash the identity and signing keys, and to compute the
signature. The algorithm "sha256" MUST be recognized;
signatures with unrecognized algorithms MUST be ignored.
(See below).
The consensus index is made available at
/tor/status-vote/(current|next)/consensus-index.z.
Caches should fetch this document so they can check the
correctness of the different consensus documents they fetch.
They do not need to check anything about an unrecognized
consensus document beyond its digest and length.
4.1. The "sha256" signature format.
The 'SHA256' signature format for directory objects is defined as
the RSA signature of the OAEP+-padded SHA256 digest of the item to
be signed. When checking signatures, the signature MUST be treated
as valid if the signature material begins with SHA256(document);
this allows us to add other data later.
Considerations:
- We should not create a new flavor of consensus when adding a
field instead wouldn't be too onerous.
- We should not proliferate flavors lightly: clients will be
distinguishable based on which flavor they download.
Migration:
- Stage one: authorities begin generating and serving
consensus-index files.
- Stage two: Caches begin downloading consensus-index files,
validating them, and using them to decide what flavors of
consensus documents to cache. They download all listed
documents, and compare them to the digests given in the
consensus.
- Stage three: Once we want to make a significant change to the
consensus format, we deploy another flavor of consensus at the
authorities. This will immediately start getting cached by the
caches, and clients can start fetching the new flavor without
waiting a version or two for enough caches to begin supporting
it.
Acknowledgements:
Aspects of this design and its applications to hash migration were
heavily influenced by IRC conversations with Marian.
Filename: 163-detecting-clients.txt
Title: Detecting whether a connection comes from a client
Author: Nick Mathewson
Created: 22-May-2009
Target: 0.2.2
Status: Superseded
[Note: Actually, this is partially done, partially superseded
-nickm, 9 May 2011]
Overview:
Some aspects of Tor's design require relays to distinguish
connections from clients from connections that come from relays.
The existing means for doing this is easy to spoof. We propose
a better approach.
Motivation:
There are at least two reasons for which Tor servers want to tell
which connections come from clients and which come from other
servers:
1) Some exits, proposal 152 notwithstanding, want to disallow
their use as single-hop proxies.
2) Some performance-related proposals involve prioritizing
traffic from relays, or limiting traffic per client (but not
per relay).
Right now, we detect client vs server status based on how the
client opens circuits. (Check out the code that implements the
AllowSingleHopExits option if you want all the details.) This
method is depressingly easy to fake, though. This document
proposes better means.
Goals:
To make grabbing relay privileges at least as difficult as just
running a relay.
In the analysis below, "using server privileges" means taking any
action that only servers are supposed to do, like delivering a
BEGIN cell to an exit node that doesn't allow single hop exits,
or claiming server-like amounts of bandwidth.
Passive detection:
A connection is definitely a client connection if it takes one of
the TLS methods during setup that does not establish an identity
key.
A circuit is definitely a client circuit if it is initiated with
a CREATE_FAST cell, though the node could be a client or a server.
A node that's listed in a recent consensus is probably a server.
A node to which we have successfully extended circuits from
multiple origins is probably a server.
Active detection:
If a node doesn't try to use server privileges at all, we never
need to care whether it's a server.
When a node or circuit tries to use server privileges, if it is
"definitely a client" as per above, we can refuse it immediately.
If it's "probably a server" as per above, we can accept it.
Otherwise, we have either a client, or a server that is neither
listed in any consensus or used by any other clients -- in other
words, a new or private server.
For these servers, we should attempt to build one or more test
circuits through them. If enough of the circuits succeed, the
node is a real relay. If not, it is probably a client.
While we are waiting for the test circuits to succeed, we should
allow a short grace period in which server privileges are
permitted. When a test is done, we should remember its outcome
for a while, so we don't need to do it again.
Why it's hard to do good testing:
Doing a test circuit starting with an unlisted router requires
only that we have an open connection for it. Doing a test
circuit starting elsewhere _through_ an unlisted router--though
more reliable-- would require that we have a known address, port,
identity key, and onion key for the router. Only the address and
identity key are easily available via the current Tor protocol in
all cases.
We could fix this part by requiring that all servers support
BEGIN_DIR and support downloading at least a current descriptor
for themselves.
Open questions:
What are the thresholds for the needed numbers of circuits
for us to decide that a node is a relay?
[Suggested answer: two circuits from two distinct hosts.]
How do we pick grace periods? How long do we remember the
outcome of a test?
[Suggested answer: 10 minute grace period; 48 hour memory of
test outcomes.]
If we can build circuits starting at a suspect node, but we don't
have enough information to try extending circuits elsewhere
through the node, should we conclude that the node is
"server-like" or not?
[Suggested answer: for now, just try making circuits through
the node. Extend this to extending circuits as needed.]
Filename: 164-reporting-server-status.txt
Title: Reporting the status of server votes
Author: Nick Mathewson
Created: 22-May-2009
Status: Obsolete
Notes: This doesn't work with the current things authorities do,
though we could revise it to work if we ever want to do this.
Overview:
When a given node isn't listed in the directory, it isn't always easy
to tell why. This proposal suggest a quick-and-dirty way for
authorities to export not only how they voted, but why, and a way to
collate the information.
Motivation:
Right now, if you want to know the reason why your server was listed
a certain way in the Tor directory, the following steps are
recommended:
- Look through your log for reports of what the authority said
when you tried to upload.
- Look at the consensus; see if you're listed.
- Wait a while, see if things get better.
- Download the votes from all the authorities, and see how they
voted. Try to figure out why.
- If you think they'll listen to you, ask some authority
operators to look you up in their mtbf files and logs to see
why they voted as they did.
This is far too hard.
Solution:
We should add a new vote-like information-only document that
authorities serve on request. Call it a "vote info". It is
generated at the same time as a vote, but used only for
determining why a server voted as it did. It is served from
/tor/status-vote-info/current/authority[.z]
It differs from a vote in that:
* Its vote-status field is 'vote-info'.
* It includes routers that the authority would not include
in its vote.
For these, it includes an "omitted" line with an English
message explaining why they were omitted.
* For each router, it includes a line describing its WFU and
MTBF. The format is:
"stability <mtbf> up-since='date'"
"uptime <wfu> down-since='date'"
* It describes the WFU and MTBF thresholds it requires to
vote for a given router in various roles in the header.
The format is:
"flag-requirement <flag-name> <field> <op> <value>"
e.g.
"flag-requirement Guard uptime > 80"
* It includes info on routers all of whose descriptors that
were uploaded but rejected over the past few hours. The
"r" lines for these are the same as for regular routers.
The other lines are omitted for these routers, and are
replaced with a single "rejected" line, explaining (in
English) why the router was rejected.
A status site (like Torweather or Torstatus or another
tool) can poll these files when they are generated, collate
the data, and make it available to server operators.
Risks:
This document makes no provisions for caching these "vote
info" documents. If many people wind up fetching them
aggressively from the authorities, that would be bad.
Filename: 165-simple-robust-voting.txt
Title: Easy migration for voting authority sets
Author: Nick Mathewson
Created: 2009-05-28
Status: Rejected
Status: rejected as too complex.
Overview:
This proposal describes an easy-to-implement, easy-to-verify way to
change the set of authorities without creating a "flag day" situation.
Motivation:
From proposal 134 ("More robust consensus voting with diverse
authority sets") by Peter Palfrader:
Right now there are about five authoritative directory servers
in the Tor network, tho this number is expected to rise to about
15 eventually.
Adding a new authority requires synchronized action from all
operators of directory authorities so that at any time during the
update at least half of all authorities are running and agree on
who is an authority. The latter requirement is there so that the
authorities can arrive at a common consensus: Each authority
builds the consensus based on the votes from all authorities it
recognizes, and so a different set of recognized authorities will
lead to a different consensus document.
In response to this problem, proposal 134 suggested that every
candidate authority list in its vote whom it believes to be an
authority. These A-says-B-is-an-authority relationships form a
directed graph. Each authority then iteratively finds the largest
clique in the graph and remove it, until they find one containing
them. They vote with this clique.
Proposal 134 had some problems:
- It had a security problem in that M hostile authorities in a
clique could effectively kick out M-1 honest authorities. This
could enable a minority of the original authorities to take over.
- It was too complex in its implications to analyze well: it took us
over a year to realize that it was insecure.
- It tried to solve a bigger problem: general fragmentation of
authority trust. Really, all we wanted to have was the ability to
add and remove authorities without forcing a flag day.
Proposed protocol design:
A "Voting Set" is a set of authorities. Each authority has a list of
the voting sets it considers acceptable. These sets are chosen
manually by the authority operators. They must always contain the
authority itself. Each authority lists all of these voting sets in
its votes.
Authorities exchange votes with every other authority in any of their
voting sets.
When it is time to calculate a consensus, an authority picks votes from
whichever voting set it lists that is listed by the most members of
that set. In other words, given two sets S1 and S2 that an authority
lists, that authority will prefer to vote with S1 over S2 whenever
the number of other authorities in S1 that themselves list S1 is
higher than the number of other authorities in S2 that themselves
list S2.
For example, suppose authority A recognizes two sets, "A B C D" and
"A E F G H". Suppose that the first set is recognized by all of A,
B, C, and D, whereas the second set is recognized only by A, E, and
F. Because the first set is recognize by more of the authorities in
it than the other one, A will vote with the first set.
Ties are broken in favor of some arbitrary function of the identity
keys of the authorities in the set.
How to migrate authority sets:
In steady state, each authority operator should list only the current
actual voting set as accepted.
When we want to add an authority, each authority operator configures
his or her server to list two voting sets: one containing all the old
authorities, and one containing the old authorities and the new
authority too. Once all authorities are listing the new set of
authorities, they will start voting with that set because of its
size.
What if one or two authority operators are slow to list the new set?
Then the other operators can stop listing the old set once there are
enough authorities listing the new set to make its voting successful.
(Note that these authorities not listing the new set will still have
their votes counted, since they themselves will be members of the new
set. They will only fail to sign the consensus generated by the
other authorities who are using the new set.)
When we want to remove an authority, the operators list two voting
sets: one containing all the authorities, and one omitting the
authority we want to remove. Once enough authorities list the new
set as acceptable, we start having authority operators stop listing
the old set. Once there are more listing the new set than the old
set, the new set will win.
Data format changes:
Add a new 'voting-set' line to the vote document format. Allow it to
occur any number of times. Its format is:
voting-set SP 'fingerprint' SP 'fingerprint' ... NL
where each fingerprint is the hex fingerprint of an identity key of
an authority. Sort fingerprints in ascending order.
When the consensus method is at least 'X' (decide this when we
implement the proposal), add this line to the consensus format as
well, before the first dir-source line. [This information is not
redundant with the dir-source sections in the consensus: If an
authority is recognized but didn't vote, that authority will appear in
the voting-set line but not in the dir-source sections.]
We don't need to list other information about authorities in our
vote.
Migration issues:
We should keep track somewhere which Tor client versions
recognized which authorities.
Acknowledgments:
The design came out of an IRC conversation with Peter Palfrader. He
had the basic idea first.
Filename: 166-statistics-extra-info-docs.txt
Title: Including Network Statistics in Extra-Info Documents
Author: Karsten Loesing
Created: 21-Jul-2009
Target: 0.2.2
Status: Closed
Change history:
21-Jul-2009 Initial proposal for or-dev
Overview:
The Tor network has grown to almost two thousand relays and millions
of casual users over the past few years. With growth has come
increasing performance problems and attempts by some countries to
block access to the Tor network. In order to address these problems,
we need to learn more about the Tor network. This proposal suggests to
measure additional statistics and include them in extra-info documents
to help us understand the Tor network better.
Introduction:
As of May 2009, relays, bridges, and directories gather the following
data for statistical purposes:
- Relays and bridges count the number of bytes that they have pushed
in 15-minute intervals over the past 24 hours. Relays and bridges
include these data in extra-info documents that they send to the
directory authorities whenever they publish their server descriptor.
- Bridges further include a rough number of clients per country that
they have seen in the past 48 hours in their extra-info documents.
- Directories can be configured to count the number of clients they
see per country in the past 24 hours and to write them to a local
file.
Since then we extended the network statistics in Tor. These statistics
include:
- Directories now gather more precise statistics about connecting
clients. Fixes include measuring in intervals of exactly 24 hours,
counting unsuccessful requests, measuring download times, etc. The
directories append their statistics to a local file every 24 hours.
- Entry guards count the number of clients per country per day like
bridges do and write them to a local file every 24 hours.
- Relays measure statistics of the number of cells in their circuit
queues and how much time these cells spend waiting there. Relays
write these statistics to a local file every 24 hours.
- Exit nodes count the number of read and written bytes on exit
connections per port as well as the number of opened exit streams
per port in 24-hour intervals. Exit nodes write their statistics to
a local file.
The following four sections contain descriptions for adding these
statistics to the relays' extra-info documents.
Directory request statistics:
The first type of statistics aims at measuring directory requests sent
by clients to a directory mirror or directory authority. More
precisely, these statistics aim at requests for v2 and v3 network
statuses only. These directory requests are sent non-anonymously,
either via HTTP-like requests to a directory's Dir port or tunneled
over a 1-hop circuit.
Measuring directory request statistics is useful for several reasons:
First, the number of locally seen directory requests can be used to
estimate the total number of clients in the Tor network. Second, the
country-wise classification of requests using a GeoIP database can
help counting the relative and absolute number of users per country.
Third, the download times can give hints on the available bandwidth
capacity at clients.
Directory requests do not give any hints on the contents that clients
send or receive over the Tor network. Every client requests network
statuses from the directories, so that there are no anonymity-related
concerns to gather these statistics. It might be, though, that clients
wish to hide the fact that they are connecting to the Tor network.
Therefore, IP addresses are resolved to country codes in memory,
events are accumulated over 24 hours, and numbers are rounded up to
multiples of 4 or 8.
"dirreq-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
[At most once.]
YYYY-MM-DD HH:MM:SS defines the end of the included measurement
interval of length NSEC seconds (86400 seconds by default).
A "dirreq-stats-end" line, as well as any other "dirreq-*" line,
is only added when the relay has opened its Dir port and after 24
hours of measuring directory requests.
"dirreq-v2-ips" CC=N,CC=N,... NL
[At most once.]
"dirreq-v3-ips" CC=N,CC=N,... NL
[At most once.]
List of mappings from two-letter country codes to the number of
unique IP addresses that have connected from that country to
request a v2/v3 network status, rounded up to the nearest multiple
of 8. Only those IP addresses are counted that the directory can
answer with a 200 OK status code.
"dirreq-v2-reqs" CC=N,CC=N,... NL
[At most once.]
"dirreq-v3-reqs" CC=N,CC=N,... NL
[At most once.]
List of mappings from two-letter country codes to the number of
requests for v2/v3 network statuses from that country, rounded up
to the nearest multiple of 8. Only those requests are counted that
the directory can answer with a 200 OK status code.
"dirreq-v2-share" num% NL
[At most once.]
"dirreq-v3-share" num% NL
[At most once.]
The share of v2/v3 network status requests that the directory
expects to receive from clients based on its advertised bandwidth
compared to the overall network bandwidth capacity. Shares are
formatted in percent with two decimal places. Shares are
calculated as means over the whole 24-hour interval.
"dirreq-v2-resp" status=num,... NL
[At most once.]
"dirreq-v3-resp" status=nul,... NL
[At most once.]
List of mappings from response statuses to the number of requests
for v2/v3 network statuses that were answered with that response
status, rounded up to the nearest multiple of 4. Only response
statuses with at least 1 response are reported. New response
statuses can be added at any time. The current list of response
statuses is as follows:
"ok": a network status request is answered; this number
corresponds to the sum of all requests as reported in
"dirreq-v2-reqs" or "dirreq-v3-reqs", respectively, before
rounding up.
"not-enough-sigs: a version 3 network status is not signed by a
sufficient number of requested authorities.
"unavailable": a requested network status object is unavailable.
"not-found": a requested network status is not found.
"not-modified": a network status has not been modified since the
If-Modified-Since time that is included in the request.
"busy": the directory is busy.
"dirreq-v2-direct-dl" key=val,... NL
[At most once.]
"dirreq-v3-direct-dl" key=val,... NL
[At most once.]
"dirreq-v2-tunneled-dl" key=val,... NL
[At most once.]
"dirreq-v3-tunneled-dl" key=val,... NL
[At most once.]
List of statistics about possible failures in the download process
of v2/v3 network statuses. Requests are either "direct"
HTTP-encoded requests over the relay's directory port, or
"tunneled" requests using a BEGIN_DIR cell over the relay's OR
port. The list of possible statistics can change, and statistics
can be left out from reporting. The current list of statistics is
as follows:
Successful downloads and failures:
"complete": a client has finished the download successfully.
"timeout": a download did not finish within 10 minutes after
starting to send the response.
"running": a download is still running at the end of the
measurement period for less than 10 minutes after starting to
send the response.
Download times:
"min", "max": smallest and largest measured bandwidth in B/s.
"d[1-4,6-9]": 1st to 4th and 6th to 9th decile of measured
bandwidth in B/s. For a given decile i, i/10 of all downloads
had a smaller bandwidth than di, and (10-i)/10 of all downloads
had a larger bandwidth than di.
"q[1,3]": 1st and 3rd quartile of measured bandwidth in B/s. One
fourth of all downloads had a smaller bandwidth than q1, one
fourth of all downloads had a larger bandwidth than q3, and the
remaining half of all downloads had a bandwidth between q1 and
q3.
"md": median of measured bandwidth in B/s. Half of the downloads
had a smaller bandwidth than md, the other half had a larger
bandwidth than md.
Entry guard statistics:
Entry guard statistics include the number of clients per country and
per day that are connecting directly to an entry guard.
Entry guard statistics are important to learn more about the
distribution of clients to countries. In the future, this knowledge
can be useful to detect if there are or start to be any restrictions
for clients connecting from specific countries.
The information which client connects to a given entry guard is very
sensitive. This information must not be combined with the information
what contents are leaving the network at the exit nodes. Therefore,
entry guard statistics need to be aggregated to prevent them from
becoming useful for de-anonymization. Aggregation includes resolving
IP addresses to country codes, counting events over 24-hour intervals,
and rounding up numbers to the next multiple of 8.
"entry-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
[At most once.]
YYYY-MM-DD HH:MM:SS defines the end of the included measurement
interval of length NSEC seconds (86400 seconds by default).
An "entry-stats-end" line, as well as any other "entry-*"
line, is first added after the relay has been running for at least
24 hours.
"entry-ips" CC=N,CC=N,... NL
[At most once.]
List of mappings from two-letter country codes to the number of
unique IP addresses that have connected from that country to the
relay and which are no known other relays, rounded up to the
nearest multiple of 8.
Cell statistics:
The third type of statistics have to do with the time that cells spend
in circuit queues. In order to gather these statistics, the relay
memorizes when it puts a given cell in a circuit queue and when this
cell is flushed. The relay further notes the life time of the circuit.
These data are sufficient to determine the mean number of cells in a
queue over time and the mean time that cells spend in a queue.
Cell statistics are necessary to learn more about possible reasons for
the poor network performance of the Tor network, especially high
latencies. The same statistics are also useful to determine the
effects of design changes by comparing today's data with future data.
There are basically no privacy concerns from measuring cell
statistics, regardless of a node being an entry, middle, or exit node.
"cell-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
[At most once.]
YYYY-MM-DD HH:MM:SS defines the end of the included measurement
interval of length NSEC seconds (86400 seconds by default).
A "cell-stats-end" line, as well as any other "cell-*" line,
is first added after the relay has been running for at least 24
hours.
"cell-processed-cells" num,...,num NL
[At most once.]
Mean number of processed cells per circuit, subdivided into
deciles of circuits by the number of cells they have processed in
descending order from loudest to quietest circuits.
"cell-queued-cells" num,...,num NL
[At most once.]
Mean number of cells contained in queues by circuit decile. These
means are calculated by 1) determining the mean number of cells in
a single circuit between its creation and its termination and 2)
calculating the mean for all circuits in a given decile as
determined in "cell-processed-cells". Numbers have a precision of
two decimal places.
"cell-time-in-queue" num,...,num NL
[At most once.]
Mean time cells spend in circuit queues in milliseconds. Times are
calculated by 1) determining the mean time cells spend in the
queue of a single circuit and 2) calculating the mean for all
circuits in a given decile as determined in
"cell-processed-cells".
"cell-circuits-per-decile" num NL
[At most once.]
Mean number of circuits that are included in any of the deciles,
rounded up to the next integer.
Exit statistics:
The last type of statistics affects exit nodes counting the number of
bytes written and read and the number of streams opened per port and
per 24 hours. Exit port statistics can be measured from looking at
headers of BEGIN and DATA cells. A BEGIN cell contains the exit port
that is required for the exit node to open a new exit stream.
Subsequent DATA cells coming from the client or being sent back to the
client contain a length field stating how many bytes of application
data are contained in the cell.
Exit port statistics are important to measure in order to identify
possible load-balancing problems with respect to exit policies. Exit
nodes that permit more ports than others are very likely overloaded
with traffic for those ports plus traffic for other ports. Improving
load balancing in the Tor network improves the overall utilization of
bandwidth capacity.
Exit traffic is one of the most sensitive parts of network data in the
Tor network. Even though these statistics do not require looking at
traffic contents, statistics are aggregated so that they are not
useful for de-anonymizing users. Only those ports are reported that
have seen at least 0.1% of exiting or incoming bytes, numbers of bytes
are rounded up to full kibibytes (KiB), and stream numbers are rounded
up to the next multiple of 4.
"exit-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
[At most once.]
YYYY-MM-DD HH:MM:SS defines the end of the included measurement
interval of length NSEC seconds (86400 seconds by default).
An "exit-stats-end" line, as well as any other "exit-*" line, is
first added after the relay has been running for at least 24 hours
and only if the relay permits exiting (where exiting to a single
port and IP address is sufficient).
"exit-kibibytes-written" port=N,port=N,... NL
[At most once.]
"exit-kibibytes-read" port=N,port=N,... NL
[At most once.]
List of mappings from ports to the number of kibibytes that the
relay has written to or read from exit connections to that port,
rounded up to the next full kibibyte.
"exit-streams-opened" port=N,port=N,... NL
[At most once.]
List of mappings from ports to the number of opened exit streams
to that port, rounded up to the nearest multiple of 4.
Implementation notes:
Right now, relays that are configured accordingly write similar
statistics to those described in this proposal to disk every 24 hours.
With this proposal being implemented, relays include the contents of
these files in extra-info documents.
The following steps are necessary to implement this proposal:
1. The current format of [dirreq|entry|buffer|exit]-stats files needs
to be adapted to the description in this proposal. This step
basically means renaming keywords.
2. The timing of writing the four *-stats files should be unified, so
that they are written exactly 24 hours after starting the
relay. Right now, the measurement intervals for dirreq, entry, and
exit stats starts with the first observed request, and files are
written when observing the first request that occurs more than 24
hours after the beginning of the measurement interval. With this
proposal, the measurement intervals should all start at the same
time, and files should be written exactly 24 hours later.
3. It is advantageous to cache statistics in local files in the data
directory until they are included in extra-info documents. The
reason is that the 24-hour measurement interval can be very
different from the 18-hour publication interval of extra-info
documents. When a relay crashes after finishing a measurement
interval, but before publishing the next extra-info document,
statistics would get lost. Therefore, statistics are written to
disk when finishing a measurement interval and read from disk when
generating an extra-info document. Only the statistics that were
appended to the *-stats files within the past 24 hours are included
in extra-info documents. Further, the contents of the *-stats files
need to be checked in the process of generating extra-info documents.
4. With the statistics patches being tested, the ./configure options
should be removed and the statistics code be compiled by default.
It is still required for relay operators to add configuration
options (DirReqStatistics, ExitPortStatistics, etc.) to enable
gathering statistics. However, in the near future, statistics shall
be enabled gathered by all relays by default, where requiring a
./configure option would be a barrier for many relay operators.
Filename: 167-params-in-consensus.txt
Title: Vote on network parameters in consensus
Author: Roger Dingledine
Created: 18-Aug-2009
Status: Closed
Implemented-In: 0.2.2
0. History
1. Overview
Several of our new performance plans involve guessing how to tune
clients and relays, yet we won't be able to learn whether we guessed
the right tuning parameters until many people have upgraded. Instead,
we should have directory authorities vote on the parameters, and teach
Tors to read the currently recommended values out of the consensus.
2. Design
V3 votes should include a new "params" line after the known-flags
line. It contains key=value pairs, where value is an integer.
Consensus documents that are generated with a sufficiently new consensus
method (7?) then include a params line that includes every key listed
in any vote, and the median value for that key (in case of ties,
we use the median closer to zero).
2.1. Planned keys.
The first planned parameter is "circwindow=101", which is the initial
circuit packaging window that clients and relays should use. Putting
it in the consensus will let us perform experiments with different
values once enough Tors have upgraded -- see proposal 168.
Later parameters might include a weighting for how much to favor quiet
circuits over loud circuits in our round-robin algorithm; a weighting
for how much to prioritize relays over clients if we use an incentive
scheme like the gold-star design; and what fraction of circuits we
should throw out from proposal 151.
2.2. What about non-integers?
I'm not sure how we would do median on non-integer values. Further,
I don't have any non-integer values in mind yet. So I say we cross
that bridge when we get to it.
Filename: 168-reduce-circwindow.txt
Title: Reduce default circuit window
Author: Roger Dingledine
Created: 12-Aug-2009
Status: Rejected
0. History
1. Overview
We should reduce the starting circuit "package window" from 1000 to
101. The lower package window will mean that clients will only be able
to receive 101 cells (~50KB) on a circuit before they need to send a
'sendme' acknowledgement cell to request 100 more.
Starting with a lower package window on exit relays should save on
buffer sizes (and thus memory requirements for the exit relay), and
should save on queue sizes (and thus latency for users).
Lowering the package window will induce an extra round-trip for every
additional 50298 bytes of the circuit. This extra step is clearly a
slow-down for large streams, but ultimately we hope that a) clients
fetching smaller streams will see better response, and b) slowing
down the large streams in this way will produce lower e2e latencies,
so the round-trips won't be so bad.
2. Motivation
Karsten's torperf graphs show that the median download time for a 50KB
file over Tor in mid 2009 is 7.7 seconds, whereas the median download
time for 1MB and 5MB are around 50s and 150s respectively. The 7.7
second figure is way too high, whereas the 50s and 150s figures are
surprisingly low.
The median round-trip latency appears to be around 2s, with 25% of
the data points taking more than 5s. That's a lot of variance.
We designed Tor originally with the goal of maximizing
throughput. We figured that would also optimize other network properties
like round-trip latency. Looks like we were wrong.
3. Design
Wherever we initialize the circuit package window, initialize it to
101 rather than 1000. Reducing it should be safe even when interacting
with old Tors: the old Tors will receive the 101 cells and send back
a sendme ack cell. They'll still have much higher deliver windows,
but the rest of their deliver window will go unused.
You can find the patch at arma/circwindow. It seems to work.
3.1. Why not 100?
Tor 0.0.0 through 0.2.1.19 have a bug where they only send the sendme
ack cell after 101 cells rather than the intended 100 cells.
Once 0.2.1.19 is obsolete we can change it back to 100 if we like. But
hopefully we'll have moved to some datagram protocol long before
0.2.1.19 becomes obsolete.
3.2. What about stream packaging windows?
Right now the stream packaging windows start at 500. The goal was to
set the stream window to half the circuit window, to provide a crude
load balancing between streams on the same circuit. Once we lower
the circuit packaging window, the stream packaging window basically
becomes redundant.
We could leave it in -- it isn't hurting much in either case. Or we
could take it out -- people building other Tor clients would thank us
for that step. Alas, people building other Tor clients are going to
have to be compatible with current Tor clients, so in practice there's
no point taking out the stream packaging windows.
3.3. What about variable circuit windows?
Once upon a time we imagined adapting the circuit package window to
the network conditions. That is, we would start the window small,
and raise it based on the latency and throughput we see.
In theory that crude imitation of TCP's windowing system would allow
us to adapt to fill the network better. In practice, I think we want
to stick with the small window and never raise it. The low cap reduces
the total throughput you can get from Tor for a given circuit. But
that's a feature, not a bug.
4. Evaluation
How do we know this change is actually smart? It seems intuitive that
it's helpful, and some smart systems people have agreed that it's
a good idea (or said another way, they were shocked at how big the
default package window was before).
To get a more concrete sense of the benefit, though, Karsten has been
running torperf side-by-side on exit relays with the old package window
vs the new one. The results are mixed currently -- it is slightly faster
for fetching 40KB files, and slightly slower for fetching 50KB files.
I think it's going to be tough to get a clear conclusion that this is
a good design just by comparing one exit relay running the patch. The
trouble is that the other hops in the circuits are still getting bogged
down by other clients introducing too much traffic into the network.
Ultimately, we'll want to put the circwindow parameter into the
consensus so we can test a broader range of values once enough relays
have upgraded.
5. Transition and deployment
We should put the circwindow in the consensus (see proposal 167),
with an initial value of 101. Then as more exit relays upgrade,
clients should seamlessly get the better behavior.
Note that upgrading the exit relay will only affect the "download"
package window. An old client that's uploading lots of bytes will
continue to use the old package window at the client side, and we
can't throttle that window at the exit side without breaking protocol.
The real question then is what we should backport to 0.2.1. Assuming
this could be a big performance win, we can't afford to wait until
0.2.2.x comes out before starting to see the changes here. So we have
two options as I see them:
a) once clients in 0.2.2.x know how to read the value out of the
consensus, and it's been tested for a bit, backport that part to
0.2.1.x.
b) if it's too complex to backport, just pick a number, like 101, and
backport that number.
Clearly choice (a) is the better one if the consensus parsing part
isn't very complex. Let's shoot for that, and fall back to (b) if the
patch turns out to be so big that we reconsider.
Filename: 169-eliminating-renegotiation.txt
Title: Eliminate TLS renegotiation for the Tor connection handshake
Author: Nick Mathewson
Created: 27-Jan-2010
Status: Superseded
Target: 0.2.2
Superseded-By: 176
1. Overview
I propose a backward-compatible change to the Tor connection
establishment protocol to avoid the use of TLS renegotiation.
Rather than doing a TLS renegotiation to exchange certificates
and authenticate the original handshake, this proposal takes an
approach similar to Steven Murdoch's proposal 124, and uses Tor
cells to finish authenticating the parties' identities once the
initial TLS handshake is finished.
Terminological note: I use "client" below to mean the Tor
instance (a client or a relay) that initiates a TLS connection,
and "server" to mean the Tor instance (a relay) that accepts it.
2. Motivation and history
In the original Tor TLS connection handshake protocol ("V1", or
"two-cert"), parties that wanted to authenticate provided a
two-cert chain of X.509 certificates during the handshake setup
phase. Every party that wanted to authenticate sent these
certificates.
In the current Tor TLS connection handshake protocol ("V2", or
"renegotiating"), the parties begin with a single certificate
sent from the server (responder) to the client (initiator), and
then renegotiate to a two-certs-from-each-authenticating-party.
We made this change to make Tor's handshake look like a browser
speaking SSL to a webserver. (See proposal 130, and
tor-spec.txt.) To tell whether to use the V1 or V2 handshake,
servers look at the list of ciphers sent by the client. (This is
ugly, but there's not much else in the ClientHello that they can
look at.) If the list contains any cipher not used by the V1
protocol, the server sends back a single cert and expects a
renegotiation. If the client gets back a single cert, then it
withholds its own certificates until the TLS renegotiation phase.
In other words, initiator behavior now looks like this:
- Begin TLS negotiation with V2 cipher list; wait for
certificate(s).
- If we get a certificate chain:
- Then we are using the V1 handshake. Send our own
certificate chain as part of this initial TLS handshake
if we want to authenticate; otherwise, send no
certificates. When the handshake completes, check
certificates. We are now mutually authenticated.
Otherwise, if we get just a single certificate:
- Then we are using the V2 handshake. Do not send any
certificates during this handshake.
- When the handshake is done, immediately start a TLS
renegotiation. During the renegotiation, expect
a certificate chain from the server; send a certificate
chain of our own if we want to authenticate ourselves.
- After the renegotiation, check the certificates. Then
send (and expect) a VERSIONS cell from the other side to
establish the link protocol version.
And V2 responder behavior now looks like this:
- When we get a TLS ClientHello request, look at the cipher
list.
- If the cipher list contains only the V1 ciphersuites:
- Then we're doing a V1 handshake. Send a certificate
chain. Expect a possible client certificate chain in
response.
Otherwise, if we get other ciphersuites:
- We're using the V2 handshake. Send back a single
certificate and let the handshake complete.
- Do not accept any data until the client has renegotiated.
- When the client is renegotiating, send a certificate
chain, and expect (possibly multiple) certificates in
reply.
- Check the certificates when the renegotiation is done.
Then exchange VERSIONS cells.
Late in 2009, researchers found a flaw in most applications' use
of TLS renegotiation: Although TLS renegotiation does not
reauthenticate any information exchanged before the renegotiation
takes place, many applications were treating it as though it did,
and assuming that data sent _before_ the renegotiation was
authenticated with the credentials negotiated _during_ the
renegotiation. This problem was exacerbated by the fact that
most TLS libraries don't actually give you an obvious good way to
tell where the renegotiation occurred relative to the datastream.
Tor wasn't directly affected by this vulnerability, but its
aftermath hurts us in a few ways:
1) OpenSSL has disabled renegotiation by default, and created
a "yes we know what we're doing" option we need to set to
turn it back on. (Two options, actually: one for openssl
0.9.8l and one for 0.9.8m and later.)
2) Some vendors have removed all renegotiation support from
their versions of OpenSSL entirely, forcing us to tell
users to either replace their versions of OpenSSL or to
link Tor against a hand-built one.
3) Because of 1 and 2, I'd expect TLS renegotiation to become
rarer and rarer in the wild, making our own use stand out
more.
3. Design
3.1. The view in the large
Taking a cue from Steven Murdoch's proposal 124, I propose that
we move the work currently done by the TLS renegotiation step
(that is, authenticating the parties to one another) and do it
with Tor cells instead of with TLS.
Using _yet another_ variant response from the responder (server),
we allow the client to learn that it doesn't need to rehandshake
and can instead use a cell-based authentication system. Once the
TLS handshake is done, the client and server exchange VERSIONS
cells to determine link protocol version (including
handshake version). If they're using the handshake version
specified here, the client and server arrive at link protocol
version 3 (or higher), and use cells to exchange further
authentication information.
3.2. New TLS handshake variant
We already used the list of ciphers from the clienthello to
indicate whether the client can speak the V2 ("renegotiating")
handshake or later, so we can't encode more information there.
We can, however, change the DN in the certificate passed by the
server back to the client. Currently, all V2 certificates are
generated with CN values ending with ".net". I propose that we
have the ".net" commonName ending reserved to indicate the V2
protocol, and use commonName values ending with ".com" to
indicate the V3 ("minimal") handshake described herein.
Now, once the initial TLS handshake is done, the client can look
at the server's certificate(s). If there is a certificate chain,
the handshake is V1. If there is a single certificate whose
subject commonName ends in ".net", the handshake is V2 and the
client should try to renegotiate as it would currently.
Otherwise, the client should assume that the handshake is V3+.
[Servers should _only_ send ".com" addesses, to allow room for
more signaling in the future.]
3.3. Authenticating inside Tor
Once the TLS handshake is finished, if the client renegotiates,
then the server should go on as it does currently.
If the client implements this proposal, however, and the server
has shown it can understand the V3+ handshake protocol, the
client immediately sends a VERSIONS cell to the server
and waits to receive a VERSIONS cell in return. We negotiate
the Tor link protocol version _before_ we proceed with the
negotiation, in case we need to change the authentication
protocol in the future.
Once either party has seen the VERSIONS cell from the other, it
knows which version they will pick (that is, the highest version
shared by both parties' VERSIONS cells). All Tor instances using
the handshake protocol described in 3.2 MUST support at least
link protocol version 3 as described here.
On learning the link protocol, the server then sends the client a
CERT cell and a NETINFO cell. If the client wants to
authenticate to the server, it sends a CERT cell, an AUTHENTICATE
cell, and a NETINFO cell; or it may simply send a NETINFO cell if
it does not want to authenticate.
The CERT cell describes the keys that a Tor instance is claiming
to have. It is a variable-length cell. Its payload format is:
N: Number of certs in cell [1 octet]
N times:
CLEN [2 octets]
Certificate [CLEN octets]
Any extra octets at the end of a CERT cell MUST be ignored.
Each certificate has the form:
CertType [1 octet]
CertPurpose [1 octet]
PublicKeyLen [2 octets]
PublicKey [PublicKeyLen octets]
NotBefore [4 octets]
NotAfter [4 octets]
SignerID [HASH256_LEN octets]
SignatureLen [2 octets]
Signature [SignatureLen octets]
where CertType is 1 (meaning "RSA/SHA256")
CertPurpose is 1 (meaning "link certificate")
PublicKey is the DER encoding of the ASN.1 representation
of the RSA key of the subject of this certificate
NotBefore is a time in HOURS since January 1, 1970, 00:00
UTC before which this certificate should not be
considered valid.
NotAfter is a time in HOURS since January 1, 1970, 00:00
UTC after which this certificate should not be
considered valid.
SignerID is the SHA-256 digest of the public key signing
this certificate
and Signature is the signature of all the other fields in
this certificate, using SHA256 as described in proposal
158.
While authenticating, a server need send only a self-signed
certificate for its identity key. (Its TLS certificate already
contains its link key signed by its identity key.) A client that
wants to authenticate MUST send two certificates: one containing
a public link key signed by its identity key, and one self-signed
cert for its identity.
Tor instances MUST ignore any certificate with an unrecognized
CertType or CertPurpose, and MUST ignore extra bytes in the cert.
The AUTHENTICATE cell proves to the server that the client with
whom it completed the initial TLS handshake is the one possessing
the link public key in its certificate. It is a variable-length
cell. Its contents are:
SignatureType [2 octets]
SignatureLen [2 octets]
Signature [SignatureLen octets]
where SignatureType is 1 (meaning "RSA-SHA256") and Signature is
an RSA-SHA256 signature of the HMAC-SHA256, using the TLS master
secret key as its key, of the following elements:
- The SignatureType field (0x00 0x01)
- The NUL terminated ASCII string: "Tor certificate verification"
- client_random, as sent in the Client Hello
- server_random, as sent in the Server Hello
Once the above handshake is complete, the client knows (from the
initial TLS handshake) that it has a secure connection to an
entity that controls a given link public key, and knows (from the
CERT cell) that the link public key is a valid public key for a
given Tor identity.
If the client authenticates, the server learns from the CERT cell
that a given Tor identity has a given current public link key.
From the AUTHENTICATE cell, it knows that an entity with that
link key knows the master secret for the TLS connection, and
hence must be the party with whom it's talking, if TLS works.
3.4. Security checks
If the TLS handshake indicates a V2 or V3+ connection, the server
MUST reject any connection from the client that does not begin
with either a renegotiation attempt or a VERSIONS cell containing
at least link protocol version "3". If the TLS handshake
indicates a V3+ connection, the client MUST reject any connection
where the server sends anything before the client has sent a
VERSIONS cell, and any connection where the VERSIONS cell does
not contain at least link protocol version "3".
If link protocol version 3 is chosen:
Clients and servers MUST check that all digests and signatures
on the certificates in CERT cells they are given are as
described above.
After the VERSIONS cell, clients and servers MUST close the
connection if anything besides a CERT or AUTH cell is sent
before the
CERT or AUTHENTICATE cells anywhere after the first NETINFO
cell must be rejected.
... [write more here. What else?] ...
3.5. Summary
We now revisit the protocol outlines from section 2 to incorporate
our changes. New or modified steps are marked with a *.
The new initiator behavior now looks like this:
- Begin TLS negotiation with V2 cipher list; wait for
certificate(s).
- If we get a certificate chain:
- Then we are using the V1 handshake. Send our own
certificate chain as part of this initial TLS handshake
if we want to authenticate; otherwise, send no
certificates. When the handshake completes, check
certificates. We are now mutually authenticated.
Otherwise, if we get just a single certificate:
- Then we are using the V2 or the V3+ handshake. Do not
send any certificates during this handshake.
* When the handshake is done, look at the server's
certificate's subject commonName.
* If it ends with ".net", we're doing a V2 handshake:
- Immediately start a TLS renegotiation. During the
renegotiation, expect a certificate chain from the
server; send a certificate chain of our own if we
want to authenticate ourselves.
- After the renegotiation, check the certificates. Then
send (and expect) a VERSIONS cell from the other side
to establish the link protocol version.
* If it ends with anything else, assume a V3 or later
handshake:
* Send a VERSIONS cell, and wait for a VERSIONS cell
from the server.
* If we are authenticating, send CERT and AUTHENTICATE
cells.
* Send a NETINFO cell. Wait for a CERT and a NETINFO
cell from the server.
* If the CERT cell contains a valid self-identity cert,
and the identity key in the cert can be used to check
the signature on the x.509 certificate we got during
the TLS handshake, then we know we connected to the
server with that identity. If any of these checks
fail, or the identity key was not what we expected,
then we close the connection.
* Once the NETINFO cell arrives, continue as before.
And V3+ responder behavior now looks like this:
- When we get a TLS ClientHello request, look at the cipher
list.
- If the cipher list contains only the V1 ciphersuites:
- Then we're doing a V1 handshake. Send a certificate
chain. Expect a possible client certificate chain in
response.
Otherwise, if we get other ciphersuites:
- We're using the V2 handshake. Send back a single
certificate whose subject commonName ends with ".com",
and let the handshake complete.
* If the client does anything besides renegotiate or send a
VERSIONS cell, drop the connection.
- If the client renegotiates immediately, it's a V2
connection:
- When the client is renegotiating, send a certificate
chain, and expect (possibly multiple certificates in
reply).
- Check the certificates when the renegotiation is done.
Then exchange VERSIONS cells.
* Otherwise we got a VERSIONS cell and it's a V3 handshake.
* Send a VERSIONS cell, a CERT cell, an AUTHENTICATE
cell, and a NETINFO cell.
* Wait for the client to send cells in reply. If the
client sends a CERT and an AUTHENTICATE and a NETINFO,
use them to authenticate the client. If the client
sends a NETINFO, it is unauthenticated. If it sends
anything else before its NETINFO, it's rejected.
4. Numbers to assign
We need a version number for this link protocol. I've been
calling it "3".
We need to reserve command numbers for CERT and AUTH cells. I
suggest that in link protocol 3 and higher, we reserve command
numbers 128..240 for variable-length cells. (241-256 we can hold
for future extensions.)
5. Efficiency
This protocol adds a round-trip step when the client sends a
VERSIONS cell to the server and waits for the {VERSIONS, CERT,
NETINFO} response in turn. (The server then waits for the
client's {NETINFO} or {CERT, AUTHENTICATE, NETINFO} reply,
but it would have already been waiting for the client's NETINFO,
so that's not an additional wait.)
This is actually fewer round-trip steps than required before for
TLS renegotiation, so that's a win.
6. Open questions:
- Should we use X.509 certificates instead of the certificate-ish
things we describe here? They are more standard, but more ugly.
- May we cache which certificates we've already verified? It
might leak in timing whether we've connected with a given server
before, and how recently.
- Is there a better secret than the master secret to use in the
AUTHENTICATE cell? Say, a portable one? Can we get at it for
other libraries besides OpenSSL?
- Does using the client_random and server_random data in the
AUTHENTICATE message actually help us? How hard is it to pull
them out of the OpenSSL data structure?
- Can we give some way for clients to signal "I want to use the
V3 protocol if possible, but I can't renegotiate, so don't give
me the V2"? Clients currently have a fair idea of server
versions, so they could potentially do the V3+ handshake with
servers that support it, and fall back to V1 otherwise.
- What should servers that don't have TLS renegotiation do? For
now, I think they should just get it. Eventually we can
deprecate the V2 handshake as we did with the V1 handshake.
Title: Configuration options regarding circuit building
Filename: 170-user-path-config.txt
Author: Sebastian Hahn
Created: 01-March-2010
Status: Superseded
Overview:
This document outlines how Tor handles the user configuration
options to influence the circuit building process.
Motivation:
Tor's treatment of the configuration *Nodes options was surprising
to many users, and quite a few conspiracy theories have crept up. We
should update our specification and code to better describe and
communicate what is going during circuit building, and how we're
honoring configuration. So far, we've been tracking a bugreport
about this behaviour (
https://bugs.torproject.org/flyspray/index.php?do=details&id=1090 )
and Nick replied in a thread on or-talk (
http://archives.seul.org/or/talk/Feb-2010/msg00117.html ).
This proposal tries to document our intention for those configuration
options.
Design:
Five configuration options are available to users to influence Tor's
circuit building. EntryNodes and ExitNodes define a list of nodes
that are for the Entry/Exit position in all circuits. ExcludeNodes
is a list of nodes that are used for no circuit, and
ExcludeExitNodes is a list of nodes that aren't used as the last
hop. StrictNodes defines Tor's behaviour in case of a conflict, for
example when a node that is excluded is the only available
introduction point. Setting StrictNodes to 1 breaks Tor's
functionality in that case, and it will refuse to build such a
circuit.
Neither Nick's email nor bug 1090 have clear suggestions how we
should behave in each case, so I tried to come up with something
that made sense to me.
Security implications:
Deviating from normal circuit building can break one's anonymity, so
the documentation of the above option should contain a warning to
make users aware of the pitfalls.
Specification:
It is proposed that the "User configuration" part of path-spec
(section 2.2.2) be replaced with this:
Users can alter the default behavior for path selection with
configuration options. In case of conflicts (excluding and requiring
the same node) the "StrictNodes" option is used to determine
behaviour. If a nodes is both excluded and required via a
configuration option, the exclusion takes preference.
- If "ExitNodes" is provided, then every request requires an exit
node on the ExitNodes list. If a request is supported by no nodes
on that list, and "StrictNodes" is false, then Tor treats that
request as if ExitNodes were not provided.
- "EntryNodes" behaves analogously.
- If "ExcludeNodes" is provided, then no circuit uses any of the
nodes listed. If a circuit requires an excluded node to be used,
and "StrictNodes" is false, then Tor uses the node in that
position while not using any other of the excluded nodes.
- If "ExcludeExitNodes" is provided, then Tor will not use the nodes
listed for the exit position in a circuit. If a circuit requires
an excluded node to be used in the exit position and "StrictNodes"
is false, then Tor builds that circuit as if ExcludeExitNodes were
not provided.
- If a user tries to connect to or resolve a hostname of the form
<target>.<servername>.exit and the "AllowDotExit" configuration
option is set to 1, the request is rewritten to a request for
<target>, and the request is only supported by the exit whose
nickname or fingerprint is <servername>. If "AllowDotExit" is set
to 0 (default), any request for <anything>.exit is denied.
- When any of the *Nodes settings are changed, all circuits are
expired immediately, to prevent a situation where a previously
built circuit is used even though some of its nodes are now
excluded.
Compatibility:
The old Strict*Nodes options are deprecated, and the StrictNodes
option is new. Tor users may need to update their configuration file.
Filename: 171-separate-streams.txt
Title: Separate streams across circuits by connection metadata
Author: Robert Hogan, Jacob Appelbaum, Damon McCoy, Nick Mathewson
Created: 21-Oct-2008
Modified: 7-Dec-2010
Status: Closed
Implemented-In: 0.2.3.3-alpha
Summary:
We propose a new set of options to isolate unrelated streams from one
another, putting them on separate circuits so that semantically
unrelated traffic is not inadvertently made linkable.
Motivation:
Currently, Tor attaches regular streams (that is, ones not carrying
rendezvous or directory traffic) to circuits based only on whether Tor
circuit's current exit node supports the destination, and whether the
circuit has been dirty (that is, in use) for too long.
This means that traffic that would otherwise be unrelated sometimes
gets sent over the same circuit, allowing the exit node to link such
streams with certainty, and allowing other parties to link such
streams probabilistically.
Older versions of onion routing tried to address this problem by
sending every stream over a separate circuit; performance issues made
this unfeasible. Moreover, in the presence of a localized adversary,
separating streams by circuits increases the odds that, for any given
linked set of streams, at least one will go over a compromised
circuit.
Therefore we ought to look for ways to allow streams that ought to be
linked to travel over a single circuit, while keeping streams that
ought not be linked isolated to separate circuits.
Discussion:
Let's call a series of inherently-linked streams (like a set of
streams downloading objects from the same webpage, or a browsing
session where the user requests several related webpages) a "Session".
"Sessions" are a necessarily a fuzzy concept. While users typically
consider some activities as wholly unrelated to each other ("My IM
session has nothing to do with my web browsing!"), the boundaries
between activities are sometimes hard to determine. If I'm reading
lolcats in one browser tab and reading about treatments for an
embarrassing disease in another, those are probably separate sessions.
If I search for a forum, log in, read it for a while, and post a few
messages on unrelated topics, that's probably all the same session.
So with the proviso that no automated process can identify sessions
100% accurately, let's see which options we have available.
Generally, all the streams on a session come from a single
application. Unfortunately, isolating streams by application
automatically isn't feasible, given the lack of any nice
cross-platform way to tell which local process originated a given
connection. (Yes, lsof works. But a quick review of the lsof code
should be sufficient to scare you away from thinking there is a
portable option, much less a portable O(1) option.) So instead, we'll
have to use some other aspect of a Tor request as a proxy for the
application.
Generally, traffic from separate applications is not in the same
session.
With some applications (IRC, for example), each stream is a session.
Some applications (most notably web browsing) can't be meaningfully
split into sessions without inspecting the traffic itself and
maintaining a lot of state.
How well do ports correspond to sessions? Early versions of this
proposal focused on using destination ports as a proxy for
application, since a connection to port 22 for SSH is probably not in
the same session as one to port 80. This only works with some
applications better than others, though: while SSH users typically
know when they're on port 22 and when they aren't, a web browser can
be coaxed (though img urls or any number of releated tricks) into
connecting to any port at all. Moreover, when Tor gets a DNS lookup
request, it doesn't know in advance which port the resulting address
will be used to connect to.
So in summary, each kind of traffic wants to follow different rules,
and assuming the existence of a web browser and a hostile web page or
exit node, we can't tell one kind of traffic from another by simply
looking at the destination:port of the traffic.
Fortunately, we're not doomed.
Design:
When a stream arrives at Tor, we have the following data to examine:
1) The destination address
2) The destination port (unless this a DNS lookup)
3) The protocol used by the application to send the stream to Tor:
SOCKS4, SOCKS4A, SOCKS5, or whatever local "transparent proxy"
mechanism the kernel gives us.
4) The port used by the application to send the stream to Tor --
that is, the SOCKSListenAddress or TransListenAddress that the
application used, if we have more than one.
5) The SOCKS username and password, if any.
6) The source address and port for the application.
We propose to use 3, 4, and 5 as a backchannel for applications to
tell Tor about different sessions. Rather than running only one
SOCKSPort, a Tor user who would prefer better session isolation should
run multiple SOCKSPorts/TransPorts, and configure different
applications to use separate ports. Applications that support SOCKS
authentication can further be separated on a single port by their
choice of username/password. Streams sent to separate ports or using
different authentication information should never be sent over the
same circuit. We allow each port to have its own settings for
isolation based on destination port, destination address, or both.
Handling DNS can be a challenge. We can get hostnames by one of three
means:
A) A SOCKS4a request, or a SOCKS5 request with a hostname. This
case is handled trivially using the rules above.
B) A RESOLVE request on a SOCKSPort. This case is handled using the
rules above, except that port isolation can't work to isolate
RESOLVE requests into a proper session, since we don't know which
port will eventually be used when we connect to the returned
address.
C) A request on a DNSPort. We have no way of knowing which
address/port will be used to connect to the requested address.
When B or C is required but problematic, we could favor the use of
AutomapHostsOnResolve.
Interface:
We propose that {SOCKS,Natd,Trans,DNS}ListenAddr be deprecated in
favor of an expanded {SOCKS,Natd,Trans,DNS}Port syntax:
ClientPortLine = OptionName SP (Addr ":")? Port (SP Options?)
OptionName = "SOCKSPort" / "NatdPort" / "TransPort" / "DNSPort"
Addr = An IPv4 address / an IPv6 address surrounded by brackets.
If optional, we default to 127.0.0.1
Port = An integer from 1 through 65535 inclusive
Options = Option
Options = Options SP Option
Option = IsolateOption / GroupOption
GroupOption = "SessionGroup=" UINT
IsolateOption = OptNo ("IsolateDestPort" / "IsolateDestAddr" /
"IsolateSOCKSUser"/ "IsolateClientProtocol" /
"IsolateClientAddr") OptPlural
OptNo = "No" ?
OptPlural = "s" ?
SP = " "
UINT = An unsigned integer
All options are case-insensitive.
The "IsolateSOCKSUser" and "IsolateClientAddr" options are on by
default; "NoIsolateSOCKSUser" and "NoIsolateClientAddr" respectively
turn them off. The IsolateDestPort and IsolateDestAddr and
IsolateClientProtocol options are off by default. NoIsolateDestPort and
NoIsolateDestAddr and NoIsolateClientProtocol have no effect.
Given a set of ClientPortLines, streams must NOT be placed on the same
circuit if ANY of the following hold:
* They were sent to two different client ports, unless the two
client ports both specify a "SessionGroup" option with the same
integer value.
* At least one was sent to a client port with the IsolateDestPort
active, and they have different destination ports.
* At least one was sent to a client port with IsolateDestAddr
active, and they have different destination addresses.
* At least one was sent to a client port with IsolateClientProtocol
active, and they use different protocols (where SOCKS4, SOCKS4a,
SOCKS5, TransPort, NatdPort, and DNS are the protocols in question)
* At least one was sent to a client port with IsolateSOCKSUser
active, and they have different SOCKS username/password values
configurations. (For the purposes of this option, the
username/password pair of ""/"" is distinct from SOCKS without
authentication, and both are distinct from any non-SOCKS client's
non-authentication.)
* At least one was sent to a client port with IsolateClientAddr
active, and they came from different client addresses. (For the
purpose of this option, any local interface counts as the same
address. So if the host is configured with addresses 10.0.0.1,
192.0.32.10, and 127.0.0.1, then traffic from those addresses can
leave on the same circuit, but traffic to from 10.0.0.2 (for
example) could not share a circuit with any of them.)
These rules apply regardless of whether the streams are active at the
same time. In other words, if the rules say that streams A and B must
not be on the same circuit, and stream A is attached to circuit X,
then stream B must never be attached to stream X, even if stream A is
closed first.
Alternative Interface:
We're cramming a lot onto one line in the design above. Perhaps
instead it would be a better idea to have grouped lines of the form:
StreamGroup 1
SOCKSPort 9050
TransPort 9051
IsolateDestPort 1
IsolateClientProtocol 0
EndStreamGroup
StreamGroup 2
SOCKSPort 9052
DNSPort 9053
IsolateDestAddr 1
EndStreamGroup
This would be equivalent to:
SOCKSPort 9050 SessionGroup=1 IsolateDestPort NoIsolateClientProtocol
TransPort 9051 SessionGroup=1 IsolateDestPort NoIsolateClientProtocol
SOCKSPort 9052 SessionGroup=2 IsolateDestAddr
DNSPort 9053 SessionGroup=2 IsolateDestAddr
But it would let us extend range of allowed options later without
having client port lines group without bound. For example, we might
give different circuit building parameters to different session
groups.
Example of use:
Suppose that we want to use a web browser, an IRC client, and a SSH
client all at the same time. Let's assume that we want web traffic to
be isolated from all other traffic, even if the browser makes
connections to ports usually used for IRC or SSH. Let's also assume
that IRC and SSH are both used for relatively long-lived connections,
and we want to keep all IRC/SSH sessions separate from one another.
In this case, we could say:
SOCKSPort 9050
SOCKSPort 9051 IsolateDestAddr IsolateDestPort
We would then configure our browser to use 9050 and our IRC/SSH
clients to use 9051.
Advanced example of use, #2:
Suppose that we have a bunch of applications, and we launch them all
using torsocks, and we want to keep each applications isolated from
one another. We just create a shell script, "torlaunch":
#!/bin/bash
export TORSOCKS_USERNAME="$1"
exec torsocks $@
And we configure our SOCKSPort with IsolateSOCKSUser.
Or if we're on Linux and we want to isolate by application invocation,
we would change the TORSOCKS_USERNAME line to:
export TORSOCKS_USERNAME="`cat /proc/sys/kernel/random/uuid`"
Advanced example of use, #2:
Now suppose that we want to achieve the benefits of the first example
of use, but we are stuck using transparent proxies. Let's suppose
this is Linux.
TransPort 9090
TransPort 9091 IsolateDestAddr IsolateDestPort
DNSPort 5353
AutomapHostsOnResolve 1
Here we use the iptables --cmd-owner filter to distinguish which
command is originating the packets, directing traffic from our irc
client and our SSH client to port 9091, and directing other traffic to
9090. Using AutomapHostsOnResolve will confuse ssh in its default
configuration; we'll need to find a way around that.
Security Risks:
Disabling IsolateClientAddr is a pretty bad idea.
Setting up a set of applications to use this system effectively is a
big problem. It's likely that lots of people who try to do this will
mess it up. We should try to see which setups are sensible, and see
if we can provide good feedback to explain which streams are isolated
how.
Performance Risks:
This proposal will result in clients building many more circuits than
they do today. To avoid accidentally hammering the network, we should
have in-process limits on the maximum circuit creation rate and the
total maximum client circuits.
Specification:
The Tor client circuit selection process is not entirely specified.
Any client circuit specification must take these changes into account.
Implementation notes:
The more obvious ways to implement the "find a good circuit to attach
to" part of this proposal involve doing an O(n_circuits) operation
every time we have a stream to attach. We already do such an
operation, so it's not as if we need to hunt for fancy ways to make it
O(1). What will be harder is implementing the "launch circuits as
needed" part of the proposal. Still, it should come down to "a simple
matter of programming."
The SOCKS4 spec has the client provide authentication info when it
connects; accepting such info is no problem. But the SOCKS5 spec has
the client send a list of known auth methods, then has the server send
back the authentication method it chooses. We'll need to update the
SOCKS5 implementation so it can accept user/password authentication if
it's offered.
If we use the second syntax for describing these options, we'll want
to add a new "section-based" entry type for the configuration parser.
Not a huge deal; we already have kludged up something similar for
hidden service configurations.
Opening circuits for predicted ports has the potential to get a little
more complicated; we can probably get away with the existing
algorithm, though, to see where its weak points are and look for
better ones.
Perhaps we can get our next-gen HTTP proxy to communicate browser tab
or session into to tor via authentication, or have torbutton do it
directly. More design is needed here, though.
Alternative designs:
The implementation of this option may want to consider cases where the
same exit node is shared by two or more circuits and
IsolateStreamsByPort is in force. Since one possible use of the option
is to reduce the opportunity of Exit Nodes to attack traffic from the
same source on multiple ports, the implementation may need to ensure
that circuits reserved for the exclusive use of given ports do not
share the same exit node. On the other hand, if our goal is only that
streams should be unlinkable, deliberately shunting them to different
exit nodes is unnecessary and slightly counterproductive.
Earlier versions of this design included a mechanism to isolate
_particular_ destination ports and addresses, so that traffic sent to,
say, port 22 would never share a port with any traffic *not* sent to
port 22. You can achieve this here by having all applications that
send traffic to one of these ports use a separate SOCKSPort, and
then setting IsolateDestPorts on that SOCKSPort.
Future work:
Nikita Borisov suggests that different session profiles -- so long as
there aren't too many of them -- could well get different guard node
allocations in order to prevent guard profiling. This can be done
orthogonally to the rest of this proposal.
Lingering questions:
I suspect there are issues remaining with DNS and TransPort users, and
that my "just use AutomapHostsOnResolve" suggestion may be
insufficient.
Filename: 172-circ-getinfo-option.txt
Title: GETINFO controller option for circuit information
Author: Damian Johnson
Created: 03-June-2010
Status: Reserve
Overview:
This details an additional GETINFO option that would provide information
concerning a relay's current circuits.
Motivation:
The original proposal was for connection related information, but Jake make
the excellent point that any information retrieved from the control port
is...
1. completely ineffectual for auditing purposes since either (a) these
results can be fetched from netstat already or (b) the information would
only be provided via tor and can't be validated.
2. The more useful uses for connection information can be achieved with
much less (and safer) information.
Hence the proposal is now for circuit based rather than connection based
information. This would strip the most controversial and sensitive data
entirely (ip addresses, ports, and connection based bandwidth breakdowns)
while still being useful for the following purposes:
- Basic Relay Usage Questions
How is the bandwidth I'm contributing broken down? Is it being evenly
distributed or is someone hogging most of it? Do these circuits belong to
the hidden service I'm running or something else? Now that I'm using exit
policy X am I desirable as an exit, or are most people just using me as a
relay?
- Debugging
Say a relay has a restrictive firewall policy for outbound connections,
with the ORPort whitelisted but doesn't realize that tor needs random high
ports. Tor would report success ("your orport is reachable - excellent")
yet the relay would be nonfunctional. This proposed information would
reveal numerous RELAY -> YOU -> UNESTABLISHED circuits, giving a good
indicator of what's wrong.
- Visualization
A nice benefit of visualizing tor's behavior is that it becomes a helpful
tool in puzzling out how tor works. For instance, tor spawns numerous
client connections at startup (even if unused as a client). As a newcomer
to tor these asymmetric (outbound only) connections mystified me for quite
a while until until Roger explained their use to me. The proposed
TYPE_FLAGS would let controllers clearly label them as being client
related, making their purpose a bit clearer.
At the moment connection data can only be retrieved via commands like
netstat, ss, and lsof. However, providing an alternative via the control
port provides several advantages:
- scrubbing for private data
Raw connection data has no notion of what's sensitive and what is
not. The relay's flags and cached consensus can be used to take
educated guesses concerning which connections could possibly belong
to client or exit traffic, but this is both difficult and inaccurate.
Anything provided via the control port can scrubbed to make sure we
aren't providing anything we think relay operators should not see.
- additional information
All connection querying commands strictly provide the ip address and
port of connections, and nothing else. However, for the uses listed
above the far more interesting attributes are the circuit's type,
bandwidth usage and uptime.
- improved performance
Querying connection data is an expensive activity, especially for
busy relays or low end processors (such as mobile devices). Tor
already internally knows its circuits, allowing for vastly quicker
lookups.
- cross platform capability
The connection querying utilities mentioned above not only aren't
available under Windows, but differ widely among different *nix
platforms. FreeBSD in particular takes a very unique approach,
dropping important options from netstat and assigning ss to a
spreadsheet application instead. A controller interface, however,
would provide a uniform means of retrieving this information.
Security Implications:
This is an open question. This proposal lacks the most controversial pieces
of information (ip addresses and ports) and insight into potential threats
this would pose would be very welcomed!
Specification:
The following addition would be made to the control-spec's GETINFO section:
"rcirc/id/<Circuit identity>" -- Provides entry for the associated relay
circuit, formatted as:
CIRC_ID=<circuit ID> CREATED=<timestamp> UPDATED=<timestamp> TYPE=<flag>
READ=<bytes> WRITE=<bytes>
none of the parameters contain whitespace, and additional results must be
ignored to allow for future expansion. Parameters are defined as follows:
CIRC_ID - Unique numeric identifier for the circuit this belongs to.
CREATED - Unix timestamp (as seconds since the Epoch) for when the
circuit was created.
UPDATED - Unix timestamp for when this information was last updated.
TYPE - Single character flags indicating attributes in the circuit:
(E)ntry : has a connection that doesn't belong to a known Tor server,
indicating that this is either the first hop or bridged
E(X)it : has been used for at least one exit stream
(R)elay : has been extended
Rende(Z)vous : is being used for a rendezvous point
(I)ntroduction : is being used for a hidden service introduction
(N)one of the above: none of the above have happened yet.
READ - Total bytes transmitted toward the exit over the circuit.
WRITE - Total bytes transmitted toward the client over the circuit.
"rcirc/all" -- The 'rcirc/id/*' output for all current circuits, joined by
newlines.
The following would be included for circ info update events.
4.1.X. Relay circuit status changed
The syntax is:
"650" SP "RCIRC" SP CircID SP Notice [SP Created SP Updated SP Type SP
Read SP Write] CRLF
Notice =
"NEW" / ; first information being provided for this circuit
"UPDATE" / ; update for a previously reported circuit
"CLOSED" ; notice that the circuit no longer exists
Notice indicating that queryable information on a relay related circuit has
changed. If the Notice parameter is either "NEW" or "UPDATE" then this
provides the same fields that would be given by calling "GETINFO rcirc/id/"
with the CircID.
Filename: 173-getinfo-option-expansion.txt
Title: GETINFO Option Expansion
Author: Damian Johnson
Created: 02-June-2010
Status: Obsolete
Overview:
Over the course of developing arm there's been numerous hacks and
workarounds to glean pieces of basic, desirable information about the tor
process. As per Roger's request I've compiled a list of these pain points
to try and improve the control protocol interface.
Motivation:
The purpose of this proposal is to expose additional process and relay
related information that is currently unavailable in a convenient,
dependable, and/or platform independent way. Examples are:
- The relay's total contributed bandwidth. This is a highly requested
piece of information and, based on the following patch from pipe, looks
trivial to include.
http://www.mail-archive.com/or-talk@freehaven.net/msg13085.html
- The process ID of the tor process. There is a high degree of guess work
in obtaining this. Arm for instance uses pidof, netstat, and ps yet
still fails on some platforms, and Orbot recently got a ticket about
its own attempt to fetch it with ps:
https://trac.torproject.org/projects/tor/ticket/1388
This just includes the pieces of missing information I've noticed
(suggestions or questions of their usefulness are welcome!).
Security Implications:
None that I'm aware of. From a security standpoint this seems decently
innocuous.
Specification:
The following addition would be made to the control-spec's GETINFO section:
"relay/bw-limit" -- Effective relayed bandwidth limit.
"relay/burst-limit" -- Effective relayed burst limit.
"relay/read-total" -- Total bytes relayed (download).
"relay/write-total" -- Total bytes relayed (upload).
"relay/flags" -- Space separated listing of flags currently held by the
relay as reported by the currently cached consensus.
"process/user" -- Username under which the tor process is running,
or an empty string if none exists.
[what do we mean 'if none exists'?]
[Implemented in 0.2.3.1-alpha.]
"process/pid" -- Process id belonging to the main tor process, -1 if none
exists for the platform.
[Implemented in 0.2.3.1-alpha.]
"process/uptime" -- Total uptime of the tor process (in seconds).
"process/uptime-reset" -- Time since last reset (startup, sighup, or RELOAD
signal, in seconds). [should clarify exactly which events cause an
uptime reset]
"process/descriptors-used" -- Count of file descriptors used.
"process/descriptor-limit" -- File descriptor limit (getrlimit results).
"ns/authority" -- Router status info (v2 directory style) for all
recognized directory authorities, joined by newlines.
"state/names" -- A space-separated list of all the keys supported by this
version of Tor's state.
"state/val/<key>" -- Provides the current state value belonging to the
given key. If undefined, this provides the key's default value.
"status/ports-seen" -- A summary of which ports we've seen connections'
circuits connect to recently, formatted the same as the EXITS_SEEN status
event described in Section 4.1.XX. This GETINFO option is currently
available only for exit relays.
4.1.XX. Per-port exit stats
The syntax is:
"650" SP "EXITS_SEEN" SP TimeStarted SP PortSummary CRLF
We just generated a new summary of which ports we've seen exiting circuits
connecting to recently. The controller could display this for the user, e.g.
in their "relay" configuration window, to give them a sense of how they're
being used (popularity of the various ports they exit to). Currently only
exit relays will receive this event.
TimeStarted is a quoted string indicating when the reported summary
counts from (in GMT).
The PortSummary keyword has as its argument a comma-separated, possibly
empty set of "port=count" pairs. For example (without linebreak),
650-EXITS_SEEN TimeStarted="2008-12-25 23:50:43"
PortSummary=80=16,443=8
Filename: 174-optimistic-data-server.txt
Title: Optimistic Data for Tor: Server Side
Author: Ian Goldberg
Created: 2-Aug-2010
Status: Closed
Implemented-In: 0.2.3.1-alpha
Overview:
When a SOCKS client opens a TCP connection through Tor (for an HTTP
request, for example), the query latency is about 1.5x higher than it
needs to be. Simply, the problem is that the sequence of data flows
is this:
1. The SOCKS client opens a TCP connection to the OP
2. The SOCKS client sends a SOCKS CONNECT command
3. The OP sends a BEGIN cell to the Exit
4. The Exit opens a TCP connection to the Server
5. The Exit returns a CONNECTED cell to the OP
6. The OP returns a SOCKS CONNECTED notification to the SOCKS client
7. The SOCKS client sends some data (the GET request, for example)
8. The OP sends a DATA cell to the Exit
9. The Exit sends the GET to the server
10. The Server returns the HTTP result to the Exit
11. The Exit sends the DATA cells to the OP
12. The OP returns the HTTP result to the SOCKS client
Note that the Exit node knows that the connection to the Server was
successful at the end of step 4, but is unable to send the HTTP query to
the server until step 9.
This proposal (as well as its upcoming sibling concerning the client
side) aims to reduce the latency by allowing:
1. SOCKS clients to optimistically send data before they are notified
that the SOCKS connection has completed successfully
2. OPs to optimistically send DATA cells on streams in the CONNECT_WAIT
state
3. Exit nodes to accept and queue DATA cells while in the
EXIT_CONN_STATE_CONNECTING state
This particular proposal deals with #3.
In this way, the flow would be as follows:
1. The SOCKS client opens a TCP connection to the OP
2. The SOCKS client sends a SOCKS CONNECT command, followed immediately
by data (such as the GET request)
3. The OP sends a BEGIN cell to the Exit, followed immediately by DATA
cells
4. The Exit opens a TCP connection to the Server
5. The Exit returns a CONNECTED cell to the OP, and sends the queued GET
request to the Server
6. The OP returns a SOCKS CONNECTED notification to the SOCKS client,
and the Server returns the HTTP result to the Exit
7. The Exit sends the DATA cells to the OP
8. The OP returns the HTTP result to the SOCKS client
Motivation:
This change will save one OP<->Exit round trip (down to one from two).
There are still two SOCKS Client<->OP round trips (negligible time) and
two Exit<->Server round trips. Depending on the ratio of the
Exit<->Server (Internet) RTT to the OP<->Exit (Tor) RTT, this will
decrease the latency by 25 to 50 percent. Experiments validate these
predictions. [Goldberg, PETS 2010 rump session; see
https://thunk.cs.uwaterloo.ca/optimistic-data-pets2010-rump.pdf ]
Design:
The current code actually correctly handles queued data at the Exit; if
there is queued data in a EXIT_CONN_STATE_CONNECTING stream, that data
will be immediately sent when the connection succeeds. If the
connection fails, the data will be correctly ignored and freed. The
problem with the current server code is that the server currently
drops DATA cells on streams in the EXIT_CONN_STATE_CONNECTING state.
Also, if you try to queue data in the EXIT_CONN_STATE_RESOLVING state,
bad things happen because streams in that state don't yet have
conn->write_event set, and so some existing sanity checks (any stream
with queued data is at least potentially writable) are no longer sound.
The solution is to simply not drop received DATA cells while in the
EXIT_CONN_STATE_CONNECTING state. Also do not send SENDME cells in this
state, so that the OP cannot send more than one window's worth of data
to be queued at the Exit. Finally, patch the sanity checks so that
streams in the EXIT_CONN_STATE_RESOLVING state that have buffered data
can pass.
If no clients ever send such optimistic data, the new code will never be
executed, and the behaviour of Tor will not change. When clients begin
to send optimistic data, the performance of those clients' streams will
improve.
After discussion with nickm, it seems best to just have the server
version number be the indicator of whether a particular Exit supports
optimistic data. (If a client sends optimistic data to an Exit which
does not support it, the data will be dropped, and the client's request
will fail to complete.) What do version numbers for hypothetical future
protocol-compatible implementations look like, though?
Security implications:
Servers (for sure the Exit, and possibly others, by watching the
pattern of packets) will be able to tell that a particular client
is using optimistic data. This will be discussed more in the sibling
proposal.
On the Exit side, servers will be queueing a little bit extra data, but
no more than one window. Clients today can cause Exits to queue that
much data anyway, simply by establishing a Tor connection to a slow
machine, and sending one window of data.
Specification:
tor-spec section 6.2 currently says:
The OP waits for a RELAY_CONNECTED cell before sending any data.
Once a connection has been established, the OP and exit node
package stream data in RELAY_DATA cells, and upon receiving such
cells, echo their contents to the corresponding TCP stream.
RELAY_DATA cells sent to unrecognized streams are dropped.
It is not clear exactly what an "unrecognized" stream is, but this last
sentence would be changed to say that RELAY_DATA cells received on a
stream that has processed a RELAY_BEGIN cell and has not yet issued a
RELAY_END or a RELAY_CONNECTED cell are queued; that queue is processed
immediately after a RELAY_CONNECTED cell is issued for the stream, or
freed after a RELAY_END cell is issued for the stream.
The earlier part of this section will be addressed in the sibling
proposal.
Compatibility:
There are compatibility issues, as mentioned above. OPs MUST NOT send
optimistic data to Exit nodes whose version numbers predate (something).
OPs MAY send optimistic data to Exit nodes whose version numbers match
or follow that value. (But see the question about independent server
reimplementations, above.)
Implementation:
Here is a simple patch. It seems to work with both regular streams and
hidden services, but there may be other corner cases I'm not aware of.
(Do streams used for directory fetches, hidden services, etc. take a
different code path?)
diff --git a/src/or/connection.c b/src/or/connection.c
index 7b1493b..f80cd6e 100644
--- a/src/or/connection.c
+++ b/src/or/connection.c
@@ -2845,7 +2845,13 @@ _connection_write_to_buf_impl(const char *string, size_t len,
return;
}
- connection_start_writing(conn);
+ /* If we receive optimistic data in the EXIT_CONN_STATE_RESOLVING
+ * state, we don't want to try to write it right away, since
+ * conn->write_event won't be set yet. Otherwise, write data from
+ * this conn as the socket is available. */
+ if (conn->state != EXIT_CONN_STATE_RESOLVING) {
+ connection_start_writing(conn);
+ }
if (zlib) {
conn->outbuf_flushlen += buf_datalen(conn->outbuf) - old_datalen;
} else {
@@ -3382,7 +3388,11 @@ assert_connection_ok(connection_t *conn, time_t now)
tor_assert(conn->s < 0);
if (conn->outbuf_flushlen > 0) {
- tor_assert(connection_is_writing(conn) || conn->write_blocked_on_bw ||
+ /* With optimistic data, we may have queued data in
+ * EXIT_CONN_STATE_RESOLVING while the conn is not yet marked to writing.
+ * */
+ tor_assert(conn->state == EXIT_CONN_STATE_RESOLVING ||
+ connection_is_writing(conn) || conn->write_blocked_on_bw ||
(CONN_IS_EDGE(conn) && TO_EDGE_CONN(conn)->edge_blocked_on_circ));
}
diff --git a/src/or/relay.c b/src/or/relay.c
index fab2d88..e45ff70 100644
--- a/src/or/relay.c
+++ b/src/or/relay.c
@@ -1019,6 +1019,9 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ,
relay_header_t rh;
unsigned domain = layer_hint?LD_APP:LD_EXIT;
int reason;
+ int optimistic_data = 0; /* Set to 1 if we receive data on a stream
+ that's in the EXIT_CONN_STATE_RESOLVING
+ or EXIT_CONN_STATE_CONNECTING states.*/
tor_assert(cell);
tor_assert(circ);
@@ -1038,9 +1041,20 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ,
/* either conn is NULL, in which case we've got a control cell, or else
* conn points to the recognized stream. */
- if (conn && !connection_state_is_open(TO_CONN(conn)))
- return connection_edge_process_relay_cell_not_open(
- &rh, cell, circ, conn, layer_hint);
+ if (conn && !connection_state_is_open(TO_CONN(conn))) {
+ if ((conn->_base.state == EXIT_CONN_STATE_CONNECTING ||
+ conn->_base.state == EXIT_CONN_STATE_RESOLVING) &&
+ rh.command == RELAY_COMMAND_DATA) {
+ /* We're going to allow DATA cells to be delivered to an exit
+ * node in state EXIT_CONN_STATE_CONNECTING or
+ * EXIT_CONN_STATE_RESOLVING. This speeds up HTTP, for example. */
+ log_warn(domain, "Optimistic data received.");
+ optimistic_data = 1;
+ } else {
+ return connection_edge_process_relay_cell_not_open(
+ &rh, cell, circ, conn, layer_hint);
+ }
+ }
switch (rh.command) {
case RELAY_COMMAND_DROP:
@@ -1090,7 +1104,9 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ,
log_debug(domain,"circ deliver_window now %d.", layer_hint ?
layer_hint->deliver_window : circ->deliver_window);
- circuit_consider_sending_sendme(circ, layer_hint);
+ if (!optimistic_data) {
+ circuit_consider_sending_sendme(circ, layer_hint);
+ }
if (!conn) {
log_info(domain,"data cell dropped, unknown stream (streamid %d).",
@@ -1107,7 +1123,9 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ,
stats_n_data_bytes_received += rh.length;
connection_write_to_buf(cell->payload + RELAY_HEADER_SIZE,
rh.length, TO_CONN(conn));
- connection_edge_consider_sending_sendme(conn);
+ if (!optimistic_data) {
+ connection_edge_consider_sending_sendme(conn);
+ }
return 0;
case RELAY_COMMAND_END:
reason = rh.length > 0 ?
Performance and scalability notes:
There may be more RAM used at Exit nodes, as mentioned above, but it is
transient.
Filename: 175-automatic-node-promotion.txt
Title: Automatically promoting Tor clients to nodes
Author: Steven Murdoch
Created: 12-Mar-2010
Status: Rejected
1. Overview
This proposal describes how Tor clients could determine when they
have sufficient bandwidth capacity and are sufficiently reliable to
become either bridges or Tor relays. When they meet this
criteria, they will automatically promote themselves, based on user
preferences. The proposal also defines the new controller messages
and options which will control this process.
Note that for the moment, only transitions between client and
bridge are being considered. Transitions to public relay will
be considered at a future date, but will use the same
infrastructure for measuring capacity and reliability.
2. Motivation and history
Tor has a growing user-base and one of the major impediments to the
quality of service offered is the lack of network capacity. This is
particularly the case for bridges, because these are gradually
being blocked, and thus no longer of use to people within some
countries. By automatically promoting Tor clients to bridges, and
perhaps also to full public relays, this proposal aims to solve
these problems.
Only Tor clients which are sufficiently useful should be promoted,
and the process of determining usefulness should be performed
without reporting the existence of the client to the central
authorities. The criteria used for determining usefulness will be
in terms of bandwidth capacity and uptime, but parameters should be
specified in the directory consensus. State stored at the client
should be in no more detail than necessary, to prevent sensitive
information being recorded.
3. Design
3.x Opt-in state model
Tor can be in one of five node-promotion states:
- off (O): Currently a client, and will stay as such
- auto (A): Currently a client, but will consider promotion
- bridge (B): Currently a bridge, and will stay as such
- auto-bridge (AB): Currently a bridge, but will consider promotion
- relay (R): Currently a public relay, and will stay as such
The state can be fully controlled from the configuration file or
controller, but the normal state transitions are as follows:
Any state -> off: User has opted out of node promotion
Off -> any state: Only permitted with user consent
Auto -> auto-bridge: Tor has detected that it is sufficiently
reliable to be a *bridge*
Auto -> bridge: Tor has detected that it is sufficiently reliable
to be a *relay*, but the user has chosen to remain a *bridge*
Auto -> relay: Tor has detected that it is sufficiently reliable
to be *relay*, and will skip being a *bridge*
Auto-bridge -> relay: Tor has detected that it is sufficiently
reliable to be a *relay*
Note that this model does not support automatic demotion. If this
is desirable, there should be some memory as to whether the
previous state was relay, bridge, or auto-bridge. Otherwise the
user may be prompted to become a relay, although he has opted to
only be a bridge.
3.x User interaction policy
There are a variety of options in how to involve the user into the
decision as to whether and when to perform node promotion. The
choice also may be different when Tor is running from Vidalia (and
thus can readily prompt the user for information), and standalone
(where Tor can only log messages, which may or may not be read).
The option requiring minimal user interaction is to automatically
promote nodes according to reliability, and allow the user to opt
out, by changing settings in the configuration file or Vidalia user
interface.
Alternatively, if a user interface is available, Tor could prompt
the user when it detects that a transition is available, and allow
the user to choose which of the available options to select. If
Vidalia is not available, it still may be possible to solicit an
email address on install, and contact the operator to ask whether
a transition to bridge or relay is permitted.
Finally, Tor could by default not make any transition, and the user
would need to opt in by stating the maximum level (bridge or
relay) to which the node may automatically promote itself.
3.x Performance monitoring model
To prevent a large number of clients activating as relays, but
being too unreliable to be useful, clients should measure their
performance. If this performance meets a parameterized acceptance
criteria, a client should consider promotion. To measure
reliability, this proposal adopts a simple user model:
- A user decides to use Tor at times which follow a Poisson
distribution
- At each time, the user will be happy if the bridge chosen has
adequate bandwidth and is reachable
- If the chosen bridge is down or slow too many times, the user
will consider Tor to be bad
If we additionally assume that the recent history of relay
performance matches the current performance, we can measure
reliability by simulating this simple user.
The following parameters are distributed to clients in the
directory consensus:
- min_bandwidth: Minimum self-measured bandwidth for a node to be
considered useful, in bytes per second
- check_period: How long, in seconds, to wait between checking
reachability and bandwidth (on average)
- num_samples: Number of recent samples to keep
- num_useful: Minimum number of recent samples where the node was
reachable and had at least min_bandwidth capacity, for a client
to consider promoting to a bridge
A different set of parameters may be used for considering when to
promote a bridge to a full relay, but this will be the subject of a
future revision of the proposal.
3.x Performance monitoring algorithm
The simulation described above can be implemented as follows:
Every 60 seconds:
1. Tor generates a random floating point number x in
the interval [0, 1).
2. If x > (1 / (check_period / 60)) GOTO end; otherwise:
3. Tor sets the value last_check to the current_time (in seconds)
4. Tor measures reachability
5. If the client is reachable, Tor measures its bandwidth
6. If the client is reachable and the bandwidth is >=
min_bandwidth, the test has succeeded, otherwise it has failed.
7. Tor adds the test result to the end of a ring-buffer containing
the last num_samples results: measurement_results
8. Tor saves last_check and measurements_results to disk
9. If the length of measurements_results == num_samples and
the number of successes >= num_useful, Tor should consider
promotion to a bridge
end.
When Tor starts, it must fill in the samples for which it was not
running. This can only happen once the consensus has downloaded,
because the value of check_period is needed.
1. Tor generates a random number y from the Poisson distribution [1]
with lambda = (current_time - last_check) * (1 / check_period)
2. Tor sets the value last_check to the current_time (in seconds)
3. Add y test failures to the ring buffer measurements_results
4. Tor saves last_check and measurements_results to disk
In this way, a Tor client will measure its bandwidth and
reachability every check_period seconds, on average. Provided
check_period is sufficiently greater than a minute (say, at least an
hour), the times of check will follow a Poisson distribution. [2]
While this does require that Tor does record the state of a client
over time, this does not leak much information. Only a binary
reachable/non-reachable is stored, and the timing of samples becomes
increasingly fuzzy as the data becomes less recent.
On IP address changes, Tor should clear the ring-buffer, because
from the perspective of users with the old IP address, this node
might as well be a new one with no history. This policy may change
once we start allowing the bridge authority to hand out new IP
addresses given the fingerprint.
[Perhaps another consensus param? Also, this means we save previous
IP address in our state file, yes? -RD]
3.x Bandwidth measurement
Tor needs to measure its bandwidth to test the usefulness as a
bridge. A non-intrusive way to do this would be to passively measure
the peak data transfer rate since the last reachability test. Once
this exceeds min_bandwidth, Tor can set a flag that this node
currently has sufficient bandwidth to pass the bandwidth component
of the upcoming performance measurement.
For the first version we may simply skip the bandwidth test,
because the existing reachability test sends 500 kB over several
circuits, and checks whether the node can transfer at least 50
kB/s. This is probably good enough for a bridge, so this test
might be sufficient to record a success in the ring buffer.
3.x New options
3.x New controller message
4. Migration plan
We should start by setting a high bandwidth and uptime requirement
in the consensus, so as to avoid overloading the bridge authority
with too many bridges. Once we are confident our systems can scale,
the criteria can be gradually shifted down to gain more bridges.
5. Related proposals
6. Open questions:
- What user interaction policy should we take?
- When (if ever) should we turn a relay into an exit relay?
- What should the rate limits be for auto-promoted bridges/relays?
Should we prompt the user for this?
- Perhaps the bridge authority should tell potential bridges
whether to enable themselves, by taking into account whether
their IP address is blocked
- How do we explain the possible risks of running a bridge/relay
* Use of bandwidth/congestion
* Publication of IP address
* Blocking from IRC (even for non-exit relays)
- What feedback should we give to bridge relays, to encourage them
e.g. number of recent users (what about reserve bridges)?
- Can clients back-off from doing these tests (yes, we should do
this)
[1] For algorithms to generate random numbers from the Poisson
distribution, see: http://en.wikipedia.org/wiki/Poisson_distribution#Generating_Poisson-distributed_random_variables
[2] "The sample size n should be equal to or larger than 20 and the
probability of a single success, p, should be smaller than or equal to
.05. If n >= 100, the approximation is excellent if np is also <= 10."
http://www.itl.nist.gov/div898/handbook/pmc/section3/pmc331.htm (e-Handbook of Statistical Methods)
% vim: spell ai et:
Filename: 176-revising-handshake.txt
Title: Proposed version-3 link handshake for Tor
Author: Nick Mathewson
Created: 31-Jan-2011
Status: Closed
Target: 0.2.3
Supersedes: 169
1. Overview
I propose a (mostly) backward-compatible change to the Tor
connection establishment protocol to avoid the use of TLS
renegotiation, to avoid certain protocol fingerprinting attacks,
and to make it easier to write Tor clients and servers.
Rather than doing a TLS renegotiation to exchange certificates
and authenticate the original handshake, this proposal takes an
approach similar to Steven Murdoch's proposal 124 and my old
proposal 169, and uses Tor cells to finish authenticating the
parties' identities once the initial TLS handshake is finished.
I discuss some alternative design choices and why I didn't make
them in section 7; please have a quick look there before
telling me that something is pointless or makes no sense.
Terminological note: I use "client" or "initiator" below to mean
the Tor instance (a client or a bridge or a relay) that initiates a
TLS connection, and "server" or "responder" to mean the Tor
instance (a bridge or a relay) that accepts it.
2. History and Motivation
The _goals_ of the Tor link handshake have remained basically uniform
since our earliest versions. They are:
* Provide data confidentiality, data integrity
* Provide forward secrecy
* Allow responder authentication or bidirectional authentication.
* Try to look like some popular too-important-to-block-at-whim
encryption protocol, to avoid fingerprinting and censorship.
* Try to be implementable -- on the client side at least! --
by as many TLS implementations as possible.
When we added the v2 handshake, we added another goal:
* Remain compatible with older versions of the handshake
protocol.
In the original Tor TLS connection handshake protocol ("V1", or
"two-cert"), parties that wanted to authenticate provided a
two-cert chain of X.509 certificates during the handshake setup
phase. Every party that wanted to authenticate sent these
certificates. The security properties of this protocol are just
fine; the problem was that our behavior of sending
two-certificate chains made Tor easy to identify.
In the current Tor TLS connection handshake protocol ("V2", or
"renegotiating"), the parties begin with a single certificate
sent from the server (responder) to the client (initiator), and
then renegotiate to a two-certs-from-each-authenticating party.
We made this change to make Tor's handshake look like a browser
speaking SSL to a webserver. (See proposal 130, and
tor-spec.txt.) So from an observer's point of view, two parties
performing the V2 handshake begin by making a regular TLS
handshake with a single certificate, then renegotiate
immediately.
To tell whether to use the V1 or V2 handshake, the servers look
at the list of ciphers sent by the client. (This is ugly, but
there's not much else in the ClientHello that they can look at.)
If the list contains any cipher not used by the V1 protocol, the
server sends back a single cert and expects a renegotiation. If
the client gets back a single cert, then it withholds its own
certificates until the TLS renegotiation phase.
In other words, V2-supporting initiator behavior currently looks
like this:
- Begin TLS negotiation with V2 cipher list; wait for
certificate(s).
- If we get a certificate chain:
- Then we are using the V1 handshake. Send our own
certificate chain as part of this initial TLS handshake
if we want to authenticate; otherwise, send no
certificates. When the handshake completes, check
certificates. We are now mutually authenticated.
Otherwise, if we get just a single certificate:
- Then we are using the V2 handshake. Do not send any
certificates during this handshake.
- When the handshake is done, immediately start a TLS
renegotiation. During the renegotiation, expect
a certificate chain from the server; send a certificate
chain of our own if we want to authenticate ourselves.
- After the renegotiation, check the certificates. Then
send (and expect) a VERSIONS cell from the other side to
establish the link protocol version.
And V2-supporting responder behavior now looks like this:
- When we get a TLS ClientHello request, look at the cipher
list.
- If the cipher list contains only the V1 ciphersuites:
- Then we're doing a V1 handshake. Send a certificate
chain. Expect a possible client certificate chain in
response.
Otherwise, if we get other ciphersuites:
- We're using the V2 handshake. Send back a single
certificate and let the handshake complete.
- Do not accept any data until the client has renegotiated.
- When the client is renegotiating, send a certificate
chain, and expect (possibly multiple) certificates in
reply.
- Check the certificates when the renegotiation is done.
Then exchange VERSIONS cells.
Late in 2009, researchers found a flaw in most applications' use
of TLS renegotiation: Although TLS renegotiation does not
reauthenticate any information exchanged before the renegotiation
takes place, many applications were treating it as though it did,
and assuming that data sent _before_ the renegotiation was
authenticated with the credentials negotiated _during_ the
renegotiation. This problem was exacerbated by the fact that
most TLS libraries don't actually give you an obvious good way to
tell where the renegotiation occurred relative to the datastream.
Tor wasn't directly affected by this vulnerability, but the
aftermath hurts us in a few ways:
1) OpenSSL has disabled renegotiation by default, and created
a "yes we know what we're doing" option we need to set to
turn it back on. (Two options, actually: one for openssl
0.9.8l and one for 0.9.8m and later.)
2) Some vendors have removed all renegotiation support from
their versions of OpenSSL entirely, forcing us to tell
users to either replace their versions of OpenSSL or to
link Tor against a hand-built one.
3) Because of 1 and 2, I'd expect TLS renegotiation to become
rarer and rarer in the wild, making our own use stand out
more.
Furthermore, there are other issues related to TLS and
fingerprinting that we want to fix in any revised handshake:
1) We should make it easier to use self-signed certs, or maybe
even existing HTTPS certificates, for the server side
handshake, since most non-Tor SSL handshakes use either
self-signed certificates or CA-signed certificates.
2) We should allow other changes in our use of TLS and in our
certificates so as to resist fingerprinting based on how
our certificates look. (See proposal 179.)
3. Design
3.1. The view in the large
Taking a cue from Steven Murdoch's proposal 124 and my old
proposal 169, I propose that we move the work currently done by
the TLS renegotiation step (that is, authenticating the parties
to one another) and do it with Tor cells instead of with TLS
alone.
This section outlines the protocol; we go into more detail below.
To tell the client that it can use the new cell-based
authentication system, the server sends a "V3 certificate" during
the initial TLS handshake. (More on what makes a certificate
"v3" below.) If the client recognizes the format of the
certificate and decides to pursue the V3 handshake, then instead
of renegotiating immediately on completion of the initial TLS
handshake, the client instead sends a VERSIONS cell (and the
negotiation begins).
So the flowchart on the server side is:
Wait for a ClientHello.
If the client sends a ClientHello that indicates V1:
- Send a certificate chain.
- When the TLS handshake is done, if the client sent us a
certificate chain, then check it.
If the client sends a ClientHello that indicates V2 or V3:
- Send a self-signed certificate or a CA-signed certificate
- When the TLS handshake is done, wait for renegotiation or data.
- If renegotiation occurs, the client is V2: send a
certificate chain and maybe receive one. Check the
certificate chain as in V1.
- If the client sends data without renegotiating, it is
starting the V3 handshake. Proceed with the V3
handshake as below.
And the client-side flowchart is:
- Send a ClientHello with a set of ciphers that indicates V2/V3.
- After the handshake is done:
- If the server sent us a certificate chain, check it: we
are using the V1 handshake.
- If the server sent us a single "V2 certificate", we are
using the v2 handshake: the client begins to renegotiate
and proceeds as before.
- Finally, if the server sent us a "v3 certificate", we are
doing the V3 handshake below.
And the cell-based part of the V3 handshake, in summary, is:
C<->S: TLS handshake where S sends a "v3 certificate"
In TLS:
C->S: VERSIONS cell
S->C: VERSIONS cell, CERT cell, AUTH_CHALLENGE cell, NETINFO cell
C->S: Optionally: CERT cell, AUTHENTICATE cell
C->S: NETINFO cell
A "CERTS" cell contains a set of certificates; an "AUTHENTICATE"
cell authenticates the client to the server. More on these
later.
3.2. Distinguishing V2 and V3 certificates
In the protocol outline above, we require that the client can
distinguish between v2 certificates (that is, those sent by
current servers) and v3 certificates. We further require that
existing clients will accept v3 certificates as they currently
accept v2 certificates.
Fortunately, current certificates have a few characteristics that
make them fairly well-mannered as it is. We say that a certificate
indicates a V2-only server if ALL of the following hold:
* The certificate is not self-signed.
* There is no DN field set in the certificate's issuer or
subject other than "commonName".
* The commonNames of the issuer and subject both end with
".net"
* The public modulus is at most 1024 bits long.
Otherwise, the client should assume that the server supports the
V3 handshake.
To the best of my knowledge, current clients will behave properly
on receiving non-v2 certs during the initial TLS handshake so
long as they eventually get the correct V2 cert chain during the
renegotiation.
The v3 requirements are easy to meet: any certificate designed to
resist fingerprinting will likely be self-signed, or if it's
signed by a CA, then the issuer will surely have more DN fields
set. Certificates that aren't trying to resist fingerprinting
can trivially become v3 by using a CN that doesn't end with .net,
or using a key longer than 1024 bits.
3.3. Authenticating via Tor cells: server authentication
Once the TLS handshake is finished, if the client renegotiates,
then the server should go on as it does currently.
If the client implements this proposal, however, and the server
has shown it can understand the V3+ handshake protocol, the
client immediately sends a VERSIONS cell to the server
and waits to receive a VERSIONS cell in return. We negotiate
the Tor link protocol version _before_ we proceed with the
negotiation, in case we need to change the authentication
protocol in the future.
Once either party has seen the VERSIONS cell from the other, it
knows which version they will pick (that is, the highest version
shared by both parties' VERSIONS cells). All Tor instances using
the handshake protocol described in 3.2 MUST support at least
link protocol version 3 as described here. If a version lower
than 3 is negotiated with the V3 handshake in place, a Tor
instance MUST close the connection.
On learning the link protocol, the server then sends the client a
CERT cell and a NETINFO cell. If the client wants to
authenticate to the server, it sends a CERT cell, an AUTHENTICATE
cell, and a NETINFO cell; or it may simply send a NETINFO cell if
it does not want to authenticate.
The CERT cell describes the keys that a Tor instance is claiming
to have. It is a variable-length cell. Its payload format is:
N: Number of certs in cell [1 octet]
N times:
CertType [1 octet]
CLEN [2 octets]
Certificate [CLEN octets]
Any extra octets at the end of a CERT cell MUST be ignored.
CertType values are:
1: Link key certificate from RSA1024 identity
2: RSA1024 Identity certificate
3: RSA1024 AUTHENTICATE cell link certificate
The certificate format is X509.
To authenticate the server, the client MUST check the following:
* The CERTS cell contains exactly one CertType 1 "Link" certificate.
* The CERTS cell contains exactly one CertType 2 "ID" certificate.
* Both certificates have validAfter and validUntil dates that
are not expired.
* The certified key in the Link certificate matches the
link key that was used to negotiate the TLS connection.
* The certified key in the ID certificate is a 1024-bit RSA key.
* The certified key in the ID certificate was used to sign both
certificates.
* The link certificate is correctly signed with the key in the
ID certificate
* The ID certificate is correctly self-signed.
If all of these conditions hold, then the client knows that it is
connected to the server whose identity key is certified in the ID
certificate. If any condition does not hold, the client closes
the connection. If the client wanted to connect to a server with
a different identity key, the client closes the connection.
An AUTH_CHALLENGE cell is a variable-length cell with the following
fields:
Challenge [32 octets]
N_Methods [2 octets]
Methods [2 * N_Methods octets]
It is sent from the server to the client. Clients MUST ignore
unexpected bytes at the end of the cell. Servers MUST generate
every challenge using a strong RNG or PRNG.
The Challenge field is a randomly generated string that the
client must sign (a hash of) as part of authenticating. The
methods are the authentication methods that the server will
accept. Only one authentication method is defined right now; see
3.4 below.
3.4. Authenticating via Tor cells: Client authentication
A client does not need to authenticate to the server. If it
does not wish to, it responds to the server's valid CERT cell by
sending a NETINFO cell: once it has gotten a valid NETINFO cell,
the client should consider the connection open, and the
server should consider the connection as opened by an
unauthenticated client.
If a client wants to authenticate, it responds to the
AUTH_CHALLENGE cell with a CERT cell and an AUTHENTICATE cell.
The CERT cell is as a server would send, except that instead of
sending a CertType 1 cert for an arbitrary link certificate, the
client sends a CertType 3 cert for an RSA AUTHENTICATE key.
(This difference is because we allow any link key type on a TLS
link, but the protocol described here will only work for 1024-bit
RSA keys. A later protocol version should extend the protocol
here to work with non-1024-bit, non-RSA keys.)
AuthType [2 octets]
AuthLen [2 octets]
Authentication [AuthLen octets]
Servers MUST ignore extra bytes at the end of an AUTHENTICATE
cell. If AuthType is 1 (meaning "RSA-SHA256-TLSSecret"), then the
Authentication contains the following:
TYPE: The characters "AUTH0001" [8 octets]
CID: A SHA256 hash of the client's RSA1024 identity key [32 octets]
SID: A SHA256 hash of the server's RSA1024 identity key [32 octets]
SLOG: A SHA256 hash of all bytes sent from the server to the client
as part of the negotiation up to and including the
AUTH_CHALLENGE cell; that is, the VERSIONS cell,
the CERT cell, the AUTH_CHALLENGE cell, and any padding cells.
[32 octets]
CLOG: A SHA256 hash of all bytes sent from the client to the
server as part of the negotiation so far; that is, the
VERSIONS cell and the CERT cell and any padding cells. [32 octets]
SCERT: A SHA256 hash of the server's TLS link
certificate. [32 octets]
TLSSECRETS: A SHA256 HMAC, using the TLS master secret as the
secret key, of the following:
- client_random, as sent in the TLS Client Hello
- server_random, as sent in the TLS Server Hello
- the NUL terminated ASCII string:
"Tor V3 handshake TLS cross-certification"
[32 octets]
TIME: The time of day in seconds since the POSIX epoch. [8 octets]
RAND: A 16 byte value, randomly chosen by the client [16 octets]
SIG: A signature of a SHA256 hash of all the previous fields
using the client's "Authenticate" key as presented. (As
always in Tor, we use OAEP-MGF1 padding; see tor-spec.txt
section 0.3.)
[variable length]
To check the AUTHENTICATE cell, a server checks that all fields
containing from TYPE through TLSSECRETS contain their unique
correct values as described above, and then verifies the signature.
signature. The server MUST ignore any extra bytes in the signed
data after the SHA256 hash.
3.5. Responding to extra cells, and other security checks.
If the handshake is a V3 TLS handshake, both parties MUST reject
any negotiated link version less than 3. Both parties MUST check
this and close the connection if it is violated.
If the handshake is not a V3 TLS handshake, both parties MUST
still advertise all link protocols they support in their versions
cell. Both parties MUST close the link if it turns out they both
would have supported version 3 or higher, but they somehow wound
up using a v2 or v1 handshake. (More on this in section 6.4.)
Either party may send a VPADDING cell at any time during the
handshake, except as the first cell. (See proposal 184.)
A server SHOULD NOT send any sequence of cells when starting a v3
negotiation other than "VERSIONS, CERT, AUTH_CHALLENGE,
NETINFO". A client SHOULD drop a CERT, AUTH_CHALLENGE, or
NETINFO cell that appears at any other time or out of sequence.
A client should not begin a v3 negotiation with any sequence
other than "VERSIONS, NETINFO" or "VERSIONS, CERT, AUTHENTICATE,
NETINFO". A server SHOULD drop a CERT, AUTH_CHALLENGE, or
NETINFO cell that appears at any other time or out of sequence.
4. Numbers to assign
We need a version number for this link protocol. I've been
calling it "3".
We need to reserve command numbers for CERT, AUTH_CHALLENGE, and
AUTHENTICATE. I suggest that in link protocol 3 and higher, we
reserve a separate range of commands for variable-length cells.
See proposal 184 for more there.
5. Efficiency
This protocol adds a round-trip step when the client sends a
VERSIONS cell to the server, and waits for the {VERSIONS, CERT,
NETINFO} response in turn. (The server then waits for the
client's {NETINFO} or {CERT, AUTHENTICATE, NETINFO} reply,
but it would have already been waiting for the client's NETINFO,
so that's not an additional wait.)
This is actually fewer round-trip steps than required before for
TLS renegotiation, so that's a win over v2.
6. Security argument
These aren't crypto proofs, since I don't write those. They are
meant to be reasonably convincing.
6.1. The server is authenticated
TLS guarantees that if the TLS handshake completes successfully,
the client knows that it is speaking to somebody who knows the
private key corresponding to the public link key that was used in
the TLS handshake.
Because this public link key is signed by the server's identity
key in the CERT cell, the client knows that somebody who holds
the server's private identity key says that the server's public
link key corresponds to the server's public identity key.
Therefore, if the crypto works, and if TLS works, and if the keys
aren't compromised, then the client is talking to somebody who
holds the server's private identity key.
6.2. The client is authenticated
Once the server has checked the client's certificates, the server
knows that somebody who knows the client's private identity key
says that he is the one holding the private key corresponding to
the client's presented link-authentication public key.
Once the server has checked the signature in the AUTHENTICATE
cell, the server knows that somebody holding the client's
link-authentication private key signed the data in question. By
the standard certification argument above, the server knows that
somebody holding the client's private identity key signed the
data in question.
So the server's remaining question is: am I really talking to
somebody holding the client's identity key, or am I getting a
replayed or MITM'd AUTHENTICATE cell that was previously sent by
the client?
Because the client includes a TLSSECRET component, and the
server is able to verify it, then the answer is easy: the server
knows for certain that it is talking to the party with whom it
did the TLS handshake, since if somebody else generated a correct
TLSSECRET, they would have to know the master secret of the TLS
connection, which would require them to have broken TLS.
Even if the protocol didn't contain the TLSSECRET component,
the server could the client's authentication, but it's a little
trickier. The server knows that it is not getting a replayed
AUTHENTICATE cell, since the cell authenticates (among other
stuff) the server's AUTH_CHALLENGE cell, which it has never used
before. The server knows that it is not getting a MITM'd
AUTHENTICATE cell, since the cell includes a hash of the server's
link certificate, which nobody else should have been able to use
in a successful TLS negotiation.
6.3. MITM attacks won't work any better than they do against TLS
TLS guarantees that a man-in-the-middle attacker can't read the
content of a successfully negotiated encrypted connection, nor
alter the content in any way other than truncating it, unless he
compromises the session keys or one of the key-exchange secret
keys used to establish that connection. Let's make sure we do at
least that well.
Suppose that a client Alice connects to an MITM attacker Mallory,
thinking that she is connecting to some server Bob. Let's assume
that the TLS handshake between Alice and Mallory finishes
successfully and the v3 protocol is chosen. [If the v1 or v2
protocol is chosen, those already resist MITM. If the TLS
handshake doesn't complete, then Alice isn't connected to anybody.]
During the v3 handshake, Mallory can't convince Alice that she is
talking to Bob, since she should not be able to produce a CERT
cell containing a certificate chain signed by Bob's identity key
and used to authenticate the link key that Mallory used during
TLS. (If Mallory used her own link key for the TLS handshake, it
won't match anything Bob signed unless Bob is compromised.
Mallory can't use any key that Bob _did_ produce a certificate
for, since she doesn't know the private key.)
Even if Alice fails to check the certificates from Bob, Mallory
still can't convince Bob that she is really Alice. Assuming that
Alice's keys aren't compromised, Mallory can't send a CERT cell
with a cert chain from Alice's identity key to a key that Mallory
controls, so if Mallory wants to impersonate Alice's identity
key, she can only do so by sending an AUTHENTICATE cell really
generated by Alice. Because Bob will check that the random bytes
in the AUTH_CHALLENGE cell will influence the SLOG hash, Mallory
needs to send Bob's challenge to Alice, and can't use any other
AUTHENTICATE cell that Alice generated before. But because the
AUTHENTICATE cell Alice will generate will include in the SCERT
field a hash of the link certificate used by Mallory, Bob will
reject it as not being valid to connect to him.
6.4. Protocol downgrade attacks won't work.
Assuming that Alice checks the certificates from Bob, she knows
that Bob really sent her the VERSION cell that she received.
Because the AUTHENTICATE cell from Alice includes signed hashes
of the VERSIONS cells from Alice and Bob, Bob knows that Alice
got the VERSIONS cell he sent and sent the VERSIONS cell that he
received.
But what about attempts to downgrade the protocol earlier in the
handshake? Here TLS comes to the rescue: because the TLS
Finished handshake message includes an authenticated digest of
everything previously said during the handshake, an attacker
can't replace the client's ciphersuite list (to trigger a
downgrade to the v1 protocol) or the server's certificate [chain]
(to trigger a downgrade to the v1 or v2 protocol).
7. Design considerations
I previously considered adding our own certificate format in
order to avoid the pain associated with X509, but decided instead
to simply use X509 since a correct Tor implementation will
already need to have X509 code to handle the other handshake
versions and to use TLS.
The trickiest part of the design here is deciding what to stick
in the AUTHENTICATE cell. Some of it is strictly necessary, and
some of it is left there for security margin in case my other
security arguments fail. Because of the CID and SID elements
you can't use an AUTHENTICATE cell for anything other than
authenticating a client ID to a server with an appropriate
server ID. The SLOG and CLOG elements are there mostly to
authenticate the VERSIONS cells and resist downgrade attacks
once there are two versions of this. The presence of the
AUTH_CHALLENGE field in the stuff authenticated in SLOG
prevents replays and ensures that the AUTHENTICATE cell was
really generated by somebody who is reading what the server is
sending over the TLS connection. The SCERT element is meant to
prevent MITM attacks. When the TLSSECRET field is
used, it should prevent the use of the AUTHENTICATE cell for
anything other than the TLS connection the client had in mind.
A signature of the TLSSECRET element on its own should also be
sufficient to prevent the attacks we care about. The redundancy
here should come in handy if I've made a mistake somewhere else in
my analysis.
If the client checks the server's certificates and matches them
to the TLS connection link key before proceding with the
handshake, then signing the contents of the AUTH_CHALLENGE cell
would be sufficient to authenticate the client. But implementers
of allegedly compatible Tor clients have in the past skipped
certificate verification steps, and I didn't want a client's
failure to verify certificates to mean that a server couldn't
trust that he was really talking to the client. To prevent this,
I added the TLS link certificate to the authenticated data: even
if the Tor client code doesn't check any certificates, the TLS
library code will still check that the certificate used in the
handshake contains a link key that matches the one used in the
handshake.
8. Open questions:
- May we cache which certificates we've already verified? It
might leak in timing whether we've connected with a given server
before, and how recently.
- With which TLS libraries is it feasible to yoink client_random,
server_random, and the master secret? If the answer is "All
free C TLS libraries", great. If the answer is "OpenSSL only",
not so great.
- Should we do anything to check the timestamp in the AUTHENTICATE
cell?
- Can we give some way for clients to signal "I want to use the
V3 protocol if possible, but I can't renegotiate, so don't give
me the V2"? Clients currently have a fair idea of server
versions, so they could potentially do the V3 handshake with
servers that support it, and fall back to V1 otherwise.
- What should servers that don't have TLS renegotiation do? For
now, I think they should just stick with V1. Eventually we can
deprecate the V2 handshake as we did with the V1 handshake.
When that happens, servers can be V3-only.
Filename: 177-flag-abstention.txt
Title: Abstaining from votes on individual flags
Author: Nick Mathewson
Created: 14 Feb 2011
Status: Reserve
Target: 0.2.4.x
Overview:
We should have a way for authorities to vote on flags in
particular instances, without having to vote on that flag for all
servers.
Motivation:
Suppose that the status of some router becomes controversial, and
an authority wants to vote for or against the BadExit status of
that router. Suppose also that the authority is not currently
voting on the BadExit flag. If the authority wants to say that
the router is or is not "BadExit", it cannot currently do so
without voting yea or nay on the BadExit status of all other
routers.
Suppose that an authority wants to vote "Valid" or "Invalid" on a
large number of routers, but does not have an opinion on some of
them. Currently, it cannot do so: if it votes for the Valid flag
anywhere, it votes for it everywhere.
Design:
We add a new line "extra-flags" in directory votes, to appear
after "known-flags". It lists zero or more flags that an
authority has occasional opinions on, but for which the authority
will usually abstain. No flag may appear in both extra-flags and
known-flags.
In the router-status section for each directory vote, we allow an
optional "s2" line to appear after the "s" line. It contains
zero or more flag votes. A flag vote is of the form of one of
"+", "-", or "/" followed by the name of a flag. "+" denotes a
yea vote, and "-" denotes a nay vote, and "/" notes an
abstention. Authorities may omit most abstentions, except as
noted below. No flag may appear in an s2 line unless it appears
in the known-flags or extra-flags line.We retain the rule that no
flag may appear in an s line unless it appears in the known-flags
line.
When using an appropriate consensus method to vote, we use these
new rules to determine flags:
A flag is listed in the consensus if it is in the known-flags
section of at least one voter, and in the known-flags or
extra-flags section of at least three voters (or half the
authorities, whichever set is smaller).
A single authority's vote for a given flag on a given router is
interpreted as follows:
- If the authority votes +Flag or -Flag or /Flag in the s2 line for
that router, the vote is "yea" or "nay" or "abstain" respectively.
- Otherwise, if the flag is listed on the "s" line for the
router, then the vote is "yea".
- Otherwise, if the flag is listed in the known-flags line,
then the vote is "nay".
- Otherwise, the vote is "abstain".
A router is assigned a flag in the consensus iff the total "yeas"
outnumber the total "nays".
As an exception, this proposal does not affect the behavior of
the "Named" and "Unnamed" flags; these are still treated as
before. (An authority can already abstain from a single naming
decision by not voting Named on any router with a given name.)
Examples:
Suppose that it becomes important to know which Tor servers are
operated by burrowing marsupials. Some authority operators
diligently research this question; others want to vote about
individual routers on an ad hoc basis when they learn about a
particular router's being e.g. located underground in New South
Wales.
If an authority usually has no opinions on the RunByWombats flag,
it should list it in the "extra-flags" of its votes. If it
occasionally wants to vote that a router is (or is not) run by
wombats, it should list "s2 +RunByWombats" or "s2 -RunByWombats"
for the routers in question. Otherwise it can omit the flag from
its s and s2 lines entirely.
If an authority usually has an opinion on the RunByWombats flag,
but wants to abstain in some cases, it should list "RunByWombats"
in the "known-flags" part of its votes, and include
"RunByWombats" in the s line for every router that it believes is
run by wombats. When it wants to vote that a router is not run
by wombats, it should list the RunByWombats flag in neither the s
nor the s2 line. When it wants to abstain, it should list "s2
/RunByWombats".
In both cases, when the new consensus method is used, a router
will get listed as "RunByWombats" if there are more authorities
that say it is run by wombats than there are authorities saying
it is not run by wombats. (As now, "no" votes win ties.)
Filename: 178-param-voting.txt
Title: Require majority of authorities to vote for consensus parameters
Author: Sebastian Hahn
Created: 16-Feb-2011
Status: Closed
Implemented-In: 0.2.3.9-alpha
Overview:
The consensus that the directory authorities create may contain one or
more parameters (32-bit signed integers) that influence the behavior
of Tor nodes (see proposal 167, "Vote on network parameters in
consensus" for more details).
Currently (as of consensus method 11), a consensus will end up
containing a parameter if at least one directory authority votes for
that paramater. The value of the parameter will be the low-median of
all the votes for this parameter.
This proposal aims at changing this voting process to be more secure
against tampering by a small fraction of directory authorities.
Motivation:
To prevent a small fraction of the directory authorities from
influencing the value of a parameter unduly, a big enough fraction
of all directory authorities authorities has to vote for that
parameter. This is not currently happening, and it is in fact not
uncommon for a single authority to govern the value of a consensus
parameter.
Design:
When the consensus is generated, the directory authorities ensure that
a param is only included in the list of params if at least three of the
authorities (or a simple majority, whichever is the smaller number)
votes for that param. The value chosen is the low-median of all the
votes. We don't mandate that the authorities have to vote on exactly
the same value for it to be included because some consensus parameters
could be the result of active measurements that individual authorities
make.
Security implications:
This change is aimed at improving the security of Tor nodes against
attacks carried out by a small fraction of directory authorities. It
is possible that a consensus parameter that would be helpful to the
network is not included because not enough directory authorities
voted for it, but since clients are required to have sane defaults
in case the parameter is absent this does not carry a security risk.
This proposal makes a security vs coordination effort tradeoff. When
considering only the security of the design, it would be better to
require a simple majority of directory authorities to agree on
voting on a parameter, but it would involve requiring more
directory authority operators to coordinate their actions to set the
parameter successfully.
Specification:
dir-spec section 3.4 currently says:
Entries are given on the "params" line for every keyword on which any
authority voted. The values given are the low-median of all votes on
that keyword.
It is proposed that the above is changed to:
Entries are given on the "params" line for every keyword on which a
majority of authorities (total authorities, not just those
participating in this vote) voted on, or if at least three
authorities voted for that parameter. The values given are the
low-median of all votes on that keyword.
Consensus methods 11 and before, entries are given on the "params"
line for every keyword on which any authority voted, the value given
being the low-median of all votes on that keyword.
The following should be added to the bottom of section 3.4.:
* If consensus method 12 or later is used, only consensus
parameters that more than half of the total number of
authorities voted for are included in the consensus.
The following line should be added to the bottom of section 3.4.1.:
"12" -- Params are only included if enough auths voted for them
Compatibility:
A sufficient number of directory authorities must upgrade to the new
consensus method used to calculate the params in the way this proposal
calls for, otherwise the old mechanism is used. Nodes that do not act
as directory authorities do not need to be upgraded and should
experience no change in behaviour.
Implementation:
An example implementation of this feature can be found in
https://gitweb.torproject.org/sebastian/tor.git, branch safer_params.
Filename: 179-TLS-cert-and-parameter-normalization.txt
Title: TLS certificate and parameter normalization
Author: Jacob Appelbaum, Gladys Shufflebottom
Created: 16-Feb-2011
Status: Closed
Target: 0.2.3.x
Draft spec for TLS certificate and handshake normalization
Overview
STATUS NOTE:
This document is implemented in part in 0.2.3.x, deferred in part, and
rejected in part. See indented bracketed comments in individual
sections below for more information. -NM
Scope
This is a document that proposes improvements to problems with Tor's
current TLS (Transport Layer Security) certificates and handshake that will
reduce the distinguishability of Tor traffic from other encrypted traffic that
uses TLS. It also addresses some of the possible fingerprinting attacks
possible against the current Tor TLS protocol setup process.
Motivation and history
Censorship is an arms race and this is a step forward in the defense
of Tor. This proposal outlines ideas to make it more difficult to
fingerprint and block Tor traffic.
Goals
This proposal intends to normalize or remove easy-to-predict or static
values in the Tor TLS certificates and with the Tor TLS setup process.
These values can be used as criteria for the automated classification of
encrypted traffic as Tor traffic. Network observers should not be able
to trivially detect Tor merely by receiving or observing the certificate
used or advertised by a Tor relay. I also propose the creation of
a hard-to-detect covert channel through which a server can signal that it
supports the third version ("V3") of the Tor handshake protocol.
Non-Goals
This document is not intended to solve all of the possible active or passive
Tor fingerprinting problems. This document focuses on removing distinctive
and predictable features of TLS protocol negotiation; we do not attempt to
make guarantees about resisting other kinds of fingerprinting of Tor
traffic, such as fingerprinting techniques related to timing or volume of
transmitted data.
Implementation details
Certificate Issues
The CN or commonName ASN1 field
Tor generates certificates with a predictable commonName field; the
field is within a given range of values that is specific to Tor.
Additionally, the generated host names have other undesirable properties.
The host names typically do not resolve in the DNS because the domain
names referred to are generated at random. Although they are syntatically
valid, they usually refer to domains that have never been registered by
any domain name registrar.
An example of the current commonName field: CN=www.s4ku5skci.net
An example of OpenSSL’s asn1parse over a typical Tor certificate:
0:d=0 hl=4 l= 438 cons: SEQUENCE
4:d=1 hl=4 l= 287 cons: SEQUENCE
8:d=2 hl=2 l= 3 cons: cont [ 0 ]
10:d=3 hl=2 l= 1 prim: INTEGER :02
13:d=2 hl=2 l= 4 prim: INTEGER :4D3C763A
19:d=2 hl=2 l= 13 cons: SEQUENCE
21:d=3 hl=2 l= 9 prim: OBJECT :sha1WithRSAEncryption
32:d=3 hl=2 l= 0 prim: NULL
34:d=2 hl=2 l= 35 cons: SEQUENCE
36:d=3 hl=2 l= 33 cons: SET
38:d=4 hl=2 l= 31 cons: SEQUENCE
40:d=5 hl=2 l= 3 prim: OBJECT :commonName
45:d=5 hl=2 l= 24 prim: PRINTABLESTRING :www.vsbsvwu5b4soh4wg.net
71:d=2 hl=2 l= 30 cons: SEQUENCE
73:d=3 hl=2 l= 13 prim: UTCTIME :110123184058Z
88:d=3 hl=2 l= 13 prim: UTCTIME :110123204058Z
103:d=2 hl=2 l= 28 cons: SEQUENCE
105:d=3 hl=2 l= 26 cons: SET
107:d=4 hl=2 l= 24 cons: SEQUENCE
109:d=5 hl=2 l= 3 prim: OBJECT :commonName
114:d=5 hl=2 l= 17 prim: PRINTABLESTRING :www.s4ku5skci.net
133:d=2 hl=3 l= 159 cons: SEQUENCE
136:d=3 hl=2 l= 13 cons: SEQUENCE
138:d=4 hl=2 l= 9 prim: OBJECT :rsaEncryption
149:d=4 hl=2 l= 0 prim: NULL
151:d=3 hl=3 l= 141 prim: BIT STRING
295:d=1 hl=2 l= 13 cons: SEQUENCE
297:d=2 hl=2 l= 9 prim: OBJECT :sha1WithRSAEncryption
308:d=2 hl=2 l= 0 prim: NULL
310:d=1 hl=3 l= 129 prim: BIT STRING
I propose that we match OpenSSL's default self-signed certificates. I hypothesise
that they are the most common self-signed certificates. If this turns out not
to be the case, then we should use whatever the most common turns out to be.
Certificate serial numbers
Currently our generated certificate serial number is set to the number of
seconds since the epoch at the time of the certificate's creation. I propose
that we should ensure that our serial numbers are unrelated to the epoch,
since the generation methods are potentially recognizable as Tor-related.
Instead, I propose that we use a randomly generated number that is
subsequently hashed with SHA-512 and then truncate the data to eight bytes[1].
Random sixteen byte values appear to be the high bound for serial number as
issued by Verisign and DigiCert. RapidSSL appears to be three bytes in length.
Others common byte lengths appear to be between one and four bytes. The default
OpenSSL certificates are eight bytes and we should use this length with our
self-signed certificates.
This randomly generated serial number field may now serve as a covert channel
that signals to the client that the OR will not support TLS renegotiation; this
means that the client can expect to perform a V3 TLS handshake setup.
Otherwise, if the serial number is a reasonable time since the epoch, we should
assume the OR is using an earlier protocol version and hence that it expects
renegotiation.
We also have a need to signal properties with our certificates for a possible
v3 handshake in the future. Therefore I propose that we match OpenSSL default
self-signed certificates (a 64-bit random number), but reserve the two least-
significant bits for signaling. For the moment, these two bits will be zero.
This means that an attacker may be able to identify Tor certificates from default
OpenSSL certificates with a 75% probability.
As a security note, care must be taken to ensure that supporting this
covert channel will not lead to an attacker having a method to downgrade client
behavior. This shouldn't be a risk because the TLS Finished message hashes over
all the bytes of the handshake, including the certificates.
[Randomized serial numbers are implemented in 0.2.3.9-alpha. We probably
shouldn't do certificate tagging by a covert channel in serial numbers,
since doing so would mean we could never have an externally signed
cert. -NM]
Certificate fingerprinting issues expressed as base64 encoding
It appears that all deployed Tor certificates have the following strings in
common:
MIIB
CCA
gAwIBAgIETU
ANBgkqhkiG9w0BAQUFADA
YDVQQDEx
3d3cu
As expected these values correspond to specific ASN.1 OBJECT IDENTIFIER (OID)
properties (sha1WithRSAEncryption, commonName, etc) of how we generate our
certificates.
As an illustrated example of the common bytes of all certificates used within
the Tor network within a single one hour window, I have replaced the actual
value with a wild card ('.') character here:
-----BEGIN CERTIFICATE-----
MIIB..CCA..gAwIBAgIETU....ANBgkqhkiG9w0BAQUFADA.M..w..YDVQQDEx.3
d3cu............................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
........................... <--- Variable length and padding
-----END CERTIFICATE-----
This fine ascii art only illustrates the bytes that absolutely match in all
cases. In many cases, it's likely that there is a high probability for a given
byte to be only a small subset of choices.
Using the above strings, the EFF's certificate observatory may trivially
discover all known relays, known bridges and unknown bridges in a single SQL
query. I propose that we ensure that we test our certificates to ensure that
they do not have these kinds of statistical similarities without ensuring
overlap with a very large cross section of the internet's certificates.
Certificate dating and validity issues
TLS certificates found in the wild are generally found to be long-lived;
they are frequently old and often even expired. The current Tor certificate
validity time is a very small time window starting at generation time and
ending shortly thereafter, as defined in or.h by MAX_SSL_KEY_LIFETIME
(2*60*60).
I propose that the certificate validity time length is extended to a period of
twelve Earth months, possibly with a small random skew to be determined by the
implementer. Tor should randomly set the start date in the past or some
currently unspecified window of time before the current date. This would
more closely track the typical distribution of non-Tor TLS certificate
expiration times.
The certificate values, such as expiration, should not be used for anything
relating to security; for example, if the OR presents an expired TLS
certificate, this does not imply that the client should terminate the
connection (as would be appropriate for an ordinary TLS implementation).
Rather, I propose we use a TOFU style expiration policy - the certificate
should never be trusted for more than a two hour window from first sighting.
This policy should have two major impacts. The first is that an adversary will
have to perform a differential analysis of all certificates for a given IP
address rather than a single check. The second is that the server expiration
time is enforced by the client and confirmed by keys rotating in the consensus.
The expiration time should not be a fixed time that is simple to calculate by
any Deep Packet Inspection device or it will become a new Tor TLS setup
fingerprint.
[Deferred and needs revision; see proposal XXX. -NM]
Proposed certificate form
The following output from openssl asn1parse results from the proposed
certificate generation algorithm. It matches the results of generating a
default self-signed certificate:
0:d=0 hl=4 l= 513 cons: SEQUENCE
4:d=1 hl=4 l= 362 cons: SEQUENCE
8:d=2 hl=2 l= 9 prim: INTEGER :DBF6B3B864FF7478
19:d=2 hl=2 l= 13 cons: SEQUENCE
21:d=3 hl=2 l= 9 prim: OBJECT :sha1WithRSAEncryption
32:d=3 hl=2 l= 0 prim: NULL
34:d=2 hl=2 l= 69 cons: SEQUENCE
36:d=3 hl=2 l= 11 cons: SET
38:d=4 hl=2 l= 9 cons: SEQUENCE
40:d=5 hl=2 l= 3 prim: OBJECT :countryName
45:d=5 hl=2 l= 2 prim: PRINTABLESTRING :AU
49:d=3 hl=2 l= 19 cons: SET
51:d=4 hl=2 l= 17 cons: SEQUENCE
53:d=5 hl=2 l= 3 prim: OBJECT :stateOrProvinceName
58:d=5 hl=2 l= 10 prim: PRINTABLESTRING :Some-State
70:d=3 hl=2 l= 33 cons: SET
72:d=4 hl=2 l= 31 cons: SEQUENCE
74:d=5 hl=2 l= 3 prim: OBJECT :organizationName
79:d=5 hl=2 l= 24 prim: PRINTABLESTRING :Internet Widgits Pty Ltd
105:d=2 hl=2 l= 30 cons: SEQUENCE
107:d=3 hl=2 l= 13 prim: UTCTIME :110217011237Z
122:d=3 hl=2 l= 13 prim: UTCTIME :120217011237Z
137:d=2 hl=2 l= 69 cons: SEQUENCE
139:d=3 hl=2 l= 11 cons: SET
141:d=4 hl=2 l= 9 cons: SEQUENCE
143:d=5 hl=2 l= 3 prim: OBJECT :countryName
148:d=5 hl=2 l= 2 prim: PRINTABLESTRING :AU
152:d=3 hl=2 l= 19 cons: SET
154:d=4 hl=2 l= 17 cons: SEQUENCE
156:d=5 hl=2 l= 3 prim: OBJECT :stateOrProvinceName
161:d=5 hl=2 l= 10 prim: PRINTABLESTRING :Some-State
173:d=3 hl=2 l= 33 cons: SET
175:d=4 hl=2 l= 31 cons: SEQUENCE
177:d=5 hl=2 l= 3 prim: OBJECT :organizationName
182:d=5 hl=2 l= 24 prim: PRINTABLESTRING :Internet Widgits Pty Ltd
208:d=2 hl=3 l= 159 cons: SEQUENCE
211:d=3 hl=2 l= 13 cons: SEQUENCE
213:d=4 hl=2 l= 9 prim: OBJECT :rsaEncryption
224:d=4 hl=2 l= 0 prim: NULL
226:d=3 hl=3 l= 141 prim: BIT STRING
370:d=1 hl=2 l= 13 cons: SEQUENCE
372:d=2 hl=2 l= 9 prim: OBJECT :sha1WithRSAEncryption
383:d=2 hl=2 l= 0 prim: NULL
385:d=1 hl=3 l= 129 prim: BIT STRING
[Rejected pending more evidence; this pattern is trivially detectable,
and there is just not enough reason at the moment to think that this
particular certificate pattern is common enough for sites that matter
that the censors wouldn't be willing to block it. -NM]
Custom Certificates
It should be possible for a Tor relay operator to use a specifically supplied
certificate and secret key. This will allow a relay or bridge operator to use a
certificate signed by any member of any geographically relevant certificate
authority racket; it will also allow for any other user-supplied certificate.
This may be desirable in some kinds of filtered networks or when attempting to
avoid attracting suspicion by blending in with the TLS web server certificate
crowd.
[Deferred; see proposal XXX]
Problematic Diffie–Hellman parameters
We currently send a static Diffie–Hellman parameter, prime p (or “prime p
outlaw”) as specified in RFC2409 as part of the TLS Server Hello response.
The use of this prime in TLS negotiations may, as a result, be filtered and
effectively banned by certain networks. We do not have to use this particular
prime in all cases.
While amusing to have the power to make specific prime numbers into a new class
of numbers (cf. imaginary, irrational, illegal [3]) - our new friend prime p
outlaw is not required.
The use of this prime in TLS negotiations may, as a result, be filtered and
effectively banned by certain networks. We do not have to use this particular
prime in all cases.
I propose that the function to initialize and generate DH parameters be
split into two functions.
First, init_dh_param() should be used only for OR-to-OR DH setup and
communication. Second, it is proposed that we create a new function
init_tls_dh_param() that will have a two-stage development process.
The first stage init_tls_dh_param() will use the same prime that
Apache2.x [4] sends (or “dh1024_apache_p”), and this change should be
made immediately. This is a known good and safe prime number (p-1 / 2
is also prime) that is currently not known to be blocked.
The second stage init_tls_dh_param() should randomly generate a new prime on a
regular basis; this is designed to make the prime difficult to outlaw or
filter. Call this a shape-shifting or "Rakshasa" prime. This should be added
to the 0.2.3.x branch of Tor. This prime can be generated at setup or execution
time and probably does not need to be stored on disk. Rakshasa primes only
need to be generated by Tor relays as Tor clients will never send them. Such
a prime should absolutely not be shared between different Tor relays nor
should it ever be static after the 0.2.3.x release.
As a security precaution, care must be taken to ensure that we do not generate
weak primes or known filtered primes. Both weak and filtered primes will
undermine the TLS connection security properties. OpenSSH solves this issue
dynamically in RFC 4419 [5] and may provide a solution that works reasonably
well for Tor. More research in this area including the applicability of
Miller-Rabin or AKS primality tests[6] will need to be analyzed and probably
added to Tor.
[Randomized DH groups are implemented in 0.2.3.9-alpha. -NM]
Practical key size
Currently we use a 1024 bit long RSA modulus. I propose that we increase the
RSA key size to 2048 as an additional channel to signal support for the V3
handshake setup. 2048 appears to be the most common key size[0] above 1024.
Additionally, the increase in modulus size provides a reasonable security boost
with regard to key security properties.
The implementer should increase the 1024 bit RSA modulus to 2048 bits.
[Deferred and needs performance analysis. See proposal
XXX. Additionally, DH group strength seems far more crucial. Still, this
is out-of-scope for a "normalization" question. -NM]
Possible future filtering nightmares
At some point it may cost effective or politically feasible for a network
filter to simply block all signed or self-signed certificates without a known
valid CA trust chain. This will break many applications on the internet and
hopefully, our option for custom certificates will ensure that this step is
simply avoided by the censors.
The Rakshasa prime approach may cause censors to specifically allow only
certain known and accepted DH parameters.
Appendix: Other issues
What other obvious TLS certificate issues exist? What other static values are
present in the Tor TLS setup process?
[0] http://archives.seul.org/or/dev/Jan-2011/msg00051.html
[1] http://archives.seul.org/or/dev/Feb-2011/msg00016.html
[2] http://archives.seul.org/or/dev/Feb-2011/msg00039.html
[3] To be fair this is hardly a new class of numbers. History is rife with
similar examples of inane authoritarian attempts at mathematical secrecy.
Probably the most dramatic example is the story of the pupil Hipassus of
Metapontum, pupil of the famous Pythagoras, who, legend goes, proved the
fact that Root2 cannot be expressed as a fraction of whole numbers (now
called an irrational number) and was assassinated for revealing this
secret. Further reading on the subject may be found on the Wikipedia:
http://en.wikipedia.org/wiki/Hippasus
[4] httpd-2.2.17/modules/ss/ssl_engine_dh.c
[5] http://tools.ietf.org/html/rfc4419
[6] http://archives.seul.org/or/dev/Jan-2011/msg00037.html
Filename: 180-pluggable-transport.txt
Title: Pluggable transports for circumvention
Author: Jacob Appelbaum, Nick Mathewson
Created: 15-Oct-2010
Status: Closed
Implemented-In: 0.2.3.x
Overview
This proposal describes a way to decouple protocol-level obfuscation
from the core Tor protocol in order to better resist client-bridge
censorship. Our approach is to specify a means to add pluggable
transport implementations to Tor clients and bridges so that they can
negotiate a superencipherment for the Tor protocol.
Scope
This is a document about transport plugins; it does not cover
discovery improvements, or bridgedb improvements. While these
requirements might be solved by a program that also functions as a
transport plugin, this proposal only covers the requirements and
operation of transport plugins.
Motivation
Frequently, people want to try a novel circumvention method to help
users connect to Tor bridges. Some of these methods are already
pretty easy to deploy: if the user knows an unblocked VPN or open
SOCKS proxy, they can just use that with the Tor client today.
Less easy to deploy are methods that require participation by both the
client and the bridge. In order of increasing sophistication, we
might want to support:
1. A protocol obfuscation tool that transforms the output of a TLS
connection into something that looks like HTTP as it leaves the
client, and back to TLS as it arrives at the bridge.
2. An additional authentication step that a client would need to
perform for a given bridge before being allowed to connect.
3. An information passing system that uses a side-channel in some
existing protocol to convey traffic between a client and a bridge
without the two of them ever communicating directly.
4. A set of clients to tunnel client->bridge traffic over an existing
large p2p network, such that the bridge is known by an identifier
in that network rather than by an IP address.
We could in theory support these almost fine with Tor as it stands
today: every Tor client can take a SOCKS proxy to use for its outgoing
traffic, so a suitable client proxy could handle the client's traffic
and connections on its behalf, while a corresponding program on the
bridge side could handle the bridge's side of the protocol
transformation. Nevertheless, there are some reasons to add support
for transportation plugins to Tor itself:
1. It would be good for bridges to have a standard way to advertise
which transports they support, so that clients can have multiple
local transport proxies, and automatically use the right one for
the right bridge.
2. There are some changes to our architecture that we'll need for a
system like this to work. For testing purposes, if a bridge blocks
off its regular ORPort and instead has an obfuscated ORPort, the
bridge authority has no way to test it. Also, unless the bridge
has some way to tell that the bridge-side proxy at 127.0.0.1 is not
the origin of all the connections it is relaying, it might decide
that there are too many connections from 127.0.0.1, and start
paring them down to avoid a DoS.
3. Censorship and anticensorship techniques often evolve faster than
the typical Tor release cycle. As such, it's a good idea to
provide ways to test out new anticensorship mechanisms on a more
rapid basis.
4. Transport obfuscation is a relatively distinct problem
from the other privacy problems that Tor tries to solve, and it
requires a fairly distinct skill-set from hacking the rest of Tor.
By decoupling transport obfuscation from the Tor core, we hope to
encourage people working on transport obfuscation who would
otherwise not be interested in hacking Tor.
5. Finally, we hope that defining a generic transport obfuscation plugin
mechanism will be useful to other anticensorship projects.
Non-Goals
We're not going to talk about automatic verification of plugin
correctness and safety via sandboxing, proof-carrying code, or
whatever.
We need to do more with discovery and distribution, but that's not
what this proposal is about. We're pretty convinced that the problems
are sufficiently orthogonal that we should be fine so long as we don't
preclude a single program from implementing both transport and
discovery extensions.
This proposal is not about what transport plugins are the best ones
for people to write. We do, however, make some general
recommendations for plugin authors in an appendix.
We've considered issues involved with completely replacing Tor's TLS
with another encryption layer, rather than layering it inside the
obfuscation layer. We describe how to do this in an appendix to the
current proposal, though we are not currently sure whether it's a good
idea to implement.
We deliberately reject any design that would involve linking the
transport plugins into Tor's process space.
Design overview
To write a new transport protocol, an implementer must provide two
pieces: a "Client Proxy" to run at the initiator side, and a "Server
Proxy" to run at the server side. These two pieces may or may not be
implemented by the same program.
Each client may run any number of Client Proxies. Each one acts like
a SOCKS proxy that accepts connections on localhost. Each one
runs on a different port, and implements one or more transport
methods. If the protocol has any parameters, they are passed from Tor
inside the regular username/password parts of the SOCKS protocol.
Bridges (and maybe relays) may run any number of Server Proxies: these
programs provide an interface like stunnel: they get connections from the
network (typically by listening for connections on the network) and relay
them to the Bridge's real ORPort.
To configure one of these programs, it should be sufficient simply to
list it in your torrc. The program tells Tor which transports it
provides. The Tor consensus should carry a new approved version number that
is specific for pluggable transport; this will allow Tor to know when a
particular transport is known to be unsafe, safe, or non-functional.
Bridges (and maybe relays) report in their descriptors which transport
protocols they support. This information can be copied into bridge
lines. Bridges using a transport protocol may have multiple bridge
lines.
Any methods that are wildly successful, we can bake into Tor.
Specifications: Client behavior
We extend the bridge line format to allow you to say which method
to use to connect to a bridge.
The new format is:
Bridge method address:port [[keyid=]id-fingerprint] [k=v] [k=v] [k=v]
To connect to such a bridge, the Tor program needs to know which
SOCKS proxy will support the transport called "method". It
then connects to this proxy, and asks it to connect to
address:port. If [id-fingerprint] is provided, Tor should expect
the public identity key on the TLS connection to match the digest
provided in [id-fingerprint]. If any [k=v] items are provided,
they are configuration parameters for the proxy: Tor should
separate them with semicolons and put them in the user and
password fields of the request, splitting them across the fields
as necessary. If a key or value value must contain a semicolon or
a backslash, it is escaped with a backslash.
Method names must be C identifiers.
For reference, the old bridge format was
Bridge address[:port] [id-fingerprint]
where port defaults to 443 and the id-fingerprint is optional. The
new format can be distinguished from the old one by checking if the
first argument has any non-C-identifier characters. (Looking for a
period should be a simple way.) Also, while the id-fingerprint could
optionally include whitespace in the old format, whitespace in the
id-fingerprint is not permitted in the new format.
Example: if the bridge line is "bridge trebuchet www.example.com:3333
keyid=09F911029D74E35BD84156C5635688C009F909F9 rocks=20 height=5.6m"
AND if the Tor client knows that the 'trebuchet' method is supported,
the client should connect to the proxy that provides the 'trebuchet'
method, ask it to connect to www.example.com, and provide the string
"rocks=20;height=5.6m" as the username, the password, or split
across the username and password.
There are two ways to tell Tor clients about protocol proxies:
external proxies and managed proxies. An external proxy is configured
with
ClientTransportPlugin <method> socks4 <address:port> [auth=X]
or
ClientTransportPlugin <method> socks5 <address:port> [username=X] [password=Y]
as in
"ClientTransportPlugin trebuchet socks5 127.0.0.1:9999".
This example tells Tor that another program is already running to handle
'trubuchet' connections, and Tor doesn't need to worry about it.
A managed proxy is configured with
ClientTransportPlugin <methods> exec <path> [options]
as in
"ClientTransportPlugin trebuchet exec /usr/libexec/trebuchet --managed".
This example tells Tor to launch an external program to provide a
socks proxy for 'trebuchet' connections. The Tor client only
launches one instance of each external program with a given set of
options, even if the same executable and options are listed for
more than one method.
In managed proxies, <methods> can be a comma-separated list of
pluggable transport method names, as in:
"ClientTransportPlugin pawn,bishop,rook exec /bin/ptproxy --managed".
If instead of a transport method, the torrc lists "*" for a managed
proxy, Tor uses that proxy for all transport methods that the plugin
supports. So "ClientTransportPlugin * exec /usr/libexec/tor/foobar"
tells Tor that Tor should use the foobar plugin for every method that
the proxy supports. See the "Managed proxy interface" section below
for details on how Tor learns which methods a plugin supports.
If two plugins support the same method, Tor should use whichever
one is listed first.
The same program can implement a managed or an external proxy: it just
needs to take an argument saying which one to be.
Server behavior
Server proxies are configured similarly to client proxies. When
launching a proxy, the server must tell it what ORPort it has
configured, and what address (if any) it can listen on. The
server must tell the proxy which (if any) methods it should
provide if it can; the proxy needs to tell the server which
methods it is actually providing, and on what ports.
When a client connects to the proxy, the proxy may need a way to
tell the server some identifier for the client address. It does
this in-band.
As before, the server lists proxies in its torrc. These can be
external proxies that run on their own, or managed proxies that Tor
launches.
An external server proxy is configured as
ServerTransportPlugin <method> proxy <address:port> <param=val> ...
as in
"ServerTransportPlugin trebuchet proxy 127.0.0.1:999 rocks=heavy".
The param=val pairs and the address are used to make the bridge
configuration information that we'll tell users.
A managed proxy is configured as
ServerTransportPlugin <methods> exec </path/to/binary> [options]
or
ServerTransportPlugin * exec </path/to/binary> [options]
When possible, Tor should launch only one binary of each binary/option
pair configured. So if the torrc contains
ClientTransportPlugin foo exec /usr/bin/megaproxy --foo
ClientTransportPlugin bar exec /usr/bin/megaproxy --bar
ServerTransportPlugin * exec /usr/bin/megaproxy --foo
then Tor will launch the megaproxy binary twice: once with the option
--foo and once with the option --bar.
Managed proxy interface
When the Tor client or relay launches a managed proxy, it communicates
via environment variables. At a minimum, it sets (in addition to the
normal environment variables inherited from Tor):
{Client and server}
"TOR_PT_STATE_LOCATION" -- A filesystem directory path where the
proxy should store state if it wants to. This directory is not
required to exist, but the proxy SHOULD be able to create it if
it doesn't. The proxy MUST NOT store state elsewhere.
Example: TOR_PT_STATE_LOCATION=/var/lib/tor/pt_state/
"TOR_PT_MANAGED_TRANSPORT_VER" -- To tell the proxy which
versions of this configuration protocol Tor supports. Future
versions will give a comma-separated list. Clients MUST accept
comma-separated lists containing any version that they
recognize, and MUST work correctly even if some of the versions
they don't recognize are non-numeric. Valid version characters
are non-space, non-comma printing ASCII characters.
Example: TOR_PT_MANAGED_TRANSPORT_VER=1,1a,2,4B
{Client only}
"TOR_PT_CLIENT_TRANSPORTS" -- A comma-separated list of which
methods this client should enable, or * if all methods should
be enabled. The proxy SHOULD ignore methods that it doesn't
recognize.
Example: TOR_PT_CLIENT_TRANSPORTS=trebuchet,battering_ram,ballista
{Server only}
"TOR_PT_EXTENDED_SERVER_PORT" -- An <address>:<port> where tor
should be listening for connections speaking the extended
ORPort protocol (See the "The extended ORPort protocol" section
below). If tor does not support the extended ORPort protocol,
it MUST use the empty string as the value of this environment
variable.
Example: TOR_PT_EXTENDED_SERVER_PORT=127.0.0.1:4200
"TOR_PT_ORPORT" -- Our regular ORPort in a form suitable
for local connections, i.e. connections from the proxy to
the ORPort.
Example: TOR_PT_ORPORT=127.0.0.1:9001
"TOR_PT_SERVER_BINDADDR" -- A comma seperated list of
<key>-<value> pairs, where <key> is a transport name and
<value> is the adress:port on which it should listen for client
proxy connections.
The keys holding transport names must appear on the same order
as they appear on TOR_PT_SERVER_TRANSPORTS.
This might be the advertised address, or might be a local
address that Tor will forward ports to. It MUST be an address
that will work with bind().
Example:
TOR_PT_SERVER_BINDADDR=trebuchet-127.0.0.1:1984,ballista-127.0.0.1:4891
"TOR_PT_SERVER_TRANSPORTS" -- A comma-separated list of server
methods that the proxy should support, or * if all methods
should be enabled. The proxy SHOULD ignore methods that it
doesn't recognize.
Example: TOR_PT_SERVER_TRANSPORTS=trebuchet,ballista
The transport proxy replies by writing NL-terminated lines to
stdout. The line metaformat is
<Line> ::= <Keyword> <OptArgs> <NL>
<Keyword> ::= <KeywordChar> | <Keyword> <KeywordChar>
<KeyWordChar> ::= <any US-ASCII alphanumeric, dash, and underscore>
<OptArgs> ::= <Args>*
<Args> ::= <SP> <ArgChar> | <Args> <ArgChar>
<ArgChar> ::= <any US-ASCII character but NUL or NL>
<SP> ::= <US-ASCII whitespace symbol (32)>
<NL> ::= <US-ASCII newline (line feed) character (10)>
Tor MUST ignore lines with keywords that it doesn't recognize.
First, if there's an error parsing the environment variables, the
proxy should write:
ENV-ERROR <errormessage>
and exit.
If the environment variables were correctly formatted, the proxy
should write:
VERSION <configuration protocol version>
to say that it supports this configuration protocol version (example
"VERSION 1"). It must either pick a version that Tor told it about
in TOR_PT_MANAGED_TRANSPORT_VER, or pick no version at all, say:
VERSION-ERROR no-version
and exit.
The proxy should then open its ports. If running as a client
proxy, it should not use fixed ports; instead it should autoselect
ports to avoid conflicts. A client proxy should by default only
listen on localhost for connections.
A server proxy SHOULD try to listen at a consistent port, though it
SHOULD pick a different one if the port it last used is now allocated.
A client or server proxy then should tell which methods it has
made available and how. It does this by printing zero or more
CMETHOD and SMETHOD lines to its stdout. These lines look like:
CMETHOD <methodname> socks4/socks5 <address:port> [ARGS=arglist] \
[OPT-ARGS=arglist]
as in
CMETHOD trebuchet socks5 127.0.0.1:19999 ARGS=rocks,height \
OPT-ARGS=tensile-strength
The ARGS field lists mandatory parameters that must appear in
every bridge line for this method. The OPT-ARGS field lists
optional parameters. If no ARGS or OPT-ARGS field is provided,
Tor should not check the parameters in bridge lines for this
method.
The proxy should print a single "CMETHODS DONE" line after it is
finished telling Tor about the client methods it provides. If it
tries to supply a client method but can't for some reason, it
should say:
CMETHOD-ERROR <methodname> <errormessage>
A proxy should also tell Tor about the server methods it is providing
by printing zero or more SMETHOD lines. These lines look like:
SMETHOD <methodname> <address:port> [options]
If there's an error setting up a configured server method, the
proxy should say:
SMETHOD-ERROR <methodname> <errormessage>
as in
SMETHOD-ERROR trebuchet could not setup 'trebuchet' method
The 'address:port' part of an SMETHOD line is the address to put
in the bridge line. The Options part is a list of space-separated
K:V flags that Tor should know about. Recognized options are:
- FORWARD:1
If this option is set (for example, because address:port is not
a publicly accessible address), then Tor needs to forward some
other address:port to address:port via upnp-helper. Tor would
then advertise that other address:port in the bridge line instead.
- ARGS:K=V,K=V,K=V
If this option is set, the K=V arguments are added to Tor's
extrainfo document.
- DECLARE:K=V,...
If this option is set, the K=V options should be added as
extension entries to the router descriptor, so clients and other
relays can make use of it. See ideas/xxx-triangleboy-transport.txt
for an example situation where the plugin would want to declare
parameters to other Tors.
- USE-EXTENDED-PORT:1
If this option is set, the server plugin is planning to connect
to Tor's extended server port.
SMETHOD and CMETHOD lines may be interspersed, to allow the proxies to
report methods as they become available, even when some methods may
require probing your network, connecting to some kind of peers, etc
before they are set up. After the final SMETHOD line, the proxy says
"SMETHODS DONE".
The proxy SHOULD NOT tell Tor about a server or client method
unless it is actually open and ready to use.
Tor clients SHOULD NOT use any method from a client proxy or
advertise any method from a server proxy UNLESS it is listed as a
possible method for that proxy in torrc, and it is listed by the
proxy as a method it supports.
Proxies should respond to a single INT signal by closing their
listener ports and not accepting any new connections, but keeping
all connections open, then terminating when connections are all
closed. Proxies should respond to a second INT signal by shutting
down cleanly.
The managed proxy configuration protocol version defined in this
section is "1".
So, for example, if tor supports this configuration protocol it
should set the environment variable:
TOR_PT_MANAGED_TRANSPORT_VER=1
The Extended ORPort protocol
The Extended ORPort protocol is described in proposal 196.
Advertising bridge methods
Bridges put the 'method' lines in their extra-info documents.
transport SP <transportname> SP <address:port> [SP arglist] NL
The address:port are as returned from an SMETHOD line (unless they are
replaced by the FORWARD: directive). The arglist is a K=V,... list as
returned in the ARGS: part of the SMETHOD line's Options component.
If the SMETHOD line includes a DECLARE: part, the router descriptor gets
a new line:
transport-info SP <transportname> [SP arglist] NL
Bridge authority behavior
We need to specify a way to test different transport methods that
bridges claim to support. We should test as many as possible. We
should NOT require that we have a way to test every possible
transport method before we allow its use: the point of this design
is to remove bottlenecks in transport deployment.
Bridgedb behavior
Bridgedb can, given a set of router descriptors and their
corresponding extrainfo documents, generate a set of bridge lines
for each bridge. Bridgedb may want to avoid handing out
methods that seem to get bridges blocked quickly.
Implementation plan
First, we should implement per-bridge proxies via the "external
proxy" method described in "Specifications: Client behavior". Also,
we'll want to build the
extended-server-port mechanism. This will let bridges run
transport proxies such that they can generate bridge lines to
give to clients for testing, so long as the user configures and
launches their proxies on their own.
Once that's done, we can see if we need any managed proxies, or if
the whole idea there is silly.
If we do, the next most important part seems to be getting
the client-side automation part written. And once that's done, we
can evaluate how much of the server side is easy for people to do
and how much is hard.
The "obfsproxy" obfuscating proxy is a likely candidate for an
initial transport (trac entry #2760), as is Steven Murdoch's http
thing (trac entry #2759) or something similar.
Notes on plugins to write
We should ship a couple of null plugin implementations in one or two
popular, portable languages so that people get an idea of how to
write the stuff.
1. We should have one that's just a proof of concept that does
nothing but transfer bytes back and forth.
2. We should implement DNS or HTTP using other software (as Geoff Goodell
did years ago with DNS) as an example of wrapping existing code into
our plugin model.
3. The obfuscated-ssh superencipherment is pretty trivial and pretty
useful. It makes the protocol stringwise unfingerprintable.
4. If we do a raw-traffic proxy, openssh tunnels would be the logical
choice.
Appendix: recommendations for transports
Be free/open-source software. Also, if you think your code might
someday do so well at circumvention that it should be implemented
inside Tor, it should use the same license as Tor.
Tor already uses OpenSSL, Libevent, and zlib. Before you go and decide
to use crypto++ in your transport plugin, ask yourself whether OpenSSL
wouldn't be a nicer choice.
Be portable: most Tor users are on Windows, and most Tor developers
are not, so designing your code for just one of these platforms will
make it either get a small userbase, or poor auditing.
Think secure: if your code is in a C-like language, and it's hard to
read it and become convinced it's safe, then it's probably not safe.
Think small: we want to minimize the bytes that a Windows user needs
to download for a transport client.
Avoid security-through-obscurity if possible. Specify.
Resist trivial fingerprinting: There should be no good string or regex
to search for to distinguish your protocol from protocols permitted by
censors.
Imitate a real profile: There are many ways to implement most
protocols -- and in many cases, most possible variants of a given
protocol won't actually exist in the wild.
Filename: 181-optimistic-data-client.txt
Title: Optimistic Data for Tor: Client Side
Author: Ian Goldberg
Created: 2-Jun-2011
Status: Closed
Implemented-In: 0.2.3.3-alpha
Overview:
This proposal (as well as its already-implemented sibling concerning the
server side) aims to reduce the latency of HTTP requests in particular
by allowing:
1. SOCKS clients to optimistically send data before they are notified
that the SOCKS connection has completed successfully
2. OPs to optimistically send DATA cells on streams in the CONNECT_WAIT
state
3. Exit nodes to accept and queue DATA cells while in the
EXIT_CONN_STATE_CONNECTING state
This particular proposal deals with #1 and #2.
For more details (in general and for #3), see the sibling proposal 174
(Optimistic Data for Tor: Server Side), which has been implemented in
0.2.3.1-alpha.
Motivation:
This change will save one OP<->Exit round trip (down to one from two).
There are still two SOCKS Client<->OP round trips (negligible time) and
two Exit<->Server round trips. Depending on the ratio of the
Exit<->Server (Internet) RTT to the OP<->Exit (Tor) RTT, this will
decrease the latency by 25 to 50 percent. Experiments validate these
predictions. [Goldberg, PETS 2010 rump session; see
https://thunk.cs.uwaterloo.ca/optimistic-data-pets2010-rump.pdf ]
Design:
Currently, data arriving on the SOCKS connection to the OP on a stream
in AP_CONN_STATE_CONNECT_WAIT is queued, and transmitted when the state
transitions to AP_CONN_STATE_OPEN. Instead, when data arrives on the
SOCKS connection to the OP on a stream in AP_CONN_STATE_CONNECT_WAIT
(connection_edge_process_inbuf):
- Check to see whether optimistic data is allowed at all (see below).
- Check to see whether the exit node for this stream supports optimistic
data (according to tor-spec.txt section 6.2, this means that the
exit node's version number is at least 0.2.3.1-alpha). If you don't
know the exit node's version number (because it's not in your
hashtable of fingerprints, for example), assume it does *not* support
optimistic data.
- If both are true, transmit the data on the stream.
Also, when a stream transitions *to* AP_CONN_STATE_CONNECT_WAIT
(connection_ap_handshake_send_begin), do the above checks, and
immediately send any already-queued data if they pass.
SOCKS clients (e.g. polipo) will also need to be patched to take
advantage of optimistic data. The simplest solution would seem to be to
just start sending data immediately after sending the SOCKS CONNECT
command, without waiting for the SOCKS server reply. When the SOCKS
client starts reading data back from the SOCKS server, it will first
receive the SOCKS server reply, which may indicate success or failure.
If success, it just continues reading the stream as normal. If failure,
it does whatever it used to do when a SOCKS connection failed.
Security implications:
ORs (for sure the Exit, and possibly others, by watching the
pattern of packets), as well as possibly end servers, will be able to
tell that a particular client is using optimistic data. This of course
has the potential to fingerprint clients, dividing the anonymity set.
The usual kind of solution is suggested:
- There is a boolean consensus parameter UseOptimisticData.
- There is a 3-state (-1, 0, 1) configuration parameter
UseOptimisticData (or give it a distinct name if you like)
defaulting to -1.
- If the configuration parameter is -1, the OP obeys the consensus
value; otherwise, it obeys the configuration parameter.
It may be wise to set the consensus parameter to 1 at the same time as
similar other client protocol changes are made (for example, a new
circuit construction protocol) in order to not further subdivide the
anonymity set.
Specification:
The current tor-spec has already been updated by proposal 174 to handle
optimistic data. It says, in part:
If the exit node does not support optimistic data (i.e. its version
number is before 0.2.3.1-alpha), then the OP MUST wait for a
RELAY_CONNECTED cell before sending any data. If the exit node
supports optimistic data (i.e. its version number is 0.2.3.1-alpha
or later), then the OP MAY send RELAY_DATA cells immediately after
sending the RELAY_BEGIN cell (and before receiving either a
RELAY_CONNECTED or RELAY_END cell).
Should the "MAY" be more specific, referring to the consensus
parameters? Or does the existence of the configuration parameter
override mean it's really "MAY", regardless?
Compatibility:
There are compatibility issues, as mentioned above. OPs MUST NOT send
optimistic data to Exit nodes whose version numbers predate
0.2.3.1-alpha. OPs MAY send optimistic data to Exit nodes whose version
numbers match or follow that value.
Implementation:
My git diff is 42 lines long (+17 lines, -1 line), changing only the two
functions mentioned above (connection_edge_process_inbuf and
connection_ap_handshake_send_begin). This diff does not, however,
handle the configuration options, or check the version number of the
exit node.
I have patched a command-line SOCKS client (webfetch) to use optimistic
data. I have not attempted to patch polipo, but I have looked at it a
bit, and it seems pretty straightforward. (Of course, if and when
polipo is deprecated, whatever else speaks SOCKS to the OP should take
advantage of optimistic data.)
Performance and scalability notes:
OPs may queue a little more data, if the SOCKS client pushes it faster
than the OP can write it out. But that's also true today after the
SOCKS CONNECT returns success, right?
Filename: 182-creditbucket.txt
Title: Credit Bucket
Author: Florian Tschorsch and Björn Scheuermann
Created: 22 Jun 2011
Status: Obsolete
Note: Obsolete because we no longer have a once-per-second bucket refill.
Overview:
The following proposal targets the reduction of queuing times in onion
routers. In particular, we focus on the token bucket algorithm in Tor and
point out that current usage unnecessarily locks cells for long time spans.
We propose a non-intrusive change in Tor's design which overcomes the
deficiencies.
Motivation and Background:
Cell statistics from the Tor network [1] reveal that cells reside in
individual onion routers' cell queues for up to several seconds. These
queuing times increase the end-to-end delay very significantly and are
apparently the largest contributor to overall cell latency in Tor.
In Tor there exist multiple token buckets on different logical levels. They
all work independently. They are used to limit the up- and downstream of an
onion router. All token buckets are refilled every second with a constant
amount of tokens that depends on the configured bandwidth limits. For
example, the so-called RelayedTokenBucket limits relay traffic only. All
read data of incoming connections are bound to a dedicated read token
bucket. An analogous mechanism exists for written data leaving the onion
router. We were able to identify the specific usage and implementation of
the token bucket algorithm as one cause for very high (and unnecessary)
queuing times in an onion router.
We observe that the token buckets in Tor are (surprisingly at a first
glance) allowed to take on negative fill levels. This is justified by the
TLS connections between onion routers where whole TLS records need to be
processed. The token bucket on the incoming side (i.e., the one which
determines at which rate it is allowed to read from incoming TCP
connections) in particular often runs into non-negligible negative fill
levels. As a consequence of this behavior, sometimes slightly more data is
read than it would be admissible upon strict interpretation of the token
bucket concept.
However, the token bucket for limiting the outgoing rate does not take on
negative fill levels equally often. Consequently, it regularly happens
that somewhat more data are read on the incoming side than the outgoing
token bucket allows to be written during the same cycle, even if their
configured data rates are the same. The respective cells will thus not be
allowed to leave the onion router immediately. They will thus necessarily
be queued for at least as long as it takes until the token bucket on the
outgoing side is refilled again. The refill interval currently is, as
mentioned before, one second -- so, these cells are delayed for a very
substantial time. In summary, one could say that the two buckets, on the
incoming and outgoing side, work like a double door system and frequently
lock cells for a full token bucket refill interval length.
General Design:
In order to overcome the described problem, we propose the following
changes related to the token bucket algorithm.
We observe that the token bucket on the outgoing connections with its
current design is contra productive in the sense of queuing times. We
therefore propose modifications to the token bucket algorithm that will
eliminate the "double door effect" discussed above.
Let us start from Tor's current approach: Thus, we have a regular token
bucket on the reading side with a certain rate and a certain burst size.
Let x denote the current amount of tokens in the bucket. On the outgoing
side we need something appropriate that monitors and constrains the
outgoing rate, but at the same time avoids holding back cells (cf. double
door effects) whenever possible.
Here we propose something that adopts the role of a token bucket, but
realizes this functionality in a slightly different way. We call it a
"credit bucket". Like a token bucket, the credit bucket also has a current
fill level, denoted by y. However, the credit bucket is refilled in a
different way.
To understand how it works, let us look at the possible operations:
As said, x is the fill level of a regular token bucket on the incoming
side and thus gets incremented periodically according to the configured
rate. No changes here.
If x<=0, we are obviously not allowed to read. If x>0, we are allowed to
read up to x bytes of incoming data. If k bytes are read (k<=x), then we
update x and y as follows:
x = x - k (1)
y = y + k (2)
(1) is the standard token bucket operation on the incoming side. Whenever
data is admitted in, though, an additional operation is performed: (2)
allocates the same number of bytes on the outgoing side, which will later
on allow the same number of bytes to leave the onion router without any
delays.
If y + x > -M, we are allowed to write up to y + x + M bytes on the
outgoing side, where M is a positive constant. M specifies a burst size for
the outgoing side. M should be higher than the number of tokens that get
refilled during a refill interval, we would suggest to have M in the order
of a few seconds "worth" of data. Now if k bytes are written on the
outgoing side, we proceed as follows:
If k <= y then y = y - k
In this case we use "saved" credits, previously allocated on the incoming
side when incoming data has been processed.
If k > y then y = 0 and x = x - (k-y)
We generated additional traffic in the onion router, so that more data is
to be sent than has been read (the credit is not sufficient). We therefore
"steal" tokens from the token buffer on the incoming side to compensate for
the additionally generated data. This will result in correspondingly less
data being read on the incoming side subsequently. As a result of such an
operation, the token bucket fill level x on the incoming side may become
negative (but it can never fall below -M).
If y + x <= -M then outgoing data will be held back. This may lead to
double-door effects, but only in extreme cases where the outgoing traffic
largely exceeds the incoming traffic, so that the outgoing bursts size M is
exceeded.
Aside from short-term bursts of configurable size (as with every token
bucket), this procedure guarantees that the configured rate may never be
exceeded (on the application layer, that is; as with the current
implementation, an attacker may easily cause the onion router to
arbitrarily exceed the limits on the lower layers). Over time, we never
send more data than the configured rate: every sent byte needs a
corresponding token on the incoming side; this token must either have been
consumed by an incoming byte before (it then became a "credit"), or it is
"stolen" from the incoming bucket to compensate for data generated within
the onion router.
Specific Design Changes:
In the following we briefly point out the specific changes that need to be
done in Tor's source code. By doing so one can see how non intrusive our
modifications are.
First we need to address the bucket increment and decrement operations.
According to the described logic above, this should be done in the methods
connection_bucket_refill and connection_buckets_decrement respectively. In
particular allocating, saving and "stealing" of tokens need to be
considered here.
Second the rate limiting, i.e. the amount we are allowed to write
(connection_bucket_write_limit) needs to be adapted in lines of the credit
bucket logic. Meaning in order to avoid the here identified unnecessary
queuing of cells, we need to consider the new burst parameter M. Here we
also need to take non rate limited connections such as from the localhost
into account. The rate limiting on the reading side remains the same.
At last we need to find good values/ ratios for the parameter M such that
the trade off between avoiding "double door effects" and maintaining
strict rate limits work as expected. As future work and after insights
about the performance gain of the here described proposal we need to find a
way to implement this both using bufferevent rate limiting with libevent
2.3.x and Tor's rate limiting code.
Conclusion:
This proposal can be implemented with moderate effort and requires changes
only at the points where currently the token bucket operations are
performed.
We feel that this is not the be-all and end-all solution, because it again
introduces a feedback loop between the incoming and the outgoing side. We
therefore still hope that we will be able to come to a both simpler and
more effective design in the future. However, we believe that what we
proposed here is a good compromise between avoiding double-door effects to
the furthest possible extent, strictly enforcing an application-layer data
rate, and keeping the extent of changes to the code small.
Feedback is highly appreciated.
References:
[1] Karsten Loesing. Analysis of Circuit Queues in Tor. August 25, 2009.
[2] https://trac.torproject.org/projects/tor/wiki/sponsors/SponsorD/June2011
Filename: 183-refillintervals.txt
Title: Refill Intervals
Author: Florian Tschorsch and Björn Scheuermann
Created: 03-Dec-2010
Status: Closed
Implemented-In: 0.2.3.5-alpha
Overview:
In order to avoid additional queuing and bursty traffic, the refill
interval of the token bucket algorithm should be shortened. Thus we
propose a configurable parameter that sets the refill interval
accordingly.
Motivation and Background:
In Tor there exist multiple token buckets on different logical levels. They
all work independently. They are used to limit the up- and downstream of an
onion router. All token buckets are refilled every second with a constant
amount of tokens that depends on the configured bandwidth limits. The very
coarse-grained refill interval of one second has detrimental effects.
First, consider an onion router with multiple TLS connections over which
cells arrive. If there is high activity (i.e., many incoming cells in
total), then the coarse refill interval will cause unfairness. Assume (just
for simplicity) that C doesn't share its TLS connection with any other
circuit. Moreover, assume that C hasn't transmitted any data for some time
(e.g., due a typical bursty HTTP traffic pattern). Consequently, there are
no cells from this circuit in the incoming socket buffers. When the buckets
are refilled, the incoming token bucket will immediately spend all its
tokens on other incoming connections. Now assume that cells from C arrive
soon after. For fairness' sake, these cells should be serviced timely --
circuit C hasn't received any bandwidth for a significant time before.
However, it will take a very long time (one refill interval) before the
current implementation will fetch these cells from the incoming TLS
connection, because the token bucket will remain empty for a long time. Just
because the cells happened to arrive at the "wrong" point in time, they must
wait. Such situations may occur even though the configured admissible
incoming data rate is not exceeded by incoming cells: the long refill
intervals often lead to an operational state where all the cells that were
admissible during a given one-second period are queued until the end of this
second, before the onion router even just starts processing them. This
results in unnecessary, long queuing delays in the incoming socket buffers.
These delays are not visible in the Tor circuit queue delay statistics [1].
Finally, the coarse-grained refill intervals result in a very bursty outgoing
traffic pattern at the onion routers (one large chunk of data once per
second, instead of smooth transmission progress). This is undesirable, since
such a traffic pattern can interfere with TCP's control mechanisms and can
be the source of suboptimal TCP performance on the TLS links between onion
routers.
Specific Changes:
The token buckets should be refilled more often, with a correspondingly
smaller amount of tokens. For instance, the buckets might be refilled every
10 milliseconds with one-hundredth of the amount of data admissible per
second. This will help to overcome the problem of unfairness when reading
from the incoming socket buffers. At the same time it smoothes the traffic
leaving the onion routers. We are aware that this latter change has
apparently been discussed before [2]; we are not sure why this change has
not been implemented yet.
In particular we need to change the current implementation in Tor which
triggers refilling always after exactly one second. Instead the refill event
should fire more frequently. The smaller time intervals between each refill
action need to be taken into account for the number of tokens that are added
to the bucket.
With libevent 2.x and bufferevents enabled, smaller refill intervals are
already considered but hard coded. This should be changed to a configurable
parameter, too.
Conclusion:
This proposal can be implemented with moderate effort and requires changes
only at the points where the token bucket operations are currently
performed.
This change will also be a good starting point for further enhancements
to improve queuing times in Tor. I.e. it will pave the ground for other means
that tackle this problem.
Feedback is highly appreciated.
References:
[1] Karsten Loesing. Analysis of Circuit Queues in Tor. August 25, 2009.
[2] https://trac.torproject.org/projects/tor/wiki/sponsors/SponsorD/June2011
Filename: 184-v3-link-protocol.txt
Title: Miscellaneous changes for a v3 Tor link protocol
Author: Nick Mathewson
Created: 19-Sep-2011
Status: Closed
Target: 0.2.3.x
Overview:
When proposals 176 and 179 are implemented, Tor will have a new
link protocol. I propose two simple improvements for the v3 link
protocol: a more partitioned set of which types indicate
variable-length cells, and a better way to handle link padding if
and when we come up with a decent scheme for it.
Motivation:
We're getting a new link protocol in 0.2.3.x, thanks (again) to
TLS fingerprinting concerns. When we do, it'd be nice to take
care of some small issues that require a link protocol version
increment.
First, our system for introducing new variable-length cell types
has required a protocol increment for each one. Unlike
fixed-length (512 byte) cells, we can't add new variable-length
cells in the existing link protocols and just let older clients
ignore them, because unless the recipient knows which cells are
variable-length, it will treat them as 512-byte cells and discard
too much of the stream or too little. In the past, it's been
useful to be able to introduce new cell types without having to
increment the link protocol version.
Second, once we have our new TLS handshake in place, we will want
a good way to address the remaining fingerprinting opportunities.
Some of those will likely involve traffic volume. We can't fix
that easily with our existing PADDING cell type, since PADDING
cells are fixed-length, and wouldn't be so easy to use to break up
our TLS record sizes.
Design: Indicating variable-length cells.
Beginning with the v3 link protocol, we specify that all cell
types in the range 128..255 indicate variable-length cells.
Cell types in the range 0..127 are still used for 512-byte
cells, except that the VERSIONS cell type (7) also indicates a
variable-length cell (for backward compatibility).
As before, all Tor instances must ignore cells with types that
they don't recognize.
Design: Variable-length padding.
We add a new variable-length cell type, "VPADDING", to be used for
padding. All Tor instances may send a VPADDING cell at any point that
a VERSIONS cell is not required; a VPADDING cell's body may be any
length; the body of a VPADDING cell MAY have any content. Upon
receiving a VPADDING cell, the recipient should drop it, as with a
PADDING cell.
(This does not give a way to send fewer than 5 bytes of padding.
We could add this in the future, in a new link protocol.)
Implementations SHOULD fill the content of all padding cells
randomly.
A note on padding:
We do not specify any situation in which a node ought to generate
a VPADDING cell; that's left for future work. Implementors should
be aware that many schemes have been proposed for link padding
that do not in fact work as well as one would expect. We
recommend that no mainstream implementation should produce padding
in an attempt to resist traffic analysis, without real research
showing that it helps.
Interaction with proposal 176:
Proposal 176 says that during the v3 handshake, no cells other
than VERSIONS, AUTHENTICATE, AUTH_CHALLENGE, CERT, and NETINFO are
allowed, and those are only allowed in their standard order. If
this proposal is accepted, then VPADDING cells should also be
allowed in the handshake at any point after the VERSIONS cell.
They should be included when computing the "SLOG" and "CLOG"
handshake-digest fields of the AUTHENTICATE cell.
Notes on future-proofing:
It may be in the future we need a new cell format that is neither the
original 512-byte format nor the variable-length format. If we
do, we can just increment the link protocol version number again.
Right now we have 10 cell types; with this proposal and proposal
176, we will have 14. It's unlikely that we'll run out any time
soon, but if we start to approach the number 64 with fixed-length
cell types or 196 with var-length cell types, we should consider
tweaking the link protocol to have a variable-length cell type
encoding.
Filename: 185-dir-without-dirport.txt
Title: Directory caches without DirPort
Author: Nick Mathewson
Created: 20-Sep-2011
Status: Superseded
Superseded-by: 237
Overview:
Exposing a directory port is no longer necessary for running as a
directory cache. This proposal suggests that we eliminate that
requirement, and describes how.
Motivation:
Now that we tunnel directory connections by default, it is no
longer necessary to have a DirPort to be a directory cache. In
fact, bridges act as directory caches but do not actually have a
DirPort exposed. It would be nice and tidy to expand that
property to the rest of the network.
Configuration:
Add a new torrc option, "DirCache". Its values can be "0", "1",
and "auto". If it is 0, we never act as a directory cache, even
if DirPort is set. If it is 1, then we act as a directory cache
according to same rules as those used for nodes that set a
DirPort. If it is "auto", then Tor decides whether to act as a
directory cache based on some future intelligent algorithm. "Auto"
should be the new default.
Advertising cache status:
Nodes that are running as a directory cache should set the entry
"dir-cache 1" in their router descriptors. If they do not have a
DirPort set, or do not have a working DirPort, they should give
their directory port as 0 in their router lines. (Nodes that have
a working directory port advertise it as usual, and also include a
"dir-cache" line. Nodes that do not serve directory information
should set their directory port to 0, and not include any
dir-cache line. Implementations should accept and ignore
dir-cache lines with values other than "dir-cache 1".)
Consensus:
Authorities should assign a "DirCache" flag to all nodes running
as a directory cache.
This does not require a new version of the consensus algorithm.
Filename: 186-multiple-orports.txt
Title: Multiple addresses for one OR or bridge
Author: Nick Mathewson
Created: 19-Sep-2011
Supersedes: 118
Status: Closed
Target: 0.2.4.x+
Status:
This proposal is partially implemented to the extent needed to allow nodes
to have one IPv4 and one IPv6 address.
Overview:
This document is a proposal for servers to advertise multiple
address/port combinations for their ORPort.
It supersedes proposal 118.
Motivation:
Sometimes servers want to support multiple ports for incoming
connections, either in order to support multiple address families
(ie, to add IPv6 support), to better use multiple interfaces, or
to support a variety of FascistFirewallPorts settings. This is
easy to set up now, but there's no way to advertise it to clients.
Configuring additional addresses and ports:
In consonance with our changes to the (Socks|Trans|NATD|DNS)Port
options made in 0.2.3.x for proposal 171, I make a corresponding
change to allow multiple ORPort options and deprecate
ORListenAddress.
The new syntax will be:
"ORPort" PortDescription Option*
Option = "NoAdvertise" | "NoListen" | "AllAddrs" | "IPV4Only"
| "IPV6Only"
PortDescription = PORTLIST |
ADDRESS ":" PORTLIST |
Hostname ":" PORTLIST
(PORTLIST and ADDRESS are defined below.)
The 'NoAdvertise' option performs the function of the old
ORListenAddress option. If it is set, we bind a port, but
don't put it in our descriptor.
The 'NoListen' option tells Tor to advertise an address, but not
bind to it. The operator needs to use some other mechanism to
ensure that ports are redirected to ports that _are_ listened on.
The 'AllAddrs' option tells Tor that if no address is given in the
PortDescription part, we should bind/advertise every one of our
publicly visible unicast addresses; and that if a hostname address
is given in the PortDescription, we should bind/advertise every
publicly visible unicast address that the hostname resolves to.
(Q: Should this be on by default?) The 'IPv4Only' and 'IPv6Only'
options tell Tor to interpret such situations as applying only to
IPv4 addresses or to IPv6 addresses.
As with the client *Port options, only the old format or the new
format are allowed: either a single numeric ORPort and zero or
more ORListenAddress options, or a set of one or more
ORPorts in the new extended format.
In current operating systems (unless we get into crazy nonportable
tricks) we need to use one socket for every address:port that Tor
binds on. As a sanity check, we can limit the number of such sockets
we use to, say, something between 8 and 64. If you want to bind lots
of address:port combinations, you'll want to do it at the
firewall/routing level.
Example: We want to bind on 0.0.0.0:9001
ORPort 9001
Example: Our firewall is redirecting ports 80, 443, and 7000
on all hosts in 18.244.2.0 onto our port 2929.
ORPort 2929 noadvertise
ORPort 18.244.2.0:80,443,7000 nolisten
Example: We have a dynamic DNS provider that maps
tornode.example.com to our current external IPv4 and IPv6
addresses. Our firewall forwards port 443 on those addresses to our
port 1337.
ORPort 1337 noadvertise alladdrs
ORPort tornode.example.com:443 nobind alladdrs
Self-testing:
Right now, Tor nodes need to check every port that they advertise
before they declare themselves reachable. If a Tor has
a lot of advertised ports, that could be prohibitive.
Instead, it should try a sample of ports for each address. It should
not advertise any given ORPort line until it has tried
extending to or connecting to a sample of the address/port
combinations.
It will now be possible for a Tor node to find that some addresses
work and others do not. In this case, the node should only advertise
ORPort lines that have been checked. (As a consequence, the node
should not advertise any address unless at least one ORPort without
nolisten has been specified.)
{Until support is added for extend cells to IPv6 addresses, it
will only be possible to test IPv6 addresses by connecting
directly. We might want to just skip self-testing those until we
have IPv6 extend support.}
New descriptor syntax:
We add a new line in the router descriptor, "or-address". This line
can occur zero, one, or multiple times. Its format is:
or-address SP ADDRESS ":" PORTLIST NL
ADDRESS = IPV6ADDR | IPV4ADDR
IPV6ADDR = an ipv6 address, surrounded by square brackets.
IPV4ADDR = an ipv4 address, represented as a dotted quad.
PORTLIST = PORTSPEC | PORTSPEC "," PORTLIST
PORTSPEC = PORT
PORT = a number between 1 and 65535 inclusive.
[This is the regular format for specifying sets of addresses and
ports in Tor.]
A descriptor should not include an or-address line that does
nothing but duplicate the address:port pair from its "router"
line.
A node must not list more than 8 or-address lines.
A PORTLIST must have no more than 16 PORTSPEC entries, and its entries must
be disjoint.
(Q: Any reason to allow more than 2? Multiple interfaces, I guess.)
New authority behavior:
The same rationale applies as for self-testing. An authority
needs to test the main address:port from the router line, and
every or-address line. For or-address lines that contain
multiple ports, it needs to test all of them if they are few, or a
sample if they are not.
An authority shouldn't list a node as Running unless every
or-address line it advertises looks like it will work.
Consensus directories and microdescriptors:
We introduce a new line type for microdescriptors and consensuses,
"a". Each "a" line has the same format as an or-address line.
The "a" lines (if any) appear immediately after the "r" line for a
router in the consensus, and immediately after the "onion-key"
entry in a microdescriptor.
Clients that use microdescriptors should consider a node's
addresses to be the address:port listed in the "r" line of a
consensus, plus all "a" lines for that node in the consensus, plus
all "a" lines for that node in its microdescriptor. Clients
that use full descriptors should consider a node's addresses to be
everything listed in its descriptor.
We will have to define a new voting algorithm version; when using
this version or later, votes should include a single "a" line for
every relay that has an IPv6 address, to include the first IPv6
line in its descriptor. (If there are no IPv6 or-address lines, then
they shouldn't include any "a" lines.) The remaining or-address
lines will turn into "a" lines in the microdescriptor.
As with other data in the vote derived from the descriptor, the
consensus will include whichever set of "a" lines are given by the
most authorities who voted for the descriptor digest that will be
used for the router.
Directory authorities with more addresses:
We need a way for a client to configure a TrustedDirServer as
having multiple OR addresses, specifically so that we can give at
least one default authority an IPv6 address for bootstrapping
purposes.
(Q: Do any of the current authorities have stable IPv6 addresses?)
We will want to allow the address in a "dir-source" line in a vote
to contain an IPv6 address, and/or allow voters to list themselves
with more addresses in votes/consensuses. But right now, nothing
actually uses the addresses listed for voters in dir-source lines
for anything besides log messages.
Client behavior:
I propose that initially we shouldn't change client behavior too
much here.
(Q: Is there any advantage to having a client choose a random
address? If so we can do it later. If not, why list any more
than one IPv4 and one IPv6 address?)
Tor clients not running with bridges, and running with IPv4
support, should still use the address and ORPort as advertised in
the "router" or "r" line of the appropriate directory object.
Tor clients not running with bridges, and running without IPv4
support, should use the first listed IPv6 address for a node,
using the lowest-numbered listed port for that address. They
should only connect to nodes with an IPv6 address.
Clients should accept Bridge lines with IPv6 addresses, and
address:port sets, in addition to the lines they currently accept.
Clients, for now, should only use the address:port from the router
line when making EXTEND cells; see below.
Nodes without IPv4 addresses:
Currently Tor requires every node or bridge to have an IPv4
address. We will want to maintain this property for the
foreseeable future, but we should define how a node without an IPv4
address would advertise itself.
Right now, there's no way to do that: if anything but an IPv4
address appears in a router line of a routerdesc, or the "r" line of
a consensus, then it won't parse. If something that looks like an
IPv4 address appears there, clients will (I believe) try to
connect to it.
We can make this work, though: let's allow nodes to list themselves
with a magic IPv4 address (say, 127.1.1.1) if they have
or-address entries containing only IPv6 address. We could give
these nodes a new flag other than Running to indicate that they're
up, and not give them the Running flag. That way, old clients
would never try to use them, but new clients could know to treat
the new flag as indicating that the node is running, and know not
to connect to a node listed with address 127.1.1.1.
Interaction with EXTEND and NETINFO:
Currently, EXTEND cells only support IPv4 addresses, so we should
use only those. There is a proposal draft to support more address
types.
A server's NETINFO cells must list all configured addresses for a
server.
Why not extend DirPort this way too?
Because clients are all using BEGINDIR these days.
That is, clients tunnel their directory requests inside OR
connections, and don't generally connect to DirPorts at all.
Why not have address and port ranges?
Earlier drafts of this proposal suggested that servers should provide
ranges of addresses, specified with bitmasks. That's a neat idea for
circumvention, but if we did that, you wouldn't want to advertise
publicly that you have an entire address range.
Port ranges are out because I don't think they would actually get used
much, and they add a fair bit of complexity.
Coding impact:
In addition to the obvious changes, we need to audit everything
that looks up or compares OR connections and nodes by address:port
under the assumptions that each node has only a single address or
ORPort.
TODO:
* Make it so that authorities can vote on which addresses are working
somehow.
* Specify some way to say "I only want to connect to v4/v6 addresses".
* Come up with a better alternative to running6 for the longterm?
Filename: 187-allow-client-auth.txt
Title: Reserve a cell type to allow client authorization
Author: Nick Mathewson
Created: 16-Oct-2011
Status: Closed
Target: 0.2.3.x
Overview:
Proposals 176 and 184 introduce a new "v3" handshake, coupled with
a new version 3 link protocol. This is a good time to introduce
other stuff we might need.
One thing we might want is a scanning resistance feature for
bridges. This proposal suggests a change we should make right
away to enable us to deploy such a feature in future versions of
Tor.
Motivation:
If an adversary has a suspected bridge address/port combination,
the easiest way for them to confirm or disconfirm their suspicion
is to connect to the address and see whether they can do a Tor
handshake. The easiest way to fix this problem seems to be to
give out bridge addresses along with some secret that clients
should know, but which an adversary shouldn't be able to learn
easily. The client should prove to the bridge that it's
authorized to know about the bridge, before the bridge acts like a
bridge. If the client doesn't show knowledge of the proper
secret, the bridge should act like an HTTPS server or a bittorrent
tracker or something.
This proposal *does not* specify a way for clients to authorize
themselves at bridges; rather, it specifies changes that we should
make now in order to allow this kind of authorization in the
future.
Design:
Currently, now that proposal 176 is implemented, if a server
provides a certificate that indicates a v3 handshake, and the
client understands how to do a V3 handshake, we specify that the
client's first cell must be a VERSIONS cell.
Instead, we make the following specification changes:
We reserve a new variable-length cell type, "AUTHORIZE".
We specify that any number of PADDING or VPADDING or AUTHORIZE
cells may be sent by the client before it sends a VERSIONS cell.
Servers that do not require client authorization MUST ignore such
cells, except to include them when calculating the HMAC that will
appear in the CLOG part of a client's AUTHENTICATE cell.
We still specify that clients SHOULD send VERSIONS as their first
cell; only in some future version of Tor will an AUTHORIZE cell be sent
first.
Discussion:
This change allows future versions of the Tor client to know that
some bridges need authorization, and to send them authentication
before sending them anything recognizably Tor-like.
The authorization cell needs to be received before the server can
send any Tor cells, so we can't just patch it in after the
VERSIONS cell exchange: the server's VERSIONS cell is unsendable
until after the AUTHORIZE has been accepted.
Note that to avoid scanning attacks, it's not sufficient to wait
for a single cell, and then either handle it as authorization or
reject the connection. Instead, we need to decide what kind of
server we're impersonating, and respond once the client has
provided *either* an authorization cell, *or* a recognizably valid
or invalid command in the impersonated protocol.
Alternative design: Just use pluggable transports
Pluggable transports can do this too, but in general, we want to
avoid designing the Tor protocol so that any particular desirable
feature can only be done with a pluggable transport. That is, any
feature that *every* bridge should want, should be doable in Tor
proper.
Also, as of 16 Oct 2011, pluggable transports aren't in general
use. Past experience IMO suggests that we shouldn't offload
architectural responsibilities to our chickens until they've
hatched.
Alternative design: Out-of-TLS authorization
There are features (like port-knocking) designed to allow a client
to show that it's authorized to use a bridge before the TLS
handshake even happens. These are appropriate for bunches of
applications, but they're trickier with an adversary who is
MITMing the client.
Alternative design: Just use padding.
Arguably, we could only add the "VPADDING" cell type to the list
of those allowed before VERSIONS cells, and say that any client
authorization we specify later on will be sent as a VPADDING
cell. But that design is kludgy: padding should be padding, not
semantically significant. Besides, cell types are still fairly
plentiful.
Counterargument: specify it later
We could, later on, say that if a client learns that a bridge
needs authorization, it should send an AUTHORIZE cell. So long as
a client never sends an AUTHORIZE to anything other than a bridge that
needs authorization, it'll never violate the spec.
But all things considered, it seems easier (just a few lines of
spec and code) to let bridges eat unexpected authorization now
than it does to have stuff fail later when clients think that a
bridge needs authorization but it doesn't.
Counterargument: it's too late!
We've already got the prop176 branch merged and running on a few
servers. But as of this writing, it isn't in any Tor version.
Even if it *is* out in an alpha before we can get this proposal
accepted and implemented, that's not a big disaster. In the worst
case, where future clients don't know whom to send authorization
to so they need to send it to _all_ v3 servers, they will at worst
break their connections only to a couple of alpha versions which
one hopes by then will be long-deprecated already.
Filename: 188-bridge-guards.txt
Title: Bridge Guards and other anti-enumeration defenses
Author: Nick Mathewson, Isis Lovecruft
Created: 14 Oct 2011
Modified: 10 Sep 2015
Status: Reserve
[NOTE: This proposal is marked as "reserve" because the enumeration
technique it addresses does not currently seem to be in use. See
ticket tor#7144 for more information. (2020 July 31)]
1. Overview
Bridges are useful against censors only so long as the adversary
cannot easily enumerate their addresses. I propose a design to make
it harder for an adversary who controls or observes only a few
nodes to enumerate a large number of bridges.
Briefly: bridges should choose guard nodes, and use the Tor
protocol's "Loose source routing" feature to re-route all extend
requests from clients through an additional layer of guard nodes
chosen by the bridge. This way, only a bridge's guard nodes can
tell that it is a bridge, and the attacker needs to run many more
nodes in order to enumerate a large number of bridges.
I also discuss other ways to avoid enumeration, recommending some.
These ideas are due to a discussion at the 2011 Tor Developers'
Meeting in Waterloo, Ontario. Practically none of the ideas here
are mine; I'm just writing up what I remember.
2. History and Motivation
Under the current bridge design, an attacker who runs a node can
identify bridges by seeing which "clients" make a large number of
connections to it, or which "clients" make connections to it in the
same way clients do. This has been a known attack since early
versions {XXXX check} of the design document; let's try to fix it.
2.1. Related idea: Guard nodes
The idea of guard nodes isn't new: since 0.1.1, Tor has used guard
nodes (first designed as "Helper" nodes by Wright et al in {XXXX})
to make it harder for an adversary who controls a smaller number of
nodes to eavesdrop on clients. The rationale was: an adversary who
controls or observes only one entry and one exit will have a low
probability of correlating any single circuit, but over time, if
clients choose a random entry and exit for each circuit, such an
adversary will eventually see some circuits from each client with a
probability of 1, thereby building a statistical profile of the
client's activities. Therefore, let each client choose its entry
node only from among a small number of client-selected "guard"
nodes: the client is still correlated with the same probability as
before, but now the client has a nonzero chance of remaining
unprofiled.
2.2. Related idea: Loose source routing
Since the earliest versions of Onion Routing, the protocol has
provided "loose source routing". In strict source routing, the
source of a message chooses every hop on the message's path. But
in loose source routing, the message traverses the selected nodes,
but may also traverse other nodes as well. In other words, the
client selects nodes N_a, N_b, and N_c, but the message may in fact
traverse any sequence of nodes N_1...N_j, so long as N_1=N_a,
N_x=N_b, and N_y=N_c, for 1 < x < y.
Tor has retained this feature, but has not yet made use of it.
3. Design
Every bridge currently chooses a set of guard nodes for its
circuits. Bridges should also re-route client circuits through
these circuits.
Specifically, when a bridge receives a request from a client to
extend a circuit, it should first create a circuit to its guard,
and then relay that extend cell through the guard. The bridge
should add an additional layer of encryption to outgoing cells on
that circuit corresponding to the encryption that the guard will
remove, and remove a layer of encryption on incoming cells on that
circuit corresponding to the encryption that the guard will add.
3.1. Loose-Source Routed Circuit Construction
Alice, an OP, is using a bridge, Bob, and she has chosen the
following path through the network:
Alice -> Bob -> Charlie -> Deidra
However, Bob has decided to take advantage of the loose-source
routing circuit characteristic (for example, in order to use a bridge
guard), and Bob has chosen N additional loose-source routed hop(s),
through which he will transparently relays cells.
NOTE: For the purposes of bridge guards, N is always 1. However, for
completion's sake, the following details of the circuit construction
are generalized to include N > 1. Additionally, the following steps
should hold for a hop at any position in Alice's circuit that has
decided to take advantage of the loose-source routing feature, not
only for bridge ORs.
From Alice's perspective, her circuit path matches the one diagrammed
above. However, the overall path of the circuit is:
Alice -> Bob -> Guillaume -> Charlie -> Deidra
From Bob's perspective, the circuit's path is:
Alice -> Bob -> Guillaume -> Charlie -> UNKNOWN
Interestingly, because Bob's behaviour towards Guillaume and choices
of cell types is that of a normal OP, Guillaume's perspective of the
circuit's path is:
Bob -> Guillaume -> Charlie -> UNKNOWN
That is, to Guillaume, Bob appears (for the most part) to be a
normally connecting client. (See §4.1 for more detailed analysis.)
3.1.1. Detailed Steps of Loose-Source Routed Circuit Construction
1. Connection from OP
Alice has connected to Bob, and she has sent to Bob either a
CREATE/CREATE_FAST or CREATE2 cell.
2. Loose-Source Path Selection
In anticipation of Alice's first RELAY_EARLY cell (which will
contain an EXTEND cell to Alice's next hop), Bob begins
constructing a loose-source routed circuit. To do so, Bob chooses
N additional hop(s):
2.a. For the first additional hop, H_1, Bob chooses a suitable
entry guard node, Guillaume, using the same algorithm as OPs.
See "§5 Guard nodes" of path-spec.txt for additional
information on the selection algorithm.
2.b. Each additional hop, [H_2, ..., H_N], is chosen at random
from a list of suitable, non-excluded ORs.
3. Loose-Source Routed Circuit Extension and Cell Types
Bob now follows the same procedure as OPs use to complete the key
exchanges with his chosen additional hop(s).
While undergoing these following substeps, Bob SHOULD continue to
proceed with Step 4, below, in parallel, as an optimization for
speeding up circuit construction.
3.a. Create Cells
Bob sends the appropriate type of create cell to Guillaume.
For ORs new enough to support the NTor handshake (nearly all
of them at this point), Bob sends a CREATE2 cell. Otherwise,
for ORs which only support the older TAP handshake, Bob sends
either a CREATE or CREATE_FAST cell, using the same
decision-making logic as OPs.
See §4.1 for more information the distinguishability of
bridges based upon whether they use CREATE versus
CREATE_FAST. Also note that the CREATE2 cell has since
become ubiquitous after this proposal was originally drafted.
Thus, because we prefer ORs which use loose-source routing to
behave (as much as possible) like OPs, we now prefer to use
CREATE2.
3.b. Created Cells
Later, when Bob receives a corresponding CREATED/CREATED_FAST
or CREATED2 cell from Guillaume, Bob extracts key material
for the shared forward and reverse keys, KG_f and KG_b,
respectively.
3.c. Extend Cells
When N > 1, for each additional hop, H_i, in [H_2, ..., H_N],
Bob chooses the appropriate type of extend cell for H_i, and
sends this extend cell to H_i-1, who transforms it into a
create cell in order to perform the extension. To choose
which type of extend cell to send, Bob uses the same
algorithm as an OP to determine whether to use EXTEND or
EXTEND2. Similar to the CREATE* cells above, for most modern
ORs, this will very likely mean an EXTEND2 cell.
3.d. Extended Cells
When a corresponding EXTENDED/EXTENDED2 cell is received for
an additional hop, H_i, Bob extracts the shared forward and
reverse keys, Ki_f and Ki_b, respectively.
4. Responding to the OP
Now that the additional hops in Bob's loose-source routed circuit
are chosen, and construction of the loose-source routed circuit
has begun, Bob answers Alice's original CREATE/CREATE_FAST or
CREATE2 cell (from Step 1) by sending the corresponding created
cell type.
Alice has now built a circuit through Bob, and the two share the
negotiated forward and reverse keys, KB_n and KB_p, respectively.
Note that Bob SHOULD do this step in tandem with the loose-source
routed circuit construction procedure outlined in Step 3, above.
5. OP Circuit Extension
Alice then wants to extend the circuit to node Charlie. She makes
a hybrid-encrypted onionskin, encrypted to Charlie's public key,
containing her chosen g^x value. She puts this in an extend cell:
"Extend (Charlie's address) (Charlie's OR Port) (Onionskin)
(Charlie's ID)". She encrypts this with KB_n and sends it as a
RELAY_EARLY cell to Bob.
Bob's behaviour is now dependent on whether the loose-source
routed circuit construction steps (as outlined in Step 3, above)
have already completed.
5.a. The Loose-Source Routed Circuit Construction is Incomplete
If Bob has not yet finished the loose-source routed circuit
construction, then Bob MUST store the first outgoing
(i.e. exitward) RELAY_EARLY cell received from Alice until
the loose-source routed circuit construction has been
completed.
If any incoming (i.e. toward the OP) RELAY* cell is received
while the loose-source routed circuit is not fully
constructed, Bob MUST drop the cell.
If Bob has already stored Alice's first RELAY_EARLY cell, and
Alice sends any additional RELAY* cell, then Bob SHOULD mark
the entire circuit for close with END_CIRC_REASON_TORPROTOCOL.
5.b. The Loose-Source Routed Circuit Construction is Completed
Later, when the loose-source routed circuit is fully
constructed, Bob MUST send any stored cells from Alice
outward by following the procedure described in Step 6.a.
6. Relay Cells
When receiving a RELAY* cell in either direction, Bob MAY keep
statistics on the number of relay cells encountered, as well as
the number of relay cells relayed.
6.a. Outgoing Relay Cells
Bob decrypts the RELAY* cell with KB_n. If the cell becomes
recognized, Bob should now follow the relay command checks
described in Step 6.c.
Bob MUST encrypt the relay cell's underlying payload to each
additional hop in the loose-source routed circuit, in
reverse: for each additional hop, H_i, in [H_N, ..., H_1],
Bob encrypts the relay cell payload to Ki_f, the shared
forward key for the hop H_i.
Bob MUST update the forward digest, DG_f, of the relay cell,
regardless of whether or not the cell is recognized. See
6.c. for additional information on recognized cells.
Bob now sends the cell outwards through the additional hops.
At each hop, H_i, the hop removes a layer of the onionskin by
decrypting the cell with Ki_f, and then hop H_i forwards the
cell to the next addition additional hop H_i+1. When the
final additional hop, H_N, received the cell, the OP's cell
command and payload should be processed by H_N in the normal
manner for an OR.
6.b. Incoming Relay Cells
Bob MUST decrypt the relay cell's underlying payload from
each additional hop in the loose-source routed circuit (in
forward order, this time): For each additional hop, H_i, in
[H_1, ..., H_N], Bob decrypts the relay cell payload with
Ki_b, the shared backward key for the hop H_i.
If the cell has becomes recognized after all decryptions, Bob
should now follow the relay command checks described in Step
6.c.
Bob MUST update the backward digest, DG_b, of the relay cell,
regardless of whether or not the cell is recognized. See
6.c. for additional information on recognized cells.
Bob encrypts the cell towards the OP with KB_p, and sends the
cell inwards.
6.c. Recognized Cells
If a relay cell, either incoming or outgoing, becomes
recognized (i.e. Bob sees that the cell was intended for him)
after decryption, and there is no stream attached to the
circuit, then Bob SHOULD mark the circuit for close if the
relay command contained within the cell is any of the
following types:
- RELAY_BEGIN
- RELAY_CONNECTED
- RELAY_END
- RELAY_RESOLVE
- RELAY_RESOLVED
- RELAY_BEGIN_DIR
Apart from the above checks, Bob SHOULD essentially treat
every cell as "unrecognized" by following the en-/de-cryption
procedures in Steps 6.a. and 6.b. regardless of whether the
cell is actually recognized or not. That is, since this is a
loose-source routed circuit, Bob SHOULD relay cells not
intended for him *and* cells intended for him through the
leaky pipe, no matter what the cell's underlying payload and
command are.
3.1.2. Example Loose-Source Circuit Construction
For example, given the following circuit path chosen by Alice:
Alice -> Bob -> Charlie -> Deidra
when Alice wishes to extend to node Charlie, and Bob the bridge is
using only one additional loose-source routed hop, Guillaume, as his
bridge guard, the following steps are taken:
- Alice packages the extend into a RELAY_EARLY cell and encrypts
the RELAY_EARLY cell with KB_f to Bob.
- Bob receives the RELAY_EARLY cell from Alice, and he follows
the procedure (outlined in §3.1.1. Step 6.a.) by:
* Decrypting the cell with KB_f,
* Encrypting the cell to the forward key, KG_f, which Bob
shares with his guard node, Guillaume,
* Updating the cell forward digest, DG_f, and
* Sending the cell as a RELAY_EARLY cell to Guillaume.
- When Guillaume receives the cell from Bob, he processes it by:
* Decrypting the cell with KG_f. Guillaume now sees that it
is a RELAY_EARLY cell containing an extend cell "intended"
for him, containing: "Extend (Charlie's address) (Charlie's
OR Port) (Onionskin) (Charlie's ID)".
* Performing the circuit extension to the specified node,
Charlie, by acting accordingly: creating a connection to
Charlie if he doesn't have one, ensuring that the ID is as
expected, and then sending the onionskin in a create cell
on that connection. Note that Guillaume is behaving
exactly as a regular node would upon receiving an Extend
cell.
* Now the handshake finishes. Charlie receives the onionskin
and sends Guillaume "CREATED g^y,KH".
* Making an extended cell for Bob which contains
"E(KG_b, EXTENDED g^y KH)", and
* Sending the extended cell to Bob. Note that Charlie and
Guillaume are both still behaving in a manner identical to
regular ORs.
- Bob receives the extended cell from Guillaume, and he follows
the procedure (outlined in §3.1.1. Step 6.b.) by:
* Decrypting the cell with KG_b,
* Encrypting the cell to Alice with KB_b,
* Updating the cell backward digest, DG_b, and
* Sending the cell to Alice.
- Alice receives the cell, and she decrypts it with KB_b, just
as she would have if Bob had extended to Charlie directly.
She then processes the extended cell contained within to
extract shared keys with Charlie. Note that Alice's behaviour
is identical to regular OPs.
3.2. Additional Notes on the Construction
Note that this design does not require that our stream cipher
operations be commutative, even though they are.
Note also that this design requires no change in behavior from any
node other than Bob, and as we can see in the above example in §3.1.2
for Alice's circuit extension, Alice, Guillaume, and Charlie behave
identical to a normal OP and normal ORs.
Finally, observe that even though the circuit N hops longer than it
would be otherwise, no relay's count of permissible RELAY_EARLY cells
falls lower than it otherwise would. This is because the extra hop
that Bob adds is done with RELAY_EARLY cells, then he continues to
relay Alice's cells as RELAY_EARLY, until the appropriate maximum
number of RELAY_EARLY cells is reached. Afterwards, further
RELAY_EARLY cells from Alice are repackaged by Bob as normal RELAY
cells.
4. Alternative designs
4.1. Client-enforced bridge guards
What if Tor didn't have loose source routing? We could have
bridges tell clients what guards to use by advertising those guard
in their descriptors, and then refusing to extend circuits to any
other nodes. This change would require all clients to upgrade in
order to be able to use the newer bridges, and would quite possibly
cause a fair amount of pain along the way.
Fortunately, we don't need to go down this path. So let's not!
4.2. Separate bridge-guards and client-guards
In the design above, I specify that bridges should use the same
guard nodes for extending client circuits as they use for their own
circuits. It's not immediately clear whether this is a good idea
or not. Having separate sets would seem to make the two kinds of
circuits more easily distinguishable (even though we already assume
they are distinguishable). Having different sets of guards would
also seem like a way to keep the nodes who guard our own traffic
from learning that we're a bridge... but another set of nodes will
learn that anyway, so it's not clear what we'd gain.
One good reason to keep separate guard lists is to prevent the
*client* of the bridge from being able to enumerate the guards that
the bridge uses to protect its own traffic (by extending a circuit
through the bridge to a node it controls, and finding out where the
extend request arrives from).
5. Additional bridge enumeration methods and protections
In addition to the design above, there are more ways to try to
prevent enumeration.
Right now, there are multiple ways for the node after a bridge to
distinguish a circuit extended through the bridge from one
originating at the bridge. (This lets the node after the bridge
tell that a bridge is talking to it.)
5.1. Make it harder to tell clients from bridges
When using the older TAP circuit handshake protocol, one of the
giveaways is that the first hop in a circuit is created with
CREATE_FAST cells, but all subsequent hops are created with CREATE
cells.
However, because nearly everything in the network now uses the newer
NTor circuit handshake protocol, clients send CREATE2 cells to all
hops, regardless of position. Therefore, in the above design, it's
no longer quite so simple to distinguish an OP connecting through
bridge from an actual OP, since all of the circuits that extend
through a bridge now reach its guards through CREATE2 cells (whether
the bridge originated them or not), and only as a fallback (e.g. if
an additional node in the loose-source routed path does not support
NTor) will the bridge ever use CREATE/CREATE_FAST. (Additionally,
when using the fallback mathod, the behaviour for choosing either
CREATE or CREATE_FAST is identical to normal OP behaviour.)
The CREATE/CREATE_FAST distinction is not the only way for a
bridge's guard to tell bridges from orginary clients, however.
Most importantly, a busy bridge will open far more circuits than a
client would. More subtly, the timing on response from the client
will be higher and more highly variable that it would be with an
ordinary client. I don't think we can make bridges behave wholly
indistinguishably from clients: that's why we should go with guard
nodes for bridges.
[XXX For further research: we should study the methods by which a
bridge guard can determine that they are acting as a guard for a
bridge, rather than for a normal OP, and which methods are likely to
be more accurate or efficient than others. -IL]
5.2. Bridge Reachability Testing
Currently, a bridge's reachability is tested both by the bridge
itself (called "self-testing") and by the BridgeAuthority.
5.2.1. Bridge Reachability Self-Testing
Before a bridge uploads its descriptors to the BridgeAuthority, it
creates a special type of testing circuit which ends at itself:
Bob -> Guillaume -> Charlie -> Bob
Thus, going to all this trouble to later use loose-source routing in
order to relay Alice's traffic through Guillaume (rather than
connecting directly to Charlie, as Alice intended) is diminished by
the fact that Charlie can still passively enumerate bridges by
waiting to be asked to connect to a node which is not contained
within the consensus.
We could get around this option by disabling self-testing for bridges
entirely, by automatically setting "AssumeReachable 1" for all bridge
relays… although I am not sure if this is wise.
Our best idea thus far, for bridge reachability self-testing, is to create
a circuit like so:
Bridge → Guard → Middle → OtherMiddle → Guard → Bridge
While, clearly, that circuit is just a little bit insane, it must be that
way because we cannot simply do:
Bridge → Guard → Middle → Guard → Bridge
because the Middle would refuse to extend back to the previous node
(all ORs follow this rule). Similarly, it would be inane to do:
Bridge → Guard → Middle → OtherMiddle → Bridge
because, obviously, that merely shifts the problem to OtherMiddle and
accomplishes nothing. [XXX Is there something smarter we could do? —IL]
5.2.2. Bridge Reachability Testing by the BridgeAuthority
After receiving Bob's descriptors, the BridgeAuthority attempts to
connect to Bob's ORPort by making a direct TLS connection to the
bridge's advertised ORPort.
Should we change this behaviour? One the one hand, at least this
does not enable any random OR in the entire network to enumerate
bridges. On the other hand, any adversary who can observe packets
from the BridgeAuthority is capable of enumeration.
6. Other considerations
What fraction of our traffic is bridge traffic? Will this alter
our circuit selection weights?
Filename: 189-authorize-cell.txt
Title: AUTHORIZE and AUTHORIZED cells
Author: George Kadianakis
Created: 04 Nov 2011
Status: Obsolete
1. Overview
Proposal 187 introduced the concept of the AUTHORIZE cell, a cell
whose purpose is to make Tor bridges resistant to scanning attacks.
This is achieved by having the bridge and the client share a secret
out-of-band and then use AUTHORIZE cells to validate that the
client indeed knows that secret before proceeding with the Tor
protocol.
This proposal specifies the format of the AUTHORIZE cell and also
introduces the AUTHORIZED cell, a way for bridges to announce to
clients that the authorization process is complete and successful.
2. Motivation
AUTHORIZE cells should be able to perform a variety of
authorization protocols based on a variety of shared secrets. This
forces the AUTHORIZE cell to have a dynamic format based on the
authorization method used.
AUTHORIZED cells are used by bridges to signal the end of a
successful bridge client authorization and the beginning of the
actual link handshake. AUTHORIZED cells have no other use and for
this reason their format is very simple.
Both AUTHORIZE and AUTHORIZED cells are to be used under censorship
conditions and they should look innocuous to any adversary capable
of monitoring network traffic.
As an attack example, an adversary could passively monitor the
traffic of a bridge host, looking at the packets directly after the
TLS handshake and trying to deduce from their packet size if they
are AUTHORIZE and AUTHORIZED cells. For this reason, AUTHORIZE and
AUTHORIZED cells are padded with a random amount of padding before
sending.
3. Design
3.1. AUTHORIZE cell
The AUTHORIZE cell is a variable-sized cell.
The generic AUTHORIZE cell format is:
AuthMethod [1 octet]
MethodFields [...]
PadLen [2 octets]
Padding ['PadLen' octets]
where:
'AuthMethod', is the authorization method to be used.
'MethodFields', is dependent on the authorization Method used. It's
a meta-field hosting an arbitrary amount of fields.
'PadLen', specifies the amount of padding in octets.
Implementations SHOULD pick 'PadLen' to be a random integer from 1
to 3141 inclusive.
'Padding', is 'PadLen' octets of random content.
3.2. AUTHORIZED cell format
The AUTHORIZED cell is a variable-sized cell.
The AUTHORIZED cell format is:
'AuthMethod' [1 octet]
'PadLen' [2 octets]
'Padding' ['PadLen' octets]
where all fields have the same meaning as in section 3.1.
3.3. Cell parsing
Implementations MUST ignore the contents of 'Padding'.
Implementations MUST reject an AUTHORIZE or AUTHORIZED cell where
the 'Padding' field is not 'PadLen' octets long.
Implementations MUST reject an AUTHORIZE cell with an 'AuthMethod'
they don't recognize.
4. Discussion
4.1. What's up with the [1,3141] padding bytes range?
The upper limit is larger than the Ethernet MTU so that AUTHORIZE
and AUTHORIZED cells are not always transmitted into a single
packet. Other than that, it's indeed pretty much arbitrary.
4.2. Why not let the pluggable transports do the padding, like they
are supposed to do for the rest of the Tor protocol?
The arguments of section "Alternative design: Just use pluggable
transports" of proposal 187, apply here as well:
All bridges who use client authorization will also need padded
AUTHORIZE and AUTHORIZED cells.
4.3. How should multiple round-trip authorization protocols be handled?
Protocols that require multiple round trips between the client and
the bridge should use AUTHORIZE cells for communication.
The format of the AUTHORIZE cell is flexible enough to support
messages from the client to the bridge and the reverse.
At the end of a successful multiple-round-trip protocol, an
AUTHORIZED cell must be issued from the bridge to the client.
4.4. AUTHORIZED seems useless. Why not use VPADDING instead?
As noted in proposal 187, the Tor protocol uses VPADDING cells for
padding; any other use of VPADDING makes the Tor protocol kludgy.
In the future, and in the example case of a v3 handshake, a client
can optimistically send a VERSIONS cell along with the final
AUTHORIZE cell of an authorization protocol. That allows the
bridge, in the case of successful authorization, to also process
the VERSIONS cell and begin the v3 handshake promptly.
4.5. What should actually happen when a bridge rejects an AUTHORIZE
cell?
When a bridge detects a badly formed or malicious AUTHORIZE cell,
it should assume that the other side is an adversary scanning for
bridges. The bridge should then act accordingly to avoid detection.
This proposal does not try to specify how a bridge can avoid
detection by an adversary.
Filename: 190-shared-secret-bridge-authorization.txt
Title: Bridge Client Authorization Based on a Shared Secret
Author: George Kadianakis
Created: 04 Nov 2011
Status: Obsolete
Notes: This is obsoleted by pluggable transports.
1. Overview
Proposals 187 and 189 introduced AUTHORIZE and AUTHORIZED cells.
Their purpose is to make bridge relays scanning-resistant against
censoring adversaries capable of probing hosts to observe whether
they speak the Tor protocol.
This proposal specifies a bridge client authorization scheme based
on a shared secret between the bridge user and bridge operator.
2. Motivation
A bridge client authorization scheme should only allow clients who
show knowledge of a shared secret to talk Tor to the bridge.
3. Shared-secret-based authorization
3.1. Where do shared secrets come from?
A shared secret is a piece of data known only to the bridge
operator and the bridge client.
It's meant to be automatically generated by the bridge
implementation to avoid issues with insecure and weak passwords.
Bridge implementations SHOULD create shared secrets by generating
random data using a strong RNG or PRNG.
3.2. AUTHORIZE cell format
In shared-secret-based authorization, the MethodFields field of the
AUTHORIZE cell becomes:
'shared_secret' [10 octets]
where:
'shared_secret', is the shared secret between the bridge operator
and the bridge client.
3.3. Cell parsing
Bridge implementations MUST reject any AUTHORIZE cells whose
'shared_secret' field does not match the shared secret negotiated
between the bridge operator and authorized bridge clients.
4. Tor implementation
4.1. Bridge side
Tor bridge implementations MUST create the bridge shared secret by
generating 10 octets of random data using a strong RNG or PRNG.
Tor bridge implementations MUST store the shared secret in
'DataDirectory/keys/bridge_auth_ss_key' in hexadecimal encoding.
Tor bridge implementations MUST support the boolean
'BridgeRequireClientSharedSecretAuthorization' configuration file
option which enables bridge client authorization based on a shared
secret.
If 'BridgeRequireClientSharedSecretAuthorization' is set, bridge
implementations MUST generate a new shared secret, if
'DataDirectory/keys/bridge_auth_ss_key' does not already exist.
4.2. Client side
Tor client implementations must extend their Bridge line format to
support bridge shared secrets. The new format is:
Bridge [<method>] <address[:port]> [["keyid="]<id-fingerprint>] ["shared_secret="<shared_secret>]
where <shared_secret> is the bridge shared secret in hexadecimal
encoding.
Tor clients who use bridges with shared-secret-based client
authorization must specify the bridge's shared secret as in:
Bridge 12.34.56.78 shared_secret=934caff420aa7852b855
5. Discussion
5.1. What should actually happen when a bridge rejects an AUTHORIZE
cell?
When a bridge detects a badly formed or malicious AUTHORIZE cell,
it should assume that the other side is an adversary scanning for
bridges. The bridge should then act accordingly to avoid detection.
This proposal does not try to specify how a bridge can avoid
detection by an adversary.
6. Acknowledgements
Thanks to Nick Mathewson and Robert Ransom for the help and
suggestions while writing this proposal.
Filename: 191-mitm-bridge-detection-resistance.txt
Title: Bridge Detection Resistance against MITM-capable Adversaries
Author: George Kadianakis
Created: 07 Nov 2011
Status: Obsolete
1. Overview
Proposals 187, 189 and 190 make the first steps toward scanning
resistant bridges. They attempt to block attacks from censoring
adversaries who provoke bridges into speaking the Tor protocol.
An attack vector that hasn't been explored in those previous
proposals is that of an adversary capable of performing Man In The
Middle attacks to Tor clients. At the moment, Tor clients using the
v3 link protocol have no way to detect such an MITM attack, and
will gladly send a VERSIONS or AUTHORIZE cell to the MITMed
connection, thereby revealing the Tor protocol and thus the bridge.
This proposal introduces a way for clients to detect an MITMed SSL
connection, allowing them to protect against the above attack.
2. Motivation
When the v3 link handshake protocol is performed, Tor's SSL
handshake is performed with the server sending a self-signed
certificate and the client blindly accepting it. This allows the
adversary to perform an MITM attack.
A Tor client must detect the MITM attack before he initiates the
Tor protocol by sending a VERSIONS or AUTHORIZE cell. A good
moment to detect such an MITM attack is during the SSL handshake.
To achieve that, bridge operators provide their bridge users with a
hash digest of the public-key certificate their bridge is using for
SSL. Bridge clients store that hash digest locally and associate it
with that specific bridge. Bridge clients who have "pinned" a
bridge to a certificate "fingerprint" can thereafter validate that
their SSL connection peer is the intended bridge.
Of course, the hash digest must be provided to users out-of-band
and before the actual SSL handshake. Usually, the bridge operator
gives the hash digest to her bridge users along with the rest of
the bridge credentials, like the bridge's address and port.
3. Security implications
Bridge clients who have pinned a bridge to a certificate
fingerprint will be able to detect an MITMing adversary in time.
If after detection they act as an innocuous Internet
client, they can successfully remove suspicion from the SSL
connection and subvert bridge detection.
Pinning a certificate fingerprint and detecting an MITMing attacker
does not automatically alleviate suspicions from the bridge or the
client. Clients must have a behavior to follow after detecting the
MITM attack so that they look like innocent Netizens. This proposal
does not try to specify such a behavior.
Implementation and use of this scheme does not render bridges and
clients immune to scanning or DPI attacks. This scheme should be
used along with bridge client authorization schemes like the ones
detailed in proposal 190.
4. Tor Implementation
4.1. Certificate fingerprint creation
The certificate fingerprints used on this scheme MUST be computed
by applying the SHA256 cryptographic hash function upon the ASN.1
DER encoding of a public-key certificate, then truncating the hash
output to 12 bytes, encoding it to RFC4648 Base32 and omitting any
trailing padding '='.
4.2. Bridge side implementation
Tor bridge implementations SHOULD provide a command line option
that exports a fully equipped Bridge line containing the bridge
address and port, the link certificate fingerprint, and any other
enabled Bridge options, so that bridge operators can easily send it
to their users.
In the case of expiring SSL certificates, Tor bridge
implementations SHOULD warn the bridge operator a sensible amount
of time before the expiration, so that she can warn her clients and
potentially rotate the certificate herself.
4.3. Client side implementation
Tor client implementations MUST extend their Bridge line format to
support bridge SSL certificate fingerprints. The new format is:
Bridge <method> <address:port> [["keyid="]<id-fingerprint>] \
["shared_secret="<shared_secret>] ["link_cert_fpr="<fingerprint>]
where <fingerprint> is the bridge's SSL certificate fingerprint.
Tor clients who use bridges and want to pin their SSL certificates
must specify the bridge's SSL certificate fingerprint as in:
Bridge 12.34.56.78 shared_secret=934caff420aa7852b855 \
link_cert_fpr=GM4GEMBXGEZGKOJQMJSWINZSHFSGMOBRMYZGCMQ
4.4. Implementation prerequisites
Tor bridges currently rotate their SSL certificates every 2
hours. This not only acts as a fingerprint for the bridges, but it
also acts as a blocker for this proposal.
Tor trac ticket #4390 and proposal YYY were created to resolve this
issue.
5. Other ideas
5.1. Certificate tagging using a shared secret
Another idea worth considering is having the bridge use the shared
secret from proposal 190 to embed a "secret message" on her
certificate, which could only be understood by a client who knows
that shared secret, essentially authenticating the bridge.
Specifically, the bridge would "tag" the Serial Number (or any
other covert field) of her certificate with the (potentially
truncated) HMAC of her link public key, using the shared secret of
proposal 190 as the key: HMAC(shared_secret, link_public_key).
A client knowing the shared secret would be able to verify the
'link_public_key' and authenticate the bridge, and since the Serial
Number field is usually composed of random bytes a probing attacker
would not notice the "tagging" of the certificate.
Arguments for this scheme are that it:
a) doesn't need extra bridge credentials apart from the shared secret
of prop190.
b) doesn't need any maintenance in case of certificate expiration.
Arguments against this scheme are:
a) In the case of self-signed certificates, OpenSSL creates an
8-bytes random Serial number, and we would probably need
something more than 8-bytes to tag. There are not many other
covert fields in SSL certificates mutable by vanilla OpenSSL.
b) It complicates the scheme, and if not implemented and researched
wisely it might also make it fingerprintable.
c) We most probably won't be able to tag CA-signed certificates.
6. Discussion
6.1. In section 4.1, why do you truncate the SHA256 output to 12 bytes?!
Bridge credentials are frequently propagated by word of mouth or
are physically written down, which renders the occult Base64
encoding unsatisfactory. The 104 characters Base32 encoding or the
64 characters hex representation of the SHA256 output would also be
too much bloat.
By truncating the SHA256 output to 12 bytes and encoding it with
Base32, we get 39 characters of readable and easy to transcribe
output, and sufficient security. Finally, dividing '39' by the
golden ratio gives us about 24.10!
7. Acknowledgements
Thanks to Robert Ransom for his great help and suggestions on
devising this scheme and writing this proposal!
Filename: 192-store-bridge-information.txt
Title: Automatically retrieve and store information about bridges
Author: Sebastian Hahn
Created: 16-Nov-2011
Status: Obsolete
Target: 0.2.[45].x
Overview:
Currently, tor already stores some information about the bridges it is
configured to use locally, but doesn't make great use of the stored
data. This data is the Tor configuration information about the bridge
(IP address, port, and optionally fingerprint) and the bridge descriptor
which gets stored along with the other descriptors a Tor client fetches,
as well as an "EntryGuard" line in the state file. That line includes
the Tor version we used to add the bridge, and a slightly randomized
timestamp (up to a month in the past of the real date). The descriptor
data also includes some more accurate timestamps about when the
descriptor was fetched.
The information we give out about bridges via bridgedb currently only
includes the IP address and port, because giving out the fingerprint as
well might mean that Tor clients make direct connections to the bridge
authority, since we didn't design Tor's UpdateBridgesFromAuthority
behaviour correctly.
Motivation:
The only way to let Tor know about a change affecting the bridge (IP
address or port change) is to either ask the bridge authority directly,
or reconfigure Tor. The former requires making a non-anonymized direct
connection to the bridge authority Tonga and asking it for the current
descriptor of the bridge with a given fingerprint - this is unsafe and
also requires prior knowledge of the fingerprint. The latter requires
user intervention, first to learn that there was an update and second to
actually teach Tor about the change.
This is way too complicated for most users, and should be unnecessary
while the user has at least one bridge that remains working: Tonga can
give out bridge descriptors when asked for the descriptor for a certain
fingerprint, and Tor clients learn the fingerprint either from their
torrc file or from the first connection they make to a bridge.
For some users, however, this option is not what they want: They might
use private bridges or have special security concerns, which would make
them want to connect to the IP addresses specified in their
configuration only, and not tell Tonga about the set of bridges they
know about, even through a Tor circuit. Also see
https://blog.torproject.org/blog/different-ways-use-bridge for more
information about the different types of bridge users.
Design:
Tor should provide a new configuration option that allows bridge users
to indicate that they wish to contact Tonga anonymously and learn about
updates for the bridges that they know about, but can't currently reach.
Once those updates have been received, the clients would then hold on to
the new information in their state file, and use it across restarts for
connection attempts.
The option UpdateBridgesFromAuthority should be removed or recycled for
this purpose, as it is currently dangerous to set (it makes direct
connections to the bridge authority, thus leaking that a user is about
to use bridges). Recycling the option is probably the better choice,
because current users of the option get a surprising and never useful
behaviour. On the other hand, users who downgrade their Tors might get
the old behaviour by accident.
If configured with this option, tor would make an anonymized connection
to Tonga to ask for the descriptors of bridges that it cannot currently
connect to, once every few hours. Making more frequent requests would
likely not help, as bridge information doesn't typically change that
frequently, and may overload Tonga.
This information needs to be stored in the state file:
- An exact copy of the Bridge stanza in the torrc file, so that tor can
detect when the bridge is unconfigured/the configuration is changed
- The IP address, port, and fingerprint we last used when making a
successful connection to the bridge, if this differs from/supplements
the configured data.
- The IP address, port, and fingerprint we learned from the bridge
authority, if this differs from both the configured data and the data
we used for the last successful connection.
We don't store more data in the state file to avoid leaking too much if
the state file falls into the hands of an adversary.
Security implications:
Storing sensitive data on disk is risky when the computer one uses gets
into the wrong hands, and state file entries can be used to identify
times the user was online. This is already a problem for the Bridge
lines in a user's configuration file, but by storing more information
about bridges some timings can be deduced.
Another risk is that this allows long-term tracking of users when the
set of bridges a user knows about is known to the attacker, and the set
is unique. This is not very hard to achieve for bridgedb, as users
typically make requests to it non-anomymized and bridgedb can
selectively pick bridges to report. By combining the data about
descriptor fetches on Tonga and this fingerprint, a usage pattern can be
established. Also, bridgedb could give out a made-up fingerprint to a
user that requested bridges, thus easily creating a unique set.
Users of private bridges should not set this option, as it will leak the
fingerprints of their bridges to Tonga. This is not a huge concern, as
Tonga doesn't know about those descriptors, but private bridge users
will likely want to avoid leaking the existence of their bridge. We
might want to figure out a way to indicate that a bridge is private on
the Bridge line in the configuration, so fetching the descriptor from
Tonga is disabled for those automatically. This warrants more discussion
to find a solution that doesn't require bridge users to understand the
trade-offs of setting a configuration option.
One idea is to indicate that a bridge is private by a special flag in
its bridge descriptor, so clients can avoid leaking those to the bridge
authority automatically. Also, Bridge lines for private bridges
shouldn't include the fingerprint so that users don't accidentally leak
the fingerprint to the bridge authority before they have talked to the
bridge.
Specification:
No change/addition to the current specification is necessary, as the
data that gets stored at clients is not covered by the specification.
This document is supposed to serve as a basis for discussion and to
provide hints for implementors.
Compatibility:
Tonga is already set up to send out descriptors requested by clients, so
the bridge authority side doesn't need any changes. The new
configuration options governing the behaviour of Tor would be
incompatible with previous versions, so the torrc needs to be adapted.
The state file changes should not affect older versions.
Filename: 193-safe-cookie-authentication.txt
Title: Safe cookie authentication for Tor controllers
Author: Robert Ransom
Created: 2012-02-04
Status: Closed
Overview:
Not long ago, all Tor controllers which automatically attempted
'cookie authentication' were vulnerable to an information-disclosure
attack. (See https://bugs.torproject.org/4303 for slightly more
information.)
Now, some Tor controllers which automatically attempt cookie
authentication are only vulnerable to an information-disclosure
attack on any 32-byte files they can read. But the Ed25519
signature scheme (among other cryptosystems) has 32-byte secret
keys, and we would like to not worry about Tor controllers leaking
our secret keys to whatever can listen on what the controller thinks
is Tor's control port.
Additionally, we would like to not have to remodel Tor's innards and
rewrite all of our Tor controllers to use TLS on Tor's control port
this week (or deal with the many design issues which that would
raise).
Design:
From af6bf472d59162428a1d7f1d77e6e77bda827414 Mon Sep 17 00:00:00 2001
From: Robert Ransom <rransom.8774@gmail.com>
Date: Sun, 5 Feb 2012 04:02:23 -0800
Subject: [PATCH] Add SAFECOOKIE control-port authentication method
---
control-spec.txt | 59 ++++++++++++++++++++++++++++++++++++++++++++++-------
1 files changed, 51 insertions(+), 8 deletions(-)
diff --git a/control-spec.txt b/control-spec.txt
index 66088f7..3651c86 100644
--- a/control-spec.txt
+++ b/control-spec.txt
@@ -323,11 +323,12 @@
For information on how the implementation securely stores authentication
information on disk, see section 5.1.
- Before the client has authenticated, no command other than PROTOCOLINFO,
- AUTHENTICATE, or QUIT is valid. If the controller sends any other command,
- or sends a malformed command, or sends an unsuccessful AUTHENTICATE
- command, or sends PROTOCOLINFO more than once, Tor sends an error reply and
- closes the connection.
+ Before the client has authenticated, no command other than
+ PROTOCOLINFO, AUTHCHALLENGE, AUTHENTICATE, or QUIT is valid. If the
+ controller sends any other command, or sends a malformed command, or
+ sends an unsuccessful AUTHENTICATE command, or sends PROTOCOLINFO or
+ AUTHCHALLENGE more than once, Tor sends an error reply and closes
+ the connection.
To prevent some cross-protocol attacks, the AUTHENTICATE command is still
required even if all authentication methods in Tor are disabled. In this
@@ -949,6 +950,7 @@
"NULL" / ; No authentication is required
"HASHEDPASSWORD" / ; A controller must supply the original password
"COOKIE" / ; A controller must supply the contents of a cookie
+ "SAFECOOKIE" ; A controller must prove knowledge of a cookie
AuthCookieFile = QuotedString
TorVersion = QuotedString
@@ -970,9 +972,9 @@
methods that Tor currently accepts.
AuthCookieFile specifies the absolute path and filename of the
- authentication cookie that Tor is expecting and is provided iff
- the METHODS field contains the method "COOKIE". Controllers MUST handle
- escape sequences inside this string.
+ authentication cookie that Tor is expecting and is provided iff the
+ METHODS field contains the method "COOKIE" and/or "SAFECOOKIE".
+ Controllers MUST handle escape sequences inside this string.
The VERSION line contains the Tor version.
@@ -1033,6 +1035,47 @@
[TAKEOWNERSHIP was added in Tor 0.2.2.28-beta.]
+3.24. AUTHCHALLENGE
+
+ The syntax is:
+ "AUTHCHALLENGE" SP "AUTHMETHOD=SAFECOOKIE"
+ SP "COOKIEFILE=" AuthCookieFile
+ SP "CLIENTCHALLENGE=" 2*HEXDIG / QuotedString
+ CRLF
+
+ The server will reject this command with error code 512, then close
+ the connection, if Tor is not using the file specified in the
+ AuthCookieFile argument as a controller authentication cookie file.
+
+ If the server accepts the command, the server reply format is:
+ "250-AUTHCHALLENGE"
+ SP "CLIENTRESPONSE=" 64*64HEXDIG
+ SP "SERVERCHALLENGE=" 2*HEXDIG
+ CRLF
+
+ The CLIENTCHALLENGE, CLIENTRESPONSE, and SERVERCHALLENGE values are
+ encoded/decoded in the same way as the argument passed to the
+ AUTHENTICATE command.
+
+ The CLIENTRESPONSE value is computed as:
+ HMAC-SHA256(HMAC-SHA256("Tor server-to-controller cookie authenticator",
+ CookieString)
+ ClientChallengeString)
+ (with the HMAC key as its first argument)
+
+ After a controller sends a successful AUTHCHALLENGE command, the
+ next command sent on the connection must be an AUTHENTICATE command,
+ and the only authentication string which that AUTHENTICATE command
+ will accept is:
+ HMAC-SHA256(HMAC-SHA256("Tor controller-to-server cookie authenticator",
+ CookieString)
+ ServerChallengeString)
+
+ [Unlike other commands besides AUTHENTICATE, AUTHCHALLENGE may be
+ used (but only once!) before AUTHENTICATE.]
+
+ [AUTHCHALLENGE was added in Tor FIXME.]
+
4. Replies
Reply codes follow the same 3-character format as used by SMTP, with the
--
1.7.8.3
Rationale:
The weird inner HMAC was meant to ensure that whatever impersonates
Tor's control port cannot even abuse a secret key meant to be used
with HMAC-SHA256.
Then I added the server-to-controller challenge-response
authentication step, to ensure that the server can only use a
controller as an HMAC oracle if it already knows the contents of the
cookie file. Now, the inner HMAC is just a not-very-efficient way
to keep controllers from using the server as an oracle for its own
challenges (it could be replaced with a hash function).
Filename: 194-mnemonic-urls.txt
Title: Mnemonic .onion URLs
Author: Sai, Alex Fink
Created: 29-Feb-2012
Status: Superseded
1. Overview
Currently, canonical Tor .onion URLs consist of a naked 80-bit hash[1]. This
is not something that users can even recognize for validity, let alone produce
directly. It is vulnerable to partial-match fuzzing attacks[2], where a
would-be MITM attacker generates a very similar hash and uses various social
engineering, wiki poisoning, or other methods to trick the user into visiting
the spoof site.
This proposal gives an alternative method for displaying and entering .onion
and other URLs, such that they will be easily remembered and generated by end
users, and easily published by hidden service websites, without any dependency
on a full domain name type system like e.g. namecoin[3]. This makes it easier
to implement (requiring only a change in the proxy).
This proposal could equally be used for IPv4, IPv6, etc, if normal DNS is for
some reason untrusted.
This is not a petname system[4], in that it does not allow service providers
or users[5] to associate a name of their choosing to an address[6]. Rather, it
is a mnemonic system that encodes the 80 bit .onion address into a
meaningful[7] and memorable sentence. A full petname system (based on
registration of some kind, and allowing for shorter, service-chosen URLs) can
be implemented in parallel[8].
This system has the three properties of being secure, distributed, and
human-meaningful — it just doesn't also have choice of name (except of course
by brute force creation of multiple keys to see if one has an encoding the
operator likes).
This is inspired by Jonathan Ackerman's "Four Little Words" proposal[9] for
doing the same thing with IPv4 addresses. We just need to handle 80+ bits, not
just 32 bits.
It is similar to Markus Jakobsson & Ruj Akavipat's FastWord system[10], except
that it does not permit user choice of passphrase, does not know what URL a
user will enter (vs verifying against a single stored password), and again has
to encode significantly more data.
This is also similar to RFC1751[11], RFC2289[12], and multiple other
fingerprint encoding systems[13] (e.g. PGPfone[14] using the PGP
wordlist[15], and Arturo Filatsò's OnionURL[16]), but we aim to make something
that's as easy as possible for users to remember — and significantly easier
than just a list of words or pseudowords, which we consider only useful as an
active confirmation tool, not as something that can be fully memorized and
recalled, like a normal domain name.
2. Requirements
2.1. encodes at least 80 bits of random data (preferably more, eg for a
checksum)
2.2. valid, visualizable English sentence — not just a series of words[17]
2.3. words are common enough that non-native speakers and bad spellers will have
minimum difficulty remembering and producing (perhaps with some spellcheck help)
2.4. not syntactically confusable (e.g. order should not matter)
2.5. short enough to be easily memorized and fully recalled at will, not just
recognized
2.6. no dependency on an external service
2.7. dictionary size small enough to be reasonable for end users to download as
part of the onion package
2.8. consistent across users (so that websites can e.g. reinforce their random
hash's phrase with a clever drawing)
2.9. not create offensive sentences that service providers will reject
2.10. resistant against semantic fuzzing (e.g. by having uniqueness against
WordNet synsets[18])
3. Possible implementations
This section is intentionally left unfinished; full listing of template
sentences and the details of their parser and generating implementation is
co-dependent on the creation of word class dictionaries fulfilling the above
criteria. Since that's fairly labor-intensive, we're pausing at this stage for
input first, to avoid wasting work.
3.1. Have a fixed number of template sentences, such as:
1. Adj subj adv vtrans adj obj
2. Subj and subj vtrans adj obj
3. … etc
For a 6 word sentence, with 8 (3b) templates, we need ~12b (4k word)
dictionaries for each word category.
If multiple words of the same category are used, they must either play
different grammatical roles (eg subj vs obj, or adj on a different item), be
chosen from different dictionaries, or there needs to be an order-agnostic way
to join them at the bit level. Preferably this should be avoided, just to
prevent users forgetting the order.
3.2. As 3.1, but treat sentence generation as decoding a prefix code, and have
a Huffman code for each word class.
We suppose it’s okay if the generated sentence has a few more words than it
might, as long as they’re common lean words. E.g., for adjectives, "good"
might cost only six bits while "unfortunate" costs twelve.
Choice between different sentence syntaxes could be worked into the prefix
code as well, and potentially done separately for each syntactic constituent.
4. Usage
To form mnemonic .onion URL, just join the words with dashes or underscores,
stripping minimal words like 'a', 'the', 'and' etc., and append '.onion'. This
can be readily distinguished from standard hash-style .onion URLs by form.
Translation should take place at the client — though hidden service servers
should also be able to output the mnemonic form of hashes too, to assist
website operators in publishing them (e.g. by posting an amusing drawing of
the described situation on their website to reinforce the mnemonic).
After the translation stage of name resolution, everything proceeds as normal
for an 80-bit hash onion URL.
The user should be notified of the mnemonic form of hash URL in some way, and
have an easy way in the client UI to translate mnemonics to hashes and vice
versa. For the purposes of browser URLs and the like though, the mnemonic
should be treated on par with the hash; if the user enters a mnemonic URL they
should not become redirected to the hash version. (If anything, the opposite
may be true, so that users become used to seeing and verifying the mnemonic
version of hash URLs, and gain the security benefits against partial-match
fuzzing.)
Ideally, inputs that don't validly resolve should have a response page served
by the proxy that uses a simple spell-check system to suggest alternate domain
names that are valid hash encodings. This could hypothetically be done inline
in URL input, but would require changes on the browser (normally domain names
aren't subject so spellcheck), and this avoids that implementation problem.
5. International support
It is not possible for this scheme to support non-English languages without
a) (usually) Unicode in domains (which is not yet well supported by browsers),
and
b) fully customized dictionaries and phrase patterns per language
The scheme must not be used in an attempted 'translation' by simply replacing
English words with glosses in the target language. Several of the necessary
features would be completely mangled by this (e.g. other languages have
different synonym, homonym, etc groupings, not to mention completely different
grammar).
It is unlikely a priori that URLs constructed using a non-English
dictionary/pattern setup would in any sense 'translate' semantically to
English; more likely is that each language would have completely unrelated
encodings for a given hash.
We intend to only make an English version at first, to avoid these issues
during testing.
________________
[1] https://trac.torproject.org/projects/tor/wiki/doc/HiddenServiceNames
https://gitweb.torproject.org/torspec.git/blob/HEAD:/address-spec.txt
[2] http://www.thc.org/papers/ffp.html
[3] http://dot-bit.org/Namecoin
[4] https://en.wikipedia.org/wiki/Zooko's_triangle
[5] https://addons.mozilla.org/en-US/firefox/addon/petname-tool/
[6] However, service operators can generate a large number of hidden service
descriptors and check whether their hashes result in a desirable phrasal
encoding (much like certain hidden services currently use brute force generated
hashes to ensure their name is the prefix of their raw hash). This won't get you
whatever phrase you want, but will at least improve the likelihood that it's
something amusing and acceptable.
[7] "Meaningful" here inasmuch as e.g. "Barnaby thoughtfully mangles simplistic
yellow camels" is an absurdist but meaningful sentence. Absurdness is a feature,
not a bug; it decreases the probability of mistakes if the scenario described is
not one that the user would try to fit into a template of things they have
previously encountered IRL. See research into linguistic schema for further
details.
[8] https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/ideas/xxx-oni
on-nyms.txt
[9] http://blog.rabidgremlin.com/2010/11/28/4-little-words/
[10] http://fastword.me/
[11] https://tools.ietf.org/html/rfc1751
[12] http://tools.ietf.org/html/rfc2289
[13] https://github.com/singpolyma/mnemonicode
http://mysteryrobot.com
https://github.com/zacharyvoase/humanhash
[14] http://www.mathcs.duq.edu/~juola/papers.d/icslp96.pdf
[15] http://en.wikipedia.org/wiki/PGP_word_list
[16] https://github.com/hellais/Onion-url
https://github.com/hellais/Onion-url/blob/master/dev/mnemonic.py
[17] http://www.reddit.com/r/technology/comments/ecllk
[18] http://wordnet.princeton.edu/wordnet/man2.1/wnstats.7WN.html
Filename: 195-TLS-normalization-for-024.txt
Title: TLS certificate normalization for Tor 0.2.4.x
Author: Jacob Appelbaum, Gladys Shufflebottom, Nick Mathewson, Tim Wilde
Created: 6-Mar-2012
Status: Dead
Target: 0.2.4.x
0. Introduction
The TLS (Transport Layer Security) protocol was designed for security
and extensibility, not for uniformity. Because of this, it's not
hard for an attacker to tell one application's use of TLS from
another's.
We proposes improvements to Tor's current TLS certificates to
reduce the distinguishability of Tor traffic.
0.1. History
This draft is based on parts of Proposal 179, by Jacob Appelbaum
and Gladys Shufflebottom, but removes some already implemented parts
and replaces others.
0.2. Non-Goals
We do not address making TLS harder to distinguish after the
handshake is done. We also do not discuss TLS improvements not
related to distinguishability (such as increased key size, algorithm
choice, and so on).
1. Certificate Issues
Currently, Tor generates certificates according to a fixed pattern,
where lifetime is fairly small, the certificate Subject DN is a
single randomly generated CN, and the certificate Issuer DN is a
different single randomly generated CN.
We propose several ways to improve this below.
1.1. Separate initial certificate from link certificate
When Tor is using the v2 or v3 link handshake (see tor-spec.txt), it
currently presents an initial handshake authenticating the link key
with the identity key.
We propose instead that Tor should be able to present an arbitrary
initial certificate (so long as its key matches the link key used in
the actual TLS handshake), and then present the real certificate
authenticating the link key during the Tor handshake. (That is,
during the v2 handshake's renegotiation step, or in the v3
handshake's CERTS cell.)
The TLS protocol and the Tor handshake protocol both allow this, and
doing so will give us more freedom for the alternative certificate
presentation ideas below.
1.2. Allow externally generated certificates
It should be possible for a Tor relay operator to generate and
provide their own certificate and secret key. This will allow a relay or
bridge operator to use a certificate signed by any member of the "SSL
mafia,"[*] to generate their own self-signed certificate, and so on.
For compatibility, we need to require that the key be an RSA secret
key, of at least 1024 bits, generated with e=65537.
As a proposed interface, let's require that the certificate be stored
in ${DataDir}/tls_cert/tls_certificate.crt , that the secret key be
stored in ${DataDir}/tls_cert/private_tls_key.key , and that they be
used instead of generating our own certificate whenever the new
boolean option "ProvidedTLSCert" is set to true.
(Alternative interface: Allow the cert and key cert to be stored
wherever, and have the user provide their respective locations with
TLSCertificateFile and TLSCertificateKeyFile options.)
1.3. Longer certificate lifetimes
Tor's current certificates aren't long-lived, which makes them
different from most other certificates in the wild.
Typically, certificates are valid for a year, so let's use that as
our default lifetime. [TODO: investigate whether "a year" for most
CAs and self-signed certs have their validity dates running for a
calendar year ending at the second of issue, one calendar year
ending at midnight, or 86400*(365.5 +/- .5) seconds, or what.]
There are two ways to approach this. We could continue our current
certificate management approach where we frequently generate new
certificates (albeit with longer lifetimes), or we could make a cert,
store it to disk, and use it for all or most of its declared
lifetime.
If we continue to use fairly short lifetimes for the _true_ link
certificates (the ones presented during the Tor handshake), then
presenting long-lived certificates doesn't hurt us much: in the event
of a link-key-only compromise, the adversary still couldn't actually
impersonate a server for long.[**]
Using shorter-lived certificates with long nominal lifetimes doesn't
seem to buy us much. It would let us rotate link keys more
frequently, but we're already getting forward secrecy from our use of
diffie-hellman key agreement. Further, it would make our behavior
look less like regular TLS behavior, where certificates are typically
used for most of their nominal lifetime. Therefore, let's store and
use certs and link keys for the full year.
1.4. Self-signed certificates with better DNs
When we generate our own certificates, we currently set no DN fields
other than the commonName. This behavior isn't terribly common:
users of self-signed certs usually/often set other fields too.
[TODO: find out frequency.]
Unfortunately, it appears that no particular other set of fields or
way of filling them out _is_ universal for self-signed certificates,
or even particularly common. The most common schema seem to be for
things most censors wouldn't mind blocking, like embedded devices.
Even the default openssl schema, though common, doesn't appear to
represent a terribly large fraction of self-signed websites. [TODO:
get numbers here.]
So the best we can do here is probably to reproduce the process that
results in self-signed certificates originally: let the bridge and relay
operators to pick the DN fields themselves. This is an annoying
interface issue, and wants a better solution.
1.5. Better commonName values
Our current certificates set the commonName to a randomly generated
field like www.rmf4h4h.net. This is also a weird behavior: nearly
all TLS certs used for web purposes will have a hostname that
resolves to their IP.
The simplest way to get a plausible commonName here would be to do a
reverse lookup on our IP and try to find a good hostname. It's not
clear whether this would actually work out in practice, or whether
we'd just get dynamic-IP-pool hostnames everywhere blocked when they
appear in certificates.
Alternatively, if we are told a hostname in our Torrc (possibly in
the Address field), we could try to use that.
2. TLS handshake issues
2.1. Session ID.
Currently we do not send an SSL session ID, as we do not support session
resumption. However, Apache (and likely other major SSL servers) do have
this support, and do send a 32 byte SSLv3/TLSv1 session ID in their Server
Hello cleartext. We should do the same to avoid an easy fingerprinting
opportunity. It may be necessary to lie to OpenSSL to claim that we are
tracking session IDs to cause it to generate them for us.
(We should not actually support session resumption.)
[*] "Hey buddy, it's a nice website you've got there. Sure would be a
shame if somebody started poppin' up warnings on all your user's
browsers, tellin' everbody that you're _insecure_..."
[**] Furthermore, a link-key-only compromise isn't very realistic atm;
nearly any attack that would let an adversary learn a link key would
probably let the adversary learn the identity key too. The most
plausible way would probably be an implementation bug in OpenSSL or
something.
Filename: 196-transport-control-ports.txt
Title: Extended ORPort and TransportControlPort
Author: George Kadianakis, Nick Mathewson
Created: 14 Mar 2012
Status: Closed
Implemented-In: 0.2.5.2-alpha
1. Overview
Proposal 180 defined Tor pluggable transports, a way to decouple
protocol-level obfuscation from the core Tor protocol in order to
better resist client-bridge censorship. This is achieved by
introducing pluggable transport proxies, programs that obfuscate Tor
traffic to resist DPI detection.
Proposal 180 defined a way for pluggable transport proxies to
communicate with local Tor clients and bridges, so as to exchange
traffic. This document extends this communication protocol, so that
pluggable transport proxies can exchange arbitrary operational
information and metadata with Tor clients and bridges.
2. Motivation
The communication protocol specified in Proposal 180 gives a way
for transport proxies to announce the IP address of their clients
to tor. Still, modern pluggable transports might have more (?)
needs than this. For example:
1. Tor might want to inform pluggable transport proxies on how to
rate-limit incoming or outgoing connections.
2. Server pluggable transport proxies might want to pass client
information to an anti-active-probing system controlled by tor.
3. Tor might want to temporarily stop a transport proxy from
obfuscating traffic.
To satisfy the above use cases, there must be real-time
communication between the tor process and the pluggable transport
proxy. To achieve this, this proposal refactors the Extended ORPort
protocol specified in Proposal 180, and introduces a new port,
TransportControlPort, whose sole role is the exchange of control
information between transport proxies and tor.
Specifically, transports proxies deliver each connection to the
"Extended ORPort", where they provide metadata and agree on an
identifier for each tunneled connection. Once this handshake
occurs, the OR protocol proceeds unchanged.
Additionally, each transport maintains a single connection to Tor's
"TransportControlPort", where it receives instructions from Tor
about rate-limiting on individual connections.
3. The new port protocols
3.1. The new extended ORPort protocol
3.1.1. Protocol
The extended server port protocol is as follows:
COMMAND [2 bytes, big-endian]
BODYLEN [2 bytes, big-endian]
BODY [BODYLEN bytes]
Commands sent from the transport proxy to the bridge are:
[0x0000] DONE: There is no more information to give. The next
bytes sent by the transport will be those tunneled over it.
(body ignored)
[0x0001] USERADDR: an address:port string that represents the
client's address.
[0x0002] TRANSPORT: a string of the name of the pluggable
transport currently in effect on the connection.
Replies sent from tor to the proxy are:
[0x1000] OKAY: Send the user's traffic. (body ignored)
[0x1001] DENY: Tor would prefer not to get more traffic from
this address for a while. (body ignored)
[0x1002] CONTROL: a NUL-terminated "identifier" string. The
pluggable transport proxy must use the "identifier" to access
the TransportControlPort. See the 'Association and identifier
creation' section below.
Parties MUST ignore command codes that they do not understand.
If the server receives a recognized command that does not parse, it
MUST close the connection to the client.
3.1.2. Command descriptions
3.1.2.1. USERADDR
An ASCII string holding the TCP/IP address of the client of the
pluggable transport proxy. A Tor bridge SHOULD use that address to
collect statistics about its clients. Recognized formats are:
1.2.3.4:5678
[1:2::3:4]:5678
(Current Tor versions may accept other formats, but this is a bug:
transports MUST NOT send them.)
The string MUST not be NUL-terminated.
3.1.2.2. TRANSPORT
An ASCII string holding the name of the pluggable transport used by
the client of the pluggable transport proxy. A Tor bridge that
supports multiple transports SHOULD use that information to collect
statistics about the popularity of individual pluggable transports.
The string MUST not be NUL-terminated.
Pluggable transport names are C-identifiers and Tor MUST check them
for correctness.
3.2. The new TransportControlPort protocol
The TransportControlPort protocol is as follows:
CONNECTIONID[16 bytes, big-endian]
COMMAND [2 bytes, big-endian]
BODYLEN [2 bytes, big-endian]
BODY [BODYLEN bytes]
Commands sent from the transport proxy to the bridge:
[0x0001] RATE_LIMITED: Message confirming that the rate limiting
request of the bridge was carried out successfully (body
ignored). See the 'Rate Limiting' section below.
[0x0002] NOT_RATE_LIMITED: Message notifying that the transport
proxy failed to carry out the rate limiting request of the
bridge (body ignored). See the 'Rate Limiting' section below.
Configuration commands sent from the bridge to the transport
proxy are:
[0x1001] NOT_ALLOWED: Message notifying that the CONNECTIONID
could not be matched with an authorized connection ID. The
bridge SHOULD shutdown the connection.
[0x1001] RATE_LIMIT: Carries information on how the pluggable
transport proxy should rate-limit its traffic. See the 'Rate
Limiting' section below.
CONNECTIONID should carry the connection identifier described in the
'Association and identifier creation' section.
Parties should ignore command codes that they do not understand.
3.3. Association and identifier creation
For Tor and a transport proxy to communicate using the
TransportControlPort, an identifier must have already been negotiated
using the 'CONTROL' command of Extended ORPort.
The TransportControlPort identifier should not be predictable by a
user who hasn't received a 'CONTROL' command from the Extended
ORPort. For this reason, the TransportControlPort identifier should
not be cryptographically-weak or deterministically created.
Tor MUST create its identifiers by generating 16 bytes of random
data.
4. Configuration commands
4.1. Rate Limiting
A Tor relay should be able to inform a transport proxy in real-time
about its rate-limiting needs.
This can be achieved by using the TransportControlPort and sending a
'RATE_LIMIT' command to the transport proxy.
The body of the 'RATE_LIMIT' command should contain two integers,
4 bytes each, in big-endian format. The two numbers should represent
the bandwidth rate and bandwidth burst respectively in 'bytes per
second' which the transport proxy must set as its overall
rate-limiting setting.
When the transport proxy sets the appropriate rate limiting, it
should send back a 'RATE_LIMITED' command. If it fails while setting
up rate limiting, it should send back a 'NOT_RATE_LIMITED' command.
After sending a 'RATE_LIMIT' command. the tor bridge MAY want to
stop pushing data to the transport proxy, till it receives a
'RATE_LIMITED' command. If, instead, it receives a 'NOT_RATE_LIMITED'
command it MAY want to shutdown its connections to the transport
proxy.
5. Authentication
To defend against cross-protocol attacks on the Extended ORPort,
proposal 213 defines an authentication scheme that should be used to
protect it.
If the Extended ORPort is enabled, Tor should regenerate the cookie
file of proposal 213 on startup and store it in
$DataDirectory/extended_orport_auth_cookie.
The location of the cookie can be overriden by using the
configuration file parameter ExtORPortCookieAuthFile, which is
defined as:
ExtORPortCookieAuthFile <path>
where <path> is a filesystem path.
XXX should we also add an ExtORPortCookieFileGroupReadable torrc option?
6. Security Considerations
Extended ORPort or TransportControlPort do _not_ provide link
confidentiality, authentication or integrity. Sensitive data, like
cryptographic material, should not be transferred through them.
An attacker with superuser access, is able to sniff network traffic,
and capture TransportControlPort identifiers and any data passed
through those ports.
Tor SHOULD issue a warning if the bridge operator tries to bind
Extended ORPort or TransportControlPort to a non-localhost address.
Pluggable transport proxies SHOULD issue a warning if they are
instructed to connect to a non-localhost Extended ORPort or
TransportControlPort.
7. Future
In the future, we might have pluggable transports which require the
_client_ transport proxy to use the TransportControlPort and exchange
control information with the Tor client. The current proposal doesn't
yet support this, but we should not add functionality that will
prevent it from being possible.
Filename: 197-postmessage-ipc.txt
Title: Message-based Inter-Controller IPC Channel
Author: Mike Perry
Created: 16-03-2012
Status: REJECTED
Overview
This proposal seeks to create a means for inter-controller
communication using the Tor Control Port.
Motivation
With the advent of pluggable transports, bridge discovery mechanisms,
and tighter browser-Vidalia integration, we're going to have an
increasing number of collaborating Tor controller programs
communicating with each other. Rather than define new pairwise IPC
mechanisms for each case, we will instead create a generalized
message-passing mechanism through the Tor Control Port.
Control Protocol Specification Changes
CONTROLLERNAME command
Sent from the client to the server. The syntax is:
"CONTROLLERNAME" SP ControllerID
ControllerID = 1*(ALNUM / "_")
Server returns "250 OK" and records the ControllerID to use for
this control port connection for messaging information if successful,
or "553 Controller name already in use" if the name is in use by
another controller, or if an attempt is made to register the special
names "all" or "unset".
[CONTROLLERNAME need not be issued to send POSTMESSAGE commands,
and CONTROLLERNAME may be unsupported by initial POSTMESSAGE
implementations in Tor.]
POSTMESSAGE command
Sent from the client to the server. The syntax is:
"POSTMESSAGE" SP "@" DestControllerID SP LineItem CRLF
DestControllerID = "all" / 1*(ALNUM / "_")
If DestControllerID is "all", the message will be posted to all
controllers that have "SETEVENTS POSTMESSAGE" set. Otherwise, the
message should be posted to the controller with the appropriate
ControllerID.
Server returns "250 OK" if successful, or "552 Invalid destination
controller name" if the name is not registered.
[Initial implementations may require DestControllerID always be
"all"]
POSTMESSAGE event
"650" SP "POSTMESSAGE" SP MessageID SP SourceControllerID SP
"@" DestControllerID SP LineItem CRLF
MessageID = 1*DIGIT
SourceControllerID = "unset" / 1*(ALNUM / "_")
DestControllerID = "all" / 1*(ALNUM / "_")
MessageID is an incrementing integer identifier that uniquely
identifies this message to all controllers.
The SourceControllerID is the value from the sending
controller's CONTROLLERNAME command, or "unset" if the
CONTROLLERNAME command was not used or unimplemented.
GETINFO commands
"recent-messages" -- Retrieves messages
sent to ControllerIDs that match the current controller
in POSTMESSAGE event format. This list should be generated
on the fly, to handle disconnecting controllers.
"new-messages" -- Retrieves the last 10 "unread" messages
sent to this controller, in POSTMESSAGE event format. If
SETEVENTS POSTMESSAGE was set, this command should always return
nothing.
"list-controllers" -- Retrieves a list of all connected controllers
with either their registered ControllerID or "unset".
Implementation plan
The POSTMESSAGE protocol is designed to be incrementally deployable.
Initial implementations are only expected to implement broadcast
capabilities and SETEVENTS based delivery. CONTROLLERNAME need not be
supported, nor do non-"@all" POSTMESSAGE destinations.
To support command-based controllers (which do not use SETEVENTS) such
as Torbutton, at minimum the "GETINFO recent-messages" command is
needed. However, Torbutton does not have immediate need for this
protocol.
Filename: 198-restore-clienthello-semantics.txt
Title: Restore semantics of TLS ClientHello
Author: Nick Mathewson
Created: 19-Mar-2012
Status: Closed
Target: 0.2.4.x
Status:
Tor 0.2.3.17-beta implements the client-side changes, and no longer
advertises openssl-supported TLS ciphersuites we don't have.
Overview:
Currently, all supported Tor versions try to imitate an older version
of Firefox when advertising ciphers in their TLS ClientHello. This
feature is intended to make it harder for a censor to distinguish a
Tor client from other TLS traffic. Unfortunately, it makes the
contents of the ClientHello unreliable: a server cannot conclude that
a cipher is really supported by a Tor client simply because it is
advertised in the ClientHello.
This proposal suggests an approach for restoring sanity to our use of
ClientHello, so that we still avoid ciphersuite-based fingerprinting,
but allow nodes to negotiate better ciphersuites than they are
allowed to negotiate today.
Background reading:
Section 2 of tor-spec.txt 2 describes our current baroque link
negotiation scheme. Proposals 176 and 184 describe more information
about how it got that way.
Bug 4744 is a big part of the motivation for this proposal: we want
to allow Tors to advertise even more ciphers, some of which we would
actually prefer to the ones we are using now.
What you need to know about the TLS handshake is that the client
sends a list of all the ciphersuites that it supports in its
ClientHello message, and then the server chooses one and tells the
client which one it picked.
Motivation and constraints:
We'd like to use some of the ECDHE TLS ciphersuites, since they allow
us to get better forward-secrecy at lower cost than our current
DH-1024 usage. But right now, we can't ever use them, since Tor will
advertise them whether or not it has a version of OpenSSL that
supports them.
(OpenSSL before 1.0.0 did not support ECDHE ciphersuites; OpenSSL
before 1.0.0e or so had some security issues with them.)
We cannot have the rule be "Tors must only advertise ciphersuites
that they can use", since current Tors will advertise such
ciphersuites anyway.
We cannot have the rule be "Tors must support every ECDHE ciphersuite
on the following list", since current Tors don't do all that, and
since one prominent Linux distribution builds OpenSSL without ECC
support because of patent/freedom fears.
Fortunately, nearly every ciphersuite that we would like to advertise
to imitate FF8 (see bug 4744) is currently supported by OpenSSL 1.0.0
and later. This enables the following proposal to work:
Proposed spec changes:
I propose that the rules for handling ciphersuites at the server side
become the following:
If the ciphersuites in the ClientHello contains no ciphers other than
the following[*], they indicate that the Tor v1 link protocol is in use.
TLS_DHE_RSA_WITH_AES_256_CBC_SHA
TLS_DHE_RSA_WITH_AES_128_CBC_SHA
SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA
If the advertised ciphersuites in the ClientHello are _exactly_[*]
the following, they indicate that the Tor v2+ link protocol is in
use, AND that the ClientHello may have unsupported ciphers. In this
case, the server may choose DHE_RSA_WITH_AES_128_CBC_SHA or
DHE_RSA_WITH_AES_256_SHA, but may not choose any other cipher.
TLS1_ECDHE_ECDSA_WITH_AES_256_CBC_SHA
TLS1_ECDHE_RSA_WITH_AES_256_CBC_SHA
TLS1_DHE_RSA_WITH_AES_256_SHA
TLS1_DHE_DSS_WITH_AES_256_SHA
TLS1_ECDH_RSA_WITH_AES_256_CBC_SHA
TLS1_ECDH_ECDSA_WITH_AES_256_CBC_SHA
TLS1_RSA_WITH_AES_256_SHA
TLS1_ECDHE_ECDSA_WITH_RC4_128_SHA
TLS1_ECDHE_ECDSA_WITH_AES_128_CBC_SHA
TLS1_ECDHE_RSA_WITH_RC4_128_SHA
TLS1_ECDHE_RSA_WITH_AES_128_CBC_SHA
TLS1_DHE_RSA_WITH_AES_128_SHA
TLS1_DHE_DSS_WITH_AES_128_SHA
TLS1_ECDH_RSA_WITH_RC4_128_SHA
TLS1_ECDH_RSA_WITH_AES_128_CBC_SHA
TLS1_ECDH_ECDSA_WITH_RC4_128_SHA
TLS1_ECDH_ECDSA_WITH_AES_128_CBC_SHA
SSL3_RSA_RC4_128_MD5
SSL3_RSA_RC4_128_SHA
TLS1_RSA_WITH_AES_128_SHA
TLS1_ECDHE_ECDSA_WITH_DES_192_CBC3_SHA
TLS1_ECDHE_RSA_WITH_DES_192_CBC3_SHA
SSL3_EDH_RSA_DES_192_CBC3_SHA
SSL3_EDH_DSS_DES_192_CBC3_SHA
TLS1_ECDH_RSA_WITH_DES_192_CBC3_SHA
TLS1_ECDH_ECDSA_WITH_DES_192_CBC3_SHA
SSL3_RSA_FIPS_WITH_3DES_EDE_CBC_SHA
SSL3_RSA_DES_192_CBC3_SHA
[*] The "extended renegotiation is supported" ciphersuite, 0x00ff, is
not counted when checking the list of ciphersuites.
Otherwise, the ClientHello has these semantics: The inclusion of any
cipher supported by OpenSSL 1.0.0 means that the client supports it,
with the exception of
SSL_RSA_FIPS_WITH_3DES_EDE_CBC_SHA
which is never supported. Clients MUST advertise support for at least one of
TLS_DHE_RSA_WITH_AES_256_CBC_SHA or TLS_DHE_RSA_WITH_AES_128_CBC_SHA.
The server MUST choose a ciphersuite with ephemeral keys for forward
secrecy; MUST NOT choose a weak or null ciphersuite; and SHOULD NOT
choose any cipher other than AES or 3DES.
Discussion and consequences:
Currently, OpenSSL 1.0.0 (in its default configuration) supports every
cipher that we would need in order to give the same list as Firefox
versions 8 through 11 give in their default configuration, with the
exception of the FIPS ciphersuite above. Therefore, we will be able
to fake the new ciphersuite list correctly in all of our bundles that
include OpenSSL, and on every version of Unix that keeps up-to-date.
However, versions of Tor compiled to use older versions of OpenSSL, or
versions of OpenSSL with some ciphersuites disabled, will no
longer give the same ciphersuite lists as other versions of Tor. On
these platforms, Tor clients will no longer impersonate Firefox.
Users who need to do so will have to download one of our bundles, or
use a non-system OpenSSL.
The proposed spec change above tries to future-proof ourselves by not
declaring that we support every declared cipher, in case we someday
need to handle a new Firefox version. If a new Firefox version
comes out that uses ciphers not supported by OpenSSL 1.0.0, we will
need to define whether clients may advertise its ciphers without
supporting them; but existing servers will continue working whether
we decide yes or no.
The restriction to "servers SHOULD only pick AES or 3DES" is meant to
reflect our current behavior, not to represent a permanent refusal to
support other ciphers. We can revisit it later as appropriate, if for
some bizarre reason Camellia or Seed or Aria becomes a better bet than
AES.
Open questions:
Should the client drop connections if the server chooses a bad
cipher, or a suite without forward secrecy?
Can we get OpenSSL to support the dubious FIPS suite excluded above,
in order to remove a distinguishing opportunity? It is not so simple
as just editing the SSL_CIPHER list in s3_lib.c, since the nonstandard
SSL_RSA_FIPS_WITH_3DES_EDE_CBC_SHA cipher is (IIUC) defined to use the
TLS1 KDF, while declaring itself to be an SSL cipher (!).
Can we do anything to eventually allow the IE7+[**] cipher list as
well? IE does not support TLS_DHE_RSA_WITH_AES_{256,128}_SHA or
SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA, and so wouldn't work with current
Tor servers, which _only_ support those. It looks like the only
forward-secure ciphersuites that IE7+ *does* support are ECDHE ones,
and DHE+DSS ones. So if we want this flexibility, we could mandate
server-side ECDHE, or somehow get DHE+DSS support (which would play
havoc with our current certificate generation code IIUC), or say that
it is sometimes acceptable to have a non-forward-secure link
protocol[***]. None of these answers seems like a great one. Is one
best? Are there other options?
[**] Actually, I think it's the Windows SChannel cipher list we
should be looking at here.
[***] If we did _that_, we'd want to specify that CREATE_FAST could
never be used on a non-forward-secure link. Even so, I don't like the
implications of leaking cell types and circuit IDs to a future
compromise.
Filename: 199-bridgefinder-integration.txt
Title: Integration of BridgeFinder and BridgeFinderHelper
Author: Mike Perry
Created: 18-03-2012
Status: OBSOLETE
Overview
This proposal describes how the Tor client software can interact with
an external program that performs bridge discovery based on user input
or information extracted from a web page, QR Code, online game, or
other transmission medium.
Scope and Audience
This document describes how all of the components involved in bridge
discovery communicate this information to the rest of the Tor
software. The mechanisms of bridge discovery are not discussed, though
the design aims to be generalized enough to allow arbitrary new
discovery mechanisms to be added at any time.
This document is also written with the hope that those who wish to
implement BridgeFinder components and BridgeFinderHelpers can get
started immediately after a read of this proposal, so that development
of bridge discovery mechanisms can proceed in parallel to supporting
functionality improvements in the Tor client software.
Components and Responsibilities
0. Tor Client
The Tor Client is the piece of software that connects to the Tor
network (optionally using bridges) and provides a SOCKS proxy for
use by the user.
In initial implementations, the Tor Client will support only
standard bridges. In later implementations, it is expected to
support pluggable transports as defined by Proposal 180.
1. Tor Control Port
The Tor Control Port provides commands to perform operations,
configuration, and to obtain status information. It also optionally
provides event driven status updates.
In initial implementations, it will be used directly by BridgeFinder
to configure bridge information via GETINFO and SETCONF. It is covered
by control-spec.txt in the tor-specs git repository.
In later implementations, it will support the inter-controller
POSTMESSAGE IPC protocol as defined by Proposal 197 for use
in conveying bridge information to the Primary Controller.
2. Primary Controller
The Primary Controller is the program that launches and configures the
Tor client, and monitors its status.
On desktop platforms, this program is Vidalia, and it also launches
the Tor Browser. On Android, this program is Orbot. Orbot does not
launch a browser.
On all platforms, this proposal requires that the Primary Controller
will launch one or more BridgeFinder child processes and provide
them with authentication information through the environment variables
TOR_CONTROL_PORT and TOR_CONTROL_PASSWD.
In later implementations, the Primary Controller will be expected
to receive Bridge configuration information via the free-form
POSTMESSAGE protocol from Proposal 197, validate that information,
and hold that information for user approval.
3. BridgeFinder
A BridgeFinder is a program that discovers bridges and configures
Tor to use them.
In initial implementations, it is likely to be very dumb, and its main
purpose will be to serve as a layer of abstraction that should free
the Primary Controller from having to directly implement numerous ways
of retrieving bridges for various pluggable transports.
In later implementations, it may perform arbitrary network operations
to discover, authenticate to, and/or verify bridges, possibly using
informational hints provided by one or more external
BridgeFinderHelpers (see next component). It could even go so far as
to download new pluggable transport plugins and/or transform
definition files from arbitrary urls.
It will be launched by the Primary Controller and given access to the
Tor Control Port via the environment variables TOR_CONTROL_PORT and
TOR_CONTROL_PASSWD.
Initial control port interactions can be command driven via GETINFO
and SETCONF, and do not need to subscribe to or process control port
events. Later implementations will use POSTMESSAGE as defined in
Proposal 197 to pass command requests to Vidalia, which will parse
them and ask for user confirmation before deploying them. Use of
POSTMESSAGE may or may not require event driven operation, depending
on POSTMESSAGE implementation status (POSTMESSAGE is designed to
support both command and event driven operation, but it is possible
event driven operation will happen first).
4. BridgeFinderHelper
Each BridgeFinder implementation can optionally communicate with one
or more BridgeFinderHelpers. BridgeFinderHelpers are plugins to
external 3rd party applications that can inspect traffic, handle mime
types, or implement protocol handlers for accepting bridge discovery
information to pass to BridgeFinder. Example 3rd party applications
include Chrome, World of Warcraft, QR Code readers, or simple cut
and paste.
Due to the arbitrary nature of sandboxing that may be present in
various BridgeFinderHelper host applications, we do not mandate the
exact nature of the IPC between BridgeFinder instances and external
BridgeFinderHelper addons. However, please see the "Security Concerns"
section for common pitfalls to avoid.
5. Tor Browser
This is the browser the user uses with Tor. It is not useful until Tor
is properly configured to use bridges. It fails closed.
It is not expected to run BridgeFinderHelper plugin instances, unless
those plugin instances exist to ensure the user always has a pool of
working bridges available after successfully configuring an
initial bridge. Once all bridges fail, the Tor Browser is useless.
6. Non-Tor Browser (aka BridgeFinderHelper host)
This is the program the user uses for normal Internet activity to
obtain bridges via a BridgeFinderHelper plugin. It does not have to be
a browser. In advanced scenarios, this component may not be a browser
at all, but may be a program such as World of Warcraft instead.
Incremental Deployability
The system is designed to be incrementally deployable: Simple designs
should be possible to develop and test immediately. The design is
flexible enough to be easily upgraded as more advanced features become
available from both Tor and new pluggable transports.
Initial Implementation
In the simplest possible initial implementation, BridgeFinder will
only discover Tor Bridges as they are deployed today. It will use the
Tor Control Port to configure these bridges directly via the SETCONF
command. It may or may not receive bridge information from a
BridgeFinderHelper. In an even more degenerate case,
BridgeFinderHelper may even be Vidalia or Orbot itself, acting upon
user input from cut and paste.
Initial Implementation: BridgeFinder Launch
In the initial implementation, the Primary Controller will launch one
or more BridgeFinders, providing control port authentication
information to them through the environment variables TOR_CONTROL_PORT
and TOR_CONTROL_PASSWD.
BridgeFinder will then directly connect to the control port and
authenticate. Initial implementations should be able to function
without using SETEVENTS, and instead only using command-based
status inquiries and configuration (GETINFO and SETCONF).
Initial Implementation: Obtaining Bridge Hint Information
In the initial implementation, to test functionality,
BridgeFinderHelper can simply scrape bridges directly from
https://bridges.torproject.org.
In slightly more advanced implementations, a BridgeFinderHelper
instance may be written for use in the user's Non-Tor Browser. This
plugin could extract bridges from images, html comments, and other
material present in ad banners and slack space on unrelated pages.
BridgeFinderHelper would then communicate with the appropriate
BridgeFinder instance over an acceptable IPC mechanism. This proposal
does not seek to specify the nature of that IPC channel (because
BridgeFinderHelper may be arbitrarily constrained due to host
application sandboxing), but we do make several security
recommendations under the section "Security Concerns: BridgeFinder and
BridgeFinderHelper".
Initial Implementation: Configuring New Bridges
In the initial implementation, Bridge configuration will be done
directly though the control port using the SETCONF command.
Initial implementations will support only retrieval and configuration
of standard Tor Bridges. These are configured using SETCONF on the Tor
Control Port as follows:
SETCONF Bridge="IP:ORPort [fingerprint]"
Future Implementations
In future implementations, the system can incrementally evolve in a
few different directions. As new pluggable transports are created, it
is conceivable that BridgeFinder may want to download new plugin
binaries (and/or new transport transform definition files) and
provide them to Tor.
Furthermore, it may prove simpler to deploy multiple concurrent
BridgeFinder+BridgeFinderHelper pairs as opposed to adding new
functionality to existing prototypes.
Finally, it is desirable for BridgeFinder to obtain approval
from the user before updating bridge configuration, especially for
cases where BridgeFinderHelper is automatically discovering bridges
in-band during Non-Tor activity.
The exact mechanisms for accomplishing these improvements is
described in the following subsections.
Future Implementations: BridgeFinder Launch and POSTMESSAGE handshake
The nature of the BridgeFinder launch and the environment variables
provided is not expected to change. However, future Primary Controller
implementations may decide to launch more than one BridgeFinder
instance side by side.
Additionally, to negotiate the IPC channel created by Proposal 197
for purposes of providing user confirmation, it is recommended that
BridgeFinder and the Primary Controller perform a handshake using
POSTMESSAGE upon launch, to establish that all parties properly
support the feature:
Primary Controller: "POSTMESSAGE @all Controller wants POSTMESSAGE v1.1"
BridgeFinder: "POSTMESSAGE @all BridgeFinder has POSTMESSAGE v1.0"
Primary Controller: "POSTMESSAGE @all Controller expects POSTMESSAGE v1.0"
BridgeFinder: "POSTMESSAGE @all BridgeFinder will POSTMESSAGE v1.0"
If this 4 step handshake proceeds with an acceptable version,
BridgeFinder must use POSTMESSAGE to transmit SETCONF Bridge lines
(see "Future Implementations: Configuring New Bridges" below). If
POSTMESSAGE support is expected, but the handshake does not complete
for any reason, BridgeFinder should either exit or go dormant.
The exact nature of the version negotiation and exactly how much
backwards compatibility must be tolerated is unspecified.
"All-or-nothing" is a safe assumption to get started.
Future Implementations: Obtaining Bridge Hint Information
Future BridgeFinder implementations may download additional
information based on what is provided by BridgeFinderHelper. They
may fetch pluggable transport plugins, transformation parameters,
and other material.
Future Implementations: Configuring New Bridges
Future implementations will be concerned with providing two new pieces
of functionality with respect to configuring bridges: configuring
pluggable transports, and properly prompting the user before altering
Tor configuration.
There are two ways to tell Tor clients about pluggable transports
(as defined in Proposal 180).
On the control port, an external Proposal 180 transport will be
configured with
SETCONF ClientTransportPlugin=<method> socks5 <addr:port> [auth=X]
as in
SETCONF ClientTransportPlugin="trebuchet socks5 127.0.0.1:9999".
A managed proxy is configured with
SETCONF ClientTransportPlugin=<methods> exec <path> [options]
as in
SETCONF ClientTransportPlugin="trebuchet exec /usr/libexec/trebuchet --managed".
This example tells Tor to launch an external program to provide a
socks proxy for 'trebuchet' connections. The Tor client only
launches one instance of each external program with a given set of
options, even if the same executable and options are listed for
more than one method.
Pluggable transport bridges discovered for this transport by
BridgeFinder would then be set with:
SETCONF Bridge="trebuchet 3.2.4.1:8080 keyid=09F911029D74E35BD84156C5635688C009F909F9 rocks=20 height=5.6m".
For more information on pluggable transports and supporting Tor
configuration commands, see Proposal 180.
Future Implementations: POSTMESSAGE and User Confirmation
Because configuring even normal bridges alone can expose the user to
attacks, it is strongly desired to provide some mechanism to allow
the user to approve new bridges prior to their use, especially for
situations where BridgeFinderHelper is extracting them transparently
while the user performs unrelated activity.
If BridgeFinderHelper grows to the point where it is downloading new
transform definitions or plugins, user confirmation becomes
absolutely required.
To achieve user confirmation, we depend upon the POSTMESSAGE command
defined in Proposal 197.
If the POSTMESSAGE handshake succeeds, instead of sending SETCONF
commands directly to the control port, the commands will be wrapped
inside a POSTMESSAGE:
POSTMESSAGE @all SETCONF Bridge="www.example.com:8284"
Upon receiving this POSTMESSAGE, the Primary Controller will
validate it, evaluate it, store it to be later enabled by the
user, and alert the user that new bridges are available for
approval. It is only after the user has approved the new bridges
that the Primary Controller should then re-issue the SETCONF commands
to configure and deploy them in the tor client.
Additionally, see "Security Concerns: Primary Controller" for more
discussion on potential pitfalls with POSTMESSAGE.
Security Concerns
While automatic bridge discovery and configuration is quite compelling
and powerful, there are several serious security concerns that warrant
extreme care. We've broken them down by component.
Security Concerns: Primary Controller
In the initial implementation, Orbot and Vidalia must take care to
transmit the Tor Control password to BridgeFinder in such a way that
it does not end up in system logs, process list, or viewable by other
system users. The best known strategy for doing this is by passing the
information through exported environment variables.
Additionally, in future implementations, Orbot and Vidalia will need
to validate Proposal 197 POSTMESSAGE input before prompting the user.
POSTMESSAGE is a free-form message-passing mechanism. All sorts of
unexpected input may be passed through it by any other authenticated
Tor Controllers for their own unrelated communication purposes.
Minimal validation includes verifying that the POSTMESSAGE data is a
valid Bridge or ClientTransportPlugin line and is acceptable input for
SETCONF. All unexpected characters should be removed through using a
whitelist, and format and structure should be checked against a
regular expression. Additionally, the POSTMESSAGE string should not be
passed through any string processing engines that automatically decode
character escape encodings, to avoid arbitrary control port execution.
At the same time, POSTMESSAGE validation should be light. While fully
untrusted input is not expected due to the need for control port
authentication and BridgeFinder sanitation, complicated manual string
parsing techniques during validation should be avoided. Perform simple
easy-to-verify whitelist-based checks, and ignore unrecognized input.
Beyond POSTMESSAGE validation, the manner in which the Primary
Controller achieves consent from the user is absolutely crucial to
security under this scheme. A simple "OK/Cancel" dialog is
insufficient to protect the user from the dangers of switching
bridges and running new plugins automatically.
Newly discovered bridge lines from POSTMESSAGE should be added to a
disabled set that the user must navigate to as an independent window
apart from any confirmation dialog. The user must then explicitly
enable recently added plugins by checking them off individually. We
need the user's brain to be fully engaged and aware that it is
interacting with Tor during this step. If they get an "OK/Cancel"
popup that interrupts their online game play, they will almost
certainly simply click "OK" just to get back to the game quickly.
The Primary Controller should transmit the POSTMESSAGE content to the
control port only after obtaining this out-of-band approval.
Security Concerns: BridgeFinder and BridgeFinderHelper
The unspecified nature of the IPC channel between BridgeFinder and
BridgeFinderHelper makes it difficult to make concrete security
suggestions. However, from past experience, the following best
practices must be employed to avoid security vulnerabilities:
1. Define a non-webby handshake and/or perform authentication
The biggest risk is that unexpected applications will be manipulated
into posting malformed data to the BridgeFinder's IPC channel as if it
were from BridgeFinderHelper. The best way to defend against this is
to require a handshake to properly complete before accepting input. If
the handshake fails at any point, the IPC channel must be abandoned
and closed. Do not continue scanning for good input after any bad
input has been encountered.
Additionally, if possible, it is wise to establish a shared secret
between BridgeFinder and BridgeFinderHelper through the filesystem or
any other means available for use in authentication. For an a good
example on how to use such a shared secret properly for
authentication, see Trac Ticket #5185 and/or the SafeCookie Tor
Control Port authentication mechanism.
2. Perform validation before parsing
Care must be taken before converting BridgeFinderHelper data into
Bridge lines, especially for cases where the BridgeFinderHelper data
is fed directly to the control port after passing through
BridgeFinder.
The input should be subjected to a character whitelist and possibly
also validated against a regular expression to verify format, and if
any unexpected or poorly-formed data is encountered, the IPC channel
must be closed.
3. Fail closed on unexpected input
If the handshake fails, or if any other part of the BridgeFinderHelper
input is invalid, the IPC channel must be abandoned and closed. Do
*not* continue scanning for good input after any bad input has been
encountered.
Filename: 200-new-create-and-extend-cells.txt
Title: Adding new, extensible CREATE, EXTEND, and related cells
Author: Robert Ransom
Created: 2012-03-22
Status: Closed
Implemented-In: 0.2.4.8-alpha
History
The original draft of this proposal was from 2010-12-27; nickm revised
it slightly on 2012-03-22 and added it as proposal 200.
Overview and Motivation:
In Tor's current circuit protocol, every field, including the 'onion
skin', in the EXTEND relay cell has a fixed meaning and length.
This prevents us from extending the current EXTEND cell to support
IPv6 relays, efficient UDP-based link protocols, larger 'onion
keys', new circuit-extension handshake protocols, or larger
identity-key fingerprints. We will need to support all of these
extensions in the near future. This proposal specifies a
replacement EXTEND2 cell and related cells that provide more room
for future extension.
Design:
FIXME - allocate command ID numbers (non-RELAY commands for CREATE2 and
CREATED2; RELAY commands for EXTEND2 and EXTENDED2)
The CREATE2 cell contains the following payload:
Handshake type [2 bytes]
Handshake data length [2 bytes]
Handshake data [variable]
The relay payload for an EXTEND2 relay cell contains the following
payload:
Number of link specifiers [1 byte]
N times:
Link specifier type [1 byte]
Link specifier length [1 byte]
Link specifier [variable]
Handshake type [2 bytes]
Handshake data length [2 bytes]
Handshake data [variable]
The CREATED2 cell and EXTENDED2 relay cell both contain the following
payload:
Handshake data length [2 bytes]
Handshake data [variable]
All four cell types are padded to 512-byte cells.
When a relay X receives an EXTEND2 relay cell:
* X finds or opens a link to the relay Y using the link target
specifiers in the EXTEND2 relay cell; if X fails to open a link, it
replies with a TRUNCATED relay cell. (FIXME: what do we do now?)
* X copies the handshake type and data into a CREATE2 cell and sends
it along the link to Y.
* If the handshake data is valid, Y replies by sending a CREATED2
cell along the link to X; otherwise, Y replies with a TRUNCATED
relay cell. (XXX: we currently use a DESTROY cell?)
* X copies the contents of the CREATED2 cell into an EXTENDED2 relay
cell and sends it along the circuit to the OP.
Link target specifiers:
The list of link target specifiers must include at least one address and
at least one identity fingerprint, in a format that the extending node is
known to recognize.
The extending node MUST NOT accept the connection unless at least one
identity matches, and should follow the current rules for making sure that
addresses match.
[00] TLS-over-TCP, IPv4 address
A four-byte IPv4 address plus two-byte ORPort
[01] TLS-over-TCP, IPv6 address
A sixteen-byte IPv6 address plus two-byte ORPort
[02] Legacy identity
A 20-byte SHA1 identity fingerprint. At most one may be listed.
As always, values are sent in network (big-endian) order.
Legacy handshake type:
The current "onionskin" handshake type is defined to be handshake type
[00 00], or "legacy".
The first (client->relay) message in a handshake of type “legacy”
contains the following data:
‘Onion skin’ (as in CREATE cell) [DH_LEN+KEY_LEN+PK_PAD_LEN bytes]
This value is generated and processed as sections 5.1 and 5.2 of
tor-spec.txt specify for the current CREATE cell.
The second (relay->client) message in a handshake of type “legacy”
contains the following data:
Relay DH public key [DH_LEN bytes]
KH (see section 5.2 of tor-spec.txt) [HASH_LEN bytes]
These values are generated and processed as sections 5.1 and 5.2 of
tor-spec.txt specify for the current CREATED cell.
After successfully completing a handshake of type “legacy”, the
client and relay use the current relay cryptography protocol.
Bugs:
This specification does not accommodate:
* circuit-extension handshakes requiring more than one round
No circuit-extension handshake should ever require more than one
round (i.e. more than one message from the client and one reply
from the relay). We can easily extend the protocol to handle
this, but we will never need to.
* circuit-extension handshakes in which either message cannot fit in
a single 512-byte cell along with the other required fields
This can be handled by specifying a dummy handshake type whose
data (sent from the client) consists of another handshake type and
the beginning of the data required by that handshake type, and
then u