Filename: 104-short-descriptors.txt
Title: Long and Short Router Descriptors
Author: Nick Mathewson
Created: Jan 2007
Status: Closed
Implemented-In: 0.2.0.x

Overview:

  This document proposes moving unused-by-clients information from regular
  router descriptors into a new "extra info" router descriptor.

Proposal:

  Some of the costliest fields in the current directory protocol are ones
  that no client actually uses.  In particular, the "read-history" and
  "write-history" fields are used only by the authorities for monitoring the
  status of the network.  If we took them out, the size of a compressed list
  of all the routers would fall by about 60%.  (No other disposable field
  would save much more than 2%.)

  We propose to remove these fields from descriptors, and and have them
  uploaded as a part of a separate signed "extra info" to the authorities.
  This document will be signed.  A hash of this document will be included in
  the regular descriptors.

  (We considered another design, where routers would generate and upload a
  short-form and a long-form descriptor.  Only the short-form descriptor would
  ever be used by anybody for routing.  The long-form descriptor would be
  used only for analytics and other tools.   We decided against this because
  well-behaved tools would need to download short-form descriptors too (as
  these would be the only ones indexed), and hence get redundant info. Badly
  behaved tools would download only long-form descriptors, and expose
  themselves to partitioning attacks.)

Other disposable fields:

  Clients don't need these fields, but removing them doesn't help bandwidth
  enough to be worthwhile.
    contact (save about 1%)
    fingerprint (save about 3%)

  We could represent these fields more succinctly, but removing them would
  only save 1%.  (!)
    reject
    accept
  (Apparently, exit polices are highly compressible.)

  [Does size-on-disk matter to anybody? Some clients and servers don't
   have much disk, or have really slow disk (e.g. USB). And we don't
   store caches compressed right now. -RD]

Specification:

  1. Extra Info Format.

    An "extra info" descriptor contains the following fields:

    "extra-info" Nickname Fingerprint
        Identifies what router this is an extra info descriptor for.
        Fingerprint is encoded in hex (using upper-case letters), with
        no spaces.

    "published" As currently documented in dir-spec.txt.  It MUST match the
        "published" field of the descriptor published at the same time.

    "read-history"
    "write-history"
        As currently documented in dir-spec.txt.  Optional.

    "router-signature" NL Signature NL

        A signature of the PKCS1-padded hash of the entire extra info
        document, taken from the beginning of the "extra-info" line, through
        the newline after the "router-signature" line.  An extra info
        document is not valid unless the signature is performed with the
        identity key whose digest matches FINGERPRINT.

    The "extra-info" field is required and MUST appear first.  The
    router-signature field is required and MUST appear last.  All others are
    optional.  As for other documents, unrecognized fields must be ignored.

  2. Existing formats

     Implementations that use "read-history" and "write-history" SHOULD
     continue accepting router descriptors that contain them.  (Prior to
     0.2.0.x, this information was encoded in ordinary router descriptors;
     in any case they have always been listed as opt, so they should be
     accepted anyway.)

     Add these fields to router descriptors:

       "extra-info-digest" Digest
          "Digest" is a hex-encoded digest (using upper-case characters)
          of the router's extra-info document, as signed in the router's
          extra-info.  (If this field is absent, no extra-info-digest
          exists.)

       "caches-extra-info"
          Present if this router is a directory cache that provides
          extra-info documents, or an authority that handles extra-info
          documents.

     (Since implementations before 0.1.2.5-alpha required that the "opt"
     keyword precede any unrecognized entry, these keys MUST be preceded
     with "opt" until 0.1.2.5-alpha is obsolete.)

  3. New communications rules

     Servers SHOULD generate and upload one extra-info document after each
     descriptor they generate and upload; no more, no less.  Servers MUST
     upload the new descriptor before they upload the new extra-info.

     Authorities receiving an extra-info document SHOULD verify all of the
     following:
       * They have a router descriptor for some server with a matching
         nickname and identity fingerprint.
       * That server's identity key has been used to sign the extra-info
         document.
       * The extra-info-digest field in the router descriptor matches
         the digest of the extra-info document.
       * The published fields in the two documents match.

     Authorities SHOULD drop extra-info documents that do not meet these
     criteria.

     Extra-info documents MAY be uploaded as part of the same HTTP post as
     the router descriptor, or separately.  Authorities MUST accept both
     methods.

     Authorities SHOULD try to fetch extra-info documents from one another if
     they do not have one matching the digest declared in a router
     descriptor.

     Caches that are running locally with a tool that needs to use extra-info
     documents MAY download and store extra-info documents.  They should do
     so when they notice that the recommended descriptor has an
     extra-info-digest not matching any extra-info document they currently
     have.  (Caches not running on a host that needs to use extra-info
     documents SHOULD NOT download or cache them.)

  4. New URLs

     http://<hostname>/tor/extra/d/...
     http://<hostname>/tor/extra/fp/...
     http://<hostname>/tor/extra/all[.z]
        (As for /tor/server/ URLs: supports fetching extra-info documents
        by their digest, by the fingerprint of their servers, or all
        at once. When serving by fingerprint, we serve the extra-info
        that corresponds to the descriptor we would serve by that
        fingerprint. Only directory authorities are guaranteed to support
        these URLs.)

     http://<hostname>/tor/extra/authority[.z]
        (The extra-info document for this router.)

     Extra-info documents are uploaded to the same URLs as regular
     router descriptors.

Migration:

  For extra info approach:
     * First:
       * Authorities should accept extra info, and support serving it.
       * Routers should upload extra info once authorities accept it.
       * Caches should support an option to download and cache it, once
         authorities serve it.
       * Tools should be updated to use locally cached information.
         These tools include:
           lefkada's exit.py script.
           tor26's noreply script and general directory cache.
           https://nighteffect.us/tns/ for its graphs
           and check with or-talk for the rest, once it's time.

     * Set a cutoff time for including bandwidth in router descriptors, so
       that tools that use bandwidth info know that they will need to fetch
       extra info documents.

     * Once tools that want bandwidth info support fetching extra info:
       * Have routers stop including bandwidth info in their router
         descriptors.