Extra-info document format

Extra-info documents consist of the following items:

    "extra-info" Nickname Fingerprint NL
        [At start, exactly once.]

Identifies what router this is an extra-info descriptor for. Fingerprint is encoded in hex (using upper-case letters), with no spaces.

    "identity-ed25519"
        [As in router descriptors]

    "published" YYYY-MM-DD HH:MM:SS NL

       [Exactly once.]

The time, in UTC, when this document (and its corresponding router descriptor if any) was generated. It MUST match the published time in the corresponding server descriptor.

    "read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL
        [At most once.]
    "write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL
        [At most once.]

Declare how much bandwidth the OR has used recently. Usage is divided into intervals of NSEC seconds. The YYYY-MM-DD HH:MM:SS field defines the end of the most recent interval. The numbers are the number of bytes used in the most recent intervals, ordered from oldest to newest.

These fields include both IPv4 and IPv6 traffic.

    "ipv6-read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL
        [At most once]
    "ipv6-write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL
        [At most once]

Declare how much bandwidth the OR has used recently, on IPv6 connections. See "read-history" and "write-history" for full details.

    "geoip-db-digest" Digest NL
        [At most once.]

SHA1 digest of the IPv4 GeoIP database file that is used to resolve IPv4 addresses to country codes.

    "geoip6-db-digest" Digest NL
        [At most once.]

SHA1 digest of the IPv6 GeoIP database file that is used to resolve IPv6 addresses to country codes.

("geoip-start-time" YYYY-MM-DD HH:MM:SS NL) ("geoip-client-origins" CC=NUM,CC=NUM,... NL)

Only generated by bridge routers (see blocking.pdf), and only when they have been configured with a geoip database. Non-bridges SHOULD NOT generate these fields. Contains a list of mappings from two-letter country codes (CC) to the number of clients that have connected to that bridge from that country (approximate, and rounded up to the nearest multiple of 8 in order to hamper traffic analysis). A country is included only if it has at least one address. The time in "geoip-start-time" is the time at which we began collecting geoip statistics.

"geoip-start-time" and "geoip-client-origins" have been replaced by "bridge-stats-end" and "bridge-ips" in 0.2.2.4-alpha. The reason is that the measurement interval with "geoip-stats" as determined by subtracting "geoip-start-time" from "published" could have had a variable length, whereas the measurement interval in 0.2.2.4-alpha and later is set to be exactly 24 hours long. In order to clearly distinguish the new measurement intervals from the old ones, the new keywords have been introduced.

    "bridge-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
        [At most once.]

YYYY-MM-DD HH:MM:SS defines the end of the included measurement interval of length NSEC seconds (86400 seconds by default).

A "bridge-stats-end" line, as well as any other "bridge-*" line, is only added when the relay has been running as a bridge for at least 24 hours.

    "bridge-ips" CC=NUM,CC=NUM,... NL
        [At most once.]

List of mappings from two-letter country codes to the number of unique IP addresses that have connected from that country to the bridge and which are no known relays, rounded up to the nearest multiple of 8.

    "bridge-ip-versions" FAM=NUM,FAM=NUM,... NL
        [At most once.]

List of unique IP addresses that have connected to the bridge per protocol family.

    "bridge-ip-transports" PT=NUM,PT=NUM,... NL
        [At most once.]

List of mappings from pluggable transport names to the number of unique IP addresses that have connected using that pluggable transport. Unobfuscated connections are counted using the reserved pluggable transport name "<OR>" (without quotes). If we received a connection from a transport proxy but we couldn't figure out the name of the pluggable transport, we use the reserved pluggable transport name "<??>".

("<OR>" and "<??>" are reserved because normal pluggable transport names MUST match the following regular expression: "[a-zA-Z_][a-zA-Z0-9_]*" )

The pluggable transport name list is sorted into lexically ascending order.

If no clients have connected to the bridge yet, we only write "bridge-ip-transports" to the stats file.

    "dirreq-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
        [At most once.]

YYYY-MM-DD HH:MM:SS defines the end of the included measurement interval of length NSEC seconds (86400 seconds by default).

A "dirreq-stats-end" line, as well as any other "dirreq-*" line, is only added when the relay has opened its Dir port and after 24 hours of measuring directory requests.

    "dirreq-v2-ips" CC=NUM,CC=NUM,... NL
        [At most once.]
    "dirreq-v3-ips" CC=NUM,CC=NUM,... NL
        [At most once.]

List of mappings from two-letter country codes to the number of unique IP addresses that have connected from that country to request a v2/v3 network status, rounded up to the nearest multiple of 8. Only those IP addresses are counted that the directory can answer with a 200 OK status code. (Note here and below: current Tor versions, as of 0.2.5.2-alpha, no longer cache or serve v2 networkstatus documents.)

    "dirreq-v2-reqs" CC=NUM,CC=NUM,... NL
        [At most once.]
    "dirreq-v3-reqs" CC=NUM,CC=NUM,... NL
        [At most once.]

List of mappings from two-letter country codes to the number of requests for v2/v3 network statuses from that country, rounded up to the nearest multiple of 8. Only those requests are counted that the directory can answer with a 200 OK status code.

    "dirreq-v2-share" NUM% NL
        [At most once.]
    "dirreq-v3-share" NUM% NL
        [At most once.]

The share of v2/v3 network status requests that the directory expects to receive from clients based on its advertised bandwidth compared to the overall network bandwidth capacity. Shares are formatted in percent with two decimal places. Shares are calculated as means over the whole 24-hour interval.

    "dirreq-v2-resp" status=NUM,... NL
        [At most once.]
    "dirreq-v3-resp" status=NUM,... NL
        [At most once.]

List of mappings from response statuses to the number of requests for v2/v3 network statuses that were answered with that response status, rounded up to the nearest multiple of 4. Only response statuses with at least 1 response are reported. New response statuses can be added at any time. The current list of response statuses is as follows:

        "ok": a network status request is answered; this number
           corresponds to the sum of all requests as reported in
           "dirreq-v2-reqs" or "dirreq-v3-reqs", respectively, before
           rounding up.
        "not-enough-sigs: a version 3 network status is not signed by a
           sufficient number of requested authorities.
        "unavailable": a requested network status object is unavailable.
        "not-found": a requested network status is not found.
        "not-modified": a network status has not been modified since the
           If-Modified-Since time that is included in the request.
        "busy": the directory is busy.

    "dirreq-v2-direct-dl" key=NUM,... NL
        [At most once.]
    "dirreq-v3-direct-dl" key=NUM,... NL
        [At most once.]
    "dirreq-v2-tunneled-dl" key=NUM,... NL
        [At most once.]
    "dirreq-v3-tunneled-dl" key=NUM,... NL
        [At most once.]

List of statistics about possible failures in the download process of v2/v3 network statuses. Requests are either "direct" HTTP-encoded requests over the relay's directory port, or "tunneled" requests using a BEGIN_DIR relay message over the relay's OR port. The list of possible statistics can change, and statistics can be left out from reporting. The current list of statistics is as follows:

Successful downloads and failures:

        "complete": a client has finished the download successfully.
        "timeout": a download did not finish within 10 minutes after
           starting to send the response.
        "running": a download is still running at the end of the
           measurement period for less than 10 minutes after starting to
           send the response.

        Download times:

        "min", "max": smallest and largest measured bandwidth in B/s.
        "d[1-4,6-9]": 1st to 4th and 6th to 9th decile of measured
           bandwidth in B/s. For a given decile i, i/10 of all downloads
           had a smaller bandwidth than di, and (10-i)/10 of all downloads
           had a larger bandwidth than di.
        "q[1,3]": 1st and 3rd quartile of measured bandwidth in B/s. One
           fourth of all downloads had a smaller bandwidth than q1, one
           fourth of all downloads had a larger bandwidth than q3, and the
           remaining half of all downloads had a bandwidth between q1 and
           q3.
        "md": median of measured bandwidth in B/s. Half of the downloads
           had a smaller bandwidth than md, the other half had a larger
           bandwidth than md.

    "dirreq-read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL
        [At most once]
    "dirreq-write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL
        [At most once]

Declare how much bandwidth the OR has spent on answering directory requests. Usage is divided into intervals of NSEC seconds. The YYYY-MM-DD HH:MM:SS field defines the end of the most recent interval. The numbers are the number of bytes used in the most recent intervals, ordered from oldest to newest.

    "entry-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
        [At most once.]

YYYY-MM-DD HH:MM:SS defines the end of the included measurement interval of length NSEC seconds (86400 seconds by default).

An "entry-stats-end" line, as well as any other "entry-*" line, is first added after the relay has been running for at least 24 hours.

    "entry-ips" CC=NUM,CC=NUM,... NL
        [At most once.]

List of mappings from two-letter country codes to the number of unique IP addresses that have connected from that country to the relay and which are no known other relays, rounded up to the nearest multiple of 8.

    "cell-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
        [At most once.]

YYYY-MM-DD HH:MM:SS defines the end of the included measurement interval of length NSEC seconds (86400 seconds by default).

A "cell-stats-end" line, as well as any other "cell-*" line, is first added after the relay has been running for at least 24 hours.

    "cell-processed-cells" NUM,...,NUM NL
        [At most once.]

Mean number of processed cells per circuit, subdivided into deciles of circuits by the number of cells they have processed in descending order from loudest to quietest circuits.

    "cell-queued-cells" NUM,...,NUM NL
        [At most once.]

Mean number of cells contained in queues by circuit decile. These means are calculated by 1) determining the mean number of cells in a single circuit between its creation and its termination and 2) calculating the mean for all circuits in a given decile as determined in "cell-processed-cells". Numbers have a precision of two decimal places.

Note that this statistic can be inaccurate for circuits that had queued cells at the start or end of the measurement interval.

    "cell-time-in-queue" NUM,...,NUM NL
        [At most once.]

Mean time cells spend in circuit queues in milliseconds. Times are calculated by 1) determining the mean time cells spend in the queue of a single circuit and 2) calculating the mean for all circuits in a given decile as determined in "cell-processed-cells".

Note that this statistic can be inaccurate for circuits that had queued cells at the start or end of the measurement interval.

    "cell-circuits-per-decile" NUM NL
        [At most once.]

Mean number of circuits that are included in any of the deciles, rounded up to the next integer.

    "conn-bi-direct" YYYY-MM-DD HH:MM:SS (NSEC s) BELOW,READ,WRITE,BOTH NL
        [At most once]

Number of connections, split into 10-second intervals, that are used uni-directionally or bi-directionally as observed in the NSEC seconds (usually 86400 seconds) before YYYY-MM-DD HH:MM:SS. Every 10 seconds, we determine for every connection whether we read and wrote less than a threshold of 20 KiB (BELOW), read at least 10 times more than we wrote (READ), wrote at least 10 times more than we read (WRITE), or read and wrote more than the threshold, but not 10 times more in either direction (BOTH). After classifying a connection, read and write counters are reset for the next 10-second interval.

This measurement includes both IPv4 and IPv6 connections.

    "ipv6-conn-bi-direct" YYYY-MM-DD HH:MM:SS (NSEC s) BELOW,READ,WRITE,BOTH NL
        [At most once]

Number of IPv6 connections that are used uni-directionally or bi-directionally. See "conn-bi-direct" for more details.

    "exit-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
        [At most once.]

YYYY-MM-DD HH:MM:SS defines the end of the included measurement interval of length NSEC seconds (86400 seconds by default).

An "exit-stats-end" line, as well as any other "exit-*" line, is first added after the relay has been running for at least 24 hours and only if the relay permits exiting (where exiting to a single port and IP address is sufficient).

    "exit-kibibytes-written" port=N,port=N,... NL
        [At most once.]
    "exit-kibibytes-read" port=N,port=N,... NL
        [At most once.]

List of mappings from ports to the number of kibibytes that the relay has written to or read from exit connections to that port, rounded up to the next full kibibyte. Relays may limit the number of listed ports and subsume any remaining kibibytes under port "other".

    "exit-streams-opened" port=N,port=N,... NL
        [At most once.]

List of mappings from ports to the number of opened exit streams to that port, rounded up to the nearest multiple of 4. Relays may limit the number of listed ports and subsume any remaining opened streams under port "other".

    "hidserv-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
        [At most once.]
    "hidserv-v3-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
        [At most once.]

YYYY-MM-DD HH:MM:SS defines the end of the included measurement interval of length NSEC seconds (86400 seconds by default).

A "hidserv-stats-end" line, as well as any other "hidserv-*" line, is first added after the relay has been running for at least 24 hours.

(Introduced in tor-0.4.6.1-alpha)

    "hidserv-rend-relayed-cells" SP NUM SP key=val SP key=val ... NL
        [At most once.]
    "hidserv-rend-v3-relayed-cells" SP NUM SP key=val SP key=val ... NL
        [At most once.]

Approximate number of relay cells seen in either direction on a circuit after receiving and successfully processing a RENDEZVOUS1 cell.

The original measurement value is obfuscated in several steps: first, it is rounded up to the nearest multiple of 'bin_size' which is reported in the key=val part of this line; second, a (possibly negative) noise value is added to the result of the first step by randomly sampling from a Laplace distribution with mu = 0 and b = (delta_f / epsilon) with 'delta_f' and 'epsilon' being reported in the key=val part, too; third, the result of the previous obfuscation steps is truncated to the next smaller integer and included as 'NUM'. Note that the overall reported value can be negative.

(Introduced in tor-0.4.6.1-alpha)

    "hidserv-dir-onions-seen" SP NUM SP key=val SP key=val ... NL
        [At most once.]
    "hidserv-dir-v3-onions-seen" SP NUM SP key=val SP key=val ... NL
        [At most once.]

Approximate number of unique hidden-service identities seen in descriptors published to and accepted by this hidden-service directory.

The original measurement value is obfuscated in the same way as the 'NUM' value reported in "hidserv-rend-relayed-cells", but possibly with different parameters as reported in the key=val part of this line. Note that the overall reported value can be negative.

(Introduced in tor-0.4.6.1-alpha)

    "transport" transportname address:port [arglist] NL
        [Any number.]

Signals that the router supports the 'transportname' pluggable transport in IP address 'address' and TCP port 'port'. A single descriptor MUST not have more than one transport line with the same 'transportname'.

Pluggable transports are only relevant to bridges, but these entries can appear in non-bridge relays as well.

    "padding-counts" YYYY-MM-DD HH:MM:SS (NSEC s) key=NUM key=NUM ... NL
        [At most once.]

YYYY-MM-DD HH:MM:SS defines the end of the included measurement interval of length NSEC seconds (86400 seconds by default). Counts are reset to 0 at the end of this interval.

The keyword list is currently as follows:

         bin-size
           - The current rounding value for cell count fields (10000 by
             default)
         write-drop
           - The number of RELAY_DROP messages this relay sent
         write-pad
           - The number of PADDING cells this relay sent
         write-total
           - The total number of cells this relay cent
         read-drop
           - The number of RELAY_DROP messages this relay received
         read-pad
           - The number of PADDING cells this relay received
         read-total
           - The total number of cells this relay received
         enabled-read-pad
           - The number of PADDING cells this relay received on
             connections that support padding
         enabled-read-total
           - The total number of cells this relay received on connections
             that support padding
         enabled-write-pad
           - The total number of cells this relay received on connections
             that support padding
         enabled-write-total
           - The total number of cells sent by this relay on connections
             that support padding
         max-chanpad-timers
           - The maximum number of timers that this relay scheduled for
             padding in the previous NSEC interval

    "overload-ratelimits" SP version SP YYYY-MM-DD SP HH:MM:SS
                      SP rate-limit SP burst-limit
                      SP read-overload-count SP write-overload-count NL
        [At most once.]

        Indicates that a bandwidth limit was exhausted for this relay.

The "rate-limit" and "burst-limit" are the raw values from the BandwidthRate and BandwidthBurst found in the torrc configuration file.

The "{read|write}-overload-count" are the counts of how many times the reported limits of burst/rate were exhausted and thus the maximum between the read and write count occurrences. To make the counter more meaningful and to avoid multiple connections saturating the counter when a relay is overloaded, we only increment it once a minute.

The 'version' field is set to '1' for now.

(Introduced in tor-0.4.6.1-alpha)

    "overload-fd-exhausted" SP version YYYY-MM-DD HH:MM:SS NL
        [At most once.]

Indicates that a file descriptor exhaustion was experienced by this relay.

The timestamp indicates that the maximum was reached between the timestamp and the "published" timestamp of the document.

This overload field should remain in place for 72 hours since last triggered. If the limits are reached again in this period, the timestamp is updated, and this 72 hour period restarts.

The 'version' field is set to '1' for the initial implementation which detects fd exhaustion only when a socket open fails.

(Introduced in tor-0.4.6.1-alpha)

    "router-sig-ed25519"
        [As in router descriptors]

    "router-signature" NL Signature NL
        [At end, exactly once.]
        [No extra arguments]

A document signature as documented in section 1.3, using the initial item "extra-info" and the final item "router-signature", signed with the router's identity key.