netdoc document meta-format

Server descriptors, directories, and running-routers documents all obey the following lightweight extensible information format, known as netdoc format.

netdoc syntax

A netdoc is a text file, with Unix line endings.

The highest level object is a Document, which consists of one or more Items. Every Item begins with a KeywordLine, followed by zero or more Objects. A KeywordLine begins with a Keyword, optionally followed by whitespace and more non-newline characters, and ends with a newline. A Keyword is a sequence of one or more characters in the set [A-Za-z0-9-], but may not start with -. An Object is a block of encoded data in pseudo-Privacy-Enhanced-Mail (PEM) style format: that is, lines of encoded data MAY be wrapped by inserting an ascii linefeed ("LF", also called newline, or "NL" here) character (cf. RFC 4648 ยง3.1). When line wrapping, implementations MUST wrap lines at 64 characters. Upon decoding, implementations MUST ignore and discard all linefeed characters.

More formally:

NL = The ascii LF character (hex value 0x0a).
Document ::= (Item | NL)+
Item ::= KeywordLine Object?
KeywordLine ::= ItemKeyword (WS Arguments)? NL
ItemKeyword = Keyword
Arguments ::= any printing ASCII character.
WS = (SP | TAB)+
Object ::= BeginLine Base64-encoded-data EndLine
BeginLine ::= "-----BEGIN " Keyword (" " Keyword)*"-----" NL
EndLine ::= "-----END " Keyword (" " Keyword)* "-----" NL
Keyword = KeywordStart KeywordChar*
KeywordStart ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9'
KeywordChar ::= KeywordStart | '-'

The documentation for each ItemKeyword must specify its expected Arguments and Objects. Unless otherwise stated, a KeywordLine contains a sequence of space/tab-separated arguments:

Arguments ::= Argument (WS Arguments)?
Argument := ArgumentChar+
ArgumentChar ::= any graphical printing ASCII character.

A Keyword may not be -----BEGIN.

The BeginLine and EndLine of an Object must use the same keyword.

Compatibility and extensibility

When interpreting a Document, software MUST ignore any KeywordLine that starts with a keyword it doesn't recognize; future implementations MUST NOT require current clients to understand any KeywordLine not currently described.

Other implementations that want to extend Tor's directory format MAY introduce their own items. The keywords for extension items SHOULD start with the characters "x-" or "X-", to guarantee that they will not conflict with keywords used by future versions of Tor.

Permit additional arguments

For forward compatibility, each item MUST allow extra arguments at the end of the line unless otherwise noted. So, for example, if an item's description is given as:

  • thing int int int ..

then implementations SHOULD accept this string as well:

thing 5 9 11 13 16 12 NL

but not this string:

thing 5 NL

Typically the text would state that the int arguments are integers, so the implementation should also reject this string:

thing 5 10 thing NL

Whenever an item DOES NOT allow extra arguments, we will tag it with "No extra arguments" in the syntax bullet points. (If the .. has been omitted, but there is no "no extra arguments" statement, the omission of the .. is a spec mistake and extra arguments are allowed.)

netdoc structure

Each type of netdoc requires, and permits, certain ItemKeywords, with certain restrictions on their order. In some cases ItemKeywords can introduce sections, providing structure to the document; this will be stated in the description for that ItemKeyword in that type of document.

netdoc format description conventions

NB these conventions are not yet followed everywhere in the Tor Specifications.

When presenting a specific document format, the Items forming the document are shown one per subsection.

The syntax of each item is defined in detail with a bulleted list at the start of the section.

The first bullet point shows the syntax of the line introducing the item. Literal parts (including the Item Keyword) are shown in fixed width. Arguments are shown with italic emphasis. The spaces between arguments, and the final newline, are not depicted. If (as is usual) extra arguments are to be tolerated (for future expansion), a short ellipsis .. is shown as a reminder. Optional arguments are shown in [ ].

When an Item has (or may have) an Object, that is shown as the 2nd line in the bullet list, in the form:

  • something, Object, OBJECT KEYWORD where something will be used to refer to the Object in the text, and OBJECT KEYWORD is the Object's Keyword in the base64 delimiters. (The ----BEGIN etc. are not depicted.)

Further bullet points give further information about the syntax - often, in terms defined more fully here.

The type (therefore, format) of arguments, and permissible values, are stated in the text. The argument is named in bold-italic in its principal description.

Position and multiplicity

The syntax bullet points for an Item state its permissible multiplicity and position, within each Document of its particular document type, in the following terms:

  • "At start, exactly once" --- MUST occur exactly once, and MUST be the first item.

  • "Exactly once" --- MUST occur exactly once.

  • "At end, exactly once" --- MUST occur exactly once, and MUST be the last item.

  • "At most once" --- MAY occur zero or one times but MUST NOT occur more than once.

  • "Any number" --- MAY occur zero, one, or more times.

  • "Once or more" --- MUST occur at least once and MAY occur more than once.

Rest-of-line arguments

Exceptionally, for some items there is a "rest of line" argument. This is denoted by writing ARGUMENT.... in the syntax summary, in the first bullet point, and stating

  • ARGUMENT is the whole rest of the line,

in the syntax description.

In this case, the value of the argument is all the characters after the SP following the keyword or previous argument.

Signing documents

Every signable document below is signed in a similar manner, using a given "Initial Item", a final "Signature Item", a digest algorithm, and a signing key.

The Initial Item must be the first item in the document.

The Signature Item has the following format:

<signature item keyword> [arguments] NL SIGNATURE NL

The "SIGNATURE" Object contains a signature (using the signing key) of the PKCS#1 1.5 padded digest of the entire document, taken from the beginning of the Initial item, through the newline after the Signature Item's keyword and its arguments.

The signature does not include the algorithmIdentifier specified in PKCS #1.

Unless specified otherwise, the digest algorithm is SHA-1.

All documents are invalid unless signed with the correct signing key.

The "Digest" of a document, unless stated otherwise, is its digest as signed by this signature scheme.