Molerat Protocol Specification

Molerat Protocol Specification

1 molerat v0.1.0-alpha

Molerat is yet another protocol like Gemini or http/html/css. It features extended functionality over Gemini while keeping a low bar-of-entry.

1.0.1 The Name

In nature, naked mole-rats are unusually resistant to cancer. This reflects the philosophy of Molerat- to provide a useable platform, resistant to feature creep.

Not only are naked mole-rats highly resistant to cancer, but they also have unusually long life-spans for rodents. This is interpreted as a good omen for the Molerat project- may it live a long and cancer-free life.

1.0.2 Inspiration

Molerat is heavily inspired by http and Gemini choosing to adopt many principles from both. A core philosophy for Molerat is don’t do it differently just because someone else is already doing it. This is also known as steal good features.

1.0.3 Mascot

Molerat is represented by Potat the naked mole-rat.

Potat, the naked mole-rat.

1.1 Overview

molerat adopts the TOFU TLS TOFU TLS stands for trust on first use transport layer security. It is a system of securing against MITM (man in the middle) attacks where a malicious actor impersonates the server in order to serve altered content.

standard from Gemini. TOFU works by storing an id or key for a specific server. On future connections, the client will check to make sure the key still matches the original. If it does not it will produce an error and alert the user as the client may be connected to an impersonator instead.

For data transfer, Molerat uses request and response headers to define different types of operations like getting, or putting data.

For markup, Molerat uses a markdown-like syntax to define page content like links, text formats, and user-interaction.

1.2 Molerat URI Scheme

Molerat mostly conforms to the base URI syntax defined in RFC 3986. Similar to Gemini, Molerat does not use the userinfo component of URIs.

Molerat URIs use the scheme “molerat” and a default port of 2693.

1.3 Requests

Basic Molerat requests are built using the format

<kind> <url>
<crlf>
<crlf>

Where

  • kind corresponds to the type of request being made.
  • url corresponds to the URL of the content being requested.
  • crlf corresponds to a carriage-return line-feed, or \r\n.

Molerat supports three different request types. Those are;

  • get
  • put
  • del

1.3.1 Get

In Molerat, get is used to request content from a server. The server need only respond with the content, no other action is necessary.

Here is an example get request for the content at molerat://example.com/:

get example.com/

To get the content at molerat://example.com/resource

get example.com/resource

To get the content at molerat://example.com/path/to/resource?with=query#and-fragment

get example.com/path/to/resource?with=query#and-fragment

Note that every request must end with a double crlf. They are not visible when rendered on this page, however to a standard Molerat request in bash one would have to include them like this:

echo 'get example.com/resource\r\n\r\n'

1.3.2 Put

put is used to send content to a server. What is done with that content is up to the server, but it is generally expected to update a record, or create a new one with the data provided. The server may then respond with a new page for the client to display.

put also includes the data being sent as a list of <key>:<value> records. These records are delimited by a <tab><crlf> and are formatted as <key>:<value>. A <tab><crlf> following the final key is optional.

Here is an example of a put request to submit a username/password for a login page:

put example.com/login
length:32
hash:02f53083ac99d85db16d2226b370c8137e6d2f4f8a5a52dad84d12d6a9f6f471

username:potat
password:molerat

As you can see, there are two separate sections to a form. These sections are delimited with a \r\n\r\n and they give the server information about the form being submitted. For more information on these header keys, see Section 3.4

Multi line form values are also allowed as long as they do not contain a tab directly followed by a crlf:

put example.com/submit/story
length:64
hash:336ef7e928bfa7de38f84917a976728daca6e67160e6aac7b9fa6ec16784151d

username:potat
story:Once upon a time...
There was a beautiful naked mole-rat named Potat. Potat loved things such as marshmallows and stargazing, he found every part of his life to be enjoyable.

Keys must not be empty. For example, this would be an invalid put request:

put example.com/submit/story
length:22
hash:3a9d888ce6913876fdc2258fac9f31aacaa222fa4258f394a22a3f58d4586375

username:potat
story:

Servers implementing the Molerat protocol should reject put requests with empty keys.

Note that keys can, however, be solely whitespace. This counts as not being an empty key.

Here is a valid put request printed in bash:

echo 'put example.com/login\r\nlength:64\t\r\nhash:336ef7e928bfa7de38f84917a976728daca6e67160e6aac7b9fa6ec16784151d\r\n\r\nusername:potat\t\r\npassword:molerat'

1.3.3 Del

del is used to delete a resource. It is syntactically the same as a get request. A del request can be used to clear up the put request. Since otherwise it would have to be used to request deletion of a resource in a server.

Here is an example of a del request:

del example.com/item/potat

del can use the same form headers and content as put, so it is possible to send authentication via login/password with a del request.

1.3.4 All three

Here is an example of a typical user session on a theoretical social media site using Molerat requests:

  1. get other user’s posts:
get example.com/
  1. put a post:
put example.com/posts
username:potat
content:Hi! I'm the new mascot for the Molerat protocol!
  1. del an older post:
del example.com/posts/1
  1. View other posts:
get example.com/posts

1.4 Responses

Molerat formats responses as

status<crlf>
message:<value><tab>
<crlf>
type:<value><tab>
<crlf>
length:<value><tab>
<crlf>
hash:<value>
<crlf>
<crlf>
<content>

Where

  • status corresponds to the status code of the request. See below for more information on available status codes.
  • message is an optional message from the server to the client. It can be used to send error messages or context-specific messages.
  • type corresponds to the MIME Type of the data being sent. The recognized list of MIME types is available at www.iana.org/assignments/media-types/media-types.xhtml. A valid MIME definition is defined in RFC 2046.

  • length is the decimal byte-length of of the content being sent.
  • hash is a unique identifier for that page. It is suggested to be a sha256 hash. But the only requirement is that it be 1024 or fewer bytes.
  • content is the raw data for the client to display.

In a response containing no content, length and hash should not be included. Additionally, the type component should not be included.

Molerat responses are really just key-values with an optional content section at the endNote that the exception to this is the status field. Status is just the first line of the response indicating the server and response status.

. Here is another way to write the template above:

status
message:
type:
length:
hash:

<content>

Here is an example valid call and response for a get request:

Request:

get example.com/

Response:

10
message:Success
type:text/plain
length:18
hash:c28de73edf48cbc93eafb11f7266fc2268b9c7c77f9df02e74e0561e6cea7595

Hello, from Potat.

Or, using escape codes:

# Request
echo 'get example.com/\r\n\r\n'

# Response
echo '10\r\nmessage:Success\t\r\ntype:text/plain\t\r\nlength:18\t\r\nhash:c28de73edf48cbc93eafb11f7266fc2268b9c7c77f9df02e74e0561e6cea7595\r\n\r\n'

Note that as with put requests, the final form delimiter <tab><crlf> is not necessary in a response.

Molerat mandates a hash key to allow clients to optionally cache pages. It should not be used to reliably validate them, however. Instead, a client may read the hash key before allocating memory to load the content, and if it finds an identical one in it’s store, it may display that page instead. This can be used to ease-up on network strain for long documents, as Molerat does not support compression during transport.

Clients are not required to keep a hash cache, and if they do, user options to control the hash should be available.

1.5 Status Codes

Molerat specifies several types of responses status codes:

  • 1x codes are used for success messages.
  • 2x codes indicate redirection.
  • 3x codes indicate client errors.
  • 4x codes indicate server errors.
  • 5x codes are used for TLS signatures.

1.5.1 1x Codes

1x codes are success messages. They indicate that the client request has succeeded.

1.5.1.1 10

Success. Everything went OK and the new page is attached.

1.5.1.2 11

Content unchanged. The request was a success, but the client should not update the page.

1.5.2 2x Codes

2x codes indicate redirection. The server may tell the client see another resource. The redirect location should be an encoded URL in the message component of the response. The server may optionally include content to be displayed while the redirection is taking place.

1.5.2.1 20

Permanent redirect. The client should not make further requests to this resource.

1.5.2.2 21

Temporary redirect. The client should temporarily see another source, but not cache the redirect.

1.5.3 3x Codes

3x codes are for client errors. These should be sent as a response using the message component to explain the error. Servers may optionally include content to display a richer error to the user. Otherwise, the client should display a generated error page.

1.5.3.1 30

Malformed request. Due to an error with the client, the server was unable to parse the request being sent.

1.5.3.2 31

Invalid request. The server was able to parse the request, but the data included is not valid in this context. This code should be used for authentication failures and put requests containing data that the server cannot accept.

1.5.3.3 32

Not available. The requested resource does not exist or is not available to the client. This code can be used for both nonexistent resources, and for cases where the client does not have access to the resource.

1.5.4 4x Codes

4x codes indicate an unrecoverable failure on the server. If possible, the server should catch internal errors and respond with a 4x code.

1.5.4.1 40

Internal error. The server cannot proceed as it has encountered an error it does not know how to handle.

1.5.4.2 41

Not supported. The server does not support or allow access to this resource by the client.

1.5.4.3 42

Slow down. The client is being rate limited. The message component should be a number corresponding to the seconds a client must wait before making another request to the server.

1.5.5 5x Codes

5x codes are used for when the server requires a client key to access a resource.

1.5.5.1 50

Client certificate required. The server requires that the client provides a certificate to proceed. The client should retry the request using a certificate.

1.6 MIME Types

In responses, the server may specify any valid MIME type. It is up to the client to determine how best to display or open the file. Molerat also supports parameters for MIME types like charset. As with Gemini, the default charset is UTF-8 for text content.

1.7 TLS

Molerat chooses to adopt many standards from Gemini for TLS. As with Gemini, use of TLS for Molerat requests and responses is required. Additionally, as with Gemini, use of the Server Name Indication extension is TLS is also required. This allows multiple domains to be hosted on a single server while remaining secure.

Molerat expects that clients and servers use TLS version 1.2 or higher.

1.7.1 Client

Clients may use certificate to identify themselves to a server, but a Molerat client should not use a certificate by default.

If a resource requires a certificate (via the 50 status code) then the client should allow the user to specify whether to use a short-lived certificate, or a long-lived one.

1.7.2 Validation

Molerat recommends the same methods of server certificate validation as Gemini (TOFU). Clients may validate server certificates however they want, including not at allNo matter how a client validates a server certificate, it should always strive to be transparent with the user on how it does so.

.

Molerat is a simple protocol, so the need for CA certificates is overly-complex. Of course, a client and server may implement it, but clients should . Of course, a client and server may implement it, but clients should accept self-signed keys by default.

1.8 Molerat Text (mtxt)

Molerat text, or mtxtis the format that Molerat uses to present pre formatted text.

mtxt supports several features, but not all of them must be implemented by every client. A server serving mtxt should use a MIME Type of text/molerat. Like Gemini, Molerat supports an optional lang parameter in addition to charset. Clients may use this for any purpose such as displaying text in the correct direction, or auto-translating to a different language. As with Gemini, the lang parameter accepts comma separated language tags from the IANA Language Registry www.iana.org/assignments/language-subtag-registry/language-subtag-registry

...
type:text/molerat; lang=en
...

Specifies a mtxt document that is in English.

1.8.1 Headings

Molerat supports three levels of heading denoted by a hashtag (#) and then a space. The number of #s denotes the level of heading.

# Title Heading

## Subtitle Heading

### Sub-Subtitle Heading

Rendered in HTML, it would look like this:


Title Heading

Subtitle Heading

Sub-Subtitle Heading


1.8.2 Text

Regular text is simply written on a standalone line. Line breaks are ignored unless there are two in a row. mtxt should not be manually line-wrapped when written, clients should instead choose how and where to wrap text.

This is a regular paragraph. The client will choose when to break the line, to better accommodate the user and their device.


This line has a break,
but it will not show up when rendered since it is only one break. 
People can use a single line break to make it easier to write
mtxt without changing the end-result.


This line has a manual break,

This line will show up on a new line after the previous.

People can use this to separate paragraphs, and other distinguished content.

This is how the previous text should look:


This is a regular paragraph. The client will choose when to break the line, to better accommodate the user and their device.

This line has a break, but it will not show up when rendered since it is only one break. People can use a single line break to make it easier to write mtxt without changing the end-result.

This line has a manual break,

This line will show up on a new line after the previous.

People can use this to separate paragraphs, and other distinguished content.


1.8.3 Subtext formatting

Like Markdown, asterisks (*) can be used to format text inside a paragraph.

*This text is italic*

**This text is bold**

***This text is italic and bold***

This text is italic

This text is bold

This text is italic and bold


Text surrounded by a single asterisk will be italic, double asterisks will be bold, and triple asterisks should be italic and bold. Asterisks should not style text if surrounded by whitespace.

This word is italic: *Potat*

This word is NOT italic: *Potat *

Neither is this one: * Potat*

And DEFINITELY not this one: * Potat *

This word is italic: Potat

This word is NOT italic: *Potat *

Neither is this one: * Potat*

And DEFINITELY not this one: * Potat *


Asterisks can be escaped with a backslash (\) character.

*I should be italic, but I'm not\*

\*I have fallen to the same predicament*

\*I have too...\*

*I should be italic, but I’m not*

*I have fallen to the same predicament*

*I have too…*


Inline links can be specified via the [[ and ]] Characters. Links can link to any type of content like mtxt, images, audio, videos, and even to other pages on other protocols.

The following is a link to another Molerat page:

[[This links to example.com!]molerat://example.com]

It would be displayed like this:

This links to example.com!

A link in mtxt consists of three parts:

  • The display text.
  • The URL.
  • The media type.

1.8.4.1 Display Text

In a link, the text at the very center of the double braces ([[ and ]]) is the text that the client should display to the user. If no text is specified, then the client should display the URL.

1.8.4.2 The URL

The URL is a link to the resource that the client should fetch. It is located on the right-hand side of the double braces. If there is no protocol (or scheme) specified (like http://, gemini://, or molerat://), then the client should treat the URL as a relative link on the same website. i.e. if a URL is /about, and the user is on molerat://example.com/login, then the client should interpret the URL as molerat://example.com/about.

1.8.4.3 The Media Type

The media type goes on the left-hand side of the double braces and specifies the type of data that is being linked to. By default, it is mtxt, so it can be left blank for regular links to other webpages.

If the link is to a media-type that is uncommon, or widely unsupported by the client, then the media type may be a MIME type.

Molerat also supports several shorthand media types.

  • ! - Denotes an image of types image/png or image/jpeg.
  • . - Denotes a video of type video/mp4.
  • " - Denotes an audio clip of types audio/mp4, audio/mp3, or audio/wav.
  • # - Is for auto. The client should guess how to display the content to the best of it’s abilities. Clients are not required to do anything for auto Media types, but they should offer the ability to open or download the linked media.
  • - - Can be used to force the client to treat the link as a text link.

Here are some examples:

[[see here for more info]molerat://example.com/info]

Links to molerat://example.com/info and has the text “see here for more info”. It looks like this:

see here for more info


[[]molerat://example.com]

Links to molerat://example.com without any text. It looks like this:

molerat://example.com


[![A naked mole-rat.]https://upload.wikimedia.org/wikipedia/commons/0/02/Nacktmull.jpg]

If the client supports it, this will embed an image into the text content. Otherwise it will appear as a linkClients without image support should provide some visual way to distinguish regular links and images.

. Images with display text should use it as an alt text for screen readers.

It should look like this:

A naked mole-rat.

[#[Mystery media.]molerat://example.com/media]

For links with the # media type, the client should use the type parameter from the server if the link uses the molerat:// scheme. Otherwise, the client may do anything (or nothing at all) to discern the type of media.


[-[An image of a naked mole-rat.]https://upload.wikimedia.org/wikipedia/commons/0/02/Nacktmull.jpg]

This link uses the - media type which means that the client should not attempt to display it as an inline image. Instead, it should look like this:

An image of a naked mole-rat.

1.8.5 Lists

Lists in mtxt can be ordered or unordered.

1.8.5.1 Unordered Lists

Unordered lists are any sequence of characters preceded by a - (dash) and a space.

- item one
- item two
- item three

Would look like


  • item one
  • item two
  • item three

In mtxt, lists do not nest.

1.8.5.2 Ordered Lists

Ordered lists are any sequence of lines beginning with a sequential number followed by a period and a space.

1. item one
2. item two
3. item three

Would look like


  1. item one
  2. item two
  3. item three

mtxt lists can start at any number.

5. item one
6. item two
7. item three

Would look like


  1. item one
  2. item two
  3. item three

The only restriction is that the numbers must be sequential going up. This would be an invalid list in mtxt:

1. item one
5. item two.
10. item three.

That means that numbers going down would also be invalid, here is an example of that:

3. item one
2. item two
1. item three

1.8.6 Code Blocks

mtxt supports fenced code blocks. Fenced code blocks inherit the same format as Markdown.

```
#include <stdio.h>

int main(void) {
    printf("hello, world!\n");
}
```

Fenced code blocks can be any text enclosed in triple backticks (`). Code blocks will have their whitespace preserved. They should also be displayed in a way that distinguishes them from the rest of the document flow such as a mono font, or with a different background color. Exact implementation is up to clients.

For example, the above example could be displayed like this:

#include <stdio.h>

int main(void) {
    printf("hello, world!\n");
}

mtxt also supports language specifiers for code blocks. After the opening triple backticks, a language can be specified which the client may use for language highlighting or any other purpose.

For example, a code block specifying that its contents are in the c language would look like this:

```c
#include <stdio.h>

int main(void) {
    printf("hello, world!\n");
}
```

There is no formal list of supported languages for syntax highlighting, but as a general rule, using the file’s extension name should work. c code should use c, c++ code should use cpp, and JavaScript code should use js.

The above code may be syntax highlighted by the client to look something like this:

#include <stdio.h>

int main(void) {
    printf("hello, world!\n");
}

Clients may force-wrap code blocks if necessary, but should avoid doing so to preserve the original formatting.

1.8.6.1 Inline Code

In addition to code blocks, mtxt also supports inline code snippets. Inline code follows the same way as Subtext Formatting (see Section 3.8.3)

This is an example of inline code:

I am an example paragraph. I have inline `code` which can be inside the regular text.

1.8.7 Horizontal Breaks

Three dashes (-) alone on a line will become a horizontal break.

Closing paragraph.

---

New paragraph.

This would be rendered like this:

Closing paragraph.


New paragraph.

A client may display a horizontal break however it likes, but it should be clear that it is a separator between content.

1.8.8 Input

mtxt allows for text input. An input field looks like this:

|Placeholder text..|[id text]

That would look like this:

Input fields are made up of three parts:

  • Placeholder Text - The optional placeholder text to be displayed as a prompt for the input field.
  • ID - The id of the input that will be used when the form is submitted.
  • Type - The optional type of the input field.

1.8.8.1 Placeholder text

Placeholder text is optional. If specified, it will be shown as a prompt for the input. Placeholder text does not support subtext formatting.

Here is an example of an input with placeholder text:

|Name|[id text]

It would look like this:


Without a placeholder text:

||[id text]

1.8.8.2 ID

The id is the only mandatory part of an input field. It is what the client will use to identify this input field in the submit request. If two inputs have the same ID on the same page, the client should use the most recent unsubmitted one.

1.8.8.3 Type

The type is a hint to the client as to what data this form wishes to accept. The client may choose to disregard this, so servers should not expect input to be validated already.

These are the types that clients should support:

  • text - The default option. Any text data.
  • number - Any numeric digits.
  • time - Any time in the hours:minutes [AM|PM] format. Clients should convert times into 24-hour time before submitting a form, although this is optional.
  • date - Any date in the year-month-day format.
  • private - Any data that should be hidden while typed, such as a password.

1.8.9 Submitting Input

To submit input forms on a page, there is a special element:

|Submit|(/login get)

Depending on the client, this can be rendered as a link or a button:

Submit

Submit elements use parentheses (( and ) ) instead of braces ([ and ]) to specify that they do not accept input, they merely submit it.

Submit elements submit every input that came before it on a page. But they ignore inputs that come after themselves.

A submit element is made up of three parts:

  • The Name - The display text to show.
  • The URL - The URL to send the submit request to.
  • The Method - The Molerat method to use for the request like get, put, or del.

1.8.10 Input Form Flow

Here is an example of a login page in mtxt:

# Login

Please login to the site below:

|username|[username text]
|password|[password private]

|Login|(/login get)

This would look like this:


Login

Please login to the site below: