#
Protocol
Note: This protocol documentation is preliminary.
#
Basics
The Polycentric protocol is based on Vector clocks, Asymmetric cryptography, and Conflict-free replicated data types. An understanding of these concepts is required. The core is as follows: A system is identified by a public key. A system is usually a user identity. A process is identified by a random string. A process usually represents a user device. Each event in a system is signed using the system public key, and published by a process. Messages are communicated via a set reconcillation protocol, and state constructed by a consumer of system events under an eventual consistency model.
There are two components: clients, and servers. A client chooses to publish it's systems on multiple servers. An system includes the necessary routing information to find it within a network. When a process is consumed a client will connect to the servers a process is available on and start synchronizing. The set of servers a user chooses to store their own processes on can be totally disjoint from those of the people they are following. The basic system synchronization places very limited trust in servers compared to most models. Should server operators provide unreliable service, or choose to deny service to users, the client automatically fetches systems from other sources.
Many features are very difficult to provide via a trustless methodology, or purely within a client. Examples of these features include recommendation engines, and search. Each Polycentric server provides search and recommendation, but has control over what data it chooses to present. A server could choose to return some results, and not others. To combine the best of both worlds Clients use multiple servers of their choosing to provide search and recommendations. Results are deduplicated and attributed to the server that provided them. This allows using high performance and state of the art solutions to these difficult problems, but limit the manipulation possible by a single actor.
#
Core Message Format
Polycentric is a binary protocol based on Protocol Buffers Version 3.
#
Event Message
message Event {
PublicKey system = 1;
Process process = 2;
uint64 logical_clock = 3;
uint64 content_type = 4;
bytes content = 5;
VectorClock vector_clock = 6;
Indices indices = 7;
}
#
Signed Event Message
A SignedEvent
is an Event
with a signature, and is the main message type sent over a network between devices. Signatures, and digests are computed over the raw bytes of the event
field. The event field must be stored as is by clients to remedy lack of canonicalization among libraries. This also ensures fields may be added to Event
in a non breaking way.
message SignedEvent {
bytes signature = 1;
bytes event = 2;
}
#
Public Key Message
The only supported key_type
is 1
representing ed25519.
message PublicKey {
uint64 key_type = 1;
bytes key = 2;
}
#
Digest Message
The only supported digest_type
is 1
representing SHA256.
message Digest {
uint64 digest_type = 1;
bytes digest = 2;
}
#
Vector Clock Message
A component of Event
. This is contains the state of the logical clocks of each other process that a process is aware of in the order of the last SystemProcesses
message. Should a process not be aware of other processes the VectorClock
will be empty.
message VectorClock {
repeated uint64 logical_clocks = 1;
}
#
Index Message
A component of Indices
.
message Index {
uint64 index_type = 1;
uint64 logical_clock = 2;
}
#
Indices Message
A component of Event
. Indices
is a map of back pointers to previous Event
types or passed on. This may be used to point to the location of a more complex index type, or in the simple case used to establish a chain of particular values for safer partial set reconciliation.
message Indices {
repeated Index indices = 1;
}
#
Process Message
Process is a per process random 16 byte identifier.
message Process {
bytes process = 1;
}
#
Pointer Message
Used for addressing an Event
. The event_digest
is included such that subject of the pointer cannot be maliciously mutated. An example usage is referencing a post in a reply.
message Pointer {
PublicKey system = 1;
Process process = 2;
uint64 logical_clock = 3;
Digest event_digest = 4;
}
#
Last Writer Wins Element Set Message
See the Conflict-free replicated data type Wikipedia page for more information.
LWWElementSet is ADD
biased, using unix_milliseconds
as the conflict resolution timestamp.
message LWWElementSet {
enum Operation {
ADD = 1;
REMOVE = 2;
}
Operation operation = 1;
bytes value = 2;
uint64 unix_milliseconds = 4;
}
#
Last Writer Wins Element Message
A CRDT representing a single value. The unix_milliseconds
value is used for conflict resolution, or unix_milliseconds
values conflict which Process
identifier is larger.
message LWWElement {
bytes value = 1;
uint64 unix_milliseconds = 2;
}
#
Reference Message
message Reference {
uint64 reference_type = 1;
bytes reference = 2;
}
The two referenence types are:
1: Pointer
2: System
#
Event Content Types
Corresponding to content_type
:
1: Delete
2: SystemProcesses
3: Post
4: Follow
5: Username
6: Description
7: BlobMeta
8: BlobSection
9: Avatar
10: Server
11: Vouch
12: Claim
13: Banner
#
Delete Message
A Delete
message instructs implementations to stop storing a message. The Delete
message is then returned when requested as proof that the mutation was not malicious. A Delete
message may not be the subject of another Delete
. The field indices
mirrors that of the subject message.
message Delete {
Process process = 1;
uint64 logical_clock = 2;
Indices indices = 3;
}
#
System Processes Message
This message represents the other processes of a system known by a given process. A process should not include itself in a SystemProcesses
message. When a process becomes aware of another process it should publish a new SystemProcesses
message with the new process included.
message SystemProcesses {
repeated Process processes = 1;
}
#
Server Message
Message type server
uses an empty content
field with a server address set for lww_element_set
. This message is used to advertise servers that a storing a events for the system.
#
Username Message
Message type username
uses an empty content
field with the value of lww_element
set to a username.
#
Description Message
Message type description
uses an empty content
field with the value of lww_element
set to a description.
#
Follow Message
Message type Follow
uses an empty content
field with a single reference
value pointing to a System
to be followed, and with the same value also used in lww_element_set
.
#
Avatar
Message type avatar
uses an empty content
field with the value of lww_element
set to an image Pointer
.
#
Post Message
A freestanding message.
message Post {
string content = 1;
optional Pointer image = 2;
optional Pointer boost = 3;
}
#
BlobMeta and BlobSection
Blobs are split into segments to ensure that events are not larger than one megabyte. A BlobMeta
event is used to describe a blob. The meta_pointer
field of BlobSection
references the index of BlobMeta
the BlobSection
corresponds to.
message BlobMeta {
uint64 section_count = 1;
string mime = 2;
}
message BlobSection {
uint64 meta_pointer = 1;
bytes content = 2;
}
#
Vouch
Message type Vouch
uses an empty content
field with a single reference
value pointing to a Claim
.
#
Claim
message Claim {
string claim_type = 1;
bytes claim = 2;
}
#
Claim Types
HackerNews (ClaimIdentifier)
YouTube (ClaimIdentifier)
Odysee (ClaimIdentifier)
Rumble (ClaimIdentifier)
Twitter (ClaimIdentifier)
Bitcoin (ClaimIdentifier)
Generic (ClaimIdentifier)
URL (ClaimIdentifier)
Patreon (ClaimIdentifier)
Discord (ClaimIdentifier)
Instagram (ClaimIdentifier)
Twitch (ClaimIdentifier)
#
Claim Identifier
message ClaimIdentifier {
string identifier = 1;
}
#
Network
Binary query paramaters are Base64-URL encoded following RFC 4648.
There are a few types of querying. You can query an index chain, you can query specific events, you can query references. Only querying specific events is currently documented.
#
Message Type Events
A simple list of events used in various contexts.
message Events {
repeated Event events = 1;
}
#
GET /head?system=...
The head endpoint returns the set of messages required to capture the entire known state of a system. If a single process has an accurate SystemProcesses
state, only the latest message from that process is returned. If the server has a more complete view of a system than any given process then the latest message from multiple processes may be returned.
This endpoint is intended to be used to spot check servers cheaply. If a given server is being used for synchronization, a client may check that messages are not being hidden by asking other servers for the head.
This endpoint returns Events
.
#
POST /events
The POST events endpoint is used to submit events to a server. This endpoint accepts Events
.
#
GET /ranges?system=...
The ranges endpoint is used to determine messages a server currently has. The result type is RangesForSystem
. A Range
is inclusive.
message Range {
uint64 low = 1;
uint64 high = 2;
}
message RangesForProcess {
PublicKey process = 1;
repeated Range ranges = 2;
}
message RangesForSystem {
repeated RangesForProcesses = 1;
}
#
GET /events?system=...&ranges_for_system=...
The GET events endpoint is used to request events in a range, returning an Events
message.
#
GET /resolve_claim?trust_root=...&claim=...
The GET resolve_claim
end point is used to find a feed for an arbitrary claim, returning an Events
message. Claim validity is a social not technical construct, as such a node (system in a trust graph) must be provided to use as a basis for claim resolution via trust_root
. The claim
query parameter is a base64 encoded claim.