How it Works
Radicle Link is the peer-to-peer gossip protocol that powers the Radicle network. This documentation will provide an overview on how the protocol works. More information β including specifications, security considerations, and implementation notes β can be found in the corresponding repository.
Overviewβ
Radicle Link is a peer-to-peer gossip protocol with a generic distributed version control backend. It aims to be general enough to be used on top of systems such as pijul or mercurial, though it's initial implementation is focused on supporting Git.
The protocol disseminates Git repositories via gossip-based replication, enabling the hosting and sharing of repositories without reliance on central servers. Repositories on the Radicle network are called 'projects', which are gossiped by 'peers'.
In Radicle:
- Peers track other peers
- Peers track projects they are interested in
- Peers gossip about projects. This means replicating updates from the peers they track and the projects they are interested in
These interactions create a "trusted" social graph of peers and projects that become the foundation for collaboration within Radicle.
Radicle Link supports a bazaar-style collaboration model in which there is no single canonical 'master' branch that contributors merge into, but a multitude of upstreams exchanging patches via remotes.
DISCLAIMER π±
While this documentation reflects the specification of the Radicle Link protocol, not all information is integrated into the Beta release
Identitiesβ
Overviewβ
Radicle Link distinguishes two types of identities: personal and project. The
first describes an actor (a peer
) in the system, while the second describes a
software project (repository) on which one or more actors collaborate.
The notion of "identity" in Radicle Link simply means the presence of an
identity document at a conventional location within a Git repository, where the
document is subject to certain verification rules. The hash of the initial
document is considered its stable identifier and encoded as a uniform resource
name (URN) of the form rad:git:$HASH
(for Git repositories). The URN is
supposed to be resolvable on the network into a top-level Git repository of the
same name ($HASH.git), which is valid if it contains said identity document, and
the document passes the verification rules.
Data Modelβ
Our model for maintaining consistency on repository data, is based on The Update
Framework (TUF), which was conceived as a means of securely distributing
software packages. Our approach is to establish an ownership proof, tied to the
network identity of a peer (PeerId
- See Key Architecture), or a set of
peers, such that the views of a project can be replicated according to the trust
relationships between peers ("tracking").
The identity of a project (repository) or peer is established using a document of the form:
struct Doc<T, D> {
replaces: Option<Revision>,
payload: T,
delegations: D,
}
where:
Revision
is a cryptographic hash of a document's contents, such that this document is content-addressable by this hash within the storage system.replaces
refers to the previous revision of the document, or none if it is the first revision.payload
is an extensible, forwards- and backwards-compatible datatype containing application-defined metadata about the repository. The protocol interprets some of the properties, as described in Doc Payload.delegations
contains the public keys of key owners who are authorised to issue and approve new revisions of the document. The delegation format depends on the type of identity being established, as detailed below.
The Doc
must be serialised in canonical form. See more in Serialisation.
The authenticity of the Doc
is captured by the following type:
struct Identity<T, D> {
id: ContentId,
root: Revision,
revision: Revision,
doc: Doc<T, D>,
signatures: HashMap<PublicKey, Signature>,
}
where:
*`id` is the content-addressable hash of the Identity itself
*`root` is the initial revision of the identity Doc
*`revision` is the current revision of the identity Doc
*`doc` is the Doc described by revision
*`signatures` contains signatures over the document history, indexed by the
public keys used. A signature is made over the cryptographic hash of the
concatenation of the Revisions chain, from most recent to the root.
An Identity
describes the attestation of a Doc
s validity and the root
of a
verified Identity
is the stable identifier of the repository.
Read more in the specification
Doc Payloadβ
The doc payload determines the type of identity: personal or project. The payload must include one of the following structures (but not both) for interpretation by the protocol:
struct User {
/// A short name (nickname, handle), without any prefix such as the `@`
/// character
name: String,
}
struct Project {
/// A short name
name: String,
/// A slightly longer description (should fit in a headline)
description: Option<String>,
/// The default branch. "master" is assumed for git repositories if
/// unspecified.
default_branch: Option<String>,
}
There are currently no restrictions on the length (in bytes) of the fields. Applications may add additional payload data, but must do so in a way which unambiguously preserves the shape of the above definitions. See Serialisation.
Radicle URNsβ
Identities are addressable within the Radicle Network by their stable identifier, encoded as a URN. Radicle URNs are syntactically and functionally equivalent to URNs as per RFC8141.
The syntax of a Radicle URN is defined as follows:
"rad" ":" protocol ":" root [ "/" path ]
where:
protocol = "git"
root = MULTIBASE(MULTIHASH(id))
path = pct-encoded
id = BYTES
The id
is the root
field of a verified Identity, as specified previously.
The MULTIBASE
and MULTIHASH
encodings are specified in multibase and
multihash, respectively. The preferred alphabet for the multibase encoding
is z-base32. pct-encoded is defined in RFC3986, and the equivalence
rules as per RFC8141 apply.
An example of a Radicle URN:
rad:git:hnrkmzko1nps1pjogxadcmqipfxpeqn6xbeto
.
Delegationsβ
As described in Data Model Radicle Link distinguishes two types of
identities: personal and project. The first describes a person (usually denoted
as a peer
) in the system, while the second describes a software project
(repository) on which one or more persons collaborate. Apart from their
payload types T
, they differ in their delegations type D
:
Personal identities can only delegate to anonymous keys, while project identities may attach a personal identity to a key delegation. More formally:
type User<T> = Identity<T, HashSet<PublicKey>>;
enum ProjectDelegation<U> {
Key(PublicKey),
User(User<U>),
}
type Project<T, U> = Identity<T, ProjectDelegation<U>>;
The delegations
of a Project
are also referred to as the project's
maintainers.
These delegations are used to verify the authenticity of data across the network. See more notes on Verification here.
Verificationβ
Document revisions, as well as authenticity attestations, form a hash-linked
chain leading to the initial revision of the document. In order to verify the
authenticity of a given identity attestation (Identity
), the chain of updates
must be contiguous and subject to the following verification rules.
There are four levels of validity:
Untrusted
The identity document is well-formed, and points to a root object which is retrievable from local storage.
Signed
The identity carrier passes 1., and is signed by at least one key specified in the delegations of the document.
Quorum
The identity carrier passes 2., and is signed by a quorum of the keys specified in the delegations of the document (
Q > D/2
).Verified
The identity carrier passes 3., and:
- The document does not refer to a previous revision, and no previous revision is known
- Or, the set of signatures forms a quorum of the delegations of the previous revision.
It is an error if:
- No previous revision is given, but a parent in the identity attestation chain is found
- A previous revision is given, but the identity attestation chain does not yield a parent
- A previous revision is given, but it is not the same the parent attestation refers to
- The current and parent attestations refer to different roots
Git Implementationβ
Overviewβ
Radicle basically uses Git as a database. This means everything is stored in a single Git monorepo that is read and written from via a Radicle client, like the CLI. Our Git implementation was devised to create an incentive for the seeder to provide all data necessary to resolve and verify a repository, while reducing latency by eliminating gossip queries and git fetches as much as possible.
Read more about our approach here
Namespacingβ
Radicle uses namespaces to store a peer
's Radicle data. By using
namespacing, Radicle Link partitions a single git repository - or monorepo -
into logical, smaller repos, that can be checked out individually. This monorepo
includes a peer
's identity (personal & project) data, as well as all tracked
Git repositories.
The namespacing scheme has the following format:
# Owner of the monorepo
let PEER_ID;
# Peer tracked by $PEER_ID, either directly or transitively
let TRACKED_PEER_ID;
# Identity hash of the project or peer
let IDENTITY;
# Identity hashes of certifiers of $IDENTITY
let CERTIFIER[1..];
$PEER_ID/refs/
`-- namespaces
`-- $IDENTITY
`-- refs
|-- heads # <-- code branches owned by $PEER_ID go here
|-- rad
| |-- id # <-- points to the identity document history
| |-- signed_refs # <-- signed refs of the peer
| |-- self # <-- points to the identity of $PEER_ID
| `-- ids
| |-- $CERTIFIER[1]
| `-- $CERTIFIER[2]
`-- remotes
`-- $TRACKED_PEER_ID
|-- heads
`-- rad
|-- id
|-- signed_refs
|-- self
`-- ids
|-- $CERTIFIER[1]
`-- $CERTIFIER[2]
Note that the owned $CERTIFIER[n] refs (ie. not those of remotes) are
symbolic refs, pointing to the rad/id
branch of the respective
namespace. For example, if identity A
is certified by identity B
,
refs/namespaces/A/refs/rad/ids/B
would contain:
ref: refs/namespaces/B/refs/rad/id
The certifier can only certify if the certifying identity is present locally and is logically valid for the certifier to use for certifying. The symref ensures that the certifying identity can be updated in one place, and stays up-to-date at all use sites without maintenance.
The rad/self
branch identifies $PEER_ID
, i.e. the rad/id
branch of the
corresponding identity namespace. For example, if the identity of $PEER_ID
is
C
, rad/self
within the context of $IDENTITY
would be a symref:
ref: refs/namespaces/C/rad/id
Any certifiers of the self
identity must be included under rad/ids/
. The
rad/self
branch contains the equivalent to the contributor file in the
radicle-link spec, rev1-draft, which is required iff the refs/heads/ hierarchy
of $PEER_ID is non-empty (ie. it is permissible to omit it if $PEER_ID does not
publish any branches of their own).
Git Encodingβ
In the git
implementation, a Doc
corresponds to a blob
object, stored as
the single entry of a tree
object, such that its name (acc. to the tree
) is
equal to the blob
hash of the initial version of the Doc
, serialised in
canonical form. That is:
let name = git hash-object -t blob doc.canonical_form()
An Identity
corresponds to a commit
object.
We map the fields as follows:
/* Simplified git object model */
struct Commit {
id: Oid,
tree: Tree,
message: String,
}
struct Tree {
id: Oid,
entries: Vec<TreeEntry>,
}
struct TreeEntry {
id: Oid,
name: String,
object: BlobOrTree,
}
struct Blob {
id: Oid,
content: Vec<u8>,
}
/* Mapping (trivial type conversions elided) */
let commit = /* .. */;
let identity = Identity {
id: commit.id,
root: commit.tree.entries[0].name,
revision: tree.id,
doc: deserialize(first_blob(commit.tree).content),
signatures: from_trailers(commit.message),
};
Where:
first_blob
finds the firstTreeEntry
which is of typeblob
.deserialize
is implemented by a standard JSON parser.User
delegations from aProject
are specified in theProject
'sDoc
as URNs, which are resolved by parsing ablob
object of the same name as the URN'sid
field below thetree
entry of type directory nameddelegations
.from_trailers
interprets the commit message as per git-interpret-trailers, and extracts the signatures from trailers with the tokenx-rad-signature
.
The commit chain is stored in a branch at refs/rad/id
in the peer's monorepo.
Fetchingβ
Fetching (or cloning) happens on a per-$IDENTITY
basis, meaning when a peer
fetches a project, they only have to fetch information from the peers they are
interested in. Due to the certifier symrefs, the peer being fetched from
is able to advertise a "proof" (or perhaps better: "promise") to be able to
include all relevant data (the rad/id
branches) in the packfile. This enables,
in a single packfile, a "mirror" of the logical remote repository requested, as
well as all of the top-level logical repositories of all certifiers required to
verify the identity document(s).
See more on Fetching
Working Copiesβ
Within the rad
refs category, peers can obtain a working copy of a specific
repository that can hide "special" branches during to our namespacing scheme. We
call this checking out
.
[remote "rad"]
url = file://path/to/monorepo.git
fetch = +refs/namespaces/$IDENTITY/refs/heads/*:refs/remotes/rad/refs/heads/*
In addition to this, we can see the branches of tracked peers by running git
branch
. To provide a human-readable view of a project's remotes, when fetching
we inspect the rad/self
identity metadata in order to find nicknames. This is managed entirely by librad
, which
reduces to the following in the working copy's config:
[include]
path = /path/to/managed.inc
Peer Discovery & Replicationβ
Overviewβ
Radicle Link extends Git with peer-to-peer network discovery via a process called gossip. This means that peers in the network share and spread data they are "interested" in by keeping (replicating) redundant copies locally and sharing deltas with peers. With Radicle, we replicate data across connected repositories according to a βsocial graphβ of peers and projects, enabling source code and changesets to be disseminated according to use and value: the more peers who are interested in a certain project, the more available this project is made to the network.
Replication Modelβ
Repositories are the base unit of replication in Radicle. To publish a
repository to the network, it must first be initialized as a project. Project
combine source code, issues and proposed changes under a single umbrella, and
carry a unique, shareable peer-to-peer identifier. The entirety of the project
data and metadata, including social artefacts such as comments, are stored
within the repository. To create a project, the owner of a repository defines a
project identity. In the background, a project identity document is created in a
predetermined disjoint branch of the repository, by convention rad/id
. This
file contains important metadata such as the project name, list of maintainers,
as well as any related links.
The unit of replication is a repository, identified by a PeerID
in the context
of a project document (See Data Model). The holder of the corresponding
DeviceKey
is referred to as the maintainer of the repository. Repositories
belonging to the same project are represented locally as a single repository,
identified by a Radicle URN. In the
context of a project, the maintainer of a repository may choose to track the
repositories of other peers (this is called a remote in git terminology: a named
reference to a remote repository). If the remote repository is found to track
other remotes, the tracking repository will also transitively track those, up to
n hops out.
Therefore, a project on Radicle preserves the transitivity information of its remotes (i.e. via which tracked PeerID another PeerID is tracked).
Trackingβ
Tracking is the backbone of collaboration as it drives the exchange of projects and their artifacts. In Radicle, peers track other peers and projects that they are interested in. This happens when a peer clones another peer's project.
Since peers represent seperate devices in the network, they each have their own view of the network. Each peer tracks this view of projects, identities, and data from connected peers in its own monorepo (See Git Implementation).
When a peer tracks another peer in the context of a project β say, if it clones another peer's project β it sets the intention to fetch and gossip the other peer's view of that project. This means includes the project metadata, all working branches and commits, and changesets will be replicated and stored in the tracking peer's monorepo, so that it can be fetched and collaborated on.
Specifically, this means the tracked peer will be added as a remote within the
tracking peer's monorepo as a $TRACKED_PEER_ID
, as seen in the following
example:
# Owner of the monorepo
let PEER_ID;
# Peer tracked by $PEER_ID, either directly or transitively
let TRACKED_PEER_ID;
# Identity hash of the project or user
let IDENTITY;
# Identity hashes of certifiers of $IDENTITY
let CERTIFIER[1..];
$PEER_ID/refs/
`-- namespaces
`-- $IDENTITY
`-- refs
|-- heads # <-- code branches owned by $PEER_ID go here
|-- rad
| |-- id # <-- points to the identity document history
| |-- signed_refs # <-- signed refs of the peer
| |-- self # <-- points to the identity of $PEER_ID
| `-- ids
| |-- $CERTIFIER[1]
| `-- $CERTIFIER[2]
`-- remotes
`-- $TRACKED_PEER_ID
|-- heads <-- code branches owned by $TRACKED_PEER_ID go here
`-- rad
|-- id
|-- signed_refs
|-- self <-- points to the identity of $TRACKED_PEER_ID
`-- ids
|-- $CERTIFIER[1]
`-- $CERTIFIER[2]
See Namespacing for more on monorepo structure.
As well as being added to remotes/$TRACKED_PEER_ID
, the tracking peer's
monorepo will also gain an entry in its config
file (See Fetching). This
means that if the tracked peer announces changes and they are gossiped to the
tracking peer β whether through their device or another peer's β they can be
fetched and applied to the tracking peer's monorepo.
Direct Trackingβ
The other way a peer can track another peer is by explicity telling their
monorepo to track a specific PEER_ID
. Using the track
function with
PEER_ID
of interest, the monorepo creates a new entry in the git config. Any
updates from the tracked peer can be similarly fetched and applied the tracking
peer's monorepo.
The Social Graphβ
In the case of multiple peer replications, any peer that tracks a project will implicitly track it's maintainers as well. This means that when any peer on the network clones a project, all of said project's maintainers will end up in that peer's remote list. Since maintainers of the project work on the canonical view of the project, this automatic tracking ensures the health and consistency of a project as it's gossiped across the network.
This also means that for a single PEER_ID
, we have a sub-graph that consists
of more PEER_ID
s β whether they be the maintainers of the project or other
tracked peers. Any time a peer is replicated, a portion of their sub-graph is
replicated as well, up to 2 hops out.
This means that everytime you track a peer, you are not only adding them as a remote, but also their remotes, and the remotes of their remotes. This ensures that a project is consistently available across the network without a total reliance on the maintainers of the project or the original tracked peer.
Validationβ
To ensure data integrity and authenticity, when creating a working copy of a project, the attestation history according to the remote peer is fetched before all other repository contents, and the verification procedure (See Verification) is run on it. If this does not yield a verified status, the clone is aborted. The resulting repository state must include the attestation histories of at least a quorum of the delegates (See Delegation) as per the remote peer's view of the identity document. In Git, the claim that this will be the case can be determined before fetching the repository contents by examining the advertised remote refs (See Fetching). If these preconditions are not met, the clone is aborted, and already fetched data is pruned.
Seedingβ
To improve data availability, participants in the network can choose to act as seeds. This is similar in concept to a pub in Secure Scuttlebutt. Seed nodes are "always-on" nodes running on public IP addresses that serve data to any connected peers. By joining a seed node, it automatically tracks you and shares your data across its network of other connected users. This increases the availability of your data throughout the network, while making it easier to find other's data as well.
A seed may track a large number of repositories for a given project, so cloning from a seed will greatly increase the connectedness of a tracking graph. Also note that, by tracking a seed, upstream maintainers can increase the number of paths leading back to them, such that contributions can flow back up even if they come from participants not within the set of tracked repositories of a maintainer.
Collaboration Modelβ
Our construction of the Identity
from a git commit allows for multiple id
s
to describe the same revision of the document (and thus be equally valid). This
means that the respective delegates' histories may diverge in their commit
histories, but still converge to an agreement on the validity of the attested
document revision.
This means that there isn't a single canonical branch (or master
),
as peers are all maintaining their own upstreams of the same project. However,
due to the data model of Radicle identities, there will always be a 'canonical'
view of a project associated with its maintainers. Maintainers can follow a
leader-based workflow in which they are converging histories of contributing
peers into their main branch. Since their view is verifiable and implicitly
tracked whenever a peer follows a project, peers can ensure they are replicating
a canonical and updated view of a project.
In addition to this, the way Radicle Link works introduces certain implications for end-user collaboration experience:
- Your social graph determines what type of content you see, interact with and replicate.
Assuming you have discovered a project of interest within the Radicle network (more on discoverability later), then the first thing you have to do in order to interact with it is to track it. Tracking a project signals interest, and by design implies tracking the project's maintainers, therefore replicating the data within their social graphs.
In the context of a project, maintainers of a repository may choose to track the views of other owners (this is called a remote in Git terminology: a named reference to a remote repository). If the remote repository is found to track other remotes, the tracking repository shall also transitively track those, up to a configurable N hops out (currently in the works).
- Spam and content moderation is naturally handled by the peer's social graph
While this might appear confusing at first, in fact its far more natural (it actually mimics real life communication) and by design addresses issues like spam and content moderation, which are naturally handled by the peer's social graph.
A spammer's patches or issues will never be tracked by the actual maintainers and as a result they wont be seen by the rest of the network (unless explicitly tracked). Similarly, if you are not interested in a peer's views or contributions to a project, you can simply un-follow them, stopping to replicate, view and interact with their data.
- Within the same project, two peers might have diverging views.
The above design also means that even within the same project, peers have subjective (and often diverging) views.
At minimum, your view of a project becomes the sum of the views of the people you follow, plus the views of the maintainers of the project. In addition, you can expand your perspective by configuring your replication settings to also transitively track other remotes N hops out from the peers you follow (i.e. peers of your peers / remotes of your remotes).
This design also addresses a significant problem with decentralized systems relying exclusively on distributed ledger technology, the problem of "blockchain poisoning". This is when someone deliberately adds illegal content to an append only source in hopes to make the sole act of replicating the project legally problematic, as correctly pointed out by Konstantin Ryabitsev of the Linux foundation with regards to a previous version of Radicle that was relying on IPFS.
Read more about the Radicle collaboration model.