Understanding Decentralized Social Networks

cover
25 Sept 2024

Authors:

(1) Martin Kleppmann, University of Cambridge, Cambridge, UK (martin.kleppmann@cst.cam.ac.uk);

(2) Paul Frazee, Bluesky Social PBC United States;

(3) Jake Gold, Bluesky Social PBC United States;

(4) Jay Graber, Bluesky Social PBC United States;

(5) Daniel Holmgren, Bluesky Social PBC United States;

(6) Devin Ivy, Bluesky Social PBC United States;

(7) Jeromy Johnson, Bluesky Social PBC United States;

(8) Bryan Newbold, Bluesky Social PBC United States;

(9) Jaz Volpert, Bluesky Social PBC United States.

Abstract and 1 Introduction

2 The Bluesky Social App

2.1 Moderation Features

2.2 User Handles

2.3 Custom Feeds and Algorithmic Choice

3 The at Protocol Architecture

3.1 User Data Repositories

3.2 Personal Data Servers (PDS)

3.3 Indexing Infrastructure

3.4 Labelers and Feed Generators

3.5 User Identity

4 Related Work

5 Conclusions, Acknowledgments, and References

Several other decentralized social networks are in development. We believe that there is no single optimal design: different systems make different trade-offs, and are therefore suitable for different purposes. Bluesky and the AT Protocol aim to provide a good user experience by making moderation a first-class concern, by having clients that are lightweight and fast, and by providing a global view over the whole network. For example, conversation threads include all replies (unless removed by moderation), regardless of the server on which they were posted. To achieve this goal we rely on an indexing infrastructure that is more centralized than some other designs. However, we emphasize that there can be multiple competing indexers, and third-party client apps are free to show data from whichever indexers they wish.

In 2021 some of our team published a review of the decentralized social ecosystem [25]. In this section we summarize some recent developments that have happened since, and we refer to the review for a more comprehensive comparison of protocols and systems.

Many decentralized social networking projects have ideas in common. For example, the idea of using DNS domain names as usernames also appears in Nostr [23]. An atproto PDS has similarities to Git repository hosting (e.g. GitHub/Gitlab) or a Solid pod [49]. There are also federated chat systems such as Matrix [40], IRC [43], and XMPP [47], but we focus on systems that provide a Twitter-like model where users follow each other.

4.1 Scuttlebutt

Secure Scuttlebutt (SSB) is a peer-to-peer social networking protocol [1]; Manyverse [57] and Planetary [59] are social applications built upon the SSB protocol. It optionally uses relay servers called pubs to store messages from peers that are offline, and to enable user discovery. The client software downloads the feeds from accounts that the user is explicitly following, and from accounts followed by followed accounts (up to three hops by default). This can require significant amounts of storage and bandwidth on the client.

Any messages from users outside of the third-degree network are not shown, which effectively limits the set of people who can mention or reply to a user to the third-degree network. This deliberate design decision is intended to reduce moderation problems by prioritizing conversation between people who already know each other [53]. In contrast, Bluesky/atproto are designed to allow anybody to talk to anybody else. This requires more explicit moderation to manage unwanted content, but we believe it also enables serendipity and is a prerequisite for any “digital town square”.

Since SSB is built upon append-only logs and gossip replication, it is not possible to delete content once it has been posted [56]. User identity is tied to a cryptographic key on the user’s device, requiring manual key management for moving to another device. Posting from multiple devices is not possible, as sharing the same key between devices can make an account unrecoverable [55]. A workin-progress successor protocol to SSB, called PPPPP, is designed to address these issues [52].

4.2 Nostr

Nostr also began as a revision of SSB, replacing the append-only logs with individual signed messages [30]. It leans more heavily on relay servers instead of peer-to-peer communication: clients publish and fetch messages on relays of their choice, and there is no federation among relays [21]. The protocol is deliberately simple, and it prioritizes censorship resistance over moderation: relays can block users, but users can always move to a new relay, and use multiple relays at the same time. Communication (e.g. reply threads) is only possible between users who have at least one relay in common. Although some services index the whole Nostr network, these indexes are not used for user-to-user interaction. As a result, it is unpredictable who will see which message. The creator of Nostr writes: “there are no guarantees of anything [. . . ] to use Nostr one must embrace the inherent chaos” [22]. Key management is manual in Nostr, and facilities for key rotation are still under discussion [4].

4.3 Farcaster and blockchain-based systems

Farcaster [60] has some architectural similarities to Bluesky/atproto, although it was developed independently. It has storage servers called hubs, which store the state of the entire network similarly to an atproto Relay, and it has a concept of hosted app servers that are similar to our App View [51]. Farcaster user IDs are similar to our DIDs, and they are mapped to public keys using a smart contract on the Ethereum Optimism blockchain that is functionally similar to our PLC directory. Usernames can be either ENS names [19], or names within an off-chain namespace managed centrally by Farcaster, similarly to .bsky.social subdomains in Bluesky [20].

A difference is that Farcaster has no equivalent to atproto’s PDS; instead, client apps publish signed messages directly to a hub, and hubs synchronize messages using a convergent gossip protocol. Users must pay in cryptocurrency to register their public key, and for hub data storage (at the time of writing, Ethereum equivalent to $5 USD/year); when a user exceeds their storage allowance, old messages are deleted. Fees are currently collected centrally by the Farcaster team [29]. In contrast, the AT Protocol does not specify storage limitations, but leaves it to providers of PDS and indexing services to define their own business model and abuse-prevention policies. We also prefer to avoid a dependency on a cryptocurrency.

The Lens protocol [35] is more strongly blockchain-based than Farcaster: it even stores high-volume user actions such as posts and follows on Polygon, a proof-of-stake blockchain. DSNP takes a similar approach [44]. Placing high-volume events directly on a blockchain incurs orders of magnitude higher per-user costs than atproto, and is likely to run into scalability limits as the number of users grows. Lens is adopting a layer-3 blockchain that provides better scalability and lower cost [36], but weaker security properties. Linking social accounts to cryptocurrency wallets and NFTs enables users to monetize their content, but this is not a goal of atproto.

4.4 ActivityPub and Mastodon

ActivityPub [34] is a W3C standard for social networking, and Mastodon [39] is its most popular implementation. We have highlighted aspects of their design in Sections 1, 2.3, and 3.2. Mastodon gives a lot of power to server administrators: for example, a server admin can choose to block another server, preventing all communication between users on those servers. There is a degree of lock-in to a server because moving to another server is intrusive: the username changes, moving posts to the new server currently requires an experimental command-line tool [48, 58], and other users’ replies to those posts are lost. These risks can be mitigated by self-hosting; managed providers exist [28], but they still require some expertise and cost money. The AT Protocol separates the roles of moderation and hosting, and aims to make it easier to change providers.

When user 𝐴 follows user 𝐵, 𝐴’s server asks 𝐵’s server to send it notifications of 𝐵’s future posts via ActivityPub. This architecture has the advantage of not requiring a whole-network index. However, replies to a post notify the server of the original poster, but not necessarily every server that has a copy of the original post, leading to inconsistent reply threads on different servers. Notifications can be forwarded, but in the limit this leads to each server having a copy of the whole network, which would make it expensive to run a server. Viral posts can generate a lot of inbound requests to a server from people liking, replying, and boosting (reposting). The Bluesky indexing infrastructure is also fairly expensive, but a PDS is cheap to run. Since users can choose their moderation preferences independently from their indexing provider (App View), we believe that the ecosystem can be healthy with a small number of indexing providers.

This paper is available on arxiv under CC BY 4.0 DEED license.