Authors:
(1) Martin Kleppmann, University of Cambridge, Cambridge, UK (martin.kleppmann@cst.cam.ac.uk);
(2) Paul Frazee, Bluesky Social PBC United States;
(3) Jake Gold, Bluesky Social PBC United States;
(4) Jay Graber, Bluesky Social PBC United States;
(5) Daniel Holmgren, Bluesky Social PBC United States;
(6) Devin Ivy, Bluesky Social PBC United States;
(7) Jeromy Johnson, Bluesky Social PBC United States;
(8) Bryan Newbold, Bluesky Social PBC United States;
(9) Jaz Volpert, Bluesky Social PBC United States.
Table of Links
2.3 Custom Feeds and Algorithmic Choice
3 The at Protocol Architecture
3.2 Personal Data Servers (PDS)
3.4 Labelers and Feed Generators
5 Conclusions, Acknowledgments, and References
ABSTRACT
Bluesky is a new social network built upon the AT Protocol, a decentralized foundation for public social media. It was launched in private beta in February 2023, and has grown to over 3 million registered users in the following year. In this paper we introduce the architecture of Bluesky and the AT Protocol, which is inspired by the web itself, but modernized to include streams of real-time updates and cryptographic authentication. We explain how the technical design of Bluesky is informed by our goals: to enable decentralization by having multiple interoperable providers for every part of the system; to make it easy for users to switch providers; to give users agency over the content they see; and to provide a simple user experience that does not burden users with complexity arising from the system’s decentralized nature. The system’s openness allows anybody to contribute to content moderation and community management, and we invite the research community to use Bluesky as a dataset and testing ground for new approaches in social media moderation.
1 INTRODUCTION
Over the last two decades, social media services have evolved from a fun curiosity into a cornerstone of civic life [5]. This development has been accompanied by increasing unease that mainstream “digital town squares”, such as Twitter/X or Facebook, are under the control of a single corporation, and may change their policies on the whim of their leaders [62]. Their operations are opaque (e.g. regarding which content is recommended to users), and their users lack agency over their user experience. As a result, there has been increasing interest in decentralized social networks, of which the fediverse around the ActivityPub protocol [34] and the Mastodon software [39] is perhaps the best known (we review a selection of decentralized social networks in Section 4).
However, decentralization also introduces new challenges. For example, in the case of Mastodon, a user needs to choose a server when creating an account. This choice is significant because the server name becomes part of the username; migrating to another server implies changing username, and preserving one’s followers during such a migration requires the cooperation of the old server. If a server is shut down without warning, accounts on that server cannot be recovered – a particular risk with volunteer-run servers. In principle, a user can host their own server, but only a small fraction of social media users have both the technical skills and the inclination to do so.
The distinction between servers in Mastodon introduces complexity for users that does not exist in centralized services. For example, a user viewing a thread of replies in the web interface of one server may see a different set of replies compared to viewing the same thread on another server, because a server only shows those replies that it knows about [2]. As another example, when viewing the web profile of an account on another server, clicking the “follow” button does not simply follow that account; instead, the user needs to enter the hostname of their own server and be redirected to a URL on their home server before they can follow the account. In our opinion, it is undesirable to burden users with such complexity arising from the federated architecture.
In this paper we introduce the AT Protocol (atproto), a decentralized foundation for social networking, and Bluesky, a Twitter-style social app built upon it. A core design goal of atproto and Bluesky is to enable a user experience of the same or better quality as centralized services, while being open and decentralized on a technical level. We introduce the user-facing features of Bluesky in Section 2, and in Section 3 we explain the underlying systems architecture. The AT Protocol is designed such that for every part of the system there are multiple competing operators providing interoperable services, making it easy to switch from one provider to another.
Decentralization alone is not able to solve some of the thorniest problems of social media, such as misinformation, harassment, and hate speech [46]. However, by opening up the internals of a service to contributors who are not employees of a particular company, decentralization can enable a marketplace of approaches to these problems [38]. For example, Bluesky allows anybody to run moderation services that make subjective decisions of selecting desirable content or flagging undesirable content, and users can choose which moderation services they want to subscribe to. Moderation services are decoupled from hosting providers, making it easy for users to switch moderation services until they find ones that match their preferences. Our hope is that this architectural openness enables communities to develop their own approaches to managing problematic content, independently of what any particular service operator implements [38].
For example, researchers wanting to identify disinformation campaigns can easily get access to all content being posted, the social graph, and user profiles on Bluesky. If they are able construct an algorithm to label suspected disinformation, they can publish their labels in real time, and users who wish to see those labels can enable them in their client software. One goal of this paper is to bring Bluesky and the AT Protocol to the attention of researchers working on such algorithms, and to invite them to use the rapidly growing dataset of Bluesky content as a basis for their work.
This paper is available on arxiv under CC BY 4.0 DEED license.