The Role of Indexers and the Firehose in AT Protocol’s Decentralized Web

cover
25 Sept 2024

Authors:

(1) Martin Kleppmann, University of Cambridge, Cambridge, UK (martin.kleppmann@cst.cam.ac.uk);

(2) Paul Frazee, Bluesky Social PBC United States;

(3) Jake Gold, Bluesky Social PBC United States;

(4) Jay Graber, Bluesky Social PBC United States;

(5) Daniel Holmgren, Bluesky Social PBC United States;

(6) Devin Ivy, Bluesky Social PBC United States;

(7) Jeromy Johnson, Bluesky Social PBC United States;

(8) Bryan Newbold, Bluesky Social PBC United States;

(9) Jaz Volpert, Bluesky Social PBC United States.

Abstract and 1 Introduction

2 The Bluesky Social App

2.1 Moderation Features

2.2 User Handles

2.3 Custom Feeds and Algorithmic Choice

3 The at Protocol Architecture

3.1 User Data Repositories

3.2 Personal Data Servers (PDS)

3.3 Indexing Infrastructure

3.4 Labelers and Feed Generators

3.5 User Identity

4 Related Work

5 Conclusions, Acknowledgments, and References

3.4 Labelers and Feed Generators

Relay and App View aim to provide a mostly “unopinionated” service: they compute indexes over repositories in a neutral way, without attempting to rank or classify content. However, a good user experience also requires “opinionated” judgements for the purposes of content filtering (e.g. detecting sexually explicit images or spam) and curation (e.g. selecting posts on a particular topic).

The AT Protocol seperates out the “opinionated” aspects of the system into separate services: labelers and feed generators. These services typically take the firehose as their input. Labelers produce a stream of judgements about content (e.g. “this post is spam”), whereas feed generators return a list of post IDs they have selected for inclusion in a custom feed, as described in Section 2.3. Users can choose in their client app which feeds and which labelers they want to use. The output of labelers is consumed by App Views or PDSes in order to apply content filtering [12]. For a feed generator, an App View expands the post IDs into full posts before sending them to the client app of users who have subscribed to that feed.

Having labeler and feed generator services that are separate from App Views has several advantages:

• Anyone can run such services, which enables a pluralistic ecosystem in which different parties may make different judgements about the same piece of content. Users, as well as the operators of App Views and PDSes, can decide whose judgements they want to trust, and it is easy for them to switch to alternative labeling and feed generation services if their current providers fail to meet their expectations.

• It becomes easier to set up alternative App View providers: since any App View can consume the publicly available output from labelers and feed generators, there is less pressure for each App View to develop its own content filtering infrastructure. Having alternative App Views is important for a healthy, decentralized marketplace.

Feed generators can be implemented in code using our starter kit [10], or created with a third-party service such as Skyfeed [13].

This paper is available on arxiv under CC BY 4.0 DEED license.