Migrating User Handles and PDS on Bluesky with AT Protocol’s Decentralized IDs

cover
25 Sept 2024

Authors:

(1) Martin Kleppmann, University of Cambridge, Cambridge, UK (martin.kleppmann@cst.cam.ac.uk);

(2) Paul Frazee, Bluesky Social PBC United States;

(3) Jake Gold, Bluesky Social PBC United States;

(4) Jay Graber, Bluesky Social PBC United States;

(5) Daniel Holmgren, Bluesky Social PBC United States;

(6) Devin Ivy, Bluesky Social PBC United States;

(7) Jeromy Johnson, Bluesky Social PBC United States;

(8) Bryan Newbold, Bluesky Social PBC United States;

(9) Jaz Volpert, Bluesky Social PBC United States.

Abstract and 1 Introduction

2 The Bluesky Social App

2.1 Moderation Features

2.2 User Handles

2.3 Custom Feeds and Algorithmic Choice

3 The at Protocol Architecture

3.1 User Data Repositories

3.2 Personal Data Servers (PDS)

3.3 Indexing Infrastructure

3.4 Labelers and Feed Generators

3.5 User Identity

4 Related Work

5 Conclusions, Acknowledgments, and References

3.5 User Identity

As explained in Section 2.2, user handles in Bluesky and atproto are DNS domain names. Any number of identity providers can coexist in the system: Bluesky Social PBC allows users to register subdomains of .bsky.social, but the indexing infrastructure does not treat users differently based on their handle.

We want a user to be able to change their handle without affecting their social graph. Therefore, when a record in user 𝐴’s repository indicates that 𝐴 is following 𝐵, that record must identify 𝐵 in a way that is more long-lived than specifying 𝐵’s handle. For this reason, every Bluesky/atproto account has an immutable, unique identifier: a decentralized ID or DID, which is a URI starting with the prefix did:. The record that 𝐴 follows 𝐵 then contains 𝐵’s DID. DIDs are a recent W3C standard [50].

Moreover, we want a user to be able to migrate to a different PDS without changing either their DID or their handle. The DID specification provides a mechanism for resolving a DID into a DID document, a JSON document containing information about the user identified by that DID, as illustrated in Figure 4. In atproto, a DID document specifies (among other things) the handle of the user, the URL of their PDS, and the public key that is used to sign the Merkle tree root of their repository every time they add or delete a record. To change their handle or their PDS, the user needs to update their DID document to the new value.

For a user to successfully claim a particular handle, they must have a bidirectional link between their DID and their domain name handle, as shown in Figure 4:

• A link from the handle to the DID is established either by storing the DID in a DNS TXT record on that domain name, or by returning the DID in response to a HTTPS request to a /.well-known/ URL on that domain name [37].

• A link from the DID to the handle is established by including the handle in the DID document that is returned when the DID is resolved.

3.5.1 Resolving DID documents. The W3C DID specification [50] does not directly specify the mechanism for resolving a DID into a DID document. Rather, the first substring after did: in a DID indicates the DID method, and the specification of the DID method defines the protocol for obtaining the DID document. Hundreds of DID methods have been defined [54], many of which are dependent on specific blockchains or other external systems. To avoid atproto implementations having to support so many resolution methods, our services currently only accept DIDs based on either did:web (defined by the the W3C Credentials Community Group [27]) or did:plc (defined by ourselves for atproto [31]). Support for more DID methods might be added in the future.

The did:web method is very simple: the part of the DID after did:web: is a domain name, and the DID document is resolved by making a HTTPS request to a /.well-known/ URL on that domain name (a path can optionally be included). The security of a did:web identity therefore assumes that the web hosting provider for that domain is trusted, and also relies on trusting the TLS certificate authorities that may authenticate the HTTPS request.

did:web identities are therefore similar to domain name handles, with the difference that the name cannot be changed, since a DID is an immutable identifier. This makes did:web appropriate for the identity of organizations that are already strongly linked to a particular domain name. For most users, did:plc is more appropriate, since it uses a domain name only as a handle that can be changed.

3.5.2 The did:plc DID method. When a user creates an account on the Bluesky social app, they are by default assigned a DID of the form did:plc:eclio37ymobqex2ncko63h4r, where the string after the prefix did:plc: is the SHA256 hash of the initial DID document, truncated to 120 bits and encoded using base32 [31]. A DID of this form can be resolved to the corresponding DID document by querying a server at https://plc.directory/, which is currently operated by Bluesky Social PBC; in the future we plan to establish a consortium of independent operators that collectively provide the PLC directory service.

The PLC directory server plays an authoritative role similar to the DNS root servers, but it is mostly untrusted because PLC DID documents are self-certifying. If the DID document has not changed since its initial creation, it is easy to verify that a DID has been correctly resolved to a DID document by recomputing its hash. To support changes to the DID document, the initial version of a user’s DID document contains a public key that is authorized to sign a new version of the DID document. Any new version of the DID document is only valid if it has been signed by the key in the previous version. The directory server returns all DID document versions for a given DID, allowing anybody to check the chain of signatures.

If the directory server were to be malicious, it would not be able to modify any DID documents – it could only omit valid DID document versions from its responses, or fail to respond at all. Moreover, if there were to be a fork in DID document history such that two correctly signed successor versions for some DID document exist, the directory server could choose which one of these forks to serve. To mitigate the risk of such attacks, we anticipate that a future version of the PLC directory will use techniques from certificate transparency [33] to ensure that DID document updates form an append-only log.

3.5.3 Authentication. In principle, the cryptographic keys for signing repository updates and DID document updates can be held directly on the user’s devices, e.g. using a cryptocurrency wallet, in order to minimize trust in servers. However, we believe that such manual key management is not appropriate for most users, since there is a significant risk of the keys being compromised or lost.

The Bluesky PDSes therefore hold these signing keys custodially on behalf of users, and users authenticate themselves to their home PDS via username and password. This provides a familiar user experience to users, and enables standard features such as password reset by email. The AT Protocol does not make any assumptions about how PDSes authenticate their users, and other PDS operators are free to use different authentication methods.

This paper is available on arxiv under CC BY 4.0 DEED license.