Skip to main content
Nick Scialli home page

Govsky

Recently, I have been a lot more active on Bluesky. I'm really enjoying it for a few reasons:

Due to the openness of the Bluesky API, I began thinking about what projects I could hack on in my spare time. I quickly realized that, due to the domain-based handle verification system, it would be possible to catalog official government presence on Bluesky: anyone with a handle ending in .gov is a US government account. .europa.eu? European Union! .gov.je? You'd better believe that's an official Island of Jersey account.

Thus, Govsky was born!

What Govsky includes #

There's quite a bit involved in Govsky! First and foremost, the whole Govsky project is open source. The outputs of the Govsky projects are:

How it works #

I'll go into some technical details about how Govsky works here.

Getting handles #

When tackling this project, my first thought was that I should be able to easily query the Bluesky API for all handles ending in .gov. Incorrect! While Bluesky probably caches user handles somewhere, it doesn't expose that information via its API. Instead, you have to go to the PLC Directory, which provides an export API endpoint that can be used to backfill updates to Decentralized Identifier (DID) records.

Once I figured out how to consume this endpoint, I wrote a script that backfills handles since the beginning of Bluesky. The initial backfill took several hours and now I just catch up every 10 minutes or so on a scheduled job.

I store every custom handle (i.e., not the default bsky.social handle) in a postgres database. I made the decision to store all of these handles because I didn't (and still don't) know the full list of handles I want to track. At the time of writing this (Jan 29, 2025), there are about 625,000 handles in the database.

Over-collecting handles was definitely a good idea. However, I probably will dial this over-collecting back a bit: no official government entities will have a .lol domain name (even if it feels like some should).

Validating handles #

Handles in DID records need to be verified bi-directionally. That means I couldn't assume all records from the PLC backfill were correct—and in fact, I have seen quite a few false records. For example: fakeaccount.534.gov.

There are a couple ways to validate handles:

  1. Check for DNS txt record or .well-known file. This is the mechanism used by Bluesky when people using a custom domain as a handle. If the user's DID is in the _atproto.{handle} text record or is returned from a request to https://{handle}/.well-known/atproto-did, the handle can be considered verified.
  2. Use the Bluesky API. Requesting the profile information for the user's did will return the Bluesky-verified handle associated with the user.

I opted for the second option as I was already familiar with the Bluesky API. The handle validation code can be viewed here.

Setting up the API #

In the spirit of Bluesky and the AT Protocol, I knew I wanted to make Govsky open source. I especially wanted others to have access to verified government handles without having to do the whole PLC backfill thing and maintaining their own databases. Therefore, I opted to create an API that returns verified government accounts for a specific domain. The API itself is a small Fastify/Typescript/node.js application and its code can be viewed here.

What I like about the API approach is that now anyone can call https://api.govsky.org/api/.gov (go ahead, click on it) and get all verified .gov handles on Bluesky. This will hopefully enable all sorts of interesting projects folks might want to undertake with this information!

The Govsky project is designed to largely be config-driven, so as new countries or domains are discovered, only config changes are needed to start surfacing them via the API. The config project in the codebase contains all of these configs. Here is a small example of what the config looks like for a few countries:

const config = {
	au: {
		name: "Australia",
		domains: [".gov.au"],
	},
	br: {
		name: "Brazil",
		domains: [".gov.br"],
	},
	ca: {
		name: "Canada",
		domains: [".gc.ca", ".canada.ca"],
	},
};

The API is set up to only return handles for domains specified in this config file. I'm not interested in the API being used for non-government handles. Perhaps that could be another project! If you try to look up a non-allowed domain, you'll get an error message that looks like this:

Extension must be one of: .gov.au, .gov.br, .gc.ca, .canada.ca, .gov.co, .bund.de, .bundesregierung.de, .bundestag.de, .europa.eu, .gouv.fr, .senat.fr, .service-public.fr, .gov.je, .gov.uk, .parliament.uk, .gov, .mil

Running Bluesky bot accounts #

The Bluesky bot accounts (e.g., Govsky US) are scheduled node.js jobs. They do the following:

The bot code can be found here. The bot code is purely config driven, so new bots can be added by defining a new config like this:

const govskyUsBot: BotConfig = {
	name: "Govsky US",
	handle: process.env.GOVSKY_US_HANDLE || "",
	password: process.env.GOVSKY_US_PW || "",
	domains: config.us.domains,
	welcomeMessage: (user: ApiUser) => {
		const name = user.displayName
			? `${user.displayName} (@${user.handle})`
			: user.handle;
		return `${name} has joined Bluesky! #govsky`;
	},
	lists: [
		{
			description: "All gov",
			uri: "at://did:plc:pe365hgnkisv4rhrcow7m5ue/app.bsky.graph.list/3lf3xwfybxl2j",
			addHandleToListTest: () => true,
		},
		{
			description: "No congress",
			uri: "at://did:plc:pe365hgnkisv4rhrcow7m5ue/app.bsky.graph.list/3lf6am7kaxb2n",
			addHandleToListTest: (handle) =>
				!handle.endsWith(".house.gov") && !handle.endsWith(".senate.gov"),
		},
	],
};

This small-ish object specifies which domains to look for, the welcome message to be posted per account, and any lists that account is maintaining.

The web app #

The https://govsky.org/ website is a React app that uses the Govsky API and renders a tree view of accounts on a country-by-country basis. It's also searchable as countries like the US have hundreds of accounts on Bluesky already.

screenshot of the Govsky website

I really like the tree view for this application as government entities can be quite hierarchical. With the tree view, you can find all 20+ accounts belonging to various City of Boston entities nearly tucked under the boston.gov node of the tree!

Project structure and infrastructure #

While it's not important to the final output, I figure it's worth going into some information about project structure and infrastructure decisions I made.

Rush.js monorepo #

The Govsky codebase is a rush.js monorepo consisting of the following projects:

I wasn't sure from the start whether a monorepo would be a good idea or if it would just add overhead. I decided to use one since there would be quite a bit of code-sharing between the projects. Additionally, I wanted to make each project separate enough that folks could run the individual projects on their own. I chose Rush specifically because I use it at work and am already decently familiar with it.

In hindsight, I'm extremely happy with how the monorepo worked out. The code feels DRY and nicely structured. It pretty much just works except for a few hiccups along the way. An example issue I hit is that the Prisma ORM generates its client/types inside the node_modules directory. This doesn't play nicely with Rush because project node_modules directories are actually just symlinked to a common node_modules directory. Fortunately, I was able to solve that issue after a bit of research by specifying an alternative directory in which the Prisma client could be generated.

Supabase #

I use Supabase for a managed postgres database service. It's really nice so far! I'm a good bit below the threshold for egress and storage size for getting billed. If I do end up getting close to those thresholds, I have some ideas to reduce size (eliminate domains that will never be government-affiliated) and reduce egress (more aggressive caching).

Fly.io #

Fly.io is where I run everything except for the web app. Lazily, and frugally, I am running the API, bots, and backfill/verification process ona single 256mb fly.io machine and so far... it works great! Fly.io doesn't charge you for monthly bills under $5 and it looks like I'll be able to stay under this threshold. Pretty nifty!

Cloudflare #

The web app is just some static assets and is deployed to Cloudflare. Its global CDN is excellent and reliable. As far as I can tell, there will be no cost to hosting the app there.

Reflections #

This has been (and is continuing to be) quite a fun project! It combined my interests in civic technology and Bluesky/AT proto into something valuable for the Bluesky ecosystem. I feel like I made some good design and infrastructure choices, which is always nice. Also, I'm really proud to have all the code open source.

Hopefully this article is informative to anyone else looking to tinker with AT Protocol or the Bluesky API, or at least provided an interesting read!

If you enjoyed this article, consider subscribing on Feedly or your favorite RSS consumer. If you'd like to chat, I'm most active on Bluesky.