Nix maintainer demographics
October 2, 2024
I personally know fewer than ten people who actually use Nix, but I’ve noticed that most (if not all) of them are more privacy-conscious and pedantic. They tend to use end-to-end encrypted email providers, DNS resolvers and VPNs. They all use modal text editors and sign their git commits.
I had a suspicion that because of Nix’s idealistic nature, it might attract more of these privacy-advocating, open-source contributing, tech purists. I then realized that nixpkgs repository keeps a list of maintainers with their emails. That gave me an idea.
You can get the maintainers list from nixpkgs repository. It’s in Nix (language), so I converted it to JSON for processing purposes.
curl https://raw.githubusercontent.com/NixOS/nixpkgs/refs/heads/master/maintainers/maintainer-list.nix \
| nix eval --json -f - > maintainer-list.json
A quick scan reveals that not all maintainers have their emails listed, but most of them (except three) do have their GitHub usernames. To fill in the gaps, I used GitHub’s public API api.github.com/users/${username}
to fetch the publicly listed emails for each user and combined the responses.
I now have a list of Nix package maintainers with most1 of the emails populated. Then I funneled the data into a SQLite database, so that I can easily query and perform aggregations on it.
The aggregations will span multiple stages, so we’ll use Common Table Expressions (CTEs) for simplicity. The first stage is to retrieve the email, and if it’s NULL
, we’ll use the one from GitHub profile instead.
WITH emails AS (
SELECT
COALESCE(email, github_email) AS email
FROM
maintainers
WHERE
email IS NOT NULL OR github_email IS NOT NULL
),
Next, extract the domain name by taking the substring after the @
sign.
domains AS (
SELECT
LOWER(SUBSTR(email, INSTR(email, '@') + 1)) AS name
FROM
emails
)
Some providers use multiple domains. For example, Proton Mail has protonmail.com
, proton.me
, and pm.me
. So, we’ll need to group these together. We’ll start by creating a table of mappings.
companies AS (
SELECT 'Google' AS company, 'gmail.com' AS domain
UNION ALL
SELECT 'Google', 'googlemail.com'
UNION ALL
SELECT 'Proton Mail', 'protonmail.com'
UNION ALL
SELECT 'Proton Mail', 'pm.me'
UNION ALL
SELECT 'Proton Mail', 'proton.me'
UNION ALL
SELECT 'Apple', 'icloud.com'
UNION ALL
SELECT 'Apple', 'me.com'
UNION ALL
SELECT 'Apple', 'mac.com'
-- Many more repetitive lines but you get the idea.
)
Finally, using a SELECT
query, we’ll perform a LEFT JOIN
on the company. If the domain isn’t recognized, we’ll group it under the Others
category.
SELECT
COALESCE(c.company, 'Others') AS company,
COALESCE(c.domain, '-') AS domain,
COUNT(*) AS maintainers
FROM
domains d
LEFT JOIN
companies c ON d.name = c.domain
GROUP BY
c.company,
c.domain
ORDER BY
maintainers DESC;
This generates a table with company
, domain
, and maintainers
(count) fields. Remove c.domain
from GROUP BY
to hide the domains if you aren’t interested in that. You can also add a percentage
field.
If you use default sqlite3
client, you can set .mode markdown
to get markdown tables from queries.
email provider | count | percentage |
---|---|---|
1245 | 34.4% | |
Proton Mail | 213 | 5.9% |
Microsoft | 67 | 1.8% |
Posteo | 41 | 1.1% |
Tuta | 21 | 0.6% |
Apple | 21 | 0.6% |
Mailbox.org | 16 | 0.4% |
Yahoo | 11 | 0.3% |
HEY | 7 | 0.2% |
Others | 1980 | 54.7% |
The results were disappointing. I was expecting to be proven right or wrong, but it turns out more than 50% of maintainers use custom domains — some organizational (like mit.edu
or jpl.nasa.gov
), others personal. They might be using Google Workspace, paying for Proton Mail, or self-hosting their own mail servers. The answer remains inconclusive.
I could run dig $DOMAIN mx
on each domain to identify their mail server providers, but I’m not that bored. I suppose the takeaway from this analysis is that developers prefer custom domain emails. For those not using custom domains, most still rely on Google. Additionally, about 10% of the emails are dedicated aliases just for nixpkgs2.
The public GitHub profiles also include locations provided by users. I normalized these locations using geograpy3 to obtain standardized country names. A simple GROUP BY
query shows that out of 2,068 maintainers who have listed their locations properly, most are from the US, followed closely by Germany, France, and Canada3.
These are likely not very accurate. People can list whatever they want in their profiles anyway. It’d be great if we could have a proper annual developer survey like State of JavaScript for Nix.