Using Val Town to chart dependency bloat
Devstats is a Val that you can use to easily track send point-in-time statistics from your GitHub Actions (or other CI) runs and turn them into pretty charts. Here’s ours!
So far, we’re using it to track dependency bloat. Much has been written about dependency bloat. A lot of web projects end up with a multi-gigabyte node_modules
directory filled with agglomerated complexity.
Making a pretty chart of our node_modules
size over time doesn’t force us to cut weight, but it does help motivate such efforts: who doesn’t like to see a big drop in a chart after making some strategic cut?
There are many other analytics stacks that we could’ve used for this, but I wanted to make something dead-simple:
- schemaless, so you can send any statistic without having to declare it beforehand
- using a barebones HTTP request to send statistics, so all you need is cURL
So, devstats was born! I see it as a very basic riff on the legacy of StatsD and RRDtool but over HTTP and with a less fancy wire protocol.
Here’s what our GitHub Actions configuration has at the very end to send statistics to devstats:
Not bad! curl
is already available, so this is all you need
to start tracking your node_modules
size, package.json
lines,
and package-lock.json
lines over time. This is all using Linux
utilities like du
, awk
, and wc
, but you could do the same
with JavaScript or any other language to calculate more complex
metrics.
Using it
To use devstats yourself, just
- Fork the val. You can rename it, and so on!
- Come up with a random authentication token, and add it to your repository secrets and to your Val Town environment variables
- Add a step like shown above to your GitHub Actions configuration.
Not sure how to generate a random authentication token? One way is to run OpenSSL:
That’s it! The ‘wire format’ for devstats is simply sending an object like this:
You can send any name
you want and it’ll be added to the dashboard.
How it works
This val uses our SQLite support as its database, Observable Plot for charts, and Hono to do some routing and rendering.
Notes
You might be wondering: Val Town’s node_modules
directory is around 900 megabytes: what’s in there! I’ll give you a peek into our top-10 module (namespaces):
We spend a lot of megabytes on our monitoring stack, using Open Telemetry and Sentry, and another chunk on the icon sets Lucide and heroicons. We use tiktoken to truncate text before sending it off to be encoded for semantic search.
One issue: triple distributions
JavaScript has been in a painful transition phase between module systems, moving from CommonJS, which was Node.js’s module system for many years, to ESM, which is standardized and designed to also work in browsers. This is one of the culprits behind our node_modules
size.
Open Telemetry is a pretty heavyweight dependency for us. One issue is that they ship each line of code three times: not just dual modules between CommonJS & ESM, but also an esnext build for bleeding-edge JavaScript features, and they include source maps for all three versions.
Lucide does the same thing with a triple build, but for ESM, CommonJS, and UMD module types. Heroicons does a double build, for CommonJS and ESM.
Another: big binaries
Another big issue we have – that probably many other projects have as well – is the big binaries embedded in some NPM modules.
Sentry includes a binary distribution of their CLI, a native Rust tool, which weighs 16MB. Biome, which we use to lint and format our code, is another 23MB.
Lots of copies of esbuild
The biggest challenge that isn’t on this list is esbuild, which is used internally by lots of modules that we depend on.
That’s Grand Perspective showing all of the files in node_modules
as a treemap, with the seven copies of esbuild
glowing.
All in all, we spend about 94MB of space storing 7 versions of esbuild and another 3 versions in sub-packages, a fact that I figured out using this epic (for me) chain of Linux commands:
The problem with esbuild for us is that some transitive dependency will depend on an old or specific version of esbuild, and NPM will dutifully download and store multiple copies of esbuild to satisfy all consumers. Absent some coordination from all the community, we might have to use an override to pin esbuild to a specific version, and hope that our dependencies don’t really need such a specific version range.
Endnotes
Like most decisions, there were good reasons behind all of these. Hybrid CommonJS + ESM modules make sense for a still-divided world of people using NPM modules. Using a conservative version range for esbuild makes sense to limit the chances that your module will unexpectedly break on a new esbuild release. NPM’s ability to download multiple versions of a transitive dependency is actually great and solves a problem inherent in other systems like pip and Cabal.
So, there aren’t many opportunities to point fingers. So far I haven’t found any NPM modules that accidentally included a 40MB MP4 video of Buffy The Vampire Slayer and that’s why they were bloated. We’ve used knip to identify a handful of dependencies that we forgot to remove once we stopped using them. Most of the big stuff in node_modules
is there for a reason.
So, it’s definitely a goal to cut big dependencies that we aren’t using and slim down the ones that we need to use. Devstats is helping with that, and we’re working on more developer tooling for building Val Town with Val Town!
Edit this page