A
Build System
for Yarn

It's like Bazel, Buck and Pants but for Yarn.

Owen Kelly
Sunday, November 22, 2020

Tooling in the Javascript (and Typescript) ecosystem is generally pretty good (no really). But for the longest time a piece of the puzzle has been missing.

Most of what I build ends up being applications with more than one deployable artifact. Sometimes they're just a front-end client and a simple server. Other times it is that plus a GraphQL schema, multiple Lambdas, and so on.

Since Lerna appeared both the idea and tooling for Javascript monorepos started to take off. For me, it wasn't until Yarn that linking between local packages became a thing - so say, your front-end client and server packages could depend on your GraphQL schema package. Yarn v2 took this to another level and added a degree of stability and correctness that make this even more enticing.

But among all of that, building the packages was still a problem. Namely, if a local package depended on another local package being built, you need to orchestrate that somehow. And as much as I tried, it always ended up feeling less than ideal and certainly not easily repeatable.

In my dabbling into other langauges and tooling, I tried using Bazel. In some ways it's great. I used it to great success with a Golang monorepo with multiple build and testing artifacts.

But for Javascript, well, it's not pretty. Javascript's package ecosystem is reasonably mature at this point. Sure it still has it's flaws, though work is continually being done to address them. (Yarn v2 vendors your node_modules as zip files, for example.)

Javascript and Bazel mix reasonably well. But Javascript and NPM or Yarn don't. Having two systems both trying to own dependency management is just painful.

Yarn v2

At the start of 2020 I started playing around with Yarn v2. Plug'n'Play and the zipfs approach to vendoring dependencies had me intrigued immediately. Both are areas I've found our tooling lacking.

In practice, at the start of 2020 support was growing but still limited. Enough things worked to convince me this is a workable approach though.

And then I discovered that Yarn v2 was far more hackable than v1. Not only that, with the new focus on correctness and reproducibility the only thing missing to make a Bazel for Javascript was the build tool itself.

How it works

At a high level the plugin is pretty straight forward. Yarn's already built the dependency graph. We just need to know where to start from on that graph. That's also relatively easy. If you're in the directory of a package, well we can work out which package it is. If not, we can build everything.

Once we know what we need to build, we have a look at anything it depends on, and if they depend on anything. And so on. Once we know that, we can build a plan for how to build it all with as much parallelisation as you have threads.

Bonus Feature

Having worked all of that out, there was one last feature I really wanted to include. And to be honest, it's the main thing I've wanted from the start.

I wanted a command that will create a zip file ready for AWS Lambda, Kubernetes or Docker.

Now I hear what you're saying, what about the Serverless framework?. While I know it's a valuable tool, and plenty of us use it with much success. It's never fit my requirements. Any abstraction over Cloudformation that obscures the actual Cloudformation templates always ended up getting in my way.

Yarn PnP makes this a bit hard, locally linked packages make this really hard. And vendoring node_modules makes this near impossible. Especially in a monorepo where your dependencies are shared and hoisted up. Meaning you can't just copy the adjacent node_modules folder. We need something smarter. Much smarter.

Once again though, we have access to the dependency graph we've already defined for Yarn. Combining this, with the zipfs tooling in Yarn v2, it was not too much extra work to get this going.

Now, in a package running yarn bundle copies the whole workspace (so likely you repository) into a temporary folder. Then, using Yarn's dependency graph, we chuck out everything we don't need. Delete the local packages that aren't used, and the vendored packages that aren't used.

At this point we have a zip file that looks like your repo, with a bunch of stuff chucked out. Which is great, but there's two remaining issues to tackle.

The first, Yarn PnP. It's great, and it means our zip file is faster to work with and smaller than a node_modules directory. But, we need to run everything via the pnp.js file.

The second, is that as we're recreating the whole workspace in the zip file, and not just your package, you need to know exactly where it is to specify your entrypoint or index file.

The solution was pretty simple. Drop a file called entrypoint.js at the root of the zip file. Have it load pnp.js first, then load your file, referenced in main in your package.json.

And just like that, yarn bundle can create a zip file ready to run in Lambda et al.

How to get started

This all sounds great, but how do you actually use it?

First, you have to be using Yarn v2/3. If you're not already here's a great getting started guide.

Next install the plugin by running the following command in your Yarn workspace:

yarn plugin import https://yarn.build/v2

or for Yarn v3

yarn plugin import https://yarn.build/latest

This command downloads and installs (or updates) the yarn.build plugin to the latest version. The plugin is downloaded and vendored in you repository. It's not redownloaded on every build.

Currently there's two commands you can run.

yarn build which will run the build script defined in package.json.

And yarn bundle which will create the zip file described above, ready for Lambda et al.

There's still plenty of work to be done on this plugin, but in it's current state it's ready to start being used.

You can find the source here github.com/ojkelly/yarn.build.

— OK
Tagged