Setup monorepo rather than multi repo

Setup monorepo rather than multi repo blog banner image

Discussion around this topic started around 5-6 years ago. Tech giants had already started using this pattern and were promoting it, of course after seeing its huge success in their codebase. So it is natural that this gained the required momentum amongst the developer community.

In this post, will evaluate both pros and cons of monorepo and at last, we will set up one simple monorepo.

I had worked with both types of pattern - monorepo and multi repo. Have a look at my github, you can easily find the different repositories for the backend and frontend of the same project like Imagine Backend and Imagine Frontend. It is very difficult to manage such projects. You have to keep track of two repositories, or more as it grows.

Before starting I want to show the problem that monorepo helps to solve at scale. Take a look at the repo of Gatsby, Babel, etc... You will find the /packages directory, where all the official plugins of gatsby are listed, imagine a number of repositories and problem this team might have faced if they had set up a one repo per plugin. Here is the video to show how google stores billions of files in monorepo, what problems it solved for them and at what cost

Monorepo

This is a very simple pattern. The easiest way to explain how it looks like is you only have one repository to store your project. Consider the example of my Imagine project, I will create only on repo named Imagine in which I will store the two packages web and server šŸ˜‰. As your project grows and you decide to have your component library, just create one more package and you are ready to rock.

Pros

  1. Brings great simplicity to your codebase: You are not managing multiple repositories for a single project.
  2. You can easily fix the bugs: To understand this point, consider the simple example. You have a separate repo for your component library and you find a simple bug in one of the components. Now you have to set up that repo in your system, fix it, wait for the fix to get a merge. After these steps only you can move forward. But in monorepo, you can fix the bug in a single PR(of course with different commits for the ease of review)
  3. Re-use of the code becomes much easier with packages: I always like to reuse the code for validation. If you are making a simple registration form with validation. You also need the same validation on the client-side too, so that you are not wasting the request just to validate the input and provide instant feedback to the user that something is wrong. If you manage validation criteria separately on server and client. And in future more criteria arises for validation, you have to make sure that it will be updated on both sides. In monorepo, you can reuse it by keeping the package isolated from each other šŸ˜Œ
  4. Ease in setting up various tooling, and managing dependencies: You have to set up some must have tools like lint and prettier once and can be used in every package. These do not mean you can not override, it is very flexible and can be overridden at each package level. We will be taking a look at how to set up these tools and jest with monorepo in future posts and will link it here.

With all these advantages, monorepo has some disadvantages too. It comes from the very core of the pattern - you only have one repo to store your entire project. So, let's discuss its cons too.

Cons

  1. Restrict access: You are giving the access of your single codebase to everyone, you can't restrict that some can only access frontend or backend part of your code.

  2. Performance issue: If you watched the above video, you have a good idea about how monorepo make you struggle when you have billions of file, tooling that we use to run the test pipeline, deployment needs to be very optimized. Google heavily invested to scale the size of the repository and to provide the resources for computationally intensive tools But as the popularity is growing day by day, companies are investing to make this optimized. Above video shows how Google scale it. Also, Facebook chooses Mercurial, it is a version control system like Git and can handle the project of any size. Here, I am providing one more video that shows how Mercurial scale at Facebook monorepo. There are lots of articles available on the internet about how Google and Facebook approach scales to store billions of file in a monolithic repository. Here is how Mercurial at Facebook scales

  3. Build and Clone time - You join a company that has a big monorepo, you know that you are going to work only on some specific part of your codebase. But still, you have to clone that entire repo and store it on your system šŸ˜¢.

Every tool comes with its pros and cons, but from my experience of running personal projects using both patterns, monorepos are a great starting point for smaller teams(because smaller teams generally work on the entire codebase and need a quick way to switch). and from there it scales to the size of Google, Facebook and other giant code bases. Also, there are great tools like Lerna, which will make your journey happy with monorepo.

So, I think we already have enough talking, lets setup a small monorepo ourselves with Yarn Workspaces.

Setup Monorepo

  1. Create a directory for your project and create package.json file in it.

  2. package.json will have following code -

1{
2 "private": true,
3 "workspaces": ["packages/server", "packages/shared"]
4}

before moving forward let's understand what each of these field implies.

private:true is added because yarn workspaces are not meant to be published on npm registry. So, this is nothing but simple safety measure for your workspace. workspaces will contain the array of packages in your workspace. Note that you can also have this field as "workspaces": ["server", "shared"], but it is common convention to put all your packages in a common directory named packages, and this will also help you to specify glob instead of making individual entry each time like "workspaces": ["packages/*"]šŸ˜‡. You can find the similar usage in both of the monorepo linked above.

  1. Now create the packages directory and two more directories named server and shared in packages.
  2. You need to create package.json file in each package and it must have the following minimal content. You can of-course run yarn init command to generate this file, just insure that it has proper name and version.
1{
2 "name": "server",
3 "version": "1.0.0"
4}

Let's discuss one more convention that people follows and you can find this in Babel repo(you are not abide to follow this). name of each package should be like <project_name>/<package_name>. For our example, we can give monorepo/server and monorepo/shared name in corresponding package.json file. This is just for clarity while using it that this is coming from our own packages instead of NPM

  1. Our monorepo setup is complete with these few steps, now to use any function from other packages, list that in package.json as shown below
1"dependencies": {
2 "@monorepo/shared": "1.0.0"
3 }

run yarn command, and it will install shared dependency from your project rather than NPM. And if you inspect the directory structure in your IDE, it will look like this

Monorepo Directory Structure

Tiny arrow symbol in above image refers to symlink, and that is the trick that allows you to import these packages as if you are importing any other package from NPM.

Thats it šŸŽ‰. Define any function in one package, install it as a normal dependency where you want to use it, and it will be available to you.

Conclusion

There are many tools available today to properly maintain code base.

  1. Lerna: tool that optimizes the workflow around managing multi-package repositories with git and npm.

We have discussed the yarn workspaces, you may like to further dive into the lerna. So the final question whether you should opt for multi repo or mono repo. As with almost any tech question answer here is it depends on what you want to work with and how you like to organize your project.

Subscribe for the newsletter