Overview

Welcome! This is the documentation for Git Repo Manager (GRM for short), a tool that helps you manage git repositories in a declarative way.

GRM helps you manage git repositories in a declarative way. Configure your repositories in a TOML or YAML file, GRM does the rest. Take a look at the example configuration to get a feel for the way you configure your repositories. See the repository tree chapter for details.

GRM also provides some tooling to work with single git repositories using git-worktree. See the worktree chapter for more details.

Why use GRM?

If you're working with a lot of git repositories, GRM can help you to manage them in an easy way:

  • You want to easily clone many repositories to a new machine.
  • You want to change remotes for multiple repositories (e.g. because your GitLab domain changed).
  • You want to get an overview over all repositories you have, and check whether you forgot to commit or push something.

If you want to work with git worktrees in a streamlined, easy way, GRM provides you with an opinionated workflow. It's especially helpful when the following describes you:

  • You're juggling a lot of git branches, switching between them a lot.
  • When switching branches, you'd like to just leave your work as-is, without using the stash or temporary commits.

Installation

Installation

Building GRM requires the Rust toolchain to be installed. The easiest way is using rustup. Make sure that rustup is properly installed.

Make sure that the stable toolchain is installed:

$ rustup toolchain install stable

Then, install the build dependencies:

DistributionCommand
Arch Linuxpacman -S --needed gcc openssl pkg-config
Ubuntu/Debianapt-get install --no-install-recommends pkg-config gcc libssl-dev zlib1g-dev

Then, it's a simple command to install the latest stable version:

$ cargo install git-repo-manager

If you're brave, you can also run the development build:

$ cargo install --git https://github.com/hakoerber/git-repo-manager.git --branch develop

Static build

Note that by default, you will get a dynamically linked executable. Alternatively, you can also build a statically linked binary. For this, you will need musl and a few other build dependencies installed installed:

DistributionCommand
Arch Linuxpacman -S --needed gcc musl perl make
Ubuntu/Debianapt-get install --no-install-recommends gcc musl-tools libc-dev perl make

(perl and make are required for the OpenSSL build script)

The, add the musl target via rustup:

$ rustup target add x86_64-unknown-linux-musl

Then, use a modified build command to get a statically linked binary:

$ cargo install git-repo-manager --target x86_64-unknown-linux-musl --features=static-build

Nix

Run from github without downloading:

$ nix run github:hakoerber/git-repo-manager/develop -- --version
git-repo-manager 0.7.15

Run from local source directory:

$ nix run . -- --version
git-repo-manager 0.7.15

Integrate into a Nix Flake:

{
  inputs = {
    ...
    git-repo-manager = {
      url = "github:hakoerber/git-repo-manager";
      inputs.nixpkgs.follows = "nixpkgs";
      inputs.flake-utils.follows = "flake-utils";
    };
  };

  outputs = {
    ...
    pkgs = import inputs.nixpkgs {
        ...
        overlays = [ inputs.git-repo-manager.overlays.git-repo-manager ];
    };
  };
}

Tutorial

Here, you'll find a quick overview over the most common functionality of GRM.

Managing existing repositories

Let's say you have your git repositories at ~/code. To start managing them via GRM, first create a configuration:

grm repos find local ~/code --format yaml > ~/code/config.yml

The result may look something like this:

---
trees:
  - root: ~/code
    repos:
      - name: git-repo-manager
        worktree_setup: true
        remotes:
          - name: origin
            url: "https://github.com/hakoerber/git-repo-manager.git"
            type: https

To apply the configuration and check whether all repositories are in sync, run the following:

$ grm repos sync config --config ~/code/config.yml
[✔] git-repo-manager: OK

Well, obiously there are no changes. To check how changes would be applied, let's change the name of the remote (currently origin):

$ sed -i 's/name: origin/name: github/' ~/code/config.yml
$ grm repos sync config --config ~/code/config.yml
[⚙] git-repo-manager: Setting up new remote "github" to "https://github.com/hakoerber/git-repo-manager.git"
[⚙] git-repo-manager: Deleting remote "origin"
[✔] git-repo-manager: OK

GRM replaced the origin remote with github.

The configuration (~/code/config.yml in this example) would usually be something you'd track in git or synchronize between machines via some other means. Then, on every machine, all your repositories are a single grm repos sync away!

Getting repositories from a forge

Let's say you have a bunch of repositories on GitHub and you'd like to clone them all to your local machine.

To authenticate, you'll need to get a personal access token, as described in the forge documentation. Let's assume you put your token into ~/.github_token (please don't if you're doing this "for real"!)

Let's first see what kind of repos we can find:

$ grm repos sync remote --provider github --token-command "cat ~/.github_token" --root ~/code/github.com/ --format yaml
---
trees: []
$

Ummm, ok? No repos? This is because you have to tell GRM what to look for (if you don't, GRM will just relax, as it's lazy).

There are different filters (see the forge documentation for more info). In our case, we'll just use the --owner filter to get all repos that belong to us:

$ grm repos find remote --provider github --token-command "cat ~/.github_token" --root ~/code/github.com/ --format yaml
---
trees:
  - root: ~/code/github.com
    repos:
      - name: git-repo-manager
        worktree_setup: false
        remotes:
          - name: origin
            url: "https://github.com/hakoerber/git-repo-manager.git"
            type: https

Nice! The format is the same as we got from grm repos find local above. So if we wanted, we could save this file and use it with grm repos sync config as above. But there is an even easier way: We can directly clone the repositories!

$ grm repos sync remote --provider github --token-command "cat ~/.github_token" --root ~/code/github.com/
[⚙] Cloning into "~/code/github.com/git-repo-manager" from "https://github.com/hakoerber/git-repo-manager.git"
[✔] git-repo-manager: Repository successfully cloned
[✔] git-repo-manager: OK

Nice! Just to make sure, let's run the same command again:

$ grm repos sync remote --provider github --token-command "cat ~/.github_token" --root ~/code/github.com/
[✔] git-repo-manager: OK

GRM saw that the repository is already there and did nothing (remember, it's lazy).

Using worktrees

Worktrees are something that make it easier to work with multiple branches at the same time in a repository. Let's say we wanted to hack on the codebase of GRM:

$ cd ~/code/github.com/git-repo-manager
$ ls
.gitignore
Cargo.toml
...

Well, this is just a normal git repository. But let's try worktrees! First, we have to convert the existing repository to use the special worktree setup. For all worktree operations, we will use grm worktree (or grm wt for short):

$ grm wt convert
[✔] Conversion done
$ ls
$

So, the code is gone? Not really, there is just no active worktree right now. So let's add one for master:

$ grm wt add master --track origin/master
[✔] Conversion done
$ ls
master
$ (cd ./master && git status)
On branch master
nothing to commit, working tree clean

Now, a single worktree is kind of pointless (if we only have one, we could also just use the normal setup, without worktrees). So let's another one for develop:

$ grm wt add develop --track origin/develop
[✔] Conversion done
$ ls
develop
master
$ (cd ./develop && git status)
On branch develop
nothing to commit, working tree clean

What's the point? The cool thing is that we can now start working in the develop worktree, without affecting the master worktree at all. If you're working on develop and want to quickly see what a certain file looks like in master, just look inside ./master, it's all there!

This becomes especially interesting when you have many feature branches and are working on multiple features at the same time.

There are a lot of options that influence how worktrees are handled. Maybe you want to automatically track origin/master when you add a worktree called master? Maybe you want your feature branches to have a prefix, so when you're working on the feature1 worktree, the remote branch will be origin/awesomefeatures/feature1? Check out the chapter on worktrees for all the things that are possible.

Managing Repositories

GRM helps you manage a bunch of git repositories easily. There are generally two ways to go about that:

You can either manage a list of repositories in a TOML or YAML file, and use GRM to sync the configuration with the state of the repository.

Or, you can pull repository information from a forge (e.g. GitHub, GitLab) and clone the repositories.

There are also hybrid modes where you pull information from a forge and create a configuration file that you can use later.

Local Configuration

When managing multiple git repositories with GRM, you'll generally have a configuration file containing information about all the repos you have. GRM then makes sure that you repositories match that configuration. If they don't exist yet, it will clone them. It will also make sure that all remotes are configured properly.

Let's try it out:

Get the example configuration

curl --proto '=https' --tlsv1.2 -sSfO https://raw.githubusercontent.com/hakoerber/git-repo-manager/master/example.config.toml

Then, you're ready to run the first sync. This will clone all configured repositories and set up the remotes.

$ grm repos sync config --config example.config.toml
[⚙] Cloning into "/home/me/projects/git-repo-manager" from "https://code.hkoerber.de/hannes/git-repo-manager.git"
[✔] git-repo-manager: Repository successfully cloned
[⚙] git-repo-manager: Setting up new remote "github" to "https://github.com/hakoerber/git-repo-manager.git"
[✔] git-repo-manager: OK
[⚙] Cloning into "/home/me/projects/dotfiles" from "https://github.com/hakoerber/dotfiles.git"
[✔] dotfiles: Repository successfully cloned
[✔] dotfiles: OK

If you run it again, it will report no changes:

$ grm repos sync config -c example.config.toml
[✔] git-repo-manager: OK
[✔] dotfiles: OK

Generate your own configuration

Now, if you already have a few repositories, it would be quite laborious to write a configuration from scratch. Luckily, GRM has a way to generate a configuration from an existing file tree:

grm repos find local ~/your/project/root > config.toml

This will detect all repositories and remotes and write them to config.toml.

You can exclude repositories from the generated configuration by providing a regex that will be test against the path of each discovered repository:

grm repos find local ~/your/project/root --exclude "^.*/subdir/match-(foo|bar)/.*$" > config.toml

Show the state of your projects

$ grm repos status --config example.config.toml
╭──────────────────┬──────────┬────────┬───────────────────┬────────┬─────────╮
│ Repo             ┆ Worktree ┆ Status ┆ Branches          ┆ HEAD   ┆ Remotes │
╞══════════════════╪══════════╪════════╪═══════════════════╪════════╪═════════╡
│ git-repo-manager ┆          ┆ ✔      ┆ branch: master    ┆ master ┆ github  │
│                  ┆          ┆        ┆ <origin/master> ✔ ┆        ┆ origin  │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤
│ dotfiles         ┆          ┆ ✔      ┆                   ┆ Empty  ┆ origin  │
╰──────────────────┴──────────┴────────┴───────────────────┴────────┴─────────╯

You can also use status without --config to check the repository you're currently in:

$ cd ~/example-projects/dotfiles
$ grm repos status
╭──────────┬──────────┬────────┬──────────┬───────┬─────────╮
│ Repo     ┆ Worktree ┆ Status ┆ Branches ┆ HEAD  ┆ Remotes │
╞══════════╪══════════╪════════╪══════════╪═══════╪═════════╡
│ dotfiles ┆          ┆ ✔      ┆          ┆ Empty ┆ origin  │
╰──────────┴──────────┴────────┴──────────┴───────┴─────────╯

YAML

By default, the repo configuration uses TOML. If you prefer YAML, just give it a YAML file instead (file ending does not matter, grm will figure out the format). For generating a configuration, pass --format yaml to grm repo find which generates a YAML configuration instead of a TOML configuration.

Forge Integrations

In addition to managing repositories locally, grm also integrates with source code hosting platforms. Right now, the following platforms are supported:

Imagine you are just starting out with grm and want to clone all your repositories from GitHub. This is as simple as:

$ grm repos sync remote --provider github --owner --token-command "pass show github_grm_access_token" --path ~/projects

You will end up with your projects cloned into ~/projects/{your_github_username}/

Authentication

The only currently supported authentication option is using a personal access token.

GitHub

See the GitHub documentation for personal access tokens: Link.

The only required permission is the "repo" scope.

GitLab

See the GitLab documentation for personal access tokens: Link.

The required scopes are a bit weird. Actually, the following should suffice:

  • read_user to get user information (required to get the current authenticated user name for the --owner filter.
  • A scope that allows reading private repositories. (read_repository is just for cloning private repos). This unfortunately does not exist.

So currently, you'll need to select the read_api scope.

Filters

By default, grm will sync nothing. This is quite boring, so you have to tell the command what repositories to include. They are all inclusive (i.e. act as a logical OR), so you can easily chain many filters to clone a bunch of repositories. It's quite simple:

  • --user <USER> syncs all repositories of that remote user
  • --group <GROUP> syncs all repositories of that remote group/organization
  • --owner syncs all repositories of the user that is used for authentication. This is effectively a shortcut for --user $YOUR_USER
  • --access syncs all repositories that the current user has access to

Easiest to see in an example:

$ grm repos sync remote --provider github --user torvals --owner --group zalando [...]

This would sync all of Torvald's repositories, all of my own repositories and all (public) repositories in the "zalando" group.

Strategies

There are generally three ways how you can use grm with forges:

Ad-hoc cloning

This is the easiest, there are no local files involved. You just run the command, grm clones the repos, that's it. If you run the command again, grm will figure out the differences between local and remote repositories and resolve them locally.

Create a file

This is effectively grm repos find local, but using the forge instead of the local file system. You will end up with a normal repository file that you can commit to git. To update the list of repositories, just run the command again and commit the new file.

Define options in a file

This is a hybrid approach: You define filtering options in a file that you can commit to source control. Effectively, you are persisting the options you gave to grm on the command line with the ad-hoc approach. Similarly, grm will figure out differences between local and remote and resolve them.

A file would look like this:

provider = "github"
token_command = "cat ~/.github_token"
root = "~/projects"

[filters]
owner = true
groups = [
  "zalando"
]

The options in the file map to the command line options of the grm repos sync remote command.

You'd then run the grm repos sync command the same way as with a list of repositories in a configuration:

$ grm repos sync --config example.config.toml

You can even use that file to generate a repository list that you can feed into grm repos sync:

$ grm repos find config --config example.config.toml > repos.toml
$ grm repos sync config --config repos.toml

Using with self-hosted GitLab

By default, grm uses the default GitLab API endpoint (https://gitlab.com). You can override the endpoint by specifying the --api-url parameter. Like this:

$ grm repos sync remote --provider gitlab --api-url https://gitlab.example.com [...]

The cloning protocol

By default, grm will use HTTPS for public repositories and SSH otherwise. This can be overridden with the --force-ssh switch.

About the token command

To ensure maximum flexibility, grm has a single way to get the token it uses to authenticate: Specify a command that returns the token via stdout. This easily integrates with password managers like pass.

Of course, you are also free to specify something like echo mytoken as the command, as long as you are OK with the security implications (like having the token in clear text in your shell history). It may be better to have the token in a file instead and read it: cat ~/.gitlab_token.

Generally, use whatever you want. The command just has to return successfully and return the token as the first line of stdout.

Examples

Maybe you just want to locally clone all repos from your GitHub user?

$ grm repos sync remote --provider github --owner --root ~/github_projects --token-command "pass show github_grm_access_token"

This will clone all repositories into ~/github_projects/{your_github_username}.

If instead you want to clone all repositories you have access to (e.g. via organizations or other users' private repos you have access to), just change the filter a little bit:

$ grm repos sync remote --provider github --access --root ~/github_projects --token-command "pass show github_grm_access_token"

Limitations

GitHub

Unfortunately, GitHub does not have a nice API endpoint to get private repositories for a certain user (/users/{user}/repos/ only returns public repositories).

Therefore, using --user {user} will only show public repositories for GitHub. Note that this does not apply to --access: If you have access to another user's private repository, it will be listed.

Adding integrations

Adding a new integration involves writing some Rust code. Most of the logic is generic, so you will not have to reinvent the wheel. Generally, you will need to gather the following information:

  • A list of repositories for a single user
  • A list of repositories for a group (or any similar concept if applicable)
  • A list of repositories for the user that the API token belongs to
  • The username of the currently authenticated user

Authentication currently only works via a bearer token passed via the Authorization HTTP header.

Each repo has to have the following properties:

  • A name (which also acts as the identifier for diff between local and remote repositories)
  • An SSH URL to push to
  • An HTTPS URL to clone and fetch from
  • A flag that marks the repository as private

If you plan to implement another forge, please first open an issue so we can go through the required setup. I'm happy to help!

Git Worktrees

Why?

The default workflow when using git is having your repository in a single directory. Then, you can check out a certain reference (usually a branch), which will update the files in the directory to match the state of that reference. Most of the time, this is exactly what you need and works perfectly. But especially when you're working with branches a lot, you may notice that there is a lot of work required to make everything run smoothly.

Maybe you have experienced the following: You're working on a feature branch. Then, for some reason, you have to change branches (maybe to investigate some issue). But you get the following:

error: Your local changes to the following files would be overwritten by checkout

Now you can create a temporary commit or stash your changes. In any case, you have some mental overhead before you can work on something else. Especially with stashes, you'll have to remember to do a git stash pop before resuming your work (I cannot count the number of times where I "rediscovered" some code hidden in some old stash I forgot about). Also, conflicts on a git stash pop are just horrible.

And even worse: If you're currently in the process of resolving merge conflicts or an interactive rebase, there is just no way to "pause" this work to check out a different branch.

Sometimes, it's crucial to have an unchanging state of your repository until some long-running process finishes. I'm thinking of Ansible and Terraform runs. I'd rather not change to a different branch while ansible or Terraform are running as I have no idea how those tools would behave (and I'm not too eager to find out).

In any case, Git Worktrees are here for the rescue:

What are git worktrees?

Git Worktrees allow you to have multiple independent checkouts of your repository on different directories. You can have multiple directories that correspond to different references in your repository. Each worktree has it's independent working tree (duh) and index, so there is no way to run into conflicts. Changing to a different branch is just a cd away (if the worktree is already set up).

Worktrees in GRM

GRM exposes an opinionated way to use worktrees in your repositories. Opinionated, because there is a single invariant that makes reasoning about your worktree setup quite easy:

The branch inside the worktree is always the same as the directory name of the worktree.

In other words: If you're checking out branch mybranch into a new worktree, the worktree directory will be named mybranch.

GRM can be used with both "normal" and worktree-enabled repositories. But note that a single repository can be either the former or the latter. You'll have to decide during the initial setup which way you want to go for that repository.

If you want to clone your repository in a worktree-enabled way, specify worktree_setup = true for the repository in your config.toml:

[[trees.repos]]
name = "git-repo-manager"
worktree_setup = true

Now, when you run a grm sync, you'll notice that the directory of the repository is empty! Well, not totally, there is a hidden directory called .git-main-working-tree. This is where the repository actually "lives" (it's a bare checkout).

Note that there are few specific things you can configure for a certain workspace. This is all done in an optional grm.toml file right in the root of the worktree. More on that later.

Manual access

GRM isn't doing any magic, it's just git under the hood. If you need to have access to the underlying git repository, you can always do this:

$ git --git-dir ./.git-main-working-tree [...]

This should never be required (whenever you have to do this, you can consider this a bug in GRM and open an issue, but it may help in a pinch.

Working with Worktrees

Creating a new worktree

To actually work, you'll first have to create a new worktree checkout. All worktree-related commands are available as subcommands of grm worktree (or grm wt for short):

$ grm wt add mybranch
[✔] Worktree mybranch created

You'll see that there is now a directory called mybranch that contains a checkout of your repository, using the branch mybranch

$ cd ./mybranch && git status
On branch mybranch
nothing to commit, working tree clean

You can work in this repository as usual. Make changes, commit them, revert them, whatever you're up to :)

Just note that you should not change the branch inside the worktree directory. There is nothing preventing you from doing so, but you will notice that you'll run into problems when trying to remove a worktree (more on that later). It may also lead to confusing behavior, as there can be no two worktrees that have the same branch checked out. So if you decide to use the worktree setup, go all in, let grm manage your branches and bury git branch (and git checkout -b).

You will notice that there is no tracking branch set up for the new branch. You can of course set up one manually after creating the worktree, but there is an easier way, using the --track flag during creation. Let's create another worktree. Go back to the root of the repository, and run:

$ grm wt add mybranch2 --track origin/mybranch2
[✔] Worktree mybranch2 created

You'll see that this branch is now tracking mybranch on the origin remote:

$ cd ./mybranch2 && git status
On branch mybranch

Your branch is up to date with 'origin/mybranch2'.
nothing to commit, working tree clean

The behavior of --track differs depending on the existence of the remote branch:

  • If the remote branch already exists, grm uses it as the base of the new local branch.
  • If the remote branch does not exist (as in our example), grm will create a new remote tracking branch, using the default branch (either main or master) as the base

Often, you'll have a workflow that uses tracking branches by default. It would be quite tedious to add --track every single time. Luckily, the grm.toml file supports defaults for the tracking behavior. See this for an example:

[track]
default = true
default_remote = "origin"

This will set up a tracking branch on origin that has the same name as the local branch.

Sometimes, you might want to have a certain prefix for all your tracking branches. Maybe to prevent collisions with other contributors. You can simply set default_remote_prefix in grm.toml:

[track]
default = true
default_remote = "origin"
default_remote_prefix = "myname"

When using branch my-feature-branch, the remote tracking branch would be origin/myname/my-feature-branch in this case.

Note that --track overrides any configuration in grm.toml. If you want to disable tracking, use --no-track.

Showing the status of your worktrees

There is a handy little command that will show your an overview over all worktrees in a repository, including their status (i.e. changes files). Just run the following in the root of your repository:

$ grm wt status
╭───────────┬────────┬──────────┬──────────────────╮
│ Worktree  ┆ Status ┆ Branch   ┆ Remote branch    │
╞═══════════╪════════╪══════════╪══════════════════╡
│ mybranch  ┆ ✔      ┆ mybranch ┆                  │
│ mybranch2 ┆ ✔      ┆ mybranch ┆ origin/mybranch2 │
╰───────────┴────────┴──────────┴──────────────────╯

The "Status" column would show any uncommitted changes (new / modified / deleted files) and the "Remote branch" would show differences to the remote branch (e.g. if there are new pushes to the remote branch that are not yet incorporated into your local branch).

Deleting worktrees

If you're done with your worktrees, use grm wt delete to delete them. Let's start with mybranch2:

$ grm wt delete mybranch2
[✔] Worktree mybranch2 deleted

Easy. On to mybranch:

$ grm wt delete mybranch
[!] Changes in worktree: No remote tracking branch for branch mybranch found. Refusing to delete

Hmmm. grm tells you:

"Hey, there is no remote branch that you could have pushed your changes to. I'd rather not delete work that you cannot recover."

Note that grm is very cautious here. As your repository will not be deleted, you could still recover the commits via git-reflog. But better safe than sorry! Note that you'd get a similar error message if your worktree had any uncommitted files, for the same reason. Now you can either commit & push your changes, or your tell grm that you know what you're doing:

$ grm wt delete mybranch --force
[✔] Worktree mybranch deleted

If you just want to delete all worktrees that do not contain any changes, you can also use the following:

$ grm wt clean

Note that this will not delete the default branch of the repository. It can of course still be delete with grm wt delete if necessary.

Converting an existing repository

It is possible to convert an existing directory to a worktree setup, using grm wt convert. This command has to be run in the root of the repository you want to convert:

$ grm wt convert
[✔] Conversion successful

This command will refuse to run if you have any changes in your repository. Commit them and try again!

Afterwards, the directory is empty, as there are no worktrees checked out yet. Now you can use the usual commands to set up worktrees.

Worktrees and Remotes

To fetch all remote references from all remotes in a worktree setup, you can use the following command:

$ grm wt fetch
[✔] Fetched from all remotes

This is equivalent to running git fetch --all in any of the worktrees.

Often, you may want to pull all remote changes into your worktrees. For this, use the git pull equivalent:

$ grm wt pull
[✔] master: Done
[✔] my-cool-branch: Done

This will refuse when there are local changes, or if the branch cannot be fast forwarded. If you want to rebase your local branches, use the --rebase switch:

$ grm wt pull --rebase
[✔] master: Done
[✔] my-cool-branch: Done

As noted, this will fail if there are any local changes in your worktree. If you want to stash these changes automatically before the pull (and unstash them afterwards), use the --stash option.

This will rebase your changes onto the upstream branch. This is mainly helpful for persistent branches that change on the remote side.

There is a similar rebase feature that rebases onto the default branch instead:

$ grm wt rebase
[✔] master: Done
[✔] my-cool-branch: Done

This is super helpful for feature branches. If you want to incorporate changes made on the remote branches, use grm wt rebase and all your branches will be up to date. If you want to also update to remote tracking branches in one go, use the --pull flag, and --rebase if you want to rebase instead of aborting on non-fast-forwards:

$ grm wt rebase --pull --rebase
[✔] master: Done
[✔] my-cool-branch: Done

"So, what's the difference between pull --rebase and rebase --pull? Why the hell is there a --rebase flag in the rebase command?"

Yes, it's kind of weird. Remember that pull only ever updates each worktree to their remote branch, if possible. rebase rebases onto the default branch instead. The switches to rebase are just convenience, so you do not have to run two commands.

  • rebase --pull is the same as pull && rebase
  • rebase --pull --rebase is the same as pull --rebase && rebase

I understand that the UX is not the most intuitive. If you can think of an improvement, please let me know (e.g. via an GitHub issue)!

As with pull, rebase will also refuse to run when there are changes in your worktree. And you can also use the --stash option to stash/unstash changes automatically.

Behavior Details

When working with worktrees and GRM, there is a lot going on under the hood. Each time you create a new worktree, GRM has to figure out what commit to set your new branch to and how to configure any potential remote branches.

To state again, the most important guideline is the following:

The branch inside the worktree is always the same as the directory name of the worktree.

The second set of guidelines relates to the commit to check out, and the remote branches to use:

  • When a branch already exists, you will get a worktree for that branch
  • Existing local branches are never changed
  • Only do remote operations if specifically requested (via configuration file or command line parameters)
  • When you specify --track, you will get that exact branch as the tracking branch
  • When you specify --no-track, you will get no tracking branch

Apart from that, GRM tries to do The Right ThingTM. It should be as little surprising as possible.

In 99% of the cases, you will not have to care about the details, as the normal workflows are covered by the rules above. In case you want to know the exact behavior "specification", take a look at the module documentation for grm::worktree.

If you think existing behavior is super-duper confusing and you have a better idea, do not hesitate to open a GitHub issue to discuss this!

FAQ

Currently empty, as there are no questions that are asked frequently :D

Overview

GRM is still in very early development. I started GRM mainly to scratch my own itches (and am heavily dogfooding it). If you have a new use case for GRM, go for it!

Contributing

To contribute, just fork the repo and create a pull request against develop. If you plan bigger changes, please consider opening an issue first, so we can discuss it.

If you want, add yourself to the CONTRIBUTORS file in your pull request.

Branching strategy

The branching strategy is a simplified git-flow.

  • master is the "production" branch. Each commit is a new release.
  • develop is the branch where new stuff is coming in.
  • feature branches branch off of develop and merge back into it.

Feature branches are not required, there are also changes happening directly on develop.

Required tooling

You will need the following tools:

  • Rust (obviously) (easiest via rustup)
  • Python3
  • just, a command runner like make. See here for installation instructions (it's most likely just a simple cargo install just).
  • Docker & docker-compose for the e2e tests
  • isort, black and shfmt for formatting.
  • ruff and shellcheck for linting.
  • python-tomlkit for the dependency update script.
  • mdbook for the documentation

Here are the tools:

DistributionCommand
Arch Linuxpacman -S --needed python3 rustup just docker docker-compose python-black shfmt shellcheck mdbook python-tomlkit
Ubuntu/Debianapt-get install --no-install-recommends python3 docker.io docker-compose black shellcheck python3-tomlkit

Note that you will have to install just and mdbook manually on Ubuntu (e.g. via cargo install just mdbook if your rust build environment is set up correctly). Same for shfmt, which may just be a go install mvdan.cc/sh/v3/cmd/shfmt@latest, depending on your go build environment.

For details about rustup and the toolchains, see the installation section.

Development Environment with Nix

Enter a development shell with all tools and dependencies:

$ nix develop

From within the nix shell:

$ just [TARGET]

or

$ cargo build

Update toolchain and dependencies:

$ nix flake update

Build:

$ nix build

Run:

$ nix run . -- [ARGUMENTS]

Find more documentation about Nix Flakes here: https://nixos.wiki/wiki/Flakes

Caveats

The current Nix environment does not source:

  • aarch64-unknown-linux-musl
  • x86_64-unknown-linux-musl
  • docker and related tools

If interest develops this can be added.

Developing Nix

The crate is built using Crane.

Format code with alejandra.

Testing

There are two distinct test suites: One for unit test (just test-unit) and integration tests (just test-integration) that is part of the rust crate, and a separate e2e test suite in python (just test-e2e).

To run all tests, run just test.

When contributing, consider whether it makes sense to add tests which could prevent regressions in the future. When fixing bugs, it makes sense to add tests that expose the wrong behavior beforehand.

The unit and integration tests are very small and only test a few self-contained functions (like validation of certain input).

E2E tests

The main focus of the testing setup lays on the e2e tests. Each user-facing behavior should have a corresponding e2e test. These are the most important tests, as they test functionality the user will use in the end.

The test suite is written in python and uses pytest. There are helper functions that set up temporary git repositories and remotes in a tmpfs.

Effectively, each tests works like this:

  • Set up some prerequisites (e.g. different git repositories or configuration files)
  • Run grm
  • Check that everything is according to expected behavior (e.g. that grm had certain output and exit code, that the target repositories have certain branches, heads and remotes, ...)

As there are many different scenarios, the tests make heavy use of the @pytest.mark.parametrize decorator to get all permutations of input parameters (e.g. whether a configuration exists, what a config value is set to, how the repository looks like, ...)

Whenever you write a new test, think about the different circumstances that can happen. What are the failure modes? What affects the behavior? Parametrize each of these behaviors.

Optimization

Note: You will most likely not need to read this.

Each test parameter will exponentially increase the number of tests that will be run. As a general rule, comprehensiveness is more important than test suite runtime (so if in doubt, better to add another parameter to catch every edge case). But try to keep the total runtime sane. Currently, the whole just test-e2e target runs ~8'000 tests and takes around 5 minutes on my machine, exlucding binary and docker build time. I'd say that keeping it under 10 minutes is a good idea.

To optimize tests, look out for two patterns: Dependency and Orthogonality

Dependency

If a parameter depends on another one, it makes little sense to handle them independently. Example: You have a paramter that specifies whether a configuration is used, and another parameter that sets a certain value in that configuration file. It might look something like this:

@pytest.mark.parametrize("use_config", [True, False])
@pytest.mark.parametrize("use_value", ["0", "1"])
def test(...):

This leads to 4 tests being instantiated. But there is little point in setting a configuration value when no config is used, so the combinations (False, "0") and (False, "1") are redundant. To remedy this, spell out the optimized permutation manually:

@pytest.mark.parametrize("config", ((True, "0"), (True, "1"), (False, None)))
def test(...):
    (use_config, use_value) = config

This cuts down the number of tests by 25%. If you have more dependent parameters (e.g. additional configuration values), this gets even better. Generally, this will cut down the number of tests to

\[ \frac{1}{o \cdot c} + \frac{1}{(o \cdot c) ^ {(n + 1)}} \]

with \( o \) being the number of values of a parent parameters a parameter is dependent on, \( c \) being the cardinality of the test input (so you can assume \( o = 1 \) and \( c = 2 \) for boolean parameters), and \( n \) being the number of parameters that are optimized, i.e. folded into their dependent parameter.

As an example: Folding down two boolean parameters into one dependent parent boolean parameter will cut down the number of tests to 62.5%!

Orthogonality

If different test parameters are independent of each other, there is little point in testing their combinations. Instead, split them up into different test functions. For boolean parameters, this will cut the number of tests in half.

So instead of this:

@pytest.mark.parametrize("param1", [True, False])
@pytest.mark.parametrize("param2", [True, False])
def test(...):

Rather do this:

@pytest.mark.parametrize("param1", [True, False])
def test_param1(...):

@pytest.mark.parametrize("param2", [True, False])
def test_param2(...):

The tests are running in Docker via docker-compose. This is mainly needed to test networking functionality like GitLab integration, with the GitLab API being mocked by a simple flask container.

Dependency updates

Rust has the same problem as the node ecosystem, just a few magnitudes smaller: Dependency sprawl. GRM has a dozen direct dependencies, but over 150 transitive ones.

To keep them up to date, there is a script: update-cargo-dependencies.py. It updates direct dependencies to the latest stable version and updates transitive dependencies where possible. To run it, use just update-dependencies, which will create commits for each update.

Releases

To make a release, make sure you are on a clean develop branch, sync your remotes and then run ./release (major|minor|patch). It will handle a git-flow-y release, meaning that it will perform a merge from develop to master, create a git tag, sync all remotes and run cargo publish.

Make sure to run just check before releasing to make sure that nothing is broken.

As GRM is still v0.x, there is not much consideration for backwards compatibility. Generally, update the patch version for small stuff and the minor version for bigger / backwards incompatible changes.

Generally, it's good to regularly release a new patch release with updated dependencies. As ./release.sh patch is exposed as a Justfile target (release-patch), it's possible to do both in one step:

$ just update-dependencies check release-patch

Release notes

There are currently no release notes. Things are changing quite quickly and there is simply no need for a record of changes (except the git history of course).

Formatting & Style

Code formatting

I'm allergic to discussions about formatting. I'd rather make the computer do it for me.

For Rust, just use cargo fmt. For Python, use black. I'd rather not spend any effort in configuring the formatters (not possible for black anyway). For shell scripts, use shfmt.

To autoformat all code, use just fmt

Style

Honestly, no idea about style. I'm still learning Rust, so I'm trying to find a good style. Just try to keep it consistent when you add code.

Linting

You can use just lint to run all lints.

Rust

Clippy is the guard that prevents shitty code from getting into the code base. When running just check, any clippy suggestions will make the command fail. So make clippy happy! The easiest way:

  • Commit your changes (so clippy can change safely).
  • Run cargo clippy --fix to do the easy changes automatically.
  • Run cargo clippy and take a look at the messages.

Until now, I had no need to override or silence any clippy suggestions.

Shell

shellcheck lints all shell scripts. As they change very rarely, this is not too important.

Unsafe code

Any unsafe code is forbidden for now globally via #![forbid(unsafe_code)]. I cannot think of any reason GRM may need unsafe. If it comes up, it needs to be discussed.

Documentation

The documentation lives in the docs folder and uses mdBook. Please document new user-facing features here!

Using GitHub actions, the documentation on master is automatically published to the project homepage via GitHub pages. See .github/workflows/gh-pages.yml for the configuration of GitHub Actions.