Git - Remove unreachable objects using git prune

This article covers how to use the command 'git prune'.

After using git for a while, you may find that you have a number of git objects that are no longer reachable.

Let's look at an example of how this could happen, in a new repository:

mkdir using-git-prune
cd using-git-prune

git init .

This will create a new directory called "using-git-prune", setting the new directory as the current directory, and initialise a repository:

Initialized empty Git repository in D:/Code/using-git-prune/.git/

Now let's create our first file, and commit it:

echo "hello world!" > helloworld.txt
git add helloworld.txt
git commit -am "added helloworld.txt"

We now have a file called "helloworld.txt", containing the line "hello world!", and made our first commit:

[master (root-commit) 7cc707c] added helloworld.txt
 1 file changed, 1 insertion(+)
 create mode 100644 helloworld.txt

To show git prune in action, we'll need a second commit.

Let's add a line to the existing file:

echo "goodbye world!" >> helloworld.txt
git commit -am "added a second line to helloworld.txt"
[master 5c6b407] added a second line to helloworld.txt
 1 file changed, 1 insertion(+)

We can now see the two commits using git log:

git log
commit 5c6b4076ee5b66a2dc6db0cfa67a9c9fea519aaa (HEAD -> master)
Author: Sean Lloyd
Date:   Sun Apr 05 09:43:41 2021 +0100

        added a second line to helloworld.txt

commit 7cc707c7c9f2d88512d7788e76391ee2d048e426
Author: Sean Lloyd
Date:   Sun Apr 05 10:01:20 2021 +0100

        added helloworld.txt

Let's say we've decided that helloworld.txt shouldn't contain the line "goodbye world!".

We should do a hard reset back to the first commit, which will set the HEAD back to the first commit:

git reset --hard 7cc707c7c9f2d88512d7788e76391ee2d048e426
HEAD is now at 7cc707c added helloworld.txt

And if we use git log now, you can see we have only a single commit:

git log
commit 7cc707c7c9f2d88512d7788e76391ee2d048e426
Author: Sean Lloyd
Date:   Sun Apr 05 10:01:20 2021 +0100

        added helloworld.txt

Great! But what happens if we try checking out the branch we just reset from? It should be deleted, right?

Let's give it a go:

git checkout 5c6b4076ee5b66a2dc6db0cfa67a9c9fea519aaa

Despite reseting to the previous commit, git has still remembered the second commit and checked out successfully.

Fortunately, it's let us know that we are in a detached HEAD state:

Note: switching to '5c6b4076ee5b66a2dc6db0cfa67a9c9fea519aaa'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

HEAD is now at 5c6b407 added a second line to helloworld.txt

First attempt with git prune

By using git prune, we shall remove the commit we don't want from git history.

To do so, let's first checkout the master (main) branch:

git checkout master

This will give us a useful warning that we are leaving a commit behind, that is not connected to any branch (due to using git reset).

Fortunately, this is what we want to do:

Warning: you are leaving 1 commit behind, not connected to
any of your branches:

  5c6b407 added a second line to helloworld.txt

If you want to keep it by creating a new branch, this may be a good time
to do so with:

 git branch <new-branch-name> 5c6b407

Switched to branch 'master'

Now let's use git prune, but with the --dry-run parameter so we can see what would happen, without actually changing anything:

git prune --dry-run

And the result?

	

Nothing!

Even though Git claims that this commit was in a detached HEAD state when we checked it out, Git is still holding on to a reference to it.

To remove this reference, we need to remove it from the reference log.

A brief look at git reflog

Let's first have a look at what is inside the reflog, using git reflog:

git reflog
7cc707c (HEAD -> master) HEAD@{0}: checkout: moving from 5c6b4076ee5b66a2dc6db0cfa67a9c9fea519aaa to master
5c6b407 HEAD@{1}: checkout: moving from master to 5c6b4076ee5b66a2dc6db0cfa67a9c9fea519aaa
7cc707c (HEAD -> master) HEAD@{2}: reset: moving to 7cc707c7c9f2d88512d7788e76391ee2d048e426
5c6b407 HEAD@{3}: commit: added a second line to helloworld.txt
7cc707c (HEAD -> master) HEAD@{4}: commit (initial): added helloworld.txt

To remove these references, we can add the subcommand expire:

$ git reflog expire --expire=now --expire-unreachable=now --all

There are a few extra parameters added here:

  • --expire=now will set the expiry date to now, rather than the default of 90 days
  • --expire-unreachable=now will set the expiry date of unreachable objects to now, rather than the default of 30 days
  • --all run on all items in the log

Once we've executed the above, we can look at the log again:

git reflog

And nothing comes back! Exactly what we want this time.

Second attempt with git prune

Now that we've updated the reference log, let's try a dry run of git prune again:

git prune --dry-run

This time, we can see the objects that will be pruned:

05a4d8c8dd9c718a5700916733591b4cbb32f26a blob
384696f5d61e411c50e9cc2eb950163508e04c72 tree
5c6b4076ee5b66a2dc6db0cfa67a9c9fea519aaa commit

And so we can be assured that running git prune by itself will work this time:

git prune

To check, let's try and checkout the second commit:

git checkout 5c6b4076ee5b66a2dc6db0cfa67a9c9fea519aaa

This time, git cannot find the reference! The commit has been pruned successfully:

fatal: reference is not a tree: 5c6b4076ee5b66a2dc6db0cfa67a9c9fea519aaa