I thought I would blog about this, because I’ve tried and failed in the past, and it just took a lot of tinkering to figure out. The problem is this: I have a git repository with a game in it, and a long time ago thought it would be a good idea to add the game’s music to the repository. But it turns out that was a bad decision, because it made the repository too large to easily transfer around/host on free sites/etc. So, how do I get those pesky music files out of there, and once I do, how do I convince the repository to actually shrink (this latter bit was the trickiest part).
So, first, I go into the git directory and prune out all the music:
% cd mygame % git filter-branch --index-filter \ 'find -name ''*.mp3'' -or -name ''*.ogg'' | xargs -d''\n'' rm -f'
And wait a while while it does its thing. Now we would like to run gc and get rid of all the old objects. Unfortunately, the old objects are still stored in “refs/original/”, so gc will mark them and not clean them up, and I don’t know how to delete them.
So here’s the hack. Clone the repository, which will not clone the original refs, and then gc that one.
% cd .. % git clone mygame mygame2 % cd mygame2 % git gc --aggressive --prune
And at this point, the mygame2 repository will have forgotten all about those music files, and is much smaller.
WARNING: This will rename all the objects in the repository, so pushing and pulling from repositories which haven’t been filtered won’t work. After this process is done, everyone has to clone anew.
A less hacky way to do this would be appreciated, if anyone knows it.
UPDATE: Check out this article mentioned by @spuder in the comments for a better way.