I use git-svn to interoperate with out Subversion repository at Sourceforge. I did the original checkout via HTTP and continued working like this until Sourceforge's recent site-wide SVN upgrade (from version 1.3 to 1.5). After the upgrade, I could no longer do Subversion commits (that is, git svn dcommit failed).
I had two options; I could:
- reclone the SVN repository via HTTPS, or
- hack the git-svn metadata so that git-svn would access the SVN repo via HTTPS.
I was loathe to take the first route given how long it takes to clone the entire history of an SVN repository, so I opted for the second choice.
The hack
To hack the git-svn metadata, we need to know that git-svn keeps its meta-data in three locations:
- the directory .git/svn (in your git repository) which contains internal git-svn data; if you delete this directory, git-svn will automatically rebuild its contents,
- various refs such as refs/remotes/trunk and refs/remotes/trunk@1234,
- per-commit SVN identification strings, such as git-svn-id: http://translate.svn.sourceforge.net/svnroot/translate/src/trunk/Pootle@... 54714841-351b-0410-a198-e36a94b762f5.
To perform the hack, the following things must happen:
- the commit messages must all be rewritten so that URLs in the git-svn-id: strings which start with http, are replaced with URLs starting with https,
- a changed commit message will lead to a changed SHA1 value for the commit, which means that all refs (such as refs/remotes/trunk) which pointed to the old commits have to be updated to point to the new commits,
- the git-svn data under .git/svn must be updated,
- the git-svn entries in .git/config must be updated.
Steps 1 and 2 are performed by the magical tool git-filter-branch, while step 3 is performed by a simple rm -rf .git/svn. With git-filter-branch you can rewrite a git repository in interesting ways, including:
- removing all commits made by a particular author,
- removing certain files from the repository,
- modifying commit messages.
Of interest is the last of these points. The git-filter-branch command line for modifying commit messages has the following form:
git filter-branch --msg-filter <text filter command> <refs to rewrite>
The flag --msg-filter tells git-filter-branch to enumerate commit messages; it passes every commit message to <text filter command> via stdin and takes the stdout of <text filter command> to be the new commit message. <refs to rewrite> is a list of git refs whose histories are to be modified.
To replace "http" with "https" in every string resembling "git-svn-id: http://translate.svn.sourceforge.net/svnroot/translate/src/trunk/Pootle@... 54714841-351b-0410-a198-e36a94b762f5" a sed command suffices:
sed "s/git-svn-id: http/git-svn-id: https/g
I wanted to rewrite the histories of every reference in my git repository. Therefore, <refs to rewrite> contained all of the refs in my repository. Traditionally, git maintains refs files under .git/refs; each files is 41 bytes long (a SHA1 values followed by a newline). If you run git-gc, it creates a file called .git/packed-refs containing all of your current references and removes the files under .git/refs. Thus, a quick way to get hold of all your refs is to run git-gc and then to extract the refs from .git/packed-refs. I used a combination of awk and grep to do the extraction:
cat .git/packed-refs | awk '// {print $2}' | grep -v 'pack-refs'
cat outputs the contents of .git/packed-refs to stdout; awk prints the second column of each line; grep removes the first line.
Putting this all together gives:
git filter-branch --msg-filter 'sed "s/git-svn-id: http/git-svn-id: https/g' \
$(cat .git/packed-refs | awk '// {print $2}' | grep -v 'pack-refs')
Finally, to get git-svn to recreate its internal data, I simply executed rm -rf .git/svn. The next time git svn rebase is executed, git-svn will rebuild its internal data.
The summary
- Back up your git repo. If you make a mistake with repository rewriting you'll be in for a lot of fun.
- git-gc
- git filter-branch --msg-filter 'sed "s/git-svn-id: http/git-svn-id: https/g' $(cat .git/packed-refs | awk '// {print $2}' | grep -v 'pack-refs')
- rm -rf .git/svn
- edit .git/config and change "http" in all the git-svn URLs to "https"
- git svn rebase (to update your repo and to let the git-svn data be rebuilt)
