Tuesday, June 02, 2009

Preparing a 1.5 subversion repo to handle 1.6 clients

When subversion 1.6 was released, one of the changes included was the a set of changes related to enhancing the functionality of the 'svn:externals' property. As part of these changes the back-slash character became an 'escape character'.

Unfortunately if you have been using subversion with windows clients/users it is quite likely some of your developers will have used the 'backslash' to provide the 'local folder' part of the external definition (e.g you have a line such as: 'foo\bar http://my_svn/folder/trunk'.) Sadly subversion 1.6 clients no longer interpret those paths in the traditional way (it has always been recommended not to use the slash in this way, but it did previously work.) This combined with the fact that if you use TortoiseSVN or similar it is likely to have forced an automatic update recently which has somewhat forced our hand with moving to 1.6.

One could change all of the svn:externals properties on all folders at the HEAD of all their branches, and this would help with all *new* revisions, but if you have a requirement to be able to reliably produce retrospective builds (as in any good change-management environment one should be able to) then this will not do (if you were to checkout a revision of the repository prior to your fixes (say an old branch of your product you've shipped to the customer) then those externals would not work as your local svn client would be much newer than when you first built the code :(

To work around this I've created the following snippet of perl that uses the rather handy SVN::DumpFilter to search through a subversion dump file and replace any '\' characters found in svn:externals declarations with a '/' character.


use SVN::Dumpfilter;

$|=1;

sub my_filter (\%;$);
my $dumpfile = shift @ARGV; # filename or '-' for STDIN
my $outfile = shift @ARGV; # filename or '-' for STDOUT

Dumpfilter($dumpfile, $outfile, \&my_filter);
sub my_filter (\%;$)
{
my $href = shift;
my $recalc = shift || 0;
my $header = $href->{'header'};
my $prop = $href->{'properties'};
# Do something (modify, add, delete) with the current node given by the
# hash ref $href
# e.g.:
if( exists $header->{'Revision-number'}) {
print "Processing revision ".$header->{'Revision-number'}."\n";
}
if( exists $prop->{'svn:externals'} && $prop->{'svn:externals'} =~ m/\\/)
{
print "found external : ". $prop->{'svn:externals'}."\n";
$prop->{'svn:externals'} =~ s/\\/\//g;
print " -> converted external : ". $prop->{'svn:externals'}."\n";
$recalc= 1;
}

# The node content is accessible as scalar with ${$href->{content}}
# Can be in every possible text or binary format.

if ($recalc)
{
svn_recalc_prop_header(%$href); # call if you changed properties
}
}


On an 11Gb 20,000 revision dump file this can take up to 10 minutes to run to completion.

To use, either pipe a subversion dump file to it and pipe the output to a new file, or pass the filename of the dumpfile you wish to filter and the filename of the resulting file (or a crazy mix of both).

Please be careful, this script worked for me and I offer no guarantees and certainly take no responsibility if it causes damage to your repository, but it got me out of a sticky situation.

It should also be noted that as we're not adding any 'new' revisions to the repository, any existing working copies may be a little bit out of 'sync' with the real repository, so should ideally be re-fetched from clean. In our case we could delete the few folders that were affected and re-update so it wasn't too much of a big issue