Friday, February 10, 2012

How to Optimize SVN Mirror

My previous blog post explains how to setup a svn mirror using svnsync. Lately I realized how hard to keep the replication system running. Perhaps People have been moved to proprietary replication systems like wandisco, due to lack of replication capability in native subversion system. Its really challenging task to keep the svn mirrors up and running 24 X 7 with no complain from the commiters. Following I found some of the problems with svnsync [1] due to its centralized architecture.


Master/slave model means a single point of failure - If the master is down no-one can write to the repository.
No performance improvement on writes (these are simply proxied to the master server)
No guarantee of transactional integrity
No topology intelligence
Can’t build against slave/mirror if process requires master
Manual intervention is normally required in the event of network outage and latency
No optimization of SVN traffic
Recovery time and recovery point objectives are greater than zero
Requires additional solution or approach for DR, business continuity, fault tolerance


One of the most common problem is commit truncate with "The specified baseline is not the latest baseline, so it may not be" error, due to internet connection is slow between master slave is the.

Some times svnsync dies while lock file is present in the slave, therefore following commits return a error saying "Failed to get lock on destination repos, currently held by" etc.

If in any case Slave got insert a commit, you have to rebuild the repo.

Apache software foundations themself experience issues with svn replication using "svnsync" they are listed down at [2]



With subversion 1.7 there are some improverments have been made. Inmemory caching and data compression comes with subversion 1.7. You may need to manually build the subversion 1.7..

How to Enable Inmemory cache




# Enable a 1 Gb Subversion data cache for both fulltext and deltas.
SVNInMemoryCacheSize 1048576
SVNCacheTextDeltas On
SVNCacheFullTexts On



How Enable data compression. Level 9 denotes maximum compression, default is 5.



SVNCompressionLevel 9



By enabling keepalive it allows to sent multiple request over same tcp connection. keep MaxKeepAliveRequests directive as maximum as possible.


KeepAlive On
MaxKeepAliveRequests 1000
KeepAliveTimeout 15


We experienced that most of the issues cased due to svn on the fly replication. When a user commits a file, it has to update the master the run the replication with salve to complete the commit. Most cases commit fails due to replication issues.

There fore its good to remove the svnsync from the "post-commit" and run the it on a separate cronjob in continuous time interval.

following shell script checks master, slave revision numbers if slave is out of sync svnsync runs. If svnsync fails it sends a mail. Instead of sending mail you can use this to alert on Nagios monitoring system.



#!/bin/bash


MAINREPO=`svn info https://svn.example.com/repo/main | grep Revision | awk '{print $2 }'`
MIRRORREPO=`svn info https://svnmirror.example.com/repo/main | grep Revision | awk '{print $2 }'`




if [ $MAINREPO != $MIRRORREPO ]; then
RESULT=`/usr/bin/svnsync --sync-username svnsync --sync-password "password" sync https://svnmirror.example.com/sync/proxy`

CHECK=`echo $RESULT | grep 'Committed revision'`
if [ "$CHECK" = "" ]; then
mail -s "SVN Mirror is not Syncing please check" admin@example.com < /dev/null
fi

fi




Makesure no accedent commits inserted to slave. Follwoing script on salve "pre-revprop-change" hook script makesure only svnsync user do the sync. You



#!/bin/sh
USER=$3

if [ "$USER" != "svnsync" ]; then
echo >&2 "Only the svnsync user is allowed to change revprops"
exit 1
fi

exit 0

exit 0




Follwoing apache configuration make sure sync is comming only from master and only svnsync user can commit to the slave.



DAV svn
SVNPath /path/to/svn/mirror
Order deny,allow
Deny from all
#IP address of the master
Allow from X.X.X.X
AuthType Basic
AuthName "SVN mirror Login"
AuthUserFile /path/to/svn/password/file
SSLRequireSSL
Require user svnsync




Further if you want to do traffic load balancing based on the location you can try to use GeoDNS service as my previous blog post.

I have commit all configurations to git hub. Syntax highlighter is not properly showing the configuration. you may find it through here. http://bit.ly/yL4jUh



[1] http://www.svnforum.org/entries/8-Is-svnsync-good-enough
[2] http://www.apache.org/dev/version-control.html#svnproblems
[3] http://subversion.apache.org/docs/release-notes/1.7.html

1 comment: