Friday, February 10, 2012

How to Optimize SVN Mirror

My previous blog post explains how to setup a svn mirror using svnsync. Lately I realized how hard to keep the replication system running. Perhaps People have been moved to proprietary replication systems like wandisco, due to lack of replication capability in native subversion system. Its really challenging task to keep the svn mirrors up and running 24 X 7 with no complain from the commiters. Following I found some of the problems with svnsync [1] due to its centralized architecture.

Master/slave model means a single point of failure - If the master is down no-one can write to the repository.
No performance improvement on writes (these are simply proxied to the master server)
No guarantee of transactional integrity
No topology intelligence
Can’t build against slave/mirror if process requires master
Manual intervention is normally required in the event of network outage and latency
No optimization of SVN traffic
Recovery time and recovery point objectives are greater than zero
Requires additional solution or approach for DR, business continuity, fault tolerance

One of the most common problem is commit truncate with "The specified baseline is not the latest baseline, so it may not be" error, due to internet connection is slow between master slave is the.

Some times svnsync dies while lock file is present in the slave, therefore following commits return a error saying "Failed to get lock on destination repos, currently held by" etc.

If in any case Slave got insert a commit, you have to rebuild the repo.

Apache software foundations themself experience issues with svn replication using "svnsync" they are listed down at [2]

With subversion 1.7 there are some improverments have been made. Inmemory caching and data compression comes with subversion 1.7. You may need to manually build the subversion 1.7..

How to Enable Inmemory cache

# Enable a 1 Gb Subversion data cache for both fulltext and deltas.
SVNInMemoryCacheSize 1048576
SVNCacheTextDeltas On
SVNCacheFullTexts On

How Enable data compression. Level 9 denotes maximum compression, default is 5.

SVNCompressionLevel 9

By enabling keepalive it allows to sent multiple request over same tcp connection. keep MaxKeepAliveRequests directive as maximum as possible.

KeepAlive On
MaxKeepAliveRequests 1000
KeepAliveTimeout 15

We experienced that most of the issues cased due to svn on the fly replication. When a user commits a file, it has to update the master the run the replication with salve to complete the commit. Most cases commit fails due to replication issues.

There fore its good to remove the svnsync from the "post-commit" and run the it on a separate cronjob in continuous time interval.

following shell script checks master, slave revision numbers if slave is out of sync svnsync runs. If svnsync fails it sends a mail. Instead of sending mail you can use this to alert on Nagios monitoring system.


MAINREPO=`svn info | grep Revision | awk '{print $2 }'`
MIRRORREPO=`svn info | grep Revision | awk '{print $2 }'`

if [ $MAINREPO != $MIRRORREPO ]; then
RESULT=`/usr/bin/svnsync --sync-username svnsync --sync-password "password" sync`

CHECK=`echo $RESULT | grep 'Committed revision'`
if [ "$CHECK" = "" ]; then
mail -s "SVN Mirror is not Syncing please check" < /dev/null


Makesure no accedent commits inserted to slave. Follwoing script on salve "pre-revprop-change" hook script makesure only svnsync user do the sync. You


if [ "$USER" != "svnsync" ]; then
echo >&2 "Only the svnsync user is allowed to change revprops"
exit 1

exit 0

exit 0

Follwoing apache configuration make sure sync is comming only from master and only svnsync user can commit to the slave.

DAV svn
SVNPath /path/to/svn/mirror
Order deny,allow
Deny from all
#IP address of the master
Allow from X.X.X.X
AuthType Basic
AuthName "SVN mirror Login"
AuthUserFile /path/to/svn/password/file
Require user svnsync

Further if you want to do traffic load balancing based on the location you can try to use GeoDNS service as my previous blog post.

I have commit all configurations to git hub. Syntax highlighter is not properly showing the configuration. you may find it through here.


Before Configuring Mailman

DEFAULT_URL_HOST is need to first properly set with your server host name (eg: before you create any mailing list. If not welcome mails or mail-footer will not have proper links to mailing interface. Even if you change the above variable after creating the list will not effect as message template does not change accordingly.

In case if you have changed the host name best thing is to get a user list backup, delete the mailing list (without archives), and recreate the mailing list and re-import the user list. So new hostage will effects to the welcome massages.

You need to create MAILMAN_SITE_LIST mailing list before start the mailman and makesure to add sys-admins mailing lists.

You need to create Apache vhost to host archives and images.

ScriptAlias /mailman/ /usr/lib/cgi-bin/mailman/
ScriptAlias /cgi-bin/mailman/ /usr/lib/cgi-bin/mailman/

Alias /images/ /usr/share/images/

   AllowOverride None
   Options ExecCGI
   Order allow,deny
   Allow from all

   AllowOverride None
   Options ExecCGI
   Order allow,deny
   Allow from all

Alias /pipermail/ /var/lib/mailman/archives/public/

   Options Indexes MultiViews FollowSymLinks
   AllowOverride None
   Order allow,deny
   Allow from all

Make sure to increase the attachment size into at least 5M.
Set “Prefix for subject line of list postings
Enable moderator approval for MAILMAN_SITE_LIST list and steps required to subscription as “Require approval”
Disable archive for MAILMAN_SITE_LIST.
If you enabling through postfix “postalias” command will help to generate hash list of mailing list. 

Tuesday, February 7, 2012

How to install PgeoDNS [ GEODNS ]

There was a problem since the early begin of the internet how to reach to clostest servers from any area. Some one can point out "anycast" as a solution. Most of IPV4 implemetations any case is used as advertising same BGP prefixes from different locations of the world and it store in the global BGP table with different metrics. When a packt comes to a particular destination according to the BGP metric packet is sent to the clostes destination. But this method is not easy to implement because you need to have your own public ip range.

Another solution is you can do DNS load balancing where you configure two "A" type records to same domain name, but it does not gurentee alwas some client will get the clostest server ip resolved by DNS. To overcome all these issues there has been geographic aware DNS servers have been introduced. These DNS servers can resolve your domain and give you the closetest server ip address to the user. Basically these DNS servers looks into source ip address and gives reply the DNS query by matching with his internal databases. maxming [1] is a one of these country ip database provider. PGeoDNS uses pgeoIP perl module for these purpose.


Following is how to install PGeoDNS.

you have to download following perl libraries from CPAN. Note[ Following versions worked for me.


and also PgeoDNS


Now its required to install each perl modules including PgeoDNS one by one as following

perl Makefile.PL # will warn if any dependencies are missing
make test # optional
make install

You have to add a new user to execute the PgeoDNS. Add a new user as following

adduser pgeodns

The Zone configurations need to be configure as JSON notations. Sample config files can be download from apache infra site and it will give you idea about how the configurations shoud be.

You can start the service with following command as root.

pgeodns --config=pgeodns.conf --interface= --user=pgeodns  --verbose

Check the DNS queries as following.

dig a @