Mar 3

Solr 1.4 Upgrade P2 – Out with Rsync CollectionDistribution in with JavaReplication

Category: Development,Linux

Solr 1.4 made replication from Master to Slave servers a whole lot easier.  Before solr1.4 we were using rsync via the snapshooter and snappuller scripts As seen here.  This method worked OK, but intermittently we would see the snapshooter or puller fail for various Java reasons (Memory usually).

Please see Solr Java-Based Replication for setup overview.  I will cover specific modification I had to make compared to what is in their documentation.  In my solrconfig.xml on my Master server, I have the following:

<requestHandler name="/replication" class="solr.ReplicationHandler" >
      <lst name="master">
          <!--Replicate on 'startup' and 'commit'. 'optimize' is also a valid value for replicateAfter. -->
          <str name="replicateAfter">startup</str>
          <str name="replicateAfter">commit</str>

Then on the slave servers, I have:

<requestHandler name="/replication" class="solr.ReplicationHandler" >
      <lst name="slave">
          <!--fully qualified url for the replication handler of master . It is possible to pass on this as a request param for the fetchindex command-->
          <str name="masterUrl">http://{solr_host}:{solr_port}/solr/${}/replication</str>
          <!--Interval in which the slave should poll master .Format is HH:mm:ss . If this is absent slave does not poll automatically. 
           But a fetchindex can be triggered from the admin or the http API -->
          <str name="pollInterval">00:00:30</str>
          <str name="httpReadTimeout">10000</str>-->

Substitute {solr_host} and {solr_port} with your specific settings. IMPORTANT: Note the ${} variable.  This makes it so the slaves will poll from the correct MultiCore path on the Master server.


3 Comments so far

  1. Hoss March 7th, 2010 11:46 pm

    using “{slor_home}” for that variable name (besides being a typo for “{solr-home}) is extremely missleading … “Solr Home” is a very specific concept in Solr, relating to where on local disk the config files (and by default: data directory) can be found.

    What’s needed in the URL is just the based path of the Solr application on the server — in most cases this should just be “solr” (assuming people use “solr.war” and standard webapp path mapping conventions)

  2. Shep March 23rd, 2010 3:04 pm

    Thanks for the clarification Hoss. I was not using solr_home as a variable, rather just a placeholder. The only variable is ${}. Maybe webapp_path is a more appropriate name.

  3. Shep March 23rd, 2010 3:13 pm

    OK, updated to just use solr in the path name. Might as well make a similar assumption as the solr documentation does and reduce confusion.

Leave a comment