Solr 3.x DataImportHandler and symlinking config files

Ever since I’ve started using Solr, I think it was Solr 1.3, there has been to possibility to create indexes using the DataImportHandler. In my case I often connect Solr to MySQL to fetch the data that needs to be indexed.

Since the 3.x release I’ve seen the amount of questions asked about this feature explode on sites like stackoverflow etc. At first I didn’t know why, but since I started migrating some older versions of Solr implementations I do! Configuring the DataImportHandler can be a hassle if you don’t exactly know what you are doing…

The 3.x release

Since the 3.x release of Solr some features that we’re incorporated in the 1.3 & 1.4 release of Solr have been pushed from core to plugins. The DataImportHandler is one of them. As such, the functionality can be found in the dist folder that is included in the distribution of the 3.x releases.

When running the standard examples that come with the Solr distribution the DataImportHandler isn’t enabled. Configuration of the plugins is defined by the “lib” directive available in the solrconfig.xml of each index. The documentation in the file says the following important issue about the lib directive attributes:

All directories and paths are resolved relative to the instanceDir.

Okay! I need to keep in mind how I’m referencing the dist folder using ‘../’ paths declarations. Because I’m using a multicore installation an extra folder is part of the relative path configuration in solrconfig.xml.
So I updated my config files, and fired off solr using the java -jar start.jar command. Unfortunately I got the following error again:

Error loading class 'org.apache.solr.handler.dataimport.DataImportHandler'

Why???

It took me a while, but I found that my way of symlinking files was the issue. For ease of configuration I symlink my own solr core configuration while backing up the solr core configuration that comes with the distribution. Because in this way I can put my the solr config files in a version control system.
Unfortunately when Solr is started, it follows the symlink which in result generates a complete different relative path. The lib directives in the solrconfig are rendered useless.

Solution

My solution for the problem is using the sharedLib attribute in the solr.xml configuration. This configuration attribute gives you the ability to define a folder that should be used. If you place or link the jars you want to load in this folder, you are good to go!

No comments yet.

Leave a Reply