Thursday, August 20, 2009

Searching for synonyms..

Solr supports searching synonym words. This means that if you have word called "computer" in your index, you can search using its synonyms like CPU, PC etc. To achieve this you will need to do the following :
(This example is for configuring Solr distributed through ColdFusion)

Under your collections directory (Lets say C:\ColdFusion\Collections\solrcollection), there will be a conf directory, under which you will have a synonyms.txt. Shutdown Solr and open the file.

Add your synonyms list in that.
Ex:
computer,PC,CPU,calculator
Save, exit and restart Solr service or process.

There are many synonym list available online. One such can be found here:

http://www.gutenberg.org/etext/3202

Loading the complete list is not a good idea because you will run out of memory. Add only what is relevant and have a small list. Try to convert to small cases wherever applicable and remove spaces.

To have the synonyms applicable to all collections, do the following:
Goto the Solr installation directory and browse to multicore\template\conf direcory.
Ex: C:\ColdFusion9\solr\multicore\template\conf

Update the synonyms.txt. This should be done before creating a collection.
The synonyms are mostly specific to your application. Say you are indexing books, you might want to have synonyms like rowling,potter,hogwarts etc (something similar to the labels).

No comments:

Post a Comment