Spell Check Configuration in Solr

The spell check is one of the essential things that we need to use in our application for spell
correction.
This can be done in solr by first writing the “Spell check component”

in the solrconfig.xml file.

The below is config of th spell check component

<searchComponent name=”keyspellcheck”>

<str name=”queryAnalyzerFieldType”>textSpell</str>

<!– Multiple “Spell Checkers” can be declared and used by this
component
–>

<!– a spellchecker built from a field of the main index, and
written to disk
–>
<!–   <lst name=”spellchecker”>
<str name=”name”>default</str>
<str name=”field”>keyword</str>
<str name=”spellcheckIndexDir”>spellchecker</str> –>
<!– uncomment this to require terms to occur in 1% of the documents in order to be included in the dictionary
<float name=”thresholdTokenFrequency”>.01</float>
–>
<!– </lst> –>
<lst name=”spellchecker”>
<!–
Optional, it is required when more than one spellchecker is configured.
Select non-default name with spellcheck.dictionary in request handler.
–>
<str name=”name”>default</str>
<!– The classname is optional, defaults to IndexBasedSpellChecker –>
<str name=”classname”>solr.IndexBasedSpellChecker</str>
<!–
Load tokens from the following field for spell checking,
analyzer for the field’s type as defined in schema.xml are used
–>
<str name=”field”>keyword</str>
<!– Optional, by default use in-memory index (RAMDirectory) –>
<str name=”spellcheckIndexDir”>./spellchecker</str>
<!– Set the accuracy (float) to be used for the suggestions. Default is 0.5 –>
<str name=”accuracy”>0.4</str>
<!– Require terms to occur in 1/100th of 1% of documents in order to be included in the dictionary –>
<!–<float name=”thresholdTokenFrequency”>.0001</float> –>
</lst>
<!– Example of using different distance measure –>
<lst name=”spellchecker”>
<str name=”name”>jarowinkler</str>
<str name=”field”>lowerfilt</str>
<!– Use a different Distance Measure –>
<str name=”distanceMeasure”>org.apache.lucene.search.spell.JaroWinklerDistance</str>
<str name=”spellcheckIndexDir”>./spellchecker</str>

</lst>
</searchComponent>

Here the field value must be specified in the schema file with the analyzers and tokenizers that are necessary.

Then the spell component has to be added with the request handler “SEARCH” so that it appears in the response of the solr

<requestHandler name=”search” default=”true”>
<!– default values for query parameters can be specified, these
will be overridden by parameters in the request
–>
<lst name=”defaults”>
<str name=”echoParams”>explicit</str>
<int name=”rows”>10</int>
<str name=”spellcheck.onlyMorePopular”>true</str>
<str name=”spellcheck.extendedResults”>false</str>
<str name=”spellcheck.count”>3</str>
<str name=”spellcheck”>true</str>
<str name=”spellcheck.collate”>true</str>
<str name=”spellcheck.extendedResults”>true</str>
</lst>

 

Then the data must be reindexed and the required suggestions for the misspelt word can be got.

 

JKB


Advertisements