Textractor API textractor-720 (20091120123250)

textractor.didyoumean
Class DidYouMean

java.lang.Object
  extended by textractor.didyoumean.DidYouMean
All Implemented Interfaces:
DidYouMeanI

public class DidYouMean
extends Object
implements DidYouMeanI

DidYouMean class for suggesting new search terms, a la 'Did You Mean' on Google.com.


Constructor Summary
DidYouMean(DocumentIndexManager documentManager)
          Initialize the DidYouMean engine.
 
Method Summary
 String getDidYouMeanBasename()
           
 void setDidYouMeanBasename(String basename)
           
 List<ScoredResult> suggest(String term, boolean orderWithVignaScore, float cutoff)
          Returns "Did you mean" suggestions based on a search term.
 List<ScoredResult> suggest(String term, float cutoff)
          Returns "Did you mean" suggestions based on a search term.
 List<ScoredResult> suggestPaiceHusk(String term, float cutoff)
          Suggest terms that belong to the same stemmed class as the query term.
 List<ScoredResult> suggestRelated(String term, float cutoff)
          Suggests terms related to this term.
 List<ScoredResult> suggestRelated2(String term, float cutoff)
          Deprecated. This method will be removed in the near future. Use suggestRelated for biomedical corpora.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DidYouMean

public DidYouMean(DocumentIndexManager documentManager)
           throws IOException,
                  NoSuchMethodException,
                  IllegalAccessException,
                  ConfigurationException,
                  InvocationTargetException,
                  InstantiationException,
                  ClassNotFoundException,
                  URISyntaxException
Initialize the DidYouMean engine.

Parameters:
documentManager - DocumenManager for the main index (the engine will find the associated DidYouMean index)
Throws:
IOException - If an error occured reading the did you mean index data.
NoSuchMethodException
IllegalAccessException
ConfigurationException
InvocationTargetException
InstantiationException
ClassNotFoundException
URISyntaxException
Method Detail

suggest

public List<ScoredResult> suggest(String term,
                                  float cutoff)
                           throws IOException,
                                  ConfigurationException,
                                  ParseException,
                                  ClassNotFoundException,
                                  QueryParserException,
                                  QueryBuilderVisitorException
Returns "Did you mean" suggestions based on a search term.

Specified by:
suggest in interface DidYouMeanI
Parameters:
term - - the search term
Returns:
an ArrayList of suggestions, of null if no suggestions are found
Throws:
IOException
ConfigurationException
ParseException
ClassNotFoundException
QueryParserException
QueryBuilderVisitorException

suggest

public List<ScoredResult> suggest(String term,
                                  boolean orderWithVignaScore,
                                  float cutoff)
                           throws IOException,
                                  ParseException,
                                  QueryParserException,
                                  QueryBuilderVisitorException
Returns "Did you mean" suggestions based on a search term.

Specified by:
suggest in interface DidYouMeanI
Parameters:
term - the search term
orderWithVignaScore - If true, terms are ordered first by vigna score, and then by similarity to the search term
Returns:
a List of suggestions, of null if no suggestions are found
Throws:
IOException
ParseException
QueryParserException
QueryBuilderVisitorException

suggestRelated2

@Deprecated
public List<ScoredResult> suggestRelated2(String term,
                                                     float cutoff)
                                   throws ConfigurationException,
                                          IOException,
                                          ParseException,
                                          ClassNotFoundException,
                                          QueryParserException,
                                          QueryBuilderVisitorException
Deprecated. This method will be removed in the near future. Use suggestRelated for biomedical corpora.

Suggests terms related to this term. Terms suggested by this method have the same stem as the input term. We use a stemming model tuned for Medline. Terms whose prefix suffix probability are below the model's are not returned. Try experimenting with cutoff values 1E-3 to 1E-4.

Parameters:
term - Term suggestions are sought for.
cutoff - Probability cutoff. Try 1E-3 or 1E-4.
Returns:
Terms syntactically similar to query that share the same stem.
Throws:
ConfigurationException
IOException
ParseException
ClassNotFoundException
QueryParserException
QueryBuilderVisitorException

suggestRelated

public List<ScoredResult> suggestRelated(String term,
                                         float cutoff)
                                  throws ConfigurationException,
                                         IOException,
                                         ParseException,
                                         ClassNotFoundException,
                                         QueryParserException,
                                         QueryBuilderVisitorException
Suggests terms related to this term. Terms suggested by this method are morphologically related to the input term. Each suggestion is associated with a score that indicates how similar the suggestion is to the input term. We use a stemming method tuned for Medline. Terms whose prefix suffix probability are below the model's are not returned. On the Medline corpus, cutoff values around 1E-7 seem to work reasonably well, but a higher cutoff can be used if more stringency is required.

Specified by:
suggestRelated in interface DidYouMeanI
Parameters:
term - Term suggestions are sought for.
cutoff - Probability cutoff. Try 1E-7 for the Medline corpus.
Returns:
Terms syntactically similar to query that share the same stem.
Throws:
ConfigurationException
IOException
ParseException
ClassNotFoundException
QueryParserException
QueryBuilderVisitorException

suggestPaiceHusk

public List<ScoredResult> suggestPaiceHusk(String term,
                                           float cutoff)
                                    throws ConfigurationException,
                                           IOException,
                                           ParseException,
                                           ClassNotFoundException,
                                           QueryParserException,
                                           QueryBuilderVisitorException
Suggest terms that belong to the same stemmed class as the query term. Use the PaiceHusk stemmer to define the stemming class.

Specified by:
suggestPaiceHusk in interface DidYouMeanI
Parameters:
term -
cutoff -
Returns:
Throws:
ConfigurationException
IOException
ParseException
ClassNotFoundException
QueryParserException
QueryBuilderVisitorException

getDidYouMeanBasename

public String getDidYouMeanBasename()

setDidYouMeanBasename

public void setDidYouMeanBasename(String basename)

Textractor API textractor-720 (20091120123250)

Copyright © 2003-2008 Institute for Computational Biomedicine, All Rights Reserved.