Twease

edu.cornell.med.icb.synonyms.util
Class SynonymClImportTool

java.lang.Object
  extended by edu.cornell.med.icb.synonyms.util.SynonymClImportTool

public final class SynonymClImportTool
extends Object

Import abbreviations and DYMs from twease query logs using the command line.

Author:
Kevin Dorff

Field Summary
static int HUGO_ALIASES
          Hugo_Aliases are stored in this position of the array.
static int HUGO_MIN_SYMBOL_LENGTH
          The minimum length acceptable for Hugo symbols.
static int HUGO_NAME
          Hugo_Name are stored in this position of the array.
static int HUGO_PREVSYMBOLS
          Hugo_PrevSymbols are stored in this position of the array.
static int HUGO_SYMBOL
          Hugo_Symbol are stored in this position of the array.
 
Constructor Summary
SynonymClImportTool()
          Constructor.
SynonymClImportTool(String persistenceUnit)
          Constructor.
 
Method Summary
 int getHugoLineCount()
          Import value: number of hugo lines processed.
 int getMeshBaseTermsCount()
          Import value: number of mesh base terms processed.
 SynonymImportTool getSynonymImportTool()
           
 int getTweaseImportBadParseCount()
          Import value: number of twease queries that couldn't be parsed.
 int getTweaseImportQueryCount()
          Import value: number of twease queries imported.
 int getTweaseImportTermCount()
          Import value: number of twease terms found.
 int getTweaseImportTermRepeatCount()
          Import value: number of twease repeat terms found.
 void hugoImport(String filename)
          Import a Hugo TSV file.
 void initializeQueryExecutor()
          Initialze the QueryExecutor.
static void main(String[] args)
          Command line interface to SynonymImportTool.
 void meshImport(String filename)
          Import the MeSH data.
protected  List<Synonym> processHugoDataLine(String line, int[] header, int minSymbolLen)
          Process a Hugo data line for synonyms.
 void tweaseQueryLogImport(String filename, boolean importRelated, boolean importDym, boolean importAbbr)
          Import the abbreviations and DYM's for all the phrases and terms in the specified twease query logs.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

HUGO_SYMBOL

public static final int HUGO_SYMBOL
Hugo_Symbol are stored in this position of the array.

See Also:
Constant Field Values

HUGO_NAME

public static final int HUGO_NAME
Hugo_Name are stored in this position of the array.

See Also:
Constant Field Values

HUGO_PREVSYMBOLS

public static final int HUGO_PREVSYMBOLS
Hugo_PrevSymbols are stored in this position of the array.

See Also:
Constant Field Values

HUGO_ALIASES

public static final int HUGO_ALIASES
Hugo_Aliases are stored in this position of the array.

See Also:
Constant Field Values

HUGO_MIN_SYMBOL_LENGTH

public static final int HUGO_MIN_SYMBOL_LENGTH
The minimum length acceptable for Hugo symbols.

See Also:
Constant Field Values
Constructor Detail

SynonymClImportTool

public SynonymClImportTool(String persistenceUnit)
Constructor.

Parameters:
persistenceUnit - the jpa persistence-unit name to use.

SynonymClImportTool

public SynonymClImportTool()
Constructor.

Method Detail

getSynonymImportTool

public SynonymImportTool getSynonymImportTool()

hugoImport

public void hugoImport(String filename)
                throws IOException
Import a Hugo TSV file.

Parameters:
filename - twease query log
Throws:
IOException - problem reading the file

processHugoDataLine

protected List<Synonym> processHugoDataLine(String line,
                                            int[] header,
                                            int minSymbolLen)
Process a Hugo data line for synonyms.

Parameters:
line - the line of data to use
header - the positions of the columns of iterest
minSymbolLen - the minimum length of a symbol to accept
Returns:
List[Synonym] the synonyms found for this line

meshImport

public void meshImport(String filename)
                throws FileNotFoundException,
                       javax.xml.stream.XMLStreamException
Import the MeSH data.

Parameters:
filename - MeSH XML file of data
Throws:
FileNotFoundException - specified twease query log does not exist
javax.xml.stream.XMLStreamException - problem processing the xml file

tweaseQueryLogImport

public void tweaseQueryLogImport(String filename,
                                 boolean importRelated,
                                 boolean importDym,
                                 boolean importAbbr)
                          throws FileNotFoundException,
                                 javax.xml.stream.XMLStreamException
Import the abbreviations and DYM's for all the phrases and terms in the specified twease query logs.

Parameters:
filename - twease query log
importRelated - set to true to import "Related" synonyms
importDym - set to true to import "DYM" synonyms
importAbbr - set to true to import "Abbr" synonyms
Throws:
FileNotFoundException - specified twease query log does not exist
javax.xml.stream.XMLStreamException - problem processing the xml file

initializeQueryExecutor

public void initializeQueryExecutor()
                             throws twease.query.TweaseException,
                                    org.apache.commons.configuration.ConfigurationException,
                                    IllegalAccessException,
                                    textractor.database.TextractorDatabaseException,
                                    IOException,
                                    InstantiationException,
                                    InvocationTargetException,
                                    NoSuchMethodException,
                                    URISyntaxException,
                                    ClassNotFoundException
Initialze the QueryExecutor.

Throws:
twease.query.TweaseException - exception
org.apache.commons.configuration.ConfigurationException - exception
IllegalAccessException - exception
textractor.database.TextractorDatabaseException - exception
IOException - exception
InstantiationException - exception
InvocationTargetException - exception
NoSuchMethodException - exception
ClassNotFoundException
URISyntaxException

getTweaseImportQueryCount

public int getTweaseImportQueryCount()
Import value: number of twease queries imported.

Returns:
value

getTweaseImportBadParseCount

public int getTweaseImportBadParseCount()
Import value: number of twease queries that couldn't be parsed.

Returns:
value

getTweaseImportTermCount

public int getTweaseImportTermCount()
Import value: number of twease terms found.

Returns:
value

getTweaseImportTermRepeatCount

public int getTweaseImportTermRepeatCount()
Import value: number of twease repeat terms found.

Returns:
value

getHugoLineCount

public int getHugoLineCount()
Import value: number of hugo lines processed.

Returns:
value

getMeshBaseTermsCount

public int getMeshBaseTermsCount()
Import value: number of mesh base terms processed.

Returns:
value

main

public static void main(String[] args)
Command line interface to SynonymImportTool.

Parameters:
args - command line arguments

Twease

Copyright © 2006-2007 Institute for Computational Biomedicine, All Rights Reserved.