|
Textractor API textractor-720 (20091120123250) | |||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectit.unimi.dsi.mg4j.util.parser.callback.DefaultCallback
textractor.parsers.PubmedExtractor
public abstract class PubmedExtractor
Pumbed / medline XML parser.
| Field Summary | |
|---|---|
static MutableString |
EMPTY_MUTABLE_STRING
An empty mutable string. |
| Fields inherited from interface it.unimi.dsi.mg4j.util.parser.callback.Callback |
|---|
EMPTY_CALLBACK_ARRAY |
| Constructor Summary | |
|---|---|
protected |
PubmedExtractor()
Create the parser. |
| Method Summary | |
|---|---|
boolean |
characters(char[] characters,
int offset,
int length,
boolean flowBroken)
Received XML characters. |
void |
configure(it.unimi.dsi.mg4j.util.parser.BulletParser parserVal)
Configure the parser to parse text. |
void |
endDocument()
End of document. |
boolean |
endElement(it.unimi.dsi.mg4j.util.parser.Element elementOrig)
We have found an XML end-element tag. |
abstract boolean |
processAbstractText(MutableString pmidVal,
MutableString titleVal,
MutableString textVal,
Map<String,Object> additionalFieldsMap)
Process the text of this document. |
abstract void |
processNoticeOfRetraction(MutableString pmidVal,
List<String> retractedPmidsVal,
boolean createArticleVal)
Process retraction notices. |
void |
setArticleElementName(String dummy)
Deprecated. |
void |
startDocument()
Start of document. |
boolean |
startElement(it.unimi.dsi.mg4j.util.parser.Element elementOrig,
Map attrMap)
We have found an XML end-element tag. |
it.unimi.dsi.mg4j.util.parser.Element |
translateElement(it.unimi.dsi.mg4j.util.parser.Element elementOrig)
If the element is in elementIgnoreSet this will return null. |
| Methods inherited from class it.unimi.dsi.mg4j.util.parser.callback.DefaultCallback |
|---|
cdata, getInstance |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public static final MutableString EMPTY_MUTABLE_STRING
| Constructor Detail |
|---|
protected PubmedExtractor()
| Method Detail |
|---|
@Deprecated public void setArticleElementName(String dummy)
dummy - ignored valuepublic final void configure(it.unimi.dsi.mg4j.util.parser.BulletParser parserVal)
configure in interface Callbackconfigure in class DefaultCallbackparserVal - parser to usepublic final void startDocument()
startDocument in interface CallbackstartDocument in class DefaultCallbackpublic void endDocument()
endDocument in interface CallbackendDocument in class DefaultCallback
public final boolean characters(char[] characters,
int offset,
int length,
boolean flowBroken)
characters in interface Callbackcharacters in class DefaultCallbackcharacters - the charactersoffset - the offset of the characters we are interested inlength - the length of the characters we are interested inflowBroken - true of flow is broken
public final boolean startElement(it.unimi.dsi.mg4j.util.parser.Element elementOrig,
Map attrMap)
startElement in interface CallbackstartElement in class DefaultCallbackelementOrig - the current elementattrMap - attributes map for this element
public final boolean endElement(it.unimi.dsi.mg4j.util.parser.Element elementOrig)
endElement in interface CallbackendElement in class DefaultCallbackelementOrig - the current element
public it.unimi.dsi.mg4j.util.parser.Element translateElement(it.unimi.dsi.mg4j.util.parser.Element elementOrig)
elementOrig - the element to translate
public abstract boolean processAbstractText(MutableString pmidVal,
MutableString titleVal,
MutableString textVal,
Map<String,Object> additionalFieldsMap)
throws IOException,
SentenceProcessingException
pmidVal - the pmid of the documenttitleVal - the title of the documenttextVal - the text of the documentadditionalFieldsMap - additional fields to index
IOException - error processing sentence
SentenceProcessingException - error processing sentence
public abstract void processNoticeOfRetraction(MutableString pmidVal,
List<String> retractedPmidsVal,
boolean createArticleVal)
pmidVal - the pmid of the retractionretractedPmidsVal - the retracted pmidscreateArticleVal - True if this method may create an article to
represent the retraction notice
|
Textractor API textractor-720 (20091120123250) | |||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||