|
Textractor API textractor-720 (20091120123250) | |||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objecttextractor.mg4j.docstore.DocumentStoreReader
public final class DocumentStoreReader
Access a DocumentStore on disk. User: Fabien Campagne Date: Oct 29, 2005 Time: 5:17:17 PM To change this template use File | Settings | File Templates.
| Field Summary | |
|---|---|
static int |
DOCUMENT_NOT_FOUND
|
| Constructor Summary | |
|---|---|
DocumentStoreReader(DocumentIndexManager docmanager)
Initialize a document store reader for the "text" index. |
|
DocumentStoreReader(IndexDetails indexDetailsVal,
boolean readPmidsVal)
Initialize a document store reader for the inex specified by indexDetailsVal. |
|
DocumentStoreReader(IndexDetails indexDetailsVal,
int maxReads,
boolean readPmidsVal)
Initialize a document store reader for the inex specified by indexDetailsVal. |
|
| Method Summary | |
|---|---|
void |
close()
Closes this reader and releases any system resources associated with it. |
MutableString |
document(int documentIndex)
Retrieves a document from this document store. |
int |
document(int documentIndex,
List<Integer> result)
Retrieves a document from this document store. |
void |
document(int documentIndex,
MutableString result)
Retrieves a document from this document store. |
int |
frequencies(int documentIndex,
int[] termFrequencies,
int[] numDocForTerm)
Get frequencies. |
int |
getDocumentNumber(long pmid)
Retrieve the document number that corresponds to a given PMID. |
int |
getNumberOfDocuments()
Get the total number of documents in the index associated with this document store. |
int |
getNumberOfTerms()
Get the total number of terms in the index associated with this document store. |
long |
getPMID(int documentNumber)
Retrieve the PMID that corresponds to a given document number. |
int[] |
getTermPermutation()
Get term permutation. |
boolean |
isPositionsAvailable()
Does this document store contain postion information? |
List<IntRange> |
positions(int documentIndex)
Get the range of positions that this term occupied in the original document source. |
void |
readPMIDs()
Read PMID information. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public static final int DOCUMENT_NOT_FOUND
| Constructor Detail |
|---|
public DocumentStoreReader(DocumentIndexManager docmanager)
throws IOException
docmanager - the document index manager
IOException - When docstore files cannot be read with
the given basename.
public DocumentStoreReader(IndexDetails indexDetailsVal,
boolean readPmidsVal)
throws IOException
indexDetailsVal - the document index to read the documents forreadPmidsVal - if true, the pmids file will be read
IOException - When docstore files cannot be read with
the given basename.
public DocumentStoreReader(IndexDetails indexDetailsVal,
int maxReads,
boolean readPmidsVal)
throws IOException
indexDetailsVal - the document index to read the documents formaxReads - xxreadPmidsVal - if true, the pmids file will be read
IOException - When docstore files cannot be read with
the given basename.| Method Detail |
|---|
public void readPMIDs()
throws IOException
IOException - if the pmid map file cannot be readgetPMID(int)public long getPMID(int documentNumber)
documentNumber - Document for which the PMID is sought
readPMIDs()public int getDocumentNumber(long pmid)
pmid - PMID for which the document number is sought
-1) if not found.readPMIDs()
public MutableString document(int documentIndex)
throws IOException
documentIndex - the document index number to read
IOException - error reading the data
public void document(int documentIndex,
MutableString result)
throws IOException
documentIndex - the document index number to readresult - Text of the document will be appended
to this MutableString.
IOException - error reading the data
public int document(int documentIndex,
List<Integer> result)
throws IOException
documentIndex - Index of the document to retrieve.result - Words of the document will be appended to
this IntList.
IOException - error reading the datapublic int[] getTermPermutation()
public int frequencies(int documentIndex,
int[] termFrequencies,
int[] numDocForTerm)
throws IOException
IOException
public List<IntRange> positions(int documentIndex)
throws IOException
documentIndex - index of the document to get the positions for
IOException - if there is a problem reading the positionsisPositionsAvailable()public boolean isPositionsAvailable()
public int getNumberOfTerms()
public int getNumberOfDocuments()
public void close()
throws IOException
close in interface CloseableIOException - if an I/O error occurs
|
Textractor API textractor-720 (20091120123250) | |||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||