Squil API squil-98 (20130530172026)

edu.cornell.med.icb.parsers
Class FastaParser

java.lang.Object
  extended by edu.cornell.med.icb.parsers.FastaParser

public final class FastaParser
extends Object

Parse a FASTA file. In contrast to the crover FastaParser, this class uses MutableString for efficiency, and loads sequences lazily. This means that clients can start processing sequences in a file before the file is completely loaded. This reader can therefore process very large files without consuming more memory than is needed to process the largest sequence in the file.

Author:
Fabien Campagne Date: Oct 25, 2006 Time: 6:09:57 PM

Constructor Summary
FastaParser()
          Create a parser to read sequences.
FastaParser(Reader fastaFileSource)
          Create a parser to read sequences.
 
Method Summary
static void filterProteinResidues(CharSequence rawResidues, MutableString filteredResidues)
          Filter a string to keep only protein residues.
static void guessAccessionCode(CharSequence descriptionLine, MutableString accessionCode)
          Try to extract an accession code from a FASTA description line.
 boolean hasNext()
          Returns true if the reader has at least one more sequence.
 boolean next(MutableString descriptionLine, MutableString residues)
          Obtain the next sequence from the reader over the FASTA formatted content.
 void setReader(Reader reader)
          Repositions this reader on a different file/data content.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

FastaParser

public FastaParser(Reader fastaFileSource)
            throws IOException
Create a parser to read sequences.

Parameters:
fastaFileSource - The reader over the FASTA formatted data.
Throws:
IOException - if the sequence cannot be read using the reader

FastaParser

public FastaParser()
Create a parser to read sequences.

Method Detail

setReader

public void setReader(Reader reader)
               throws IOException
Repositions this reader on a different file/data content.

Parameters:
reader - the new reader to use to parse the file
Throws:
IOException - if the sequence cannot be read using the reader

hasNext

public boolean hasNext()
Returns true if the reader has at least one more sequence.

Returns:
True if a call to next will return another sequence.

next

public boolean next(MutableString descriptionLine,
                    MutableString residues)
             throws IOException
Obtain the next sequence from the reader over the FASTA formatted content. This method returns true until there is no more sequence to parse in the input. When the method returns false, the content of the parameters descriptionLine and residues is unspecified.

Parameters:
descriptionLine - Where the raw description line will be written.
residues - When the raw residue lines will be written.
Returns:
True if hasNext() is true, False otherwise.
Throws:
IOException - if there is a problem reading from the input

guessAccessionCode

public static void guessAccessionCode(CharSequence descriptionLine,
                                      MutableString accessionCode)
Try to extract an accession code from a FASTA description line.

Parameters:
descriptionLine - The line of text to parse for the accession code
accessionCode - The location to place the resulting accession code

filterProteinResidues

public static void filterProteinResidues(CharSequence rawResidues,
                                         MutableString filteredResidues)
Filter a string to keep only protein residues.

Parameters:
rawResidues - A string that may contain any character.
filteredResidues - The subset of characters that represent valid protein residue codes, in the order in which they occur in the rawResidue string.

Squil API squil-98 (20130530172026)

Copyright © 2007-2013 Institute for Computational Biomedicine, All Rights Reserved.