de.jstacs.io
Class SymbolExtractor

java.lang.Object
  extended by de.jstacs.io.SymbolExtractor
All Implemented Interfaces:
Enumeration<String>

public class SymbolExtractor
extends Object
implements Enumeration<String>

This class enables you to extract elements form a given string similar to an StringTokenizer.

The class has some special functionalities that are not given be a StringTokenizer:

  1. It enables you to recycle use objects.
  2. It handles the delimiter "", null. (In this case the characters of the string are returned.)

      Author:
      Jens Keilwagen

      Constructor Summary
      SymbolExtractor(String delim)
                Creates a new instance using delim as delimiter.
      SymbolExtractor(String string, String delim)
                Creates a new instance using delim as delimiter and string as string to be parsed.
       
      Method Summary
       int countElements()
                Counts the number of elements that can initially be received.
      static int filter(String inFile, char ignore, AlphabetContainer con, int minLength, String outFile)
                This method allows the user to filter a file using a given alphabet container and a minimal sequence length.
       boolean hasMoreElements()
                 
       String nextElement()
                 
       void setStringToBeParsed(String string)
                Sets a new string to be parsed.
       
      Methods inherited from class java.lang.Object
      clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
       

      Constructor Detail

      SymbolExtractor

      public SymbolExtractor(String delim)
      Creates a new instance using delim as delimiter. Before invoking other methods one has to use setStringToBeParsed(String)

      Parameters:
      delim - the delimiter
      See Also:
      setStringToBeParsed(String)

      SymbolExtractor

      public SymbolExtractor(String string,
                             String delim)
      Creates a new instance using delim as delimiter and string as string to be parsed.

      Parameters:
      string - the string to be parsed
      delim - delimiter
      Method Detail

      setStringToBeParsed

      public void setStringToBeParsed(String string)
      Sets a new string to be parsed.

      Parameters:
      string - the string to be parsed

      countElements

      public int countElements()
      Counts the number of elements that can initially be received.

      Returns:
      the number of elements that can initially be received

      hasMoreElements

      public boolean hasMoreElements()
      Specified by:
      hasMoreElements in interface Enumeration<String>

      nextElement

      public String nextElement()
      Specified by:
      nextElement in interface Enumeration<String>

      filter

      public static int filter(String inFile,
                               char ignore,
                               AlphabetContainer con,
                               int minLength,
                               String outFile)
                        throws IOException
      This method allows the user to filter a file using a given alphabet container and a minimal sequence length. This method is useful to filter for instance data sets with ambiguous nucleotides and/or short sequences. Lines that can not be parsed with respect to the alphabet container will be masked by the ignore char.

      Parameters:
      inFile - the input file
      ignore - the char for comment lines
      con - the alpgabet container
      minLength - the minimal length of a sequence in the output file
      outFile - the output file
      Returns:
      the number of discarded lines
      Throws:
      IOException - if something with the file handling went wrong