de.jstacs.io
Class SymbolExtractor

java.lang.Object
  extended by de.jstacs.io.SymbolExtractor
All Implemented Interfaces:
Enumeration<String>

public class SymbolExtractor
extends Object
implements Enumeration<String>

This class enables you to extract elements (symbols) from a given String similar to a StringTokenizer.

The class has some special functionalities that are not given by a StringTokenizer:

  1. It enables you to recycle used objects.
  2. It handles the delimiter "" as null. (In this case the characters of the String are returned.)

Author:
Jens Keilwagen

Constructor Summary
SymbolExtractor(String delim)
          Creates a new SymbolExtractor using delim as delimiter.
SymbolExtractor(String string, String delim)
          Creates a new SymbolExtractor using delim as delimiter and string as the String to be parsed.
 
Method Summary
 int countElements()
          Counts the number of elements (symbols) that can be received initially.
static int filter(String inFile, char ignore, AlphabetContainer con, int minLength, String outFile)
          This method allows the user to filter the content of a File using a given AlphabetContainer and a minimal sequence length.
 boolean hasMoreElements()
           
 String nextElement()
           
 void setStringToBeParsed(String string)
          Sets a new String to be parsed.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SymbolExtractor

public SymbolExtractor(String delim)
Creates a new SymbolExtractor using delim as delimiter. Before invoking other methods one has to use setStringToBeParsed(String).

Parameters:
delim - the delimiter
See Also:
setStringToBeParsed(String), SymbolExtractor(String, String)

SymbolExtractor

public SymbolExtractor(String string,
                       String delim)
Creates a new SymbolExtractor using delim as delimiter and string as the String to be parsed.

Parameters:
string - the String to be parsed
delim - the delimiter
Method Detail

setStringToBeParsed

public void setStringToBeParsed(String string)
Sets a new String to be parsed.

Parameters:
string - the String to be parsed

countElements

public int countElements()
Counts the number of elements (symbols) that can be received initially.

Returns:
the number of elements (symbols) that can be received initially

hasMoreElements

public boolean hasMoreElements()
Specified by:
hasMoreElements in interface Enumeration<String>

nextElement

public String nextElement()
Specified by:
nextElement in interface Enumeration<String>

filter

public static int filter(String inFile,
                         char ignore,
                         AlphabetContainer con,
                         int minLength,
                         String outFile)
                  throws IOException
This method allows the user to filter the content of a File using a given AlphabetContainer and a minimal sequence length. This method is useful to filter, for instance, data sets with ambiguous nucleotides and/or short sequences. Lines that can not be parsed with respect to the AlphabetContainer will be masked by the character ignore.

Parameters:
inFile - the path/name of the input File
ignore - the char for comment lines
con - the AlphabetContainer
minLength - the minimal length of a sequence in the output File
outFile - the path/name of output File
Returns:
the number of discarded lines of the input File
Throws:
IOException - if something with the handling of the Files went wrong