de.jstacs.data
Class Sequence

java.lang.Object
  extended by de.jstacs.data.Sequence
All Implemented Interfaces:
Comparable<Sequence>
Direct Known Subclasses:
ArbitrarySequence, DiscreteSequence, RecursiveSequence

public abstract class Sequence
extends Object
implements Comparable<Sequence>

This is the main class for all sequences. All sequences are immutable.

Author:
Jens Keilwagen

Nested Class Summary
protected static class Sequence.CompositeSequence
          The class handles composite Sequences.
protected static class Sequence.SubSequence
          This class handles subsequences.
 
Field Summary
protected  AlphabetContainer alphabetCon
          The underlying alphabets.
protected  SequenceAnnotation[] annotation
          The annotation of the Sequence.CompositeSequence.
protected  Sequence rc
          The pointer to the reverse complement of the Sequence.CompositeSequence.
 
Constructor Summary
protected Sequence(AlphabetContainer container, SequenceAnnotation[] annotation)
          Creates a new Sequence.CompositeSequence with the given AlphabetContainer and the given annotation, but without the content.
 
Method Summary
 Sequence annotate(boolean add, SequenceAnnotation... annotation)
          This method allows to append annotation to a Sequence.CompositeSequence.
 int compareTo(Sequence s)
           
 Sequence complement()
          This method returns a new instance of Sequence.CompositeSequence containing the complementary current Sequence.CompositeSequence.
 Sequence complement(int start, int end)
          This method returns a new instance of Sequence.CompositeSequence containing a part of the complementary current Sequence.CompositeSequence.
abstract  double continuousVal(int pos)
          Returns the continuous value at position pos of the Sequence.CompositeSequence.
static Sequence create(AlphabetContainer con, SequenceAnnotation[] annotation, String sequence, String delim)
          Creates a Sequence.CompositeSequence from a String based on the given AlphabetContainer using the given delimiter delim and some annotation for the Sequence.CompositeSequence.
static Sequence create(AlphabetContainer con, String sequence)
          Creates a Sequence.CompositeSequence from a String based on the given AlphabetContainer using the standard delimiter for this AlphabetContainer.
static Sequence create(AlphabetContainer con, String sequence, String delim)
          Creates a Sequence.CompositeSequence from a String based on the given AlphabetContainer using the given delimiter delim.
abstract  int discreteVal(int pos)
          Returns the discrete value at position pos of the Sequence.CompositeSequence.
 boolean equals(Object o)
           
protected abstract  Sequence flatCloneWithoutAnnotation()
          Works in analogy to Object.clone(), but does not clone the annotation.
 AlphabetContainer getAlphabetContainer()
          Return the alphabets, i.e. the AlphabetContainer, used in this Sequence.CompositeSequence.
 SequenceAnnotation[] getAnnotation()
          Returns the annotation of the Sequence.CompositeSequence.
 Sequence getCompositeSequence(AlphabetContainer abc, int[] starts, int[] lengths)
          This method should be used if one wants to create a Sample of Sequence.CompositeSequences.
 Sequence getCompositeSequence(int[] starts, int[] lengths)
          This is a very efficient way to create a Sequence.CompositeSequence for sequences with a simple AlphabetContainer.
abstract  int getLength()
          Returns the length of the Sequence.CompositeSequence.
 Sequence getSubSequence(AlphabetContainer abc, int start)
          This method should be used if one wants to create a Sample of subsequences of defined length.
 Sequence getSubSequence(AlphabetContainer abc, int start, int length)
          This method should be used if one wants to create a Sample of subsequences of defined length.
 Sequence getSubSequence(int start)
          This is a very efficient way to create a subsequence/suffix for Sequence.CompositeSequences with a simple AlphabetContainer.
 Sequence getSubSequence(int start, int length)
          This is a very efficient way to create a subsequence of defined length for Sequence.CompositeSequences with a simple AlphabetContainer.
 int hashCode()
           
 Sequence reverse()
          This method returns a new instance of Sequence.CompositeSequence containing the reverse current Sequence.CompositeSequence.
 Sequence reverse(int start, int end)
          This method returns a new instance of Sequence.CompositeSequence containing a part of the reverse current Sequence.CompositeSequence.
 Sequence reverseComplement()
          This method returns a new instance of Sequence.CompositeSequence containing the reverse complementary current Sequence.CompositeSequence.
 Sequence reverseComplement(int start, int end)
          This method returns a new instance of Sequence.CompositeSequence containing a reverse part of the complementary current Sequence.CompositeSequence.
protected  int toDiscrete(int pos, double content)
          This method converts a continuous value at position pos of the Sequence.CompositeSequence into a discrete one.
 String toString()
          Returns a String representation of the Sequence.CompositeSequence (normally the Sequence.CompositeSequence in its original Alphabet).
 String toString(int start)
          Returns a String representation of the Sequence.CompositeSequence (normally the Sequence.CompositeSequence in its original Alphabet) beginning at position start with a default delimiter as separator.
 String toString(int start, int end)
          Returns a String representation of the Sequence.CompositeSequence (normally the Sequence.CompositeSequence in its original Alphabet) between start and end with a default delimiter as separator.
 String toString(String delim, int start, int end)
          Returns a String representation of the Sequence.CompositeSequence (normally the Sequence.CompositeSequence in its original alphabet) between start and end with delim as separator.
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Field Detail

alphabetCon

protected AlphabetContainer alphabetCon
The underlying alphabets.


rc

protected Sequence rc
The pointer to the reverse complement of the Sequence.CompositeSequence.


annotation

protected SequenceAnnotation[] annotation
The annotation of the Sequence.CompositeSequence.

Constructor Detail

Sequence

protected Sequence(AlphabetContainer container,
                   SequenceAnnotation[] annotation)
Creates a new Sequence.CompositeSequence with the given AlphabetContainer and the given annotation, but without the content. The content has to be set by the constructor of the extending class.

Parameters:
container - the AlphabetContainer of the Sequence.CompositeSequence
annotation - the annotation of the Sequence.CompositeSequence
Method Detail

continuousVal

public abstract double continuousVal(int pos)
Returns the continuous value at position pos of the Sequence.CompositeSequence.

Parameters:
pos - the position of the Sequence.CompositeSequence
Returns:
the continuous value at position pos of the Sequence.CompositeSequence

discreteVal

public abstract int discreteVal(int pos)
Returns the discrete value at position pos of the Sequence.CompositeSequence.

Parameters:
pos - the position of the Sequence.CompositeSequence
Returns:
the discrete value at position pos of the Sequence.CompositeSequence

equals

public boolean equals(Object o)
Overrides:
equals in class Object

getAlphabetContainer

public final AlphabetContainer getAlphabetContainer()
Return the alphabets, i.e. the AlphabetContainer, used in this Sequence.CompositeSequence.

Returns:
the alphabets, i.e. the AlphabetContainer, used in this Sequence.CompositeSequence

getAnnotation

public final SequenceAnnotation[] getAnnotation()
Returns the annotation of the Sequence.CompositeSequence.

Returns:
the annotation of the Sequence.CompositeSequence (can be null)

getCompositeSequence

public Sequence getCompositeSequence(AlphabetContainer abc,
                                     int[] starts,
                                     int[] lengths)
This method should be used if one wants to create a Sample of Sequence.CompositeSequences. With this constructor you are enabled to create a Sample where every Sequence.CompositeSequence has the same AlphabetContainer instance.

Internally it is checked that the AlphabetContainer matches with the one of the Sequence.CompositeSequence.

Parameters:
abc - the new AlphabetContainer
starts - the start positions of the junks
lengths - the length of each junk
Returns:
the Sequence.CompositeSequence
See Also:
CompositeSequence#CompositeSequence(de.jstacs.data.AlphabetContainer, de.jstacs.data.Sequence, int[], int[])

getCompositeSequence

public Sequence getCompositeSequence(int[] starts,
                                     int[] lengths)
This is a very efficient way to create a Sequence.CompositeSequence for sequences with a simple AlphabetContainer.

Parameters:
starts - the start positions of the junks
lengths - the length of each junk
Returns:
the Sequence.CompositeSequence
See Also:
CompositeSequence#CompositeSequence(de.jstacs.data.Sequence, int[], int[])

getSubSequence

public final Sequence getSubSequence(AlphabetContainer abc,
                                     int start)
This method should be used if one wants to create a Sample of subsequences of defined length. With this constructor you are enabled to create a Sample where every Sequence.CompositeSequence has the same AlphabetContainer instance.

Internally it is checked that the AlphabetContainer matches with the one of the subsequence.

Parameters:
abc - the new AlphabetContainer
start - the index of the start position
Returns:
the subsequence
See Also:
getSubSequence(de.jstacs.data.AlphabetContainer, int, int)

getSubSequence

public Sequence getSubSequence(AlphabetContainer abc,
                               int start,
                               int length)
This method should be used if one wants to create a Sample of subsequences of defined length. With this constructor you are enabled to create a Sample where every Sequence.CompositeSequence has the same AlphabetContainer instance.

Internally it is checked that the AlphabetContainer matches with the one of the subsequence.

Parameters:
abc - the new AlphabetContainer
start - the index of the start position
length - the length of the new Sequence.CompositeSequence
Returns:
the subsequence
See Also:
SubSequence#SubSequence(de.jstacs.data.AlphabetContainer, de.jstacs.data.Sequence, int, int)

getSubSequence

public final Sequence getSubSequence(int start)
This is a very efficient way to create a subsequence/suffix for Sequence.CompositeSequences with a simple AlphabetContainer.

Parameters:
start - the index of the start position
Returns:
the subsequence
See Also:
getSubSequence(int, int)

getSubSequence

public Sequence getSubSequence(int start,
                               int length)
This is a very efficient way to create a subsequence of defined length for Sequence.CompositeSequences with a simple AlphabetContainer.

Parameters:
start - the index of the start position
length - the length of the new Sequence.CompositeSequence
Returns:
the subsequence
See Also:
SubSequence#SubSequence(Sequence, int, int)

annotate

public Sequence annotate(boolean add,
                         SequenceAnnotation... annotation)
This method allows to append annotation to a Sequence.CompositeSequence.

Parameters:
add - indicates whether to add the new annotation to the existing or not
annotation - the new annotation
Returns:
the new annotated Sequence.CompositeSequence
See Also:
Sequence.CompositeSequence.flatCloneWithoutAnnotation()

flatCloneWithoutAnnotation

protected abstract Sequence flatCloneWithoutAnnotation()
Works in analogy to Object.clone(), but does not clone the annotation. This method is used in annotate(boolean, SequenceAnnotation...).

Returns:
the cloned Sequence.CompositeSequence without annotation

getLength

public abstract int getLength()
Returns the length of the Sequence.CompositeSequence.

Returns:
the length of the Sequence.CompositeSequence

toString

public String toString()
Returns a String representation of the Sequence.CompositeSequence (normally the Sequence.CompositeSequence in its original Alphabet).

Overrides:
toString in class Object
Returns:
the Sequence.CompositeSequence as String
See Also:
toString(String, int, int)

toString

public String toString(int start)
Returns a String representation of the Sequence.CompositeSequence (normally the Sequence.CompositeSequence in its original Alphabet) beginning at position start with a default delimiter as separator.

Parameters:
start - the start index (inclusive)
Returns:
the Sequence.CompositeSequence as String
See Also:
toString(String, int, int)

toString

public String toString(int start,
                       int end)
Returns a String representation of the Sequence.CompositeSequence (normally the Sequence.CompositeSequence in its original Alphabet) between start and end with a default delimiter as separator.

Parameters:
start - the start index (inclusive)
end - the end index (exclusive)
Returns:
the Sequence.CompositeSequence as String
See Also:
toString(String, int, int)

compareTo

public int compareTo(Sequence s)
Specified by:
compareTo in interface Comparable<Sequence>

toDiscrete

protected int toDiscrete(int pos,
                         double content)
This method converts a continuous value at position pos of the Sequence.CompositeSequence into a discrete one.

Parameters:
pos - the position of the Sequence.CompositeSequence
content - the value at this position
Returns:
the discrete value for this position
See Also:
AlphabetContainer.toDiscrete(int, double)

toString

public String toString(String delim,
                       int start,
                       int end)
Returns a String representation of the Sequence.CompositeSequence (normally the Sequence.CompositeSequence in its original alphabet) between start and end with delim as separator.

Parameters:
delim - the delimiter/separator
start - the start index (inclusive)
end - the end index (exclusive)
Returns:
the Sequence.CompositeSequence as String

create

public static Sequence create(AlphabetContainer con,
                              String sequence)
                       throws WrongAlphabetException,
                              IllegalArgumentException
Creates a Sequence.CompositeSequence from a String based on the given AlphabetContainer using the standard delimiter for this AlphabetContainer.

Parameters:
con - the AlphabetContainer
sequence - the String containing the Sequence.CompositeSequence
Returns:
a new Sequence.CompositeSequence instance
Throws:
WrongAlphabetException - if sequence is not defined over con
IllegalArgumentException - if the delimiter is empty and the AlphabetContainer is not discrete
See Also:
create(AlphabetContainer, String, String)

create

public static Sequence create(AlphabetContainer con,
                              String sequence,
                              String delim)
                       throws WrongAlphabetException,
                              IllegalArgumentException
Creates a Sequence.CompositeSequence from a String based on the given AlphabetContainer using the given delimiter delim.

Parameters:
con - the AlphabetContainer
sequence - the String containing the Sequence.CompositeSequence
delim - the given delimiter
Returns:
a new Sequence.CompositeSequence instance
Throws:
WrongAlphabetException - if sequence is not defined over con
IllegalArgumentException - if the delimiter is empty and the AlphabetContainer is not discrete
See Also:
create(AlphabetContainer, SequenceAnnotation[], String, String)

create

public static Sequence create(AlphabetContainer con,
                              SequenceAnnotation[] annotation,
                              String sequence,
                              String delim)
                       throws WrongAlphabetException,
                              IllegalArgumentException
Creates a Sequence.CompositeSequence from a String based on the given AlphabetContainer using the given delimiter delim and some annotation for the Sequence.CompositeSequence.

Parameters:
con - the AlphabetContainer
annotation - the annotation for the Sequence.CompositeSequence
sequence - the String containing the Sequence.CompositeSequence
delim - the given delimiter
Returns:
a new Sequence.CompositeSequence instance
Throws:
WrongAlphabetException - if sequence is not defined over con
IllegalArgumentException - if the delimiter is empty and the AlphabetContainer is not discrete

reverse

public final Sequence reverse()
                       throws OperationNotSupportedException
This method returns a new instance of Sequence.CompositeSequence containing the reverse current Sequence.CompositeSequence.
So invoking this method, for instance, on the sequence "TAATA" returns "ATAAT".

Returns:
the reverse Sequence.CompositeSequence
Throws:
OperationNotSupportedException - if the current Sequence.CompositeSequence is based on an AlphabetContainer that is not simple
See Also:
reverse(int, int)

reverse

public Sequence reverse(int start,
                        int end)
                 throws OperationNotSupportedException
This method returns a new instance of Sequence.CompositeSequence containing a part of the reverse current Sequence.CompositeSequence.

Parameters:
start - the start position (inclusive) in the original Sequence.CompositeSequence
end - the end position (exclusive) in the original Sequence.CompositeSequence
Returns:
the reverse Sequence.CompositeSequence of the part
Throws:
OperationNotSupportedException - if the current Sequence.CompositeSequence is based on an AlphabetContainer that is not simple

complement

public Sequence complement()
                    throws OperationNotSupportedException
This method returns a new instance of Sequence.CompositeSequence containing the complementary current Sequence.CompositeSequence.
So invoking this method, for instance, on the sequence "TAATA" with an AlphabetContainer on DNAAlphabet returns "ATTAT".

Returns:
the complementary Sequence.CompositeSequence
Throws:
OperationNotSupportedException - if the current Sequence.CompositeSequence is not based on a ComplementableDiscreteAlphabet
See Also:
ComplementableDiscreteAlphabet, complement(int, int)

complement

public Sequence complement(int start,
                           int end)
                    throws OperationNotSupportedException
This method returns a new instance of Sequence.CompositeSequence containing a part of the complementary current Sequence.CompositeSequence.
So invoking this method, for instance, on the sequence "TAATA" with an AlphabetContainer on DNAAlphabet returns "ATTAT".

Parameters:
start - the start position (inclusive) in the original Sequence.CompositeSequence
end - the end position (exclusive) in the original Sequence.CompositeSequence
Returns:
the complementary Sequence.CompositeSequence of the part
Throws:
OperationNotSupportedException - if the current Sequence.CompositeSequence is not based on a ComplementableDiscreteAlphabet
See Also:
ComplementableDiscreteAlphabet

reverseComplement

public Sequence reverseComplement()
                           throws OperationNotSupportedException
This method returns a new instance of Sequence.CompositeSequence containing the reverse complementary current Sequence.CompositeSequence. For more details see the methods reverse() and complement().

Returns:
the reverse complementary Sequence.CompositeSequence
Throws:
OperationNotSupportedException - if the current Sequence.CompositeSequence is not discrete and simple (not based on a ComplementableDiscreteAlphabet)
See Also:
reverse(), complement(), reverseComplement(int, int), ComplementableDiscreteAlphabet

reverseComplement

public Sequence reverseComplement(int start,
                                  int end)
                           throws OperationNotSupportedException
This method returns a new instance of Sequence.CompositeSequence containing a reverse part of the complementary current Sequence.CompositeSequence. For more details see the methods reverse() and complement().

Parameters:
start - the start position (inclusive) in the original Sequence.CompositeSequence
end - the end position (exclusive) in the original Sequence.CompositeSequence
Returns:
the reverse complementary Sequence.CompositeSequence of the part
Throws:
OperationNotSupportedException - if the current Sequence.CompositeSequence is not discrete and simple ((not based on a ComplementableDiscreteAlphabet)
See Also:
reverse(), complement(), ComplementableDiscreteAlphabet

hashCode

public int hashCode()
Overrides:
hashCode in class Object