de.jstacs.data
Class Sequence

java.lang.Object
  extended by de.jstacs.data.Sequence
All Implemented Interfaces:
Comparable<Sequence>
Direct Known Subclasses:
ArbitrarySequence, DiscreteSequence, RecursiveSequence

public abstract class Sequence
extends Object
implements Comparable<Sequence>

This is the main class for all sequences. All sequences are immutable.

Author:
Jens Keilwagen

Nested Class Summary
protected static class Sequence.CompositeSequence
          The class handles composite sequences.
protected static class Sequence.SubSequence
          This class handles subsequences.
 
Field Summary
protected  AlphabetContainer alphabetCon
          the underlying alphabets
protected  SequenceAnnotation[] annotation
          The annotation of the sequence.
protected  Sequence rc
          The pointer to the reverse complement
 
Constructor Summary
protected Sequence(AlphabetContainer container, SequenceAnnotation[] annotation)
          This constructor creates an instance with the AlphabetContainer and the annotation, but without the content.
 
Method Summary
 Sequence annotate(boolean add, SequenceAnnotation... annotation)
          This method allows to append annotation to a sequence.
 int compareTo(Sequence s)
           
 Sequence complement()
          This method returns a new instance of sequence containing the complementary current sequence.
 Sequence complement(int start, int end)
          This method returns a new instance of sequence containing a part of the complementary current sequence.
abstract  double continuousVal(int pos)
          Returns the continuous value of position pos.
static Sequence create(AlphabetContainer con, SequenceAnnotation[] annotation, String sequence, String delim)
          Creates a sequence from a string based on the given AlphabetContainer using the given delimiter.
static Sequence create(AlphabetContainer con, String sequence)
          Creates a sequence from a string based on the given AlphabetContainer using the standard delimiter for this AlphabetContainer.
static Sequence create(AlphabetContainer con, String sequence, String delim)
          Creates a sequence from a string based on the given AlphabetContainer using the given delimiter.
abstract  int discreteVal(int pos)
          Returns the discrete value of position pos.
 boolean equals(Object o)
           
protected abstract  Sequence flatCloneWithoutAnnotation()
          Works in analogy to Object.clone(), but does not clone the annotation.
 AlphabetContainer getAlphabetContainer()
          Return the alphabets used in this sequence.
 SequenceAnnotation[] getAnnotation()
          Returns the annotation of the sequence.
 Sequence getCompositeSequence(AlphabetContainer abc, int[] starts, int[] lengths)
          This constructor should be used if one wants to create a sample of composite sequences.
 Sequence getCompositeSequence(int[] starts, int[] lengths)
          This is an very efficient way to create a composite sequence for sequences with a simple AlphabetContainer.
abstract  int getLength()
          Returns the length of the sequence
 Sequence getSubSequence(AlphabetContainer abc, int start)
          This method should be used if one wants to create a sample of subsequences of defined length.
 Sequence getSubSequence(AlphabetContainer abc, int start, int length)
          This method should be used if one wants to create a sample of subsequences of defined length.
 Sequence getSubSequence(int start)
          This is an very efficient way to create a subsequence/suffix for sequences with a simple AlphabetContainer.
 Sequence getSubSequence(int start, int length)
          This is an very efficient way to create a subsequence of defined length for sequences with a simple AlphabetContainer.
 int hashCode()
           
 Sequence reverse()
          This method returns a new instance of sequence containing the reverse current sequence.
 Sequence reverse(int start, int end)
          This method returns a new instance of sequence containing a part of the reverse current sequence.
 Sequence reverseComplement()
          This method returns a new sequence instance containing the complementary current sequence.
 Sequence reverseComplement(int start, int end)
          This method returns a new sequence instance containing a part of the complementary current sequence.
protected  int toDiscrete(int pos, double content)
          This method converts a continuous value in a discrete one.
 String toString()
          Returns a String representation of the sequence (normally the sequence in its original alphabet)
 String toString(int start)
          Returns a string representation of the sequence (normally the sequence in its original alphabet) with default delimiter as separator.
 String toString(int start, int end)
          Returns a string representation of the sequence (normally the sequence in its original alphabet) with default delimiter as separator.
 String toString(String delim, int start, int end)
          Returns a string representation of the sequence (normally the sequence in its original alphabet) with delim as separator.
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Field Detail

alphabetCon

protected AlphabetContainer alphabetCon
the underlying alphabets


rc

protected Sequence rc
The pointer to the reverse complement


annotation

protected SequenceAnnotation[] annotation
The annotation of the sequence.

Constructor Detail

Sequence

protected Sequence(AlphabetContainer container,
                   SequenceAnnotation[] annotation)
This constructor creates an instance with the AlphabetContainer and the annotation, but without the content. The content has to be set by the constructor of the extending class.

Parameters:
container - the AlpahbetContainer of the sequence
annotation - the annotation of the sequence
Method Detail

continuousVal

public abstract double continuousVal(int pos)
Returns the continuous value of position pos.

Parameters:
pos - the position
Returns:
the continuous value of position pos

discreteVal

public abstract int discreteVal(int pos)
Returns the discrete value of position pos.

Parameters:
pos - the position
Returns:
the discrete value of position pos

equals

public boolean equals(Object o)
Overrides:
equals in class Object

getAlphabetContainer

public final AlphabetContainer getAlphabetContainer()
Return the alphabets used in this sequence.

Returns:
the alphabets used in this sequence

getAnnotation

public final SequenceAnnotation[] getAnnotation()
Returns the annotation of the sequence.

Returns:
the annotation of the sequence (can be null).

getCompositeSequence

public Sequence getCompositeSequence(AlphabetContainer abc,
                                     int[] starts,
                                     int[] lengths)
This constructor should be used if one wants to create a sample of composite sequences. With this constructor you are enabled to create a Sample where every sequence has the same AlphabetContainer-instance.

Internally it is checked that the AlphabetContainer matches with the subsequence.

Parameters:
abc - the new AlphabetContainer
starts - the start positions of the junks
lengths - the length for each junk
Returns:
the composite sequence

getCompositeSequence

public Sequence getCompositeSequence(int[] starts,
                                     int[] lengths)
This is an very efficient way to create a composite sequence for sequences with a simple AlphabetContainer.

Parameters:
starts - the index of the start position
lengths - the length of the new sequence
Returns:
the composite sequence

getSubSequence

public final Sequence getSubSequence(AlphabetContainer abc,
                                     int start)
This method should be used if one wants to create a sample of subsequences of defined length. With this constructor you are enabled to create a Sample where every sequence has the same AlphabetContainer-instance.

Internally it is checked that the AlphabetContainer matches with the subsequence.

Parameters:
abc - the new AlphabetContainer
start - the index of the start position
Returns:
the subsequence

getSubSequence

public Sequence getSubSequence(AlphabetContainer abc,
                               int start,
                               int length)
This method should be used if one wants to create a sample of subsequences of defined length. With this constructor you are enabled to create a Sample where every sequence has the same AlphabetContainer-instance.

Internally it is checked that the AlphabetContainer matches with the subsequence.

Parameters:
abc - the new AlphabetContainer
start - the index of the start position
length - the length of the new sequence
Returns:
the subsequence

getSubSequence

public final Sequence getSubSequence(int start)
This is an very efficient way to create a subsequence/suffix for sequences with a simple AlphabetContainer.

Parameters:
start - the index of the start position
Returns:
the subsequence

getSubSequence

public Sequence getSubSequence(int start,
                               int length)
This is an very efficient way to create a subsequence of defined length for sequences with a simple AlphabetContainer.

Parameters:
start - the index of the start position
length - the length of the new sequence
Returns:
the subsequence

annotate

public Sequence annotate(boolean add,
                         SequenceAnnotation... annotation)
This method allows to append annotation to a sequence.

Parameters:
add - whether to add the new annotation to the existing or not
annotation - the new annotation
Returns:
the new annotated sequence
See Also:
Sequence.CompositeSequence.flatCloneWithoutAnnotation()

flatCloneWithoutAnnotation

protected abstract Sequence flatCloneWithoutAnnotation()
Works in analogy to Object.clone(), but does not clone the annotation. This method is used in annotate(boolean, SequenceAnnotation...).

Returns:
the cloned Sequence.CompositeSequence without annotation

getLength

public abstract int getLength()
Returns the length of the sequence

Returns:
the length

toString

public String toString()
Returns a String representation of the sequence (normally the sequence in its original alphabet)

Overrides:
toString in class Object
Returns:
the sequence as String

toString

public String toString(int start)
Returns a string representation of the sequence (normally the sequence in its original alphabet) with default delimiter as separator.

Parameters:
start - the start index (inclusive)
Returns:
the sequence as String
See Also:
toString(String, int, int)

toString

public String toString(int start,
                       int end)
Returns a string representation of the sequence (normally the sequence in its original alphabet) with default delimiter as separator.

Parameters:
start - the start index (inclusive)
end - the end index (exclusive)
Returns:
the sequence as String
See Also:
toString(String, int, int)

compareTo

public int compareTo(Sequence s)
Specified by:
compareTo in interface Comparable<Sequence>

toDiscrete

protected int toDiscrete(int pos,
                         double content)
This method converts a continuous value in a discrete one.

Parameters:
pos - the position
content - the value at this position
Returns:
the discrete value for this position

toString

public String toString(String delim,
                       int start,
                       int end)
Returns a string representation of the sequence (normally the sequence in its original alphabet) with delim as separator.

Parameters:
delim - the delimiter/separator
start - the start index (inclusive)
end - the end index (exclusive)
Returns:
the sequence as String

create

public static Sequence create(AlphabetContainer con,
                              String sequence)
                       throws WrongAlphabetException,
                              IllegalArgumentException
Creates a sequence from a string based on the given AlphabetContainer using the standard delimiter for this AlphabetContainer.

Parameters:
con - the AlphabetContainer
sequence - the string containing the sequence
Returns:
a new sequence instance
Throws:
WrongAlphabetException - if the sequence is not defined over alphabetContainer
IllegalArgumentException - if the delimiter is empty and the AlphabetContainer is not discrete

create

public static Sequence create(AlphabetContainer con,
                              String sequence,
                              String delim)
                       throws WrongAlphabetException,
                              IllegalArgumentException
Creates a sequence from a string based on the given AlphabetContainer using the given delimiter.

Parameters:
con - the AlphabetContainer
sequence - the string containing the sequence
delim - the delimiter
Returns:
a new sequence instance
Throws:
WrongAlphabetException - if the sequence is not defined over alphabetContainer
IllegalArgumentException - if the delimiter is empty and the alphabetContainer is not discrete

create

public static Sequence create(AlphabetContainer con,
                              SequenceAnnotation[] annotation,
                              String sequence,
                              String delim)
                       throws WrongAlphabetException,
                              IllegalArgumentException
Creates a sequence from a string based on the given AlphabetContainer using the given delimiter.

Parameters:
con - the AlphabetContainer
annotation - the annotation for the sequence
sequence - the string containing the sequence
delim - the delimiter
Returns:
a new sequence instance
Throws:
WrongAlphabetException - if the sequence is not defined over alphabetContainer
IllegalArgumentException - if the delimiter is empty and the alphabetContainer is not discrete

reverse

public final Sequence reverse()
                       throws OperationNotSupportedException
This method returns a new instance of sequence containing the reverse current sequence.
So for instance invoking this method on the sequence "TAATA" returns "ATAAT".

Returns:
the reverse sequence
Throws:
OperationNotSupportedException - if the current sequence is based on an AlphabetContainer that is not simple.

reverse

public Sequence reverse(int start,
                        int end)
                 throws OperationNotSupportedException
This method returns a new instance of sequence containing a part of the reverse current sequence.

Parameters:
start - the start position (inclusive) in the original sequence
end - the end position (exclusive) in the original sequence
Returns:
the reverse sequence
Throws:
OperationNotSupportedException - if the current sequence is based on an AlphabetContainer that is not simple.

complement

public Sequence complement()
                    throws OperationNotSupportedException
This method returns a new instance of sequence containing the complementary current sequence.
So for instance invoking this method on the sequence "TAATA" with an AlphabetContainer on DNAAlphabet returns "ATTAT".

Returns:
the complementary sequence
Throws:
OperationNotSupportedException - if the current sequence is not based on a ComplementableDiscreteAlphabet
See Also:
ComplementableDiscreteAlphabet

complement

public Sequence complement(int start,
                           int end)
                    throws OperationNotSupportedException
This method returns a new instance of sequence containing a part of the complementary current sequence.
So for instance invoking this method on the sequence "TAATA" with an AlphabetContainer on DNAAlphabet returns "ATTAT".

Parameters:
start - the start position (inclusive) in the original sequence
end - the end position (exclusive) in the original sequence
Returns:
the complementary sequence
Throws:
OperationNotSupportedException - if the current sequence is not based on a ComplementableDiscreteAlphabet
See Also:
ComplementableDiscreteAlphabet

reverseComplement

public Sequence reverseComplement()
                           throws OperationNotSupportedException
This method returns a new sequence instance containing the complementary current sequence. For more details see the methods reverse() and complement().

Returns:
the reverse complementary sequence
Throws:
OperationNotSupportedException - if the current sequence is not discrete and simple
See Also:
reverse(), complement(), ComplementableDiscreteAlphabet

reverseComplement

public Sequence reverseComplement(int start,
                                  int end)
                           throws OperationNotSupportedException
This method returns a new sequence instance containing a part of the complementary current sequence. For more details see the methods reverse() and complement().

Parameters:
start - the start position (inclusive) in the original sequence
end - the end position (exclusive) in the original sequence
Returns:
the reverse complementary sequence
Throws:
OperationNotSupportedException - if the current sequence is not discrete and simple
See Also:
reverse(), complement(), ComplementableDiscreteAlphabet

hashCode

public int hashCode()
Overrides:
hashCode in class Object