de.jstacs.sequenceScores.statisticalModels.differentiable
Class AbstractDifferentiableStatisticalModel

java.lang.Object
  extended by de.jstacs.sequenceScores.differentiable.AbstractDifferentiableSequenceScore
      extended by de.jstacs.sequenceScores.statisticalModels.differentiable.AbstractDifferentiableStatisticalModel
All Implemented Interfaces:
DifferentiableSequenceScore, SequenceScore, DifferentiableStatisticalModel, StatisticalModel, Storable, Cloneable
Direct Known Subclasses:
AbstractMixtureDiffSM, AbstractVariableLengthDiffSM, BayesianNetworkDiffSM, MappingDiffSM, MarkovRandomFieldDiffSM, NormalizedDiffSM, PositionDiffSM

public abstract class AbstractDifferentiableStatisticalModel
extends AbstractDifferentiableSequenceScore
implements DifferentiableStatisticalModel

This class is the main part of any ScoreClassifier. It implements many methods of the interface DifferentiableStatisticalModel.

Author:
Jens Keilwagen, Jan Grau

Field Summary
 
Fields inherited from class de.jstacs.sequenceScores.differentiable.AbstractDifferentiableSequenceScore
alphabets, length, r
 
Fields inherited from interface de.jstacs.sequenceScores.differentiable.DifferentiableSequenceScore
UNKNOWN
 
Constructor Summary
AbstractDifferentiableStatisticalModel(AlphabetContainer alphabets, int length)
          The main constructor.
AbstractDifferentiableStatisticalModel(StringBuffer xml)
          This is the constructor for Storable.
 
Method Summary
 AbstractDifferentiableStatisticalModel clone()
          Creates a clone (deep copy) of the current DifferentiableSequenceScore instance.
 DataSet emitDataSet(int numberOfSequences, int... seqLength)
          This method returns a DataSet object containing artificial sequence(s).
 double getInitialClassParam(double classProb)
          Returns the initial class parameter for the class this DifferentiableSequenceScore is responsible for, based on the class probability classProb.
 double getLogProbFor(Sequence sequence)
          Returns the logarithm of the probability of the given sequence given the model.
 double getLogProbFor(Sequence sequence, int startpos)
          Returns the logarithm of the probability of (a part of) the given sequence given the model.
 double getLogProbFor(Sequence sequence, int startpos, int endpos)
          Returns the logarithm of the probability of (a part of) the given sequence given the model.
 double[] getLogScoreFor(DataSet data)
          This method computes the logarithm of the scores of all sequences in the given sample.
 void getLogScoreFor(DataSet data, double[] res)
          This method computes and stores the logarithm of the scores for any sequence in the sample in the given double-array.
 byte getMaximalMarkovOrder()
          This method returns the maximal used Markov order, if possible.
 boolean isNormalized()
          This method indicates whether the implemented score is already normalized to 1 or not.
static boolean isNormalized(DifferentiableSequenceScore... function)
          This method checks whether all given DifferentiableStatisticalModels are normalized.
 
Methods inherited from class de.jstacs.sequenceScores.differentiable.AbstractDifferentiableSequenceScore
fromXML, getAlphabetContainer, getCharacteristics, getLength, getLogScoreAndPartialDerivation, getLogScoreAndPartialDerivation, getLogScoreFor, getLogScoreFor, getNumberOfRecommendedStarts, getNumberOfStarts, getNumericalCharacteristics
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface de.jstacs.sequenceScores.statisticalModels.differentiable.DifferentiableStatisticalModel
addGradientOfLogPriorTerm, getESS, getLogNormalizationConstant, getLogPartialNormalizationConstant, getLogPriorTerm, getSizeOfEventSpaceForRandomVariablesOfParameter
 
Methods inherited from interface de.jstacs.sequenceScores.differentiable.DifferentiableSequenceScore
getCurrentParameterValues, getLogScoreAndPartialDerivation, getLogScoreAndPartialDerivation, getLogScoreAndPartialDerivation, getNumberOfParameters, getNumberOfRecommendedStarts, initializeFunction, initializeFunctionRandomly, setParameters
 
Methods inherited from interface de.jstacs.sequenceScores.SequenceScore
getAlphabetContainer, getCharacteristics, getInstanceName, getLength, getLogScoreFor, getLogScoreFor, getLogScoreFor, getNumericalCharacteristics, isInitialized
 
Methods inherited from interface de.jstacs.Storable
toXML
 

Constructor Detail

AbstractDifferentiableStatisticalModel

public AbstractDifferentiableStatisticalModel(AlphabetContainer alphabets,
                                              int length)
                                       throws IllegalArgumentException
The main constructor.

Parameters:
alphabets - the AlphabetContainer of this DifferentiableSequenceScore
length - the length of this DifferentiableSequenceScore, i.e. the length of the modeled sequences
Throws:
IllegalArgumentException - if the length is negative or does not match with AlphabetContainer.getPossibleLength()

AbstractDifferentiableStatisticalModel

public AbstractDifferentiableStatisticalModel(StringBuffer xml)
                                       throws NonParsableException
This is the constructor for Storable. Creates a new AbstractDifferentiableStatisticalModel out of a StringBuffer .

Parameters:
xml - the XML representation as StringBuffer
Throws:
NonParsableException - if the XML representation could not be parsed
Method Detail

clone

public AbstractDifferentiableStatisticalModel clone()
                                             throws CloneNotSupportedException
Description copied from interface: DifferentiableSequenceScore
Creates a clone (deep copy) of the current DifferentiableSequenceScore instance.

Specified by:
clone in interface DifferentiableSequenceScore
Specified by:
clone in interface SequenceScore
Overrides:
clone in class AbstractDifferentiableSequenceScore
Returns:
the cloned instance of the current DifferentiableSequenceScore
Throws:
CloneNotSupportedException - if something went wrong while cloning the DifferentiableSequenceScore

isNormalized

public boolean isNormalized()
Description copied from interface: DifferentiableStatisticalModel
This method indicates whether the implemented score is already normalized to 1 or not. The standard implementation returns false.

Specified by:
isNormalized in interface DifferentiableStatisticalModel
Returns:
true if the implemented score is already normalized to 1, false otherwise

isNormalized

public static boolean isNormalized(DifferentiableSequenceScore... function)
This method checks whether all given DifferentiableStatisticalModels are normalized.

Parameters:
function - the DifferentiableStatisticalModels to be checked
Returns:
true if all DifferentiableStatisticalModels are already normalized, otherwise false
See Also:
DifferentiableStatisticalModel.isNormalized()

getInitialClassParam

public double getInitialClassParam(double classProb)
Description copied from interface: DifferentiableSequenceScore
Returns the initial class parameter for the class this DifferentiableSequenceScore is responsible for, based on the class probability classProb.

Specified by:
getInitialClassParam in interface DifferentiableSequenceScore
Overrides:
getInitialClassParam in class AbstractDifferentiableSequenceScore
Parameters:
classProb - the class probability
Returns:
the initial class parameter

getLogProbFor

public double getLogProbFor(Sequence sequence)
Description copied from interface: StatisticalModel
Returns the logarithm of the probability of the given sequence given the model. If at least one random variable is continuous the value of density function is returned.

The length and the alphabets define the type of data that can be modeled and therefore both has to be checked.

Specified by:
getLogProbFor in interface StatisticalModel
Parameters:
sequence - the given sequence for which the logarithm of the probability/the value of the density function should be returned
Returns:
the logarithm of the probability or the value of the density function of the part of the given sequence given the model
See Also:
StatisticalModel.getLogProbFor(Sequence, int, int)

getLogProbFor

public double getLogProbFor(Sequence sequence,
                            int startpos)
                     throws Exception
Description copied from interface: StatisticalModel
Returns the logarithm of the probability of (a part of) the given sequence given the model. If at least one random variable is continuous the value of density function is returned.

If the length of the sequences, whose probability should be returned, is fixed (e.g. in a inhomogeneous model) and the given sequence is longer than their fixed length, the start position within the given sequence is given by startpos. E.g. the fixed length is 12. The length of the given sequence is 30 and the startpos=15 the logarithm of the probability of the part from position 15 to 26 (inclusive) given the model should be returned.
The length and the alphabets define the type of data that can be modeled and therefore both has to be checked.

Specified by:
getLogProbFor in interface StatisticalModel
Parameters:
sequence - the given sequence
startpos - the start position within the given sequence
Returns:
the logarithm of the probability or the value of the density function of (the part of) the given sequence given the model
Throws:
Exception - if the sequence could not be handled by the model
NotTrainedException - if the model is not trained yet
See Also:
StatisticalModel.getLogProbFor(Sequence, int, int)

getLogProbFor

public double getLogProbFor(Sequence sequence,
                            int startpos,
                            int endpos)
Description copied from interface: StatisticalModel
Returns the logarithm of the probability of (a part of) the given sequence given the model. If at least one random variable is continuous the value of density function is returned.

It extends the possibility given by the method StatisticalModel.getLogProbFor(Sequence, int) by the fact, that the model could be e.g. homogeneous and therefore the length of the sequences, whose probability should be returned, is not fixed. Additionally, the end position of the part of the given sequence is given and the probability of the part from position startpos to endpos (inclusive) should be returned.
The length and the alphabets define the type of data that can be modeled and therefore both has to be checked.

Specified by:
getLogProbFor in interface StatisticalModel
Parameters:
sequence - the given sequence
startpos - the start position within the given sequence
endpos - the last position to be taken into account
Returns:
the logarithm of the probability or the value of the density function of (the part of) the given sequence given the model

getLogScoreFor

public double[] getLogScoreFor(DataSet data)
                        throws Exception
Description copied from interface: SequenceScore
This method computes the logarithm of the scores of all sequences in the given sample. The values are stored in an array according to the index of the respective sequence in the sample.

The score for any sequence shall be computed independent of all other sequences in the sample. So the result should be exactly the same as for the method SequenceScore.getLogScoreFor(Sequence).

Specified by:
getLogScoreFor in interface SequenceScore
Overrides:
getLogScoreFor in class AbstractDifferentiableSequenceScore
Parameters:
data - the sample of sequences
Returns:
an array containing the logarithm of the score of all sequences of the sample
Throws:
Exception - if something went wrong
See Also:
SequenceScore.getLogScoreFor(Sequence)

getLogScoreFor

public void getLogScoreFor(DataSet data,
                           double[] res)
                    throws Exception
Description copied from interface: SequenceScore
This method computes and stores the logarithm of the scores for any sequence in the sample in the given double-array.

The score for any sequence shall be computed independent of all other sequences in the sample. So the result should be exactly the same as for the method SequenceScore.getLogScoreFor(Sequence).

Specified by:
getLogScoreFor in interface SequenceScore
Overrides:
getLogScoreFor in class AbstractDifferentiableSequenceScore
Parameters:
data - the sample of sequences
res - the array for the results, has to have length data.getNumberOfElements() (which returns the number of sequences in the sample)
Throws:
Exception - if something went wrong
See Also:
SequenceScore.getLogScoreFor(Sequence), SequenceScore.getLogScoreFor(DataSet)

emitDataSet

public DataSet emitDataSet(int numberOfSequences,
                           int... seqLength)
                    throws NotTrainedException,
                           Exception
Description copied from interface: StatisticalModel
This method returns a DataSet object containing artificial sequence(s).

There are two different possibilities to create a sample for a model with length 0 (homogeneous models).
  1. emitDataSet( int n, int l ) should return a data set with n sequences of length l.
  2. emitDataSet( int n, int[] l ) should return a data set with n sequences which have a sequence length corresponding to the entry in the given array l.

There are two different possibilities to create a sample for a model with length greater than 0 (inhomogeneous models).
emitDataSet( int n ) and emitDataSet( int n, null ) should return a sample with n sequences of length of the model ( SequenceScore.getLength()).

The standard implementation throws an Exception.

Specified by:
emitDataSet in interface StatisticalModel
Parameters:
numberOfSequences - the number of sequences that should be contained in the returned sample
seqLength - the length of the sequences for a homogeneous model; for an inhomogeneous model this parameter should be null or an array of size 0.
Returns:
a DataSet containing the artificial sequence(s)
Throws:
NotTrainedException - if the model is not trained yet
Exception - if the emission did not succeed
See Also:
DataSet

getMaximalMarkovOrder

public byte getMaximalMarkovOrder()
                           throws UnsupportedOperationException
Description copied from interface: StatisticalModel
This method returns the maximal used Markov order, if possible.

Specified by:
getMaximalMarkovOrder in interface StatisticalModel
Returns:
maximal used Markov order
Throws:
UnsupportedOperationException - if the model can't give a proper answer