de.jstacs.scoringFunctions.homogeneous
Class HMM0ScoringFunction

java.lang.Object
  extended by de.jstacs.scoringFunctions.AbstractNormalizableScoringFunction
      extended by de.jstacs.scoringFunctions.VariableLengthScoringFunction
          extended by de.jstacs.scoringFunctions.homogeneous.HomogeneousScoringFunction
              extended by de.jstacs.scoringFunctions.homogeneous.HMM0ScoringFunction
All Implemented Interfaces:
NormalizableScoringFunction, ScoringFunction, Storable, Cloneable

public class HMM0ScoringFunction
extends HomogeneousScoringFunction

This scoring function implements a homogeneous Markov model of order zero (hMM(0)) for a fixed sequence length.

Author:
Jens Keilwagen

Field Summary
 
Fields inherited from class de.jstacs.scoringFunctions.AbstractNormalizableScoringFunction
alphabets, length, r
 
Fields inherited from interface de.jstacs.scoringFunctions.ScoringFunction
UNKNOWN
 
Constructor Summary
HMM0ScoringFunction(AlphabetContainer alphabets, int length, double ess, boolean plugIn, boolean optimize)
          The main constructor that creates an instance of a homogeneous Markov model of order 0.
HMM0ScoringFunction(StringBuffer xml)
          This is the constructor for Storable.
 
Method Summary
 void addGradientOfLogPriorTerm(double[] grad, int start)
          This method computes the gradient of getLogPriorTerm() for each parameter of this model.
 HMM0ScoringFunction clone()
          Creates a clone (deep copy) of the current ScoringFunction instance.
protected  void fromXML(StringBuffer xml)
          This method is called in the constructor to create a scoring function from a StringBuffer
 double[] getCurrentParameterValues()
          Returns a double array of dimension getNumberOfParameters() containing the current parameter values.
 double getEss()
          Returns the equivalent sample size of this model, i.e. the equivalent sample size for the class or component that is represented by this model.
 String getInstanceName()
          Returns a short instance name.
 double getLogPriorTerm()
          This method computes a value that is proportional to getESS()*Math.log( getNormalizationConstant() ) + Math.log( prior ).
 double getLogScore(Sequence seq, int start, int length)
          This method computes the logarithm of the score for a given subsequence.
 double getLogScoreAndPartialDerivation(Sequence seq, int start, int length, IntList indices, DoubleList dList)
          This method computes the logarithm of the score and the partial derivations for a given subsequence.
 int getMaximalMarkovOrder()
          Returns the maximal used markov oder.
 double getNormalizationConstant(int length)
          This method returns the normalization constant for a given sequence length.
 int getNumberOfParameters()
          The number of parameters in this scoring function.
 double getPartialNormalizationConstant(int parameterIndex, int length)
          This method returns the partial normalization constant for a given parameter index and sequence length.
 int getSizeOfEventSpaceForRandomVariablesOfParameter(int index)
          Returns the size of the event space of the random variables that are affected by parameter no.
 double[] getStationarySymbolDistribution()
          This method returns the stationary symbol distribution.
 void initializeFunction(int index, boolean freeParams, Sample[] data, double[][] weights)
          This method creates the underlying structure of the scoring function.
 void initializeFunctionRandomly(boolean freeParams)
          This method initializes the scoring function randomly.
 boolean isInitialized()
          This method can be used to determine whether the model is initialized.
 void setParameters(double[] params, int start)
          This method sets the internal parameters to the values of params between start and start + this.getNumberOfParameters() - 1
 void setStatisticForHyperparameters(int[] length, double[] weight)
          This method sets the hyperparameters for the model parameters by evaluating the given statistic.
 String toString()
           
 StringBuffer toXML()
          This method returns an XML-representation of an instance of the implementing class.
 
Methods inherited from class de.jstacs.scoringFunctions.VariableLengthScoringFunction
getLogScore, getLogScoreAndPartialDerivation, getNormalizationConstant, getPartialNormalizationConstant
 
Methods inherited from class de.jstacs.scoringFunctions.AbstractNormalizableScoringFunction
getAlphabetContainer, getInitialClassParam, getLength, getLogScore, getLogScoreAndPartialDerivation, getNumberOfRecommendedStarts, isNormalized, isNormalized
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

HMM0ScoringFunction

public HMM0ScoringFunction(AlphabetContainer alphabets,
                           int length,
                           double ess,
                           boolean plugIn,
                           boolean optimize)
The main constructor that creates an instance of a homogeneous Markov model of order 0.

Parameters:
alphabets - the AlphabetContainer of the model
length - the length of sequences respectively the model
ess - the equivalent sample size (ess)
plugIn - whether to use a plug-in strategy to initialize the parameters
optimize - whether to optimize the parameters or not after they have been initialized

HMM0ScoringFunction

public HMM0ScoringFunction(StringBuffer xml)
                    throws NonParsableException
This is the constructor for Storable.

Parameters:
xml - the xml representation
Throws:
NonParsableException - if the representation could not be parsed.
Method Detail

clone

public HMM0ScoringFunction clone()
                          throws CloneNotSupportedException
Description copied from interface: ScoringFunction
Creates a clone (deep copy) of the current ScoringFunction instance.

Specified by:
clone in interface ScoringFunction
Overrides:
clone in class AbstractNormalizableScoringFunction
Returns:
the cloned instance
Throws:
CloneNotSupportedException

getInstanceName

public String getInstanceName()
Description copied from interface: ScoringFunction
Returns a short instance name.

Returns:
a short instance name

getLogScore

public double getLogScore(Sequence seq,
                          int start,
                          int length)
Description copied from class: VariableLengthScoringFunction
This method computes the logarithm of the score for a given subsequence.

Specified by:
getLogScore in class VariableLengthScoringFunction
Parameters:
seq - the sequence
start - the start index
length - the end index
Returns:
the logarithm of the score
See Also:
ScoringFunction.getLogScore(Sequence, int)

getLogScoreAndPartialDerivation

public double getLogScoreAndPartialDerivation(Sequence seq,
                                              int start,
                                              int length,
                                              IntList indices,
                                              DoubleList dList)
Description copied from class: VariableLengthScoringFunction
This method computes the logarithm of the score and the partial derivations for a given subsequence.

Specified by:
getLogScoreAndPartialDerivation in class VariableLengthScoringFunction
Parameters:
seq - the sequence
start - the start index
length - the end index
indices - the list for the indices of the parameters
dList - the list for the partial derivations
Returns:
the logarithm of the score
See Also:
ScoringFunction.getLogScoreAndPartialDerivation(Sequence, int, IntList, DoubleList)

getNumberOfParameters

public int getNumberOfParameters()
Description copied from interface: ScoringFunction
The number of parameters in this scoring function. If the number of parameters is not known yet, the method returns UNKNOWN.

Returns:
the number of parameters in this scoring function
See Also:
ScoringFunction.UNKNOWN

setParameters

public void setParameters(double[] params,
                          int start)
Description copied from interface: ScoringFunction
This method sets the internal parameters to the values of params between start and start + this.getNumberOfParameters() - 1

Parameters:
params - the parameters
start - the start index

toXML

public StringBuffer toXML()
Description copied from interface: Storable
This method returns an XML-representation of an instance of the implementing class.

Returns:
the XML-representation

getCurrentParameterValues

public double[] getCurrentParameterValues()
Description copied from interface: ScoringFunction
Returns a double array of dimension getNumberOfParameters() containing the current parameter values. If on e likes to use these parameters to start an optimization it is highly recommended to invoke ScoringFunction.initializeFunction(int, boolean, Sample[], double[][]) before. After an optimization this method can be used to get the current parameter values.

Returns:
the current parameter values

getStationarySymbolDistribution

public double[] getStationarySymbolDistribution()
Description copied from class: VariableLengthScoringFunction
This method returns the stationary symbol distribution. For DNA this is the stationary mono nucleotide distribution. This method is used to determine the contrast for de.jstacs.motifDiscovery.Mutable#expand(int, double[], double[], double), de.jstacs.motifDiscovery.Mutable#shift(double[], double[], double) and de.jstacs.motifDiscovery.Mutable#shrink(double[], double[], double).

Specified by:
getStationarySymbolDistribution in class VariableLengthScoringFunction
Returns:
the stationary symbol distribution
See Also:
Mutable

initializeFunction

public void initializeFunction(int index,
                               boolean freeParams,
                               Sample[] data,
                               double[][] weights)
Description copied from interface: ScoringFunction
This method creates the underlying structure of the scoring function.

Parameters:
index - the index of the class the scoring function models
freeParams - if true, the (reduced) parameterization is used
data - the samples
weights - the weights of the sequences in the samples

initializeFunctionRandomly

public void initializeFunctionRandomly(boolean freeParams)
Description copied from interface: ScoringFunction
This method initializes the scoring function randomly. It has to create the underlying structure of the scoring function.

Parameters:
freeParams - if true, the (reduced) parameterization is used

fromXML

protected void fromXML(StringBuffer xml)
                throws NonParsableException
Description copied from class: AbstractNormalizableScoringFunction
This method is called in the constructor to create a scoring function from a StringBuffer

Specified by:
fromXML in class AbstractNormalizableScoringFunction
Parameters:
xml - the XML representation
Throws:
NonParsableException - if the StringBuffer could not be parsed.

getSizeOfEventSpaceForRandomVariablesOfParameter

public int getSizeOfEventSpaceForRandomVariablesOfParameter(int index)
Description copied from interface: NormalizableScoringFunction
Returns the size of the event space of the random variables that are affected by parameter no. index, i.e. the product of the sizes of the alphabets at the position of each random variable affected by parameter index. For DNA-alphabets this corresponds to 4 for a PWM, 16 for a WAM except position 0, ...

Parameters:
index - the index of the parameter
Returns:
the size of the event space

getNormalizationConstant

public double getNormalizationConstant(int length)
Description copied from class: VariableLengthScoringFunction
This method returns the normalization constant for a given sequence length.

Specified by:
getNormalizationConstant in class VariableLengthScoringFunction
Parameters:
length - the sequence length
Returns:
the normalization constant
See Also:
NormalizableScoringFunction.getNormalizationConstant()

getPartialNormalizationConstant

public double getPartialNormalizationConstant(int parameterIndex,
                                              int length)
                                       throws Exception
Description copied from class: VariableLengthScoringFunction
This method returns the partial normalization constant for a given parameter index and sequence length.

Specified by:
getPartialNormalizationConstant in class VariableLengthScoringFunction
Parameters:
parameterIndex - the index of the parameter
length - the sequence length
Returns:
the partial normalization constant
Throws:
Exception - if something went wrong
See Also:
NormalizableScoringFunction.getPartialNormalizationConstant(int)

getEss

public double getEss()
Description copied from interface: NormalizableScoringFunction
Returns the equivalent sample size of this model, i.e. the equivalent sample size for the class or component that is represented by this model.

Returns:
the equivalent sample size.

toString

public String toString()
Overrides:
toString in class Object

getLogPriorTerm

public double getLogPriorTerm()
Description copied from interface: NormalizableScoringFunction
This method computes a value that is proportional to

getESS()*Math.log( getNormalizationConstant() ) + Math.log( prior ).

where prior is the prior for the parameters of this model.

Returns:
getESS()*Math.log( getNormalizationConstant() ) + Math.log( prior )
See Also:
NormalizableScoringFunction.getEss(), NormalizableScoringFunction.getNormalizationConstant()

addGradientOfLogPriorTerm

public void addGradientOfLogPriorTerm(double[] grad,
                                      int start)
Description copied from interface: NormalizableScoringFunction
This method computes the gradient of getLogPriorTerm() for each parameter of this model. The results are added to the array grad beginning at index start.

Parameters:
grad - the gradient
start - the start index in the grad array, where the partial derivations for the parameters of this models shall be enter
See Also:
NormalizableScoringFunction.getLogPriorTerm()

isInitialized

public boolean isInitialized()
Description copied from interface: ScoringFunction
This method can be used to determine whether the model is initialized. If the model is not initialize you should invoke the method ScoringFunction.initializeFunction(int, boolean, Sample[], double[][]).

Returns:
true if the model is initialized

getMaximalMarkovOrder

public int getMaximalMarkovOrder()
Description copied from class: HomogeneousScoringFunction
Returns the maximal used markov oder.

Specified by:
getMaximalMarkovOrder in class HomogeneousScoringFunction
Returns:
the maximal used markov oder

setStatisticForHyperparameters

public void setStatisticForHyperparameters(int[] length,
                                           double[] weight)
                                    throws Exception
Description copied from class: VariableLengthScoringFunction
This method sets the hyperparameters for the model parameters by evaluating the given statistic. The statistic can be interpreted as follows: The model has seen a number of sequences. From these sequences it is only known how long (length) and how often (weight) they have been seen.

Specified by:
setStatisticForHyperparameters in class VariableLengthScoringFunction
Parameters:
length - the non-negative lengths of the sequences
weight - the non-negative weight for the corresponding sequence
Throws:
Exception - if something went wrong
See Also:
Mutable