de.jstacs.sequenceScores.statisticalModels.differentiable.mixture
Class StrandDiffSM

java.lang.Object
  extended by de.jstacs.sequenceScores.differentiable.AbstractDifferentiableSequenceScore
      extended by de.jstacs.sequenceScores.statisticalModels.differentiable.AbstractDifferentiableStatisticalModel
          extended by de.jstacs.sequenceScores.statisticalModels.differentiable.mixture.AbstractMixtureDiffSM
              extended by de.jstacs.sequenceScores.statisticalModels.differentiable.mixture.StrandDiffSM
All Implemented Interfaces:
Mutable, DifferentiableSequenceScore, SequenceScore, DifferentiableStatisticalModel, SamplingDifferentiableStatisticalModel, StatisticalModel, Storable, Cloneable

public class StrandDiffSM
extends AbstractMixtureDiffSM
implements Mutable

This class enables the user to search on both strand. So the motif can be found on the forward or on the reverse complementary strand.

Author:
Jens Keilwagen
See Also:
ComplementableDiscreteAlphabet, AlphabetContainer.isReverseComplementable()

Nested Class Summary
static class StrandDiffSM.InitMethod
          This enum defines the different types of plug-in initialization of a StrandDiffSM.
 
Field Summary
 
Fields inherited from class de.jstacs.sequenceScores.statisticalModels.differentiable.mixture.AbstractMixtureDiffSM
componentScore, dList, freeParams, function, hiddenParameter, hiddenPotential, iList, logGammaSum, logHiddenNorm, logHiddenPotential, norm, optimizeHidden, paramRef, partNorm
 
Fields inherited from class de.jstacs.sequenceScores.differentiable.AbstractDifferentiableSequenceScore
alphabets, length, r
 
Fields inherited from interface de.jstacs.sequenceScores.differentiable.DifferentiableSequenceScore
UNKNOWN
 
Constructor Summary
StrandDiffSM(DifferentiableStatisticalModel function, double forwardPartOfESS, int starts, boolean plugIn, StrandDiffSM.InitMethod initMethod)
          This constructor creates a StrandDiffSM that optimizes the usage of each strand.
StrandDiffSM(DifferentiableStatisticalModel function, int starts, boolean plugIn, StrandDiffSM.InitMethod initMethod, double forward)
          This constructor creates a StrandDiffSM that has a fixed frequency for the strand usage.
StrandDiffSM(StringBuffer xml)
          This is the constructor for Storable.
 
Method Summary
protected  void extractFurtherInformation(StringBuffer xml)
          This method is the opposite of AbstractMixtureDiffSM.getFurtherInformation().
protected  void fillComponentScores(Sequence seq, int start)
          Fills the internal array AbstractMixtureDiffSM.componentScore with the logarithmic scores of the components given a Sequence.
 double getESS()
          Returns the equivalent sample size (ess) of this model, i.e. the equivalent sample size for the class or component that is represented by this model.
 double getForwardProbability()
          This methoth returns the a-priori probability for the forward strand.
protected  StringBuffer getFurtherInformation()
          This method is used to append further information of the instance to the XML representation.
 double getHyperparameterForHiddenParameter(int index)
          This method returns the hyperparameter for the hidden parameter with index index.
 String getInstanceName()
          Should return a short instance name such as iMM(0), BN(2), ...
protected  double getLogNormalizationConstantForComponent(int i)
          Computes the logarithm of the normalization constant for the component i.
 double getLogPartialNormalizationConstant(int parameterIndex)
          Returns the logarithm of the partial normalization constant for the parameter with index parameterIndex.
 double getLogScoreAndPartialDerivation(Sequence seq, int start, IntList indices, DoubleList partialDer)
          Returns the logarithmic score for a Sequence beginning at position start in the Sequence and fills lists with the indices and the partial derivations.
static double[][][] getReverseComplementDistributions(ComplementableDiscreteAlphabet abc, double[][][] condDistr)
          This method computes the reverse complement distributions for given conditional distributions.
 StrandedLocatedSequenceAnnotationWithLength.Strand getStrand(Sequence seq, int startPos)
          This method returns the preferred StrandedLocatedSequenceAnnotationWithLength.Strand for a given subsequence.
protected  void init(boolean freeParams)
          This method creates the underlying structure for the parameters.
protected  void initializeUsingPlugIn(int index, boolean freeParams, DataSet[] data, double[][] weights)
          This method initializes the functions using the data in some way.
static boolean isStrandModel(DifferentiableStatisticalModel nsf)
          Check whether a DifferentiableStatisticalModel is a StrandDiffSM.
 boolean modify(int offsetLeft, int offsetRight)
          Manually modifies the model.
protected  void setForwardProb(double forward)
          This method can be used to set the forward strand probability.
 String toString()
           
 
Methods inherited from class de.jstacs.sequenceScores.statisticalModels.differentiable.mixture.AbstractMixtureDiffSM
addGradientOfLogPriorTerm, clone, cloneFunctions, computeHiddenParameter, computeLogGammaSum, determineIsNormalized, fromXML, getCurrentParameterValues, getDifferentiableStatisticalModels, getFunction, getFunctions, getIndexOfMaximalComponentFor, getIndices, getLogNormalizationConstant, getLogPriorTerm, getLogScoreFor, getNumberOfComponents, getNumberOfParameters, getNumberOfRecommendedStarts, getProbsForComponent, getSamplingGroups, getSizeOfEventSpaceForRandomVariablesOfParameter, getXMLTag, initializeFunction, initializeFunctionRandomly, initializeHiddenPotentialRandomly, initializeHiddenUniformly, initWithLength, isInitialized, isNormalized, precomputeNorm, setHiddenParameters, setParameters, setParametersForFunction, toXML
 
Methods inherited from class de.jstacs.sequenceScores.statisticalModels.differentiable.AbstractDifferentiableStatisticalModel
emitDataSet, getInitialClassParam, getLogProbFor, getLogProbFor, getLogProbFor, getLogScoreFor, getLogScoreFor, getMaximalMarkovOrder, isNormalized
 
Methods inherited from class de.jstacs.sequenceScores.differentiable.AbstractDifferentiableSequenceScore
getAlphabetContainer, getCharacteristics, getLength, getLogScoreAndPartialDerivation, getLogScoreAndPartialDerivation, getLogScoreFor, getLogScoreFor, getNumberOfStarts, getNumericalCharacteristics
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface de.jstacs.sequenceScores.differentiable.DifferentiableSequenceScore
getInitialClassParam, getLogScoreAndPartialDerivation, getLogScoreAndPartialDerivation
 
Methods inherited from interface de.jstacs.sequenceScores.statisticalModels.StatisticalModel
emitDataSet, getLogProbFor, getLogProbFor, getLogProbFor, getMaximalMarkovOrder
 
Methods inherited from interface de.jstacs.sequenceScores.SequenceScore
getAlphabetContainer, getCharacteristics, getLength, getLogScoreFor, getLogScoreFor, getLogScoreFor, getLogScoreFor, getNumericalCharacteristics
 

Constructor Detail

StrandDiffSM

public StrandDiffSM(DifferentiableStatisticalModel function,
                    double forwardPartOfESS,
                    int starts,
                    boolean plugIn,
                    StrandDiffSM.InitMethod initMethod)
             throws CloneNotSupportedException,
                    WrongAlphabetException
This constructor creates a StrandDiffSM that optimizes the usage of each strand.

Parameters:
function - the DifferentiableSequenceScore
forwardPartOfESS - the part of the full ESS that should be used as hyperparameter for the forward strand
starts - the number of starts the should be done in an optimization
plugIn - whether the initial parameters for an optimization should be related to the data or randomly drawn
initMethod - only used if plugIn==true
whether the initial parameters for an optimization should be related to the data of the forward strand, the backward strand or both strands
Throws:
CloneNotSupportedException - if function could not be cloned
WrongAlphabetException - if the alphabet of function is not AlphabetContainer.isReverseComplementable() and, hence, cannot be used for a strand mixture
See Also:
StrandDiffSM.InitMethod

StrandDiffSM

public StrandDiffSM(DifferentiableStatisticalModel function,
                    int starts,
                    boolean plugIn,
                    StrandDiffSM.InitMethod initMethod,
                    double forward)
             throws CloneNotSupportedException,
                    WrongAlphabetException
This constructor creates a StrandDiffSM that has a fixed frequency for the strand usage.

Parameters:
function - the DifferentiableSequenceScore
starts - the number of starts the should be done in an optimization
plugIn - whether the initial parameters for an optimization should be related to the data or randomly drawn
initMethod - only used if plugIn==true
whether the initial parameters for an optimization should be related to the data of the forward strand, the backward strand or both strands
forward - the probability of a motif to be on the forward strand
Throws:
CloneNotSupportedException - if function could not be cloned
WrongAlphabetException - if the alphabet of function is not AlphabetContainer.isReverseComplementable() and, hence, cannot be used for a strand mixture
See Also:
StrandDiffSM.InitMethod

StrandDiffSM

public StrandDiffSM(StringBuffer xml)
             throws NonParsableException
This is the constructor for Storable.

Parameters:
xml - the xml representation
Throws:
NonParsableException - if the representation could not be parsed.
Method Detail

setForwardProb

protected void setForwardProb(double forward)
This method can be used to set the forward strand probability.

Parameters:
forward - the forward strand probability in (0,1)

getLogNormalizationConstantForComponent

protected double getLogNormalizationConstantForComponent(int i)
Description copied from class: AbstractMixtureDiffSM
Computes the logarithm of the normalization constant for the component i.

Specified by:
getLogNormalizationConstantForComponent in class AbstractMixtureDiffSM
Parameters:
i - the index of the component
Returns:
the logarithm of the normalization constant of the component

getLogPartialNormalizationConstant

public double getLogPartialNormalizationConstant(int parameterIndex)
                                          throws Exception
Description copied from interface: DifferentiableStatisticalModel
Returns the logarithm of the partial normalization constant for the parameter with index parameterIndex. This is the logarithm of the partial derivation of the normalization constant for the parameter with index parameterIndex,
\[\log \frac{\partial Z(\underline{\lambda})}{\partial \lambda_{parameterindex}}\]
.

Specified by:
getLogPartialNormalizationConstant in interface DifferentiableStatisticalModel
Parameters:
parameterIndex - the index of the parameter
Returns:
the logarithm of the partial normalization constant
Throws:
Exception - if something went wrong with the normalization
See Also:
DifferentiableStatisticalModel.getLogNormalizationConstant()

getHyperparameterForHiddenParameter

public double getHyperparameterForHiddenParameter(int index)
Description copied from class: AbstractMixtureDiffSM
This method returns the hyperparameter for the hidden parameter with index index.

Specified by:
getHyperparameterForHiddenParameter in class AbstractMixtureDiffSM
Parameters:
index - the index of the hidden parameter
Returns:
the hyperparameter for the hidden parameter

getForwardProbability

public double getForwardProbability()
This methoth returns the a-priori probability for the forward strand.

Returns:
the a-priori probability for the forward strand

getESS

public double getESS()
Description copied from interface: DifferentiableStatisticalModel
Returns the equivalent sample size (ess) of this model, i.e. the equivalent sample size for the class or component that is represented by this model.

Specified by:
getESS in interface DifferentiableStatisticalModel
Returns:
the equivalent sample size.

initializeUsingPlugIn

protected void initializeUsingPlugIn(int index,
                                     boolean freeParams,
                                     DataSet[] data,
                                     double[][] weights)
                              throws Exception
Description copied from class: AbstractMixtureDiffSM
This method initializes the functions using the data in some way.

Specified by:
initializeUsingPlugIn in class AbstractMixtureDiffSM
Parameters:
index - the class index
freeParams - if true, the (reduced) parameterization is used
data - the data
weights - the weights for the data
Throws:
Exception - if the initialization could not be done
See Also:
DifferentiableSequenceScore.initializeFunction(int, boolean, DataSet[], double[][])

getInstanceName

public String getInstanceName()
Description copied from interface: SequenceScore
Should return a short instance name such as iMM(0), BN(2), ...

Specified by:
getInstanceName in interface SequenceScore
Returns:
a short instance name

fillComponentScores

protected void fillComponentScores(Sequence seq,
                                   int start)
Description copied from class: AbstractMixtureDiffSM
Fills the internal array AbstractMixtureDiffSM.componentScore with the logarithmic scores of the components given a Sequence.

Specified by:
fillComponentScores in class AbstractMixtureDiffSM
Parameters:
seq - the sequence
start - the start position in seq

getLogScoreAndPartialDerivation

public double getLogScoreAndPartialDerivation(Sequence seq,
                                              int start,
                                              IntList indices,
                                              DoubleList partialDer)
Description copied from interface: DifferentiableSequenceScore
Returns the logarithmic score for a Sequence beginning at position start in the Sequence and fills lists with the indices and the partial derivations.

Specified by:
getLogScoreAndPartialDerivation in interface DifferentiableSequenceScore
Parameters:
seq - the Sequence
start - the start position in the Sequence
indices - an IntList of indices, after method invocation the list should contain the indices i where $\frac{\partial \log score(seq)}{\partial \lambda_i}$ is not zero
partialDer - a DoubleList of partial derivations, after method invocation the list should contain the corresponding $\frac{\partial \log score(seq)}{\partial \lambda_i}$ that are not zero
Returns:
the logarithmic score for the Sequence

getFurtherInformation

protected StringBuffer getFurtherInformation()
Description copied from class: AbstractMixtureDiffSM
This method is used to append further information of the instance to the XML representation. This method is designed to allow subclasses to add information to the XML representation.

Overrides:
getFurtherInformation in class AbstractMixtureDiffSM
Returns:
the further information as XML code in a StringBuffer
See Also:
AbstractMixtureDiffSM.extractFurtherInformation(StringBuffer)

extractFurtherInformation

protected void extractFurtherInformation(StringBuffer xml)
                                  throws NonParsableException
Description copied from class: AbstractMixtureDiffSM
This method is the opposite of AbstractMixtureDiffSM.getFurtherInformation(). It extracts further information of the instance from a XML representation.

Overrides:
extractFurtherInformation in class AbstractMixtureDiffSM
Parameters:
xml - the StringBuffer containing the information to be extracted as XML code
Throws:
NonParsableException - if the StringBuffer could not be parsed
See Also:
AbstractMixtureDiffSM.getFurtherInformation()

init

protected void init(boolean freeParams)
Description copied from class: AbstractMixtureDiffSM
This method creates the underlying structure for the parameters.

Overrides:
init in class AbstractMixtureDiffSM
Parameters:
freeParams - indicates whether to use only free parameters or all parameters

toString

public String toString()
Overrides:
toString in class Object

modify

public boolean modify(int offsetLeft,
                      int offsetRight)
Description copied from interface: Mutable
Manually modifies the model. The two offsets offsetLeft and offsetRight define how many positions the left or right border positions shall be moved. Negative numbers indicate moves to the left while positive numbers correspond to moves to the right.

Specified by:
modify in interface Mutable
Parameters:
offsetLeft - the offset on the left side
offsetRight - the offset on the right side
Returns:
true if the motif model was modified otherwise false

getReverseComplementDistributions

public static double[][][] getReverseComplementDistributions(ComplementableDiscreteAlphabet abc,
                                                             double[][][] condDistr)
This method computes the reverse complement distributions for given conditional distributions. This method is used to determine the context of a motif.

Parameters:
abc - the alphabet
condDistr - the conditional distribution
Returns:
the complement of the conditional distribution that can be used for computing a combing conditional distribution

getStrand

public StrandedLocatedSequenceAnnotationWithLength.Strand getStrand(Sequence seq,
                                                                    int startPos)
This method returns the preferred StrandedLocatedSequenceAnnotationWithLength.Strand for a given subsequence.

Parameters:
seq - the sequence
startPos - the start position
Returns:
the StrandedLocatedSequenceAnnotationWithLength.Strand of this subsequence
See Also:
AbstractMixtureDiffSM.getIndexOfMaximalComponentFor(Sequence, int)

isStrandModel

public static boolean isStrandModel(DifferentiableStatisticalModel nsf)
Check whether a DifferentiableStatisticalModel is a StrandDiffSM.

Parameters:
nsf - the original DifferentiableStatisticalModel
Returns:
true if the DifferentiableStatisticalModel is a StrandDiffSM