de.jstacs.scoringFunctions.mix
Class StrandScoringFunction

java.lang.Object
  extended by de.jstacs.scoringFunctions.AbstractNormalizableScoringFunction
      extended by de.jstacs.scoringFunctions.mix.AbstractMixtureScoringFunction
          extended by de.jstacs.scoringFunctions.mix.StrandScoringFunction
All Implemented Interfaces:
Mutable, NormalizableScoringFunction, ScoringFunction, Storable, Cloneable

public class StrandScoringFunction
extends AbstractMixtureScoringFunction
implements Mutable

This class enables the user to search on both strand. So the motif can be found on the forward or on the reverse complementary strand.

Author:
Jens Keilwagen
See Also:
ComplementableDiscreteAlphabet, AlphabetContainer.isReverseComplementable()

Nested Class Summary
static class StrandScoringFunction.InitMethod
          This enum defines the different types of plug-in initialization of a StrandScoringFunction.
 
Field Summary
 
Fields inherited from class de.jstacs.scoringFunctions.mix.AbstractMixtureScoringFunction
componentScore, dList, freeParams, function, hiddenParameter, hiddenPotential, iList, isNormalized, logGammaSum, logHiddenNorm, logHiddenPotential, norm, optimizeHidden, paramRef, partNorm
 
Fields inherited from class de.jstacs.scoringFunctions.AbstractNormalizableScoringFunction
alphabets, length, r
 
Fields inherited from interface de.jstacs.scoringFunctions.ScoringFunction
UNKNOWN
 
Constructor Summary
StrandScoringFunction(NormalizableScoringFunction function, double forwardPartOfESS, int starts, boolean plugIn, StrandScoringFunction.InitMethod initMethod)
          This constructor creates a StrandScoringFunction that optimizes the usage of each strand.
StrandScoringFunction(NormalizableScoringFunction function, int starts, boolean plugIn, StrandScoringFunction.InitMethod initMethod, double forward)
          This constructor creates a StrandScoringFunction that has a fixed frequency for the strand usage.
StrandScoringFunction(StringBuffer xml)
          This is the constructor for Storable.
 
Method Summary
 int[] determineNotSignificantPositions(double samples, double[] weightsLeft, double[] weightsRight, double[][][][] contrastLeft, double[][][][] contrastRight, double sign)
          This method determines the number of not significant positions from each side of the motif using the the significance level sign and the contrast distributions of the left or right side, contrastLeft and contrastRight, respectively.
protected  void extractFurtherInformation(StringBuffer xml)
          This method is the opposite of AbstractMixtureScoringFunction.getFurtherInformation().
protected  void fillComponentScores(Sequence seq, int start)
          Fills the internal array AbstractMixtureScoringFunction.componentScore with the logarithmic scores of the components given a Sequence.
 double getEss()
          Returns the equivalent sample size (ess) of this model, i.e. the equivalent sample size for the class or component that is represented by this model.
protected  StringBuffer getFurtherInformation()
          This method is used to append further information of the instance to the XML representation.
 double getHyperparameterForHiddenParameter(int index)
          This method returns the hyperparameter for the hidden parameter with index index.
 String getInstanceName()
          Returns a short instance name.
 double getLogScoreAndPartialDerivation(Sequence seq, int start, IntList indices, DoubleList partialDer)
          Returns the logarithmic score for a Sequence beginning at position start in the Sequence and fills lists with the indices and the partial derivations.
protected  double getNormalizationConstantForComponent(int i)
          Computes the normalization constant for the component i.
 double getPartialNormalizationConstant(int parameterIndex)
          Returns the partial normalization constant for the parameter with index parameterIndex.
static double[][][] getReverseComplementDistributions(ComplementableDiscreteAlphabet abc, double[][][] condDistr)
          This method computes the reverse complement distributions for given conditional distributions.
 StrandedLocatedSequenceAnnotationWithLength.Strand getStrand(Sequence seq, int startPos)
          This method returns the preferred StrandedLocatedSequenceAnnotationWithLength.Strand for a given subsequence.
protected  void init(boolean freeParams)
          This method creates the underlying structure for the parameters.
protected  void initializeUsingPlugIn(int index, boolean freeParams, Sample[] data, double[][] weights)
          This method initializes the functions using the data in some way.
 boolean modify(double[] weightsLeft, double[] weightsRight, double[][][][] replacementLeft, double[][][][] replacementRight, int offsetLeft, int offsetRight)
          Manually modifies the model.
protected  void setForwardProb(double forward)
          This method can be used to set the forward strand probability.
 String toString()
           
 
Methods inherited from class de.jstacs.scoringFunctions.mix.AbstractMixtureScoringFunction
addGradientOfLogPriorTerm, clone, cloneFunctions, computeHiddenParameter, computeLogGammaSum, fromXML, getCurrentParameterValues, getFunction, getFunctions, getIndexOfMaximalComponentFor, getIndices, getLogPriorTerm, getLogScore, getMaxIndex, getNormalizationConstant, getNumberOfComponents, getNumberOfParameters, getNumberOfRecommendedStarts, getProbsForComponent, getScoringFunctions, getSizeOfEventSpaceForRandomVariablesOfParameter, getXMLTag, initializeFunction, initializeFunctionRandomly, initializeHiddenPotentialRandomly, initializeHiddenUniformly, initWithLength, isInitialized, isNormalized, precomputeNorm, setHiddenParameters, setParameters, setParametersForFunction, toXML
 
Methods inherited from class de.jstacs.scoringFunctions.AbstractNormalizableScoringFunction
getAlphabetContainer, getInitialClassParam, getLength, getLogScore, getLogScoreAndPartialDerivation, isNormalized
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

StrandScoringFunction

public StrandScoringFunction(NormalizableScoringFunction function,
                             double forwardPartOfESS,
                             int starts,
                             boolean plugIn,
                             StrandScoringFunction.InitMethod initMethod)
                      throws CloneNotSupportedException,
                             WrongAlphabetException
This constructor creates a StrandScoringFunction that optimizes the usage of each strand.

Parameters:
function - the ScoringFunction
forwardPartOfESS - the part of the full ESS that should be used as hyerparameter for the forward strand
starts - the number of starts the should be done in an optimization
plugIn - whether the initial parameters for an optimization should be related to the data or randomly drawn
initMethod - only used if plugIn==true
whether the initial parameters for an optimization should be related to the data of the forward strand, the backward strand or both strands
Throws:
CloneNotSupportedException
WrongAlphabetException
See Also:
StrandScoringFunction.InitMethod

StrandScoringFunction

public StrandScoringFunction(NormalizableScoringFunction function,
                             int starts,
                             boolean plugIn,
                             StrandScoringFunction.InitMethod initMethod,
                             double forward)
                      throws CloneNotSupportedException,
                             WrongAlphabetException
This constructor creates a StrandScoringFunction that has a fixed frequency for the strand usage.

Parameters:
function - the ScoringFunction
starts - the number of starts the should be done in an optimization
plugIn - whether the initial parameters for an optimization should be related to the data or randomly drawn
initMethod - only used if plugIn==true
whether the initial parameters for an optimization should be related to the data of the forward strand, the backward strand or both strands
forward - the probability of a motif to be on the forward strand
Throws:
CloneNotSupportedException
WrongAlphabetException
See Also:
StrandScoringFunction.InitMethod

StrandScoringFunction

public StrandScoringFunction(StringBuffer xml)
                      throws NonParsableException
This is the constructor for Storable.

Parameters:
xml - the xml representation
Throws:
NonParsableException - if the representation could not be parsed.
Method Detail

setForwardProb

protected void setForwardProb(double forward)
This method can be used to set the forward strand probability.

Parameters:
forward - the forward strand probability in (0,1)

getNormalizationConstantForComponent

protected double getNormalizationConstantForComponent(int i)
Description copied from class: AbstractMixtureScoringFunction
Computes the normalization constant for the component i.

Specified by:
getNormalizationConstantForComponent in class AbstractMixtureScoringFunction
Parameters:
i - the index of the component
Returns:
the normalization constant of the component

getPartialNormalizationConstant

public double getPartialNormalizationConstant(int parameterIndex)
                                       throws Exception
Description copied from interface: NormalizableScoringFunction
Returns the partial normalization constant for the parameter with index parameterIndex. This is the partial derivation of the normalization constant for the parameter with index parameterIndex, in LaTex notation: \frac{\partial Z(\lambda)}{\partial \lambda_{index}}.

Specified by:
getPartialNormalizationConstant in interface NormalizableScoringFunction
Parameters:
parameterIndex - the index of the parameter
Returns:
the partial normalization constant
Throws:
Exception - if something went wrong with the normalization
See Also:
NormalizableScoringFunction.getNormalizationConstant()

getHyperparameterForHiddenParameter

public double getHyperparameterForHiddenParameter(int index)
Description copied from class: AbstractMixtureScoringFunction
This method returns the hyperparameter for the hidden parameter with index index.

Specified by:
getHyperparameterForHiddenParameter in class AbstractMixtureScoringFunction
Parameters:
index - the index of the hidden parameter
Returns:
the hyperparameter for the hidden parameter

getEss

public double getEss()
Description copied from interface: NormalizableScoringFunction
Returns the equivalent sample size (ess) of this model, i.e. the equivalent sample size for the class or component that is represented by this model.

Specified by:
getEss in interface NormalizableScoringFunction
Returns:
the equivalent sample size.

initializeUsingPlugIn

protected void initializeUsingPlugIn(int index,
                                     boolean freeParams,
                                     Sample[] data,
                                     double[][] weights)
                              throws Exception
Description copied from class: AbstractMixtureScoringFunction
This method initializes the functions using the data in some way.

Specified by:
initializeUsingPlugIn in class AbstractMixtureScoringFunction
Parameters:
index - the class index
freeParams - if true, the (reduced) parameterization is used
data - the data
weights - the weights for the data
Throws:
Exception - if the initialization could not be done
See Also:
ScoringFunction.initializeFunction(int, boolean, Sample[], double[][])

getInstanceName

public String getInstanceName()
Description copied from interface: ScoringFunction
Returns a short instance name.

Specified by:
getInstanceName in interface ScoringFunction
Returns:
a short instance name

fillComponentScores

protected void fillComponentScores(Sequence seq,
                                   int start)
Description copied from class: AbstractMixtureScoringFunction
Fills the internal array AbstractMixtureScoringFunction.componentScore with the logarithmic scores of the components given a Sequence.

Specified by:
fillComponentScores in class AbstractMixtureScoringFunction
Parameters:
seq - the sequence
start - the start position in seq

getLogScoreAndPartialDerivation

public double getLogScoreAndPartialDerivation(Sequence seq,
                                              int start,
                                              IntList indices,
                                              DoubleList partialDer)
Description copied from interface: ScoringFunction
Returns the logarithmic score for a Sequence beginning at position start in the Sequence and fills lists with the indices and the partial derivations.

Specified by:
getLogScoreAndPartialDerivation in interface ScoringFunction
Parameters:
seq - the Sequence
start - the start position in the Sequence
indices - an IntList of indices, after method invocation the list should contain the indices i where \frac{\partial \log score(seq)}{\partial \lambda_i} is not zero
partialDer - a DoubleList of partial derivations, after method invocation the list should contain the corresponding \frac{\partial \log score(seq)}{\partial \lambda_i}
Returns:
the logarithmic score for the Sequence

getFurtherInformation

protected StringBuffer getFurtherInformation()
Description copied from class: AbstractMixtureScoringFunction
This method is used to append further information of the instance to the XML representation. This method is designed to allow subclasses to add information to the XML representation.

Overrides:
getFurtherInformation in class AbstractMixtureScoringFunction
Returns:
the further information as XML code in a StringBuffer
See Also:
AbstractMixtureScoringFunction.extractFurtherInformation(StringBuffer)

extractFurtherInformation

protected void extractFurtherInformation(StringBuffer xml)
                                  throws NonParsableException
Description copied from class: AbstractMixtureScoringFunction
This method is the opposite of AbstractMixtureScoringFunction.getFurtherInformation(). It extracts further information of the instance from a XML representation.

Overrides:
extractFurtherInformation in class AbstractMixtureScoringFunction
Parameters:
xml - the StringBuffer containing the information to be extracted as XML code
Throws:
NonParsableException - if the StringBuffer could not be parsed
See Also:
AbstractMixtureScoringFunction.getFurtherInformation()

init

protected void init(boolean freeParams)
Description copied from class: AbstractMixtureScoringFunction
This method creates the underlying structure for the parameters.

Overrides:
init in class AbstractMixtureScoringFunction
Parameters:
freeParams - indicates whether to use only free parameters or all parameters

toString

public String toString()
Overrides:
toString in class Object

modify

public boolean modify(double[] weightsLeft,
                      double[] weightsRight,
                      double[][][][] replacementLeft,
                      double[][][][] replacementRight,
                      int offsetLeft,
                      int offsetRight)
Description copied from interface: Mutable
Manually modifies the model. The two offsets offsetLeft and offsetRight define how many positions the left or right border positions shall be moved. Negative numbers indicate moves to the left while positive numbers correspond to moves to the right.

Specified by:
modify in interface Mutable
Parameters:
weightsLeft - the weights for the left replacement distributions
weightsRight - the weights for the left replacement distributions
replacementLeft - the replacement distribution for the left side
replacementRight - the replacement distribution for the right side
offsetLeft - the offset on the left side
offsetRight - the offset on the right side
Returns:
true if the motif model was modified otherwise false

determineNotSignificantPositions

public int[] determineNotSignificantPositions(double samples,
                                              double[] weightsLeft,
                                              double[] weightsRight,
                                              double[][][][] contrastLeft,
                                              double[][][][] contrastRight,
                                              double sign)
Description copied from interface: Mutable
This method determines the number of not significant positions from each side of the motif using the the significance level sign and the contrast distributions of the left or right side, contrastLeft and contrastRight, respectively. The contrast array have four dimensions.
  1. The first is used for the possibility of having different contrast (caused by different flanking models).
  2. The second is used for the order of the contrast.
  3. The third is used for the (encoded) context.
  4. The fourth is used for the the realization of the random variable
For example, if we have only one flanking model which is a homogeneous Markov model of order 0 for a DNAAlphabet, the contrast array has the dimension new double[1][1][1][4]. For the same example but with order 1, the contrast array has the dimension new double[1][1][4][4]. Left and right contrast can have different dimensions.

Specified by:
determineNotSignificantPositions in interface Mutable
Parameters:
samples - the summed weights of Sequence containing this motif
weightsLeft - the weights for the left contrast distributions
weightsRight - the weights for the right contrast distributions
contrastLeft - the left contrast distributions
contrastRight - the right contrast distributions
sign - the significance level
Returns:
a two dimensional array containing at position 0 the number of not significant positions from the left side using contrastLeft and at position 1 the number of not significant positions from the right side using contrastRight
See Also:
Mutable.modify(double[], double[], double[][][][], double[][][][], int, int)

getReverseComplementDistributions

public static double[][][] getReverseComplementDistributions(ComplementableDiscreteAlphabet abc,
                                                             double[][][] condDistr)
This method computes the reverse complement distributions for given conditional distributions. This method is used to determine the context of a motif.

Parameters:
abc - the alphabet
condDistr - the conditional distribution
Returns:
the complement of the conditional distribution that can be used for computing a combing conditional distribution

getStrand

public StrandedLocatedSequenceAnnotationWithLength.Strand getStrand(Sequence seq,
                                                                    int startPos)
This method returns the preferred StrandedLocatedSequenceAnnotationWithLength.Strand for a given subsequence.

Parameters:
seq - the sequence
startPos - the start position
Returns:
the StrandedLocatedSequenceAnnotationWithLength.Strand of this subsequence
See Also:
AbstractMixtureScoringFunction.getIndexOfMaximalComponentFor(Sequence, int)