de.jstacs.scoringFunctions
Class IndependentProductScoringFunction

java.lang.Object
  extended by de.jstacs.scoringFunctions.AbstractNormalizableScoringFunction
      extended by de.jstacs.scoringFunctions.IndependentProductScoringFunction
All Implemented Interfaces:
MotifDiscoverer, MutableMotifDiscoverer, NormalizableScoringFunction, ScoringFunction, Storable, Cloneable

public class IndependentProductScoringFunction
extends AbstractNormalizableScoringFunction
implements MutableMotifDiscoverer

This class enables the user to model parts of a sequence independent of each other. The first part of the sequence is modeled by the first NormalizableScoringFunction and has the length of the first NormalizableScoringFunction, the second part starts directly after the first part, is modeled by the second NormalizableScoringFunction ... etc.

Author:
Jens Keilwagen

Nested Class Summary
 
Nested classes/interfaces inherited from interface de.jstacs.motifDiscovery.MotifDiscoverer
MotifDiscoverer.KindOfProfile
 
Field Summary
 
Fields inherited from class de.jstacs.scoringFunctions.AbstractNormalizableScoringFunction
alphabets, length, r
 
Fields inherited from interface de.jstacs.scoringFunctions.ScoringFunction
UNKNOWN
 
Constructor Summary
IndependentProductScoringFunction(NormalizableScoringFunction... functions)
          This constructor creates an instance of an IndependentProductScoringFunction from a given series of independent NormalizableScoringFunctions.
IndependentProductScoringFunction(NormalizableScoringFunction[] functions, int[] length)
          This constructor creates an instance of an IndependentProductScoringFunction from given series of independent NormalizableScoringFunctions and lengths.
IndependentProductScoringFunction(StringBuffer source)
          This is the constructor for the interface Storable.
 
Method Summary
 void addGradientOfLogPriorTerm(double[] grad, int start)
          This method computes the gradient of NormalizableScoringFunction.getLogPriorTerm() for each parameter of this model.
 IndependentProductScoringFunction clone()
          Creates a clone (deep copy) of the current ScoringFunction instance.
 int[] determineNotSignificantPositionsFor(int motif, Sample[] data, double[][] weights, int classIdx)
          This method determines the number of not significant positions from each side of the motif with index motif.
protected  void fromXML(StringBuffer rep)
          This method is called in the constructor for the Storable interface to create a scoring function from a StringBuffer.
 double[] getCurrentParameterValues()
          Returns a double array of dimension ScoringFunction.getNumberOfParameters() containing the current parameter values.
 double getEss()
          Returns the equivalent sample size (ess) of this model, i.e. the equivalent sample size for the class or component that is represented by this model.
 int getGlobalIndexOfMotifInComponent(int component, int motif)
          Returns the global index of the motif used in component.
 int getIndexOfMaximalComponentFor(Sequence sequence)
          Returns the index of the component with the maximal score for the sequence sequence.
 String getInstanceName()
          Returns a short instance name.
 double getLogPriorTerm()
          This method computes a value that is proportional to NormalizableScoringFunction.getEss() * Math.log( NormalizableScoringFunction.getNormalizationConstant() ) + Math.log( prior ).
 double getLogScore(Sequence seq, int start)
          Returns the logarithmic score for the Sequence seq beginning at position start in the Sequence.
 double getLogScoreAndPartialDerivation(Sequence seq, int start, IntList indices, DoubleList partialDer)
          Returns the logarithmic score for a Sequence beginning at position start in the Sequence and fills lists with the indices and the partial derivations.
 int getMotifLength(int motif)
          This method returns the length of the motif with index motif .
 double getNormalizationConstant()
          Returns the sum of the scores over all sequences of the event space.
 int getNumberOfComponents()
          Returns the number of components in this MotifDiscoverer.
 int getNumberOfMotifs()
          Returns the number of motifs for this MotifDiscoverer.
 int getNumberOfMotifsInComponent(int component)
          Returns the number of motifs that are used in the component component of this MotifDiscoverer.
 int getNumberOfParameters()
          Returns the number of parameters in this ScoringFunction.
 int getNumberOfRecommendedStarts()
          This method returns the number of recommended optimization starts.
 double getPartialNormalizationConstant(int parameterIndex)
          Returns the partial normalization constant for the parameter with index parameterIndex.
 double[] getProfileOfScoresFor(int component, int motif, Sequence sequence, int startpos, MotifDiscoverer.KindOfProfile dist)
          Returns the profile of the scores for component component and motif motif at all possible start positions of the motif in the sequence sequence beginning at startpos.
 int getSizeOfEventSpaceForRandomVariablesOfParameter(int index)
          Returns the size of the event space of the random variables that are affected by parameter no.
 StrandedLocatedSequenceAnnotationWithLength.Strand getStrandFor(int component, int motif, Sequence sequence, int startpos)
          This method returns the strand for a given subsequence if it is considered as site of the motif model in a specific component.
 void initializeFunction(int index, boolean freeParams, Sample[] data, double[][] weights)
          This method creates the underlying structure of the ScoringFunction.
 void initializeFunctionRandomly(boolean freeParams)
          This method initializes the ScoringFunction randomly.
 void initializeMotif(int motifIndex, Sample data, double[] weights)
          This method allows to initialize the model of a motif manually using a weighted sample.
 boolean isInitialized()
          This method can be used to determine whether the model is initialized.
 boolean modifyMotif(int motifIndex, double[] weightsLeft, double[] weightsRight, double[][][][] replacementLeft, double[][][][] replacementRight, int offsetLeft, int offsetRight)
          Manually modifies the motif model with index motifIndex.
 boolean modifyMotif(int motifIndex, int offsetLeft, int offsetRight)
          Manually modifies the motif model with index motifIndex.
 void setParameters(double[] params, int start)
          This method sets the internal parameters to the values of params between start and start + ScoringFunction.getNumberOfParameters() - 1
 String toString()
           
 StringBuffer toXML()
          This method returns an XML representation as StringBuffer of an instance of the implementing class.
 
Methods inherited from class de.jstacs.scoringFunctions.AbstractNormalizableScoringFunction
getAlphabetContainer, getInitialClassParam, getLength, getLogScore, getLogScoreAndPartialDerivation, isNormalized, isNormalized
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

IndependentProductScoringFunction

public IndependentProductScoringFunction(NormalizableScoringFunction... functions)
                                  throws CloneNotSupportedException,
                                         IllegalArgumentException
This constructor creates an instance of an IndependentProductScoringFunction from a given series of independent NormalizableScoringFunctions. The length that is modeled by each component is determined by ScoringFunction.getLength(). So the length should not be 0.

Parameters:
functions - the components, i.e. the given series of independent NormalizableScoringFunctions
Throws:
CloneNotSupportedException - if at least one element of functions could not be cloned
IllegalArgumentException - if at least one component has length 0 or the components do not have the same equivalent sample size (ess)
See Also:
IndependentProductScoringFunction(NormalizableScoringFunction[], int[])

IndependentProductScoringFunction

public IndependentProductScoringFunction(NormalizableScoringFunction[] functions,
                                         int[] length)
                                  throws CloneNotSupportedException,
                                         IllegalArgumentException
This constructor creates an instance of an IndependentProductScoringFunction from given series of independent NormalizableScoringFunctions and lengths.

Parameters:
functions - the components, i.e. the given series of independent NormalizableScoringFunctions
length - the lengths, one for each component
Throws:
CloneNotSupportedException - if at least one component could not be cloned
IllegalArgumentException - if the lengths and the components are not matching or the components do not have the same equivalent sample size (ess)
See Also:
IndependentProductScoringFunction(NormalizableScoringFunction[], int[])

IndependentProductScoringFunction

public IndependentProductScoringFunction(StringBuffer source)
                                  throws NonParsableException
This is the constructor for the interface Storable. Creates a new IndependentProductScoringFunction out of a StringBuffer as returned by toXML().

Parameters:
source - the XML representation as StringBuffer
Throws:
NonParsableException - if the XML representation could not be parsed
Method Detail

clone

public IndependentProductScoringFunction clone()
                                        throws CloneNotSupportedException
Description copied from interface: ScoringFunction
Creates a clone (deep copy) of the current ScoringFunction instance.

Specified by:
clone in interface ScoringFunction
Overrides:
clone in class AbstractNormalizableScoringFunction
Returns:
the cloned instance of the current ScoringFunction
Throws:
CloneNotSupportedException - if something went wrong while cloning the ScoringFunction

getSizeOfEventSpaceForRandomVariablesOfParameter

public int getSizeOfEventSpaceForRandomVariablesOfParameter(int index)
Description copied from interface: NormalizableScoringFunction
Returns the size of the event space of the random variables that are affected by parameter no. index, i.e. the product of the sizes of the alphabets at the position of each random variable affected by parameter index. For DNA alphabets this corresponds to 4 for a PWM, 16 for a WAM except position 0, ...

Specified by:
getSizeOfEventSpaceForRandomVariablesOfParameter in interface NormalizableScoringFunction
Parameters:
index - the index of the parameter
Returns:
the size of the event space

getNormalizationConstant

public double getNormalizationConstant()
Description copied from interface: NormalizableScoringFunction
Returns the sum of the scores over all sequences of the event space.

Specified by:
getNormalizationConstant in interface NormalizableScoringFunction
Returns:
the normalization constant Z

getPartialNormalizationConstant

public double getPartialNormalizationConstant(int parameterIndex)
                                       throws Exception
Description copied from interface: NormalizableScoringFunction
Returns the partial normalization constant for the parameter with index parameterIndex. This is the partial derivation of the normalization constant for the parameter with index parameterIndex, in LaTex notation: \frac{\partial Z(\lambda)}{\partial \lambda_{index}}.

Specified by:
getPartialNormalizationConstant in interface NormalizableScoringFunction
Parameters:
parameterIndex - the index of the parameter
Returns:
the partial normalization constant
Throws:
Exception - if something went wrong with the normalization
See Also:
NormalizableScoringFunction.getNormalizationConstant()

getEss

public double getEss()
Description copied from interface: NormalizableScoringFunction
Returns the equivalent sample size (ess) of this model, i.e. the equivalent sample size for the class or component that is represented by this model.

Specified by:
getEss in interface NormalizableScoringFunction
Returns:
the equivalent sample size.

initializeFunction

public void initializeFunction(int index,
                               boolean freeParams,
                               Sample[] data,
                               double[][] weights)
                        throws Exception
Description copied from interface: ScoringFunction
This method creates the underlying structure of the ScoringFunction.

Specified by:
initializeFunction in interface ScoringFunction
Parameters:
index - the index of the class the ScoringFunction models
freeParams - indicates whether the (reduced) parameterization is used
data - the samples
weights - the weights of the sequences in the samples
Throws:
Exception - if something went wrong

fromXML

protected void fromXML(StringBuffer rep)
                throws NonParsableException
Description copied from class: AbstractNormalizableScoringFunction
This method is called in the constructor for the Storable interface to create a scoring function from a StringBuffer.

Specified by:
fromXML in class AbstractNormalizableScoringFunction
Parameters:
rep - the XML representation as StringBuffer
Throws:
NonParsableException - if the StringBuffer could not be parsed
See Also:
AbstractNormalizableScoringFunction.AbstractNormalizableScoringFunction(StringBuffer)

getInstanceName

public String getInstanceName()
Description copied from interface: ScoringFunction
Returns a short instance name.

Specified by:
getInstanceName in interface ScoringFunction
Returns:
a short instance name

getCurrentParameterValues

public double[] getCurrentParameterValues()
                                   throws Exception
Description copied from interface: ScoringFunction
Returns a double array of dimension ScoringFunction.getNumberOfParameters() containing the current parameter values. If one likes to use these parameters to start an optimization it is highly recommended to invoke ScoringFunction.initializeFunction(int, boolean, Sample[], double[][]) before. After an optimization this method can be used to get the current parameter values.

Specified by:
getCurrentParameterValues in interface ScoringFunction
Returns:
the current parameter values
Throws:
Exception - if no parameters exist (yet)

getLogScore

public double getLogScore(Sequence seq,
                          int start)
Description copied from interface: ScoringFunction
Returns the logarithmic score for the Sequence seq beginning at position start in the Sequence.

Specified by:
getLogScore in interface ScoringFunction
Parameters:
seq - the Sequence
start - the start position in the Sequence
Returns:
the logarithmic score for the Sequence

getLogScoreAndPartialDerivation

public double getLogScoreAndPartialDerivation(Sequence seq,
                                              int start,
                                              IntList indices,
                                              DoubleList partialDer)
Description copied from interface: ScoringFunction
Returns the logarithmic score for a Sequence beginning at position start in the Sequence and fills lists with the indices and the partial derivations.

Specified by:
getLogScoreAndPartialDerivation in interface ScoringFunction
Parameters:
seq - the Sequence
start - the start position in the Sequence
indices - an IntList of indices, after method invocation the list should contain the indices i where \frac{\partial \log score(seq)}{\partial \lambda_i} is not zero
partialDer - a DoubleList of partial derivations, after method invocation the list should contain the corresponding \frac{\partial \log score(seq)}{\partial \lambda_i}
Returns:
the logarithmic score for the Sequence

getNumberOfParameters

public int getNumberOfParameters()
Description copied from interface: ScoringFunction
Returns the number of parameters in this ScoringFunction. If the number of parameters is not known yet, the method returns ScoringFunction.UNKNOWN.

Specified by:
getNumberOfParameters in interface ScoringFunction
Returns:
the number of parameters in this ScoringFunction
See Also:
ScoringFunction.UNKNOWN

getNumberOfRecommendedStarts

public int getNumberOfRecommendedStarts()
Description copied from interface: ScoringFunction
This method returns the number of recommended optimization starts. The standard implementation returns 1.

Specified by:
getNumberOfRecommendedStarts in interface ScoringFunction
Overrides:
getNumberOfRecommendedStarts in class AbstractNormalizableScoringFunction
Returns:
the number of recommended optimization starts

setParameters

public void setParameters(double[] params,
                          int start)
Description copied from interface: ScoringFunction
This method sets the internal parameters to the values of params between start and start + ScoringFunction.getNumberOfParameters() - 1

Specified by:
setParameters in interface ScoringFunction
Parameters:
params - the new parameters
start - the start index in params

toXML

public StringBuffer toXML()
Description copied from interface: Storable
This method returns an XML representation as StringBuffer of an instance of the implementing class.

Specified by:
toXML in interface Storable
Returns:
the XML representation

toString

public String toString()
Overrides:
toString in class Object

getLogPriorTerm

public double getLogPriorTerm()
Description copied from interface: NormalizableScoringFunction
This method computes a value that is proportional to

NormalizableScoringFunction.getEss() * Math.log( NormalizableScoringFunction.getNormalizationConstant() ) + Math.log( prior ).

where prior is the prior for the parameters of this model.

Specified by:
getLogPriorTerm in interface NormalizableScoringFunction
Returns:
a value that is proportional to NormalizableScoringFunction.getEss() * Math.log( NormalizableScoringFunction.getNormalizationConstant() ) + Math.log( prior ).
See Also:
NormalizableScoringFunction.getEss(), NormalizableScoringFunction.getNormalizationConstant()

addGradientOfLogPriorTerm

public void addGradientOfLogPriorTerm(double[] grad,
                                      int start)
                               throws Exception
Description copied from interface: NormalizableScoringFunction
This method computes the gradient of NormalizableScoringFunction.getLogPriorTerm() for each parameter of this model. The results are added to the array grad beginning at index start.

Specified by:
addGradientOfLogPriorTerm in interface NormalizableScoringFunction
Parameters:
grad - the array of gradients
start - the start index in the grad array, where the partial derivations for the parameters of this models shall be entered
Throws:
Exception - if something went wrong with the computing of the gradients
See Also:
NormalizableScoringFunction.getLogPriorTerm()

isInitialized

public boolean isInitialized()
Description copied from interface: ScoringFunction
This method can be used to determine whether the model is initialized. If the model is not initialized you should invoke the method ScoringFunction.initializeFunction(int, boolean, Sample[], double[][]).

Specified by:
isInitialized in interface ScoringFunction
Returns:
true if the model is initialized, false otherwise

initializeFunctionRandomly

public void initializeFunctionRandomly(boolean freeParams)
                                throws Exception
Description copied from interface: ScoringFunction
This method initializes the ScoringFunction randomly. It has to create the underlying structure of the ScoringFunction.

Specified by:
initializeFunctionRandomly in interface ScoringFunction
Parameters:
freeParams - indicates whether the (reduced) parameterization is used
Throws:
Exception - if something went wrong

initializeMotif

public void initializeMotif(int motifIndex,
                            Sample data,
                            double[] weights)
                     throws Exception
Description copied from interface: MutableMotifDiscoverer
This method allows to initialize the model of a motif manually using a weighted sample.

Specified by:
initializeMotif in interface MutableMotifDiscoverer
Parameters:
motifIndex - the index of the motif in the motif discoverer
data - the sample of sequences
weights - either null or an array of length data.getNumberofElements() with non-negative weights.
Throws:
Exception - if initialize was not possible

modifyMotif

public boolean modifyMotif(int motifIndex,
                           double[] weightsLeft,
                           double[] weightsRight,
                           double[][][][] replacementLeft,
                           double[][][][] replacementRight,
                           int offsetLeft,
                           int offsetRight)
                    throws Exception
Description copied from interface: MutableMotifDiscoverer
Manually modifies the motif model with index motifIndex. The two offsets offsetLeft and offsetRight define how many positions the left or right border positions shall be moved. Negative numbers indicate moves to the left while positive numbers correspond to moves to the right.

Specified by:
modifyMotif in interface MutableMotifDiscoverer
Parameters:
motifIndex - the index of the motif in the motif discoverer
weightsLeft - the weights for the left contrast distributions
weightsRight - the weights for the right contrast distributions
replacementLeft - the replacement distribution for the left side
replacementRight - the replacement distribution for the right side
offsetLeft - the offset on the left side
offsetRight - the offset on the right side
Returns:
true if the motif model was modified otherwise false
Throws:
Exception

modifyMotif

public boolean modifyMotif(int motifIndex,
                           int offsetLeft,
                           int offsetRight)
                    throws Exception
Description copied from interface: MutableMotifDiscoverer
Manually modifies the motif model with index motifIndex. The two offsets offsetLeft and offsetRight define how many positions the left or right border positions shall be moved. Negative numbers indicate moves to the left while positive numbers correspond to moves to the right. The distribution for sequences to the left and right side of the motif shall be computed internally.

Specified by:
modifyMotif in interface MutableMotifDiscoverer
Parameters:
motifIndex - the index of the motif in the motif discoverer
offsetLeft - the offset on the left side
offsetRight - the offset on the right side
Returns:
true if the motif model was modified otherwise false
Throws:
Exception
See Also:
MutableMotifDiscoverer.modifyMotif(int, double[], double[], double[][][][], double[][][][], int, int)

determineNotSignificantPositionsFor

public int[] determineNotSignificantPositionsFor(int motif,
                                                 Sample[] data,
                                                 double[][] weights,
                                                 int classIdx)
Description copied from interface: MutableMotifDiscoverer
This method determines the number of not significant positions from each side of the motif with index motif.

Specified by:
determineNotSignificantPositionsFor in interface MutableMotifDiscoverer
Parameters:
motif - the index of the motif in the motif discoverer
data - an array Samples, each array-entry represents on class
weights - the weights of each Sequence for each class
classIdx - the index of the current class in classCounts
Returns:
a two dimensional array containing at position 0 the number of not significant positions from the left and at position 1 the number of not significant positions from the right side
See Also:
Mutable.determineNotSignificantPositions(double, double[], double[], double[][][][], double[][][][], double), MutableMotifDiscoverer.modifyMotif(int, int, int), MutableMotifDiscoverer.modifyMotif(int, double[], double[], double[][][][], double[][][][], int, int)

getGlobalIndexOfMotifInComponent

public int getGlobalIndexOfMotifInComponent(int component,
                                            int motif)
Description copied from interface: MotifDiscoverer
Returns the global index of the motif used in component. The index returned must be at least 0 and less than MotifDiscoverer.getNumberOfMotifs().

Specified by:
getGlobalIndexOfMotifInComponent in interface MotifDiscoverer
Parameters:
component - the component index
motif - the motif index in the component
Returns:
the global index of the motif in component

getIndexOfMaximalComponentFor

public int getIndexOfMaximalComponentFor(Sequence sequence)
                                  throws Exception
Description copied from interface: MotifDiscoverer
Returns the index of the component with the maximal score for the sequence sequence.

Specified by:
getIndexOfMaximalComponentFor in interface MotifDiscoverer
Parameters:
sequence - the given sequence
Returns:
the index of the component with the maximal score for the given sequence
Throws:
Exception - if the index could not be computed for any reasons

getMotifLength

public int getMotifLength(int motif)
Description copied from interface: MotifDiscoverer
This method returns the length of the motif with index motif .

Specified by:
getMotifLength in interface MotifDiscoverer
Parameters:
motif - the index of the motif
Returns:
the length of the motif with index motif

getNumberOfComponents

public int getNumberOfComponents()
Description copied from interface: MotifDiscoverer
Returns the number of components in this MotifDiscoverer.

Specified by:
getNumberOfComponents in interface MotifDiscoverer
Returns:
the number of components

getNumberOfMotifs

public int getNumberOfMotifs()
Description copied from interface: MotifDiscoverer
Returns the number of motifs for this MotifDiscoverer.

Specified by:
getNumberOfMotifs in interface MotifDiscoverer
Returns:
the number of motifs

getNumberOfMotifsInComponent

public int getNumberOfMotifsInComponent(int component)
Description copied from interface: MotifDiscoverer
Returns the number of motifs that are used in the component component of this MotifDiscoverer.

Specified by:
getNumberOfMotifsInComponent in interface MotifDiscoverer
Parameters:
component - the component of the MotifDiscoverer
Returns:
the number of motifs

getProfileOfScoresFor

public double[] getProfileOfScoresFor(int component,
                                      int motif,
                                      Sequence sequence,
                                      int startpos,
                                      MotifDiscoverer.KindOfProfile dist)
                               throws Exception
Description copied from interface: MotifDiscoverer
Returns the profile of the scores for component component and motif motif at all possible start positions of the motif in the sequence sequence beginning at startpos. This array should be of length
sequence.length() - startpos - motifs[motif].length() + 1.
A high score should encode for a probable start position.

Specified by:
getProfileOfScoresFor in interface MotifDiscoverer
Parameters:
component - the component index
motif - the index of the motif in the component
sequence - the given sequence
startpos - the start position in the sequence
dist - indicates the kind of profile
Returns:
the profile of scores
Throws:
Exception - if the score could not be computed for any reasons

getStrandFor

public StrandedLocatedSequenceAnnotationWithLength.Strand getStrandFor(int component,
                                                                       int motif,
                                                                       Sequence sequence,
                                                                       int startpos)
                                                                throws Exception
Description copied from interface: MotifDiscoverer
This method returns the strand for a given subsequence if it is considered as site of the motif model in a specific component.

Specified by:
getStrandFor in interface MotifDiscoverer
Parameters:
component - the component index
motif - the index of the motif in the component
sequence - the given sequence
startpos - the start position in the sequence
Returns:
the predicted strand annotation
Throws:
Exception - if the strand could not be computed for any reasons