public class IndependentProductDiffSM extends IndependentProductDiffSS implements DifferentiableStatisticalModel, MutableMotifDiscoverer
DifferentiableStatisticalModel and has the length of the first
DifferentiableStatisticalModel, the second part starts directly after
the first part, is modeled by the second DifferentiableStatisticalModel
... etc. It is also possible to use a DifferentiableStatisticalModel for
more than one sequence part and in both orientations (if possible).
MotifDiscoverer.KindOfProfileindex, isVariable, partialLength, reverse, score, start, startIndexOfParamsalphabets, length, rUNKNOWN| Constructor and Description |
|---|
IndependentProductDiffSM(double ess,
boolean plugIn,
DifferentiableStatisticalModel... functions)
This constructor creates an instance of an
IndependentProductDiffSM from a given series of
independent DifferentiableStatisticalModels. |
IndependentProductDiffSM(double ess,
boolean plugIn,
DifferentiableStatisticalModel[] functions,
int[] length)
This constructor creates an instance of an
IndependentProductDiffSM from given series of
independent DifferentiableStatisticalModels and lengths. |
IndependentProductDiffSM(double ess,
boolean plugIn,
DifferentiableStatisticalModel[] functions,
int[] index,
int[] length,
boolean[] reverse)
This is the main constructor.
|
IndependentProductDiffSM(StringBuffer source)
This is the constructor for the interface
Storable. |
| Modifier and Type | Method and Description |
|---|---|
void |
addGradientOfLogPriorTerm(double[] grad,
int start)
This method computes the gradient of
DifferentiableStatisticalModel.getLogPriorTerm() for each
parameter of this model. |
void |
adjustHiddenParameters(int index,
DataSet[] data,
double[][] weights)
Adjusts all hidden parameters including duration and mixture parameters according to the current values of the remaining parameters.
|
IndependentProductDiffSM |
clone()
Creates a clone (deep copy) of the current
DifferentiableSequenceScore
instance. |
DataSet |
emitDataSet(int numberOfSequences,
int... seqLength)
This method returns a
DataSet object containing artificial
sequence(s). |
protected void |
extractFurtherInformation(StringBuffer rep)
This method is the opposite of
IndependentProductDiffSS.getFurtherInformation(). |
double |
getESS()
Returns the equivalent sample size (ess) of this model, i.e.
|
protected StringBuffer |
getFurtherInformation()
This method is used to append further information of the instance to the
XML representation.
|
int |
getGlobalIndexOfMotifInComponent(int component,
int motif)
Returns the global index of the
motif used in
component. |
int |
getIndexOfMaximalComponentFor(Sequence sequence)
Returns the index of the component with the maximal score for the
sequence
sequence. |
String |
getInstanceName()
Should return a short instance name such as iMM(0), BN(2), ...
|
double |
getLogNormalizationConstant()
Returns the logarithm of the sum of the scores over all sequences of the event space.
|
double |
getLogPartialNormalizationConstant(int parameterIndex)
Returns the logarithm of the partial normalization constant for the parameter with index
parameterIndex. |
double |
getLogPriorTerm()
This method computes a value that is proportional to
|
double |
getLogProbFor(Sequence sequence)
Returns the logarithm of the probability of the given sequence given the
model.
|
double |
getLogProbFor(Sequence sequence,
int startpos)
Returns the logarithm of the probability of (a part of) the given
sequence given the model.
|
double |
getLogProbFor(Sequence sequence,
int startpos,
int endpos)
Returns the logarithm of the probability of (a part of) the given
sequence given the model.
|
byte |
getMaximalMarkovOrder()
This method returns the maximal used Markov order, if possible.
|
int |
getMotifLength(int motif)
This method returns the length of the motif with index
motif
. |
int |
getNumberOfComponents()
Returns the number of components in this
MotifDiscoverer. |
int |
getNumberOfMotifs()
Returns the number of motifs for this
MotifDiscoverer. |
int |
getNumberOfMotifsInComponent(int component)
Returns the number of motifs that are used in the component
component of this MotifDiscoverer. |
int |
getNumberOfParameters()
Returns the number of parameters in this
DifferentiableSequenceScore. |
int |
getNumberOfRecommendedStarts()
This method returns the number of recommended optimization starts.
|
double[] |
getProfileOfScoresFor(int component,
int motif,
Sequence sequence,
int startpos,
MotifDiscoverer.KindOfProfile dist)
Returns the profile of the scores for component
component
and motif motif at all possible start positions of the motif
in the sequence sequence beginning at startpos. |
int |
getSizeOfEventSpaceForRandomVariablesOfParameter(int index)
Returns the size of the event space of the random variables that are
affected by parameter no.
|
double[] |
getStrandProbabilitiesFor(int component,
int motif,
Sequence sequence,
int startpos)
This method returns the probabilities of the strand orientations for a given subsequence if it is
considered as site of the motif model in a specific component.
|
void |
initializeMotif(int motifIndex,
DataSet data,
double[] weights)
This method allows to initialize the model of a motif manually using a weighted data set.
|
void |
initializeMotifRandomly(int motif)
This method initializes the motif with index
motif randomly using for instance DifferentiableSequenceScore.initializeFunctionRandomly(boolean). |
boolean |
isNormalized()
This method indicates whether the implemented score is already normalized
to 1 or not.
|
boolean |
modifyMotif(int motifIndex,
int offsetLeft,
int offsetRight)
Manually modifies the motif model with index
motifIndex. |
void |
setParameters(double[] params,
int start)
This method sets the internal parameters to the values of
params between start and
start + |
extractSequenceParts, extractWeights, fromXML, getCurrentParameterValues, getFunctions, getIndices, getLengthArray, getLogScoreAndPartialDerivation, getLogScoreFor, getPartialLengths, getReverseSwitches, initializeFunction, initializeFunctionRandomly, isInitialized, setParamsStarts, toString, toXMLgetAlphabetContainer, getCharacteristics, getInitialClassParam, getLength, getLogScoreAndPartialDerivation, getLogScoreAndPartialDerivation, getLogScoreFor, getLogScoreFor, getLogScoreFor, getLogScoreFor, getNumberOfStarts, getNumericalCharacteristics, toStringequals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitgetCurrentParameterValues, getInitialClassParam, getLogScoreAndPartialDerivation, getLogScoreAndPartialDerivation, getLogScoreAndPartialDerivation, initializeFunction, initializeFunctionRandomlygetAlphabetContainer, getCharacteristics, getLength, getLogScoreFor, getLogScoreFor, getLogScoreFor, getLogScoreFor, getLogScoreFor, getNumericalCharacteristics, isInitialized, toStringpublic IndependentProductDiffSM(double ess,
boolean plugIn,
DifferentiableStatisticalModel... functions)
throws CloneNotSupportedException,
IllegalArgumentException,
WrongAlphabetException
IndependentProductDiffSM from a given series of
independent DifferentiableStatisticalModels. The length that is
modeled by each component is determined by
SequenceScore.getLength(). So the length should not be 0.ess - the equivalent sample sizeplugIn - whether to use plugIn parameters for the parts, otherwise the last parameters
are used for parts that are instance of
HomogeneousDiffSMfunctions - the components, i.e. the given series of independent
DifferentiableStatisticalModelsCloneNotSupportedException - if at least one element of functions could not
be clonedIllegalArgumentException - if at least one component has length 0 or if the
equivalent sample size (ess) is smaller than zero (0)WrongAlphabetException - if the user tries to use an alphabet for a reverse complement that can not be used for a reverse complement.IndependentProductDiffSM(double, boolean, DifferentiableStatisticalModel[], int[])public IndependentProductDiffSM(double ess,
boolean plugIn,
DifferentiableStatisticalModel[] functions,
int[] length)
throws CloneNotSupportedException,
IllegalArgumentException,
WrongAlphabetException
IndependentProductDiffSM from given series of
independent DifferentiableStatisticalModels and lengths.ess - the equivalent sample sizeplugIn - whether to use plugIn parameters for the parts, otherwise the last parameters
are used for parts that are instance of
HomogeneousDiffSMfunctions - the components, i.e. the given series of independent
DifferentiableStatisticalModelslength - the lengths, one for each componentCloneNotSupportedException - if at least one component could not be clonedIllegalArgumentException - if the lengths and the components are not matching or if the
equivalent sample size (ess) is smaller than zero (0)WrongAlphabetException - if the user tries to use an alphabet for a reverse complement that can not be used for a reverse complement.IndependentProductDiffSM(double, boolean, DifferentiableStatisticalModel[], int[], int[], boolean[])public IndependentProductDiffSM(double ess,
boolean plugIn,
DifferentiableStatisticalModel[] functions,
int[] index,
int[] length,
boolean[] reverse)
throws CloneNotSupportedException,
IllegalArgumentException,
WrongAlphabetException
ess - the equivalent sample sizeplugIn - whether to use plugIn parameters for the parts, otherwise the last parameters
are used for parts that are instance of
HomogeneousDiffSMfunctions - the DifferentiableStatisticalModelindex - the index of the DifferentiableStatisticalModel at each partlength - the length of each partreverse - a switch whether to use it directly or the reverse complementary strandCloneNotSupportedException - if at least one component could not be clonedIllegalArgumentException - if the lengths and the components are not matching or if the
equivalent sample size (ess) is smaller than zero (0)WrongAlphabetException - if the user tries to use an alphabet for a reverse complement that can not be used for a reverse complement.public IndependentProductDiffSM(StringBuffer source) throws NonParsableException
Storable.
Creates a new IndependentProductDiffSM out of a
StringBuffer as returned by IndependentProductDiffSS.toXML().source - the XML representation as StringBufferNonParsableException - if the XML representation could not be parsedpublic IndependentProductDiffSM clone() throws CloneNotSupportedException
DifferentiableSequenceScoreDifferentiableSequenceScore
instance.clone in interface MotifDiscovererclone in interface DifferentiableSequenceScoreclone in interface SequenceScoreclone in class IndependentProductDiffSSDifferentiableSequenceScoreCloneNotSupportedException - if something went wrong while cloning the
DifferentiableSequenceScoreCloneablepublic int getSizeOfEventSpaceForRandomVariablesOfParameter(int index)
DifferentiableStatisticalModelindex, i.e. the product of the
sizes of the alphabets at the position of each random variable affected
by parameter index. For DNA alphabets this corresponds to 4
for a PWM, 16 for a WAM except position 0, ...getSizeOfEventSpaceForRandomVariablesOfParameter in interface DifferentiableStatisticalModelindex - the index of the parameterpublic double getLogNormalizationConstant()
DifferentiableStatisticalModelgetLogNormalizationConstant in interface DifferentiableStatisticalModelpublic double getLogPartialNormalizationConstant(int parameterIndex)
throws Exception
DifferentiableStatisticalModelparameterIndex. This is the logarithm of the partial derivation of the
normalization constant for the parameter with index
parameterIndex,
![\[\log \frac{\partial Z(\underline{\lambda})}{\partial \lambda_{parameterindex}}\]](images/DifferentiableStatisticalModel_LaTeXilb9_1.png)
getLogPartialNormalizationConstant in interface DifferentiableStatisticalModelparameterIndex - the index of the parameterException - if something went wrong with the normalizationDifferentiableStatisticalModel.getLogNormalizationConstant()public double getESS()
DifferentiableStatisticalModelgetESS in interface DifferentiableStatisticalModelprotected void extractFurtherInformation(StringBuffer rep) throws NonParsableException
IndependentProductDiffSSIndependentProductDiffSS.getFurtherInformation(). It
extracts further information of the instance from a XML representation.extractFurtherInformation in class IndependentProductDiffSSrep - the StringBuffer containing the information to be
extracted as XML codeNonParsableException - if the StringBuffer could not be parsedpublic String getInstanceName()
SequenceScoregetInstanceName in interface SequenceScoregetInstanceName in class IndependentProductDiffSSpublic int getNumberOfParameters()
DifferentiableSequenceScoreDifferentiableSequenceScore. If the
number of parameters is not known yet, the method returns
DifferentiableSequenceScore.UNKNOWN.getNumberOfParameters in interface DifferentiableSequenceScoregetNumberOfParameters in class IndependentProductDiffSSDifferentiableSequenceScoreDifferentiableSequenceScore.UNKNOWNpublic int getNumberOfRecommendedStarts()
DifferentiableSequenceScoregetNumberOfRecommendedStarts in interface DifferentiableSequenceScoregetNumberOfRecommendedStarts in class IndependentProductDiffSSpublic void setParameters(double[] params,
int start)
DifferentiableSequenceScoreparams between start and
start + DifferentiableSequenceScore.getNumberOfParameters() - 1setParameters in interface DifferentiableSequenceScoresetParameters in class IndependentProductDiffSSparams - the new parametersstart - the start index in paramsprotected StringBuffer getFurtherInformation()
IndependentProductDiffSSgetFurtherInformation in class IndependentProductDiffSSStringBufferpublic double getLogPriorTerm()
DifferentiableStatisticalModel
DifferentiableStatisticalModel.getESS() * DifferentiableStatisticalModel.getLogNormalizationConstant() + Math.log( prior )
prior is the prior for the parameters of this model.getLogPriorTerm in interface DifferentiableStatisticalModelgetLogPriorTerm in interface StatisticalModelDifferentiableStatisticalModel.getESS() * DifferentiableStatisticalModel.getLogNormalizationConstant() + Math.log( prior ).DifferentiableStatisticalModel.getESS(),
DifferentiableStatisticalModel.getLogNormalizationConstant()public void addGradientOfLogPriorTerm(double[] grad,
int start)
throws Exception
DifferentiableStatisticalModelDifferentiableStatisticalModel.getLogPriorTerm() for each
parameter of this model. The results are added to the array
grad beginning at index start.addGradientOfLogPriorTerm in interface DifferentiableStatisticalModelgrad - the array of gradientsstart - the start index in the grad array, where the
partial derivations for the parameters of this models shall be
enteredException - if something went wrong with the computing of the gradientsDifferentiableStatisticalModel.getLogPriorTerm()public double getLogProbFor(Sequence sequence, int startpos) throws Exception
StatisticalModelstartpos. E.g. the fixed length is 12. The length
of the given sequence is 30 and the startpos=15 the logarithm
of the probability of the part from position 15 to 26 (inclusive) given
the model should be returned.
length and the alphabets define the type of
data that can be modeled and therefore both has to be checked.getLogProbFor in interface StatisticalModelsequence - the given sequencestartpos - the start position within the given sequenceException - if the sequence could not be handled by the modelNotTrainedException - if the model is not trained yetStatisticalModel.getLogProbFor(Sequence, int, int)public double getLogProbFor(Sequence sequence) throws Exception
StatisticalModellength and the alphabets define the type of
data that can be modeled and therefore both has to be checked.getLogProbFor in interface StatisticalModelsequence - the given sequence for which the logarithm of the
probability/the value of the density function should be
returnedException - if the sequence could not be handled by the modelNotTrainedException - if the model is not trained yetStatisticalModel.getLogProbFor(Sequence, int, int)public double getLogProbFor(Sequence sequence, int startpos, int endpos) throws Exception
StatisticalModelStatisticalModel.getLogProbFor(Sequence, int) by the fact, that the model could be
e.g. homogeneous and therefore the length of the sequences, whose
probability should be returned, is not fixed. Additionally, the end
position of the part of the given sequence is given and the probability
of the part from position startpos to endpos
(inclusive) should be returned.
length and the alphabets define the type of
data that can be modeled and therefore both has to be checked.getLogProbFor in interface StatisticalModelsequence - the given sequencestartpos - the start position within the given sequenceendpos - the last position to be taken into accountException - if the sequence could not be handled (e.g.
startpos > , endpos
> sequence.length, ...) by the modelNotTrainedException - if the model is not trained yetpublic void initializeMotif(int motifIndex,
DataSet data,
double[] weights)
throws Exception
MutableMotifDiscovererinitializeMotif in interface MutableMotifDiscoverermotifIndex - the index of the motif in the motif discovererdata - the data set of sequencesweights - either null or an array of length data.getNumberofElements() with non-negative weights.Exception - if initialize was not possiblepublic void initializeMotifRandomly(int motif)
throws Exception
MutableMotifDiscoverermotif randomly using for instance DifferentiableSequenceScore.initializeFunctionRandomly(boolean).
Furthermore, if available, it also initializes the positional distribution.initializeMotifRandomly in interface MutableMotifDiscoverermotif - the index of the motifException - either if the index is wrong or if it is thrown by the method DifferentiableSequenceScore.initializeFunctionRandomly(boolean)public boolean modifyMotif(int motifIndex,
int offsetLeft,
int offsetRight)
throws Exception
MutableMotifDiscoverermotifIndex. The two offsets offsetLeft and offsetRight
define how many positions the left or right border positions shall be moved. Negative numbers indicate moves to the left while positive
numbers correspond to moves to the right. The distribution for sequences to the left and right side of the motif shall be computed internally.modifyMotif in interface MutableMotifDiscoverermotifIndex - the index of the motif in the motif discovereroffsetLeft - the offset on the left sideoffsetRight - the offset on the right sidetrue if the motif model was modified otherwise falseException - if some unexpected error occurred during the modificationMutableMotifDiscoverer.modifyMotif(int, int, int),
Mutable.modify(int, int)public int getGlobalIndexOfMotifInComponent(int component,
int motif)
MotifDiscoverermotif used in
component. The index returned must be at least 0 and less
than MotifDiscoverer.getNumberOfMotifs().getGlobalIndexOfMotifInComponent in interface MotifDiscoverercomponent - the component indexmotif - the motif index in the componentmotif in componentpublic int getIndexOfMaximalComponentFor(Sequence sequence) throws Exception
MotifDiscoverersequence.getIndexOfMaximalComponentFor in interface MotifDiscoverersequence - the given sequenceException - if the index could not be computed for any reasonspublic int getMotifLength(int motif)
MotifDiscoverermotif
.getMotifLength in interface MotifDiscoverermotif - the index of the motifmotifpublic int getNumberOfComponents()
MotifDiscovererMotifDiscoverer.getNumberOfComponents in interface MotifDiscovererpublic int getNumberOfMotifs()
MotifDiscovererMotifDiscoverer.getNumberOfMotifs in interface MotifDiscovererpublic int getNumberOfMotifsInComponent(int component)
MotifDiscoverercomponent of this MotifDiscoverer.getNumberOfMotifsInComponent in interface MotifDiscoverercomponent - the component of the MotifDiscovererpublic double[] getProfileOfScoresFor(int component,
int motif,
Sequence sequence,
int startpos,
MotifDiscoverer.KindOfProfile dist)
throws Exception
MotifDiscoverercomponent
and motif motif at all possible start positions of the motif
in the sequence sequence beginning at startpos.
This array should be of length sequence.length() - startpos - motifs[motif].getLength() + 1.
getProfileOfScoresFor in interface MotifDiscoverercomponent - the component indexmotif - the index of the motif in the componentsequence - the given sequencestartpos - the start position in the sequencedist - indicates the kind of profileException - if the score could not be computed for any reasonspublic double[] getStrandProbabilitiesFor(int component,
int motif,
Sequence sequence,
int startpos)
throws Exception
MotifDiscoverergetStrandProbabilitiesFor in interface MotifDiscoverercomponent - the component indexmotif - the index of the motif in the componentsequence - the given sequencestartpos - the start position in the sequenceException - if the strand could not be computed for any reasonspublic boolean isNormalized()
DifferentiableStatisticalModelfalse.isNormalized in interface DifferentiableStatisticalModeltrue if the implemented score is already normalized
to 1, false otherwisepublic void adjustHiddenParameters(int index,
DataSet[] data,
double[][] weights)
throws Exception
MutableMotifDiscovereradjustHiddenParameters in interface MutableMotifDiscovererindex - the index of the class of this MutableMotifDiscovererdata - the array of data for all classesweights - the weights for all sequences in dataException - thrown if the hidden parameters could not be adjustedpublic DataSet emitDataSet(int numberOfSequences, int... seqLength) throws NotTrainedException, Exception
StatisticalModelDataSet object containing artificial
sequence(s).
emitDataSet( int n, int l ) should return a data set with
n sequences of length l.
emitDataSet( int n, int[] l ) should return a data set with
n sequences which have a sequence length corresponding to
the entry in the given array l.
emitDataSet( int n ) and
emitDataSet( int n, null ) should return a data set with
n sequences of length of the model (
SequenceScore.getLength()).
Exception.emitDataSet in interface StatisticalModelnumberOfSequences - the number of sequences that should be contained in the
returned data setseqLength - the length of the sequences for a homogeneous model; for an
inhomogeneous model this parameter should be null
or an array of size 0.DataSet containing the artificial sequence(s)NotTrainedException - if the model is not trained yetException - if the emission did not succeedDataSetpublic byte getMaximalMarkovOrder()
throws UnsupportedOperationException
StatisticalModelgetMaximalMarkovOrder in interface StatisticalModelUnsupportedOperationException - if the model can't give a proper answer