public final class MappingDiffSM extends AbstractDifferentiableStatisticalModel implements MutableMotifDiscoverer, Mutable
DifferentiableStatisticalModel that works on
mapped Sequences. For instance this can be useful for protein
sequences to reduce the alphabet of size 20 to a smaller alphabet using for
instance some chemical properties of the amino acids.
Be careful with references to Sequences in the internal
DifferentiableStatisticalModel, since the Sequences might be
unexpectedly mutable.MappedDiscreteSequence,
DiscreteAlphabetMappingMotifDiscoverer.KindOfProfilealphabets, length, rUNKNOWN| Constructor and Description |
|---|
MappingDiffSM(AlphabetContainer originalAlphabetContainer,
DifferentiableStatisticalModel nsf,
DiscreteAlphabetMapping... mapping)
The main constructor creating a
MappingDiffSM. |
MappingDiffSM(StringBuffer xml)
This is the constructor for
Storable. |
| Modifier and Type | Method and Description |
|---|---|
void |
addGradientOfLogPriorTerm(double[] grad,
int start)
This method computes the gradient of
DifferentiableStatisticalModel.getLogPriorTerm() for each
parameter of this model. |
void |
adjustHiddenParameters(int index,
DataSet[] data,
double[][] weights)
Adjusts all hidden parameters including duration and mixture parameters according to the current values of the remaining parameters.
|
MappingDiffSM |
clone()
Creates a clone (deep copy) of the current
DifferentiableSequenceScore
instance. |
protected void |
fromXML(StringBuffer xml)
This method is called in the constructor for the
Storable
interface to create a scoring function from a StringBuffer. |
double[] |
getCurrentParameterValues()
Returns a
double array of dimension
DifferentiableSequenceScore.getNumberOfParameters() containing the current parameter values. |
double |
getESS()
Returns the equivalent sample size (ess) of this model, i.e.
|
DifferentiableStatisticalModel |
getFunction()
This method return the internal function.
|
int |
getGlobalIndexOfMotifInComponent(int component,
int motif)
Returns the global index of the
motif used in
component. |
int |
getIndexOfMaximalComponentFor(Sequence sequence)
Returns the index of the component with the maximal score for the
sequence
sequence. |
String |
getInstanceName()
Should return a short instance name such as iMM(0), BN(2), ...
|
double |
getLogNormalizationConstant()
Returns the logarithm of the sum of the scores over all sequences of the event space.
|
double |
getLogPartialNormalizationConstant(int parameterIndex)
Returns the logarithm of the partial normalization constant for the parameter with index
parameterIndex. |
double |
getLogPriorTerm()
This method computes a value that is proportional to
|
double |
getLogScoreAndPartialDerivation(Sequence seq,
int start,
IntList indices,
DoubleList partialDer)
|
double |
getLogScoreFor(Sequence seq,
int start)
|
int |
getMotifLength(int motif)
This method returns the length of the motif with index
motif
. |
int |
getNumberOfComponents()
Returns the number of components in this
MotifDiscoverer. |
int |
getNumberOfMotifs()
Returns the number of motifs for this
MotifDiscoverer. |
int |
getNumberOfMotifsInComponent(int component)
Returns the number of motifs that are used in the component
component of this MotifDiscoverer. |
int |
getNumberOfParameters()
Returns the number of parameters in this
DifferentiableSequenceScore. |
double[] |
getProfileOfScoresFor(int component,
int motif,
Sequence sequence,
int startpos,
MotifDiscoverer.KindOfProfile kind)
Returns the profile of the scores for component
component
and motif motif at all possible start positions of the motif
in the sequence sequence beginning at startpos. |
int |
getSizeOfEventSpaceForRandomVariablesOfParameter(int index)
Returns the size of the event space of the random variables that are
affected by parameter no.
|
double[] |
getStrandProbabilitiesFor(int component,
int motif,
Sequence sequence,
int startpos)
This method returns the probabilities of the strand orientations for a given subsequence if it is
considered as site of the motif model in a specific component.
|
void |
initializeFunction(int index,
boolean freeParams,
DataSet[] data,
double[][] weights)
This method creates the underlying structure of the
DifferentiableSequenceScore. |
void |
initializeFunctionRandomly(boolean freeParams)
This method initializes the
DifferentiableSequenceScore randomly. |
void |
initializeMotif(int motifIndex,
DataSet data,
double[] weights)
This method allows to initialize the model of a motif manually using a weighted data set.
|
void |
initializeMotifRandomly(int motif)
This method initializes the motif with index
motif randomly using for instance DifferentiableSequenceScore.initializeFunctionRandomly(boolean). |
boolean |
isInitialized()
This method can be used to determine whether the instance is initialized.
|
boolean |
modify(int offsetLeft,
int offsetRight)
Manually modifies the model.
|
boolean |
modifyMotif(int motifIndex,
int offsetLeft,
int offsetRight)
Manually modifies the motif model with index
motifIndex. |
void |
setParameters(double[] params,
int start)
This method sets the internal parameters to the values of
params between start and
start + |
String |
toString(NumberFormat nf)
This method returns a
String representation of the instance. |
StringBuffer |
toXML()
This method returns an XML representation as
StringBuffer of an
instance of the implementing class. |
emitDataSet, getInitialClassParam, getLogProbFor, getLogProbFor, getLogProbFor, getLogScoreFor, getLogScoreFor, getMaximalMarkovOrder, isNormalized, isNormalizedgetAlphabetContainer, getCharacteristics, getLength, getLogScoreAndPartialDerivation, getLogScoreAndPartialDerivation, getLogScoreFor, getLogScoreFor, getNumberOfRecommendedStarts, getNumberOfStarts, getNumericalCharacteristics, toStringequals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitgetLogScoreAndPartialDerivation, getLogScoreAndPartialDerivation, getNumberOfRecommendedStartsgetAlphabetContainer, getCharacteristics, getLength, getLogScoreFor, getLogScoreFor, getNumericalCharacteristicspublic MappingDiffSM(AlphabetContainer originalAlphabetContainer, DifferentiableStatisticalModel nsf, DiscreteAlphabetMapping... mapping) throws WrongAlphabetException, CloneNotSupportedException
MappingDiffSM.originalAlphabetContainer - the original AlphabetContainernsf - the internally used DifferentiableStatisticalModelmapping - the DiscreteAlphabetMappings defining the
transformation from the original AlphabetContainer to
the AlphabetContainer of the
DifferentiableStatisticalModel nsfWrongAlphabetException - if there is a problem with the mapping of the
AlphabetsCloneNotSupportedException - if the DifferentiableStatisticalModel could not be
clonedpublic MappingDiffSM(StringBuffer xml) throws NonParsableException
xml - the XML representation as StringBufferNonParsableException - if the XML representation could not be parsedpublic MappingDiffSM clone() throws CloneNotSupportedException
DifferentiableSequenceScoreDifferentiableSequenceScore
instance.clone in interface MotifDiscovererclone in interface DifferentiableSequenceScoreclone in interface SequenceScoreclone in class AbstractDifferentiableStatisticalModelDifferentiableSequenceScoreCloneNotSupportedException - if something went wrong while cloning the
DifferentiableSequenceScoreCloneablepublic StringBuffer toXML()
StorableStringBuffer of an
instance of the implementing class.protected void fromXML(StringBuffer xml) throws NonParsableException
AbstractDifferentiableSequenceScoreStorable
interface to create a scoring function from a StringBuffer.fromXML in class AbstractDifferentiableSequenceScorexml - the XML representation as StringBufferNonParsableException - if the StringBuffer could not be parsedAbstractDifferentiableSequenceScore.AbstractDifferentiableSequenceScore(StringBuffer)public void addGradientOfLogPriorTerm(double[] grad,
int start)
throws Exception
DifferentiableStatisticalModelDifferentiableStatisticalModel.getLogPriorTerm() for each
parameter of this model. The results are added to the array
grad beginning at index start.addGradientOfLogPriorTerm in interface DifferentiableStatisticalModelgrad - the array of gradientsstart - the start index in the grad array, where the
partial derivations for the parameters of this models shall be
enteredException - if something went wrong with the computing of the gradientsDifferentiableStatisticalModel.getLogPriorTerm()public double getESS()
DifferentiableStatisticalModelgetESS in interface DifferentiableStatisticalModelpublic double getLogNormalizationConstant()
DifferentiableStatisticalModelgetLogNormalizationConstant in interface DifferentiableStatisticalModelpublic double getLogPartialNormalizationConstant(int parameterIndex)
throws Exception
DifferentiableStatisticalModelparameterIndex. This is the logarithm of the partial derivation of the
normalization constant for the parameter with index
parameterIndex,
![\[\log \frac{\partial Z(\underline{\lambda})}{\partial \lambda_{parameterindex}}\]](images/DifferentiableStatisticalModel_LaTeXilb9_1.png)
getLogPartialNormalizationConstant in interface DifferentiableStatisticalModelparameterIndex - the index of the parameterException - if something went wrong with the normalizationDifferentiableStatisticalModel.getLogNormalizationConstant()public double getLogPriorTerm()
DifferentiableStatisticalModel
DifferentiableStatisticalModel.getESS() * DifferentiableStatisticalModel.getLogNormalizationConstant() + Math.log( prior )
prior is the prior for the parameters of this model.getLogPriorTerm in interface DifferentiableStatisticalModelgetLogPriorTerm in interface StatisticalModelDifferentiableStatisticalModel.getESS() * DifferentiableStatisticalModel.getLogNormalizationConstant() + Math.log( prior ).DifferentiableStatisticalModel.getESS(),
DifferentiableStatisticalModel.getLogNormalizationConstant()public int getSizeOfEventSpaceForRandomVariablesOfParameter(int index)
DifferentiableStatisticalModelindex, i.e. the product of the
sizes of the alphabets at the position of each random variable affected
by parameter index. For DNA alphabets this corresponds to 4
for a PWM, 16 for a WAM except position 0, ...getSizeOfEventSpaceForRandomVariablesOfParameter in interface DifferentiableStatisticalModelindex - the index of the parameterpublic double[] getCurrentParameterValues()
throws Exception
DifferentiableSequenceScoredouble array of dimension
DifferentiableSequenceScore.getNumberOfParameters() containing the current parameter values.
If one likes to use these parameters to start an optimization it is
highly recommended to invoke
DifferentiableSequenceScore.initializeFunction(int, boolean, DataSet[], double[][]) before.
After an optimization this method can be used to get the current
parameter values.getCurrentParameterValues in interface DifferentiableSequenceScoreException - if no parameters exist (yet)public String getInstanceName()
SequenceScoregetInstanceName in interface SequenceScorepublic double getLogScoreFor(Sequence seq, int start)
SequenceScoregetLogScoreFor in interface SequenceScoreseq - the Sequencestart - the start position in the SequenceSequencepublic double getLogScoreAndPartialDerivation(Sequence seq, int start, IntList indices, DoubleList partialDer)
DifferentiableSequenceScoreSequence beginning at
position start in the Sequence and fills lists with
the indices and the partial derivations.getLogScoreAndPartialDerivation in interface DifferentiableSequenceScoreseq - the Sequencestart - the start position in the Sequenceindices - an IntList of indices, after method invocation the
list should contain the indices i where
is not zeropartialDer - a DoubleList of partial derivations, after method
invocation the list should contain the corresponding
that are not zeroSequencepublic int getNumberOfParameters()
DifferentiableSequenceScoreDifferentiableSequenceScore. If the
number of parameters is not known yet, the method returns
DifferentiableSequenceScore.UNKNOWN.getNumberOfParameters in interface DifferentiableSequenceScoreDifferentiableSequenceScoreDifferentiableSequenceScore.UNKNOWNpublic void initializeFunction(int index,
boolean freeParams,
DataSet[] data,
double[][] weights)
throws Exception
DifferentiableSequenceScoreDifferentiableSequenceScore.initializeFunction in interface DifferentiableSequenceScoreindex - the index of the class the DifferentiableSequenceScore modelsfreeParams - indicates whether the (reduced) parameterization is useddata - the data setsweights - the weights of the sequences in the data setsException - if something went wrongpublic void initializeFunctionRandomly(boolean freeParams)
throws Exception
DifferentiableSequenceScoreDifferentiableSequenceScore randomly. It has to
create the underlying structure of the DifferentiableSequenceScore.initializeFunctionRandomly in interface DifferentiableSequenceScorefreeParams - indicates whether the (reduced) parameterization is usedException - if something went wrongpublic boolean isInitialized()
SequenceScoreSequenceScore.getLogScoreFor(Sequence).isInitialized in interface SequenceScoretrue if the instance is initialized, false
otherwisepublic void setParameters(double[] params,
int start)
DifferentiableSequenceScoreparams between start and
start + DifferentiableSequenceScore.getNumberOfParameters() - 1setParameters in interface DifferentiableSequenceScoreparams - the new parametersstart - the start index in paramspublic String toString(NumberFormat nf)
SequenceScoreString representation of the instance.toString in interface SequenceScorenf - the NumberFormat for the String representation of parameters or probabilitiesString representation of the instancepublic DifferentiableStatisticalModel getFunction() throws CloneNotSupportedException
DifferentiableStatisticalModel that is internally usedCloneNotSupportedException - if the DifferentiableStatisticalModel could not be
clonedpublic int getNumberOfMotifs()
MotifDiscovererMotifDiscoverer.getNumberOfMotifs in interface MotifDiscovererpublic void adjustHiddenParameters(int index,
DataSet[] data,
double[][] weights)
throws Exception
MutableMotifDiscovereradjustHiddenParameters in interface MutableMotifDiscovererindex - the index of the class of this MutableMotifDiscovererdata - the array of data for all classesweights - the weights for all sequences in dataException - thrown if the hidden parameters could not be adjustedpublic void initializeMotif(int motifIndex,
DataSet data,
double[] weights)
throws Exception
MutableMotifDiscovererinitializeMotif in interface MutableMotifDiscoverermotifIndex - the index of the motif in the motif discovererdata - the data set of sequencesweights - either null or an array of length data.getNumberofElements() with non-negative weights.Exception - if initialize was not possiblepublic void initializeMotifRandomly(int motif)
throws Exception
MutableMotifDiscoverermotif randomly using for instance DifferentiableSequenceScore.initializeFunctionRandomly(boolean).
Furthermore, if available, it also initializes the positional distribution.initializeMotifRandomly in interface MutableMotifDiscoverermotif - the index of the motifException - either if the index is wrong or if it is thrown by the method DifferentiableSequenceScore.initializeFunctionRandomly(boolean)public boolean modifyMotif(int motifIndex,
int offsetLeft,
int offsetRight)
throws Exception
MutableMotifDiscoverermotifIndex. The two offsets offsetLeft and offsetRight
define how many positions the left or right border positions shall be moved. Negative numbers indicate moves to the left while positive
numbers correspond to moves to the right. The distribution for sequences to the left and right side of the motif shall be computed internally.modifyMotif in interface MutableMotifDiscoverermotifIndex - the index of the motif in the motif discovereroffsetLeft - the offset on the left sideoffsetRight - the offset on the right sidetrue if the motif model was modified otherwise falseException - if some unexpected error occurred during the modificationMutableMotifDiscoverer.modifyMotif(int, int, int),
Mutable.modify(int, int)public int getGlobalIndexOfMotifInComponent(int component,
int motif)
MotifDiscoverermotif used in
component. The index returned must be at least 0 and less
than MotifDiscoverer.getNumberOfMotifs().getGlobalIndexOfMotifInComponent in interface MotifDiscoverercomponent - the component indexmotif - the motif index in the componentmotif in componentpublic int getIndexOfMaximalComponentFor(Sequence sequence) throws Exception
MotifDiscoverersequence.getIndexOfMaximalComponentFor in interface MotifDiscoverersequence - the given sequenceException - if the index could not be computed for any reasonspublic int getMotifLength(int motif)
MotifDiscoverermotif
.getMotifLength in interface MotifDiscoverermotif - the index of the motifmotifpublic int getNumberOfComponents()
MotifDiscovererMotifDiscoverer.getNumberOfComponents in interface MotifDiscovererpublic int getNumberOfMotifsInComponent(int component)
MotifDiscoverercomponent of this MotifDiscoverer.getNumberOfMotifsInComponent in interface MotifDiscoverercomponent - the component of the MotifDiscovererpublic double[] getProfileOfScoresFor(int component,
int motif,
Sequence sequence,
int startpos,
MotifDiscoverer.KindOfProfile kind)
throws Exception
MotifDiscoverercomponent
and motif motif at all possible start positions of the motif
in the sequence sequence beginning at startpos.
This array should be of length sequence.length() - startpos - motifs[motif].getLength() + 1.
getProfileOfScoresFor in interface MotifDiscoverercomponent - the component indexmotif - the index of the motif in the componentsequence - the given sequencestartpos - the start position in the sequencekind - indicates the kind of profileException - if the score could not be computed for any reasonspublic double[] getStrandProbabilitiesFor(int component,
int motif,
Sequence sequence,
int startpos)
throws Exception
MotifDiscoverergetStrandProbabilitiesFor in interface MotifDiscoverercomponent - the component indexmotif - the index of the motif in the componentsequence - the given sequencestartpos - the start position in the sequenceException - if the strand could not be computed for any reasonspublic boolean modify(int offsetLeft,
int offsetRight)
MutableoffsetLeft
and offsetRight define how many positions the left or
right border positions shall be moved. Negative numbers indicate moves to
the left while positive numbers correspond to moves to the right.