public class UniformDiffSM extends UniformDiffSS implements SamplingDifferentiableStatisticalModel
DifferentiableStatisticalModel does nothing. So it is possible to save
parameters in an optimization.alphabets, length, rUNKNOWN| Constructor and Description |
|---|
UniformDiffSM(AlphabetContainer alphabets,
int length,
double ess)
This is the main constructor that creates an instance of a
UniformDiffSM that models each sequence uniformly. |
UniformDiffSM(StringBuffer xml)
This is the constructor for the interface
Storable. |
| Modifier and Type | Method and Description |
|---|---|
void |
addGradientOfLogPriorTerm(double[] grad,
int start)
This method computes the gradient of
DifferentiableStatisticalModel.getLogPriorTerm() for each
parameter of this model. |
DataSet |
emitDataSet(int numberOfSequences,
int... seqLength)
This method returns a
DataSet object containing artificial
sequence(s). |
protected void |
extractFurtherInformation(StringBuffer xml)
This method is the opposite of
UniformDiffSS.getFurtherInformation(). |
double |
getESS()
Returns the equivalent sample size (ess) of this model, i.e.
|
protected StringBuffer |
getFurtherInformation()
This method is used to append further information of the instance to the
XML representation.
|
double |
getLogNormalizationConstant()
Returns the logarithm of the sum of the scores over all sequences of the event space.
|
double |
getLogPartialNormalizationConstant(int parameterIndex)
Returns the logarithm of the partial normalization constant for the parameter with index
parameterIndex. |
double |
getLogPriorTerm()
This method computes a value that is proportional to
|
double |
getLogProbFor(Sequence sequence)
Returns the logarithm of the probability of the given sequence given the
model.
|
double |
getLogProbFor(Sequence sequence,
int startpos)
Returns the logarithm of the probability of (a part of) the given
sequence given the model.
|
double |
getLogProbFor(Sequence sequence,
int startpos,
int endpos)
Returns the logarithm of the probability of (a part of) the given
sequence given the model.
|
double |
getLogScoreAndPartialDerivation(Sequence seq,
int start,
IntList indices,
DoubleList dList)
|
double |
getLogScoreFor(Sequence seq,
int start)
|
byte |
getMaximalMarkovOrder()
This method returns the maximal used Markov order, if possible.
|
int[][] |
getSamplingGroups(int parameterOffset)
Returns groups of indexes of parameters that shall be drawn
together in a sampling procedure
|
int |
getSizeOfEventSpaceForRandomVariablesOfParameter(int index)
Returns the size of the event space of the random variables that are
affected by parameter no.
|
void |
initializeFunction(int index,
boolean meila,
DataSet[] data,
double[][] weights)
This method creates the underlying structure of the
DifferentiableSequenceScore. |
boolean |
isNormalized()
This method indicates whether the implemented score is already normalized
to 1 or not.
|
String |
toString(NumberFormat nf)
This method returns a
String representation of the instance. |
fromXML, getCurrentParameterValues, getInstanceName, getNumberOfParameters, initializeFunctionRandomly, isInitialized, setParameters, toXMLclone, getAlphabetContainer, getCharacteristics, getInitialClassParam, getLength, getLogScoreAndPartialDerivation, getLogScoreAndPartialDerivation, getLogScoreFor, getLogScoreFor, getLogScoreFor, getLogScoreFor, getNumberOfRecommendedStarts, getNumberOfStarts, getNumericalCharacteristics, toStringequals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitclone, getCurrentParameterValues, getInitialClassParam, getLogScoreAndPartialDerivation, getLogScoreAndPartialDerivation, getNumberOfParameters, getNumberOfRecommendedStarts, initializeFunctionRandomly, setParametersgetAlphabetContainer, getCharacteristics, getInstanceName, getLength, getLogScoreFor, getLogScoreFor, getLogScoreFor, getLogScoreFor, getNumericalCharacteristics, isInitializedpublic UniformDiffSM(AlphabetContainer alphabets, int length, double ess)
UniformDiffSM that models each sequence uniformly.alphabets - the AlphabetContainerlength - the length of the modeled sequencesess - the equivalent sample size (ess) of the classpublic UniformDiffSM(StringBuffer xml) throws NonParsableException
Storable.
Creates a new UniformDiffSM out of its XML
representation as returned by UniformDiffSS.fromXML(StringBuffer).xml - the XML representation as StringBufferNonParsableException - if the XML representation could not be parsedpublic double getLogScoreFor(Sequence seq, int start)
SequenceScoregetLogScoreFor in interface SequenceScoregetLogScoreFor in class UniformDiffSSseq - the Sequencestart - the start position in the SequenceSequencepublic double getLogScoreAndPartialDerivation(Sequence seq, int start, IntList indices, DoubleList dList)
DifferentiableSequenceScoreSequence beginning at
position start in the Sequence and fills lists with
the indices and the partial derivations.getLogScoreAndPartialDerivation in interface DifferentiableSequenceScoregetLogScoreAndPartialDerivation in class UniformDiffSSseq - the Sequencestart - the start position in the Sequenceindices - an IntList of indices, after method invocation the
list should contain the indices i where
is not zerodList - a DoubleList of partial derivations, after method
invocation the list should contain the corresponding
that are not zeroSequenceprotected StringBuffer getFurtherInformation()
UniformDiffSSgetFurtherInformation in class UniformDiffSSStringBufferUniformDiffSS.extractFurtherInformation(StringBuffer)protected void extractFurtherInformation(StringBuffer xml) throws NonParsableException
UniformDiffSSUniformDiffSS.getFurtherInformation(). It
extracts further information of the instance from a XML representation.extractFurtherInformation in class UniformDiffSSxml - the StringBuffer containing the information to be
extracted as XML codeNonParsableException - if the StringBuffer could not be parsedUniformDiffSS.getFurtherInformation()public double getLogNormalizationConstant()
DifferentiableStatisticalModelgetLogNormalizationConstant in interface DifferentiableStatisticalModelpublic void initializeFunction(int index,
boolean meila,
DataSet[] data,
double[][] weights)
DifferentiableSequenceScoreDifferentiableSequenceScore.initializeFunction in interface DifferentiableSequenceScoreinitializeFunction in class UniformDiffSSindex - the index of the class the DifferentiableSequenceScore modelsmeila - indicates whether the (reduced) parameterization is useddata - the data setsweights - the weights of the sequences in the data setspublic int getSizeOfEventSpaceForRandomVariablesOfParameter(int index)
DifferentiableStatisticalModelindex, i.e. the product of the
sizes of the alphabets at the position of each random variable affected
by parameter index. For DNA alphabets this corresponds to 4
for a PWM, 16 for a WAM except position 0, ...getSizeOfEventSpaceForRandomVariablesOfParameter in interface DifferentiableStatisticalModelindex - the index of the parameterpublic double getLogPartialNormalizationConstant(int parameterIndex)
throws Exception
DifferentiableStatisticalModelparameterIndex. This is the logarithm of the partial derivation of the
normalization constant for the parameter with index
parameterIndex,
![\[\log \frac{\partial Z(\underline{\lambda})}{\partial \lambda_{parameterindex}}\]](images/DifferentiableStatisticalModel_LaTeXilb9_1.png)
getLogPartialNormalizationConstant in interface DifferentiableStatisticalModelparameterIndex - the index of the parameterException - if something went wrong with the normalizationDifferentiableStatisticalModel.getLogNormalizationConstant()public double getESS()
DifferentiableStatisticalModelgetESS in interface DifferentiableStatisticalModelpublic String toString(NumberFormat nf)
SequenceScoreString representation of the instance.toString in interface SequenceScoretoString in class UniformDiffSSnf - the NumberFormat for the String representation of parameters or probabilitiesString representation of the instancepublic double getLogPriorTerm()
DifferentiableStatisticalModel
DifferentiableStatisticalModel.getESS() * DifferentiableStatisticalModel.getLogNormalizationConstant() + Math.log( prior )
prior is the prior for the parameters of this model.getLogPriorTerm in interface DifferentiableStatisticalModelgetLogPriorTerm in interface StatisticalModelDifferentiableStatisticalModel.getESS() * DifferentiableStatisticalModel.getLogNormalizationConstant() + Math.log( prior ).DifferentiableStatisticalModel.getESS(),
DifferentiableStatisticalModel.getLogNormalizationConstant()public void addGradientOfLogPriorTerm(double[] grad,
int start)
DifferentiableStatisticalModelDifferentiableStatisticalModel.getLogPriorTerm() for each
parameter of this model. The results are added to the array
grad beginning at index start.addGradientOfLogPriorTerm in interface DifferentiableStatisticalModelgrad - the array of gradientsstart - the start index in the grad array, where the
partial derivations for the parameters of this models shall be
enteredDifferentiableStatisticalModel.getLogPriorTerm()public boolean isNormalized()
DifferentiableStatisticalModelfalse.isNormalized in interface DifferentiableStatisticalModeltrue if the implemented score is already normalized
to 1, false otherwisepublic int[][] getSamplingGroups(int parameterOffset)
SamplingDifferentiableStatisticalModelgetSamplingGroups in interface SamplingDifferentiableStatisticalModelparameterOffset - a global offset on the parameter indexesparameterOffset.public double getLogProbFor(Sequence sequence, int startpos) throws Exception
StatisticalModelstartpos. E.g. the fixed length is 12. The length
of the given sequence is 30 and the startpos=15 the logarithm
of the probability of the part from position 15 to 26 (inclusive) given
the model should be returned.
length and the alphabets define the type of
data that can be modeled and therefore both has to be checked.getLogProbFor in interface StatisticalModelsequence - the given sequencestartpos - the start position within the given sequenceException - if the sequence could not be handled by the modelNotTrainedException - if the model is not trained yetStatisticalModel.getLogProbFor(Sequence, int, int)public double getLogProbFor(Sequence sequence) throws Exception
StatisticalModellength and the alphabets define the type of
data that can be modeled and therefore both has to be checked.getLogProbFor in interface StatisticalModelsequence - the given sequence for which the logarithm of the
probability/the value of the density function should be
returnedException - if the sequence could not be handled by the modelNotTrainedException - if the model is not trained yetStatisticalModel.getLogProbFor(Sequence, int, int)public double getLogProbFor(Sequence sequence, int startpos, int endpos) throws Exception
StatisticalModelStatisticalModel.getLogProbFor(Sequence, int) by the fact, that the model could be
e.g. homogeneous and therefore the length of the sequences, whose
probability should be returned, is not fixed. Additionally, the end
position of the part of the given sequence is given and the probability
of the part from position startpos to endpos
(inclusive) should be returned.
length and the alphabets define the type of
data that can be modeled and therefore both has to be checked.getLogProbFor in interface StatisticalModelsequence - the given sequencestartpos - the start position within the given sequenceendpos - the last position to be taken into accountException - if the sequence could not be handled (e.g.
startpos > , endpos
> sequence.length, ...) by the modelNotTrainedException - if the model is not trained yetpublic DataSet emitDataSet(int numberOfSequences, int... seqLength) throws NotTrainedException, Exception
StatisticalModelDataSet object containing artificial
sequence(s).
emitDataSet( int n, int l ) should return a data set with
n sequences of length l.
emitDataSet( int n, int[] l ) should return a data set with
n sequences which have a sequence length corresponding to
the entry in the given array l.
emitDataSet( int n ) and
emitDataSet( int n, null ) should return a data set with
n sequences of length of the model (
SequenceScore.getLength()).
Exception.emitDataSet in interface StatisticalModelnumberOfSequences - the number of sequences that should be contained in the
returned data setseqLength - the length of the sequences for a homogeneous model; for an
inhomogeneous model this parameter should be null
or an array of size 0.DataSet containing the artificial sequence(s)NotTrainedException - if the model is not trained yetException - if the emission did not succeedDataSetpublic byte getMaximalMarkovOrder()
throws UnsupportedOperationException
StatisticalModelgetMaximalMarkovOrder in interface StatisticalModelUnsupportedOperationException - if the model can't give a proper answer