|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectde.jstacs.sequenceScores.differentiable.AbstractDifferentiableSequenceScore
de.jstacs.sequenceScores.statisticalModels.differentiable.AbstractDifferentiableStatisticalModel
de.jstacs.sequenceScores.statisticalModels.differentiable.AbstractVariableLengthDiffSM
de.jstacs.sequenceScores.statisticalModels.differentiable.homogeneous.HomogeneousDiffSM
de.jstacs.sequenceScores.statisticalModels.differentiable.homogeneous.HomogeneousMMDiffSM
public class HomogeneousMMDiffSM
This scoring function implements a homogeneous Markov model of arbitrary order for any sequence length. The scoring function uses the parameterization of Meila if one uses the free parameters, which yields in a non-concave log conditional likelihood.
| Field Summary |
|---|
| Fields inherited from class de.jstacs.sequenceScores.differentiable.AbstractDifferentiableSequenceScore |
|---|
alphabets, length, r |
| Fields inherited from interface de.jstacs.sequenceScores.differentiable.DifferentiableSequenceScore |
|---|
UNKNOWN |
| Constructor Summary | |
|---|---|
HomogeneousMMDiffSM(AlphabetContainer alphabets,
int order,
double classEss,
double[] sumOfHyperParams,
boolean plugIn,
boolean optimize,
int starts)
This is the main constructor that creates an instance of a homogeneous Markov model of arbitrary order. |
|
HomogeneousMMDiffSM(AlphabetContainer alphabets,
int order,
double classEss,
int length)
This is a convenience constructor for creating an instance of a homogeneous Markov model of arbitrary order. |
|
HomogeneousMMDiffSM(StringBuffer xml)
This is the constructor for Storable. |
|
| Method Summary | |
|---|---|
void |
addGradientOfLogPriorTerm(double[] grad,
int start)
This method computes the gradient of DifferentiableStatisticalModel.getLogPriorTerm() for each
parameter of this model. |
HomogeneousMMDiffSM |
clone()
Creates a clone (deep copy) of the current DifferentiableSequenceScore
instance. |
DataSet |
emit(int numberOfSequences,
int... seqLength)
This method returns a DataSet object containing artificial
sequence(s). |
protected void |
fromXML(StringBuffer xml)
This method is called in the constructor for the Storable
interface to create a scoring function from a StringBuffer. |
double[][][] |
getAllConditionalStationaryDistributions()
This method returns the stationary conditional distributions. |
double[] |
getCurrentParameterValues()
Returns a double array of dimension
DifferentiableSequenceScore.getNumberOfParameters() containing the current parameter values. |
double |
getESS()
Returns the equivalent sample size (ess) of this model, i.e. the equivalent sample size for the class or component that is represented by this model. |
String |
getInstanceName()
Should return a short instance name such as iMM(0), BN(2), ... |
double |
getLogNormalizationConstant(int length)
This method returns the logarithm of the normalization constant for a given sequence length. |
double |
getLogPartialNormalizationConstant(int parameterIndex,
int length)
This method returns the logarithm of the partial normalization constant for a given parameter index and a sequence length. |
double |
getLogPriorTerm()
This method computes a value that is proportional to
where prior is the prior for the parameters of this model. |
double |
getLogScoreAndPartialDerivation(Sequence seq,
int start,
int end,
IntList indices,
DoubleList dList)
Returns the logarithmic score for a Sequence beginning at
position start in the Sequence and fills lists with
the indices and the partial derivations. |
double |
getLogScoreFor(Sequence seq,
int start,
int end)
Returns the logarithmic score for the Sequence seq
beginning at position start in the Sequence. |
byte |
getMaximalMarkovOrder()
Returns the maximal used markov oder. |
int |
getNumberOfParameters()
Returns the number of parameters in this DifferentiableSequenceScore. |
int |
getNumberOfRecommendedStarts()
This method returns the number of recommended optimization starts. |
int[][] |
getSamplingGroups(int parameterOffset)
Returns groups of indexes of parameters that shall be drawn together in a sampling procedure |
int |
getSizeOfEventSpaceForRandomVariablesOfParameter(int index)
Returns the size of the event space of the random variables that are affected by parameter no. |
static double[] |
getSumOfHyperParameters(int order,
int length,
double ess)
This method returns an array that can be used in the constructor HomogeneousMMDiffSM(AlphabetContainer, int, double, double[], boolean, boolean, int)
containing the sums of the specific hyperparameters. |
void |
initializeFunction(int index,
boolean freeParams,
DataSet[] data,
double[][] weights)
This method creates the underlying structure of the DifferentiableSequenceScore. |
void |
initializeFunctionRandomly(boolean freeParams)
This method initializes the DifferentiableSequenceScore randomly. |
void |
initializeUniformly(boolean freeParams)
This method allows to initialize the instance with an uniform distribution. |
boolean |
isInitialized()
This method can be used to determine whether the instance is initialized. |
boolean |
isNormalized()
This method indicates whether the implemented score is already normalized to 1 or not. |
void |
setParameterOptimization(boolean optimize)
This method allows the user to specify whether the parameters should be optimized or not. |
void |
setParameters(double[] params,
int start)
This method sets the internal parameters to the values of params between start and
start + |
void |
setStatisticForHyperparameters(int[] length,
double[] weight)
This method sets the hyperparameters for the model parameters by evaluating the given statistic. |
String |
toString()
|
StringBuffer |
toXML()
This method returns an XML representation as StringBuffer of an
instance of the implementing class. |
| Methods inherited from class de.jstacs.sequenceScores.statisticalModels.differentiable.AbstractVariableLengthDiffSM |
|---|
getLogNormalizationConstant, getLogPartialNormalizationConstant, getLogScoreAndPartialDerivation, getLogScoreFor |
| Methods inherited from class de.jstacs.sequenceScores.statisticalModels.differentiable.AbstractDifferentiableStatisticalModel |
|---|
emitDataSet, getInitialClassParam, getLogProbFor, getLogProbFor, getLogProbFor, getLogScoreFor, getLogScoreFor, isNormalized |
| Methods inherited from class de.jstacs.sequenceScores.differentiable.AbstractDifferentiableSequenceScore |
|---|
getAlphabetContainer, getCharacteristics, getLength, getLogScoreAndPartialDerivation, getLogScoreFor, getNumberOfStarts, getNumericalCharacteristics |
| Methods inherited from class java.lang.Object |
|---|
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
| Methods inherited from interface de.jstacs.sequenceScores.statisticalModels.differentiable.DifferentiableStatisticalModel |
|---|
getLogNormalizationConstant, getLogPartialNormalizationConstant |
| Methods inherited from interface de.jstacs.sequenceScores.differentiable.DifferentiableSequenceScore |
|---|
getInitialClassParam, getLogScoreAndPartialDerivation, getLogScoreAndPartialDerivation |
| Methods inherited from interface de.jstacs.sequenceScores.statisticalModels.StatisticalModel |
|---|
emitDataSet, getLogProbFor, getLogProbFor, getLogProbFor |
| Methods inherited from interface de.jstacs.sequenceScores.SequenceScore |
|---|
getAlphabetContainer, getCharacteristics, getLength, getLogScoreFor, getLogScoreFor, getLogScoreFor, getLogScoreFor, getNumericalCharacteristics |
| Constructor Detail |
|---|
public HomogeneousMMDiffSM(AlphabetContainer alphabets,
int order,
double classEss,
int length)
alphabets - the AlphabetContainerorder - the oder of the model (has to be non-negative)classEss - the equivalent sample size (ess) of the classlength - the sequence length (only used for computing the hyperparameters)getSumOfHyperParameters(int, int, double),
HomogeneousMMDiffSM(AlphabetContainer, int, double, double[], boolean, boolean, int)
public HomogeneousMMDiffSM(AlphabetContainer alphabets,
int order,
double classEss,
double[] sumOfHyperParams,
boolean plugIn,
boolean optimize,
int starts)
alphabets - the AlphabetContainerorder - the oder of the model (has to be non-negative)classEss - the equivalent sample size (ess) of the classsumOfHyperParams - the sum of the hyperparameters for each order (length has to
be order, each entry has to be non-negative)plugIn - a switch which enables to use the MAP-parameters as plug-in
parametersoptimize - a switch which enables to optimize or fix the parametersstarts - the number of recommended starts
public HomogeneousMMDiffSM(StringBuffer xml)
throws NonParsableException
Storable. Creates a new
HomogeneousMMDiffSM out of its XML representation as returned by
fromXML(StringBuffer).
xml - the XML representation as StringBuffer
NonParsableException - if the StringBuffer representation could
not be parsed| Method Detail |
|---|
public static double[] getSumOfHyperParameters(int order,
int length,
double ess)
HomogeneousMMDiffSM(AlphabetContainer, int, double, double[], boolean, boolean, int)
containing the sums of the specific hyperparameters.
order - the order of the modellength - the sequence lengthess - the class ESS
HomogeneousMMDiffSM(AlphabetContainer, int, double, double[], boolean, boolean, int)
public HomogeneousMMDiffSM clone()
throws CloneNotSupportedException
DifferentiableSequenceScoreDifferentiableSequenceScore
instance.
clone in interface DifferentiableSequenceScoreclone in interface SequenceScoreclone in class AbstractDifferentiableStatisticalModelDifferentiableSequenceScore
CloneNotSupportedException - if something went wrong while cloning the
DifferentiableSequenceScorepublic String getInstanceName()
SequenceScore
public double getLogScoreFor(Sequence seq,
int start,
int end)
SequenceScoreSequence seq
beginning at position start in the Sequence.
getLogScoreFor in interface SequenceScoregetLogScoreFor in interface VariableLengthDiffSMgetLogScoreFor in class AbstractVariableLengthDiffSMseq - the Sequencestart - the start position in the Sequenceend - the end position (inclusive) in the Sequence
Sequence
public double getLogScoreAndPartialDerivation(Sequence seq,
int start,
int end,
IntList indices,
DoubleList dList)
DifferentiableSequenceScoreSequence beginning at
position start in the Sequence and fills lists with
the indices and the partial derivations.
getLogScoreAndPartialDerivation in interface DifferentiableSequenceScoregetLogScoreAndPartialDerivation in interface VariableLengthDiffSMgetLogScoreAndPartialDerivation in class AbstractVariableLengthDiffSMseq - the Sequencestart - the start position in the Sequenceend - the end position (inclusive) in the Sequenceindices - an IntList of indices, after method invocation the
list should contain the indices i where
is not zerodList - a DoubleList of partial derivations, after method
invocation the list should contain the corresponding
that are not zero
Sequencepublic int getNumberOfParameters()
DifferentiableSequenceScoreDifferentiableSequenceScore. If the
number of parameters is not known yet, the method returns
DifferentiableSequenceScore.UNKNOWN.
DifferentiableSequenceScoreDifferentiableSequenceScore.UNKNOWN
public void setParameters(double[] params,
int start)
DifferentiableSequenceScoreparams between start and
start + DifferentiableSequenceScore.getNumberOfParameters() - 1
params - the new parametersstart - the start index in paramspublic StringBuffer toXML()
StorableStringBuffer of an
instance of the implementing class.
public double[] getCurrentParameterValues()
DifferentiableSequenceScoredouble array of dimension
DifferentiableSequenceScore.getNumberOfParameters() containing the current parameter values.
If one likes to use these parameters to start an optimization it is
highly recommended to invoke
DifferentiableSequenceScore.initializeFunction(int, boolean, DataSet[], double[][]) before.
After an optimization this method can be used to get the current
parameter values.
public void initializeFunction(int index,
boolean freeParams,
DataSet[] data,
double[][] weights)
DifferentiableSequenceScoreDifferentiableSequenceScore.
index - the index of the class the DifferentiableSequenceScore modelsfreeParams - indicates whether the (reduced) parameterization is useddata - the samplesweights - the weights of the sequences in the samplespublic void initializeFunctionRandomly(boolean freeParams)
DifferentiableSequenceScoreDifferentiableSequenceScore randomly. It has to
create the underlying structure of the DifferentiableSequenceScore.
freeParams - indicates whether the (reduced) parameterization is used
protected void fromXML(StringBuffer xml)
throws NonParsableException
AbstractDifferentiableSequenceScoreStorable
interface to create a scoring function from a StringBuffer.
fromXML in class AbstractDifferentiableSequenceScorexml - the XML representation as StringBuffer
NonParsableException - if the StringBuffer could not be parsedAbstractDifferentiableSequenceScore.AbstractDifferentiableSequenceScore(StringBuffer)public int getSizeOfEventSpaceForRandomVariablesOfParameter(int index)
DifferentiableStatisticalModelindex, i.e. the product of the
sizes of the alphabets at the position of each random variable affected
by parameter index. For DNA alphabets this corresponds to 4
for a PWM, 16 for a WAM except position 0, ...
index - the index of the parameter
public double getLogNormalizationConstant(int length)
VariableLengthDiffSM
length - the sequence length
DifferentiableStatisticalModel.getLogNormalizationConstant()
public double getLogPartialNormalizationConstant(int parameterIndex,
int length)
throws Exception
VariableLengthDiffSM
parameterIndex - the index of the parameterlength - the sequence length
Exception - if something went wrongDifferentiableStatisticalModel.getLogPartialNormalizationConstant(int)public double getESS()
DifferentiableStatisticalModel
public String toString()
toString in class Objectpublic double getLogPriorTerm()
DifferentiableStatisticalModel
DifferentiableStatisticalModel.getESS() * DifferentiableStatisticalModel.getLogNormalizationConstant() + Math.log( prior )
prior is the prior for the parameters of this model.
DifferentiableStatisticalModel.getESS() * DifferentiableStatisticalModel.getLogNormalizationConstant() + Math.log( prior ).DifferentiableStatisticalModel.getESS(),
DifferentiableStatisticalModel.getLogNormalizationConstant()
public void addGradientOfLogPriorTerm(double[] grad,
int start)
DifferentiableStatisticalModelDifferentiableStatisticalModel.getLogPriorTerm() for each
parameter of this model. The results are added to the array
grad beginning at index start.
grad - the array of gradientsstart - the start index in the grad array, where the
partial derivations for the parameters of this models shall be
enteredDifferentiableStatisticalModel.getLogPriorTerm()public boolean isNormalized()
DifferentiableStatisticalModelfalse.
isNormalized in interface DifferentiableStatisticalModelisNormalized in class AbstractDifferentiableStatisticalModeltrue if the implemented score is already normalized
to 1, false otherwisepublic boolean isInitialized()
SequenceScoreSequenceScore.getLogScoreFor(Sequence).
true if the instance is initialized, false
otherwisepublic byte getMaximalMarkovOrder()
HomogeneousDiffSM
getMaximalMarkovOrder in interface StatisticalModelgetMaximalMarkovOrder in class HomogeneousDiffSMpublic int getNumberOfRecommendedStarts()
DifferentiableSequenceScore
getNumberOfRecommendedStarts in interface DifferentiableSequenceScoregetNumberOfRecommendedStarts in class AbstractDifferentiableSequenceScorepublic void setParameterOptimization(boolean optimize)
optimize - indicates if the parameters should be optimized or notpublic double[][][] getAllConditionalStationaryDistributions()
public void setStatisticForHyperparameters(int[] length,
double[] weight)
throws Exception
VariableLengthDiffSMlength) and how often (
weight) they have been seen.
length - the non-negative lengths of the sequencesweight - the non-negative weight for the corresponding sequence
Exception - if something went wrongMutable
public DataSet emit(int numberOfSequences,
int... seqLength)
throws Exception
DataSet object containing artificial
sequence(s).
DataSet:
emitDataSet( int n, int l ) returns a DataSet with
n sequences of length l.
emitDataSet( int n, int[] l ) should return a
DataSet with n sequences which have a sequence length
corresponding to the entry in the array.
numberOfSequences - the number of sequences that should be contained in the
returned DataSetseqLength - the length of the sequences
DataSet containing numberOfSequences
artificial sequence(s)
Exception - if the emission of the artificial DataSet did not
succeedDataSetpublic void initializeUniformly(boolean freeParams)
HomogeneousDiffSM
initializeUniformly in class HomogeneousDiffSMfreeParams - a switch whether to take only free parameters or to take allpublic int[][] getSamplingGroups(int parameterOffset)
SamplingDifferentiableStatisticalModel
parameterOffset - a global offset on the parameter indexes
parameterOffset.
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||