public abstract class AbstractMixtureDiffSM extends AbstractDifferentiableStatisticalModel implements SamplingDifferentiableStatisticalModel
DifferentiableStatisticalModels. If these
are already normalized (see
DifferentiableStatisticalModel.isNormalized() ) the potential is
parameterized using the Meila-parameterization, otherwise it is parameterized
using the unnormalized MRF(Markov Random Fields)-parameterization.| Modifier and Type | Field and Description |
|---|---|
protected double[] |
componentScore
This array is used while computing the score.
|
protected DoubleList[] |
dList
This array contains some
DoubleLists that are used while
computing the partial derivation. |
protected boolean |
freeParams
This
boolean indicates whether free parameterization or all
parameters are used. |
protected DifferentiableStatisticalModel[] |
function
This array contains the internal
DifferentiableStatisticalModels that are used to
determine the score. |
protected double[] |
hiddenParameter
This array contains the hidden parameters of the instance.
|
protected double[] |
hiddenPotential
This array contains the hidden potentials of the instance.
|
protected IntList[] |
iList
This array contains some
IntLists that are used while computing
the partial derivation. |
protected double |
logGammaSum
This
double contains the sum of the logarithm of the gamma
functions used in the prior. |
protected double |
logHiddenNorm
This
double contains the logarithm of the normalization
constant of hidden parameters of the instance. |
protected double[] |
logHiddenPotential
This array contains the logarithm of the hidden potentials of the
instance.
|
protected double |
norm
This
double contains the normalization constant of the
instance. |
protected boolean |
optimizeHidden
This
boolean indicates whether to optimize the hidden
variables of this instance. |
protected int[] |
paramRef
This array contains the references/indices for the parameters.
|
protected double[] |
partNorm
This array contains the partial normalization constants, i.e.
|
alphabets, length, rUNKNOWN| Modifier | Constructor and Description |
|---|---|
protected |
AbstractMixtureDiffSM(int length,
int starts,
int dimension,
boolean optimizeHidden,
boolean plugIn,
DifferentiableStatisticalModel... function)
This constructor creates a new
AbstractMixtureDiffSM. |
protected |
AbstractMixtureDiffSM(StringBuffer xml)
This is the constructor for the interface
Storable. |
| Modifier and Type | Method and Description |
|---|---|
void |
addGradientOfLogPriorTerm(double[] grad,
int start)
This method computes the gradient of
DifferentiableStatisticalModel.getLogPriorTerm() for each
parameter of this model. |
AbstractMixtureDiffSM |
clone()
Creates a clone (deep copy) of the current
DifferentiableSequenceScore
instance. |
protected void |
cloneFunctions(DifferentiableStatisticalModel[] originalFunctions)
This method clones the given array of functions and enables the user to
do some post-processing.
|
protected void |
computeHiddenParameter(double[] statistic,
boolean add)
This method has to be invoked during an initialization.
|
protected void |
computeLogGammaSum()
This method is used to pre-compute the sum of the logarithm of the gamma
functions that is used in the prior.
|
protected boolean |
determineIsNormalized()
This method is used to determine the value that is returned by the method
isNormalized(). |
protected void |
extractFurtherInformation(StringBuffer xml)
This method is the opposite of
getFurtherInformation(). |
protected abstract void |
fillComponentScores(Sequence seq,
int start)
Fills the internal array
componentScore with the logarithmic
scores of the components given a Sequence. |
protected void |
fromXML(StringBuffer b)
This method is called in the constructor for the
Storable
interface to create a scoring function from a StringBuffer. |
double[] |
getAPrioriMixtureProbabilities()
Returns the mixture probabilities (i.e., the a-priori probabilities of the different components).
|
double[] |
getCurrentParameterValues()
Returns a
double array of dimension
DifferentiableSequenceScore.getNumberOfParameters() containing the current parameter values. |
DifferentiableStatisticalModel[] |
getDifferentiableStatisticalModels()
Returns a deep copy of all internal used
DifferentiableStatisticalModels. |
DifferentiableStatisticalModel |
getFunction(int index)
This method returns a specific internal function.
|
DifferentiableStatisticalModel[] |
getFunctions()
This method returns an array of clones of the internal used functions.
|
protected StringBuffer |
getFurtherInformation()
This method is used to append further information of the instance to the
XML representation.
|
abstract double |
getHyperparameterForHiddenParameter(int index)
This method returns the hyperparameter for the hidden parameter with
index
index. |
int |
getIndexOfMaximalComponentFor(Sequence seq,
int start)
Returns the index of the component that has the greatest impact on the
complete score for a
Sequence. |
protected int[] |
getIndices(int index)
This array is used to compute the relative indices of a parameter index.
|
double |
getLogNormalizationConstant()
Returns the logarithm of the sum of the scores over all sequences of the event space.
|
protected abstract double |
getLogNormalizationConstantForComponent(int i)
Computes the logarithm of the normalization constant for the component
i. |
double |
getLogPriorTerm()
This method computes a value that is proportional to
|
double |
getLogScoreFor(Sequence seq,
int start)
|
int |
getNumberOfComponents()
Returns the number of different components of this
AbstractMixtureDiffSM. |
int |
getNumberOfParameters()
Returns the number of parameters in this
DifferentiableSequenceScore. |
int |
getNumberOfRecommendedStarts()
This method returns the number of recommended optimization starts.
|
double[] |
getProbsForComponent(Sequence seq)
Returns the probabilities for each component given a
Sequence. |
int[][] |
getSamplingGroups(int parameterOffset)
Returns groups of indexes of parameters that shall be drawn
together in a sampling procedure
|
int |
getSizeOfEventSpaceForRandomVariablesOfParameter(int index)
Returns the size of the event space of the random variables that are
affected by parameter no.
|
protected String |
getXMLTag()
This method returns the XML tag of the instance that is used to build a
XML representation.
|
protected void |
init(boolean freeParams)
This method creates the underlying structure for the parameters.
|
void |
initializeFunction(int index,
boolean freeParams,
DataSet[] data,
double[][] weights)
This method creates the underlying structure of the
DifferentiableSequenceScore. |
void |
initializeFunctionRandomly(boolean freeParams)
This method initializes the
DifferentiableSequenceScore randomly. |
protected void |
initializeHiddenPotentialRandomly()
This method initializes the hidden potential (and the corresponding
parameters) randomly.
|
void |
initializeHiddenUniformly()
This method initializes the hidden parameters of the instance uniformly.
|
protected abstract void |
initializeUsingPlugIn(int index,
boolean freeParams,
DataSet[] data,
double[][] weights)
This method initializes the functions using the data in some way.
|
protected void |
initWithLength(boolean freeParams,
int len)
This method is used to create the underlying structure, e.g.
|
boolean |
isInitialized()
This method can be used to determine whether the instance is initialized.
|
boolean |
isNormalized()
This method indicates whether the implemented score is already normalized
to 1 or not.
|
protected void |
precomputeNorm()
Pre-computes the normalization constant.
|
protected void |
setHiddenParameters(double[] params,
int start)
This method sets the hidden parameters of the model.
|
void |
setParameters(double[] params,
int start)
This method sets the internal parameters to the values of
params between start and
start + |
void |
setParametersForFunction(int index,
double[] params,
int start)
This method allows to set the parameters for specific functions.
|
StringBuffer |
toXML()
This method returns an XML representation as
StringBuffer of an
instance of the implementing class. |
emitDataSet, getInitialClassParam, getLogProbFor, getLogProbFor, getLogProbFor, getLogScoreFor, getLogScoreFor, getMaximalMarkovOrder, isNormalizedgetAlphabetContainer, getCharacteristics, getLength, getLogScoreAndPartialDerivation, getLogScoreAndPartialDerivation, getLogScoreFor, getLogScoreFor, getNumberOfStarts, getNumericalCharacteristics, toStringequals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitgetESS, getLogPartialNormalizationConstantgetInitialClassParam, getLogScoreAndPartialDerivation, getLogScoreAndPartialDerivation, getLogScoreAndPartialDerivationemitDataSet, getLogProbFor, getLogProbFor, getLogProbFor, getMaximalMarkovOrdergetAlphabetContainer, getCharacteristics, getInstanceName, getLength, getLogScoreFor, getLogScoreFor, getLogScoreFor, getLogScoreFor, getNumericalCharacteristics, toStringprotected int[] paramRef
protected boolean optimizeHidden
boolean indicates whether to optimize the hidden
variables of this instance. (It is not used recursive.)protected boolean freeParams
boolean indicates whether free parameterization or all
parameters are used.protected DifferentiableStatisticalModel[] function
DifferentiableStatisticalModels that are used to
determine the score.protected double[] hiddenParameter
protected double[] logHiddenPotential
protected double[] hiddenPotential
protected double[] componentScore
protected double[] partNorm
protected double norm
double contains the normalization constant of the
instance.protected double logHiddenNorm
double contains the logarithm of the normalization
constant of hidden parameters of the instance.protected double logGammaSum
double contains the sum of the logarithm of the gamma
functions used in the prior.computeLogGammaSum()protected DoubleList[] dList
DoubleLists that are used while
computing the partial derivation.protected AbstractMixtureDiffSM(int length,
int starts,
int dimension,
boolean optimizeHidden,
boolean plugIn,
DifferentiableStatisticalModel... function)
throws CloneNotSupportedException
AbstractMixtureDiffSM.length - the sequence length that should be modeledstarts - the number of starts that should be done in an optimizationdimension - the number of different mixture componentsoptimizeHidden - indicates whether the parameters for the hidden variables
should be optimized or notplugIn - indicates whether the initial parameters for an optimization
should be related to the data or randomly drawnfunction - the DifferentiableStatisticalModelsCloneNotSupportedException - if an element of function could not be clonedprotected AbstractMixtureDiffSM(StringBuffer xml) throws NonParsableException
Storable.
Creates a new AbstractMixtureDiffSM out of a
StringBuffer as returned by toXML().xml - the XML representation as StringBufferNonParsableException - if the representation could not be parsedprotected void computeLogGammaSum()
public AbstractMixtureDiffSM clone() throws CloneNotSupportedException
DifferentiableSequenceScoreDifferentiableSequenceScore
instance.clone in interface DifferentiableSequenceScoreclone in interface SequenceScoreclone in class AbstractDifferentiableStatisticalModelDifferentiableSequenceScoreCloneNotSupportedException - if something went wrong while cloning the
DifferentiableSequenceScoreprotected void cloneFunctions(DifferentiableStatisticalModel[] originalFunctions) throws CloneNotSupportedException
clone().originalFunctions - the array of functions to be clonedCloneNotSupportedException - if an element of originalFunctions could not be
clonedpublic abstract double getHyperparameterForHiddenParameter(int index)
index.index - the index of the hidden parameterpublic double getLogPriorTerm()
DifferentiableStatisticalModel
DifferentiableStatisticalModel.getESS() * DifferentiableStatisticalModel.getLogNormalizationConstant() + Math.log( prior )
prior is the prior for the parameters of this model.getLogPriorTerm in interface DifferentiableStatisticalModelgetLogPriorTerm in interface StatisticalModelDifferentiableStatisticalModel.getESS() * DifferentiableStatisticalModel.getLogNormalizationConstant() + Math.log( prior ).DifferentiableStatisticalModel.getESS(),
DifferentiableStatisticalModel.getLogNormalizationConstant()public void addGradientOfLogPriorTerm(double[] grad,
int start)
throws Exception
DifferentiableStatisticalModelDifferentiableStatisticalModel.getLogPriorTerm() for each
parameter of this model. The results are added to the array
grad beginning at index start.addGradientOfLogPriorTerm in interface DifferentiableStatisticalModelgrad - the array of gradientsstart - the start index in the grad array, where the
partial derivations for the parameters of this models shall be
enteredException - if something went wrong with the computing of the gradientsDifferentiableStatisticalModel.getLogPriorTerm()public int getIndexOfMaximalComponentFor(Sequence seq, int start)
Sequence.seq - the sequencestart - the start positionpublic double[] getCurrentParameterValues()
throws Exception
DifferentiableSequenceScoredouble array of dimension
DifferentiableSequenceScore.getNumberOfParameters() containing the current parameter values.
If one likes to use these parameters to start an optimization it is
highly recommended to invoke
DifferentiableSequenceScore.initializeFunction(int, boolean, DataSet[], double[][]) before.
After an optimization this method can be used to get the current
parameter values.getCurrentParameterValues in interface DifferentiableSequenceScoreException - if no parameters exist (yet)public double getLogScoreFor(Sequence seq, int start)
SequenceScoregetLogScoreFor in interface SequenceScoreseq - the Sequencestart - the start position in the SequenceSequencepublic final double getLogNormalizationConstant()
DifferentiableStatisticalModelgetLogNormalizationConstant in interface DifferentiableStatisticalModelpublic final int getNumberOfComponents()
AbstractMixtureDiffSM.public final int getNumberOfParameters()
DifferentiableSequenceScoreDifferentiableSequenceScore. If the
number of parameters is not known yet, the method returns
DifferentiableSequenceScore.UNKNOWN.getNumberOfParameters in interface DifferentiableSequenceScoreDifferentiableSequenceScoreDifferentiableSequenceScore.UNKNOWNpublic final int getNumberOfRecommendedStarts()
DifferentiableSequenceScoregetNumberOfRecommendedStarts in interface DifferentiableSequenceScoregetNumberOfRecommendedStarts in class AbstractDifferentiableSequenceScorepublic double[] getProbsForComponent(Sequence seq)
Sequence.seq - the Sequencei
(=p(i|class,seq)) in entry ipublic DifferentiableStatisticalModel[] getDifferentiableStatisticalModels() throws CloneNotSupportedException
DifferentiableStatisticalModels.DifferentiableStatisticalModelsCloneNotSupportedException - if one element of the internal used
DifferentiableStatisticalModels could not
be clonedpublic int getSizeOfEventSpaceForRandomVariablesOfParameter(int index)
DifferentiableStatisticalModelindex, i.e. the product of the
sizes of the alphabets at the position of each random variable affected
by parameter index. For DNA alphabets this corresponds to 4
for a PWM, 16 for a WAM except position 0, ...getSizeOfEventSpaceForRandomVariablesOfParameter in interface DifferentiableStatisticalModelindex - the index of the parameterpublic void initializeFunction(int index,
boolean freeParams,
DataSet[] data,
double[][] weights)
throws Exception
DifferentiableSequenceScoreDifferentiableSequenceScore.initializeFunction in interface DifferentiableSequenceScoreindex - the index of the class the DifferentiableSequenceScore modelsfreeParams - indicates whether the (reduced) parameterization is useddata - the data setsweights - the weights of the sequences in the data setsException - if something went wrongprotected abstract void initializeUsingPlugIn(int index,
boolean freeParams,
DataSet[] data,
double[][] weights)
throws Exception
index - the class indexfreeParams - if true, the (reduced) parameterization is useddata - the dataweights - the weights for the dataException - if the initialization could not be doneDifferentiableSequenceScore.initializeFunction(int,
boolean, DataSet[], double[][])public void initializeFunctionRandomly(boolean freeParams)
throws Exception
DifferentiableSequenceScoreDifferentiableSequenceScore randomly. It has to
create the underlying structure of the DifferentiableSequenceScore.initializeFunctionRandomly in interface DifferentiableSequenceScorefreeParams - indicates whether the (reduced) parameterization is usedException - if something went wrongprotected void initializeHiddenPotentialRandomly()
public boolean isInitialized()
SequenceScoreSequenceScore.getLogScoreFor(Sequence).isInitialized in interface SequenceScoretrue if the instance is initialized, false
otherwisepublic void setParameters(double[] params,
int start)
DifferentiableSequenceScoreparams between start and
start + DifferentiableSequenceScore.getNumberOfParameters() - 1setParameters in interface DifferentiableSequenceScoreparams - the new parametersstart - the start index in paramsprotected boolean determineIsNormalized()
isNormalized().isNormalized()public void initializeHiddenUniformly()
protected void setHiddenParameters(double[] params,
int start)
params - the parameter vectorstart - the start index in paramspublic void setParametersForFunction(int index,
double[] params,
int start)
index - the function indexparams - the parameter vectorstart - the start index in paramspublic final StringBuffer toXML()
StorableStringBuffer of an
instance of the implementing class.protected final void fromXML(StringBuffer b) throws NonParsableException
AbstractDifferentiableSequenceScoreStorable
interface to create a scoring function from a StringBuffer.fromXML in class AbstractDifferentiableSequenceScoreb - the XML representation as StringBufferNonParsableException - if the StringBuffer could not be parsedAbstractDifferentiableSequenceScore.AbstractDifferentiableSequenceScore(StringBuffer)protected StringBuffer getFurtherInformation()
StringBufferextractFurtherInformation(StringBuffer)protected void extractFurtherInformation(StringBuffer xml) throws NonParsableException
getFurtherInformation(). It
extracts further information of the instance from a XML representation.xml - the StringBuffer containing the information to be
extracted as XML codeNonParsableException - if the StringBuffer could not be parsedgetFurtherInformation()protected int[] getIndices(int index)
index - the parameter indexparamRefprotected String getXMLTag()
Class.getSimpleName()protected void init(boolean freeParams)
freeParams - indicates whether to use only free parameters or all
parametersprotected final void initWithLength(boolean freeParams,
int len)
paramRef.freeParams - indicates whether to use free parameters or alllen - the length of the array paramRefprotected void computeHiddenParameter(double[] statistic,
boolean add)
statistic - a statistic for the initialization of the hidden parametersadd - a switch for adding hyperparameters to the statisticDifferentiableSequenceScore.initializeFunction(int,
boolean, DataSet[], double[][])protected void precomputeNorm()
protected abstract double getLogNormalizationConstantForComponent(int i)
i.i - the index of the componentpublic double[] getAPrioriMixtureProbabilities()
protected abstract void fillComponentScores(Sequence seq, int start)
componentScore with the logarithmic
scores of the components given a Sequence.seq - the sequencestart - the start position in seqpublic final boolean isNormalized()
DifferentiableStatisticalModelfalse.isNormalized in interface DifferentiableStatisticalModelisNormalized in class AbstractDifferentiableStatisticalModeltrue if the implemented score is already normalized
to 1, false otherwisepublic DifferentiableStatisticalModel getFunction(int index) throws CloneNotSupportedException
index - the index of the specific functionCloneNotSupportedException - if the function could not be clonedpublic DifferentiableStatisticalModel[] getFunctions() throws CloneNotSupportedException
CloneNotSupportedException - if at least one function could not be clonedpublic int[][] getSamplingGroups(int parameterOffset)
SamplingDifferentiableStatisticalModelgetSamplingGroups in interface SamplingDifferentiableStatisticalModelparameterOffset - a global offset on the parameter indexesparameterOffset.