public abstract class AbstractConditionalDiscreteEmission extends Object implements SamplingEmission, DifferentiableEmission
| Modifier and Type | Field and Description |
|---|---|
protected AlphabetContainer |
con
The alphabet of the emissions
|
protected int[] |
counter
The counter for the sampling steps of each sampling.
|
protected double[] |
ess
The equivalent sample sizes for each condition
|
protected double[][] |
grad
The array for storing the gradients for
each parameter
|
protected double[][] |
hyperParams
The hyper-parameters for the prior on the parameters
|
protected double[] |
logNorm
The log-normalization constants for each condition
|
protected int |
offset
The offset of the parameter indexes
|
protected double[][] |
params
The parameters of the emission
|
protected File[] |
paramsFile
The files for saving the parameters during the sampling.
|
protected double[][] |
probs
The parameters transformed to probabilites
|
protected BufferedReader |
reader
The reader for the
paramsFile after a sampling. |
protected int |
samplingIndex
The index of the current sampling.
|
protected double[][] |
statistic
The array for storing the statistics for
each parameter
|
protected BufferedWriter |
writer
The writer for the
paramsFile in a sampling. |
| Modifier | Constructor and Description |
|---|---|
protected |
AbstractConditionalDiscreteEmission(AlphabetContainer con,
double[][] hyperParams)
This is a simple constructor for a
AbstractConditionalDiscreteEmission defining the individual hyper parameters. |
protected |
AbstractConditionalDiscreteEmission(AlphabetContainer con,
double[][] hyperParams,
double[][] initHyperParams)
This constructor creates a
AbstractConditionalDiscreteEmission defining the individual hyper parameters for the
prior used during training and initialization. |
protected |
AbstractConditionalDiscreteEmission(AlphabetContainer con,
int numberOfConditions,
double ess)
This is a simple constructor for a
AbstractConditionalDiscreteEmission based on the equivalent sample size. |
protected |
AbstractConditionalDiscreteEmission(StringBuffer xml)
Creates a
AbstractConditionalDiscreteEmission from its XML representation. |
| Modifier and Type | Method and Description |
|---|---|
void |
acceptParameters()
This methods accepts the drawn parameters.
|
void |
addGradientOfLogPriorTerm(double[] gradient,
int offset)
This method computes the gradient of
Emission.getLogPriorTerm() for each
parameter of this model. |
void |
addToStatistic(boolean forward,
int startPos,
int endPos,
double weight,
Sequence seq)
This method adds the
weight to the internal sufficient statistic. |
protected void |
appendFurtherInformation(StringBuffer xml)
This method appends further information to the XML representation.
|
AbstractConditionalDiscreteEmission |
clone() |
protected void |
drawParameters(double[][] hyper,
boolean uniformBackup)
Draws the parameters of this
AbstractConditionalDiscreteEmission from a Dirichlet distribution
with given hyper-parameters. |
void |
drawParametersFromStatistic()
This method draws the parameters using a sufficient statistic representing a posteriori
density.
|
void |
estimateFromStatistic()
This method estimates the parameters from the internal sufficient statistic.
|
void |
extendSampling(int start,
boolean append)
This method allows to extend a sampling.
|
protected void |
extractFurtherInformation(StringBuffer xml)
This method extracts further information from the XML representation.
|
void |
fillCurrentParameter(double[] params)
Fills the current parameters in the global
params array using the internal offset. |
void |
fillSamplingGroups(int parameterOffset,
LinkedList<int[]> list)
Adds the groups of indexes of those parameters of this emission that should be sampled
together in one step of a grouped sampling procedure, each as an
int[], into list. |
protected void |
finalize() |
protected void |
fromXML(StringBuffer xml)
This method is internally used by the constructor
AbstractConditionalDiscreteEmission(StringBuffer). |
AlphabetContainer |
getAlphabetContainer()
This method returns the
AlphabetContainer of this emission. |
protected abstract int |
getConditionIndex(boolean forward,
int seqPos,
Sequence seq)
This method returns an index encoding the condition.
|
protected static double[][] |
getHyperParams(double ess,
int numConditions,
int numEmissions)
Returns the hyper-parameters for all parameters and a given ess.
|
protected int |
getIndex(int seqPos,
Sequence seq)
Returns the index for position
seqPos in sequence seq. |
double |
getLogGammaScoreFromStatistic()
This method calculates a score for the current statistics, which is independent from the current parameters
In general the gamma-score is a product of gamma-functions parameterized with the current statistics
|
double |
getLogPosteriorFromStatistic()
This method calculates the a-posteriori probability for the current statistics
|
double |
getLogPriorTerm()
Returns a value that is proportional to the log of the prior.
|
double |
getLogProbAndPartialDerivationFor(boolean forward,
int startPos,
int endPos,
IntList indices,
DoubleList partDer,
Sequence seq)
|
double |
getLogProbFor(boolean forward,
int startPos,
int endPos,
Sequence seq)
This method computes the logarithm of the likelihood.
|
String |
getNodeLabel(double weight,
String name,
NumberFormat nf)
Returns the graphviz label of the node containing this emission.
|
String |
getNodeShape(boolean forward)
Returns the graphviz string for the shape of the node.
|
int |
getNumberOfParameters()
Returns the number of parameters of this emission.
|
int |
getSizeOfEventSpace()
Returns the size of the event space, i.e., the number of possible outcomes,
for the random variables of this emission
|
void |
initForSampling(int starts)
This method initializes the instance for the sampling.
|
void |
initializeFunctionRandomly()
This method initializes the emission randomly.
|
boolean |
isInSamplingMode()
This method returns
true if the object is currently used in
a sampling, otherwise false. |
void |
joinStatistics(Emission... emissions)
This method joins the statistics of different instances and sets this joined statistic as statistic of each instance.
|
boolean |
parseNextParameterSet()
This method allows the user to parse the next set of parameters (from a
file).
|
boolean |
parseParameterSet(int start,
int n)
This method allows the user to parse the set of parameters with index
n of a certain sampling (from a file). |
protected void |
precompute()
This method precomputes some normalization constant and probabilities.
|
void |
resetStatistic()
This method resets the internal sufficient statistic.
|
void |
samplingStopped()
This method is the opposite of the method
SamplingComponent.extendSampling(int, boolean). |
void |
setLinear(boolean linear)
If set to true, the probabilities are mapped to colors by directly, otherwise
a logistic mapping is used to emphasize deviations from the uniform distribution.
|
void |
setParameter(double[] params,
int offset)
This method sets the internal parameters using the given global parameter array, the global offset of the HMM and the internal offset.
|
int |
setParameterOffset(int offset)
This method sets the internal parameter offset and returns the new parameter offset for further use.
|
void |
setParameters(Emission t)
Set values of parameters of the instance to the value of the parameters of the given instance.
|
void |
setShape(String shape)
Sets the graphviz shape of the node that uses this emission to some non-standard value
(standard is "house").
|
StringBuffer |
toXML()
This method returns an XML representation as
StringBuffer of an
instance of the implementing class. |
protected File[] paramsFile
protected int[] counter
protected int samplingIndex
protected BufferedWriter writer
paramsFile in a sampling.protected BufferedReader reader
paramsFile after a sampling.protected int offset
protected AlphabetContainer con
protected double[][] params
protected double[][] probs
protected double[][] hyperParams
protected double[][] statistic
protected double[][] grad
protected double[] logNorm
protected double[] ess
protected AbstractConditionalDiscreteEmission(AlphabetContainer con, int numberOfConditions, double ess)
AbstractConditionalDiscreteEmission based on the equivalent sample size.con - the AlphabetContainer of this emissionnumberOfConditions - the number of conditionsess - the equivalent sample size (ess) of this emission that is equally distributed over all parametersAbstractConditionalDiscreteEmission(AlphabetContainer, double[][])protected AbstractConditionalDiscreteEmission(AlphabetContainer con, double[][] hyperParams)
AbstractConditionalDiscreteEmission defining the individual hyper parameters.con - the AlphabetContainer of this emissionhyperParams - the individual hyper parameters for each parameterAbstractConditionalDiscreteEmission(AlphabetContainer, double[][])protected AbstractConditionalDiscreteEmission(AlphabetContainer con, double[][] hyperParams, double[][] initHyperParams)
AbstractConditionalDiscreteEmission defining the individual hyper parameters for the
prior used during training and initialization.con - the AlphabetContainer of this emissionhyperParams - the individual hyper parameters for each parameter (used during training)initHyperParams - the individual hyper parameters for each parameter used in initializeFunctionRandomly()protected AbstractConditionalDiscreteEmission(StringBuffer xml) throws NonParsableException
AbstractConditionalDiscreteEmission from its XML representation.xml - the XML representation.NonParsableException - if the XML representation could not be parsedprotected static double[][] getHyperParams(double ess,
int numConditions,
int numEmissions)
ess - the equivalent sample sizenumConditions - the number of conditionsnumEmissions - the number of emissions, assumed to be equal for all conditionspublic AbstractConditionalDiscreteEmission clone() throws CloneNotSupportedException
clone in class ObjectCloneNotSupportedExceptionpublic void setShape(String shape)
shape - the shape of the nodepublic void addGradientOfLogPriorTerm(double[] gradient,
int offset)
DifferentiableEmissionEmission.getLogPriorTerm() for each
parameter of this model. The results are added to the array
grad beginning at index (offset + internal offset).addGradientOfLogPriorTerm in interface DifferentiableEmissiongradient - the array of gradientsoffset - the start index of the HMM in the grad array, where the
partial derivations for the parameters of the HMM shall be
enteredEmission.getLogPriorTerm(),
DifferentiableEmission.setParameterOffset(int)public double getLogPriorTerm()
EmissiongetLogPriorTerm in interface EmissionStatisticalModel.getLogPriorTerm()public double getLogProbAndPartialDerivationFor(boolean forward,
int startPos,
int endPos,
IntList indices,
DoubleList partDer,
Sequence seq)
throws OperationNotSupportedException
DifferentiableEmissionSequence beginning at
position start in the Sequence and fills lists with
the indices and the partial derivations.getLogProbAndPartialDerivationFor in interface DifferentiableEmissionforward - a switch whether to use the forward or the reverse complementary strand of the sequencestartPos - the start position in the SequenceendPos - the end position in the Sequenceindices - an IntList of indices, after method invocation the
list should contain the indices i where
is not zeropartDer - a DoubleList of partial derivations, after method
invocation the list should contain the corresponding
that are not zeroseq - the SequenceSequenceOperationNotSupportedException - if forward==false and the reverse complement of the sequence can not be computedpublic double getLogProbFor(boolean forward,
int startPos,
int endPos,
Sequence seq)
throws OperationNotSupportedException
EmissiongetLogProbFor in interface Emissionforward - whether to use the forward or the reverse strandstartPos - the start positionendPos - the end positionseq - the sequenceOperationNotSupportedException - if forward=false and the reverse complement of the sequence seq is not definedpublic void initializeFunctionRandomly()
EmissioninitializeFunctionRandomly in interface Emissionprotected void precompute()
public StringBuffer toXML()
StorableStringBuffer of an
instance of the implementing class.protected void appendFurtherInformation(StringBuffer xml)
xml - the XML representationprotected void fromXML(StringBuffer xml) throws NonParsableException
AbstractConditionalDiscreteEmission(StringBuffer).xml - the StringBuffer containing the xml representation of an instanceNonParsableException - if the StringBuffer is not parsableAbstractConditionalDiscreteEmission(StringBuffer)protected void extractFurtherInformation(StringBuffer xml) throws NonParsableException
xml - the XML representationNonParsableException - if the information could not be reconstructed out of the StringBuffer xmlpublic void joinStatistics(Emission... emissions)
EmissionjoinStatistics in interface Emissionemissions - the emissions to be joinedpublic void addToStatistic(boolean forward,
int startPos,
int endPos,
double weight,
Sequence seq)
throws OperationNotSupportedException
Emissionweight to the internal sufficient statistic.addToStatistic in interface Emissionforward - whether to use the forward or the reverse strandstartPos - the start positionendPos - the end positionweight - the weight of the sequenceseq - the sequenceOperationNotSupportedException - if forward=false and the reverse complement of the sequence seq is not definedprotected abstract int getConditionIndex(boolean forward,
int seqPos,
Sequence seq)
forward - a switch to decide whether to use the forward or the reverse complementary strand (e.g. for DNA sequences)seqPos - the position in the sequence seqseq - the sequenceprotected int getIndex(int seqPos,
Sequence seq)
seqPos in sequence seq.seqPos - the positionseq - the sequencepublic void estimateFromStatistic()
EmissionestimateFromStatistic in interface Emissionpublic void resetStatistic()
EmissionresetStatistic in interface Emissionpublic void setParameter(double[] params,
int offset)
DifferentiableEmissionsetParameter in interface DifferentiableEmissionparams - the global parameter array of the classifieroffset - the offset of the HMMDifferentiableEmission.setParameterOffset(int)public AlphabetContainer getAlphabetContainer()
EmissionAlphabetContainer of this emission.getAlphabetContainer in interface EmissionAlphabetContainer of this emissionpublic void fillCurrentParameter(double[] params)
DifferentiableEmissionparams array using the internal offset.fillCurrentParameter in interface DifferentiableEmissionparams - the global parameter array of the HMMDifferentiableEmission.setParameterOffset(int)public int setParameterOffset(int offset)
DifferentiableEmissionsetParameterOffset in interface DifferentiableEmissionoffset - the offset to be setprotected void drawParameters(double[][] hyper,
boolean uniformBackup)
AbstractConditionalDiscreteEmission from a Dirichlet distribution
with given hyper-parameters. If the equivalent sample size (ess) according to the provided hyper-parameters
is zero, parameters may be drawn from a uniform distribution on the simplex.hyper - the hyper-parametersuniformBackup - if a uniform distribution should be used in case of ess zeropublic void drawParametersFromStatistic()
SamplingFromStatisticSamplingComponent.acceptParameters() so that they can later be parsed using the
methods of the interface.
SamplingComponent.initForSampling(int) should be
called.drawParametersFromStatistic in interface SamplingFromStatisticSamplingComponent.initForSampling(int),
SamplingComponent.acceptParameters()public double getLogGammaScoreFromStatistic()
SamplingEmissiongetLogGammaScoreFromStatistic in interface SamplingEmissionpublic void acceptParameters()
throws IOException
SamplingComponentacceptParameters in interface SamplingComponentIOException - if the file could not be handled correctlypublic double getLogPosteriorFromStatistic()
SamplingFromStatisticgetLogPosteriorFromStatistic in interface SamplingFromStatisticpublic void extendSampling(int start,
boolean append)
throws IOException
SamplingComponentextendSampling in interface SamplingComponentstart - the index of the samplingappend - whether to append the sampled parameters to an existing file
or to overwrite the fileIOException - if the file could not be handled correctlypublic void initForSampling(int starts)
throws IOException
SamplingComponentinitForSampling in interface SamplingComponentstarts - the number of different sampling starts that will be doneIOException - if something went wrongFile.createTempFile(String, String, java.io.File )public boolean isInSamplingMode()
SamplingComponenttrue if the object is currently used in
a sampling, otherwise false.isInSamplingMode in interface SamplingComponenttrue if the object is currently used in a sampling,
otherwise falsepublic boolean parseNextParameterSet()
SamplingComponentparseNextParameterSet in interface SamplingComponenttrue if the parameters could be parsed, otherwise
falseSamplingComponent.parseParameterSet(int, int)public boolean parseParameterSet(int start,
int n)
throws IOException
SamplingComponentn of a certain sampling (from a file). The
internal numbering should start with 0. The parameter set with index 0 is
the initial (random) parameter set. It is recommended that a series of
parameter sets is accessed by the following lines:
for( sampling = 0; sampling < numSampling; sampling++ )
{
boolean b = parseParameterSet( sampling, n );
while( b )
{
//do something
b = parseNextParameterSet();
}
}
parseParameterSet in interface SamplingComponentstart - the index of the samplingn - the index of the parameter settrue if the parameter set could be parsedIOExceptionSamplingComponent.parseNextParameterSet()public void samplingStopped()
throws IOException
SamplingComponentSamplingComponent.extendSampling(int, boolean). It can be
used for closing any streams of writer, ...samplingStopped in interface SamplingComponentIOException - if something went wrongSamplingComponent.extendSampling(int, boolean)protected void finalize()
throws Throwable
public String getNodeShape(boolean forward)
EmissiongetNodeShape in interface Emissionforward - if this emission is used on the forward strandpublic String getNodeLabel(double weight, String name, NumberFormat nf)
EmissiongetNodeLabel in interface Emissionweight - the weight of the node which is represented by
the color of the node, or -1 for no representation, i.e.,
white backgroundname - the name of the state using this emissionnf - the NumberFormat for formatting the textual representation of this emissionpublic void setLinear(boolean linear)
linear - map probabilities linearpublic void fillSamplingGroups(int parameterOffset,
LinkedList<int[]> list)
DifferentiableEmissionint[], into list.
In most cases, one group should contain the parameters that are living on a common simplex.
The internal indexes of the parameters are incremeneted by an external parameterOffsetfillSamplingGroups in interface DifferentiableEmissionparameterOffset - the external parameter offsetlist - the list of sampling groupspublic int getNumberOfParameters()
DifferentiableEmissiongetNumberOfParameters in interface DifferentiableEmissionpublic int getSizeOfEventSpace()
DifferentiableEmissiongetSizeOfEventSpace in interface DifferentiableEmissionpublic void setParameters(Emission t) throws IllegalArgumentException
EmissionsetParameters in interface Emissiont - the emission with the parameters to be setIllegalArgumentException - if the assumption about the same class for given and current instance is wrong