de.jstacs.sequenceScores.statisticalModels.trainable.hmm.states.emissions.continuous
Class GaussianEmission

java.lang.Object
  extended by de.jstacs.sequenceScores.statisticalModels.trainable.hmm.states.emissions.continuous.GaussianEmission
All Implemented Interfaces:
DifferentiableEmission, Emission, Storable, Cloneable
Direct Known Subclasses:
PluginGaussianEmission

public class GaussianEmission
extends Object
implements DifferentiableEmission

Emission for continuous values following a Gaussian distribution. The Gaussian density is parameterized in terms of mean and (log) precision. The prior of the Gaussian density is a normal-gamma density parameterized in terms of shape and rate.

Author:
Jan Grau

Constructor Summary
GaussianEmission(AlphabetContainer con)
          Creates a GaussianEmission which can be used for maximum likelihood.
GaussianEmission(AlphabetContainer con, double ess, double priorMu, double priorAlpha, double priorBeta, boolean transformed)
          Creates a GaussianEmission with normal-gamma prior by directly defining the hyper-parameters of the prior.
GaussianEmission(double ess, AlphabetContainer con, double priorMu, double expectedPrecision, double sdPrecision, boolean transformed)
          Creates a GaussianEmission with normal-gamma prior by defining the expected precision and the expected standard deviation of the precision, i.e.
GaussianEmission(StringBuffer xml)
          Creates a GaussianEmission from its XML representation.
 
Method Summary
 void addGradientOfLogPriorTerm(double[] gradient, int offset)
          This method computes the gradient of Emission.getLogPriorTerm() for each parameter of this model.
 void addToStatistic(boolean forward, int startPos, int endPos, double weight, Sequence seq)
          This method adds the weight to the internal sufficient statistic.
 GaussianEmission clone()
           
 void estimateFromStatistic()
          This method estimates the parameters from the internal sufficient statistic.
 void fillCurrentParameter(double[] params)
          Fills the current parameters in the global code>params array using the internal offset.
 void fillSamplingGroups(int parameterOffset, LinkedList<int[]> list)
          Adds the groups of indexes of those parameters of this emission that should be sampled together in one step of a grouped sampling procedure, each as an int[], into list.
protected  void fromXML(StringBuffer xml)
          This method is internally used by the constructor GaussianEmission(StringBuffer).
 AlphabetContainer getAlphabetContainer()
          This method returns the AlphabetContainer of this emission.
 double getLogPriorTerm()
          Returns a value that is proportional to the log of the prior.
 double getLogProbAndPartialDerivationFor(boolean forward, int startPos, int endPos, IntList indices, DoubleList partDer, Sequence seq)
          Returns the logarithmic score for a Sequence beginning at position start in the Sequence and fills lists with the indices and the partial derivations.
 double getLogProbFor(boolean forward, int startPos, int endPos, Sequence seq)
          This method computes the logarithm of the likelihood.
 String getNodeLabel(double weight, String name, NumberFormat nf)
          Returns the graphviz label of the node containing this emission.
 String getNodeShape(boolean forward)
          Returns the graphviz string for the shape of the node.
 int getNumberOfParameters()
          Returns the number of parameters of this emission.
 int getSizeOfEventSpace()
          Returns the size of the event space, i.e., the number of possible outcomes, for the random variables of this emission
 void initializeFunctionRandomly()
          This method initializes the emission randomly.
 void joinStatistics(Emission... emissions)
          This method joins the statistics of different instances and sets this joined statistic as statistic of each instance.
protected  void precompute()
          This method precomputes some normalization constant.
 void resetStatistic()
          This method resets the internal sufficient statistic.
 void setParameter(double[] params, int offset)
          This method sets the internal parameters using the given global parameter array, the global offset of the HMM and the internal offset.
 int setParameterOffset(int offset)
          This method sets the internal parameter offset and returns the new parameter offset for further use.
 void setParameters(Emission t)
          Set values of parameters of the instance to the value of the parameters of the given instance.
 String toString(NumberFormat nf)
          This method returns a String representation of the instance.
 StringBuffer toXML()
          This method returns an XML representation as StringBuffer of an instance of the implementing class.
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

GaussianEmission

public GaussianEmission(AlphabetContainer con)
Creates a GaussianEmission which can be used for maximum likelihood.

Parameters:
con - the alphabet of the emissions

GaussianEmission

public GaussianEmission(AlphabetContainer con,
                        double ess,
                        double priorMu,
                        double priorAlpha,
                        double priorBeta,
                        boolean transformed)
Creates a GaussianEmission with normal-gamma prior by directly defining the hyper-parameters of the prior.

Parameters:
con - the alphabet of the emissions
ess - the equivalent sample size of the normal-gamma prior
priorMu - the a-priori mean of the normal part of the prior
priorAlpha - the shape parameter of the gamma part of the prior
priorBeta - the rate parameter of the gamma part of the prior
transformed - use the transformed Gaussian density, i.e. exponential precision, ignored for numerical optimization

GaussianEmission

public GaussianEmission(double ess,
                        AlphabetContainer con,
                        double priorMu,
                        double expectedPrecision,
                        double sdPrecision,
                        boolean transformed)
Creates a GaussianEmission with normal-gamma prior by defining the expected precision and the expected standard deviation of the precision, i.e. via the expectation and variance of the gamma part of the normal-gamma prior.

Parameters:
ess - the equivalent sample size of the normal-gamma prior
con - the alphabet of the emissions
priorMu - the a-priori mean of the normal part of the prior
expectedPrecision - the expected value of the precision
sdPrecision - the expected standard deviation of the precision
transformed - use the transformed Gaussian density, i.e. exponential precision, ignored for numerical optimization

GaussianEmission

public GaussianEmission(StringBuffer xml)
                 throws NonParsableException
Creates a GaussianEmission from its XML representation.

Parameters:
xml - the XML representation.
Throws:
NonParsableException - if the XML representation could not be parsed
Method Detail

clone

public GaussianEmission clone()
                       throws CloneNotSupportedException
Overrides:
clone in class Object
Throws:
CloneNotSupportedException

addGradientOfLogPriorTerm

public void addGradientOfLogPriorTerm(double[] gradient,
                                      int offset)
Description copied from interface: DifferentiableEmission
This method computes the gradient of Emission.getLogPriorTerm() for each parameter of this model. The results are added to the array grad beginning at index (offset + internal offset).

Specified by:
addGradientOfLogPriorTerm in interface DifferentiableEmission
Parameters:
gradient - the array of gradients
offset - the start index of the HMM in the grad array, where the partial derivations for the parameters of the HMM shall be entered
See Also:
Emission.getLogPriorTerm(), DifferentiableEmission.setParameterOffset(int)

joinStatistics

public void joinStatistics(Emission... emissions)
Description copied from interface: Emission
This method joins the statistics of different instances and sets this joined statistic as statistic of each instance. This method might be used for instance in a multi-threaded optimization to join partial statistics.

Specified by:
joinStatistics in interface Emission
Parameters:
emissions - the emissions to be joined

addToStatistic

public void addToStatistic(boolean forward,
                           int startPos,
                           int endPos,
                           double weight,
                           Sequence seq)
                    throws OperationNotSupportedException
Description copied from interface: Emission
This method adds the weight to the internal sufficient statistic.

Specified by:
addToStatistic in interface Emission
Parameters:
forward - whether to use the forward or the reverse strand
startPos - the start position
endPos - the end position
weight - the weight of the sequence
seq - the sequence
Throws:
OperationNotSupportedException - if forward=false and the reverse complement of the sequence seq is not defined

estimateFromStatistic

public void estimateFromStatistic()
Description copied from interface: Emission
This method estimates the parameters from the internal sufficient statistic.

Specified by:
estimateFromStatistic in interface Emission

fillCurrentParameter

public void fillCurrentParameter(double[] params)
Description copied from interface: DifferentiableEmission
Fills the current parameters in the global code>params array using the internal offset.

Specified by:
fillCurrentParameter in interface DifferentiableEmission
Parameters:
params - the global parameter array of the HMM
See Also:
DifferentiableEmission.setParameterOffset(int)

setParameter

public void setParameter(double[] params,
                         int offset)
Description copied from interface: DifferentiableEmission
This method sets the internal parameters using the given global parameter array, the global offset of the HMM and the internal offset.

Specified by:
setParameter in interface DifferentiableEmission
Parameters:
params - the global parameter array of the classifier
offset - the offset of the HMM
See Also:
DifferentiableEmission.setParameterOffset(int)

setParameterOffset

public int setParameterOffset(int offset)
Description copied from interface: DifferentiableEmission
This method sets the internal parameter offset and returns the new parameter offset for further use.

Specified by:
setParameterOffset in interface DifferentiableEmission
Parameters:
offset - the offset to be set
Returns:
the new parameter offset

fromXML

protected void fromXML(StringBuffer xml)
                throws NonParsableException
This method is internally used by the constructor GaussianEmission(StringBuffer).

Parameters:
xml - the StringBuffer containing the xml representation of an instance
Throws:
NonParsableException - if the StringBuffer is not parsable
See Also:
GaussianEmission(StringBuffer)

getLogPriorTerm

public double getLogPriorTerm()
Description copied from interface: Emission
Returns a value that is proportional to the log of the prior. For maximum likelihood (ML) 0 should be returned.

Specified by:
getLogPriorTerm in interface Emission
Returns:
a value that is proportional to the log of the prior
See Also:
StatisticalModel.getLogPriorTerm()

getLogProbAndPartialDerivationFor

public double getLogProbAndPartialDerivationFor(boolean forward,
                                                int startPos,
                                                int endPos,
                                                IntList indices,
                                                DoubleList partDer,
                                                Sequence seq)
                                         throws OperationNotSupportedException
Description copied from interface: DifferentiableEmission
Returns the logarithmic score for a Sequence beginning at position start in the Sequence and fills lists with the indices and the partial derivations.

Specified by:
getLogProbAndPartialDerivationFor in interface DifferentiableEmission
Parameters:
forward - a switch whether to use the forward or the reverse complementary strand of the sequence
startPos - the start position in the Sequence
endPos - the end position in the Sequence
indices - an IntList of indices, after method invocation the list should contain the indices i where $\frac{\partial \log score(seq)}{\partial \lambda_i}$ is not zero
partDer - a DoubleList of partial derivations, after method invocation the list should contain the corresponding $\frac{\partial \log score(seq)}{\partial \lambda_i}$ that are not zero
seq - the Sequence
Returns:
the logarithmic score for the Sequence
Throws:
OperationNotSupportedException - if forward==false and the reverse complement of the sequence can not be computed

getLogProbFor

public double getLogProbFor(boolean forward,
                            int startPos,
                            int endPos,
                            Sequence seq)
                     throws OperationNotSupportedException
Description copied from interface: Emission
This method computes the logarithm of the likelihood.

Specified by:
getLogProbFor in interface Emission
Parameters:
forward - whether to use the forward or the reverse strand
startPos - the start position
endPos - the end position
seq - the sequence
Returns:
the logarithm of the probability
Throws:
OperationNotSupportedException - if forward=false and the reverse complement of the sequence seq is not defined

initializeFunctionRandomly

public void initializeFunctionRandomly()
Description copied from interface: Emission
This method initializes the emission randomly.

Specified by:
initializeFunctionRandomly in interface Emission

precompute

protected void precompute()
This method precomputes some normalization constant.


resetStatistic

public void resetStatistic()
Description copied from interface: Emission
This method resets the internal sufficient statistic.

Specified by:
resetStatistic in interface Emission

toXML

public StringBuffer toXML()
Description copied from interface: Storable
This method returns an XML representation as StringBuffer of an instance of the implementing class.

Specified by:
toXML in interface Storable
Returns:
the XML representation

getAlphabetContainer

public AlphabetContainer getAlphabetContainer()
Description copied from interface: Emission
This method returns the AlphabetContainer of this emission.

Specified by:
getAlphabetContainer in interface Emission
Returns:
the AlphabetContainer of this emission

toString

public String toString(NumberFormat nf)
Description copied from interface: Emission
This method returns a String representation of the instance.

Specified by:
toString in interface Emission
Parameters:
nf - the NumberFormat for the String representation of parameters or probabilities
Returns:
a String representation of the instance

getNodeShape

public String getNodeShape(boolean forward)
Description copied from interface: Emission
Returns the graphviz string for the shape of the node.

Specified by:
getNodeShape in interface Emission
Parameters:
forward - if this emission is used on the forward strand
Returns:
the shape

getNodeLabel

public String getNodeLabel(double weight,
                           String name,
                           NumberFormat nf)
Description copied from interface: Emission
Returns the graphviz label of the node containing this emission.

Specified by:
getNodeLabel in interface Emission
Parameters:
weight - the weight of the node which is represented by the color of the node, or -1 for no representation, i.e., white background
name - the name of the state using this emission
nf - the NumberFormat for formatting the textual representation of this emission
Returns:
the label

fillSamplingGroups

public void fillSamplingGroups(int parameterOffset,
                               LinkedList<int[]> list)
Description copied from interface: DifferentiableEmission
Adds the groups of indexes of those parameters of this emission that should be sampled together in one step of a grouped sampling procedure, each as an int[], into list. In most cases, one group should contain the parameters that are living on a common simplex. The internal indexes of the parameters are incremeneted by an external parameterOffset

Specified by:
fillSamplingGroups in interface DifferentiableEmission
Parameters:
parameterOffset - the external parameter offset
list - the list of sampling groups

getNumberOfParameters

public int getNumberOfParameters()
Description copied from interface: DifferentiableEmission
Returns the number of parameters of this emission.

Specified by:
getNumberOfParameters in interface DifferentiableEmission
Returns:
the number of parameters

getSizeOfEventSpace

public int getSizeOfEventSpace()
Description copied from interface: DifferentiableEmission
Returns the size of the event space, i.e., the number of possible outcomes, for the random variables of this emission

Specified by:
getSizeOfEventSpace in interface DifferentiableEmission
Returns:
the size of the event space

setParameters

public void setParameters(Emission t)
                   throws IllegalArgumentException
Description copied from interface: Emission
Set values of parameters of the instance to the value of the parameters of the given instance. It can be assumed that the given instance and the current instance are from the same class. This method might be used for instance in a multi-threaded optimization to broadcast the parameters.

Specified by:
setParameters in interface Emission
Parameters:
t - the emission with the parameters to be set
Throws:
IllegalArgumentException - if the assumption about the same class for given and current instance is wrong