de.jstacs.sequenceScores.statisticalModels.trainable.discrete.inhomogeneous
Class InhCondProb

java.lang.Object
  extended by de.jstacs.sequenceScores.statisticalModels.trainable.discrete.Constraint
      extended by de.jstacs.sequenceScores.statisticalModels.trainable.discrete.inhomogeneous.InhConstraint
          extended by de.jstacs.sequenceScores.statisticalModels.trainable.discrete.inhomogeneous.InhCondProb
All Implemented Interfaces:
Storable, Cloneable

public class InhCondProb
extends InhConstraint

This class handles (conditional) probabilities of sequences for inhomogeneous models.

Author:
Jens Keilwagen

Field Summary
 
Fields inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.discrete.inhomogeneous.InhConstraint
offset
 
Fields inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.discrete.Constraint
counts, freq, usedPositions
 
Constructor Summary
InhCondProb(int[] pos, int[] alphabetLength, boolean cond)
          Creates a new InhCondProb instance.
InhCondProb(int pos, int... alphabetLength)
          Creates a new InhCondProb instance.
InhCondProb(StringBuffer xml)
          The standard constructor for the interface Storable.
 
Method Summary
protected  void appendAdditionalInfo(StringBuffer xml)
          This method appends additional information that is not stored in the base class to the StringBuffer.
 InhCondProb clone()
           
 void drawParameters(double ess)
          Draws the parameters from a Dirichlet distribution using the counts and the given ess (equivalent sample size) as hyperparameters.
protected  void drawUnConditional(int start, int end, double pc)
          This method draws the parameters for a part of this constraint.
 void estimate(double ess)
          Estimates the (smoothed) relative frequencies using the ess (equivalent sample size).
 void estimateUnConditional(double ess, double all)
          Estimates the unconditional frequencies using the ess (equivalent sample size).
protected  void estimateUnConditional(int start, int end, double pc, boolean exceptionWhenNoData)
          Estimates unconditionally.
protected  void extractAdditionalInfo(StringBuffer xml)
          This method parses additional information from the StringBuffer that is not parsed in the base class.
 String getDescription(AlphabetContainer con, int i)
          Returns the decoded symbol for the encoded symbol i.
 double getLnFreq(int index)
          Returns the logarithm of the relative frequency (=probability) at position index in the distribution.
 double getLnFreq(Sequence s, int start)
          Returns the logarithm of the relative frequency (=probability) with the position in the distribution given by the index of the specific constraint that is fulfilled by the Sequence s beginning at start.
 void getOutput(byte[] content, double p)
          This method is used to create random sequences.
protected  String getXMLTag()
          Returns the XML tag that is used for the class to en- or decode.
 void setFreqs(String[] array, int start)
          This method is used to restore the values of a Gibbs Sampling run.
 String toString()
           
 
Methods inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.discrete.inhomogeneous.InhConstraint
satisfiesSpecificConstraint
 
Methods inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.discrete.Constraint
add, add, getCount, getFreq, getFreq, getFreqInfo, getMarginalOrder, getNumberOfSpecificConstraints, getPosition, getPositions, reset, resetCounts, toXML
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

InhCondProb

public InhCondProb(int pos,
                   int... alphabetLength)
Creates a new InhCondProb instance.

Parameters:
pos - the position
alphabetLength - the length of each alphabet (not only the used position)
See Also:
InhCondProb(int[], int[], boolean)

InhCondProb

public InhCondProb(int[] pos,
                   int[] alphabetLength,
                   boolean cond)
Creates a new InhCondProb instance.

Parameters:
pos - the positions
alphabetLength - the length of each alphabet (not only the used positions)
cond - indicates if the instance has to use conditional probabilities
See Also:
InhConstraint.InhConstraint(int[], int[])

InhCondProb

public InhCondProb(StringBuffer xml)
            throws NonParsableException
The standard constructor for the interface Storable. Creates a new InhCondProb instance out of its XML representation.

Parameters:
xml - the XML representation as StringBuffer
Throws:
NonParsableException - if the InhCondProb instance could not be reconstructed out of the XML representation (the StringBuffer could not be parsed)
See Also:
Storable, InhConstraint.InhConstraint(StringBuffer)
Method Detail

clone

public InhCondProb clone()
                  throws CloneNotSupportedException
Overrides:
clone in class InhConstraint
Throws:
CloneNotSupportedException

drawParameters

public void drawParameters(double ess)
Draws the parameters from a Dirichlet distribution using the counts and the given ess (equivalent sample size) as hyperparameters.

Parameters:
ess - the given ess (equivalent sample size)
See Also:
drawUnConditional(int, int, double)

estimate

public void estimate(double ess)
Description copied from class: Constraint
Estimates the (smoothed) relative frequencies using the ess (equivalent sample size).

Specified by:
estimate in class Constraint
Parameters:
ess - the ess

estimateUnConditional

public void estimateUnConditional(double ess,
                                  double all)
Estimates the unconditional frequencies using the ess (equivalent sample size).

Parameters:
ess - the ess (equivalent sample size)
all - the sum of all weights used to fill the counts

getLnFreq

public double getLnFreq(int index)
Returns the logarithm of the relative frequency (=probability) at position index in the distribution.

Parameters:
index - the index of the entry in the distribution
Returns:
the logarithm of the relative frequency (=probability)

getLnFreq

public double getLnFreq(Sequence s,
                        int start)
Returns the logarithm of the relative frequency (=probability) with the position in the distribution given by the index of the specific constraint that is fulfilled by the Sequence s beginning at start.

Parameters:
s - the sequence
start - the index of the start position
Returns:
the logarithm of the relative frequency (=probability)
See Also:
InhConstraint.satisfiesSpecificConstraint(Sequence, int)

getOutput

public void getOutput(byte[] content,
                      double p)
               throws OperationNotSupportedException
This method is used to create random sequences.

Parameters:
content - the content of the random sequence as far as it is known
p - a random number in (0,1)
Throws:
OperationNotSupportedException - if this instance models a joint probability for more than one position (shall be implemented in the future)
See Also:
StatisticalModel.emitDataSet(int, int[])

toString

public String toString()
Specified by:
toString in class Constraint

drawUnConditional

protected void drawUnConditional(int start,
                                 int end,
                                 double pc)
This method draws the parameters for a part of this constraint. It is used to draw from a distribution with fixed context.

Parameters:
start - the start index
end - the end index
pc - the pseudocount/hyperparameter

estimateUnConditional

protected void estimateUnConditional(int start,
                                     int end,
                                     double pc,
                                     boolean exceptionWhenNoData)
Description copied from class: Constraint
Estimates unconditionally.

Overrides:
estimateUnConditional in class Constraint
Parameters:
start - the start index
end - the end index
pc - the pseudocount for each parameter
exceptionWhenNoData - indicates if an (runtime) exception is thrown if no data was available to estimate the parameters

appendAdditionalInfo

protected void appendAdditionalInfo(StringBuffer xml)
Description copied from class: Constraint
This method appends additional information that is not stored in the base class to the StringBuffer.

Overrides:
appendAdditionalInfo in class InhConstraint
Parameters:
xml - the StringBuffer that is used for appending additional information

getXMLTag

protected String getXMLTag()
Description copied from class: Constraint
Returns the XML tag that is used for the class to en- or decode.

Specified by:
getXMLTag in class Constraint
Returns:
the XML tag that is used for the class to en- or decode

extractAdditionalInfo

protected void extractAdditionalInfo(StringBuffer xml)
                              throws NonParsableException
Description copied from class: Constraint
This method parses additional information from the StringBuffer that is not parsed in the base class.

Overrides:
extractAdditionalInfo in class InhConstraint
Parameters:
xml - the StringBuffer to be parsed
Throws:
NonParsableException - if something with the parsing went wrong

setFreqs

public void setFreqs(String[] array,
                     int start)
              throws IllegalArgumentException
This method is used to restore the values of a Gibbs Sampling run. The parameters are encoded in Strings. Each String is one frequency. The index start is used to begin at a specific position in the array.

Parameters:
array - the array of String chunks to be parsed
start - the start index
Throws:
IllegalArgumentException - if something is wrong with the frequencies

getDescription

public String getDescription(AlphabetContainer con,
                             int i)
Description copied from class: Constraint
Returns the decoded symbol for the encoded symbol i.

Overrides:
getDescription in class InhConstraint
Parameters:
con - the AlphabetContainer
i - the encoded symbol
Returns:
the decoded symbol for the encoded symbol i