de.jstacs.sequenceScores.statisticalModels.trainable.discrete
Class ConstraintManager

java.lang.Object
  extended by de.jstacs.sequenceScores.statisticalModels.trainable.discrete.ConstraintManager

public class ConstraintManager
extends Object

This class manipulates and manages some constraints.

Author:
Jens Keilwagen

Method Summary
static void computeFreqs(double ess, Constraint... constr)
          This method computes the (smoothed) relative frequencies.
static double countInhomogeneous(AlphabetContainer alphabets, int length, DataSet data, double[] weights, boolean reset, Constraint... constr)
          Fills the (inhomogeneous) Constraint constr with the weighted absolute frequencies of the DataSet data.
static MEMConstraint[] createConstraints(AbstractList<int[]> list, int[] alphabetLength)
          Creates the constraints of a model.
static MEMConstraint[] createConstraints(AbstractList<int[]> list, int[] alphabetLength, int[] indices)
          Creates the constraints for a part of a model.
static void drawFreqs(double ess, InhCondProb... constr)
          This method draws relative frequencies for the constraints in constr.
static ArrayList<int[]> extract(int length, String encoded)
          Extracts the constraints of a String and returns an ArrayList of int[].
static double getEntropy(Constraint c)
          Tries to compute the entropy of a Constraint as exact as possible.
static double getLogGammaSum(Constraint c, double ess)
          Computes the sum of the differences between the logarithmic values of the prior knowledge and all counts of a Constraint c.
static void reduce(AbstractList<int[]> list)
          This method tries to find and remove subconstraints that are already fulfilled by a bigger one.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

computeFreqs

public static void computeFreqs(double ess,
                                Constraint... constr)
                         throws IllegalArgumentException
This method computes the (smoothed) relative frequencies. If the ess (equivalent sample size) is zero no smoothing is done.

Parameters:
ess - the ess (equivalent sample size), if the ess is zero then MLE (maximum likelihood estimation) otherwise MAPE (maximum a posteriori estimation)
constr - the constraints, should be filled with absolute frequencies
Throws:
IllegalArgumentException - if the ess is negative

countInhomogeneous

public static double countInhomogeneous(AlphabetContainer alphabets,
                                        int length,
                                        DataSet data,
                                        double[] weights,
                                        boolean reset,
                                        Constraint... constr)
                                 throws WrongAlphabetException,
                                        IllegalArgumentException
Fills the (inhomogeneous) Constraint constr with the weighted absolute frequencies of the DataSet data. The relative frequencies will not be computed, for this task use computeFreqs(double, Constraint...)

Parameters:
alphabets - the alphabets over which the constraints are defined
length - the length for which the constraints are defined
data - the DataSet
weights - the weights for the Sequences in the DataSet:
  1. weights == null or
  2. weights.length = data.getNumberOfElements(), for all i: weights[i] >= 0
reset - indicates whether the constraints should be reseted
constr - the (inhomogeneous) constraints to be filled
Returns:
the sum of the weights
Throws:
WrongAlphabetException - if the alphabet of the data is not correct
IllegalArgumentException - if the weights array has wrong dimension or the element length of the data is not correct

drawFreqs

public static void drawFreqs(double ess,
                             InhCondProb... constr)
                      throws IllegalArgumentException
This method draws relative frequencies for the constraints in constr.

Parameters:
ess - the ess (equivalent sample size, additional pseudocount)
constr - the constraints (can be filled with absolute frequencies)
Throws:
IllegalArgumentException - if the ess is negative
See Also:
InhCondProb.drawParameters(double)

extract

public static ArrayList<int[]> extract(int length,
                                       String encoded)
                                throws IllegalArgumentException
Extracts the constraints of a String and returns an ArrayList of int[]. This can be used to create constraints.

Parameters:
length - the sequence/model length
encoded - constraints encoded in a String
  • items are separated by ";"
  • short notation for sets of constraints, e.g. "m2sx"
  • or each constraint as list of positions e.g. "0,1,2;"
Returns:
an ArrayList of int[]
Throws:
IllegalArgumentException - if something is wrong with the length or the encoded String

getEntropy

public static double getEntropy(Constraint c)
Tries to compute the entropy of a Constraint as exact as possible.

Parameters:
c - the constraint
Returns:
the entropy of the constraint as exact as possible

getLogGammaSum

public static double getLogGammaSum(Constraint c,
                                    double ess)
Computes the sum of the differences between the logarithmic values of the prior knowledge and all counts of a Constraint c.
$\sum_i \left[\log \Gamma(\alpha_i) - \log \Gamma(\alpha_i + N_i)\right]$

Parameters:
c - the constraint
ess - the ess (equivalent sample size)
Returns:
the sum of the differences

reduce

public static void reduce(AbstractList<int[]> list)
This method tries to find and remove subconstraints that are already fulfilled by a bigger one. The method only looks for the positions, so it is recommended to use this method before any learning step.

Parameters:
list - the list of all constraints

createConstraints

public static MEMConstraint[] createConstraints(AbstractList<int[]> list,
                                                int[] alphabetLength,
                                                int[] indices)
Creates the constraints for a part of a model.

Parameters:
list - the list of all cliques, each clique is used for one constraint
alphabetLength - the array of alphabet lengths for each position
indices - the positions used in this part of the model
Returns:
the array of constraints
See Also:
extract(int, String), createConstraints(AbstractList, int[])

createConstraints

public static MEMConstraint[] createConstraints(AbstractList<int[]> list,
                                                int[] alphabetLength)
Creates the constraints of a model.

Parameters:
list - the list of all cliques, each clique is used for one constraint
alphabetLength - the array of alphabet lengths for each position
Returns:
the array of constraints
See Also:
extract(int, String), createConstraints(AbstractList, int[], int[])