de.jstacs.sequenceScores.statisticalModels.trainable.discrete
Class ConstraintManager

java.lang.Object
  extended by de.jstacs.sequenceScores.statisticalModels.trainable.discrete.ConstraintManager

public class ConstraintManager
extends Object

The class manipulate some constraints.

Author:
Jens Keilwagen

Nested Class Summary
static class ConstraintManager.Decomposition
          This enum defines the different possible types of decomposition of a model.
 
Method Summary
static void computeFreqs(double ess, Constraint... constr)
          This method computes the (smoothed) relative frequencies.
static double countInhomogeneous(AlphabetContainer alphabets, int length, DataSet data, double[] weights, boolean reset, Constraint... constr)
          Fills the (inhomogeneous) constr with the weighted absolute frequency of the DataSet data and computes the frequencies will not be computed.
static MEMConstraint[] createConstraints(AbstractList<int[]> list, int[] alphabetLength)
          Creates the constraints of a model
static MEMConstraint[] createConstraints(AbstractList<int[]> list, int[] alphabetLength, int[] indices)
          Creates the constraints for a part of a model
static MEM[] disconnect(AbstractList<int[]> list, int[] alphabetLength, ConstraintManager.Decomposition decomposition)
          This method tries to disconnect the constraints and create the models.
static void drawFreqs(double ess, InhCondProb... constr)
          This method draws relative frequencies.
static ArrayList<int[]> extract(int length, String encoded)
          Extracts the constraint of a String and returns an ArrayList of int[].
static double getEntropy(Constraint c)
          Tries to compute the entropy as exact as possible.
static double getLogGammaSum(Constraint c, double ess)
          Computes the sum of differences of the logarithmic values of the prior knowledge and all counts.
static void reduce(AbstractList<int[]> list)
          This method tries to find and remove subconstraints that are already fulfilled by a bigger one.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

computeFreqs

public static void computeFreqs(double ess,
                                Constraint... constr)
                         throws IllegalArgumentException
This method computes the (smoothed) relative frequencies. If ess=0 no smoothing is done.

Parameters:
ess - the ESS, if ESS is zero than MLE otherwise MAPE
constr - the constraints, should be fill with absolute frequencies
Throws:
IllegalArgumentException - if the ess is negative

countInhomogeneous

public static double countInhomogeneous(AlphabetContainer alphabets,
                                        int length,
                                        DataSet data,
                                        double[] weights,
                                        boolean reset,
                                        Constraint... constr)
                                 throws WrongAlphabetException,
                                        IllegalArgumentException
Fills the (inhomogeneous) constr with the weighted absolute frequency of the DataSet data and computes the frequencies will not be computed.

Parameters:
alphabets - the alphabets over which the constraints are defined
length - the length for which the constraints are defined
data - the sequences
weights - the weights for the sequences,
  1. weights==null or
  2. weights.length = data.getNumberOfElements(), for all i:weights[i]>=0
reset - whether the constraints should be reseted
constr - constraints to fill
Returns:
the sum of the weights
Throws:
WrongAlphabetException - if the alphabet of the data is not correct
IllegalArgumentException - if the weights array has wrong dimension or the element length of the data is not correct

drawFreqs

public static void drawFreqs(double ess,
                             InhCondProb... constr)
                      throws IllegalArgumentException
This method draws relative frequencies.

Parameters:
ess - the ESS (additional pseudocount)
constr - the constraints (can be fill with absolute frequencies)
Throws:
IllegalArgumentException - if the ess is negative

extract

public static ArrayList<int[]> extract(int length,
                                       String encoded)
                                throws IllegalArgumentException
Extracts the constraint of a String and returns an ArrayList of int[]. These can be used to create constraints.

Parameters:
length - the sequence respectively the model length
encoded - constraints encoded in a String
  • items are separated by ";"
  • short notation for sets of constraints, e.g. ":m2sx"
  • or each constraint as e.g. "i,j,k;"

getEntropy

public static double getEntropy(Constraint c)
Tries to compute the entropy as exact as possible.

Parameters:
c - the constraint
Returns:
the entropy of the constraint

getLogGammaSum

public static double getLogGammaSum(Constraint c,
                                    double ess)
Computes the sum of differences of the logarithmic values of the prior knowledge and all counts.

In Latex notation:
\sum_i [\ln \Gamma(\alpha_i) - \ln \Gamma(\alpha_i + N_i)]

Parameters:
c - the constraint
ess - the ESS
Returns:
the sum

reduce

public static void reduce(AbstractList<int[]> list)
This method tries to find and remove subconstraints that are already fulfilled by a bigger one. The method only looks for the positions, so it is recommended to use this method before any learning step.

Parameters:
list - the list of all constraints

createConstraints

public static MEMConstraint[] createConstraints(AbstractList<int[]> list,
                                                int[] alphabetLength,
                                                int[] indices)
Creates the constraints for a part of a model

Parameters:
list - the list of all cliques, each clique is used for one constraint
alphabetLength - the array of alpahebtLength for each position
indices - the positions used in this part of the model
Returns:
the array of constraints

createConstraints

public static MEMConstraint[] createConstraints(AbstractList<int[]> list,
                                                int[] alphabetLength)
Creates the constraints of a model

Parameters:
list - the list of all cliques, each clique is used for one constraint
alphabetLength - the array of alpahebtLength for each position
Returns:
the array of constraints

disconnect

public static MEM[] disconnect(AbstractList<int[]> list,
                               int[] alphabetLength,
                               ConstraintManager.Decomposition decomposition)
This method tries to disconnect the constraints and create the models. This can be useful to be much faster while training.

Parameters:
list - the list of all constraints
alphabetLength - the length of the alphabets at the specific positions
decomposition - the choice how to disconnect the constraints
Returns:
an array of MEM
See Also:
ConstraintManager.Decomposition.DECOMPOSE_LESS_CONNECTED, ConstraintManager.Decomposition.DECOMPOSE_UNCONNECTED, ConstraintManager.Decomposition.DECOMPOSE_NOTHING