de.jstacs.sequenceScores.statisticalModels.trainable.hmm
Class HMMFactory

java.lang.Object
  extended by de.jstacs.sequenceScores.statisticalModels.trainable.hmm.HMMFactory

public class HMMFactory
extends Object

This class allows to create some frequently used HMMs.

Author:
Jan Grau, Jens Keilwagen

Nested Class Summary
static class HMMFactory.HMMType
          This enum defines some standard architecture of profile HMMs.
static class HMMFactory.PseudoTransitionElement
          This class is used as place holder for a later BasicHigherOrderTransition.AbstractTransitionElement.
 
Constructor Summary
HMMFactory()
           
 
Method Summary
static AbstractHMM createErgodicHMM(HMMTrainingParameterSet pars, int order, double ess, double selfTranistionPart, double expectedSequenceLength, Emission... emission)
          This method creates an ergodic, i.e. a completely connected, HMM using the given emissions.
static AbstractHMM createProfileHMM(MaxHMMTrainingParameterSet trainingParameterSet, double[][] initFromTo, boolean likelihood, int order, int numLayers, AlphabetContainer con, double ess, boolean conditionalMain, boolean closeCircle, double[][] conditionInitProbs, boolean insertUniform)
          Creates a new profile HMM for a given architecture and number of layers.
static AbstractHMM createProfileHMM(MaxHMMTrainingParameterSet trainingParameterSet, double[][] initFromTo, boolean likelihood, int order, int numLayers, AlphabetContainer con, double ess, boolean conditionalMain, int joiningStates, double[][] conditionInitProbs, boolean insertUniform)
          Creates a new profile HMM for a given architecture and number of layers.
static AbstractHMM createProfileHMM(MaxHMMTrainingParameterSet trainingParameterSet, HMMFactory.HMMType type, boolean likelihood, int order, int numLayers, AlphabetContainer con, double ess, boolean conditionalMain, boolean closeCircle, double[][] conditionInitProbs)
          Creates a new profile HMM for a given architecture and number of layers.
static AbstractHMM createPseudoErgodicHMM(HMMTrainingParameterSet pars, double ess, double selfTranistionPart, double finalTranistionPart, AlphabetContainer con, int numStates, boolean insertUniform)
          Creates an HMM with numStates+1 states, where numStates emitting build a clique and each of those states is connected to the absorbing silent final state.
static AbstractHMM createSunflowerHMM(HMMTrainingParameterSet pars, AlphabetContainer con, double ess, int expectedSequenceLength, boolean startCentral, int... motifLength)
          This method creates a first order sunflower HMM.
static AbstractHMM createSunflowerHMM(HMMTrainingParameterSet pars, AlphabetContainer con, double ess, int expectedSequenceLength, boolean startCentral, PhyloTree[] t, double[] motifProb, int[] motifLength)
          This method creates a first order sunflower HMM allowing phylogenetic emissions.
static TransitionElement[] createTransition(double[][] hyperParams, ArrayList<HMMFactory.PseudoTransitionElement> list)
          Creates the real TransitionElements that can be used to create the HMM.
static HashMap<String,String> getHashMap()
          This method returns a HashMap that can be used in AbstractHMM.getGraphvizRepresentation(java.text.NumberFormat, de.jstacs.data.DataSet, double[], HashMap) to create a Graphviz representation of the AbstractHMM
static Pair<double[][],double[]> propagateESS(double ess, ArrayList<HMMFactory.PseudoTransitionElement> list)
          Propagates the ess for an HMM with absorbing states.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

HMMFactory

public HMMFactory()
Method Detail

createErgodicHMM

public static AbstractHMM createErgodicHMM(HMMTrainingParameterSet pars,
                                           int order,
                                           double ess,
                                           double selfTranistionPart,
                                           double expectedSequenceLength,
                                           Emission... emission)
                                    throws Exception
This method creates an ergodic, i.e. a completely connected, HMM using the given emissions.

Parameters:
pars - the parameter determining the training procedure of the HMM
order - the Markov order of the HMM
ess - the ess to be used for the HMM
selfTranistionPart - the a-priori probability of any self transition
expectedSequenceLength - the expected length of sequences to be modeled
emission - the emissions of the states
Returns:
an ergodic HMM
Throws:
Exception - if the HMM can not be created properly

createPseudoErgodicHMM

public static AbstractHMM createPseudoErgodicHMM(HMMTrainingParameterSet pars,
                                                 double ess,
                                                 double selfTranistionPart,
                                                 double finalTranistionPart,
                                                 AlphabetContainer con,
                                                 int numStates,
                                                 boolean insertUniform)
                                          throws Exception
Creates an HMM with numStates+1 states, where numStates emitting build a clique and each of those states is connected to the absorbing silent final state. Such an HMM models the length of the input sequences, i.e. summing the likelihood over all input sequences (of different lengths) will give 1, whereas an ergodic HMM does not model the input length and summing the likelihood of each possible input length will give 1.

Parameters:
pars - the parameters of the algorithm for learning the model parameters
ess - the equivalent sample size, is propagated between states to obtain consistent hyper-parameters for all parameters
selfTranistionPart - the a-priori probability of a self transition for each emitting state
finalTranistionPart - the a-priori probability of the transition to the final state from each emitting state
con - the AlphabetContainer of the HMM
numStates - the number of emitting states
insertUniform - if true the emitting states will use UniformEmissions
Returns:
an HMM with numStates+1 states, where numStates emitting build a clique and each of those states is connected to the absorbing silent final state
Throws:
Exception - if the HMM could not be created properly
See Also:
propagateESS(double, ArrayList)

createSunflowerHMM

public static AbstractHMM createSunflowerHMM(HMMTrainingParameterSet pars,
                                             AlphabetContainer con,
                                             double ess,
                                             int expectedSequenceLength,
                                             boolean startCentral,
                                             int... motifLength)
                                      throws Exception
This method creates a first order sunflower HMM. The current implementation does not set any hyper parameters for the prior.

Parameters:
pars - the parameter determining the training procedure of the HMM
con - the AlphabetContainer of the HMM
ess - the equivalent sample size (ess) of this model
expectedSequenceLength - the expected sequence length to be modeled; this parameter is used to determine the prior
startCentral - a switch for deciding between starting in the central state or in all states
motifLength - the length of the motifs building the petals of the sunflower
Returns:
a first order sunflower HMM
Throws:
Exception - if the HMM can not be created properly
See Also:
createSunflowerHMM(HMMTrainingParameterSet, AlphabetContainer, double, int, boolean, PhyloTree[], double[], int[])

createSunflowerHMM

public static AbstractHMM createSunflowerHMM(HMMTrainingParameterSet pars,
                                             AlphabetContainer con,
                                             double ess,
                                             int expectedSequenceLength,
                                             boolean startCentral,
                                             PhyloTree[] t,
                                             double[] motifProb,
                                             int[] motifLength)
                                      throws Exception
This method creates a first order sunflower HMM allowing phylogenetic emissions. The current implementation does not set any hyper parameters for the prior.

Parameters:
pars - the parameter determining the training procedure of the HMM
con - the AlphabetContainer of the HMM
ess - the equivalent sample size (ess) of this model
expectedSequenceLength - the expected sequence length to be modeled; this parameter is used to determine the prior
startCentral - a switch for deciding between starting in the central state or in all states
t - an array of length two that contains a PhyloTree for the background and the motif, can be null than a normal sunflower HMM is returned
motifProb - the a-priori probabilities for each motif, i.e., the a-priori probabilities for the edges from the central node to the first motif states
motifLength - the length of the motifs building the petals of the sunflower
Returns:
a first order sunflower HMM
Throws:
Exception - if the HMM can not be created properly

createProfileHMM

public static AbstractHMM createProfileHMM(MaxHMMTrainingParameterSet trainingParameterSet,
                                           HMMFactory.HMMType type,
                                           boolean likelihood,
                                           int order,
                                           int numLayers,
                                           AlphabetContainer con,
                                           double ess,
                                           boolean conditionalMain,
                                           boolean closeCircle,
                                           double[][] conditionInitProbs)
                                    throws Exception
Creates a new profile HMM for a given architecture and number of layers.

Parameters:
trainingParameterSet - the parameters of the algorithm for learning the model parameters
type - the type of the HMM, i.e., its architecture
likelihood - if true, the likelihood is considered as score of a sequence, and the probability of the Viterbi path otherwise
order - the order of the HMM, i.e., the number of previous states that are considered for a transition probability
numLayers - the number of layers of the profile HMM
con - the alphabet of the profile HMM
ess - the equivalent sample size, is propagated between states to obtain consistent hyper-parameters for all parameters
conditionalMain - if true, the match states have ReferenceSequenceDiscreteEmissions, and DiscreteEmissions otherwise
closeCircle - if true the circle from end to initial state is closed, i.e., the HMM can be traversed several times
conditionInitProbs - the hyper-parameters for initializing the match states if conditionalMain is true. May be null for using the hyper-parameters of the prior
Returns:
the profile HMM
Throws:
Exception - if the profile HMM could not be created

createProfileHMM

public static AbstractHMM createProfileHMM(MaxHMMTrainingParameterSet trainingParameterSet,
                                           double[][] initFromTo,
                                           boolean likelihood,
                                           int order,
                                           int numLayers,
                                           AlphabetContainer con,
                                           double ess,
                                           boolean conditionalMain,
                                           boolean closeCircle,
                                           double[][] conditionInitProbs,
                                           boolean insertUniform)
                                    throws Exception
Creates a new profile HMM for a given architecture and number of layers.

Parameters:
trainingParameterSet - the parameters of the algorithm for learning the model parameters
initFromTo - hyper-parameters of the transition from each state (first dimension) of the current layer to each other state in the same layer (first three entries of the second dimension) and the next layer (next three entries in the second dimension). If a hyper-parameter is set to Double.NaN, the corresponding transition is not allowed
likelihood - if true, the likelihood is considered as score of a sequence, and the probability of the Viterbi path otherwise
order - the order of the HMM, i.e., the number of previous states that are considered for a transition probability
numLayers - the number of layers of the profile HMM
con - the alphabet of the profile HMM
ess - the equivalent sample size, is propagated between states to obtain consistent hyper-parameters for all parameters
conditionalMain - if true, the match states have ReferenceSequenceDiscreteEmissions, and DiscreteEmissions otherwise
closeCircle - if true the circle from end to initial state is closed, i.e., the HMM can be traversed several times
conditionInitProbs - the hyper-parameters for initializing the match states if conditionalMain is true. May be null for using the hyper-parameters of the prior
insertUniform - if true the insert states will use UniformEmissions
Returns:
the profile HMM
Throws:
Exception - if the profile HMM could not be created

createProfileHMM

public static AbstractHMM createProfileHMM(MaxHMMTrainingParameterSet trainingParameterSet,
                                           double[][] initFromTo,
                                           boolean likelihood,
                                           int order,
                                           int numLayers,
                                           AlphabetContainer con,
                                           double ess,
                                           boolean conditionalMain,
                                           int joiningStates,
                                           double[][] conditionInitProbs,
                                           boolean insertUniform)
                                    throws Exception
Creates a new profile HMM for a given architecture and number of layers.

Parameters:
trainingParameterSet - the parameters of the algorithm for learning the model parameters
initFromTo - hyper-parameters of the transition from each state (first dimension) of the current layer to each other state in the same layer (first three entries of the second dimension) and the next layer (next three entries in the second dimension). If a hyper-parameter is set to Double.NaN, the corresponding transition is not allowed
likelihood - if true, the likelihood is considered as score of a sequence, and the probability of the Viterbi path otherwise
order - the order of the HMM, i.e., the number of previous states that are considered for a transition probability
numLayers - the number of layers of the profile HMM
con - the alphabet of the profile HMM
ess - the equivalent sample size, is propagated between states to obtain consistent hyper-parameters for all parameters
conditionalMain - if true, the match states have ReferenceSequenceDiscreteEmissions, and DiscreteEmissions otherwise
joiningStates - the number of states used in the joining arc, if not positive the profile HMM does not contain any joining states (i.e. the circle is not closed)
conditionInitProbs - the hyper-parameters for initializing the match states if conditionalMain is true. May be null for using the hyper-parameters of the prior
insertUniform - if true the insert states will use UniformEmissions
Returns:
the profile HMM
Throws:
Exception - if the profile HMM could not be created

createTransition

public static TransitionElement[] createTransition(double[][] hyperParams,
                                                   ArrayList<HMMFactory.PseudoTransitionElement> list)
Creates the real TransitionElements that can be used to create the HMM.

Parameters:
hyperParams - the hyper-parameters for HMMFactory.PseudoTransitionElement from the list
list - a list of HMMFactory.PseudoTransitionElement that is used to create the HMM
Returns:
an array of TransitionElements that correspond to all entries of the list
See Also:
propagateESS(double, ArrayList)

propagateESS

public static Pair<double[][],double[]> propagateESS(double ess,
                                                     ArrayList<HMMFactory.PseudoTransitionElement> list)
Propagates the ess for an HMM with absorbing states.

Parameters:
ess - the equivalent sample size of the HMM
list - a list of HMMFactory.PseudoTransitionElement that is used to create the HMM
Returns:
a Pair, which contains as first element the hyper-parameters for HMMFactory.PseudoTransitionElement from the list, and as second element the ess of each state

getHashMap

public static HashMap<String,String> getHashMap()
This method returns a HashMap that can be used in AbstractHMM.getGraphvizRepresentation(java.text.NumberFormat, de.jstacs.data.DataSet, double[], HashMap) to create a Graphviz representation of the AbstractHMM

Returns:
a HashMap that can be used to create a Graphviz representation
See Also:
AbstractHMM.getGraphvizRepresentation(java.text.NumberFormat, de.jstacs.data.DataSet, double[], HashMap)