de.jstacs.sequenceScores.statisticalModels.trainable
Class TrainableStatisticalModelFactory

java.lang.Object
  extended by de.jstacs.sequenceScores.statisticalModels.trainable.TrainableStatisticalModelFactory

public class TrainableStatisticalModelFactory
extends Object

This class allows to easily create some frequently used models. It offers only one way of creating each model and set some of the parameters to default values. If you like to set further models please check the constructors of the individual classes.

Author:
Jens Keilwagen

Constructor Summary
TrainableStatisticalModelFactory()
           
 
Method Summary
static BayesianNetworkTrainSM createBayesianNetworkModel(AlphabetContainer con, int length, double ess, byte order)
          This method returns a Bayesian network model (BN) with user-specified order.
static HomogeneousMM createHomogeneousMarkovModel(AlphabetContainer con, double ess, byte order)
          This method returns a homogeneous Markov model with user-specified order.
static FSDAGTrainSM createInhomogeneousMarkovModel(AlphabetContainer con, int length, double ess, byte order)
          This method returns a inhomogeneous Markov model (IMM) with user-specified order.
static MixtureTrainSM createMixtureModel(double[] hyper, TrainableStatisticalModel[] model)
          This method allows to create a MixtureTrainSM that allows to model a DataSet as a mixture of individual components.
static BayesianNetworkTrainSM createPermutedMarkovModel(AlphabetContainer con, int length, double ess, byte order)
          This method returns a permuted Markov model (PMM) with user-specified order.
static FSDAGTrainSM createPWM(AlphabetContainer con, int length, double ess)
          This method returns a position weight matrix (PWM).
static StrandTrainSM createStrandModel(TrainableStatisticalModel model)
          This method allows to create a StrandTrainSM that allows to score binding sites on both strand of DNA.
static ZOOPSTrainSM createZOOPS(TrainableStatisticalModel motif, TrainableStatisticalModel bg, double[] hyper, boolean trainOnlyMotifModel)
          This method allows to create a "zero or one occurrence per sequence" (ZOOPS) model that allows to discover binding sites in a DataSet.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TrainableStatisticalModelFactory

public TrainableStatisticalModelFactory()
Method Detail

createPWM

public static FSDAGTrainSM createPWM(AlphabetContainer con,
                                     int length,
                                     double ess)
                              throws Exception
This method returns a position weight matrix (PWM). A PWM assumes that all positions of a sequence are statistically independent.

Parameters:
con - the AlphabetContainer of the PWM
length - the length of the PWM, i.e., the length of the sequences that can be modeled
ess - the equivalent sample size (ess) of the PWM, if 0 (zero) the model can be trained using maximum likelihood principle, otherwise it can be trained using the maximum a posteriori principle using the BDeu prior
Returns:
the PWM
Throws:
Exception - if the model can not be created correctly

createInhomogeneousMarkovModel

public static FSDAGTrainSM createInhomogeneousMarkovModel(AlphabetContainer con,
                                                          int length,
                                                          double ess,
                                                          byte order)
                                                   throws Exception
This method returns a inhomogeneous Markov model (IMM) with user-specified order.

Parameters:
con - the AlphabetContainer of the IMM
length - the length of the IMM, i.e., the length of the sequences that can be modeled
ess - the equivalent sample size (ess) of the IMM, if 0 (zero) the model can be trained using maximum likelihood principle, otherwise it can be trained using the maximum a posteriori principle using the BDeu prior
order - the order of the IMM, i.e., the number of directly preceding random variables (=positions) that might have an influence on the probability of outcome of a random variable (=position)
Returns:
the IMM
Throws:
Exception - if the model can not be created correctly

createPermutedMarkovModel

public static BayesianNetworkTrainSM createPermutedMarkovModel(AlphabetContainer con,
                                                               int length,
                                                               double ess,
                                                               byte order)
                                                        throws Exception
This method returns a permuted Markov model (PMM) with user-specified order. Permuted Markov models determine a permutation of the random variables (=position) and than build a inhomogeneous Markov model based on this permutation.

Parameters:
con - the AlphabetContainer of the PMM
length - the length of the PMM, i.e., the length of the sequences that can be modeled
ess - the equivalent sample size (ess) of the PMM, if 0 (zero) the model can be trained using maximum likelihood principle, otherwise it can be trained using the maximum a posteriori principle using the BDeu prior
order - the order of the PMM, i.e., the number of random variables (=positions) that might have an influence on the probability of outcome of a random variable (=position)
Returns:
the PMM
Throws:
Exception - if the model can not be created correctly

createBayesianNetworkModel

public static BayesianNetworkTrainSM createBayesianNetworkModel(AlphabetContainer con,
                                                                int length,
                                                                double ess,
                                                                byte order)
                                                         throws Exception
This method returns a Bayesian network model (BN) with user-specified order. Bayesian network determine a directed acyclic graph of the random variables (positions) and than learn the parameters of the distribution based on this network.

Parameters:
con - the AlphabetContainer of the BN
length - the length of the BN, i.e., the length of the sequences that can be modeled
ess - the equivalent sample size (ess) of the BN, if 0 (zero) the model can be trained using maximum likelihood principle, otherwise it can be trained using the maximum a posteriori principle using the BDeu prior
order - the order of the BN, i.e., the number of random variables (=positions) that might have an influence on the probability of outcome of a random variable (=position)
Returns:
the BN
Throws:
Exception - if the model can not be created correctly

createHomogeneousMarkovModel

public static HomogeneousMM createHomogeneousMarkovModel(AlphabetContainer con,
                                                         double ess,
                                                         byte order)
                                                  throws Exception
This method returns a homogeneous Markov model with user-specified order.

Parameters:
con - the AlphabetContainer of the model
ess - the equivalent sample size (ess) of the model, if 0 (zero) the model can be trained using maximum likelihood principle, otherwise it can be trained using the maximum a posteriori principle using the BDeu prior
order - the order of the model, i.e., the number of directly preceding random variables (=positions) that might have an influence on the probability of outcome of a random variable (=position)
Returns:
the homogeneous Markov model
Throws:
Exception - if the model can not be created correctly

createStrandModel

public static StrandTrainSM createStrandModel(TrainableStatisticalModel model)
                                       throws Exception
This method allows to create a StrandTrainSM that allows to score binding sites on both strand of DNA.

Parameters:
model - the internally used model
Returns:
the StrandTrainSM
Throws:
Exception - if the model can not be created correctly

createMixtureModel

public static MixtureTrainSM createMixtureModel(double[] hyper,
                                                TrainableStatisticalModel[] model)
                                         throws Exception
This method allows to create a MixtureTrainSM that allows to model a DataSet as a mixture of individual components.

Parameters:
hyper - the hyper parameters for the components (should be identical to the ESS of the components)
model - the internally used model
Returns:
the MixtureTrainSM
Throws:
Exception - if the model can not be created correctly

createZOOPS

public static ZOOPSTrainSM createZOOPS(TrainableStatisticalModel motif,
                                       TrainableStatisticalModel bg,
                                       double[] hyper,
                                       boolean trainOnlyMotifModel)
                                throws Exception
This method allows to create a "zero or one occurrence per sequence" (ZOOPS) model that allows to discover binding sites in a DataSet.

Parameters:
motif - the internally used model for the binding sites
bg - the internally used model for the flanking sequence
hyper - the hyper parameters for the components (should be identical to the ESS of the components)
trainOnlyMotifModel - a switch allowing to train either the motif model or both (motif and bg) models
Returns:
the ZOOPS model
Throws:
Exception - if the model can not be created correctly