de.jstacs.models.mixture.motif
Class HiddenMotifMixture

java.lang.Object
  extended by de.jstacs.models.AbstractModel
      extended by de.jstacs.models.mixture.AbstractMixtureModel
          extended by de.jstacs.models.mixture.motif.HiddenMotifMixture
All Implemented Interfaces:
Model, MotifDiscoverer, Storable, Cloneable
Direct Known Subclasses:
SingleHiddenMotifMixture

public abstract class HiddenMotifMixture
extends AbstractMixtureModel
implements MotifDiscoverer

This is the main class that every generative motif discoverer should implement. It implements the mixture of containing 0 or 1 motif in a sequence.

Author:
Jens Keilwagen

Nested Class Summary
 
Nested classes/interfaces inherited from class de.jstacs.models.mixture.AbstractMixtureModel
AbstractMixtureModel.Algorithm, AbstractMixtureModel.Parameterization
 
Nested classes/interfaces inherited from interface de.jstacs.motifDiscovery.MotifDiscoverer
MotifDiscoverer.KindOfProfile
 
Field Summary
protected  byte bgMaxMarkovOrder
          The order of the background model.
protected  PositionPrior posPrior
          The prior for the positions.
protected  boolean trainOnlyMotifModel
          A switch that enables to train only the motif model.
 
Fields inherited from class de.jstacs.models.mixture.AbstractMixtureModel
algorithm, algorithmHasBeenRun, alternativeModel, burnInTest, componentHyperParams, compProb, counter, dimension, estimateComponentProbs, file, filereader, filewriter, initialIteration, logWeights, model, optimizeModel, sample, samplingIndex, sostream, starts, stationaryIteration, weights
 
Fields inherited from class de.jstacs.models.AbstractModel
alphabets, length
 
Constructor Summary
protected HiddenMotifMixture(Model[] models, int starts, boolean estimateComponentProbs, double[] componentHyperParams, double[] weights, PositionPrior posPrior, boolean trainOnlyMotifModel, AbstractMixtureModel.Algorithm algorithm, double alpha, double eps, AbstractMixtureModel.Parameterization parametrization, int initialIteration, int stationaryIteration, BurnInTest burnInTest)
          Creates a new AbstractMixtureModel.
protected HiddenMotifMixture(StringBuffer xml)
          The standard constructor for the interface Storable.
 
Method Summary
protected  void checkLength(int index, int l)
          This method checks if the length l of the model with index index is capable for the current instance.
 HiddenMotifMixture clone()
          Follows the conventions of Object's clone-method.
protected  Sequence[] emitSampleUsingCurrentParameterSet(int n, int... lengths)
          Standard implementation throwing an OperationNotSupportedException.
protected  void extractFurtherInformation(StringBuffer xml)
          This method is used in the subclasses to extract further information from the xml representation and to set these as values of the instance.
protected  StringBuffer getFurtherInformation()
          This method is used in the subclasses to append further information at the xml representation.
 String getInstanceName()
          Should return a short instance name such as iMM(0), BN(2), ...
 double getLogPriorTerm()
          Returns a value that is proportional to the log of the prior.
abstract  int getMinimalSequenceLength()
          Returns the minimal length a sequence respectively a sample has to have.
protected  void getNewParameters(int iteration, double[][] seqWeights, double[] w)
          This method trains the internal models on the internal sample and the given weights.
 String toString()
          Should give a simple representation (text) of the model as String.
 void train(Sample data, double[] weights)
          Trains the Model object given the data as Sample using the specified weights.
 void trainBgModel(Sample data, double[] weights)
          This method trains the bg-model.
 
Methods inherited from class de.jstacs.models.mixture.AbstractMixtureModel
algorithmHasBeenRun, checkModelsForGibbsSampling, continueIterations, continueIterations, createSeqWeightsArray, doFirstIteration, doFirstIteration, doFirstIteration, draw, emitSample, extendSampling, finalize, fromXML, getCharacteristics, getIndexOfMaximalComponentFor, getLogPriorTermForComponentProbs, getLogProbFor, getLogProbFor, getLogProbFor, getLogProbUsingCurrentParameterSetFor, getModel, getModels, getMRG, getMRGParams, getNameOfAlgorithm, getNewComponentProbs, getNewParametersForModel, getNewWeights, getNumberOfComponents, getNumericalCharacteristics, getProbFor, getScoreForBestRun, getWeights, initModelForSampling, initWithPrior, isInSamplingMode, isTrained, iterate, iterate, max, modifyWeights, parseNextParameterSet, parseParameterSet, samplingStopped, set, setAlpha, setOutputStream, setThreshold, setTrainData, setWeights, swap, toXML
 
Methods inherited from class de.jstacs.models.AbstractModel
getAlphabetContainer, getLength, getLogProbFor, getLogProbFor, getLogProbFor, getMaximalMarkovOrder, getPriorTerm, getProbFor, getProbFor, setNewAlphabetContainerInstance, train
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface de.jstacs.motifDiscovery.MotifDiscoverer
getGlobalIndexOfMotifInComponent, getIndexOfMaximalComponentFor, getMotifLength, getNumberOfComponents, getNumberOfMotifs, getNumberOfMotifsInComponent, getProfileOfScoresFor, getStrandFor
 
Methods inherited from interface de.jstacs.Storable
toXML
 

Field Detail

posPrior

protected PositionPrior posPrior
The prior for the positions.


trainOnlyMotifModel

protected boolean trainOnlyMotifModel
A switch that enables to train only the motif model.


bgMaxMarkovOrder

protected byte bgMaxMarkovOrder
The order of the background model.

Constructor Detail

HiddenMotifMixture

protected HiddenMotifMixture(Model[] models,
                             int starts,
                             boolean estimateComponentProbs,
                             double[] componentHyperParams,
                             double[] weights,
                             PositionPrior posPrior,
                             boolean trainOnlyMotifModel,
                             AbstractMixtureModel.Algorithm algorithm,
                             double alpha,
                             double eps,
                             AbstractMixtureModel.Parameterization parametrization,
                             int initialIteration,
                             int stationaryIteration,
                             BurnInTest burnInTest)
                      throws CloneNotSupportedException,
                             IllegalArgumentException,
                             WrongAlphabetException
Creates a new AbstractMixtureModel. This constructor can be used for any algorithm since it takes all necessary values as parameters.

Parameters:
models - the single models building the AbstractMixtureModel, if the model is trained using AbstractMixtureModel.Algorithm.GIBBS_SAMPLING the models that will be adjusted have to implement GibbsSamplingComponent. The models that are used for the flanking sequences have to be able to score sequences of arbitrary length.
starts - the number of times the algorithm will be started in the train method, at least 1
estimateComponentProbs - the switch for estimating the component probabilities in the algorithm or to hold them fixed; if the component parameters are fixed, the values of weights will be used, otherwise the componentHyperParams will be incorporated in the adjustment
componentHyperParams - the hyperparameters for the component assignment prior,
  • will only be used if estimateComponentProbs == true
  • the array has to be null or has to have length dimension
  • null or an array with all values zero (0) than ML
  • otherwise (all values positive) a prior is used (MAP, MP, ...)
  • depends on the parameterization
weights - null or the weights for the components (than weights.length == dimension)
posPrior - this object determine the positional distribution that shall be used
trainOnlyMotifModel - a switch whether to train only the motif model
algorithm - either AbstractMixtureModel.Algorithm.EM or AbstractMixtureModel.Algorithm.GIBBS_SAMPLING
alpha - only for AbstractMixtureModel.Algorithm.EM
the positive parameter for the Dirichlet which is used when you invoke train to initialize the gammas. It is recommended to use alpha = 1 (uniform distribution on a simplex).
eps - only for AbstractMixtureModel.Algorithm.EM
the non-negative threshold for stopping the EM-algorithm
parametrization - only for AbstractMixtureModel.Algorithm.EM
the type of the component probability parameterization;

HiddenMotifMixture

protected HiddenMotifMixture(StringBuffer xml)
                      throws NonParsableException
The standard constructor for the interface Storable.

Parameters:
xml - the StringBuffer containing the model
Throws:
NonParsableException - if the StringBuffer can not be parsed
Method Detail

clone

public HiddenMotifMixture clone()
                         throws CloneNotSupportedException
Description copied from class: AbstractModel
Follows the conventions of Object's clone-method.

Specified by:
clone in interface Model
Overrides:
clone in class AbstractMixtureModel
Returns:
an object, that is a copy of the current AbstractModel (the member-AlphabetContainer isn't deeply cloned since it is assumed to be immutable). The type of the returned object is defined by the class X directly inherited from AbstractModel. Hence X's clone-method should work as:
1. Object o = (X)super.clone(); 2. all additional member variables of o defined by X that are not of simple data-types like int, double, ... , have to be deeply copied 3. return o
Throws:
CloneNotSupportedException

getFurtherInformation

protected StringBuffer getFurtherInformation()
Description copied from class: AbstractMixtureModel
This method is used in the subclasses to append further information at the xml representation.

Overrides:
getFurtherInformation in class AbstractMixtureModel
Returns:
a part of the xml representation
See Also:
AbstractMixtureModel.extractFurtherInformation(StringBuffer)

extractFurtherInformation

protected void extractFurtherInformation(StringBuffer xml)
                                  throws NonParsableException
Description copied from class: AbstractMixtureModel
This method is used in the subclasses to extract further information from the xml representation and to set these as values of the instance.

Overrides:
extractFurtherInformation in class AbstractMixtureModel
Parameters:
xml - the xml representation
Throws:
NonParsableException - if the xml representation is not parsable
See Also:
AbstractMixtureModel.getFurtherInformation()

train

public void train(Sample data,
                  double[] weights)
           throws Exception
Description copied from interface: Model
Trains the Model object given the data as Sample using the specified weights. The weight at position i belongs to the element at position i. So the array weight should have the number of sequences in the sample as dimension. (Optionally it is possible to use weight == null if all weights have the value one.)
This method should work non-incrementally. That means the result of the following series: train(data1); train(data2) should be a fully trained model over data2 and not over data1+data2. All parameters of the model were given by the call of the constructor.

Specified by:
train in interface Model
Overrides:
train in class AbstractMixtureModel
Parameters:
data - the given sequences
weights - the weights of the elements, each weight should be non-negative
Throws:
Exception - an Exception should be thrown if the training did not succeed (e.g. the weights dimension of weights and number of samples does not match).
See Also:
Sample.getElementAt(int), Sample.ElementEnumerator

trainBgModel

public final void trainBgModel(Sample data,
                               double[] weights)
                        throws Exception
This method trains the bg-model. This can be useful if the bg-model is not trained while the EM-algorithm.

Parameters:
data - the sample
weights - the weights
Throws:
Exception - if something went wrong

getNewParameters

protected void getNewParameters(int iteration,
                                double[][] seqWeights,
                                double[] w)
                         throws Exception
Description copied from class: AbstractMixtureModel
This method trains the internal models on the internal sample and the given weights.

Overrides:
getNewParameters in class AbstractMixtureModel
Parameters:
iteration - the number this method has been invoked
seqWeights - the weights for each model and sequence
w - the weights for the components
Throws:
Exception - if the training of the internal models went wrong

checkLength

protected void checkLength(int index,
                           int l)
Description copied from class: AbstractMixtureModel
This method checks if the length l of the model with index index is capable for the current instance. Otherwise an IllegalArgumentException is thrown.

Overrides:
checkLength in class AbstractMixtureModel
Parameters:
index - the index of the model
l - the length of the model

getMinimalSequenceLength

public abstract int getMinimalSequenceLength()
Returns the minimal length a sequence respectively a sample has to have.

Returns:
the minimal length a sequence respectively a sample has to have

getInstanceName

public String getInstanceName()
Description copied from interface: Model
Should return a short instance name such as iMM(0), BN(2), ...

Specified by:
getInstanceName in interface Model
Overrides:
getInstanceName in class AbstractMixtureModel
Returns:
a short instance name

getLogPriorTerm

public double getLogPriorTerm()
                       throws Exception
Description copied from interface: Model
Returns a value that is proportional to the log of the prior. For ML 0 should be returned.

Specified by:
getLogPriorTerm in interface Model
Overrides:
getLogPriorTerm in class AbstractMixtureModel
Returns:
a value that is proportional to the log of the prior
Throws:
Exception - if something went wrong
See Also:
Model.getPriorTerm()

toString

public String toString()
Description copied from interface: Model
Should give a simple representation (text) of the model as String.

Specified by:
toString in interface Model
Overrides:
toString in class Object
Returns:
the representation as String

emitSampleUsingCurrentParameterSet

protected Sequence[] emitSampleUsingCurrentParameterSet(int n,
                                                        int... lengths)
                                                 throws Exception
Standard implementation throwing an OperationNotSupportedException.

Specified by:
emitSampleUsingCurrentParameterSet in class AbstractMixtureModel
Parameters:
n - the number of sequences to be sampled
lengths - the corresponding lengths
Returns:
an array of sequences
Throws:
Exception - if it was impossible to sample
See Also:
AbstractModel.emitSample(int, int...)