public abstract class HiddenMotifMixture extends AbstractMixtureTrainSM implements MotifDiscoverer
AbstractMixtureTrainSM.Algorithm, AbstractMixtureTrainSM.ParameterizationMotifDiscoverer.KindOfProfile| Modifier and Type | Field and Description |
|---|---|
protected PositionPrior |
posPrior
The prior for the positions.
|
algorithm, algorithmHasBeenRun, alternativeModel, best, burnInTest, componentHyperParams, compProb, counter, dimension, estimateComponentProbs, file, filereader, filewriter, initialIteration, logWeights, model, optimizeModel, sample, samplingIndex, seqWeights, sostream, starts, stationaryIteration, weightsalphabets, length| Modifier | Constructor and Description |
|---|---|
protected |
HiddenMotifMixture(StringBuffer xml)
The standard constructor for the interface
Storable. |
protected |
HiddenMotifMixture(TrainableStatisticalModel[] models,
boolean[] optimzeArray,
int components,
int starts,
boolean estimateComponentProbs,
double[] componentHyperParams,
double[] weights,
PositionPrior posPrior,
AbstractMixtureTrainSM.Algorithm algorithm,
double alpha,
TerminationCondition tc,
AbstractMixtureTrainSM.Parameterization parametrization,
int initialIteration,
int stationaryIteration,
BurnInTest burnInTest)
Creates a new
HiddenMotifMixture. |
| Modifier and Type | Method and Description |
|---|---|
protected void |
checkLength(int index,
int l)
This method checks if the length
l of the model with index
index is capable for the current instance. |
HiddenMotifMixture |
clone()
Follows the conventions of
Object's clone()-method. |
protected Sequence[] |
emitDataSetUsingCurrentParameterSet(int n,
int... lengths)
Standard implementation throwing an
OperationNotSupportedException. |
protected void |
extractFurtherInformation(StringBuffer xml)
This method is used in the subclasses to extract further information from
the XML representation and to set these as values of the instance.
|
protected StringBuffer |
getFurtherInformation()
This method is used in the subclasses to append further information to
the XML representation.
|
String |
getInstanceName()
Should return a short instance name such as iMM(0), BN(2), ...
|
abstract int |
getMinimalSequenceLength()
Returns the minimal length a sequence respectively a data set has to have.
|
protected void |
getNewParameters(int iteration,
double[][] seqWeights,
double[] w)
This method trains the internal models on the internal data set and the
given weights.
|
String |
toString(NumberFormat nf)
This method returns a
String representation of the instance. |
void |
train(DataSet data,
double[] weights)
Trains the
TrainableStatisticalModel object given the data as DataSet using
the specified weights. |
abstract void |
trainBgModel(DataSet data,
double[] weights)
This method trains the background model.
|
algorithmHasBeenRun, checkModelsForGibbsSampling, continueIterations, continueIterations, createSeqWeightsArray, doFirstIteration, doFirstIteration, doFirstIteration, draw, emitDataSet, extendSampling, finalize, fromXML, getCharacteristics, getIndexOfMaximalComponentFor, getLogPriorTerm, getLogPriorTermForComponentProbs, getLogProbFor, getLogProbFor, getLogProbFor, getLogProbUsingCurrentParameterSetFor, getLogScoreFor, getModel, getModels, getMRG, getMRGParams, getNameOfAlgorithm, getNewComponentProbs, getNewParametersForModel, getNewWeights, getNumberOfComponents, getNumericalCharacteristics, getScoreForBestRun, getWeights, initModelForSampling, initWithPrior, isInitialized, isInSamplingMode, iterate, iterate, max, modifyWeights, parseNextParameterSet, parseParameterSet, samplingStopped, setAlpha, setOutputStream, setTrainData, setWeights, swap, toXMLcheck, getAlphabetContainer, getLength, getLogProbFor, getLogProbFor, getLogScoreFor, getLogScoreFor, getLogScoreFor, getLogScoreFor, getMaximalMarkovOrder, toString, trainequals, getClass, hashCode, notify, notifyAll, wait, wait, waitgetGlobalIndexOfMotifInComponent, getIndexOfMaximalComponentFor, getMotifLength, getNumberOfComponents, getNumberOfMotifs, getNumberOfMotifsInComponent, getProfileOfScoresFor, getStrandProbabilitiesForprotected PositionPrior posPrior
protected HiddenMotifMixture(TrainableStatisticalModel[] models, boolean[] optimzeArray, int components, int starts, boolean estimateComponentProbs, double[] componentHyperParams, double[] weights, PositionPrior posPrior, AbstractMixtureTrainSM.Algorithm algorithm, double alpha, TerminationCondition tc, AbstractMixtureTrainSM.Parameterization parametrization, int initialIteration, int stationaryIteration, BurnInTest burnInTest) throws CloneNotSupportedException, IllegalArgumentException, WrongAlphabetException
HiddenMotifMixture. This constructor can be used
for any algorithm since it takes all necessary values as parameters.models - the single models building the HiddenMotifMixture, if
the model is trained using
AbstractMixtureTrainSM.Algorithm.GIBBS_SAMPLING the
models that will be adjusted have to implement
SamplingComponent.
The models that are used for the flanking sequences have to
be able to score sequences of arbitrary length.optimzeArray - a array of switches whether to train or not the corresponding modelcomponents - the number of components (e.g. for ZOOPS this is 2)starts - the number of times the algorithm will be started in the
train-method, at least 1estimateComponentProbs - the switch for estimating the component probabilities in the
algorithm or to hold them fixed; if the component parameters
are fixed, the values of weights will be used,
otherwise the componentHyperParams will be
incorporated in the adjustmentcomponentHyperParams - the hyperparameters for the component assignment prior
estimateComponentProbs == true
null or has to have
length dimension
null or an array with all values zero (0)
then ML
parameterization
weights - null or the weights for the components (then
weights.length == dimension)posPrior - this object determines the positional distribution that shall
be usedalgorithm - either AbstractMixtureTrainSM.Algorithm.EM or
AbstractMixtureTrainSM.Algorithm.GIBBS_SAMPLINGalpha - only for AbstractMixtureTrainSM.Algorithm.EMtrain to initialize the
gammas. It is recommended to use alpha = 1
(uniform distribution on a simplex).tc - only for AbstractMixtureTrainSM.Algorithm.EMTerminationCondition for stopping the EM-algorithm,
tc has to return true from TerminationCondition.isSimple()parametrization - only for AbstractMixtureTrainSM.Algorithm.EMAbstractMixtureTrainSM.Parameterization.THETA or
AbstractMixtureTrainSM.Parameterization.LAMBDA
AbstractMixtureTrainSM.Parameterization.LAMBDA
initialIteration - only for AbstractMixtureTrainSM.Algorithm.GIBBS_SAMPLINGstationaryIteration/starts)stationaryIteration - only for AbstractMixtureTrainSM.Algorithm.GIBBS_SAMPLINGburnInTest - only for AbstractMixtureTrainSM.Algorithm.GIBBS_SAMPLINGIllegalArgumentException - if
weights != null && weights.length != 2
weights != null and it exists an
i where weights[i] < 0
starts < 1
componentHyperParams are not correct
WrongAlphabetException - if not all models work on the same simple
alphabetCloneNotSupportedException - if the models can not be clonedprotected HiddenMotifMixture(StringBuffer xml) throws NonParsableException
Storable.
Creates a new HiddenMotifMixture out of its XML representation.xml - the XML representation of the model as StringBufferNonParsableException - if the StringBuffer can not be parsedpublic HiddenMotifMixture clone() throws CloneNotSupportedException
AbstractTrainableStatisticalModelObject's clone()-method.clone in interface MotifDiscovererclone in interface SequenceScoreclone in interface TrainableStatisticalModelclone in class AbstractMixtureTrainSMAbstractTrainableStatisticalModel
(the member-AlphabetContainer isn't deeply cloned since
it is assumed to be immutable). The type of the returned object
is defined by the class X directly inherited from
AbstractTrainableStatisticalModel. Hence X's
clone()-method should work as:Object o = (X)super.clone(); o defined by
X that are not of simple data-types like
int, double, ... have to be deeply
copied return oCloneNotSupportedException - if something went wrong while cloningCloneableprotected StringBuffer getFurtherInformation()
AbstractMixtureTrainSMgetFurtherInformation in class AbstractMixtureTrainSMAbstractMixtureTrainSM.extractFurtherInformation(StringBuffer)protected void extractFurtherInformation(StringBuffer xml) throws NonParsableException
AbstractMixtureTrainSMextractFurtherInformation in class AbstractMixtureTrainSMxml - the XML representationNonParsableException - if the XML representation is not parsableAbstractMixtureTrainSM.getFurtherInformation()public void train(DataSet data, double[] weights) throws Exception
TrainableStatisticalModelTrainableStatisticalModel object given the data as DataSet using
the specified weights. The weight at position i belongs to the element at
position i. So the array weight should have the number of
sequences in the data set as dimension. (Optionally it is possible to use
weight == null if all weights have the value one.)train(data1); train(data2)
should be a fully trained model over data2 and not over
data1+data2. All parameters of the model were given by the
call of the constructor.train in interface TrainableStatisticalModeltrain in class AbstractMixtureTrainSMdata - the given sequences as DataSetweights - the weights of the elements, each weight should be
non-negativeException - if the training did not succeed (e.g. the dimension of
weights and the number of sequences in the
data set do not match)DataSet.getElementAt(int),
DataSet.ElementEnumeratorprotected void getNewParameters(int iteration,
double[][] seqWeights,
double[] w)
throws Exception
AbstractMixtureTrainSMgetNewParameters in class AbstractMixtureTrainSMiteration - the number of times this method has been invokedseqWeights - the weights for each model and sequencew - the weights for the componentsException - if the training of the internal models went wrongpublic abstract void trainBgModel(DataSet data, double[] weights) throws Exception
data - the data setweights - the weightsException - if something went wrongprotected void checkLength(int index,
int l)
AbstractMixtureTrainSMl of the model with index
index is capable for the current instance. Otherwise an
IllegalArgumentException is thrown.checkLength in class AbstractMixtureTrainSMindex - the index of the modell - the length of the modelpublic abstract int getMinimalSequenceLength()
public String getInstanceName()
SequenceScoregetInstanceName in interface SequenceScoregetInstanceName in class AbstractMixtureTrainSMpublic String toString(NumberFormat nf)
SequenceScoreString representation of the instance.toString in interface SequenceScorenf - the NumberFormat for the String representation of parameters or probabilitiesString representation of the instanceprotected Sequence[] emitDataSetUsingCurrentParameterSet(int n, int... lengths) throws Exception
OperationNotSupportedException.emitDataSetUsingCurrentParameterSet in class AbstractMixtureTrainSMn - the number of sequences to be sampledlengths - the corresponding lengthsException - if it was impossible to sample the sequencesStatisticalModel.emitDataSet(int, int...)