|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectde.jstacs.sequenceScores.statisticalModels.trainable.AbstractTrainableStatisticalModel
de.jstacs.sequenceScores.statisticalModels.trainable.mixture.AbstractMixtureTrainSM
de.jstacs.sequenceScores.statisticalModels.trainable.mixture.motif.HiddenMotifMixture
public abstract class HiddenMotifMixture
This is the main class that every generative motif discoverer should implement.
| Nested Class Summary |
|---|
| Nested classes/interfaces inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.mixture.AbstractMixtureTrainSM |
|---|
AbstractMixtureTrainSM.Algorithm, AbstractMixtureTrainSM.Parameterization |
| Nested classes/interfaces inherited from interface de.jstacs.motifDiscovery.MotifDiscoverer |
|---|
MotifDiscoverer.KindOfProfile |
| Field Summary | |
|---|---|
protected PositionPrior |
posPrior
The prior for the positions. |
| Fields inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.mixture.AbstractMixtureTrainSM |
|---|
algorithm, algorithmHasBeenRun, alternativeModel, best, burnInTest, componentHyperParams, compProb, counter, dimension, estimateComponentProbs, file, filereader, filewriter, initialIteration, logWeights, model, optimizeModel, sample, samplingIndex, seqWeights, sostream, starts, stationaryIteration, weights |
| Fields inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.AbstractTrainableStatisticalModel |
|---|
alphabets, length |
| Constructor Summary | |
|---|---|
protected |
HiddenMotifMixture(StringBuffer xml)
The standard constructor for the interface Storable. |
protected |
HiddenMotifMixture(TrainableStatisticalModel[] models,
boolean[] optimzeArray,
int components,
int starts,
boolean estimateComponentProbs,
double[] componentHyperParams,
double[] weights,
PositionPrior posPrior,
AbstractMixtureTrainSM.Algorithm algorithm,
double alpha,
TerminationCondition tc,
AbstractMixtureTrainSM.Parameterization parametrization,
int initialIteration,
int stationaryIteration,
BurnInTest burnInTest)
Creates a new HiddenMotifMixture. |
| Method Summary | |
|---|---|
protected void |
checkLength(int index,
int l)
This method checks if the length l of the model with index
index is capable for the current instance. |
HiddenMotifMixture |
clone()
Follows the conventions of Object's clone()-method. |
protected Sequence[] |
emitDataSetUsingCurrentParameterSet(int n,
int... lengths)
Standard implementation throwing an OperationNotSupportedException. |
protected void |
extractFurtherInformation(StringBuffer xml)
This method is used in the subclasses to extract further information from the XML representation and to set these as values of the instance. |
protected StringBuffer |
getFurtherInformation()
This method is used in the subclasses to append further information to the XML representation. |
String |
getInstanceName()
Should return a short instance name such as iMM(0), BN(2), ... |
abstract int |
getMinimalSequenceLength()
Returns the minimal length a sequence respectively a data set has to have. |
protected void |
getNewParameters(int iteration,
double[][] seqWeights,
double[] w)
This method trains the internal models on the internal data set and the given weights. |
String |
toString(NumberFormat nf)
This method returns a String representation of the instance. |
void |
train(DataSet data,
double[] weights)
Trains the TrainableStatisticalModel object given the data as DataSet using
the specified weights. |
abstract void |
trainBgModel(DataSet data,
double[] weights)
This method trains the background model. |
| Methods inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.AbstractTrainableStatisticalModel |
|---|
check, getAlphabetContainer, getLength, getLogProbFor, getLogProbFor, getLogScoreFor, getLogScoreFor, getLogScoreFor, getLogScoreFor, getMaximalMarkovOrder, toString, train |
| Methods inherited from class java.lang.Object |
|---|
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait |
| Methods inherited from interface de.jstacs.motifDiscovery.MotifDiscoverer |
|---|
getGlobalIndexOfMotifInComponent, getIndexOfMaximalComponentFor, getMotifLength, getNumberOfComponents, getNumberOfMotifs, getNumberOfMotifsInComponent, getProfileOfScoresFor, getStrandProbabilitiesFor |
| Methods inherited from interface de.jstacs.Storable |
|---|
toXML |
| Field Detail |
|---|
protected PositionPrior posPrior
| Constructor Detail |
|---|
protected HiddenMotifMixture(TrainableStatisticalModel[] models,
boolean[] optimzeArray,
int components,
int starts,
boolean estimateComponentProbs,
double[] componentHyperParams,
double[] weights,
PositionPrior posPrior,
AbstractMixtureTrainSM.Algorithm algorithm,
double alpha,
TerminationCondition tc,
AbstractMixtureTrainSM.Parameterization parametrization,
int initialIteration,
int stationaryIteration,
BurnInTest burnInTest)
throws CloneNotSupportedException,
IllegalArgumentException,
WrongAlphabetException
HiddenMotifMixture. This constructor can be used
for any algorithm since it takes all necessary values as parameters.
models - the single models building the HiddenMotifMixture, if
the model is trained using
AbstractMixtureTrainSM.Algorithm.GIBBS_SAMPLING the
models that will be adjusted have to implement
SamplingComponent.
The models that are used for the flanking sequences have to
be able to score sequences of arbitrary length.optimzeArray - a array of switches whether to train or not the corresponding modelcomponents - the number of components (e.g. for ZOOPS this is 2)starts - the number of times the algorithm will be started in the
train-method, at least 1estimateComponentProbs - the switch for estimating the component probabilities in the
algorithm or to hold them fixed; if the component parameters
are fixed, the values of weights will be used,
otherwise the componentHyperParams will be
incorporated in the adjustmentcomponentHyperParams - the hyperparameters for the component assignment prior
estimateComponentProbs == true
null or has to have
length dimension
null or an array with all values zero (0)
then ML
parameterization
weights - null or the weights for the components (then
weights.length == dimension)posPrior - this object determines the positional distribution that shall
be usedalgorithm - either AbstractMixtureTrainSM.Algorithm.EM or
AbstractMixtureTrainSM.Algorithm.GIBBS_SAMPLINGalpha - only for AbstractMixtureTrainSM.Algorithm.EMtrain to initialize the
gammas. It is recommended to use alpha = 1
(uniform distribution on a simplex).tc - only for AbstractMixtureTrainSM.Algorithm.EMTerminationCondition for stopping the EM-algorithm,
tc has to return true from TerminationCondition.isSimple()parametrization - only for AbstractMixtureTrainSM.Algorithm.EMAbstractMixtureTrainSM.Parameterization.THETA or
AbstractMixtureTrainSM.Parameterization.LAMBDA
AbstractMixtureTrainSM.Parameterization.LAMBDA
initialIteration - only for AbstractMixtureTrainSM.Algorithm.GIBBS_SAMPLINGstationaryIteration/starts)stationaryIteration - only for AbstractMixtureTrainSM.Algorithm.GIBBS_SAMPLINGburnInTest - only for AbstractMixtureTrainSM.Algorithm.GIBBS_SAMPLINGIllegalArgumentException - if
weights != null && weights.length != 2
weights != null and it exists an
i where weights[i] < 0
starts < 1
componentHyperParams are not correct
WrongAlphabetException - if not all models work on the same simple
alphabet
CloneNotSupportedException - if the models can not be cloned
protected HiddenMotifMixture(StringBuffer xml)
throws NonParsableException
Storable.
Creates a new HiddenMotifMixture out of its XML representation.
xml - the XML representation of the model as StringBuffer
NonParsableException - if the StringBuffer can not be parsed| Method Detail |
|---|
public HiddenMotifMixture clone()
throws CloneNotSupportedException
AbstractTrainableStatisticalModelObject's clone()-method.
clone in interface MotifDiscovererclone in interface SequenceScoreclone in interface TrainableStatisticalModelclone in class AbstractMixtureTrainSMAbstractTrainableStatisticalModel
(the member-AlphabetContainer isn't deeply cloned since
it is assumed to be immutable). The type of the returned object
is defined by the class X directly inherited from
AbstractTrainableStatisticalModel. Hence X's
clone()-method should work as:Object o = (X)super.clone(); o defined by
X that are not of simple data-types like
int, double, ... have to be deeply
copied return o
CloneNotSupportedException - if something went wrong while cloningCloneableprotected StringBuffer getFurtherInformation()
AbstractMixtureTrainSM
getFurtherInformation in class AbstractMixtureTrainSMAbstractMixtureTrainSM.extractFurtherInformation(StringBuffer)
protected void extractFurtherInformation(StringBuffer xml)
throws NonParsableException
AbstractMixtureTrainSM
extractFurtherInformation in class AbstractMixtureTrainSMxml - the XML representation
NonParsableException - if the XML representation is not parsableAbstractMixtureTrainSM.getFurtherInformation()
public void train(DataSet data,
double[] weights)
throws Exception
TrainableStatisticalModelTrainableStatisticalModel object given the data as DataSet using
the specified weights. The weight at position i belongs to the element at
position i. So the array weight should have the number of
sequences in the data set as dimension. (Optionally it is possible to use
weight == null if all weights have the value one.)train(data1); train(data2)
should be a fully trained model over data2 and not over
data1+data2. All parameters of the model were given by the
call of the constructor.
train in interface TrainableStatisticalModeltrain in class AbstractMixtureTrainSMdata - the given sequences as DataSetweights - the weights of the elements, each weight should be
non-negative
Exception - if the training did not succeed (e.g. the dimension of
weights and the number of sequences in the
data set do not match)DataSet.getElementAt(int),
DataSet.ElementEnumerator
protected void getNewParameters(int iteration,
double[][] seqWeights,
double[] w)
throws Exception
AbstractMixtureTrainSM
getNewParameters in class AbstractMixtureTrainSMiteration - the number of times this method has been invokedseqWeights - the weights for each model and sequencew - the weights for the components
Exception - if the training of the internal models went wrong
public abstract void trainBgModel(DataSet data,
double[] weights)
throws Exception
data - the data setweights - the weights
Exception - if something went wrong
protected void checkLength(int index,
int l)
AbstractMixtureTrainSMl of the model with index
index is capable for the current instance. Otherwise an
IllegalArgumentException is thrown.
checkLength in class AbstractMixtureTrainSMindex - the index of the modell - the length of the modelpublic abstract int getMinimalSequenceLength()
public String getInstanceName()
SequenceScore
getInstanceName in interface SequenceScoregetInstanceName in class AbstractMixtureTrainSMpublic String toString(NumberFormat nf)
SequenceScoreString representation of the instance.
toString in interface SequenceScorenf - the NumberFormat for the String representation of parameters or probabilities
String representation of the instance
protected Sequence[] emitDataSetUsingCurrentParameterSet(int n,
int... lengths)
throws Exception
OperationNotSupportedException.
emitDataSetUsingCurrentParameterSet in class AbstractMixtureTrainSMn - the number of sequences to be sampledlengths - the corresponding lengths
Exception - if it was impossible to sample the sequencesStatisticalModel.emitDataSet(int, int...)
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||