de.jstacs.models.discrete.homogeneous
Class HomogeneousModel

java.lang.Object
  extended by de.jstacs.models.AbstractModel
      extended by de.jstacs.models.discrete.DiscreteGraphicalModel
          extended by de.jstacs.models.discrete.homogeneous.HomogeneousModel
All Implemented Interfaces:
InstantiableFromParameterSet, Model, Storable, Cloneable
Direct Known Subclasses:
HomogeneousMM

public abstract class HomogeneousModel
extends DiscreteGraphicalModel

This class implements homogeneous models.

Author:
Jens Keilwagen
See Also:
HomogeneousModelParameterSet

Nested Class Summary
protected  class HomogeneousModel.HomCondProb
          This class handles the (conditional) homogeneous probabilities in a fast way.
 
Field Summary
protected  byte order
          The Markov order of the model.
protected  int[] powers
          The powers of the alphabet length.
 
Fields inherited from class de.jstacs.models.discrete.DiscreteGraphicalModel
params, trained
 
Fields inherited from class de.jstacs.models.AbstractModel
alphabets, length
 
Constructor Summary
HomogeneousModel(HomogeneousModelParameterSet params)
          Creates a homogeneous model from a parameter set.
HomogeneousModel(StringBuffer stringBuff)
          Creates a homogeneous model from a StringBuffer.
 
Method Summary
protected  void check(Sequence sequence, int startpos, int endpos)
          Checks some constraints
protected  int chooseFromDistr(Constraint distr, int start, int end, double randNo)
          Chooses a value in [0,end-start] according to the distribution encoded in the frequencies of distr between the indices start and end.
protected  HomogeneousModel.HomCondProb[] cloneHomProb(HomogeneousModel.HomCondProb[] p)
          Clones the given conditional probabilities.
 Sample emitSample(int no, int... length)
          Creates a sample of nosequences.
 double getLogProbFor(Sequence sequence, int startpos, int endpos)
          Returns the logarithm of the probability of the given sequence given the model.
 byte getMaximalMarkovOrder()
          This method returns the maximal used markov order if possible.
 NumericalResultSet getNumericalCharacteristics()
          Returns the subset of numerical values that are also returned by getCharacteristsics.
 double getProbFor(Sequence sequence, int startpos, int endpos)
          Returns the probability of the given sequence given the model.
protected abstract  Sequence getRandomSequence(Random r, int length)
          This method creates a sequence from a trained model.
protected abstract  double logProbFor(Sequence sequence, int startpos, int endpos)
          This method computes the logarithm of the probability of the given sequence in the given interval.
protected abstract  double probFor(Sequence sequence, int startpos, int endpos)
          This method computes the probability of the given sequence in the given interval.
protected  void set(DGMParameterSet params, boolean trained)
          Sets the parameters as internal parameters and does some essential computations.
 void train(Sample[] data)
          Trains the model on all given samples.
abstract  void train(Sample[] data, double[][] weights)
          Trains the model using an array of weighted samples.
 
Methods inherited from class de.jstacs.models.discrete.DiscreteGraphicalModel
clone, fromXML, getCurrentParameterSet, getDescription, getESS, getFurtherModelInfos, getXMLTag, isTrained, setFurtherModelInfos, toString, toXML
 
Methods inherited from class de.jstacs.models.AbstractModel
getAlphabetContainer, getCharacteristics, getLength, getLogProbFor, getLogProbFor, getLogProbFor, getLogProbFor, getPriorTerm, getProbFor, getProbFor, set, setNewAlphabetContainerInstance, train
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface de.jstacs.models.Model
getInstanceName, getLogPriorTerm, train
 

Field Detail

powers

protected int[] powers
The powers of the alphabet length.


order

protected byte order
The Markov order of the model.

Constructor Detail

HomogeneousModel

public HomogeneousModel(HomogeneousModelParameterSet params)
                 throws CloneNotSupportedException,
                        IllegalArgumentException,
                        NonParsableException
Creates a homogeneous model from a parameter set.

Parameters:
params - the parameter set
Throws:
CloneNotSupportedException - if the parameter set could not be cloned
IllegalArgumentException - if the parameter set is not instantiated
NonParsableException - if the parameter set is not parsable

HomogeneousModel

public HomogeneousModel(StringBuffer stringBuff)
                 throws NonParsableException
Creates a homogeneous model from a StringBuffer.

Parameters:
stringBuff - the StringBuffer
Throws:
NonParsableException - if the buffer is not parsable
Method Detail

emitSample

public final Sample emitSample(int no,
                               int... length)
                        throws NotTrainedException,
                               IllegalArgumentException,
                               EmptySampleException,
                               WrongAlphabetException,
                               WrongSequenceTypeException
Creates a sample of nosequences.

Specified by:
emitSample in interface Model
Overrides:
emitSample in class AbstractModel
Parameters:
no - the number of sequences in the sample
length - the length of all sequences or an array of length, than the sequence with index i has length length[i]
Returns:
the sample
Throws:
NotTrainedException - if the model was not trained
IllegalArgumentException - if the dimension of length is neither 1 nor no
EmptySampleException - if no == 0
WrongSequenceTypeException
WrongAlphabetException
See Also:
Sample

getRandomSequence

protected abstract Sequence getRandomSequence(Random r,
                                              int length)
                                       throws WrongAlphabetException,
                                              WrongSequenceTypeException
This method creates a sequence from a trained model.

Parameters:
r - the random generator
length - the length of the sequence
Returns:
the sequence
Throws:
WrongSequenceTypeException
WrongAlphabetException

getMaximalMarkovOrder

public byte getMaximalMarkovOrder()
Description copied from interface: Model
This method returns the maximal used markov order if possible.

Specified by:
getMaximalMarkovOrder in interface Model
Overrides:
getMaximalMarkovOrder in class AbstractModel
Returns:
maximal used markov order

getNumericalCharacteristics

public NumericalResultSet getNumericalCharacteristics()
                                               throws Exception
Description copied from interface: Model
Returns the subset of numerical values that are also returned by getCharacteristsics.

Returns:
the numerical characteristics
Throws:
Exception - an Exception is thrown if some of the characteristics could not be defined

getLogProbFor

public final double getLogProbFor(Sequence sequence,
                                  int startpos,
                                  int endpos)
                           throws NotTrainedException,
                                  Exception
Description copied from interface: Model
Returns the logarithm of the probability of the given sequence given the model. If a least one random variable is continuous the value of density function is returned.

For more details see Model.getProbFor(Sequence, int, int)

Specified by:
getLogProbFor in interface Model
Overrides:
getLogProbFor in class AbstractModel
Parameters:
sequence - the sequence
startpos - the start position
endpos - the last position to be taken into account
Returns:
the logarithm of probability or the value of the density function of the part of the given sequence given the model
Throws:
NotTrainedException - a NotTrainedException should be thrown if the model is not trained yet.
Exception - an Exception should be thrown if the sequence could not be handled (e.g. startpos > endpos, endpos > sequence.length, ...) by the model
See Also:
Model.getProbFor(Sequence, int, int)

getProbFor

public final double getProbFor(Sequence sequence,
                               int startpos,
                               int endpos)
                        throws NotTrainedException,
                               Exception
Description copied from interface: Model
Returns the probability of the given sequence given the model. If a least one random variable is continuous the value of density function is returned.

It extends the possibility given by the method getProbFor(Sequence sequence, int startpos) by the fact, that the model could be e.g. homogeneous and therefore the length of the sequences, whose probability should be returned, is not fixed. Additionally the end position of the part of the given sequence is given and the probability of the part from position startpos to endpos (inclusive) should be returned.
The length and alphabets define the type of data that can be modeled and therefore both has to be checked.

Parameters:
sequence - the sequence
startpos - the start position
endpos - the last position to be taken into account
Returns:
the probability or the value of the density function of the part of the given sequence given the model
Throws:
NotTrainedException - a NotTrainedException should be thrown if the model is not trained yet.
Exception - an Exception should be thrown if the sequence could not be handled (e.g. startpos > endpos, endpos > sequence.length, ...) by the model

train

public void train(Sample[] data)
           throws Exception
Trains the model on all given samples.

Parameters:
data - the data
Throws:
Exception - if something went wrong

train

public abstract void train(Sample[] data,
                           double[][] weights)
                    throws Exception
Trains the model using an array of weighted samples. The weights[i] are for data[i].

Parameters:
data - the samples
weights - the weights
Throws:
Exception - if something went wrong, furthermore data.length has to be weights.length

set

protected void set(DGMParameterSet params,
                   boolean trained)
            throws CloneNotSupportedException,
                   NonParsableException
Description copied from class: DiscreteGraphicalModel
Sets the parameters as internal parameters and does some essential computations. Used in fromParameterSet-methods.

Overrides:
set in class DiscreteGraphicalModel
Parameters:
params - the new ParameterSet
trained - the indicator for the model
Throws:
CloneNotSupportedException - if the parmeterSet could not be cloned
NonParsableException - if the parameters of the model could not be parsed

check

protected void check(Sequence sequence,
                     int startpos,
                     int endpos)
              throws NotTrainedException,
                     IllegalArgumentException
Checks some constraints

Overrides:
check in class DiscreteGraphicalModel
Parameters:
sequence - the sequence
startpos - the start position
endpos - the end position
Throws:
NotTrainedException - if the model is not trained
IllegalArgumentException - if some arguments are wrong

chooseFromDistr

protected final int chooseFromDistr(Constraint distr,
                                    int start,
                                    int end,
                                    double randNo)
Chooses a value in [0,end-start] according to the distribution encoded in the frequencies of distr between the indices start and end.

The instance distr is not changed in the process.

Parameters:
distr - the distribution
start - the start index
end - the end index
randNo - a random number in [0,1]
Returns:
the chosen value
See Also:
Constraint.getFreq(int)

logProbFor

protected abstract double logProbFor(Sequence sequence,
                                     int startpos,
                                     int endpos)
This method computes the logarithm of the probability of the given sequence in the given interval. The method is only used in Model.getLogProbFor(Sequence, int, int) after the method check(Sequence, int, int) has been invoked.

Parameters:
sequence - the sequence
startpos - the start position
endpos - the end position
Returns:
the logarithm of the probability for the given subsequence
See Also:
check(Sequence, int, int), Model.getLogProbFor(Sequence, int, int)

probFor

protected abstract double probFor(Sequence sequence,
                                  int startpos,
                                  int endpos)
This method computes the probability of the given sequence in the given interval. The method is only used in Model.getProbFor(Sequence, int, int) after the method check(Sequence, int, int) has been invoked.

Parameters:
sequence - the sequence
startpos - the start position
endpos - the end position
Returns:
the probability for the given subsequence
See Also:
check(Sequence, int, int), Model.getProbFor(Sequence, int, int)

cloneHomProb

protected HomogeneousModel.HomCondProb[] cloneHomProb(HomogeneousModel.HomCondProb[] p)
Clones the given conditional probabilities.

Parameters:
p - the original conditional probabilities
Returns:
an array of clones