de.jstacs.sequenceScores.statisticalModels.trainable
Class DifferentiableStatisticalModelWrapperTrainSM

java.lang.Object
  extended by de.jstacs.sequenceScores.statisticalModels.trainable.AbstractTrainableStatisticalModel
      extended by de.jstacs.sequenceScores.statisticalModels.trainable.DifferentiableStatisticalModelWrapperTrainSM
All Implemented Interfaces:
SequenceScore, StatisticalModel, TrainableStatisticalModel, Storable, Cloneable

public class DifferentiableStatisticalModelWrapperTrainSM
extends AbstractTrainableStatisticalModel

This model can be used to use a DifferentiableStatisticalModel as model. It enables the user to train the DifferentiableStatisticalModel in a generative way.

Author:
Jens Keilwagen
See Also:
DifferentiableStatisticalModel, LogGenDisMixFunction

Field Summary
protected  DifferentiableStatisticalModel nsf
          The internally used DifferentiableStatisticalModel.
 
Fields inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.AbstractTrainableStatisticalModel
alphabets, length
 
Constructor Summary
DifferentiableStatisticalModelWrapperTrainSM(DifferentiableStatisticalModel nsf, int threads, byte algo, AbstractTerminationCondition tc, double lineps, double startD)
          The main constructor that creates an instance with the user given parameters.
DifferentiableStatisticalModelWrapperTrainSM(StringBuffer stringBuff)
          The standard constructor for the interface Storable.
 
Method Summary
 DifferentiableStatisticalModelWrapperTrainSM clone()
          Follows the conventions of Object's clone()-method.
protected  void fromXML(StringBuffer xml)
          This method should only be used by the constructor that works on a StringBuffer.
 DifferentiableStatisticalModel getFunction()
          Returns a copy of the internally used DifferentiableStatisticalModel.
 String getInstanceName()
          Should return a short instance name such as iMM(0), BN(2), ...
 double getLogPriorTerm()
          Returns a value that is proportional to the log of the prior.
 double getLogProbFor(Sequence sequence, int startpos, int endpos)
          Returns the logarithm of the probability of (a part of) the given sequence given the model.
 NumericalResultSet getNumericalCharacteristics()
          Returns the subset of numerical values that are also returned by SequenceScore.getCharacteristics().
 boolean isInitialized()
          This method can be used to determine whether the instance is initialized.
 void setOutputStream(OutputStream o)
          Sets the OutputStream that is used e.g.
 String toString(NumberFormat nf)
          This method returns a String representation of the instance.
 StringBuffer toXML()
          This method returns an XML representation as StringBuffer of an instance of the implementing class.
 void train(DataSet data, double[] weights)
          Trains the TrainableStatisticalModel object given the data as DataSet using the specified weights.
 
Methods inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.AbstractTrainableStatisticalModel
check, emitDataSet, getAlphabetContainer, getCharacteristics, getLength, getLogProbFor, getLogProbFor, getLogScoreFor, getLogScoreFor, getLogScoreFor, getLogScoreFor, getLogScoreFor, getMaximalMarkovOrder, toString, train
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

nsf

protected DifferentiableStatisticalModel nsf
The internally used DifferentiableStatisticalModel.

Constructor Detail

DifferentiableStatisticalModelWrapperTrainSM

public DifferentiableStatisticalModelWrapperTrainSM(DifferentiableStatisticalModel nsf,
                                                    int threads,
                                                    byte algo,
                                                    AbstractTerminationCondition tc,
                                                    double lineps,
                                                    double startD)
                                             throws CloneNotSupportedException
The main constructor that creates an instance with the user given parameters.

Parameters:
nsf - the DifferentiableStatisticalModel that should be used
threads - the number of threads that should be used for optimization
algo - the algorithm that should be used for the optimization
tc - the AbstractTerminationCondition for stopping the optimization
lineps - the line epsilon for stopping the line search in the optimization
startD - the start distance that should be used initially
Throws:
CloneNotSupportedException - if nsf can not be cloned

DifferentiableStatisticalModelWrapperTrainSM

public DifferentiableStatisticalModelWrapperTrainSM(StringBuffer stringBuff)
                                             throws NonParsableException
The standard constructor for the interface Storable. Creates a new DifferentiableStatisticalModelWrapperTrainSM out of a StringBuffer.

Parameters:
stringBuff - the StringBuffer to be parsed
Throws:
NonParsableException - is thrown if the StringBuffer could not be parsed
Method Detail

clone

public DifferentiableStatisticalModelWrapperTrainSM clone()
                                                   throws CloneNotSupportedException
Description copied from class: AbstractTrainableStatisticalModel
Follows the conventions of Object's clone()-method.

Specified by:
clone in interface SequenceScore
Specified by:
clone in interface TrainableStatisticalModel
Overrides:
clone in class AbstractTrainableStatisticalModel
Returns:
an object, that is a copy of the current AbstractTrainableStatisticalModel (the member-AlphabetContainer isn't deeply cloned since it is assumed to be immutable). The type of the returned object is defined by the class X directly inherited from AbstractTrainableStatisticalModel. Hence X's clone()-method should work as:
1. Object o = (X)super.clone();
2. all additional member variables of o defined by X that are not of simple data-types like int, double, ... have to be deeply copied
3. return o
Throws:
CloneNotSupportedException - if something went wrong while cloning

train

public void train(DataSet data,
                  double[] weights)
           throws Exception
Description copied from interface: TrainableStatisticalModel
Trains the TrainableStatisticalModel object given the data as DataSet using the specified weights. The weight at position i belongs to the element at position i. So the array weight should have the number of sequences in the data set as dimension. (Optionally it is possible to use weight == null if all weights have the value one.)
This method should work non-incrementally. That means the result of the following series: train(data1); train(data2) should be a fully trained model over data2 and not over data1+data2. All parameters of the model were given by the call of the constructor.

Parameters:
data - the given sequences as DataSet
weights - the weights of the elements, each weight should be non-negative
Throws:
Exception - if the training did not succeed (e.g. the dimension of weights and the number of sequences in the data set do not match)
See Also:
DataSet.getElementAt(int), DataSet.ElementEnumerator

getLogProbFor

public double getLogProbFor(Sequence sequence,
                            int startpos,
                            int endpos)
                     throws NotTrainedException,
                            Exception
Description copied from interface: StatisticalModel
Returns the logarithm of the probability of (a part of) the given sequence given the model. If at least one random variable is continuous the value of density function is returned.

It extends the possibility given by the method StatisticalModel.getLogProbFor(Sequence, int) by the fact, that the model could be e.g. homogeneous and therefore the length of the sequences, whose probability should be returned, is not fixed. Additionally, the end position of the part of the given sequence is given and the probability of the part from position startpos to endpos (inclusive) should be returned.
The length and the alphabets define the type of data that can be modeled and therefore both has to be checked.

Parameters:
sequence - the given sequence
startpos - the start position within the given sequence
endpos - the last position to be taken into account
Returns:
the logarithm of the probability or the value of the density function of (the part of) the given sequence given the model
Throws:
NotTrainedException - if the model is not trained yet
Exception - if the sequence could not be handled (e.g. startpos > , endpos > sequence.length, ...) by the model

getLogPriorTerm

public double getLogPriorTerm()
                       throws Exception
Description copied from interface: StatisticalModel
Returns a value that is proportional to the log of the prior. For maximum likelihood (ML) 0 should be returned.

Returns:
a value that is proportional to the log of the prior
Throws:
Exception - if something went wrong

getInstanceName

public String getInstanceName()
Description copied from interface: SequenceScore
Should return a short instance name such as iMM(0), BN(2), ...

Returns:
a short instance name

isInitialized

public boolean isInitialized()
Description copied from interface: SequenceScore
This method can be used to determine whether the instance is initialized. If the instance is initialized you should be able to invoke SequenceScore.getLogScoreFor(Sequence).

Returns:
true if the instance is initialized, false otherwise

getNumericalCharacteristics

public NumericalResultSet getNumericalCharacteristics()
                                               throws Exception
Description copied from interface: SequenceScore
Returns the subset of numerical values that are also returned by SequenceScore.getCharacteristics().

Returns:
the numerical characteristics of the current instance
Throws:
Exception - if some of the characteristics could not be defined

toString

public String toString(NumberFormat nf)
Description copied from interface: SequenceScore
This method returns a String representation of the instance.

Parameters:
nf - the NumberFormat for the String representation of parameters or probabilities
Returns:
a String representation of the instance

fromXML

protected void fromXML(StringBuffer xml)
                throws NonParsableException
Description copied from class: AbstractTrainableStatisticalModel
This method should only be used by the constructor that works on a StringBuffer. It is the counter part of Storable.toXML().

Specified by:
fromXML in class AbstractTrainableStatisticalModel
Parameters:
xml - the XML representation of the model
Throws:
NonParsableException - if the StringBuffer is not parsable or the representation is conflicting
See Also:
AbstractTrainableStatisticalModel.AbstractTrainableStatisticalModel(StringBuffer)

toXML

public StringBuffer toXML()
Description copied from interface: Storable
This method returns an XML representation as StringBuffer of an instance of the implementing class.

Returns:
the XML representation

setOutputStream

public final void setOutputStream(OutputStream o)
Sets the OutputStream that is used e.g. for writing information while training. It is possible to set o=null, than nothing will be written.

Parameters:
o - the OutputStream

getFunction

public DifferentiableStatisticalModel getFunction()
                                           throws CloneNotSupportedException
Returns a copy of the internally used DifferentiableStatisticalModel.

Returns:
a copy of the internally used DifferentiableStatisticalModel
Throws:
CloneNotSupportedException - if the internal instance could not be cloned