de.jstacs.models
Class NormalizableScoringFunctionModel

java.lang.Object
  extended by de.jstacs.models.AbstractModel
      extended by de.jstacs.models.NormalizableScoringFunctionModel
All Implemented Interfaces:
Model, Storable, Cloneable

public class NormalizableScoringFunctionModel
extends AbstractModel

This model can be used to use a NormalizableScoringFunction as model. It enables the user to train the NormalizableScoringFunction in a generative way.

Author:
Jens Keilwagen
See Also:
NormalizableScoringFunction, LogGenDisMixFunction

Field Summary
protected  NormalizableScoringFunction nsf
          The internally used NormalizableScoringFunction.
 
Fields inherited from class de.jstacs.models.AbstractModel
alphabets, length
 
Constructor Summary
NormalizableScoringFunctionModel(NormalizableScoringFunction nsf, int threads, byte algo, AbstractTerminationCondition tc, double lineps, double startD)
          The main constructor that creates an instance with the user given parameters.
NormalizableScoringFunctionModel(StringBuffer stringBuff)
          The standard constructor for the interface Storable.
 
Method Summary
 NormalizableScoringFunctionModel clone()
          Follows the conventions of Object's clone()-method.
protected  void fromXML(StringBuffer xml)
          This method should only be used by the constructor that works on a StringBuffer.
 NormalizableScoringFunction getFunction()
          Returns a copy of the internally used NormalizableScoringFunction.
 String getInstanceName()
          Should return a short instance name such as iMM(0), BN(2), ...
 double getLogPriorTerm()
          Returns a value that is proportional to the log of the prior.
 double getLogProbFor(Sequence sequence, int startpos, int endpos)
          Returns the logarithm of the probability of (a part of) the given sequence given the model.
 NumericalResultSet getNumericalCharacteristics()
          Returns the subset of numerical values that are also returned by Model.getCharacteristics().
 double getProbFor(Sequence sequence, int startpos, int endpos)
          Returns the probability of (a part of) the given sequence given the model.
 boolean isTrained()
          Returns true if the model has been trained successfully, false otherwise.
 void setOutputStream(OutputStream o)
          Sets the OutputStream that is used e.g. for writing information while training.
 String toString()
          Should give a simple representation (text) of the model as String .
 StringBuffer toXML()
          This method returns an XML representation as StringBuffer of an instance of the implementing class.
 void train(Sample data, double[] weights)
          Trains the Model object given the data as Sample using the specified weights.
 
Methods inherited from class de.jstacs.models.AbstractModel
emitSample, getAlphabetContainer, getCharacteristics, getLength, getLogProbFor, getLogProbFor, getLogProbFor, getLogProbFor, getMaximalMarkovOrder, getPriorTerm, getProbFor, getProbFor, set, setNewAlphabetContainerInstance, train
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

nsf

protected NormalizableScoringFunction nsf
The internally used NormalizableScoringFunction.

Constructor Detail

NormalizableScoringFunctionModel

public NormalizableScoringFunctionModel(NormalizableScoringFunction nsf,
                                        int threads,
                                        byte algo,
                                        AbstractTerminationCondition tc,
                                        double lineps,
                                        double startD)
                                 throws CloneNotSupportedException
The main constructor that creates an instance with the user given parameters.

Parameters:
nsf - the NormalizableScoringFunction that should be used
threads - the number of threads that should be used for optimization
algo - the algorithm that should be used for the optimization
tc - the AbstractTerminationCondition for stopping the optimization
lineps - the line epsilon for stopping the line search in the optimization
startD - the start distance that should be used initially
Throws:
CloneNotSupportedException - if nsf can not be cloned

NormalizableScoringFunctionModel

public NormalizableScoringFunctionModel(StringBuffer stringBuff)
                                 throws NonParsableException
The standard constructor for the interface Storable. Creates a new NormalizableScoringFunctionModel out of a StringBuffer.

Parameters:
stringBuff - the StringBuffer to be parsed
Throws:
NonParsableException - is thrown if the StringBuffer could not be parsed
Method Detail

clone

public NormalizableScoringFunctionModel clone()
                                       throws CloneNotSupportedException
Description copied from class: AbstractModel
Follows the conventions of Object's clone()-method.

Specified by:
clone in interface Model
Overrides:
clone in class AbstractModel
Returns:
an object, that is a copy of the current AbstractModel (the member-AlphabetContainer isn't deeply cloned since it is assumed to be immutable). The type of the returned object is defined by the class X directly inherited from AbstractModel. Hence X's clone()-method should work as:
1. Object o = (X)super.clone();
2. all additional member variables of o defined by X that are not of simple data-types like int, double, ... have to be deeply copied
3. return o
Throws:
CloneNotSupportedException - if something went wrong while cloning

train

public void train(Sample data,
                  double[] weights)
           throws Exception
Description copied from interface: Model
Trains the Model object given the data as Sample using the specified weights. The weight at position i belongs to the element at position i. So the array weight should have the number of sequences in the sample as dimension. (Optionally it is possible to use weight == null if all weights have the value one.)
This method should work non-incrementally. That means the result of the following series: train(data1); train(data2) should be a fully trained model over data2 and not over data1+data2. All parameters of the model were given by the call of the constructor.

Parameters:
data - the given sequences as Sample
weights - the weights of the elements, each weight should be non-negative
Throws:
Exception - if the training did not succeed (e.g. the dimension of weights and the number of sequences in the sample do not match)
See Also:
Sample.getElementAt(int), Sample.ElementEnumerator

getProbFor

public double getProbFor(Sequence sequence,
                         int startpos,
                         int endpos)
                  throws NotTrainedException,
                         Exception
Description copied from interface: Model
Returns the probability of (a part of) the given sequence given the model. If at least one random variable is continuous the value of density function is returned.

It extends the possibility given by the method Model.getProbFor(Sequence, int) by the fact, that the model could be e.g. homogeneous and therefore the length of the sequences, whose probability should be returned, is not fixed. Additionally the end position of the part of the given sequence is given and the probability of the part from position startpos to endpos (inclusive) should be returned.
The length and the alphabets define the type of data that can be modeled and therefore both has to be checked.

Parameters:
sequence - the given sequence
startpos - the start position within the given sequence
endpos - the last position to be taken into account
Returns:
the probability or the value of the density function of (the part of) the given sequence given the model
Throws:
NotTrainedException - if the model is not trained yet
Exception - if the sequence could not be handled (e.g. startpos > endpos, endpos > sequence.length, ...) by the model

getLogProbFor

public double getLogProbFor(Sequence sequence,
                            int startpos,
                            int endpos)
                     throws NotTrainedException,
                            Exception
Description copied from interface: Model
Returns the logarithm of the probability of (a part of) the given sequence given the model. If at least one random variable is continuous the value of density function is returned.

For more details see Model.getProbFor(Sequence, int, int)

Specified by:
getLogProbFor in interface Model
Overrides:
getLogProbFor in class AbstractModel
Parameters:
sequence - the given sequence
startpos - the start position within the given sequence
endpos - the last position to be taken into account
Returns:
the logarithm of the probability or the value of the density function of (the part of) the given sequence given the model
Throws:
NotTrainedException - if the model is not trained yet
Exception - if the sequence could not be handled (e.g. startpos > , endpos > sequence.length, ...) by the model
See Also:
Model.getProbFor(Sequence, int, int)

getLogPriorTerm

public double getLogPriorTerm()
                       throws Exception
Description copied from interface: Model
Returns a value that is proportional to the log of the prior. For maximum likelihood (ML) 0 should be returned.

Returns:
a value that is proportional to the log of the prior
Throws:
Exception - if something went wrong
See Also:
Model.getPriorTerm()

getInstanceName

public String getInstanceName()
Description copied from interface: Model
Should return a short instance name such as iMM(0), BN(2), ...

Returns:
a short instance name

isTrained

public boolean isTrained()
Description copied from interface: Model
Returns true if the model has been trained successfully, false otherwise.

Returns:
true if the model has been trained successfully, false otherwise

getNumericalCharacteristics

public NumericalResultSet getNumericalCharacteristics()
                                               throws Exception
Description copied from interface: Model
Returns the subset of numerical values that are also returned by Model.getCharacteristics().

Returns:
the numerical characteristics of the current instance of the model
Throws:
Exception - if some of the characteristics could not be defined

toString

public String toString()
Description copied from interface: Model
Should give a simple representation (text) of the model as String .

Specified by:
toString in interface Model
Overrides:
toString in class Object
Returns:
the representation as String

fromXML

protected void fromXML(StringBuffer xml)
                throws NonParsableException
Description copied from class: AbstractModel
This method should only be used by the constructor that works on a StringBuffer. It is the counter part of Storable.toXML().

Specified by:
fromXML in class AbstractModel
Parameters:
xml - the XML representation of the model
Throws:
NonParsableException - if the StringBuffer is not parsable or the representation is conflicting
See Also:
AbstractModel.AbstractModel(StringBuffer)

toXML

public StringBuffer toXML()
Description copied from interface: Storable
This method returns an XML representation as StringBuffer of an instance of the implementing class.

Returns:
the XML representation

setOutputStream

public final void setOutputStream(OutputStream o)
Sets the OutputStream that is used e.g. for writing information while training. It is possible to set o=null, than nothing will be written.

Parameters:
o - the OutputStream

getFunction

public NormalizableScoringFunction getFunction()
                                        throws CloneNotSupportedException
Returns a copy of the internally used NormalizableScoringFunction.

Returns:
a copy of the internally used NormalizableScoringFunction
Throws:
CloneNotSupportedException - if the internal instance could not be cloned