de.jstacs.sequenceScores.statisticalModels.trainable.discrete.inhomogeneous
Class DAGTrainSM

java.lang.Object
  extended by de.jstacs.sequenceScores.statisticalModels.trainable.AbstractTrainableStatisticalModel
      extended by de.jstacs.sequenceScores.statisticalModels.trainable.discrete.DiscreteGraphicalTrainSM
          extended by de.jstacs.sequenceScores.statisticalModels.trainable.discrete.inhomogeneous.InhomogeneousDGTrainSM
              extended by de.jstacs.sequenceScores.statisticalModels.trainable.discrete.inhomogeneous.DAGTrainSM
All Implemented Interfaces:
InstantiableFromParameterSet, SequenceScore, StatisticalModel, TrainableStatisticalModel, Storable, Cloneable
Direct Known Subclasses:
BayesianNetworkTrainSM, FSDAGTrainSM

public abstract class DAGTrainSM
extends InhomogeneousDGTrainSM

The abstract class for directed acyclic graphical models (DAGTrainSM).

Author:
Jens Keilwagen

Field Summary
protected  InhCondProb[] constraints
          The constraints for the model.
 
Fields inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.discrete.inhomogeneous.InhomogeneousDGTrainSM
DEFAULT_STREAM, sostream
 
Fields inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.discrete.DiscreteGraphicalTrainSM
params, trained
 
Fields inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.AbstractTrainableStatisticalModel
alphabets, length
 
Constructor Summary
protected DAGTrainSM(IDGTrainSMParameterSet params)
          This is the main constructor.
protected DAGTrainSM(StringBuffer xml)
          The standard constructor for the interface Storable.
 
Method Summary
protected static boolean checkAcyclic(int length, int[][] graph)
          This method checks whether a given graph is acyclic.
 DAGTrainSM clone()
          Follows the conventions of Object's clone()-method.
protected  void createConstraints(int[][] structure)
          This method creates the constraints for a given structure.
protected  void drawParameters(DataSet data, double[] weights)
          This method draws the parameter of the model from the likelihood or the posterior, respectively.
 DataSet emitDataSet(int n, int... lengths)
          This method returns a DataSet object containing artificial sequence(s).
protected  void estimateParameters(DataSet data, double[] weights)
          This method estimates the parameter of the model from the likelihood or the posterior, respectively.
protected  StringBuffer getFurtherModelInfos()
          Returns further model information as a StringBuffer.
 double getLogPriorTerm()
          Returns a value that is proportional to the log of the prior.
 double getLogProbFor(Sequence sequence, int startpos, int endpos)
          Returns the logarithm of the probability of (a part of) the given sequence given the model.
 NumericalResultSet getNumericalCharacteristics()
          Returns the subset of numerical values that are also returned by SequenceScore.getCharacteristics().
 String getStructure()
          Returns a String representation of the underlying graph.
protected  void setFurtherModelInfos(StringBuffer xml)
          This method replaces the internal model information with those from a StringBuffer.
 String toString(NumberFormat nf)
          This method returns a String representation of the instance.
 
Methods inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.discrete.inhomogeneous.InhomogeneousDGTrainSM
check, set, setOutputStream
 
Methods inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.discrete.DiscreteGraphicalTrainSM
fromXML, getCurrentParameterSet, getDescription, getESS, getXMLTag, isInitialized, toXML
 
Methods inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.AbstractTrainableStatisticalModel
getAlphabetContainer, getCharacteristics, getLength, getLogProbFor, getLogProbFor, getLogScoreFor, getLogScoreFor, getLogScoreFor, getLogScoreFor, getLogScoreFor, getMaximalMarkovOrder, toString, train
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface de.jstacs.sequenceScores.statisticalModels.trainable.TrainableStatisticalModel
train
 
Methods inherited from interface de.jstacs.sequenceScores.SequenceScore
getInstanceName
 

Field Detail

constraints

protected InhCondProb[] constraints
The constraints for the model.

Constructor Detail

DAGTrainSM

protected DAGTrainSM(IDGTrainSMParameterSet params)
              throws CloneNotSupportedException,
                     IllegalArgumentException,
                     NonParsableException
This is the main constructor. It creates a new DAGTrainSM from the given IDGTrainSMParameterSet.

Parameters:
params - the given parameter set
Throws:
CloneNotSupportedException - if the parameter set could not be cloned
IllegalArgumentException - if the parameter set is not instantiated
NonParsableException - if the parameter set is not parsable
See Also:
InhomogeneousDGTrainSM.InhomogeneousDGTrainSM(IDGTrainSMParameterSet)

DAGTrainSM

protected DAGTrainSM(StringBuffer xml)
              throws NonParsableException
The standard constructor for the interface Storable. Creates a new DAGTrainSM out of its XML representation.

Parameters:
xml - the XML representation as StringBuffer
Throws:
NonParsableException - if the DAGTrainSM could not be reconstructed out of the XML representation (the StringBuffer could not be parsed)
See Also:
Storable, InhomogeneousDGTrainSM.InhomogeneousDGTrainSM(StringBuffer)
Method Detail

clone

public DAGTrainSM clone()
                 throws CloneNotSupportedException
Description copied from class: AbstractTrainableStatisticalModel
Follows the conventions of Object's clone()-method.

Specified by:
clone in interface SequenceScore
Specified by:
clone in interface TrainableStatisticalModel
Overrides:
clone in class InhomogeneousDGTrainSM
Returns:
an object, that is a copy of the current AbstractTrainableStatisticalModel (the member-AlphabetContainer isn't deeply cloned since it is assumed to be immutable). The type of the returned object is defined by the class X directly inherited from AbstractTrainableStatisticalModel. Hence X's clone()-method should work as:
1. Object o = (X)super.clone();
2. all additional member variables of o defined by X that are not of simple data-types like int, double, ... have to be deeply copied
3. return o
Throws:
CloneNotSupportedException - if something went wrong while cloning

emitDataSet

public DataSet emitDataSet(int n,
                           int... lengths)
                    throws NotTrainedException,
                           Exception
Description copied from interface: StatisticalModel
This method returns a DataSet object containing artificial sequence(s).

There are two different possibilities to create a data set for a model with length 0 (homogeneous models).
  1. emitDataSet( int n, int l ) should return a data set with n sequences of length l.
  2. emitDataSet( int n, int[] l ) should return a data set with n sequences which have a sequence length corresponding to the entry in the given array l.

There are two different possibilities to create a data set for a model with length greater than 0 (inhomogeneous models).
emitDataSet( int n ) and emitDataSet( int n, null ) should return a data set with n sequences of length of the model ( SequenceScore.getLength()).

The standard implementation throws an Exception.

Specified by:
emitDataSet in interface StatisticalModel
Overrides:
emitDataSet in class AbstractTrainableStatisticalModel
Parameters:
n - the number of sequences that should be contained in the returned data set
lengths - the length of the sequences for a homogeneous model; for an inhomogeneous model this parameter should be null or an array of size 0.
Returns:
a DataSet containing the artificial sequence(s)
Throws:
NotTrainedException - if the model is not trained yet
Exception - if the emission did not succeed
See Also:
DataSet

getLogPriorTerm

public double getLogPriorTerm()
                       throws Exception
Description copied from interface: StatisticalModel
Returns a value that is proportional to the log of the prior. For maximum likelihood (ML) 0 should be returned.

Returns:
a value that is proportional to the log of the prior
Throws:
Exception - if something went wrong

getLogProbFor

public double getLogProbFor(Sequence sequence,
                            int startpos,
                            int endpos)
                     throws NotTrainedException,
                            Exception
Description copied from interface: StatisticalModel
Returns the logarithm of the probability of (a part of) the given sequence given the model. If at least one random variable is continuous the value of density function is returned.

It extends the possibility given by the method StatisticalModel.getLogProbFor(Sequence, int) by the fact, that the model could be e.g. homogeneous and therefore the length of the sequences, whose probability should be returned, is not fixed. Additionally, the end position of the part of the given sequence is given and the probability of the part from position startpos to endpos (inclusive) should be returned.
The length and the alphabets define the type of data that can be modeled and therefore both has to be checked.

Parameters:
sequence - the given sequence
startpos - the start position within the given sequence
endpos - the last position to be taken into account
Returns:
the logarithm of the probability or the value of the density function of (the part of) the given sequence given the model
Throws:
NotTrainedException - if the model is not trained yet
Exception - if the sequence could not be handled (e.g. startpos > , endpos > sequence.length, ...) by the model

getNumericalCharacteristics

public NumericalResultSet getNumericalCharacteristics()
Description copied from interface: SequenceScore
Returns the subset of numerical values that are also returned by SequenceScore.getCharacteristics().

Returns:
the numerical characteristics of the current instance

getStructure

public String getStructure()
                    throws NotTrainedException
Description copied from class: InhomogeneousDGTrainSM
Returns a String representation of the underlying graph.

Specified by:
getStructure in class InhomogeneousDGTrainSM
Returns:
a String representation of the underlying graph
Throws:
NotTrainedException - if the structure is not set, this can only be the case if the model is not trained

toString

public String toString(NumberFormat nf)
Description copied from interface: SequenceScore
This method returns a String representation of the instance.

Specified by:
toString in interface SequenceScore
Overrides:
toString in class DiscreteGraphicalTrainSM
Parameters:
nf - the NumberFormat for the String representation of parameters or probabilities
Returns:
a String representation of the instance

checkAcyclic

protected static boolean checkAcyclic(int length,
                                      int[][] graph)
This method checks whether a given graph is acyclic.

Parameters:
length - the sequence length (which corresponds to the number of nodes in the graph)
graph - the specified graph
Returns:
true if the given graph is acyclic, false otherwise

createConstraints

protected void createConstraints(int[][] structure)
This method creates the constraints for a given structure.

Parameters:
structure - the specified structure

drawParameters

protected void drawParameters(DataSet data,
                              double[] weights)
                       throws Exception
This method draws the parameter of the model from the likelihood or the posterior, respectively.

Parameters:
data - the given data
weights - the weights for the sequences in the data
Throws:
Exception - if something went wrong while counting or drawing
See Also:
ConstraintManager.countInhomogeneous(de.jstacs.data.AlphabetContainer, int, DataSet, double[], boolean, de.jstacs.sequenceScores.statisticalModels.trainable.discrete.Constraint...), ConstraintManager.drawFreqs(double, InhCondProb...)

estimateParameters

protected void estimateParameters(DataSet data,
                                  double[] weights)
                           throws Exception
This method estimates the parameter of the model from the likelihood or the posterior, respectively.

Parameters:
data - the data
weights - the weights for the sequences in the data
Throws:
Exception - if something went wrong while counting or estimating
See Also:
drawParameters(DataSet, double[])

getFurtherModelInfos

protected StringBuffer getFurtherModelInfos()
Description copied from class: DiscreteGraphicalTrainSM
Returns further model information as a StringBuffer.

Specified by:
getFurtherModelInfos in class DiscreteGraphicalTrainSM
Returns:
further model information like parameters of the distribution etc. in XML format
See Also:
DiscreteGraphicalTrainSM.toXML()

setFurtherModelInfos

protected void setFurtherModelInfos(StringBuffer xml)
                             throws NonParsableException
Description copied from class: DiscreteGraphicalTrainSM
This method replaces the internal model information with those from a StringBuffer.

Specified by:
setFurtherModelInfos in class DiscreteGraphicalTrainSM
Parameters:
xml - contains the model information like parameters of the distribution etc. in XML format
Throws:
NonParsableException - if the StringBuffer could not be parsed
See Also:
DiscreteGraphicalTrainSM.fromXML(StringBuffer)