ModelTester

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

de.jstacs.models.utils
Class ModelTester

java.lang.Object
  de.jstacs.models.utils.ModelTester

public class ModelTester
extends Object
extends Object

This class is useful for some test for any (discrete) models. It implements several statistics (log-likelihood, Shannon entropy, AIC, BIC, ...) to compare models.

Author:: Jens Keilwagen
See Also:: AbstractModel

Constructor Summary
`ModelTester()`

Method Summary
`static double`	`getKLDivergence(Model m1, Model m2, int length)` Returns the Kullback-Leibler-divergence `D(p_m1\|\|p_m2)`.
`static double`	`getLogLikelihood(Model m, Sample data)` Returns the log-likelihood of a `Sample` `data` for a given model `m`.
`static double`	`getLogLikelihood(Model m, Sample data, double[] weights)` Returns the log-likelihood of a `Sample` `data` for a given model `m`.
`static double`	`getMarginalDistribution(Model m, int[] constraint)` This method computes the marginal distribution for any discrete model `m` and all sequences that fulfill the `constraint` , if possible.
`static double`	`getMaxOfDeviation(Model m1, Model m2, int length)` This method computes the maximum deviation between the probabilities for all sequences of `length` for discrete models `m1` and `m2`.
`static Sequence`	`getMostProbableSequence(Model m, int length)` Returns one most probable sequence for the discrete model `m`.
`static double`	`getShannonEntropy(Model m, int length)` This method computes the Shannon entropy for any discrete model `m` and all sequences of `length`, if possible.
`static double`	`getShannonEntropyInBits(Model m, int length)` This method computes the Shannon entropy in bits for any discrete model `m` and all sequences of `length`, if possible.
`static double`	`getSumOfDeviation(Model m1, Model m2, int length)` This method computes the sum of deviations between the probabilities for all sequences of `length` for discrete models `m1` and `m2`.
`static double`	`getSumOfDistribution(Model m, int length)` This method computes the marginal distribution for any discrete model `m` and all sequences of `length`, if possible.
`static double`	`getSymKLDivergence(Model m1, Model m2, int length)` Returns the difference of the Kullback-Leibler-divergences, i.e.
`static double`	`getValueOfAIC(Model m, Sample s, int k)` This method computes the value of Akaikes Information Criterion (AIC).
`static double`	`getValueOfBIC(Model m, Sample s, int k)` This method computes the value of the Bayesian Information Criterion (BIC).

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Constructor Detail

ModelTester

public ModelTester()

Method Detail

getKLDivergence

public static double getKLDivergence(Model m1,
                                     Model m2,
                                     int length)
                              throws Exception

Returns the Kullback-Leibler-divergence D(p_m1||p_m2).

Computes \sum_x p(x|m1) * \log \frac{p(x|m1)}{p(x|m2)}.

Parameters:: m1 - one discrete model; m2 - another discrete model; length - the length of the sequence (for inhomogeneous models length has to be Model.getLength())
Returns:: the Kullback-Leibler-divergence
Throws:: Exception - if something went wrong

getSymKLDivergence

public static double getSymKLDivergence(Model m1,
                                        Model m2,
                                        int length)
                                 throws Exception

Returns the difference of the Kullback-Leibler-divergences, i.e. D(p_m1||p_m2) - D(p_m2||p_m1).

Computes \sum_x (p(x|m1)-p(x|m2)) * \log \frac{p(x|m1)}{p(x|m2)}.

Parameters:: m1 - one discrete model; m2 - another discrete model; length - the length of the sequence (for inhomogeneous models length has to be Model.getLength())
Returns:: the difference of the Kullback-Leibler-divergence
Throws:: Exception - if something went wrong

getLogLikelihood

public static double getLogLikelihood(Model m,
                                      Sample data)
                               throws Exception

Returns the log-likelihood of a Sample data for a given model m.

Parameters:: m - the given model; data - the Sample
Returns:: the log-likelihood of data
Throws:: Exception - if something went wrong

getLogLikelihood

public static double getLogLikelihood(Model m,
                                      Sample data,
                                      double[] weights)
                               throws Exception

Returns the log-likelihood of a Sample data for a given model m.

Parameters:: m - the given model; data - the Sample; weights - the weight for each element of the Sample
Returns:: the log-likelihood of data
Throws:: Exception - if something went wrong

getMarginalDistribution

public static double getMarginalDistribution(Model m,
                                             int[] constraint)
                                      throws Exception

This method computes the marginal distribution for any discrete model m and all sequences that fulfill the constraint , if possible.

Parameters:: m - a discrete model; constraint - constraint[i] < 0 stands for an irrelevant position, constraint[i] = c with 0 <= c < m.getAlphabets()[(m.getLength==0)?0:i].getAlphabetLength() is the encoded character of position i
Returns:: the marginal distribution for a discrete model
Throws:: Exception - if something went wrong

getMaxOfDeviation

public static double getMaxOfDeviation(Model m1,
                                       Model m2,
                                       int length)
                                throws Exception

This method computes the maximum deviation between the probabilities for all sequences of length for discrete models m1 and m2.

Parameters:: m1 - one discrete model; m2 - another discrete model; length - the length of the sequence (for inhomogeneous models length has to be Model.getLength())
Returns:: the maximum deviation between the probabilities
Throws:: Exception - if something went wrong

getMostProbableSequence

public static Sequence getMostProbableSequence(Model m,
                                               int length)
                                        throws Exception

Returns one most probable sequence for the discrete model m. (Maybe there are more than one most probable sequences. In this case only one of them is returned.)

This is only a standard implementation. For some special models like Markov models it is possible to compute the probabilities of the sequences much faster by using a dynamic-programming-algorithm.

Parameters:: m - the discrete model; length - the length of the sequence (for inhomogeneous models length has to be Model.getLength())
Returns:: one most probable sequence
Throws:: Exception - if something went wrong

getShannonEntropy

public static double getShannonEntropy(Model m,
                                       int length)
                                throws Exception

This method computes the Shannon entropy for any discrete model m and all sequences of length, if possible.

Parameters:: m - the discrete model; length - the length of the sequence (for inhomogeneous models length has to be Model.getLength())
Returns:: the Shannon entropy for a discrete model
Throws:: Exception - if something went wrong

getShannonEntropyInBits

public static double getShannonEntropyInBits(Model m,
                                             int length)
                                      throws Exception

This method computes the Shannon entropy in bits for any discrete model m and all sequences of length, if possible.

Parameters:: m - the discrete model; length - the length of the sequence (for inhomogeneous models length has to be Model.getLength())
Returns:: the Shannon entropy in bits for a discrete model
Throws:: Exception - if something went wrong

getSumOfDeviation

public static double getSumOfDeviation(Model m1,
                                       Model m2,
                                       int length)
                                throws Exception

This method computes the sum of deviations between the probabilities for all sequences of length for discrete models m1 and m2.

Parameters:: m1 - one discrete model; m2 - another discrete model; length - the length of the sequence (for inhomogeneous models length has to be Model.getLength())
Returns:: the sum of deviations between the probabilities
Throws:: Exception - if something went wrong

getSumOfDistribution

public static double getSumOfDistribution(Model m,
                                          int length)
                                   throws Exception

This method computes the marginal distribution for any discrete model m and all sequences of length, if possible. So this method can be used to give a hint whether a model is a distribution or if some mistakes are in the implementation.

It is expected that this method delivers the value 1.0, but because of the limited precision in Java the value 1.0 is unrealistic.

Math.abs( 1.0d - getSumOfDistribution( m, length ) should be smaller than 1E-10.

Parameters:: m - the discrete model; length - the length of the sequence (for inhomogeneous models length has to be Model.getLength())
Returns:: the marginal distribution for a discrete model
Throws:: Exception - if something went wrong

getValueOfAIC

public static double getValueOfAIC(Model m,
                                   Sample s,
                                   int k)
                            throws Exception

This method computes the value of Akaikes Information Criterion (AIC). It uses the formula: AIC = 2 * log L(t,x) - 2*k, where L(t,x) is the likelihood of the Sample and k is the number of parameters in the model.

The value of the AIC can be used for model selection.

Parameters:: m - a trained model; s - the Sample for the test; k - the number of parameters of the model m
Returns:: the value of AIC
Throws:: Exception - if something went wrong

getValueOfBIC

public static double getValueOfBIC(Model m,
                                   Sample s,
                                   int k)
                            throws Exception

This method computes the value of the Bayesian Information Criterion (BIC). It uses the formula: BIC =

2 * log L(t,x) - k *
 log n

, where L(t,x) is the likelihood of the Sample, k is the number of parameters in the model and n is the number of sequences in the Sample.

The value of the BIC can be used for model selection.

Parameters:: m - a trained model; s - the Sample for the test; k - the number of parameters of the model m
Returns:: value of AIC
Throws:: Exception - if something went wrong

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

de.jstacs.models.utils Class ModelTester

ModelTester

getKLDivergence

getSymKLDivergence

getLogLikelihood

getLogLikelihood

getMarginalDistribution

getMaxOfDeviation

getMostProbableSequence

getShannonEntropy

getShannonEntropyInBits

getSumOfDeviation

getSumOfDistribution

getValueOfAIC

getValueOfBIC

de.jstacs.models.utils
Class ModelTester