|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectde.jstacs.utils.StatisticalModelTester
public class StatisticalModelTester
This class is useful for some test for any (discrete) models. It implements several statistics (log-likelihood, Shannon entropy, AIC, BIC, ...) to compare models.
StatisticalModel| Constructor Summary | |
|---|---|
StatisticalModelTester()
|
|
| Method Summary | |
|---|---|
static double |
getKLDivergence(StatisticalModel m1,
StatisticalModel m2,
int length)
Returns the Kullback-Leibler-divergence D(p_m1||p_m2). |
static double |
getLogLikelihood(StatisticalModel m,
DataSet data)
Returns the log-likelihood of a DataSet data for a
given model m. |
static double |
getLogLikelihood(StatisticalModel m,
DataSet data,
double[] weights)
Returns the log-likelihood of a DataSet data for a
given model m. |
static double |
getMarginalDistribution(StatisticalModel m,
int[] constraint)
This method computes the marginal distribution for any discrete model m and all sequences that fulfill the constraint
, if possible. |
static double |
getMaxOfDeviation(StatisticalModel m1,
StatisticalModel m2,
int length)
This method computes the maximum deviation between the probabilities for all sequences of length for discrete models m1
and m2. |
static Sequence |
getMostProbableSequence(SequenceScore m,
int length)
Returns one most probable sequence for the discrete model m. |
static double |
getShannonEntropy(StatisticalModel m,
int length)
This method computes the Shannon entropy for any discrete model m and all sequences of length, if possible. |
static double |
getShannonEntropyInBits(StatisticalModel m,
int length)
This method computes the Shannon entropy in bits for any discrete model m and all sequences of length, if possible. |
static double |
getSumOfDeviation(StatisticalModel m1,
StatisticalModel m2,
int length)
This method computes the sum of deviations between the probabilities for all sequences of length for discrete models m1
and m2. |
static double |
getSumOfDistribution(StatisticalModel m,
int length)
This method computes the marginal distribution for any discrete model m and all sequences of length, if possible. |
static double |
getSymKLDivergence(StatisticalModel m1,
StatisticalModel m2,
int length)
Returns the difference of the Kullback-Leibler-divergences, i.e. |
static double |
getValueOfAIC(StatisticalModel m,
DataSet s,
int k)
This method computes the value of Akaikes Information Criterion (AIC). |
static double |
getValueOfBIC(StatisticalModel m,
DataSet s,
int k)
This method computes the value of the Bayesian Information Criterion (BIC). |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public StatisticalModelTester()
| Method Detail |
|---|
public static double getKLDivergence(StatisticalModel m1,
StatisticalModel m2,
int length)
throws Exception
D(p_m1||p_m2).
\sum_x p(x|m1) * \log \frac{p(x|m1)}{p(x|m2)}.
m1 - one discrete modelm2 - another discrete modellength - the length of the sequence (for inhomogeneous models length
has to be SequenceScore.getLength())
Exception - if something went wrong
public static double getSymKLDivergence(StatisticalModel m1,
StatisticalModel m2,
int length)
throws Exception
D(p_m1||p_m2) - D(p_m2||p_m1).
\sum_x (p(x|m1)-p(x|m2)) * \log \frac{p(x|m1)}{p(x|m2)}.
m1 - one discrete modelm2 - another discrete modellength - the length of the sequence (for inhomogeneous models length
has to be SequenceScore.getLength())
Exception - if something went wrong
public static double getLogLikelihood(StatisticalModel m,
DataSet data)
throws Exception
DataSet data for a
given model m.
m - the given modeldata - the DataSet
data
Exception - if something went wrong
public static double getLogLikelihood(StatisticalModel m,
DataSet data,
double[] weights)
throws Exception
DataSet data for a
given model m.
m - the given modeldata - the DataSetweights - the weight for each element of the DataSet
data
Exception - if something went wrong
public static double getMarginalDistribution(StatisticalModel m,
int[] constraint)
throws Exception
m and all sequences that fulfill the constraint
, if possible.
m - a discrete modelconstraint - constraint[i] < 0 stands for an irrelevant
position, constraint[i] = c with
0 <= c < m.getAlphabets()[(m.getLength==0)?0:i].getAlphabetLength()
is the encoded character of position i
Exception - if something went wrong
public static double getMaxOfDeviation(StatisticalModel m1,
StatisticalModel m2,
int length)
throws Exception
length for discrete models m1
and m2.
m1 - one discrete modelm2 - another discrete modellength - the length of the sequence (for inhomogeneous models length
has to be SequenceScore.getLength())
Exception - if something went wrong
public static Sequence getMostProbableSequence(SequenceScore m,
int length)
throws Exception
m.
(Maybe there are more than one most probable sequences. In this case only
one of them is returned.)
m - the discrete modellength - the length of the sequence (for inhomogeneous models length
has to be SequenceScore.getLength())
Exception - if something went wrong
public static double getShannonEntropy(StatisticalModel m,
int length)
throws Exception
m and all sequences of length, if possible.
m - the discrete modellength - the length of the sequence (for inhomogeneous models length
has to be SequenceScore.getLength())
Exception - if something went wrong
public static double getShannonEntropyInBits(StatisticalModel m,
int length)
throws Exception
m and all sequences of length, if possible.
m - the discrete modellength - the length of the sequence (for inhomogeneous models length
has to be SequenceScore.getLength())
Exception - if something went wrong
public static double getSumOfDeviation(StatisticalModel m1,
StatisticalModel m2,
int length)
throws Exception
length for discrete models m1
and m2.
m1 - one discrete modelm2 - another discrete modellength - the length of the sequence (for inhomogeneous models length
has to be SequenceScore.getLength())
Exception - if something went wrong
public static double getSumOfDistribution(StatisticalModel m,
int length)
throws Exception
m and all sequences of length, if possible. So
this method can be used to give a hint whether a model is a distribution
or if some mistakes are in the implementation.
Math.abs( 1.0d - getSumOfDistribution( m, length ) should be
smaller than 1E-10.
m - the discrete modellength - the length of the sequence (for inhomogeneous models length
has to be SequenceScore.getLength())
Exception - if something went wrong
public static double getValueOfAIC(StatisticalModel m,
DataSet s,
int k)
throws Exception
2 * log L(t,x) - 2*k, where
L(t,x) is the likelihood of the DataSet and
k is the number of parameters in the model.
m - a trained models - the DataSet for the testk - the number of parameters of the model m
Exception - if something went wrong
public static double getValueOfBIC(StatisticalModel m,
DataSet s,
int k)
throws Exception
2 * log L(t,x) - k *
log n, where L(t,x) is the likelihood of the
DataSet, k is the number of parameters in the model
and n is the number of sequences in the DataSet.
m - a trained models - the DataSet for the testk - the number of parameters of the model m
Exception - if something went wrong
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||