de.jstacs.sequenceScores.statisticalModels.differentiable.directedGraphicalModels.structureLearning.measures
Class Measure

java.lang.Object
  extended by de.jstacs.sequenceScores.statisticalModels.differentiable.directedGraphicalModels.structureLearning.measures.Measure
All Implemented Interfaces:
InstantiableFromParameterSet, Storable, Cloneable
Direct Known Subclasses:
BTExplainingAwayResidual, BTMutualInformation, InhomogeneousMarkov, PMMExplainingAwayResidual, PMMMutualInformation

public abstract class Measure
extends Object
implements Cloneable, Storable, InstantiableFromParameterSet

Class for structure measures that derive an optimal structure with respect to some criterion within a class of possible structures from data.

Author:
Jan Grau

Nested Class Summary
static class Measure.MeasureParameterSet
          This class is the super class of any ParameterSet that can be used to instantiate a Measure.
 
Field Summary
protected  Measure.MeasureParameterSet parameters
          The parameters of this measure
 
Constructor Summary
protected Measure(Measure.MeasureParameterSet parameters)
          Creates a new Measure from its Measure.MeasureParameterSet.
protected Measure(StringBuffer xml)
          Creates a new Measure from its XML-representation.
 
Method Summary
 Measure clone()
           
protected static void fillTensor(Tensor t, double[][] weights)
          Fills a Tensor t with the weights defined in weights.
protected static void fillTensor(Tensor t, double[][][] weights)
          Fills a Tensor t with the weights defined in weights.
protected static double[][][] getCMI(double[][][][][][] fgStats, double[][][][][][] bgStats, double n)
          Computes the conditional mutual information from fgStats and bgStats counted on sequences with a total weight of n.
protected static double[][] getCMI(double[][][][] fgStats, double[][][][] bgStats, double n, double nFg, double nBg)
          Computes the conditional mutual information from fgStats and bgStats counted on sequences with a total weight of nFg and nBg, respectively.
 InstanceParameterSet<Measure> getCurrentParameterSet()
          Returns the InstanceParameterSet that has been used to instantiate the current instance of the implementing class.
static double[][][] getEAR(double[][][][][][] fgStats, double[][][][][][] bgStats, double nFg, double nBg)
          Computes the explaining away residual from fgStats and bgStats counted on sequences with a total weight of nFg and nBg, respectively.
protected static double[][] getEAR(double[][][][] fgStats, double[][][][] bgStats, double nFg, double nBg)
          Computes the explaining away residual from fgStats and bgStats counted on sequences with a total weight of nFg and nBg, respectively.
abstract  String getInstanceName()
          Returns the name of the Measure and possibly some additional information about the current instance.
protected static double[][][] getMI(double[][][][][][] counts, double n)
          Computes the mutual information from counts counted on sequences with a total weight of n.
protected static double[][] getMI(double[][][][] counts, double n)
          Computes the mutual information from counts counted on sequences with a total weight of n.
abstract  int[][] getParents(DataSet fg, DataSet bg, double[] weightsFg, double[] weightsBg, int length)
          Returns the optimal parents for the given data and weights.
protected static double[][][][] getStatistics(DataSet s, double[] weights, int length, double ess)
          Counts the occurrences of symbols of the AlphabetContainer of DataSet s using weights.
protected static double[][][][][][] getStatisticsOrderTwo(DataSet s, double[] weights, int length, double ess)
          Counts the occurrences of symbols of the AlphabetContainer of DataSet s using weights.
abstract  String getXMLTag()
          Returns the XML-tag for storing this measure
 boolean isShiftable()
          Indicates if Measure supports shifts.
protected static double sum(double[] ar)
          Computes the sum of all elements in the array ar.
protected static int[][] toParents(int[] o, byte order)
          Creates a new parent structure as defined by getParents(DataSet, DataSet, double[], double[], int) from an order and a topological ordering of positions.
 StringBuffer toXML()
          This method returns an XML representation as StringBuffer of an instance of the implementing class.
protected static double[] union(double[][] ar)
          Linearizes the arrays in the two-dimensional array ar to form a new, one-dimensional array.
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

parameters

protected Measure.MeasureParameterSet parameters
The parameters of this measure

Constructor Detail

Measure

protected Measure(StringBuffer xml)
           throws NonParsableException
Creates a new Measure from its XML-representation.

Parameters:
xml - the XML-representation
Throws:
NonParsableException - the the XML could not be parsed

Measure

protected Measure(Measure.MeasureParameterSet parameters)
           throws CloneNotSupportedException
Creates a new Measure from its Measure.MeasureParameterSet.

Parameters:
parameters - the parameters
Throws:
CloneNotSupportedException - if the parameters could not be cloned
Method Detail

getXMLTag

public abstract String getXMLTag()
Returns the XML-tag for storing this measure

Returns:
the tag

toXML

public StringBuffer toXML()
Description copied from interface: Storable
This method returns an XML representation as StringBuffer of an instance of the implementing class.

Specified by:
toXML in interface Storable
Returns:
the XML representation

getCurrentParameterSet

public final InstanceParameterSet<Measure> getCurrentParameterSet()
                                                           throws Exception
Description copied from interface: InstantiableFromParameterSet
Returns the InstanceParameterSet that has been used to instantiate the current instance of the implementing class. If the current instance was not created using an InstanceParameterSet, an equivalent InstanceParameterSet should be returned, so that an instance created using this InstanceParameterSet would be in principle equal to the current instance.

Specified by:
getCurrentParameterSet in interface InstantiableFromParameterSet
Returns:
the current InstanceParameterSet
Throws:
Exception - if the InstanceParameterSet could not be returned

getInstanceName

public abstract String getInstanceName()
Returns the name of the Measure and possibly some additional information about the current instance.

Returns:
the name of the Measure

getParents

public abstract int[][] getParents(DataSet fg,
                                   DataSet bg,
                                   double[] weightsFg,
                                   double[] weightsBg,
                                   int length)
                            throws Exception
Returns the optimal parents for the given data and weights. The returned array of parents p at each position i is build as follows:

Parameters:
fg - the data of the current (foreground) class
bg - the data of the negative (background) class
weightsFg - the weights for the sequences of fg
weightsBg - the weights for the sequences of bg
length - the length of the model, must be equal to the length of the sequences
Returns:
the the array p with the optimal parents
Throws:
Exception - if the lengths do not match or other problems concerning the data occur

clone

public Measure clone()
              throws CloneNotSupportedException
Overrides:
clone in class Object
Throws:
CloneNotSupportedException

toParents

protected static int[][] toParents(int[] o,
                                   byte order)
Creates a new parent structure as defined by getParents(DataSet, DataSet, double[], double[], int) from an order and a topological ordering of positions.

Parameters:
o - the topological ordering
order - the order
Returns:
the parent structure in a two-dimensional array

fillTensor

protected static void fillTensor(Tensor t,
                                 double[][] weights)
Fills a Tensor t with the weights defined in weights.

Parameters:
t - the Tensor to be filled
weights - the weights

fillTensor

protected static void fillTensor(Tensor t,
                                 double[][][] weights)
Fills a Tensor t with the weights defined in weights.

Parameters:
t - the Tensor to be filled
weights - the weights

getMI

protected static double[][][] getMI(double[][][][][][] counts,
                                    double n)
Computes the mutual information from counts counted on sequences with a total weight of n.

Parameters:
counts - the counts as returned by getStatisticsOrderTwo(DataSet, double[], int, double)
n - the total weight
Returns:
the mutual information

getCMI

protected static double[][][] getCMI(double[][][][][][] fgStats,
                                     double[][][][][][] bgStats,
                                     double n)
Computes the conditional mutual information from fgStats and bgStats counted on sequences with a total weight of n.

Parameters:
fgStats - the counts in the foreground sequences as returned by getStatisticsOrderTwo(DataSet, double[], int, double)
bgStats - the counts in the foreground sequences as returned by getStatisticsOrderTwo(DataSet, double[], int, double)
n - the total weight
Returns:
the conditional mutual information

getEAR

public static double[][][] getEAR(double[][][][][][] fgStats,
                                  double[][][][][][] bgStats,
                                  double nFg,
                                  double nBg)
Computes the explaining away residual from fgStats and bgStats counted on sequences with a total weight of nFg and nBg, respectively.

Parameters:
fgStats - the counts in the foreground sequences as returned by getStatisticsOrderTwo(DataSet, double[], int, double)
bgStats - the counts in the foreground sequences as returned by getStatisticsOrderTwo(DataSet, double[], int, double)
nFg - the total weight in the foreground
nBg - the total weight in the background
Returns:
the explaining away residual

getStatisticsOrderTwo

protected static double[][][][][][] getStatisticsOrderTwo(DataSet s,
                                                          double[] weights,
                                                          int length,
                                                          double ess)
                                                   throws Exception
Counts the occurrences of symbols of the AlphabetContainer of DataSet s using weights. The array counts is indexed as follows:
counts[first index][second index][third index][symbol at first index][symbol at second index][symbol at third index] .

Parameters:
s - the data
weights - the weights
length - the length of the sequences
ess - the equivalent sample size
Returns:
the array counts with the symbol occurrences
Throws:
Exception - if the lengths do not match or other problems concerning the data occur

getStatistics

protected static double[][][][] getStatistics(DataSet s,
                                              double[] weights,
                                              int length,
                                              double ess)
                                       throws Exception
Counts the occurrences of symbols of the AlphabetContainer of DataSet s using weights. The array counts is indexed as follows:
counts[first index][second index][symbol at first index][symbol at second index] .

Parameters:
s - the data
weights - the weights
length - the length of the sequences
ess - the equivalent sample size
Returns:
the array counts with the symbol occurrences
Throws:
Exception - if the lengths do not match or other problems concerning the data occur

getMI

protected static double[][] getMI(double[][][][] counts,
                                  double n)
Computes the mutual information from counts counted on sequences with a total weight of n.

Parameters:
counts - the counts as defined in getStatistics(DataSet, double[], int, double).
n - the total weight
Returns:
the mutual information

getCMI

protected static double[][] getCMI(double[][][][] fgStats,
                                   double[][][][] bgStats,
                                   double n,
                                   double nFg,
                                   double nBg)
Computes the conditional mutual information from fgStats and bgStats counted on sequences with a total weight of nFg and nBg, respectively.

Parameters:
fgStats - the counts as defined in getStatistics(DataSet, double[], int, double) on the foreground
bgStats - the counts as defined in getStatistics(DataSet, double[], int, double) on the background
n - the total weight
nFg - the total weight in the foreground
nBg - the total weight in the background
Returns:
the conditional mutual information

getEAR

protected static double[][] getEAR(double[][][][] fgStats,
                                   double[][][][] bgStats,
                                   double nFg,
                                   double nBg)
Computes the explaining away residual from fgStats and bgStats counted on sequences with a total weight of nFg and nBg, respectively.

Parameters:
fgStats - the counts as defined in getStatistics(DataSet, double[], int, double) on the foreground
bgStats - the counts as defined in getStatistics(DataSet, double[], int, double) on the background
nFg - the total weight in the foreground
nBg - the total weight in the background
Returns:
the explaining away residual

sum

protected static double sum(double[] ar)
Computes the sum of all elements in the array ar.

Parameters:
ar - the array
Returns:
the sum of the elements of the array

union

protected static double[] union(double[][] ar)
Linearizes the arrays in the two-dimensional array ar to form a new, one-dimensional array.

Parameters:
ar - the two-dimensional array
Returns:
the linearized one-dimensional array

isShiftable

public boolean isShiftable()
Indicates if Measure supports shifts.

Returns:
if Measure supports shifts