de.jstacs.classifiers.assessment
Class Sampled_RepeatedHoldOutExperiment

java.lang.Object
  extended by de.jstacs.classifiers.assessment.ClassifierAssessment<Sampled_RepeatedHoldOutAssessParameterSet>
      extended by de.jstacs.classifiers.assessment.Sampled_RepeatedHoldOutExperiment

public class Sampled_RepeatedHoldOutExperiment
extends ClassifierAssessment<Sampled_RepeatedHoldOutAssessParameterSet>

This class is a special ClassifierAssessment that partitions the data of a user-specified reference class (typically the smallest class) and data sets non-overlapping for all other classes, so that one gets the same number of sequences (and the same lengths of the sequences) in each train and test data set.

Author:
Jens Keilwagen
See Also:
Sampled_RepeatedHoldOutAssessParameterSet

Field Summary
 
Fields inherited from class de.jstacs.classifiers.assessment.ClassifierAssessment
myAbstractClassifier, myModel, myTempMeanResultSets, skipLastClassifiersDuringClassifierTraining
 
Constructor Summary
  Sampled_RepeatedHoldOutExperiment(AbstractClassifier... aCs)
          Creates a new Sampled_RepeatedHoldOutExperiment from a set of AbstractClassifiers.
  Sampled_RepeatedHoldOutExperiment(AbstractClassifier[] aCs, boolean buildClassifiersByCrossProduct, TrainableStatisticalModel[]... aMs)
          This constructor allows to assess a collection of given AbstractClassifiers and those constructed using the given TrainableStatisticalModels by a Sampled_RepeatedHoldOutExperiment.
protected Sampled_RepeatedHoldOutExperiment(AbstractClassifier[] aCs, TrainableStatisticalModel[][] aMs, boolean buildClassifiersByCrossProduct, boolean checkAlphabetConsistencyAndLength)
          Creates a new Sampled_RepeatedHoldOutExperiment from an array of AbstractClassifiers and a two-dimensional array of TrainableStatisticalModel s, which are combined to additional classifiers.
  Sampled_RepeatedHoldOutExperiment(boolean buildClassifiersByCrossProduct, TrainableStatisticalModel[]... aMs)
          Creates a new Sampled_RepeatedHoldOutExperiment from a set of TrainableStatisticalModels.
 
Method Summary
protected  void evaluateClassifier(NumericalPerformanceMeasureParameterSet mp, Sampled_RepeatedHoldOutAssessParameterSet assessPS, DataSet[] s, double[][] weights, ProgressUpdater pU)
          This method must be implemented in all subclasses.
 Sampled_RepeatedHoldOutAssessParameterSet getAssessParameterSet()
          This method returns an instance of ClassifierAssessmentAssessParameterSet that can be used in the assess methods.
 
Methods inherited from class de.jstacs.classifiers.assessment.ClassifierAssessment
assess, assess, assess, assess, getClassifier, getNameOfAssessment, prepareAssessment, test, train
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Sampled_RepeatedHoldOutExperiment

protected Sampled_RepeatedHoldOutExperiment(AbstractClassifier[] aCs,
                                            TrainableStatisticalModel[][] aMs,
                                            boolean buildClassifiersByCrossProduct,
                                            boolean checkAlphabetConsistencyAndLength)
                                     throws IllegalArgumentException,
                                            WrongAlphabetException,
                                            CloneNotSupportedException,
                                            ClassDimensionException
Creates a new Sampled_RepeatedHoldOutExperiment from an array of AbstractClassifiers and a two-dimensional array of TrainableStatisticalModel s, which are combined to additional classifiers. If buildClassifiersByCrossProduct is true, the cross-product of all TrainableStatisticalModels in aMs is built to obtain these classifiers.

Parameters:
aCs - the predefined classifiers
aMs - the TrainableStatisticalModels that are used to build additional classifiers
buildClassifiersByCrossProduct - Determines how classifiers are constructed using the given models. Suppose a k-class problem. In this case, each classifier is supposed to consist of k models, one responsible for each class.
Let S_i be the set of all models in aMs[i]. Let S be the set S_1 x S_2 x ... x S_k (cross-product).

true: all possible classifiers consisting of a subset (set of k models) of S are constructed
false: one classifier consisting of the models aMs[0][i],aMs[1][i],..., aMs[k][i] for a fixed i is constructed. In this case, all second dimensions of aMs have to be equal, say m. In total m classifiers are constructed.
checkAlphabetConsistencyAndLength - indicates if alphabets and lengths shall be checked for consistency
Throws:
IllegalArgumentException - if the classifiers have different lengths
WrongAlphabetException - if the classifiers use different alphabets
CloneNotSupportedException - if something went wrong while cloning
ClassDimensionException - if there is something wrong with the class dimension of the classifier
See Also:
ClassifierAssessment.ClassifierAssessment(AbstractClassifier[], TrainableStatisticalModel[][], boolean, boolean)

Sampled_RepeatedHoldOutExperiment

public Sampled_RepeatedHoldOutExperiment(AbstractClassifier... aCs)
                                  throws IllegalArgumentException,
                                         WrongAlphabetException,
                                         CloneNotSupportedException,
                                         ClassDimensionException
Creates a new Sampled_RepeatedHoldOutExperiment from a set of AbstractClassifiers.

Parameters:
aCs - contains the classifiers to be assessed,
If model based classifiers are trained, the order of models in classifiers determines, which model will be trained using which sample in method assess( ... ).
For a two-class problem, it is recommended
  • to initiate the classifiers with models in order (foreground model (positive class), background model (negative class))
  • to initiate an assessment object using models in order (foreground model (positive class), background model (negative class))
  • to give data s in order (s[0] contains foreground data, s[1] contains background data)
Throws:
IllegalArgumentException - if the classifiers have different lengths
WrongAlphabetException - if not all given classifiers are defined on the same AlphabetContainer
CloneNotSupportedException - if something went wrong while cloning
ClassDimensionException - if there is something wrong with the class dimension of the classifier
See Also:
ClassifierAssessment.ClassifierAssessment(AbstractClassifier...)

Sampled_RepeatedHoldOutExperiment

public Sampled_RepeatedHoldOutExperiment(boolean buildClassifiersByCrossProduct,
                                         TrainableStatisticalModel[]... aMs)
                                  throws IllegalArgumentException,
                                         WrongAlphabetException,
                                         CloneNotSupportedException,
                                         ClassDimensionException
Creates a new Sampled_RepeatedHoldOutExperiment from a set of TrainableStatisticalModels. The argument buildClassifiersByCrossProduct determines how these TrainableStatisticalModels are combined to classifiers.

Parameters:
buildClassifiersByCrossProduct -
Determines how classifiers are constructed using the given models. Suppose a k-class problem. In this case, each classifier is supposed to consist of k models, one responsible for each class.
Let S_i be the set of all models in aMs[i]. Let S be the set S_1 x S_2 x ... x S_k (cross-product).

true: all possible classifiers consisting of a subset (set of k models) of S are constructed
false: one classifier consisting of the models aMs[0][i],aMs[1][i],..., aMs[k][i] for a fixed i is constructed. In this case, all second dimensions of aMs have to be equal, say m. In total m classifiers are constructed.
aMs -
Contains the models in the following way (suppose a k-class problem): the first dimension encodes the class (here it is k), the second dimension (aMs[i]) contains the models according to class i.
If models are trained directly (during assessment), the order of given models during initiation of this assessment object determines, which data set will be used for training which model. In general the first model will be trained using the first data set in s... .
For a two-class problem, it is recommended
  • to initiate the classifiers with models in order (foreground model (positive class), background model (negative class))
  • to initiate an assessment object using models in order (foreground model (positive class), background model (negative class))
  • to give data s in order (s[0] contains foreground data, s[1] contains background data)
Throws:
IllegalArgumentException - if the classifiers have different lengths
WrongAlphabetException - if not all given classifiers are defined on the same AlphabetContainer
CloneNotSupportedException - if something went wrong while cloning
ClassDimensionException - if there is something wrong with the class dimension of the classifier
See Also:
ClassifierAssessment.ClassifierAssessment(boolean, TrainableStatisticalModel[][])

Sampled_RepeatedHoldOutExperiment

public Sampled_RepeatedHoldOutExperiment(AbstractClassifier[] aCs,
                                         boolean buildClassifiersByCrossProduct,
                                         TrainableStatisticalModel[]... aMs)
                                  throws IllegalArgumentException,
                                         WrongAlphabetException,
                                         CloneNotSupportedException,
                                         ClassDimensionException
This constructor allows to assess a collection of given AbstractClassifiers and those constructed using the given TrainableStatisticalModels by a Sampled_RepeatedHoldOutExperiment.

Parameters:
aCs - contains some AbstractClassifier that should be assessed in addition to the AbstractClassifiers constructed using the given TrainableStatisticalModels
buildClassifiersByCrossProduct -
Determines how classifiers are constructed using the given models. Suppose a k-class problem. In this case, each classifier is supposed to consist of k models, one responsible for each class.
Let S_i be the set of all models in aMs[i]. Let S be the set S_1 x S_2 x ... x S_k (cross-product).

true: all possible classifiers consisting of a subset (set of k models) of S are constructed
false: one classifier consisting of the models aMs[0][i],aMs[1][i],..., aMs[k][i] for a fixed i is constructed. In this case, all second dimensions of aMs have to be equal, say m. In total m classifiers are constructed.
aMs -
Contains the models in the following way (suppose a k-class problem): the first dimension encodes the class (here it is k), the second dimension (aMs[i]) contains the models according to class i.
If models are trained directly (during assessment), the order of given models during initiation of this assessment object determines, which data set will be used for training which model. In general the first model will be trained using the first data set in s... .
For a two-class problem, it is recommended
  • to initiate the classifiers with models in order (foreground model (positive class), background model (negative class))
  • to initiate a assessment object using models in order (foreground model (positive class), background model (negative class))
  • to give data s in order (s[0] contains foreground data, s[1] contains background data)
Throws:
IllegalArgumentException - if the classifiers have different lengths
WrongAlphabetException - if not all given classifiers are defined on the same AlphabetContainer
CloneNotSupportedException - if something went wrong while cloning
ClassDimensionException - if there is something wrong with the class dimension of the classifier
See Also:
ClassifierAssessment.ClassifierAssessment(AbstractClassifier[], boolean, TrainableStatisticalModel[][])
Method Detail

evaluateClassifier

protected void evaluateClassifier(NumericalPerformanceMeasureParameterSet mp,
                                  Sampled_RepeatedHoldOutAssessParameterSet assessPS,
                                  DataSet[] s,
                                  double[][] weights,
                                  ProgressUpdater pU)
                           throws IllegalArgumentException,
                                  Exception
Description copied from class: ClassifierAssessment
This method must be implemented in all subclasses. It should perform the following tasks:
1.) create test and train datasets
2.) call method train() to train classifiers/models using train data
3.) call method test() to cause evaluation (test) of trained classifiers

Specified by:
evaluateClassifier in class ClassifierAssessment<Sampled_RepeatedHoldOutAssessParameterSet>
Parameters:
mp - defines which performance measures are used to assess classifiers
assessPS - contains assessment specific parameters (like: number of iterations of a k-fold-crossvalidation)
s - data to be used for assessment (both: test and train data)
weights - the (non-negative) weights for the data; weight for each data set (first dimension) and each sequence (second dimension), can be null which is the same as weight 1 for all sequences in all data sets
pU - a ProgressUpdater that mainly has to be used to allow the user to cancel a current running classifier assessment. This ProgressUpdater is guaranteed to be not null. In certain cases aborting a classifier assessment will not be allowed for example in case of KFoldCrossValidation. In this case the given ProgressUpdater should be ignored.

Usage:
  • pU.setMax()= number of iterations of the assessment loop
  • iteration=0;
  • assessment loop
    • pU.setValue()=iteration+1;
    • sample treatment
    • train();
    • test();
    • iteration++;
  • repeat unless(ready or not(pU.isCancelled()))
Throws:
IllegalArgumentException - if the given ClassifierAssessmentAssessParameterSet is of wrong type
Exception - that occurred during training or using classifiers/models

getAssessParameterSet

public Sampled_RepeatedHoldOutAssessParameterSet getAssessParameterSet()
                                                                throws Exception
Description copied from class: ClassifierAssessment
This method returns an instance of ClassifierAssessmentAssessParameterSet that can be used in the assess methods.

Specified by:
getAssessParameterSet in class ClassifierAssessment<Sampled_RepeatedHoldOutAssessParameterSet>
Returns:
an instance of ClassifierAssessmentAssessParameterSet that can be used in the assess methods.
Throws:
Exception - if the parameter set could not be created properly
See Also:
ClassifierAssessment.assess(NumericalPerformanceMeasureParameterSet, ClassifierAssessmentAssessParameterSet, DataSet...), ClassifierAssessment.assess(NumericalPerformanceMeasureParameterSet, ClassifierAssessmentAssessParameterSet, ProgressUpdater, DataSet[]), #assess(NumericalPerformanceMeasureParameterSet, ClassifierAssessmentAssessParameterSet, ProgressUpdater, DataSet[][]...), ClassifierAssessment.assess(NumericalPerformanceMeasureParameterSet, ClassifierAssessmentAssessParameterSet, ProgressUpdater, DataSet[], double[][])