de.jstacs.classifier.assessment
Class RepeatedSubSamplingExperiment

java.lang.Object
  extended by de.jstacs.classifier.assessment.ClassifierAssessment
      extended by de.jstacs.classifier.assessment.RepeatedSubSamplingExperiment

public class RepeatedSubSamplingExperiment
extends ClassifierAssessment

This class implements a repeated subsampling experiment. A repested subsampling experiment uses the following procedure to assess classifiers.
The user supplies data-sets for each class the classifiers are capable to distinguish. In each step these data-sets are used to generate test- and train-data-sets by subsampling of these data-sets. The user defines how many elements the subsampled test- and train-data-sets should contain. After subampling the train-data-sets are used to train the classifiers and the test-data-sets are used to assess the performance of the classifiers to predict the elements therein. Additionally the user defines how often these procedure should be repeated and which assessment-measures are used to assess the classifiers.

Author:
andr|e gohr (a0coder (nospam:@) gmail (nospam:.) com)

Field Summary
 
Fields inherited from class de.jstacs.classifier.assessment.ClassifierAssessment
myAbstractClassifier, myBuildClassifierByCrossProduct, myModel, myTempMeanResultSets, skipLastClassifiersDuringClassifierTraining
 
Constructor Summary
  RepeatedSubSamplingExperiment(AbstractClassifier... aCs)
          Creates a new RepeatedSubSamplingExperiment from a set of AbstractClassifiers.
  RepeatedSubSamplingExperiment(AbstractClassifier[] aCs, boolean buildClassifiersByCrossProduct, Model[]... aMs)
          This constructor allows to assess a collection of given AbstractClassifiers and those constructed using the given AbstractModels by a RepeatedSubSamplingExperiment.
protected RepeatedSubSamplingExperiment(AbstractClassifier[] aCs, Model[][] aMs, boolean buildClassifiersByCrossProduct, boolean checkAlphabetConsistencyAndLength)
          Creates a new RepeatedSubSamplingExperiment from an array of AbstractClassifiers and a two-dimensional array of Models, which are combined to additional classifiers.
  RepeatedSubSamplingExperiment(boolean buildClassifiersByCrossProduct, Model[]... aMs)
          Creates a new RepeatedSubSamplingExperiment from a set of Models.
 
Method Summary
protected  boolean evaluateClassifier(MeasureParameters mp, ClassifierAssessmentAssessParameterSet assessPS, Sample[] s, ProgressUpdater pU)
          This method must be implemented in all subclasses.
 
Methods inherited from class de.jstacs.classifier.assessment.ClassifierAssessment
assess, assess, assess, getClassifier, getNameOfAssessment, prepareAssessment, test, train
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

RepeatedSubSamplingExperiment

protected RepeatedSubSamplingExperiment(AbstractClassifier[] aCs,
                                        Model[][] aMs,
                                        boolean buildClassifiersByCrossProduct,
                                        boolean checkAlphabetConsistencyAndLength)
                                 throws IllegalArgumentException,
                                        WrongAlphabetException,
                                        CloneNotSupportedException,
                                        ClassDimensionException
Creates a new RepeatedSubSamplingExperiment from an array of AbstractClassifiers and a two-dimensional array of Models, which are combined to additional classifiers. If buildClassifiersByCrossProduct is true, the cross product of all Models in aMs is built to obtain these classifiers.

Parameters:
aCs - the pre-defined classifiers
aMs - the Models that are used to build additional classifiers
buildClassifiersByCrossProduct - Determines how classifiers are constructed using the given models. Suppose a k-class problem. In this case, each classifier is supposed to consist of k models, one responsible for each class.
Let S_i be the set of all models in aMs[i]. Let S be the set S_1 x S_2 x ... x S_k (cross-product).

true: all possible classifiers consisting of a subset (set of k models) of S are constructed
false: one classifier consisting of the models aMs[0][i],aMs[1][i],...,aMs[k][i] for a fixed i is constructed . In this case, all second dimensions of aMs have to be equal, say m. In total m classifiers are constructed.
checkAlphabetConsistencyAndLength - indicates if alphabets and lengths shall be checked for consistency
Throws:
IllegalArgumentException
WrongAlphabetException
CloneNotSupportedException
ClassDimensionException

RepeatedSubSamplingExperiment

public RepeatedSubSamplingExperiment(AbstractClassifier... aCs)
                              throws IllegalArgumentException,
                                     WrongAlphabetException,
                                     CloneNotSupportedException,
                                     ClassDimensionException
Creates a new RepeatedSubSamplingExperiment from a set of AbstractClassifiers.

Parameters:
aCs - contains the classifiers to be assessed
If model-based classifiers are trained, the order of models in classifiers determines, which model will be trained using which sample in method assess().
For a two class-problem, it is recommended to
  • initiate the classifiers with models in order (foreground-model (positive class), background-model (negative-class))
  • to initiate a assessment-object using models in order (foreground-model (positive class), background-model (negative-class))
  • to give data s in order (s[0] contains foreground-data, s[1] contains background data)
Throws:
IllegalArgumentException
WrongAlphabetException - if not all given classifiers are defined on the same AlphabetContainer
ClassDimensionException
CloneNotSupportedException

RepeatedSubSamplingExperiment

public RepeatedSubSamplingExperiment(boolean buildClassifiersByCrossProduct,
                                     Model[]... aMs)
                              throws IllegalArgumentException,
                                     WrongAlphabetException,
                                     CloneNotSupportedException,
                                     ClassDimensionException
Creates a new RepeatedSubSamplingExperiment from a set of Models. The argument buildClassifiersByCrossProduct determines how these Models are combined to classifiers.

Parameters:
buildClassifiersByCrossProduct -
Determines how classifiers are constructed using the given models. Suppose a k-class problem. In this case, each classifier is supposed to consist of k models, one responsible for each class.
Let S_i be the set of all models in aMs[i]. Let S be the set S_1 x S_2 x ... x S_k (cross-product).

true: all possible classifiers consisting of a subset (set of k models) of S are constructed
false: one classifier consisting of the models aMs[0][i],aMs[1][i],...,aMs[k][i] for a fixed i is constructed . In this case, all second dimensions of aMs have to be equal, say m. In total m classifiers are constructed.
aMs -
Contains the models in the following way (suppose a k-class problem): the first dimension encodes the class (here it is k), the second dimension (aMs[i]) contains the models according to class i.
If models are trained directly (during assessment), the order of given models during initiation of this assessment-object determines, which sample will be used for training which model. In general the first model will be trained using the first sample in s... .
For a two class-problem, it is recommended to
  • initiate the classifiers with models in order (foreground-model (positive class), background-model (negative-class))
  • to initiate a assessment-object using models in order (foreground-model (positive class), background-model (negative-class))
  • to give data s in order (s[0] contains foreground-data, s[1] contains background data)
Throws:
WrongAlphabetException - if not all given models are defines on the same AlphabetContainer
IllegalArgumentException
CloneNotSupportedException
ClassDimensionException

RepeatedSubSamplingExperiment

public RepeatedSubSamplingExperiment(AbstractClassifier[] aCs,
                                     boolean buildClassifiersByCrossProduct,
                                     Model[]... aMs)
                              throws IllegalArgumentException,
                                     WrongAlphabetException,
                                     CloneNotSupportedException,
                                     ClassDimensionException
This constructor allows to assess a collection of given AbstractClassifiers and those constructed using the given AbstractModels by a RepeatedSubSamplingExperiment.

Parameters:
aCs - contains some AbstractClassifier that should be assessed in addition to the AbstractClassifiers constructed using the given AbstractModels
buildClassifiersByCrossProduct -
Determines how classifiers are constructed using the given models. Suppose a k-class problem. In this case, each classifier is supposed to consist of k models, one responsible for each class.
Let S_i be the set of all models in aMs[i]. Let S be the set S_1 x S_2 x ... x S_k (cross-product).

true: all possible classifiers consisting of a subset (set of k models) of S are constructed
false: one classifier consisting of the models aMs[0][i],aMs[1][i],...,aMs[k][i] for a fixed i is constructed . In this case, all second dimensions of aMs have to be equal, say m. In total m classifiers are constructed.
aMs -
Contains the models in the following way (suppose a k-class problem): the first dimension encodes the class (here it is k), the second dimension (aMs[i]) contains the models according to class i.
If models are trained directly (during assessment), the order of given models during initiation of this assessment-object determines, which sample will be used for training which model. In general the first model will be trained using the first sample in s... .
For a two class-problem, it is recommended to
  • initiate the classifiers with models in order (foreground-model (positive class), background-model (negative-class))
  • to initiate a assessment-object using models in order (foreground-model (positive class), background-model (negative-class))
  • to give data s in order (s[0] contains foreground-data, s[1] contains background data)
Throws:
WrongAlphabetException - if not all given models are defines on the same AlphabetContainer
IllegalArgumentException
CloneNotSupportedException
ClassDimensionException
Method Detail

evaluateClassifier

protected boolean evaluateClassifier(MeasureParameters mp,
                                     ClassifierAssessmentAssessParameterSet assessPS,
                                     Sample[] s,
                                     ProgressUpdater pU)
                              throws IllegalArgumentException,
                                     Exception
Description copied from class: ClassifierAssessment
This method must be implemented in all subclasses. It should perform the following tasks:
1.) create test- and train-datasets 2.) call method train to train classifiers/models using train-data 3.) call method test to cause evaluation (test) of trained classieres

Specified by:
evaluateClassifier in class ClassifierAssessment
Parameters:
mp - defines which performance-measures are used to assess classifiers
assessPS - containes assessment-specific parameters (like: number of iterations of a k-fold-crossvalidation)
s - data to be used for assessment (both: test- and train-data)
pU - a ProgressUpdater that mainly has to be used to allow the user to cancel a current running alssifier assessment. This ProgressUpdater is guaranteed to be not null. In certain cases aborting a classifier assessment will not be allowed for example in case of KFoldCrossValidation. In this case the given ProgressUpdater should be ignored.

Usage:
  • pU.setMax()= number of iterations of the assessment-loop
  • iteration=0;
  • assessment-loop
    • pU.setValue()=iteration+1;
    • Sample treatment
    • train();
    • test();
    • iteration++;
  • repeat unless(ready or not(pU.isCanceled()))
Returns:
true, if no errors occured
Throws:
IllegalArgumentException - if the given AssessParameterSet is of wrong type
Exception - that occured during training or using classifiers/models