ClassifierAssessment

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

de.jstacs.classifiers.assessment
Class ClassifierAssessment

java.lang.Object
  de.jstacs.classifiers.assessment.ClassifierAssessment

Direct Known Subclasses:: KFoldCrossValidation, RepeatedHoldOutExperiment, RepeatedSubSamplingExperiment, Sampled_RepeatedHoldOutExperiment

public abstract class ClassifierAssessment
extends Object
extends Object

Class defining an assessment of classifiers.
It should be used as a superclass for specialized classifier assessments like k-fold-crossvalidation or subsampling-crossvalidation.
Several standard tasks like classifier or model management (testing, training) are implemented. The method assess( ... ) should be used as standard method to start a classifier assessment. Subclasses have to implement the method evaluateClassifier( ... ). This method mainly has to execute the construction of test and training subsets of the given data. These test and training subsets may be used by the methods test( ... ) and train( ... ) which already are implemented in a standard way.

Author:: Andre Gohr (bioinf (nospam:.) ag (nospam:@) googlemail (nospam:.) com), Jens Keilwagen

Field Summary
`protected AbstractClassifier[]`	`myAbstractClassifier` This array contains the internal used classifiers.
`protected TrainableStatisticalModel[][]`	`myModel` This array contains for each class the internal used models.
`protected MeanResultSet[]`	`myTempMeanResultSets` The temporary result set.
`protected int`	`skipLastClassifiersDuringClassifierTraining` Skip last classifier.

Constructor Summary
	`ClassifierAssessment(AbstractClassifier... aCs)` Creates a new `ClassifierAssessment` from a set of `AbstractClassifier`s.
	`ClassifierAssessment(AbstractClassifier[] aCs, boolean buildClassifiersByCrossProduct, TrainableStatisticalModel[]... aMs)` This constructor allows to assess a collection of given `AbstractClassifier`s and, in addition, classifiers that will be constructed using the given `TrainableStatisticalModel`s.
`protected`	`ClassifierAssessment(AbstractClassifier[] aCs, TrainableStatisticalModel[][] aMs, boolean buildClassifiersByCrossProduct, boolean checkAlphabetConsistencyAndLength)` Creates a new `ClassifierAssessment` from an array of `AbstractClassifier`s and a two-dimensional array of `TrainableStatisticalModel` s, which are combined to additional classifiers.
	`ClassifierAssessment(boolean buildClassifiersByCrossProduct, TrainableStatisticalModel[]... aMs)` Creates a new `ClassifierAssessment` from a set of `TrainableStatisticalModel`s.

Method Summary
`ListResult`	`assess(NumericalPerformanceMeasureParameterSet mp, ClassifierAssessmentAssessParameterSet assessPS, DataSet... s)` Assesses the contained classifiers.
`ListResult`	`assess(NumericalPerformanceMeasureParameterSet mp, ClassifierAssessmentAssessParameterSet assessPS, ProgressUpdater pU, DataSet... s)` Assesses the contained classifiers.
`ListResult`	`assess(NumericalPerformanceMeasureParameterSet mp, ClassifierAssessmentAssessParameterSet assessPS, ProgressUpdater pU, DataSet[][]... s)` Assesses the contained classifiers.
`protected abstract void`	`evaluateClassifier(NumericalPerformanceMeasureParameterSet mp, ClassifierAssessmentAssessParameterSet assessPS, DataSet[] s, ProgressUpdater pU)` This method must be implemented in all subclasses.
`AbstractClassifier[]`	`getClassifier()` Returns a deep copy of all classifiers that have been or will be used in this assessment.
`String`	`getNameOfAssessment()` Returns the name of this class.
`protected void`	`prepareAssessment(DataSet... s)` Prepares an assessment.
`protected void`	`test(NumericalPerformanceMeasureParameterSet mp, boolean exception, DataSet... testS)` Uses the given test samples to call the `evaluate( ... )` -methods of the local `AbstractClassifier`s.
`protected void`	`train(DataSet... trainS)` Trains the local classifiers using the given training samples.

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Field Detail

myAbstractClassifier

protected AbstractClassifier[] myAbstractClassifier

This array contains the internal used classifiers.

myModel

protected TrainableStatisticalModel[][] myModel

This array contains for each class the internal used models. This is helpful to allow to train some models only once while evaluating them in combination with other models.

myTempMeanResultSets

protected MeanResultSet[] myTempMeanResultSets

The temporary result set.

skipLastClassifiersDuringClassifierTraining

protected int skipLastClassifiersDuringClassifierTraining

Skip last classifier.

Constructor Detail

ClassifierAssessment

protected ClassifierAssessment(AbstractClassifier[] aCs,
                               TrainableStatisticalModel[][] aMs,
                               boolean buildClassifiersByCrossProduct,
                               boolean checkAlphabetConsistencyAndLength)
                        throws IllegalArgumentException,
                               WrongAlphabetException,
                               CloneNotSupportedException,
                               ClassDimensionException

Creates a new ClassifierAssessment from an array of AbstractClassifiers and a two-dimensional array of TrainableStatisticalModel s, which are combined to additional classifiers. If buildClassifiersByCrossProduct is true, the cross product of all TrainableStatisticalModels in aMs is built to obtain these classifiers.

Parameters:: aCs - the predefined classifiers; aMs - the TrainableStatisticalModels that are used to build additional classifiers; buildClassifiersByCrossProduct - Determines how classifiers are constructed using the given models. Suppose a k-class problem. In this case, each classifier is supposed to consist of k models, one responsible for each class.
Let S_i be the set of all models in aMs[i]. Let S be the set S_1 x S_2 x ... x S_k (cross-product).

true: all possible classifiers consisting of a subset (set of k models) of S are constructed
false: one classifier consisting of the models aMs[0][i],aMs[1][i],..., aMs[k][i] for a fixed i is constructed. In this case, all second dimensions of aMs have to be equal, say m. In total m classifiers are constructed.; checkAlphabetConsistencyAndLength - indicates if alphabets and lengths shall be checked for consistency
Throws:: IllegalArgumentException - if the classifiers have different lengths; WrongAlphabetException - if not all given classifiers are defined on the same AlphabetContainer; CloneNotSupportedException - if something went wrong while cloning; ClassDimensionException - if there is something wrong with the class dimension of the classifier

ClassifierAssessment

public ClassifierAssessment(AbstractClassifier... aCs)
                     throws IllegalArgumentException,
                            WrongAlphabetException,
                            CloneNotSupportedException,
                            ClassDimensionException

Creates a new ClassifierAssessment from a set of AbstractClassifiers.

Parameters:

aCs - contains the classifiers to be assessed.
If model based classifiers are trained, the order of models in classifiers determines, which model will be trained using which sample in method assess( ... ).
For a two-class problem, it is recommended

to initiate the classifiers with models in order (foreground model (positive class), background model (negative class))
to initiate an assessment object using models in order (foreground model (positive class), background model (negative class))
to give data s in order (s[0] contains foreground data, s[1] contains background data)

Throws:

IllegalArgumentException - if the classifiers have different lengths

WrongAlphabetException - if not all given classifiers are defined on the same AlphabetContainer

CloneNotSupportedException - if something went wrong while cloning

ClassDimensionException - if there is something wrong with the class dimension of the classifier

See Also:

ClassifierAssessment(AbstractClassifier[],
      TrainableStatisticalModel[][], boolean, boolean)

ClassifierAssessment

public ClassifierAssessment(boolean buildClassifiersByCrossProduct,
                            TrainableStatisticalModel[]... aMs)
                     throws IllegalArgumentException,
                            WrongAlphabetException,
                            CloneNotSupportedException,
                            ClassDimensionException

Creates a new ClassifierAssessment from a set of TrainableStatisticalModels. The argument buildClassifiersByCrossProduct determines how these TrainableStatisticalModels are combined to classifiers.

Parameters:

buildClassifiersByCrossProduct -
Determines how classifiers are constructed using the given models. Suppose a k-class problem. In this case, each classifier is supposed to consist of k models, one responsible for each class.
Let S_i be the set of all models in aMs[i]. Let S be the set S_1 x S_2 x ... x S_k (cross-product).

true: all possible classifiers consisting of a subset (set of k models) of S are constructed
false: one classifier consisting of the models aMs[0][i],aMs[1][i],..., aMs[k][i] for a fixed i is constructed. In this case, all second dimensions of aMs have to be equal, say m. In total m classifiers are constructed.

aMs -
Contains the models in the following way (suppose a k-class problem): the first dimension encodes the class (here it is k), the second dimension (aMs[i]) contains the models according to class i.
If models are trained directly (during assessment), the order of given models during initiation of this assessment object determines, which sample will be used for training which model. In general the first model will be trained using the first sample in s... .
For a two-class problem, it is recommended

to initiate the classifiers with models in order (foreground model (positive class), background model (negative class))
to initiate an assessment object using models in order (foreground model (positive class), background model (negative class))
to give data s in order (s[0] contains foreground data, s[1] contains background data)

Throws:

IllegalArgumentException - if the classifiers have different lengths

WrongAlphabetException - if not all given classifiers are defined on the same AlphabetContainer

CloneNotSupportedException - if something went wrong while cloning

ClassDimensionException - if there is something wrong with the class dimension of the classifier

See Also:

ClassifierAssessment(AbstractClassifier[],
      TrainableStatisticalModel[][], boolean, boolean)

ClassifierAssessment

public ClassifierAssessment(AbstractClassifier[] aCs,
                            boolean buildClassifiersByCrossProduct,
                            TrainableStatisticalModel[]... aMs)
                     throws IllegalArgumentException,
                            WrongAlphabetException,
                            CloneNotSupportedException,
                            ClassDimensionException

This constructor allows to assess a collection of given AbstractClassifiers and, in addition, classifiers that will be constructed using the given TrainableStatisticalModels.

Parameters:

aCs - contains some AbstractClassifiers that should be assessed in addition to the AbstractClassifiers constructed using the given TrainableStatisticalModels

buildClassifiersByCrossProduct - Determines how classifiers are constructed using the given models. Suppose a k-class problem. In this case, each classifier is supposed to consist of k models, one responsible for each class.
Let S_i be the set of all models in aMs[i]. Let S be the set S_1 x S_2 x ... x S_k (cross-product).

true: all possible classifiers consisting of a subset (set of k models) of S are constructed
false: one classifier consisting of the models aMs[0][i],aMs[1][i],..., aMs[k][i] for a fixed i is constructed. In this case, all second dimensions of aMs have to be equal, say m. In total m classifiers are constructed.

aMs - Contains the models in the following way (suppose a k-class problem): the first dimension encodes the class (here it is k), the second dimension (aMs[i]) contains the models according to class i.
If models are trained directly (during assessment), the order of given models during initiation of this assessment object determines, which sample will be used for training which model. In general the first model will be trained using the first sample in s... .
For a two-class problem, it is recommended

to initiate the classifiers with models in order (foreground model (positive class), background model (negative class))
to initiate an assessment object using models in order (foreground model (positive class), background model (negative class))
to give data s in order (s[0] contains foreground data, s[1] contains background data)

Throws:

IllegalArgumentException - if the classifiers have different lengths

WrongAlphabetException - if not all given classifiers are defined on the same AlphabetContainer

CloneNotSupportedException - if something went wrong while cloning

ClassDimensionException - if there is something wrong with the class dimension of the classifier

See Also:

ClassifierAssessment(AbstractClassifier[],
      TrainableStatisticalModel[][], boolean, boolean)

Method Detail

assess

public ListResult assess(NumericalPerformanceMeasureParameterSet mp,
                         ClassifierAssessmentAssessParameterSet assessPS,
                         DataSet... s)
                  throws IllegalArgumentException,
                         WrongAlphabetException,
                         Exception

Assesses the contained classifiers.

Parameters:

s - contains the data to be used for assessment. The order of the samples is important.
If model based classifiers are trained, the order of the models in the classifiers determines, which model will be trained using which sample. The first model in the classifier will be trained using the first sample in s. If the models are trained directly, the order of given models during initiation of this assessment object determines, which sample will be used for training which model. In general the first model will be trained using the first sample in s ... .
For a two-class problem, it is recommended

to initiate the classifiers with models in order (foreground model (positive class), background model (negative class))
to initiate an assessment object using models in order (foreground model (positive class), background model (negative class))
to give data s in order (s[0] contains foreground data, s[1] contains background data)

mp - defines which performance measure should be used to assess classifiers

assessPS - contains some parameters necessary for assessment (depends on the kind of assessment!)

Returns:

a ListResult that contains the results (mean and standard errors) of user specified performance measures. These performance measures are user specified via the given NumericalPerformanceMeasureParameterSet.

Throws:

IllegalArgumentException - if the given assessPS is not of the right type (see method evaluateClassifier( ... ))

WrongAlphabetException - if the given samples s do not use the same AlphabetContainer as contained classifiers/models

Exception - forwarded from training/testing of classifiers/models

assess

public ListResult assess(NumericalPerformanceMeasureParameterSet mp,
                         ClassifierAssessmentAssessParameterSet assessPS,
                         ProgressUpdater pU,
                         DataSet... s)
                  throws IllegalArgumentException,
                         WrongAlphabetException,
                         Exception

Assesses the contained classifiers.

Parameters:

s - contains the data to be used for assessment. The order of the samples is important.
If model based classifiers are trained, the order of the models in the classifiers determines which model will be trained using which sample. The first model in the classifier will be trained using the first sample in s. If the models are trained directly, the order of given models during initiation of this assessment object determines, which sample will be used for training which model. In general the first model will be trained using the first sample in s... .
For a two-class problem, it is recommended

to initiate the classifiers with models in order (foreground model (positive class), background model (negative class))
to initiate an assessment object using models in order (foreground model (positive class), background model (negative class))
to give data s in order (s[0] contains foreground data, s[1] contains background data)

mp - defines which performance measure should be used to assess classifiers

assessPS - contains some parameters necessary for assessment (depends on the kind of assessment!)

pU - this ProgressUpdater may be used to cancel this method assess() by setting pU.isCancelled()=true. In that case, assess() will abort but return results already computed.
In certain cases aborting a classifier assessment will not be allowed for example in case of KFoldCrossValidation. In this case it might be wise to override this method such that it just returns an error message.
pU is allowed to be null although in this case it may be more convenient to use the second method code() not requiring a ProgressUpdater .

Returns:

Throws:

IllegalArgumentException - if the given assessPS is not of the right type (see method evaluateClassifier( ... ))

WrongAlphabetException - if the given samples s do not use the same AlphabetContainer as contained classifiers/models

Exception - forwarded from training/testing of classifiers/models

assess

public ListResult assess(NumericalPerformanceMeasureParameterSet mp,
                         ClassifierAssessmentAssessParameterSet assessPS,
                         ProgressUpdater pU,
                         DataSet[][]... s)
                  throws IllegalArgumentException,
                         WrongAlphabetException,
                         Exception

Assesses the contained classifiers. In contrast to the other assess()-methods this one allows the user to predefine all train and test data beforehand which should be used for assessment.

Parameters:: s - contains the data to be used for assessment.
The order of the samples in s are important.
s[iter][train/test][] -> the first dimension codes for which samples (train, test) are used in iteration iter.
The second dimension codes for training: s[iter][0] or test: s[iter][1]. s[iter][0] contains for each class a training sample. Analog s[iter][1] contains the test samples. The order of the samples is important. For further details see comment of method assess(NumericalPerformanceMeasureParameterSet, ClassifierAssessmentAssessParameterSet, DataSet...) .
The user is responsible to take care or not to take care of the given test and training dataset to be not overlapping.; mp - defines which performance measure should be used to assess classifiers; assessPS - contains some parameters necessary for assessment. Must be of type ClassifierAssessmentAssessParameterSet; pU - this ProgressUpdater allows to abort this classifier assessment. If pU.isCancelled()=true, all results already computed will be returned. It is allowed to give a null reference.
Returns:: a ListResult that contains the results (mean and standard errors) of user specified performance measures. These performance measures are user specified via the given NumericalPerformanceMeasureParameterSet.
Throws:: IllegalArgumentException - if the given assessPS is not of the right type (see method evaluateClassifier( ... )); WrongAlphabetException - if the given samples s do not use the same AlphabetContainer as contained classifiers/models; Exception - forwarded from training/testing of classifiers/models

getClassifier

public AbstractClassifier[] getClassifier()
                                   throws CloneNotSupportedException

Returns a deep copy of all classifiers that have been or will be used in this assessment.

Returns:: a deep copy of all used classifiers in this assessment
Throws:: CloneNotSupportedException - if it is impossible to get a deep copy for at least one classifier (if the classifier could not be cloned)

getNameOfAssessment

public String getNameOfAssessment()

Returns the name of this class.

Returns:: name of this class

evaluateClassifier

protected abstract void evaluateClassifier(NumericalPerformanceMeasureParameterSet mp,
                                           ClassifierAssessmentAssessParameterSet assessPS,
                                           DataSet[] s,
                                           ProgressUpdater pU)
                                    throws IllegalArgumentException,
                                           Exception

This method must be implemented in all subclasses. It should perform the following tasks:
1.) create test and train datasets
2.) call method train() to train classifiers/models using train data
3.) call method test() to cause evaluation (test) of trained classifiers

Parameters:

mp - defines which performance measures are used to assess classifiers

assessPS - contains assessment specific parameters (like: number of iterations of a k-fold-crossvalidation)

s - data to be used for assessment (both: test and train data)

pU - a ProgressUpdater that mainly has to be used to allow the user to cancel a current running classifier assessment. This ProgressUpdater is guaranteed to be not null. In certain cases aborting a classifier assessment will not be allowed for example in case of KFoldCrossValidation. In this case the given ProgressUpdater should be ignored.

Usage:

pU.setMax()= number of iterations of the assessment loop
iteration=0;
assessment loop
- pU.setValue()=iteration+1;
- sample treatment
- train();
- test();
- iteration++;
repeat unless(ready or not(pU.isCancelled()))

Throws:

IllegalArgumentException - if the given ClassifierAssessmentAssessParameterSet is of wrong type

Exception - that occurred during training or using classifiers/models

prepareAssessment

protected void prepareAssessment(DataSet... s)
                          throws IllegalArgumentException,
                                 WrongAlphabetException

Prepares an assessment. If the given DataSet may not be used for this assessment, this method throws an Exception.
Further MeanResultSets are initiated for this assessment (one for each contained classifier).

Parameters:

s - the DataSet to be checked

Throws:

WrongAlphabetException - if

s is null or not of required length (number of classes)
AlphabetContainers of s are not consistent with AlphabetContainer of local models or classifiers

IllegalArgumentException - if the given samples are not suitable

test

protected void test(NumericalPerformanceMeasureParameterSet mp,
                    boolean exception,
                    DataSet... testS)
             throws SimpleParameter.IllegalValueException,
                    MeanResultSet.InconsistentResultNumberException,
                    MeanResultSet.AdditionImpossibleException,
                    Exception

Uses the given test samples to call the evaluate( ... ) -methods of the local AbstractClassifiers. The returned NumericalResults as well as the numerical characteristics are added to each classifiers MeanResultSet.
It should not be necessary to override this method in subclasses.

Parameters:: mp - determines which performance measures are used to assess the classifiers; exception - whether an Exception should be thrown if some AbstractPerformanceMeasure could not be evaluated; testS - samples used as test sets (has to contain one DataSet for each class)
Throws:: SimpleParameter.IllegalValueException - if a parameter is not valid; MeanResultSet.InconsistentResultNumberException - if the number of results between the different result sets differ; MeanResultSet.AdditionImpossibleException - if added result sets do not match; Exception - if necessary; IllegalArgumentException - if the length of testS is not equal to the dimension of the classification problem ( testS.length!=this.myAbstractClassifier [0].getNumberOfClasses())
See Also:: AbstractClassifier.evaluate(de.jstacs.classifiers.performanceMeasures.PerformanceMeasureParameterSet, boolean, DataSet...)

train

protected void train(DataSet... trainS)
              throws IllegalArgumentException,
                     Exception

Trains the local classifiers using the given training samples.
The classifiers are either directly trained or via training of the local models. The second option always is used if the ClassifierAssessment-object was constructed using TrainableStatisticalModels.

It should not be necessary to override this method in subclasses.

Parameters:: trainS - samples used as training sets (has to contain one DataSet for each class)
Throws:: IllegalArgumentException - if the length of trainS is not equal to the dimension of the classification problem ( trainS.length!=this.myAbstractClassifier [0].getNumberOfClasses()); Exception - if necessary

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

de.jstacs.classifiers.assessment Class ClassifierAssessment

myAbstractClassifier

myModel

myTempMeanResultSets

skipLastClassifiersDuringClassifierTraining

ClassifierAssessment

ClassifierAssessment

ClassifierAssessment

ClassifierAssessment

assess

assess

assess

getClassifier

getNameOfAssessment

evaluateClassifier

prepareAssessment

test

train

de.jstacs.classifiers.assessment
Class ClassifierAssessment