de.jstacs.classifiers.assessment

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV PACKAGE NEXT PACKAGE

FRAMES NO FRAMES

Package de.jstacs.classifiers.assessment

This package allows to assess classifiers.

It contains the class ClassifierAssessment that is used as a super-class of all implemented methodologies of an assessment to assess classifiers.

See:
Description

Class Summary
ClassifierAssessment<T extends ClassifierAssessmentAssessParameterSet>	Class defining an assessment of classifiers.
ClassifierAssessmentAssessParameterSet	This class is the superclass used by all `ClassifierAssessmentAssessParameterSet`s.
KFoldCrossValidation	This class implements a k-fold crossvalidation.
KFoldCrossValidationAssessParameterSet	This class implements a `ClassifierAssessmentAssessParameterSet` that must be used to call method `assess( ...`
RepeatedHoldOutAssessParameterSet	This class implements a `ClassifierAssessmentAssessParameterSet` that must be used to call method `assess( ...`
RepeatedHoldOutExperiment	This class implements a repeated hold-out experiment for assessing classifiers.
RepeatedSubSamplingAssessParameterSet	This class implements a `ClassifierAssessmentAssessParameterSet` that must be used to call method `assess( ...`
RepeatedSubSamplingExperiment	This class implements a repeated subsampling experiment.
Sampled_RepeatedHoldOutAssessParameterSet	This class implements a `ClassifierAssessmentAssessParameterSet` that must be used to call the method `assess( ...`
Sampled_RepeatedHoldOutExperiment	This class is a special `ClassifierAssessment` that partitions the data of a user-specified reference class (typically the smallest class) and data sets non-overlapping for all other classes, so that one gets the same number of sequences (and the same lengths of the sequences) in each train and test data set.

Package de.jstacs.classifiers.assessment Description

This package allows to assess classifiers.

It contains the class ClassifierAssessment that is used as a super-class of all implemented methodologies of an assessment to assess classifiers. In addition it should be used as a super-class of all coming assessments since this class already implements basic patterns like:

handling of given classifiers
handling of given models and construction of classifiers using those models
management of temporary results (In general an assessment is a repeated procedure producing several temporary results that are summarized in terms of means or standard deviations or standard-errors.)
construction of a summary (mean and standard-error of temporary results)

Further on it contains three implementations of different assessments to assess classifiers. These are:

Repeated HoldOut Experiment
Sampled Repeated HoldOut Experiment
K-Fold Crossvaliation
Repeated Subsampling Experiment

A RepeatedHoldOutExperiment implements the following procedure. For given data-sets it randomly, mutually exclusive partitions the given data-sets into a train-data-set and a test-data-set. Afterwards it uses these data-sets to first train the classifiers and afterwards assess its performance to correctly predict the elements of the test-data-sets. This step is repeated at users will.

A Sampled_RepeatedHoldOutExperiment is a special ClassifierAssessment that partitions the data of a user-specified reference class and data sets non-overlapping for all other classes, so that one gets the same number of sequences (and the same lengths of the sequences) in each train and test data set.

A KFoldCrossValidation implements a k-fold crossvalidation. That is the given data is randomly and mutually exclusive partitioned into k parts. Each of these parts is used once as test-data-set and the remaining k-1 parts are used once as train-data-sets. In each of the k steps the classifiers are trained using the train-data-sets and their performance to correctly predict the elements of the test-data-sets is assessed.

A RepeatedSubSamplingExperiment subsamples in each step a train-data-set and a test-data-set from given data. These data-sets may be overlapping. Afterwards the classifiers are trained using the train-data-sets and their performance to predict the elements of the test-data-sets is assessed. This procedure is repeated at users will.

In addition all classes allow to assess classifiers using a set of user-specified test-data-sets and a set of user specified train-data-sets. This methodology allows the user to use test- and train-data-sets that are not automatically generated but user-specified.