de.jstacs.motifDiscovery
Class MotifDiscoveryAssessment

java.lang.Object
  extended by de.jstacs.motifDiscovery.MotifDiscoveryAssessment

public class MotifDiscoveryAssessment
extends Object

This class enables the user to assess the prediction of motif occurrences

Author:
Jan Grau, Jens Keilwagen

Constructor Summary
MotifDiscoveryAssessment()
           
 
Method Summary
static ListResult assess(DataSet truth, DataSet prediction, int maxDiff)
          This method computes the nucleotide and site measures.
static double[][] getSortedScoresForMotifAndFlanking(DataSet data, DataSet pred, String identifier)
          Returns the scores read from the prediction pred for the motif with identifier identifier and flanking sequences as annotated in the DataSet data.
static double[][] getSortedValuesForMotifAndFlanking(DataSet data, double[][] values, double offset, double factor, String identifier)
          This method provides some score arrays that can be used in AbstractPerformanceMeasure to determine some curves or area under curves based on the values of the predictions.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

MotifDiscoveryAssessment

public MotifDiscoveryAssessment()
Method Detail

assess

public static ListResult assess(DataSet truth,
                                DataSet prediction,
                                int maxDiff)
                         throws Exception
This method computes the nucleotide and site measures.

Parameters:
truth - the DataSet annotated with the true annotation
prediction - annotated with the predicted annotation
maxDiff - the maximal difference between predicted and true start position; this value is used to determine the site measures
Returns:
a ListResult containing all NumericalResultSets
Throws:
Exception - if something went wrong

getSortedValuesForMotifAndFlanking

public static double[][] getSortedValuesForMotifAndFlanking(DataSet data,
                                                            double[][] values,
                                                            double offset,
                                                            double factor,
                                                            String identifier)
This method provides some score arrays that can be used in AbstractPerformanceMeasure to determine some curves or area under curves based on the values of the predictions. The scores are generated by offset+factor*values[i][j].

Parameters:
data - the data
values - the array of smoothed values
offset - the offset that is added to the current values
factor - the factor that is multiplied to the current values
identifier - the identifier of the annotation of the positive class
Returns:
two arrays containing values; the first contains the values for the positive class, the second contains the values for the negative class
See Also:
SequenceAnnotation.getIdentifier()

getSortedScoresForMotifAndFlanking

public static double[][] getSortedScoresForMotifAndFlanking(DataSet data,
                                                            DataSet pred,
                                                            String identifier)
Returns the scores read from the prediction pred for the motif with identifier identifier and flanking sequences as annotated in the DataSet data. The identifier may be null to obtain the scores for all motifs, irrespective of present identifiers. The first dimension of the returned array contains the scores for the motif annotations, while the second dimension contains the scores of the flanking sequences. Both dimensions are sorted and can be directly used in the methods of AbstractPerformanceMeasure. The scores for the predictions must be added to the LocatedSequenceAnnotationWithLength representing the motifs as additional annotation using LocatedSequenceAnnotationWithLength.LocatedSequenceAnnotationWithLength(String, String, LocatedSequenceAnnotation[], Result...) with the name of the annotation, i.e. the name of the corresponding Result equal to "score".

Parameters:
data - the DataSet annotated with the truth
pred - the DataSet annotated with the prediction and associated scores
identifier - the identifier of the motif
Returns:
the scores of motifs and flanking sequences