de.jstacs.motifDiscovery
Class MutableMotifDiscovererToolbox

java.lang.Object
  extended by de.jstacs.motifDiscovery.MotifDiscovererToolBox
      extended by de.jstacs.motifDiscovery.MutableMotifDiscovererToolbox

public final class MutableMotifDiscovererToolbox
extends MotifDiscovererToolBox

This class contains some important methods for the initiation and optimization of MutableMotifDiscoverer.

Author:
Jan Grau, Jens Keilwagen

Nested Class Summary
static class MutableMotifDiscovererToolbox.InitMethodForDiffSM
          This enum defines some constants for the method getSortedInitialParameters(DifferentiableSequenceScore[], InitMethodForDiffSM[], DiffSSBasedOptimizableFunction, int, OutputStream, int).
 
Constructor Summary
MutableMotifDiscovererToolbox()
           
 
Method Summary
static void clearHistoryArray(History[][] history)
          This method clears all elements of an History-array, so that it can be used again.
static History[][] createHistoryArray(DifferentiableSequenceScore[] funs, History template)
          This method creates a History-array that can be used in an optimization.
static int[][] createMinimalNewLengthArray(DifferentiableSequenceScore[] funs)
          This method creates a minimalNewLength-array that can be used in an optimization.
static boolean doHeuristicSteps(DifferentiableSequenceScore[] funs, DataSet[] data, double[][] weights, DiffSSBasedOptimizableFunction opt, DifferentiableFunction neg, byte algorithm, double linEps, StartDistanceForecaster startDistance, SafeOutputStream out, boolean breakOnChanged, History[][] hist, int[][] minimalNewLength, boolean maxPos)
          This method tries to make some heuristic step if at least one DifferentiableSequenceScore is a MutableMotifDiscoverer.
static Sequence[] enumerate(DifferentiableSequenceScore[] funs, int[] classIndex, int[] motifIndex, RecyclableSequenceEnumerator[] rse, double weight, DiffSSBasedOptimizableFunction opt, OutputStream out)
          This method allows to enumerate all possible seeds for a number of motifs in the MutableMotifDiscoverers of a specific classes.
static Sequence enumerate(DifferentiableSequenceScore[] funs, int classIndex, int motifIndex, RecyclableSequenceEnumerator rse, double weight, DiffSSBasedOptimizableFunction opt, OutputStream out)
          This method allows to enumerate all possible seeds for a motif in the MutableMotifDiscoverer of a specific class.
static boolean findModification(int clazz, int motif, MutableMotifDiscoverer mmd, DifferentiableSequenceScore[] score, DataSet[] data, double[][] weights, DiffSSBasedOptimizableFunction opt, DifferentiableFunction neg, byte algo, double linEps, StartDistanceForecaster startDistance, SafeOutputStream out, History hist, int minimalNewLength, boolean maxPos)
          This method tries to find a modification, i.e.
static ComparableElement<double[],Double>[] getSortedInitialParameters(DifferentiableSequenceScore[] funs, MutableMotifDiscovererToolbox.InitMethodForDiffSM[] init, DiffSSBasedOptimizableFunction opt, int n, OutputStream stream, int optimizationSteps)
          This method allows to initialize the DifferentiableSequenceScore using different MutableMotifDiscovererToolbox.InitMethodForDiffSM.
static void initMotif(int idx, int[] classIndex, int[] motifIndex, DataSet[] s, double[][] seqWeights, boolean[] adjust, MutableMotifDiscoverer[] mmd, int[] len, DataSet[] data, double[][] dataWeights)
          This method allows to initialize a number of motifs.
static double[][] optimize(DifferentiableSequenceScore[] funs, DiffSSBasedOptimizableFunction opt, byte algorithm, AbstractTerminationCondition condition, double linEps, StartDistanceForecaster startDistance, SafeOutputStream out, boolean breakOnChanged, History[][] hist, int[][] minimalNewLength, OptimizableFunction.KindOfParameter plugIn, boolean maxPos)
          This method tries to optimize the problem at hand as good as possible.
static double[][] optimize(DifferentiableSequenceScore[] funs, DiffSSBasedOptimizableFunction opt, byte algorithm, AbstractTerminationCondition condition, double linEps, StartDistanceForecaster startDistance, SafeOutputStream out, boolean breakOnChanged, History template, OptimizableFunction.KindOfParameter plugIn, boolean maxPos)
          This method tries to optimize the problem at hand as good as possible.
 
Methods inherited from class de.jstacs.motifDiscovery.MotifDiscovererToolBox
plot, plotAndAnnotate
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

MutableMotifDiscovererToolbox

public MutableMotifDiscovererToolbox()
Method Detail

enumerate

public static Sequence enumerate(DifferentiableSequenceScore[] funs,
                                 int classIndex,
                                 int motifIndex,
                                 RecyclableSequenceEnumerator rse,
                                 double weight,
                                 DiffSSBasedOptimizableFunction opt,
                                 OutputStream out)
                          throws Exception
This method allows to enumerate all possible seeds for a motif in the MutableMotifDiscoverer of a specific class.

Parameters:
funs - the DifferentiableSequenceScores
classIndex - the index of the class
motifIndex - the index of the motif in the MutableMotifDiscoverer
rse - an RecyclableSequenceEnumerator that contains Sequence objects tested for initialization of the motif motifIndex
weight - the weight of the seed Sequence
opt - the objective function
out - a stream that allows to write some output if necessary
Returns:
the best Sequence with respect to the DiffSSBasedOptimizableFunction
Throws:
Exception - if something went wrong
See Also:
enumerate(DifferentiableSequenceScore[], int[], int[], RecyclableSequenceEnumerator[], double, DiffSSBasedOptimizableFunction, OutputStream)

enumerate

public static Sequence[] enumerate(DifferentiableSequenceScore[] funs,
                                   int[] classIndex,
                                   int[] motifIndex,
                                   RecyclableSequenceEnumerator[] rse,
                                   double weight,
                                   DiffSSBasedOptimizableFunction opt,
                                   OutputStream out)
                            throws Exception
This method allows to enumerate all possible seeds for a number of motifs in the MutableMotifDiscoverers of a specific classes.

Parameters:
funs - the DifferentiableSequenceScores
classIndex - the indices of the classes
motifIndex - the indices of the motif in the MutableMotifDiscoverers
rse - an array of RecyclableSequenceEnumerator that contains Sequence objects tested for initialization of the corresponding motif
weight - the weight of the seed Sequence
opt - the objective function
out - a stream that allows to write some output if necessary
Returns:
an array containing the best Sequences with respect to the DiffSSBasedOptimizableFunction
Throws:
Exception - if something went wrong

initMotif

public static void initMotif(int idx,
                             int[] classIndex,
                             int[] motifIndex,
                             DataSet[] s,
                             double[][] seqWeights,
                             boolean[] adjust,
                             MutableMotifDiscoverer[] mmd,
                             int[] len,
                             DataSet[] data,
                             double[][] dataWeights)
                      throws Exception
This method allows to initialize a number of motifs.

Parameters:
idx - the index indicates how many motifs are initialized
classIndex - the indices of the classes of each motif
motifIndex - the indices of each motif within the MutableMotifDiscoverer
s - the DataSets to be used for the initialization
seqWeights - the weights corresponding to the DataSets
adjust - an array of switches indicating whether to adjust hidden parameters of not
mmd - the array of MutableMotifDiscoverers to be initialized
len - the length of each motif
data - the complete data sets
dataWeights - the weights corresponding to the complete data sets
Throws:
Exception - if something went wrong

getSortedInitialParameters

public static ComparableElement<double[],Double>[] getSortedInitialParameters(DifferentiableSequenceScore[] funs,
                                                                              MutableMotifDiscovererToolbox.InitMethodForDiffSM[] init,
                                                                              DiffSSBasedOptimizableFunction opt,
                                                                              int n,
                                                                              OutputStream stream,
                                                                              int optimizationSteps)
                                                                       throws Exception
This method allows to initialize the DifferentiableSequenceScore using different MutableMotifDiscovererToolbox.InitMethodForDiffSM. It returns an array of ComparableElements that contain the parameters and the

Parameters:
funs - the DifferentiableSequenceScores
init - the specific MutableMotifDiscovererToolbox.InitMethodForDiffSM, the entries correspond one to one to those of fun
opt - the objective function
n - the number of initializations
stream - a stream that allows to write some output if necessary
optimizationSteps - the number of initial steps that should be performed before evaluating the function
Returns:
a sorted array containing ComparableElements of parameter arrays and corresponding values of the DiffSSBasedOptimizableFunction
Throws:
Exception - if something went wrong

createMinimalNewLengthArray

public static int[][] createMinimalNewLengthArray(DifferentiableSequenceScore[] funs)
This method creates a minimalNewLength-array that can be used in an optimization.

Parameters:
funs - the DiffSMs used in an optimization
Returns:
an minimalNewLength-array for the given DiffSMs

createHistoryArray

public static History[][] createHistoryArray(DifferentiableSequenceScore[] funs,
                                             History template)
                                      throws CloneNotSupportedException
This method creates a History-array that can be used in an optimization.

Parameters:
funs - the DiffSMs used in an optimization
template - the template history instance
Returns:
an History-array for the given DiffSMs
Throws:
CloneNotSupportedException - if the ttemplate could not be cloned

clearHistoryArray

public static void clearHistoryArray(History[][] history)
This method clears all elements of an History-array, so that it can be used again.

Parameters:
history - the array

optimize

public static double[][] optimize(DifferentiableSequenceScore[] funs,
                                  DiffSSBasedOptimizableFunction opt,
                                  byte algorithm,
                                  AbstractTerminationCondition condition,
                                  double linEps,
                                  StartDistanceForecaster startDistance,
                                  SafeOutputStream out,
                                  boolean breakOnChanged,
                                  History template,
                                  OptimizableFunction.KindOfParameter plugIn,
                                  boolean maxPos)
                           throws Exception
This method tries to optimize the problem at hand as good as possible. If the optimization uses MutableMotifDiscoverers it tries to perform modify operations as long as they seem to be promising.

Parameters:
funs - the DifferentiableSequenceScores for scoring sequences
opt - the DiffSSBasedOptimizableFunction
algorithm - used for the optimization
condition - used for the optimization
linEps - used for the optimization
startDistance - used for the optimization
out - an stream that allows to obtain some information while optimization
breakOnChanged - a switch that decides whether a new optimization should be started after one successful modify or after all motifs have been tried to modify.
template - a history instance used to build an array with this instance
plugIn - a switch whether to take the internal parameters or not
maxPos - a switch whether to take the maximal shift position or not in the heuristic
Returns:
the optimized value (res[0][0]) and the array for the class parameters (res[1])
Throws:
Exception - if something went wrong while optimization
See Also:
clearHistoryArray(de.jstacs.motifDiscovery.history.History[][]), optimize(DifferentiableSequenceScore[], DiffSSBasedOptimizableFunction, byte, AbstractTerminationCondition, double, StartDistanceForecaster, SafeOutputStream, boolean, History[][], int[][], de.jstacs.classifiers.differentiableSequenceScoreBased.OptimizableFunction.KindOfParameter, boolean)

optimize

public static double[][] optimize(DifferentiableSequenceScore[] funs,
                                  DiffSSBasedOptimizableFunction opt,
                                  byte algorithm,
                                  AbstractTerminationCondition condition,
                                  double linEps,
                                  StartDistanceForecaster startDistance,
                                  SafeOutputStream out,
                                  boolean breakOnChanged,
                                  History[][] hist,
                                  int[][] minimalNewLength,
                                  OptimizableFunction.KindOfParameter plugIn,
                                  boolean maxPos)
                           throws Exception
This method tries to optimize the problem at hand as good as possible. If the optimization uses MutableMotifDiscoverers it tries to perform modify operations as long as they seem to be promising.

Parameters:
funs - the DifferentiableSequenceScores for scoring sequences
opt - the DiffSSBasedOptimizableFunction
algorithm - used for the optimization
condition - used for the optimization
linEps - used for the optimization
startDistance - used for the optimization
out - an stream that allows to obtain some information while optimization
breakOnChanged - a switch that decides whether a new optimization should be started after one successful modify or after all motifs have been tried to modify.
hist - an array that is used to check whether a modify-operation can be performed
minimalNewLength - the minimal new length for each motif in each class, that will be used in an expand if the motif was shortened before
plugIn - a switch whether to take the internal parameters or not
maxPos - a switch whether to take the maximal shift position or not in the heuristic
Returns:
the optimized value (res[0][0]) and the array for the class parameters (res[1])
Throws:
Exception - if something went wrong while optimization

doHeuristicSteps

public static boolean doHeuristicSteps(DifferentiableSequenceScore[] funs,
                                       DataSet[] data,
                                       double[][] weights,
                                       DiffSSBasedOptimizableFunction opt,
                                       DifferentiableFunction neg,
                                       byte algorithm,
                                       double linEps,
                                       StartDistanceForecaster startDistance,
                                       SafeOutputStream out,
                                       boolean breakOnChanged,
                                       History[][] hist,
                                       int[][] minimalNewLength,
                                       boolean maxPos)
                                throws Exception
This method tries to make some heuristic step if at least one DifferentiableSequenceScore is a MutableMotifDiscoverer. These heuristic steps include shift, shrink, and expand as far as the user allows those operations by the History array.

Parameters:
funs - the DifferentiableSequenceScores for scoring sequences
data - array of DataSet containing the data for each class
weights - the weights corresponding to the Sequences in data
opt - the DiffSSBasedOptimizableFunction
neg - the NegativeDifferentiableFunction used in the optimization
algorithm - used for the optimization
linEps - used for the optimization
startDistance - used for the optimization
out - an stream that allows to obtain some information while optimization
breakOnChanged - a switch that decides whether a new optimization should be started after one successful modify or after all motifs have been tried to modify.
hist - an array that is used to check whether a modify-operation can be performed
minimalNewLength - the minimal new length for each motif in each class, that will be used in an expand if the motif was shortened before
maxPos - a switch whether to take the maximal shift position or not in the heuristic
Returns:
true if some heuristic steps has been performed otherwise false
Throws:
Exception - if something went wrong

findModification

public static boolean findModification(int clazz,
                                       int motif,
                                       MutableMotifDiscoverer mmd,
                                       DifferentiableSequenceScore[] score,
                                       DataSet[] data,
                                       double[][] weights,
                                       DiffSSBasedOptimizableFunction opt,
                                       DifferentiableFunction neg,
                                       byte algo,
                                       double linEps,
                                       StartDistanceForecaster startDistance,
                                       SafeOutputStream out,
                                       History hist,
                                       int minimalNewLength,
                                       boolean maxPos)
                                throws Exception
This method tries to find a modification, i.e. shifting, shrinking, or expanding a motif, that is promising. The method returns true a modification was found and could be performed.

For finding a promising modification, the method test various shifts and computes the number of sequences predicted to be bound.

Parameters:
clazz - the class index for which the Scoring function will be tested for modification
motif - the motif index for which the Scoring function will be tested for modification
mmd - the MutableMotifDiscoverer that will be tested
score - the DifferentiableSequenceScores for scoring sequences
data - array of DataSet containing the data for each class
weights - array of double[] containing the weights for the data of each class
opt - the DiffSSBasedOptimizableFunction
neg - the NegativeDifferentiableFunction used in the optimization
algo - used for the optimization
linEps - used for the optimization
startDistance - used for the optimization
out - an stream that allows to obtain some information
hist - an instance to check whether a modify-operation can be performed
minimalNewLength - the minimal new length for each motif in each class, that will be used in an expand if the motif was shortened before
maxPos - a switch whether to take the maximal shift position or not in the heuristic
Returns:
true if a modification has been performed
Throws:
Exception - if something went wrong
See Also:
SignificantMotifOccurrencesFinder.getNumberOfBoundSequences(DataSet, double[], int)