PMMdeNovo

From Jstacs
Revision as of 12:36, 21 February 2015 by Eggeling (talk | contribs)
Jump to navigationJump to search

by Ralf Eggeling, Teemu Roos, Petri Myllymäki, and Ivo Grosse

Runnable JARs

The application consists of three independent tools.

ModelTrainer

The tool ModelTrainer performs a de novo motif discovery on a set of putative non aligned sequences. It infers an inhomogenous PMM of arbitrary order, where order 0 corresponds to a PWM model. The tool produces two output files, namely (i) an .xml file containing the learned model and (ii) a .dot file containing the graphViz representation of the learned PCT structures.

BindingSitePrediction

The tool BindingSitePrediction predicts instances of binding sites in a positive data set based on a previously learned model.

Classification

The tool Classification performs first a motif discovery with subsequent fragment-based classification using positive data that is assumed to contain an instance of the motif, and negative data that is assumed not to contain the motif. The tool returns the classification results to the standard output.

Data

The exemplary data sets contain extracted ChIP seq sequences of 50 different human transcription factors from the ENCODE project, as well as corresponding negative data. All data sets are split into 10 different subsets for enabling a reproducible 10-fold cross validation.

Source code

Building the source code requires Jstacs 2.1.