**SlimDimont** is an extension of Dimont, a universal tool for de-novo motif discovery, to parse local inhomogeneous mixture (Slim) models, which can capture neighboring and non-neighboring dependencies between the positions of sequence motifs. 
SlimDimont has successfully been applied to ChIP-seq and genomic context protein-binding microarray (gcPBM) data.

Input sequences must be supplied in an annotated FastA format as a file uploaded by the "Upload File" task in section "Get Data" of Galaxy.
In the annotation of each sequence, you need to provide a value that reflects the confidence that this sequence is bound by the factor of interest.
Such confidences may be peak statistics (e.g., number of fragments under a peak) for ChIP data or signal intensities for PBM data. In addition, you need to provide an anchor position within the sequence. 
In case of ChIP data, this anchor position could for instance be the peak summit.
For instance, an annotated FastA file for ChIP-seq data comprising sequences of length 100 centered around the peak summit could look like::
	
	> peak: 50; signal: 515
	ggccatgtgtatttttttaaatttccac...
	> peak: 50; signal: 199
	GGTCCCCTGGGAGGATGGGGACGTGCTG...
	...

where the anchor point is given as 50 for the first two sequences, and the confidence amounts to 515 and 199, respectively.
The FastA comment may contain additional annotations of the format ``key1 : value1; key2: value2;...``.
We also provide an example_ input file and a Perl script_ for preparing data in the format required by Dimont.

Accordingly, you would need to set the parameter "Position tag" to ``peak`` and the parameter "Value tag" to ``signal`` for the input file.

For the standard deviation of the position prior, the motif width and the number of pre-optimization runs, we provide default values that worked well in our studies. 
However, you may want adjust these parameters to meet your prior information.


The parameter "Weighting factor" defines the proportion of sequences that you expect to be bound by the targeted factor with high confidence. For ChIP data, the default value of ``0.2`` typically works well. 
For universal PBM data, containing a large number of unspecific probes, this parameter should be set to a lower value, e.g. ``0.01``.


The parameter "Model type" allows for choosing between standard Markov models (including the PWM model), Slim and LSlim models.
If the Markov model is chosen, the parameter "Order" selects the model order. If this parameter is set to ``0``, you obtain a position weight matrix (PWM) model. 
If it is set to ``1``, you obtain a weight array matrix (WAM) model. You can set the order of the motif model to at most ``3``.
In case of the Slim model, no further parameters are required, whereas in case of the LSlim model, the maximum distance between positions that is considered by the model is specified by the parameter "Distance".

The parameter "Markov order of the background model" sets the order of the homogeneous Markov model used for modeling positions not covered by a motif. 
If this parameter is set to ``-1``, you obtain a uniform distribution, which worked well for ChIP data. For PBM data, a background order of ``1`` worked well.

The "Equivalent sample size" reflects the strength of the influence of the prior on the model parameters, where higher values smooth out the parameters to a greater extent.

The parameter "Delete BSs from profile" defines if BSs of already discovered motifs should be deleted, i.e., "blanked out", from the sequence before searching for futher motifs.

You can also install this web-application within your local Galaxy server. Instructions can be found at the Slim_ page of Jstacs. 
There you can also download a command line version of Dimont.

If you use Slim models for motif discovery, please cite

 \J. Keilwagen and J. Grau. Varying levels of complexity in transcription factor binding motifs. doi:10.1093/nar/gkv577 Nucleic Acids Research, 2015.

and

 \J. Grau, S. Posch, I. Grosse, and J. Keilwagen. A general approach for discriminative de-novo motif discovery from high-throughput data. *Nucleic Acids Research*, 41(21):e197, 2013.


If you experience problems using Dimont, please contact_ us.

.. _example: http://www.jstacs.de/downloads/dimont-example.fa
.. _script: http://www.jstacs.de/index.php/Dimont#Data_preparation
.. _Slim: http://jstacs.de/index.php/Slim
.. _contact: mailto:grau@informatik.uni-halle.de