public class PFMWrapperTrainSM extends AbstractTrainableStatisticalModel implements PWMSupplier
TrainableStatisticalModel
s.alphabets, length
Constructor and Description |
---|
PFMWrapperTrainSM(AlphabetContainer alphabets,
String name,
double[][] pfm,
double ess)
Creates a new wrapper for a given position frequency matrix.
|
PFMWrapperTrainSM(StringBuffer stringBuff)
Creates a wrapper from its XML representation
|
Modifier and Type | Method and Description |
---|---|
protected void |
fromXML(StringBuffer xml)
This method should only be used by the constructor that works on a
StringBuffer . |
String |
getInstanceName()
Should return a short instance name such as iMM(0), BN(2), ...
|
double |
getLogPriorTerm()
Returns a value that is proportional to the log of the prior.
|
double |
getLogProbFor(Sequence sequence,
int startpos,
int endpos)
Returns the logarithm of the probability of (a part of) the given
sequence given the model.
|
String |
getName()
Returns a name (e.g., an identifier from a database) for the PWM.
|
NumericalResultSet |
getNumericalCharacteristics()
Returns the subset of numerical values that are also returned by
SequenceScore.getCharacteristics() . |
double[][] |
getPFM()
Returns a deep copy of the internal PFM.
|
double[][] |
getPWM()
Returns the position weight matrix.
|
boolean |
isInitialized()
This method can be used to determine whether the instance is initialized.
|
String |
toString(NumberFormat nf)
This method returns a
String representation of the instance. |
StringBuffer |
toXML()
This method returns an XML representation as
StringBuffer of an
instance of the implementing class. |
void |
train(DataSet data,
double[] weights)
Trains the
TrainableStatisticalModel object given the data as DataSet using
the specified weights. |
check, clone, emitDataSet, getAlphabetContainer, getCharacteristics, getLength, getLogProbFor, getLogProbFor, getLogScoreFor, getLogScoreFor, getLogScoreFor, getLogScoreFor, getLogScoreFor, getMaximalMarkovOrder, toString, train
public PFMWrapperTrainSM(AlphabetContainer alphabets, String name, double[][] pfm, double ess) throws CloneNotSupportedException
alphabets
- the alphabetname
- the name of the matrixpfm
- the position frequency matrix (may also be a position weight matrix, but ess
should typically be zero in that case)ess
- the equivalent sample size (divided by the size of the alphabet to determine pseudo counts)CloneNotSupportedException
- if the PFM could not be clonedpublic PFMWrapperTrainSM(StringBuffer stringBuff) throws NonParsableException
stringBuff
- the XML representationNonParsableException
- if the XML could not be parsedpublic StringBuffer toXML()
Storable
StringBuffer
of an
instance of the implementing class.public void train(DataSet data, double[] weights) throws Exception
TrainableStatisticalModel
TrainableStatisticalModel
object given the data as DataSet
using
the specified weights. The weight at position i belongs to the element at
position i. So the array weight
should have the number of
sequences in the data set as dimension. (Optionally it is possible to use
weight == null
if all weights have the value one.)train(data1)
; train(data2)
should be a fully trained model over data2
and not over
data1+data2
. All parameters of the model were given by the
call of the constructor.train
in interface TrainableStatisticalModel
data
- the given sequences as DataSet
weights
- the weights of the elements, each weight should be
non-negativeException
- if the training did not succeed (e.g. the dimension of
weights
and the number of sequences in the
data set do not match)DataSet.getElementAt(int)
,
DataSet.ElementEnumerator
public double getLogProbFor(Sequence sequence, int startpos, int endpos) throws Exception
StatisticalModel
StatisticalModel.getLogProbFor(Sequence, int)
by the fact, that the model could be
e.g. homogeneous and therefore the length of the sequences, whose
probability should be returned, is not fixed. Additionally, the end
position of the part of the given sequence is given and the probability
of the part from position startpos
to endpos
(inclusive) should be returned.
length
and the alphabets
define the type of
data that can be modeled and therefore both has to be checked.getLogProbFor
in interface StatisticalModel
sequence
- the given sequencestartpos
- the start position within the given sequenceendpos
- the last position to be taken into accountException
- if the sequence could not be handled (e.g.
startpos >
, endpos
> sequence.length
, ...) by the modelNotTrainedException
- if the model is not trained yetpublic double getLogPriorTerm() throws Exception
StatisticalModel
getLogPriorTerm
in interface StatisticalModel
Exception
- if something went wrongpublic String getInstanceName()
SequenceScore
getInstanceName
in interface SequenceScore
public String getName()
PWMSupplier
getName
in interface PWMSupplier
public NumericalResultSet getNumericalCharacteristics() throws Exception
SequenceScore
SequenceScore.getCharacteristics()
.getNumericalCharacteristics
in interface SequenceScore
Exception
- if some of the characteristics could not be definedpublic boolean isInitialized()
SequenceScore
SequenceScore.getLogScoreFor(Sequence)
.isInitialized
in interface SequenceScore
true
if the instance is initialized, false
otherwisepublic String toString(NumberFormat nf)
SequenceScore
String
representation of the instance.toString
in interface SequenceScore
nf
- the NumberFormat
for the String
representation of parameters or probabilitiesString
representation of the instanceprotected void fromXML(StringBuffer xml) throws NonParsableException
AbstractTrainableStatisticalModel
StringBuffer
. It is the counter part of Storable.toXML()
.fromXML
in class AbstractTrainableStatisticalModel
xml
- the XML representation of the modelNonParsableException
- if the StringBuffer
is not parsable or the
representation is conflictingAbstractTrainableStatisticalModel.AbstractTrainableStatisticalModel(StringBuffer)
public double[][] getPFM() throws CloneNotSupportedException
CloneNotSupportedException
- if the PFM could not be clonedpublic double[][] getPWM()
PWMSupplier
getPWM
in interface PWMSupplier