FAQs: Difference between revisions

From Jstacs
Jump to navigationJump to search
No edit summary
No edit summary
 
(33 intermediate revisions by 2 users not shown)
Line 1: Line 1:
== Handling data ==
== Handling data ==
Also have a look at the code example [[Loading data]].
<p>
<p>
'''Q: How do I create an [http://www.jstacs.de/api/de/jstacs/data/AlphabetContainer.html AlphabetContainer] instance for DNA sequences?'''<br />
'''Q: How do I create an [http://www.jstacs.de/api/de/jstacs/data/AlphabetContainer.html AlphabetContainer] instance for DNA sequences?'''<br />
Line 12: Line 13:
<p>
<p>
'''Q: How can I create a simple sequence?'''<br />
'''Q: How can I create a simple sequence?'''<br />
'''A:''' Try to use the [http://www.jstacs.de/api/de/jstacs/data/Sequence.html#create(de.jstacs.data.AlphabetContainer,%20java.lang.String) create method] of [http://www.jstacs.de/api/de/jstacs/data/Sequence.html Sequence], e.q. Sequence.create( new AlphabetContainer( new DNAAlphabet() ), "ACGTACGT" );
'''A:''' Try to use the [http://www.jstacs.de/api/de/jstacs/data/sequences/Sequence.html#create(de.jstacs.data.AlphabetContainer,%20java.lang.String) create method] of [http://www.jstacs.de/api/de/jstacs/data/sequences/Sequence.html Sequence], e.q. Sequence.create( new AlphabetContainer( new DNAAlphabet() ), "ACGTACGT" );
</p>
</p>
<hr />
<hr />
<p>
<p>
'''Q: How can I load my own data?'''<br />
'''Q: How can I load my own data?'''<br />
'''A:''' If your sequences are stored either in plain text or in FastA format, you can directly [http://www.jstacs.de/api/de/jstacs/data/Sample.html#Sample(de.jstacs.data.AlphabetContainer,%20de.jstacs.io.StringExtractor,%20java.lang.String) create] a new [http://www.jstacs.de/api/de/jstacs/data/Sample.html Sample] from the file.
'''A:''' If your sequences are stored either in plain text or in FastA format, you can directly [http://www.jstacs.de/api/de/jstacs/data/DataSet.html#DataSet(de.jstacs.data.AlphabetContainer,%20de.jstacs.io.AbstractStringExtractor,%20java.lang.String) create] a new [http://www.jstacs.de/api/de/jstacs/data/DataSet.html DataSet] from the file.
</p>
</p>
<hr />
<hr />
Line 27: Line 28:


== Using existing models ==
== Using existing models ==
Also have a look at the [[Code examples]].
<p>
<p>
'''Q: Where do I find a list of the models currently implemented in Jstacs?'''<br />
'''Q: Where do I find a list of the models currently implemented in Jstacs?'''<br />
'''A:''' All generative models in Jstacs implement the [http://www.jstacs.de/api/de/jstacs/models/Model.html Model] interface.<br />
'''A:''' All generative models in Jstacs implement the [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/trainable/TrainableStatisticalModel.html TrainableStatisticalModel] interface.<br />
All discriminative models in Jstacs implement the [http://www.jstacs.de/api/de/jstacs/scoringFunctions/ScoringFunction.html ScoringFunction] interface. You find all the existing implementation in the list of implementing classes of these two interfaces.
All discriminative models in Jstacs implement the [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/differentiable/DifferentiableStatisticalModel.html DifferentiableStatisticalModel] interface. You find all the existing implementations in the list of implementing classes of these two interfaces.
</p>
<hr />
<p>
'''Q: I decided for two [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/trainable/TrainableStatisticalModel.html TrainableStatisticalModels]. How do I learn them and classify new data?'''<br />
'''A:''' You can create a new [http://www.jstacs.de/api/de/jstacs/classifiers/trainSMBased/TrainSMBasedClassifier.html TrainSMBasedClassifier] from your models and use its [http://www.jstacs.de/api/de/jstacs/classifiers/trainSMBased/TrainSMBasedClassifier.html#train(de.jstacs.data.DataSet%5B%5D,%20double%5B%5D%5B%5D) train] and [http://www.jstacs.de/api/de/jstacs/classifiers/trainSMBased/TrainSMBasedClassifier.html#classify(de.jstacs.data.DataSet) classify] methods. If you only want to learn a model from data, e.g. to sample new sequences, you can also directly use the [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/trainable/TrainableStatisticalModel.html#train(de.jstacs.data.DataSet) train] method of the [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/trainable/TrainableStatisticalModel.html TrainableStatisticalModel].
</p>
<hr />
<p>
'''Q: I decided for two [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/differentiable/DifferentiableStatisticalModel.html DifferentiableStatisticalModels]. How do I learn them and classify new data?'''<br />
'''A:''' You can create a new [http://www.jstacs.de/api/de/jstacs/classifiers/differentiableSequenceScoreBased/gendismix/GenDisMixClassifier.html GenDisMixClassifier] from your models and use its [http://www.jstacs.de/api/de/jstacs/classifier/scoringFunctionBased/ScoreClassifier.html#train(de.jstacs.data.Sample%5B%5D,%20double%5B%5D%5B%5D) train] and [http://www.jstacs.de/api/de/jstacs/classifiers/differentiableSequenceScoreBased/gendismix/GenDisMixClassifier.html#classify(de.jstacs.data.sequences.Sequence)) classify] methods.
</p>
<hr />
<p>
'''Q: I have to decide, which model is best for my classification task. How do I assess different model combinations or classifiers?'''<br />
'''A:''' You can use the subclasses of [http://www.jstacs.de/api/de/jstacs/classifiers/assessment/ClassifierAssessment.html ClassifierAssessment], e.g. [http://www.jstacs.de/api/de/jstacs/classifiers/assessment/KFoldCrossValidation.html KFoldCrossValidation]. All [http://www.jstacs.de/api/de/jstacs/classifiers/assessment/ClassifierAssessment.html ClassifierAssessments] have a constructor that accepts an array of classifiers (or [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/trainable/TrainableStatisticalModel.html TrainableStatisticalModels]). You can then use the [http://www.jstacs.de/api/de/jstacs/classifiers/assessment/ClassifierAssessment.html#assess(de.jstacs.classifiers.performanceMeasures.NumericalPerformanceMeasureParameterSet,%20de.jstacs.classifiers.assessment.ClassifierAssessmentAssessParameterSet,%20de.jstacs.data.DataSet...) assess] method to assess these classifiers on the same data using a number of pre-defined [http://www.jstacs.de/api/de/jstacs/classifiers/performanceMeasures/NumericalPerformanceMeasureParameterSet.html performance measures].
</p>
</p>
<hr />
<hr />
<p>
<p>
'''Q: How can store and load my model, classifier, ...?'''<br />
'''Q: How can store and load my model, classifier, ...?'''<br />
'''A:''' All classes that implement Storable should have a method toXML() that returns a StringBuffer containing the instance as [http://en.wikipedia.org/wiki/XML XML]. Furthermore, such classes should have a proper constructor with a single argument StringBuffer. This can be used to create a new instance form a StringBuffer that contains an instance as XML. In addition, The class FileManger allows to read and write StringBuffer to the hard drive.
'''A:''' All classes that implement [http://www.jstacs.de/api/de/jstacs/Storable.html Storable] have a method [http://www.jstacs.de/api/de/jstacs/Storable.html#toXML() toXML()] that returns a StringBuffer containing the instance as [http://en.wikipedia.org/wiki/XML XML]. Such classes should also have a proper constructor with a single argument StringBuffer. This can be used to create a new instance form a StringBuffer that contains an instance as XML. In addition, the class [http://www.jstacs.de/api/de/jstacs/io/FileManager.html FileManager] allows to read and write StringBuffers to the hard drive.
</p>
<hr />
<p>
'''Q: Why does Jstacs use XML to save instances?'''<br />
'''A:''' Because it is human-readable.
</p>
</p>
'''''Q: Why does Jstacs use XML to save instances?'''''<br>
<hr />
'''A:''' ''Because it is human readable.''
<p>
'''Q: I use Gibbs sampling in a class extending [http://www.jstacs.de/api-2.0/de/jstacs/sequenceScores/statisticalModels/trainable/mixture/AbstractMixtureTrainSM.html AbstractMixtureTrainSM].'''<br />
* '''Q<sub>1</sub>: Why does the sampling create files in temporary directory of Java?''' <br />
* '''Q<sub>2</sub>: Will these files be deleted automatically, if they will not be used any more?'''<br />
'''A:'''<br />
* '''A<sub>1</sub>:''' These files are created for saving the sampled parameter temporarily. Java temp is used to minimize network load if you work on a cluster.<br />
* '''A<sub>2</sub>:''' These files will be deleted if no reference to the mixture instance exists and the Garbage collector is called. Therefore it is recommended to [http://docs.oracle.com/javase/6/docs/api/java/lang/System.html#gc() call the Garbage collector explicitly] at the end of any application.</p>
<hr />


'''''Q: The class UserTime does not work! Why?'''''<br>
== Implementing new models ==
'''A:''' ''The class UserTime uses native code. Therefore there are at least two possibilities:''<br>
<p>
'''A<sub>1</sub>:''' ''You have forgotten to set the Java library path: -Djava.library.path=...''<br>  
'''Q: How do I implement a new generative [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/trainable/TrainableStatisticalModel.html TrainableStatisticalModel]?'''<br />
'''A<sub>2</sub>:''' ''You have to compile the native code on your system.''
'''A:''' Write an implementation of the [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/trainable/TrainableStatisticalModel.html TrainableStatisticalModel] interface. For convenience, you can use the abstract [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/trainable/AbstractTrainableStatisticalModel.html AbstractTrainableStatisticalModel] class with default implementations for many methods.
</p>
<hr />
<p>
'''Q: How do I implement a new discriminative model?'''<br />
'''A:''' Write an implementation of the [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/differentiable/DifferentiableStatisticalModel.html DifferentiableStatisticalModel] interface. For convenience, you can use the abstract [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/differentiable/AbstractDifferentiableStatisticalModel.html AbstractDifferentiableStatisticalModel] class with default implementations for many methods.
</p>
<hr />
<p>
'''Q: How do I implement a model that can be trained generatively ''and'' discriminatively?'''<br />
'''A:''' You can either extend [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/trainable/AbstractTrainableStatisticalModel.html AbstractTrainableStatisticalModel] and additionally implement the [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/differentiable/DifferentiableStatisticalModel.html DifferentiableStatisticalModel] interface, or you extend the [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/differentiable/AbstractDifferentiableStatisticalModel.html AbstractDifferentiableStatisticalModel] and additionally implement the [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/trainable/TrainableStatisticalModel.html TrainableStatisticalModel] interface.
</p>
<hr />
 
== Reporting bugs and requesting new features ==
<p>
'''Q: How do I report bugs I found in Jstacs?'''<br />
'''A:''' Before reporting bugs in Jstacs, you should be sure it's not a feature ;-) You can discuss potential issues in the [https://trac.informatik.uni-halle.de/trac/jstacs/discussion Jstacs forum]. You can also have a look at the [https://trac.informatik.uni-halle.de/trac/jstacs/report bugs that have already been reported]. If you are sure that you found a new bug, please submit a [https://trac.informatik.uni-halle.de/trac/jstacs/newticket new ticket] to the Jstacs bug tracking system.
</p>
<hr />
<p>
'''Q: How can request new features?'''<br />
'''A:''' You may use the [https://trac.informatik.uni-halle.de/trac/jstacs/discussion Jstacs forum] to discuss your request with other users. Most likely, we will join the discussion, too (We are somewhere out there!).<br />
If you are convinced that the feature you request will be useful for all users of Jstacs, you are invited to submit a  [https://trac.informatik.uni-halle.de/trac/jstacs/newticket new ticket] with your request.
</p>
<hr />
== Other ==
'''Q: The class [http://www.jstacs.de/api/de/jstacs/utils/UserTime.html UserTime] does not work! Why?'''<br />
'''A:''' The class [http://www.jstacs.de/api/de/jstacs/utils/UserTime.html UserTime] uses native code. Therefore there are at least two possibilities:<br />
* '''A<sub>1</sub>:''' You have forgotten to set the Java library path: -Djava.library.path=...<br />  
* '''A<sub>2</sub>:''' You have to compile the native code on your system.

Latest revision as of 16:04, 2 February 2012

Handling data

Also have a look at the code example Loading data.

Q: How do I create an AlphabetContainer instance for DNA sequences?
A: AlphabetContainer container = new AlphabetContainer( new DNAAlphabet() );


Q: Why shall I use the AlphabetContainer and not just a simple Alphabet instance?
A: Because for some data you will not have the same alphabet at each position of the sequence, e.g. when using phenotypic data. Hence, we also strongly recommend to always use getAlphabetLengthAt(int) when setting e.g. the size of an array.


Q: How can I create a simple sequence?
A: Try to use the create method of Sequence, e.q. Sequence.create( new AlphabetContainer( new DNAAlphabet() ), "ACGTACGT" );


Q: How can I load my own data?
A: If your sequences are stored either in plain text or in FastA format, you can directly create a new DataSet from the file.


Q: I wrote some sophisticated method using BioJava to load my data from a Genbank file/a database/somewhere else. How can I do something similar in Jstacs?
A: You can still use your existing method. Jstacs has an adapter for BioJava SequenceIterators.


Using existing models

Also have a look at the Code examples.

Q: Where do I find a list of the models currently implemented in Jstacs?
A: All generative models in Jstacs implement the TrainableStatisticalModel interface.
All discriminative models in Jstacs implement the DifferentiableStatisticalModel interface. You find all the existing implementations in the list of implementing classes of these two interfaces.


Q: I decided for two TrainableStatisticalModels. How do I learn them and classify new data?
A: You can create a new TrainSMBasedClassifier from your models and use its train and classify methods. If you only want to learn a model from data, e.g. to sample new sequences, you can also directly use the train method of the TrainableStatisticalModel.


Q: I decided for two DifferentiableStatisticalModels. How do I learn them and classify new data?
A: You can create a new GenDisMixClassifier from your models and use its train and classify methods.


Q: I have to decide, which model is best for my classification task. How do I assess different model combinations or classifiers?
A: You can use the subclasses of ClassifierAssessment, e.g. KFoldCrossValidation. All ClassifierAssessments have a constructor that accepts an array of classifiers (or TrainableStatisticalModels). You can then use the assess method to assess these classifiers on the same data using a number of pre-defined performance measures.


Q: How can store and load my model, classifier, ...?
A: All classes that implement Storable have a method toXML() that returns a StringBuffer containing the instance as XML. Such classes should also have a proper constructor with a single argument StringBuffer. This can be used to create a new instance form a StringBuffer that contains an instance as XML. In addition, the class FileManager allows to read and write StringBuffers to the hard drive.


Q: Why does Jstacs use XML to save instances?
A: Because it is human-readable.


Q: I use Gibbs sampling in a class extending AbstractMixtureTrainSM.

  • Q1: Why does the sampling create files in temporary directory of Java?
  • Q2: Will these files be deleted automatically, if they will not be used any more?

A:

  • A1: These files are created for saving the sampled parameter temporarily. Java temp is used to minimize network load if you work on a cluster.
  • A2: These files will be deleted if no reference to the mixture instance exists and the Garbage collector is called. Therefore it is recommended to call the Garbage collector explicitly at the end of any application.


Implementing new models

Q: How do I implement a new generative TrainableStatisticalModel?
A: Write an implementation of the TrainableStatisticalModel interface. For convenience, you can use the abstract AbstractTrainableStatisticalModel class with default implementations for many methods.


Q: How do I implement a new discriminative model?
A: Write an implementation of the DifferentiableStatisticalModel interface. For convenience, you can use the abstract AbstractDifferentiableStatisticalModel class with default implementations for many methods.


Q: How do I implement a model that can be trained generatively and discriminatively?
A: You can either extend AbstractTrainableStatisticalModel and additionally implement the DifferentiableStatisticalModel interface, or you extend the AbstractDifferentiableStatisticalModel and additionally implement the TrainableStatisticalModel interface.


Reporting bugs and requesting new features

Q: How do I report bugs I found in Jstacs?
A: Before reporting bugs in Jstacs, you should be sure it's not a feature ;-) You can discuss potential issues in the Jstacs forum. You can also have a look at the bugs that have already been reported. If you are sure that you found a new bug, please submit a new ticket to the Jstacs bug tracking system.


Q: How can request new features?
A: You may use the Jstacs forum to discuss your request with other users. Most likely, we will join the discussion, too (We are somewhere out there!).
If you are convinced that the feature you request will be useful for all users of Jstacs, you are invited to submit a new ticket with your request.


Other

Q: The class UserTime does not work! Why?
A: The class UserTime uses native code. Therefore there are at least two possibilities:

  • A1: You have forgotten to set the Java library path: -Djava.library.path=...
  • A2: You have to compile the native code on your system.