|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectde.jstacs.data.Sample
public class Sample
This is the class for any sample of Sequences. All Sequences
in a Sample have to have the same AlphabetContainer. The
Sequences may have different lengths.
For the internal representation the class Sequence is used, where the
external alphabet is converted to integral numerical values. The class
Sample knows about this coding via instances of class
AlphabetContainer and accordingly Alphabet.
There are different ways to access the elements of a
Sample. If one needs random access there is the method
getElementAt(int). For fast sequential access it is recommended to
use an Sample.ElementEnumerator.
Sample is immutable.
AlphabetContainer,
Alphabet,
Sequence| Nested Class Summary | |
|---|---|
static class |
Sample.ElementEnumerator
This class can be used to have a fast sequential access to a Sample. |
static class |
Sample.PartitionMethod
This enum defines different partition methods for a
Sample. |
static class |
Sample.WeightedSampleFactory
This class enables you to eliminate Sequences that occur more
than once in one or more Samples. |
| Constructor Summary | |
|---|---|
Sample(AlphabetContainer abc,
AbstractStringExtractor se)
Creates a new Sample from a StringExtractor
using the given AlphabetContainer. |
|
Sample(AlphabetContainer abc,
AbstractStringExtractor se,
int subsequenceLength)
Creates a new Sample from a StringExtractor
using the given AlphabetContainer and all overlapping windows of
length subsequenceLength. |
|
Sample(AlphabetContainer abc,
AbstractStringExtractor se,
String delim)
Creates a new Sample from a StringExtractor
using the given AlphabetContainer and a delimiter
delim. |
|
Sample(AlphabetContainer abc,
AbstractStringExtractor se,
String delim,
int subsequenceLength)
Creates a new Sample from a StringExtractor
using the given AlphabetContainer, the given delimiter
delim and all overlapping windows of length
subsequenceLength. |
|
Sample(Sample s,
int subsequenceLength)
Creates a new Sample from a given Sample and a given
length subsequenceLength. |
|
Sample(String annotation,
Sequence... seqs)
Creates a new Sample from an array of Sequences and a
given annotation. |
|
| Method Summary | |
|---|---|
Sequence[] |
getAllElements()
Returns an array of Sequences containing all elements of this
Sample. |
AlphabetContainer |
getAlphabetContainer()
Returns the AlphabetContainer of this Sample. |
String |
getAnnotation()
Returns some annotation of the Sample. |
static String |
getAnnotation(Sample... s)
Returns the annotation for an array of Samples. |
Sample |
getCompositeSample(int[] starts,
int[] lengths)
This method enables you to use only composite Sequences of all
elements in the current Sample. |
Sequence |
getElementAt(int i)
This method returns the element, i.e. the Sequence, with index
i. |
int |
getElementLength()
Returns the length of the elements, i.e. the Sequences, in this
Sample. |
Sample |
getInfixSample(int start,
int length)
This method enables you to use only an infix of all elements, i.e. the Sequences, in the current Sample. |
int |
getMaximalElementLength()
Returns the maximal length of an element, i.e. a Sequence, in
this Sample. |
int |
getMinimalElementLength()
Returns the minimal length of an element, i.e. a Sequence, in
this Sample. |
int |
getNumberOfElements()
Returns the number of elements, i.e. the Sequences, in this
Sample. |
int |
getNumberOfElementsWithLength(int len)
Returns the number of overlapping elements that can be extracted. |
Sample |
getSuffixSample(int start)
This method enables you to use only a suffix of all elements, i.e. the Sequence, in the current Sample. |
static Sample |
intersection(Sample... samples)
This method computes the intersection between all elements/ Sample
s of the array, i.e. it returns a Sample containing only
Sequences that are contained in all Samples of the array. |
boolean |
isDiscreteSample()
This method indicates if all positions use discrete values. |
boolean |
isSimpleSample()
This method indicates whether all random variables are defined over the same range, i.e. all positions use the same (fixed) alphabet. |
Sample[] |
partition(double p,
Sample.PartitionMethod method,
int subsequenceLength)
This method partitions the elements, i.e. the Sequences, of the
Sample in two distinct parts. |
Sample[] |
partition(int k,
Sample.PartitionMethod method)
This method partitions the elements, i.e. the Sequences, of the
Sample in k distinct parts. |
Sample[] |
partition(Sample.PartitionMethod method,
double... percentage)
This method partitions the elements, i.e. the Sequences, of the
Sample in distinct parts where each part holds the corresponding
percentage given in the array percentage. |
void |
save(String msg,
File f)
This method writes a message msg and the Sample to a
file f. |
Sample |
subSampling(int number)
Randomly samples elements, i.e. |
String |
toString()
|
static Sample |
union(Sample... s)
Unites all Samples of the array s. |
static Sample |
union(Sample[] s,
boolean[] in)
This method unites all Samples of the array s
regarding the array in. |
static Sample |
union(Sample[] s,
boolean[] in,
int subsequenceLength)
This method unites all Samples of the array s
regarding the array in and sets the element length in the
united Sample to subsequenceLength. |
static Sample |
union(Sample[] s,
int subsequenceLength)
This method unites all Samples of the array s and
sets the element length in the united sample to
subsequenceLength. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
| Constructor Detail |
|---|
public Sample(AlphabetContainer abc,
AbstractStringExtractor se)
throws WrongAlphabetException,
EmptySampleException,
WrongLengthException
Sample from a StringExtractor
using the given AlphabetContainer.
abc - the AlphabetContainerse - the StringExtractor
WrongAlphabetException - if the AlphabetContainer is not suitable
EmptySampleException - if the Sample would be empty
WrongLengthException - never happens (forwarded from
Sample(AlphabetContainer, AbstractStringExtractor, String, int)
)Sample(AlphabetContainer, AbstractStringExtractor, String,
int)
public Sample(AlphabetContainer abc,
AbstractStringExtractor se,
int subsequenceLength)
throws WrongAlphabetException,
WrongLengthException,
EmptySampleException
Sample from a StringExtractor
using the given AlphabetContainer and all overlapping windows of
length subsequenceLength.
abc - the AlphabetContainerse - the StringExtractorsubsequenceLength - the length of the window sliding on the String of
se, if len is 0 (zero) then the
Sequences are used as given from the
StringExtractor
WrongAlphabetException - if the AlphabetContainer is not suitable
WrongLengthException - if the subsequence length is not supported
EmptySampleException - if the Sample would be emptySample(AlphabetContainer, AbstractStringExtractor, String,
int)
public Sample(AlphabetContainer abc,
AbstractStringExtractor se,
String delim)
throws WrongAlphabetException,
EmptySampleException,
WrongLengthException
Sample from a StringExtractor
using the given AlphabetContainer and a delimiter
delim.
abc - the AlphabetContainerse - the StringExtractordelim - the delimiter for parsing the Strings
WrongAlphabetException - if the AlphabetContainer is not suitable
EmptySampleException - if the Sample would be empty
WrongLengthException - never happens (forwarded from
Sample(AlphabetContainer, AbstractStringExtractor, String, int)
)Sample(AlphabetContainer, AbstractStringExtractor, String,
int)
public Sample(AlphabetContainer abc,
AbstractStringExtractor se,
String delim,
int subsequenceLength)
throws EmptySampleException,
WrongAlphabetException,
WrongLengthException
Sample from a StringExtractor
using the given AlphabetContainer, the given delimiter
delim and all overlapping windows of length
subsequenceLength.
abc - the AlphabetContainerse - the StringExtractordelim - the delimiter for parsing the StringssubsequenceLength - the length of the window sliding on the String of
se, if len is 0 (zero) then the
Sequences are used as given from the
StringExtractor
WrongAlphabetException - if the AlphabetContainer is not suitable
EmptySampleException - if the Sample would be empty
WrongLengthException - if the subsequence length is not supported
public Sample(Sample s,
int subsequenceLength)
throws WrongLengthException
Sample from a given Sample and a given
length subsequenceLength.Sample.
getElementAt(int) are real objects and do not have to be created
at the invocation of the method. (The same holds for the
Sample.ElementEnumerator. In those cases both ways to access the
Sequence are approximately equally fast.)
s - the given SamplesubsequenceLength - the new element length
WrongLengthException - if something is wrong with subsequenceLengthSample(Sample, int, boolean)
public Sample(String annotation,
Sequence... seqs)
throws EmptySampleException,
IllegalArgumentException
Sample from an array of Sequences and a
given annotation.Model.emitSample(int, int...).
annotation - the annotation of the Sampleseqs - the Sequence(s)
EmptySampleException - if the array seqs is null or the
length is 0
IllegalArgumentException - if the Alphabets do not match| Method Detail |
|---|
public static final String getAnnotation(Sample... s)
Samples.
s - an array of Samples
getAnnotation()
public static final Sample intersection(Sample... samples)
throws IllegalArgumentException,
EmptySampleException
Sample
s of the array, i.e. it returns a Sample containing only
Sequences that are contained in all Samples of the array.
samples - the array of Samples
Samples in the array
IllegalArgumentException - if the elements of the array are from different domains
EmptySampleException - if the intersection is empty
public static final Sample union(Sample[] s,
boolean[] in)
throws IllegalArgumentException,
EmptySampleException
Samples of the array s
regarding the array in.
s - the array of Samplesin - an array indicating which Sample is used in the union,
if in[i]==true the Sample
s[i] is used
Sample
IllegalArgumentException - if s.length != in.length or the Alphabet
s do not match
EmptySampleException - if the union is emptyunion(Sample[], boolean[], int)
public static final Sample union(Sample... s)
throws IllegalArgumentException
Samples of the array s.
s - the array of Samples
Sample
IllegalArgumentException - if the Alphabets do not matchunion(Sample[], boolean[])
public static final Sample union(Sample[] s,
boolean[] in,
int subsequenceLength)
throws IllegalArgumentException,
EmptySampleException,
WrongLengthException
Samples of the array s
regarding the array in and sets the element length in the
united Sample to subsequenceLength.
s - the array of Samplesin - an array indicating which Sample is used in the union,
if in[i]==true the Sample
s[i] is usedsubsequenceLength - the length of the elements in the united Sample
Sample
IllegalArgumentException - if s.length != in.length or the Alphabet
s do not match
EmptySampleException - if the union is empty
WrongLengthException - if the united Sample does not support this
subsequenceLength
public static final Sample union(Sample[] s,
int subsequenceLength)
throws IllegalArgumentException,
WrongLengthException
Samples of the array s and
sets the element length in the united sample to
subsequenceLength.
s - the array of SamplessubsequenceLength - the length of the elements in the united Sample
Sample
IllegalArgumentException - if the Alphabets do not match
WrongLengthException - if the united Sample does not support this
subsequenceLengthunion(Sample[], boolean[], int)public Sequence[] getAllElements()
Sequences containing all elements of this
Sample.
Sequences) of this SampleSample.ElementEnumeratorpublic final AlphabetContainer getAlphabetContainer()
AlphabetContainer of this Sample.
AlphabetContainer of this Samplepublic final String getAnnotation()
Sample.
Sample
public final Sample getCompositeSample(int[] starts,
int[] lengths)
throws IllegalArgumentException
Sequences of all
elements in the current Sample. Each composite Sequence
will be build from one corresponding Sequence in this
Sample and all composite Sequences
will be returned in a new Sample.
starts - the start positions of the chunkslengths - the lengths of the chunks
Sample
IllegalArgumentException - if either starts or lengths or both
in combination are not suitableSequence.getCompositeSequence(AlphabetContainer, int[], int[])public Sequence getElementAt(int i)
Sequence, with index
i. See also this
comment.
i - the index of the element, i.e. the Sequence
Sequence, with index ipublic int getElementLength()
Sequences, in this
Sample.
Sequences, in this
Sample
public final Sample getInfixSample(int start,
int length)
throws IllegalArgumentException
Sequences, in the current Sample. The subsequences will
be returned in an new Sample.
Sample of prefixes if
the element length is not zero.
start - the start position of the infixlength - the length of the infix, has to be positive
Sample of the specified infixes
IllegalArgumentException - if either start or length or both
in combination are not suitablepublic int getMinimalElementLength()
Sequence, in
this Sample.
Sequence, in
this Samplepublic int getMaximalElementLength()
Sequence, in
this Sample.
Sequence, in
this Samplepublic int getNumberOfElements()
Sequences, in this
Sample.
Sequences, in this
Sample
public int getNumberOfElementsWithLength(int len)
throws WrongLengthException
len - the length of the elements
WrongLengthException - if the given length is bigger than the minimal element length
public final Sample getSuffixSample(int start)
throws IllegalArgumentException
Sequence, in the current Sample. The subsequences will be
returned in an new Sample.
start - the start position of the suffix
Sample of specified suffixes
IllegalArgumentException - if start is not suitablepublic final boolean isSimpleSample()
true if the Sample is simple,
false otherwiseAlphabetContainer.isSimple()public final boolean isDiscreteSample()
true if the Sample is discrete,
false otherwiseAlphabetContainer.isDiscrete()
public Sample[] partition(double p,
Sample.PartitionMethod method,
int subsequenceLength)
throws WrongLengthException,
UnsupportedOperationException,
EmptySampleException
Sequences, of the
Sample in two distinct parts. The second part (test sample) holds
the percentage of p, the first the rest (train sample). The
first part has element length as the current Sample, the second
has element length subsequenceLength, which might be
necessary for testing.
p - the percentage for the second part, the second part holds at
least this percentage of the full Samplemethod - the method how to partition the sample (partitioning
criterion)subsequenceLength - the element length of the second part, if 0 (zero) then the
sequences are used as given in this Sample
Samples
WrongLengthException - if something is wrong with subsequenceLength
UnsupportedOperationException - if the Sample is not simple
EmptySampleException - if at least one of the created partitions is emptySample.PartitionMethod,
Sample.PartitionMethod.PARTITION_BY_NUMBER_OF_ELEMENTS,
Sample.PartitionMethod.PARTITION_BY_NUMBER_OF_SYMBOLS,
partition(PartitionMethod, double...),
setSubsequenceLength(int)
public Sample[] partition(Sample.PartitionMethod method,
double... percentage)
throws IllegalArgumentException,
EmptySampleException
Sequences, of the
Sample in distinct parts where each part holds the corresponding
percentage given in the array percentage.
method - the method how to partition the Sample (partitioning
criterion)percentage - the array of percentages for each "subsample"
Samples
IllegalArgumentException - if something with the percentages is not correct (
sum != 1 or one value is not in
[0,1])
EmptySampleException - if at least one of the created partitions is emptySample.PartitionMethod,
Sample.PartitionMethod.PARTITION_BY_NUMBER_OF_ELEMENTS,
Sample.PartitionMethod.PARTITION_BY_NUMBER_OF_SYMBOLS
public Sample[] partition(int k,
Sample.PartitionMethod method)
throws IllegalArgumentException,
EmptySampleException
Sequences, of the
Sample in k distinct parts.
k - the number of distinct partsmethod - the method how to partition the Sample (partitioning
criterion)
Samples
IllegalArgumentException - if k is not correct
EmptySampleException - if at least one of the created partitions is emptySample.PartitionMethod,
Sample.PartitionMethod.PARTITION_BY_NUMBER_OF_ELEMENTS,
Sample.PartitionMethod.PARTITION_BY_NUMBER_OF_SYMBOLS
public Sample subSampling(int number)
throws EmptySampleException
Sequences, from the set of all
elements, i.e. the Sequences, contained in this Sample. Sample is chosen to contain overlapping
elements (windows of length subsequenceLength) or not, those
elements (overlapping windows, whole sequences) are subsampled.
number - the number of Sequences that should be drawn from the
contained set of Sequences (with replacement)
Sample containing the drawn Sequences
EmptySampleException - if number is not positiveSample(AlphabetContainer, Sequence[], int, String)
public final void save(String msg,
File f)
throws IOException
msg and the Sample to a
file f.
msg - the message, any informationf - the File
IOException - if something went wrong with the filepublic String toString()
toString in class Object
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||