i6_core.lib.rasr_cache
¶
This module is about reading (maybe later also writing) the Rasr archive format.
- class i6_core.lib.rasr_cache.AllophoneLabeling(silence_phone, allophone_file, phoneme_file=None, state_tying_file=None, verbose_out=None)¶
Allophone labeling.
- Parameters:
silence_phone (str) – e.g. “si”
allophone_file (str) – list of allophones
phoneme_file (str|None) – list of phonemes
state_tying_file (str|None) – allophone state tying (e.g. via CART). maps each allophone state to a class label
verbose_out (file) – stream to dump log messages
- get_label_idx(allo_idx, state_idx)¶
- Parameters:
allo_idx (int) –
state_idx (int) –
- Return type:
int
- get_label_idx_by_allo_state_idx(allo_state_idx)¶
- Parameters:
allo_state_idx (int) –
- Return type:
int
- class i6_core.lib.rasr_cache.FileArchive(filename, must_exists=False, encoding='ascii')¶
File archive.
- RasrCacheHeader = 'SP_ARC1\x00'¶
- addAttributes(filename, dim, duration)¶
- Parameters:
filename (str) –
dim (int) –
duration (float) –
- addFeatureCache(filename, features, times)¶
- Parameters:
filename (str) –
features –
times –
- end_recovery_tag = 1437226410¶
- file_list()¶
- Return type:
list[str]
- finalize()¶
Finalize.
- getState(mix)¶
- Parameters:
mix (int) –
- Returns:
(mix, state)
- Return type:
(int,int)
- has_entry(filename)¶
- Parameters:
filename (str) – argument for self.read()
- Returns:
True if we have this entry
- read(filename, typ)¶
- Parameters:
filename (str) – the entry-name in the archive
typ (str) – “str”, “feat” or “align”
- Returns:
depending on typ, “str” -> string, “feat” -> (time, data), “align” -> align, where string is a str, time is list of time-stamp tuples (start-time,end-time) in millisecs,
data is a list of features, each a numpy vector,
- align is a list of (time, allophone, state), time is an int from 0 to len of align,
allophone is some int, state is e.g. in [0,1,2].
- Return type:
str|(list[numpy.ndarray],list[numpy.ndarray])|list[(int,int,int)]
- readFileInfoTable()¶
Read file info table.
- read_S16()¶
- Return type:
float
- read_U32()¶
- Return type:
int
- read_U8()¶
- Return type:
int
- read_bytes(l)¶
- Return type:
bytes
- read_char()¶
- Return type:
int
- read_f32()¶
- Return type:
float
- read_f64()¶
- Return type:
float
- read_packed_U32()¶
- Return type:
int
- read_str(l, enc='ascii')¶
- Return type:
str
- read_u32()¶
- Return type:
int
- read_u64()¶
- Return type:
int
- read_v(typ, size)¶
- Parameters:
typ (str) – “f” for float (float32) or “d” for double (float64)
size (int) – number of elements to return
- Returns:
numpy array of shape (size,) of dtype depending on typ
- Return type:
numpy.ndarray
- scanArchive()¶
Scan archive.
- setAllophones(f)¶
- Parameters:
f (str) – allophone filename. line-separated. will ignore lines starting with “#”
- start_recovery_tag = 2857740885¶
- writeFileInfoTable()¶
Write file info table.
- write_U32(i)¶
- Parameters:
i (int) –
- Return type:
int
- write_char(i)¶
- Parameters:
i (int) –
- Return type:
int
- write_f32(i)¶
- Parameters:
i (float) –
- Return type:
int
- write_f64(i)¶
- Parameters:
i (float) –
- Return type:
int
- write_str(s, enc='ascii')¶
- Parameters:
s (str) –
- Return type:
int
- write_u32(i)¶
- Parameters:
i (int) –
- Return type:
int
- write_u64(i)¶
- Parameters:
i (int) –
- Return type:
int
- class i6_core.lib.rasr_cache.FileArchiveBundle(filename, encoding='ascii')¶
File archive bundle.
- Parameters:
filename (str) – .bundle file
encoding (str) – encoding used in the files
- file_list()¶
- Return type:
list[str]
- Returns:
list of content-filenames (which can be used for self.read())
- has_entry(filename)¶
- Parameters:
filename (str) – argument for self.read()
- Returns:
True if we have this entry
- read(filename, typ)¶
- Parameters:
filename (str) – the entry-name in the archive
typ (str) – “str”, “feat” or “align”
- Returns:
depending on typ, “str” -> string, “feat” -> (time, data), “align” -> align, where string is a str, time is list of time-stamp tuples (start-time,end-time) in millisecs,
data is a list of features, each a numpy vector,
- align is a list of (time, allophone, state), time is an int from 0 to len of align,
allophone is some int, state is e.g. in [0,1,2].
- Return type:
str|(list[numpy.ndarray],list[numpy.ndarray])|list[(int,int,int)]
Uses FileArchive.read().
- setAllophones(filename)¶
- Parameters:
filename (str) – allophone filename
- class i6_core.lib.rasr_cache.FileInfo(name, pos, size, compressed, index)¶
File info.
- Parameters:
name (str) –
pos (int) –
size (int) –
compressed (bool|int) –
index (int) –
- class i6_core.lib.rasr_cache.MixtureSet(filename)¶
Mixture set.
- Parameters:
filename (str) –
- getCovByIdx(idx)¶
- Parameters:
idx (int) –
- Return type:
numpy.ndarray
- getMeanByIdx(idx)¶
- Parameters:
idx (int) –
- Return type:
numpy.ndarray
- getNumberMixtures()¶
- Return type:
int
- read_U32()¶
- Return type:
int
- read_char()¶
- Return type:
int
- read_f32()¶
- Return type:
float
- read_f64()¶
- Return type:
float
- read_str(l, enc='ascii')¶
- Parameters:
l (int) –
enc (str) –
- Return type:
str
- read_u32()¶
- Return type:
int
- read_u64()¶
- Return type:
int
- read_v(size, a)¶
- Parameters:
size (int) –
a (array.array) –
- Return type:
array.array
- write(filename)¶
- Parameters:
filename (str) –
- write_U32(i)¶
- Parameters:
i (int) –
- Return type:
int
- write_char(i)¶
- Parameters:
i (int) –
- Return type:
int
- write_f32(i)¶
- Parameters:
i (float) –
- Return type:
int
- write_f64(i)¶
- Parameters:
i (float) –
- Return type:
int
- write_str(s, enc='ascii')¶
- Parameters:
s (str) –
enc (str) –
- Return type:
int
- write_u32(i)¶
- Parameters:
i (int) –
- Return type:
int
- write_u64(i)¶
- Parameters:
i (int) –
- Return type:
int
- class i6_core.lib.rasr_cache.WordBoundaries(filename)¶
Word boundaries.
- Parameters:
filename (str) –
- read_str(l, enc='ascii')¶
- Return type:
str
- read_u16()¶
- Return type:
int
- read_u32()¶
- Return type:
int
- i6_core.lib.rasr_cache.is_rasr_cache_file(filename)¶
- Parameters:
filename (str) – file to check. must exist
- Returns:
True iff this is a rasr cache (which can be loaded with open_file_archive())
- Return type:
bool
- i6_core.lib.rasr_cache.open_file_archive(archive_filename, must_exists=True, encoding='ascii')¶
- Parameters:
archive_filename (str) –
must_exists (bool) –
encoding (str) –
- Return type: