i6_core.lib.lexicon
¶
Library for the RASR Lexicon files
For format details visit: `https://www-i6.informatik.rwth-aachen.de/rwth-asr/manual/index.php/Lexicon`_
- class i6_core.lib.lexicon.Lemma(orth: Optional[List[str]] = None, phon: Optional[List[str]] = None, synt: Optional[List[str]] = None, eval: Optional[List[List[str]]] = None, special: Optional[str] = None)¶
Represents a lemma of a lexicon
- Parameters:
orth – list of spellings used in the training data
phon – list of pronunciation variants. Each str should contain a space separated string of phonemes from the phoneme-inventory.
synt – list of LM tokens that form a single token sequence. This sequence is used as the language model representation.
eval – list of output representations. Each sublist should contain one possible transcription (token sequence) of this lemma that is scored against the reference transcription.
special – assigns special property to a lemma. Supported values: “silence”, “unknown”, “sentence-boundary”, or “sentence-begin” / “sentence-end”
- to_xml()¶
- Returns:
xml representation
- Return type:
ET.Element
- class i6_core.lib.lexicon.Lexicon¶
Represents a bliss lexicon, can be read from and written to .xml files
- add_phoneme(symbol, variation='context')¶
- Parameters:
symbol (str) – representation of one phoneme
variation (str) – possible values: “context” or “none”. Use none for context independent phonemes like silence and noise.
- load(path)¶
- Parameters:
path (str) – bliss lexicon .xml or .xml.gz file
- remove_phoneme(symbol)¶
- Parameters:
symbol (str) –
- to_xml()¶
- Returns:
xml representation, can be used with util.write_xml
- Return type:
ET.Element