i6_core.lib.lexicon

Library for the RASR Lexicon files

For format details visit: `https://www-i6.informatik.rwth-aachen.de/rwth-asr/manual/index.php/Lexicon`_

class i6_core.lib.lexicon.Lemma(orth: Optional[List[str]] = None, phon: Optional[List[str]] = None, synt: Optional[List[str]] = None, eval: Optional[List[List[str]]] = None, special: Optional[str] = None)

Represents a lemma of a lexicon

Parameters:
  • orth – list of spellings used in the training data

  • phon – list of pronunciation variants. Each str should contain a space separated string of phonemes from the phoneme-inventory.

  • synt – list of LM tokens that form a single token sequence. This sequence is used as the language model representation.

  • eval – list of output representations. Each sublist should contain one possible transcription (token sequence) of this lemma that is scored against the reference transcription.

  • special – assigns special property to a lemma. Supported values: “silence”, “unknown”, “sentence-boundary”, or “sentence-begin” / “sentence-end”

classmethod from_element(e)
Parameters:

e (ET.Element) –

Return type:

Lemma

to_xml()
Returns:

xml representation

Return type:

ET.Element

class i6_core.lib.lexicon.Lexicon

Represents a bliss lexicon, can be read from and written to .xml files

add_lemma(lemma)
Parameters:

lemma (Lemma) –

add_phoneme(symbol, variation='context')
Parameters:
  • symbol (str) – representation of one phoneme

  • variation (str) – possible values: “context” or “none”. Use none for context independent phonemes like silence and noise.

load(path)
Parameters:

path (str) – bliss lexicon .xml or .xml.gz file

remove_phoneme(symbol)
Parameters:

symbol (str) –

to_xml()
Returns:

xml representation, can be used with util.write_xml

Return type:

ET.Element